ClickHouse/docs/en/query_language/insert_into.md
Ivan Blinkov 0a4a5b36cc
Some WIP on documentation refactoring (#2659)
* Additional .gitignore entries

* Merge a bunch of small articles about system tables into single one

* Merge a bunch of small articles about formats into single one

* Adapt table with formats to English docs too

* Add SPb meetup link to main page

* Move Utilities out of top level of docs (the location is probably not yet final) + translate couple articles

* Merge MacOS.md into build_osx.md

* Move Data types higher in ToC

* Publish changelog on website alongside documentation

* Few fixes for en/table_engines/file.md

* Use smaller header sizes in changelogs

* Group up table engines inside ToC

* Move table engines out of top level too

* Specificy in ToC that query language is SQL based. Thats a bit excessive, but catches eye.

* Move stuff that is part of query language into respective folder

* Move table functions lower in ToC

* Lost redirects.txt update

* Do not rely on comments in yaml + fix few ru titles

* Extract major parts of queries.md into separate articles

* queries.md has been supposed to be removed

* Fix weird translation

* Fix a bunch of links

* There is only table of contents left

* "Query language" is actually part of SQL abbreviation

* Change filename in README.md too

* fix mistype
2018-07-18 13:00:53 +03:00

2.7 KiB

INSERT

Adding data.

Basic query format:

INSERT INTO [db.]table [(c1, c2, c3)] VALUES (v11, v12, v13), (v21, v22, v23), ...

The query can specify a list of columns to insert [(c1, c2, c3)]. In this case, the rest of the columns are filled with:

  • The values calculated from the DEFAULT expressions specified in the table definition.
  • Zeros and empty strings, if DEFAULT expressions are not defined.

If strict_insert_defaults=1, columns that do not have DEFAULT defined must be listed in the query.

Data can be passed to the INSERT in any format supported by ClickHouse. The format must be specified explicitly in the query:

INSERT INTO [db.]table [(c1, c2, c3)] FORMAT format_name data_set

For example, the following query format is identical to the basic version of INSERT ... VALUES:

INSERT INTO [db.]table [(c1, c2, c3)] FORMAT Values (v11, v12, v13), (v21, v22, v23), ...

ClickHouse removes all spaces and one line feed (if there is one) before the data. When forming a query, we recommend putting the data on a new line after the query operators (this is important if the data begins with spaces).

Example:

INSERT INTO t FORMAT TabSeparated
11  Hello, world!
22  Qwerty

You can insert data separately from the query by using the command-line client or the HTTP interface. For more information, see the section "Interfaces".

Inserting the results of SELECT

INSERT INTO [db.]table [(c1, c2, c3)] SELECT ...

Columns are mapped according to their position in the SELECT clause. However, their names in the SELECT expression and the table for INSERT may differ. If necessary, type casting is performed.

None of the data formats except Values allow setting values to expressions such as now(), 1 + 2, and so on. The Values format allows limited use of expressions, but this is not recommended, because in this case inefficient code is used for their execution.

Other queries for modifying data parts are not supported: UPDATE, DELETE, REPLACE, MERGE, UPSERT, INSERT UPDATE. However, you can delete old data using ALTER TABLE ... DROP PARTITION.

Performance considerations

INSERT sorts the input data by primary key and splits them into partitions by month. If you insert data for mixed months, it can significantly reduce the performance of the INSERT query. To avoid this:

  • Add data in fairly large batches, such as 100,000 rows at a time.
  • Group data by month before uploading it to ClickHouse.

Performance will not decrease if:

  • Data is added in real time.
  • You upload data that is usually sorted by time.