ClickHouse/docs/en/interfaces/http.md

236 lines
10 KiB
Markdown
Raw Normal View History

# HTTP Interface {#http_interface}
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
The HTTP interface lets you use ClickHouse on any platform from any programming language. We use it for working from Java and Perl, as well as shell scripts. In other departments, the HTTP interface is used from Perl, Python, and Go. The HTTP interface is more limited than the native interface, but it has better compatibility.
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
By default, clickhouse-server listens for HTTP on port 8123 (this can be changed in the config).
If you make a GET / request without parameters, it returns the string "Ok" (with a line feed at the end). You can use this in health-check scripts.
2017-04-03 19:49:50 +00:00
```bash
$ curl 'http://localhost:8123/'
Ok.
```
2017-04-03 19:49:50 +00:00
Send the request as a URL 'query' parameter, or as a POST. Or send the beginning of the query in the 'query' parameter, and the rest in the POST (we'll explain later why this is necessary). The size of the URL is limited to 16 KB, so keep this in mind when sending large queries.
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
If successful, you receive the 200 response code and the result in the response body.
If an error occurs, you receive the 500 response code and an error description text in the response body.
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
When using the GET method, 'readonly' is set. In other words, for queries that modify data, you can only use the POST method. You can send the query itself either in the POST body, or in the URL parameter.
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
Examples:
2017-04-03 19:49:50 +00:00
```bash
$ curl 'http://localhost:8123/?query=SELECT%201'
1
2017-04-03 19:49:50 +00:00
$ wget -O- -q 'http://localhost:8123/?query=SELECT 1'
1
2017-04-03 19:49:50 +00:00
$ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123
HTTP/1.0 200 OK
Connection: Close
Date: Fri, 16 Nov 2012 19:21:50 GMT
2017-04-03 19:49:50 +00:00
1
```
2017-04-03 19:49:50 +00:00
As you can see, curl is somewhat inconvenient in that spaces must be URL escaped.
Although wget escapes everything itself, we don't recommend using it because it doesn't work well over HTTP 1.1 when using keep-alive and Transfer-Encoding: chunked.
2017-04-03 19:49:50 +00:00
```bash
$ echo 'SELECT 1' | curl 'http://localhost:8123/' --data-binary @-
1
2017-04-03 19:49:50 +00:00
$ echo 'SELECT 1' | curl 'http://localhost:8123/?query=' --data-binary @-
1
2017-04-03 19:49:50 +00:00
$ echo '1' | curl 'http://localhost:8123/?query=SELECT' --data-binary @-
1
```
2017-04-03 19:49:50 +00:00
If part of the query is sent in the parameter, and part in the POST, a line feed is inserted between these two data parts.
2017-04-26 16:07:48 +00:00
Example (this won't work):
2017-04-03 19:49:50 +00:00
```bash
$ echo 'ECT 1' | curl 'http://localhost:8123/?query=SEL' --data-binary @-
Code: 59, e.displayText() = DB::Exception: Syntax error: failed at position 0: SEL
ECT 1
, expected One of: SHOW TABLES, SHOW DATABASES, SELECT, INSERT, CREATE, ATTACH, RENAME, DROP, DETACH, USE, SET, OPTIMIZE., e.what() = DB::Exception
```
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
By default, data is returned in TabSeparated format (for more information, see the "Formats" section).
You use the FORMAT clause of the query to request any other format.
2017-04-03 19:49:50 +00:00
```bash
$ echo 'SELECT 1 FORMAT Pretty' | curl 'http://localhost:8123/?' --data-binary @-
┏━━━┓
┃ 1 ┃
┡━━━┩
│ 1 │
└───┘
```
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
The POST method of transmitting data is necessary for INSERT queries. In this case, you can write the beginning of the query in the URL parameter, and use POST to pass the data to insert. The data to insert could be, for example, a tab-separated dump from MySQL. In this way, the INSERT query replaces LOAD DATA LOCAL INFILE from MySQL.
2017-04-03 19:49:50 +00:00
Examples: Creating a table:
2017-04-03 19:49:50 +00:00
```bash
2018-07-14 06:47:37 +00:00
echo 'CREATE TABLE t (a UInt8) ENGINE = Memory' | curl 'http://localhost:8123/' --data-binary @-
```
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
Using the familiar INSERT query for data insertion:
2017-04-03 19:49:50 +00:00
```bash
2018-07-14 06:47:37 +00:00
echo 'INSERT INTO t VALUES (1),(2),(3)' | curl 'http://localhost:8123/' --data-binary @-
```
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
Data can be sent separately from the query:
2017-04-03 19:49:50 +00:00
```bash
2018-07-14 06:47:37 +00:00
echo '(4),(5),(6)' | curl 'http://localhost:8123/?query=INSERT%20INTO%20t%20VALUES' --data-binary @-
```
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
You can specify any data format. The 'Values' format is the same as what is used when writing INSERT INTO t VALUES:
2017-04-03 19:49:50 +00:00
```bash
2018-07-14 06:47:37 +00:00
echo '(7),(8),(9)' | curl 'http://localhost:8123/?query=INSERT%20INTO%20t%20FORMAT%20Values' --data-binary @-
```
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
To insert data from a tab-separated dump, specify the corresponding format:
2017-04-03 19:49:50 +00:00
```bash
2018-07-14 06:47:37 +00:00
echo -ne '10\n11\n12\n' | curl 'http://localhost:8123/?query=INSERT%20INTO%20t%20FORMAT%20TabSeparated' --data-binary @-
```
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
Reading the table contents. Data is output in random order due to parallel query processing:
2017-04-03 19:49:50 +00:00
```bash
2018-07-14 06:47:37 +00:00
$ curl 'http://localhost:8123/?query=SELECT%20a%20FROM%20t'
7
8
9
10
11
12
1
2
3
4
5
6
```
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
Deleting the table.
2017-04-03 19:49:50 +00:00
```bash
2018-07-14 06:47:37 +00:00
echo 'DROP TABLE t' | curl 'http://localhost:8123/' --data-binary @-
```
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
For successful requests that don't return a data table, an empty response body is returned.
2017-04-03 19:49:50 +00:00
You can use the internal ClickHouse compression format when transmitting data. The compressed data has a non-standard format, and you will need to use the special `clickhouse-compressor` program to work with it (it is installed with the `clickhouse-client` package). To increase the efficiency of the data insertion, you may disable the server-side checksum verification with the [http_native_compression_disable_checksumming_on_decompress](../operations/settings/settings.md#settings-http_native_compression_disable_checksumming_on_decompress) setting.
2017-04-03 19:49:50 +00:00
If you specified `compress = 1` in the URL, the server compresses the data it sends you.
If you specified `decompress = 1` in the URL, the server decompresses the same data that you pass in the `POST` method.
2017-04-03 19:49:50 +00:00
It is also possible to use the standard `gzip`-based [HTTP compression](https://en.wikipedia.org/wiki/HTTP_compression). To send a `POST` request compressed using `gzip`, append the request header `Content-Encoding: gzip`.
In order for ClickHouse to compress the response using `gzip`, you must append `Accept-Encoding: gzip` to the request headers, and enable the ClickHouse [enable_http_compression](../operations/settings/settings.md#settings-enable_http_compression) setting. You can configure the compression level of the data with the [http_zlib_compression_level](#settings-http_zlib_compression_level) setting.
2017-04-26 16:07:48 +00:00
You can use this to reduce network traffic when transmitting a large amount of data, or for creating dumps that are immediately compressed.
2017-04-03 19:49:50 +00:00
Examples of sending the data with compression:
```bash
#Sending the data to the server:
curl -vsS "http://localhost:8123/?enable_http_compression=1" -d 'SELECT number FROM system.numbers LIMIT 10' -H 'Accept-Encoding: gzip'
#Sending the data to the client:
echo "SELECT 1" | gzip -c | curl -sS --data-binary @- -H 'Content-Encoding: gzip' 'http://localhost:8123/'
```
!!! note "Note"
Some HTTP clients can decompress data (`gzip` and `deflate`) from the server by default and you may get the decompressed data even if you use the compression settings correctly.
2017-04-26 16:07:48 +00:00
You can use the 'database' URL parameter to specify the default database.
2017-04-03 19:49:50 +00:00
```bash
$ echo 'SELECT number FROM numbers LIMIT 10' | curl 'http://localhost:8123/?database=system' --data-binary @-
0
1
2
3
4
5
6
7
8
9
```
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
By default, the database that is registered in the server settings is used as the default database. By default, this is the database called 'default'. Alternatively, you can always specify the database using a dot before the table name.
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
The username and password can be indicated in one of two ways:
2017-04-03 19:49:50 +00:00
CLICKHOUSE-2720: progress on website (#865) * update presentations * CLICKHOUSE-2936: redirect from clickhouse.yandex.ru and clickhouse.yandex.com * update submodule * lost files * CLICKHOUSE-2981: prefer sphinx docs over original reference * CLICKHOUSE-2981: docs styles more similar to main website + add flags to switch language links * update presentations * Less confusing directory structure (docs -> doc/reference/) * Minify sphinx docs too * Website release script: fail fast + pass docker hash on deploy * Do not underline links in docs * shorter * cleanup docker images * tune nginx config * CLICKHOUSE-3043: get rid of habrastorage links * Lost translation * CLICKHOUSE-2936: temporary client-side redirect * behaves weird in test * put redirect back * CLICKHOUSE-3047: copy docs txts to public too * move to proper file * remove old pages to avoid confusion * Remove reference redirect warning for now * Refresh README.md * Yellow buttons in docs * Use svg flags instead of unicode ones in docs * fix test website instance * Put flags to separate files * wrong flag * Copy Yandex.Metrica introduction from main page to docs * Yet another home page structure change, couple new blocks (CLICKHOUSE-3045) * Update Contacts section * CLICKHOUSE-2849: more detailed legal information * CLICKHOUSE-2978 preparation - split by files * More changes in Contacts block * Tune texts on index page * update presentations * One more benchmark * Add usage sections to index page, adapted from slides * Get the roadmap started, based on slides from last ClickHouse Meetup * CLICKHOUSE-2977: some rendering tuning * Get rid of excessive section in the end of getting started * Make headers linkable * CLICKHOUSE-2981: links to editing reference - https://github.com/yandex/ClickHouse/issues/849 * CLICKHOUSE-2981: fix mobile styles in docs * Ban crawling of duplicating docs * Open some external links in new tab * Ban old docs too * Lots of trivial fixes in english docs * Lots of trivial fixes in russian docs * Remove getting started copies in markdown * Add Yandex.Webmaster * Fix some sphinx warnings * More warnings fixed in english docs * More sphinx warnings fixed * Add code-block:: text * More code-block:: text * These headers look not that well * Better switch between documentation languages * merge use_case.rst into ya_metrika_task.rst * Edit the agg_functions.rst texts * Add lost empty lines
2017-06-13 04:15:47 +00:00
1. Using HTTP Basic Authentication. Example:
```bash
echo 'SELECT 1' | curl 'http://user:password@localhost:8123/' -d @-
```
2017-04-03 19:49:50 +00:00
CLICKHOUSE-2720: progress on website (#865) * update presentations * CLICKHOUSE-2936: redirect from clickhouse.yandex.ru and clickhouse.yandex.com * update submodule * lost files * CLICKHOUSE-2981: prefer sphinx docs over original reference * CLICKHOUSE-2981: docs styles more similar to main website + add flags to switch language links * update presentations * Less confusing directory structure (docs -> doc/reference/) * Minify sphinx docs too * Website release script: fail fast + pass docker hash on deploy * Do not underline links in docs * shorter * cleanup docker images * tune nginx config * CLICKHOUSE-3043: get rid of habrastorage links * Lost translation * CLICKHOUSE-2936: temporary client-side redirect * behaves weird in test * put redirect back * CLICKHOUSE-3047: copy docs txts to public too * move to proper file * remove old pages to avoid confusion * Remove reference redirect warning for now * Refresh README.md * Yellow buttons in docs * Use svg flags instead of unicode ones in docs * fix test website instance * Put flags to separate files * wrong flag * Copy Yandex.Metrica introduction from main page to docs * Yet another home page structure change, couple new blocks (CLICKHOUSE-3045) * Update Contacts section * CLICKHOUSE-2849: more detailed legal information * CLICKHOUSE-2978 preparation - split by files * More changes in Contacts block * Tune texts on index page * update presentations * One more benchmark * Add usage sections to index page, adapted from slides * Get the roadmap started, based on slides from last ClickHouse Meetup * CLICKHOUSE-2977: some rendering tuning * Get rid of excessive section in the end of getting started * Make headers linkable * CLICKHOUSE-2981: links to editing reference - https://github.com/yandex/ClickHouse/issues/849 * CLICKHOUSE-2981: fix mobile styles in docs * Ban crawling of duplicating docs * Open some external links in new tab * Ban old docs too * Lots of trivial fixes in english docs * Lots of trivial fixes in russian docs * Remove getting started copies in markdown * Add Yandex.Webmaster * Fix some sphinx warnings * More warnings fixed in english docs * More sphinx warnings fixed * Add code-block:: text * More code-block:: text * These headers look not that well * Better switch between documentation languages * merge use_case.rst into ya_metrika_task.rst * Edit the agg_functions.rst texts * Add lost empty lines
2017-06-13 04:15:47 +00:00
2. In the 'user' and 'password' URL parameters. Example:
```bash
echo 'SELECT 1' | curl 'http://localhost:8123/?user=user&password=password' -d @-
```
2017-04-26 16:07:48 +00:00
If the user name is not indicated, the username 'default' is used. If the password is not indicated, an empty password is used.
You can also use the URL parameters to specify any settings for processing a single query, or entire profiles of settings. Example:http://localhost:8123/?profile=web&max_rows_to_read=1000000000&query=SELECT+1
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
For more information, see the section "Settings".
2017-04-03 19:49:50 +00:00
```bash
$ echo 'SELECT number FROM system.numbers LIMIT 10' | curl 'http://localhost:8123/?' --data-binary @-
0
1
2
3
4
5
6
7
8
9
```
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
For information about other parameters, see the section "SET".
2017-04-03 19:49:50 +00:00
2018-04-23 06:20:21 +00:00
Similarly, you can use ClickHouse sessions in the HTTP protocol. To do this, you need to add the `session_id` GET parameter to the request. You can use any string as the session ID. By default, the session is terminated after 60 seconds of inactivity. To change this timeout, modify the `default_session_timeout` setting in the server configuration, or add the `session_timeout` GET parameter to the request. To check the session status, use the `session_check=1` parameter. Only one query at a time can be executed within a single session.
2018-04-23 06:20:21 +00:00
You have the option to receive information about the progress of query execution in X-ClickHouse-Progress headers. To do this, enable the setting send_progress_in_http_headers.
2018-04-23 06:20:21 +00:00
Running requests don't stop automatically if the HTTP connection is lost. Parsing and data formatting are performed on the server side, and using the network might be ineffective.
2017-04-26 16:07:48 +00:00
The optional 'query_id' parameter can be passed as the query ID (any string). For more information, see the section "Settings, replace_running_query".
2017-04-03 19:49:50 +00:00
The optional 'quota_key' parameter can be passed as the quota key (any string). For more information, see the section "Quotas".
2017-04-03 19:49:50 +00:00
2017-04-26 16:07:48 +00:00
The HTTP interface allows passing external data (external temporary tables) for querying. For more information, see the section "External data for query processing".
## Response Buffering
You can enable response buffering on the server side. The `buffer_size` and `wait_end_of_query` URL parameters are provided for this purpose.
`buffer_size` determines the number of bytes in the result to buffer in the server memory. If the result body is larger than this threshold, the buffer is written to the HTTP channel, and the remaining data is sent directly to the HTTP channel.
To ensure that the entire response is buffered, set `wait_end_of_query=1`. In this case, the data that is not stored in memory will be buffered in a temporary server file.
Example:
```bash
curl -sS 'http://localhost:8123/?max_result_bytes=4000000&buffer_size=3000000&wait_end_of_query=1' -d 'SELECT toUInt8(number) FROM system.numbers LIMIT 9000000 FORMAT RowBinary'
```
Use buffering to avoid situations where a query processing error occurred after the response code and HTTP headers were sent to the client. In this situation, an error message is written at the end of the response body, and on the client side, the error can only be detected at the parsing stage.
2018-04-23 06:20:21 +00:00
[Original article](https://clickhouse.yandex/docs/en/interfaces/http_interface/) <!--hide-->