mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-12-04 13:32:13 +00:00
99 lines
3.7 KiB
Markdown
99 lines
3.7 KiB
Markdown
---
|
|
slug: /en/engines/table-engines/special/url
|
|
sidebar_position: 80
|
|
sidebar_label: URL
|
|
---
|
|
|
|
# URL Table Engine
|
|
|
|
Queries data to/from a remote HTTP/HTTPS server. This engine is similar to the [File](../../../engines/table-engines/special/file.md) engine.
|
|
|
|
Syntax: `URL(URL [,Format] [,CompressionMethod])`
|
|
|
|
- The `URL` parameter must conform to the structure of a Uniform Resource Locator. The specified URL must point to a server that uses HTTP or HTTPS. This does not require any additional headers for getting a response from the server.
|
|
|
|
- The `Format` must be one that ClickHouse can use in `SELECT` queries and, if necessary, in `INSERTs`. For the full list of supported formats, see [Formats](../../../interfaces/formats.md#formats).
|
|
|
|
If this argument is not specified, ClickHouse detectes the format automatically from the suffix of the `URL` parameter. If the suffix of `URL` parameter does not match any supported formats, it fails to create table. For example, for engine expression `URL('http://localhost/test.json')`, `JSON` format is applied.
|
|
|
|
- `CompressionMethod` indicates that whether the HTTP body should be compressed. If the compression is enabled, the HTTP packets sent by the URL engine contain 'Content-Encoding' header to indicate which compression method is used.
|
|
|
|
To enable compression, please first make sure the remote HTTP endpoint indicated by the `URL` parameter supports corresponding compression algorithm.
|
|
|
|
The supported `CompressionMethod` should be one of following:
|
|
- gzip or gz
|
|
- deflate
|
|
- brotli or br
|
|
- lzma or xz
|
|
- zstd or zst
|
|
- lz4
|
|
- bz2
|
|
- snappy
|
|
- none
|
|
- auto
|
|
|
|
If `CompressionMethod` is not specified, it defaults to `auto`. This means ClickHouse detects compression method from the suffix of `URL` parameter automatically. If the suffix matches any of compression method listed above, corresponding compression is applied or there won't be any compression enabled.
|
|
|
|
For example, for engine expression `URL('http://localhost/test.gzip')`, `gzip` compression method is applied, but for `URL('http://localhost/test.fr')`, no compression is enabled because the suffix `fr` does not match any compression methods above.
|
|
|
|
## Usage {#using-the-engine-in-the-clickhouse-server}
|
|
|
|
`INSERT` and `SELECT` queries are transformed to `POST` and `GET` requests,
|
|
respectively. For processing `POST` requests, the remote server must support
|
|
[Chunked transfer encoding](https://en.wikipedia.org/wiki/Chunked_transfer_encoding).
|
|
|
|
You can limit the maximum number of HTTP GET redirect hops using the [max_http_get_redirects](../../../operations/settings/settings.md#setting-max_http_get_redirects) setting.
|
|
|
|
## Example {#example}
|
|
|
|
**1.** Create a `url_engine_table` table on the server :
|
|
|
|
``` sql
|
|
CREATE TABLE url_engine_table (word String, value UInt64)
|
|
ENGINE=URL('http://127.0.0.1:12345/', CSV)
|
|
```
|
|
|
|
**2.** Create a basic HTTP server using the standard Python 3 tools and
|
|
start it:
|
|
|
|
``` python3
|
|
from http.server import BaseHTTPRequestHandler, HTTPServer
|
|
|
|
class CSVHTTPServer(BaseHTTPRequestHandler):
|
|
def do_GET(self):
|
|
self.send_response(200)
|
|
self.send_header('Content-type', 'text/csv')
|
|
self.end_headers()
|
|
|
|
self.wfile.write(bytes('Hello,1\nWorld,2\n', "utf-8"))
|
|
|
|
if __name__ == "__main__":
|
|
server_address = ('127.0.0.1', 12345)
|
|
HTTPServer(server_address, CSVHTTPServer).serve_forever()
|
|
```
|
|
|
|
``` bash
|
|
$ python3 server.py
|
|
```
|
|
|
|
**3.** Request data:
|
|
|
|
``` sql
|
|
SELECT * FROM url_engine_table
|
|
```
|
|
|
|
``` text
|
|
┌─word──┬─value─┐
|
|
│ Hello │ 1 │
|
|
│ World │ 2 │
|
|
└───────┴───────┘
|
|
```
|
|
|
|
## Details of Implementation {#details-of-implementation}
|
|
|
|
- Reads and writes can be parallel
|
|
- Not supported:
|
|
- `ALTER` and `SELECT...SAMPLE` operations.
|
|
- Indexes.
|
|
- Replication.
|