ClickHouse/docs/en/operations/table_engines/url.md
Ivan Blinkov 94f86eda79
WIP on docs: improvements for search + some content changes (#2842)
* Some improvements for introduction/performance.md

* Minor improvements for example_datasets

* Add website/package-lock.json to .gitignore

* YT paragraph was badly outdated and there is no real reason to write a new one

* Use weird introduction article as a starting point for F.A.Q.

* Some refactoring of first half of ya_metrika_task.md

* minor

* Weird docs footer bugfix

* Forgotten redirect

* h/v scrollbars same size in docs

* CLICKHOUSE-3831: introduce security changelog

* A bit more narrow tables on docs front page

* fix flag in ru docs

* Save some space in top level of docs ToC

* Capitalize most words in titles of docs/en/

* more docs scrollbar fixes

* fix incorrect merge

* fix link

* fix switching languages in single page docs mode

* Update mkdocs & mkdocs-material + unminify javascript

* cherrypick 17e18d1ecc
2018-08-10 17:44:49 +03:00

1.8 KiB

URL(URL, Format)

This data source operates with data on remote HTTP/HTTPS server. The engine is similar to File.

Usage in ClickHouse Server

URL(URL, Format)

Format should be supported for SELECT and/or INSERT. For the full list of supported formats see Formats.

URL must match the format of Uniform Resource Locator. The specified URL must address a server working with HTTP or HTTPS. The server shouldn't require any additional HTTP-headers.

INSERT and SELECT queries are transformed into POST and GET requests respectively. For correct POST-requests handling the remote server should support Chunked transfer encoding.

Example:

1. Create the url_engine_table table:

CREATE TABLE url_engine_table (word String, value UInt64)
ENGINE=URL('http://127.0.0.1:12345/', CSV)

2. Implement simple http-server using python3:

from http.server import BaseHTTPRequestHandler, HTTPServer

class CSVHTTPServer(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.send_header('Content-type', 'text/csv')
        self.end_headers()

        self.wfile.write(bytes('Hello,1\nWorld,2\n', "utf-8"))

if __name__ == "__main__":
    server_address = ('127.0.0.1', 12345)
    HTTPServer(server_address, CSVHTTPServer).serve_forever()
python3 server.py

3. Query the data:

SELECT * FROM url_engine_table
┌─word──┬─value─┐
│ Hello │     1 │
│ World │     2 │
└───────┴───────┘

Details of Implementation

  • Reads and writes can be parallel
  • Not supported:
    • ALTER
    • SELECT ... SAMPLE
    • Indices
    • Replication