mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-10-12 03:20:51 +00:00
94f86eda79
* Some improvements for introduction/performance.md
* Minor improvements for example_datasets
* Add website/package-lock.json to .gitignore
* YT paragraph was badly outdated and there is no real reason to write a new one
* Use weird introduction article as a starting point for F.A.Q.
* Some refactoring of first half of ya_metrika_task.md
* minor
* Weird docs footer bugfix
* Forgotten redirect
* h/v scrollbars same size in docs
* CLICKHOUSE-3831: introduce security changelog
* A bit more narrow tables on docs front page
* fix flag in ru docs
* Save some space in top level of docs ToC
* Capitalize most words in titles of docs/en/
* more docs scrollbar fixes
* fix incorrect merge
* fix link
* fix switching languages in single page docs mode
* Update mkdocs & mkdocs-material + unminify javascript
* cherrypick 17e18d1ecc
1.8 KiB
1.8 KiB
URL(URL, Format)
This data source operates with data on remote HTTP/HTTPS server. The engine is
similar to File
.
Usage in ClickHouse Server
URL(URL, Format)
Format
should be supported for SELECT
and/or INSERT
. For the full list of
supported formats see Formats.
URL
must match the format of Uniform Resource Locator. The specified
URL must address a server working with HTTP or HTTPS. The server shouldn't
require any additional HTTP-headers.
INSERT
and SELECT
queries are transformed into POST
and GET
requests
respectively. For correct POST
-requests handling the remote server should support
Chunked transfer encoding.
Example:
1. Create the url_engine_table
table:
CREATE TABLE url_engine_table (word String, value UInt64)
ENGINE=URL('http://127.0.0.1:12345/', CSV)
2. Implement simple http-server using python3:
from http.server import BaseHTTPRequestHandler, HTTPServer
class CSVHTTPServer(BaseHTTPRequestHandler):
def do_GET(self):
self.send_response(200)
self.send_header('Content-type', 'text/csv')
self.end_headers()
self.wfile.write(bytes('Hello,1\nWorld,2\n', "utf-8"))
if __name__ == "__main__":
server_address = ('127.0.0.1', 12345)
HTTPServer(server_address, CSVHTTPServer).serve_forever()
python3 server.py
3. Query the data:
SELECT * FROM url_engine_table
┌─word──┬─value─┐
│ Hello │ 1 │
│ World │ 2 │
└───────┴───────┘
Details of Implementation
- Reads and writes can be parallel
- Not supported:
ALTER
SELECT ... SAMPLE
- Indices
- Replication