diff --git a/docs/en/query_language/dicts/external_dicts_dict_sources.md b/docs/en/query_language/dicts/external_dicts_dict_sources.md index 023f14f5a70..01952e18717 100644 --- a/docs/en/query_language/dicts/external_dicts_dict_sources.md +++ b/docs/en/query_language/dicts/external_dicts_dict_sources.md @@ -1,433 +1,433 @@ - - -# Sources of External Dictionaries - -An external dictionary can be connected from many different sources. - -The configuration looks like this: - -```xml - - - ... - - - - - - ... - - ... - -``` - -The source is configured in the `source` section. - -Types of sources (`source_type`): - -- [Local file](#dicts-external_dicts_dict_sources-local_file) -- [Executable file](#dicts-external_dicts_dict_sources-executable) -- [HTTP(s)](#dicts-external_dicts_dict_sources-http) -- [ODBC](#dicts-external_dicts_dict_sources-odbc) -- DBMS - - [MySQL](#dicts-external_dicts_dict_sources-mysql) - - [ClickHouse](#dicts-external_dicts_dict_sources-clickhouse) - - [MongoDB](#dicts-external_dicts_dict_sources-mongodb) - - - -## Local File - -Example of settings: - -```xml - - - /opt/dictionaries/os.tsv - TabSeparated - - -``` - -Setting fields: - -- `path` – The absolute path to the file. -- `format` – The file format. All the formats described in "[Formats](../../interfaces/formats.md#formats)" are supported. - - - -## Executable File - -Working with executable files depends on [how the dictionary is stored in memory](external_dicts_dict_layout.md#dicts-external_dicts_dict_layout). If the dictionary is stored using `cache` and `complex_key_cache`, ClickHouse requests the necessary keys by sending a request to the executable file's `STDIN`. - -Example of settings: - -```xml - - - cat /opt/dictionaries/os.tsv - TabSeparated - - -``` - -Setting fields: - -- `command` – The absolute path to the executable file, or the file name (if the program directory is written to `PATH`). -- `format` – The file format. All the formats described in "[Formats](../../interfaces/formats.md#formats)" are supported. - - - -## HTTP(s) - -Working with an HTTP(s) server depends on [how the dictionary is stored in memory](external_dicts_dict_layout.md#dicts-external_dicts_dict_layout). If the dictionary is stored using `cache` and `complex_key_cache`, ClickHouse requests the necessary keys by sending a request via the `POST` method. - -Example of settings: - -```xml - - - http://[::1]/os.tsv - TabSeparated - - -``` - -In order for ClickHouse to access an HTTPS resource, you must [configure openSSL](../../operations/server_settings/settings.md#server_settings-openSSL) in the server configuration. - -Setting fields: - -- `url` – The source URL. -- `format` – The file format. All the formats described in "[Formats](../../interfaces/formats.md#formats)" are supported. - - - -## ODBC - -You can use this method to connect any database that has an ODBC driver. - -Example of settings: - -```xml - - DatabaseName - ShemaName.TableName
- DSN=some_parameters - SQL_QUERY -
-``` - -Setting fields: - -- `db` – Name of the database. Omit it if the database name is set in the `` parameters. -- `table` – Name of the table and schema if exists. -- `connection_string` – Connection string. -- `invalidate_query` – Query for checking the dictionary status. Optional parameter. Read more in the section [Updating dictionaries](external_dicts_dict_lifetime.md#dicts-external_dicts_dict_lifetime). - -ClickHouse receives quoting symbols from ODBC-driver and quote all settings in queries to driver, so it's necessary to set table name accordingly to table name case in database. - -### Known vulnerability of the ODBC dictionary functionality - -!!! attention - When connecting to the database through the ODBC driver connection parameter `Servername` can be substituted. In this case values of `USERNAME` and `PASSWORD` from `odbc.ini` are sent to the remote server and can be compromised. - -**Example of insecure use** - -Let's configure unixODBC for PostgreSQL. Content of `/etc/odbc.ini`: - -``` -[gregtest] -Driver = /usr/lib/psqlodbca.so -Servername = localhost -PORT = 5432 -DATABASE = test_db -#OPTION = 3 -USERNAME = test -PASSWORD = test -``` - -If you then make a query such as - -``` -SELECT * FROM odbc('DSN=gregtest;Servername=some-server.com', 'test_db'); -``` - -ODBC driver will send values of `USERNAME` and `PASSWORD` from `odbc.ini` to `some-server.com`. - -### Example of Connecting PostgreSQL - -Ubuntu OS. - -Installing unixODBC and the ODBC driver for PostgreSQL: - - sudo apt-get install -y unixodbc odbcinst odbc-postgresql - -Configuring `/etc/odbc.ini` (or `~/.odbc.ini`): - -``` - [DEFAULT] - Driver = myconnection - - [myconnection] - Description = PostgreSQL connection to my_db - Driver = PostgreSQL Unicode - Database = my_db - Servername = 127.0.0.1 - UserName = username - Password = password - Port = 5432 - Protocol = 9.3 - ReadOnly = No - RowVersioning = No - ShowSystemTables = No - ConnSettings = -``` - -The dictionary configuration in ClickHouse: - -```xml - - - table_name - - + + +# Sources of External Dictionaries + +An external dictionary can be connected from many different sources. + +The configuration looks like this: + +```xml + + + ... + + + + + + ... + + ... + +``` + +The source is configured in the `source` section. + +Types of sources (`source_type`): + +- [Local file](#dicts-external_dicts_dict_sources-local_file) +- [Executable file](#dicts-external_dicts_dict_sources-executable) +- [HTTP(s)](#dicts-external_dicts_dict_sources-http) +- DBMS + - [MySQL](#dicts-external_dicts_dict_sources-mysql) + - [ClickHouse](#dicts-external_dicts_dict_sources-clickhouse) + - [MongoDB](#dicts-external_dicts_dict_sources-mongodb) + - [ODBC](#dicts-external_dicts_dict_sources-odbc) + + + +## Local File + +Example of settings: + +```xml + + + /opt/dictionaries/os.tsv + TabSeparated + + +``` + +Setting fields: + +- `path` – The absolute path to the file. +- `format` – The file format. All the formats described in "[Formats](../../interfaces/formats.md#formats)" are supported. + + + +## Executable File + +Working with executable files depends on [how the dictionary is stored in memory](external_dicts_dict_layout.md#dicts-external_dicts_dict_layout). If the dictionary is stored using `cache` and `complex_key_cache`, ClickHouse requests the necessary keys by sending a request to the executable file's `STDIN`. + +Example of settings: + +```xml + + + cat /opt/dictionaries/os.tsv + TabSeparated + + +``` + +Setting fields: + +- `command` – The absolute path to the executable file, or the file name (if the program directory is written to `PATH`). +- `format` – The file format. All the formats described in "[Formats](../../interfaces/formats.md#formats)" are supported. + + + +## HTTP(s) + +Working with an HTTP(s) server depends on [how the dictionary is stored in memory](external_dicts_dict_layout.md#dicts-external_dicts_dict_layout). If the dictionary is stored using `cache` and `complex_key_cache`, ClickHouse requests the necessary keys by sending a request via the `POST` method. + +Example of settings: + +```xml + + + http://[::1]/os.tsv + TabSeparated + + +``` + +In order for ClickHouse to access an HTTPS resource, you must [configure openSSL](../../operations/server_settings/settings.md#server_settings-openSSL) in the server configuration. + +Setting fields: + +- `url` – The source URL. +- `format` – The file format. All the formats described in "[Formats](../../interfaces/formats.md#formats)" are supported. + + + +## ODBC + +You can use this method to connect any database that has an ODBC driver. + +Example of settings: + +```xml + + DatabaseName + ShemaName.TableName
+ DSN=some_parameters + SQL_QUERY +
+``` + +Setting fields: + +- `db` – Name of the database. Omit it if the database name is set in the `` parameters. +- `table` – Name of the table and schema if exists. +- `connection_string` – Connection string. +- `invalidate_query` – Query for checking the dictionary status. Optional parameter. Read more in the section [Updating dictionaries](external_dicts_dict_lifetime.md#dicts-external_dicts_dict_lifetime). + +ClickHouse receives quoting symbols from ODBC-driver and quote all settings in queries to driver, so it's necessary to set table name accordingly to table name case in database. + +### Known vulnerability of the ODBC dictionary functionality + +!!! attention + When connecting to the database through the ODBC driver connection parameter `Servername` can be substituted. In this case values of `USERNAME` and `PASSWORD` from `odbc.ini` are sent to the remote server and can be compromised. + +**Example of insecure use** + +Let's configure unixODBC for PostgreSQL. Content of `/etc/odbc.ini`: + +``` +[gregtest] +Driver = /usr/lib/psqlodbca.so +Servername = localhost +PORT = 5432 +DATABASE = test_db +#OPTION = 3 +USERNAME = test +PASSWORD = test +``` + +If you then make a query such as + +``` +SELECT * FROM odbc('DSN=gregtest;Servername=some-server.com', 'test_db'); +``` + +ODBC driver will send values of `USERNAME` and `PASSWORD` from `odbc.ini` to `some-server.com`. + +### Example of Connecting PostgreSQL + +Ubuntu OS. + +Installing unixODBC and the ODBC driver for PostgreSQL: + + sudo apt-get install -y unixodbc odbcinst odbc-postgresql + +Configuring `/etc/odbc.ini` (or `~/.odbc.ini`): + +``` + [DEFAULT] + Driver = myconnection + + [myconnection] + Description = PostgreSQL connection to my_db + Driver = PostgreSQL Unicode + Database = my_db + Servername = 127.0.0.1 + UserName = username + Password = password + Port = 5432 + Protocol = 9.3 + ReadOnly = No + RowVersioning = No + ShowSystemTables = No + ConnSettings = +``` + +The dictionary configuration in ClickHouse: + +```xml + + + table_name + + - - DSN=myconnection - postgresql_table
-
- - - 300 - 360 - - - - - - - id - - - some_column - UInt64 - 0 - - -
-
-``` - -You may need to edit `odbc.ini` to specify the full path to the library with the driver `DRIVER=/usr/local/lib/psqlodbcw.so`. - -### Example of Connecting MS SQL Server - -Ubuntu OS. - -Installing the driver: : - -``` - sudo apt-get install tdsodbc freetds-bin sqsh -``` - -Configuring the driver: : - -``` - $ cat /etc/freetds/freetds.conf - ... - - [MSSQL] - host = 192.168.56.101 - port = 1433 - tds version = 7.0 - client charset = UTF-8 - - $ cat /etc/odbcinst.ini - ... - - [FreeTDS] - Description = FreeTDS - Driver = /usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so - Setup = /usr/lib/x86_64-linux-gnu/odbc/libtdsS.so - FileUsage = 1 - UsageCount = 5 - - $ cat ~/.odbc.ini - ... - - [MSSQL] - Description = FreeTDS - Driver = FreeTDS - Servername = MSSQL - Database = test - UID = test - PWD = test - Port = 1433 -``` - -Configuring the dictionary in ClickHouse: - -```xml - - - test - - - dict
- DSN=MSSQL;UID=test;PWD=test -
- - - - 300 - 360 - - - - - - - - - k - - - s - String - - - -
-
-``` - -## DBMS - - - -### MySQL - -Example of settings: - -```xml - - - 3306 - clickhouse - qwerty - - example01-1 - 1 - - - example01-2 - 1 - - db_name - table_name
- id=10 - SQL_QUERY -
- -``` - -Setting fields: - -- `port` – The port on the MySQL server. You can specify it for all replicas, or for each one individually (inside ``). - -- `user` – Name of the MySQL user. You can specify it for all replicas, or for each one individually (inside ``). - -- `password` – Password of the MySQL user. You can specify it for all replicas, or for each one individually (inside ``). - -- `replica` – Section of replica configurations. There can be multiple sections. - - `replica/host` – The MySQL host. - - \* `replica/priority` – The replica priority. When attempting to connect, ClickHouse traverses the replicas in order of priority. The lower the number, the higher the priority. - -- `db` – Name of the database. - -- `table` – Name of the table. - -- `where ` – The selection criteria. Optional parameter. - -- `invalidate_query` – Query for checking the dictionary status. Optional parameter. Read more in the section [Updating dictionaries](external_dicts_dict_lifetime.md#dicts-external_dicts_dict_lifetime). - -MySQL can be connected on a local host via sockets. To do this, set `host` and `socket`. - -Example of settings: - -```xml - - - localhost - /path/to/socket/file.sock - clickhouse - qwerty - db_name - table_name
- id=10 - SQL_QUERY -
- -``` - - - -### ClickHouse - -Example of settings: - -```xml - - - example01-01-1 - 9000 - default - - default - ids
- id=10 -
- -``` - -Setting fields: - -- `host` – The ClickHouse host. If it is a local host, the query is processed without any network activity. To improve fault tolerance, you can create a [Distributed](../../operations/table_engines/distributed.md#table_engines-distributed) table and enter it in subsequent configurations. -- `port` – The port on the ClickHouse server. -- `user` – Name of the ClickHouse user. -- `password` – Password of the ClickHouse user. -- `db` – Name of the database. -- `table` – Name of the table. -- `where ` – The selection criteria. May be omitted. -- `invalidate_query` – Query for checking the dictionary status. Optional parameter. Read more in the section [Updating dictionaries](external_dicts_dict_lifetime.md#dicts-external_dicts_dict_lifetime). - - - -### MongoDB - -Example of settings: - -```xml - - - localhost - 27017 - - - test - dictionary_source - - -``` - -Setting fields: - -- `host` – The MongoDB host. -- `port` – The port on the MongoDB server. -- `user` – Name of the MongoDB user. -- `password` – Password of the MongoDB user. -- `db` – Name of the database. -- `collection` – Name of the collection. - -[Original article](https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict_sources/) + + DSN=myconnection + postgresql_table
+
+ + + 300 + 360 + + + + + + + id + + + some_column + UInt64 + 0 + + +
+
+``` + +You may need to edit `odbc.ini` to specify the full path to the library with the driver `DRIVER=/usr/local/lib/psqlodbcw.so`. + +### Example of Connecting MS SQL Server + +Ubuntu OS. + +Installing the driver: : + +``` + sudo apt-get install tdsodbc freetds-bin sqsh +``` + +Configuring the driver: : + +``` + $ cat /etc/freetds/freetds.conf + ... + + [MSSQL] + host = 192.168.56.101 + port = 1433 + tds version = 7.0 + client charset = UTF-8 + + $ cat /etc/odbcinst.ini + ... + + [FreeTDS] + Description = FreeTDS + Driver = /usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so + Setup = /usr/lib/x86_64-linux-gnu/odbc/libtdsS.so + FileUsage = 1 + UsageCount = 5 + + $ cat ~/.odbc.ini + ... + + [MSSQL] + Description = FreeTDS + Driver = FreeTDS + Servername = MSSQL + Database = test + UID = test + PWD = test + Port = 1433 +``` + +Configuring the dictionary in ClickHouse: + +```xml + + + test + + + dict
+ DSN=MSSQL;UID=test;PWD=test +
+ + + + 300 + 360 + + + + + + + + + k + + + s + String + + + +
+
+``` + +## DBMS + + + +### MySQL + +Example of settings: + +```xml + + + 3306 + clickhouse + qwerty + + example01-1 + 1 + + + example01-2 + 1 + + db_name + table_name
+ id=10 + SQL_QUERY +
+ +``` + +Setting fields: + +- `port` – The port on the MySQL server. You can specify it for all replicas, or for each one individually (inside ``). + +- `user` – Name of the MySQL user. You can specify it for all replicas, or for each one individually (inside ``). + +- `password` – Password of the MySQL user. You can specify it for all replicas, or for each one individually (inside ``). + +- `replica` – Section of replica configurations. There can be multiple sections. + - `replica/host` – The MySQL host. + + \* `replica/priority` – The replica priority. When attempting to connect, ClickHouse traverses the replicas in order of priority. The lower the number, the higher the priority. + +- `db` – Name of the database. + +- `table` – Name of the table. + +- `where ` – The selection criteria. Optional parameter. + +- `invalidate_query` – Query for checking the dictionary status. Optional parameter. Read more in the section [Updating dictionaries](external_dicts_dict_lifetime.md#dicts-external_dicts_dict_lifetime). + +MySQL can be connected on a local host via sockets. To do this, set `host` and `socket`. + +Example of settings: + +```xml + + + localhost + /path/to/socket/file.sock + clickhouse + qwerty + db_name + table_name
+ id=10 + SQL_QUERY +
+ +``` + + + +### ClickHouse + +Example of settings: + +```xml + + + example01-01-1 + 9000 + default + + default + ids
+ id=10 +
+ +``` + +Setting fields: + +- `host` – The ClickHouse host. If it is a local host, the query is processed without any network activity. To improve fault tolerance, you can create a [Distributed](../../operations/table_engines/distributed.md#table_engines-distributed) table and enter it in subsequent configurations. +- `port` – The port on the ClickHouse server. +- `user` – Name of the ClickHouse user. +- `password` – Password of the ClickHouse user. +- `db` – Name of the database. +- `table` – Name of the table. +- `where ` – The selection criteria. May be omitted. +- `invalidate_query` – Query for checking the dictionary status. Optional parameter. Read more in the section [Updating dictionaries](external_dicts_dict_lifetime.md#dicts-external_dicts_dict_lifetime). + + + +### MongoDB + +Example of settings: + +```xml + + + localhost + 27017 + + + test + dictionary_source + + +``` + +Setting fields: + +- `host` – The MongoDB host. +- `port` – The port on the MongoDB server. +- `user` – Name of the MongoDB user. +- `password` – Password of the MongoDB user. +- `db` – Name of the database. +- `collection` – Name of the collection. + +[Original article](https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict_sources/)