2018-12-21 19:23:55 +00:00
# Sources of External Dictionaries {#dicts-external_dicts_dict_sources}
An external dictionary can be connected from many different sources.
2019-11-13 15:50:09 +00:00
If dictionary is configured using xml-file, the configuration looks like this:
2018-12-21 19:23:55 +00:00
```xml
< yandex >
< dictionary >
...
< source >
< source_type >
<!-- Source configuration -->
< / source_type >
< / source >
...
< / dictionary >
...
< / yandex >
```
2019-11-13 15:50:09 +00:00
In case of [DDL-query ](../create.md#create-dictionary-query ), equal configuration will looks like:
```sql
CREATE DICTIONARY dict_name (...)
...
SOURCE(SOURCE_TYPE(param1 val1 ... paramN valN)) -- Source configuration
...
```
2018-12-21 19:23:55 +00:00
The source is configured in the `source` section.
Types of sources (`source_type`):
- [Local file ](#dicts-external_dicts_dict_sources-local_file )
- [Executable file ](#dicts-external_dicts_dict_sources-executable )
- [HTTP(s) ](#dicts-external_dicts_dict_sources-http )
- DBMS
2019-06-02 13:29:43 +00:00
- [ODBC ](#dicts-external_dicts_dict_sources-odbc )
2018-12-21 19:23:55 +00:00
- [MySQL ](#dicts-external_dicts_dict_sources-mysql )
- [ClickHouse ](#dicts-external_dicts_dict_sources-clickhouse )
- [MongoDB ](#dicts-external_dicts_dict_sources-mongodb )
2019-02-12 09:23:22 +00:00
- [Redis ](#dicts-external_dicts_dict_sources-redis )
2018-12-21 19:23:55 +00:00
## Local File {#dicts-external_dicts_dict_sources-local_file}
Example of settings:
```xml
< source >
< file >
< path > /opt/dictionaries/os.tsv< / path >
< format > TabSeparated< / format >
< / file >
< / source >
```
2019-11-13 15:50:09 +00:00
or
```sql
SOURCE(FILE(path '/opt/dictionaries/os.tsv' format 'TabSeparated'))
```
2018-12-21 19:23:55 +00:00
Setting fields:
- `path` – The absolute path to the file.
- `format` – The file format. All the formats described in "[Formats](../../interfaces/formats.md#formats)" are supported.
## Executable File {#dicts-external_dicts_dict_sources-executable}
2019-06-04 19:50:24 +00:00
Working with executable files depends on [how the dictionary is stored in memory ](external_dicts_dict_layout.md ). If the dictionary is stored using `cache` and `complex_key_cache` , ClickHouse requests the necessary keys by sending a request to the executable file's STDIN. Otherwise, ClickHouse starts executable file and treats its output as dictionary data.
2018-12-21 19:23:55 +00:00
Example of settings:
```xml
< source >
< executable >
< command > cat /opt/dictionaries/os.tsv< / command >
< format > TabSeparated< / format >
< / executable >
< / source >
```
2019-11-13 15:50:09 +00:00
or
```sql
SOURCE(EXECUTABLE(command 'cat /opt/dictionaries/os.tsv' format 'TabSeparated'))
```
2018-12-21 19:23:55 +00:00
Setting fields:
- `command` – The absolute path to the executable file, or the file name (if the program directory is written to `PATH` ).
- `format` – The file format. All the formats described in "[Formats](../../interfaces/formats.md#formats)" are supported.
## HTTP(s) {#dicts-external_dicts_dict_sources-http}
Working with an HTTP(s) server depends on [how the dictionary is stored in memory ](external_dicts_dict_layout.md ). If the dictionary is stored using `cache` and `complex_key_cache` , ClickHouse requests the necessary keys by sending a request via the `POST` method.
Example of settings:
```xml
< source >
< http >
< url > http://[::1]/os.tsv< / url >
< format > TabSeparated< / format >
2019-09-25 09:42:17 +00:00
< credentials >
< user > user< / user >
< password > password< / password >
< / credentials >
2019-09-26 03:41:00 +00:00
< headers >
< header >
2019-09-25 09:42:17 +00:00
< name > API-KEY< / name >
< value > key< / value >
2019-09-26 03:41:00 +00:00
< / header >
< / headers >
2018-12-21 19:23:55 +00:00
< / http >
< / source >
```
2019-11-13 15:50:09 +00:00
or
```sql
SOURCE(HTTP(
url 'http://[::1]/os.tsv'
format 'TabSeparated'
credentials(user 'user' password 'password')
headers(header(name 'API-KEY' value 'key'))
))
```
2019-02-28 09:06:10 +00:00
In order for ClickHouse to access an HTTPS resource, you must [configure openSSL ](../../operations/server_settings/settings.md#server_settings-openssl ) in the server configuration.
2018-12-21 19:23:55 +00:00
Setting fields:
- `url` – The source URL.
- `format` – The file format. All the formats described in "[Formats](../../interfaces/formats.md#formats)" are supported.
2019-09-26 03:34:22 +00:00
- `credentials` – Basic HTTP authentification. Optional parameter.
2019-09-25 09:42:17 +00:00
- `user` – Username required for the authentification.
- `password` – Password required for the authentification.
2019-09-26 03:34:22 +00:00
- `headers` – All custom HTTP headers entries used for the HTTP request. Optional parameter.
2019-09-26 03:41:00 +00:00
- `header` – Single HTTP header entry.
2019-09-25 11:37:13 +00:00
- `name` – Identifiant name used for the header send on the request.
2019-09-25 09:42:17 +00:00
- `value` – Value set for a specific identifiant name.
2018-12-21 19:23:55 +00:00
## ODBC {#dicts-external_dicts_dict_sources-odbc}
You can use this method to connect any database that has an ODBC driver.
Example of settings:
```xml
2019-11-13 15:50:09 +00:00
< source >
< odbc >
< db > DatabaseName< / db >
< table > ShemaName.TableName< / table >
< connection_string > DSN=some_parameters< / connection_string >
< invalidate_query > SQL_QUERY< / invalidate_query >
< / odbc >
< / source >
```
or
```sql
SOURCE(ODBC(
db 'DatabaseName'
table 'SchemaName.TableName'
connection_string 'DSN=some_parameters'
invalidate_query 'SQL_QUERY'
))
2018-12-21 19:23:55 +00:00
```
Setting fields:
- `db` – Name of the database. Omit it if the database name is set in the `<connection_string>` parameters.
- `table` – Name of the table and schema if exists.
- `connection_string` – Connection string.
- `invalidate_query` – Query for checking the dictionary status. Optional parameter. Read more in the section [Updating dictionaries ](external_dicts_dict_lifetime.md ).
ClickHouse receives quoting symbols from ODBC-driver and quote all settings in queries to driver, so it's necessary to set table name accordingly to table name case in database.
2019-04-24 07:39:53 +00:00
If you have a problems with encodings when using Oracle, see the corresponding [FAQ ](../../faq/general.md#oracle-odbc-encodings ) article.
2018-12-21 19:23:55 +00:00
### Known vulnerability of the ODBC dictionary functionality
!!! attention
When connecting to the database through the ODBC driver connection parameter `Servername` can be substituted. In this case values of `USERNAME` and `PASSWORD` from `odbc.ini` are sent to the remote server and can be compromised.
**Example of insecure use**
Let's configure unixODBC for PostgreSQL. Content of `/etc/odbc.ini` :
2019-09-23 15:31:46 +00:00
```text
2018-12-21 19:23:55 +00:00
[gregtest]
Driver = /usr/lib/psqlodbca.so
Servername = localhost
PORT = 5432
DATABASE = test_db
#OPTION = 3
USERNAME = test
PASSWORD = test
```
If you then make a query such as
2019-09-23 15:31:46 +00:00
```sql
2018-12-21 19:23:55 +00:00
SELECT * FROM odbc('DSN=gregtest;Servername=some-server.com', 'test_db');
```
ODBC driver will send values of `USERNAME` and `PASSWORD` from `odbc.ini` to `some-server.com` .
### Example of Connecting PostgreSQL
Ubuntu OS.
Installing unixODBC and the ODBC driver for PostgreSQL:
2019-09-23 15:31:46 +00:00
```bash
$ sudo apt-get install -y unixodbc odbcinst odbc-postgresql
```
2018-12-21 19:23:55 +00:00
Configuring `/etc/odbc.ini` (or `~/.odbc.ini` ):
2019-09-23 15:31:46 +00:00
```text
2018-12-21 19:23:55 +00:00
[DEFAULT]
Driver = myconnection
[myconnection]
Description = PostgreSQL connection to my_db
Driver = PostgreSQL Unicode
Database = my_db
Servername = 127.0.0.1
UserName = username
Password = password
Port = 5432
Protocol = 9.3
ReadOnly = No
RowVersioning = No
ShowSystemTables = No
ConnSettings =
```
The dictionary configuration in ClickHouse:
```xml
< yandex >
< dictionary >
< name > table_name< / name >
< source >
< odbc >
<!-- You can specify the following parameters in connection_string: -->
<!-- DSN=myconnection;UID=username;PWD=password;HOST=127.0.0.1;PORT=5432;DATABASE=my_db -->
< connection_string > DSN=myconnection< / connection_string >
< table > postgresql_table< / table >
< / odbc >
< / source >
< lifetime >
< min > 300< / min >
< max > 360< / max >
< / lifetime >
< layout >
< hashed / >
< / layout >
< structure >
< id >
< name > id< / name >
< / id >
< attribute >
< name > some_column< / name >
< type > UInt64< / type >
< null_value > 0< / null_value >
< / attribute >
< / structure >
< / dictionary >
< / yandex >
```
2019-11-13 15:50:09 +00:00
or
```sql
CREATE DICTIONARY table_name (
id UInt64,
some_column UInt64 DEFAULT 0
)
PRIMARY KEY id
SOURCE(ODBC(connection_string 'DSN=myconnection' table 'postgresql_table'))
LAYOUT(HASHED())
LIFETIME(MIN 300 MAX 360)
```
2018-12-21 19:23:55 +00:00
You may need to edit `odbc.ini` to specify the full path to the library with the driver `DRIVER=/usr/local/lib/psqlodbcw.so` .
### Example of Connecting MS SQL Server
Ubuntu OS.
Installing the driver: :
2019-09-23 15:31:46 +00:00
```bash
$ sudo apt-get install tdsodbc freetds-bin sqsh
2018-12-21 19:23:55 +00:00
```
2019-09-23 15:31:46 +00:00
Configuring the driver:
2018-12-21 19:23:55 +00:00
2019-09-23 15:31:46 +00:00
```bash
2018-12-21 19:23:55 +00:00
$ cat /etc/freetds/freetds.conf
...
[MSSQL]
host = 192.168.56.101
port = 1433
tds version = 7.0
client charset = UTF-8
$ cat /etc/odbcinst.ini
...
[FreeTDS]
Description = FreeTDS
Driver = /usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so
Setup = /usr/lib/x86_64-linux-gnu/odbc/libtdsS.so
FileUsage = 1
UsageCount = 5
$ cat ~/.odbc.ini
...
[MSSQL]
Description = FreeTDS
Driver = FreeTDS
Servername = MSSQL
Database = test
UID = test
PWD = test
Port = 1433
```
Configuring the dictionary in ClickHouse:
```xml
< yandex >
< dictionary >
< name > test< / name >
< source >
< odbc >
< table > dict< / table >
< connection_string > DSN=MSSQL;UID=test;PWD=test< / connection_string >
< / odbc >
< / source >
< lifetime >
< min > 300< / min >
< max > 360< / max >
< / lifetime >
< layout >
< flat / >
< / layout >
< structure >
< id >
< name > k< / name >
< / id >
< attribute >
< name > s< / name >
< type > String< / type >
< null_value > < / null_value >
< / attribute >
< / structure >
< / dictionary >
< / yandex >
```
2019-11-13 15:50:09 +00:00
or
```sql
CREATE DICTIONARY test (
k UInt64,
s String DEFAULT ''
)
PRIMARY KEY k
SOURCE(ODBC(table 'dict' connection_string 'DSN=MSSQL;UID=test;PWD=test'))
LAYOUT(FLAT())
LIFETIME(MIN 300 MAX 360)
```
2018-12-21 19:23:55 +00:00
## DBMS
### MySQL {#dicts-external_dicts_dict_sources-mysql}
Example of settings:
```xml
< source >
< mysql >
< port > 3306< / port >
< user > clickhouse< / user >
< password > qwerty< / password >
< replica >
< host > example01-1< / host >
< priority > 1< / priority >
< / replica >
< replica >
< host > example01-2< / host >
< priority > 1< / priority >
< / replica >
< db > db_name< / db >
< table > table_name< / table >
< where > id=10< / where >
< invalidate_query > SQL_QUERY< / invalidate_query >
< / mysql >
< / source >
```
2019-11-13 15:50:09 +00:00
or
```sql
SOURCE(MYSQL(
port 3306
user 'clickhouse'
password 'qwerty'
replica(host 'example01-1' priority 1)
replica(host 'example01-2' priority 1)
db 'db_name'
table 'table_name'
where 'id=10'
invalidate_query 'SQL_QUERY'
))
```
2018-12-21 19:23:55 +00:00
Setting fields:
- `port` – The port on the MySQL server. You can specify it for all replicas, or for each one individually (inside `<replica>` ).
- `user` – Name of the MySQL user. You can specify it for all replicas, or for each one individually (inside `<replica>` ).
- `password` – Password of the MySQL user. You can specify it for all replicas, or for each one individually (inside `<replica>` ).
- `replica` – Section of replica configurations. There can be multiple sections.
- `replica/host` – The MySQL host.
\* `replica/priority` – The replica priority. When attempting to connect, ClickHouse traverses the replicas in order of priority. The lower the number, the higher the priority.
- `db` – Name of the database.
- `table` – Name of the table.
- `where ` – The selection criteria. Optional parameter.
- `invalidate_query` – Query for checking the dictionary status. Optional parameter. Read more in the section [Updating dictionaries ](external_dicts_dict_lifetime.md ).
MySQL can be connected on a local host via sockets. To do this, set `host` and `socket` .
Example of settings:
```xml
< source >
< mysql >
< host > localhost< / host >
< socket > /path/to/socket/file.sock< / socket >
< user > clickhouse< / user >
< password > qwerty< / password >
< db > db_name< / db >
< table > table_name< / table >
< where > id=10< / where >
< invalidate_query > SQL_QUERY< / invalidate_query >
< / mysql >
< / source >
```
2019-11-13 15:50:09 +00:00
or
```sql
SOURCE(MYSQL(
host 'localhost'
socket '/path/to/socket/file.sock'
user 'clickhouse'
password 'qwerty'
db 'db_name'
table 'table_name'
where 'id=10'
invalidate_query 'SQL_QUERY'
))
```
2018-12-21 19:23:55 +00:00
### ClickHouse {#dicts-external_dicts_dict_sources-clickhouse}
Example of settings:
```xml
< source >
< clickhouse >
< host > example01-01-1< / host >
< port > 9000< / port >
< user > default< / user >
< password > < / password >
< db > default< / db >
< table > ids< / table >
< where > id=10< / where >
< / clickhouse >
< / source >
```
2019-11-13 15:50:09 +00:00
or
```sql
SOURCE(CLICKHOUSE(
host 'example01-01-1'
port 9000
user 'default'
password ''
db 'default'
table 'ids'
where 'id=10'
))
```
2018-12-21 19:23:55 +00:00
Setting fields:
- `host` – The ClickHouse host. If it is a local host, the query is processed without any network activity. To improve fault tolerance, you can create a [Distributed ](../../operations/table_engines/distributed.md ) table and enter it in subsequent configurations.
- `port` – The port on the ClickHouse server.
- `user` – Name of the ClickHouse user.
- `password` – Password of the ClickHouse user.
- `db` – Name of the database.
- `table` – Name of the table.
- `where ` – The selection criteria. May be omitted.
- `invalidate_query` – Query for checking the dictionary status. Optional parameter. Read more in the section [Updating dictionaries ](external_dicts_dict_lifetime.md ).
### MongoDB {#dicts-external_dicts_dict_sources-mongodb}
Example of settings:
```xml
< source >
< mongodb >
< host > localhost< / host >
< port > 27017< / port >
< user > < / user >
< password > < / password >
< db > test< / db >
< collection > dictionary_source< / collection >
< / mongodb >
< / source >
```
2019-11-13 15:50:09 +00:00
or
```sql
SOURCE(MONGO(
host 'localhost'
port 27017
user ''
password ''
db 'test'
collection 'dictionary_source'
))
```
2018-12-21 19:23:55 +00:00
Setting fields:
- `host` – The MongoDB host.
- `port` – The port on the MongoDB server.
- `user` – Name of the MongoDB user.
- `password` – Password of the MongoDB user.
- `db` – Name of the database.
- `collection` – Name of the collection.
2019-02-12 09:23:22 +00:00
### Redis {#dicts-external_dicts_dict_sources-redis}
Example of settings:
```xml
< source >
< redis >
< host > localhost< / host >
< port > 6379< / port >
2019-06-02 13:29:43 +00:00
< storage_type > simple< / storage_type >
< db_index > 0< / db_index >
2019-02-12 09:23:22 +00:00
< / redis >
< / source >
```
2019-11-13 15:50:09 +00:00
or
```sql
SOURCE(REDIS(
host 'localhost'
port 6379
storage_type 'simple'
db_index 0
))
```
2019-02-12 09:23:22 +00:00
Setting fields:
- `host` – The Redis host.
- `port` – The port on the Redis server.
2019-06-02 13:29:43 +00:00
- `storage_type` – The structure of internal Redis storage using for work with keys. `simple` is for simple sources and for hashed single key sources, `hash_map` is for hashed sources with two keys. Ranged sources and cache sources with complex key are unsupported. May be omitted, default value is `simple` .
- `db_index` – The specific numeric index of Redis logical database. May be omitted, default value is 0.
2019-02-12 09:23:22 +00:00
2018-12-21 19:23:55 +00:00
[Original article ](https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict_sources/ ) <!--hide-->