ClickHouse/docs/en/operations/named-collections.md

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

289 lines
7.4 KiB
Markdown
Raw Normal View History

2022-04-01 17:34:35 +00:00
---
2022-08-28 14:53:34 +00:00
slug: /en/operations/named-collections
sidebar_position: 69
2022-08-22 22:17:39 +00:00
sidebar_label: "Named collections"
2022-04-01 17:34:35 +00:00
---
2022-06-02 10:55:18 +00:00
# Storing details for connecting to external sources in configuration files
2022-04-01 17:34:35 +00:00
Details for connecting to external sources (dictionaries, tables, table functions) can be saved
in configuration files and thus simplify the creation of objects and hide credentials
from users with only SQL access.
Parameters can be set in XML `<format>CSV</format>` and overridden in SQL `, format = 'TSV'`.
The parameters in SQL can be overridden using format `key` = `value`: `compression_method = 'gzip'`.
2022-08-22 22:17:39 +00:00
Named collections are stored in the `config.xml` file of the ClickHouse server in the `<named_collections>` section and are applied when ClickHouse starts.
2022-04-01 17:34:35 +00:00
Example of configuration:
```xml
$ cat /etc/clickhouse-server/config.d/named_collections.xml
<clickhouse>
<named_collections>
...
</named_collections>
</clickhouse>
```
2022-08-22 22:17:39 +00:00
## Named collections for accessing S3.
2022-04-01 17:34:35 +00:00
The description of parameters see [s3 Table Function](../sql-reference/table-functions/s3.md).
Example of configuration:
```xml
<clickhouse>
<named_collections>
<s3_mydata>
<access_key_id>AKIAIOSFODNN7EXAMPLE</access_key_id>
2022-06-22 03:04:55 +00:00
<secret_access_key>wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY</secret_access_key>
2022-04-01 17:34:35 +00:00
<format>CSV</format>
<url>https://s3.us-east-1.amazonaws.com/yourbucket/mydata/</url>
</s3_mydata>
</named_collections>
</clickhouse>
```
2022-08-22 22:17:39 +00:00
### Example of using named collections with the s3 function
2022-04-01 17:34:35 +00:00
```sql
INSERT INTO FUNCTION s3(s3_mydata, filename = 'test_file.tsv.gz',
format = 'TSV', structure = 'number UInt64', compression_method = 'gzip')
SELECT * FROM numbers(10000);
SELECT count()
FROM s3(s3_mydata, filename = 'test_file.tsv.gz')
┌─count()─┐
│ 10000 │
└─────────┘
1 rows in set. Elapsed: 0.279 sec. Processed 10.00 thousand rows, 90.00 KB (35.78 thousand rows/s., 322.02 KB/s.)
```
2022-08-22 22:17:39 +00:00
### Example of using named collections with an S3 table
2022-04-01 17:34:35 +00:00
```sql
CREATE TABLE s3_engine_table (number Int64)
ENGINE=S3(s3_mydata, url='https://s3.us-east-1.amazonaws.com/yourbucket/mydata/test_file.tsv.gz', format = 'TSV')
SETTINGS input_format_with_names_use_header = 0;
SELECT * FROM s3_engine_table LIMIT 3;
┌─number─┐
│ 0 │
│ 1 │
│ 2 │
└────────┘
```
2022-08-22 22:17:39 +00:00
## Named collections for accessing MySQL database
2022-04-01 17:34:35 +00:00
The description of parameters see [mysql](../sql-reference/table-functions/mysql.md).
Example of configuration:
```xml
<clickhouse>
<named_collections>
<mymysql>
<user>myuser</user>
<password>mypass</password>
<host>127.0.0.1</host>
<port>3306</port>
<database>test</database>
<connection_pool_size>8</connection_pool_size>
<on_duplicate_clause>1</on_duplicate_clause>
<replace_query>1</replace_query>
</mymysql>
</named_collections>
</clickhouse>
```
2022-08-22 22:17:39 +00:00
### Example of using named collections with the mysql function
2022-04-01 17:34:35 +00:00
```sql
SELECT count() FROM mysql(mymysql, table = 'test');
┌─count()─┐
│ 3 │
└─────────┘
```
2022-08-22 22:17:39 +00:00
### Example of using named collections with an MySQL table
2022-04-01 17:34:35 +00:00
```sql
CREATE TABLE mytable(A Int64) ENGINE = MySQL(mymysql, table = 'test', connection_pool_size=3, replace_query=0);
SELECT count() FROM mytable;
┌─count()─┐
│ 3 │
└─────────┘
```
2022-08-22 22:17:39 +00:00
### Example of using named collections with database with engine MySQL
2022-04-01 17:34:35 +00:00
```sql
CREATE DATABASE mydatabase ENGINE = MySQL(mymysql);
SHOW TABLES FROM mydatabase;
┌─name───┐
│ source │
│ test │
└────────┘
```
### Example of using named collections with a dictionary with source MySQL
2022-04-01 17:34:35 +00:00
```sql
CREATE DICTIONARY dict (A Int64, B String)
PRIMARY KEY A
SOURCE(MYSQL(NAME mymysql TABLE 'source'))
LIFETIME(MIN 1 MAX 2)
LAYOUT(HASHED());
SELECT dictGet('dict', 'B', 2);
┌─dictGet('dict', 'B', 2)─┐
│ two │
└─────────────────────────┘
```
2022-08-22 22:17:39 +00:00
## Named collections for accessing PostgreSQL database
2022-04-01 17:34:35 +00:00
The description of parameters see [postgresql](../sql-reference/table-functions/postgresql.md).
Example of configuration:
```xml
<clickhouse>
<named_collections>
<mypg>
<user>pguser</user>
<password>jw8s0F4</password>
<host>127.0.0.1</host>
<port>5432</port>
<database>test</database>
<schema>test_schema</schema>
<connection_pool_size>8</connection_pool_size>
</mypg>
</named_collections>
</clickhouse>
```
2022-08-22 22:17:39 +00:00
### Example of using named collections with the postgresql function
2022-04-01 17:34:35 +00:00
```sql
SELECT * FROM postgresql(mypg, table = 'test');
┌─a─┬─b───┐
│ 2 │ two │
│ 1 │ one │
└───┴─────┘
SELECT * FROM postgresql(mypg, table = 'test', schema = 'public');
┌─a─┐
│ 1 │
│ 2 │
│ 3 │
└───┘
```
2022-08-22 22:17:39 +00:00
### Example of using named collections with database with engine PostgreSQL
2022-04-01 17:34:35 +00:00
```sql
CREATE TABLE mypgtable (a Int64) ENGINE = PostgreSQL(mypg, table = 'test', schema = 'public');
SELECT * FROM mypgtable;
┌─a─┐
│ 1 │
│ 2 │
│ 3 │
└───┘
```
2022-08-22 22:17:39 +00:00
### Example of using named collections with database with engine PostgreSQL
2022-04-01 17:34:35 +00:00
```sql
CREATE DATABASE mydatabase ENGINE = PostgreSQL(mypg);
SHOW TABLES FROM mydatabase
┌─name─┐
│ test │
└──────┘
```
### Example of using named collections with a dictionary with source POSTGRESQL
2022-04-01 17:34:35 +00:00
```sql
CREATE DICTIONARY dict (a Int64, b String)
PRIMARY KEY a
SOURCE(POSTGRESQL(NAME mypg TABLE test))
LIFETIME(MIN 1 MAX 2)
LAYOUT(HASHED());
SELECT dictGet('dict', 'b', 2);
┌─dictGet('dict', 'b', 2)─┐
│ two │
└─────────────────────────┘
2022-06-22 03:04:55 +00:00
```
2022-08-22 22:17:39 +00:00
## Named collections for accessing remote ClickHouse database
The description of parameters see [remote](../sql-reference/table-functions/remote.md/#parameters).
Example of configuration:
```xml
<clickhouse>
<named_collections>
<remote1>
<host>remote_host</host>
2022-08-22 22:17:39 +00:00
<port>9000</port>
<database>system</database>
<user>foo</user>
<password>secret</password>
<secure>1</secure>
2022-08-22 22:17:39 +00:00
</remote1>
</named_collections>
</clickhouse>
```
`secure` is not needed for connection because of `remoteSecure`, but it can be used for dictionaries.
2022-08-22 22:17:39 +00:00
2022-08-22 22:36:01 +00:00
### Example of using named collections with the `remote`/`remoteSecure` functions
2022-08-22 22:17:39 +00:00
```sql
SELECT * FROM remote(remote1, table = one);
┌─dummy─┐
│ 0 │
└───────┘
SELECT * FROM remote(remote1, database = merge(system, '^one'));
┌─dummy─┐
│ 0 │
└───────┘
INSERT INTO FUNCTION remote(remote1, database = default, table = test) VALUES (1,'a');
SELECT * FROM remote(remote1, database = default, table = test);
┌─a─┬─b─┐
│ 1 │ a │
└───┴───┘
```
### Example of using named collections with a dictionary with source ClickHouse
2022-08-22 22:17:39 +00:00
```sql
CREATE DICTIONARY dict(a Int64, b String)
PRIMARY KEY a
SOURCE(CLICKHOUSE(NAME remote1 TABLE test DB default))
LIFETIME(MIN 1 MAX 2)
LAYOUT(HASHED());
SELECT dictGet('dict', 'b', 1);
┌─dictGet('dict', 'b', 1)─┐
│ a │
└─────────────────────────┘
```