ClickHouse/docs/en/sql-reference/table-functions/remote.md

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

183 lines
7.5 KiB
Markdown
Raw Normal View History

2020-04-03 13:23:32 +00:00
---
2022-08-28 14:53:34 +00:00
slug: /en/sql-reference/table-functions/remote
sidebar_position: 175
sidebar_label: remote
2020-04-03 13:23:32 +00:00
---
2022-06-02 10:55:18 +00:00
# remote, remoteSecure
2023-11-15 13:32:31 +00:00
Table function `remote` allows to access remote servers on-the-fly, i.e. without creating a [Distributed](../../engines/table-engines/special/distributed.md) table. Table function `remoteSecure` is same as `remote` but over a secure connection.
2021-01-19 22:39:12 +00:00
Both functions can be used in `SELECT` and `INSERT` queries.
2022-07-13 19:07:20 +00:00
## Syntax
2020-03-20 10:10:48 +00:00
``` sql
remote(addresses_expr, [db, table, user [, password], sharding_key])
remote(addresses_expr, [db.table, user [, password], sharding_key])
remote(named_collection[, option=value [,..]])
remoteSecure(addresses_expr, [db, table, user [, password], sharding_key])
remoteSecure(addresses_expr, [db.table, user [, password], sharding_key])
remoteSecure(named_collection[, option=value [,..]])
```
2022-07-13 19:07:20 +00:00
## Parameters
2021-01-19 22:39:12 +00:00
2023-11-15 13:32:31 +00:00
- `addresses_expr` — A remote server address or an expression that generates multiple addresses of remote servers. Format: `host` or `host:port`.
2021-07-29 15:27:50 +00:00
2023-11-15 13:32:31 +00:00
The `host` can be specified as a server name, or as a IPv4 or IPv6 address. An IPv6 address must be specified in square brackets.
2021-07-29 15:27:50 +00:00
2023-11-15 13:32:31 +00:00
The `port` is the TCP port on the remote server. If the port is omitted, it uses [tcp_port](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-tcp_port) from the server config file for table function `remote` (by default, 9000) and [tcp_port_secure](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-tcp_port_secure) for table function `remoteSecure` (by default, 9440).
2021-01-19 22:39:12 +00:00
2023-11-15 13:32:31 +00:00
For IPv6 addresses, a port is required.
2021-01-19 23:02:46 +00:00
2023-11-15 13:32:31 +00:00
If only parameter `addresses_expr` is specified, `db` and `table` will use `system.one` by default.
2021-01-19 23:02:46 +00:00
Type: [String](../../sql-reference/data-types/string.md).
2021-01-19 22:39:12 +00:00
- `db` — Database name. Type: [String](../../sql-reference/data-types/string.md).
- `table` — Table name. Type: [String](../../sql-reference/data-types/string.md).
2023-11-15 13:32:31 +00:00
- `user` — User name. If not specified, `default` is used. Type: [String](../../sql-reference/data-types/string.md).
- `password` — User password. If not specified, an empty password is used. Type: [String](../../sql-reference/data-types/string.md).
- `sharding_key` — Sharding key to support distributing data across nodes. For example: `insert into remote('127.0.0.1:9000,127.0.0.2', db, table, 'default', rand())`. Type: [UInt32](../../sql-reference/data-types/int-uint.md).
2021-01-19 22:39:12 +00:00
Arguments also can be passed using [named collections](/docs/en/operations/named-collections.md).
2022-07-13 19:07:20 +00:00
## Returned value
2021-01-19 22:39:12 +00:00
2023-11-15 13:32:31 +00:00
A table located on a remote server.
2021-01-19 22:39:12 +00:00
2022-07-13 19:07:20 +00:00
## Usage
2023-11-15 13:32:31 +00:00
As table functions `remote` and `remoteSecure` re-establish the connection for each request, it is recommended to use a `Distributed` table instead. Also, if hostnames are set, the names are resolved, and errors are not counted when working with various replicas. When processing a large number of queries, always create the `Distributed` table ahead of time, and do not use the `remote` table function.
2021-01-19 22:39:12 +00:00
The `remote` table function can be useful in the following cases:
2023-11-15 13:32:31 +00:00
- One-time data migration from one system to another
- Accessing a specific server for data comparison, debugging, and testing, i.e. ad-hoc connections.
2022-07-13 19:07:20 +00:00
- Queries between various ClickHouse clusters for research purposes.
- Infrequent distributed requests that are made manually.
- Distributed requests where the set of servers is re-defined each time.
### Addresses
2020-03-20 10:10:48 +00:00
``` text
example01-01-1
2022-07-13 19:07:20 +00:00
example01-01-1:9440
example01-01-1:9000
localhost
127.0.0.1
2022-07-13 19:07:20 +00:00
[::]:9440
[::]:9000
[2a02:6b8:0:1111::11]:9000
```
2023-11-15 13:32:31 +00:00
Multiple addresses can be comma-separated. In this case, ClickHouse will use distributed processing and send the query to all specified addresses (like shards with different data). Example:
2020-03-20 10:10:48 +00:00
``` text
example01-01-1,example01-02-1
```
2022-07-13 19:07:20 +00:00
## Examples
2022-07-13 19:07:20 +00:00
### Selecting data from a remote server:
2021-01-19 22:39:12 +00:00
``` sql
SELECT * FROM remote('127.0.0.1', db.remote_engine_table) LIMIT 3;
```
Or using [named collections](/docs/en/operations/named-collections.md):
```sql
CREATE NAMED COLLECTION creds AS
host = '127.0.0.1',
database = 'db';
SELECT * FROM remote(creds, table='remote_engine_table') LIMIT 3;
```
### Inserting data into a table on a remote server:
2018-09-06 18:15:16 +00:00
2021-01-19 22:39:12 +00:00
``` sql
CREATE TABLE remote_table (name String, value UInt32) ENGINE=Memory;
INSERT INTO FUNCTION remote('127.0.0.1', currentDatabase(), 'remote_table') VALUES ('test', 42);
SELECT * FROM remote_table;
2021-01-19 22:39:12 +00:00
```
2022-07-13 19:07:20 +00:00
### Migration of tables from one system to another:
2023-11-15 13:32:31 +00:00
This example uses one table from a sample dataset. The database is `imdb`, and the table is `actors`.
2022-07-13 19:07:20 +00:00
#### On the source ClickHouse system (the system that currently hosts the data)
2023-11-15 13:32:31 +00:00
- Verify the source database and table name (`imdb.actors`)
2023-11-15 13:32:31 +00:00
2022-07-13 23:27:33 +00:00
```sql
show databases
```
2022-07-13 19:07:20 +00:00
2022-07-13 23:27:33 +00:00
```sql
show tables in imdb
```
2022-07-13 19:07:20 +00:00
- Get the CREATE TABLE statement from the source:
2023-11-15 13:32:31 +00:00
2022-07-13 23:27:33 +00:00
```
select create_table_query
from system.tables
where database = 'imdb' and table = 'actors'
```
Response
2023-11-15 13:32:31 +00:00
2022-07-13 23:27:33 +00:00
```sql
CREATE TABLE imdb.actors (`id` UInt32,
`first_name` String,
`last_name` String,
`gender` FixedString(1))
ENGINE = MergeTree
ORDER BY (id, first_name, last_name, gender);
2022-07-13 23:27:33 +00:00
```
2022-07-13 19:07:20 +00:00
#### On the destination ClickHouse system:
- Create the destination database:
2023-11-15 13:32:31 +00:00
2022-07-13 23:27:33 +00:00
```sql
CREATE DATABASE imdb
```
2022-07-13 19:07:20 +00:00
- Using the CREATE TABLE statement from the source, create the destination:
2023-11-15 13:32:31 +00:00
2022-07-13 23:27:33 +00:00
```sql
CREATE TABLE imdb.actors (`id` UInt32,
`first_name` String,
`last_name` String,
`gender` FixedString(1))
ENGINE = MergeTree
ORDER BY (id, first_name, last_name, gender);
2022-07-13 23:27:33 +00:00
```
2022-07-13 19:07:20 +00:00
#### Back on the source deployment:
Insert into the new database and table created on the remote system. You will need the host, port, username, password, destination database, and destination table.
2023-11-15 13:32:31 +00:00
2022-07-13 19:07:20 +00:00
```sql
INSERT INTO FUNCTION
remoteSecure('remote.clickhouse.cloud:9440', 'imdb.actors', 'USER', 'PASSWORD')
2022-07-13 19:07:20 +00:00
SELECT * from imdb.actors
```
2023-11-15 13:32:31 +00:00
## Globbing {#globs-in-addresses}
2021-07-29 15:20:55 +00:00
Patterns in curly brackets `{ }` are used to generate a set of shards and to specify replicas. If there are multiple pairs of curly brackets, then the direct product of the corresponding sets is generated.
2023-11-15 13:32:31 +00:00
The following pattern types are supported.
2023-11-15 13:32:31 +00:00
- `{a,b,c}` - Represents any of alternative strings `a`, `b` or `c`. The pattern is replaced with `a` in the first shard address and replaced with `b` in the second shard address and so on. For instance, `example0{1,2}-1` generates addresses `example01-1` and `example02-1`.
- `{N..M}` - A range of numbers. This pattern generates shard addresses with incrementing indices from `N` to (and including) `M`. For instance, `example0{1..2}-1` generates `example01-1` and `example02-1`.
- `{0n..0m}` - A range of numbers with leading zeroes. This pattern preserves leading zeroes in indices. For instance, `example{01..03}-1` generates `example01-1`, `example02-1` and `example03-1`.
- `{a|b}` - Any number of variants separated by a `|`. The pattern specifies replicas. For instance, `example01-{1|2}` generates replicas `example01-1` and `example01-2`.
2023-12-20 18:26:36 +00:00
The query will be sent to the first healthy replica. However, for `remote` the replicas are iterated in the order currently set in the [load_balancing](../../operations/settings/settings.md#load_balancing) setting.
The number of generated addresses is limited by [table_function_remote_max_addresses](../../operations/settings/settings.md#table_function_remote_max_addresses) setting.