ClickHouse/docs/ru/operations/utilities/clickhouse-copier.md

---
slug: /ru/operations/utilities/clickhouse-copier
sidebar_position: 59
sidebar_label: clickhouse-copier
---

# clickhouse-copier {#clickhouse-copier}

Копирует данные из таблиц одного кластера в таблицы другого (или этого же) кластера.

Можно запустить несколько `clickhouse-copier` для разных серверах для выполнения одного и того же задания. Для синхронизации между процессами используется ZooKeeper.

После запуска, `clickhouse-copier`:

-   Соединяется с ZooKeeper и получает:

    -   Задания на копирование.
    -   Состояние заданий на копирование.

-   Выполняет задания.

        Каждый запущенный процесс выбирает "ближайший" шард исходного кластера и копирует данные в кластер назначения, при необходимости перешардируя их.

`clickhouse-copier` отслеживает изменения в ZooKeeper и применяет их «на лету».

Для снижения сетевого трафика рекомендуем запускать `clickhouse-copier` на том же сервере, где находятся исходные данные.

## Запуск Clickhouse-copier {#zapusk-clickhouse-copier}

Утилиту следует запускать вручную следующим образом:

``` bash
$ clickhouse-copier --daemon --config zookeeper.xml --task-path /task/path --base-dir /path/to/dir
```

Параметры запуска:

-   `daemon` - запускает `clickhouse-copier` в режиме демона.
-   `config` - путь к файлу `zookeeper.xml` с параметрами соединения с ZooKeeper.
-   `task-path` - путь к ноде ZooKeeper. Нода используется для синхронизации между процессами `clickhouse-copier` и для хранения заданий. Задания хранятся в `$task-path/description`.
-   `task-file` - необязательный путь к файлу с описанием конфигурация заданий для загрузки в ZooKeeper.
-   `task-upload-force` - Загрузить `task-file` в ZooKeeper даже если уже было загружено.
-   `base-dir` - путь к логам и вспомогательным файлам. При запуске `clickhouse-copier` создает в `$base-dir` подкаталоги `clickhouse-copier_YYYYMMHHSS_<PID>`. Если параметр не указан, то каталоги будут создаваться в каталоге, где `clickhouse-copier` был запущен.

## Формат Zookeeper.xml {#format-zookeeper-xml}

``` xml
<clickhouse>
    <logger>
        <level>trace</level>
        <size>100M</size>
        <count>3</count>
    </logger>

    <zookeeper>
        <node index="1">
            <host>127.0.0.1</host>
            <port>2181</port>
        </node>
    </zookeeper>
</clickhouse>
```

## Конфигурация заданий на копирование {#konfiguratsiia-zadanii-na-kopirovanie}

``` xml
<clickhouse>
    <!-- Configuration of clusters as in an ordinary server config -->
    <remote_servers>
        <source_cluster>
		    <!--
                source cluster & destination clusters accept exactly the same
                parameters as parameters for the usual Distributed table
                see https://clickhouse.com/docs/ru/engines/table-engines/special/distributed/
            -->
            <shard>
                <internal_replication>false</internal_replication>
                    <replica>
                        <host>127.0.0.1</host>
                        <port>9000</port>
						<!--
                        <user>default</user>
                        <password>default</password>
                        <secure>1</secure>
                        -->
                    </replica>
            </shard>
            ...
        </source_cluster>

        <destination_cluster>
        ...
        </destination_cluster>
    </remote_servers>

    <!-- How many simultaneously active workers are possible. If you run more workers superfluous workers will sleep. -->
    <max_workers>2</max_workers>

    <!-- Setting used to fetch (pull) data from source cluster tables -->
    <settings_pull>
        <readonly>1</readonly>
    </settings_pull>

    <!-- Setting used to insert (push) data to destination cluster tables -->
    <settings_push>
        <readonly>0</readonly>
    </settings_push>

    <!-- Common setting for fetch (pull) and insert (push) operations. Also, copier process context uses it.
         They are overlaid by <settings_pull/> and <settings_push/> respectively. -->
    <settings>
        <connect_timeout>3</connect_timeout>
        <!-- Sync insert is set forcibly, leave it here just in case. -->
        <distributed_foreground_insert>1</distributed_foreground_insert>
    </settings>

    <!-- Copying tasks description.
         You could specify several table task in the same task description (in the same ZooKeeper node), they will be performed
         sequentially.
    -->
    <tables>
        <!-- A table task, copies one table. -->
        <table_hits>
            <!-- Source cluster name (from <remote_servers/> section) and tables in it that should be copied -->
            <cluster_pull>source_cluster</cluster_pull>
            <database_pull>test</database_pull>
            <table_pull>hits</table_pull>

            <!-- Destination cluster name and tables in which the data should be inserted -->
            <cluster_push>destination_cluster</cluster_push>
            <database_push>test</database_push>
            <table_push>hits2</table_push>

            <!-- Engine of destination tables.
                 If destination tables have not be created, workers create them using columns definition from source tables and engine
                 definition from here.

                 NOTE: If the first worker starts insert data and detects that destination partition is not empty then the partition will
                 be dropped and refilled, take it into account if you already have some data in destination tables. You could directly
                 specify partitions that should be copied in <enabled_partitions/>, they should be in quoted format like partition column of
                 system.parts table.
            -->
            <engine>
            ENGINE=ReplicatedMergeTree('/clickhouse/tables/{cluster}/{shard}/hits2', '{replica}')
            PARTITION BY toMonday(date)
            ORDER BY (CounterID, EventDate)
            </engine>

            <!-- Sharding key used to insert data to destination cluster -->
            <sharding_key>jumpConsistentHash(intHash64(UserID), 2)</sharding_key>

            <!-- Optional expression that filter data while pull them from source servers -->
            <where_condition>CounterID != 0</where_condition>

            <!-- This section specifies partitions that should be copied, other partition will be ignored.
                 Partition names should have the same format as
                 partition column of system.parts table (i.e. a quoted text).
                 Since partition key of source and destination cluster could be different,
                 these partition names specify destination partitions.

                 NOTE: In spite of this section is optional (if it is not specified, all partitions will be copied),
                 it is strictly recommended to specify them explicitly.
                 If you already have some ready partitions on destination cluster they
                 will be removed at the start of the copying since they will be interpeted
                 as unfinished data from the previous copying!!!
            -->
            <enabled_partitions>
                <partition>'2018-02-26'</partition>
                <partition>'2018-03-05'</partition>
                ...
            </enabled_partitions>
        </table_hits>

        <!-- Next table to copy. It is not copied until previous table is copying. -->
        <table_visits>
        ...
        </table_visits>
        ...
    </tables>
</clickhouse>
```

`clickhouse-copier` отслеживает изменения `/task/path/description` и применяет их «на лету». Если вы поменяете, например, значение `max_workers`, то количество процессов, выполняющих задания, также изменится.
-												DOCSUP-2806: Add meta header in RU (#15801)

* DOCSUP-2806: Add meta intro.

* DOCSUP-2806: Update meta intro.

* DOCSUP-2806: Fix meta.

* DOCSUP-2806: Add quotes for meta headers.

* DOCSUP-2806: Remove quotes from meta headers.

* DOCSUP-2806: Add meta headers.

* DOCSUP-2806: Fix quotes in meta headers.

* DOCSUP-2806: Update meta headers.

* DOCSUP-2806: Fix link to nowhere in EN.

* DOCSUP-2806: Fix link (settings to tune)

* DOCSUP-2806: Fix links.

* DOCSUP-2806:Fix links EN

* DOCSUP-2806: Fix build errors.

* DOCSUP-2806: Fix meta intro.

* DOCSUP-2806: Fix toc_priority in examples datasets TOC.

* DOCSUP-2806: Fix items order in toc.

* DOCSUP-2806: Fix order in toc.

* DOCSUP-2806: Fix toc order.

* DOCSUP-2806: Fix order in toc.

* DOCSUP-2806: Fix toc index in create

* DOCSUP-2806: Fix toc order in create.

Co-authored-by: romanzhukov <romanzhukov@yandex-team.ru>
Co-authored-by: alexey-milovidov <milovidov@yandex-team.ru>
											
										
										
											2020-10-26 10:29:30 +00:00
+								---
-												add slugs to all docs

											
										
										
											2022-08-26 17:37:11 +00:00
+								slug: /ru/operations/utilities/clickhouse-copier
-												Removed /ja folder, cleaned up /ru markdown

											
										
										
											2022-04-09 13:29:05 +00:00
+								sidebar_position: 59
 								sidebar_label: clickhouse-copier
-												DOCSUP-2806: Add meta header in RU (#15801)

* DOCSUP-2806: Add meta intro.

* DOCSUP-2806: Update meta intro.

* DOCSUP-2806: Fix meta.

* DOCSUP-2806: Add quotes for meta headers.

* DOCSUP-2806: Remove quotes from meta headers.

* DOCSUP-2806: Add meta headers.

* DOCSUP-2806: Fix quotes in meta headers.

* DOCSUP-2806: Update meta headers.

* DOCSUP-2806: Fix link to nowhere in EN.

* DOCSUP-2806: Fix link (settings to tune)

* DOCSUP-2806: Fix links.

* DOCSUP-2806:Fix links EN

* DOCSUP-2806: Fix build errors.

* DOCSUP-2806: Fix meta intro.

* DOCSUP-2806: Fix toc_priority in examples datasets TOC.

* DOCSUP-2806: Fix items order in toc.

* DOCSUP-2806: Fix order in toc.

* DOCSUP-2806: Fix toc order.

* DOCSUP-2806: Fix order in toc.

* DOCSUP-2806: Fix toc index in create

* DOCSUP-2806: Fix toc order in create.

Co-authored-by: romanzhukov <romanzhukov@yandex-team.ru>
Co-authored-by: alexey-milovidov <milovidov@yandex-team.ru>
											
										
										
											2020-10-26 10:29:30 +00:00
+								---
-												WIP on docs translation/normalization tools (#9783)


											
										
										
											2020-03-20 18:20:59 +00:00
+								# clickhouse-copier {#clickhouse-copier}
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
-												Ru contents is synchronized with En one.
Utils section is restructured.
clickhouse-copier is editted.

											
										
										
											2018-03-02 09:44:48 +00:00
+								Копирует данные из таблиц одного кластера в таблицы другого (или этого же) кластера.
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
-												Ru contents is synchronized with En one.
Utils section is restructured.
clickhouse-copier is editted.

											
										
										
											2018-03-02 09:44:48 +00:00
+								Можно запустить несколько `clickhouse-copier` для разных серверах для выполнения одного и того же задания. Для синхронизации между процессами используется ZooKeeper.
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
-												Ru contents is synchronized with En one.
Utils section is restructured.
clickhouse-copier is editted.

											
										
										
											2018-03-02 09:44:48 +00:00
+								После запуска, `clickhouse-copier`:
-												Some bugs are fixed.

											
										
										
											2018-03-02 10:08:42 +00:00
-												[experimental] add "es" docs language as machine translated draft (#9787)

* replace exit with assert in test_single_page

* improve save_raw_single_page docs option

* More grammar fixes

* "Built from" link in new tab

* fix mistype

* Example of include in docs

* add anchor to meeting form

* Draft of translation helper

* WIP on translation helper

* Replace some fa docs content with machine translation

* add normalize-en-markdown.sh

* normalize some en markdown

* normalize some en markdown

* admonition support

* normalize

* normalize

* normalize

* support wide tables

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* lightly edited machine translation of introdpection.md

* lightly edited machhine translation of lazy.md

* WIP on translation utils

* Normalize ru docs

* Normalize other languages

* some fixes

* WIP on normalize/translate tools

* add requirements.txt

* [experimental] add es docs language as machine translated draft

* remove duplicate script

* Back to wider tab-stop (narrow renders not so well)
											
										
										
											2020-03-21 04:11:51 +00:00
+								-   Соединяется с ZooKeeper и получает:
-												WIP on docs translation/normalization tools (#9783)


											
										
										
											2020-03-20 18:20:59 +00:00
-												[experimental] add "es" docs language as machine translated draft (#9787)

* replace exit with assert in test_single_page

* improve save_raw_single_page docs option

* More grammar fixes

* "Built from" link in new tab

* fix mistype

* Example of include in docs

* add anchor to meeting form

* Draft of translation helper

* WIP on translation helper

* Replace some fa docs content with machine translation

* add normalize-en-markdown.sh

* normalize some en markdown

* normalize some en markdown

* admonition support

* normalize

* normalize

* normalize

* support wide tables

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* lightly edited machine translation of introdpection.md

* lightly edited machhine translation of lazy.md

* WIP on translation utils

* Normalize ru docs

* Normalize other languages

* some fixes

* WIP on normalize/translate tools

* add requirements.txt

* [experimental] add es docs language as machine translated draft

* remove duplicate script

* Back to wider tab-stop (narrow renders not so well)
											
										
										
											2020-03-21 04:11:51 +00:00
+								    -   Задания на копирование.
 								    -   Состояние заданий на копирование.
-												Some bugs are fixed.

											
										
										
											2018-03-02 10:08:42 +00:00
-												[experimental] add "es" docs language as machine translated draft (#9787)

* replace exit with assert in test_single_page

* improve save_raw_single_page docs option

* More grammar fixes

* "Built from" link in new tab

* fix mistype

* Example of include in docs

* add anchor to meeting form

* Draft of translation helper

* WIP on translation helper

* Replace some fa docs content with machine translation

* add normalize-en-markdown.sh

* normalize some en markdown

* normalize some en markdown

* admonition support

* normalize

* normalize

* normalize

* support wide tables

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* lightly edited machine translation of introdpection.md

* lightly edited machhine translation of lazy.md

* WIP on translation utils

* Normalize ru docs

* Normalize other languages

* some fixes

* WIP on normalize/translate tools

* add requirements.txt

* [experimental] add es docs language as machine translated draft

* remove duplicate script

* Back to wider tab-stop (narrow renders not so well)
											
										
										
											2020-03-21 04:11:51 +00:00
+								-   Выполняет задания.
 								        Каждый запущенный процесс выбирает "ближайший" шард исходного кластера и копирует данные в кластер назначения, при необходимости перешардируя их.
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
-												WIP on docs translation/normalization tools (#9783)


											
										
										
											2020-03-20 18:20:59 +00:00
+								`clickhouse-copier` отслеживает изменения в ZooKeeper и применяет их «на лету».
-												Ru contents is synchronized with En one.
Utils section is restructured.
clickhouse-copier is editted.

											
										
										
											2018-03-02 09:44:48 +00:00
 								Для снижения сетевого трафика рекомендуем запускать `clickhouse-copier` на том же сервере, где находятся исходные данные.
-												[docs] replace underscores with hyphens (#10606)

* Replace underscores with hyphens

* remove temporary code

* fix style check

* fix collapse
											
										
										
											2020-04-30 18:19:18 +00:00
+								## Запуск Clickhouse-copier {#zapusk-clickhouse-copier}
-												Ru contents is synchronized with En one.
Utils section is restructured.
clickhouse-copier is editted.

											
										
										
											2018-03-02 09:44:48 +00:00
 								Утилиту следует запускать вручную следующим образом:
-												WIP on docs translation/normalization tools (#9783)


											
										
										
											2020-03-20 18:20:59 +00:00
+								``` bash
-												DOCSUP-203: Update by PR#11558.

											
										
										
											2020-08-25 23:12:51 +00:00
+								$ clickhouse-copier --daemon --config zookeeper.xml --task-path /task/path --base-dir /path/to/dir
-												Ru contents is synchronized with En one.
Utils section is restructured.
clickhouse-copier is editted.

											
										
										
											2018-03-02 09:44:48 +00:00
+								```
 								Параметры запуска:
-												Some bugs are fixed.

											
										
										
											2018-03-02 10:08:42 +00:00
-												[experimental] add "es" docs language as machine translated draft (#9787)

* replace exit with assert in test_single_page

* improve save_raw_single_page docs option

* More grammar fixes

* "Built from" link in new tab

* fix mistype

* Example of include in docs

* add anchor to meeting form

* Draft of translation helper

* WIP on translation helper

* Replace some fa docs content with machine translation

* add normalize-en-markdown.sh

* normalize some en markdown

* normalize some en markdown

* admonition support

* normalize

* normalize

* normalize

* support wide tables

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* normalize

* lightly edited machine translation of introdpection.md

* lightly edited machhine translation of lazy.md

* WIP on translation utils

* Normalize ru docs

* Normalize other languages

* some fixes

* WIP on normalize/translate tools

* add requirements.txt

* [experimental] add es docs language as machine translated draft

* remove duplicate script

* Back to wider tab-stop (narrow renders not so well)
											
										
										
											2020-03-21 04:11:51 +00:00
+								-   `daemon` - запускает `clickhouse-copier` в режиме демона.
 								-   `config` - путь к файлу `zookeeper.xml` с параметрами соединения с ZooKeeper.
 								-   `task-path` - путь к ноде ZooKeeper. Нода используется для синхронизации между процессами `clickhouse-copier` и для хранения заданий. Задания хранятся в `$task-path/description`.
 								-   `task-file` - необязательный путь к файлу с описанием конфигурация заданий для загрузки в ZooKeeper.
 								-   `task-upload-force` - Загрузить `task-file` в ZooKeeper даже если уже было загружено.
 								-   `base-dir` - путь к логам и вспомогательным файлам. При запуске `clickhouse-copier` создает в `$base-dir` подкаталоги `clickhouse-copier_YYYYMMHHSS_<PID>`. Если параметр не указан, то каталоги будут создаваться в каталоге, где `clickhouse-copier` был запущен.
-												Ru contents is synchronized with En one.
Utils section is restructured.
clickhouse-copier is editted.

											
										
										
											2018-03-02 09:44:48 +00:00
-												[docs] replace underscores with hyphens (#10606)

* Replace underscores with hyphens

* remove temporary code

* fix style check

* fix collapse
											
										
										
											2020-04-30 18:19:18 +00:00
+								## Формат Zookeeper.xml {#format-zookeeper-xml}
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
-												WIP on docs translation/normalization tools (#9783)


											
										
										
											2020-03-20 18:20:59 +00:00
+								``` xml
-												docs: switch <yandex> to <clickhouse>

											
										
										
											2021-10-26 05:50:15 +00:00
+								<clickhouse>
-												Add logger config for clickhouse-copier doc. (#4908)


											
										
										
											2019-04-04 10:23:15 +00:00
+								    <logger>
 								        <level>trace</level>
 								        <size>100M</size>
 								        <count>3</count>
 								    </logger>
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
+								    <zookeeper>
 								        <node index="1">
 								            <host>127.0.0.1</host>
 								            <port>2181</port>
 								        </node>
 								    </zookeeper>
-												docs: switch <yandex> to <clickhouse>

											
										
										
											2021-10-26 05:50:15 +00:00
+								</clickhouse>
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
+								```
-												WIP on docs translation/normalization tools (#9783)


											
										
										
											2020-03-20 18:20:59 +00:00
+								## Конфигурация заданий на копирование {#konfiguratsiia-zadanii-na-kopirovanie}
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
-												WIP on docs translation/normalization tools (#9783)


											
										
										
											2020-03-20 18:20:59 +00:00
+								``` xml
-												docs: switch <yandex> to <clickhouse>

											
										
										
											2021-10-26 05:50:15 +00:00
+								<clickhouse>
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
+								    <!-- Configuration of clusters as in an ordinary server config -->
 								    <remote_servers>
 								        <source_cluster>
-												Edit and translate

Поправил английский вариант и перевел на русский язык.

											
										
										
											2021-01-17 18:13:27 +00:00
+										    <!--
 								                source cluster & destination clusters accept exactly the same
 								                parameters as parameters for the usual Distributed table
-												find . -type f -name '*.md'| xargs -I{} perl -pi -e 's|https://clickhouse.tech|https://clickhouse.com|g' {}

											
										
										
											2021-09-19 20:05:54 +00:00
+								                see https://clickhouse.com/docs/ru/engines/table-engines/special/distributed/
-												Remove trailing whitespaces from docs

											
										
										
											2021-07-29 15:27:50 +00:00
+								            -->
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
+								            <shard>
 								                <internal_replication>false</internal_replication>
 								                    <replica>
 								                        <host>127.0.0.1</host>
 								                        <port>9000</port>
-												Edit and translate

Поправил английский вариант и перевел на русский язык.

											
										
										
											2021-01-17 18:13:27 +00:00
+														<!--
 								                        <user>default</user>
 								                        <password>default</password>
 								                        <secure>1</secure>
 								                        -->
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
+								                    </replica>
 								            </shard>
 								            ...
 								        </source_cluster>
 								        <destination_cluster>
 								        ...
 								        </destination_cluster>
 								    </remote_servers>
 								    <!-- How many simultaneously active workers are possible. If you run more workers superfluous workers will sleep. -->
 								    <max_workers>2</max_workers>
 								    <!-- Setting used to fetch (pull) data from source cluster tables -->
 								    <settings_pull>
 								        <readonly>1</readonly>
 								    </settings_pull>
 								    <!-- Setting used to insert (push) data to destination cluster tables -->
 								    <settings_push>
 								        <readonly>0</readonly>
 								    </settings_push>
 								    <!-- Common setting for fetch (pull) and insert (push) operations. Also, copier process context uses it.
 								         They are overlaid by <settings_pull/> and <settings_push/> respectively. -->
 								    <settings>
 								        <connect_timeout>3</connect_timeout>
 								        <!-- Sync insert is set forcibly, leave it here just in case. -->
-												Rename directory monitor concept into background INSERT (#55978)

* Limit log frequence for "Skipping send data over distributed table" message

After SYSTEM STOP DISTRIBUTED SENDS it will constantly print this
message.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Rename directory monitor concept into async INSERT

Rename the following query settings (with preserving backward
compatiblity, by keeping old name as an alias):
- distributed_directory_monitor_sleep_time_ms -> distributed_async_insert_sleep_time_ms
- distributed_directory_monitor_max_sleep_time_ms -> distributed_async_insert_max_sleep_time_ms
- distributed_directory_monitor_batch -> distributed_async_insert_batch_inserts
- distributed_directory_monitor_split_batch_on_failure -> distributed_async_insert_split_batch_on_failure

Rename the following table settings (with preserving backward
compatiblity, by keeping old name as an alias):
- monitor_batch_inserts -> async_insert_batch
- monitor_split_batch_on_failure -> async_insert_split_batch_on_failure
- directory_monitor_sleep_time_ms -> async_insert_sleep_time_ms
- directory_monitor_max_sleep_time_ms -> async_insert_max_sleep_time_ms

And also update all the references:

    $ gg -e directory_monitor_ -e monitor_ tests docs | cut -d: -f1 | sort -u | xargs sed -e 's/distributed_directory_monitor_sleep_time_ms/distributed_async_insert_sleep_time_ms/g' -e 's/distributed_directory_monitor_max_sleep_time_ms/distributed_async_insert_max_sleep_time_ms/g' -e 's/distributed_directory_monitor_batch_inserts/distributed_async_insert_batch/g' -e 's/distributed_directory_monitor_split_batch_on_failure/distributed_async_insert_split_batch_on_failure/g' -e 's/monitor_batch_inserts/async_insert_batch/g' -e 's/monitor_split_batch_on_failure/async_insert_split_batch_on_failure/g' -e 's/monitor_sleep_time_ms/async_insert_sleep_time_ms/g' -e 's/monitor_max_sleep_time_ms/async_insert_max_sleep_time_ms/g' -i

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Rename async_insert for Distributed into background_insert

This will avoid amigibuity between general async INSERT's and INSERT
into Distributed, which are indeed background, so new term express it
even better.

Mostly done with:

    $ git di HEAD^ --name-only | xargs sed -i -e 's/distributed_async_insert/distributed_background_insert/g' -e 's/async_insert_batch/background_insert_batch/g' -e 's/async_insert_split_batch_on_failure/background_insert_split_batch_on_failure/g' -e 's/async_insert_sleep_time_ms/background_insert_sleep_time_ms/g' -e 's/async_insert_max_sleep_time_ms/background_insert_max_sleep_time_ms/g'

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Mark 02417_opentelemetry_insert_on_distributed_table as long

CI: https://s3.amazonaws.com/clickhouse-test-reports/55978/7a6abb03a0b507e29e999cb7e04f246a119c6f28/stateless_tests_flaky_check__asan_.html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

---------

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
											
										
										
											2023-11-01 14:09:39 +00:00
+								        <distributed_foreground_insert>1</distributed_foreground_insert>
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
+								    </settings>
 								    <!-- Copying tasks description.
 								         You could specify several table task in the same task description (in the same ZooKeeper node), they will be performed
 								         sequentially.
 								    -->
 								    <tables>
 								        <!-- A table task, copies one table. -->
 								        <table_hits>
 								            <!-- Source cluster name (from <remote_servers/> section) and tables in it that should be copied -->
 								            <cluster_pull>source_cluster</cluster_pull>
 								            <database_pull>test</database_pull>
 								            <table_pull>hits</table_pull>
 								            <!-- Destination cluster name and tables in which the data should be inserted -->
 								            <cluster_push>destination_cluster</cluster_push>
 								            <database_push>test</database_push>
 								            <table_push>hits2</table_push>
 								            <!-- Engine of destination tables.
 								                 If destination tables have not be created, workers create them using columns definition from source tables and engine
 								                 definition from here.
 								                 NOTE: If the first worker starts insert data and detects that destination partition is not empty then the partition will
 								                 be dropped and refilled, take it into account if you already have some data in destination tables. You could directly
 								                 specify partitions that should be copied in <enabled_partitions/>, they should be in quoted format like partition column of
 								                 system.parts table.
 								            -->
-												Ru contents is synchronized with En one.
Utils section is restructured.
clickhouse-copier is editted.

											
										
										
											2018-03-02 09:44:48 +00:00
+								            <engine>
 								            ENGINE=ReplicatedMergeTree('/clickhouse/tables/{cluster}/{shard}/hits2', '{replica}')
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
+								            PARTITION BY toMonday(date)
 								            ORDER BY (CounterID, EventDate)
 								            </engine>
 								            <!-- Sharding key used to insert data to destination cluster -->
 								            <sharding_key>jumpConsistentHash(intHash64(UserID), 2)</sharding_key>
 								            <!-- Optional expression that filter data while pull them from source servers -->
 								            <where_condition>CounterID != 0</where_condition>
 								            <!-- This section specifies partitions that should be copied, other partition will be ignored.
 								                 Partition names should have the same format as
 								                 partition column of system.parts table (i.e. a quoted text).
 								                 Since partition key of source and destination cluster could be different,
 								                 these partition names specify destination partitions.
 								                 NOTE: In spite of this section is optional (if it is not specified, all partitions will be copied),
 								                 it is strictly recommended to specify them explicitly.
-												Fix some spelling mistakes

											
										
										
											2020-01-11 09:50:41 +00:00
+								                 If you already have some ready partitions on destination cluster they
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
+								                 will be removed at the start of the copying since they will be interpeted
 								                 as unfinished data from the previous copying!!!
 								            -->
 								            <enabled_partitions>
 								                <partition>'2018-02-26'</partition>
 								                <partition>'2018-03-05'</partition>
 								                ...
 								            </enabled_partitions>
 								        </table_hits>
 								        <!-- Next table to copy. It is not copied until previous table is copying. -->
-												Invalid xml config
											
										
										
											2022-01-09 18:38:31 +00:00
+								        <table_visits>
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
+								        ...
 								        </table_visits>
 								        ...
 								    </tables>
-												docs: switch <yandex> to <clickhouse>

											
										
										
											2021-10-26 05:50:15 +00:00
+								</clickhouse>
-												Add clickhouse-copier description to the docs. [#CLICKHOUSE-3606]

											
										
										
											2018-02-26 15:27:36 +00:00
+								```
-												WIP on docs translation/normalization tools (#9783)


											
										
										
											2020-03-20 18:20:59 +00:00
+								`clickhouse-copier` отслеживает изменения `/task/path/description` и применяет их «на лету». Если вы поменяете, например, значение `max_workers`, то количество процессов, выполняющих задания, также изменится.