--- sidebar_position: 59 sidebar_label: clickhouse-copier --- # clickhouse-copier Copies data from the tables in one cluster to tables in another (or the same) cluster. :::warning To get a consistent copy, the data in the source tables and partitions should not change during the entire process. ::: You can run multiple `clickhouse-copier` instances on different servers to perform the same job. ClickHouse Keeper, or ZooKeeper, is used for syncing the processes. After starting, `clickhouse-copier`: - Connects to ClickHouse Keeper and receives: - Copying jobs. - The state of the copying jobs. - It performs the jobs. Each running process chooses the “closest” shard of the source cluster and copies the data into the destination cluster, resharding the data if necessary. `clickhouse-copier` tracks the changes in ClickHouse Keeper and applies them on the fly. To reduce network traffic, we recommend running `clickhouse-copier` on the same server where the source data is located. ## Running Clickhouse-copier {#running-clickhouse-copier} The utility should be run manually: ``` bash $ clickhouse-copier --daemon --config keeper.xml --task-path /task/path --base-dir /path/to/dir ``` Parameters: - `daemon` — Starts `clickhouse-copier` in daemon mode. - `config` — The path to the `keeper.xml` file with the parameters for the connection to ClickHouse Keeper. - `task-path` — The path to the ClickHouse Keeper node. This node is used for syncing `clickhouse-copier` processes and storing tasks. Tasks are stored in `$task-path/description`. - `task-file` — Optional path to file with task configuration for initial upload to ClickHouse Keeper. - `task-upload-force` — Force upload `task-file` even if node already exists. - `base-dir` — The path to logs and auxiliary files. When it starts, `clickhouse-copier` creates `clickhouse-copier_YYYYMMHHSS_` subdirectories in `$base-dir`. If this parameter is omitted, the directories are created in the directory where `clickhouse-copier` was launched. ## Format of keeper.xml {#format-of-zookeeper-xml} ``` xml trace 100M 3 127.0.0.1 2181 ``` ## Configuration of Copying Tasks {#configuration-of-copying-tasks} ``` xml false 127.0.0.1 9000 ... ... 2 1 0 3 1 source_cluster test hits destination_cluster test hits2 ENGINE=ReplicatedMergeTree('/clickhouse/tables/{cluster}/{shard}/hits2', '{replica}') PARTITION BY toMonday(date) ORDER BY (CounterID, EventDate) jumpConsistentHash(intHash64(UserID), 2) CounterID != 0 '2018-02-26' '2018-03-05' ... ... ... ``` `clickhouse-copier` tracks the changes in `/task/path/description` and applies them on the fly. For instance, if you change the value of `max_workers`, the number of processes running tasks will also change. [Original article](https://clickhouse.com/docs/en/operations/utils/clickhouse-copier/)