# ﾂ环板-ｮﾂ嘉ｯﾂ偲 {#clickhouse-copier} 将数据从一个群集中的表复制到另一个（或相同）群集中的表。您可以运行多个 `clickhouse-copier` 不同服务器上的实例执行相同的作业。 ZooKeeper用于同步进程。开始后, `clickhouse-copier`: - 连接到动物园管理员和接收: - 复制作业。 - 复制作业的状态。 - 它执行的工作。 Each running process chooses the "closest" shard of the source cluster and copies the data into the destination cluster, resharding the data if necessary. `clickhouse-copier` 跟踪ZooKeeper中的更改，并实时应用它们。为了减少网络流量，我们建议运行 `clickhouse-copier` 在源数据所在的同一服务器上。 ## ﾂ暗ｪﾂ氾环催ﾂ団ﾂ法ﾂ人 {#running-clickhouse-copier} 该实用程序应手动运行: ``` bash clickhouse-copier copier --daemon --config zookeeper.xml --task-path /task/path --base-dir /path/to/dir ``` 参数: - `daemon` — Starts `clickhouse-copier` 在守护进程模式。 - `config` — The path to the `zookeeper.xml` 带有连接到ZooKeeper的参数的文件。 - `task-path` — The path to the ZooKeeper node. This node is used for syncing `clickhouse-copier` 处理和存储任务。任务存储在 `$task-path/description`. - `base-dir` — The path to logs and auxiliary files. When it starts, `clickhouse-copier` 创建 `clickhouse-copier_YYYYMMHHSS_` 子目录 `$base-dir`. 如果省略此参数，则在以下目录中创建目录 `clickhouse-copier` 被推出。 ## 动物园管理员的格式。xml {#format-of-zookeeper-xml} ``` xml trace 100M 3 127.0.0.1 2181 ``` ## 复制任务的配置 {#configuration-of-copying-tasks} ``` xml false 127.0.0.1 9000 ... ... 2 1 0 3 1 source_cluster test hits destination_cluster test hits2 ENGINE=ReplicatedMergeTree('/clickhouse/tables/{cluster}/{shard}/hits2', '{replica}') PARTITION BY toMonday(date) ORDER BY (CounterID, EventDate) jumpConsistentHash(intHash64(UserID), 2) CounterID != 0 '2018-02-26' '2018-03-05' ... ... ... ``` `clickhouse-copier` 跟踪更改 `/task/path/description` 并在飞行中应用它们。例如，如果你改变的值 `max_workers`，运行任务的进程数也会发生变化。 [原始文章](https://clickhouse.tech/docs/en/operations/utils/clickhouse-copier/)