From c3d8668eb66b1a3c2e39a37c1d607964149749bf Mon Sep 17 00:00:00 2001 From: Dan Roscigno Date: Fri, 20 May 2022 17:34:14 -0400 Subject: [PATCH 1/6] add ClickHouse Keeper to docs --- docs/en/operations/clickhouse-keeper.md | 26 +++++++++---------- docs/en/operations/configuration-files.md | 2 +- docs/en/operations/system-tables/zookeeper.md | 6 ++--- docs/en/operations/tips.md | 6 ++--- .../operations/utilities/clickhouse-copier.md | 16 ++++++------ 5 files changed, 28 insertions(+), 28 deletions(-) diff --git a/docs/en/operations/clickhouse-keeper.md b/docs/en/operations/clickhouse-keeper.md index 44f6a57f6d4..7dbe0601343 100644 --- a/docs/en/operations/clickhouse-keeper.md +++ b/docs/en/operations/clickhouse-keeper.md @@ -5,15 +5,15 @@ sidebar_label: ClickHouse Keeper # ClickHouse Keeper {#clickHouse-keeper} -ClickHouse server uses [ZooKeeper](https://zookeeper.apache.org/) coordination system for data [replication](../engines/table-engines/mergetree-family/replication.md) and [distributed DDL](../sql-reference/distributed-ddl.md) queries execution. ClickHouse Keeper is an alternative coordination system compatible with ZooKeeper. +ClickHouse Keeper provides the coordination system for data [replication](../engines/table-engines/mergetree-family/replication.md) and [distributed DDL](../sql-reference/distributed-ddl.md) queries execution. ClickHouse Keeper is compatible with ZooKeeper. ## Implementation details {#implementation-details} -ZooKeeper is one of the first well-known open-source coordination systems. It's implemented in Java, has quite a simple and powerful data model. ZooKeeper's coordination algorithm called ZAB (ZooKeeper Atomic Broadcast) doesn't provide linearizability guarantees for reads, because each ZooKeeper node serves reads locally. Unlike ZooKeeper ClickHouse Keeper is written in C++ and uses [RAFT algorithm](https://raft.github.io/) [implementation](https://github.com/eBay/NuRaft). This algorithm allows to have linearizability for reads and writes, has several open-source implementations in different languages. +ZooKeeper is one of the first well-known open-source coordination systems. It's implemented in Java, and has quite a simple and powerful data model. ZooKeeper's coordination algorithm, ZooKeeper Atomic Broadcast (ZAB), doesn't provide linearizability guarantees for reads, because each ZooKeeper node serves reads locally. Unlike ZooKeeper ClickHouse Keeper is written in C++ and uses the [RAFT algorithm](https://raft.github.io/) [implementation](https://github.com/eBay/NuRaft). This algorithm allows linearizability for reads and writes, and has several open-source implementations in different languages. -By default, ClickHouse Keeper provides the same guarantees as ZooKeeper (linearizable writes, non-linearizable reads). It has a compatible client-server protocol, so any standard ZooKeeper client can be used to interact with ClickHouse Keeper. Snapshots and logs have an incompatible format with ZooKeeper, but `clickhouse-keeper-converter` tool allows to convert ZooKeeper data to ClickHouse Keeper snapshot. Interserver protocol in ClickHouse Keeper is also incompatible with ZooKeeper so mixed ZooKeeper / ClickHouse Keeper cluster is impossible. +By default, ClickHouse Keeper provides the same guarantees as ZooKeeper (linearizable writes, non-linearizable reads). It has a compatible client-server protocol, so any standard ZooKeeper client can be used to interact with ClickHouse Keeper. Snapshots and logs have an incompatible format with ZooKeeper, but the `clickhouse-keeper-converter` tool enables the conversion of ZooKeeper data to ClickHouse Keeper snapshots. The interserver protocol in ClickHouse Keeper is also incompatible with ZooKeeper so a mixed ZooKeeper / ClickHouse Keeper cluster is impossible. -ClickHouse Keeper supports Access Control List (ACL) the same way as [ZooKeeper](https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_ZooKeeperAccessControl) does. ClickHouse Keeper supports the same set of permissions and has the identical built-in schemes: `world`, `auth`, `digest`, `host` and `ip`. Digest authentication scheme uses pair `username:password`. Password is encoded in Base64. +ClickHouse Keeper supports Access Control Lists (ACLs) the same way as [ZooKeeper](https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_ZooKeeperAccessControl) does. ClickHouse Keeper supports the same set of permissions and has the identical built-in schemes: `world`, `auth`, `digest`, `host` and `ip`. The digest authentication scheme uses the pair `username:password`, the password is encoded in Base64. :::note External integrations are not supported. @@ -21,25 +21,25 @@ External integrations are not supported. ## Configuration {#configuration} -ClickHouse Keeper can be used as a standalone replacement for ZooKeeper or as an internal part of the ClickHouse server, but in both cases configuration is almost the same `.xml` file. The main ClickHouse Keeper configuration tag is ``. Keeper configuration has the following parameters: +ClickHouse Keeper can be used as a standalone replacement for ZooKeeper or as an internal part of the ClickHouse server. In both cases the configuration is almost the same `.xml` file. The main ClickHouse Keeper configuration tag is ``. Keeper configuration has the following parameters: - `tcp_port` — Port for a client to connect (default for ZooKeeper is `2181`). - `tcp_port_secure` — Secure port for an SSL connection between client and keeper-server. - `server_id` — Unique server id, each participant of the ClickHouse Keeper cluster must have a unique number (1, 2, 3, and so on). -- `log_storage_path` — Path to coordination logs, better to store logs on the non-busy device (same for ZooKeeper). +- `log_storage_path` — Path to coordination logs, just like ZooKeeper it is best to store logs on non-busy nodes. - `snapshot_storage_path` — Path to coordination snapshots. Other common parameters are inherited from the ClickHouse server config (`listen_host`, `logger`, and so on). -Internal coordination settings are located in `.` section: +Internal coordination settings are located in the `.` section: - `operation_timeout_ms` — Timeout for a single client operation (ms) (default: 10000). - `min_session_timeout_ms` — Min timeout for client session (ms) (default: 10000). - `session_timeout_ms` — Max timeout for client session (ms) (default: 100000). -- `dead_session_check_period_ms` — How often ClickHouse Keeper check dead sessions and remove them (ms) (default: 500). +- `dead_session_check_period_ms` — How often ClickHouse Keeper checks for dead sessions and removes them (ms) (default: 500). - `heart_beat_interval_ms` — How often a ClickHouse Keeper leader will send heartbeats to followers (ms) (default: 500). -- `election_timeout_lower_bound_ms` — If the follower didn't receive heartbeats from the leader in this interval, then it can initiate leader election (default: 1000). -- `election_timeout_upper_bound_ms` — If the follower didn't receive heartbeats from the leader in this interval, then it must initiate leader election (default: 2000). +- `election_timeout_lower_bound_ms` — If the follower does not receive a heartbeat from the leader in this interval, then it can initiate leader election (default: 1000). +- `election_timeout_upper_bound_ms` — If the follower does not receive a heartbeat from the leader in this interval, then it must initiate leader election (default: 2000). - `rotate_log_storage_interval` — How many log records to store in a single file (default: 100000). - `reserved_log_items` — How many coordination log records to store before compaction (default: 100000). - `snapshot_distance` — How often ClickHouse Keeper will create new snapshots (in the number of records in logs) (default: 100000). @@ -55,7 +55,7 @@ Internal coordination settings are located in `..` section and contain servers description. +Quorum configuration is located in the `.` section and contain servers description. The only parameter for the whole quorum is `secure`, which enables encrypted connection for communication between quorum participants. The parameter can be set `true` if SSL connection is required for internal communication between nodes, or left unspecified otherwise. @@ -66,7 +66,7 @@ The main parameters for each `` are: - `port` — Port where this server listens for connections. :::note -In the case of a change in the topology of your ClickHouse Keeper cluster (eg. replacing a server), please make sure to keep the mapping `server_id` to `hostname` consistent and avoid shuffling or reusing an existing `server_id` for different servers (eg. it can happen if your rely on automation scripts to deploy ClickHouse Keeper) +In the case of a change in the topology of your ClickHouse Keeper cluster (e.g., replacing a server), please make sure to keep the mapping of `server_id` to `hostname` consistent and avoid shuffling or reusing an existing `server_id` for different servers (e.g., it can happen if your rely on automation scripts to deploy ClickHouse Keeper) ::: Examples of configuration for quorum with three nodes can be found in [integration tests](https://github.com/ClickHouse/ClickHouse/tree/master/tests/integration) with `test_keeper_` prefix. Example configuration for server #1: @@ -112,7 +112,7 @@ ClickHouse Keeper is bundled into the ClickHouse server package, just add config clickhouse-keeper --config /etc/your_path_to_config/config.xml ``` -If you don't have the symlink (`clickhouse-keeper`) you can create it or specify `keeper` as argument: +If you don't have the symlink (`clickhouse-keeper`) you can create it or specify `keeper` as an argument to `clickhouse`: ```bash clickhouse keeper --config /etc/your_path_to_config/config.xml diff --git a/docs/en/operations/configuration-files.md b/docs/en/operations/configuration-files.md index 582e90544e0..4a5431fa57c 100644 --- a/docs/en/operations/configuration-files.md +++ b/docs/en/operations/configuration-files.md @@ -34,7 +34,7 @@ You can also declare attributes as coming from environment variables by using `f The config can also define “substitutions”. If an element has the `incl` attribute, the corresponding substitution from the file will be used as the value. By default, the path to the file with substitutions is `/etc/metrika.xml`. This can be changed in the [include_from](../operations/server-configuration-parameters/settings.md#server_configuration_parameters-include_from) element in the server config. The substitution values are specified in `/clickhouse/substitution_name` elements in this file. If a substitution specified in `incl` does not exist, it is recorded in the log. To prevent ClickHouse from logging missing substitutions, specify the `optional="true"` attribute (for example, settings for [macros](../operations/server-configuration-parameters/settings.md#macros)). -If you want to replace an entire element with a substitution use `include` as element name. +If you want to replace an entire element with a substitution use `include` as the element name. XML substitution example: diff --git a/docs/en/operations/system-tables/zookeeper.md b/docs/en/operations/system-tables/zookeeper.md index e8232483f6f..31625b0fed6 100644 --- a/docs/en/operations/system-tables/zookeeper.md +++ b/docs/en/operations/system-tables/zookeeper.md @@ -1,7 +1,7 @@ # zookeeper {#system-zookeeper} -The table does not exist if ZooKeeper is not configured. Allows reading data from the ZooKeeper cluster defined in the config. -The query must either have a ‘path =’ condition or a `path IN` condition set with the `WHERE` clause as shown below. This corresponds to the path of the children in ZooKeeper that you want to get data for. +The table does not exist unless ClickHouse Keeper or ZooKeeper is configured. Th `system.zookeeper` exposes data from the Keeper cluster defined in the config. +The query must either have a ‘path =’ condition or a `path IN` condition set with the `WHERE` clause as shown below. This corresponds to the path of the children that you want to get data for. The query `SELECT * FROM system.zookeeper WHERE path = '/clickhouse'` outputs data for all children on the `/clickhouse` node. To output data for all root nodes, write path = ‘/’. @@ -9,7 +9,7 @@ If the path specified in ‘path’ does not exist, an exception will be thrown. The query `SELECT * FROM system.zookeeper WHERE path IN ('/', '/clickhouse')` outputs data for all children on the `/` and `/clickhouse` node. If in the specified ‘path’ collection has does not exist path, an exception will be thrown. -It can be used to do a batch of ZooKeeper path queries. +It can be used to do a batch of Keeper path queries. Columns: diff --git a/docs/en/operations/tips.md b/docs/en/operations/tips.md index c727c636579..26dc59d72ba 100644 --- a/docs/en/operations/tips.md +++ b/docs/en/operations/tips.md @@ -118,11 +118,11 @@ in XML configuration. This is important for ClickHouse to be able to get correct information with `cpuid` instruction. Otherwise you may get `Illegal instruction` crashes when hypervisor is run on old CPU models. -## ZooKeeper {#zookeeper} +## ClickHouse Keeper and ZooKeeper {#zookeeper} -You are probably already using ZooKeeper for other purposes. You can use the same installation of ZooKeeper, if it isn’t already overloaded. +ClickHouse Keeper is recommended to replace ZooKeeper for ClickHouse clusters. See the documentation for [ClickHouse Keeper](clickhouse-keeper.md) -It’s best to use a fresh version of ZooKeeper – 3.4.9 or later. The version in stable Linux distributions may be outdated. +If you would like to continue using ZooKeeper then it is best to use a fresh version of ZooKeeper – 3.4.9 or later. The version in stable Linux distributions may be outdated. You should never use manually written scripts to transfer data between different ZooKeeper clusters, because the result will be incorrect for sequential nodes. Never use the “zkcopy” utility for the same reason: https://github.com/ksprojects/zkcopy/issues/15 diff --git a/docs/en/operations/utilities/clickhouse-copier.md b/docs/en/operations/utilities/clickhouse-copier.md index f152c177992..f3806d1afbc 100644 --- a/docs/en/operations/utilities/clickhouse-copier.md +++ b/docs/en/operations/utilities/clickhouse-copier.md @@ -11,11 +11,11 @@ Copies data from the tables in one cluster to tables in another (or the same) cl To get a consistent copy, the data in the source tables and partitions should not change during the entire process. ::: -You can run multiple `clickhouse-copier` instances on different servers to perform the same job. ZooKeeper is used for syncing the processes. +You can run multiple `clickhouse-copier` instances on different servers to perform the same job. ClickHouse Keeper, or ZooKeeper, is used for syncing the processes. After starting, `clickhouse-copier`: -- Connects to ZooKeeper and receives: +- Connects to ClickHouse Keeper and receives: - Copying jobs. - The state of the copying jobs. @@ -24,7 +24,7 @@ After starting, `clickhouse-copier`: Each running process chooses the “closest” shard of the source cluster and copies the data into the destination cluster, resharding the data if necessary. -`clickhouse-copier` tracks the changes in ZooKeeper and applies them on the fly. +`clickhouse-copier` tracks the changes in ClickHouse Keeper and applies them on the fly. To reduce network traffic, we recommend running `clickhouse-copier` on the same server where the source data is located. @@ -33,19 +33,19 @@ To reduce network traffic, we recommend running `clickhouse-copier` on the same The utility should be run manually: ``` bash -$ clickhouse-copier --daemon --config zookeeper.xml --task-path /task/path --base-dir /path/to/dir +$ clickhouse-copier --daemon --config keeper.xml --task-path /task/path --base-dir /path/to/dir ``` Parameters: - `daemon` — Starts `clickhouse-copier` in daemon mode. -- `config` — The path to the `zookeeper.xml` file with the parameters for the connection to ZooKeeper. -- `task-path` — The path to the ZooKeeper node. This node is used for syncing `clickhouse-copier` processes and storing tasks. Tasks are stored in `$task-path/description`. -- `task-file` — Optional path to file with task configuration for initial upload to ZooKeeper. +- `config` — The path to the `keeper.xml` file with the parameters for the connection to ClickHouse Keeper. +- `task-path` — The path to the ClickHouse Keeper node. This node is used for syncing `clickhouse-copier` processes and storing tasks. Tasks are stored in `$task-path/description`. +- `task-file` — Optional path to file with task configuration for initial upload to ClickHouse Keeper. - `task-upload-force` — Force upload `task-file` even if node already exists. - `base-dir` — The path to logs and auxiliary files. When it starts, `clickhouse-copier` creates `clickhouse-copier_YYYYMMHHSS_` subdirectories in `$base-dir`. If this parameter is omitted, the directories are created in the directory where `clickhouse-copier` was launched. -## Format of Zookeeper.xml {#format-of-zookeeper-xml} +## Format of keeper.xml {#format-of-zookeeper-xml} ``` xml From 93e0e72fcbea5fea5cdb890fd369a4a4fa319cdd Mon Sep 17 00:00:00 2001 From: Dan Roscigno Date: Fri, 20 May 2022 17:43:23 -0400 Subject: [PATCH 2/6] typo --- docs/en/operations/system-tables/zookeeper.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/en/operations/system-tables/zookeeper.md b/docs/en/operations/system-tables/zookeeper.md index 31625b0fed6..ec67d2780e3 100644 --- a/docs/en/operations/system-tables/zookeeper.md +++ b/docs/en/operations/system-tables/zookeeper.md @@ -1,6 +1,6 @@ # zookeeper {#system-zookeeper} -The table does not exist unless ClickHouse Keeper or ZooKeeper is configured. Th `system.zookeeper` exposes data from the Keeper cluster defined in the config. +The table does not exist unless ClickHouse Keeper or ZooKeeper is configured. The `system.zookeeper` table exposes data from the Keeper cluster defined in the config. The query must either have a ‘path =’ condition or a `path IN` condition set with the `WHERE` clause as shown below. This corresponds to the path of the children that you want to get data for. The query `SELECT * FROM system.zookeeper WHERE path = '/clickhouse'` outputs data for all children on the `/clickhouse` node. From 49b0fde46c4a3f04229d08213397b491620ff80b Mon Sep 17 00:00:00 2001 From: Dan Roscigno Date: Mon, 23 May 2022 09:43:26 -0400 Subject: [PATCH 3/6] add ClickHouse Keeper to docs --- .../operations/settings/merge-tree-settings.md | 8 ++++---- docs/en/operations/settings/settings.md | 6 +++--- .../system-tables/distributed_ddl_queue.md | 2 +- docs/en/operations/system-tables/mutations.md | 2 +- docs/en/operations/system-tables/replicas.md | 16 ++++++++-------- .../system-tables/replication_queue.md | 6 +++--- 6 files changed, 20 insertions(+), 20 deletions(-) diff --git a/docs/en/operations/settings/merge-tree-settings.md b/docs/en/operations/settings/merge-tree-settings.md index 01177d6aae4..b672da83441 100644 --- a/docs/en/operations/settings/merge-tree-settings.md +++ b/docs/en/operations/settings/merge-tree-settings.md @@ -114,7 +114,7 @@ A large number of parts in a table reduces performance of ClickHouse queries and ## replicated_deduplication_window {#replicated-deduplication-window} -The number of most recently inserted blocks for which Zookeeper stores hash sums to check for duplicates. +The number of most recently inserted blocks for which ClickHouse Keeper stores hash sums to check for duplicates. Possible values: @@ -123,7 +123,7 @@ Possible values: Default value: 100. -The `Insert` command creates one or more blocks (parts). When inserting into Replicated tables, ClickHouse for [insert deduplication](../../engines/table-engines/mergetree-family/replication/) writes the hash sums of the created parts into Zookeeper. Hash sums are stored only for the most recent `replicated_deduplication_window` blocks. The oldest hash sums are removed from Zookeeper. +The `Insert` command creates one or more blocks (parts). For [insert deduplication](../../engines/table-engines/mergetree-family/replication/), when writing into replicated tables, ClickHouse writes the hash sums of the created parts into ClickHouse Keeper. Hash sums are stored only for the most recent `replicated_deduplication_window` blocks. The oldest hash sums are removed from ClickHouse Keeper. A large number of `replicated_deduplication_window` slows down `Inserts` because it needs to compare more entries. The hash sum is calculated from the composition of the field names and types and the data of the inserted part (stream of bytes). @@ -142,7 +142,7 @@ A deduplication mechanism is used, similar to replicated tables (see [replicated ## replicated_deduplication_window_seconds {#replicated-deduplication-window-seconds} -The number of seconds after which the hash sums of the inserted blocks are removed from Zookeeper. +The number of seconds after which the hash sums of the inserted blocks are removed from ClickHouse Keeper. Possible values: @@ -150,7 +150,7 @@ Possible values: Default value: 604800 (1 week). -Similar to [replicated_deduplication_window](#replicated-deduplication-window), `replicated_deduplication_window_seconds` specifies how long to store hash sums of blocks for insert deduplication. Hash sums older than `replicated_deduplication_window_seconds` are removed from Zookeeper, even if they are less than ` replicated_deduplication_window`. +Similar to [replicated_deduplication_window](#replicated-deduplication-window), `replicated_deduplication_window_seconds` specifies how long to store hash sums of blocks for insert deduplication. Hash sums older than `replicated_deduplication_window_seconds` are removed from ClickHouse Keeper, even if they are less than ` replicated_deduplication_window`. ## replicated_fetches_http_connection_timeout {#replicated_fetches_http_connection_timeout} diff --git a/docs/en/operations/settings/settings.md b/docs/en/operations/settings/settings.md index 76fbc5f239d..1249c3d7a95 100644 --- a/docs/en/operations/settings/settings.md +++ b/docs/en/operations/settings/settings.md @@ -1838,7 +1838,7 @@ Usage By default, deduplication is not performed for materialized views but is done upstream, in the source table. If an INSERTed block is skipped due to deduplication in the source table, there will be no insertion into attached materialized views. This behaviour exists to enable the insertion of highly aggregated data into materialized views, for cases where inserted blocks are the same after materialized view aggregation but derived from different INSERTs into the source table. -At the same time, this behaviour “breaks” `INSERT` idempotency. If an `INSERT` into the main table was successful and `INSERT` into a materialized view failed (e.g. because of communication failure with Zookeeper) a client will get an error and can retry the operation. However, the materialized view won’t receive the second insert because it will be discarded by deduplication in the main (source) table. The setting `deduplicate_blocks_in_dependent_materialized_views` allows for changing this behaviour. On retry, a materialized view will receive the repeat insert and will perform a deduplication check by itself, +At the same time, this behaviour “breaks” `INSERT` idempotency. If an `INSERT` into the main table was successful and `INSERT` into a materialized view failed (e.g. because of communication failure with ClickHouse Keeper) a client will get an error and can retry the operation. However, the materialized view won’t receive the second insert because it will be discarded by deduplication in the main (source) table. The setting `deduplicate_blocks_in_dependent_materialized_views` allows for changing this behaviour. On retry, a materialized view will receive the repeat insert and will perform a deduplication check by itself, ignoring check result for the source table, and will insert rows lost because of the first failure. ## insert_deduplication_token {#insert_deduplication_token} @@ -2459,7 +2459,7 @@ Default value: 0. ## merge_selecting_sleep_ms {#merge_selecting_sleep_ms} -Sleep time for merge selecting when no part is selected. A lower setting triggers selecting tasks in `background_schedule_pool` frequently, which results in a large number of requests to Zookeeper in large-scale clusters. +Sleep time for merge selecting when no part is selected. A lower setting triggers selecting tasks in `background_schedule_pool` frequently, which results in a large number of requests to ClickHouse Keeper in large-scale clusters. Possible values: @@ -2607,7 +2607,7 @@ Default value: 128. ## background_fetches_pool_size {#background_fetches_pool_size} -Sets the number of threads performing background fetches for [replicated](../../engines/table-engines/mergetree-family/replication.md) tables. This setting is applied at the ClickHouse server start and can’t be changed in a user session. For production usage with frequent small insertions or slow ZooKeeper cluster is recommended to use default value. +Sets the number of threads performing background fetches for [replicated](../../engines/table-engines/mergetree-family/replication.md) tables. This setting is applied at the ClickHouse server start and can’t be changed in a user session. For production usage with frequent small insertions or slow ZooKeeper cluster it is recommended to use default value. Possible values: diff --git a/docs/en/operations/system-tables/distributed_ddl_queue.md b/docs/en/operations/system-tables/distributed_ddl_queue.md index 0597972197d..ac2663bba19 100644 --- a/docs/en/operations/system-tables/distributed_ddl_queue.md +++ b/docs/en/operations/system-tables/distributed_ddl_queue.md @@ -15,7 +15,7 @@ Columns: - `query_start_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Query start time. - `query_finish_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Query finish time. - `query_duration_ms` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Duration of query execution (in milliseconds). -- `exception_code` ([Enum8](../../sql-reference/data-types/enum.md)) — Exception code from [ZooKeeper](../../operations/tips.md#zookeeper). +- `exception_code` ([Enum8](../../sql-reference/data-types/enum.md)) — Exception code from [ClickHouse Keeper](../../operations/tips.md#zookeeper). **Example** diff --git a/docs/en/operations/system-tables/mutations.md b/docs/en/operations/system-tables/mutations.md index 507146d93de..6d74a0de9c3 100644 --- a/docs/en/operations/system-tables/mutations.md +++ b/docs/en/operations/system-tables/mutations.md @@ -8,7 +8,7 @@ Columns: - `table` ([String](../../sql-reference/data-types/string.md)) — The name of the table to which the mutation was applied. -- `mutation_id` ([String](../../sql-reference/data-types/string.md)) — The ID of the mutation. For replicated tables these IDs correspond to znode names in the `/mutations/` directory in ZooKeeper. For non-replicated tables the IDs correspond to file names in the data directory of the table. +- `mutation_id` ([String](../../sql-reference/data-types/string.md)) — The ID of the mutation. For replicated tables these IDs correspond to znode names in the `/mutations/` directory in ClickHouse Keeper or ZooKeeper. For non-replicated tables the IDs correspond to file names in the data directory of the table. - `command` ([String](../../sql-reference/data-types/string.md)) — The mutation command string (the part of the query after `ALTER TABLE [db.]table`). diff --git a/docs/en/operations/system-tables/replicas.md b/docs/en/operations/system-tables/replicas.md index 6ec0f184e15..c65b0d294b0 100644 --- a/docs/en/operations/system-tables/replicas.md +++ b/docs/en/operations/system-tables/replicas.md @@ -62,13 +62,13 @@ Columns: Note that writes can be performed to any replica that is available and has a session in ZK, regardless of whether it is a leader. - `can_become_leader` (`UInt8`) - Whether the replica can be a leader. - `is_readonly` (`UInt8`) - Whether the replica is in read-only mode. - This mode is turned on if the config does not have sections with ZooKeeper, if an unknown error occurred when reinitializing sessions in ZooKeeper, and during session reinitialization in ZooKeeper. -- `is_session_expired` (`UInt8`) - the session with ZooKeeper has expired. Basically the same as `is_readonly`. + This mode is turned on if the config does not have sections with ClickHouse Keeper, if an unknown error occurred when reinitializing sessions in ClickHouse Keeper, and during session reinitialization in ClickHouse Keeper. +- `is_session_expired` (`UInt8`) - the session with ClickHouse Keeper has expired. Basically the same as `is_readonly`. - `future_parts` (`UInt32`) - The number of data parts that will appear as the result of INSERTs or merges that haven’t been done yet. - `parts_to_check` (`UInt32`) - The number of data parts in the queue for verification. A part is put in the verification queue if there is suspicion that it might be damaged. -- `zookeeper_path` (`String`) - Path to table data in ZooKeeper. -- `replica_name` (`String`) - Replica name in ZooKeeper. Different replicas of the same table have different names. -- `replica_path` (`String`) - Path to replica data in ZooKeeper. The same as concatenating ‘zookeeper_path/replicas/replica_path’. +- `zookeeper_path` (`String`) - Path to table data in ClickHouse Keeper. +- `replica_name` (`String`) - Replica name in ClickHouse Keeper. Different replicas of the same table have different names. +- `replica_path` (`String`) - Path to replica data in ClickHouse Keeper. The same as concatenating ‘zookeeper_path/replicas/replica_path’. - `columns_version` (`Int32`) - Version number of the table structure. Indicates how many times ALTER was performed. If replicas have different versions, it means some replicas haven’t made all of the ALTERs yet. - `queue_size` (`UInt32`) - Size of the queue for operations waiting to be performed. Operations include inserting blocks of data, merges, and certain other actions. It usually coincides with `future_parts`. - `inserts_in_queue` (`UInt32`) - Number of inserts of blocks of data that need to be made. Insertions are usually replicated fairly quickly. If this number is large, it means something is wrong. @@ -86,12 +86,12 @@ The next 4 columns have a non-zero value only where there is an active session w - `last_queue_update` (`DateTime`) - When the queue was updated last time. - `absolute_delay` (`UInt64`) - How big lag in seconds the current replica has. - `total_replicas` (`UInt8`) - The total number of known replicas of this table. -- `active_replicas` (`UInt8`) - The number of replicas of this table that have a session in ZooKeeper (i.e., the number of functioning replicas). +- `active_replicas` (`UInt8`) - The number of replicas of this table that have a session in ClickHouse Keeper (i.e., the number of functioning replicas). - `last_queue_update_exception` (`String`) - When the queue contains broken entries. Especially important when ClickHouse breaks backward compatibility between versions and log entries written by newer versions aren't parseable by old versions. -- `zookeeper_exception` (`String`) - The last exception message, got if the error happened when fetching the info from ZooKeeper. +- `zookeeper_exception` (`String`) - The last exception message, got if the error happened when fetching the info from ClickHouse Keeper. - `replica_is_active` ([Map(String, UInt8)](../../sql-reference/data-types/map.md)) — Map between replica name and is replica active. -If you request all the columns, the table may work a bit slowly, since several reads from ZooKeeper are made for each row. +If you request all the columns, the table may work a bit slowly, since several reads from ClickHouse Keeper are made for each row. If you do not request the last 4 columns (log_max_index, log_pointer, total_replicas, active_replicas), the table works quickly. For example, you can check that everything is working correctly like this: diff --git a/docs/en/operations/system-tables/replication_queue.md b/docs/en/operations/system-tables/replication_queue.md index a8a51162dae..834f4a04757 100644 --- a/docs/en/operations/system-tables/replication_queue.md +++ b/docs/en/operations/system-tables/replication_queue.md @@ -1,6 +1,6 @@ # replication_queue {#system_tables-replication_queue} -Contains information about tasks from replication queues stored in ZooKeeper for tables in the `ReplicatedMergeTree` family. +Contains information about tasks from replication queues stored in Clickhouse Keeper, or ZooKeeper, for tables in the `ReplicatedMergeTree` family. Columns: @@ -8,11 +8,11 @@ Columns: - `table` ([String](../../sql-reference/data-types/string.md)) — Name of the table. -- `replica_name` ([String](../../sql-reference/data-types/string.md)) — Replica name in ZooKeeper. Different replicas of the same table have different names. +- `replica_name` ([String](../../sql-reference/data-types/string.md)) — Replica name in ClickHouse Keeper. Different replicas of the same table have different names. - `position` ([UInt32](../../sql-reference/data-types/int-uint.md)) — Position of the task in the queue. -- `node_name` ([String](../../sql-reference/data-types/string.md)) — Node name in ZooKeeper. +- `node_name` ([String](../../sql-reference/data-types/string.md)) — Node name in ClickHouse Keeper. - `type` ([String](../../sql-reference/data-types/string.md)) — Type of the task in the queue, one of: From b9bd3cb49f9aa5042ef78aaa33136e81dd057908 Mon Sep 17 00:00:00 2001 From: Dan Roscigno Date: Mon, 23 May 2022 10:06:16 -0400 Subject: [PATCH 4/6] update tips for ClickHouse Keeper --- docs/en/operations/tips.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/en/operations/tips.md b/docs/en/operations/tips.md index 26dc59d72ba..a0a0391fb09 100644 --- a/docs/en/operations/tips.md +++ b/docs/en/operations/tips.md @@ -128,7 +128,7 @@ You should never use manually written scripts to transfer data between different If you want to divide an existing ZooKeeper cluster into two, the correct way is to increase the number of its replicas and then reconfigure it as two independent clusters. -Do not run ZooKeeper on the same servers as ClickHouse. Because ZooKeeper is very sensitive for latency and ClickHouse may utilize all available system resources. +You can run ClickHouse Keeper on the same server as ClickHouse, but do not run ZooKeeper on the same servers as ClickHouse. Because ZooKeeper is very sensitive for latency and ClickHouse may utilize all available system resources. You can have ZooKeeper observers in an ensemble but ClickHouse servers should not interact with observers. From 8dced1af03cead91e76600e789e604f5050378ef Mon Sep 17 00:00:00 2001 From: Dan Roscigno Date: Tue, 24 May 2022 07:29:52 -0400 Subject: [PATCH 5/6] Update docs/en/operations/system-tables/mutations.md Co-authored-by: Antonio Andelic --- docs/en/operations/system-tables/mutations.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/en/operations/system-tables/mutations.md b/docs/en/operations/system-tables/mutations.md index 6d74a0de9c3..2878a19a1e7 100644 --- a/docs/en/operations/system-tables/mutations.md +++ b/docs/en/operations/system-tables/mutations.md @@ -8,7 +8,7 @@ Columns: - `table` ([String](../../sql-reference/data-types/string.md)) — The name of the table to which the mutation was applied. -- `mutation_id` ([String](../../sql-reference/data-types/string.md)) — The ID of the mutation. For replicated tables these IDs correspond to znode names in the `/mutations/` directory in ClickHouse Keeper or ZooKeeper. For non-replicated tables the IDs correspond to file names in the data directory of the table. +- `mutation_id` ([String](../../sql-reference/data-types/string.md)) — The ID of the mutation. For replicated tables these IDs correspond to znode names in the `/mutations/` directory in ClickHouse Keeper. For non-replicated tables the IDs correspond to file names in the data directory of the table. - `command` ([String](../../sql-reference/data-types/string.md)) — The mutation command string (the part of the query after `ALTER TABLE [db.]table`). From 620ab399c99ecce92848d1c3101cbd60572393ad Mon Sep 17 00:00:00 2001 From: alesapin Date: Wed, 25 May 2022 20:23:24 +0200 Subject: [PATCH 6/6] Update docs/en/operations/clickhouse-keeper.md --- docs/en/operations/clickhouse-keeper.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/en/operations/clickhouse-keeper.md b/docs/en/operations/clickhouse-keeper.md index 7dbe0601343..e4d10967bc8 100644 --- a/docs/en/operations/clickhouse-keeper.md +++ b/docs/en/operations/clickhouse-keeper.md @@ -13,7 +13,7 @@ ZooKeeper is one of the first well-known open-source coordination systems. It's By default, ClickHouse Keeper provides the same guarantees as ZooKeeper (linearizable writes, non-linearizable reads). It has a compatible client-server protocol, so any standard ZooKeeper client can be used to interact with ClickHouse Keeper. Snapshots and logs have an incompatible format with ZooKeeper, but the `clickhouse-keeper-converter` tool enables the conversion of ZooKeeper data to ClickHouse Keeper snapshots. The interserver protocol in ClickHouse Keeper is also incompatible with ZooKeeper so a mixed ZooKeeper / ClickHouse Keeper cluster is impossible. -ClickHouse Keeper supports Access Control Lists (ACLs) the same way as [ZooKeeper](https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_ZooKeeperAccessControl) does. ClickHouse Keeper supports the same set of permissions and has the identical built-in schemes: `world`, `auth`, `digest`, `host` and `ip`. The digest authentication scheme uses the pair `username:password`, the password is encoded in Base64. +ClickHouse Keeper supports Access Control Lists (ACLs) the same way as [ZooKeeper](https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_ZooKeeperAccessControl) does. ClickHouse Keeper supports the same set of permissions and has the identical built-in schemes: `world`, `auth` and `digest`. The digest authentication scheme uses the pair `username:password`, the password is encoded in Base64. :::note External integrations are not supported.