Style and Docs update

2024-09-20 00:30:49 +00:00 · 2022-04-22 13:04:54 +00:00 · 2022-04-22 13:04:54 +00:00 · e475849761
commit e475849761
parent f95a63a7f0
7 changed files with 1345 additions and 51 deletions
--- a/docs/en/operations/server-configuration-parameters/settings.md
+++ b/docs/en/operations/server-configuration-parameters/settings.md
@ -23,7 +23,7 @@ Default value: 3600.

 Data compression settings for [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md)-engine tables.

-:::warning    
+:::warning
 Don’t use it if you have just started using ClickHouse.
 :::

@ -886,6 +886,92 @@ Default value: `10000`.
 <thread_pool_queue_size>12000</thread_pool_queue_size>
 ```

+## background_pool_size {#background_pool_size}
+
+Sets the number of threads performing background merges and mutations for tables with MergeTree engines. This setting is also could be applied  at server startup from the `default` profile configuration for backward compatibility at the ClickHouse server start. You can only increase the number of threads at runtime. To lower the number of threads you have to restart the server. By adjusting this setting, you manage CPU and disk load. Smaller pool size utilizes less CPU and disk resources, but background processes advance slower which might eventually impact query performance.
+
+Before changing it, please also take a look at related MergeTree settings, such as `number_of_free_entries_in_pool_to_lower_max_size_of_merge` and `number_of_free_entries_in_pool_to_execute_mutation`.
+
+Possible values:
+
+-   Any positive integer.
+
+Default value: 16.
+
+**Example**
+
+```xml
+<background_pool_size>16</background_pool_size>
+```
+
+## background_merges_mutations_concurrency_ratio {#background_merges_mutations_concurrency_ratio}
+
+Sets a ratio between the number of threads and the number of background merges and mutations that can be executed concurrently. For example if the ratio equals to 2 and
+`background_pool_size` is set to 16 then ClickHouse can execute 32 background merges concurrently. This is possible, because background operation could be suspended and postponed. This is needed to give small merges more execution priority. You can only increase this ratio at runtime. To lower it you have to restart the server.
+The same as for `background_pool_size` setting `background_merges_mutations_concurrency_ratio` could be applied from the `default` profile for backward compatibility.
+
+Possible values:
+
+-   Any positive integer.
+
+Default value: 2.
+
+**Example**
+
+```xml
+<background_merges_mutations_concurrency_ratio>3</background_pbackground_merges_mutations_concurrency_ratio>
+```
+
+## background_move_pool_size {#background_move_pool_size}
+
+Sets the number of threads performing background moves for tables with MergeTree engines. Could be increased at runtime and could be applied at server startup from the `default` profile for backward compatibility.
+
+Possible values:
+
+-   Any positive integer.
+
+Default value: 8.
+
+**Example**
+
+```xml
+<background_move_pool_size>36</background_move_pool_size>
+```
+
+## background_fetches_pool_size {#background_fetches_pool_size}
+
+Sets the number of threads performing background fetches for tables with ReplicatedMergeTree engines. Could be increased at runtime and could be applied at server startup from the `default` profile for backward compatibility.
+
+Possible values:
+
+-   Any positive integer.
+
+Default value: 8.
+
+**Example**
+
+```xml
+<background_fetches_pool_size>36</background_fetches_pool_size>
+```
+
+## background_common_pool_size {#background_common_pool_size}
+
+Sets the number of threads performing background non-specialized operations like cleaning the filesystem etc. for tables with MergeTree engines. Could be increased at runtime and could be applied at server startup from the `default` profile for backward compatibility.
+
+Possible values:
+
+-   Any positive integer.
+
+Default value: 8.
+
+**Example**
+
+```xml
+<background_common_pool_size>36</background_common_pool_size>
+```
+
+
+
 ## merge_tree {#server_configuration_parameters-merge_tree}

 Fine tuning for tables in the [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md).
@ -1628,3 +1714,13 @@ Possible values:

 Default value: `10000`.

+## global_memory_usage_overcommit_max_wait_microseconds {#global_memory_usage_overcommit_max_wait_microseconds}
+
+Sets maximum waiting time for global overcommit tracker.
+
+Possible values:
+
+-   Positive integer.
+
+Default value: `0`.
+
--- a/docs/en/operations/settings/settings.md
+++ b/docs/en/operations/settings/settings.md
@ -1814,7 +1814,7 @@ ignoring check result for the source table, and will insert rows lost because of

 ## insert_deduplication_token {#insert_deduplication_token}

-The setting allows a user to provide own deduplication semantic in MergeTree/ReplicatedMergeTree  
+The setting allows a user to provide own deduplication semantic in MergeTree/ReplicatedMergeTree
 For example, by providing a unique value for the setting in each INSERT statement,
 user can avoid the same inserted data being deduplicated.

@ -1840,7 +1840,7 @@ INSERT INTO test_table Values SETTINGS insert_deduplication_token = 'test' (1);
 -- the next insert won't be deduplicated because insert_deduplication_token is different
 INSERT INTO test_table Values SETTINGS insert_deduplication_token = 'test1' (1);

-- the next insert will be deduplicated because insert_deduplication_token 
+-- the next insert will be deduplicated because insert_deduplication_token
 -- is the same as one of the previous
 INSERT INTO test_table Values SETTINGS insert_deduplication_token = 'test' (2);

@ -2427,17 +2427,6 @@ Possible values:

 Default value: 0.

-## background_pool_size {#background_pool_size}
-
-Sets the number of threads performing background operations in table engines (for example, merges in [MergeTree engine](../../engines/table-engines/mergetree-family/index.md) tables). This setting is applied from the `default` profile at the ClickHouse server start and can’t be changed in a user session. By adjusting this setting, you manage CPU and disk load. Smaller pool size utilizes less CPU and disk resources, but background processes advance slower which might eventually impact query performance.
-
-Before changing it, please also take a look at related [MergeTree settings](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-merge_tree), such as `number_of_free_entries_in_pool_to_lower_max_size_of_merge` and `number_of_free_entries_in_pool_to_execute_mutation`.
-
-Possible values:
-
-   Any positive integer.
-
-Default value: 16.

 ## merge_selecting_sleep_ms {#merge_selecting_sleep_ms}

--- a/docs/ja/operations/settings/settings.md
+++ b/docs/ja/operations/settings/settings.md
--- a/docs/ru/operations/settings/settings.md
+++ b/docs/ru/operations/settings/settings.md
@ -391,7 +391,7 @@ INSERT INTO test VALUES (lower('Hello')), (lower('world')), (lower('INSERT')), (

 ## input_format_tsv_enum_as_number {#settings-input_format_tsv_enum_as_number}

-Включает или отключает парсинг значений перечислений как порядковых номеров. 
+Включает или отключает парсинг значений перечислений как порядковых номеров.

 Если режим включен, то во входящих данных в формате `TCV` значения перечисления (тип `ENUM`) всегда трактуются как порядковые номера, а не как элементы перечисления. Эту настройку рекомендуется включать для оптимизации парсинга, если данные типа `ENUM` содержат только порядковые номера, а не сами элементы перечисления.

@ -1176,9 +1176,9 @@ SELECT type, query FROM system.query_log WHERE log_comment = 'log_comment test'

 Может быть использована для ограничения скорости сети при репликации данных для добавления или замены новых узлов.

-    :::note 
-    60000000 байт/с примерно соответствует 457 Мбит/с (60000000 / 1024 / 1024 * 8). 
-    :::
+!!! note "Note"
+    60000000 байт/с примерно соответствует 457 Мбит/с (60000000 / 1024 / 1024 * 8).
+
 ## max_replicated_sends_network_bandwidth_for_server {#max_replicated_sends_network_bandwidth_for_server}

 Ограничивает максимальную скорость обмена данными в сети (в байтах в секунду) для [репликационных](../../engines/table-engines/mergetree-family/replication.md) отправок. Применяется только при запуске сервера. Можно также ограничить скорость для конкретной таблицы с помощью настройки [max_replicated_sends_network_bandwidth](../../operations/settings/merge-tree-settings.md#max_replicated_sends_network_bandwidth).
@ -1196,9 +1196,9 @@ SELECT type, query FROM system.query_log WHERE log_comment = 'log_comment test'

 Может быть использована для ограничения скорости сети при репликации данных для добавления или замены новых узлов.

-    :::note 
-    60000000 байт/с примерно соответствует 457 Мбит/с (60000000 / 1024 / 1024 * 8). 
-    :::
+!!! note "Note"
+    60000000 байт/с примерно соответствует 457 Мбит/с (60000000 / 1024 / 1024 * 8).
+
 ## connect_timeout_with_failover_ms {#connect-timeout-with-failover-ms}

 Таймаут в миллисекундах на соединение с удалённым сервером, для движка таблиц Distributed, если используются секции shard и replica в описании кластера.
@ -1419,13 +1419,13 @@ load_balancing = round_robin

 Значение по умолчанию: `1`.

-**См. также** 
+**См. также**

 -   [min_count_to_compile_aggregate_expression](#min_count_to_compile_aggregate_expression)

 ## min_count_to_compile_aggregate_expression {#min_count_to_compile_aggregate_expression}

-Минимальное количество вызовов агрегатной функции с одинаковым выражением, при котором функция будет компилироваться в нативный код в ходе выполнения запроса. Работает только если включена настройка [compile_aggregate_expressions](#compile_aggregate_expressions).  
+Минимальное количество вызовов агрегатной функции с одинаковым выражением, при котором функция будет компилироваться в нативный код в ходе выполнения запроса. Работает только если включена настройка [compile_aggregate_expressions](#compile_aggregate_expressions).

 Возможные значения:

@ -1554,7 +1554,7 @@ SELECT area/period FROM account_orders FORMAT JSON;

 ## input_format_csv_enum_as_number {#settings-input_format_csv_enum_as_number}

-Включает или отключает парсинг значений перечислений как порядковых номеров. 
+Включает или отключает парсинг значений перечислений как порядковых номеров.
 Если режим включен, то во входящих данных в формате `CSV` значения перечисления (тип `ENUM`) всегда трактуются как порядковые номера, а не как элементы перечисления. Эту настройку рекомендуется включать для оптимизации парсинга, если данные типа `ENUM` содержат только порядковые номера, а не сами элементы перечисления.

 Возможные значения:
@ -1761,11 +1761,11 @@ SETTINGS non_replicated_deduplication_window = 100;

 INSERT INTO test_table Values SETTINGS insert_deduplication_token = 'test' (1);

-- следующая вставка не будет дедуплицирована, потому что insert_deduplication_token отличается 
+-- следующая вставка не будет дедуплицирована, потому что insert_deduplication_token отличается
 INSERT INTO test_table Values SETTINGS insert_deduplication_token = 'test1' (1);

 -- следующая вставка будет дедуплицирована, потому что insert_deduplication_token
-- тот же самый, что и один из предыдущих 
+-- тот же самый, что и один из предыдущих
 INSERT INTO test_table Values SETTINGS insert_deduplication_token = 'test' (2);

 SELECT * FROM test_table
@ -1868,7 +1868,7 @@ SELECT * FROM test_table

 ## distributed_push_down_limit {#distributed-push-down-limit}

-Включает или отключает [LIMIT](#limit), применяемый к каждому шарду по отдельности. 
+Включает или отключает [LIMIT](#limit), применяемый к каждому шарду по отдельности.

 Это позволяет избежать:
 - отправки дополнительных строк по сети;
@ -1993,7 +1993,7 @@ SELECT * FROM test_table

   - 0 — оптимизация отключена.
   - 1 — оптимизация включена.
-   
+
 Значение по умолчанию: `1`.

 См. также:
@ -2205,16 +2205,6 @@ SELECT * FROM test_table

 Значение по умолчанию: 32768 (32 KiB)

-## background_pool_size {#background_pool_size}
-
-Задает количество потоков для выполнения фоновых операций в движках таблиц (например, слияния в таблицах c движком [MergeTree](../../engines/table-engines/mergetree-family/index.md)). Настройка применяется при запуске сервера ClickHouse и не может быть изменена во пользовательском сеансе. Настройка позволяет управлять загрузкой процессора и диска. Чем меньше пул, тем ниже нагрузка на CPU и диск, при этом фоновые процессы работают с меньшей интенсивностью, что в конечном итоге может повлиять на производительность запросов, потому что сервер будет обрабатывать больше кусков.
-
-Допустимые значения:
-
-   Положительное целое число.
-
-Значение по умолчанию: 16.
-
 ## merge_selecting_sleep_ms {#merge_selecting_sleep_ms}

 Время ожидания для слияния выборки, если ни один кусок не выбран. Снижение времени ожидания приводит к частому выбору задач в пуле `background_schedule_pool` и увеличению количества запросов к Zookeeper в крупных кластерах.
@ -3679,7 +3669,7 @@ SETTINGS index_granularity = 8192 │

 ## max_hyperscan_regexp_length {#max-hyperscan-regexp-length}

-Задает максимальную длину каждого регулярного выражения в [hyperscan-функциях](../../sql-reference/functions/string-search-functions.md#multimatchanyhaystack-pattern1-pattern2-patternn)  поиска множественных совпадений в строке. 
+Задает максимальную длину каждого регулярного выражения в [hyperscan-функциях](../../sql-reference/functions/string-search-functions.md#multimatchanyhaystack-pattern1-pattern2-patternn)  поиска множественных совпадений в строке.

 Возможные значения:

--- a/docs/zh/operations/settings/settings.md
+++ b/docs/zh/operations/settings/settings.md
@ -1242,16 +1242,6 @@ ClickHouse生成异常

 默认值：空

-## background_pool_size {#background_pool_size}
-
-设置在表引擎中执行后台操作的线程数（例如，合并 [MergeTree引擎](../../engines/table-engines/mergetree-family/index.md) 表）。 此设置在ClickHouse服务器启动时应用，不能在用户会话中更改。 通过调整此设置，您可以管理CPU和磁盘负载。 较小的池大小使用较少的CPU和磁盘资源，但后台进程推进速度较慢，最终可能会影响查询性能。
-
-可能的值:
-
-   任何正整数。
-
-默认值：16。
-
 [原始文章](https://clickhouse.com/docs/en/operations/settings/settings/) <!-- hide -->

 ## transform_null_in {#transform_null_in}
--- a/programs/server/Server.cpp
+++ b/programs/server/Server.cpp
@ -685,7 +685,7 @@ int Server::main(const std::vector<std::string> & /*args*/)
            std::vector<ProtocolServerMetrics> metrics;
            metrics.reserve(servers_to_start_before_tables.size());
            for (const auto & server : servers_to_start_before_tables)
-                metrics.emplace_back(ProtocolServerMetrics{server.getPortName(), server.currentThreads()});
+            metrics.emplace_back(ProtocolServerMetrics{server.getPortName(), server.currentThreads()});

            std::lock_guard lock(servers_lock);
            for (const auto & server : servers)
--- a/src/Storages/MergeTree/MergeTreeBackgroundExecutor.h
+++ b/src/Storages/MergeTree/MergeTreeBackgroundExecutor.h
@ -190,7 +190,7 @@ public:

    /// Handler for hot-reloading
    /// Supports only increasing the number of threads and tasks, because
-    /// implemeting tasks eviction will definitely be too error-prone and buggy.
+    /// implementing tasks eviction will definitely be too error-prone and buggy.
    void increaseThreadsAndMaxTasksCount(size_t new_threads_count, size_t new_max_tasks_count);

    bool trySchedule(ExecutableTaskPtr task);