From c7ed73ea27ea2da8516401cabb0711ab3d5bb5a0 Mon Sep 17 00:00:00 2001 From: ogorbacheva Date: Thu, 31 Jan 2019 15:23:18 +0300 Subject: [PATCH] fix settings default values (#4204) --- .../en/operations/server_settings/settings.md | 7 ++--- .../operations/settings/query_complexity.md | 2 +- docs/en/operations/settings/settings.md | 30 +++++-------------- .../ru/operations/server_settings/settings.md | 2 +- docs/ru/operations/settings/settings.md | 25 +++++----------- .../zh/operations/server_settings/settings.md | 5 ++-- .../operations/settings/query_complexity.md | 2 +- docs/zh/operations/settings/settings.md | 27 +++++------------ 8 files changed, 31 insertions(+), 69 deletions(-) diff --git a/docs/en/operations/server_settings/settings.md b/docs/en/operations/server_settings/settings.md index fe4330fafe4..451e3059972 100644 --- a/docs/en/operations/server_settings/settings.md +++ b/docs/en/operations/server_settings/settings.md @@ -262,12 +262,12 @@ Useful for breaking away from a specific network interface. ## keep_alive_timeout -The number of seconds that ClickHouse waits for incoming requests before closing the connection. Defaults to 10 seconds +The number of seconds that ClickHouse waits for incoming requests before closing the connection. Defaults to 3 seconds. **Example** ```xml -10 +3 ``` @@ -326,8 +326,7 @@ Keys: - user_syslog — Required setting if you want to write to the syslog. - address — The host[:порт] of syslogd. If omitted, the local daemon is used. - hostname — Optional. The name of the host that logs are sent from. -- facility — [The syslog facility keyword](https://en.wikipedia.org/wiki/Syslog#Facility) -in uppercase letters with the "LOG_" prefix: (``LOG_USER``, ``LOG_DAEMON``, ``LOG_LOCAL3``, and so on). +- facility — [The syslog facility keyword](https://en.wikipedia.org/wiki/Syslog#Facility) in uppercase letters with the "LOG_" prefix: (``LOG_USER``, ``LOG_DAEMON``, ``LOG_LOCAL3``, and so on). Default value: ``LOG_USER`` if ``address`` is specified, ``LOG_DAEMON otherwise.`` - format – Message format. Possible values: ``bsd`` and ``syslog.`` diff --git a/docs/en/operations/settings/query_complexity.md b/docs/en/operations/settings/query_complexity.md index af982e243ec..4c28b53b161 100644 --- a/docs/en/operations/settings/query_complexity.md +++ b/docs/en/operations/settings/query_complexity.md @@ -144,7 +144,7 @@ At this time, it isn't checked during parsing, but only after parsing the query. ## max_ast_elements Maximum number of elements in a query syntactic tree. If exceeded, an exception is thrown. -In the same way as the previous setting, it is checked only after parsing the query. By default, 10,000. +In the same way as the previous setting, it is checked only after parsing the query. By default, 50,000. ## max_rows_in_set diff --git a/docs/en/operations/settings/settings.md b/docs/en/operations/settings/settings.md index c3a99080627..836a13baeb0 100644 --- a/docs/en/operations/settings/settings.md +++ b/docs/en/operations/settings/settings.md @@ -111,7 +111,7 @@ Blocks the size of `max_block_size` are not always loaded from the table. If it Used for the same purpose as `max_block_size`, but it sets the recommended block size in bytes by adapting it to the number of rows in the block. However, the block size cannot be more than `max_block_size` rows. -Disabled by default (set to 0). It only works when reading from MergeTree engines. +By default: 1,000,000. It only works when reading from MergeTree engines. ## merge_tree_uniform_read_distribution {#setting-merge_tree_uniform_read_distribution} @@ -192,7 +192,7 @@ Disables lagging replicas for distributed queries. See "[Replication](../../oper Sets the time in seconds. If a replica lags more than the set value, this replica is not used. -Default value: 0 (off). +Default value: 300. Used when performing `SELECT` from a distributed table that points to replicated tables. @@ -205,7 +205,7 @@ The maximum number of query processing threads This parameter applies to threads that perform the same stages of the query processing pipeline in parallel. For example, if reading from a table, evaluating expressions with functions, filtering with WHERE and pre-aggregating for GROUP BY can all be done in parallel using at least 'max_threads' number of threads, then 'max_threads' are used. -By default, 8. +By default, 2. If less than one SELECT query is normally run on a server at a time, set this parameter to a value slightly less than the actual number of processor cores. @@ -246,11 +246,7 @@ The interval in microseconds for checking whether request execution has been can By default, 100,000 (check for canceling and send progress ten times per second). -## connect_timeout - -## receive_timeout - -## send_timeout +## connect_timeout, receive_timeout, send_timeout Timeouts in seconds on the socket used for communicating with the client. @@ -266,7 +262,7 @@ By default, 10. The maximum number of simultaneous connections with remote servers for distributed processing of a single query to a single Distributed table. We recommend setting a value no less than the number of servers in the cluster. -By default, 100. +By default, 1024. The following parameters are only used when creating Distributed tables (and when launching a server), so there is no reason to change them at runtime. @@ -274,7 +270,7 @@ The following parameters are only used when creating Distributed tables (and whe The maximum number of simultaneous connections with remote servers for distributed processing of all queries to a single Distributed table. We recommend setting a value no less than the number of servers in the cluster. -By default, 128. +By default, 1024. ## connect_timeout_with_failover_ms @@ -294,10 +290,9 @@ By default, 3. Whether to count extreme values (the minimums and maximums in columns of a query result). Accepts 0 or 1. By default, 0 (disabled). For more information, see the section "Extreme values". - ## use_uncompressed_cache {#setting-use_uncompressed_cache} -Whether to use a cache of uncompressed blocks. Accepts 0 or 1. By default, 0 (disabled). +Whether to use a cache of uncompressed blocks. Accepts 0 or 1. By default, 1 (enabled). The uncompressed cache (only for tables in the MergeTree family) allows significantly reducing latency and increasing throughput when working with a large number of short queries. Enable this setting for users who send frequent short requests. Also pay attention to the [uncompressed_cache_size](../server_settings/settings.md#server-settings-uncompressed_cache_size) configuration parameter (only set in the config file) – the size of uncompressed cache blocks. By default, it is 8 GiB. The uncompressed cache is filled in as needed; the least-used data is automatically deleted. For queries that read at least a somewhat large volume of data (one million rows or more), the uncompressed cache is disabled automatically in order to save space for truly small queries. So you can keep the 'use_uncompressed_cache' setting always set to 1. @@ -358,16 +353,9 @@ See the section "WITH TOTALS modifier". ## totals_auto_threshold -The threshold for ` totals_mode = 'auto'`. +The threshold for `totals_mode = 'auto'`. See the section "WITH TOTALS modifier". -## default_sample - -Floating-point number from 0 to 1. By default, 1. -Allows you to set the default sampling ratio for all SELECT queries. -(For tables that do not support sampling, it throws an exception.) -If set to 1, sampling is not performed by default. - ## max_parallel_replicas The maximum number of replicas for each shard when executing a query. @@ -403,14 +391,12 @@ If the value is true, integers appear in quotes when using JSON\* Int64 and UInt The character interpreted as a delimiter in the CSV data. By default, the delimiter is `,`. - ## join_use_nulls Affects the behavior of [JOIN](../../query_language/select.md). With `join_use_nulls=1,` `JOIN` behaves like in standard SQL, i.e. if empty cells appear when merging, the type of the corresponding field is converted to [Nullable](../../data_types/nullable.md#data_type-nullable), and empty cells are filled with [NULL](../../query_language/syntax.md). - ## insert_quorum Enables quorum writes. diff --git a/docs/ru/operations/server_settings/settings.md b/docs/ru/operations/server_settings/settings.md index 75008f875d5..50e8ea0ec75 100644 --- a/docs/ru/operations/server_settings/settings.md +++ b/docs/ru/operations/server_settings/settings.md @@ -268,7 +268,7 @@ ClickHouse проверит условия `min_part_size` и `min_part_size_rat **Пример** ```xml -10 +3 ``` diff --git a/docs/ru/operations/settings/settings.md b/docs/ru/operations/settings/settings.md index 169dc6c0823..7f3cc3c9c77 100644 --- a/docs/ru/operations/settings/settings.md +++ b/docs/ru/operations/settings/settings.md @@ -93,7 +93,7 @@ ClickHouse применяет настройку в тех случаях, ко Служит для тех же целей что и `max_block_size`, но задает реккомедуемый размер блоков в байтах, выбирая адаптивное количество строк в блоке. При этом размер блока не может быть более `max_block_size` строк. -По умолчанию выключен (равен 0), работает только при чтении из MergeTree-движков. +Значение по умолчанию: 1,000,000. Работает только при чтении из MergeTree-движков. ## log_queries @@ -124,7 +124,7 @@ ClickHouse применяет настройку в тех случаях, ко Устанавливает время в секундах. Если оставание реплики больше установленного значения, то реплика не используется. -Значение по умолчанию: 0 (отключено). +Значение по умолчанию: 300. Используется при выполнении `SELECT` из распределенной таблицы, которая указывает на реплицированные таблицы. @@ -136,7 +136,7 @@ ClickHouse применяет настройку в тех случаях, ко Этот параметр относится к потокам, которые выполняют параллельно одни стадии конвейера выполнения запроса. Например, если чтение из таблицы, вычисление выражений с функциями, фильтрацию с помощью WHERE и предварительную агрегацию для GROUP BY можно делать параллельно с использованием как минимум max_threads потоков, то будет использовано max_threads потоков. -По умолчанию - 8. +По умолчанию - 2. Если на сервере обычно исполняется менее одного запроса SELECT одновременно, то выставите этот параметр в значение чуть меньше количества реальных процессорных ядер. @@ -176,11 +176,7 @@ ClickHouse применяет настройку в тех случаях, ко По умолчанию - 100 000 (проверять остановку запроса и отправлять прогресс десять раз в секунду). -## connect_timeout - -## receive_timeout - -## send_timeout +## connect_timeout, receive_timeout, send_timeout Таймауты в секундах на сокет, по которому идёт общение с клиентом. @@ -196,7 +192,7 @@ ClickHouse применяет настройку в тех случаях, ко Максимальное количество одновременных соединений с удалёнными серверами при распределённой обработке одного запроса к одной таблице типа Distributed. Рекомендуется выставлять не меньше, чем количество серверов в кластере. -По умолчанию - 100. +По умолчанию - 1024. Следующие параметры имеют значение только на момент создания таблицы типа Distributed (и при запуске сервера), поэтому их не имеет смысла менять в рантайме. @@ -204,7 +200,7 @@ ClickHouse применяет настройку в тех случаях, ко Максимальное количество одновременных соединений с удалёнными серверами при распределённой обработке всех запросов к одной таблице типа Distributed. Рекомендуется выставлять не меньше, чем количество серверов в кластере. -По умолчанию - 128. +По умолчанию - 1024. ## connect_timeout_with_failover_ms @@ -227,7 +223,7 @@ ClickHouse применяет настройку в тех случаях, ко ## use_uncompressed_cache -Использовать ли кэш разжатых блоков. Принимает 0 или 1. По умолчанию - 0 (выключено). +Использовать ли кэш разжатых блоков. Принимает 0 или 1. По умолчанию - 1 (включено). Кэш разжатых блоков (только для таблиц семейства MergeTree) позволяет существенно уменьшить задержки и увеличить пропускную способность при обработке большого количества коротких запросов. Включите эту настройку для пользователей, от которых идут частые короткие запросы. Также обратите внимание на конфигурационный параметр uncompressed_cache_size (настраивается только в конфигурационном файле) - размер кэша разжатых блоков. По умолчанию - 8 GiB. Кэш разжатых блоков заполняется по мере надобности; наиболее невостребованные данные автоматически удаляются. Для запросов, читающих хоть немного приличный объём данных (миллион строк и больше), кэш разжатых блоков автоматически выключается, чтобы оставить место для действительно мелких запросов. Поэтому, можно держать настройку use_uncompressed_cache всегда выставленной в 1. @@ -288,13 +284,6 @@ ClickHouse применяет настройку в тех случаях, ко Порог для `totals_mode = 'auto'`. Смотрите раздел "Модификатор WITH TOTALS". -## default_sample - -Число с плавающей запятой от 0 до 1. По умолчанию - 1. -Позволяет выставить коэффициент сэмплирования по умолчанию для всех запросов SELECT. -(Для таблиц, не поддерживающих сэмплирование, будет кидаться исключение.) -Если равно 1 - сэмплирование по умолчанию не делается. - ## max_parallel_replicas Максимальное количество используемых реплик каждого шарда при выполнении запроса. diff --git a/docs/zh/operations/server_settings/settings.md b/docs/zh/operations/server_settings/settings.md index 5b86bc068c5..c30ac68525e 100644 --- a/docs/zh/operations/server_settings/settings.md +++ b/docs/zh/operations/server_settings/settings.md @@ -259,15 +259,14 @@ Useful for breaking away from a specific network interface. example.yandex.ru ``` - ## keep_alive_timeout -The number of seconds that ClickHouse waits for incoming requests before closing the connection. Defaults to 10 seconds +The number of seconds that ClickHouse waits for incoming requests before closing the connection. Defaults to 3 seconds. **Example** ```xml -10 +3 ``` diff --git a/docs/zh/operations/settings/query_complexity.md b/docs/zh/operations/settings/query_complexity.md index eb8e722e887..0250a37685e 100644 --- a/docs/zh/operations/settings/query_complexity.md +++ b/docs/zh/operations/settings/query_complexity.md @@ -152,7 +152,7 @@ At this time, it isn't checked during parsing, but only after parsing the query. ## max_ast_elements Maximum number of elements in a query syntactic tree. If exceeded, an exception is thrown. -In the same way as the previous setting, it is checked only after parsing the query. By default, 10,000. +In the same way as the previous setting, it is checked only after parsing the query. By default, 50,000. ## max_rows_in_set diff --git a/docs/zh/operations/settings/settings.md b/docs/zh/operations/settings/settings.md index 4a40828babb..e6fd9315e86 100644 --- a/docs/zh/operations/settings/settings.md +++ b/docs/zh/operations/settings/settings.md @@ -93,7 +93,7 @@ Blocks the size of `max_block_size` are not always loaded from the table. If it Used for the same purpose as `max_block_size`, but it sets the recommended block size in bytes by adapting it to the number of rows in the block. However, the block size cannot be more than `max_block_size` rows. -Disabled by default (set to 0). It only works when reading from MergeTree engines. +By default: 1,000,000. It only works when reading from MergeTree engines. ## log_queries @@ -124,7 +124,7 @@ Disables lagging replicas for distributed queries. See "[Replication](../../oper Sets the time in seconds. If a replica lags more than the set value, this replica is not used. -Default value: 0 (off). +Default value: 300. Used when performing `SELECT` from a distributed table that points to replicated tables. @@ -137,7 +137,7 @@ The maximum number of query processing threads This parameter applies to threads that perform the same stages of the query processing pipeline in parallel. For example, if reading from a table, evaluating expressions with functions, filtering with WHERE and pre-aggregating for GROUP BY can all be done in parallel using at least 'max_threads' number of threads, then 'max_threads' are used. -By default, 8. +By default, 2. If less than one SELECT query is normally run on a server at a time, set this parameter to a value slightly less than the actual number of processor cores. @@ -178,11 +178,7 @@ The interval in microseconds for checking whether request execution has been can By default, 100,000 (check for canceling and send progress ten times per second). -## connect_timeout - -## receive_timeout - -## send_timeout +## connect_timeout, receive_timeout, send_timeout Timeouts in seconds on the socket used for communicating with the client. @@ -198,7 +194,7 @@ By default, 10. The maximum number of simultaneous connections with remote servers for distributed processing of a single query to a single Distributed table. We recommend setting a value no less than the number of servers in the cluster. -By default, 100. +By default, 1024. The following parameters are only used when creating Distributed tables (and when launching a server), so there is no reason to change them at runtime. @@ -206,7 +202,7 @@ The following parameters are only used when creating Distributed tables (and whe The maximum number of simultaneous connections with remote servers for distributed processing of all queries to a single Distributed table. We recommend setting a value no less than the number of servers in the cluster. -By default, 128. +By default, 1024. ## connect_timeout_with_failover_ms @@ -229,7 +225,7 @@ For more information, see the section "Extreme values". ## use_uncompressed_cache -Whether to use a cache of uncompressed blocks. Accepts 0 or 1. By default, 0 (disabled). +Whether to use a cache of uncompressed blocks. Accepts 0 or 1. By default, 1 (enabled). The uncompressed cache (only for tables in the MergeTree family) allows significantly reducing latency and increasing throughput when working with a large number of short queries. Enable this setting for users who send frequent short requests. Also pay attention to the 'uncompressed_cache_size' configuration parameter (only set in the config file) – the size of uncompressed cache blocks. By default, it is 8 GiB. The uncompressed cache is filled in as needed; the least-used data is automatically deleted. For queries that read at least a somewhat large volume of data (one million rows or more), the uncompressed cache is disabled automatically in order to save space for truly small queries. So you can keep the 'use_uncompressed_cache' setting always set to 1. @@ -290,16 +286,9 @@ See the section "WITH TOTALS modifier". ## totals_auto_threshold -The threshold for ` totals_mode = 'auto'`. +The threshold for `totals_mode = 'auto'`. See the section "WITH TOTALS modifier". -## default_sample - -Floating-point number from 0 to 1. By default, 1. -Allows you to set the default sampling ratio for all SELECT queries. -(For tables that do not support sampling, it throws an exception.) -If set to 1, sampling is not performed by default. - ## max_parallel_replicas The maximum number of replicas for each shard when executing a query.