Merge branch 'master' of github.com:yandex/ClickHouse

2024-11-23 08:02:02 +00:00 · 2019-01-31 16:54:32 +03:00 · 2019-01-31 16:54:32 +03:00 · 1a69ccfa8f
commit 1a69ccfa8f
parent 4675c0bd29 c7ed73ea27
8 changed files with 31 additions and 69 deletions
--- a/docs/en/operations/server_settings/settings.md
+++ b/docs/en/operations/server_settings/settings.md
@ -262,12 +262,12 @@ Useful for breaking away from a specific network interface.

 ## keep_alive_timeout

-The number of seconds that ClickHouse waits for incoming requests before closing the connection. Defaults to 10 seconds
+The number of seconds that ClickHouse waits for incoming requests before closing the connection. Defaults to 3 seconds.

 **Example**

 ```xml
-<keep_alive_timeout>10</keep_alive_timeout>
+<keep_alive_timeout>3</keep_alive_timeout>
 ```


@ -326,8 +326,7 @@ Keys:
 - user_syslog — Required setting if you want to write to the syslog.
 - address — The host[:порт] of syslogd. If omitted, the local daemon is used.
 - hostname — Optional. The name of the host that logs are sent from.
- facility — [The syslog facility keyword](https://en.wikipedia.org/wiki/Syslog#Facility)
-in uppercase letters with the "LOG_" prefix: (``LOG_USER``, ``LOG_DAEMON``, ``LOG_LOCAL3``, and so on).
+- facility — [The syslog facility keyword](https://en.wikipedia.org/wiki/Syslog#Facility) in uppercase letters with the "LOG_" prefix: (``LOG_USER``, ``LOG_DAEMON``, ``LOG_LOCAL3``, and so on).
 Default value: ``LOG_USER`` if ``address`` is specified, ``LOG_DAEMON otherwise.``
 - format – Message format. Possible values: ``bsd`` and ``syslog.``

--- a/docs/en/operations/settings/query_complexity.md
+++ b/docs/en/operations/settings/query_complexity.md
@ -144,7 +144,7 @@ At this time, it isn't checked during parsing, but only after parsing the query.
 ## max_ast_elements

 Maximum number of elements in a query syntactic tree. If exceeded, an exception is thrown.
-In the same way as the previous setting, it is checked only after parsing the query. By default, 10,000.
+In the same way as the previous setting, it is checked only after parsing the query. By default, 50,000.

 ## max_rows_in_set

--- a/docs/en/operations/settings/settings.md
+++ b/docs/en/operations/settings/settings.md
@ -111,7 +111,7 @@ Blocks the size of `max_block_size` are not always loaded from the table. If it

 Used for the same purpose as `max_block_size`, but it sets the recommended block size in bytes by adapting it to the number of rows in the block.
 However, the block size cannot be more than `max_block_size` rows.
-Disabled by default (set to 0). It only works when reading from MergeTree engines.
+By default: 1,000,000. It only works when reading from MergeTree engines.

 ## merge_tree_uniform_read_distribution {#setting-merge_tree_uniform_read_distribution}

@ -192,7 +192,7 @@ Disables lagging replicas for distributed queries. See "[Replication](../../oper

 Sets the time in seconds. If a replica lags more than the set value, this replica is not used.

-Default value: 0 (off).
+Default value: 300.

 Used when performing `SELECT` from a distributed table that points to replicated tables.

@ -205,7 +205,7 @@ The maximum number of query processing threads
 This parameter applies to threads that perform the same stages of the query processing pipeline in parallel.
 For example, if reading from a table, evaluating expressions with functions, filtering with WHERE and pre-aggregating for GROUP BY can all be done in parallel using at least 'max_threads' number of threads, then 'max_threads' are used.

-By default, 8.
+By default, 2.

 If less than one SELECT query is normally run on a server at a time, set this parameter to a value slightly less than the actual number of processor cores.

@ -246,11 +246,7 @@ The interval in microseconds for checking whether request execution has been can

 By default, 100,000 (check for canceling and send progress ten times per second).

-## connect_timeout
-
-## receive_timeout
-
-## send_timeout
+## connect_timeout, receive_timeout, send_timeout

 Timeouts in seconds on the socket used for communicating with the client.

@ -266,7 +262,7 @@ By default, 10.

 The maximum number of simultaneous connections with remote servers for distributed processing of a single query to a single Distributed table. We recommend setting a value no less than the number of servers in the cluster.

-By default, 100.
+By default, 1024.

 The following parameters are only used when creating Distributed tables (and when launching a server), so there is no reason to change them at runtime.

@ -274,7 +270,7 @@ The following parameters are only used when creating Distributed tables (and whe

 The maximum number of simultaneous connections with remote servers for distributed processing of all queries to a single Distributed table. We recommend setting a value no less than the number of servers in the cluster.

-By default, 128.
+By default, 1024.

 ## connect_timeout_with_failover_ms

@ -294,10 +290,9 @@ By default, 3.
 Whether to count extreme values (the minimums and maximums in columns of a query result). Accepts 0 or 1. By default, 0 (disabled).
 For more information, see the section "Extreme values".

-
 ## use_uncompressed_cache {#setting-use_uncompressed_cache}

-Whether to use a cache of uncompressed blocks. Accepts 0 or 1. By default, 0 (disabled).
+Whether to use a cache of uncompressed blocks. Accepts 0 or 1. By default, 1 (enabled).
 The uncompressed cache (only for tables in the MergeTree family) allows significantly reducing latency and increasing throughput when working with a large number of short queries. Enable this setting for users who send frequent short requests. Also pay attention to the [uncompressed_cache_size](../server_settings/settings.md#server-settings-uncompressed_cache_size) configuration parameter (only set in the config file) – the size of uncompressed cache blocks. By default, it is 8 GiB. The uncompressed cache is filled in as needed; the least-used data is automatically deleted.

 For queries that read at least a somewhat large volume of data (one million rows or more), the uncompressed cache is disabled automatically in order to save space for truly small queries. So you can keep the 'use_uncompressed_cache' setting always set to 1.
@ -358,16 +353,9 @@ See the section "WITH TOTALS modifier".

 ## totals_auto_threshold

-The threshold for ` totals_mode = 'auto'`.
+The threshold for `totals_mode = 'auto'`.
 See the section "WITH TOTALS modifier".

-## default_sample
-
-Floating-point number from 0 to 1. By default, 1.
-Allows you to set the default sampling ratio for all SELECT queries.
-(For tables that do not support sampling, it throws an exception.)
-If set to 1, sampling is not performed by default.
-
 ## max_parallel_replicas

 The maximum number of replicas for each shard when executing a query.
@ -403,14 +391,12 @@ If the value is true, integers appear in quotes when using JSON\* Int64 and UInt

 The character interpreted as a delimiter in the CSV data. By default, the delimiter is `,`.

-
 ## join_use_nulls

 Affects the behavior of [JOIN](../../query_language/select.md).

 With `join_use_nulls=1,` `JOIN` behaves like in standard SQL, i.e. if empty cells appear when merging, the type of the corresponding field is converted to [Nullable](../../data_types/nullable.md#data_type-nullable), and empty cells are filled with [NULL](../../query_language/syntax.md).

-
 ## insert_quorum

 Enables quorum writes.
--- a/docs/ru/operations/server_settings/settings.md
+++ b/docs/ru/operations/server_settings/settings.md
@ -268,7 +268,7 @@ ClickHouse проверит условия `min_part_size` и `min_part_size_rat
 **Пример**

 ```xml
-<keep_alive_timeout>10</keep_alive_timeout>
+<keep_alive_timeout>3</keep_alive_timeout>
 ```


--- a/docs/ru/operations/settings/settings.md
+++ b/docs/ru/operations/settings/settings.md
@ -93,7 +93,7 @@ ClickHouse применяет настройку в тех случаях, ко

 Служит для тех же целей что и `max_block_size`, но задает реккомедуемый размер блоков в байтах, выбирая адаптивное количество строк в блоке.
 При этом размер блока не может быть более `max_block_size` строк.
-По умолчанию выключен (равен 0), работает только при чтении из MergeTree-движков.
+Значение по умолчанию: 1,000,000. Работает только при чтении из MergeTree-движков.


 ## log_queries
@ -124,7 +124,7 @@ ClickHouse применяет настройку в тех случаях, ко

 Устанавливает время в секундах. Если оставание реплики больше установленного значения, то реплика не используется.

-Значение по умолчанию: 0 (отключено).
+Значение по умолчанию: 300.

 Используется при выполнении `SELECT` из распределенной таблицы, которая указывает на реплицированные таблицы.

@ -136,7 +136,7 @@ ClickHouse применяет настройку в тех случаях, ко
 Этот параметр относится к потокам, которые выполняют параллельно одни стадии конвейера выполнения запроса.
 Например, если чтение из таблицы, вычисление выражений с функциями, фильтрацию с помощью WHERE и предварительную агрегацию для GROUP BY можно делать параллельно с использованием как минимум max_threads потоков, то будет использовано max_threads потоков.

-По умолчанию - 8.
+По умолчанию - 2.

 Если на сервере обычно исполняется менее одного запроса SELECT одновременно, то выставите этот параметр в значение чуть меньше количества реальных процессорных ядер.

@ -176,11 +176,7 @@ ClickHouse применяет настройку в тех случаях, ко

 По умолчанию - 100 000 (проверять остановку запроса и отправлять прогресс десять раз в секунду).

-## connect_timeout
-
-## receive_timeout
-
-## send_timeout
+## connect_timeout, receive_timeout, send_timeout

 Таймауты в секундах на сокет, по которому идёт общение с клиентом.

@ -196,7 +192,7 @@ ClickHouse применяет настройку в тех случаях, ко

 Максимальное количество одновременных соединений с удалёнными серверами при распределённой обработке одного запроса к одной таблице типа Distributed. Рекомендуется выставлять не меньше, чем количество серверов в кластере.

-По умолчанию - 100.
+По умолчанию - 1024.

 Следующие параметры имеют значение только на момент создания таблицы типа Distributed (и при запуске сервера), поэтому их не имеет смысла менять в рантайме.

@ -204,7 +200,7 @@ ClickHouse применяет настройку в тех случаях, ко

 Максимальное количество одновременных соединений с удалёнными серверами при распределённой обработке всех запросов к одной таблице типа Distributed. Рекомендуется выставлять не меньше, чем количество серверов в кластере.

-По умолчанию - 128.
+По умолчанию - 1024.

 ## connect_timeout_with_failover_ms

@ -227,7 +223,7 @@ ClickHouse применяет настройку в тех случаях, ко

 ## use_uncompressed_cache

-Использовать ли кэш разжатых блоков. Принимает 0 или 1. По умолчанию - 0 (выключено).
+Использовать ли кэш разжатых блоков. Принимает 0 или 1. По умолчанию - 1 (включено).
 Кэш разжатых блоков (только для таблиц семейства MergeTree) позволяет существенно уменьшить задержки и увеличить пропускную способность при обработке большого количества коротких запросов. Включите эту настройку для пользователей, от которых идут частые короткие запросы. Также обратите внимание на конфигурационный параметр uncompressed_cache_size (настраивается только в конфигурационном файле) - размер кэша разжатых блоков. По умолчанию - 8 GiB. Кэш разжатых блоков заполняется по мере надобности; наиболее невостребованные данные автоматически удаляются.

 Для запросов, читающих хоть немного приличный объём данных (миллион строк и больше), кэш разжатых блоков автоматически выключается, чтобы оставить место для действительно мелких запросов. Поэтому, можно держать настройку use_uncompressed_cache всегда выставленной в 1.
@ -288,13 +284,6 @@ ClickHouse применяет настройку в тех случаях, ко
 Порог для `totals_mode = 'auto'`.
 Смотрите раздел "Модификатор WITH TOTALS".

-## default_sample
-
-Число с плавающей запятой от 0 до 1. По умолчанию - 1.
-Позволяет выставить коэффициент сэмплирования по умолчанию для всех запросов SELECT.
-(Для таблиц, не поддерживающих сэмплирование, будет кидаться исключение.)
-Если равно 1 - сэмплирование по умолчанию не делается.
-
 ## max_parallel_replicas

 Максимальное количество используемых реплик каждого шарда при выполнении запроса.
--- a/docs/zh/operations/server_settings/settings.md
+++ b/docs/zh/operations/server_settings/settings.md
@ -259,15 +259,14 @@ Useful for breaking away from a specific network interface.
 <interserver_http_host>example.yandex.ru</interserver_http_host>
 ```

-
 ## keep_alive_timeout

-The number of seconds that ClickHouse waits for incoming requests before closing the connection. Defaults to 10 seconds
+The number of seconds that ClickHouse waits for incoming requests before closing the connection. Defaults to 3 seconds.

 **Example**

 ```xml
-<keep_alive_timeout>10</keep_alive_timeout>
+<keep_alive_timeout>3</keep_alive_timeout>
 ```


--- a/docs/zh/operations/settings/query_complexity.md
+++ b/docs/zh/operations/settings/query_complexity.md
@ -152,7 +152,7 @@ At this time, it isn't checked during parsing, but only after parsing the query.
 ## max_ast_elements

 Maximum number of elements in a query syntactic tree. If exceeded, an exception is thrown.
-In the same way as the previous setting, it is checked only after parsing the query. By default, 10,000.
+In the same way as the previous setting, it is checked only after parsing the query. By default, 50,000.

 ## max_rows_in_set

--- a/docs/zh/operations/settings/settings.md
+++ b/docs/zh/operations/settings/settings.md
@ -93,7 +93,7 @@ Blocks the size of `max_block_size` are not always loaded from the table. If it

 Used for the same purpose as `max_block_size`, but it sets the recommended block size in bytes by adapting it to the number of rows in the block.
 However, the block size cannot be more than `max_block_size` rows.
-Disabled by default (set to 0). It only works when reading from MergeTree engines.
+By default: 1,000,000. It only works when reading from MergeTree engines.


 ## log_queries
@ -124,7 +124,7 @@ Disables lagging replicas for distributed queries. See "[Replication](../../oper

 Sets the time in seconds. If a replica lags more than the set value, this replica is not used.

-Default value: 0 (off).
+Default value: 300.

 Used when performing `SELECT` from a distributed table that points to replicated tables.

@ -137,7 +137,7 @@ The maximum number of query processing threads
 This parameter applies to threads that perform the same stages of the query processing pipeline in parallel.
 For example, if reading from a table, evaluating expressions with functions, filtering with WHERE and pre-aggregating for GROUP BY can all be done in parallel using at least 'max_threads' number of threads, then 'max_threads' are used.

-By default, 8.
+By default, 2.

 If less than one SELECT query is normally run on a server at a time, set this parameter to a value slightly less than the actual number of processor cores.

@ -178,11 +178,7 @@ The interval in microseconds for checking whether request execution has been can

 By default, 100,000 (check for canceling and send progress ten times per second).

-## connect_timeout
-
-## receive_timeout
-
-## send_timeout
+## connect_timeout, receive_timeout, send_timeout

 Timeouts in seconds on the socket used for communicating with the client.

@ -198,7 +194,7 @@ By default, 10.

 The maximum number of simultaneous connections with remote servers for distributed processing of a single query to a single Distributed table. We recommend setting a value no less than the number of servers in the cluster.

-By default, 100.
+By default, 1024.

 The following parameters are only used when creating Distributed tables (and when launching a server), so there is no reason to change them at runtime.

@ -206,7 +202,7 @@ The following parameters are only used when creating Distributed tables (and whe

 The maximum number of simultaneous connections with remote servers for distributed processing of all queries to a single Distributed table. We recommend setting a value no less than the number of servers in the cluster.

-By default, 128.
+By default, 1024.

 ## connect_timeout_with_failover_ms

@ -229,7 +225,7 @@ For more information, see the section "Extreme values".

 ## use_uncompressed_cache

-Whether to use a cache of uncompressed blocks. Accepts 0 or 1. By default, 0 (disabled).
+Whether to use a cache of uncompressed blocks. Accepts 0 or 1. By default, 1 (enabled).
 The uncompressed cache (only for tables in the MergeTree family) allows significantly reducing latency and increasing throughput when working with a large number of short queries. Enable this setting for users who send frequent short requests. Also pay attention to the 'uncompressed_cache_size' configuration parameter (only set in the config file) – the size of uncompressed cache blocks. By default, it is 8 GiB. The uncompressed cache is filled in as needed; the least-used data is automatically deleted.

 For queries that read at least a somewhat large volume of data (one million rows or more), the uncompressed cache is disabled automatically in order to save space for truly small queries. So you can keep the 'use_uncompressed_cache' setting always set to 1.
@ -290,16 +286,9 @@ See the section "WITH TOTALS modifier".

 ## totals_auto_threshold

-The threshold for ` totals_mode = 'auto'`.
+The threshold for `totals_mode = 'auto'`.
 See the section "WITH TOTALS modifier".

-## default_sample
-
-Floating-point number from 0 to 1. By default, 1.
-Allows you to set the default sampling ratio for all SELECT queries.
-(For tables that do not support sampling, it throws an exception.)
-If set to 1, sampling is not performed by default.
-
 ## max_parallel_replicas

 The maximum number of replicas for each shard when executing a query.