diff --git a/CHANGELOG.md b/CHANGELOG.md
index d2cc3e51997..e2c777b3bcf 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,3 +1,148 @@
+## ClickHouse release 21.2
+
+### ClickHouse release v21.2.2.8-stable, 2021-02-07
+
+#### Backward Incompatible Change
+
+* Bitwise functions (`bitAnd`, `bitOr`, etc) are forbidden for floating point arguments. Now you have to do explicit cast to integer. [#19853](https://github.com/ClickHouse/ClickHouse/pull/19853) ([Azat Khuzhin](https://github.com/azat)).
+* Forbid `lcm`/`gcd` for floats. [#19532](https://github.com/ClickHouse/ClickHouse/pull/19532) ([Azat Khuzhin](https://github.com/azat)).
+* Fix memory tracking for `OPTIMIZE TABLE`/merges; account query memory limits and sampling for `OPTIMIZE TABLE`/merges. [#18772](https://github.com/ClickHouse/ClickHouse/pull/18772) ([Azat Khuzhin](https://github.com/azat)).
+* Disallow floating point column as partition key, see [#18421](https://github.com/ClickHouse/ClickHouse/issues/18421#event-4147046255). [#18464](https://github.com/ClickHouse/ClickHouse/pull/18464) ([hexiaoting](https://github.com/hexiaoting)).
+* Excessive parenthesis in type definitions no longer supported, example: `Array((UInt8))`.
+
+#### New Feature
+
+* Added `PostgreSQL` table engine (both select/insert, with support for multidimensional arrays), also as table function. Added `PostgreSQL` dictionary source. Added `PostgreSQL` database engine. [#18554](https://github.com/ClickHouse/ClickHouse/pull/18554) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Data type `Nested` now supports arbitrary levels of nesting. Introduced subcolumns of complex types, such as `size0` in `Array`, `null` in `Nullable`, names of `Tuple` elements, which can be read without reading of whole column. [#17310](https://github.com/ClickHouse/ClickHouse/pull/17310) ([Anton Popov](https://github.com/CurtizJ)).
+* Added `Nullable` support for `FlatDictionary`, `HashedDictionary`, `ComplexKeyHashedDictionary`, `DirectDictionary`, `ComplexKeyDirectDictionary`, `RangeHashedDictionary`. [#18236](https://github.com/ClickHouse/ClickHouse/pull/18236) ([Maksim Kita](https://github.com/kitaisreal)).
+* Adds a new table called `system.distributed_ddl_queue` that displays the queries in the DDL worker queue. [#17656](https://github.com/ClickHouse/ClickHouse/pull/17656) ([Bharat Nallan](https://github.com/bharatnc)).
+* Added support of mapping LDAP group names, and attribute values in general, to local roles for users from ldap user directories. [#17211](https://github.com/ClickHouse/ClickHouse/pull/17211) ([Denis Glazachev](https://github.com/traceon)).
+* Support insert into table function `cluster`, and for both table functions `remote` and `cluster`, support distributing data across nodes by specify sharding key. Close [#16752](https://github.com/ClickHouse/ClickHouse/issues/16752). [#18264](https://github.com/ClickHouse/ClickHouse/pull/18264) ([flynn](https://github.com/ucasFL)).
+* Add function `decodeXMLComponent` to decode characters for XML. Example: `SELECT decodeXMLComponent('Hello,"world"!')` [#17659](https://github.com/ClickHouse/ClickHouse/issues/17659). [#18542](https://github.com/ClickHouse/ClickHouse/pull/18542) ([nauta](https://github.com/nautaa)).
+* Added functions `parseDateTimeBestEffortUSOrZero`, `parseDateTimeBestEffortUSOrNull`. [#19712](https://github.com/ClickHouse/ClickHouse/pull/19712) ([Maksim Kita](https://github.com/kitaisreal)).
+* Add `sign` math function. [#19527](https://github.com/ClickHouse/ClickHouse/pull/19527) ([flynn](https://github.com/ucasFL)).
+* Add information about used features (functions, table engines, etc) into system.query_log. [#18495](https://github.com/ClickHouse/ClickHouse/issues/18495). [#19371](https://github.com/ClickHouse/ClickHouse/pull/19371) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Function `formatDateTime` support the `%Q` modification to format date to quarter. [#19224](https://github.com/ClickHouse/ClickHouse/pull/19224) ([Jianmei Zhang](https://github.com/zhangjmruc)).
+* Support MetaKey+Enter hotkey binding in play UI. [#19012](https://github.com/ClickHouse/ClickHouse/pull/19012) ([sundyli](https://github.com/sundy-li)).
+* Add three functions for map data type: 1. `mapContains(map, key)` to check weather map.keys include the second parameter key. 2. `mapKeys(map)` return all the keys in Array format 3. `mapValues(map)` return all the values in Array format. [#18788](https://github.com/ClickHouse/ClickHouse/pull/18788) ([hexiaoting](https://github.com/hexiaoting)).
+* Add `log_comment` setting related to [#18494](https://github.com/ClickHouse/ClickHouse/issues/18494). [#18549](https://github.com/ClickHouse/ClickHouse/pull/18549) ([Zijie Lu](https://github.com/TszKitLo40)).
+* Add support of tuple argument to `argMin` and `argMax` functions. [#17359](https://github.com/ClickHouse/ClickHouse/pull/17359) ([Ildus Kurbangaliev](https://github.com/ildus)).
+* Support `EXISTS VIEW` syntax. [#18552](https://github.com/ClickHouse/ClickHouse/pull/18552) ([Du Chuan](https://github.com/spongedu)).
+* Add `SELECT ALL` syntax. closes [#18706](https://github.com/ClickHouse/ClickHouse/issues/18706). [#18723](https://github.com/ClickHouse/ClickHouse/pull/18723) ([flynn](https://github.com/ucasFL)).
+
+#### Performance Improvement
+
+* Faster parts removal by lowering the number of `stat` syscalls. This returns the optimization that existed while ago. More safe interface of `IDisk`. This closes [#19065](https://github.com/ClickHouse/ClickHouse/issues/19065). [#19086](https://github.com/ClickHouse/ClickHouse/pull/19086) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Aliases declared in `WITH` statement are properly used in index analysis. Queries like `WITH column AS alias SELECT ... WHERE alias = ...` may use index now. [#18896](https://github.com/ClickHouse/ClickHouse/pull/18896) ([Amos Bird](https://github.com/amosbird)).
+* Add `optimize_alias_column_prediction` (on by default), that will: - Respect aliased columns in WHERE during partition pruning and skipping data using secondary indexes; - Respect aliased columns in WHERE for trivial count queries for optimize_trivial_count; - Respect aliased columns in GROUP BY/ORDER BY for optimize_aggregation_in_order/optimize_read_in_order. [#16995](https://github.com/ClickHouse/ClickHouse/pull/16995) ([sundyli](https://github.com/sundy-li)).
+* Speed up aggregate function `sum`. Improvement only visible on synthetic benchmarks and not very practical. [#19216](https://github.com/ClickHouse/ClickHouse/pull/19216) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Update libc++ and use another ABI to provide better performance. [#18914](https://github.com/ClickHouse/ClickHouse/pull/18914) ([Danila Kutenin](https://github.com/danlark1)).
+* Rewrite `sumIf()` and `sum(if())` function to `countIf()` function when logically equivalent. [#17041](https://github.com/ClickHouse/ClickHouse/pull/17041) ([flynn](https://github.com/ucasFL)).
+* Use a connection pool for S3 connections, controlled by the `s3_max_connections` settings. [#13405](https://github.com/ClickHouse/ClickHouse/pull/13405) ([Vladimir Chebotarev](https://github.com/excitoon)).
+* Add support for zstd long option for better compression of string columns to save space. [#17184](https://github.com/ClickHouse/ClickHouse/pull/17184) ([ygrek](https://github.com/ygrek)).
+* Slightly improve server latency by removing access to configuration on every connection. [#19863](https://github.com/ClickHouse/ClickHouse/pull/19863) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Reduce lock contention for multiple layers of the `Buffer` engine. [#19379](https://github.com/ClickHouse/ClickHouse/pull/19379) ([Azat Khuzhin](https://github.com/azat)).
+* Support splitting `Filter` step of query plan into `Expression + Filter` pair. Together with `Expression + Expression` merging optimization ([#17458](https://github.com/ClickHouse/ClickHouse/issues/17458)) it may delay execution for some expressions after `Filter` step. [#19253](https://github.com/ClickHouse/ClickHouse/pull/19253) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+
+#### Improvement
+
+* `SELECT count() FROM table` now can be executed if only one any column can be selected from the `table`. This PR fixes [#10639](https://github.com/ClickHouse/ClickHouse/issues/10639). [#18233](https://github.com/ClickHouse/ClickHouse/pull/18233) ([Vitaly Baranov](https://github.com/vitlibar)).
+* Set charset to `utf8mb4` when interacting with remote MySQL servers. Fixes [#19795](https://github.com/ClickHouse/ClickHouse/issues/19795). [#19800](https://github.com/ClickHouse/ClickHouse/pull/19800) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* `S3` table function now supports `auto` compression mode (autodetect). This closes [#18754](https://github.com/ClickHouse/ClickHouse/issues/18754). [#19793](https://github.com/ClickHouse/ClickHouse/pull/19793) ([Vladimir Chebotarev](https://github.com/excitoon)).
+* Correctly output infinite arguments for `formatReadableTimeDelta` function. In previous versions, there was implicit conversion to implementation specific integer value. [#19791](https://github.com/ClickHouse/ClickHouse/pull/19791) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Table function `S3` will use global region if the region can't be determined exactly. This closes [#10998](https://github.com/ClickHouse/ClickHouse/issues/10998). [#19750](https://github.com/ClickHouse/ClickHouse/pull/19750) ([Vladimir Chebotarev](https://github.com/excitoon)).
+* In distributed queries if the setting `async_socket_for_remote` is enabled, it was possible to get stack overflow at least in debug build configuration if very deeply nested data type is used in table (e.g. `Array(Array(Array(...more...)))`). This fixes [#19108](https://github.com/ClickHouse/ClickHouse/issues/19108). This change introduces minor backward incompatibility: excessive parenthesis in type definitions no longer supported, example: `Array((UInt8))`. [#19736](https://github.com/ClickHouse/ClickHouse/pull/19736) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Add separate pool for message brokers (RabbitMQ and Kafka). [#19722](https://github.com/ClickHouse/ClickHouse/pull/19722) ([Azat Khuzhin](https://github.com/azat)).
+* Fix rare `max_number_of_merges_with_ttl_in_pool` limit overrun (more merges with TTL can be assigned) for non-replicated MergeTree. [#19708](https://github.com/ClickHouse/ClickHouse/pull/19708) ([alesapin](https://github.com/alesapin)).
+* Dictionary: better error message during attribute parsing. [#19678](https://github.com/ClickHouse/ClickHouse/pull/19678) ([Maksim Kita](https://github.com/kitaisreal)).
+* Add an option to disable validation of checksums on reading. Should never be used in production. Please do not expect any benefits in disabling it. It may only be used for experiments and benchmarks. The setting only applicable for tables of MergeTree family. Checksums are always validated for other table engines and when receiving data over network. In my observations there is no performance difference or it is less than 0.5%. [#19588](https://github.com/ClickHouse/ClickHouse/pull/19588) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Support constant result in function `multiIf`. [#19533](https://github.com/ClickHouse/ClickHouse/pull/19533) ([Maksim Kita](https://github.com/kitaisreal)).
+* Enable function length/empty/notEmpty for datatype Map, which returns keys number in Map. [#19530](https://github.com/ClickHouse/ClickHouse/pull/19530) ([taiyang-li](https://github.com/taiyang-li)).
+* Add `--reconnect` option to `clickhouse-benchmark`. When this option is specified, it will reconnect before every request. This is needed for testing. [#19872](https://github.com/ClickHouse/ClickHouse/pull/19872) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Support using the new location of `.debug` file. This fixes [#19348](https://github.com/ClickHouse/ClickHouse/issues/19348). [#19520](https://github.com/ClickHouse/ClickHouse/pull/19520) ([Amos Bird](https://github.com/amosbird)).
+* `toIPv6` function parses `IPv4` addresses. [#19518](https://github.com/ClickHouse/ClickHouse/pull/19518) ([Bharat Nallan](https://github.com/bharatnc)).
+* Add `http_referer` field to `system.query_log`, `system.processes`, etc. This closes [#19389](https://github.com/ClickHouse/ClickHouse/issues/19389). [#19390](https://github.com/ClickHouse/ClickHouse/pull/19390) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Improve MySQL compatibility by making more functions case insensitive and adding aliases. [#19387](https://github.com/ClickHouse/ClickHouse/pull/19387) ([Daniil Kondratyev](https://github.com/dankondr)).
+* Add metrics for MergeTree parts (Wide/Compact/InMemory) types. [#19381](https://github.com/ClickHouse/ClickHouse/pull/19381) ([Azat Khuzhin](https://github.com/azat)).
+* Allow docker to be executed with arbitrary uid. [#19374](https://github.com/ClickHouse/ClickHouse/pull/19374) ([filimonov](https://github.com/filimonov)).
+* Fix wrong alignment of values of `IPv4` data type in Pretty formats. They were aligned to the right, not to the left. This closes [#19184](https://github.com/ClickHouse/ClickHouse/issues/19184). [#19339](https://github.com/ClickHouse/ClickHouse/pull/19339) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Allow change `max_server_memory_usage` without restart. This closes [#18154](https://github.com/ClickHouse/ClickHouse/issues/18154). [#19186](https://github.com/ClickHouse/ClickHouse/pull/19186) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* The exception when function `bar` is called with certain NaN argument may be slightly misleading in previous versions. This fixes [#19088](https://github.com/ClickHouse/ClickHouse/issues/19088). [#19107](https://github.com/ClickHouse/ClickHouse/pull/19107) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Explicitly set uid / gid of clickhouse user & group to the fixed values (101) in clickhouse-server images. [#19096](https://github.com/ClickHouse/ClickHouse/pull/19096) ([filimonov](https://github.com/filimonov)).
+* Fixed `PeekableReadBuffer: Memory limit exceed` error when inserting data with huge strings. Fixes [#18690](https://github.com/ClickHouse/ClickHouse/issues/18690). [#18979](https://github.com/ClickHouse/ClickHouse/pull/18979) ([tavplubix](https://github.com/tavplubix)).
+* Docker image: several improvements for clickhouse-server entrypoint. [#18954](https://github.com/ClickHouse/ClickHouse/pull/18954) ([filimonov](https://github.com/filimonov)).
+* Add `normalizeQueryKeepNames` and `normalizedQueryHashKeepNames` to normalize queries without masking long names with `?`. This helps better analyze complex query logs. [#18910](https://github.com/ClickHouse/ClickHouse/pull/18910) ([Amos Bird](https://github.com/amosbird)).
+* Check per-block checksum of the distributed batch on the sender before sending (without reading the file twice, the checksums will be verified while reading), this will avoid stuck of the INSERT on the receiver (on truncated .bin file on the sender). Avoid reading .bin files twice for batched INSERT (it was required to calculate rows/bytes to take squashing into account, now this information included into the header, backward compatible is preserved). [#18853](https://github.com/ClickHouse/ClickHouse/pull/18853) ([Azat Khuzhin](https://github.com/azat)).
+* Fix issues with RIGHT and FULL JOIN of tables with aggregate function states. In previous versions exception about `cloneResized` method was thrown. [#18818](https://github.com/ClickHouse/ClickHouse/pull/18818) ([templarzq](https://github.com/templarzq)).
+* Added prefix-based S3 endpoint settings. [#18812](https://github.com/ClickHouse/ClickHouse/pull/18812) ([Vladimir Chebotarev](https://github.com/excitoon)).
+* Add [UInt8, UInt16, UInt32, UInt64] arguments types support for bitmapTransform, bitmapSubsetInRange, bitmapSubsetLimit, bitmapContains functions. This closes [#18713](https://github.com/ClickHouse/ClickHouse/issues/18713). [#18791](https://github.com/ClickHouse/ClickHouse/pull/18791) ([sundyli](https://github.com/sundy-li)).
+* Allow CTE (Common Table Expressions) to be further aliased. Propagate CSE (Common Subexpressions Elimination) to subqueries in the same level when `enable_global_with_statement = 1`. This fixes [#17378](https://github.com/ClickHouse/ClickHouse/issues/17378) . This fixes https://github.com/ClickHouse/ClickHouse/pull/16575#issuecomment-753416235 . [#18684](https://github.com/ClickHouse/ClickHouse/pull/18684) ([Amos Bird](https://github.com/amosbird)).
+* Update librdkafka to v1.6.0-RC2. Fixes [#18668](https://github.com/ClickHouse/ClickHouse/issues/18668). [#18671](https://github.com/ClickHouse/ClickHouse/pull/18671) ([filimonov](https://github.com/filimonov)).
+* In case of unexpected exceptions automatically restart background thread which is responsible for execution of distributed DDL queries. Fixes [#17991](https://github.com/ClickHouse/ClickHouse/issues/17991). [#18285](https://github.com/ClickHouse/ClickHouse/pull/18285) ([徐炘](https://github.com/weeds085490)).
+* Updated AWS C++ SDK in order to utilize global regions in S3. [#17870](https://github.com/ClickHouse/ClickHouse/pull/17870) ([Vladimir Chebotarev](https://github.com/excitoon)).
+* Added support for `WITH ... [AND] [PERIODIC] REFRESH [interval_in_sec]` clause when creating `LIVE VIEW` tables. [#14822](https://github.com/ClickHouse/ClickHouse/pull/14822) ([vzakaznikov](https://github.com/vzakaznikov)).
+* Restrict `MODIFY TTL` queries for `MergeTree` tables created in old syntax. Previously the query succeeded, but actually it had no effect. [#19064](https://github.com/ClickHouse/ClickHouse/pull/19064) ([Anton Popov](https://github.com/CurtizJ)).
+
+#### Bug Fix
+
+* Fix index analysis of binary functions with constant argument which leads to wrong query results. This fixes [#18364](https://github.com/ClickHouse/ClickHouse/issues/18364). [#18373](https://github.com/ClickHouse/ClickHouse/pull/18373) ([Amos Bird](https://github.com/amosbird)).
+* Fix starting the server with tables having default expressions containing dictGet(). Allow getting return type of dictGet() without loading dictionary. [#19805](https://github.com/ClickHouse/ClickHouse/pull/19805) ([Vitaly Baranov](https://github.com/vitlibar)).
+* Fix server crash after query with `if` function with `Tuple` type of then/else branches result. `Tuple` type must contain `Array` or another complex type. Fixes [#18356](https://github.com/ClickHouse/ClickHouse/issues/18356). [#20133](https://github.com/ClickHouse/ClickHouse/pull/20133) ([alesapin](https://github.com/alesapin)).
+* `MaterializeMySQL` (experimental feature): Fix replication for statements that update several tables. [#20066](https://github.com/ClickHouse/ClickHouse/pull/20066) ([Håvard Kvålen](https://github.com/havardk)).
+* Prevent "Connection refused" in docker during initialization script execution. [#20012](https://github.com/ClickHouse/ClickHouse/pull/20012) ([filimonov](https://github.com/filimonov)).
+* `EmbeddedRocksDB` is an experimental storage. Fix the issue with lack of proper type checking. Simplified code. This closes [#19967](https://github.com/ClickHouse/ClickHouse/issues/19967). [#19972](https://github.com/ClickHouse/ClickHouse/pull/19972) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix a segfault in function `fromModifiedJulianDay` when the argument type is `Nullable(T)` for any integral types other than Int32. [#19959](https://github.com/ClickHouse/ClickHouse/pull/19959) ([PHO](https://github.com/depressed-pho)).
+* The function `greatCircleAngle` returned inaccurate results in previous versions. This closes [#19769](https://github.com/ClickHouse/ClickHouse/issues/19769). [#19789](https://github.com/ClickHouse/ClickHouse/pull/19789) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix rare bug when some replicated operations (like mutation) cannot process some parts after data corruption. Fixes [#19593](https://github.com/ClickHouse/ClickHouse/issues/19593). [#19702](https://github.com/ClickHouse/ClickHouse/pull/19702) ([alesapin](https://github.com/alesapin)).
+* Background thread which executes `ON CLUSTER` queries might hang waiting for dropped replicated table to do something. It's fixed. [#19684](https://github.com/ClickHouse/ClickHouse/pull/19684) ([yiguolei](https://github.com/yiguolei)).
+* Fix wrong deserialization of columns description. It makes INSERT into a table with a column named `\` impossible. [#19479](https://github.com/ClickHouse/ClickHouse/pull/19479) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Mark distributed batch as broken in case of empty data block in one of files. [#19449](https://github.com/ClickHouse/ClickHouse/pull/19449) ([Azat Khuzhin](https://github.com/azat)).
+* Fixed very rare bug that might cause mutation to hang after `DROP/DETACH/REPLACE/MOVE PARTITION`. It was partially fixed by [#15537](https://github.com/ClickHouse/ClickHouse/issues/15537) for the most cases. [#19443](https://github.com/ClickHouse/ClickHouse/pull/19443) ([tavplubix](https://github.com/tavplubix)).
+* Fix possible error `Extremes transform was already added to pipeline`. Fixes [#14100](https://github.com/ClickHouse/ClickHouse/issues/14100). [#19430](https://github.com/ClickHouse/ClickHouse/pull/19430) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Fix default value in join types with non-zero default (e.g. some Enums). Closes [#18197](https://github.com/ClickHouse/ClickHouse/issues/18197). [#19360](https://github.com/ClickHouse/ClickHouse/pull/19360) ([vdimir](https://github.com/vdimir)).
+* Do not mark file for distributed send as broken on EOF. [#19290](https://github.com/ClickHouse/ClickHouse/pull/19290) ([Azat Khuzhin](https://github.com/azat)).
+* Fix leaking of pipe fd for `async_socket_for_remote`. [#19153](https://github.com/ClickHouse/ClickHouse/pull/19153) ([Azat Khuzhin](https://github.com/azat)).
+* Fix infinite reading from file in `ORC` format (was introduced in [#10580](https://github.com/ClickHouse/ClickHouse/issues/10580)). Fixes [#19095](https://github.com/ClickHouse/ClickHouse/issues/19095). [#19134](https://github.com/ClickHouse/ClickHouse/pull/19134) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Fix issue in merge tree data writer which can lead to marks with bigger size than fixed granularity size. Fixes [#18913](https://github.com/ClickHouse/ClickHouse/issues/18913). [#19123](https://github.com/ClickHouse/ClickHouse/pull/19123) ([alesapin](https://github.com/alesapin)).
+* Fix startup bug when clickhouse was not able to read compression codec from `LowCardinality(Nullable(...))` and throws exception `Attempt to read after EOF`. Fixes [#18340](https://github.com/ClickHouse/ClickHouse/issues/18340). [#19101](https://github.com/ClickHouse/ClickHouse/pull/19101) ([alesapin](https://github.com/alesapin)).
+* Simplify the implementation of `tupleHammingDistance`. Support for tuples of any equal length. Fixes [#19029](https://github.com/ClickHouse/ClickHouse/issues/19029). [#19084](https://github.com/ClickHouse/ClickHouse/pull/19084) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Make sure `groupUniqArray` returns correct type for argument of Enum type. This closes [#17875](https://github.com/ClickHouse/ClickHouse/issues/17875). [#19019](https://github.com/ClickHouse/ClickHouse/pull/19019) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix possible error `Expected single dictionary argument for function` if use function `ignore` with `LowCardinality` argument. Fixes [#14275](https://github.com/ClickHouse/ClickHouse/issues/14275). [#19016](https://github.com/ClickHouse/ClickHouse/pull/19016) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Fix inserting of `LowCardinality` column to table with `TinyLog` engine. Fixes [#18629](https://github.com/ClickHouse/ClickHouse/issues/18629). [#19010](https://github.com/ClickHouse/ClickHouse/pull/19010) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Fix minor issue in JOIN: Join tries to materialize const columns, but our code waits for them in other places. [#18982](https://github.com/ClickHouse/ClickHouse/pull/18982) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
+* Disable `optimize_move_functions_out_of_any` because optimization is not always correct. This closes [#18051](https://github.com/ClickHouse/ClickHouse/issues/18051). This closes [#18973](https://github.com/ClickHouse/ClickHouse/issues/18973). [#18981](https://github.com/ClickHouse/ClickHouse/pull/18981) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix possible exception `QueryPipeline stream: different number of columns` caused by merging of query plan's `Expression` steps. Fixes [#18190](https://github.com/ClickHouse/ClickHouse/issues/18190). [#18980](https://github.com/ClickHouse/ClickHouse/pull/18980) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Fixed very rare deadlock at shutdown. [#18977](https://github.com/ClickHouse/ClickHouse/pull/18977) ([tavplubix](https://github.com/tavplubix)).
+* Fixed rare crashes when server run out of memory. [#18976](https://github.com/ClickHouse/ClickHouse/pull/18976) ([tavplubix](https://github.com/tavplubix)).
+* Fix incorrect behavior when `ALTER TABLE ... DROP PART 'part_name'` query removes all deduplication blocks for the whole partition. Fixes [#18874](https://github.com/ClickHouse/ClickHouse/issues/18874). [#18969](https://github.com/ClickHouse/ClickHouse/pull/18969) ([alesapin](https://github.com/alesapin)).
+* Fixed issue [#18894](https://github.com/ClickHouse/ClickHouse/issues/18894) Add a check to avoid exception when long column alias('table.column' style, usually auto-generated by BI tools like Looker) equals to long table name. [#18968](https://github.com/ClickHouse/ClickHouse/pull/18968) ([Daniel Qin](https://github.com/mathfool)).
+* Fix error `Task was not found in task queue` (possible only for remote queries, with `async_socket_for_remote = 1`). [#18964](https://github.com/ClickHouse/ClickHouse/pull/18964) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Fix bug when mutation with some escaped text (like `ALTER ... UPDATE e = CAST('foo', 'Enum8(\'foo\' = 1')` serialized incorrectly. Fixes [#18878](https://github.com/ClickHouse/ClickHouse/issues/18878). [#18944](https://github.com/ClickHouse/ClickHouse/pull/18944) ([alesapin](https://github.com/alesapin)).
+* ATTACH PARTITION will reset mutations. [#18804](https://github.com/ClickHouse/ClickHouse/issues/18804). [#18935](https://github.com/ClickHouse/ClickHouse/pull/18935) ([fastio](https://github.com/fastio)).
+* Fix issue with `bitmapOrCardinality` that may lead to nullptr dereference. This closes [#18911](https://github.com/ClickHouse/ClickHouse/issues/18911). [#18912](https://github.com/ClickHouse/ClickHouse/pull/18912) ([sundyli](https://github.com/sundy-li)).
+* Fixed `Attempt to read after eof` error when trying to `CAST` `NULL` from `Nullable(String)` to `Nullable(Decimal(P, S))`. Now function `CAST` returns `NULL` when it cannot parse decimal from nullable string. Fixes [#7690](https://github.com/ClickHouse/ClickHouse/issues/7690). [#18718](https://github.com/ClickHouse/ClickHouse/pull/18718) ([Winter Zhang](https://github.com/zhang2014)).
+* Fix data type convert issue for MySQL engine. [#18124](https://github.com/ClickHouse/ClickHouse/pull/18124) ([bo zeng](https://github.com/mis98zb)).
+* Fix clickhouse-client abort exception while executing only `select`. [#19790](https://github.com/ClickHouse/ClickHouse/pull/19790) ([taiyang-li](https://github.com/taiyang-li)).
+
+
+#### Build/Testing/Packaging Improvement
+
+* Run [SQLancer](https://twitter.com/RiggerManuel/status/1352345625480884228) (logical SQL fuzzer) in CI. [#19006](https://github.com/ClickHouse/ClickHouse/pull/19006) ([Ilya Yatsishin](https://github.com/qoega)).
+* Query Fuzzer will fuzz newly added tests more extensively. This closes [#18916](https://github.com/ClickHouse/ClickHouse/issues/18916). [#19185](https://github.com/ClickHouse/ClickHouse/pull/19185) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Integrate with [Big List of Naughty Strings](https://github.com/minimaxir/big-list-of-naughty-strings/) for better fuzzing. [#19480](https://github.com/ClickHouse/ClickHouse/pull/19480) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Add integration tests run with MSan. [#18974](https://github.com/ClickHouse/ClickHouse/pull/18974) ([alesapin](https://github.com/alesapin)).
+* Fixed MemorySanitizer errors in cyrus-sasl and musl. [#19821](https://github.com/ClickHouse/ClickHouse/pull/19821) ([Ilya Yatsishin](https://github.com/qoega)).
+* Insuffiient arguments check in `positionCaseInsensitiveUTF8` function triggered address sanitizer. [#19720](https://github.com/ClickHouse/ClickHouse/pull/19720) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Remove --project-directory for docker-compose in integration test. Fix logs formatting from docker container. [#19706](https://github.com/ClickHouse/ClickHouse/pull/19706) ([Ilya Yatsishin](https://github.com/qoega)).
+* Made generation of macros.xml easier for integration tests. No more excessive logging from dicttoxml. dicttoxml project is not active for 5+ years. [#19697](https://github.com/ClickHouse/ClickHouse/pull/19697) ([Ilya Yatsishin](https://github.com/qoega)).
+* Allow to explicitly enable or disable watchdog via environment variable `CLICKHOUSE_WATCHDOG_ENABLE`. By default it is enabled if server is not attached to terminal. [#19522](https://github.com/ClickHouse/ClickHouse/pull/19522) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Allow building ClickHouse with Kafka support on arm64. [#19369](https://github.com/ClickHouse/ClickHouse/pull/19369) ([filimonov](https://github.com/filimonov)).
+* Allow building librdkafka without ssl. [#19337](https://github.com/ClickHouse/ClickHouse/pull/19337) ([filimonov](https://github.com/filimonov)).
+* Restore Kafka input in FreeBSD builds. [#18924](https://github.com/ClickHouse/ClickHouse/pull/18924) ([Alexandre Snarskii](https://github.com/snar)).
+* Fix potential nullptr dereference in table function `VALUES`. [#19357](https://github.com/ClickHouse/ClickHouse/pull/19357) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Avoid UBSan reports in `arrayElement` function, `substring` and `arraySum`. Fixes [#19305](https://github.com/ClickHouse/ClickHouse/issues/19305). Fixes [#19287](https://github.com/ClickHouse/ClickHouse/issues/19287). This closes [#19336](https://github.com/ClickHouse/ClickHouse/issues/19336). [#19347](https://github.com/ClickHouse/ClickHouse/pull/19347) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+
+
## ClickHouse release 21.1
### ClickHouse release v21.1.3.32-stable, 2021-02-03
diff --git a/README.md b/README.md
index 8e114d5abe9..53778c79bef 100644
--- a/README.md
+++ b/README.md
@@ -8,7 +8,7 @@ ClickHouse® is an open-source column-oriented database management system that a
* [Tutorial](https://clickhouse.tech/docs/en/getting_started/tutorial/) shows how to set up and query small ClickHouse cluster.
* [Documentation](https://clickhouse.tech/docs/en/) provides more in-depth information.
* [YouTube channel](https://www.youtube.com/c/ClickHouseDB) has a lot of content about ClickHouse in video format.
-* [Slack](https://join.slack.com/t/clickhousedb/shared_invite/zt-d2zxkf9e-XyxDa_ucfPxzuH4SJIm~Ng) and [Telegram](https://telegram.me/clickhouse_en) allow to chat with ClickHouse users in real-time.
+* [Slack](https://join.slack.com/t/clickhousedb/shared_invite/zt-ly9m4w1x-6j7x5Ts_pQZqrctAbRZ3cg) and [Telegram](https://telegram.me/clickhouse_en) allow to chat with ClickHouse users in real-time.
* [Blog](https://clickhouse.yandex/blog/en/) contains various ClickHouse-related articles, as well as announcements and reports about events.
* [Code Browser](https://clickhouse.tech/codebrowser/html_report/ClickHouse/index.html) with syntax highlight and navigation.
* [Yandex.Messenger channel](https://yandex.ru/chat/#/join/20e380d9-c7be-4123-ab06-e95fb946975e) shares announcements and useful links in Russian.
diff --git a/contrib/base64-cmake/CMakeLists.txt b/contrib/base64-cmake/CMakeLists.txt
index 63b4e324d29..a295ee45b84 100644
--- a/contrib/base64-cmake/CMakeLists.txt
+++ b/contrib/base64-cmake/CMakeLists.txt
@@ -11,7 +11,7 @@ endif ()
target_compile_options(base64_scalar PRIVATE -falign-loops)
if (ARCH_AMD64)
- target_compile_options(base64_ssse3 PRIVATE -mssse3 -falign-loops)
+ target_compile_options(base64_ssse3 PRIVATE -mno-avx -mno-avx2 -mssse3 -falign-loops)
target_compile_options(base64_avx PRIVATE -falign-loops -mavx)
target_compile_options(base64_avx2 PRIVATE -falign-loops -mavx2)
else ()
diff --git a/contrib/hyperscan-cmake/CMakeLists.txt b/contrib/hyperscan-cmake/CMakeLists.txt
index c44214cded8..75c45ff7bf5 100644
--- a/contrib/hyperscan-cmake/CMakeLists.txt
+++ b/contrib/hyperscan-cmake/CMakeLists.txt
@@ -252,6 +252,7 @@ if (NOT EXTERNAL_HYPERSCAN_LIBRARY_FOUND)
target_compile_definitions (hyperscan PUBLIC USE_HYPERSCAN=1)
target_compile_options (hyperscan
PRIVATE -g0 # Library has too much debug information
+ -mno-avx -mno-avx2 # The library is using dynamic dispatch and is confused if AVX is enabled globally
-march=corei7 -O2 -fno-strict-aliasing -fno-omit-frame-pointer -fvisibility=hidden # The options from original build system
-fno-sanitize=undefined # Assume the library takes care of itself
)
diff --git a/contrib/poco b/contrib/poco
index e11f3c97157..fbaaba4a02e 160000
--- a/contrib/poco
+++ b/contrib/poco
@@ -1 +1 @@
-Subproject commit e11f3c971570cf6a31006cd21cadf41a259c360a
+Subproject commit fbaaba4a02e29987b8c584747a496c79528f125f
diff --git a/docker/test/fasttest/run.sh b/docker/test/fasttest/run.sh
index 17cec7ae286..bb29959acd2 100755
--- a/docker/test/fasttest/run.sh
+++ b/docker/test/fasttest/run.sh
@@ -319,6 +319,7 @@ function run_tests
# In fasttest, ENABLE_LIBRARIES=0, so rocksdb engine is not enabled by default
01504_rocksdb
+ 01686_rocksdb
# Look at DistributedFilesToInsert, so cannot run in parallel.
01460_DistributedFilesToInsert
diff --git a/docker/test/integration/runner/Dockerfile b/docker/test/integration/runner/Dockerfile
index f353931f0a0..502dc3736b2 100644
--- a/docker/test/integration/runner/Dockerfile
+++ b/docker/test/integration/runner/Dockerfile
@@ -61,7 +61,7 @@ RUN python3 -m pip install \
aerospike \
avro \
cassandra-driver \
- confluent-kafka \
+ confluent-kafka==1.5.0 \
dict2xml \
dicttoxml \
docker \
diff --git a/docs/en/introduction/adopters.md b/docs/en/introduction/adopters.md
index 707a05b63e5..c7230f2f080 100644
--- a/docs/en/introduction/adopters.md
+++ b/docs/en/introduction/adopters.md
@@ -46,7 +46,7 @@ toc_title: Adopters
| Exness | Trading | Metrics, Logging | — | — | [Talk in Russian, May 2019](https://youtu.be/_rpU-TvSfZ8?t=3215) |
| FastNetMon | DDoS Protection | Main Product | | — | [Official website](https://fastnetmon.com/docs-fnm-advanced/fastnetmon-advanced-traffic-persistency/) |
| Flipkart | e-Commerce | — | — | — | [Talk in English, July 2020](https://youtu.be/GMiXCMFDMow?t=239) |
-| FunCorp | Games | | — | — | [Article](https://www.altinity.com/blog/migrating-from-redshift-to-clickhouse) |
+| FunCorp | Games | | — | 14 bn records/day as of Jan 2021 | [Article](https://www.altinity.com/blog/migrating-from-redshift-to-clickhouse) |
| Geniee | Ad network | Main product | — | — | [Blog post in Japanese, July 2017](https://tech.geniee.co.jp/entry/2017/07/20/160100) |
| Genotek | Bioinformatics | Main product | — | — | [Video, August 2020](https://youtu.be/v3KyZbz9lEE) |
| HUYA | Video Streaming | Analytics | — | — | [Slides in Chinese, October 2018](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup19/7.%20ClickHouse万亿数据分析实践%20李本旺(sundy-li)%20虎牙.pdf) |
@@ -74,6 +74,7 @@ toc_title: Adopters
| NOC Project | Network Monitoring | Analytics | Main Product | — | [Official Website](https://getnoc.com/features/big-data/) |
| Nuna Inc. | Health Data Analytics | — | — | — | [Talk in English, July 2020](https://youtu.be/GMiXCMFDMow?t=170) |
| OneAPM | Monitorings and Data Analysis | Main product | — | — | [Slides in Chinese, October 2018](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup19/8.%20clickhouse在OneAPM的应用%20杜龙.pdf) |
+| Panelbear | Analytics | Monitoring and Analytics | — | — | [Tech Stack, November 2020](https://panelbear.com/blog/tech-stack/) |
| Percent 百分点 | Analytics | Main Product | — | — | [Slides in Chinese, June 2019](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup24/4.%20ClickHouse万亿数据双中心的设计与实践%20.pdf) |
| Percona | Performance analysis | Percona Monitoring and Management | — | — | [Official website, Mar 2020](https://www.percona.com/blog/2020/03/30/advanced-query-analysis-in-percona-monitoring-and-management-with-direct-clickhouse-access/) |
| Plausible | Analytics | Main Product | — | — | [Blog post, June 2020](https://twitter.com/PlausibleHQ/status/1273889629087969280) |
diff --git a/docs/en/operations/quotas.md b/docs/en/operations/quotas.md
index c637ef03f71..56c3eaf6455 100644
--- a/docs/en/operations/quotas.md
+++ b/docs/en/operations/quotas.md
@@ -29,6 +29,8 @@ Let’s look at the section of the ‘users.xml’ file that defines quotas.
0
+ 0
+ 0
0
0
0
@@ -48,6 +50,8 @@ The resource consumption calculated for each interval is output to the server lo
3600
1000
+ 100
+ 100
100
1000000000
100000000000
@@ -58,6 +62,8 @@ The resource consumption calculated for each interval is output to the server lo
86400
10000
+ 10000
+ 10000
1000
5000000000
500000000000
@@ -74,6 +80,10 @@ Here are the amounts that can be restricted:
`queries` – The total number of requests.
+`query_selects` – The total number of select requests.
+
+`query_inserts` – The total number of insert requests.
+
`errors` – The number of queries that threw an exception.
`result_rows` – The total number of rows given as a result.
diff --git a/docs/en/operations/system-tables/part_log.md b/docs/en/operations/system-tables/part_log.md
index 9aa95b1a493..08269a2dc48 100644
--- a/docs/en/operations/system-tables/part_log.md
+++ b/docs/en/operations/system-tables/part_log.md
@@ -6,29 +6,62 @@ This table contains information about events that occurred with [data parts](../
The `system.part_log` table contains the following columns:
-- `event_type` (Enum) — Type of the event that occurred with the data part. Can have one of the following values:
+- `query_id` ([String](../../sql-reference/data-types/string.md)) — Identifier of the `INSERT` query that created this data part.
+- `event_type` ([Enum8](../../sql-reference/data-types/enum.md)) — Type of the event that occurred with the data part. Can have one of the following values:
- `NEW_PART` — Inserting of a new data part.
- `MERGE_PARTS` — Merging of data parts.
- `DOWNLOAD_PART` — Downloading a data part.
- `REMOVE_PART` — Removing or detaching a data part using [DETACH PARTITION](../../sql-reference/statements/alter/partition.md#alter_detach-partition).
- `MUTATE_PART` — Mutating of a data part.
- `MOVE_PART` — Moving the data part from the one disk to another one.
-- `event_date` (Date) — Event date.
-- `event_time` (DateTime) — Event time.
-- `duration_ms` (UInt64) — Duration.
-- `database` (String) — Name of the database the data part is in.
-- `table` (String) — Name of the table the data part is in.
-- `part_name` (String) — Name of the data part.
-- `partition_id` (String) — ID of the partition that the data part was inserted to. The column takes the ‘all’ value if the partitioning is by `tuple()`.
-- `rows` (UInt64) — The number of rows in the data part.
-- `size_in_bytes` (UInt64) — Size of the data part in bytes.
-- `merged_from` (Array(String)) — An array of names of the parts which the current part was made up from (after the merge).
-- `bytes_uncompressed` (UInt64) — Size of uncompressed bytes.
-- `read_rows` (UInt64) — The number of rows was read during the merge.
-- `read_bytes` (UInt64) — The number of bytes was read during the merge.
-- `error` (UInt16) — The code number of the occurred error.
-- `exception` (String) — Text message of the occurred error.
+- `event_date` ([Date](../../sql-reference/data-types/date.md)) — Event date.
+- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Event time.
+- `duration_ms` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Duration.
+- `database` ([String](../../sql-reference/data-types/string.md)) — Name of the database the data part is in.
+- `table` ([String](../../sql-reference/data-types/string.md)) — Name of the table the data part is in.
+- `part_name` ([String](../../sql-reference/data-types/string.md)) — Name of the data part.
+- `partition_id` ([String](../../sql-reference/data-types/string.md)) — ID of the partition that the data part was inserted to. The column takes the `all` value if the partitioning is by `tuple()`.
+- `path_on_disk` ([String](../../sql-reference/data-types/string.md)) — Absolute path to the folder with data part files.
+- `rows` ([UInt64](../../sql-reference/data-types/int-uint.md)) — The number of rows in the data part.
+- `size_in_bytes` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Size of the data part in bytes.
+- `merged_from` ([Array(String)](../../sql-reference/data-types/array.md)) — An array of names of the parts which the current part was made up from (after the merge).
+- `bytes_uncompressed` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Size of uncompressed bytes.
+- `read_rows` ([UInt64](../../sql-reference/data-types/int-uint.md)) — The number of rows was read during the merge.
+- `read_bytes` ([UInt64](../../sql-reference/data-types/int-uint.md)) — The number of bytes was read during the merge.
+- `peak_memory_usage` ([Int64](../../sql-reference/data-types/int-uint.md)) — The maximum difference between the amount of allocated and freed memory in context of this thread.
+- `error` ([UInt16](../../sql-reference/data-types/int-uint.md)) — The code number of the occurred error.
+- `exception` ([String](../../sql-reference/data-types/string.md)) — Text message of the occurred error.
The `system.part_log` table is created after the first inserting data to the `MergeTree` table.
+**Example**
+
+``` sql
+SELECT * FROM system.part_log LIMIT 1 FORMAT Vertical;
+```
+
+``` text
+Row 1:
+──────
+query_id: 983ad9c7-28d5-4ae1-844e-603116b7de31
+event_type: NewPart
+event_date: 2021-02-02
+event_time: 2021-02-02 11:14:28
+duration_ms: 35
+database: default
+table: log_mt_2
+part_name: all_1_1_0
+partition_id: all
+path_on_disk: db/data/default/log_mt_2/all_1_1_0/
+rows: 115418
+size_in_bytes: 1074311
+merged_from: []
+bytes_uncompressed: 0
+read_rows: 0
+read_bytes: 0
+peak_memory_usage: 0
+error: 0
+exception:
+```
+
[Original article](https://clickhouse.tech/docs/en/operations/system_tables/part_log)
diff --git a/docs/en/operations/system-tables/quota_limits.md b/docs/en/operations/system-tables/quota_limits.md
index 065296f5df3..c2dcb4db34d 100644
--- a/docs/en/operations/system-tables/quota_limits.md
+++ b/docs/en/operations/system-tables/quota_limits.md
@@ -9,6 +9,8 @@ Columns:
- `0` — Interval is not randomized.
- `1` — Interval is randomized.
- `max_queries` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — Maximum number of queries.
+- `max_query_selects` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — Maximum number of select queries.
+- `max_query_inserts` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — Maximum number of insert queries.
- `max_errors` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — Maximum number of errors.
- `max_result_rows` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — Maximum number of result rows.
- `max_result_bytes` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — Maximum number of RAM volume in bytes used to store a queries result.
diff --git a/docs/en/operations/system-tables/quota_usage.md b/docs/en/operations/system-tables/quota_usage.md
index 0eb59fd6453..17af9ad9a30 100644
--- a/docs/en/operations/system-tables/quota_usage.md
+++ b/docs/en/operations/system-tables/quota_usage.md
@@ -9,6 +9,8 @@ Columns:
- `end_time`([Nullable](../../sql-reference/data-types/nullable.md)([DateTime](../../sql-reference/data-types/datetime.md))) — End time for calculating resource consumption.
- `duration` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — Length of the time interval for calculating resource consumption, in seconds.
- `queries` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — The total number of requests on this interval.
+- `query_selects` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — The total number of select requests on this interval.
+- `query_inserts` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — The total number of insert requests on this interval.
- `max_queries` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — Maximum number of requests.
- `errors` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — The number of queries that threw an exception.
- `max_errors` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — Maximum number of errors.
diff --git a/docs/en/operations/system-tables/quotas_usage.md b/docs/en/operations/system-tables/quotas_usage.md
index ed6be820b26..31aafd3e697 100644
--- a/docs/en/operations/system-tables/quotas_usage.md
+++ b/docs/en/operations/system-tables/quotas_usage.md
@@ -11,6 +11,10 @@ Columns:
- `duration` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt32](../../sql-reference/data-types/int-uint.md))) — Length of the time interval for calculating resource consumption, in seconds.
- `queries` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — The total number of requests in this interval.
- `max_queries` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — Maximum number of requests.
+- `query_selects` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — The total number of select requests in this interval.
+- `max_query_selects` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — Maximum number of select requests.
+- `query_inserts` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — The total number of insert requests in this interval.
+- `max_query_inserts` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — Maximum number of insert requests.
- `errors` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — The number of queries that threw an exception.
- `max_errors` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — Maximum number of errors.
- `result_rows` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — The total number of rows given as a result.
diff --git a/docs/en/operations/system-tables/zookeeper.md b/docs/en/operations/system-tables/zookeeper.md
index ddb4d305964..82ace5e81dc 100644
--- a/docs/en/operations/system-tables/zookeeper.md
+++ b/docs/en/operations/system-tables/zookeeper.md
@@ -1,12 +1,16 @@
# system.zookeeper {#system-zookeeper}
The table does not exist if ZooKeeper is not configured. Allows reading data from the ZooKeeper cluster defined in the config.
-The query must have a ‘path’ equality condition in the WHERE clause. This is the path in ZooKeeper for the children that you want to get data for.
+The query must either have a ‘path =’ condition or a `path IN` condition set with the `WHERE` clause as shown below. This corresponds to the path of the children in ZooKeeper that you want to get data for.
The query `SELECT * FROM system.zookeeper WHERE path = '/clickhouse'` outputs data for all children on the `/clickhouse` node.
To output data for all root nodes, write path = ‘/’.
If the path specified in ‘path’ doesn’t exist, an exception will be thrown.
+The query `SELECT * FROM system.zookeeper WHERE path IN ('/', '/clickhouse')` outputs data for all children on the `/` and `/clickhouse` node.
+If in the specified ‘path’ collection has doesn't exist path, an exception will be thrown.
+It can be used to do a batch of ZooKeeper path queries.
+
Columns:
- `name` (String) — The name of the node.
diff --git a/docs/en/sql-reference/aggregate-functions/reference/argmax.md b/docs/en/sql-reference/aggregate-functions/reference/argmax.md
index 35e87d49e60..9899c731ce9 100644
--- a/docs/en/sql-reference/aggregate-functions/reference/argmax.md
+++ b/docs/en/sql-reference/aggregate-functions/reference/argmax.md
@@ -4,13 +4,42 @@ toc_priority: 106
# argMax {#agg-function-argmax}
-Syntax: `argMax(arg, val)` or `argMax(tuple(arg, val))`
+Calculates the `arg` value for a maximum `val` value. If there are several different values of `arg` for maximum values of `val`, returns the first of these values encountered.
-Calculates the `arg` value for a maximum `val` value. If there are several different values of `arg` for maximum values of `val`, the first of these values encountered is output.
+Tuple version of this function will return the tuple with the maximum `val` value. It is convenient for use with [SimpleAggregateFunction](../../../sql-reference/data-types/simpleaggregatefunction.md).
-Tuple version of this function will return the tuple with the maximum `val` value. It is convinient for use with `SimpleAggregateFunction`.
+**Syntax**
-**Example:**
+``` sql
+argMax(arg, val)
+```
+
+or
+
+``` sql
+argMax(tuple(arg, val))
+```
+
+**Parameters**
+
+- `arg` — Argument.
+- `val` — Value.
+
+**Returned value**
+
+- `arg` value that corresponds to maximum `val` value.
+
+Type: matches `arg` type.
+
+For tuple in the input:
+
+- Tuple `(arg, val)`, where `val` is the maximum value and `arg` is a corresponding value.
+
+Type: [Tuple](../../../sql-reference/data-types/tuple.md).
+
+**Example**
+
+Input table:
``` text
┌─user─────┬─salary─┐
@@ -20,12 +49,18 @@ Tuple version of this function will return the tuple with the maximum `val` valu
└──────────┴────────┘
```
+Query:
+
``` sql
-SELECT argMax(user, salary), argMax(tuple(user, salary)) FROM salary
+SELECT argMax(user, salary), argMax(tuple(user, salary)) FROM salary;
```
+Result:
+
``` text
┌─argMax(user, salary)─┬─argMax(tuple(user, salary))─┐
│ director │ ('director',5000) │
└──────────────────────┴─────────────────────────────┘
```
+
+[Original article](https://clickhouse.tech/docs/en/sql-reference/aggregate-functions/reference/argmax/)
diff --git a/docs/en/sql-reference/aggregate-functions/reference/argmin.md b/docs/en/sql-reference/aggregate-functions/reference/argmin.md
index 72c9bce6817..2fe9a313260 100644
--- a/docs/en/sql-reference/aggregate-functions/reference/argmin.md
+++ b/docs/en/sql-reference/aggregate-functions/reference/argmin.md
@@ -4,13 +4,42 @@ toc_priority: 105
# argMin {#agg-function-argmin}
-Syntax: `argMin(arg, val)` or `argMin(tuple(arg, val))`
+Calculates the `arg` value for a minimum `val` value. If there are several different values of `arg` for minimum values of `val`, returns the first of these values encountered.
-Calculates the `arg` value for a minimal `val` value. If there are several different values of `arg` for minimal values of `val`, the first of these values encountered is output.
+Tuple version of this function will return the tuple with the minimum `val` value. It is convenient for use with [SimpleAggregateFunction](../../../sql-reference/data-types/simpleaggregatefunction.md).
-Tuple version of this function will return the tuple with the minimal `val` value. It is convinient for use with `SimpleAggregateFunction`.
+**Syntax**
-**Example:**
+``` sql
+argMin(arg, val)
+```
+
+or
+
+``` sql
+argMin(tuple(arg, val))
+```
+
+**Parameters**
+
+- `arg` — Argument.
+- `val` — Value.
+
+**Returned value**
+
+- `arg` value that corresponds to minimum `val` value.
+
+Type: matches `arg` type.
+
+For tuple in the input:
+
+- Tuple `(arg, val)`, where `val` is the minimum value and `arg` is a corresponding value.
+
+Type: [Tuple](../../../sql-reference/data-types/tuple.md).
+
+**Example**
+
+Input table:
``` text
┌─user─────┬─salary─┐
@@ -20,12 +49,18 @@ Tuple version of this function will return the tuple with the minimal `val` valu
└──────────┴────────┘
```
+Query:
+
``` sql
-SELECT argMin(user, salary), argMin(tuple(user, salary)) FROM salary
+SELECT argMin(user, salary), argMin(tuple(user, salary)) FROM salary;
```
+Result:
+
``` text
┌─argMin(user, salary)─┬─argMin(tuple(user, salary))─┐
│ worker │ ('worker',1000) │
└──────────────────────┴─────────────────────────────┘
```
+
+[Original article](https://clickhouse.tech/docs/en/sql-reference/aggregate-functions/reference/argmin/)
diff --git a/docs/en/sql-reference/aggregate-functions/reference/mannwhitneyutest.md b/docs/en/sql-reference/aggregate-functions/reference/mannwhitneyutest.md
new file mode 100644
index 00000000000..012df7052aa
--- /dev/null
+++ b/docs/en/sql-reference/aggregate-functions/reference/mannwhitneyutest.md
@@ -0,0 +1,71 @@
+---
+toc_priority: 310
+toc_title: mannWhitneyUTest
+---
+
+# mannWhitneyUTest {#mannwhitneyutest}
+
+Applies the Mann-Whitney rank test to samples from two populations.
+
+**Syntax**
+
+``` sql
+mannWhitneyUTest[(alternative[, continuity_correction])](sample_data, sample_index)
+```
+
+Values of both samples are in the `sample_data` column. If `sample_index` equals to 0 then the value in that row belongs to the sample from the first population. Otherwise it belongs to the sample from the second population.
+The null hypothesis is that two populations are stochastically equal. Also one-sided hypothesises can be tested. This test does not assume that data have normal distribution.
+
+**Parameters**
+
+- `alternative` — alternative hypothesis. (Optional, default: `'two-sided'`.) [String](../../../sql-reference/data-types/string.md).
+ - `'two-sided'`;
+ - `'greater'`;
+ - `'less'`.
+- `continuity_correction` - if not 0 then continuity correction in the normal approximation for the p-value is applied. (Optional, default: 1.) [UInt64](../../../sql-reference/data-types/int-uint.md).
+- `sample_data` — sample data. [Integer](../../../sql-reference/data-types/int-uint.md), [Float](../../../sql-reference/data-types/float.md) or [Decimal](../../../sql-reference/data-types/decimal.md).
+- `sample_index` — sample index. [Integer](../../../sql-reference/data-types/int-uint.md).
+
+
+**Returned values**
+
+[Tuple](../../../sql-reference/data-types/tuple.md) with two elements:
+- calculated U-statistic. [Float64](../../../sql-reference/data-types/float.md).
+- calculated p-value. [Float64](../../../sql-reference/data-types/float.md).
+
+
+**Example**
+
+Input table:
+
+``` text
+┌─sample_data─┬─sample_index─┐
+│ 10 │ 0 │
+│ 11 │ 0 │
+│ 12 │ 0 │
+│ 1 │ 1 │
+│ 2 │ 1 │
+│ 3 │ 1 │
+└─────────────┴──────────────┘
+```
+
+Query:
+
+``` sql
+SELECT mannWhitneyUTest('greater')(sample_data, sample_index) FROM mww_ttest;
+```
+
+Result:
+
+``` text
+┌─mannWhitneyUTest('greater')(sample_data, sample_index)─┐
+│ (9,0.04042779918503192) │
+└────────────────────────────────────────────────────────┘
+```
+
+**See Also**
+
+- [Mann–Whitney U test](https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test)
+- [Stochastic ordering](https://en.wikipedia.org/wiki/Stochastic_ordering)
+
+[Original article](https://clickhouse.tech/docs/en/sql-reference/aggregate-functions/reference/mannwhitneyutest/)
diff --git a/docs/en/sql-reference/aggregate-functions/reference/quantiletimingweighted.md b/docs/en/sql-reference/aggregate-functions/reference/quantiletimingweighted.md
index 0f8606986c8..817cd831d85 100644
--- a/docs/en/sql-reference/aggregate-functions/reference/quantiletimingweighted.md
+++ b/docs/en/sql-reference/aggregate-functions/reference/quantiletimingweighted.md
@@ -79,6 +79,40 @@ Result:
└───────────────────────────────────────────────┘
```
+# quantilesTimingWeighted {#quantilestimingweighted}
+
+Same as `quantileTimingWeighted`, but accept multiple parameters with quantile levels and return an Array filled with many values of that quantiles.
+
+
+**Example**
+
+Input table:
+
+``` text
+┌─response_time─┬─weight─┐
+│ 68 │ 1 │
+│ 104 │ 2 │
+│ 112 │ 3 │
+│ 126 │ 2 │
+│ 138 │ 1 │
+│ 162 │ 1 │
+└───────────────┴────────┘
+```
+
+Query:
+
+``` sql
+SELECT quantilesTimingWeighted(0,5, 0.99)(response_time, weight) FROM t
+```
+
+Result:
+
+``` text
+┌─quantilesTimingWeighted(0.5, 0.99)(response_time, weight)─┐
+│ [112,162] │
+└───────────────────────────────────────────────────────────┘
+```
+
**See Also**
- [median](../../../sql-reference/aggregate-functions/reference/median.md#median)
diff --git a/docs/en/sql-reference/aggregate-functions/reference/studentttest.md b/docs/en/sql-reference/aggregate-functions/reference/studentttest.md
new file mode 100644
index 00000000000..f868e976039
--- /dev/null
+++ b/docs/en/sql-reference/aggregate-functions/reference/studentttest.md
@@ -0,0 +1,65 @@
+---
+toc_priority: 300
+toc_title: studentTTest
+---
+
+# studentTTest {#studentttest}
+
+Applies Student's t-test to samples from two populations.
+
+**Syntax**
+
+``` sql
+studentTTest(sample_data, sample_index)
+```
+
+Values of both samples are in the `sample_data` column. If `sample_index` equals to 0 then the value in that row belongs to the sample from the first population. Otherwise it belongs to the sample from the second population.
+The null hypothesis is that means of populations are equal. Normal distribution with equal variances is assumed.
+
+**Parameters**
+
+- `sample_data` — sample data. [Integer](../../../sql-reference/data-types/int-uint.md), [Float](../../../sql-reference/data-types/float.md) or [Decimal](../../../sql-reference/data-types/decimal.md).
+- `sample_index` — sample index. [Integer](../../../sql-reference/data-types/int-uint.md).
+
+**Returned values**
+
+[Tuple](../../../sql-reference/data-types/tuple.md) with two elements:
+- calculated t-statistic. [Float64](../../../sql-reference/data-types/float.md).
+- calculated p-value. [Float64](../../../sql-reference/data-types/float.md).
+
+
+**Example**
+
+Input table:
+
+``` text
+┌─sample_data─┬─sample_index─┐
+│ 20.3 │ 0 │
+│ 21.1 │ 0 │
+│ 21.9 │ 1 │
+│ 21.7 │ 0 │
+│ 19.9 │ 1 │
+│ 21.8 │ 1 │
+└─────────────┴──────────────┘
+```
+
+Query:
+
+``` sql
+SELECT studentTTest(sample_data, sample_index) FROM student_ttest;
+```
+
+Result:
+
+``` text
+┌─studentTTest(sample_data, sample_index)───┐
+│ (-0.21739130434783777,0.8385421208415731) │
+└───────────────────────────────────────────┘
+```
+
+**See Also**
+
+- [Student's t-test](https://en.wikipedia.org/wiki/Student%27s_t-test)
+- [welchTTest function](welchttest.md#welchttest)
+
+[Original article](https://clickhouse.tech/docs/en/sql-reference/aggregate-functions/reference/studentttest/)
diff --git a/docs/en/sql-reference/aggregate-functions/reference/welchttest.md b/docs/en/sql-reference/aggregate-functions/reference/welchttest.md
new file mode 100644
index 00000000000..3fe1c9d58b9
--- /dev/null
+++ b/docs/en/sql-reference/aggregate-functions/reference/welchttest.md
@@ -0,0 +1,65 @@
+---
+toc_priority: 301
+toc_title: welchTTest
+---
+
+# welchTTest {#welchttest}
+
+Applies Welch's t-test to samples from two populations.
+
+**Syntax**
+
+``` sql
+welchTTest(sample_data, sample_index)
+```
+
+Values of both samples are in the `sample_data` column. If `sample_index` equals to 0 then the value in that row belongs to the sample from the first population. Otherwise it belongs to the sample from the second population.
+The null hypothesis is that means of populations are equal. Normal distribution is assumed. Populations may have unequal variance.
+
+**Parameters**
+
+- `sample_data` — sample data. [Integer](../../../sql-reference/data-types/int-uint.md), [Float](../../../sql-reference/data-types/float.md) or [Decimal](../../../sql-reference/data-types/decimal.md).
+- `sample_index` — sample index. [Integer](../../../sql-reference/data-types/int-uint.md).
+
+**Returned values**
+
+[Tuple](../../../sql-reference/data-types/tuple.md) with two elements:
+- calculated t-statistic. [Float64](../../../sql-reference/data-types/float.md).
+- calculated p-value. [Float64](../../../sql-reference/data-types/float.md).
+
+
+**Example**
+
+Input table:
+
+``` text
+┌─sample_data─┬─sample_index─┐
+│ 20.3 │ 0 │
+│ 22.1 │ 0 │
+│ 21.9 │ 0 │
+│ 18.9 │ 1 │
+│ 20.3 │ 1 │
+│ 19 │ 1 │
+└─────────────┴──────────────┘
+```
+
+Query:
+
+``` sql
+SELECT welchTTest(sample_data, sample_index) FROM welch_ttest;
+```
+
+Result:
+
+``` text
+┌─welchTTest(sample_data, sample_index)─────┐
+│ (2.7988719532211235,0.051807360348581945) │
+└───────────────────────────────────────────┘
+```
+
+**See Also**
+
+- [Welch's t-test](https://en.wikipedia.org/wiki/Welch%27s_t-test)
+- [studentTTest function](studentttest.md#studentttest)
+
+[Original article](https://clickhouse.tech/docs/en/sql-reference/aggregate-functions/reference/welchTTest/)
diff --git a/docs/en/sql-reference/data-types/array.md b/docs/en/sql-reference/data-types/array.md
index 48957498d63..41e35aaa96f 100644
--- a/docs/en/sql-reference/data-types/array.md
+++ b/docs/en/sql-reference/data-types/array.md
@@ -45,6 +45,8 @@ SELECT [1, 2] AS x, toTypeName(x)
## Working with Data Types {#working-with-data-types}
+The maximum size of an array is limited to one million elements.
+
When creating an array on the fly, ClickHouse automatically defines the argument type as the narrowest data type that can store all the listed arguments. If there are any [Nullable](../../sql-reference/data-types/nullable.md#data_type-nullable) or literal [NULL](../../sql-reference/syntax.md#null-literal) values, the type of an array element also becomes [Nullable](../../sql-reference/data-types/nullable.md).
If ClickHouse couldn’t determine the data type, it generates an exception. For instance, this happens when trying to create an array with strings and numbers simultaneously (`SELECT array(1, 'a')`).
diff --git a/docs/en/sql-reference/functions/array-functions.md b/docs/en/sql-reference/functions/array-functions.md
index dc7727bdfd8..d5b357795d7 100644
--- a/docs/en/sql-reference/functions/array-functions.md
+++ b/docs/en/sql-reference/functions/array-functions.md
@@ -1288,73 +1288,226 @@ Returns the index of the first element in the `arr1` array for which `func` retu
Note that the `arrayFirstIndex` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You must pass a lambda function to it as the first argument, and it can’t be omitted.
-## arrayMin(\[func,\] arr1, …) {#array-min}
+## arrayMin {#array-min}
-Returns the min of the `func` values. If the function is omitted, it just returns the min of the array elements.
+Returns the minimum of elements in the source array.
+
+If the `func` function is specified, returns the mininum of elements converted by this function.
Note that the `arrayMin` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You can pass a lambda function to it as the first argument.
-Examples:
+**Syntax**
+
```sql
-SELECT arrayMin([1, 2, 4]) AS res
+arrayMin([func,] arr)
+```
+
+**Parameters**
+
+- `func` — Function. [Expression](../../sql-reference/data-types/special-data-types/expression.md).
+- `arr` — Array. [Array](../../sql-reference/data-types/array.md).
+
+**Returned value**
+
+- The minimum of function values (or the array minimum).
+
+Type: if `func` is specified, matches `func` return value type, else matches the array elements type.
+
+**Examples**
+
+Query:
+
+```sql
+SELECT arrayMin([1, 2, 4]) AS res;
+```
+
+Result:
+
+```text
┌─res─┐
│ 1 │
└─────┘
+```
+Query:
-SELECT arrayMin(x -> (-x), [1, 2, 4]) AS res
+```sql
+SELECT arrayMin(x -> (-x), [1, 2, 4]) AS res;
+```
+
+Result:
+
+```text
┌─res─┐
│ -4 │
└─────┘
```
-## arrayMax(\[func,\] arr1, …) {#array-max}
+## arrayMax {#array-max}
-Returns the max of the `func` values. If the function is omitted, it just returns the max of the array elements.
+Returns the maximum of elements in the source array.
+
+If the `func` function is specified, returns the maximum of elements converted by this function.
Note that the `arrayMax` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You can pass a lambda function to it as the first argument.
-Examples:
+**Syntax**
+
```sql
-SELECT arrayMax([1, 2, 4]) AS res
+arrayMax([func,] arr)
+```
+
+**Parameters**
+
+- `func` — Function. [Expression](../../sql-reference/data-types/special-data-types/expression.md).
+- `arr` — Array. [Array](../../sql-reference/data-types/array.md).
+
+**Returned value**
+
+- The maximum of function values (or the array maximum).
+
+Type: if `func` is specified, matches `func` return value type, else matches the array elements type.
+
+**Examples**
+
+Query:
+
+```sql
+SELECT arrayMax([1, 2, 4]) AS res;
+```
+
+Result:
+
+```text
┌─res─┐
│ 4 │
└─────┘
+```
+Query:
-SELECT arrayMax(x -> (-x), [1, 2, 4]) AS res
+```sql
+SELECT arrayMax(x -> (-x), [1, 2, 4]) AS res;
+```
+
+Result:
+
+```text
┌─res─┐
│ -1 │
└─────┘
```
-## arraySum(\[func,\] arr1, …) {#array-sum}
+## arraySum {#array-sum}
-Returns the sum of the `func` values. If the function is omitted, it just returns the sum of the array elements.
+Returns the sum of elements in the source array.
+
+If the `func` function is specified, returns the sum of elements converted by this function.
Note that the `arraySum` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You can pass a lambda function to it as the first argument.
-Examples:
+**Syntax**
+
```sql
-SELECT arraySum([2,3]) AS res
+arraySum([func,] arr)
+```
+
+**Parameters**
+
+- `func` — Function. [Expression](../../sql-reference/data-types/special-data-types/expression.md).
+- `arr` — Array. [Array](../../sql-reference/data-types/array.md).
+
+**Returned value**
+
+- The sum of the function values (or the array sum).
+
+Type: for decimal numbers in source array (or for converted values, if `func` is specified) — [Decimal128](../../sql-reference/data-types/decimal.md), for floating point numbers — [Float64](../../sql-reference/data-types/float.md), for numeric unsigned — [UInt64](../../sql-reference/data-types/int-uint.md), and for numeric signed — [Int64](../../sql-reference/data-types/int-uint.md).
+
+**Examples**
+
+Query:
+
+```sql
+SELECT arraySum([2, 3]) AS res;
+```
+
+Result:
+
+```text
┌─res─┐
│ 5 │
└─────┘
+```
+Query:
-SELECT arraySum(x -> x*x, [2, 3]) AS res
+```sql
+SELECT arraySum(x -> x*x, [2, 3]) AS res;
+```
+
+Result:
+
+```text
┌─res─┐
│ 13 │
└─────┘
```
+## arrayAvg {#array-avg}
-## arrayAvg(\[func,\] arr1, …) {#array-avg}
+Returns the average of elements in the source array.
-Returns the average of the `func` values. If the function is omitted, it just returns the average of the array elements.
+If the `func` function is specified, returns the average of elements converted by this function.
Note that the `arrayAvg` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You can pass a lambda function to it as the first argument.
+**Syntax**
+
+```sql
+arrayAvg([func,] arr)
+```
+
+**Parameters**
+
+- `func` — Function. [Expression](../../sql-reference/data-types/special-data-types/expression.md).
+- `arr` — Array. [Array](../../sql-reference/data-types/array.md).
+
+**Returned value**
+
+- The average of function values (or the array average).
+
+Type: [Float64](../../sql-reference/data-types/float.md).
+
+**Examples**
+
+Query:
+
+```sql
+SELECT arrayAvg([1, 2, 4]) AS res;
+```
+
+Result:
+
+```text
+┌────────────────res─┐
+│ 2.3333333333333335 │
+└────────────────────┘
+```
+
+Query:
+
+```sql
+SELECT arrayAvg(x -> (x * x), [2, 4]) AS res;
+```
+
+Result:
+
+```text
+┌─res─┐
+│ 10 │
+└─────┘
+```
+
## arrayCumSum(\[func,\] arr1, …) {#arraycumsumfunc-arr1}
Returns an array of partial sums of elements in the source array (a running sum). If the `func` function is specified, then the values of the array elements are converted by this function before summing.
diff --git a/docs/en/sql-reference/functions/date-time-functions.md b/docs/en/sql-reference/functions/date-time-functions.md
index 9de780fb596..4a73bdb2546 100644
--- a/docs/en/sql-reference/functions/date-time-functions.md
+++ b/docs/en/sql-reference/functions/date-time-functions.md
@@ -380,7 +380,7 @@ Alias: `dateTrunc`.
**Parameters**
-- `unit` — Part of date. [String](../syntax.md#syntax-string-literal).
+- `unit` — The type of interval to truncate the result. [String Literal](../syntax.md#syntax-string-literal).
Possible values:
- `second`
@@ -435,6 +435,201 @@ Result:
- [toStartOfInterval](#tostartofintervaltime-or-data-interval-x-unit-time-zone)
+## date\_add {#date_add}
+
+Adds specified date/time interval to the provided date.
+
+**Syntax**
+
+``` sql
+date_add(unit, value, date)
+```
+
+Aliases: `dateAdd`, `DATE_ADD`.
+
+**Parameters**
+
+- `unit` — The type of interval to add. [String](../../sql-reference/data-types/string.md).
+
+ Supported values: second, minute, hour, day, week, month, quarter, year.
+- `value` - Value in specified unit - [Int](../../sql-reference/data-types/int-uint.md)
+- `date` — [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md).
+
+
+**Returned value**
+
+Returns Date or DateTime with `value` expressed in `unit` added to `date`.
+
+**Example**
+
+```sql
+select date_add(YEAR, 3, toDate('2018-01-01'));
+```
+
+```text
+┌─plus(toDate('2018-01-01'), toIntervalYear(3))─┐
+│ 2021-01-01 │
+└───────────────────────────────────────────────┘
+```
+
+## date\_diff {#date_diff}
+
+Returns the difference between two Date or DateTime values.
+
+**Syntax**
+
+``` sql
+date_diff('unit', startdate, enddate, [timezone])
+```
+
+Aliases: `dateDiff`, `DATE_DIFF`.
+
+**Parameters**
+
+- `unit` — The type of interval for result [String](../../sql-reference/data-types/string.md).
+
+ Supported values: second, minute, hour, day, week, month, quarter, year.
+
+- `startdate` — The first time value to subtract (the subtrahend). [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md).
+
+- `enddate` — The second time value to subtract from (the minuend). [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md).
+
+- `timezone` — Optional parameter. If specified, it is applied to both `startdate` and `enddate`. If not specified, timezones of `startdate` and `enddate` are used. If they are not the same, the result is unspecified.
+
+**Returned value**
+
+Difference between `enddate` and `startdate` expressed in `unit`.
+
+Type: `int`.
+
+**Example**
+
+Query:
+
+``` sql
+SELECT dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'));
+```
+
+Result:
+
+``` text
+┌─dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'))─┐
+│ 25 │
+└────────────────────────────────────────────────────────────────────────────────────────┘
+```
+
+## date\_sub {#date_sub}
+
+Subtracts a time/date interval from the provided date.
+
+**Syntax**
+
+``` sql
+date_sub(unit, value, date)
+```
+
+Aliases: `dateSub`, `DATE_SUB`.
+
+**Parameters**
+
+- `unit` — The type of interval to subtract. [String](../../sql-reference/data-types/string.md).
+
+ Supported values: second, minute, hour, day, week, month, quarter, year.
+- `value` - Value in specified unit - [Int](../../sql-reference/data-types/int-uint.md)
+- `date` — [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md) to subtract value from.
+
+**Returned value**
+
+Returns Date or DateTime with `value` expressed in `unit` subtracted from `date`.
+
+**Example**
+
+Query:
+
+``` sql
+SELECT date_sub(YEAR, 3, toDate('2018-01-01'));
+```
+
+Result:
+
+``` text
+┌─minus(toDate('2018-01-01'), toIntervalYear(3))─┐
+│ 2015-01-01 │
+└────────────────────────────────────────────────┘
+```
+
+## timestamp\_add {#timestamp_add}
+
+Adds the specified time value with the provided date or date time value.
+
+**Syntax**
+
+``` sql
+timestamp_add(date, INTERVAL value unit)
+```
+
+Aliases: `timeStampAdd`, `TIMESTAMP_ADD`.
+
+**Parameters**
+
+- `date` — Date or Date with time - [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md).
+- `value` - Value in specified unit - [Int](../../sql-reference/data-types/int-uint.md)
+- `unit` — The type of interval to add. [String](../../sql-reference/data-types/string.md).
+
+ Supported values: second, minute, hour, day, week, month, quarter, year.
+
+**Returned value**
+
+Returns Date or DateTime with the specified `value` expressed in `unit` added to `date`.
+
+**Example**
+
+```sql
+select timestamp_add(toDate('2018-01-01'), INTERVAL 3 MONTH);
+```
+
+```text
+┌─plus(toDate('2018-01-01'), toIntervalMonth(3))─┐
+│ 2018-04-01 │
+└────────────────────────────────────────────────┘
+```
+
+## timestamp\_sub {#timestamp_sub}
+
+Returns the difference between two dates in the specified unit.
+
+**Syntax**
+
+``` sql
+timestamp_sub(unit, value, date)
+```
+
+Aliases: `timeStampSub`, `TIMESTAMP_SUB`.
+
+**Parameters**
+
+- `unit` — The type of interval to add. [String](../../sql-reference/data-types/string.md).
+
+ Supported values: second, minute, hour, day, week, month, quarter, year.
+- `value` - Value in specified unit - [Int](../../sql-reference/data-types/int-uint.md).
+- `date`- [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md).
+
+**Returned value**
+
+Difference between `date` and the specified `value` expressed in `unit`.
+
+**Example**
+
+```sql
+select timestamp_sub(MONTH, 5, toDateTime('2018-12-18 01:02:03'));
+```
+
+```text
+┌─minus(toDateTime('2018-12-18 01:02:03'), toIntervalMonth(5))─┐
+│ 2018-07-18 01:02:03 │
+└──────────────────────────────────────────────────────────────┘
+```
+
## now {#now}
Returns the current date and time.
@@ -550,50 +745,6 @@ SELECT
└──────────────────────────┴───────────────────────────────┘
```
-## dateDiff {#datediff}
-
-Returns the difference between two Date or DateTime values.
-
-**Syntax**
-
-``` sql
-dateDiff('unit', startdate, enddate, [timezone])
-```
-
-**Parameters**
-
-- `unit` — Time unit, in which the returned value is expressed. [String](../../sql-reference/syntax.md#syntax-string-literal).
-
- Supported values: second, minute, hour, day, week, month, quarter, year.
-
-- `startdate` — The first time value to compare. [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md).
-
-- `enddate` — The second time value to compare. [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md).
-
-- `timezone` — Optional parameter. If specified, it is applied to both `startdate` and `enddate`. If not specified, timezones of `startdate` and `enddate` are used. If they are not the same, the result is unspecified.
-
-**Returned value**
-
-Difference between `startdate` and `enddate` expressed in `unit`.
-
-Type: `int`.
-
-**Example**
-
-Query:
-
-``` sql
-SELECT dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'));
-```
-
-Result:
-
-``` text
-┌─dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'))─┐
-│ 25 │
-└────────────────────────────────────────────────────────────────────────────────────────┘
-```
-
## timeSlots(StartTime, Duration,\[, Size\]) {#timeslotsstarttime-duration-size}
For a time interval starting at ‘StartTime’ and continuing for ‘Duration’ seconds, it returns an array of moments in time, consisting of points from this interval rounded down to the ‘Size’ in seconds. ‘Size’ is an optional parameter: a constant UInt32, set to 1800 by default.
diff --git a/docs/en/sql-reference/statements/alter/quota.md b/docs/en/sql-reference/statements/alter/quota.md
index 905c57503fc..a43b5255598 100644
--- a/docs/en/sql-reference/statements/alter/quota.md
+++ b/docs/en/sql-reference/statements/alter/quota.md
@@ -5,7 +5,7 @@ toc_title: QUOTA
# ALTER QUOTA {#alter-quota-statement}
-Changes [quotas](../../../operations/access-rights.md#quotas-management).
+Changes quotas.
Syntax:
@@ -14,13 +14,13 @@ ALTER QUOTA [IF EXISTS] name [ON CLUSTER cluster_name]
[RENAME TO new_name]
[KEYED BY {user_name | ip_address | client_key | client_key,user_name | client_key,ip_address} | NOT KEYED]
[FOR [RANDOMIZED] INTERVAL number {second | minute | hour | day | week | month | quarter | year}
- {MAX { {queries | errors | result_rows | result_bytes | read_rows | read_bytes | execution_time} = number } [,...] |
+ {MAX { {queries | query_selects | query_inserts | errors | result_rows | result_bytes | read_rows | read_bytes | execution_time} = number } [,...] |
NO LIMITS | TRACKING ONLY} [,...]]
[TO {role [,...] | ALL | ALL EXCEPT role [,...]}]
```
Keys `user_name`, `ip_address`, `client_key`, `client_key, user_name` and `client_key, ip_address` correspond to the fields in the [system.quotas](../../../operations/system-tables/quotas.md) table.
-Parameters `queries`, `errors`, `result_rows`, `result_bytes`, `read_rows`, `read_bytes`, `execution_time` correspond to the fields in the [system.quotas_usage](../../../operations/system-tables/quotas_usage.md) table.
+Parameters `queries`, `query_selects`, 'query_inserts', errors`, `result_rows`, `result_bytes`, `read_rows`, `read_bytes`, `execution_time` correspond to the fields in the [system.quotas_usage](../../../operations/system-tables/quotas_usage.md) table.
`ON CLUSTER` clause allows creating quotas on a cluster, see [Distributed DDL](../../../sql-reference/distributed-ddl.md).
diff --git a/docs/en/sql-reference/statements/create/quota.md b/docs/en/sql-reference/statements/create/quota.md
index ec980af921f..71416abf588 100644
--- a/docs/en/sql-reference/statements/create/quota.md
+++ b/docs/en/sql-reference/statements/create/quota.md
@@ -13,14 +13,14 @@ Syntax:
CREATE QUOTA [IF NOT EXISTS | OR REPLACE] name [ON CLUSTER cluster_name]
[KEYED BY {user_name | ip_address | client_key | client_key,user_name | client_key,ip_address} | NOT KEYED]
[FOR [RANDOMIZED] INTERVAL number {second | minute | hour | day | week | month | quarter | year}
- {MAX { {queries | errors | result_rows | result_bytes | read_rows | read_bytes | execution_time} = number } [,...] |
+ {MAX { {queries | query_selects | query_inserts | errors | result_rows | result_bytes | read_rows | read_bytes | execution_time} = number } [,...] |
NO LIMITS | TRACKING ONLY} [,...]]
[TO {role [,...] | ALL | ALL EXCEPT role [,...]}]
```
-Keys `user_name`, `ip_address`, `client_key`, `client_key, user_name` and `client_key, ip_address` correspond to the fields in the [system.quotas](../../../operations/system-tables/quotas.md) table.
+Keys `user_name`, `ip_address`, `client_key`, `client_key, user_name` and `client_key, ip_address` correspond to the fields in the [system.quotas](../../../operations/system-tables/quotas.md) table.
-Parameters `queries`, `errors`, `result_rows`, `result_bytes`, `read_rows`, `read_bytes`, `execution_time` correspond to the fields in the [system.quotas_usage](../../../operations/system-tables/quotas_usage.md) table.
+Parameters `queries`, `query_selects`, `query_inserts`, `errors`, `result_rows`, `result_bytes`, `read_rows`, `read_bytes`, `execution_time` correspond to the fields in the [system.quotas_usage](../../../operations/system-tables/quotas_usage.md) table.
`ON CLUSTER` clause allows creating quotas on a cluster, see [Distributed DDL](../../../sql-reference/distributed-ddl.md).
diff --git a/docs/en/sql-reference/window-functions/index.md b/docs/en/sql-reference/window-functions/index.md
index a79328ade32..5a6f13226a5 100644
--- a/docs/en/sql-reference/window-functions/index.md
+++ b/docs/en/sql-reference/window-functions/index.md
@@ -1,9 +1,14 @@
-# [development] Window Functions
+---
+toc_priority: 62
+toc_title: Window Functions
+---
+
+# [experimental] Window Functions
!!! warning "Warning"
This is an experimental feature that is currently in development and is not ready
for general use. It will change in unpredictable backwards-incompatible ways in
-the future releases.
+the future releases. Set `allow_experimental_window_functions = 1` to enable it.
ClickHouse currently supports calculation of aggregate functions over a window.
Pure window functions such as `rank`, `lag`, `lead` and so on are not yet supported.
@@ -11,9 +16,7 @@ Pure window functions such as `rank`, `lag`, `lead` and so on are not yet suppor
The window can be specified either with an `OVER` clause or with a separate
`WINDOW` clause.
-Only two variants of frame are supported, `ROWS` and `RANGE`. The only supported
-frame boundaries are `ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`.
-
+Only two variants of frame are supported, `ROWS` and `RANGE`. Offsets for the `RANGE` frame are not yet supported.
## References
@@ -28,6 +31,7 @@ https://github.com/ClickHouse/ClickHouse/blob/master/tests/performance/window_fu
https://github.com/ClickHouse/ClickHouse/blob/master/tests/queries/0_stateless/01591_window_functions.sql
### Postgres Docs
+https://www.postgresql.org/docs/current/sql-select.html#SQL-WINDOW
https://www.postgresql.org/docs/devel/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS
https://www.postgresql.org/docs/devel/functions-window.html
https://www.postgresql.org/docs/devel/tutorial-window.html
diff --git a/docs/ru/development/style.md b/docs/ru/development/style.md
index 4d71dca46a7..1b211259bbb 100644
--- a/docs/ru/development/style.md
+++ b/docs/ru/development/style.md
@@ -714,6 +714,7 @@ auto s = std::string{"Hello"};
### Пользовательская ошибка {#error-messages-user-error}
Такая ошибка вызвана действиями пользователя (неверный синтаксис запроса) или конфигурацией внешних систем (кончилось место на диске). Предполагается, что пользователь может устранить её самостоятельно. Для этого в сообщении об ошибке должна содержаться следующая информация:
+
* что произошло. Это должно объясняться в пользовательских терминах (`Function pow() is not supported for data type UInt128`), а не загадочными конструкциями из кода (`runtime overload resolution failed in DB::BinaryOperationBuilder::Impl, UInt128, Int8>::kaboongleFastPath()`).
* почему/где/когда -- любой контекст, который помогает отладить проблему. Представьте, как бы её отлаживали вы (программировать и пользоваться отладчиком нельзя).
* что можно предпринять для устранения ошибки. Здесь можно перечислить типичные причины проблемы, настройки, влияющие на это поведение, и так далее.
diff --git a/docs/ru/operations/system-tables/part_log.md b/docs/ru/operations/system-tables/part_log.md
index 255ece76ee2..bba4fda6135 100644
--- a/docs/ru/operations/system-tables/part_log.md
+++ b/docs/ru/operations/system-tables/part_log.md
@@ -6,29 +6,62 @@
Столбцы:
-- `event_type` (Enum) — тип события. Столбец может содержать одно из следующих значений:
+- `query_id` ([String](../../sql-reference/data-types/string.md)) — идентификатор запроса `INSERT`, создавшего этот кусок.
+- `event_type` ([Enum8](../../sql-reference/data-types/enum.md)) — тип события. Столбец может содержать одно из следующих значений:
- `NEW_PART` — вставка нового куска.
- `MERGE_PARTS` — слияние кусков.
- `DOWNLOAD_PART` — загрузка с реплики.
- `REMOVE_PART` — удаление или отсоединение из таблицы с помощью [DETACH PARTITION](../../sql-reference/statements/alter/partition.md#alter_detach-partition).
- `MUTATE_PART` — изменение куска.
- `MOVE_PART` — перемещение куска между дисками.
-- `event_date` (Date) — дата события.
-- `event_time` (DateTime) — время события.
-- `duration_ms` (UInt64) — длительность.
-- `database` (String) — имя базы данных, в которой находится кусок.
-- `table` (String) — имя таблицы, в которой находится кусок.
-- `part_name` (String) — имя куска.
-- `partition_id` (String) — идентификатор партиции, в которую был добавлен кусок. В столбце будет значение ‘all’, если таблица партициируется по выражению `tuple()`.
-- `rows` (UInt64) — число строк в куске.
-- `size_in_bytes` (UInt64) — размер куска данных в байтах.
-- `merged_from` (Array(String)) — массив имён кусков, из которых образован текущий кусок в результате слияния (также столбец заполняется в случае скачивания уже смерженного куска).
-- `bytes_uncompressed` (UInt64) — количество прочитанных разжатых байт.
-- `read_rows` (UInt64) — сколько было прочитано строк при слиянии кусков.
-- `read_bytes` (UInt64) — сколько было прочитано байт при слиянии кусков.
-- `error` (UInt16) — код ошибки, возникшей при текущем событии.
-- `exception` (String) — текст ошибки.
+- `event_date` ([Date](../../sql-reference/data-types/date.md)) — дата события.
+- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — время события.
+- `duration_ms` ([UInt64](../../sql-reference/data-types/int-uint.md)) — длительность.
+- `database` ([String](../../sql-reference/data-types/string.md)) — имя базы данных, в которой находится кусок.
+- `table` ([String](../../sql-reference/data-types/string.md)) — имя таблицы, в которой находится кусок.
+- `part_name` ([String](../../sql-reference/data-types/string.md)) — имя куска.
+- `partition_id` ([String](../../sql-reference/data-types/string.md)) — идентификатор партиции, в которую был добавлен кусок. В столбце будет значение `all`, если таблица партициируется по выражению `tuple()`.
+- `path_on_disk` ([String](../../sql-reference/data-types/string.md)) — абсолютный путь к папке с файлами кусков данных.
+- `rows` ([UInt64](../../sql-reference/data-types/int-uint.md)) — число строк в куске.
+- `size_in_bytes` ([UInt64](../../sql-reference/data-types/int-uint.md)) — размер куска данных в байтах.
+- `merged_from` ([Array(String)](../../sql-reference/data-types/array.md)) — массив имён кусков, из которых образован текущий кусок в результате слияния (также столбец заполняется в случае скачивания уже смерженного куска).
+- `bytes_uncompressed` ([UInt64](../../sql-reference/data-types/int-uint.md)) — количество прочитанных не сжатых байт.
+- `read_rows` ([UInt64](../../sql-reference/data-types/int-uint.md)) — сколько было прочитано строк при слиянии кусков.
+- `read_bytes` ([UInt64](../../sql-reference/data-types/int-uint.md)) — сколько было прочитано байт при слиянии кусков.
+- `peak_memory_usage` ([Int64](../../sql-reference/data-types/int-uint.md)) — максимальная разница между выделенной и освобождённой памятью в контексте потока.
+- `error` ([UInt16](../../sql-reference/data-types/int-uint.md)) — код ошибки, возникшей при текущем событии.
+- `exception` ([String](../../sql-reference/data-types/string.md)) — текст ошибки.
Системная таблица `system.part_log` будет создана после первой вставки данных в таблицу `MergeTree`.
+**Пример**
+
+``` sql
+SELECT * FROM system.part_log LIMIT 1 FORMAT Vertical;
+```
+
+``` text
+Row 1:
+──────
+query_id: 983ad9c7-28d5-4ae1-844e-603116b7de31
+event_type: NewPart
+event_date: 2021-02-02
+event_time: 2021-02-02 11:14:28
+duration_ms: 35
+database: default
+table: log_mt_2
+part_name: all_1_1_0
+partition_id: all
+path_on_disk: db/data/default/log_mt_2/all_1_1_0/
+rows: 115418
+size_in_bytes: 1074311
+merged_from: []
+bytes_uncompressed: 0
+read_rows: 0
+read_bytes: 0
+peak_memory_usage: 0
+error: 0
+exception:
+```
+
[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/system_tables/part_log)
diff --git a/docs/ru/sql-reference/aggregate-functions/reference/argmax.md b/docs/ru/sql-reference/aggregate-functions/reference/argmax.md
index 97edd5773c8..f44e65831a9 100644
--- a/docs/ru/sql-reference/aggregate-functions/reference/argmax.md
+++ b/docs/ru/sql-reference/aggregate-functions/reference/argmax.md
@@ -4,8 +4,63 @@ toc_priority: 106
# argMax {#agg-function-argmax}
-Синтаксис: `argMax(arg, val)`
+Вычисляет значение `arg` при максимальном значении `val`. Если есть несколько разных значений `arg` для максимальных значений `val`, возвращает первое попавшееся из таких значений.
-Вычисляет значение arg при максимальном значении val. Если есть несколько разных значений arg для максимальных значений val, то выдаётся первое попавшееся из таких значений.
+Если функции передан кортеж, то будет выведен кортеж с максимальным значением `val`. Удобно использовать для работы с [SimpleAggregateFunction](../../../sql-reference/data-types/simpleaggregatefunction.md).
-[Оригинальная статья](https://clickhouse.tech/docs/en/sql-reference/aggregate-functions/reference/argmax/)
+**Синтаксис**
+
+``` sql
+argMax(arg, val)
+```
+
+или
+
+``` sql
+argMax(tuple(arg, val))
+```
+
+**Параметры**
+
+- `arg` — аргумент.
+- `val` — значение.
+
+**Возвращаемое значение**
+
+- Значение `arg`, соответствующее максимальному значению `val`.
+
+Тип: соответствует типу `arg`.
+
+Если передан кортеж:
+
+- Кортеж `(arg, val)` c максимальным значением `val` и соответствующим ему `arg`.
+
+Тип: [Tuple](../../../sql-reference/data-types/tuple.md).
+
+**Пример**
+
+Исходная таблица:
+
+``` text
+┌─user─────┬─salary─┐
+│ director │ 5000 │
+│ manager │ 3000 │
+│ worker │ 1000 │
+└──────────┴────────┘
+```
+
+Запрос:
+
+``` sql
+SELECT argMax(user, salary), argMax(tuple(user, salary)) FROM salary;
+```
+
+Результат:
+
+``` text
+┌─argMax(user, salary)─┬─argMax(tuple(user, salary))─┐
+│ director │ ('director',5000) │
+└──────────────────────┴─────────────────────────────┘
+```
+
+[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/aggregate-functions/reference/argmax/)
diff --git a/docs/ru/sql-reference/aggregate-functions/reference/argmin.md b/docs/ru/sql-reference/aggregate-functions/reference/argmin.md
index 58161cd226a..8c25b79f92a 100644
--- a/docs/ru/sql-reference/aggregate-functions/reference/argmin.md
+++ b/docs/ru/sql-reference/aggregate-functions/reference/argmin.md
@@ -4,11 +4,42 @@ toc_priority: 105
# argMin {#agg-function-argmin}
-Синтаксис: `argMin(arg, val)`
+Вычисляет значение `arg` при минимальном значении `val`. Если есть несколько разных значений `arg` для минимальных значений `val`, возвращает первое попавшееся из таких значений.
-Вычисляет значение arg при минимальном значении val. Если есть несколько разных значений arg для минимальных значений val, то выдаётся первое попавшееся из таких значений.
+Если функции передан кортеж, то будет выведен кортеж с минимальным значением `val`. Удобно использовать для работы с [SimpleAggregateFunction](../../../sql-reference/data-types/simpleaggregatefunction.md).
-**Пример:**
+**Синтаксис**
+
+``` sql
+argMin(arg, val)
+```
+
+или
+
+``` sql
+argMin(tuple(arg, val))
+```
+
+**Параметры**
+
+- `arg` — аргумент.
+- `val` — значение.
+
+**Возвращаемое значение**
+
+- Значение `arg`, соответствующее минимальному значению `val`.
+
+Тип: соответствует типу `arg`.
+
+Если передан кортеж:
+
+- Кортеж `(arg, val)` c минимальным значением `val` и соответствующим ему `arg`.
+
+Тип: [Tuple](../../../sql-reference/data-types/tuple.md).
+
+**Пример**
+
+Исходная таблица:
``` text
┌─user─────┬─salary─┐
@@ -18,14 +49,18 @@ toc_priority: 105
└──────────┴────────┘
```
+Запрос:
+
``` sql
-SELECT argMin(user, salary) FROM salary
+SELECT argMin(user, salary), argMin(tuple(user, salary)) FROM salary;
```
+Результат:
+
``` text
-┌─argMin(user, salary)─┐
-│ worker │
-└──────────────────────┘
+┌─argMin(user, salary)─┬─argMin(tuple(user, salary))─┐
+│ worker │ ('worker',1000) │
+└──────────────────────┴─────────────────────────────┘
```
-[Оригинальная статья](https://clickhouse.tech/docs/en/sql-reference/aggregate-functions/reference/argmin/)
+[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/aggregate-functions/reference/argmin/)
diff --git a/docs/ru/sql-reference/aggregate-functions/reference/mannwhitneyutest.md b/docs/ru/sql-reference/aggregate-functions/reference/mannwhitneyutest.md
new file mode 100644
index 00000000000..fb73fff5f00
--- /dev/null
+++ b/docs/ru/sql-reference/aggregate-functions/reference/mannwhitneyutest.md
@@ -0,0 +1,71 @@
+---
+toc_priority: 310
+toc_title: mannWhitneyUTest
+---
+
+# mannWhitneyUTest {#mannwhitneyutest}
+
+Вычисляет U-критерий Манна — Уитни для выборок из двух генеральных совокупностей.
+
+**Синтаксис**
+
+``` sql
+mannWhitneyUTest[(alternative[, continuity_correction])](sample_data, sample_index)
+```
+
+Значения выборок берутся из столбца `sample_data`. Если `sample_index` равно 0, то значение из этой строки принадлежит первой выборке. Во всех остальных случаях значение принадлежит второй выборке.
+Проверяется нулевая гипотеза, что генеральные совокупности стохастически равны. Наряду с двусторонней гипотезой могут быть проверены и односторонние.
+Для применения U-критерия Манна — Уитни закон распределения генеральных совокупностей не обязан быть нормальным.
+
+**Параметры**
+
+- `alternative` — альтернативная гипотеза. (Необязательный параметр, по умолчанию: `'two-sided'`.) [String](../../../sql-reference/data-types/string.md).
+ - `'two-sided'`;
+ - `'greater'`;
+ - `'less'`.
+- `continuity_correction` - если не 0, то при вычислении p-значения применяется коррекция непрерывности. (Необязательный параметр, по умолчанию: 1.) [UInt64](../../../sql-reference/data-types/int-uint.md).
+- `sample_data` — данные выборок. [Integer](../../../sql-reference/data-types/int-uint.md), [Float](../../../sql-reference/data-types/float.md) or [Decimal](../../../sql-reference/data-types/decimal.md).
+- `sample_index` — индексы выборок. [Integer](../../../sql-reference/data-types/int-uint.md).
+
+
+**Возвращаемые значения**
+
+[Кортеж](../../../sql-reference/data-types/tuple.md) с двумя элементами:
+- вычисленное значение критерия Манна — Уитни. [Float64](../../../sql-reference/data-types/float.md).
+- вычисленное p-значение. [Float64](../../../sql-reference/data-types/float.md).
+
+
+**Пример**
+
+Таблица:
+
+``` text
+┌─sample_data─┬─sample_index─┐
+│ 10 │ 0 │
+│ 11 │ 0 │
+│ 12 │ 0 │
+│ 1 │ 1 │
+│ 2 │ 1 │
+│ 3 │ 1 │
+└─────────────┴──────────────┘
+```
+
+Запрос:
+
+``` sql
+SELECT mannWhitneyUTest('greater')(sample_data, sample_index) FROM mww_ttest;
+```
+
+Результат:
+
+``` text
+┌─mannWhitneyUTest('greater')(sample_data, sample_index)─┐
+│ (9,0.04042779918503192) │
+└────────────────────────────────────────────────────────┘
+```
+
+**Смотрите также**
+
+- [U-критерий Манна — Уитни](https://ru.wikipedia.org/wiki/U-%D0%BA%D1%80%D0%B8%D1%82%D0%B5%D1%80%D0%B8%D0%B9_%D0%9C%D0%B0%D0%BD%D0%BD%D0%B0_%E2%80%94_%D0%A3%D0%B8%D1%82%D0%BD%D0%B8)
+
+[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/aggregate-functions/reference/mannwhitneyutest/)
diff --git a/docs/ru/sql-reference/aggregate-functions/reference/studentttest.md b/docs/ru/sql-reference/aggregate-functions/reference/studentttest.md
new file mode 100644
index 00000000000..5361e06c5e2
--- /dev/null
+++ b/docs/ru/sql-reference/aggregate-functions/reference/studentttest.md
@@ -0,0 +1,65 @@
+---
+toc_priority: 300
+toc_title: studentTTest
+---
+
+# studentTTest {#studentttest}
+
+Вычисляет t-критерий Стьюдента для выборок из двух генеральных совокупностей.
+
+**Синтаксис**
+
+``` sql
+studentTTest(sample_data, sample_index)
+```
+
+Значения выборок берутся из столбца `sample_data`. Если `sample_index` равно 0, то значение из этой строки принадлежит первой выборке. Во всех остальных случаях значение принадлежит второй выборке.
+Проверяется нулевая гипотеза, что средние значения генеральных совокупностей совпадают. Для применения t-критерия Стьюдента распределение в генеральных совокупностях должно быть нормальным и дисперсии должны совпадать.
+
+**Параметры**
+
+- `sample_data` — данные выборок. [Integer](../../../sql-reference/data-types/int-uint.md), [Float](../../../sql-reference/data-types/float.md) or [Decimal](../../../sql-reference/data-types/decimal.md).
+- `sample_index` — индексы выборок. [Integer](../../../sql-reference/data-types/int-uint.md).
+
+**Возвращаемые значения**
+
+[Кортеж](../../../sql-reference/data-types/tuple.md) с двумя элементами:
+- вычисленное значение критерия Стьюдента. [Float64](../../../sql-reference/data-types/float.md).
+- вычисленное p-значение. [Float64](../../../sql-reference/data-types/float.md).
+
+
+**Пример**
+
+Таблица:
+
+``` text
+┌─sample_data─┬─sample_index─┐
+│ 20.3 │ 0 │
+│ 21.1 │ 0 │
+│ 21.9 │ 1 │
+│ 21.7 │ 0 │
+│ 19.9 │ 1 │
+│ 21.8 │ 1 │
+└─────────────┴──────────────┘
+```
+
+Запрос:
+
+``` sql
+SELECT studentTTest(sample_data, sample_index) FROM student_ttest;
+```
+
+Результат:
+
+``` text
+┌─studentTTest(sample_data, sample_index)───┐
+│ (-0.21739130434783777,0.8385421208415731) │
+└───────────────────────────────────────────┘
+```
+
+**Смотрите также**
+
+- [t-критерий Стьюдента](https://ru.wikipedia.org/wiki/T-%D0%BA%D1%80%D0%B8%D1%82%D0%B5%D1%80%D0%B8%D0%B9_%D0%A1%D1%82%D1%8C%D1%8E%D0%B4%D0%B5%D0%BD%D1%82%D0%B0)
+- [welchTTest](welchttest.md#welchttest)
+
+[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/aggregate-functions/reference/studentttest/)
diff --git a/docs/ru/sql-reference/aggregate-functions/reference/welchttest.md b/docs/ru/sql-reference/aggregate-functions/reference/welchttest.md
new file mode 100644
index 00000000000..1f36b2d04ee
--- /dev/null
+++ b/docs/ru/sql-reference/aggregate-functions/reference/welchttest.md
@@ -0,0 +1,65 @@
+---
+toc_priority: 301
+toc_title: welchTTest
+---
+
+# welchTTest {#welchttest}
+
+Вычисляет t-критерий Уэлча для выборок из двух генеральных совокупностей.
+
+**Синтаксис**
+
+``` sql
+welchTTest(sample_data, sample_index)
+```
+
+Значения выборок берутся из столбца `sample_data`. Если `sample_index` равно 0, то значение из этой строки принадлежит первой выборке. Во всех остальных случаях значение принадлежит второй выборке.
+Проверяется нулевая гипотеза, что средние значения генеральных совокупностей совпадают. Для применения t-критерия Уэлча распределение в генеральных совокупностях должно быть нормальным. Дисперсии могут не совпадать.
+
+**Параметры**
+
+- `sample_data` — данные выборок. [Integer](../../../sql-reference/data-types/int-uint.md), [Float](../../../sql-reference/data-types/float.md) or [Decimal](../../../sql-reference/data-types/decimal.md).
+- `sample_index` — индексы выборок. [Integer](../../../sql-reference/data-types/int-uint.md).
+
+**Возвращаемые значения**
+
+[Кортеж](../../../sql-reference/data-types/tuple.md) с двумя элементами:
+- вычисленное значение критерия Уэлча. [Float64](../../../sql-reference/data-types/float.md).
+- вычисленное p-значение. [Float64](../../../sql-reference/data-types/float.md).
+
+
+**Пример**
+
+Таблица:
+
+``` text
+┌─sample_data─┬─sample_index─┐
+│ 20.3 │ 0 │
+│ 22.1 │ 0 │
+│ 21.9 │ 0 │
+│ 18.9 │ 1 │
+│ 20.3 │ 1 │
+│ 19 │ 1 │
+└─────────────┴──────────────┘
+```
+
+Запрос:
+
+``` sql
+SELECT welchTTest(sample_data, sample_index) FROM welch_ttest;
+```
+
+Результат:
+
+``` text
+┌─welchTTest(sample_data, sample_index)─────┐
+│ (2.7988719532211235,0.051807360348581945) │
+└───────────────────────────────────────────┘
+```
+
+**Смотрите также**
+
+- [t-критерий Уэлча](https://ru.wikipedia.org/wiki/T-%D0%BA%D1%80%D0%B8%D1%82%D0%B5%D1%80%D0%B8%D0%B9_%D0%A3%D1%8D%D0%BB%D1%87%D0%B0)
+- [studentTTest](studentttest.md#studentttest)
+
+[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/aggregate-functions/reference/welchTTest/)
diff --git a/docs/ru/sql-reference/data-types/array.md b/docs/ru/sql-reference/data-types/array.md
index 906246b66ee..86a23ed041b 100644
--- a/docs/ru/sql-reference/data-types/array.md
+++ b/docs/ru/sql-reference/data-types/array.md
@@ -47,6 +47,8 @@ SELECT [1, 2] AS x, toTypeName(x)
## Особенности работы с типами данных {#osobennosti-raboty-s-tipami-dannykh}
+Максимальный размер массива ограничен одним миллионом элементов.
+
При создании массива «на лету» ClickHouse автоматически определяет тип аргументов как наиболее узкий тип данных, в котором можно хранить все перечисленные аргументы. Если среди аргументов есть [NULL](../../sql-reference/data-types/array.md#null-literal) или аргумент типа [Nullable](nullable.md#data_type-nullable), то тип элементов массива — [Nullable](nullable.md).
Если ClickHouse не смог подобрать тип данных, то он сгенерирует исключение. Это произойдёт, например, при попытке создать массив одновременно со строками и числами `SELECT array(1, 'a')`.
diff --git a/docs/ru/sql-reference/functions/array-functions.md b/docs/ru/sql-reference/functions/array-functions.md
index 015d14b9de5..80057e6f0e0 100644
--- a/docs/ru/sql-reference/functions/array-functions.md
+++ b/docs/ru/sql-reference/functions/array-functions.md
@@ -1135,11 +1135,225 @@ SELECT
Функция `arrayFirstIndex` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) — в качестве первого аргумента ей нужно передать лямбда-функцию, и этот аргумент не может быть опущен.
-## arraySum(\[func,\] arr1, …) {#array-sum}
+## arrayMin {#array-min}
-Возвращает сумму значений функции `func`. Если функция не указана - просто возвращает сумму элементов массива.
+Возвращает значение минимального элемента в исходном массиве.
-Функция `arraySum` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) - в качестве первого аргумента ей можно передать лямбда-функцию.
+Если передана функция `func`, возвращается минимум из элементов массива, преобразованных этой функцией.
+
+Функция `arrayMin` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) — в качестве первого аргумента ей можно передать лямбда-функцию.
+
+**Синтаксис**
+
+```sql
+arrayMin([func,] arr)
+```
+
+**Параметры**
+
+- `func` — функция. [Expression](../../sql-reference/data-types/special-data-types/expression.md).
+- `arr` — массив. [Array](../../sql-reference/data-types/array.md).
+
+**Возвращаемое значение**
+
+- Минимальное значение функции (или минимальный элемент массива).
+
+Тип: если передана `func`, соответствует типу ее возвращаемого значения, иначе соответствует типу элементов массива.
+
+**Примеры**
+
+Запрос:
+
+```sql
+SELECT arrayMin([1, 2, 4]) AS res;
+```
+
+Результат:
+
+```text
+┌─res─┐
+│ 1 │
+└─────┘
+```
+
+Запрос:
+
+```sql
+SELECT arrayMin(x -> (-x), [1, 2, 4]) AS res;
+```
+
+Результат:
+
+```text
+┌─res─┐
+│ -4 │
+└─────┘
+```
+
+## arrayMax {#array-max}
+
+Возвращает значение максимального элемента в исходном массиве.
+
+Если передана функция `func`, возвращается максимум из элементов массива, преобразованных этой функцией.
+
+Функция `arrayMax` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) — в качестве первого аргумента ей можно передать лямбда-функцию.
+
+**Синтаксис**
+
+```sql
+arrayMax([func,] arr)
+```
+
+**Параметры**
+
+- `func` — функция. [Expression](../../sql-reference/data-types/special-data-types/expression.md).
+- `arr` — массив. [Array](../../sql-reference/data-types/array.md).
+
+**Возвращаемое значение**
+
+- Максимальное значение функции (или максимальный элемент массива).
+
+Тип: если передана `func`, соответствует типу ее возвращаемого значения, иначе соответствует типу элементов массива.
+
+**Примеры**
+
+Запрос:
+
+```sql
+SELECT arrayMax([1, 2, 4]) AS res;
+```
+
+Результат:
+
+```text
+┌─res─┐
+│ 4 │
+└─────┘
+```
+
+Запрос:
+
+```sql
+SELECT arrayMax(x -> (-x), [1, 2, 4]) AS res;
+```
+
+Результат:
+
+```text
+┌─res─┐
+│ -1 │
+└─────┘
+```
+
+## arraySum {#array-sum}
+
+Возвращает сумму элементов в исходном массиве.
+
+Если передана функция `func`, возвращается сумма элементов массива, преобразованных этой функцией.
+
+Функция `arraySum` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) — в качестве первого аргумента ей можно передать лямбда-функцию.
+
+**Синтаксис**
+
+```sql
+arraySum([func,] arr)
+```
+
+**Параметры**
+
+- `func` — функция. [Expression](../../sql-reference/data-types/special-data-types/expression.md).
+- `arr` — массив. [Array](../../sql-reference/data-types/array.md).
+
+**Возвращаемое значение**
+
+- Сумма значений функции (или сумма элементов массива).
+
+Тип: для Decimal чисел в исходном массиве (если функция `func` была передана, то для чисел, преобразованных ею) — [Decimal128](../../sql-reference/data-types/decimal.md), для чисел с плавающей точкой — [Float64](../../sql-reference/data-types/float.md), для беззнаковых целых чисел — [UInt64](../../sql-reference/data-types/int-uint.md), для целых чисел со знаком — [Int64](../../sql-reference/data-types/int-uint.md).
+
+**Примеры**
+
+Запрос:
+
+```sql
+SELECT arraySum([2, 3]) AS res;
+```
+
+Результат:
+
+```text
+┌─res─┐
+│ 5 │
+└─────┘
+```
+
+Запрос:
+
+```sql
+SELECT arraySum(x -> x*x, [2, 3]) AS res;
+```
+
+Результат:
+
+```text
+┌─res─┐
+│ 13 │
+└─────┘
+```
+
+## arrayAvg {#array-avg}
+
+Возвращает среднее значение элементов в исходном массиве.
+
+Если передана функция `func`, возвращается среднее значение элементов массива, преобразованных этой функцией.
+
+Функция `arrayAvg` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) — в качестве первого аргумента ей можно передать лямбда-функцию.
+
+**Синтаксис**
+
+```sql
+arrayAvg([func,] arr)
+```
+
+**Параметры**
+
+- `func` — функция. [Expression](../../sql-reference/data-types/special-data-types/expression.md).
+- `arr` — массив. [Array](../../sql-reference/data-types/array.md).
+
+**Возвращаемое значение**
+
+- Среднее значение функции (или среднее значение элементов массива).
+
+Тип: [Float64](../../sql-reference/data-types/float.md).
+
+**Примеры**
+
+Запрос:
+
+```sql
+SELECT arrayAvg([1, 2, 4]) AS res;
+```
+
+Результат:
+
+```text
+┌────────────────res─┐
+│ 2.3333333333333335 │
+└────────────────────┘
+```
+
+Запрос:
+
+```sql
+SELECT arrayAvg(x -> (x * x), [2, 4]) AS res;
+```
+
+Результат:
+
+```text
+┌─res─┐
+│ 10 │
+└─────┘
+```
## arrayCumSum(\[func,\] arr1, …) {#arraycumsumfunc-arr1}
diff --git a/docs/zh/engines/table-engines/mergetree-family/versionedcollapsingmergetree.md b/docs/zh/engines/table-engines/mergetree-family/versionedcollapsingmergetree.md
index 7a0a42fa47c..3b89da9f595 100644
--- a/docs/zh/engines/table-engines/mergetree-family/versionedcollapsingmergetree.md
+++ b/docs/zh/engines/table-engines/mergetree-family/versionedcollapsingmergetree.md
@@ -37,7 +37,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
VersionedCollapsingMergeTree(sign, version)
```
-- `sign` — 指定行类型的列名: `1` 是一个 “state” 行, `-1` 是一个 “cancel” 划
+- `sign` — 指定行类型的列名: `1` 是一个 “state” 行, `-1` 是一个 “cancel” 行
列数据类型应为 `Int8`.
diff --git a/docs/zh/operations/system-tables/zookeeper.md b/docs/zh/operations/system-tables/zookeeper.md
index b66e5262df3..f7e816ccee6 100644
--- a/docs/zh/operations/system-tables/zookeeper.md
+++ b/docs/zh/operations/system-tables/zookeeper.md
@@ -6,12 +6,16 @@ machine_translated_rev: 5decc73b5dc60054f19087d3690c4eb99446a6c3
# 系统。动物园管理员 {#system-zookeeper}
如果未配置ZooKeeper,则表不存在。 允许从配置中定义的ZooKeeper集群读取数据。
-查询必须具有 ‘path’ WHERE子句中的平等条件。 这是ZooKeeper中您想要获取数据的孩子的路径。
+查询必须具有 ‘path’ WHERE子句中的相等条件或者在某个集合中的条件。 这是ZooKeeper中您想要获取数据的孩子的路径。
查询 `SELECT * FROM system.zookeeper WHERE path = '/clickhouse'` 输出对所有孩子的数据 `/clickhouse` 节点。
要输出所有根节点的数据,write path= ‘/’.
如果在指定的路径 ‘path’ 不存在,将引发异常。
+查询`SELECT * FROM system.zookeeper WHERE path IN ('/', '/clickhouse')` 输出`/` 和 `/clickhouse`节点上所有子节点的数据。
+如果在指定的 ‘path’ 集合中有不存在的路径,将引发异常。
+它可以用来做一批ZooKeeper路径查询。
+
列:
- `name` (String) — The name of the node.
diff --git a/programs/client/Client.cpp b/programs/client/Client.cpp
index 9a8b580407a..e41f780e99a 100644
--- a/programs/client/Client.cpp
+++ b/programs/client/Client.cpp
@@ -1719,7 +1719,7 @@ private:
}
// Remember where the data ended. We use this info later to determine
// where the next query begins.
- parsed_insert_query->end = data_in.buffer().begin() + data_in.count();
+ parsed_insert_query->end = parsed_insert_query->data + data_in.count();
}
else if (!is_interactive)
{
@@ -1900,6 +1900,9 @@ private:
switch (packet.type)
{
+ case Protocol::Server::PartUUIDs:
+ return true;
+
case Protocol::Server::Data:
if (!cancelled)
onData(packet.block);
diff --git a/programs/client/QueryFuzzer.cpp b/programs/client/QueryFuzzer.cpp
index ae0de450a10..05c20434820 100644
--- a/programs/client/QueryFuzzer.cpp
+++ b/programs/client/QueryFuzzer.cpp
@@ -325,6 +325,51 @@ void QueryFuzzer::fuzzColumnLikeExpressionList(IAST * ast)
// the generic recursion into IAST.children.
}
+void QueryFuzzer::fuzzWindowFrame(WindowFrame & frame)
+{
+ switch (fuzz_rand() % 40)
+ {
+ case 0:
+ {
+ const auto r = fuzz_rand() % 3;
+ frame.type = r == 0 ? WindowFrame::FrameType::Rows
+ : r == 1 ? WindowFrame::FrameType::Range
+ : WindowFrame::FrameType::Groups;
+ break;
+ }
+ case 1:
+ {
+ const auto r = fuzz_rand() % 3;
+ frame.begin_type = r == 0 ? WindowFrame::BoundaryType::Unbounded
+ : r == 1 ? WindowFrame::BoundaryType::Current
+ : WindowFrame::BoundaryType::Offset;
+ break;
+ }
+ case 2:
+ {
+ const auto r = fuzz_rand() % 3;
+ frame.end_type = r == 0 ? WindowFrame::BoundaryType::Unbounded
+ : r == 1 ? WindowFrame::BoundaryType::Current
+ : WindowFrame::BoundaryType::Offset;
+ break;
+ }
+ case 3:
+ {
+ frame.begin_offset = getRandomField(0).get();
+ break;
+ }
+ case 4:
+ {
+ frame.end_offset = getRandomField(0).get();
+ break;
+ }
+ default:
+ break;
+ }
+
+ frame.is_default = (frame == WindowFrame{});
+}
+
void QueryFuzzer::fuzz(ASTs & asts)
{
for (auto & ast : asts)
@@ -409,6 +454,7 @@ void QueryFuzzer::fuzz(ASTPtr & ast)
auto & def = fn->window_definition->as();
fuzzColumnLikeExpressionList(def.partition_by.get());
fuzzOrderByList(def.order_by.get());
+ fuzzWindowFrame(def.frame);
}
fuzz(fn->children);
@@ -421,6 +467,23 @@ void QueryFuzzer::fuzz(ASTPtr & ast)
fuzz(select->children);
}
+ /*
+ * The time to fuzz the settings has not yet come.
+ * Apparently we don't have any infractructure to validate the values of
+ * the settings, and the first query with max_block_size = -1 breaks
+ * because of overflows here and there.
+ *//*
+ * else if (auto * set = typeid_cast(ast.get()))
+ * {
+ * for (auto & c : set->changes)
+ * {
+ * if (fuzz_rand() % 50 == 0)
+ * {
+ * c.value = fuzzField(c.value);
+ * }
+ * }
+ * }
+ */
else if (auto * literal = typeid_cast(ast.get()))
{
// There is a caveat with fuzzing the children: many ASTs also keep the
diff --git a/programs/client/QueryFuzzer.h b/programs/client/QueryFuzzer.h
index e9d3f150283..38714205967 100644
--- a/programs/client/QueryFuzzer.h
+++ b/programs/client/QueryFuzzer.h
@@ -14,6 +14,7 @@ namespace DB
class ASTExpressionList;
class ASTOrderByElement;
+struct WindowFrame;
/*
* This is an AST-based query fuzzer that makes random modifications to query
@@ -65,6 +66,7 @@ struct QueryFuzzer
void fuzzOrderByElement(ASTOrderByElement * elem);
void fuzzOrderByList(IAST * ast);
void fuzzColumnLikeExpressionList(IAST * ast);
+ void fuzzWindowFrame(WindowFrame & frame);
void fuzz(ASTs & asts);
void fuzz(ASTPtr & ast);
void collectFuzzInfoMain(const ASTPtr ast);
diff --git a/src/Access/Quota.h b/src/Access/Quota.h
index b636e83ec40..430bdca29b0 100644
--- a/src/Access/Quota.h
+++ b/src/Access/Quota.h
@@ -31,6 +31,8 @@ struct Quota : public IAccessEntity
enum ResourceType
{
QUERIES, /// Number of queries.
+ QUERY_SELECTS, /// Number of select queries.
+ QUERY_INSERTS, /// Number of inserts queries.
ERRORS, /// Number of queries with exceptions.
RESULT_ROWS, /// Number of rows returned as result.
RESULT_BYTES, /// Number of bytes returned as result.
@@ -152,6 +154,16 @@ inline const Quota::ResourceTypeInfo & Quota::ResourceTypeInfo::get(ResourceType
static const auto info = make_info("QUERIES", 1);
return info;
}
+ case Quota::QUERY_SELECTS:
+ {
+ static const auto info = make_info("QUERY_SELECTS", 1);
+ return info;
+ }
+ case Quota::QUERY_INSERTS:
+ {
+ static const auto info = make_info("QUERY_INSERTS", 1);
+ return info;
+ }
case Quota::ERRORS:
{
static const auto info = make_info("ERRORS", 1);
diff --git a/src/AggregateFunctions/AggregateFunctionMannWhitney.h b/src/AggregateFunctions/AggregateFunctionMannWhitney.h
index 403f628a9ff..1451536d519 100644
--- a/src/AggregateFunctions/AggregateFunctionMannWhitney.h
+++ b/src/AggregateFunctions/AggregateFunctionMannWhitney.h
@@ -147,7 +147,7 @@ public:
}
if (params[0].getType() != Field::Types::String)
- throw Exception("Aggregate function " + getName() + " require require first parameter to be a String", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
+ throw Exception("Aggregate function " + getName() + " require first parameter to be a String", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
auto param = params[0].get();
if (param == "two-sided")
@@ -158,13 +158,13 @@ public:
alternative = Alternative::Greater;
else
throw Exception("Unknown parameter in aggregate function " + getName() +
- ". It must be one of: 'two sided', 'less', 'greater'", ErrorCodes::BAD_ARGUMENTS);
+ ". It must be one of: 'two-sided', 'less', 'greater'", ErrorCodes::BAD_ARGUMENTS);
if (params.size() != 2)
return;
if (params[1].getType() != Field::Types::UInt64)
- throw Exception("Aggregate function " + getName() + " require require second parameter to be a UInt64", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
+ throw Exception("Aggregate function " + getName() + " require second parameter to be a UInt64", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
continuity_correction = static_cast(params[1].get());
}
diff --git a/src/AggregateFunctions/AggregateFunctionWindowFunnel.h b/src/AggregateFunctions/AggregateFunctionWindowFunnel.h
index de8f0f1e2e9..c765024507e 100644
--- a/src/AggregateFunctions/AggregateFunctionWindowFunnel.h
+++ b/src/AggregateFunctions/AggregateFunctionWindowFunnel.h
@@ -149,7 +149,6 @@ private:
UInt8 strict_order; // When the 'strict_order' is set, it doesn't allow interventions of other events.
// In the case of 'A->B->D->C', it stops finding 'A->B->C' at the 'D' and the max event level is 2.
-
// Loop through the entire events_list, update the event timestamp value
// The level path must be 1---2---3---...---check_events_size, find the max event level that satisfied the path in the sliding window.
// If found, returns the max event level, else return 0.
diff --git a/src/AggregateFunctions/QuantileTiming.h b/src/AggregateFunctions/QuantileTiming.h
index 6070f264ad6..dd6d923a5a0 100644
--- a/src/AggregateFunctions/QuantileTiming.h
+++ b/src/AggregateFunctions/QuantileTiming.h
@@ -32,6 +32,8 @@ namespace ErrorCodes
* - a histogram (that is, value -> number), consisting of two parts
* -- for values from 0 to 1023 - in increments of 1;
* -- for values from 1024 to 30,000 - in increments of 16;
+ *
+ * NOTE: 64-bit integer weight can overflow, see also QantileExactWeighted.h::get()
*/
#define TINY_MAX_ELEMS 31
@@ -396,9 +398,9 @@ namespace detail
/// Get the value of the `level` quantile. The level must be between 0 and 1.
UInt16 get(double level) const
{
- UInt64 pos = std::ceil(count * level);
+ double pos = std::ceil(count * level);
- UInt64 accumulated = 0;
+ double accumulated = 0;
Iterator it(*this);
while (it.isValid())
@@ -422,9 +424,9 @@ namespace detail
const auto * indices_end = indices + size;
const auto * index = indices;
- UInt64 pos = std::ceil(count * levels[*index]);
+ double pos = std::ceil(count * levels[*index]);
- UInt64 accumulated = 0;
+ double accumulated = 0;
Iterator it(*this);
while (it.isValid())
diff --git a/src/Client/Connection.cpp b/src/Client/Connection.cpp
index 65b15a46955..e38a6b240a6 100644
--- a/src/Client/Connection.cpp
+++ b/src/Client/Connection.cpp
@@ -542,6 +542,12 @@ void Connection::sendData(const Block & block, const String & name, bool scalar)
throttler->add(out->count() - prev_bytes);
}
+void Connection::sendIgnoredPartUUIDs(const std::vector & uuids)
+{
+ writeVarUInt(Protocol::Client::IgnoredPartUUIDs, *out);
+ writeVectorBinary(uuids, *out);
+ out->next();
+}
void Connection::sendPreparedData(ReadBuffer & input, size_t size, const String & name)
{
@@ -798,6 +804,10 @@ Packet Connection::receivePacket(std::function async_
case Protocol::Server::EndOfStream:
return res;
+ case Protocol::Server::PartUUIDs:
+ readVectorBinary(res.part_uuids, *in);
+ return res;
+
default:
/// In unknown state, disconnect - to not leave unsynchronised connection.
disconnect();
diff --git a/src/Client/Connection.h b/src/Client/Connection.h
index 83e8f3ba206..2d24b143d7a 100644
--- a/src/Client/Connection.h
+++ b/src/Client/Connection.h
@@ -66,6 +66,7 @@ struct Packet
std::vector multistring_message;
Progress progress;
BlockStreamProfileInfo profile_info;
+ std::vector part_uuids;
Packet() : type(Protocol::Server::Hello) {}
};
@@ -157,6 +158,8 @@ public:
void sendScalarsData(Scalars & data);
/// Send all contents of external (temporary) tables.
void sendExternalTablesData(ExternalTablesData & data);
+ /// Send parts' uuids to excluded them from query processing
+ void sendIgnoredPartUUIDs(const std::vector & uuids);
/// Send prepared block of data (serialized and, if need, compressed), that will be read from 'input'.
/// You could pass size of serialized/compressed block.
diff --git a/src/Client/MultiplexedConnections.cpp b/src/Client/MultiplexedConnections.cpp
index ed7aad0a515..c50dd7b6454 100644
--- a/src/Client/MultiplexedConnections.cpp
+++ b/src/Client/MultiplexedConnections.cpp
@@ -140,6 +140,21 @@ void MultiplexedConnections::sendQuery(
sent_query = true;
}
+void MultiplexedConnections::sendIgnoredPartUUIDs(const std::vector & uuids)
+{
+ std::lock_guard lock(cancel_mutex);
+
+ if (sent_query)
+ throw Exception("Cannot send uuids after query is sent.", ErrorCodes::LOGICAL_ERROR);
+
+ for (ReplicaState & state : replica_states)
+ {
+ Connection * connection = state.connection;
+ if (connection != nullptr)
+ connection->sendIgnoredPartUUIDs(uuids);
+ }
+}
+
Packet MultiplexedConnections::receivePacket()
{
std::lock_guard lock(cancel_mutex);
@@ -195,6 +210,7 @@ Packet MultiplexedConnections::drain()
switch (packet.type)
{
+ case Protocol::Server::PartUUIDs:
case Protocol::Server::Data:
case Protocol::Server::Progress:
case Protocol::Server::ProfileInfo:
@@ -253,6 +269,7 @@ Packet MultiplexedConnections::receivePacketUnlocked(std::function & uuids);
+
/** On each replica, read and skip all packets to EndOfStream or Exception.
* Returns EndOfStream if no exception has been received. Otherwise
* returns the last received packet of type Exception.
diff --git a/src/Columns/ColumnAggregateFunction.cpp b/src/Columns/ColumnAggregateFunction.cpp
index d0a5e120a07..9562dc647c9 100644
--- a/src/Columns/ColumnAggregateFunction.cpp
+++ b/src/Columns/ColumnAggregateFunction.cpp
@@ -75,8 +75,28 @@ void ColumnAggregateFunction::set(const AggregateFunctionPtr & func_)
ColumnAggregateFunction::~ColumnAggregateFunction()
{
if (!func->hasTrivialDestructor() && !src)
- for (auto * val : data)
- func->destroy(val);
+ {
+ if (copiedDataInfo.empty())
+ {
+ for (auto * val : data)
+ {
+ func->destroy(val);
+ }
+ }
+ else
+ {
+ size_t pos;
+ for (Map::iterator it = copiedDataInfo.begin(), it_end = copiedDataInfo.end(); it != it_end; ++it)
+ {
+ pos = it->getValue().second;
+ if (data[pos] != nullptr)
+ {
+ func->destroy(data[pos]);
+ data[pos] = nullptr;
+ }
+ }
+ }
+ }
}
void ColumnAggregateFunction::addArena(ConstArenaPtr arena_)
@@ -455,14 +475,37 @@ void ColumnAggregateFunction::insertFrom(const IColumn & from, size_t n)
/// (only as a whole, see comment above).
ensureOwnership();
insertDefault();
- insertMergeFrom(from, n);
+ insertCopyFrom(assert_cast(from).data[n]);
}
void ColumnAggregateFunction::insertFrom(ConstAggregateDataPtr place)
{
ensureOwnership();
insertDefault();
- insertMergeFrom(place);
+ insertCopyFrom(place);
+}
+
+void ColumnAggregateFunction::insertCopyFrom(ConstAggregateDataPtr place)
+{
+ Map::LookupResult result;
+ result = copiedDataInfo.find(place);
+ if (result == nullptr)
+ {
+ copiedDataInfo[place] = data.size()-1;
+ func->merge(data.back(), place, &createOrGetArena());
+ }
+ else
+ {
+ size_t pos = result->getValue().second;
+ if (pos != data.size() - 1)
+ {
+ data[data.size() - 1] = data[pos];
+ }
+ else /// insert same data to same pos, merge them.
+ {
+ func->merge(data.back(), place, &createOrGetArena());
+ }
+ }
}
void ColumnAggregateFunction::insertMergeFrom(ConstAggregateDataPtr place)
@@ -697,5 +740,4 @@ MutableColumnPtr ColumnAggregateFunction::cloneResized(size_t size) const
return cloned_col;
}
}
-
}
diff --git a/src/Columns/ColumnAggregateFunction.h b/src/Columns/ColumnAggregateFunction.h
index cd45cf583a0..a1aa9e29a39 100644
--- a/src/Columns/ColumnAggregateFunction.h
+++ b/src/Columns/ColumnAggregateFunction.h
@@ -13,6 +13,8 @@
#include
+#include
+
namespace DB
{
@@ -82,6 +84,17 @@ private:
/// Name of the type to distinguish different aggregation states.
String type_string;
+ /// MergedData records, used to avoid duplicated data copy.
+ ///key: src pointer, val: pos in current column.
+ using Map = HashMap<
+ ConstAggregateDataPtr,
+ size_t,
+ DefaultHash,
+ HashTableGrower<3>,
+ HashTableAllocatorWithStackMemory) * (1 << 3)>>;
+
+ Map copiedDataInfo;
+
ColumnAggregateFunction() {}
/// Create a new column that has another column as a source.
@@ -140,6 +153,8 @@ public:
void insertFrom(ConstAggregateDataPtr place);
+ void insertCopyFrom(ConstAggregateDataPtr place);
+
/// Merge state at last row with specified state in another column.
void insertMergeFrom(ConstAggregateDataPtr place);
diff --git a/src/Columns/ColumnsNumber.h b/src/Columns/ColumnsNumber.h
index 96ce2bd6d6f..17a28e617c3 100644
--- a/src/Columns/ColumnsNumber.h
+++ b/src/Columns/ColumnsNumber.h
@@ -26,4 +26,6 @@ using ColumnInt256 = ColumnVector;
using ColumnFloat32 = ColumnVector;
using ColumnFloat64 = ColumnVector;
+using ColumnUUID = ColumnVector;
+
}
diff --git a/src/Common/CurrentThread.h b/src/Common/CurrentThread.h
index 876cbd8a66b..7ab57ea7fab 100644
--- a/src/Common/CurrentThread.h
+++ b/src/Common/CurrentThread.h
@@ -63,9 +63,6 @@ public:
/// Call from master thread as soon as possible (e.g. when thread accepted connection)
static void initializeQuery();
- /// Sets query_context for current thread group
- static void attachQueryContext(Context & query_context);
-
/// You must call one of these methods when create a query child thread:
/// Add current thread to a group associated with the thread group
static void attachTo(const ThreadGroupStatusPtr & thread_group);
@@ -99,6 +96,10 @@ public:
private:
static void defaultThreadDeleter();
+
+ /// Sets query_context for current thread group
+ /// Can by used only through QueryScope
+ static void attachQueryContext(Context & query_context);
};
}
diff --git a/src/Common/ErrorCodes.cpp b/src/Common/ErrorCodes.cpp
index a2cd65137c0..cf758691cec 100644
--- a/src/Common/ErrorCodes.cpp
+++ b/src/Common/ErrorCodes.cpp
@@ -533,11 +533,13 @@
M(564, INTERSERVER_SCHEME_DOESNT_MATCH) \
M(565, TOO_MANY_PARTITIONS) \
M(566, CANNOT_RMDIR) \
+ M(567, DUPLICATED_PART_UUIDS) \
\
M(999, KEEPER_EXCEPTION) \
M(1000, POCO_EXCEPTION) \
M(1001, STD_EXCEPTION) \
- M(1002, UNKNOWN_EXCEPTION)
+ M(1002, UNKNOWN_EXCEPTION) \
+ M(1003, INVALID_SHARD_ID)
/* See END */
diff --git a/src/Common/HashTable/HashMap.h b/src/Common/HashTable/HashMap.h
index e09f60c4294..99dc5414107 100644
--- a/src/Common/HashTable/HashMap.h
+++ b/src/Common/HashTable/HashMap.h
@@ -109,6 +109,11 @@ struct HashMapCell
DB::assertChar(',', rb);
DB::readDoubleQuoted(value.second, rb);
}
+
+ static bool constexpr need_to_notify_cell_during_move = false;
+
+ static void move(HashMapCell * /* old_location */, HashMapCell * /* new_location */) {}
+
};
template
diff --git a/src/Common/HashTable/HashTable.h b/src/Common/HashTable/HashTable.h
index 15fa09490e6..9b6bb0a1be4 100644
--- a/src/Common/HashTable/HashTable.h
+++ b/src/Common/HashTable/HashTable.h
@@ -69,11 +69,16 @@ namespace ZeroTraits
{
template
-bool check(const T x) { return x == 0; }
+inline bool check(const T x) { return x == 0; }
template
-void set(T & x) { x = 0; }
+inline void set(T & x) { x = 0; }
+template <>
+inline bool check(const char * x) { return x == nullptr; }
+
+template <>
+inline void set(const char *& x){ x = nullptr; }
}
@@ -204,6 +209,13 @@ struct HashTableCell
/// Deserialization, in binary and text form.
void read(DB::ReadBuffer & rb) { DB::readBinary(key, rb); }
void readText(DB::ReadBuffer & rb) { DB::readDoubleQuoted(key, rb); }
+
+ /// When cell pointer is moved during erase, reinsert or resize operations
+
+ static constexpr bool need_to_notify_cell_during_move = false;
+
+ static void move(HashTableCell * /* old_location */, HashTableCell * /* new_location */) {}
+
};
/**
@@ -334,6 +346,32 @@ struct ZeroValueStorage
};
+template
+struct AllocatorBufferDeleter;
+
+template
+struct AllocatorBufferDeleter
+{
+ AllocatorBufferDeleter(Allocator &, size_t) {}
+
+ void operator()(Cell *) const {}
+
+};
+
+template
+struct AllocatorBufferDeleter
+{
+ AllocatorBufferDeleter(Allocator & allocator_, size_t size_)
+ : allocator(allocator_)
+ , size(size_) {}
+
+ void operator()(Cell * buffer) const { allocator.free(buffer, size); }
+
+ Allocator & allocator;
+ size_t size;
+};
+
+
// The HashTable
template
<
@@ -427,7 +465,6 @@ protected:
}
}
-
/// Increase the size of the buffer.
void resize(size_t for_num_elems = 0, size_t for_buf_size = 0)
{
@@ -460,7 +497,24 @@ protected:
new_grower.increaseSize();
/// Expand the space.
- buf = reinterpret_cast(Allocator::realloc(buf, getBufferSizeInBytes(), new_grower.bufSize() * sizeof(Cell)));
+
+ size_t old_buffer_size = getBufferSizeInBytes();
+
+ /** If cell required to be notified during move we need to temporary keep old buffer
+ * because realloc does not quarantee for reallocated buffer to have same base address
+ */
+ using Deleter = AllocatorBufferDeleter;
+ Deleter buffer_deleter(*this, old_buffer_size);
+ std::unique_ptr old_buffer(buf, buffer_deleter);
+
+ if constexpr (Cell::need_to_notify_cell_during_move)
+ {
+ buf = reinterpret_cast(Allocator::alloc(new_grower.bufSize() * sizeof(Cell)));
+ memcpy(reinterpret_cast(buf), reinterpret_cast(old_buffer.get()), old_buffer_size);
+ }
+ else
+ buf = reinterpret_cast(Allocator::realloc(buf, old_buffer_size, new_grower.bufSize() * sizeof(Cell)));
+
grower = new_grower;
/** Now some items may need to be moved to a new location.
@@ -470,7 +524,12 @@ protected:
size_t i = 0;
for (; i < old_size; ++i)
if (!buf[i].isZero(*this))
- reinsert(buf[i], buf[i].getHash(*this));
+ {
+ size_t updated_place_value = reinsert(buf[i], buf[i].getHash(*this));
+
+ if constexpr (Cell::need_to_notify_cell_during_move)
+ Cell::move(&(old_buffer.get())[i], &buf[updated_place_value]);
+ }
/** There is also a special case:
* if the element was to be at the end of the old buffer, [ x]
@@ -481,7 +540,13 @@ protected:
* process tail from the collision resolution chain immediately after it [ o x ]
*/
for (; !buf[i].isZero(*this); ++i)
- reinsert(buf[i], buf[i].getHash(*this));
+ {
+ size_t updated_place_value = reinsert(buf[i], buf[i].getHash(*this));
+
+ if constexpr (Cell::need_to_notify_cell_during_move)
+ if (&buf[i] != &buf[updated_place_value])
+ Cell::move(&buf[i], &buf[updated_place_value]);
+ }
#ifdef DBMS_HASH_MAP_DEBUG_RESIZES
watch.stop();
@@ -495,20 +560,20 @@ protected:
/** Paste into the new buffer the value that was in the old buffer.
* Used when increasing the buffer size.
*/
- void reinsert(Cell & x, size_t hash_value)
+ size_t reinsert(Cell & x, size_t hash_value)
{
size_t place_value = grower.place(hash_value);
/// If the element is in its place.
if (&x == &buf[place_value])
- return;
+ return place_value;
/// Compute a new location, taking into account the collision resolution chain.
place_value = findCell(Cell::getKey(x.getValue()), hash_value, place_value);
/// If the item remains in its place in the old collision resolution chain.
if (!buf[place_value].isZero(*this))
- return;
+ return place_value;
/// Copy to a new location and zero the old one.
x.setHash(hash_value);
@@ -516,6 +581,7 @@ protected:
x.setZero();
/// Then the elements that previously were in collision with this can move to the old place.
+ return place_value;
}
@@ -881,7 +947,11 @@ public:
/// Reinsert node pointed to by iterator
void ALWAYS_INLINE reinsert(iterator & it, size_t hash_value)
{
- reinsert(*it.getPtr(), hash_value);
+ size_t place_value = reinsert(*it.getPtr(), hash_value);
+
+ if constexpr (Cell::need_to_notify_cell_during_move)
+ if (it.getPtr() != &buf[place_value])
+ Cell::move(it.getPtr(), &buf[place_value]);
}
@@ -958,8 +1028,14 @@ public:
return const_cast *>(this)->find(x, hash_value);
}
- std::enable_if_t
+ std::enable_if_t
ALWAYS_INLINE erase(const Key & x)
+ {
+ return erase(x, hash(x));
+ }
+
+ std::enable_if_t
+ ALWAYS_INLINE erase(const Key & x, size_t hash_value)
{
/** Deletion from open addressing hash table without tombstones
*
@@ -977,21 +1053,19 @@ public:
{
--m_size;
this->clearHasZero();
+ return true;
}
else
{
- return;
+ return false;
}
}
- size_t hash_value = hash(x);
size_t erased_key_position = findCell(x, hash_value, grower.place(hash_value));
/// Key is not found
if (buf[erased_key_position].isZero(*this))
- {
- return;
- }
+ return false;
/// We need to guarantee loop termination because there will be empty position
assert(m_size < grower.bufSize());
@@ -1056,12 +1130,18 @@ public:
/// Move the element to the freed place
memcpy(static_cast(&buf[erased_key_position]), static_cast(&buf[next_position]), sizeof(Cell));
+
+ if constexpr (Cell::need_to_notify_cell_during_move)
+ Cell::move(&buf[next_position], &buf[erased_key_position]);
+
/// Now we have another freed place
erased_key_position = next_position;
}
buf[erased_key_position].setZero();
--m_size;
+
+ return true;
}
bool ALWAYS_INLINE has(const Key & x) const
diff --git a/src/Common/HashTable/LRUHashMap.h b/src/Common/HashTable/LRUHashMap.h
new file mode 100644
index 00000000000..292006f2438
--- /dev/null
+++ b/src/Common/HashTable/LRUHashMap.h
@@ -0,0 +1,244 @@
+#pragma once
+
+#include
+
+#include
+#include
+#include
+
+#include
+#include
+#include
+#include
+
+
+template
+struct LRUHashMapCell :
+ public std::conditional_t,
+ HashMapCell>
+{
+public:
+ using Key = TKey;
+
+ using Base = std::conditional_t,
+ HashMapCell>;
+
+ using Mapped = typename Base::Mapped;
+ using State = typename Base::State;
+
+ using mapped_type = Mapped;
+ using key_type = Key;
+
+ using Base::Base;
+
+ static bool constexpr need_to_notify_cell_during_move = true;
+
+ static void move(LRUHashMapCell * __restrict old_location, LRUHashMapCell * __restrict new_location)
+ {
+ /** We update new location prev and next pointers because during hash table resize
+ * they can be updated during move of another cell.
+ */
+
+ new_location->prev = old_location->prev;
+ new_location->next = old_location->next;
+
+ LRUHashMapCell * prev = new_location->prev;
+ LRUHashMapCell * next = new_location->next;
+
+ /// Updated previous next and next previous nodes of list to point to new location
+
+ if (prev)
+ prev->next = new_location;
+
+ if (next)
+ next->prev = new_location;
+ }
+
+private:
+ template
+ friend class LRUHashMapCellNodeTraits;
+
+ LRUHashMapCell * next = nullptr;
+ LRUHashMapCell * prev = nullptr;
+};
+
+template
+struct LRUHashMapCellNodeTraits
+{
+ using node = LRUHashMapCell;
+ using node_ptr = LRUHashMapCell *;
+ using const_node_ptr = const LRUHashMapCell *;
+
+ static node * get_next(const node * ptr) { return ptr->next; }
+ static void set_next(node * __restrict ptr, node * __restrict next) { ptr->next = next; }
+ static node * get_previous(const node * ptr) { return ptr->prev; }
+ static void set_previous(node * __restrict ptr, node * __restrict prev) { ptr->prev = prev; }
+};
+
+template
+class LRUHashMapImpl :
+ private HashMapTable<
+ TKey,
+ LRUHashMapCell,
+ Hash,
+ HashTableGrower<>,
+ HashTableAllocator>
+{
+ using Base = HashMapTable<
+ TKey,
+ LRUHashMapCell,
+ Hash,
+ HashTableGrower<>,
+ HashTableAllocator>;
+public:
+ using Key = TKey;
+ using Value = TValue;
+
+ using Cell = LRUHashMapCell;
+
+ using LRUHashMapCellIntrusiveValueTraits =
+ boost::intrusive::trivial_value_traits<
+ LRUHashMapCellNodeTraits,
+ boost::intrusive::link_mode_type::normal_link>;
+
+ using LRUList = boost::intrusive::list<
+ Cell,
+ boost::intrusive::value_traits,
+ boost::intrusive::constant_time_size>;
+
+ using iterator = typename LRUList::iterator;
+ using const_iterator = typename LRUList::const_iterator;
+ using reverse_iterator = typename LRUList::reverse_iterator;
+ using const_reverse_iterator = typename LRUList::const_reverse_iterator;
+
+ LRUHashMapImpl(size_t max_size_, bool preallocate_max_size_in_hash_map = false)
+ : Base(preallocate_max_size_in_hash_map ? max_size_ : 32)
+ , max_size(max_size_)
+ {
+ assert(max_size > 0);
+ }
+
+ std::pair insert(const Key & key, const Value & value)
+ {
+ return emplace(key, value);
+ }
+
+ std::pair insert(const Key & key, Value && value)
+ {
+ return emplace(key, std::move(value));
+ }
+
+ template
+ std::pair emplace(const Key & key, Args&&... args)
+ {
+ size_t hash_value = Base::hash(key);
+
+ Cell * it = Base::find(key, hash_value);
+
+ if (it)
+ {
+ /// Cell contains element return it and put to the end of lru list
+ lru_list.splice(lru_list.end(), lru_list, lru_list.iterator_to(*it));
+ return std::make_pair(it, false);
+ }
+
+ if (size() == max_size)
+ {
+ /// Erase least recently used element from front of the list
+ Cell & node = lru_list.front();
+
+ const Key & element_to_remove_key = node.getKey();
+ size_t key_hash = node.getHash(*this);
+
+ lru_list.pop_front();
+
+ [[maybe_unused]] bool erased = Base::erase(element_to_remove_key, key_hash);
+ assert(erased);
+ }
+
+ [[maybe_unused]] bool inserted;
+
+ /// Insert value first try to insert in zero storage if not then insert in buffer
+ if (!Base::emplaceIfZero(key, it, inserted, hash_value))
+ Base::emplaceNonZero(key, it, inserted, hash_value);
+
+ assert(inserted);
+
+ new (&it->getMapped()) Value(std::forward(args)...);
+
+ /// Put cell to the end of lru list
+ lru_list.insert(lru_list.end(), *it);
+
+ return std::make_pair(it, true);
+ }
+
+ using Base::find;
+
+ Value & get(const Key & key)
+ {
+ auto it = Base::find(key);
+ assert(it);
+
+ Value & value = it->getMapped();
+
+ /// Put cell to the end of lru list
+ lru_list.splice(lru_list.end(), lru_list, lru_list.iterator_to(*it));
+
+ return value;
+ }
+
+ const Value & get(const Key & key) const
+ {
+ return const_cast *>(this)->get(key);
+ }
+
+ bool contains(const Key & key) const
+ {
+ return Base::has(key);
+ }
+
+ bool erase(const Key & key)
+ {
+ auto hash = Base::hash(key);
+ auto it = Base::find(key, hash);
+
+ if (!it)
+ return false;
+
+ lru_list.erase(lru_list.iterator_to(*it));
+
+ return Base::erase(key, hash);
+ }
+
+ void clear()
+ {
+ lru_list.clear();
+ Base::clear();
+ }
+
+ using Base::size;
+
+ size_t getMaxSize() const { return max_size; }
+
+ iterator begin() { return lru_list.begin(); }
+ const_iterator begin() const { return lru_list.cbegin(); }
+ iterator end() { return lru_list.end(); }
+ const_iterator end() const { return lru_list.cend(); }
+
+ reverse_iterator rbegin() { return lru_list.rbegin(); }
+ const_reverse_iterator rbegin() const { return lru_list.crbegin(); }
+ reverse_iterator rend() { return lru_list.rend(); }
+ const_reverse_iterator rend() const { return lru_list.crend(); }
+
+private:
+ size_t max_size;
+ LRUList lru_list;
+};
+
+template >
+using LRUHashMap = LRUHashMapImpl;
+
+template >
+using LRUHashMapWithSavedHash = LRUHashMapImpl;
diff --git a/src/Common/StackTrace.h b/src/Common/StackTrace.h
index 3ae4b964838..b2e14a01f03 100644
--- a/src/Common/StackTrace.h
+++ b/src/Common/StackTrace.h
@@ -34,7 +34,15 @@ public:
std::optional file;
std::optional line;
};
- static constexpr size_t capacity = 32;
+
+ static constexpr size_t capacity =
+#ifndef NDEBUG
+ /* The stacks are normally larger in debug version due to less inlining. */
+ 64
+#else
+ 32
+#endif
+ ;
using FramePointers = std::array;
using Frames = std::array;
diff --git a/src/Common/ThreadProfileEvents.cpp b/src/Common/ThreadProfileEvents.cpp
index e6336baecda..327178c92ff 100644
--- a/src/Common/ThreadProfileEvents.cpp
+++ b/src/Common/ThreadProfileEvents.cpp
@@ -68,7 +68,7 @@ TasksStatsCounters::TasksStatsCounters(const UInt64 tid, const MetricsProvider p
case MetricsProvider::Netlink:
stats_getter = [metrics_provider = std::make_shared(), tid]()
{
- ::taskstats result;
+ ::taskstats result{};
metrics_provider->getStat(result, tid);
return result;
};
@@ -76,7 +76,7 @@ TasksStatsCounters::TasksStatsCounters(const UInt64 tid, const MetricsProvider p
case MetricsProvider::Procfs:
stats_getter = [metrics_provider = std::make_shared(tid)]()
{
- ::taskstats result;
+ ::taskstats result{};
metrics_provider->getTaskStats(result);
return result;
};
diff --git a/src/Common/ThreadStatus.cpp b/src/Common/ThreadStatus.cpp
index 5105fff03b2..8c01ed2d46f 100644
--- a/src/Common/ThreadStatus.cpp
+++ b/src/Common/ThreadStatus.cpp
@@ -99,6 +99,11 @@ ThreadStatus::~ThreadStatus()
/// We've already allocated a little bit more than the limit and cannot track it in the thread memory tracker or its parent.
}
+#if !defined(ARCADIA_BUILD)
+ /// It may cause segfault if query_context was destroyed, but was not detached
+ assert((!query_context && query_id.empty()) || (query_context && query_id == query_context->getCurrentQueryId()));
+#endif
+
if (deleter)
deleter();
current_thread = nullptr;
diff --git a/src/Common/ThreadStatus.h b/src/Common/ThreadStatus.h
index 1be1f2cd4df..dc5f09c5f3d 100644
--- a/src/Common/ThreadStatus.h
+++ b/src/Common/ThreadStatus.h
@@ -201,7 +201,7 @@ public:
void setFatalErrorCallback(std::function callback);
void onFatalError();
- /// Sets query context for current thread and its thread group
+ /// Sets query context for current master thread and its thread group
/// NOTE: query_context have to be alive until detachQuery() is called
void attachQueryContext(Context & query_context);
diff --git a/src/Common/tests/CMakeLists.txt b/src/Common/tests/CMakeLists.txt
index cb36e2b97d2..2dd56e862f0 100644
--- a/src/Common/tests/CMakeLists.txt
+++ b/src/Common/tests/CMakeLists.txt
@@ -38,6 +38,9 @@ target_link_libraries (arena_with_free_lists PRIVATE dbms)
add_executable (pod_array pod_array.cpp)
target_link_libraries (pod_array PRIVATE clickhouse_common_io)
+add_executable (lru_hash_map_perf lru_hash_map_perf.cpp)
+target_link_libraries (lru_hash_map_perf PRIVATE clickhouse_common_io)
+
add_executable (thread_creation_latency thread_creation_latency.cpp)
target_link_libraries (thread_creation_latency PRIVATE clickhouse_common_io)
diff --git a/src/Common/tests/gtest_lru_hash_map.cpp b/src/Common/tests/gtest_lru_hash_map.cpp
new file mode 100644
index 00000000000..562ee667b7b
--- /dev/null
+++ b/src/Common/tests/gtest_lru_hash_map.cpp
@@ -0,0 +1,161 @@
+#include
+#include
+
+#include
+
+#include
+
+template
+std::vector convertToVector(const LRUHashMap & map)
+{
+ std::vector result;
+ result.reserve(map.size());
+
+ for (auto & node: map)
+ result.emplace_back(node.getKey());
+
+ return result;
+}
+
+void testInsert(size_t elements_to_insert_size, size_t map_size)
+{
+ using LRUHashMap = LRUHashMap;
+
+ LRUHashMap map(map_size);
+
+ std::vector expected;
+
+ for (size_t i = 0; i < elements_to_insert_size; ++i)
+ map.insert(i, i);
+
+ for (size_t i = elements_to_insert_size - map_size; i < elements_to_insert_size; ++i)
+ expected.emplace_back(i);
+
+ std::vector actual = convertToVector(map);
+ ASSERT_EQ(map.size(), actual.size());
+ ASSERT_EQ(actual, expected);
+}
+
+TEST(LRUHashMap, Insert)
+{
+ {
+ using LRUHashMap = LRUHashMap;
+
+ LRUHashMap map(3);
+
+ map.emplace(1, 1);
+ map.insert(2, 2);
+ int v = 3;
+ map.insert(3, v);
+ map.emplace(4, 4);
+
+ std::vector expected = { 2, 3, 4 };
+ std::vector actual = convertToVector(map);
+
+ ASSERT_EQ(actual, expected);
+ }
+
+ testInsert(1200000, 1200000);
+ testInsert(10, 5);
+ testInsert(1200000, 2);
+ testInsert(1200000, 1);
+}
+
+TEST(LRUHashMap, GetModify)
+{
+ using LRUHashMap = LRUHashMap;
+
+ LRUHashMap map(3);
+
+ map.emplace(1, 1);
+ map.emplace(2, 2);
+ map.emplace(3, 3);
+
+ map.get(3) = 4;
+
+ std::vector expected = { 1, 2, 4 };
+ std::vector actual;
+ actual.reserve(map.size());
+
+ for (auto & node : map)
+ actual.emplace_back(node.getMapped());
+
+ ASSERT_EQ(actual, expected);
+}
+
+TEST(LRUHashMap, SetRecentKeyToTop)
+{
+ using LRUHashMap = LRUHashMap;
+
+ LRUHashMap map(3);
+
+ map.emplace(1, 1);
+ map.emplace(2, 2);
+ map.emplace(3, 3);
+ map.emplace(1, 4);
+
+ std::vector expected = { 2, 3, 1 };
+ std::vector actual = convertToVector(map);
+
+ ASSERT_EQ(actual, expected);
+}
+
+TEST(LRUHashMap, GetRecentKeyToTop)
+{
+ using LRUHashMap = LRUHashMap;
+
+ LRUHashMap map(3);
+
+ map.emplace(1, 1);
+ map.emplace(2, 2);
+ map.emplace(3, 3);
+ map.get(1);
+
+ std::vector expected = { 2, 3, 1 };
+ std::vector actual = convertToVector(map);
+
+ ASSERT_EQ(actual, expected);
+}
+
+TEST(LRUHashMap, Contains)
+{
+ using LRUHashMap = LRUHashMap;
+
+ LRUHashMap map(3);
+
+ map.emplace(1, 1);
+ map.emplace(2, 2);
+ map.emplace(3, 3);
+
+ ASSERT_TRUE(map.contains(1));
+ ASSERT_TRUE(map.contains(2));
+ ASSERT_TRUE(map.contains(3));
+ ASSERT_EQ(map.size(), 3);
+
+ map.erase(1);
+ map.erase(2);
+ map.erase(3);
+
+ ASSERT_EQ(map.size(), 0);
+ ASSERT_FALSE(map.contains(1));
+ ASSERT_FALSE(map.contains(2));
+ ASSERT_FALSE(map.contains(3));
+}
+
+TEST(LRUHashMap, Clear)
+{
+ using LRUHashMap = LRUHashMap;
+
+ LRUHashMap map(3);
+
+ map.emplace(1, 1);
+ map.emplace(2, 2);
+ map.emplace(3, 3);
+ map.clear();
+
+ std::vector expected = {};
+ std::vector actual = convertToVector(map);
+
+ ASSERT_EQ(actual, expected);
+ ASSERT_EQ(map.size(), 0);
+}
diff --git a/src/Common/tests/lru_hash_map_perf.cpp b/src/Common/tests/lru_hash_map_perf.cpp
new file mode 100644
index 00000000000..14beff3f7da
--- /dev/null
+++ b/src/Common/tests/lru_hash_map_perf.cpp
@@ -0,0 +1,244 @@
+#include
+#include
+#include | | | | | | |