Merge branch 'master' into set-join-storage-tables-tsan

This commit is contained in:
Alexey Milovidov 2020-12-29 20:47:22 +03:00
commit 5034d934b1
387 changed files with 10513 additions and 3038 deletions

3
.gitmodules vendored
View File

@ -212,3 +212,6 @@
[submodule "contrib/boringssl"]
path = contrib/boringssl
url = https://github.com/ClickHouse-Extras/boringssl.git
[submodule "contrib/NuRaft"]
path = contrib/NuRaft
url = https://github.com/eBay/NuRaft.git

View File

@ -1,5 +1,19 @@
### ClickHouse release 20.12
### ClickHouse release v20.12.4.5-stable, 2020-12-24
#### Bug Fix
* Fixed issue when `clickhouse-odbc-bridge` process is unreachable by server on machines with dual IPv4/IPv6 stack; - Fixed issue when ODBC dictionary updates are performed using malformed queries and/or cause crashes; Possibly closes [#14489](https://github.com/ClickHouse/ClickHouse/issues/14489). [#18278](https://github.com/ClickHouse/ClickHouse/pull/18278) ([Denis Glazachev](https://github.com/traceon)).
* Fixed key comparison between Enum and Int types. This fixes [#17989](https://github.com/ClickHouse/ClickHouse/issues/17989). [#18214](https://github.com/ClickHouse/ClickHouse/pull/18214) ([Amos Bird](https://github.com/amosbird)).
* Fixed unique key convert crash in `MaterializeMySQL` database engine. This fixes [#18186](https://github.com/ClickHouse/ClickHouse/issues/18186) and fixes [#16372](https://github.com/ClickHouse/ClickHouse/issues/16372) [#18211](https://github.com/ClickHouse/ClickHouse/pull/18211) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed `std::out_of_range: basic_string` in S3 URL parsing. [#18059](https://github.com/ClickHouse/ClickHouse/pull/18059) ([Vladimir Chebotarev](https://github.com/excitoon)).
* Fixed the issue when some tables not synchronized to ClickHouse from MySQL caused by the fact that convertion MySQL prefix index wasn't supported for MaterializeMySQL. This fixes [#15187](https://github.com/ClickHouse/ClickHouse/issues/15187) and fixes [#17912](https://github.com/ClickHouse/ClickHouse/issues/17912) [#17944](https://github.com/ClickHouse/ClickHouse/pull/17944) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed the issue when query optimization was producing wrong result if query contains `ARRAY JOIN`. [#17887](https://github.com/ClickHouse/ClickHouse/pull/17887) ([sundyli](https://github.com/sundy-li)).
* Fixed possible segfault in `topK` aggregate function. This closes [#17404](https://github.com/ClickHouse/ClickHouse/issues/17404). [#17845](https://github.com/ClickHouse/ClickHouse/pull/17845) ([Maksim Kita](https://github.com/kitaisreal)).
* Fixed empty `system.stack_trace` table when server is running in daemon mode. [#17630](https://github.com/ClickHouse/ClickHouse/pull/17630) ([Amos Bird](https://github.com/amosbird)).
### ClickHouse release v20.12.3.3-stable, 2020-12-13
#### Backward Incompatible Change
@ -123,6 +137,52 @@
## ClickHouse release 20.11
### ClickHouse release v20.11.6.6-stable, 2020-12-24
#### Bug Fix
* Fixed issue when `clickhouse-odbc-bridge` process is unreachable by server on machines with dual `IPv4/IPv6 stack` and fixed issue when ODBC dictionary updates are performed using malformed queries and/or cause crashes. This possibly closes [#14489](https://github.com/ClickHouse/ClickHouse/issues/14489). [#18278](https://github.com/ClickHouse/ClickHouse/pull/18278) ([Denis Glazachev](https://github.com/traceon)).
* Fixed key comparison between Enum and Int types. This fixes [#17989](https://github.com/ClickHouse/ClickHouse/issues/17989). [#18214](https://github.com/ClickHouse/ClickHouse/pull/18214) ([Amos Bird](https://github.com/amosbird)).
* Fixed unique key convert crash in `MaterializeMySQL` database engine. This fixes [#18186](https://github.com/ClickHouse/ClickHouse/issues/18186) and fixes [#16372](https://github.com/ClickHouse/ClickHouse/issues/16372) [#18211](https://github.com/ClickHouse/ClickHouse/pull/18211) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed `std::out_of_range: basic_string` in S3 URL parsing. [#18059](https://github.com/ClickHouse/ClickHouse/pull/18059) ([Vladimir Chebotarev](https://github.com/excitoon)).
* Fixed the issue when some tables not synchronized to ClickHouse from MySQL caused by the fact that convertion MySQL prefix index wasn't supported for MaterializeMySQL. This fixes [#15187](https://github.com/ClickHouse/ClickHouse/issues/15187) and fixes [#17912](https://github.com/ClickHouse/ClickHouse/issues/17912) [#17944](https://github.com/ClickHouse/ClickHouse/pull/17944) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed the issue when query optimization was producing wrong result if query contains `ARRAY JOIN`. [#17887](https://github.com/ClickHouse/ClickHouse/pull/17887) ([sundyli](https://github.com/sundy-li)).
* Fix possible segfault in `topK` aggregate function. This closes [#17404](https://github.com/ClickHouse/ClickHouse/issues/17404). [#17845](https://github.com/ClickHouse/ClickHouse/pull/17845) ([Maksim Kita](https://github.com/kitaisreal)).
* Do not restore parts from WAL if `in_memory_parts_enable_wal` is disabled. [#17802](https://github.com/ClickHouse/ClickHouse/pull/17802) ([detailyang](https://github.com/detailyang)).
* Fixed problem when ClickHouse fails to resume connection to MySQL servers. [#17681](https://github.com/ClickHouse/ClickHouse/pull/17681) ([Alexander Kazakov](https://github.com/Akazz)).
* Fixed inconsistent behaviour of `optimize_trivial_count_query` with partition predicate. [#17644](https://github.com/ClickHouse/ClickHouse/pull/17644) ([Azat Khuzhin](https://github.com/azat)).
* Fixed empty `system.stack_trace` table when server is running in daemon mode. [#17630](https://github.com/ClickHouse/ClickHouse/pull/17630) ([Amos Bird](https://github.com/amosbird)).
* Fixed the behaviour when xxception `fmt::v7::format_error` can be logged in background for MergeTree tables. This fixes [#17613](https://github.com/ClickHouse/ClickHouse/issues/17613). [#17615](https://github.com/ClickHouse/ClickHouse/pull/17615) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed the behaviour when clickhouse-client is used in interactive mode with multiline queries and single line comment was erronously extended till the end of query. This fixes [#13654](https://github.com/ClickHouse/ClickHouse/issues/13654). [#17565](https://github.com/ClickHouse/ClickHouse/pull/17565) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed the issue when server can stop accepting connections in very rare cases. [#17542](https://github.com/ClickHouse/ClickHouse/pull/17542) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed alter query hang when the corresponding mutation was killed on the different replica. This fixes [#16953](https://github.com/ClickHouse/ClickHouse/issues/16953). [#17499](https://github.com/ClickHouse/ClickHouse/pull/17499) ([alesapin](https://github.com/alesapin)).
* Fixed bug when mark cache size was underestimated by clickhouse. It may happen when there are a lot of tiny files with marks. [#17496](https://github.com/ClickHouse/ClickHouse/pull/17496) ([alesapin](https://github.com/alesapin)).
* Fixed `ORDER BY` with enabled setting `optimize_redundant_functions_in_order_by`. [#17471](https://github.com/ClickHouse/ClickHouse/pull/17471) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed duplicates after `DISTINCT` which were possible because of incorrect optimization. This fixes [#17294](https://github.com/ClickHouse/ClickHouse/issues/17294). [#17296](https://github.com/ClickHouse/ClickHouse/pull/17296) ([li chengxiang](https://github.com/chengxianglibra)). [#17439](https://github.com/ClickHouse/ClickHouse/pull/17439) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed crash while reading from `JOIN` table with `LowCardinality` types. This fixes [#17228](https://github.com/ClickHouse/ClickHouse/issues/17228). [#17397](https://github.com/ClickHouse/ClickHouse/pull/17397) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed set index invalidation when there are const columns in the subquery. This fixes [#17246](https://github.com/ClickHouse/ClickHouse/issues/17246) . [#17249](https://github.com/ClickHouse/ClickHouse/pull/17249) ([Amos Bird](https://github.com/amosbird)).
* Fixed possible wrong index analysis when the types of the index comparison are different. This fixes [#17122](https://github.com/ClickHouse/ClickHouse/issues/17122). [#17145](https://github.com/ClickHouse/ClickHouse/pull/17145) ([Amos Bird](https://github.com/amosbird)).
* Fixed `ColumnConst` comparison which leads to crash. This fixes [#17088](https://github.com/ClickHouse/ClickHouse/issues/17088) . [#17135](https://github.com/ClickHouse/ClickHouse/pull/17135) ([Amos Bird](https://github.com/amosbird)).
* Fixed bug when `ON CLUSTER` queries may hang forever for non-leader `ReplicatedMergeTreeTables`. [#17089](https://github.com/ClickHouse/ClickHouse/pull/17089) ([alesapin](https://github.com/alesapin)).
* Fixed fuzzer-found bug in funciton `fuzzBits`. This fixes [#16980](https://github.com/ClickHouse/ClickHouse/issues/16980). [#17051](https://github.com/ClickHouse/ClickHouse/pull/17051) ([hexiaoting](https://github.com/hexiaoting)).
* Avoid unnecessary network errors for remote queries which may be cancelled while execution, like queries with `LIMIT`. [#17006](https://github.com/ClickHouse/ClickHouse/pull/17006) ([Azat Khuzhin](https://github.com/azat)).
* Fixed wrong result in big integers (128, 256 bit) when casting from double. [#16986](https://github.com/ClickHouse/ClickHouse/pull/16986) ([Mike](https://github.com/myrrc)).
* Reresolve the IP of the `format_avro_schema_registry_url` in case of errors. [#16985](https://github.com/ClickHouse/ClickHouse/pull/16985) ([filimonov](https://github.com/filimonov)).
* Fixed possible server crash after `ALTER TABLE ... MODIFY COLUMN ... NewType` when `SELECT` have `WHERE` expression on altering column and alter doesn't finished yet. [#16968](https://github.com/ClickHouse/ClickHouse/pull/16968) ([Amos Bird](https://github.com/amosbird)).
* Blame info was not calculated correctly in `clickhouse-git-import`. [#16959](https://github.com/ClickHouse/ClickHouse/pull/16959) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed order by optimization with monotonous functions. Fixes [#16107](https://github.com/ClickHouse/ClickHouse/issues/16107). [#16956](https://github.com/ClickHouse/ClickHouse/pull/16956) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed optimization of group by with enabled setting `optimize_aggregators_of_group_by_keys` and joins. This fixes [#12604](https://github.com/ClickHouse/ClickHouse/issues/12604). [#16951](https://github.com/ClickHouse/ClickHouse/pull/16951) ([Anton Popov](https://github.com/CurtizJ)).
* Install script should always create subdirs in config folders. This is only relevant for Docker build with custom config. [#16936](https://github.com/ClickHouse/ClickHouse/pull/16936) ([filimonov](https://github.com/filimonov)).
* Fixed possible error `Illegal type of argument` for queries with `ORDER BY`. This fixes [#16580](https://github.com/ClickHouse/ClickHouse/issues/16580). [#16928](https://github.com/ClickHouse/ClickHouse/pull/16928) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Abort multipart upload if no data was written to WriteBufferFromS3. [#16840](https://github.com/ClickHouse/ClickHouse/pull/16840) ([Pavel Kovalenko](https://github.com/Jokser)).
* Fixed crash when using `any` without any arguments. This fixes [#16803](https://github.com/ClickHouse/ClickHouse/issues/16803). [#16826](https://github.com/ClickHouse/ClickHouse/pull/16826) ([Amos Bird](https://github.com/amosbird)).
* Fixed the behaviour when ClickHouse used to always return 0 insted of a number of affected rows for `INSERT` queries via MySQL protocol. This fixes [#16605](https://github.com/ClickHouse/ClickHouse/issues/16605). [#16715](https://github.com/ClickHouse/ClickHouse/pull/16715) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed uncontrolled growth of TDigest. [#16680](https://github.com/ClickHouse/ClickHouse/pull/16680) ([hrissan](https://github.com/hrissan)).
* Fixed remote query failure when using suffix `if` in Aggregate function. This fixes [#16574](https://github.com/ClickHouse/ClickHouse/issues/16574) fixes [#16231](https://github.com/ClickHouse/ClickHouse/issues/16231) [#16610](https://github.com/ClickHouse/ClickHouse/pull/16610) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed inconsistent behavior caused by `select_sequential_consistency` for optimized trivial count query and system.tables. [#16309](https://github.com/ClickHouse/ClickHouse/pull/16309) ([Hao Chen](https://github.com/haoch)).
* Throw error when use ColumnTransformer replace non exist column. [#16183](https://github.com/ClickHouse/ClickHouse/pull/16183) ([hexiaoting](https://github.com/hexiaoting)).
### ClickHouse release v20.11.3.3-stable, 2020-11-13
#### Bug Fix
@ -252,6 +312,46 @@
## ClickHouse release 20.10
### ClickHouse release v20.10.7.4-stable, 2020-12-24
#### Bug Fix
* Fixed issue when `clickhouse-odbc-bridge` process is unreachable by server on machines with dual `IPv4/IPv6` stack and fixed issue when ODBC dictionary updates are performed using malformed queries and/or cause crashes. This possibly closes [#14489](https://github.com/ClickHouse/ClickHouse/issues/14489). [#18278](https://github.com/ClickHouse/ClickHouse/pull/18278) ([Denis Glazachev](https://github.com/traceon)).
* Fix key comparison between Enum and Int types. This fixes [#17989](https://github.com/ClickHouse/ClickHouse/issues/17989). [#18214](https://github.com/ClickHouse/ClickHouse/pull/18214) ([Amos Bird](https://github.com/amosbird)).
* Fixed unique key convert crash in `MaterializeMySQL` database engine. This fixes [#18186](https://github.com/ClickHouse/ClickHouse/issues/18186) and fixes [#16372](https://github.com/ClickHouse/ClickHouse/issues/16372) [#18211](https://github.com/ClickHouse/ClickHouse/pull/18211) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed `std::out_of_range: basic_string` in S3 URL parsing. [#18059](https://github.com/ClickHouse/ClickHouse/pull/18059) ([Vladimir Chebotarev](https://github.com/excitoon)).
* Fixed the issue when some tables not synchronized to ClickHouse from MySQL caused by the fact that convertion MySQL prefix index wasn't supported for MaterializeMySQL. This fixes [#15187](https://github.com/ClickHouse/ClickHouse/issues/15187) and fixes [#17912](https://github.com/ClickHouse/ClickHouse/issues/17912) [#17944](https://github.com/ClickHouse/ClickHouse/pull/17944) ([Winter Zhang](https://github.com/zhang2014)).
* Fix possible segfault in `topK` aggregate function. This closes [#17404](https://github.com/ClickHouse/ClickHouse/issues/17404). [#17845](https://github.com/ClickHouse/ClickHouse/pull/17845) ([Maksim Kita](https://github.com/kitaisreal)).
* Do not restore parts from `WAL` if `in_memory_parts_enable_wal` is disabled. [#17802](https://github.com/ClickHouse/ClickHouse/pull/17802) ([detailyang](https://github.com/detailyang)).
* Fixed problem when ClickHouse fails to resume connection to MySQL servers. [#17681](https://github.com/ClickHouse/ClickHouse/pull/17681) ([Alexander Kazakov](https://github.com/Akazz)).
* Fixed empty `system.stack_trace` table when server is running in daemon mode. [#17630](https://github.com/ClickHouse/ClickHouse/pull/17630) ([Amos Bird](https://github.com/amosbird)).
* Fixed the behaviour when `clickhouse-client` is used in interactive mode with multiline queries and single line comment was erronously extended till the end of query. This fixes [#13654](https://github.com/ClickHouse/ClickHouse/issues/13654). [#17565](https://github.com/ClickHouse/ClickHouse/pull/17565) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed the issue when server can stop accepting connections in very rare cases. [#17542](https://github.com/ClickHouse/ClickHouse/pull/17542) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed `ALTER` query hang when the corresponding mutation was killed on the different replica. This fixes [#16953](https://github.com/ClickHouse/ClickHouse/issues/16953). [#17499](https://github.com/ClickHouse/ClickHouse/pull/17499) ([alesapin](https://github.com/alesapin)).
* Fixed bug when mark cache size was underestimated by clickhouse. It may happen when there are a lot of tiny files with marks. [#17496](https://github.com/ClickHouse/ClickHouse/pull/17496) ([alesapin](https://github.com/alesapin)).
* Fixed `ORDER BY` with enabled setting `optimize_redundant_functions_in_order_by`. [#17471](https://github.com/ClickHouse/ClickHouse/pull/17471) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed duplicates after `DISTINCT` which were possible because of incorrect optimization. Fixes [#17294](https://github.com/ClickHouse/ClickHouse/issues/17294). [#17296](https://github.com/ClickHouse/ClickHouse/pull/17296) ([li chengxiang](https://github.com/chengxianglibra)). [#17439](https://github.com/ClickHouse/ClickHouse/pull/17439) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed crash while reading from `JOIN` table with `LowCardinality` types. This fixes [#17228](https://github.com/ClickHouse/ClickHouse/issues/17228). [#17397](https://github.com/ClickHouse/ClickHouse/pull/17397) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed set index invalidation when there are const columns in the subquery. This fixes [#17246](https://github.com/ClickHouse/ClickHouse/issues/17246) . [#17249](https://github.com/ClickHouse/ClickHouse/pull/17249) ([Amos Bird](https://github.com/amosbird)).
* Fixed `ColumnConst` comparison which leads to crash. This fixed [#17088](https://github.com/ClickHouse/ClickHouse/issues/17088) . [#17135](https://github.com/ClickHouse/ClickHouse/pull/17135) ([Amos Bird](https://github.com/amosbird)).
* Fixed bug when `ON CLUSTER` queries may hang forever for non-leader `ReplicatedMergeTreeTables`. [#17089](https://github.com/ClickHouse/ClickHouse/pull/17089) ([alesapin](https://github.com/alesapin)).
* Fixed fuzzer-found bug in function `fuzzBits`. This fixes [#16980](https://github.com/ClickHouse/ClickHouse/issues/16980). [#17051](https://github.com/ClickHouse/ClickHouse/pull/17051) ([hexiaoting](https://github.com/hexiaoting)).
* Avoid unnecessary network errors for remote queries which may be cancelled while execution, like queries with `LIMIT`. [#17006](https://github.com/ClickHouse/ClickHouse/pull/17006) ([Azat Khuzhin](https://github.com/azat)).
* Fixed wrong result in big integers (128, 256 bit) when casting from double. [#16986](https://github.com/ClickHouse/ClickHouse/pull/16986) ([Mike](https://github.com/myrrc)).
* Reresolve the IP of the `format_avro_schema_registry_url` in case of errors. [#16985](https://github.com/ClickHouse/ClickHouse/pull/16985) ([filimonov](https://github.com/filimonov)).
* Fixed possible server crash after `ALTER TABLE ... MODIFY COLUMN ... NewType` when `SELECT` have `WHERE` expression on altering column and alter doesn't finished yet. [#16968](https://github.com/ClickHouse/ClickHouse/pull/16968) ([Amos Bird](https://github.com/amosbird)).
* Blame info was not calculated correctly in `clickhouse-git-import`. [#16959](https://github.com/ClickHouse/ClickHouse/pull/16959) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed order by optimization with monotonous functions. This fixes [#16107](https://github.com/ClickHouse/ClickHouse/issues/16107). [#16956](https://github.com/ClickHouse/ClickHouse/pull/16956) ([Anton Popov](https://github.com/CurtizJ)).
* Fixrf optimization of group by with enabled setting `optimize_aggregators_of_group_by_keys` and joins. This fixes [#12604](https://github.com/ClickHouse/ClickHouse/issues/12604). [#16951](https://github.com/ClickHouse/ClickHouse/pull/16951) ([Anton Popov](https://github.com/CurtizJ)).
* Install script should always create subdirs in config folders. This is only relevant for Docker build with custom config. [#16936](https://github.com/ClickHouse/ClickHouse/pull/16936) ([filimonov](https://github.com/filimonov)).
* Fixrf possible error `Illegal type of argument` for queries with `ORDER BY`. This fixes [#16580](https://github.com/ClickHouse/ClickHouse/issues/16580). [#16928](https://github.com/ClickHouse/ClickHouse/pull/16928) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Abort multipart upload if no data was written to `WriteBufferFromS3`. [#16840](https://github.com/ClickHouse/ClickHouse/pull/16840) ([Pavel Kovalenko](https://github.com/Jokser)).
* Fixed crash when using `any` without any arguments. This fixes [#16803](https://github.com/ClickHouse/ClickHouse/issues/16803). [#16826](https://github.com/ClickHouse/ClickHouse/pull/16826) ([Amos Bird](https://github.com/amosbird)).
* Fixed the behaviour when ClickHouse used to always return 0 insted of a number of affected rows for `INSERT` queries via MySQL protocol. This fixes [#16605](https://github.com/ClickHouse/ClickHouse/issues/16605). [#16715](https://github.com/ClickHouse/ClickHouse/pull/16715) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed uncontrolled growth of `TDigest`. [#16680](https://github.com/ClickHouse/ClickHouse/pull/16680) ([hrissan](https://github.com/hrissan)).
* Fixed remote query failure when using suffix `if` in Aggregate function. This fixes [#16574](https://github.com/ClickHouse/ClickHouse/issues/16574) fixes [#16231](https://github.com/ClickHouse/ClickHouse/issues/16231) [#16610](https://github.com/ClickHouse/ClickHouse/pull/16610) ([Winter Zhang](https://github.com/zhang2014)).
### ClickHouse release v20.10.4.1-stable, 2020-11-13
#### Bug Fix
@ -639,6 +739,31 @@
## ClickHouse release 20.8
### ClickHouse release v20.8.10.13-lts, 2020-12-24
#### Bug Fix
* When server log rotation was configured using `logger.size` parameter with numeric value larger than 2^32, the logs were not rotated properly. [#17905](https://github.com/ClickHouse/ClickHouse/pull/17905) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Fixed incorrect initialization of `max_compress_block_size` in MergeTreeWriterSettings with `min_compress_block_size`. [#17833](https://github.com/ClickHouse/ClickHouse/pull/17833) ([flynn](https://github.com/ucasFL)).
* Fixed problem when ClickHouse fails to resume connection to MySQL servers. [#17681](https://github.com/ClickHouse/ClickHouse/pull/17681) ([Alexander Kazakov](https://github.com/Akazz)).
* Fixed `ALTER` query hang when the corresponding mutation was killed on the different replica. This fixes [#16953](https://github.com/ClickHouse/ClickHouse/issues/16953). [#17499](https://github.com/ClickHouse/ClickHouse/pull/17499) ([alesapin](https://github.com/alesapin)).
* Fixed a bug when mark cache size was underestimated by ClickHouse. It may happen when there are a lot of tiny files with marks. [#17496](https://github.com/ClickHouse/ClickHouse/pull/17496) ([alesapin](https://github.com/alesapin)).
* Fixed `ORDER BY` with enabled setting `optimize_redundant_functions_in_order_by`. [#17471](https://github.com/ClickHouse/ClickHouse/pull/17471) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed `ColumnConst` comparison which leads to crash. This fixed [#17088](https://github.com/ClickHouse/ClickHouse/issues/17088) . [#17135](https://github.com/ClickHouse/ClickHouse/pull/17135) ([Amos Bird](https://github.com/amosbird)).
* Fixed bug when `ON CLUSTER` queries may hang forever for non-leader ReplicatedMergeTreeTables. [#17089](https://github.com/ClickHouse/ClickHouse/pull/17089) ([alesapin](https://github.com/alesapin)).
* Avoid unnecessary network errors for remote queries which may be cancelled while execution, like queries with `LIMIT`. [#17006](https://github.com/ClickHouse/ClickHouse/pull/17006) ([Azat Khuzhin](https://github.com/azat)).
* Reresolve the IP of the `format_avro_schema_registry_url` in case of errors. [#16985](https://github.com/ClickHouse/ClickHouse/pull/16985) ([filimonov](https://github.com/filimonov)).
* Fixed possible server crash after `ALTER TABLE ... MODIFY COLUMN ... NewType` when `SELECT` have `WHERE` expression on altering column and alter doesn't finished yet. [#16968](https://github.com/ClickHouse/ClickHouse/pull/16968) ([Amos Bird](https://github.com/amosbird)).
* Install script should always create subdirs in config folders. This is only relevant for Docker build with custom config. [#16936](https://github.com/ClickHouse/ClickHouse/pull/16936) ([filimonov](https://github.com/filimonov)).
* Fixed possible error `Illegal type of argument` for queries with `ORDER BY`. Fixes [#16580](https://github.com/ClickHouse/ClickHouse/issues/16580). [#16928](https://github.com/ClickHouse/ClickHouse/pull/16928) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Abort multipart upload if no data was written to WriteBufferFromS3. [#16840](https://github.com/ClickHouse/ClickHouse/pull/16840) ([Pavel Kovalenko](https://github.com/Jokser)).
* Fixed crash when using `any` without any arguments. This fixes [#16803](https://github.com/ClickHouse/ClickHouse/issues/16803). [#16826](https://github.com/ClickHouse/ClickHouse/pull/16826) ([Amos Bird](https://github.com/amosbird)).
* Fixed `IN` operator over several columns and tuples with enabled `transform_null_in` setting. Fixes [#15310](https://github.com/ClickHouse/ClickHouse/issues/15310). [#16722](https://github.com/ClickHouse/ClickHouse/pull/16722) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed inconsistent behaviour of `optimize_read_in_order/optimize_aggregation_in_order` with max_threads > 0 and expression in ORDER BY. [#16637](https://github.com/ClickHouse/ClickHouse/pull/16637) ([Azat Khuzhin](https://github.com/azat)).
* Fixed the issue when query optimization was producing wrong result if query contains `ARRAY JOIN`. [#17887](https://github.com/ClickHouse/ClickHouse/pull/17887) ([sundyli](https://github.com/sundy-li)).
* Query is finished faster in case of exception. Cancel execution on remote replicas if exception happens. [#15578](https://github.com/ClickHouse/ClickHouse/pull/15578) ([Azat Khuzhin](https://github.com/azat)).
### ClickHouse release v20.8.6.6-lts, 2020-11-13
#### Bug Fix

View File

@ -468,6 +468,7 @@ include (cmake/find/rapidjson.cmake)
include (cmake/find/fastops.cmake)
include (cmake/find/odbc.cmake)
include (cmake/find/rocksdb.cmake)
include (cmake/find/nuraft.cmake)
if(NOT USE_INTERNAL_PARQUET_LIBRARY)

View File

@ -61,6 +61,20 @@
# endif
#endif
#if defined(ADDRESS_SANITIZER)
# define BOOST_USE_ASAN 1
# define BOOST_USE_UCONTEXT 1
#endif
#if defined(THREAD_SANITIZER)
# define BOOST_USE_TSAN 1
# define BOOST_USE_UCONTEXT 1
#endif
#if defined(ARCADIA_BUILD) && defined(BOOST_USE_UCONTEXT)
# undef BOOST_USE_UCONTEXT
#endif
/// TODO: Strange enough, there is no way to detect UB sanitizer.
/// Explicitly allow undefined behaviour for certain functions. Use it as a function attribute.

View File

@ -4,6 +4,11 @@
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/time.h>
#include <sys/wait.h>
#include <sys/resource.h>
#if defined(__linux__)
#include <sys/prctl.h>
#endif
#include <fcntl.h>
#include <errno.h>
#include <string.h>
@ -12,7 +17,6 @@
#include <unistd.h>
#include <typeinfo>
#include <sys/resource.h>
#include <iostream>
#include <fstream>
#include <sstream>
@ -22,7 +26,6 @@
#include <Poco/Observer.h>
#include <Poco/AutoPtr.h>
#include <Poco/PatternFormatter.h>
#include <Poco/TaskManager.h>
#include <Poco/File.h>
#include <Poco/Path.h>
#include <Poco/Message.h>
@ -470,7 +473,6 @@ BaseDaemon::~BaseDaemon()
void BaseDaemon::terminate()
{
getTaskManager().cancelAll();
if (::raise(SIGTERM) != 0)
throw Poco::SystemException("cannot terminate process");
}
@ -478,22 +480,11 @@ void BaseDaemon::terminate()
void BaseDaemon::kill()
{
dumpCoverageReportIfPossible();
pid.reset();
pid_file.reset();
if (::raise(SIGKILL) != 0)
throw Poco::SystemException("cannot kill process");
}
void BaseDaemon::sleep(double seconds)
{
wakeup_event.reset();
wakeup_event.tryWait(seconds * 1000);
}
void BaseDaemon::wakeup()
{
wakeup_event.set();
}
std::string BaseDaemon::getDefaultCorePath() const
{
return "/opt/cores/";
@ -564,7 +555,6 @@ void BaseDaemon::initialize(Application & self)
{
closeFDs();
task_manager = std::make_unique<Poco::TaskManager>();
ServerApplication::initialize(self);
/// now highest priority (lowest value) is PRIO_APPLICATION = -100, we want higher!
@ -648,10 +638,6 @@ void BaseDaemon::initialize(Application & self)
throw Poco::OpenFileException("Cannot attach stdout to " + stdout_path);
}
/// Create pid file.
if (config().has("pid"))
pid.emplace(config().getString("pid"), DB::StatusFile::write_pid);
/// Change path for logging.
if (!log_path.empty())
{
@ -667,9 +653,17 @@ void BaseDaemon::initialize(Application & self)
throw Poco::Exception("Cannot change directory to /tmp");
}
// sensitive data masking rules are not used here
/// sensitive data masking rules are not used here
buildLoggers(config(), logger(), self.commandName());
/// After initialized loggers but before initialized signal handling.
if (should_setup_watchdog)
setupWatchdog();
/// Create pid file.
if (config().has("pid"))
pid_file.emplace(config().getString("pid"), DB::StatusFile::write_pid);
if (is_daemon)
{
/** Change working directory to the directory to write core dumps.
@ -704,54 +698,71 @@ void BaseDaemon::initialize(Application & self)
}
static void addSignalHandler(const std::vector<int> & signals, signal_function handler, std::vector<int> * out_handled_signals)
{
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_sigaction = handler;
sa.sa_flags = SA_SIGINFO;
#if defined(OS_DARWIN)
sigemptyset(&sa.sa_mask);
for (auto signal : signals)
sigaddset(&sa.sa_mask, signal);
#else
if (sigemptyset(&sa.sa_mask))
throw Poco::Exception("Cannot set signal handler.");
for (auto signal : signals)
if (sigaddset(&sa.sa_mask, signal))
throw Poco::Exception("Cannot set signal handler.");
#endif
for (auto signal : signals)
if (sigaction(signal, &sa, nullptr))
throw Poco::Exception("Cannot set signal handler.");
if (out_handled_signals)
std::copy(signals.begin(), signals.end(), std::back_inserter(*out_handled_signals));
};
static void blockSignals(const std::vector<int> & signals)
{
sigset_t sig_set;
#if defined(OS_DARWIN)
sigemptyset(&sig_set);
for (auto signal : signals)
sigaddset(&sig_set, signal);
#else
if (sigemptyset(&sig_set))
throw Poco::Exception("Cannot block signal.");
for (auto signal : signals)
if (sigaddset(&sig_set, signal))
throw Poco::Exception("Cannot block signal.");
#endif
if (pthread_sigmask(SIG_BLOCK, &sig_set, nullptr))
throw Poco::Exception("Cannot block signal.");
};
void BaseDaemon::initializeTerminationAndSignalProcessing()
{
SentryWriter::initialize(config());
std::set_terminate(terminate_handler);
/// We want to avoid SIGPIPE when working with sockets and pipes, and just handle return value/errno instead.
{
sigset_t sig_set;
if (sigemptyset(&sig_set) || sigaddset(&sig_set, SIGPIPE) || pthread_sigmask(SIG_BLOCK, &sig_set, nullptr))
throw Poco::Exception("Cannot block signal.");
}
blockSignals({SIGPIPE});
/// Setup signal handlers.
auto add_signal_handler =
[this](const std::vector<int> & signals, signal_function handler)
{
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_sigaction = handler;
sa.sa_flags = SA_SIGINFO;
{
#if defined(OS_DARWIN)
sigemptyset(&sa.sa_mask);
for (auto signal : signals)
sigaddset(&sa.sa_mask, signal);
#else
if (sigemptyset(&sa.sa_mask))
throw Poco::Exception("Cannot set signal handler.");
for (auto signal : signals)
if (sigaddset(&sa.sa_mask, signal))
throw Poco::Exception("Cannot set signal handler.");
#endif
for (auto signal : signals)
if (sigaction(signal, &sa, nullptr))
throw Poco::Exception("Cannot set signal handler.");
std::copy(signals.begin(), signals.end(), std::back_inserter(handled_signals));
}
};
/// SIGTSTP is added for debugging purposes. To output a stack trace of any running thread at anytime.
add_signal_handler({SIGABRT, SIGSEGV, SIGILL, SIGBUS, SIGSYS, SIGFPE, SIGPIPE, SIGTSTP}, signalHandler);
add_signal_handler({SIGHUP, SIGUSR1}, closeLogsSignalHandler);
add_signal_handler({SIGINT, SIGQUIT, SIGTERM}, terminateRequestedSignalHandler);
addSignalHandler({SIGABRT, SIGSEGV, SIGILL, SIGBUS, SIGSYS, SIGFPE, SIGPIPE, SIGTSTP}, signalHandler, &handled_signals);
addSignalHandler({SIGHUP, SIGUSR1}, closeLogsSignalHandler, &handled_signals);
addSignalHandler({SIGINT, SIGQUIT, SIGTERM}, terminateRequestedSignalHandler, &handled_signals);
#if defined(SANITIZER)
__sanitizer_set_death_callback(sanitizerDeathCallback);
@ -786,23 +797,6 @@ void BaseDaemon::logRevision() const
+ ", PID " + std::to_string(getpid()));
}
/// Makes server shutdown if at least one Poco::Task have failed.
void BaseDaemon::exitOnTaskError()
{
Poco::Observer<BaseDaemon, Poco::TaskFailedNotification> obs(*this, &BaseDaemon::handleNotification);
getTaskManager().addObserver(obs);
}
/// Used for exitOnTaskError()
void BaseDaemon::handleNotification(Poco::TaskFailedNotification *_tfn)
{
task_failed = true;
Poco::AutoPtr<Poco::TaskFailedNotification> fn(_tfn);
Poco::Logger * lg = &(logger());
LOG_ERROR(lg, "Task '{}' failed. Daemon is shutting down. Reason - {}", fn->task()->name(), fn->reason().displayText());
ServerApplication::terminate();
}
void BaseDaemon::defineOptions(Poco::Util::OptionSet & new_options)
{
new_options.addOption(
@ -863,13 +857,144 @@ void BaseDaemon::onInterruptSignals(int signal_id)
if (sigint_signals_counter >= 2)
{
LOG_INFO(&logger(), "Received second signal Interrupt. Immediately terminate.");
kill();
call_default_signal_handler(signal_id);
/// If the above did not help.
_exit(128 + signal_id);
}
}
void BaseDaemon::waitForTerminationRequest()
{
/// NOTE: as we already process signals via pipe, we don't have to block them with sigprocmask in threads
std::unique_lock<std::mutex> lock(signal_handler_mutex);
signal_event.wait(lock, [this](){ return terminate_signals_counter > 0; });
}
void BaseDaemon::shouldSetupWatchdog(char * argv0_)
{
should_setup_watchdog = true;
argv0 = argv0_;
}
void BaseDaemon::setupWatchdog()
{
/// Initialize in advance to avoid double initialization in forked processes.
DateLUT::instance();
std::string original_process_name;
if (argv0)
original_process_name = argv0;
while (true)
{
static pid_t pid = -1;
pid = fork();
if (-1 == pid)
throw Poco::Exception("Cannot fork");
if (0 == pid)
{
logger().information("Forked a child process to watch");
#if defined(__linux__)
if (0 != prctl(PR_SET_PDEATHSIG, SIGKILL))
logger().warning("Cannot do prctl to ask termination with parent.");
#endif
return;
}
/// Change short thread name and process name.
setThreadName("clckhouse-watch"); /// 15 characters
if (argv0)
{
const char * new_process_name = "clickhouse-watchdog";
memset(argv0, 0, original_process_name.size());
memcpy(argv0, new_process_name, std::min(strlen(new_process_name), original_process_name.size()));
}
logger().information(fmt::format("Will watch for the process with pid {}", pid));
/// Forward signals to the child process.
addSignalHandler(
{SIGHUP, SIGUSR1, SIGINT, SIGQUIT, SIGTERM},
[](int sig, siginfo_t *, void *)
{
/// Forward all signals except INT as it can be send by terminal to the process group when user press Ctrl+C,
/// and we process double delivery of this signal as immediate termination.
if (sig == SIGINT)
return;
const char * error_message = "Cannot forward signal to the child process.\n";
if (0 != ::kill(pid, sig))
{
auto res = write(STDERR_FILENO, error_message, strlen(error_message));
(void)res;
}
},
nullptr);
int status = 0;
do
{
if (-1 != waitpid(pid, &status, WUNTRACED | WCONTINUED) || errno == ECHILD)
{
if (WIFSTOPPED(status))
logger().warning(fmt::format("Child process was stopped by signal {}.", WSTOPSIG(status)));
else if (WIFCONTINUED(status))
logger().warning(fmt::format("Child process was continued."));
else
break;
}
else if (errno != EINTR)
throw Poco::Exception("Cannot waitpid, errno: " + std::string(strerror(errno)));
} while (true);
if (errno == ECHILD)
{
logger().information("Child process no longer exists.");
_exit(status);
}
if (WIFEXITED(status))
{
logger().information(fmt::format("Child process exited normally with code {}.", WEXITSTATUS(status)));
_exit(status);
}
if (WIFSIGNALED(status))
{
int sig = WTERMSIG(status);
if (sig == SIGKILL)
{
logger().fatal(fmt::format("Child process was terminated by signal {} (KILL)."
" If it is not done by 'forcestop' command or manually,"
" the possible cause is OOM Killer (see 'dmesg' and look at the '/var/log/kern.log' for the details).", sig));
}
else
{
logger().fatal(fmt::format("Child process was terminated by signal {}.", sig));
if (sig == SIGINT || sig == SIGTERM || sig == SIGQUIT)
_exit(status);
}
}
else
{
logger().fatal("Child process was not exited normally by unknown reason.");
}
/// Automatic restart is not enabled but you can play with it.
#if 1
_exit(status);
#else
logger().information("Will restart.");
if (argv0)
memcpy(argv0, original_process_name.c_str(), original_process_name.size());
#endif
}
}

View File

@ -12,7 +12,6 @@
#include <chrono>
#include <Poco/Process.h>
#include <Poco/ThreadPool.h>
#include <Poco/TaskNotification.h>
#include <Poco/Util/Application.h>
#include <Poco/Util/ServerApplication.h>
#include <Poco/Net/SocketAddress.h>
@ -26,9 +25,6 @@
#include <loggers/Loggers.h>
namespace Poco { class TaskManager; }
/// \brief Base class for applications that can run as daemons.
///
/// \code
@ -52,31 +48,26 @@ public:
BaseDaemon();
~BaseDaemon() override;
/// Загружает конфигурацию и "строит" логгеры на запись в файлы
/// Load configuration, prepare loggers, etc.
void initialize(Poco::Util::Application &) override;
/// Читает конфигурацию
void reloadConfiguration();
/// Определяет параметр командной строки
/// Process command line parameters
void defineOptions(Poco::Util::OptionSet & new_options) override;
/// Заставляет демон завершаться, если хотя бы одна задача завершилась неудачно
void exitOnTaskError();
/// Graceful shutdown
static void terminate();
/// Завершение демона ("мягкое")
void terminate();
/// Завершение демона ("жёсткое")
/// Forceful shutdown
void kill();
/// Получен ли сигнал на завершение?
/// Cancellation request has been received.
bool isCancelled() const
{
return is_cancelled;
}
/// Получение ссылки на экземпляр демона
static BaseDaemon & instance()
{
return dynamic_cast<BaseDaemon &>(Poco::Util::Application::instance());
@ -85,12 +76,6 @@ public:
/// return none if daemon doesn't exist, reference to the daemon otherwise
static std::optional<std::reference_wrapper<BaseDaemon>> tryGetInstance() { return tryGetInstance<BaseDaemon>(); }
/// Спит заданное количество секунд или до события wakeup
void sleep(double seconds);
/// Разбудить
void wakeup();
/// В Graphite компоненты пути(папки) разделяются точкой.
/// У нас принят путь формата root_path.hostname_yandex_ru.key
/// root_path по умолчанию one_min
@ -131,24 +116,23 @@ public:
/// also doesn't close global internal pipes for signal handling
static void closeFDs();
/// If this method is called after initialization and before run,
/// will fork child process and setup watchdog that will print diagnostic info, if the child terminates.
/// argv0 is needed to change process name (consequently, it is needed for scripts involving "pgrep", "pidof" to work correctly).
void shouldSetupWatchdog(char * argv0_);
protected:
/// Возвращает TaskManager приложения
/// все методы task_manager следует вызывать из одного потока
/// иначе возможен deadlock, т.к. joinAll выполняется под локом, а любой метод тоже берет лок
Poco::TaskManager & getTaskManager() { return *task_manager; }
virtual void logRevision() const;
/// Используется при exitOnTaskError()
void handleNotification(Poco::TaskFailedNotification *);
/// thread safe
virtual void handleSignal(int signal_id);
/// initialize termination process and signal handlers
virtual void initializeTerminationAndSignalProcessing();
/// реализация обработки сигналов завершения через pipe не требует блокировки сигнала с помощью sigprocmask во всех потоках
/// fork the main process and watch if it was killed
void setupWatchdog();
void waitForTerminationRequest()
#if defined(POCO_CLICKHOUSE_PATCH) || POCO_VERSION >= 0x02000000 // in old upstream poco not vitrual
override
@ -162,21 +146,13 @@ protected:
virtual std::string getDefaultCorePath() const;
std::unique_ptr<Poco::TaskManager> task_manager;
std::optional<DB::StatusFile> pid;
std::optional<DB::StatusFile> pid_file;
std::atomic_bool is_cancelled{false};
/// Флаг устанавливается по сообщению из Task (при аварийном завершении).
bool task_failed = false;
bool log_to_console = false;
/// Событие, чтобы проснуться во время ожидания
Poco::Event wakeup_event;
/// Поток, в котором принимается сигнал HUP/USR1 для закрытия логов.
/// A thread that acts on HUP and USR1 signal (close logs).
Poco::Thread signal_listener_thread;
std::unique_ptr<Poco::Runnable> signal_listener;
@ -194,6 +170,9 @@ protected:
String build_id_info;
std::vector<int> handled_signals;
bool should_setup_watchdog = false;
char * argv0 = nullptr;
};

View File

@ -38,7 +38,7 @@
* = log(6.3*5.3) + lgamma(5.3)
* = log(6.3*5.3*4.3*3.3*2.3) + lgamma(2.3)
* 2. Polynomial approximation of lgamma around its
* minimun ymin=1.461632144968362245 to maintain monotonicity.
* minimum ymin=1.461632144968362245 to maintain monotonicity.
* On [ymin-0.23, ymin+0.27] (i.e., [1.23164,1.73163]), use
* Let z = x-ymin;
* lgamma(x) = -1.214862905358496078218 + z^2*poly(z)

View File

@ -202,7 +202,7 @@ long double powl(long double x, long double y)
volatile long double z=0;
long double w=0, W=0, Wa=0, Wb=0, ya=0, yb=0, u=0;
/* make sure no invalid exception is raised by nan comparision */
/* make sure no invalid exception is raised by nan comparison */
if (isnan(x)) {
if (!isnan(y) && y == 0.0)
return 1.0;

View File

@ -0,0 +1,17 @@
#include <sys/timerfd.h>
#include "syscall.h"
int timerfd_create(int clockid, int flags)
{
return syscall(SYS_timerfd_create, clockid, flags);
}
int timerfd_settime(int fd, int flags, const struct itimerspec *new, struct itimerspec *old)
{
return syscall(SYS_timerfd_settime, fd, flags, new, old);
}
int timerfd_gettime(int fd, struct itimerspec *cur)
{
return syscall(SYS_timerfd_gettime, fd, cur);
}

View File

@ -129,7 +129,7 @@ using namespace pcg_extras;
*
* default_multiplier<uint32_t>::multiplier()
*
* gives you the default multipler for 32-bit integers. We use the name
* gives you the default multiplier for 32-bit integers. We use the name
* of the constant and not a generic word like value to allow these classes
* to be used as mixins.
*/

View File

@ -31,8 +31,20 @@ if (CCACHE_FOUND AND NOT COMPILER_MATCHES_CCACHE)
if (CCACHE_VERSION VERSION_GREATER "3.2.0" OR NOT CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
message(STATUS "Using ${CCACHE_FOUND} ${CCACHE_VERSION}")
set_property (GLOBAL PROPERTY RULE_LAUNCH_COMPILE ${CCACHE_FOUND})
set_property (GLOBAL PROPERTY RULE_LAUNCH_LINK ${CCACHE_FOUND})
# 4+ ccache respect SOURCE_DATE_EPOCH (always includes it into the hash
# of the manifest) and debian will extract these from d/changelog, and
# makes cache of ccache unusable
#
# FIXME: once sloppiness will be introduced for this this can be removed.
if (CCACHE_VERSION VERSION_GREATER "4.0")
message(STATUS "Ignore SOURCE_DATE_EPOCH for ccache")
set_property (GLOBAL PROPERTY RULE_LAUNCH_COMPILE "env -u SOURCE_DATE_EPOCH ${CCACHE_FOUND}")
set_property (GLOBAL PROPERTY RULE_LAUNCH_LINK "env -u SOURCE_DATE_EPOCH ${CCACHE_FOUND}")
else()
set_property (GLOBAL PROPERTY RULE_LAUNCH_COMPILE ${CCACHE_FOUND})
set_property (GLOBAL PROPERTY RULE_LAUNCH_LINK ${CCACHE_FOUND})
endif()
else ()
message(${RECONFIGURE_MESSAGE_LEVEL} "Not using ${CCACHE_FOUND} ${CCACHE_VERSION} bug: https://bugzilla.samba.org/show_bug.cgi?id=8118")
endif ()

24
cmake/find/nuraft.cmake Normal file
View File

@ -0,0 +1,24 @@
option(ENABLE_NURAFT "Enable NuRaft" ${ENABLE_LIBRARIES})
if (NOT ENABLE_NURAFT)
return()
endif()
if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/NuRaft/CMakeLists.txt")
message (WARNING "submodule contrib/NuRaft is missing. to fix try run: \n git submodule update --init --recursive")
message (${RECONFIGURE_MESSAGE_LEVEL} "Can't find internal NuRaft library")
set (USE_NURAFT 0)
return()
endif ()
if (NOT OS_FREEBSD)
set (USE_NURAFT 1)
set (NURAFT_LIBRARY nuraft)
set (NURAFT_INCLUDE_DIR "${ClickHouse_SOURCE_DIR}/contrib/NuRaft/include")
message (STATUS "Using NuRaft=${USE_NURAFT}: ${NURAFT_INCLUDE_DIR} : ${NURAFT_LIBRARY}")
else()
set (USE_NURAFT 0)
message (STATUS "Using internal NuRaft library on FreeBSD is not supported")
endif()

View File

@ -1,8 +1,6 @@
option (USE_INTERNAL_POCO_LIBRARY "Use internal Poco library" ON)
if (USE_INTERNAL_POCO_LIBRARY)
set (LIBRARY_DIR ${ClickHouse_SOURCE_DIR}/contrib/poco)
else ()
if (NOT USE_INTERNAL_POCO_LIBRARY)
find_path (ROOT_DIR NAMES Foundation/include/Poco/Poco.h include/Poco/Poco.h)
if (NOT ROOT_DIR)
message (${RECONFIGURE_MESSAGE_LEVEL} "Can't find system poco")

View File

@ -36,15 +36,17 @@ if (NOT PARALLEL_LINK_JOBS AND AVAILABLE_PHYSICAL_MEMORY AND MAX_LINKER_MEMORY)
endif ()
# ThinLTO provides its own parallel linking
# (which is enabled only for RELWITHDEBINFO)
#
# But use 2 parallel jobs, since:
# - this is what llvm does
# - and I've verfied that lld-11 does not use all available CPU time (in peak) while linking one binary
if (ENABLE_THINLTO AND PARALLEL_LINK_JOBS GREATER 2)
if (CMAKE_BUILD_TYPE_UC STREQUAL "RELWITHDEBINFO" AND ENABLE_THINLTO AND PARALLEL_LINK_JOBS GREATER 2)
message(STATUS "ThinLTO provides its own parallel linking - limiting parallel link jobs to 2.")
set (PARALLEL_LINK_JOBS 2)
endif()
if (PARALLEL_LINK_JOBS AND (NOT NUMBER_OF_LOGICAL_CORES OR PARALLEL_COMPILE_JOBS LESS NUMBER_OF_LOGICAL_CORES))
if (PARALLEL_LINK_JOBS AND (NOT NUMBER_OF_LOGICAL_CORES OR PARALLEL_LINK_JOBS LESS NUMBER_OF_LOGICAL_CORES))
set(CMAKE_JOB_POOL_LINK link_job_pool${CMAKE_CURRENT_SOURCE_DIR})
string (REGEX REPLACE "[^a-zA-Z0-9]+" "_" CMAKE_JOB_POOL_LINK ${CMAKE_JOB_POOL_LINK})
set_property(GLOBAL APPEND PROPERTY JOB_POOLS ${CMAKE_JOB_POOL_LINK}=${PARALLEL_LINK_JOBS})

View File

@ -15,10 +15,6 @@ if (COMPILER_GCC)
elseif (COMPILER_CLANG)
# Require minimum version of clang/apple-clang
if (CMAKE_CXX_COMPILER_ID MATCHES "AppleClang")
# If you are developer you can figure out what exact versions of AppleClang are Ok,
# remove the following line and commit changes below.
message (FATAL_ERROR "AppleClang is not supported, you should install clang from brew.")
# AppleClang 10.0.1 (Xcode 10.2) corresponds to LLVM/Clang upstream version 7.0.0
# AppleClang 11.0.0 (Xcode 11.0) corresponds to LLVM/Clang upstream version 8.0.0
set (XCODE_MINIMUM_VERSION 10.2)
@ -89,4 +85,4 @@ if (ARCH_PPC64LE)
if (COMPILER_CLANG OR (COMPILER_GCC AND CMAKE_CXX_COMPILER_VERSION VERSION_LESS 8))
message(FATAL_ERROR "Only gcc-8 or higher is supported for powerpc architecture")
endif ()
endif ()
endif ()

View File

@ -14,6 +14,13 @@ unset (_current_dir_name)
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -w")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -w")
if (WITH_COVERAGE)
set (WITHOUT_COVERAGE_LIST ${WITHOUT_COVERAGE})
separate_arguments(WITHOUT_COVERAGE_LIST)
# disable coverage for contib files and build with optimisations
add_compile_options(-O3 -DNDEBUG -finline-functions -finline-hint-functions ${WITHOUT_COVERAGE_LIST})
endif()
if (SANITIZE STREQUAL "undefined")
# 3rd-party libraries usually not intended to work with UBSan.
add_compile_options(-fno-sanitize=undefined)
@ -300,5 +307,9 @@ if (USE_INTERNAL_ROCKSDB_LIBRARY)
add_subdirectory(rocksdb-cmake)
endif()
if (USE_NURAFT)
add_subdirectory(nuraft-cmake)
endif()
add_subdirectory(fast_float)

1
contrib/NuRaft vendored Submodule

@ -0,0 +1 @@
Subproject commit 410bd149da84cdde60b4436b02b738749f4e87e1

2
contrib/boost vendored

@ -1 +1 @@
Subproject commit 0b98b443aa7bb77d65efd7b23b3b8c8a0ab5f1f3
Subproject commit 8e259cd2a6b60d75dd17e73432f11bb7b9351bb1

View File

@ -11,10 +11,13 @@ if (NOT USE_INTERNAL_BOOST_LIBRARY)
iostreams
program_options
regex
context
coroutine
)
if(Boost_INCLUDE_DIR AND Boost_FILESYSTEM_LIBRARY AND Boost_FILESYSTEM_LIBRARY AND
Boost_PROGRAM_OPTIONS_LIBRARY AND Boost_REGEX_LIBRARY AND Boost_SYSTEM_LIBRARY)
Boost_PROGRAM_OPTIONS_LIBRARY AND Boost_REGEX_LIBRARY AND Boost_SYSTEM_LIBRARY AND Boost_CONTEXT_LIBRARY AND
Boost_COROUTINE_LIBRARY)
set(EXTERNAL_BOOST_FOUND 1)
@ -27,18 +30,24 @@ if (NOT USE_INTERNAL_BOOST_LIBRARY)
add_library (_boost_program_options INTERFACE)
add_library (_boost_regex INTERFACE)
add_library (_boost_system INTERFACE)
add_library (_boost_context INTERFACE)
add_library (_boost_coroutine INTERFACE)
target_link_libraries (_boost_filesystem INTERFACE ${Boost_FILESYSTEM_LIBRARY})
target_link_libraries (_boost_iostreams INTERFACE ${Boost_IOSTREAMS_LIBRARY})
target_link_libraries (_boost_program_options INTERFACE ${Boost_PROGRAM_OPTIONS_LIBRARY})
target_link_libraries (_boost_regex INTERFACE ${Boost_REGEX_LIBRARY})
target_link_libraries (_boost_system INTERFACE ${Boost_SYSTEM_LIBRARY})
target_link_libraries (_boost_context INTERFACE ${Boost_CONTEXT_LIBRARY})
target_link_libraries (_boost_coroutine INTERFACE ${Boost_COROUTINE_LIBRARY})
add_library (boost::filesystem ALIAS _boost_filesystem)
add_library (boost::iostreams ALIAS _boost_iostreams)
add_library (boost::program_options ALIAS _boost_program_options)
add_library (boost::regex ALIAS _boost_regex)
add_library (boost::system ALIAS _boost_system)
add_library (boost::context ALIAS _boost_context)
add_library (boost::coroutine ALIAS _boost_coroutine)
else()
set(EXTERNAL_BOOST_FOUND 0)
message (${RECONFIGURE_MESSAGE_LEVEL} "Can't find system boost")
@ -72,6 +81,10 @@ if (NOT EXTERNAL_BOOST_FOUND)
add_library (boost::headers_only ALIAS _boost_headers_only)
target_include_directories (_boost_headers_only SYSTEM BEFORE INTERFACE ${LIBRARY_DIR})
# asio
target_compile_definitions (_boost_headers_only INTERFACE BOOST_ASIO_STANDALONE=1)
# iostreams
set (SRCS_IOSTREAMS
@ -142,4 +155,69 @@ if (NOT EXTERNAL_BOOST_FOUND)
add_library (_boost_system ${SRCS_SYSTEM})
add_library (boost::system ALIAS _boost_system)
target_include_directories (_boost_system PRIVATE ${LIBRARY_DIR})
# context
enable_language(ASM)
SET(ASM_OPTIONS "-x assembler-with-cpp")
if (SANITIZE AND (SANITIZE STREQUAL "address" OR SANITIZE STREQUAL "thread"))
add_compile_definitions(BOOST_USE_UCONTEXT)
if (SANITIZE STREQUAL "address")
add_compile_definitions(BOOST_USE_ASAN)
elseif (SANITIZE STREQUAL "thread")
add_compile_definitions(BOOST_USE_TSAN)
endif()
set (SRCS_CONTEXT
${LIBRARY_DIR}/libs/context/src/fiber.cpp
${LIBRARY_DIR}/libs/context/src/continuation.cpp
${LIBRARY_DIR}/libs/context/src/dummy.cpp
${LIBRARY_DIR}/libs/context/src/execution_context.cpp
${LIBRARY_DIR}/libs/context/src/posix/stack_traits.cpp
)
elseif (ARCH_ARM)
set (SRCS_CONTEXT
${LIBRARY_DIR}/libs/context/src/asm/jump_arm64_aapcs_elf_gas.S
${LIBRARY_DIR}/libs/context/src/asm/make_arm64_aapcs_elf_gas.S
${LIBRARY_DIR}/libs/context/src/asm/ontop_arm64_aapcs_elf_gas.S
${LIBRARY_DIR}/libs/context/src/dummy.cpp
${LIBRARY_DIR}/libs/context/src/execution_context.cpp
${LIBRARY_DIR}/libs/context/src/posix/stack_traits.cpp
)
elseif(OS_DARWIN)
set (SRCS_CONTEXT
${LIBRARY_DIR}/libs/context/src/asm/jump_x86_64_sysv_macho_gas.S
${LIBRARY_DIR}/libs/context/src/asm/make_x86_64_sysv_macho_gas.S
${LIBRARY_DIR}/libs/context/src/asm/ontop_x86_64_sysv_macho_gas.S
${LIBRARY_DIR}/libs/context/src/dummy.cpp
${LIBRARY_DIR}/libs/context/src/execution_context.cpp
${LIBRARY_DIR}/libs/context/src/posix/stack_traits.cpp
)
else()
set (SRCS_CONTEXT
${LIBRARY_DIR}/libs/context/src/asm/jump_x86_64_sysv_elf_gas.S
${LIBRARY_DIR}/libs/context/src/asm/make_x86_64_sysv_elf_gas.S
${LIBRARY_DIR}/libs/context/src/asm/ontop_x86_64_sysv_elf_gas.S
${LIBRARY_DIR}/libs/context/src/dummy.cpp
${LIBRARY_DIR}/libs/context/src/execution_context.cpp
${LIBRARY_DIR}/libs/context/src/posix/stack_traits.cpp
)
endif()
add_library (_boost_context ${SRCS_CONTEXT})
add_library (boost::context ALIAS _boost_context)
target_include_directories (_boost_context PRIVATE ${LIBRARY_DIR})
# coroutine
set (SRCS_COROUTINE
${LIBRARY_DIR}/libs/coroutine/detail/coroutine_context.cpp
${LIBRARY_DIR}/libs/coroutine/exceptions.cpp
${LIBRARY_DIR}/libs/coroutine/posix/stack_traits.cpp
)
add_library (_boost_coroutine ${SRCS_COROUTINE})
add_library (boost::coroutine ALIAS _boost_coroutine)
target_include_directories (_boost_coroutine PRIVATE ${LIBRARY_DIR})
target_link_libraries(_boost_coroutine PRIVATE _boost_context)
endif ()

2
contrib/cctz vendored

@ -1 +1 @@
Subproject commit 260ba195ef6c489968bae8c88c62a67cdac5ff9d
Subproject commit c0f1bcb97fd2782f7c3f972fadd5aad5affac4b8

2
contrib/croaring vendored

@ -1 +1 @@
Subproject commit 5f20740ec0de5e153e8f4cb2ab91814e8b291a14
Subproject commit d8402939b5c9fc134fd4fcf058fe0f7006d2b129

View File

@ -23,3 +23,4 @@ add_library(roaring ${SRCS})
target_include_directories(roaring PRIVATE ${LIBRARY_DIR}/include/roaring)
target_include_directories(roaring SYSTEM BEFORE PUBLIC ${LIBRARY_DIR}/include)
target_include_directories(roaring SYSTEM BEFORE PUBLIC ${LIBRARY_DIR}/cpp)

View File

@ -54,6 +54,26 @@ else ()
set(CARES_SHARED ON CACHE BOOL "" FORCE)
endif ()
# Disable looking for libnsl on a platforms that has gethostbyname in glibc
#
# c-ares searching for gethostbyname in the libnsl library, however in the
# version that shipped with gRPC it doing it wrong [1], since it uses
# CHECK_LIBRARY_EXISTS(), which will return TRUE even if the function exists in
# another dependent library. The upstream already contains correct macro [2],
# but it is not included in gRPC (even upstream gRPC, not the one that is
# shipped with clickhousee).
#
# [1]: https://github.com/c-ares/c-ares/blob/e982924acee7f7313b4baa4ee5ec000c5e373c30/CMakeLists.txt#L125
# [2]: https://github.com/c-ares/c-ares/blob/44fbc813685a1fa8aa3f27fcd7544faf612d376a/CMakeLists.txt#L146
#
# And because if you by some reason have libnsl [3] installed, clickhouse will
# reject to start w/o it. While this is completelly different library.
#
# [3]: https://packages.debian.org/bullseye/libnsl2
if (NOT CMAKE_SYSTEM_NAME STREQUAL "SunOS")
set(HAVE_LIBNSL OFF CACHE BOOL "" FORCE)
endif()
# We don't want to build C# extensions.
set(gRPC_BUILD_CSHARP_EXT OFF)

View File

@ -0,0 +1,45 @@
set(LIBRARY_DIR ${ClickHouse_SOURCE_DIR}/contrib/NuRaft)
set(SRCS
${LIBRARY_DIR}/src/handle_priority.cxx
${LIBRARY_DIR}/src/buffer_serializer.cxx
${LIBRARY_DIR}/src/peer.cxx
${LIBRARY_DIR}/src/global_mgr.cxx
${LIBRARY_DIR}/src/buffer.cxx
${LIBRARY_DIR}/src/asio_service.cxx
${LIBRARY_DIR}/src/handle_client_request.cxx
${LIBRARY_DIR}/src/raft_server.cxx
${LIBRARY_DIR}/src/snapshot.cxx
${LIBRARY_DIR}/src/handle_commit.cxx
${LIBRARY_DIR}/src/error_code.cxx
${LIBRARY_DIR}/src/crc32.cxx
${LIBRARY_DIR}/src/handle_snapshot_sync.cxx
${LIBRARY_DIR}/src/stat_mgr.cxx
${LIBRARY_DIR}/src/handle_join_leave.cxx
${LIBRARY_DIR}/src/handle_user_cmd.cxx
${LIBRARY_DIR}/src/handle_custom_notification.cxx
${LIBRARY_DIR}/src/handle_vote.cxx
${LIBRARY_DIR}/src/launcher.cxx
${LIBRARY_DIR}/src/srv_config.cxx
${LIBRARY_DIR}/src/snapshot_sync_req.cxx
${LIBRARY_DIR}/src/handle_timeout.cxx
${LIBRARY_DIR}/src/handle_append_entries.cxx
${LIBRARY_DIR}/src/cluster_config.cxx
)
add_library(nuraft ${SRCS})
target_compile_definitions(nuraft PRIVATE USE_BOOST_ASIO=1 BOOST_ASIO_STANDALONE=1)
target_include_directories (nuraft SYSTEM PRIVATE ${LIBRARY_DIR}/include/libnuraft)
# for some reason include "asio.h" directly without "boost/" prefix.
target_include_directories (nuraft SYSTEM PRIVATE ${ClickHouse_SOURCE_DIR}/contrib/boost/boost)
target_link_libraries (nuraft PRIVATE boost::headers_only boost::coroutine)
if(OPENSSL_SSL_LIBRARY AND OPENSSL_CRYPTO_LIBRARY)
target_link_libraries (nuraft PRIVATE ${OPENSSL_SSL_LIBRARY} ${OPENSSL_CRYPTO_LIBRARY})
endif()
target_include_directories (nuraft SYSTEM PUBLIC ${LIBRARY_DIR}/include)

2
contrib/poco vendored

@ -1 +1 @@
Subproject commit 08974cc024b2e748f5b1d45415396706b3521d0f
Subproject commit 2c32e17c7dfee1f8bf24227b697cdef5fddf0823

View File

@ -1,3 +1,5 @@
set (LIBRARY_DIR ${ClickHouse_SOURCE_DIR}/contrib/poco)
add_subdirectory (Crypto)
add_subdirectory (Data)
add_subdirectory (Data/ODBC)

View File

@ -30,7 +30,7 @@ if (ENABLE_SSL)
target_compile_options (_poco_crypto PRIVATE -Wno-newline-eof)
target_include_directories (_poco_crypto SYSTEM PUBLIC ${LIBRARY_DIR}/Crypto/include)
target_link_libraries (_poco_crypto PUBLIC Poco::Foundation ssl)
target_link_libraries (_poco_crypto PUBLIC Poco::Foundation ssl crypto)
else ()
add_library (Poco::Crypto UNKNOWN IMPORTED GLOBAL)

View File

@ -2,12 +2,6 @@
set(ROCKSDB_SOURCE_DIR "${ClickHouse_SOURCE_DIR}/contrib/rocksdb")
list(APPEND CMAKE_MODULE_PATH "${ROCKSDB_SOURCE_DIR}/cmake/modules/")
find_program(CCACHE_FOUND ccache)
if(CCACHE_FOUND)
set_property(GLOBAL PROPERTY RULE_LAUNCH_COMPILE ccache)
set_property(GLOBAL PROPERTY RULE_LAUNCH_LINK ccache)
endif(CCACHE_FOUND)
if (SANITIZE STREQUAL "undefined")
set(WITH_UBSAN ON)
elseif (SANITIZE STREQUAL "address")

2
debian/compat vendored
View File

@ -1 +1 @@
8
10

2
debian/rules vendored
View File

@ -5,7 +5,7 @@
export DH_VERBOSE=1
# -pie only for static mode
export DEB_BUILD_MAINT_OPTIONS=hardening=+all,-pie
export DEB_BUILD_MAINT_OPTIONS=hardening=-all
# because copy_headers.sh have hardcoded path to build/include_directories.txt
BUILDDIR = obj-$(DEB_HOST_GNU_TYPE)

View File

@ -17,10 +17,6 @@
"name": "yandex/clickhouse-unbundled-builder",
"dependent": []
},
"docker/test/coverage": {
"name": "yandex/clickhouse-coverage",
"dependent": []
},
"docker/test/compatibility/centos": {
"name": "yandex/clickhouse-test-old-centos",
"dependent": []
@ -48,27 +44,22 @@
"docker/test/stateless": {
"name": "yandex/clickhouse-stateless-test",
"dependent": [
"docker/test/stateful"
"docker/test/stateful",
"docker/test/coverage"
]
},
"docker/test/stateless_pytest": {
"name": "yandex/clickhouse-stateless-pytest",
"dependent": []
},
"docker/test/stateless_with_coverage": {
"name": "yandex/clickhouse-stateless-test-with-coverage",
"dependent": [
"docker/test/stateful_with_coverage"
]
},
"docker/test/stateful": {
"name": "yandex/clickhouse-stateful-test",
"dependent": [
"docker/test/stress"
]
},
"docker/test/stateful_with_coverage": {
"name": "yandex/clickhouse-stateful-test-with-coverage",
"docker/test/coverage": {
"name": "yandex/clickhouse-test-coverage",
"dependent": []
},
"docker/test/unit": {
@ -143,9 +134,7 @@
"name": "yandex/clickhouse-test-base",
"dependent": [
"docker/test/stateless",
"docker/test/stateless_with_coverage",
"docker/test/stateless_pytest",
"docker/test/coverage"
"docker/test/stateless_pytest"
]
},
"docker/packager/unbundled": {

View File

@ -21,6 +21,8 @@ RUN apt-get update \
libboost-thread-dev \
libboost-iostreams-dev \
libboost-regex-dev \
libboost-context-dev \
libboost-coroutine-dev \
zlib1g-dev \
liblz4-dev \
libdouble-conversion-dev \

View File

@ -1,16 +1,18 @@
# docker build -t yandex/clickhouse-coverage .
ARG PARENT_TAG=latest
FROM yandex/clickhouse-test-base:${PARENT_TAG}
# docker build -t yandex/clickhouse-test-coverage .
FROM yandex/clickhouse-stateless-test
RUN apt-get update -y \
&& env DEBIAN_FRONTEND=noninteractive \
apt-get install --yes --no-install-recommends \
cmake
COPY s3downloader /s3downloader
COPY run.sh /run.sh
ENV DATASETS="hits visits"
ENV COVERAGE_DIR=/coverage_reports
ENV SOURCE_DIR=/build
ENV OUTPUT_DIR=/output
ENV IGNORE='.*contrib.*'
RUN apt-get update && apt-get install cmake --yes --no-install-recommends
CMD mkdir -p /build/obj-x86_64-linux-gnu && cd /build/obj-x86_64-linux-gnu && CC=clang-11 CXX=clang++-11 cmake .. && cd /; \
dpkg -i /package_folder/clickhouse-common-static_*.deb; \
llvm-profdata-11 merge -sparse ${COVERAGE_DIR}/* -o clickhouse.profdata && \
llvm-cov-11 export /usr/bin/clickhouse -instr-profile=clickhouse.profdata -j=16 -format=lcov -skip-functions -ignore-filename-regex $IGNORE > output.lcov && \
genhtml output.lcov --ignore-errors source --output-directory ${OUTPUT_DIR}
CMD ["/bin/bash", "/run.sh"]

112
docker/test/coverage/run.sh Executable file
View File

@ -0,0 +1,112 @@
#!/bin/bash
kill_clickhouse () {
echo "clickhouse pids $(pgrep -u clickhouse)" | ts '%Y-%m-%d %H:%M:%S'
pkill -f "clickhouse-server" 2>/dev/null
for _ in {1..120}
do
if ! pkill -0 -f "clickhouse-server" ; then break ; fi
echo "ClickHouse still alive" | ts '%Y-%m-%d %H:%M:%S'
sleep 1
done
if pkill -0 -f "clickhouse-server"
then
pstree -apgT
jobs
echo "Failed to kill the ClickHouse server" | ts '%Y-%m-%d %H:%M:%S'
return 1
fi
}
start_clickhouse () {
LLVM_PROFILE_FILE='server_%h_%p_%m.profraw' sudo -Eu clickhouse /usr/bin/clickhouse-server --config /etc/clickhouse-server/config.xml &
counter=0
until clickhouse-client --query "SELECT 1"
do
if [ "$counter" -gt 120 ]
then
echo "Cannot start clickhouse-server"
cat /var/log/clickhouse-server/stdout.log
tail -n1000 /var/log/clickhouse-server/stderr.log
tail -n1000 /var/log/clickhouse-server/clickhouse-server.log
break
fi
sleep 0.5
counter=$((counter + 1))
done
}
chmod 777 /
dpkg -i package_folder/clickhouse-common-static_*.deb; \
dpkg -i package_folder/clickhouse-common-static-dbg_*.deb; \
dpkg -i package_folder/clickhouse-server_*.deb; \
dpkg -i package_folder/clickhouse-client_*.deb; \
dpkg -i package_folder/clickhouse-test_*.deb
mkdir -p /var/lib/clickhouse
mkdir -p /var/log/clickhouse-server
chmod 777 -R /var/log/clickhouse-server/
# install test configs
/usr/share/clickhouse-test/config/install.sh
start_clickhouse
# shellcheck disable=SC2086 # No quotes because I want to split it into words.
if ! /s3downloader --dataset-names $DATASETS; then
echo "Cannot download datatsets"
exit 1
fi
chmod 777 -R /var/lib/clickhouse
LLVM_PROFILE_FILE='client_coverage_%5m.profraw' clickhouse-client --query "SHOW DATABASES"
LLVM_PROFILE_FILE='client_coverage_%5m.profraw' clickhouse-client --query "ATTACH DATABASE datasets ENGINE = Ordinary"
LLVM_PROFILE_FILE='client_coverage_%5m.profraw' clickhouse-client --query "CREATE DATABASE test"
kill_clickhouse
start_clickhouse
LLVM_PROFILE_FILE='client_coverage_%5m.profraw' clickhouse-client --query "SHOW TABLES FROM datasets"
LLVM_PROFILE_FILE='client_coverage_%5m.profraw' clickhouse-client --query "SHOW TABLES FROM test"
LLVM_PROFILE_FILE='client_coverage_%5m.profraw' clickhouse-client --query "RENAME TABLE datasets.hits_v1 TO test.hits"
LLVM_PROFILE_FILE='client_coverage_%5m.profraw' clickhouse-client --query "RENAME TABLE datasets.visits_v1 TO test.visits"
LLVM_PROFILE_FILE='client_coverage_%5m.profraw' clickhouse-client --query "SHOW TABLES FROM test"
LLVM_PROFILE_FILE='client_coverage_%5m.profraw' clickhouse-test -j 8 --testname --shard --zookeeper --print-time --use-skip-list 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee /test_result.txt
readarray -t FAILED_TESTS < <(awk '/FAIL|TIMEOUT|ERROR/ { print substr($3, 1, length($3)-1) }' "/test_result.txt")
kill_clickhouse
sleep 3
if [[ -n "${FAILED_TESTS[*]}" ]]
then
# Clean the data so that there is no interference from the previous test run.
rm -rf /var/lib/clickhouse/{{meta,}data,user_files} ||:
start_clickhouse
echo "Going to run again: ${FAILED_TESTS[*]}"
LLVM_PROFILE_FILE='client_coverage_%5m.profraw' clickhouse-test --order=random --testname --shard --zookeeper --use-skip-list "${FAILED_TESTS[@]}" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee -a /test_result.txt
else
echo "No failed tests"
fi
mkdir -p $COVERAGE_DIR
mv /*.profraw $COVERAGE_DIR
mkdir -p $SOURCE_DIR/obj-x86_64-linux-gnu
cd $SOURCE_DIR/obj-x86_64-linux-gnu && CC=clang-11 CXX=clang++-11 cmake .. && cd /
llvm-profdata-11 merge -sparse ${COVERAGE_DIR}/* -o clickhouse.profdata
llvm-cov-11 export /usr/bin/clickhouse -instr-profile=clickhouse.profdata -j=16 -format=lcov -skip-functions -ignore-filename-regex $IGNORE > output.lcov
genhtml output.lcov --ignore-errors source --output-directory ${OUTPUT_DIR}

View File

@ -1,15 +0,0 @@
# docker build -t yandex/clickhouse-stateful-test-with-coverage .
FROM yandex/clickhouse-stateless-test-with-coverage
RUN apt-get update -y \
&& env DEBIAN_FRONTEND=noninteractive \
apt-get install --yes --no-install-recommends \
python3-requests procps psmisc
COPY s3downloader /s3downloader
COPY run.sh /run.sh
ENV DATASETS="hits visits"
CMD ["/bin/bash", "/run.sh"]

View File

@ -1,98 +0,0 @@
#!/bin/bash
kill_clickhouse () {
echo "clickhouse pids $(pgrep -u clickhouse)" | ts '%Y-%m-%d %H:%M:%S'
pkill -f "clickhouse-server" 2>/dev/null
for _ in {1..120}
do
if ! pkill -0 -f "clickhouse-server" ; then break ; fi
echo "ClickHouse still alive" | ts '%Y-%m-%d %H:%M:%S'
sleep 1
done
if pkill -0 -f "clickhouse-server"
then
pstree -apgT
jobs
echo "Failed to kill the ClickHouse server" | ts '%Y-%m-%d %H:%M:%S'
return 1
fi
}
start_clickhouse () {
LLVM_PROFILE_FILE='server_%h_%p_%m.profraw' sudo -Eu clickhouse /usr/bin/clickhouse-server --config /etc/clickhouse-server/config.xml &
counter=0
until clickhouse-client --query "SELECT 1"
do
if [ "$counter" -gt 120 ]
then
echo "Cannot start clickhouse-server"
cat /var/log/clickhouse-server/stdout.log
tail -n1000 /var/log/clickhouse-server/stderr.log
tail -n1000 /var/log/clickhouse-server/clickhouse-server.log
break
fi
sleep 0.5
counter=$((counter + 1))
done
}
chmod 777 /
dpkg -i package_folder/clickhouse-common-static_*.deb; \
dpkg -i package_folder/clickhouse-common-static-dbg_*.deb; \
dpkg -i package_folder/clickhouse-server_*.deb; \
dpkg -i package_folder/clickhouse-client_*.deb; \
dpkg -i package_folder/clickhouse-test_*.deb
mkdir -p /var/lib/clickhouse
mkdir -p /var/log/clickhouse-server
chmod 777 -R /var/log/clickhouse-server/
# install test configs
/usr/share/clickhouse-test/config/install.sh
start_clickhouse
# shellcheck disable=SC2086 # No quotes because I want to split it into words.
if ! /s3downloader --dataset-names $DATASETS; then
echo "Cannot download datatsets"
exit 1
fi
chmod 777 -R /var/lib/clickhouse
LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "SHOW DATABASES"
LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "ATTACH DATABASE datasets ENGINE = Ordinary"
LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "CREATE DATABASE test"
kill_clickhouse
start_clickhouse
LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "SHOW TABLES FROM datasets"
LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "SHOW TABLES FROM test"
LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "RENAME TABLE datasets.hits_v1 TO test.hits"
LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "RENAME TABLE datasets.visits_v1 TO test.visits"
LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "SHOW TABLES FROM test"
if grep -q -- "--use-skip-list" /usr/bin/clickhouse-test; then
SKIP_LIST_OPT="--use-skip-list"
fi
# We can have several additional options so we path them as array because it's
# more idiologically correct.
read -ra ADDITIONAL_OPTIONS <<< "${ADDITIONAL_OPTIONS:-}"
LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-test --testname --shard --zookeeper --no-stateless --hung-check --print-time "$SKIP_LIST_OPT" "${ADDITIONAL_OPTIONS[@]}" "$SKIP_TESTS_OPTION" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee test_output/test_result.txt
kill_clickhouse
sleep 3
cp /*.profraw /profraw ||:

View File

@ -12,7 +12,32 @@ dpkg -i package_folder/clickhouse-test_*.deb
# install test configs
/usr/share/clickhouse-test/config/install.sh
service clickhouse-server start && sleep 5
# For flaky check we also enable thread fuzzer
if [ "$NUM_TRIES" -gt "1" ]; then
export THREAD_FUZZER_CPU_TIME_PERIOD_US=1000
export THREAD_FUZZER_SLEEP_PROBABILITY=0.1
export THREAD_FUZZER_SLEEP_TIME_US=100000
export THREAD_FUZZER_pthread_mutex_lock_BEFORE_MIGRATE_PROBABILITY=1
export THREAD_FUZZER_pthread_mutex_lock_AFTER_MIGRATE_PROBABILITY=1
export THREAD_FUZZER_pthread_mutex_unlock_BEFORE_MIGRATE_PROBABILITY=1
export THREAD_FUZZER_pthread_mutex_unlock_AFTER_MIGRATE_PROBABILITY=1
export THREAD_FUZZER_pthread_mutex_lock_BEFORE_SLEEP_PROBABILITY=0.001
export THREAD_FUZZER_pthread_mutex_lock_AFTER_SLEEP_PROBABILITY=0.001
export THREAD_FUZZER_pthread_mutex_unlock_BEFORE_SLEEP_PROBABILITY=0.001
export THREAD_FUZZER_pthread_mutex_unlock_AFTER_SLEEP_PROBABILITY=0.001
export THREAD_FUZZER_pthread_mutex_lock_BEFORE_SLEEP_TIME_US=10000
export THREAD_FUZZER_pthread_mutex_lock_AFTER_SLEEP_TIME_US=10000
export THREAD_FUZZER_pthread_mutex_unlock_BEFORE_SLEEP_TIME_US=10000
export THREAD_FUZZER_pthread_mutex_unlock_AFTER_SLEEP_TIME_US=10000
# simpliest way to forward env variables to server
sudo -E -u clickhouse /usr/bin/clickhouse-server --config /etc/clickhouse-server/config.xml --daemon
sleep 5
else
service clickhouse-server start && sleep 5
fi
if grep -q -- "--use-skip-list" /usr/bin/clickhouse-test; then
SKIP_LIST_OPT="--use-skip-list"

View File

@ -1,47 +0,0 @@
# docker build -t yandex/clickhouse-stateless-test-with-coverage .
# TODO: that can be based on yandex/clickhouse-stateless-test (llvm version and CMD differs)
FROM yandex/clickhouse-test-base
ARG odbc_driver_url="https://github.com/ClickHouse/clickhouse-odbc/releases/download/v1.1.4.20200302/clickhouse-odbc-1.1.4-Linux.tar.gz"
RUN apt-get update -y \
&& env DEBIAN_FRONTEND=noninteractive \
apt-get install --yes --no-install-recommends \
bash \
tzdata \
fakeroot \
debhelper \
expect \
python3 \
python3-lxml \
python3-termcolor \
python3-requests \
sudo \
openssl \
ncdu \
netcat-openbsd \
telnet \
tree \
moreutils \
brotli \
zstd \
gdb \
lsof \
unixodbc \
wget \
qemu-user-static \
procps \
psmisc
RUN mkdir -p /tmp/clickhouse-odbc-tmp \
&& wget -nv -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \
&& cp /tmp/clickhouse-odbc-tmp/lib64/*.so /usr/local/lib/ \
&& odbcinst -i -d -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbcinst.ini.sample \
&& odbcinst -i -s -l -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbc.ini.sample \
&& rm -rf /tmp/clickhouse-odbc-tmp
ENV TZ=Europe/Moscow
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
COPY run.sh /run.sh
CMD ["/bin/bash", "/run.sh"]

View File

@ -1,75 +0,0 @@
#!/bin/bash
kill_clickhouse () {
echo "clickhouse pids $(pgrep -u clickhouse)" | ts '%Y-%m-%d %H:%M:%S'
pkill -f "clickhouse-server" 2>/dev/null
for _ in {1..120}
do
if ! pkill -0 -f "clickhouse-server" ; then break ; fi
echo "ClickHouse still alive" | ts '%Y-%m-%d %H:%M:%S'
sleep 1
done
if pkill -0 -f "clickhouse-server"
then
pstree -apgT
jobs
echo "Failed to kill the ClickHouse server" | ts '%Y-%m-%d %H:%M:%S'
return 1
fi
}
start_clickhouse () {
LLVM_PROFILE_FILE='server_%h_%p_%m.profraw' sudo -Eu clickhouse /usr/bin/clickhouse-server --config /etc/clickhouse-server/config.xml &
counter=0
until clickhouse-client --query "SELECT 1"
do
if [ "$counter" -gt 120 ]
then
echo "Cannot start clickhouse-server"
cat /var/log/clickhouse-server/stdout.log
tail -n1000 /var/log/clickhouse-server/stderr.log
tail -n1000 /var/log/clickhouse-server/clickhouse-server.log
break
fi
sleep 0.5
counter=$((counter + 1))
done
}
chmod 777 /
dpkg -i package_folder/clickhouse-common-static_*.deb; \
dpkg -i package_folder/clickhouse-common-static-dbg_*.deb; \
dpkg -i package_folder/clickhouse-server_*.deb; \
dpkg -i package_folder/clickhouse-client_*.deb; \
dpkg -i package_folder/clickhouse-test_*.deb
mkdir -p /var/lib/clickhouse
mkdir -p /var/log/clickhouse-server
chmod 777 -R /var/lib/clickhouse
chmod 777 -R /var/log/clickhouse-server/
# install test configs
/usr/share/clickhouse-test/config/install.sh
start_clickhouse
if grep -q -- "--use-skip-list" /usr/bin/clickhouse-test; then
SKIP_LIST_OPT="--use-skip-list"
fi
# We can have several additional options so we path them as array because it's
# more idiologically correct.
read -ra ADDITIONAL_OPTIONS <<< "${ADDITIONAL_OPTIONS:-}"
LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-test --testname --shard --zookeeper --hung-check --print-time "$SKIP_LIST_OPT" "${ADDITIONAL_OPTIONS[@]}" "$SKIP_TESTS_OPTION" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee test_output/test_result.txt
kill_clickhouse
sleep 3
cp /*.profraw /profraw ||:

View File

@ -4,5 +4,8 @@ FROM ubuntu:20.04
RUN apt-get update && env DEBIAN_FRONTEND=noninteractive apt-get install --yes shellcheck libxml2-utils git python3-pip && pip3 install codespell
CMD cd /ClickHouse/utils/check-style && ./check-style -n | tee /test_output/style_output.txt && \
CMD cd /ClickHouse/utils/check-style && \
./check-style -n | tee /test_output/style_output.txt && \
./check-typos | tee /test_output/typos_output.txt && \
./check-whitespaces -n | tee /test_output/whitespaces_output.txt && \
./check-duplicate-includes.sh | tee /test_output/duplicate_output.txt

View File

@ -20,7 +20,8 @@ toc_title: Cloud
## Altinity.Cloud {#altinity.cloud}
[Altinity.Cloud](https://altinity.com/cloud-database/) is a fully managed ClickHouse-as-a-Service for the Amazon public cloud.
[Altinity.Cloud](https://altinity.com/cloud-database/) is a fully managed ClickHouse-as-a-Service for the Amazon public cloud.
- Fast deployment of ClickHouse clusters on Amazon resources
- Easy scale-out/scale-in as well as vertical scaling of nodes
- Isolated per-tenant VPCs with public endpoint or VPC peering

View File

@ -281,8 +281,9 @@ After this, you can launch the server, create a `MergeTree` table, move the data
If the data in ZooKeeper was lost or damaged, you can save data by moving it to an unreplicated table as described above.
**See also**
**See Also**
- [background_schedule_pool_size](../../../operations/settings/settings.md#background_schedule_pool_size)
- [execute_merges_on_single_replica_time_threshold](../../../operations/settings/settings.md#execute-merges-on-single-replica-time-threshold)
[Original article](https://clickhouse.tech/docs/en/operations/table_engines/replication/) <!--hide-->

View File

@ -87,12 +87,13 @@ Format a query as usual, then place the values that you want to pass from the ap
```
- `name` — Placeholder identifier. In the console client it should be used in app parameters as `--param_<name> = value`.
- `data type` — [Data type](../sql-reference/data-types/index.md) of the app parameter value. For example, a data structure like `(integer, ('string', integer))` can have the `Tuple(UInt8, Tuple(String, UInt8))` data type (you can also use another [integer](../sql-reference/data-types/int-uint.md) types).
- `data type` — [Data type](../sql-reference/data-types/index.md) of the app parameter value. For example, a data structure like `(integer, ('string', integer))` can have the `Tuple(UInt8, Tuple(String, UInt8))` data type (you can also use another [integer](../sql-reference/data-types/int-uint.md) types). It's also possible to pass table, database, column names as a parameter, in that case you would need to use `Identifier` as a data type.
#### Example {#example}
``` bash
$ clickhouse-client --param_tuple_in_tuple="(10, ('dt', 10))" -q "SELECT * FROM table WHERE val = {tuple_in_tuple:Tuple(UInt8, Tuple(String, UInt8))}"
$ clickhouse-client --param_tbl="numbers" --param_db="system" --param_col="number" --query "SELECT {col:Identifier} FROM {db:Identifier}.{tbl:Identifier} LIMIT 10"
```
## Configuring {#interfaces_cli_configuration}

View File

@ -58,6 +58,7 @@ The supported formats are:
| [XML](#xml) | ✗ | ✔ |
| [CapnProto](#capnproto) | ✔ | ✗ |
| [LineAsString](#lineasstring) | ✔ | ✗ |
| [Regexp](#data-format-regexp) | ✔ | ✗ |
| [RawBLOB](#rawblob) | ✔ | ✔ |
You can control some format processing parameters with the ClickHouse settings. For more information read the [Settings](../operations/settings/settings.md) section.
@ -1323,6 +1324,91 @@ $ cat filename.orc | clickhouse-client --query="INSERT INTO some_table FORMAT OR
To exchange data with Hadoop, you can use [HDFS table engine](../engines/table-engines/integrations/hdfs.md).
## LineAsString {#lineasstring}
In this format, every line of input data is interpreted as a single string value. This format can only be parsed for table with a single field of type [String](../sql-reference/data-types/string.md). The remaining columns must be set to [DEFAULT](../sql-reference/statements/create/table.md#default) or [MATERIALIZED](../sql-reference/statements/create/table.md#materialized), or omitted.
**Example**
Query:
``` sql
DROP TABLE IF EXISTS line_as_string;
CREATE TABLE line_as_string (field String) ENGINE = Memory;
INSERT INTO line_as_string FORMAT LineAsString "I love apple", "I love banana", "I love orange";
SELECT * FROM line_as_string;
```
Result:
``` text
┌─field─────────────────────────────────────────────┐
│ "I love apple", "I love banana", "I love orange"; │
└───────────────────────────────────────────────────┘
```
## Regexp {#data-format-regexp}
Each line of imported data is parsed according to the regular expression.
When working with the `Regexp` format, you can use the following settings:
- `format_regexp` — [String](../sql-reference/data-types/string.md). Contains regular expression in the [re2](https://github.com/google/re2/wiki/Syntax) format.
- `format_regexp_escaping_rule` — [String](../sql-reference/data-types/string.md). The following escaping rules are supported:
- CSV (similarly to [CSV](#csv))
- JSON (similarly to [JSONEachRow](#jsoneachrow))
- Escaped (similarly to [TSV](#tabseparated))
- Quoted (similarly to [Values](#data-format-values))
- Raw (extracts subpatterns as a whole, no escaping rules)
- `format_regexp_skip_unmatched` — [UInt8](../sql-reference/data-types/int-uint.md). Defines the need to throw an exeption in case the `format_regexp` expression does not match the imported data. Can be set to `0` or `1`.
**Usage**
The regular expression from `format_regexp` setting is applied to every line of imported data. The number of subpatterns in the regular expression must be equal to the number of columns in imported dataset.
Lines of the imported data must be separated by newline character `'\n'` or DOS-style newline `"\r\n"`.
The content of every matched subpattern is parsed with the method of corresponding data type, according to `format_regexp_escaping_rule` setting.
If the regular expression does not match the line and `format_regexp_skip_unmatched` is set to 1, the line is silently skipped. If `format_regexp_skip_unmatched` is set to 0, exception is thrown.
**Example**
Consider the file data.tsv:
```text
id: 1 array: [1,2,3] string: str1 date: 2020-01-01
id: 2 array: [1,2,3] string: str2 date: 2020-01-02
id: 3 array: [1,2,3] string: str3 date: 2020-01-03
```
and the table:
```sql
CREATE TABLE imp_regex_table (id UInt32, array Array(UInt32), string String, date Date) ENGINE = Memory;
```
Import command:
```bash
$ cat data.tsv | clickhouse-client --query "INSERT INTO imp_regex_table FORMAT Regexp SETTINGS format_regexp='id: (.+?) array: (.+?) string: (.+?) date: (.+?)', format_regexp_escaping_rule='Escaped', format_regexp_skip_unmatched=0;"
```
Query:
```sql
SELECT * FROM imp_regex_table;
```
Result:
```text
┌─id─┬─array───┬─string─┬───────date─┐
│ 1 │ [1,2,3] │ str1 │ 2020-01-01 │
│ 2 │ [1,2,3] │ str2 │ 2020-01-02 │
│ 3 │ [1,2,3] │ str3 │ 2020-01-03 │
└────┴─────────┴────────┴────────────┘
```
## Format Schema {#formatschema}
The file name containing the format schema is set by the setting `format_schema`.
@ -1348,29 +1434,6 @@ Limitations:
- In case of parsing error `JSONEachRow` skips all data until the new line (or EOF), so rows must be delimited by `\n` to count errors correctly.
- `Template` and `CustomSeparated` use delimiter after the last column and delimiter between rows to find the beginning of next row, so skipping errors works only if at least one of them is not empty.
## LineAsString {#lineasstring}
In this format, a sequence of string objects separated by a newline character is interpreted as a single value. This format can only be parsed for table with a single field of type [String](../sql-reference/data-types/string.md). The remaining columns must be set to [DEFAULT](../sql-reference/statements/create/table.md#default) or [MATERIALIZED](../sql-reference/statements/create/table.md#materialized), or omitted.
**Example**
Query:
``` sql
DROP TABLE IF EXISTS line_as_string;
CREATE TABLE line_as_string (field String) ENGINE = Memory;
INSERT INTO line_as_string FORMAT LineAsString "I love apple", "I love banana", "I love orange";
SELECT * FROM line_as_string;
```
Result:
``` text
┌─field─────────────────────────────────────────────┐
│ "I love apple", "I love banana", "I love orange"; │
└───────────────────────────────────────────────────┘
```
## RawBLOB {#rawblob}
In this format, all input data is read to a single value. It is possible to parse only a table with a single field of type [String](../sql-reference/data-types/string.md) or similar.

View File

@ -80,18 +80,27 @@ List of prefixes for [custom settings](../../operations/settings/index.md#custom
- [Custom settings](../../operations/settings/index.md#custom_settings)
## core_dump
## core_dump {#server_configuration_parameters-core_dump}
Configures soft limit for core dump file size.
Possible values:
- Positive integer.
Default value: `1073741824` (1 GB).
!!! info "Note"
Hard limit is configured via system tools
**Example**
Configures soft limit for core dump file size, one gigabyte by default.
```xml
<core_dump>
<size_limit>1073741824</size_limit>
</core_dump>
```
(Hard limit is configured via system tools)
## default_database {#default-database}
The default database.
@ -431,7 +440,7 @@ Limits total RAM usage by the ClickHouse server.
Possible values:
- Positive integer.
- 0 (auto).
- 0 — Auto.
Default value: `0`.

View File

@ -1312,7 +1312,7 @@ See also:
Write to a quorum timeout in milliseconds. If the timeout has passed and no write has taken place yet, ClickHouse will generate an exception and the client must repeat the query to write the same block to the same or any other replica.
Default value: 600000 milliseconds (ten minutes).
Default value: 600 000 milliseconds (ten minutes).
See also:
@ -1855,6 +1855,18 @@ Default value: `0`.
- [Distributed Table Engine](../../engines/table-engines/special/distributed.md#distributed)
- [Managing Distributed Tables](../../sql-reference/statements/system.md#query-language-system-distributed)
## insert_distributed_one_random_shard {#insert_distributed_one_random_shard}
Enables or disables random shard insertion into a [Distributed](../../engines/table-engines/special/distributed.md#distributed) table when there is no distributed key.
By default, when inserting data into a `Distributed` table with more than one shard, the ClickHouse server will any insertion request if there is no distributed key. When `insert_distributed_one_random_shard = 1`, insertions are allowed and data is forwarded randomly among all shards.
Possible values:
- 0 — Insertion is rejected if there are multiple shards and no distributed key is given.
- 1 — Insertion is done randomly among all available shards when no distributed key is given.
Default value: `0`.
## use_compact_format_in_distributed_parts_names {#use_compact_format_in_distributed_parts_names}
@ -2458,4 +2470,34 @@ Possible values:
Default value: `0`.
## data_type_default_nullable {#data_type_default_nullable}
Allows data types without explicit modifiers [NULL or NOT NULL](../../sql-reference/statements/create/table.md#null-modifiers) in column definition will be [Nullable](../../sql-reference/data-types/nullable.md#data_type-nullable).
Possible values:
- 1 — The data types in column definitions are set to `Nullable` by default.
- 0 — The data types in column definitions are set to not `Nullable` by default.
Default value: `0`.
## execute_merges_on_single_replica_time_threshold {#execute-merges-on-single-replica-time-threshold}
Enables special logic to perform merges on replicas.
Possible values:
- Positive integer (in seconds).
- 0 — Special merges logic is not used. Merges happen in the usual way on all the replicas.
Default value: `0`.
**Usage**
Selects one replica to perform the merge on. Sets the time threshold from the start of the merge. Other replicas wait for the merge to finish, then download the result. If the time threshold passes and the selected replica does not perform the merge, then the merge is performed on other replicas as usual.
High values for that threshold may lead to replication delays.
It can be useful when merges are CPU bounded not IO bounded (performing heavy data compression, calculating aggregate functions or default expressions that require a large amount of calculations, or just very high number of tiny merges).
[Original article](https://clickhouse.tech/docs/en/operations/settings/settings/) <!-- hide -->

View File

@ -1,6 +1,6 @@
# system.distribution_queue {#system_tables-distribution_queue}
Contains information about local files that are in the queue to be sent to the shards. This local files contain new parts that are created by inserting new data into the Distributed table in asynchronous mode.
Contains information about local files that are in the queue to be sent to the shards. These local files contain new parts that are created by inserting new data into the Distributed table in asynchronous mode.
Columns:

View File

@ -406,6 +406,14 @@ Example:
<null_value>??</null_value>
</attribute>
...
</structure>
<layout>
<ip_trie>
<!-- Key attribute `prefix` can be retrieved via dictGetString. -->
<!-- This option increases memory usage. -->
<access_to_key_from_attributes>true</access_to_key_from_attributes>
</ip_trie>
</layout>
```
or
@ -435,6 +443,6 @@ dictGetString('prefix', 'asn', tuple(IPv6StringToNum('2001:db8::1')))
Other types are not supported yet. The function returns the attribute for the prefix that corresponds to this IP address. If there are overlapping prefixes, the most specific one is returned.
Data is stored in a `trie`. It must completely fit into RAM.
Data must completely fit into RAM.
[Original article](https://clickhouse.tech/docs/en/query_language/dicts/external_dicts_dict_layout/) <!--hide-->

View File

@ -1172,7 +1172,104 @@ Result:
## finalizeAggregation {#function-finalizeaggregation}
Takes state of aggregate function. Returns result of aggregation (finalized state).
Takes state of aggregate function. Returns result of aggregation (or finalized state when using[-State](../../sql-reference/aggregate-functions/combinators.md#agg-functions-combinator-state) combinator).
**Syntax**
``` sql
finalizeAggregation(state)
```
**Parameters**
- `state` — State of aggregation. [AggregateFunction](../../sql-reference/data-types/aggregatefunction.md#data-type-aggregatefunction).
**Returned value(s)**
- Value/values that was aggregated.
Type: Value of any types that was aggregated.
**Examples**
Query:
```sql
SELECT finalizeAggregation(( SELECT countState(number) FROM numbers(10)));
```
Result:
```text
┌─finalizeAggregation(_subquery16)─┐
│ 10 │
└──────────────────────────────────┘
```
Query:
```sql
SELECT finalizeAggregation(( SELECT sumState(number) FROM numbers(10)));
```
Result:
```text
┌─finalizeAggregation(_subquery20)─┐
│ 45 │
└──────────────────────────────────┘
```
Note that `NULL` values are ignored.
Query:
```sql
SELECT finalizeAggregation(arrayReduce('anyState', [NULL, 2, 3]));
```
Result:
```text
┌─finalizeAggregation(arrayReduce('anyState', [NULL, 2, 3]))─┐
│ 2 │
└────────────────────────────────────────────────────────────┘
```
Combined example:
Query:
```sql
WITH initializeAggregation('sumState', number) AS one_row_sum_state
SELECT
number,
finalizeAggregation(one_row_sum_state) AS one_row_sum,
runningAccumulate(one_row_sum_state) AS cumulative_sum
FROM numbers(10);
```
Result:
```text
┌─number─┬─one_row_sum─┬─cumulative_sum─┐
│ 0 │ 0 │ 0 │
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 3 │
│ 3 │ 3 │ 6 │
│ 4 │ 4 │ 10 │
│ 5 │ 5 │ 15 │
│ 6 │ 6 │ 21 │
│ 7 │ 7 │ 28 │
│ 8 │ 8 │ 36 │
│ 9 │ 9 │ 45 │
└────────┴─────────────┴────────────────┘
```
**See Also**
- [arrayReduce](../../sql-reference/functions/array-functions.md#arrayreduce)
- [initializeAggregation](../../sql-reference/aggregate-functions/reference/initializeAggregation.md)
## runningAccumulate {#runningaccumulate}
@ -1677,4 +1774,44 @@ Result:
UNSUPPORTED_METHOD
```
## tcpPort {#tcpPort}
Returns [native interface](../../interfaces/tcp.md) TCP port number listened by this server.
**Syntax**
``` sql
tcpPort()
```
**Parameters**
- None.
**Returned value**
- The TCP port number.
Type: [UInt16](../../sql-reference/data-types/int-uint.md).
**Example**
Query:
``` sql
SELECT tcpPort();
```
Result:
``` text
┌─tcpPort()─┐
│ 9000 │
└───────────┘
```
**See Also**
- [tcp_port](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-tcp_port)
[Original article](https://clickhouse.tech/docs/en/query_language/functions/other_functions/) <!--hide-->

View File

@ -558,4 +558,46 @@ Result:
└─────┘
```
## encodeXMLComponent {#encode-xml-component}
Escapes characters to place string into XML text node or attribute.
The following five XML predefined entities will be replaced: `<`, `&`, `>`, `"`, `'`.
**Syntax**
``` sql
encodeXMLComponent(x)
```
**Parameters**
- `x` — The sequence of characters. [String](../../sql-reference/data-types/string.md).
**Returned value(s)**
- The sequence of characters with escape characters.
Type: [String](../../sql-reference/data-types/string.md).
**Example**
Query:
``` sql
SELECT encodeXMLComponent('Hello, "world"!');
SELECT encodeXMLComponent('<123>');
SELECT encodeXMLComponent('&clickhouse');
SELECT encodeXMLComponent('\'foo\'');
```
Result:
``` text
Hello, &quot;world&quot;!
&lt;123&gt;
&amp;clickhouse
&apos;foo&apos;
```
[Original article](https://clickhouse.tech/docs/en/query_language/functions/string_functions/) <!--hide-->

View File

@ -400,7 +400,8 @@ Result:
└──────────────────────────────────────────────────────────────────────────────────────────┘
```
**See also**
**See Also**
- [extractAllGroupsVertical](#extractallgroups-vertical)
## extractAllGroupsVertical {#extractallgroups-vertical}
@ -440,7 +441,8 @@ Result:
└────────────────────────────────────────────────────────────────────────────────────────┘
```
**See also**
**See Also**
- [extractAllGroupsHorizontal](#extractallgroups-horizontal)
## like(haystack, pattern), haystack LIKE pattern operator {#function-like}
@ -590,8 +592,55 @@ Result:
└───────────────────────────────┘
```
[Original article](https://clickhouse.tech/docs/en/query_language/functions/string_search_functions/) <!--hide-->
## countMatches(haystack, pattern) {#countmatcheshaystack-pattern}
Returns the number of regular expression matches for a `pattern` in a `haystack`.
**Syntax**
``` sql
countMatches(haystack, pattern)
```
**Parameters**
- `haystack` — The string to search in. [String](../../sql-reference/syntax.md#syntax-string-literal).
- `pattern` — The regular expression with [re2 syntax](https://github.com/google/re2/wiki/Syntax). [String](../../sql-reference/data-types/string.md).
**Returned value**
- The number of matches.
Type: [UInt64](../../sql-reference/data-types/int-uint.md).
**Examples**
Query:
``` sql
SELECT countMatches('foobar.com', 'o+');
```
Result:
``` text
┌─countMatches('foobar.com', 'o+')─┐
│ 2 │
└──────────────────────────────────┘
```
Query:
``` sql
SELECT countMatches('aaaa', 'aa');
```
Result:
``` text
┌─countMatches('aaaa', 'aa')────┐
│ 2 │
└───────────────────────────────┘
```
[Original article](https://clickhouse.tech/docs/en/query_language/functions/string_search_functions/) <!--hide-->

View File

@ -16,8 +16,8 @@ By default, tables are created only on the current server. Distributed DDL queri
``` sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1] [compression_codec] [TTL expr1],
name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2] [compression_codec] [TTL expr2],
name1 [type1] [NULL|NOT NULL] [DEFAULT|MATERIALIZED|ALIAS expr1] [compression_codec] [TTL expr1],
name2 [type2] [NULL|NOT NULL] [DEFAULT|MATERIALIZED|ALIAS expr2] [compression_codec] [TTL expr2],
...
) ENGINE = engine
```
@ -57,6 +57,14 @@ In all cases, if `IF NOT EXISTS` is specified, the query wont return an error
There can be other clauses after the `ENGINE` clause in the query. See detailed documentation on how to create tables in the descriptions of [table engines](../../../engines/table-engines/index.md#table_engines).
## NULL Or NOT NULL Modifiers {#null-modifiers}
`NULL` and `NOT NULL` modifiers after data type in column definition allow or do not allow it to be [Nullable](../../../sql-reference/data-types/nullable.md#data_type-nullable).
If the type is not `Nullable` and if `NULL` is specified, it will be treated as `Nullable`; if `NOT NULL` is specified, then no. For example, `INT NULL` is the same as `Nullable(INT)`. If the type is `Nullable` and `NULL` or `NOT NULL` modifiers are specified, the exception will be thrown.
See also [data_type_default_nullable](../../../operations/settings/settings.md#data_type_default_nullable) setting.
## Default Values {#create-default-values}
The column description can specify an expression for a default value, in one of the following ways: `DEFAULT expr`, `MATERIALIZED expr`, `ALIAS expr`.

View File

@ -367,6 +367,12 @@ Exemple:
<null_value>??</null_value>
</attribute>
...
</structure>
<layout>
<ip_trie>
<access_to_key_from_attributes>true</access_to_key_from_attributes>
</ip_trie>
</layout>
```
ou
@ -396,6 +402,6 @@ dictGetString('prefix', 'asn', tuple(IPv6StringToNum('2001:db8::1')))
Les autres types ne sont pas encore pris en charge. La fonction renvoie l'attribut du préfixe correspondant à cette adresse IP. S'il y a chevauchement des préfixes, le plus spécifique est retourné.
Les données sont stockées dans une `trie`. Il doit complètement s'intégrer dans la RAM.
Les données doit complètement s'intégrer dans la RAM.
[Article Original](https://clickhouse.tech/docs/en/query_language/dicts/external_dicts_dict_layout/) <!--hide-->

View File

@ -362,6 +362,12 @@ LAYOUT(DIRECT())
<null_value>??</null_value>
</attribute>
...
</structure>
<layout>
<ip_trie>
<access_to_key_from_attributes>true</access_to_key_from_attributes>
</ip_trie>
</layout>
```
または

View File

@ -12,10 +12,21 @@ toc_title: "\u041f\u043e\u0441\u0442\u0430\u0432\u0449\u0438\u043a\u0438\u0020\u
[Yandex Managed Service for ClickHouse](https://cloud.yandex.ru/services/managed-clickhouse?utm_source=referrals&utm_medium=clickhouseofficialsite&utm_campaign=link3) предоставляет следующие ключевые возможности:
- Полностью управляемый сервис ZooKeeper для [репликации ClickHouse](../engines/table-engines/mergetree-family/replication.md)
- Выбор типа хранилища
- Реплики в разных зонах доступности
- Шифрование и изоляция
- Автоматизированное техническое обслуживание
- полностью управляемый сервис ZooKeeper для [репликации ClickHouse](../engines/table-engines/mergetree-family/replication.md)
- выбор типа хранилища
- реплики в разных зонах доступности
- шифрование и изоляция
- автоматизированное техническое обслуживание
## Altinity.Cloud {#altinity.cloud}
[Altinity.Cloud](https://altinity.com/cloud-database/) — это полностью управляемый ClickHouse-as-a-Service для публичного облака Amazon.
- быстрое развертывание кластеров ClickHouse на ресурсах Amazon.
- легкое горизонтальное масштабирование также, как и вертикальное масштабирование узлов.
- изолированные виртуальные сети для каждого клиента с общедоступным эндпоинтом или пирингом VPC.
- настраиваемые типы и объемы хранилищ
- cross-az масштабирование для повышения производительности и обеспечения высокой доступности
- встроенный мониторинг и редактор SQL-запросов
{## [Оригинальная статья](https://clickhouse.tech/docs/ru/commercial/cloud/) ##}

View File

@ -246,4 +246,9 @@ $ sudo -u clickhouse touch /var/lib/clickhouse/flags/force_restore_data
Если данные в ZooKeeper оказались утеряны или повреждены, то вы можете сохранить данные, переместив их в нереплицируемую таблицу, как описано в пункте выше.
**Смотрите также**
- [background_schedule_pool_size](../../../operations/settings/settings.md#background_schedule_pool_size)
- [execute_merges_on_single_replica_time_threshold](../../../operations/settings/settings.md#execute-merges-on-single-replica-time-threshold)
[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/table_engines/replication/) <!--hide-->

View File

@ -93,12 +93,13 @@ clickhouse-client --param_parName="[1, 2]" -q "SELECT * FROM table WHERE a = {p
```
- `name` — идентификатор подстановки. В консольном клиенте его следует использовать как часть имени параметра `--param_<name> = value`.
- `data type` — [тип данных](../sql-reference/data-types/index.md) значения. Например, структура данных `(integer, ('string', integer))` может иметь тип данных `Tuple(UInt8, Tuple(String, UInt8))` ([целочисленный](../sql-reference/data-types/int-uint.md) тип может быть и другим).
- `data type` — [тип данных](../sql-reference/data-types/index.md) значения. Например, структура данных `(integer, ('string', integer))` может иметь тип данных `Tuple(UInt8, Tuple(String, UInt8))` ([целочисленный](../sql-reference/data-types/int-uint.md) тип может быть и другим). В качестве параметра можно передать название столбца, таблицы и базы данных, в этом случае используется тип данных`Identifier` .
#### Пример {#primer}
``` bash
$ clickhouse-client --param_tuple_in_tuple="(10, ('dt', 10))" -q "SELECT * FROM table WHERE val = {tuple_in_tuple:Tuple(UInt8, Tuple(String, UInt8))}"
$ clickhouse-client --param_tbl="numbers" --param_db="system" --param_col="number" --query "SELECT {col:Identifier} FROM {db:Identifier}.{tbl:Identifier} LIMIT 10"
```
## Конфигурирование {#interfaces_cli_configuration}

View File

@ -56,6 +56,7 @@ ClickHouse может принимать (`INSERT`) и отдавать (`SELECT
| [XML](#xml) | ✗ | ✔ |
| [CapnProto](#capnproto) | ✔ | ✗ |
| [LineAsString](#lineasstring) | ✔ | ✗ |
| [Regexp](#data-format-regexp) | ✔ | ✗ |
| [RawBLOB](#rawblob) | ✔ | ✔ |
Вы можете регулировать некоторые параметры работы с форматами с помощью настроек ClickHouse. За дополнительной информацией обращайтесь к разделу [Настройки](../operations/settings/settings.md).
@ -1213,22 +1214,10 @@ $ cat filename.orc | clickhouse-client --query="INSERT INTO some_table FORMAT OR
Для обмена данных с Hadoop можно использовать [движок таблиц HDFS](../engines/table-engines/integrations/hdfs.md).
## Схема формата {#formatschema}
Имя файла со схемой записывается в настройке `format_schema`. При использовании форматов `Cap'n Proto` и `Protobuf` требуется указать схему.
Схема представляет собой имя файла и имя типа в этом файле, разделенные двоеточием, например `schemafile.proto:MessageType`.
Если файл имеет стандартное расширение для данного формата (например `.proto` для `Protobuf`),
то можно его не указывать и записывать схему так `schemafile:MessageType`.
Если для ввода/вывода данных используется [клиент](../interfaces/cli.md) в [интерактивном режиме](../interfaces/cli.md#cli_usage), то при записи схемы можно использовать абсолютный путь или записывать путь
относительно текущей директории на клиенте. Если клиент используется в [batch режиме](../interfaces/cli.md#cli_usage), то в записи схемы допускается только относительный путь, из соображений безопасности.
Если для ввода/вывода данных используется [HTTP-интерфейс](../interfaces/http.md), то файл со схемой должен располагаться на сервере в каталоге,
указанном в параметре [format_schema_path](../operations/server-configuration-parameters/settings.md#server_configuration_parameters-format_schema_path) конфигурации сервера.
## LineAsString {#lineasstring}
В этом формате последовательность строковых объектов, разделенных символом новой строки, интерпретируется как одно значение. Парситься может только таблица с единственным полем типа [String](../sql-reference/data-types/string.md). Остальные столбцы должны быть заданы как [DEFAULT](../sql-reference/statements/create/table.md#create-default-values) или [MATERIALIZED](../sql-reference/statements/create/table.md#create-default-values), либо отсутствовать.
В этом формате каждая строка импортируемых данных интерпретируется как одно строковое значение. Парситься может только таблица с единственным полем типа [String](../sql-reference/data-types/string.md). Остальные столбцы должны быть заданы как [DEFAULT](../sql-reference/statements/create/table.md#create-default-values) или [MATERIALIZED](../sql-reference/statements/create/table.md#create-default-values), либо отсутствовать.
**Пример**
@ -1249,6 +1238,89 @@ SELECT * FROM line_as_string;
└───────────────────────────────────────────────────┘
```
## Regexp {#data-format-regexp}
Каждая строка импортируемых данных разбирается в соответствии с регулярным выражением.
При работе с форматом `Regexp` можно использовать следующие параметры:
- `format_regexp` — [String](../sql-reference/data-types/string.md). Строка с регулярным выражением в формате [re2](https://github.com/google/re2/wiki/Syntax).
- `format_regexp_escaping_rule` — [String](../sql-reference/data-types/string.md). Правило сериализации. Поддерживаются следующие правила:
- CSV (как в [CSV](#csv))
- JSON (как в [JSONEachRow](#jsoneachrow))
- Escaped (как в [TSV](#tabseparated))
- Quoted (как в [Values](#data-format-values))
- Raw (данные импортируются как есть, без сериализации)
- `format_regexp_skip_unmatched` — [UInt8](../sql-reference/data-types/int-uint.md). Признак, будет ли генерироваться исключение в случае, если импортируемые данные не соответствуют регулярному выражению `format_regexp`. Может принимать значение `0` или `1`.
**Использование**
Регулярное выражение (шаблон) из параметра `format_regexp` применяется к каждой строке импортируемых данных. Количество частей в шаблоне (подшаблонов) должно соответствовать количеству колонок в импортируемых данных.
Строки импортируемых данных должны разделяться символом новой строки `'\n'` или символами `"\r\n"` (перенос строки в формате DOS).
Данные, выделенные по подшаблонам, интерпретируются в соответствии с типом, указанным в параметре `format_regexp_escaping_rule`.
Если строка импортируемых данных не соответствует регулярному выражению и параметр `format_regexp_skip_unmatched` равен 1, строка просто игнорируется. Если же параметр `format_regexp_skip_unmatched` равен 0, генерируется исключение.
**Пример**
Рассмотрим файл data.tsv:
```text
id: 1 array: [1,2,3] string: str1 date: 2020-01-01
id: 2 array: [1,2,3] string: str2 date: 2020-01-02
id: 3 array: [1,2,3] string: str3 date: 2020-01-03
```
и таблицу:
```sql
CREATE TABLE imp_regex_table (id UInt32, array Array(UInt32), string String, date Date) ENGINE = Memory;
```
Команда импорта:
```bash
$ cat data.tsv | clickhouse-client --query "INSERT INTO imp_regex_table FORMAT Regexp SETTINGS format_regexp='id: (.+?) array: (.+?) string: (.+?) date: (.+?)', format_regexp_escaping_rule='Escaped', format_regexp_skip_unmatched=0;"
```
Запрос:
```sql
SELECT * FROM imp_regex_table;
```
Результат:
```text
┌─id─┬─array───┬─string─┬───────date─┐
│ 1 │ [1,2,3] │ str1 │ 2020-01-01 │
│ 2 │ [1,2,3] │ str2 │ 2020-01-02 │
│ 3 │ [1,2,3] │ str3 │ 2020-01-03 │
└────┴─────────┴────────┴────────────┘
```
## Схема формата {#formatschema}
Имя файла со схемой записывается в настройке `format_schema`. При использовании форматов `Cap'n Proto` и `Protobuf` требуется указать схему.
Схема представляет собой имя файла и имя типа в этом файле, разделенные двоеточием, например `schemafile.proto:MessageType`.
Если файл имеет стандартное расширение для данного формата (например `.proto` для `Protobuf`),
то можно его не указывать и записывать схему так `schemafile:MessageType`.
Если для ввода/вывода данных используется [клиент](../interfaces/cli.md) в [интерактивном режиме](../interfaces/cli.md#cli_usage), то при записи схемы можно использовать абсолютный путь или записывать путь
относительно текущей директории на клиенте. Если клиент используется в [batch режиме](../interfaces/cli.md#cli_usage), то в записи схемы допускается только относительный путь, из соображений безопасности.
Если для ввода/вывода данных используется [HTTP-интерфейс](../interfaces/http.md), то файл со схемой должен располагаться на сервере в каталоге,
указанном в параметре [format_schema_path](../operations/server-configuration-parameters/settings.md#server_configuration_parameters-format_schema_path) конфигурации сервера.
## Игнорирование ошибок {#skippingerrors}
Некоторые форматы, такие как `CSV`, `TabSeparated`, `TSKV`, `JSONEachRow`, `Template`, `CustomSeparated` и `Protobuf`, могут игнорировать строки, которые не соответствуют правилам и разбор которых может вызвать ошибку. При этом обработка импортируемых данных продолжается со следующей строки. См. настройки [input_format_allow_errors_num](../operations/settings/settings.md#input-format-allow-errors-num) и
[input_format_allow_errors_ratio](../operations/settings/settings.md#input-format-allow-errors-ratio).
Ограничения:
- В формате `JSONEachRow` в случае ошибки игнорируются все данные до конца текущей строки (или до конца файла). Поэтому строки должны быть разделены символом `\n`, чтобы ошибки обрабатывались корректно.
- Форматы `Template` и `CustomSeparated` используют разделитель после последней колонки и разделитель между строками. Поэтому игнорирование ошибок работает только если хотя бы одна из строк не пустая.
## RawBLOB {#rawblob}
В этом формате все входные данные считываются в одно значение. Парсить можно только таблицу с одним полем типа [String](../sql-reference/data-types/string.md) или подобным ему.

View File

@ -80,6 +80,27 @@ ClickHouse проверяет условия для `min_part_size` и `min_part
- [Пользовательские настройки](../../operations/settings/index.md#custom_settings)
## core_dump {#server_configuration_parameters-core_dump}
Задает мягкое ограничение для размера файла дампа памяти.
Возможные значения:
- положительное целое число.
Значение по умолчанию: `1073741824` (1 ГБ).
!!! info "Примечание"
Жесткое ограничение настраивается с помощью системных инструментов.
**Пример**
```xml
<core_dump>
<size_limit>1073741824</size_limit>
</core_dump>
```
## default\_database {#default-database}
База данных по умолчанию.
@ -420,7 +441,7 @@ ClickHouse проверяет условия для `min_part_size` и `min_part
Возможные значения:
- Положительное целое число.
- 0 — объём используемой памяти не ограничен.
- 0 — автоматически.
Значение по умолчанию: `0`.

View File

@ -1258,7 +1258,7 @@ ClickHouse генерирует исключение
Время ожидания кворумной записи в миллисекундах. Если время прошло, а запись так не состоялась, то ClickHouse сгенерирует исключение и клиент должен повторить запрос на запись того же блока на эту же или любую другую реплику.
Значение по умолчанию: 600000 миллисекунд (10 минут).
Значение по умолчанию: 600 000 миллисекунд (10 минут).
См. также:
@ -2324,4 +2324,23 @@ SELECT number FROM numbers(3) FORMAT JSONEachRow;
Значение по умолчанию: `0`.
## execute_merges_on_single_replica_time_threshold {#execute-merges-on-single-replica-time-threshold}
Включает особую логику выполнения слияний на репликах.
Возможные значения:
- Положительное целое число (в секундах).
- 0 — не используется особая логика выполнения слияний. Слияния происходят обычным образом на всех репликах.
Значение по умолчанию: `0`.
**Использование**
Выбирается одна реплика для выполнения слияния. Устанавливается порог времени с момента начала слияния. Другие реплики ждут завершения слияния, а затем скачивают результат. Если время выполнения слияния превышает установленный порог и выбранная реплика не выполняет слияние, тогда слияние выполняется на других репликах как обычно.
Большие значения этой настройки могут привести к задержкам репликации.
Эта настройка полезна, когда скорость слияния ограничивается мощностью процессора, а не скоростью операций ввода-вывода (при выполнении "тяжелого" сжатия данных, при расчете агрегатных функций или выражений по умолчанию, требующих большого объема вычислений, или просто при большом количестве мелких слияний).
[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/settings/settings/) <!--hide-->

View File

@ -116,12 +116,14 @@ FROM dt
## See Also {#see-also}
- [Функции преобразования типов](../../sql-reference/data-types/datetime.md)
- [Функции для работы с датой и временем](../../sql-reference/data-types/datetime.md)
- [Функции для работы с массивами](../../sql-reference/data-types/datetime.md)
- [Настройка `date_time_input_format`](../../operations/settings/settings.md#settings-date_time_input_format)
- [Конфигурационный параметр сервера `timezone`](../../sql-reference/data-types/datetime.md#server_configuration_parameters-timezone)
- [Операторы для работы с датой и временем](../../sql-reference/data-types/datetime.md#operators-datetime)
- [Функции преобразования типов](../../sql-reference/functions/type-conversion-functions.md)
- [Функции для работы с датой и временем](../../sql-reference/functions/date-time-functions.md)
- [Функции для работы с массивами](../../sql-reference/functions/array-functions.md)
- [Настройка `date_time_input_format`](../../operations/settings/settings/#settings-date_time_input_format)
- [Настройка `date_time_output_format`](../../operations/settings/settings/)
- [Конфигурационный параметр сервера `timezone`](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone)
- [Операторы для работы с датой и временем](../../sql-reference/operators/index.md#operators-datetime)
- [Тип данных `Date`](date.md)
- [Тип данных `DateTime64`](datetime64.md)
[Оригинальная статья](https://clickhouse.tech/docs/ru/data_types/datetime/) <!--hide-->

View File

@ -92,11 +92,12 @@ FROM dt
## See Also {#see-also}
- [Функции преобразования типов](../../sql-reference/data-types/datetime64.md)
- [Функции для работы с датой и временем](../../sql-reference/data-types/datetime64.md)
- [Функции для работы с массивами](../../sql-reference/data-types/datetime64.md)
- [Функции преобразования типов](../../sql-reference/functions/type-conversion-functions.md)
- [Функции для работы с датой и временем](../../sql-reference/functions/date-time-functions.md)
- [Функции для работы с массивами](../../sql-reference/functions/array-functions.md)
- [Настройка `date_time_input_format`](../../operations/settings/settings.md#settings-date_time_input_format)
- [Конфигурационный параметр сервера `timezone`](../../sql-reference/data-types/datetime64.md#server_configuration_parameters-timezone)
- [Операторы для работы с датой и временем](../../sql-reference/data-types/datetime64.md#operators-datetime)
- [Настройка `date_time_output_format`](../../operations/settings/settings.md)
- [Конфигурационный параметр сервера `timezone`](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone)
- [Операторы для работы с датой и временем](../../sql-reference/operators/index.md#operators-datetime)
- [Тип данных `Date`](date.md)
- [Тип данных `DateTime`](datetime.md)

View File

@ -404,6 +404,14 @@ LAYOUT(DIRECT())
<null_value>??</null_value>
</attribute>
...
</structure>
<layout>
<ip_trie>
<!-- Ключевой аттрибут `prefix` будет доступен через dictGetString -->
<!-- Эта опция увеличивает потреблямую память -->
<access_to_key_from_attributes>true</access_to_key_from_attributes>
</ip_trie>
</layout>
```
или
@ -433,6 +441,6 @@ dictGetString('prefix', 'asn', tuple(IPv6StringToNum('2001:db8::1')))
Никакие другие типы не поддерживаются. Функция возвращает атрибут для префикса, соответствующего данному IP-адресу. Если есть перекрывающиеся префиксы, возвращается наиболее специфический.
Данные хранятся в побитовом дереве (`trie`), он должен полностью помещаться в оперативной памяти.
Данные должны полностью помещаться в оперативной памяти.
[Оригинальная статья](https://clickhouse.tech/docs/ru/query_language/dicts/external_dicts_dict_layout/) <!--hide-->

View File

@ -810,7 +810,7 @@ SELECT
└──────────────┴───────────┘
```
## arrayReduce (#arrayreduce}
## arrayReduce {#arrayreduce}
Применяет агрегатную функцию к элементам массива и возвращает ее результат. Имя агрегирующей функции передается как строка в одинарных кавычках `'max'`, `'sum'`. При использовании параметрических агрегатных функций, параметр указывается после имени функции в круглых скобках `'uniqUpTo(6)'`.

View File

@ -1104,7 +1104,104 @@ SELECT formatReadableSize(filesystemCapacity()) AS "Capacity", toTypeName(filesy
## finalizeAggregation {#function-finalizeaggregation}
Принимает состояние агрегатной функции. Возвращает результат агрегирования.
Принимает состояние агрегатной функции. Возвращает результат агрегирования (или конечное состояние при использовании комбинатора [-State](../../sql-reference/aggregate-functions/combinators.md#state)).
**Синтаксис**
``` sql
finalizeAggregation(state)
```
**Параметры**
- `state` — состояние агрегатной функции. [AggregateFunction](../../sql-reference/data-types/aggregatefunction.md#data-type-aggregatefunction).
**Возвращаемые значения**
- Значения, которые были агрегированы.
Тип: соответствует типу агрегируемых значений.
**Примеры**
Запрос:
```sql
SELECT finalizeAggregation(( SELECT countState(number) FROM numbers(10)));
```
Результат:
```text
┌─finalizeAggregation(_subquery16)─┐
│ 10 │
└──────────────────────────────────┘
```
Запрос:
```sql
SELECT finalizeAggregation(( SELECT sumState(number) FROM numbers(10)));
```
Результат:
```text
┌─finalizeAggregation(_subquery20)─┐
│ 45 │
└──────────────────────────────────┘
```
Обратите внимание, что значения `NULL` игнорируются.
Запрос:
```sql
SELECT finalizeAggregation(arrayReduce('anyState', [NULL, 2, 3]));
```
Результат:
```text
┌─finalizeAggregation(arrayReduce('anyState', [NULL, 2, 3]))─┐
│ 2 │
└────────────────────────────────────────────────────────────┘
```
Комбинированный пример:
Запрос:
```sql
WITH initializeAggregation('sumState', number) AS one_row_sum_state
SELECT
number,
finalizeAggregation(one_row_sum_state) AS one_row_sum,
runningAccumulate(one_row_sum_state) AS cumulative_sum
FROM numbers(10);
```
Результат:
```text
┌─number─┬─one_row_sum─┬─cumulative_sum─┐
│ 0 │ 0 │ 0 │
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 3 │
│ 3 │ 3 │ 6 │
│ 4 │ 4 │ 10 │
│ 5 │ 5 │ 15 │
│ 6 │ 6 │ 21 │
│ 7 │ 7 │ 28 │
│ 8 │ 8 │ 36 │
│ 9 │ 9 │ 45 │
└────────┴─────────────┴────────────────┘
```
**Смотрите также**
- [arrayReduce](../../sql-reference/functions/array-functions.md#arrayreduce)
- [initializeAggregation](../../sql-reference/aggregate-functions/reference/initializeAggregation.md)
## runningAccumulate {#runningaccumulate}
@ -1589,4 +1686,44 @@ SELECT countDigits(toDecimal32(1, 9)), countDigits(toDecimal32(-1, 9)),
10 10 19 19 39 39
```
## tcpPort {#tcpPort}
Вовращает номер TCP порта, который использует сервер для [нативного протокола](../../interfaces/tcp.md).
**Синтаксис**
``` sql
tcpPort()
```
**Параметры**
- Нет.
**Возвращаемое значение**
- Номер TCP порта.
Тип: [UInt16](../../sql-reference/data-types/int-uint.md).
**Пример**
Запрос:
``` sql
SELECT tcpPort();
```
Результат:
``` text
┌─tcpPort()─┐
│ 9000 │
└───────────┘
```
**Смотрите также**
- [tcp_port](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-tcp_port)
[Оригинальная статья](https://clickhouse.tech/docs/ru/query_language/functions/other_functions/) <!--hide-->

View File

@ -521,5 +521,56 @@ SELECT * FROM Months WHERE ilike(name, '%j%')
!!! note "Примечание"
Для случая UTF-8 мы используем триграммное расстояние. Вычисление n-граммного расстояния не совсем честное. Мы используем 2-х байтные хэши для хэширования n-грамм, а затем вычисляем (не)симметрическую разность между хэш таблицами могут возникнуть коллизии. В формате UTF-8 без учета регистра мы не используем честную функцию `tolower` мы обнуляем 5-й бит (нумерация с нуля) каждого байта кодовой точки, а также первый бит нулевого байта, если байтов больше 1 это работает для латиницы и почти для всех кириллических букв.
## countMatches(haystack, pattern) {#countmatcheshaystack-pattern}
Возвращает количество совпадений, найденных в строке `haystack`, для регулярного выражения `pattern`.
**Синтаксис**
``` sql
countMatches(haystack, pattern)
```
**Параметры**
- `haystack` — строка, по которой выполняется поиск. [String](../../sql-reference/syntax.md#syntax-string-literal).
- `pattern` — регулярное выражение, построенное по синтаксическим правилам [re2](https://github.com/google/re2/wiki/Syntax). [String](../../sql-reference/data-types/string.md).
**Возвращаемое значение**
- Количество совпадений.
Тип: [UInt64](../../sql-reference/data-types/int-uint.md).
**Примеры**
Запрос:
``` sql
SELECT countMatches('foobar.com', 'o+');
```
Результат:
``` text
┌─countMatches('foobar.com', 'o+')─┐
│ 2 │
└──────────────────────────────────┘
```
Запрос:
``` sql
SELECT countMatches('aaaa', 'aa');
```
Результат:
``` text
┌─countMatches('aaaa', 'aa')────┐
│ 2 │
└───────────────────────────────┘
```
[Оригинальная статья](https://clickhouse.tech/docs/ru/query_language/functions/string_search_functions/) <!--hide-->

View File

@ -23,7 +23,7 @@ mapAdd(Tuple(Array, Array), Tuple(Array, Array) [, ...])
**Возвращаемое значение**
- Возвращает один [кортеж](../../sql-reference/data-types/tuple.md#tuplet1-t2), в котором первый массив содержит отсортированные ключи, а второй - значения.
- Возвращает один [кортеж](../../sql-reference/data-types/tuple.md#tuplet1-t2), в котором первый массив содержит отсортированные ключи, а второй значения.
**Пример**

View File

@ -1,78 +0,0 @@
#!/usr/bin/env python3
import subprocess
import requests
import os
import time
FNAME_START = "+++"
CLOUDFLARE_URL = "https://api.cloudflare.com/client/v4/zones/4fc6fb1d46e87851605aa7fa69ca6fe0/purge_cache"
# we have changes in revision and commit sha on all pages
# so such changes have to be ignored
MIN_CHANGED_WORDS = 4
def collect_changed_files():
proc = subprocess.Popen("git diff HEAD~1 --word-diff=porcelain | grep -e '^+[^+]\|^\-[^\-]\|^\+\+\+'", stdout=subprocess.PIPE, shell=True)
changed_files = []
current_file_name = ""
changed_words = []
while True:
line = proc.stdout.readline().decode("utf-8").strip()
if not line:
break
if FNAME_START in line:
if changed_words:
if len(changed_words) > MIN_CHANGED_WORDS:
changed_files.append(current_file_name)
changed_words = []
current_file_name = line[6:]
else:
changed_words.append(line)
return changed_files
def filter_and_transform_changed_files(changed_files, base_domain):
result = []
for f in changed_files:
if f.endswith(".html"):
result.append(base_domain + f.replace("index.html", ""))
return result
def convert_to_dicts(changed_files, batch_size):
result = []
current_batch = {"files": []}
for f in changed_files:
if len(current_batch["files"]) >= batch_size:
result.append(current_batch)
current_batch = {"files": []}
current_batch["files"].append(f)
if current_batch["files"]:
result.append(current_batch)
return result
def post_data(prepared_batches, token):
headers = {"Authorization": "Bearer {}".format(token)}
for batch in prepared_batches:
print(("Pugring cache for", ", ".join(batch["files"])))
response = requests.post(CLOUDFLARE_URL, json=batch, headers=headers)
response.raise_for_status()
time.sleep(3)
if __name__ == "__main__":
token = os.getenv("CLOUDFLARE_TOKEN")
if not token:
raise Exception("Env variable CLOUDFLARE_TOKEN is empty")
base_domain = os.getenv("BASE_DOMAIN", "https://content.clickhouse.tech/")
changed_files = collect_changed_files()
print(("Found", len(changed_files), "changed files"))
filtered_files = filter_and_transform_changed_files(changed_files, base_domain)
print(("Files rest after filtering", len(filtered_files)))
prepared_batches = convert_to_dicts(filtered_files, 25)
post_data(prepared_batches, token)

View File

@ -32,12 +32,14 @@ then
git add ".nojekyll"
# Push to GitHub rewriting the existing contents.
git commit -a -m "Add new release at $(date)"
git commit --quiet -m "Add new release at $(date)"
git push --force origin master
if [[ ! -z "${CLOUDFLARE_TOKEN}" ]]
then
sleep 1m
python3 "${BASE_DIR}/purge_cache_for_changed_files.py"
# https://api.cloudflare.com/#zone-purge-files-by-cache-tags,-host-or-prefix
POST_DATA='{"hosts":["content.clickhouse.tech"]}'
curl -X POST "https://api.cloudflare.com/client/v4/zones/4fc6fb1d46e87851605aa7fa69ca6fe0/purge_cache" -H "Authorization: Bearer ${CLOUDFLARE_TOKEN}" -H "Content-Type:application/json" --data "${POST_DATA}"
fi
fi

File diff suppressed because it is too large Load Diff

View File

@ -1,51 +1,66 @@
# 第三方开发的库 {#di-san-fang-kai-fa-de-ku}
---
toc_priority: 26
toc_title: 客户端开发库
---
!!! warning "放弃"
Yandex不维护下面列出的库也没有进行任何广泛的测试以确保其质量。
# 第三方开发库 {#client-libraries-from-third-party-developers}
!!! warning "声明"
Yandex**没有**维护下面列出的库,也没有做过任何广泛的测试来确保它们的质量。
- Python
- [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm)
- [ツ环板driverョツ嘉ッツ偲](https://github.com/mymarilyn/clickhouse-driver)
- [ツ环板clientョツ嘉ッツ偲](https://github.com/yurial/clickhouse-client)
- [clickhouse-driver](https://github.com/mymarilyn/clickhouse-driver)
- [clickhouse-client](https://github.com/yurial/clickhouse-client)
- [aiochclient](https://github.com/maximdanilchenko/aiochclient)
- PHP
- [smi2/phpclickhouse](https://packagist.org/packages/smi2/phpClickHouse)
- [8bitov/clickhouse-php客户端](https://packagist.org/packages/8bitov/clickhouse-php-client)
- [ツ暗ェツ氾环催ツ団ツ法ツ人](https://packagist.org/packages/bozerkins/clickhouse-client)
- [ツ环板clientョツ嘉ッツ偲](https://packagist.org/packages/simpod/clickhouse-client)
- [8bitov/clickhouse-php-client](https://packagist.org/packages/8bitov/clickhouse-php-client)
- [bozerkins/clickhouse-client](https://packagist.org/packages/bozerkins/clickhouse-client)
- [simpod/clickhouse-client](https://packagist.org/packages/simpod/clickhouse-client)
- [seva-code/php-click-house-client](https://packagist.org/packages/seva-code/php-click-house-client)
- [ツ环板clientョツ嘉ッツ偲](https://github.com/SeasX/SeasClick)
- 走吧
- [SeasClick C++ client](https://github.com/SeasX/SeasClick)
- [one-ck](https://github.com/lizhichao/one-ck)
- [glushkovds/phpclickhouse-laravel](https://packagist.org/packages/glushkovds/phpclickhouse-laravel)
- Go
- [clickhouse](https://github.com/kshvakov/clickhouse/)
- [ツ环板-ョツ嘉ッツ偲](https://github.com/roistat/go-clickhouse)
- [ツ暗ェツ氾环催ツ団ツ法ツ人](https://github.com/mailru/go-clickhouse)
- [go-clickhouse](https://github.com/roistat/go-clickhouse)
- [mailrugo-clickhouse](https://github.com/mailru/go-clickhouse)
- [golang-clickhouse](https://github.com/leprosus/golang-clickhouse)
- Swift
- [ClickHouseNIO](https://github.com/patrick-zippenfenig/ClickHouseNIO)
- [ClickHouseVapor ORM](https://github.com/patrick-zippenfenig/ClickHouseVapor)
- NodeJs
- [ツ暗ェツ氾环催ツ団ツ法ツ人)](https://github.com/TimonKK/clickhouse)
- [ツ环板-ョツ嘉ッツ偲](https://github.com/apla/node-clickhouse)
- [clickhouse (NodeJs)](https://github.com/TimonKK/clickhouse)
- [node-clickhouse](https://github.com/apla/node-clickhouse)
- Perl
- [perl-DBD-ClickHouse](https://github.com/elcamlost/perl-DBD-ClickHouse)
- [HTTP-ClickHouse](https://metacpan.org/release/HTTP-ClickHouse)
- [ツ暗ェツ氾环催ツ団ツ法ツ人](https://metacpan.org/release/AnyEvent-ClickHouse)
- [AnyEvent-ClickHouse](https://metacpan.org/release/AnyEvent-ClickHouse)
- Ruby
- [ツ暗ェツ氾环催ツ団)](https://github.com/shlima/click_house)
- [ツ暗ェツ氾环催ツ団ツ法ツ人](https://github.com/PNixx/clickhouse-activerecord)
- [ClickHouse (Ruby)](https://github.com/shlima/click_house)
- [clickhouse-activerecord](https://github.com/PNixx/clickhouse-activerecord)
- R
- [clickhouse-r](https://github.com/hannesmuehleisen/clickhouse-r)
- [RClickhouse](https://github.com/IMSMWU/RClickhouse)
- [RClickHouse](https://github.com/IMSMWU/RClickHouse)
- Java
- [clickhouse-client-java](https://github.com/VirtusAI/clickhouse-client-java)
- 斯卡拉
- [掳胫client-禄脢鹿脷露胫鲁隆鹿-client酶](https://github.com/crobox/clickhouse-scala-client)
- [clickhouse-client](https://github.com/Ecwid/clickhouse-client)
- Scala
- [clickhouse-scala-client](https://github.com/crobox/clickhouse-scala-client)
- Kotlin
- [AORM](https://github.com/TanVD/AORM)
- C#
- [Octonica.ClickHouseClient](https://github.com/Octonica/ClickHouseClient)
- [克莱克豪斯Ado](https://github.com/killwort/ClickHouse-Net)
- [ClickHouse.Ado](https://github.com/killwort/ClickHouse-Net)
- [ClickHouse.Client](https://github.com/DarkWanderer/ClickHouse.Client)
- [ClickHouse.Net](https://github.com/ilyabreev/ClickHouse.Net)
- [克莱克豪斯客户](https://github.com/DarkWanderer/ClickHouse.Client)
- 仙丹
- Elixir
- [clickhousex](https://github.com/appodeal/clickhousex/)
- 尼姆
- [pillar](https://github.com/sofakingworld/pillar)
- Nim
- [nim-clickhouse](https://github.com/leonardoce/nim-clickhouse)
- Haskell
- [hdbc-clickhouse](https://github.com/zaneli/hdbc-clickhouse)
[来源文章](https://clickhouse.tech/docs/zh/interfaces/third-party/client_libraries/) <!--hide-->
[来源文章](https://clickhouse.tech/docs/en/interfaces/third-party/client_libraries/) <!--hide-->

View File

@ -1,8 +1,16 @@
---
machine_translated: true
machine_translated_rev: 72537a2d527c63c07aa5d2361a8829f3895cf2bd
toc_folder_title: "\u7B2C\u4E09\u65B9"
toc_folder_title: 第三方工具
toc_priority: 24
---
# 第三方工具 {#third-party-interfaces}
这是第三方工具的链接集合它们提供了一些ClickHouse的接口。它可以是可视化界面、命令行界面或API:
- [Client libraries](../../interfaces/third-party/client-libraries.md)
- [Integrations](../../interfaces/third-party/integrations.md)
- [GUI](../../interfaces/third-party/gui.md)
- [Proxies](../../interfaces/third-party/proxy.md)
!!! note "注意"
支持通用API的通用工具[ODBC](../../interfaces/odbc.md)或[JDBC](../../interfaces/jdbc.md)通常也适用于ClickHouse但这里没有列出因为它们实在太多了。

View File

@ -1,100 +1,108 @@
# 第三方集成库 {#di-san-fang-ji-cheng-ku}
---
toc_priority: 27
toc_title: 第三方集成库
---
# 第三方集成库 {#integration-libraries-from-third-party-developers}
!!! warning "声明"
Yandex不维护下面列出的库也没有进行任何广泛的测试以确保其质量。
Yandex**没有**维护下面列出的库,也没有做过任何广泛的测试来确保它们的质量。
## 基建产品 {#ji-jian-chan-pin}
## 基础设施 {#infrastructure-products}
- 关系数据库管理系统
- 关系数据库
- [MySQL](https://www.mysql.com)
- [mysql2ch](https://github.com/long2ice/mysql2ch)
- [ProxySQL](https://github.com/sysown/proxysql/wiki/ClickHouse-Support)
- [clickhouse-mysql-data-reader](https://github.com/Altinity/clickhouse-mysql-data-reader)
- [horgh-复制器](https://github.com/larsnovikov/horgh-replicator)
- [horgh-replicator](https://github.com/larsnovikov/horgh-replicator)
- [PostgreSQL](https://www.postgresql.org)
- [clickhousedb_fdw](https://github.com/Percona-Lab/clickhousedb_fdw)
- [infi.clickhouse_fdw](https://github.com/Infinidat/infi.clickhouse_fdw) (使用 [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm))
- [infi.clickhouse_fdw](https://github.com/Infinidat/infi.clickhouse_fdw) (uses [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm))
- [pg2ch](https://github.com/mkabilov/pg2ch)
- [clickhouse_fdw](https://github.com/adjust/clickhouse_fdw)
- [MSSQL](https://en.wikipedia.org/wiki/Microsoft_SQL_Server)
- [ClickHouseMightrator](https://github.com/zlzforever/ClickHouseMigrator)
- [ClickHouseMigrator](https://github.com/zlzforever/ClickHouseMigrator)
- 消息队列
- [卡夫卡](https://kafka.apache.org)
- [clickhouse_sinker](https://github.com/housepower/clickhouse_sinker) (使用 [去客户](https://github.com/ClickHouse/clickhouse-go/))
- [Kafka](https://kafka.apache.org)
- [clickhouse_sinker](https://github.com/housepower/clickhouse_sinker) (uses [Go client](https://github.com/ClickHouse/clickhouse-go/))
- [stream-loader-clickhouse](https://github.com/adform/stream-loader)
- 流处理
- [Flink](https://flink.apache.org)
- [flink-clickhouse-sink](https://github.com/ivi-ru/flink-clickhouse-sink)
- 对象存储
- [S3](https://en.wikipedia.org/wiki/Amazon_S3)
- [ツ环板backupョツ嘉ッツ偲](https://github.com/AlexAkulov/clickhouse-backup)
- [clickhouse-backup](https://github.com/AlexAkulov/clickhouse-backup)
- 容器编排
- [Kubernetes](https://kubernetes.io)
- [clickhouse-操](https://github.com/Altinity/clickhouse-operator)
- [clickhouse-operator](https://github.com/Altinity/clickhouse-operator)
- 配置管理
- [木偶](https://puppet.com)
- [ツ环板/ョツ嘉ッツ偲](https://forge.puppet.com/innogames/clickhouse)
- [mfedotov/clickhouse](https://forge.puppet.com/mfedotov/clickhouse)
- 监控
- [石墨](https://graphiteapp.org)
- [puppet](https://puppet.com)
- [innogames/clickhouse](https://forge.puppet.com/innogames/clickhouse)
- [mfedotov/clickhouse](https://forge.puppet.com/mfedotov/clickhouse)
- Monitoring
- [Graphite](https://graphiteapp.org)
- [graphouse](https://github.com/yandex/graphouse)
- [ツ暗ェツ氾环催ツ団](https://github.com/lomik/carbon-clickhouse) +
- [ツ环板-ョツ嘉ッツ偲](https://github.com/lomik/graphite-clickhouse)
- [石墨-ch-optimizer](https://github.com/innogames/graphite-ch-optimizer) -优化静态分区 [\*GraphiteMergeTree](../../engines/table-engines/mergetree-family/graphitemergetree.md#graphitemergetree) 如果从规则 [汇总配置](../../engines/table-engines/mergetree-family/graphitemergetree.md#rollup-configuration) 可以应用
- [carbon-clickhouse](https://github.com/lomik/carbon-clickhouse) +
- [graphite-clickhouse](https://github.com/lomik/graphite-clickhouse)
- [graphite-ch-optimizer](https://github.com/innogames/graphite-ch-optimizer) - optimizes staled partitions in [\*GraphiteMergeTree](../../engines/table-engines/mergetree-family/graphitemergetree.md#graphitemergetree) if rules from [rollup configuration](../../engines/table-engines/mergetree-family/graphitemergetree.md#rollup-configuration) could be applied
- [Grafana](https://grafana.com/)
- [clickhouse-grafana](https://github.com/Vertamedia/clickhouse-grafana)
- [普罗米修斯号](https://prometheus.io/)
- [Prometheus](https://prometheus.io/)
- [clickhouse_exporter](https://github.com/f1yegor/clickhouse_exporter)
- [PromHouse](https://github.com/Percona-Lab/PromHouse)
- [clickhouse_exporter](https://github.com/hot-wifi/clickhouse_exporter) (用途 [去客户](https://github.com/kshvakov/clickhouse/))
- [clickhouse_exporter](https://github.com/hot-wifi/clickhouse_exporter) (uses [Go client](https://github.com/kshvakov/clickhouse/))
- [Nagios](https://www.nagios.org/)
- [check_clickhouse](https://github.com/exogroup/check_clickhouse/)
- [check_clickhouse.py](https://github.com/innogames/igmonplugins/blob/master/src/check_clickhouse.py)
- [Zabbix](https://www.zabbix.com)
- [ツ暗ェツ氾环催ツ団ツ法ツ人](https://github.com/Altinity/clickhouse-zabbix-template)
- [clickhouse-zabbix-template](https://github.com/Altinity/clickhouse-zabbix-template)
- [Sematext](https://sematext.com/)
- [clickhouse积分](https://github.com/sematext/sematext-agent-integrations/tree/master/clickhouse)
- 记录
- [clickhouse integration](https://github.com/sematext/sematext-agent-integrations/tree/master/clickhouse)
- Logging
- [rsyslog](https://www.rsyslog.com/)
- [鹿茫house omhousee酶](https://www.rsyslog.com/doc/master/configuration/modules/omclickhouse.html)
- [omclickhouse](https://www.rsyslog.com/doc/master/configuration/modules/omclickhouse.html)
- [fluentd](https://www.fluentd.org)
- [loghouse](https://github.com/flant/loghouse) (对于 [Kubernetes](https://kubernetes.io))
- [Sematext](https://www.sematext.com/logagent)
- [logagent输出-插件-clickhouse](https://sematext.com/docs/logagent/output-plugin-clickhouse/)
- 地理
- [loghouse](https://github.com/flant/loghouse) (for [Kubernetes](https://kubernetes.io))
- [logagent](https://www.sematext.com/logagent)
- [logagent output-plugin-clickhouse](https://sematext.com/docs/logagent/output-plugin-clickhouse/)
- Geo
- [MaxMind](https://dev.maxmind.com/geoip/)
- [ツ环板-ョツ嘉ッツ偲青clickシツ氾カツ鉄ツ工ツ渉](https://github.com/AlexeyKupershtokh/clickhouse-maxmind-geoip)
- [clickhouse-maxmind-geoip](https://github.com/AlexeyKupershtokh/clickhouse-maxmind-geoip)
## 编程语言生态系统 {#bian-cheng-yu-yan-sheng-tai-xi-tong}
## 编程语言 {#programming-language-ecosystems}
- Python
- [SQLAlchemy](https://www.sqlalchemy.org)
- [ツ暗ェツ氾环催ツ団ツ法ツ人](https://github.com/cloudflare/sqlalchemy-clickhouse) (使用 [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm))
- [熊猫](https://pandas.pydata.org)
- [sqlalchemy-clickhouse](https://github.com/cloudflare/sqlalchemy-clickhouse) (uses [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm))
- [pandas](https://pandas.pydata.org)
- [pandahouse](https://github.com/kszucs/pandahouse)
- PHP
- [Doctrine](https://www.doctrine-project.org/)
- [dbal-clickhouse](https://packagist.org/packages/friendsofdoctrine/dbal-clickhouse)
- R
- [dplyr](https://db.rstudio.com/dplyr/)
- [RClickhouse](https://github.com/IMSMWU/RClickhouse) (使用 [ツ暗ェツ氾环催ツ団](https://github.com/artpaul/clickhouse-cpp))
- [RClickHouse](https://github.com/IMSMWU/RClickHouse) (uses [clickhouse-cpp](https://github.com/artpaul/clickhouse-cpp))
- Java
- [Hadoop](http://hadoop.apache.org)
- [clickhouse-hdfs-装载机](https://github.com/jaykelin/clickhouse-hdfs-loader) (使用 [JDBC](../../sql-reference/table-functions/jdbc.md))
- 斯卡拉
- [clickhouse-hdfs-loader](https://github.com/jaykelin/clickhouse-hdfs-loader) (uses [JDBC](../../sql-reference/table-functions/jdbc.md))
- Scala
- [Akka](https://akka.io)
- [掳胫client-禄脢鹿脷露胫鲁隆鹿-client酶](https://github.com/crobox/clickhouse-scala-client)
- [clickhouse-scala-client](https://github.com/crobox/clickhouse-scala-client)
- C#
- [ADO.NET](https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/ado-net-overview)
- [克莱克豪斯Ado](https://github.com/killwort/ClickHouse-Net)
- [ClickHouse.Net](https://github.com/ilyabreev/ClickHouse.Net)
- [ClickHouse.Net.Migrations](https://github.com/ilyabreev/ClickHouse.Net.Migrations)
- 仙丹
- [ClickHouse.Ado](https://github.com/killwort/ClickHouse-Net)
- [ClickHouse.Client](https://github.com/DarkWanderer/ClickHouse.Client)
- [ClickHouse.Net](https://github.com/ilyabreev/ClickHouse.Net)
- [ClickHouse.Net.Migrations](https://github.com/ilyabreev/ClickHouse.Net.Migrations)
- Elixir
- [Ecto](https://github.com/elixir-ecto/ecto)
- [clickhouse_ecto](https://github.com/appodeal/clickhouse_ecto)
- [clickhouse_ecto](https://github.com/appodeal/clickhouse_ecto)
- Ruby
- [Ruby on Rails](https://rubyonrails.org/)
- [activecube](https://github.com/bitquery/activecube)
- [ActiveRecord](https://github.com/PNixx/clickhouse-activerecord)
- [GraphQL](https://github.com/graphql)
- [activecube-graphql](https://github.com/bitquery/activecube-graphql)
[来源文章](https://clickhouse.tech/docs/zh/interfaces/third-party/integrations/) <!--hide-->
[源文章](https://clickhouse.tech/docs/en/interfaces/third-party/integrations/) <!--hide-->

View File

@ -1,37 +1,44 @@
# 来自第三方开发人员的代理服务器 {#lai-zi-di-san-fang-kai-fa-ren-yuan-de-dai-li-fu-wu-qi}
---
toc_priority: 29
toc_title: 第三方代理
---
[chproxy](https://github.com/Vertamedia/chproxy) 是ClickHouse数据库的http代理和负载均衡器。
# 第三方代理 {#proxy-servers-from-third-party-developers}
特征
## chproxy {#chproxy}
*每用户路由和响应缓存。
*灵活的限制。
\*自动SSL证书续订。
[chproxy](https://github.com/Vertamedia/chproxy), 是一个用于ClickHouse数据库的HTTP代理和负载均衡器。
在Go中实现。
特性:
- 用户路由和响应缓存。
- 灵活的限制。
- 自动SSL证书续订。
使用go语言实现。
## KittenHouse {#kittenhouse}
[KittenHouse](https://github.com/VKCOM/kittenhouse) 设计为ClickHouse和应用程序服务器之间的本地代理以防在应用程序端缓冲INSERT数据是不可能或不方便的。
[KittenHouse](https://github.com/VKCOM/kittenhouse)被设计为ClickHouse和应用服务器之间的本地代理以防不可能或不方便在应用程序端缓冲插入数据
特征:
性:
*内存和磁盘数据缓冲。
*每表路由。
\*负载平衡和健康检查。
- 内存和磁盘上的数据缓冲。
- 表路由。
- 负载平衡和运行状况检查。
在Go中实现。
使用go语言实现。
## ツ环板-ョツ嘉ッツ偲 {#clickhouse-bulk}
## ClickHouse-Bulk {#clickhouse-bulk}
[ツ环板-ョツ嘉ッツ偲](https://github.com/nikepan/clickhouse-bulk) 是一个简单的ClickHouse插入收集器。
[ClickHouse-Bulk](https://github.com/nikepan/clickhouse-bulk)是一个简单的ClickHouse收集器。
征:
性:
*分组请求并按阈值或间隔发送。
*多个远程服务器。
\*基本身份验证。
- 按阈值或间隔对请求进行分组并发送。
- 多个远程服务器。
- 基本身份验证。
在Go中实现。
使用go语言实现。
[来源文章](https://clickhouse.tech/docs/zh/interfaces/third-party/proxy/) <!--hide-->
[Original article](https://clickhouse.tech/docs/en/interfaces/third-party/proxy/) <!--hide-->

View File

@ -362,6 +362,12 @@ LAYOUT(DIRECT())
<null_value>??</null_value>
</attribute>
...
</structure>
<layout>
<ip_trie>
<access_to_key_from_attributes>true</access_to_key_from_attributes>
</ip_trie>
</layout>
```

View File

@ -1,8 +1,8 @@
---
machine_translated: true
machine_translated_rev: 72537a2d527c63c07aa5d2361a8829f3895cf2bd
toc_folder_title: "\u65B0\u589E\u5185\u5BB9"
toc_priority: 72
toc_folder_title: ClickHouse事迹
toc_priority: 82
---
# ClickHouse变更及改动? {#whats-new-in-clickhouse}
对于已经发布的版本,有一个[roadmap](../whats-new/roadmap.md)和一个详细的[changelog](../whats-new/changelog/index.md)。

View File

@ -1,17 +1,10 @@
---
toc_priority: 74
toc_title: 路线图
toc_title: Roadmap
---
# 路线图 {#roadmap}
# Roadmap {#roadmap}
## Q2 2020 {#q2-2020}
- 和外部认证服务集成
## Q3 2020 {#q3-2020}
- 资源池,为用户提供更精准的集群资源分配
{## [原始文档](https://clickhouse.tech/docs/en/roadmap/) ##}
`2021年Roadmap`已公布供公开讨论查看[这里](https://github.com/ClickHouse/ClickHouse/issues/17623).
{## [源文章](https://clickhouse.tech/docs/en/roadmap/) ##}

View File

@ -1,41 +1,74 @@
## 修复于 ClickHouse Release 18.12.13, 2018-09-10 {#xiu-fu-yu-clickhouse-release-18-12-13-2018-09-10}
---
toc_priority: 76
toc_title: 安全更新日志
---
## 修复于ClickHouse Release 19.14.3.3, 2019-09-10 {#fixed-in-clickhouse-release-19-14-3-3-2019-09-10}
### CVE-2019-15024 {#cve-2019-15024}
对ZooKeeper具有写访问权限并且可以运行ClickHouse所在网络上可用的自定义服务器的攻击者可以创建一个自定义的恶意服务器该服务器将充当ClickHouse副本并在ZooKeeper中注册。当另一个副本将从恶意副本获取数据部分时它可以强制clickhouse服务器写入文件系统上的任意路径。
作者Yandex信息安全团队Eldar Zaitov
### CVE-2019-16535 {#cve-2019-16535}
解压算法中的OOB-read、OOB-write和整数下溢可以通过本机协议实现RCE或DoS。
作者: Yandex信息安全团队Eldar Zaitov
### CVE-2019-16536 {#cve-2019-16536}
恶意的经过身份验证的客户端可能会触发导致DoS的堆栈溢出。
作者: Yandex信息安全团队Eldar Zaitov
## 修复于ClickHouse Release 19.13.6.1, 2019-09-20 {#fixed-in-clickhouse-release-19-13-6-1-2019-09-20}
### CVE-2019-18657 {#cve-2019-18657}
表函数`url`存在允许攻击者在请求中插入任意HTTP标头的漏洞。
作者: [Nikita Tikhomirov](https://github.com/NSTikhomirov)
## 修复于ClickHouse Release 18.12.13, 2018-09-10 {#fixed-in-clickhouse-release-18-12-13-2018-09-10}
### CVE-2018-14672 {#cve-2018-14672}
加载CatBoost模型的功能允许遍历路径并通过错误消息读取任意文件。
加载CatBoost模型的函数允许路径遍历和通过错误消息读取任意文件。
来源: Yandex信息安全团队的Andrey Krasichkov
作者Yandex信息安全团队Andrey Krasichkov
## 修复于 ClickHouse Release 18.10.3, 2018-08-13 {#xiu-fu-yu-clickhouse-release-18-10-3-2018-08-13}
## 修复于Release 18.10.3, 2018-08-13 {#fixed-in-clickhouse-release-18-10-3-2018-08-13}
### CVE-2018-14671 {#cve-2018-14671}
unixODBC允许从文件系统加载任意共享对象从而导致«远程执行代码»漏洞。
unixODBC允许从文件系统加载任意共享对象从而导致远程代码执行漏洞。
来源Yandex信息安全团队的Andrey Krasichkov和Evgeny Sidorov
作者Yandex信息安全团队Andrey Krasichkov和Evgeny Sidorov
## 修复于 ClickHouse Release 1.1.54388, 2018-06-28 {#xiu-fu-yu-clickhouse-release-1-1-54388-2018-06-28}
## 修复于ClickHouse Release 1.1.54388, 2018-06-28 {#fixed-in-clickhouse-release-1-1-54388-2018-06-28}
### CVE-2018-14668 {#cve-2018-14668}
远程表函数功能允许在 «user», «password» 及 «default_database» 字段中使用任意符号,从而导致跨协议请求伪造攻击。
`remote`表函数允许在`user``password`和`default_database`字段中使用任意符号,从而导致跨协议请求伪造攻击。
来源Yandex信息安全团队的Andrey Krasichkov
Yandex信息安全团队Andrey Krasichkov
## 修复于 ClickHouse Release 1.1.54390, 2018-07-06 {#xiu-fu-yu-clickhouse-release-1-1-54390-2018-07-06}
## 修复于ClickHouse Release 1.1.54390, 2018-07-06 {#fixed-in-clickhouse-release-1-1-54390-2018-07-06}
### CVE-2018-14669 {#cve-2018-14669}
ClickHouse MySQL客户端启用了 «LOAD DATA LOCAL INFILE» 功能该功能允许恶意MySQL数据库从连接的ClickHouse服务器读取任意文件。
ClickHouse MySQL客户端启用了`LOAD DATA LOCAL INFILE`功能,允许恶意MySQL数据库从连接的ClickHouse服务器读取任意文件。
来源Yandex信息安全团队的Andrey Krasichkov和Evgeny Sidorov
作者Yandex信息安全团队Andrey Krasichkov和Evgeny Sidorov
## 修复于 ClickHouse Release 1.1.54131, 2017-01-10 {#xiu-fu-yu-clickhouse-release-1-1-54131-2017-01-10}
## 修复于ClickHouse Release 1.1.54131, 2017-01-10 {#fixed-in-clickhouse-release-1-1-54131-2017-01-10}
### CVE-2018-14670 {#cve-2018-14670}
deb软件包中的错误配置可能导致使用未经授权的数据库。
deb包中的错误配置可能导致未经授权使用数据库。
来源英国国家网络安全中心NCSC
作者英国国家网络安全中心NCSC
[来源文章](https://clickhouse.tech/docs/en/security_changelog/) <!--hide-->
{## [Original article](https://clickhouse.tech/docs/en/security_changelog/) ##}

View File

@ -949,6 +949,11 @@ private:
TestHint test_hint(test_mode, all_queries_text);
if (test_hint.clientError() || test_hint.serverError())
processTextAsSingleQuery("SET send_logs_level = 'none'");
// Echo all queries if asked; makes for a more readable reference
// file.
if (test_hint.echoQueries())
echo_queries = true;
}
/// Several queries separated by ';'.

View File

@ -14,6 +14,7 @@
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTInsertQuery.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTOrderByElement.h>
#include <Parsers/ASTQueryWithOutput.h>
#include <Parsers/ASTSelectQuery.h>
#include <Parsers/ASTSelectWithUnionQuery.h>
@ -28,6 +29,11 @@
namespace DB
{
namespace ErrorCodes
{
extern const int TOO_DEEP_RECURSION;
}
Field QueryFuzzer::getRandomField(int type)
{
switch (type)
@ -205,14 +211,88 @@ void QueryFuzzer::replaceWithTableLike(ASTPtr & ast)
ast = new_ast;
}
void QueryFuzzer::fuzzColumnLikeExpressionList(ASTPtr ast)
void QueryFuzzer::fuzzOrderByElement(ASTOrderByElement * elem)
{
switch (fuzz_rand() % 10)
{
case 0:
elem->direction = -1;
break;
case 1:
elem->direction = 1;
break;
case 2:
elem->nulls_direction = -1;
elem->nulls_direction_was_explicitly_specified = true;
break;
case 3:
elem->nulls_direction = 1;
elem->nulls_direction_was_explicitly_specified = true;
break;
case 4:
elem->nulls_direction = elem->direction;
elem->nulls_direction_was_explicitly_specified = false;
break;
default:
// do nothing
break;
}
}
void QueryFuzzer::fuzzOrderByList(IAST * ast)
{
if (!ast)
{
return;
}
auto * impl = assert_cast<ASTExpressionList *>(ast.get());
auto * list = assert_cast<ASTExpressionList *>(ast);
// Remove element
if (fuzz_rand() % 50 == 0 && list->children.size() > 1)
{
// Don't remove last element -- this leads to questionable
// constructs such as empty select.
list->children.erase(list->children.begin()
+ fuzz_rand() % list->children.size());
}
// Add element
if (fuzz_rand() % 50 == 0)
{
auto pos = list->children.empty()
? list->children.begin()
: list->children.begin() + fuzz_rand() % list->children.size();
auto col = getRandomColumnLike();
if (col)
{
auto elem = std::make_shared<ASTOrderByElement>();
elem->children.push_back(col);
elem->direction = 1;
elem->nulls_direction = 1;
elem->nulls_direction_was_explicitly_specified = false;
elem->with_fill = false;
list->children.insert(pos, elem);
}
else
{
fprintf(stderr, "no random col!\n");
}
}
// We don't have to recurse here to fuzz the children, this is handled by
// the generic recursion into IAST.children.
}
void QueryFuzzer::fuzzColumnLikeExpressionList(IAST * ast)
{
if (!ast)
{
return;
}
auto * impl = assert_cast<ASTExpressionList *>(ast);
// Remove element
if (fuzz_rand() % 50 == 0 && impl->children.size() > 1)
@ -252,11 +332,44 @@ void QueryFuzzer::fuzz(ASTs & asts)
}
}
struct ScopedIncrement
{
size_t & counter;
explicit ScopedIncrement(size_t & counter_) : counter(counter_) { ++counter; }
~ScopedIncrement() { --counter; }
};
void QueryFuzzer::fuzz(ASTPtr & ast)
{
if (!ast)
return;
// Check for exceeding max depth.
ScopedIncrement depth_increment(current_ast_depth);
if (current_ast_depth > 500)
{
// The AST is too deep (see the comment for current_ast_depth). Throw
// an exception to fail fast and not use this query as an etalon, or we'll
// end up in a very slow and useless loop. It also makes sense to set it
// lower than the default max parse depth on the server (1000), so that
// we don't get the useless error about parse depth from the server either.
throw Exception(ErrorCodes::TOO_DEEP_RECURSION,
"AST depth exceeded while fuzzing ({})", current_ast_depth);
}
// Check for loops.
auto [_, inserted] = debug_visited_nodes.insert(ast.get());
if (!inserted)
{
fmt::print(stderr, "The AST node '{}' was already visited before."
" Depth {}, {} visited nodes, current top AST:\n{}\n",
static_cast<void *>(ast.get()), current_ast_depth,
debug_visited_nodes.size(), (*debug_top_ast)->dumpTree());
assert(false);
}
// The fuzzing.
if (auto * with_union = typeid_cast<ASTSelectWithUnionQuery *>(ast.get()))
{
fuzz(with_union->list_of_selects);
@ -281,17 +394,28 @@ void QueryFuzzer::fuzz(ASTPtr & ast)
{
fuzz(expr_list->children);
}
else if (auto * order_by_element = typeid_cast<ASTOrderByElement *>(ast.get()))
{
fuzzOrderByElement(order_by_element);
}
else if (auto * fn = typeid_cast<ASTFunction *>(ast.get()))
{
fuzzColumnLikeExpressionList(fn->arguments);
fuzzColumnLikeExpressionList(fn->parameters);
fuzzColumnLikeExpressionList(fn->arguments.get());
fuzzColumnLikeExpressionList(fn->parameters.get());
if (fn->is_window_function)
{
fuzzColumnLikeExpressionList(fn->window_partition_by.get());
fuzzOrderByList(fn->window_order_by.get());
}
fuzz(fn->children);
}
else if (auto * select = typeid_cast<ASTSelectQuery *>(ast.get()))
{
fuzzColumnLikeExpressionList(select->select());
fuzzColumnLikeExpressionList(select->groupBy());
fuzzColumnLikeExpressionList(select->select().get());
fuzzColumnLikeExpressionList(select->groupBy().get());
fuzzOrderByList(select->orderBy().get());
fuzz(select->children);
}
@ -416,6 +540,10 @@ void QueryFuzzer::collectFuzzInfoRecurse(const ASTPtr ast)
void QueryFuzzer::fuzzMain(ASTPtr & ast)
{
current_ast_depth = 0;
debug_visited_nodes.clear();
debug_top_ast = &ast;
collectFuzzInfoMain(ast);
fuzz(ast);

View File

@ -12,6 +12,9 @@
namespace DB
{
class ASTExpressionList;
class ASTOrderByElement;
/*
* This is an AST-based query fuzzer that makes random modifications to query
* AST, changing numbers, list of columns, functions, etc. It remembers part of
@ -23,6 +26,13 @@ struct QueryFuzzer
{
pcg64 fuzz_rand{randomSeed()};
// We add elements to expression lists with fixed probability. Some elements
// are so large, that the expected number of elements we add to them is
// one or higher, hence this process might never finish. Put some limit on the
// total depth of AST to prevent this.
// This field is reset for each fuzzMain() call.
size_t current_ast_depth = 0;
// These arrays hold parts of queries that we can substitute into the query
// we are currently fuzzing. We add some part from each new query we are asked
// to fuzz, and keep this state between queries, so the fuzzing output becomes
@ -36,6 +46,12 @@ struct QueryFuzzer
std::unordered_map<std::string, ASTPtr> table_like_map;
std::vector<ASTPtr> table_like;
// Some debug fields for detecting problematic ASTs with loops.
// These are reset for each fuzzMain call.
std::unordered_set<const IAST *> debug_visited_nodes;
ASTPtr * debug_top_ast;
// This is the only function you have to call -- it will modify the passed
// ASTPtr to point to new AST with some random changes.
void fuzzMain(ASTPtr & ast);
@ -46,7 +62,9 @@ struct QueryFuzzer
ASTPtr getRandomColumnLike();
void replaceWithColumnLike(ASTPtr & ast);
void replaceWithTableLike(ASTPtr & ast);
void fuzzColumnLikeExpressionList(ASTPtr ast);
void fuzzOrderByElement(ASTOrderByElement * elem);
void fuzzOrderByList(IAST * ast);
void fuzzColumnLikeExpressionList(IAST * ast);
void fuzz(ASTs & asts);
void fuzz(ASTPtr & ast);
void collectFuzzInfoMain(const ASTPtr ast);

View File

@ -19,6 +19,7 @@ namespace ErrorCodes
/// Checks expected server and client error codes in testmode.
/// To enable it add special comment after the query: "-- { serverError 60 }" or "-- { clientError 20 }".
/// Also you can enable echoing all queries by writing "-- { echo }".
class TestHint
{
public:
@ -84,12 +85,14 @@ public:
int serverError() const { return server_error; }
int clientError() const { return client_error; }
bool echoQueries() const { return echo; }
private:
bool enabled = false;
const String & query;
int server_error = 0;
int client_error = 0;
bool echo = false;
void parse(const String & hint)
{
@ -107,6 +110,8 @@ private:
ss >> server_error;
else if (item == "clientError")
ss >> client_error;
else if (item == "echo")
echo = true;
}
}

View File

@ -155,7 +155,7 @@ int mainEntryClickHouseCompressor(int argc, char ** argv)
}
catch (...)
{
std::cerr << getCurrentExceptionMessage(true);
std::cerr << getCurrentExceptionMessage(true) << '\n';
return getCurrentExceptionCode();
}

View File

@ -142,7 +142,7 @@ int mainEntryClickHouseFormat(int argc, char ** argv)
}
catch (...)
{
std::cerr << getCurrentExceptionMessage(true);
std::cerr << getCurrentExceptionMessage(true) << '\n';
return getCurrentExceptionCode();
}

View File

@ -4,6 +4,7 @@
#include <sys/resource.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
#include <pwd.h>
#include <unistd.h>
@ -103,6 +104,12 @@ namespace CurrentMetrics
int mainEntryClickHouseServer(int argc, char ** argv)
{
DB::Server app;
/// Do not fork separate process from watchdog if we attached to terminal.
/// Otherwise it breaks gdb usage.
if (argc > 0 && !isatty(STDIN_FILENO) && !isatty(STDOUT_FILENO) && !isatty(STDERR_FILENO))
app.shouldSetupWatchdog(argv[0]);
try
{
return app.run(argc, argv);
@ -366,6 +373,7 @@ void checkForUsersNotInMainConfig(
int Server::main(const std::vector<std::string> & /*args*/)
{
Poco::Logger * log = &logger();
UseSSL use_ssl;
MainThreadStatus::getInstance();

View File

@ -127,10 +127,10 @@ public:
void insertResultInto(AggregateDataPtr place, IColumn & to, Arena *) const override
{
if constexpr (IsDecimalNumber<Numerator> || IsDecimalNumber<Denominator>)
static_cast<ColumnVector<Float64> &>(to).getData().push_back(
assert_cast<ColumnVector<Float64> &>(to).getData().push_back(
this->data(place).divideIfAnyDecimal(num_scale, denom_scale));
else
static_cast<ColumnVector<Float64> &>(to).getData().push_back(this->data(place).divide());
assert_cast<ColumnVector<Float64> &>(to).getData().push_back(this->data(place).divide());
}
private:
UInt32 num_scale;

View File

@ -107,7 +107,7 @@ public:
/// TODO Do positions need to be 1-based for this function?
size_t position = columns[1]->getUInt(row_num);
/// If position is larger than size to which array will be cutted - simply ignore value.
/// If position is larger than size to which array will be cut - simply ignore value.
if (length_to_resize && position >= length_to_resize)
return;

View File

@ -1,11 +1,10 @@
#include <AggregateFunctions/AggregateFunctionFactory.h>
#include <AggregateFunctions/Helpers.h>
#include <AggregateFunctions/FactoryHelpers.h>
#include <AggregateFunctions/Helpers.h>
#include <DataTypes/DataTypeAggregateFunction.h>
#include "registerAggregateFunctions.h"
// TODO include this last because of a broken roaring header. See the comment
// inside.
// TODO include this last because of a broken roaring header. See the comment inside.
#include <AggregateFunctions/AggregateFunctionGroupBitmap.h>
namespace DB
@ -17,46 +16,54 @@ namespace ErrorCodes
namespace
{
template <template <typename> class Data>
AggregateFunctionPtr createAggregateFunctionBitmap(const std::string & name, const DataTypes & argument_types, const Array & parameters)
{
assertNoParameters(name, parameters);
assertUnary(name, argument_types);
template <template <typename> class Data>
AggregateFunctionPtr createAggregateFunctionBitmap(const std::string & name, const DataTypes & argument_types, const Array & parameters)
{
assertNoParameters(name, parameters);
assertUnary(name, argument_types);
if (!argument_types[0]->canBeUsedInBitOperations())
throw Exception(
"The type " + argument_types[0]->getName() + " of argument for aggregate function " + name
+ " is illegal, because it cannot be used in Bitmap operations",
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
if (!argument_types[0]->canBeUsedInBitOperations())
throw Exception("The type " + argument_types[0]->getName() + " of argument for aggregate function " + name
+ " is illegal, because it cannot be used in Bitmap operations",
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
AggregateFunctionPtr res(createWithUnsignedIntegerType<AggregateFunctionBitmap, Data>(*argument_types[0], argument_types[0]));
AggregateFunctionPtr res(createWithUnsignedIntegerType<AggregateFunctionBitmap, Data>(*argument_types[0], argument_types[0]));
if (!res)
throw Exception(
"Illegal type " + argument_types[0]->getName() + " of argument for aggregate function " + name,
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
if (!res)
throw Exception("Illegal type " + argument_types[0]->getName() + " of argument for aggregate function " + name, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
return res;
}
return res;
}
// Additional aggregate functions to manipulate bitmaps.
template <template <typename, typename> class AggregateFunctionTemplate>
AggregateFunctionPtr
createAggregateFunctionBitmapL2(const std::string & name, const DataTypes & argument_types, const Array & parameters)
{
assertNoParameters(name, parameters);
assertUnary(name, argument_types);
DataTypePtr argument_type_ptr = argument_types[0];
WhichDataType which(*argument_type_ptr);
if (which.idx != TypeIndex::AggregateFunction)
throw Exception(
"Illegal type " + argument_types[0]->getName() + " of argument for aggregate function " + name,
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
template <template <typename, typename> class AggregateFunctionTemplate>
AggregateFunctionPtr createAggregateFunctionBitmapL2(const std::string & name, const DataTypes & argument_types, const Array & parameters)
{
assertNoParameters(name, parameters);
assertUnary(name, argument_types);
DataTypePtr argument_type_ptr = argument_types[0];
WhichDataType which(*argument_type_ptr);
if (which.idx != TypeIndex::AggregateFunction)
throw Exception("Illegal type " + argument_types[0]->getName() + " of argument for aggregate function " + name, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
const DataTypeAggregateFunction& datatype_aggfunc = dynamic_cast<const DataTypeAggregateFunction&>(*argument_type_ptr);
AggregateFunctionPtr aggfunc = datatype_aggfunc.getFunction();
argument_type_ptr = aggfunc->getArgumentTypes()[0];
AggregateFunctionPtr res(createWithUnsignedIntegerType<AggregateFunctionTemplate, AggregateFunctionGroupBitmapData>(*argument_type_ptr, argument_type_ptr));
if (!res)
throw Exception("Illegal type " + argument_types[0]->getName() + " of argument for aggregate function " + name, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
return res;
}
const DataTypeAggregateFunction & datatype_aggfunc = dynamic_cast<const DataTypeAggregateFunction &>(*argument_type_ptr);
AggregateFunctionPtr aggfunc = datatype_aggfunc.getFunction();
argument_type_ptr = aggfunc->getArgumentTypes()[0];
AggregateFunctionPtr res(createWithUnsignedIntegerType<AggregateFunctionTemplate, AggregateFunctionGroupBitmapData>(
*argument_type_ptr, argument_type_ptr));
if (!res)
throw Exception(
"Illegal type " + argument_types[0]->getName() + " of argument for aggregate function " + name,
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
return res;
}
}
void registerAggregateFunctionsBitmap(AggregateFunctionFactory & factory)

View File

@ -1,31 +1,26 @@
#pragma once
#include <Columns/ColumnVector.h>
#include <Common/assert_cast.h>
#include <AggregateFunctions/IAggregateFunction.h>
#include <DataTypes/DataTypesNumber.h>
#include <Columns/ColumnAggregateFunction.h>
#include <Columns/ColumnVector.h>
#include <DataTypes/DataTypesNumber.h>
#include <Common/assert_cast.h>
// TODO include this last because of a broken roaring header. See the comment inside.
#include <AggregateFunctions/AggregateFunctionGroupBitmapData.h>
namespace DB
{
/// Counts bitmap operation on numbers.
template <typename T, typename Data>
class AggregateFunctionBitmap final : public IAggregateFunctionDataHelper<Data, AggregateFunctionBitmap<T, Data>>
{
public:
AggregateFunctionBitmap(const DataTypePtr & type)
: IAggregateFunctionDataHelper<Data, AggregateFunctionBitmap<T, Data>>({type}, {}) {}
AggregateFunctionBitmap(const DataTypePtr & type) : IAggregateFunctionDataHelper<Data, AggregateFunctionBitmap<T, Data>>({type}, {}) { }
String getName() const override { return Data::name(); }
DataTypePtr getReturnType() const override
{
return std::make_shared<DataTypeNumber<T>>();
}
DataTypePtr getReturnType() const override { return std::make_shared<DataTypeNumber<T>>(); }
void add(AggregateDataPtr place, const IColumn ** columns, size_t row_num, Arena *) const override
{
@ -37,15 +32,9 @@ public:
this->data(place).rbs.merge(this->data(rhs).rbs);
}
void serialize(ConstAggregateDataPtr place, WriteBuffer & buf) const override
{
this->data(place).rbs.write(buf);
}
void serialize(ConstAggregateDataPtr place, WriteBuffer & buf) const override { this->data(place).rbs.write(buf); }
void deserialize(AggregateDataPtr place, ReadBuffer & buf, Arena *) const override
{
this->data(place).rbs.read(buf);
}
void deserialize(AggregateDataPtr place, ReadBuffer & buf, Arena *) const override { this->data(place).rbs.read(buf); }
void insertResultInto(AggregateDataPtr place, IColumn & to, Arena *) const override
{
@ -59,23 +48,22 @@ class AggregateFunctionBitmapL2 final : public IAggregateFunctionDataHelper<Data
{
public:
AggregateFunctionBitmapL2(const DataTypePtr & type)
: IAggregateFunctionDataHelper<Data, AggregateFunctionBitmapL2<T, Data, Policy>>({type}, {}){}
: IAggregateFunctionDataHelper<Data, AggregateFunctionBitmapL2<T, Data, Policy>>({type}, {})
{
}
String getName() const override { return Data::name(); }
DataTypePtr getReturnType() const override
{
return std::make_shared<DataTypeNumber<T>>();
}
DataTypePtr getReturnType() const override { return std::make_shared<DataTypeNumber<T>>(); }
void add(AggregateDataPtr place, const IColumn ** columns, size_t row_num, Arena *) const override
{
Data & data_lhs = this->data(place);
const Data & data_rhs = this->data(assert_cast<const ColumnAggregateFunction &>(*columns[0]).getData()[row_num]);
if (!data_lhs.doneFirst)
if (!data_lhs.init)
{
data_lhs.doneFirst = true;
data_lhs.rbs.rb_or(data_rhs.rbs);
data_lhs.init = true;
data_lhs.rbs.merge(data_rhs.rbs);
}
else
{
@ -88,13 +76,13 @@ public:
Data & data_lhs = this->data(place);
const Data & data_rhs = this->data(rhs);
if (!data_rhs.doneFirst)
if (!data_rhs.init)
return;
if (!data_lhs.doneFirst)
if (!data_lhs.init)
{
data_lhs.doneFirst = true;
data_lhs.rbs.rb_or(data_rhs.rbs);
data_lhs.init = true;
data_lhs.rbs.merge(data_rhs.rbs);
}
else
{
@ -102,15 +90,9 @@ public:
}
}
void serialize(ConstAggregateDataPtr place, WriteBuffer & buf) const override
{
this->data(place).rbs.write(buf);
}
void serialize(ConstAggregateDataPtr place, WriteBuffer & buf) const override { this->data(place).rbs.write(buf); }
void deserialize(AggregateDataPtr place, ReadBuffer & buf, Arena *) const override
{
this->data(place).rbs.read(buf);
}
void deserialize(AggregateDataPtr place, ReadBuffer & buf, Arena *) const override { this->data(place).rbs.read(buf); }
void insertResultInto(AggregateDataPtr place, IColumn & to, Arena *) const override
{
@ -122,39 +104,30 @@ template <typename Data>
class BitmapAndPolicy
{
public:
static void apply(Data& lhs, const Data& rhs)
{
lhs.rbs.rb_and(rhs.rbs);
}
static void apply(Data & lhs, const Data & rhs) { lhs.rbs.rb_and(rhs.rbs); }
};
template <typename Data>
class BitmapOrPolicy
{
public:
static void apply(Data& lhs, const Data& rhs)
{
lhs.rbs.rb_or(rhs.rbs);
}
static void apply(Data & lhs, const Data & rhs) { lhs.rbs.rb_or(rhs.rbs); }
};
template <typename Data>
class BitmapXorPolicy
{
public:
static void apply(Data& lhs, const Data& rhs)
{
lhs.rbs.rb_xor(rhs.rbs);
}
static void apply(Data & lhs, const Data & rhs) { lhs.rbs.rb_xor(rhs.rbs); }
};
template <typename T, typename Data>
using AggregateFunctionBitmapL2And = AggregateFunctionBitmapL2<T, Data, BitmapAndPolicy<Data> >;
using AggregateFunctionBitmapL2And = AggregateFunctionBitmapL2<T, Data, BitmapAndPolicy<Data>>;
template <typename T, typename Data>
using AggregateFunctionBitmapL2Or = AggregateFunctionBitmapL2<T, Data, BitmapOrPolicy<Data> >;
using AggregateFunctionBitmapL2Or = AggregateFunctionBitmapL2<T, Data, BitmapOrPolicy<Data>>;
template <typename T, typename Data>
using AggregateFunctionBitmapL2Xor = AggregateFunctionBitmapL2<T, Data, BitmapXorPolicy<Data> >;
using AggregateFunctionBitmapL2Xor = AggregateFunctionBitmapL2<T, Data, BitmapXorPolicy<Data>>;
}

View File

@ -11,30 +11,39 @@
// garbage that breaks the build (e.g. it changes _POSIX_C_SOURCE).
// TODO: find out what it is. On github, they have proper interface headers like
// this one: https://github.com/RoaringBitmap/CRoaring/blob/master/include/roaring/roaring.h
#include <roaring/roaring.h>
#include <roaring.hh>
#include <roaring64map.hh>
namespace DB
{
enum BitmapKind
{
Small = 0,
Bitmap = 1
};
/**
* For a small number of values - an array of fixed size "on the stack".
* For large, roaring_bitmap_t is allocated.
* For large, roaring bitmap is allocated.
* For a description of the roaring_bitmap_t, see: https://github.com/RoaringBitmap/CRoaring
*/
template <typename T, UInt8 small_set_size>
class RoaringBitmapWithSmallSet : private boost::noncopyable
{
private:
using Small = SmallSet<T, small_set_size>;
SmallSet<T, small_set_size> small;
using ValueBuffer = std::vector<T>;
Small small;
roaring_bitmap_t * rb = nullptr;
using RoaringBitmap = std::conditional_t<sizeof(T) >= 8, roaring::Roaring64Map, roaring::Roaring>;
using Value = std::conditional_t<sizeof(T) >= 8, UInt64, UInt32>;
std::shared_ptr<RoaringBitmap> rb = nullptr;
void toLarge()
{
rb = roaring_bitmap_create();
rb = std::make_shared<RoaringBitmap>();
for (const auto & x : small)
roaring_bitmap_add(rb, x.getValue());
rb->add(static_cast<Value>(x.getValue()));
small.clear();
}
public:
@ -42,12 +51,6 @@ public:
bool isSmall() const { return rb == nullptr; }
~RoaringBitmapWithSmallSet()
{
if (isLarge())
roaring_bitmap_free(rb);
}
void add(T value)
{
if (isSmall())
@ -59,19 +62,22 @@ public:
else
{
toLarge();
roaring_bitmap_add(rb, value);
rb->add(static_cast<Value>(value));
}
}
}
else
roaring_bitmap_add(rb, value);
{
rb->add(static_cast<Value>(value));
}
}
UInt64 size() const
{
return isSmall()
? small.size()
: roaring_bitmap_get_cardinality(rb);
if (isSmall())
return small.size();
else
return rb->cardinality();
}
void merge(const RoaringBitmapWithSmallSet & r1)
@ -81,7 +87,7 @@ public:
if (isSmall())
toLarge();
roaring_bitmap_or_inplace(rb, r1.rb);
*rb |= *r1.rb;
}
else
{
@ -92,50 +98,49 @@ public:
void read(DB::ReadBuffer & in)
{
bool is_large;
readBinary(is_large, in);
if (is_large)
UInt8 kind;
readBinary(kind, in);
if (BitmapKind::Small == kind)
{
std::string s;
readStringBinary(s,in);
rb = roaring_bitmap_portable_deserialize(s.c_str());
for (const auto & x : small) // merge from small
roaring_bitmap_add(rb, x.getValue());
}
else
small.read(in);
}
else if (BitmapKind::Bitmap == kind)
{
size_t size;
readVarUInt(size, in);
std::unique_ptr<char[]> buf(new char[size]);
in.readStrict(buf.get(), size);
rb = std::make_shared<RoaringBitmap>(RoaringBitmap::read(buf.get()));
}
}
void write(DB::WriteBuffer & out) const
{
writeBinary(isLarge(), out);
if (isLarge())
UInt8 kind = isLarge() ? BitmapKind::Bitmap : BitmapKind::Small;
writeBinary(kind, out);
if (BitmapKind::Small == kind)
{
uint32_t expectedsize = roaring_bitmap_portable_size_in_bytes(rb);
std::string s(expectedsize,0);
roaring_bitmap_portable_serialize(rb, const_cast<char*>(s.data()));
writeStringBinary(s,out);
}
else
small.write(out);
}
else if (BitmapKind::Bitmap == kind)
{
auto size = rb->getSizeInBytes();
writeVarUInt(size, out);
std::unique_ptr<char[]> buf(new char[size]);
rb->write(buf.get());
out.write(buf.get(), size);
}
}
roaring_bitmap_t * getRb() const { return rb; }
Small & getSmall() const { return small; }
/**
* Get a new roaring_bitmap_t from elements of small
* Get a new RoaringBitmap from elements of small
*/
roaring_bitmap_t * getNewRbFromSmall() const
std::shared_ptr<RoaringBitmap> getNewRoaringBitmapFromSmall() const
{
roaring_bitmap_t * smallRb = roaring_bitmap_create();
std::shared_ptr<RoaringBitmap> ret = std::make_shared<RoaringBitmap>();
for (const auto & x : small)
roaring_bitmap_add(smallRb, x.getValue());
return smallRb;
ret->add(static_cast<Value>(x.getValue()));
return ret;
}
/**
@ -162,8 +167,10 @@ public:
else if (isSmall() && r1.isLarge())
{
for (const auto & x : small)
if (roaring_bitmap_contains(r1.rb, x.getValue()))
{
if (rb->contains(static_cast<Value>(x.getValue())))
buffer.push_back(x.getValue());
}
// Clear out the original values
small.clear();
@ -175,10 +182,8 @@ public:
}
else
{
roaring_bitmap_t * rb1 = r1.isSmall() ? r1.getNewRbFromSmall() : r1.getRb();
roaring_bitmap_and_inplace(rb, rb1);
if (r1.isSmall())
roaring_bitmap_free(rb1);
std::shared_ptr<RoaringBitmap> new_rb = r1.isSmall() ? r1.getNewRoaringBitmapFromSmall() : r1.rb;
*rb &= *new_rb;
}
}
@ -194,10 +199,9 @@ public:
{
if (isSmall())
toLarge();
roaring_bitmap_t * rb1 = r1.isSmall() ? r1.getNewRbFromSmall() : r1.getRb();
roaring_bitmap_xor_inplace(rb, rb1);
if (r1.isSmall())
roaring_bitmap_free(rb1);
std::shared_ptr<RoaringBitmap> new_rb = r1.isSmall() ? r1.getNewRoaringBitmapFromSmall() : r1.rb;
*rb ^= *new_rb;
}
/**
@ -224,8 +228,10 @@ public:
else if (isSmall() && r1.isLarge())
{
for (const auto & x : small)
if (!roaring_bitmap_contains(r1.rb, x.getValue()))
{
if (!rb->contains(static_cast<Value>(x.getValue())))
buffer.push_back(x.getValue());
}
// Clear out the original values
small.clear();
@ -237,10 +243,8 @@ public:
}
else
{
roaring_bitmap_t * rb1 = r1.isSmall() ? r1.getNewRbFromSmall() : r1.getRb();
roaring_bitmap_andnot_inplace(rb, rb1);
if (r1.isSmall())
roaring_bitmap_free(rb1);
std::shared_ptr<RoaringBitmap> new_rb = r1.isSmall() ? r1.getNewRoaringBitmapFromSmall() : r1.rb;
*rb -= *new_rb;
}
}
@ -249,27 +253,27 @@ public:
*/
UInt64 rb_and_cardinality(const RoaringBitmapWithSmallSet & r1) const
{
UInt64 retSize = 0;
UInt64 ret = 0;
if (isSmall() && r1.isSmall())
{
for (const auto & x : small)
if (r1.small.find(x.getValue()) != r1.small.end())
++retSize;
++ret;
}
else if (isSmall() && r1.isLarge())
{
for (const auto & x : small)
if (roaring_bitmap_contains(r1.rb, x.getValue()))
++retSize;
{
if (rb->contains(static_cast<Value>(x.getValue())))
++ret;
}
}
else
{
roaring_bitmap_t * rb1 = r1.isSmall() ? r1.getNewRbFromSmall() : r1.getRb();
retSize = roaring_bitmap_and_cardinality(rb, rb1);
if (r1.isSmall())
roaring_bitmap_free(rb1);
std::shared_ptr<RoaringBitmap> new_rb = r1.isSmall() ? r1.getNewRoaringBitmapFromSmall() : r1.rb;
ret = (*rb & *new_rb).cardinality();
}
return retSize;
return ret;
}
/**
@ -311,11 +315,9 @@ public:
{
if (isSmall())
toLarge();
roaring_bitmap_t * rb1 = r1.isSmall() ? r1.getNewRbFromSmall() : r1.getRb();
UInt8 is_true = roaring_bitmap_equals(rb, rb1);
if (r1.isSmall())
roaring_bitmap_free(rb1);
return is_true;
std::shared_ptr<RoaringBitmap> new_rb = r1.isSmall() ? r1.getNewRoaringBitmapFromSmall() : r1.rb;
return *rb == *new_rb;
}
/**
@ -335,18 +337,25 @@ public:
else
{
for (const auto & x : small)
if (roaring_bitmap_contains(r1.rb, x.getValue()))
{
if (r1.rb->contains(static_cast<Value>(x.getValue())))
return 1;
}
}
}
else if (r1.isSmall())
{
for (const auto & x : r1.small)
if (roaring_bitmap_contains(rb, x.getValue()))
{
if (rb->contains(static_cast<Value>(x.getValue())))
return 1;
}
}
else
{
if ((*rb & *r1.rb).cardinality() > 0)
return 1;
}
else if (roaring_bitmap_intersect(rb, r1.rb))
return 1;
return 0;
}
@ -369,8 +378,9 @@ public:
{
UInt64 r1_size = r1.size();
// A bigger set can not be a subset of ours.
if (r1_size > small.size())
return 0; // A bigger set can not be a subset of ours.
return 0;
// This is a rare case with a small number of elements on
// both sides: r1 was promoted to large for some reason and
@ -379,38 +389,48 @@ public:
// r1_size + number of not found elements, if this sum becomes
// greater then r1 is not a subset.
for (const auto & x : small)
if (!roaring_bitmap_contains(r1.rb, x.getValue()) && ++r1_size > small.size())
{
if (!r1.rb->contains(static_cast<Value>(x.getValue())) && ++r1_size > small.size())
return 0;
}
}
}
else if (r1.isSmall())
{
for (const auto & x : r1.small)
if (!roaring_bitmap_contains(rb, x.getValue()))
{
if (!rb->contains(static_cast<Value>(x.getValue())))
return 0;
}
}
else
{
if (!r1.rb->isSubset(*rb))
return 0;
}
else if (!roaring_bitmap_is_subset(r1.rb, rb))
return 0;
return 1;
}
/**
* Check whether this bitmap contains the argument.
*/
UInt8 rb_contains(const UInt32 x) const
UInt8 rb_contains(UInt64 x) const
{
return isSmall() ? small.find(x) != small.end() : roaring_bitmap_contains(rb, x);
if (isSmall())
return small.find(x) != small.end();
else
return rb->contains(x);
}
/**
* Remove value
*/
void rb_remove(UInt64 offsetid)
void rb_remove(UInt64 x)
{
if (isSmall())
toLarge();
roaring_bitmap_remove(rb, offsetid);
rb->remove(x);
}
/**
@ -419,47 +439,46 @@ public:
* range_end - range_start.
* Areas outside the range are passed through unchanged.
*/
void rb_flip(UInt64 offsetstart, UInt64 offsetend)
void rb_flip(UInt64 begin, UInt64 end)
{
if (isSmall())
toLarge();
roaring_bitmap_flip_inplace(rb, offsetstart, offsetend);
rb->flip(begin, end);
}
/**
* returns the number of integers that are smaller or equal to offsetid.
*/
UInt64 rb_rank(UInt64 offsetid)
UInt64 rb_rank(UInt64 x)
{
if (isSmall())
toLarge();
return roaring_bitmap_rank(rb, offsetid);
return rb->rank(x);
}
/**
* Convert elements to integer array, return number of elements
*/
template <typename Element>
UInt64 rb_to_array(PaddedPODArray<Element> & res_data) const
UInt64 rb_to_array(PaddedPODArray<Element> & res) const
{
UInt64 count = 0;
if (isSmall())
{
for (const auto & x : small)
{
res_data.emplace_back(x.getValue());
count++;
res.emplace_back(x.getValue());
++count;
}
}
else
{
roaring_uint32_iterator_t iterator;
roaring_init_iterator(rb, &iterator);
while (iterator.has_value)
for (auto it = rb->begin(); it != rb->end(); ++it)
{
res_data.emplace_back(iterator.current_value);
roaring_advance_uint32_iterator(&iterator);
count++;
res.emplace_back(*it);
++count;
}
}
return count;
@ -468,7 +487,7 @@ public:
/**
* Return new set with specified range (not include the range_end)
*/
UInt64 rb_range(UInt32 range_start, UInt32 range_end, RoaringBitmapWithSmallSet & r1) const
UInt64 rb_range(UInt64 range_start, UInt64 range_end, RoaringBitmapWithSmallSet & r1) const
{
UInt64 count = 0;
if (range_start >= range_end)
@ -487,14 +506,18 @@ public:
}
else
{
roaring_uint32_iterator_t iterator;
roaring_init_iterator(rb, &iterator);
roaring_move_uint32_iterator_equalorlarger(&iterator, range_start);
while (iterator.has_value && UInt32(iterator.current_value) < range_end)
for (auto it = rb->begin(); it != rb->end(); ++it)
{
r1.add(iterator.current_value);
roaring_advance_uint32_iterator(&iterator);
++count;
if (*it < range_start)
continue;
if (*it < range_end)
{
r1.add(*it);
++count;
}
else
break;
}
}
return count;
@ -503,9 +526,11 @@ public:
/**
* Return new set of the smallest `limit` values in set which is no less than `range_start`.
*/
UInt64 rb_limit(UInt32 range_start, UInt32 limit, RoaringBitmapWithSmallSet & r1) const
UInt64 rb_limit(UInt64 range_start, UInt64 limit, RoaringBitmapWithSmallSet & r1) const
{
UInt64 count = 0;
if (limit == 0)
return 0;
if (isSmall())
{
std::vector<T> answer;
@ -517,68 +542,72 @@ public:
answer.push_back(val);
}
}
sort(answer.begin(), answer.end());
if (limit > answer.size())
limit = answer.size();
for (size_t i = 0; i < limit; ++i)
r1.add(answer[i]);
count = UInt64(limit);
if (limit < answer.size())
{
std::nth_element(answer.begin(), answer.begin() + limit, answer.end());
answer.resize(limit);
}
for (const auto & elem : answer)
r1.add(elem);
return answer.size();
}
else
{
roaring_uint32_iterator_t iterator;
roaring_init_iterator(rb, &iterator);
roaring_move_uint32_iterator_equalorlarger(&iterator, range_start);
while (UInt32(count) < limit && iterator.has_value)
UInt64 count = 0;
for (auto it = rb->begin(); it != rb->end(); ++it)
{
r1.add(iterator.current_value);
roaring_advance_uint32_iterator(&iterator);
++count;
if (*it < range_start)
continue;
if (count < limit)
{
r1.add(*it);
++count;
}
else
break;
}
return count;
}
return count;
}
UInt64 rb_min() const
{
UInt64 min_val = UINT32_MAX;
if (isSmall())
{
if (small.empty())
return 0;
auto min_val = std::numeric_limits<std::make_unsigned_t<T>>::max();
for (const auto & x : small)
{
T val = x.getValue();
if (UInt64(val) < min_val)
{
min_val = UInt64(val);
}
auto val = x.getValue();
if (val < min_val)
min_val = val;
}
return min_val;
}
else
{
min_val = UInt64(roaring_bitmap_minimum(rb));
}
return min_val;
return rb->minimum();
}
UInt64 rb_max() const
{
UInt64 max_val = 0;
if (isSmall())
{
if (small.empty())
return 0;
auto max_val = std::numeric_limits<std::make_unsigned_t<T>>::min();
for (const auto & x : small)
{
T val = x.getValue();
if (UInt64(val) > max_val)
{
max_val = UInt64(val);
}
auto val = x.getValue();
if (val > max_val)
max_val = val;
}
return max_val;
}
else
{
max_val = UInt64(roaring_bitmap_maximum(rb));
}
return max_val;
return rb->maximum();
}
/**
@ -588,22 +617,23 @@ public:
{
if (isSmall())
toLarge();
for (size_t i = 0; i < num; ++i)
{
if (from_vals[i] == to_vals[i])
continue;
bool changed = roaring_bitmap_remove_checked(rb, from_vals[i]);
bool changed = rb->removeChecked(from_vals[i]);
if (changed)
roaring_bitmap_add(rb, to_vals[i]);
rb->add(to_vals[i]);
}
}
};
template <typename T>
struct AggregateFunctionGroupBitmapData
{
bool doneFirst = false;
// If false, all bitmap operations will be treated as merge to initialize the state
bool init = false;
RoaringBitmapWithSmallSet<T, 32> rbs;
static const char * name() { return "groupBitmap"; }
};

View File

@ -1,5 +1,6 @@
#pragma once
#include <AggregateFunctions/AggregateFunctionFactory.h>
#include <AggregateFunctions/IAggregateFunction.h>
#include <DataTypes/DataTypeCustomSimpleAggregateFunction.h>
#include <DataTypes/DataTypeFactory.h>
@ -32,10 +33,17 @@ public:
DataTypePtr getReturnType() const override
{
DataTypeCustomSimpleAggregateFunction::checkSupportedFunctions(nested_func);
// Need to make a clone because it'll be customized.
auto storage_type = DataTypeFactory::instance().get(nested_func->getReturnType()->getName());
// Need to make a new function with promoted argument types because SimpleAggregates requires arg_type = return_type.
AggregateFunctionProperties properties;
auto function
= AggregateFunctionFactory::instance().get(nested_func->getName(), {storage_type}, nested_func->getParameters(), properties);
DataTypeCustomNamePtr custom_name
= std::make_unique<DataTypeCustomSimpleAggregateFunction>(nested_func, DataTypes{nested_func->getReturnType()}, params);
= std::make_unique<DataTypeCustomSimpleAggregateFunction>(function, DataTypes{nested_func->getReturnType()}, params);
storage_type->setCustomization(std::make_unique<DataTypeCustomDesc>(std::move(custom_name), nullptr));
return storage_type;
}

View File

@ -104,9 +104,12 @@ public:
return false;
}
/// Inserts results into a column.
/// This method must be called once, from single thread.
/// After this method was called for state, you can't do anything with state but destroy.
/// Inserts results into a column. This method might modify the state (e.g.
/// sort an array), so must be called once, from single thread. The state
/// must remain valid though, and the subsequent calls to add/merge/
/// insertResultInto must work correctly. This kind of call sequence occurs
/// in `runningAccumulate`, or when calculating an aggregate function as a
/// window function.
virtual void insertResultInto(AggregateDataPtr place, IColumn & to, Arena * arena) const = 0;
/// Used for machine learning methods. Predict result from trained model.

View File

@ -307,6 +307,9 @@ if (USE_KRB5)
dbms_target_link_libraries(PRIVATE ${KRB5_LIBRARY})
endif()
if (USE_NURAFT)
dbms_target_link_libraries(PRIVATE ${NURAFT_LIBRARY})
endif()
if(RE2_INCLUDE_DIR)
target_include_directories(clickhouse_common_io SYSTEM BEFORE PUBLIC ${RE2_INCLUDE_DIR})
@ -436,6 +439,8 @@ if (USE_ROCKSDB)
dbms_target_include_directories(SYSTEM BEFORE PUBLIC ${ROCKSDB_INCLUDE_DIR})
endif()
dbms_target_link_libraries(PRIVATE _boost_context)
if (ENABLE_TESTS AND USE_GTEST)
macro (grep_gtest_sources BASE_DIR DST_VAR)
# Cold match files that are not in tests/ directories

View File

@ -742,8 +742,11 @@ std::optional<UInt64> Connection::checkPacket(size_t timeout_microseconds)
}
Packet Connection::receivePacket()
Packet Connection::receivePacket(std::function<void(Poco::Net::Socket &)> async_callback)
{
in->setAsyncCallback(std::move(async_callback));
SCOPE_EXIT(in->setAsyncCallback({}));
try
{
Packet res;

View File

@ -18,6 +18,7 @@
#include <DataStreams/BlockStreamProfileInfo.h>
#include <IO/ConnectionTimeouts.h>
#include <IO/ReadBufferFromPocoSocket.h>
#include <Interpreters/TablesStatus.h>
@ -171,7 +172,8 @@ public:
std::optional<UInt64> checkPacket(size_t timeout_microseconds = 0);
/// Receive packet from server.
Packet receivePacket();
/// Each time read blocks and async_callback is set, it will be called. You can poll socket inside it.
Packet receivePacket(std::function<void(Poco::Net::Socket &)> async_callback = {});
/// If not connected yet, or if connection is broken - then connect. If cannot connect - throw an exception.
void forceConnected(const ConnectionTimeouts & timeouts);
@ -226,7 +228,7 @@ private:
String server_display_name;
std::unique_ptr<Poco::Net::StreamSocket> socket;
std::shared_ptr<ReadBuffer> in;
std::shared_ptr<ReadBufferFromPocoSocket> in;
std::shared_ptr<WriteBuffer> out;
std::optional<UInt64> last_input_packet_type;

View File

@ -237,7 +237,7 @@ std::string MultiplexedConnections::dumpAddressesUnlocked() const
return buf.str();
}
Packet MultiplexedConnections::receivePacketUnlocked()
Packet MultiplexedConnections::receivePacketUnlocked(std::function<void(Poco::Net::Socket &)> async_callback)
{
if (!sent_query)
throw Exception("Cannot receive packets: no query sent.", ErrorCodes::LOGICAL_ERROR);
@ -249,7 +249,7 @@ Packet MultiplexedConnections::receivePacketUnlocked()
if (current_connection == nullptr)
throw Exception("Logical error: no available replica", ErrorCodes::NO_AVAILABLE_REPLICA);
Packet packet = current_connection->receivePacket();
Packet packet = current_connection->receivePacket(std::move(async_callback));
switch (packet.type)
{

View File

@ -69,7 +69,7 @@ public:
private:
/// Internal version of `receivePacket` function without locking.
Packet receivePacketUnlocked();
Packet receivePacketUnlocked(std::function<void(Poco::Net::Socket &)> async_callback = {});
/// Internal version of `dumpAddresses` function without locking.
std::string dumpAddressesUnlocked() const;
@ -105,6 +105,8 @@ private:
/// A mutex for the sendCancel function to execute safely
/// in separate thread.
mutable std::mutex cancel_mutex;
friend class RemoteQueryExecutorReadContext;
};
}

View File

@ -97,7 +97,7 @@ public:
/// If preprocessed_dir is empty - calculate from loaded_config.path + /preprocessed_configs/
void savePreprocessedConfig(const LoadedConfig & loaded_config, std::string preprocessed_dir);
/// Set path of main config.xml. It will be cutted from all configs placed to preprocessed_configs/
/// Set path of main config.xml. It will be cut from all configs placed to preprocessed_configs/
static void setConfigPath(const std::string & config_path);
public:

View File

@ -87,7 +87,7 @@ public:
{
/// A more understandable error message.
if (e.code() == DB::ErrorCodes::CANNOT_READ_ALL_DATA || e.code() == DB::ErrorCodes::ATTEMPT_TO_READ_AFTER_EOF)
throw DB::Exception("File " + path + " is empty. You must fill it manually with appropriate value.", e.code());
throw DB::ParsingException("File " + path + " is empty. You must fill it manually with appropriate value.", e.code());
else
throw;
}

Some files were not shown because too many files have changed in this diff Show More