Merge branch 'master' of https://github.com/ClickHouse/ClickHouse into master

This commit is contained in:
Daria Mozhaeva 2020-12-11 21:24:59 +03:00
commit 76f9d5a285
151 changed files with 21143 additions and 1283 deletions

View File

@ -15,7 +15,7 @@
* Restrict to use of non-comparable data types (like `AggregateFunction`) in keys (Sorting key, Primary key, Partition key, and so on). [#16601](https://github.com/ClickHouse/ClickHouse/pull/16601) ([alesapin](https://github.com/alesapin)).
* Remove `ANALYZE` and `AST` queries, and make the setting `enable_debug_queries` obsolete since now it is the part of full featured `EXPLAIN` query. [#16536](https://github.com/ClickHouse/ClickHouse/pull/16536) ([Ivan](https://github.com/abyss7)).
* Aggregate functions `boundingRatio`, `rankCorr`, `retention`, `timeSeriesGroupSum`, `timeSeriesGroupRateSum`, `windowFunnel` were erroneously made case-insensitive. Now their names are made case sensitive as designed. Only functions that are specified in SQL standard or made for compatibility with other DBMS or functions similar to those should be case-insensitive. [#16407](https://github.com/ClickHouse/ClickHouse/pull/16407) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Make `rankCorr` function return nan on insufficient data https://github.com/ClickHouse/ClickHouse/issues/16124. [#16135](https://github.com/ClickHouse/ClickHouse/pull/16135) ([hexiaoting](https://github.com/hexiaoting)).
* Make `rankCorr` function return nan on insufficient data [#16124](https://github.com/ClickHouse/ClickHouse/issues/16124). [#16135](https://github.com/ClickHouse/ClickHouse/pull/16135) ([hexiaoting](https://github.com/hexiaoting)).
* When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to `Part ... intersects previous part` errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).
#### New Feature
@ -33,7 +33,7 @@
* Now we can provide identifiers via query parameters. And these parameters can be used as table objects or columns. [#16594](https://github.com/ClickHouse/ClickHouse/pull/16594) ([Amos Bird](https://github.com/amosbird)).
* Added big integers (UInt256, Int128, Int256) and UUID data types support for MergeTree BloomFilter index. Big integers is an experimental feature. [#16642](https://github.com/ClickHouse/ClickHouse/pull/16642) ([Maksim Kita](https://github.com/kitaisreal)).
* Add `farmFingerprint64` function (non-cryptographic string hashing). [#16570](https://github.com/ClickHouse/ClickHouse/pull/16570) ([Jacob Hayes](https://github.com/JacobHayes)).
* Add `log_queries_min_query_duration_ms`, only queries slower then the value of this setting will go to `query_log`/`query_thread_log` (i.e. something like `slow_query_log` in mysql). [#16529](https://github.com/ClickHouse/ClickHouse/pull/16529) ([Azat Khuzhin](https://github.com/azat)).
* Add `log_queries_min_query_duration_ms`, only queries slower than the value of this setting will go to `query_log`/`query_thread_log` (i.e. something like `slow_query_log` in mysql). [#16529](https://github.com/ClickHouse/ClickHouse/pull/16529) ([Azat Khuzhin](https://github.com/azat)).
* Ability to create a docker image on the top of `Alpine`. Uses precompiled binary and glibc components from ubuntu 20.04. [#16479](https://github.com/ClickHouse/ClickHouse/pull/16479) ([filimonov](https://github.com/filimonov)).
* Added `toUUIDOrNull`, `toUUIDOrZero` cast functions. [#16337](https://github.com/ClickHouse/ClickHouse/pull/16337) ([Maksim Kita](https://github.com/kitaisreal)).
* Add `max_concurrent_queries_for_all_users` setting, see [#6636](https://github.com/ClickHouse/ClickHouse/issues/6636) for use cases. [#16154](https://github.com/ClickHouse/ClickHouse/pull/16154) ([nvartolomei](https://github.com/nvartolomei)).
@ -178,7 +178,7 @@
* Add `JSONStrings` format which output data in arrays of strings. [#14333](https://github.com/ClickHouse/ClickHouse/pull/14333) ([hcz](https://github.com/hczhcz)).
* Add support for "Raw" column format for `Regexp` format. It allows to simply extract subpatterns as a whole without any escaping rules. [#15363](https://github.com/ClickHouse/ClickHouse/pull/15363) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Allow configurable `NULL` representation for `TSV` output format. It is controlled by the setting `output_format_tsv_null_representation` which is `\N` by default. This closes [#9375](https://github.com/ClickHouse/ClickHouse/issues/9375). Note that the setting only controls output format and `\N` is the only supported `NULL` representation for `TSV` input format. [#14586](https://github.com/ClickHouse/ClickHouse/pull/14586) ([Kruglov Pavel](https://github.com/Avogar)).
* Support Decimal data type for `MaterializedMySQL`. `MaterializedMySQL` is an experimental feature. [#14535](https://github.com/ClickHouse/ClickHouse/pull/14535) ([Winter Zhang](https://github.com/zhang2014)).
* Support Decimal data type for `MaterializeMySQL`. `MaterializeMySQL` is an experimental feature. [#14535](https://github.com/ClickHouse/ClickHouse/pull/14535) ([Winter Zhang](https://github.com/zhang2014)).
* Add new feature: `SHOW DATABASES LIKE 'xxx'`. [#14521](https://github.com/ClickHouse/ClickHouse/pull/14521) ([hexiaoting](https://github.com/hexiaoting)).
* Added a script to import (arbitrary) git repository to ClickHouse as a sample dataset. [#14471](https://github.com/ClickHouse/ClickHouse/pull/14471) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Now insert statements can have asterisk (or variants) with column transformers in the column list. [#14453](https://github.com/ClickHouse/ClickHouse/pull/14453) ([Amos Bird](https://github.com/amosbird)).
@ -200,18 +200,18 @@
* Fix a very wrong code in TwoLevelStringHashTable implementation, which might lead to memory leak. [#16264](https://github.com/ClickHouse/ClickHouse/pull/16264) ([Amos Bird](https://github.com/amosbird)).
* Fix segfault in some cases of wrong aggregation in lambdas. [#16082](https://github.com/ClickHouse/ClickHouse/pull/16082) ([Anton Popov](https://github.com/CurtizJ)).
* Fix `ALTER MODIFY ... ORDER BY` query hang for `ReplicatedVersionedCollapsingMergeTree`. This fixes [#15980](https://github.com/ClickHouse/ClickHouse/issues/15980). [#16011](https://github.com/ClickHouse/ClickHouse/pull/16011) ([alesapin](https://github.com/alesapin)).
* `MaterializedMySQL` (experimental feature): Fix collate name & charset name parser and support `length = 0` for string type. [#16008](https://github.com/ClickHouse/ClickHouse/pull/16008) ([Winter Zhang](https://github.com/zhang2014)).
* `MaterializeMySQL` (experimental feature): Fix collate name & charset name parser and support `length = 0` for string type. [#16008](https://github.com/ClickHouse/ClickHouse/pull/16008) ([Winter Zhang](https://github.com/zhang2014)).
* Allow to use `direct` layout for dictionaries with complex keys. [#16007](https://github.com/ClickHouse/ClickHouse/pull/16007) ([Anton Popov](https://github.com/CurtizJ)).
* Prevent replica hang for 5-10 mins when replication error happens after a period of inactivity. [#15987](https://github.com/ClickHouse/ClickHouse/pull/15987) ([filimonov](https://github.com/filimonov)).
* Fix rare segfaults when inserting into or selecting from MaterializedView and concurrently dropping target table (for Atomic database engine). [#15984](https://github.com/ClickHouse/ClickHouse/pull/15984) ([tavplubix](https://github.com/tavplubix)).
* Fix ambiguity in parsing of settings profiles: `CREATE USER ... SETTINGS profile readonly` is now considered as using a profile named `readonly`, not a setting named `profile` with the readonly constraint. This fixes [#15628](https://github.com/ClickHouse/ClickHouse/issues/15628). [#15982](https://github.com/ClickHouse/ClickHouse/pull/15982) ([Vitaly Baranov](https://github.com/vitlibar)).
* `MaterializedMySQL` (experimental feature): Fix crash on create database failure. [#15954](https://github.com/ClickHouse/ClickHouse/pull/15954) ([Winter Zhang](https://github.com/zhang2014)).
* `MaterializeMySQL` (experimental feature): Fix crash on create database failure. [#15954](https://github.com/ClickHouse/ClickHouse/pull/15954) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed `DROP TABLE IF EXISTS` failure with `Table ... doesn't exist` error when table is concurrently renamed (for Atomic database engine). Fixed rare deadlock when concurrently executing some DDL queries with multiple tables (like `DROP DATABASE` and `RENAME TABLE`) - Fixed `DROP/DETACH DATABASE` failure with `Table ... doesn't exist` when concurrently executing `DROP/DETACH TABLE`. [#15934](https://github.com/ClickHouse/ClickHouse/pull/15934) ([tavplubix](https://github.com/tavplubix)).
* Fix incorrect empty result for query from `Distributed` table if query has `WHERE`, `PREWHERE` and `GLOBAL IN`. Fixes [#15792](https://github.com/ClickHouse/ClickHouse/issues/15792). [#15933](https://github.com/ClickHouse/ClickHouse/pull/15933) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixes [#12513](https://github.com/ClickHouse/ClickHouse/issues/12513): difference expressions with same alias when query is reanalyzed. [#15886](https://github.com/ClickHouse/ClickHouse/pull/15886) ([Winter Zhang](https://github.com/zhang2014)).
* Fix possible very rare deadlocks in RBAC implementation. [#15875](https://github.com/ClickHouse/ClickHouse/pull/15875) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix exception `Block structure mismatch` in `SELECT ... ORDER BY DESC` queries which were executed after `ALTER MODIFY COLUMN` query. Fixes [#15800](https://github.com/ClickHouse/ClickHouse/issues/15800). [#15852](https://github.com/ClickHouse/ClickHouse/pull/15852) ([alesapin](https://github.com/alesapin)).
* `MaterializedMySQL` (experimental feature): Fix `select count()` inaccuracy. [#15767](https://github.com/ClickHouse/ClickHouse/pull/15767) ([tavplubix](https://github.com/tavplubix)).
* `MaterializeMySQL` (experimental feature): Fix `select count()` inaccuracy. [#15767](https://github.com/ClickHouse/ClickHouse/pull/15767) ([tavplubix](https://github.com/tavplubix)).
* Fix some cases of queries, in which only virtual columns are selected. Previously `Not found column _nothing in block` exception may be thrown. Fixes [#12298](https://github.com/ClickHouse/ClickHouse/issues/12298). [#15756](https://github.com/ClickHouse/ClickHouse/pull/15756) ([Anton Popov](https://github.com/CurtizJ)).
* Fix drop of materialized view with inner table in Atomic database (hangs all subsequent DROP TABLE due to hang of the worker thread, due to recursive DROP TABLE for inner table of MV). [#15743](https://github.com/ClickHouse/ClickHouse/pull/15743) ([Azat Khuzhin](https://github.com/azat)).
* Possibility to move part to another disk/volume if the first attempt was failed. [#15723](https://github.com/ClickHouse/ClickHouse/pull/15723) ([Pavel Kovalenko](https://github.com/Jokser)).
@ -243,37 +243,37 @@
* Fix hang of queries with a lot of subqueries to same table of `MySQL` engine. Previously, if there were more than 16 subqueries to same `MySQL` table in query, it hang forever. [#15299](https://github.com/ClickHouse/ClickHouse/pull/15299) ([Anton Popov](https://github.com/CurtizJ)).
* Fix MSan report in QueryLog. Uninitialized memory can be used for the field `memory_usage`. [#15258](https://github.com/ClickHouse/ClickHouse/pull/15258) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix 'Unknown identifier' in GROUP BY when query has JOIN over Merge table. [#15242](https://github.com/ClickHouse/ClickHouse/pull/15242) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix instance crash when using `joinGet` with `LowCardinality` types. This fixes https://github.com/ClickHouse/ClickHouse/issues/15214. [#15220](https://github.com/ClickHouse/ClickHouse/pull/15220) ([Amos Bird](https://github.com/amosbird)).
* Fix instance crash when using `joinGet` with `LowCardinality` types. This fixes [#15214](https://github.com/ClickHouse/ClickHouse/issues/15214). [#15220](https://github.com/ClickHouse/ClickHouse/pull/15220) ([Amos Bird](https://github.com/amosbird)).
* Fix bug in table engine `Buffer` which doesn't allow to insert data of new structure into `Buffer` after `ALTER` query. Fixes [#15117](https://github.com/ClickHouse/ClickHouse/issues/15117). [#15192](https://github.com/ClickHouse/ClickHouse/pull/15192) ([alesapin](https://github.com/alesapin)).
* Adjust Decimal field size in MySQL column definition packet. [#15152](https://github.com/ClickHouse/ClickHouse/pull/15152) ([maqroll](https://github.com/maqroll)).
* Fixes `Data compressed with different methods` in `join_algorithm='auto'`. Keep LowCardinality as type for left table join key in `join_algorithm='partial_merge'`. [#15088](https://github.com/ClickHouse/ClickHouse/pull/15088) ([Artem Zuikov](https://github.com/4ertus2)).
* Update `jemalloc` to fix `percpu_arena` with affinity mask. [#15035](https://github.com/ClickHouse/ClickHouse/pull/15035) ([Azat Khuzhin](https://github.com/azat)). [#14957](https://github.com/ClickHouse/ClickHouse/pull/14957) ([Azat Khuzhin](https://github.com/azat)).
* We already use padded comparison between String and FixedString (https://github.com/ClickHouse/ClickHouse/blob/master/src/Functions/FunctionsComparison.h#L333). This PR applies the same logic to field comparison which corrects the usage of FixedString as primary keys. This fixes https://github.com/ClickHouse/ClickHouse/issues/14908. [#15033](https://github.com/ClickHouse/ClickHouse/pull/15033) ([Amos Bird](https://github.com/amosbird)).
* We already use padded comparison between String and FixedString (https://github.com/ClickHouse/ClickHouse/blob/master/src/Functions/FunctionsComparison.h#L333). This PR applies the same logic to field comparison which corrects the usage of FixedString as primary keys. This fixes [#14908](https://github.com/ClickHouse/ClickHouse/issues/14908). [#15033](https://github.com/ClickHouse/ClickHouse/pull/15033) ([Amos Bird](https://github.com/amosbird)).
* If function `bar` was called with specifically crafted arguments, buffer overflow was possible. This closes [#13926](https://github.com/ClickHouse/ClickHouse/issues/13926). [#15028](https://github.com/ClickHouse/ClickHouse/pull/15028) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed `Cannot rename ... errno: 22, strerror: Invalid argument` error on DDL query execution in Atomic database when running clickhouse-server in Docker on Mac OS. [#15024](https://github.com/ClickHouse/ClickHouse/pull/15024) ([tavplubix](https://github.com/tavplubix)).
* Fix crash in RIGHT or FULL JOIN with join_algorith='auto' when memory limit exceeded and we should change HashJoin with MergeJoin. [#15002](https://github.com/ClickHouse/ClickHouse/pull/15002) ([Artem Zuikov](https://github.com/4ertus2)).
* Now settings `number_of_free_entries_in_pool_to_execute_mutation` and `number_of_free_entries_in_pool_to_lower_max_size_of_merge` can be equal to `background_pool_size`. [#14975](https://github.com/ClickHouse/ClickHouse/pull/14975) ([alesapin](https://github.com/alesapin)).
* Fix to make predicate push down work when subquery contains `finalizeAggregation` function. Fixes [#14847](https://github.com/ClickHouse/ClickHouse/issues/14847). [#14937](https://github.com/ClickHouse/ClickHouse/pull/14937) ([filimonov](https://github.com/filimonov)).
* Publish CPU frequencies per logical core in `system.asynchronous_metrics`. This fixes https://github.com/ClickHouse/ClickHouse/issues/14923. [#14924](https://github.com/ClickHouse/ClickHouse/pull/14924) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* `MaterializedMySQL` (experimental feature): Fixed `.metadata.tmp File exists` error. [#14898](https://github.com/ClickHouse/ClickHouse/pull/14898) ([Winter Zhang](https://github.com/zhang2014)).
* Publish CPU frequencies per logical core in `system.asynchronous_metrics`. This fixes [#14923](https://github.com/ClickHouse/ClickHouse/issues/14923). [#14924](https://github.com/ClickHouse/ClickHouse/pull/14924) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* `MaterializeMySQL` (experimental feature): Fixed `.metadata.tmp File exists` error. [#14898](https://github.com/ClickHouse/ClickHouse/pull/14898) ([Winter Zhang](https://github.com/zhang2014)).
* Fix the issue when some invocations of `extractAllGroups` function may trigger "Memory limit exceeded" error. This fixes [#13383](https://github.com/ClickHouse/ClickHouse/issues/13383). [#14889](https://github.com/ClickHouse/ClickHouse/pull/14889) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix SIGSEGV for an attempt to INSERT into StorageFile with file descriptor. [#14887](https://github.com/ClickHouse/ClickHouse/pull/14887) ([Azat Khuzhin](https://github.com/azat)).
* Fixed segfault in `cache` dictionary [#14837](https://github.com/ClickHouse/ClickHouse/issues/14837). [#14879](https://github.com/ClickHouse/ClickHouse/pull/14879) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* `MaterializedMySQL` (experimental feature): Fixed bug in parsing MySQL binlog events, which causes `Attempt to read after eof` and `Packet payload is not fully read` in `MaterializeMySQL` database engine. [#14852](https://github.com/ClickHouse/ClickHouse/pull/14852) ([Winter Zhang](https://github.com/zhang2014)).
* `MaterializeMySQL` (experimental feature): Fixed bug in parsing MySQL binlog events, which causes `Attempt to read after eof` and `Packet payload is not fully read` in `MaterializeMySQL` database engine. [#14852](https://github.com/ClickHouse/ClickHouse/pull/14852) ([Winter Zhang](https://github.com/zhang2014)).
* Fix rare error in `SELECT` queries when the queried column has `DEFAULT` expression which depends on the other column which also has `DEFAULT` and not present in select query and not exists on disk. Partially fixes [#14531](https://github.com/ClickHouse/ClickHouse/issues/14531). [#14845](https://github.com/ClickHouse/ClickHouse/pull/14845) ([alesapin](https://github.com/alesapin)).
* Fix a problem where the server may get stuck on startup while talking to ZooKeeper, if the configuration files have to be fetched from ZK (using the `from_zk` include option). This fixes [#14814](https://github.com/ClickHouse/ClickHouse/issues/14814). [#14843](https://github.com/ClickHouse/ClickHouse/pull/14843) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Fix wrong monotonicity detection for shrunk `Int -> Int` cast of signed types. It might lead to incorrect query result. This bug is unveiled in [#14513](https://github.com/ClickHouse/ClickHouse/issues/14513). [#14783](https://github.com/ClickHouse/ClickHouse/pull/14783) ([Amos Bird](https://github.com/amosbird)).
* `Replace` column transformer should replace identifiers with cloned ASTs. This fixes https://github.com/ClickHouse/ClickHouse/issues/14695 . [#14734](https://github.com/ClickHouse/ClickHouse/pull/14734) ([Amos Bird](https://github.com/amosbird)).
* `Replace` column transformer should replace identifiers with cloned ASTs. This fixes [#14695](https://github.com/ClickHouse/ClickHouse/issues/14695) . [#14734](https://github.com/ClickHouse/ClickHouse/pull/14734) ([Amos Bird](https://github.com/amosbird)).
* Fixed missed default database name in metadata of materialized view when executing `ALTER ... MODIFY QUERY`. [#14664](https://github.com/ClickHouse/ClickHouse/pull/14664) ([tavplubix](https://github.com/tavplubix)).
* Fix bug when `ALTER UPDATE` mutation with `Nullable` column in assignment expression and constant value (like `UPDATE x = 42`) leads to incorrect value in column or segfault. Fixes [#13634](https://github.com/ClickHouse/ClickHouse/issues/13634), [#14045](https://github.com/ClickHouse/ClickHouse/issues/14045). [#14646](https://github.com/ClickHouse/ClickHouse/pull/14646) ([alesapin](https://github.com/alesapin)).
* Fix wrong Decimal multiplication result caused wrong decimal scale of result column. [#14603](https://github.com/ClickHouse/ClickHouse/pull/14603) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix function `has` with `LowCardinality` of `Nullable`. [#14591](https://github.com/ClickHouse/ClickHouse/pull/14591) ([Mike](https://github.com/myrrc)).
* Cleanup data directory after Zookeeper exceptions during CreateQuery for StorageReplicatedMergeTree Engine. [#14563](https://github.com/ClickHouse/ClickHouse/pull/14563) ([Bharat Nallan](https://github.com/bharatnc)).
* Fix rare segfaults in functions with combinator `-Resample`, which could appear in result of overflow with very large parameters. [#14562](https://github.com/ClickHouse/ClickHouse/pull/14562) ([Anton Popov](https://github.com/CurtizJ)).
* Fix a bug when converting `Nullable(String)` to Enum. Introduced by https://github.com/ClickHouse/ClickHouse/pull/12745. This fixes https://github.com/ClickHouse/ClickHouse/issues/14435. [#14530](https://github.com/ClickHouse/ClickHouse/pull/14530) ([Amos Bird](https://github.com/amosbird)).
* Fix a bug when converting `Nullable(String)` to Enum. Introduced by [#12745](https://github.com/ClickHouse/ClickHouse/pull/12745). This fixes [#14435](https://github.com/ClickHouse/ClickHouse/issues/14435). [#14530](https://github.com/ClickHouse/ClickHouse/pull/14530) ([Amos Bird](https://github.com/amosbird)).
* Fixed the incorrect sorting order of `Nullable` column. This fixes [#14344](https://github.com/ClickHouse/ClickHouse/issues/14344). [#14495](https://github.com/ClickHouse/ClickHouse/pull/14495) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fix `currentDatabase()` function cannot be used in `ON CLUSTER` ddl query. [#14211](https://github.com/ClickHouse/ClickHouse/pull/14211) ([Winter Zhang](https://github.com/zhang2014)).
* `MaterializedMySQL` (experimental feature): Fixed `Packet payload is not fully read` error in `MaterializeMySQL` database engine. [#14696](https://github.com/ClickHouse/ClickHouse/pull/14696) ([BohuTANG](https://github.com/BohuTANG)).
* `MaterializeMySQL` (experimental feature): Fixed `Packet payload is not fully read` error in `MaterializeMySQL` database engine. [#14696](https://github.com/ClickHouse/ClickHouse/pull/14696) ([BohuTANG](https://github.com/BohuTANG)).
#### Improvement
@ -308,7 +308,7 @@
* Add an option to skip access checks for `DiskS3`. `s3` disk is an experimental feature. [#14497](https://github.com/ClickHouse/ClickHouse/pull/14497) ([Pavel Kovalenko](https://github.com/Jokser)).
* Speed up server shutdown process if there are ongoing S3 requests. [#14496](https://github.com/ClickHouse/ClickHouse/pull/14496) ([Pavel Kovalenko](https://github.com/Jokser)).
* `SYSTEM RELOAD CONFIG` now throws an exception if failed to reload and continues using the previous users.xml. The background periodic reloading also continues using the previous users.xml if failed to reload. [#14492](https://github.com/ClickHouse/ClickHouse/pull/14492) ([Vitaly Baranov](https://github.com/vitlibar)).
* For INSERTs with inline data in VALUES format in the script mode of `clickhouse-client`, support semicolon as the data terminator, in addition to the new line. Closes https://github.com/ClickHouse/ClickHouse/issues/12288. [#13192](https://github.com/ClickHouse/ClickHouse/pull/13192) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* For INSERTs with inline data in VALUES format in the script mode of `clickhouse-client`, support semicolon as the data terminator, in addition to the new line. Closes [#12288](https://github.com/ClickHouse/ClickHouse/issues/12288). [#13192](https://github.com/ClickHouse/ClickHouse/pull/13192) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Support custom codecs in compact parts. [#12183](https://github.com/ClickHouse/ClickHouse/pull/12183) ([Anton Popov](https://github.com/CurtizJ)).
#### Performance Improvement
@ -320,7 +320,7 @@
* Improve performance of 256-bit types using (u)int64_t as base type for wide integers. Original wide integers use 8-bit types as base. [#14859](https://github.com/ClickHouse/ClickHouse/pull/14859) ([Artem Zuikov](https://github.com/4ertus2)).
* Explicitly use a temporary disk to store vertical merge temporary data. [#15639](https://github.com/ClickHouse/ClickHouse/pull/15639) ([Grigory Pervakov](https://github.com/GrigoryPervakov)).
* Use one S3 DeleteObjects request instead of multiple DeleteObject in a loop. No any functionality changes, so covered by existing tests like integration/test_log_family_s3. [#15238](https://github.com/ClickHouse/ClickHouse/pull/15238) ([ianton-ru](https://github.com/ianton-ru)).
* Fix `DateTime <op> DateTime` mistakenly choosing the slow generic implementation. This fixes https://github.com/ClickHouse/ClickHouse/issues/15153. [#15178](https://github.com/ClickHouse/ClickHouse/pull/15178) ([Amos Bird](https://github.com/amosbird)).
* Fix `DateTime <op> DateTime` mistakenly choosing the slow generic implementation. This fixes [#15153](https://github.com/ClickHouse/ClickHouse/issues/15153). [#15178](https://github.com/ClickHouse/ClickHouse/pull/15178) ([Amos Bird](https://github.com/amosbird)).
* Improve performance of GROUP BY key of type `FixedString`. [#15034](https://github.com/ClickHouse/ClickHouse/pull/15034) ([Amos Bird](https://github.com/amosbird)).
* Only `mlock` code segment when starting clickhouse-server. In previous versions, all mapped regions were locked in memory, including debug info. Debug info is usually splitted to a separate file but if it isn't, it led to +2..3 GiB memory usage. [#14929](https://github.com/ClickHouse/ClickHouse/pull/14929) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* ClickHouse binary become smaller due to link time optimization.
@ -387,7 +387,7 @@
* Allow to use direct layout for dictionaries with complex keys. [#16007](https://github.com/ClickHouse/ClickHouse/pull/16007) ([Anton Popov](https://github.com/CurtizJ)).
* Prevent replica hang for 5-10 mins when replication error happens after a period of inactivity. [#15987](https://github.com/ClickHouse/ClickHouse/pull/15987) ([filimonov](https://github.com/filimonov)).
* Fix rare segfaults when inserting into or selecting from MaterializedView and concurrently dropping target table (for Atomic database engine). [#15984](https://github.com/ClickHouse/ClickHouse/pull/15984) ([tavplubix](https://github.com/tavplubix)).
* Fix ambiguity in parsing of settings profiles: `CREATE USER ... SETTINGS profile readonly` is now considered as using a profile named `readonly`, not a setting named `profile` with the readonly constraint. This fixes https://github.com/ClickHouse/ClickHouse/issues/15628. [#15982](https://github.com/ClickHouse/ClickHouse/pull/15982) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix ambiguity in parsing of settings profiles: `CREATE USER ... SETTINGS profile readonly` is now considered as using a profile named `readonly`, not a setting named `profile` with the readonly constraint. This fixes [#15628](https://github.com/ClickHouse/ClickHouse/issues/15628). [#15982](https://github.com/ClickHouse/ClickHouse/pull/15982) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix a crash when database creation fails. [#15954](https://github.com/ClickHouse/ClickHouse/pull/15954) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed `DROP TABLE IF EXISTS` failure with `Table ... doesn't exist` error when table is concurrently renamed (for Atomic database engine). Fixed rare deadlock when concurrently executing some DDL queries with multiple tables (like `DROP DATABASE` and `RENAME TABLE`) Fixed `DROP/DETACH DATABASE` failure with `Table ... doesn't exist` when concurrently executing `DROP/DETACH TABLE`. [#15934](https://github.com/ClickHouse/ClickHouse/pull/15934) ([tavplubix](https://github.com/tavplubix)).
* Fix incorrect empty result for query from `Distributed` table if query has `WHERE`, `PREWHERE` and `GLOBAL IN`. Fixes [#15792](https://github.com/ClickHouse/ClickHouse/issues/15792). [#15933](https://github.com/ClickHouse/ClickHouse/pull/15933) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
@ -398,7 +398,7 @@
* Fixed too low default value of `max_replicated_logs_to_keep` setting, which might cause replicas to become lost too often. Improve lost replica recovery process by choosing the most up-to-date replica to clone. Also do not remove old parts from lost replica, detach them instead. [#15701](https://github.com/ClickHouse/ClickHouse/pull/15701) ([tavplubix](https://github.com/tavplubix)).
* Fix error `Cannot add simple transform to empty Pipe` which happened while reading from `Buffer` table which has different structure than destination table. It was possible if destination table returned empty result for query. Fixes [#15529](https://github.com/ClickHouse/ClickHouse/issues/15529). [#15662](https://github.com/ClickHouse/ClickHouse/pull/15662) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed bug with globs in S3 table function, region from URL was not applied to S3 client configuration. [#15646](https://github.com/ClickHouse/ClickHouse/pull/15646) ([Vladimir Chebotarev](https://github.com/excitoon)).
* Decrement the `ReadonlyReplica` metric when detaching read-only tables. This fixes https://github.com/ClickHouse/ClickHouse/issues/15598. [#15592](https://github.com/ClickHouse/ClickHouse/pull/15592) ([sundyli](https://github.com/sundy-li)).
* Decrement the `ReadonlyReplica` metric when detaching read-only tables. This fixes [#15598](https://github.com/ClickHouse/ClickHouse/issues/15598). [#15592](https://github.com/ClickHouse/ClickHouse/pull/15592) ([sundyli](https://github.com/sundy-li)).
* Throw an error when a single parameter is passed to ReplicatedMergeTree instead of ignoring it. [#15516](https://github.com/ClickHouse/ClickHouse/pull/15516) ([nvartolomei](https://github.com/nvartolomei)).
#### Improvement
@ -422,11 +422,11 @@
* Fix `Missing columns` errors when selecting columns which absent in data, but depend on other columns which also absent in data. Fixes [#15530](https://github.com/ClickHouse/ClickHouse/issues/15530). [#15532](https://github.com/ClickHouse/ClickHouse/pull/15532) ([alesapin](https://github.com/alesapin)).
* Fix bug with event subscription in DDLWorker which rarely may lead to query hangs in `ON CLUSTER`. Introduced in [#13450](https://github.com/ClickHouse/ClickHouse/issues/13450). [#15477](https://github.com/ClickHouse/ClickHouse/pull/15477) ([alesapin](https://github.com/alesapin)).
* Report proper error when the second argument of `boundingRatio` aggregate function has a wrong type. [#15407](https://github.com/ClickHouse/ClickHouse/pull/15407) ([detailyang](https://github.com/detailyang)).
* Fix bug where queries like SELECT toStartOfDay(today()) fail complaining about empty time_zone argument. [#15319](https://github.com/ClickHouse/ClickHouse/pull/15319) ([Bharat Nallan](https://github.com/bharatnc)).
* Fix bug where queries like `SELECT toStartOfDay(today())` fail complaining about empty time_zone argument. [#15319](https://github.com/ClickHouse/ClickHouse/pull/15319) ([Bharat Nallan](https://github.com/bharatnc)).
* Fix race condition during MergeTree table rename and background cleanup. [#15304](https://github.com/ClickHouse/ClickHouse/pull/15304) ([alesapin](https://github.com/alesapin)).
* Fix rare race condition on server startup when system.logs are enabled. [#15300](https://github.com/ClickHouse/ClickHouse/pull/15300) ([alesapin](https://github.com/alesapin)).
* Fix MSan report in QueryLog. Uninitialized memory can be used for the field `memory_usage`. [#15258](https://github.com/ClickHouse/ClickHouse/pull/15258) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix instance crash when using joinGet with LowCardinality types. This fixes https://github.com/ClickHouse/ClickHouse/issues/15214. [#15220](https://github.com/ClickHouse/ClickHouse/pull/15220) ([Amos Bird](https://github.com/amosbird)).
* Fix instance crash when using joinGet with LowCardinality types. This fixes [#15214](https://github.com/ClickHouse/ClickHouse/issues/15214). [#15220](https://github.com/ClickHouse/ClickHouse/pull/15220) ([Amos Bird](https://github.com/amosbird)).
* Fix bug in table engine `Buffer` which doesn't allow to insert data of new structure into `Buffer` after `ALTER` query. Fixes [#15117](https://github.com/ClickHouse/ClickHouse/issues/15117). [#15192](https://github.com/ClickHouse/ClickHouse/pull/15192) ([alesapin](https://github.com/alesapin)).
* Adjust decimals field size in mysql column definition packet. [#15152](https://github.com/ClickHouse/ClickHouse/pull/15152) ([maqroll](https://github.com/maqroll)).
* Fixed `Cannot rename ... errno: 22, strerror: Invalid argument` error on DDL query execution in Atomic database when running clickhouse-server in docker on Mac OS. [#15024](https://github.com/ClickHouse/ClickHouse/pull/15024) ([tavplubix](https://github.com/tavplubix)).
@ -455,10 +455,10 @@
* Fix bug when `ALTER UPDATE` mutation with Nullable column in assignment expression and constant value (like `UPDATE x = 42`) leads to incorrect value in column or segfault. Fixes [#13634](https://github.com/ClickHouse/ClickHouse/issues/13634), [#14045](https://github.com/ClickHouse/ClickHouse/issues/14045). [#14646](https://github.com/ClickHouse/ClickHouse/pull/14646) ([alesapin](https://github.com/alesapin)).
* Fix wrong Decimal multiplication result caused wrong decimal scale of result column. [#14603](https://github.com/ClickHouse/ClickHouse/pull/14603) ([Artem Zuikov](https://github.com/4ertus2)).
* Fixed the incorrect sorting order of `Nullable` column. This fixes [#14344](https://github.com/ClickHouse/ClickHouse/issues/14344). [#14495](https://github.com/ClickHouse/ClickHouse/pull/14495) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fixed inconsistent comparison with primary key of type `FixedString` on index analysis if they're compered with a string of less size. This fixes https://github.com/ClickHouse/ClickHouse/issues/14908. [#15033](https://github.com/ClickHouse/ClickHouse/pull/15033) ([Amos Bird](https://github.com/amosbird)).
* Fixed inconsistent comparison with primary key of type `FixedString` on index analysis if they're compered with a string of less size. This fixes [#14908](https://github.com/ClickHouse/ClickHouse/issues/14908). [#15033](https://github.com/ClickHouse/ClickHouse/pull/15033) ([Amos Bird](https://github.com/amosbird)).
* Fix bug which leads to wrong merges assignment if table has partitions with a single part. [#14444](https://github.com/ClickHouse/ClickHouse/pull/14444) ([alesapin](https://github.com/alesapin)).
* If function `bar` was called with specifically crafted arguments, buffer overflow was possible. This closes [#13926](https://github.com/ClickHouse/ClickHouse/issues/13926). [#15028](https://github.com/ClickHouse/ClickHouse/pull/15028) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Publish CPU frequencies per logical core in `system.asynchronous_metrics`. This fixes https://github.com/ClickHouse/ClickHouse/issues/14923. [#14924](https://github.com/ClickHouse/ClickHouse/pull/14924) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Publish CPU frequencies per logical core in `system.asynchronous_metrics`. This fixes [#14923](https://github.com/ClickHouse/ClickHouse/issues/14923). [#14924](https://github.com/ClickHouse/ClickHouse/pull/14924) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Fixed `.metadata.tmp File exists` error when using `MaterializeMySQL` database engine. [#14898](https://github.com/ClickHouse/ClickHouse/pull/14898) ([Winter Zhang](https://github.com/zhang2014)).
* Fix the issue when some invocations of `extractAllGroups` function may trigger "Memory limit exceeded" error. This fixes [#13383](https://github.com/ClickHouse/ClickHouse/issues/13383). [#14889](https://github.com/ClickHouse/ClickHouse/pull/14889) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix SIGSEGV for an attempt to INSERT into StorageFile(fd). [#14887](https://github.com/ClickHouse/ClickHouse/pull/14887) ([Azat Khuzhin](https://github.com/azat)).
@ -501,7 +501,7 @@
#### Performance Improvement
* Optimize queries with LIMIT/LIMIT BY/ORDER BY for distributed with GROUP BY sharding_key (under optimize_skip_unused_shards and optimize_distributed_group_by_sharding_key). [#10373](https://github.com/ClickHouse/ClickHouse/pull/10373) ([Azat Khuzhin](https://github.com/azat)).
* Optimize queries with LIMIT/LIMIT BY/ORDER BY for distributed with GROUP BY sharding_key (under `optimize_skip_unused_shards` and `optimize_distributed_group_by_sharding_key`). [#10373](https://github.com/ClickHouse/ClickHouse/pull/10373) ([Azat Khuzhin](https://github.com/azat)).
* Creating sets for multiple `JOIN` and `IN` in parallel. It may slightly improve performance for queries with several different `IN subquery` expressions. [#14412](https://github.com/ClickHouse/ClickHouse/pull/14412) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Improve Kafka engine performance by providing independent thread for each consumer. Separate thread pool for streaming engines (like Kafka). [#13939](https://github.com/ClickHouse/ClickHouse/pull/13939) ([fastio](https://github.com/fastio)).
@ -579,15 +579,15 @@
* Fix race condition during MergeTree table rename and background cleanup. [#15304](https://github.com/ClickHouse/ClickHouse/pull/15304) ([alesapin](https://github.com/alesapin)).
* Fix rare race condition on server startup when system.logs are enabled. [#15300](https://github.com/ClickHouse/ClickHouse/pull/15300) ([alesapin](https://github.com/alesapin)).
* Fix MSan report in QueryLog. Uninitialized memory can be used for the field `memory_usage`. [#15258](https://github.com/ClickHouse/ClickHouse/pull/15258) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix instance crash when using joinGet with LowCardinality types. This fixes https://github.com/ClickHouse/ClickHouse/issues/15214. [#15220](https://github.com/ClickHouse/ClickHouse/pull/15220) ([Amos Bird](https://github.com/amosbird)).
* Fix instance crash when using joinGet with LowCardinality types. This fixes [#15214](https://github.com/ClickHouse/ClickHouse/issues/15214). [#15220](https://github.com/ClickHouse/ClickHouse/pull/15220) ([Amos Bird](https://github.com/amosbird)).
* Fix bug in table engine `Buffer` which doesn't allow to insert data of new structure into `Buffer` after `ALTER` query. Fixes [#15117](https://github.com/ClickHouse/ClickHouse/issues/15117). [#15192](https://github.com/ClickHouse/ClickHouse/pull/15192) ([alesapin](https://github.com/alesapin)).
* Adjust decimals field size in mysql column definition packet. [#15152](https://github.com/ClickHouse/ClickHouse/pull/15152) ([maqroll](https://github.com/maqroll)).
* We already use padded comparison between String and FixedString (https://github.com/ClickHouse/ClickHouse/blob/master/src/Functions/FunctionsComparison.h#L333). This PR applies the same logic to field comparison which corrects the usage of FixedString as primary keys. This fixes https://github.com/ClickHouse/ClickHouse/issues/14908. [#15033](https://github.com/ClickHouse/ClickHouse/pull/15033) ([Amos Bird](https://github.com/amosbird)).
* If function `bar` was called with specifically crafter arguments, buffer overflow was possible. This closes [#13926](https://github.com/ClickHouse/ClickHouse/issues/13926). [#15028](https://github.com/ClickHouse/ClickHouse/pull/15028) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* We already use padded comparison between String and FixedString (https://github.com/ClickHouse/ClickHouse/blob/master/src/Functions/FunctionsComparison.h#L333). This PR applies the same logic to field comparison which corrects the usage of FixedString as primary keys. This fixes [#14908](https://github.com/ClickHouse/ClickHouse/issues/14908). [#15033](https://github.com/ClickHouse/ClickHouse/pull/15033) ([Amos Bird](https://github.com/amosbird)).
* If function `bar` was called with specifically crafted arguments, buffer overflow was possible. This closes [#13926](https://github.com/ClickHouse/ClickHouse/issues/13926). [#15028](https://github.com/ClickHouse/ClickHouse/pull/15028) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed `Cannot rename ... errno: 22, strerror: Invalid argument` error on DDL query execution in Atomic database when running clickhouse-server in docker on Mac OS. [#15024](https://github.com/ClickHouse/ClickHouse/pull/15024) ([tavplubix](https://github.com/tavplubix)).
* Now settings `number_of_free_entries_in_pool_to_execute_mutation` and `number_of_free_entries_in_pool_to_lower_max_size_of_merge` can be equal to `background_pool_size`. [#14975](https://github.com/ClickHouse/ClickHouse/pull/14975) ([alesapin](https://github.com/alesapin)).
* Fix to make predicate push down work when subquery contains finalizeAggregation function. Fixes [#14847](https://github.com/ClickHouse/ClickHouse/issues/14847). [#14937](https://github.com/ClickHouse/ClickHouse/pull/14937) ([filimonov](https://github.com/filimonov)).
* Publish CPU frequencies per logical core in `system.asynchronous_metrics`. This fixes https://github.com/ClickHouse/ClickHouse/issues/14923. [#14924](https://github.com/ClickHouse/ClickHouse/pull/14924) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Publish CPU frequencies per logical core in `system.asynchronous_metrics`. This fixes [#14923](https://github.com/ClickHouse/ClickHouse/issues/14923). [#14924](https://github.com/ClickHouse/ClickHouse/pull/14924) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Fixed `.metadata.tmp File exists` error when using `MaterializeMySQL` database engine. [#14898](https://github.com/ClickHouse/ClickHouse/pull/14898) ([Winter Zhang](https://github.com/zhang2014)).
* Fix a problem where the server may get stuck on startup while talking to ZooKeeper, if the configuration files have to be fetched from ZK (using the `from_zk` include option). This fixes [#14814](https://github.com/ClickHouse/ClickHouse/issues/14814). [#14843](https://github.com/ClickHouse/ClickHouse/pull/14843) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Fix wrong monotonicity detection for shrunk `Int -> Int` cast of signed types. It might lead to incorrect query result. This bug is unveiled in [#14513](https://github.com/ClickHouse/ClickHouse/issues/14513). [#14783](https://github.com/ClickHouse/ClickHouse/pull/14783) ([Amos Bird](https://github.com/amosbird)).
@ -647,16 +647,16 @@
* Fix visible data clobbering by progress bar in client in interactive mode. This fixes [#12562](https://github.com/ClickHouse/ClickHouse/issues/12562) and [#13369](https://github.com/ClickHouse/ClickHouse/issues/13369) and [#13584](https://github.com/ClickHouse/ClickHouse/issues/13584) and fixes [#12964](https://github.com/ClickHouse/ClickHouse/issues/12964). [#13691](https://github.com/ClickHouse/ClickHouse/pull/13691) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed incorrect sorting order if `LowCardinality` column when sorting by multiple columns. This fixes [#13958](https://github.com/ClickHouse/ClickHouse/issues/13958). [#14223](https://github.com/ClickHouse/ClickHouse/pull/14223) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Check for array size overflow in `topK` aggregate function. Without this check the user may send a query with carefully crafter parameters that will lead to server crash. This closes [#14452](https://github.com/ClickHouse/ClickHouse/issues/14452). [#14467](https://github.com/ClickHouse/ClickHouse/pull/14467) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Check for array size overflow in `topK` aggregate function. Without this check the user may send a query with carefully crafted parameters that will lead to server crash. This closes [#14452](https://github.com/ClickHouse/ClickHouse/issues/14452). [#14467](https://github.com/ClickHouse/ClickHouse/pull/14467) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix bug which can lead to wrong merges assignment if table has partitions with a single part. [#14444](https://github.com/ClickHouse/ClickHouse/pull/14444) ([alesapin](https://github.com/alesapin)).
* Stop query execution if exception happened in `PipelineExecutor` itself. This could prevent rare possible query hung. Continuation of [#14334](https://github.com/ClickHouse/ClickHouse/issues/14334). [#14402](https://github.com/ClickHouse/ClickHouse/pull/14402) [#14334](https://github.com/ClickHouse/ClickHouse/pull/14334) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix crash during `ALTER` query for table which was created `AS table_function`. Fixes [#14212](https://github.com/ClickHouse/ClickHouse/issues/14212). [#14326](https://github.com/ClickHouse/ClickHouse/pull/14326) ([alesapin](https://github.com/alesapin)).
* Fix exception during ALTER LIVE VIEW query with REFRESH command. Live view is an experimental feature. [#14320](https://github.com/ClickHouse/ClickHouse/pull/14320) ([Bharat Nallan](https://github.com/bharatnc)).
* Fix QueryPlan lifetime (for EXPLAIN PIPELINE graph=1) for queries with nested interpreter. [#14315](https://github.com/ClickHouse/ClickHouse/pull/14315) ([Azat Khuzhin](https://github.com/azat)).
* Fix segfault in `clickhouse-odbc-bridge` during schema fetch from some external sources. This PR fixes https://github.com/ClickHouse/ClickHouse/issues/13861. [#14267](https://github.com/ClickHouse/ClickHouse/pull/14267) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix crash in mark inclusion search introduced in https://github.com/ClickHouse/ClickHouse/pull/12277. [#14225](https://github.com/ClickHouse/ClickHouse/pull/14225) ([Amos Bird](https://github.com/amosbird)).
* Fix segfault in `clickhouse-odbc-bridge` during schema fetch from some external sources. This PR fixes [#13861](https://github.com/ClickHouse/ClickHouse/issues/13861). [#14267](https://github.com/ClickHouse/ClickHouse/pull/14267) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix crash in mark inclusion search introduced in [#12277](https://github.com/ClickHouse/ClickHouse/pull/12277). [#14225](https://github.com/ClickHouse/ClickHouse/pull/14225) ([Amos Bird](https://github.com/amosbird)).
* Fix creation of tables with named tuples. This fixes [#13027](https://github.com/ClickHouse/ClickHouse/issues/13027). [#14143](https://github.com/ClickHouse/ClickHouse/pull/14143) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix formatting of minimal negative decimal numbers. This fixes https://github.com/ClickHouse/ClickHouse/issues/14111. [#14119](https://github.com/ClickHouse/ClickHouse/pull/14119) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Fix formatting of minimal negative decimal numbers. This fixes [#14111](https://github.com/ClickHouse/ClickHouse/issues/14111). [#14119](https://github.com/ClickHouse/ClickHouse/pull/14119) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Fix `DistributedFilesToInsert` metric (zeroed when it should not). [#14095](https://github.com/ClickHouse/ClickHouse/pull/14095) ([Azat Khuzhin](https://github.com/azat)).
* Fix `pointInPolygon` with const 2d array as polygon. [#14079](https://github.com/ClickHouse/ClickHouse/pull/14079) ([Alexey Ilyukhov](https://github.com/livace)).
* Fixed wrong mount point in extra info for `Poco::Exception: no space left on device`. [#14050](https://github.com/ClickHouse/ClickHouse/pull/14050) ([tavplubix](https://github.com/tavplubix)).
@ -685,10 +685,10 @@
* Fix wrong code in function `netloc`. This fixes [#13335](https://github.com/ClickHouse/ClickHouse/issues/13335). [#13446](https://github.com/ClickHouse/ClickHouse/pull/13446) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix possible race in `StorageMemory`. [#13416](https://github.com/ClickHouse/ClickHouse/pull/13416) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix missing or excessive headers in `TSV/CSVWithNames` formats in HTTP protocol. This fixes [#12504](https://github.com/ClickHouse/ClickHouse/issues/12504). [#13343](https://github.com/ClickHouse/ClickHouse/pull/13343) ([Azat Khuzhin](https://github.com/azat)).
* Fix parsing row policies from users.xml when names of databases or tables contain dots. This fixes https://github.com/ClickHouse/ClickHouse/issues/5779, https://github.com/ClickHouse/ClickHouse/issues/12527. [#13199](https://github.com/ClickHouse/ClickHouse/pull/13199) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix parsing row policies from users.xml when names of databases or tables contain dots. This fixes [#5779](https://github.com/ClickHouse/ClickHouse/issues/5779), [#12527](https://github.com/ClickHouse/ClickHouse/issues/12527). [#13199](https://github.com/ClickHouse/ClickHouse/pull/13199) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix access to `redis` dictionary after connection was dropped once. It may happen with `cache` and `direct` dictionary layouts. [#13082](https://github.com/ClickHouse/ClickHouse/pull/13082) ([Anton Popov](https://github.com/CurtizJ)).
* Removed wrong auth access check when using ClickHouseDictionarySource to query remote tables. [#12756](https://github.com/ClickHouse/ClickHouse/pull/12756) ([sundyli](https://github.com/sundy-li)).
* Properly distinguish subqueries in some cases for common subexpression elimination. https://github.com/ClickHouse/ClickHouse/issues/8333. [#8367](https://github.com/ClickHouse/ClickHouse/pull/8367) ([Amos Bird](https://github.com/amosbird)).
* Properly distinguish subqueries in some cases for common subexpression elimination. [#8333](https://github.com/ClickHouse/ClickHouse/issues/8333). [#8367](https://github.com/ClickHouse/ClickHouse/pull/8367) ([Amos Bird](https://github.com/amosbird)).
#### Improvement
@ -756,7 +756,7 @@
* Updating LDAP user authentication suite to check that it works with RBAC. [#13656](https://github.com/ClickHouse/ClickHouse/pull/13656) ([vzakaznikov](https://github.com/vzakaznikov)).
* Removed `-DENABLE_CURL_CLIENT` for `contrib/aws`. [#13628](https://github.com/ClickHouse/ClickHouse/pull/13628) ([Vladimir Chebotarev](https://github.com/excitoon)).
* Increasing health-check timeouts for ClickHouse nodes and adding support to dump docker-compose logs if unhealthy containers found. [#13612](https://github.com/ClickHouse/ClickHouse/pull/13612) ([vzakaznikov](https://github.com/vzakaznikov)).
* Make sure https://github.com/ClickHouse/ClickHouse/issues/10977 is invalid. [#13539](https://github.com/ClickHouse/ClickHouse/pull/13539) ([Amos Bird](https://github.com/amosbird)).
* Make sure [#10977](https://github.com/ClickHouse/ClickHouse/issues/10977) is invalid. [#13539](https://github.com/ClickHouse/ClickHouse/pull/13539) ([Amos Bird](https://github.com/amosbird)).
* Skip PR's from robot-clickhouse. [#13489](https://github.com/ClickHouse/ClickHouse/pull/13489) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Move Dockerfiles from integration tests to `docker/test` directory. docker_compose files are available in `runner` docker container. Docker images are built in CI and not in integration tests. [#13448](https://github.com/ClickHouse/ClickHouse/pull/13448) ([Ilya Yatsishin](https://github.com/qoega)).
@ -788,7 +788,7 @@
* Add `FROM_UNIXTIME` function for compatibility with MySQL, related to [12149](https://github.com/ClickHouse/ClickHouse/issues/12149). [#12484](https://github.com/ClickHouse/ClickHouse/pull/12484) ([flynn](https://github.com/ucasFL)).
* Allow Nullable types as keys in MergeTree tables if `allow_nullable_key` table setting is enabled. Closes [#5319](https://github.com/ClickHouse/ClickHouse/issues/5319). [#12433](https://github.com/ClickHouse/ClickHouse/pull/12433) ([Amos Bird](https://github.com/amosbird)).
* Integration with [COS](https://intl.cloud.tencent.com/product/cos). [#12386](https://github.com/ClickHouse/ClickHouse/pull/12386) ([fastio](https://github.com/fastio)).
* Add mapAdd and mapSubtract functions for adding/subtracting key-mapped values. [#11735](https://github.com/ClickHouse/ClickHouse/pull/11735) ([Ildus Kurbangaliev](https://github.com/ildus)).
* Add `mapAdd` and `mapSubtract` functions for adding/subtracting key-mapped values. [#11735](https://github.com/ClickHouse/ClickHouse/pull/11735) ([Ildus Kurbangaliev](https://github.com/ildus)).
#### Bug Fix
@ -1071,7 +1071,7 @@
* Improved performace of 'ORDER BY' and 'GROUP BY' by prefix of sorting key (enabled with `optimize_aggregation_in_order` setting, disabled by default). [#11696](https://github.com/ClickHouse/ClickHouse/pull/11696) ([Anton Popov](https://github.com/CurtizJ)).
* Removed injective functions inside `uniq*()` if `set optimize_injective_functions_inside_uniq=1`. [#12337](https://github.com/ClickHouse/ClickHouse/pull/12337) ([Ruslan Kamalov](https://github.com/kamalov-ruslan)).
* Index not used for IN operator with literals", performance regression introduced around v19.3. This fixes "[#10574](https://github.com/ClickHouse/ClickHouse/issues/10574). [#12062](https://github.com/ClickHouse/ClickHouse/pull/12062) ([nvartolomei](https://github.com/nvartolomei)).
* Index not used for IN operator with literals, performance regression introduced around v19.3. This fixes [#10574](https://github.com/ClickHouse/ClickHouse/issues/10574). [#12062](https://github.com/ClickHouse/ClickHouse/pull/12062) ([nvartolomei](https://github.com/nvartolomei)).
* Implemented single part uploads for DiskS3 (experimental feature). [#12026](https://github.com/ClickHouse/ClickHouse/pull/12026) ([Vladimir Chebotarev](https://github.com/excitoon)).
#### Experimental Feature
@ -1133,7 +1133,7 @@
#### Performance Improvement
* Index not used for IN operator with literals", performance regression introduced around v19.3. This fixes "[#10574](https://github.com/ClickHouse/ClickHouse/issues/10574). [#12062](https://github.com/ClickHouse/ClickHouse/pull/12062) ([nvartolomei](https://github.com/nvartolomei)).
* Index not used for IN operator with literals, performance regression introduced around v19.3. This fixes [#10574](https://github.com/ClickHouse/ClickHouse/issues/10574). [#12062](https://github.com/ClickHouse/ClickHouse/pull/12062) ([nvartolomei](https://github.com/nvartolomei)).
#### Build/Testing/Packaging Improvement
@ -1213,7 +1213,7 @@
* Fix wrong result of comparison of FixedString with constant String. This fixes [#11393](https://github.com/ClickHouse/ClickHouse/issues/11393). This bug appeared in version 20.4. [#11828](https://github.com/ClickHouse/ClickHouse/pull/11828) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix wrong result for `if` with NULLs in condition. [#11807](https://github.com/ClickHouse/ClickHouse/pull/11807) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix using too many threads for queries. [#11788](https://github.com/ClickHouse/ClickHouse/pull/11788) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed `Scalar doesn't exist` exception when using `WITH <scalar subquery> ...` in `SELECT ... FROM merge_tree_table ...` https://github.com/ClickHouse/ClickHouse/issues/11621. [#11767](https://github.com/ClickHouse/ClickHouse/pull/11767) ([Amos Bird](https://github.com/amosbird)).
* Fixed `Scalar doesn't exist` exception when using `WITH <scalar subquery> ...` in `SELECT ... FROM merge_tree_table ...` [#11621](https://github.com/ClickHouse/ClickHouse/issues/11621). [#11767](https://github.com/ClickHouse/ClickHouse/pull/11767) ([Amos Bird](https://github.com/amosbird)).
* Fix unexpected behaviour of queries like `SELECT *, xyz.*` which were success while an error expected. [#11753](https://github.com/ClickHouse/ClickHouse/pull/11753) ([hexiaoting](https://github.com/hexiaoting)).
* Now replicated fetches will be cancelled during metadata alter. [#11744](https://github.com/ClickHouse/ClickHouse/pull/11744) ([alesapin](https://github.com/alesapin)).
* Parse metadata stored in zookeeper before checking for equality. [#11739](https://github.com/ClickHouse/ClickHouse/pull/11739) ([Azat Khuzhin](https://github.com/azat)).
@ -1264,8 +1264,8 @@
* Fix potential uninitialized memory in conversion. Example: `SELECT toIntervalSecond(now64())`. [#11311](https://github.com/ClickHouse/ClickHouse/pull/11311) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix the issue when index analysis cannot work if a table has Array column in primary key and if a query is filtering by this column with `empty` or `notEmpty` functions. This fixes [#11286](https://github.com/ClickHouse/ClickHouse/issues/11286). [#11303](https://github.com/ClickHouse/ClickHouse/pull/11303) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix bug when query speed estimation can be incorrect and the limit of `min_execution_speed` may not work or work incorrectly if the query is throttled by `max_network_bandwidth`, `max_execution_speed` or `priority` settings. Change the default value of `timeout_before_checking_execution_speed` to non-zero, because otherwise the settings `min_execution_speed` and `max_execution_speed` have no effect. This fixes [#11297](https://github.com/ClickHouse/ClickHouse/issues/11297). This fixes [#5732](https://github.com/ClickHouse/ClickHouse/issues/5732). This fixes [#6228](https://github.com/ClickHouse/ClickHouse/issues/6228). Usability improvement: avoid concatenation of exception message with progress bar in `clickhouse-client`. [#11296](https://github.com/ClickHouse/ClickHouse/pull/11296) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix crash when `SET DEFAULT ROLE` is called with wrong arguments. This fixes https://github.com/ClickHouse/ClickHouse/issues/10586. [#11278](https://github.com/ClickHouse/ClickHouse/pull/11278) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix crash while reading malformed data in `Protobuf` format. This fixes https://github.com/ClickHouse/ClickHouse/issues/5957, fixes https://github.com/ClickHouse/ClickHouse/issues/11203. [#11258](https://github.com/ClickHouse/ClickHouse/pull/11258) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix crash when `SET DEFAULT ROLE` is called with wrong arguments. This fixes [#10586](https://github.com/ClickHouse/ClickHouse/issues/10586). [#11278](https://github.com/ClickHouse/ClickHouse/pull/11278) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix crash while reading malformed data in `Protobuf` format. This fixes [#5957](https://github.com/ClickHouse/ClickHouse/issues/5957), fixes [#11203](https://github.com/ClickHouse/ClickHouse/issues/11203). [#11258](https://github.com/ClickHouse/ClickHouse/pull/11258) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fixed a bug when `cache` dictionary could return default value instead of normal (when there are only expired keys). This affects only string fields. [#11233](https://github.com/ClickHouse/ClickHouse/pull/11233) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fix error `Block structure mismatch in QueryPipeline` while reading from `VIEW` with constants in inner query. Fixes [#11181](https://github.com/ClickHouse/ClickHouse/issues/11181). [#11205](https://github.com/ClickHouse/ClickHouse/pull/11205) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix possible exception `Invalid status for associated output`. [#11200](https://github.com/ClickHouse/ClickHouse/pull/11200) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
@ -1331,7 +1331,7 @@
* Fix error `the BloomFilter false positive must be a double number between 0 and 1` [#10551](https://github.com/ClickHouse/ClickHouse/issues/10551). [#10569](https://github.com/ClickHouse/ClickHouse/pull/10569) ([Winter Zhang](https://github.com/zhang2014)).
* Fix SELECT of column ALIAS which default expression type different from column type. [#10563](https://github.com/ClickHouse/ClickHouse/pull/10563) ([Azat Khuzhin](https://github.com/azat)).
* Implemented comparison between DateTime64 and String values (just like for DateTime). [#10560](https://github.com/ClickHouse/ClickHouse/pull/10560) ([Vasily Nemkov](https://github.com/Enmk)).
* Fix index corruption, which may accur in some cases after merge compact parts into another compact part. [#10531](https://github.com/ClickHouse/ClickHouse/pull/10531) ([Anton Popov](https://github.com/CurtizJ)).
* Fix index corruption, which may occur in some cases after merge compact parts into another compact part. [#10531](https://github.com/ClickHouse/ClickHouse/pull/10531) ([Anton Popov](https://github.com/CurtizJ)).
* Disable GROUP BY sharding_key optimization by default (`optimize_distributed_group_by_sharding_key` had been introduced and turned of by default, due to trickery of sharding_key analyzing, simple example is `if` in sharding key) and fix it for WITH ROLLUP/CUBE/TOTALS. [#10516](https://github.com/ClickHouse/ClickHouse/pull/10516) ([Azat Khuzhin](https://github.com/azat)).
* Fixes: [#10263](https://github.com/ClickHouse/ClickHouse/issues/10263) (after that PR dist send via INSERT had been postponing on each INSERT) Fixes: [#8756](https://github.com/ClickHouse/ClickHouse/issues/8756) (that PR breaks distributed sends with all of the following conditions met (unlikely setup for now I guess): `internal_replication == false`, multiple local shards (activates the hardlinking code) and `distributed_storage_policy` (makes `link(2)` fails on `EXDEV`)). [#10486](https://github.com/ClickHouse/ClickHouse/pull/10486) ([Azat Khuzhin](https://github.com/azat)).
* Fixed error with "max_rows_to_sort" limit. [#10268](https://github.com/ClickHouse/ClickHouse/pull/10268) ([alexey-milovidov](https://github.com/alexey-milovidov)).
@ -1488,7 +1488,7 @@
* Lower memory usage in tests. [#10617](https://github.com/ClickHouse/ClickHouse/pull/10617) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixing hard coded timeouts in new live view tests. [#10604](https://github.com/ClickHouse/ClickHouse/pull/10604) ([vzakaznikov](https://github.com/vzakaznikov)).
* Increasing timeout when opening a client in tests/queries/0_stateless/helpers/client.py. [#10599](https://github.com/ClickHouse/ClickHouse/pull/10599) ([vzakaznikov](https://github.com/vzakaznikov)).
* Enable ThinLTO for clang builds, continuation of https://github.com/ClickHouse/ClickHouse/pull/10435. [#10585](https://github.com/ClickHouse/ClickHouse/pull/10585) ([Amos Bird](https://github.com/amosbird)).
* Enable ThinLTO for clang builds, continuation of [#10435](https://github.com/ClickHouse/ClickHouse/pull/10435). [#10585](https://github.com/ClickHouse/ClickHouse/pull/10585) ([Amos Bird](https://github.com/amosbird)).
* Adding fuzzers and preparing for oss-fuzz integration. [#10546](https://github.com/ClickHouse/ClickHouse/pull/10546) ([kyprizel](https://github.com/kyprizel)).
* Fix FreeBSD build. [#10150](https://github.com/ClickHouse/ClickHouse/pull/10150) ([Ivan](https://github.com/abyss7)).
* Add new build for query tests using pytest framework. [#10039](https://github.com/ClickHouse/ClickHouse/pull/10039) ([Ivan](https://github.com/abyss7)).
@ -1563,7 +1563,7 @@
#### Performance Improvement
* Index not used for IN operator with literals", performance regression introduced around v19.3. This fixes "[#10574](https://github.com/ClickHouse/ClickHouse/issues/10574). [#12062](https://github.com/ClickHouse/ClickHouse/pull/12062) ([nvartolomei](https://github.com/nvartolomei)).
* Index not used for IN operator with literals, performance regression introduced around v19.3. This fixes [#10574](https://github.com/ClickHouse/ClickHouse/issues/10574). [#12062](https://github.com/ClickHouse/ClickHouse/pull/12062) ([nvartolomei](https://github.com/nvartolomei)).
#### Build/Testing/Packaging Improvement
@ -1617,7 +1617,7 @@
* Fix the error `Data compressed with different methods` that can happen if `min_bytes_to_use_direct_io` is enabled and PREWHERE is active and using SAMPLE or high number of threads. This fixes [#11539](https://github.com/ClickHouse/ClickHouse/issues/11539). [#11540](https://github.com/ClickHouse/ClickHouse/pull/11540) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix return compressed size for codecs. [#11448](https://github.com/ClickHouse/ClickHouse/pull/11448) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix server crash when a column has compression codec with non-literal arguments. Fixes [#11365](https://github.com/ClickHouse/ClickHouse/issues/11365). [#11431](https://github.com/ClickHouse/ClickHouse/pull/11431) ([alesapin](https://github.com/alesapin)).
* Fix pointInPolygon with nan as point. Fixes https://github.com/ClickHouse/ClickHouse/issues/11375. [#11421](https://github.com/ClickHouse/ClickHouse/pull/11421) ([Alexey Ilyukhov](https://github.com/livace)).
* Fix pointInPolygon with nan as point. Fixes [#11375](https://github.com/ClickHouse/ClickHouse/issues/11375). [#11421](https://github.com/ClickHouse/ClickHouse/pull/11421) ([Alexey Ilyukhov](https://github.com/livace)).
* Fix potential uninitialized memory read in MergeTree shutdown if table was not created successfully. [#11420](https://github.com/ClickHouse/ClickHouse/pull/11420) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed geohashesInBox with arguments outside of latitude/longitude range. [#11403](https://github.com/ClickHouse/ClickHouse/pull/11403) ([Vasily Nemkov](https://github.com/Enmk)).
* Fix possible `Pipeline stuck` error for queries with external sort and limit. Fixes [#11359](https://github.com/ClickHouse/ClickHouse/issues/11359). [#11366](https://github.com/ClickHouse/ClickHouse/pull/11366) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
@ -1633,8 +1633,8 @@
* Fix potential uninitialized memory in conversion. Example: `SELECT toIntervalSecond(now64())`. [#11311](https://github.com/ClickHouse/ClickHouse/pull/11311) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix the issue when index analysis cannot work if a table has Array column in primary key and if a query is filtering by this column with `empty` or `notEmpty` functions. This fixes [#11286](https://github.com/ClickHouse/ClickHouse/issues/11286). [#11303](https://github.com/ClickHouse/ClickHouse/pull/11303) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix bug when query speed estimation can be incorrect and the limit of `min_execution_speed` may not work or work incorrectly if the query is throttled by `max_network_bandwidth`, `max_execution_speed` or `priority` settings. Change the default value of `timeout_before_checking_execution_speed` to non-zero, because otherwise the settings `min_execution_speed` and `max_execution_speed` have no effect. This fixes [#11297](https://github.com/ClickHouse/ClickHouse/issues/11297). This fixes [#5732](https://github.com/ClickHouse/ClickHouse/issues/5732). This fixes [#6228](https://github.com/ClickHouse/ClickHouse/issues/6228). Usability improvement: avoid concatenation of exception message with progress bar in `clickhouse-client`. [#11296](https://github.com/ClickHouse/ClickHouse/pull/11296) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix crash when SET DEFAULT ROLE is called with wrong arguments. This fixes https://github.com/ClickHouse/ClickHouse/issues/10586. [#11278](https://github.com/ClickHouse/ClickHouse/pull/11278) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix crash while reading malformed data in Protobuf format. This fixes https://github.com/ClickHouse/ClickHouse/issues/5957, fixes https://github.com/ClickHouse/ClickHouse/issues/11203. [#11258](https://github.com/ClickHouse/ClickHouse/pull/11258) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix crash when SET DEFAULT ROLE is called with wrong arguments. This fixes [#10586](https://github.com/ClickHouse/ClickHouse/issues/10586). [#11278](https://github.com/ClickHouse/ClickHouse/pull/11278) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix crash while reading malformed data in Protobuf format. This fixes [#5957](https://github.com/ClickHouse/ClickHouse/issues/5957), fixes [#11203](https://github.com/ClickHouse/ClickHouse/issues/11203). [#11258](https://github.com/ClickHouse/ClickHouse/pull/11258) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fixed a bug when cache-dictionary could return default value instead of normal (when there are only expired keys). This affects only string fields. [#11233](https://github.com/ClickHouse/ClickHouse/pull/11233) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fix error `Block structure mismatch in QueryPipeline` while reading from `VIEW` with constants in inner query. Fixes [#11181](https://github.com/ClickHouse/ClickHouse/issues/11181). [#11205](https://github.com/ClickHouse/ClickHouse/pull/11205) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix possible exception `Invalid status for associated output`. [#11200](https://github.com/ClickHouse/ClickHouse/pull/11200) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
@ -1679,7 +1679,7 @@ No changes compared to v20.4.3.16-stable.
* Now constraints are updated if the column participating in `CONSTRAINT` expression was renamed. Fixes [#10844](https://github.com/ClickHouse/ClickHouse/issues/10844). [#10847](https://github.com/ClickHouse/ClickHouse/pull/10847) ([alesapin](https://github.com/alesapin)).
* Fixed potential read of uninitialized memory in cache-dictionary. [#10834](https://github.com/ClickHouse/ClickHouse/pull/10834) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed columns order after `Block::sortColumns()`. [#10826](https://github.com/ClickHouse/ClickHouse/pull/10826) ([Azat Khuzhin](https://github.com/azat)).
* Fixed the issue with `ODBC` bridge when no quoting of identifiers is requested. Fixes [#7984] (https://github.com/ClickHouse/ClickHouse/issues/7984). [#10821](https://github.com/ClickHouse/ClickHouse/pull/10821) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed the issue with `ODBC` bridge when no quoting of identifiers is requested. Fixes [#7984](https://github.com/ClickHouse/ClickHouse/issues/7984). [#10821](https://github.com/ClickHouse/ClickHouse/pull/10821) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed `UBSan` and `MSan` report in `DateLUT`. [#10798](https://github.com/ClickHouse/ClickHouse/pull/10798) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed incorrect type conversion in key conditions. Fixes [#6287](https://github.com/ClickHouse/ClickHouse/issues/6287). [#10791](https://github.com/ClickHouse/ClickHouse/pull/10791) ([Andrew Onyshchuk](https://github.com/oandrew)).
* Fixed `parallel_view_processing` behavior. Now all insertions into `MATERIALIZED VIEW` without exception should be finished if exception happened. Fixes [#10241](https://github.com/ClickHouse/ClickHouse/issues/10241). [#10757](https://github.com/ClickHouse/ClickHouse/pull/10757) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
@ -1707,15 +1707,15 @@ No changes compared to v20.4.3.16-stable.
#### New Feature
* Add support for secured connection from ClickHouse to Zookeeper [#10184](https://github.com/ClickHouse/ClickHouse/pull/10184) ([Konstantin Lebedev](https://github.com/xzkostyan))
* Support custom HTTP handlers. See ISSUES-5436 for description. [#7572](https://github.com/ClickHouse/ClickHouse/pull/7572) ([Winter Zhang](https://github.com/zhang2014))
* Support custom HTTP handlers. See [#5436](https://github.com/ClickHouse/ClickHouse/issues/5436) for description. [#7572](https://github.com/ClickHouse/ClickHouse/pull/7572) ([Winter Zhang](https://github.com/zhang2014))
* Add MessagePack Input/Output format. [#9889](https://github.com/ClickHouse/ClickHouse/pull/9889) ([Kruglov Pavel](https://github.com/Avogar))
* Add Regexp input format. [#9196](https://github.com/ClickHouse/ClickHouse/pull/9196) ([Kruglov Pavel](https://github.com/Avogar))
* Added output format `Markdown` for embedding tables in markdown documents. [#10317](https://github.com/ClickHouse/ClickHouse/pull/10317) ([Kruglov Pavel](https://github.com/Avogar))
* Added support for custom settings section in dictionaries. Also fixes issue [#2829](https://github.com/ClickHouse/ClickHouse/issues/2829). [#10137](https://github.com/ClickHouse/ClickHouse/pull/10137) ([Artem Streltsov](https://github.com/kekekekule))
* Added custom settings support in DDL-queries for CREATE DICTIONARY [#10465](https://github.com/ClickHouse/ClickHouse/pull/10465) ([Artem Streltsov](https://github.com/kekekekule))
* Added custom settings support in DDL-queries for `CREATE DICTIONARY` [#10465](https://github.com/ClickHouse/ClickHouse/pull/10465) ([Artem Streltsov](https://github.com/kekekekule))
* Add simple server-wide memory profiler that will collect allocation contexts when server memory usage becomes higher than the next allocation threshold. [#10444](https://github.com/ClickHouse/ClickHouse/pull/10444) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add setting `always_fetch_merged_part` which restrict replica to merge parts by itself and always prefer dowloading from other replicas. [#10379](https://github.com/ClickHouse/ClickHouse/pull/10379) ([alesapin](https://github.com/alesapin))
* Add function JSONExtractKeysAndValuesRaw which extracts raw data from JSON objects [#10378](https://github.com/ClickHouse/ClickHouse/pull/10378) ([hcz](https://github.com/hczhcz))
* Add function `JSONExtractKeysAndValuesRaw` which extracts raw data from JSON objects [#10378](https://github.com/ClickHouse/ClickHouse/pull/10378) ([hcz](https://github.com/hczhcz))
* Add memory usage from OS to `system.asynchronous_metrics`. [#10361](https://github.com/ClickHouse/ClickHouse/pull/10361) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Added generic variants for functions `least` and `greatest`. Now they work with arbitrary number of arguments of arbitrary types. This fixes [#4767](https://github.com/ClickHouse/ClickHouse/issues/4767) [#10318](https://github.com/ClickHouse/ClickHouse/pull/10318) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Now ClickHouse controls timeouts of dictionary sources on its side. Two new settings added to cache dictionary configuration: `strict_max_lifetime_seconds`, which is `max_lifetime` by default, and `query_wait_timeout_milliseconds`, which is one minute by default. The first settings is also useful with `allow_read_expired_keys` settings (to forbid reading very expired keys). [#10337](https://github.com/ClickHouse/ClickHouse/pull/10337) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
@ -1728,7 +1728,7 @@ No changes compared to v20.4.3.16-stable.
* Add ability to query Distributed over Distributed (w/o `distributed_group_by_no_merge`) ... [#9923](https://github.com/ClickHouse/ClickHouse/pull/9923) ([Azat Khuzhin](https://github.com/azat))
* Add function `arrayReduceInRanges` which aggregates array elements in given ranges. [#9598](https://github.com/ClickHouse/ClickHouse/pull/9598) ([hcz](https://github.com/hczhcz))
* Add Dictionary Status on prometheus exporter. [#9622](https://github.com/ClickHouse/ClickHouse/pull/9622) ([Guillaume Tassery](https://github.com/YiuRULE))
* Add function arrayAUC [#8698](https://github.com/ClickHouse/ClickHouse/pull/8698) ([taiyang-li](https://github.com/taiyang-li))
* Add function `arrayAUC` [#8698](https://github.com/ClickHouse/ClickHouse/pull/8698) ([taiyang-li](https://github.com/taiyang-li))
* Support `DROP VIEW` statement for better TPC-H compatibility. [#9831](https://github.com/ClickHouse/ClickHouse/pull/9831) ([Amos Bird](https://github.com/amosbird))
* Add 'strict_order' option to windowFunnel() [#9773](https://github.com/ClickHouse/ClickHouse/pull/9773) ([achimbab](https://github.com/achimbab))
* Support `DATE` and `TIMESTAMP` SQL operators, e.g. `SELECT date '2001-01-01'` [#9691](https://github.com/ClickHouse/ClickHouse/pull/9691) ([Artem Zuikov](https://github.com/4ertus2))
@ -1932,7 +1932,7 @@ No changes compared to v20.4.3.16-stable.
* Move integration tests docker files to docker/ directory. [#10335](https://github.com/ClickHouse/ClickHouse/pull/10335) ([Ilya Yatsishin](https://github.com/qoega))
* Allow to use `clang-10` in CI. It ensures that [#10238](https://github.com/ClickHouse/ClickHouse/issues/10238) is fixed. [#10384](https://github.com/ClickHouse/ClickHouse/pull/10384) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Update OpenSSL to upstream master. Fixed the issue when TLS connections may fail with the message `OpenSSL SSL_read: error:14094438:SSL routines:ssl3_read_bytes:tlsv1 alert internal error` and `SSL Exception: error:2400006E:random number generator::error retrieving entropy`. The issue was present in version 20.1. [#8956](https://github.com/ClickHouse/ClickHouse/pull/8956) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix clang-10 build. https://github.com/ClickHouse/ClickHouse/issues/10238 [#10370](https://github.com/ClickHouse/ClickHouse/pull/10370) ([Amos Bird](https://github.com/amosbird))
* Fix clang-10 build. [#10238](https://github.com/ClickHouse/ClickHouse/issues/10238) [#10370](https://github.com/ClickHouse/ClickHouse/pull/10370) ([Amos Bird](https://github.com/amosbird))
* Add performance test for [Parallel INSERT for materialized view](https://github.com/ClickHouse/ClickHouse/pull/10052). [#10345](https://github.com/ClickHouse/ClickHouse/pull/10345) ([vxider](https://github.com/Vxider))
* Fix flaky test `test_settings_constraints_distributed.test_insert_clamps_settings`. [#10346](https://github.com/ClickHouse/ClickHouse/pull/10346) ([Vitaly Baranov](https://github.com/vitlibar))
* Add util to test results upload in CI ClickHouse [#10330](https://github.com/ClickHouse/ClickHouse/pull/10330) ([Ilya Yatsishin](https://github.com/qoega))
@ -2106,7 +2106,7 @@ No changes compared to v20.4.3.16-stable.
#### Performance Improvement
* Index not used for IN operator with literals", performance regression introduced around v19.3. This fixes "[#10574](https://github.com/ClickHouse/ClickHouse/issues/10574). [#12062](https://github.com/ClickHouse/ClickHouse/pull/12062) ([nvartolomei](https://github.com/nvartolomei)).
* Index not used for IN operator with literals, performance regression introduced around v19.3. This fixes [#10574](https://github.com/ClickHouse/ClickHouse/issues/10574). [#12062](https://github.com/ClickHouse/ClickHouse/pull/12062) ([nvartolomei](https://github.com/nvartolomei)).
### ClickHouse release v20.3.12.112-lts 2020-06-25
@ -2148,7 +2148,7 @@ No changes compared to v20.4.3.16-stable.
* Fix the error `Data compressed with different methods` that can happen if `min_bytes_to_use_direct_io` is enabled and PREWHERE is active and using SAMPLE or high number of threads. This fixes [#11539](https://github.com/ClickHouse/ClickHouse/issues/11539). [#11540](https://github.com/ClickHouse/ClickHouse/pull/11540) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix return compressed size for codecs. [#11448](https://github.com/ClickHouse/ClickHouse/pull/11448) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix server crash when a column has compression codec with non-literal arguments. Fixes [#11365](https://github.com/ClickHouse/ClickHouse/issues/11365). [#11431](https://github.com/ClickHouse/ClickHouse/pull/11431) ([alesapin](https://github.com/alesapin)).
* Fix pointInPolygon with nan as point. Fixes https://github.com/ClickHouse/ClickHouse/issues/11375. [#11421](https://github.com/ClickHouse/ClickHouse/pull/11421) ([Alexey Ilyukhov](https://github.com/livace)).
* Fix pointInPolygon with nan as point. Fixes [#11375](https://github.com/ClickHouse/ClickHouse/issues/11375). [#11421](https://github.com/ClickHouse/ClickHouse/pull/11421) ([Alexey Ilyukhov](https://github.com/livace)).
* Fix crash in JOIN over LowCarinality(T) and Nullable(T). [#11380](https://github.com/ClickHouse/ClickHouse/issues/11380). [#11414](https://github.com/ClickHouse/ClickHouse/pull/11414) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix error code for wrong `USING` key. [#11373](https://github.com/ClickHouse/ClickHouse/issues/11373). [#11404](https://github.com/ClickHouse/ClickHouse/pull/11404) ([Artem Zuikov](https://github.com/4ertus2)).
* Fixed geohashesInBox with arguments outside of latitude/longitude range. [#11403](https://github.com/ClickHouse/ClickHouse/pull/11403) ([Vasily Nemkov](https://github.com/Enmk)).
@ -2165,7 +2165,7 @@ No changes compared to v20.4.3.16-stable.
* Fix potential uninitialized memory in conversion. Example: `SELECT toIntervalSecond(now64())`. [#11311](https://github.com/ClickHouse/ClickHouse/pull/11311) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix the issue when index analysis cannot work if a table has Array column in primary key and if a query is filtering by this column with `empty` or `notEmpty` functions. This fixes [#11286](https://github.com/ClickHouse/ClickHouse/issues/11286). [#11303](https://github.com/ClickHouse/ClickHouse/pull/11303) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix bug when query speed estimation can be incorrect and the limit of `min_execution_speed` may not work or work incorrectly if the query is throttled by `max_network_bandwidth`, `max_execution_speed` or `priority` settings. Change the default value of `timeout_before_checking_execution_speed` to non-zero, because otherwise the settings `min_execution_speed` and `max_execution_speed` have no effect. This fixes [#11297](https://github.com/ClickHouse/ClickHouse/issues/11297). This fixes [#5732](https://github.com/ClickHouse/ClickHouse/issues/5732). This fixes [#6228](https://github.com/ClickHouse/ClickHouse/issues/6228). Usability improvement: avoid concatenation of exception message with progress bar in `clickhouse-client`. [#11296](https://github.com/ClickHouse/ClickHouse/pull/11296) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix crash while reading malformed data in Protobuf format. This fixes https://github.com/ClickHouse/ClickHouse/issues/5957, fixes https://github.com/ClickHouse/ClickHouse/issues/11203. [#11258](https://github.com/ClickHouse/ClickHouse/pull/11258) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix crash while reading malformed data in Protobuf format. This fixes [#5957](https://github.com/ClickHouse/ClickHouse/issues/5957), fixes [#11203](https://github.com/ClickHouse/ClickHouse/issues/11203). [#11258](https://github.com/ClickHouse/ClickHouse/pull/11258) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fixed a bug when cache-dictionary could return default value instead of normal (when there are only expired keys). This affects only string fields. [#11233](https://github.com/ClickHouse/ClickHouse/pull/11233) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fix error `Block structure mismatch in QueryPipeline` while reading from `VIEW` with constants in inner query. Fixes [#11181](https://github.com/ClickHouse/ClickHouse/issues/11181). [#11205](https://github.com/ClickHouse/ClickHouse/pull/11205) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix possible exception `Invalid status for associated output`. [#11200](https://github.com/ClickHouse/ClickHouse/pull/11200) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
@ -2196,7 +2196,7 @@ No changes compared to v20.4.3.16-stable.
* Fixed `SIGSEGV` in `StringHashTable` if such a key does not exist. [#10870](https://github.com/ClickHouse/ClickHouse/pull/10870) ([Azat Khuzhin](https://github.com/azat)).
* Fixed bug in `ReplicatedMergeTree` which might cause some `ALTER` on `OPTIMIZE` query to hang waiting for some replica after it become inactive. [#10849](https://github.com/ClickHouse/ClickHouse/pull/10849) ([tavplubix](https://github.com/tavplubix)).
* Fixed columns order after `Block::sortColumns()`. [#10826](https://github.com/ClickHouse/ClickHouse/pull/10826) ([Azat Khuzhin](https://github.com/azat)).
* Fixed the issue with `ODBC` bridge when no quoting of identifiers is requested. Fixes [#7984] (https://github.com/ClickHouse/ClickHouse/issues/7984). [#10821](https://github.com/ClickHouse/ClickHouse/pull/10821) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed the issue with `ODBC` bridge when no quoting of identifiers is requested. Fixes [#7984](https://github.com/ClickHouse/ClickHouse/issues/7984). [#10821](https://github.com/ClickHouse/ClickHouse/pull/10821) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed `UBSan` and `MSan` report in `DateLUT`. [#10798](https://github.com/ClickHouse/ClickHouse/pull/10798) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed incorrect type conversion in key conditions. Fixes [#6287](https://github.com/ClickHouse/ClickHouse/issues/6287). [#10791](https://github.com/ClickHouse/ClickHouse/pull/10791) ([Andrew Onyshchuk](https://github.com/oandrew))
* Fixed `parallel_view_processing` behavior. Now all insertions into `MATERIALIZED VIEW` without exception should be finished if exception happened. Fixes [#10241](https://github.com/ClickHouse/ClickHouse/issues/10241). [#10757](https://github.com/ClickHouse/ClickHouse/pull/10757) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
@ -2215,7 +2215,7 @@ No changes compared to v20.4.3.16-stable.
* Fixed incorrect scalar results inside inner query of `MATERIALIZED VIEW` in case if this query contained dependent table. [#10603](https://github.com/ClickHouse/ClickHouse/pull/10603) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed `SELECT` of column `ALIAS` which default expression type different from column type. [#10563](https://github.com/ClickHouse/ClickHouse/pull/10563) ([Azat Khuzhin](https://github.com/azat)).
* Implemented comparison between DateTime64 and String values. [#10560](https://github.com/ClickHouse/ClickHouse/pull/10560) ([Vasily Nemkov](https://github.com/Enmk)).
* Fixed index corruption, which may accur in some cases after merge compact parts into another compact part. [#10531](https://github.com/ClickHouse/ClickHouse/pull/10531) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed index corruption, which may occur in some cases after merge compact parts into another compact part. [#10531](https://github.com/ClickHouse/ClickHouse/pull/10531) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed the situation, when mutation finished all parts, but hung up in `is_done=0`. [#10526](https://github.com/ClickHouse/ClickHouse/pull/10526) ([alesapin](https://github.com/alesapin)).
* Fixed overflow at beginning of unix epoch for timezones with fractional offset from `UTC`. This fixes [#9335](https://github.com/ClickHouse/ClickHouse/issues/9335). [#10513](https://github.com/ClickHouse/ClickHouse/pull/10513) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed improper shutdown of `Distributed` storage. [#10491](https://github.com/ClickHouse/ClickHouse/pull/10491) ([Azat Khuzhin](https://github.com/azat)).
@ -2225,14 +2225,14 @@ No changes compared to v20.4.3.16-stable.
#### Build/Testing/Packaging Improvement
* Fix UBSan report in LZ4 library. [#10631](https://github.com/ClickHouse/ClickHouse/pull/10631) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix clang-10 build. https://github.com/ClickHouse/ClickHouse/issues/10238. [#10370](https://github.com/ClickHouse/ClickHouse/pull/10370) ([Amos Bird](https://github.com/amosbird)).
* Fix clang-10 build. [#10238](https://github.com/ClickHouse/ClickHouse/issues/10238). [#10370](https://github.com/ClickHouse/ClickHouse/pull/10370) ([Amos Bird](https://github.com/amosbird)).
* Added failing tests about `max_rows_to_sort` setting. [#10268](https://github.com/ClickHouse/ClickHouse/pull/10268) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Added some improvements in printing diagnostic info in input formats. Fixes [#10204](https://github.com/ClickHouse/ClickHouse/issues/10204). [#10418](https://github.com/ClickHouse/ClickHouse/pull/10418) ([tavplubix](https://github.com/tavplubix)).
* Added CA certificates to clickhouse-server docker image. [#10476](https://github.com/ClickHouse/ClickHouse/pull/10476) ([filimonov](https://github.com/filimonov)).
#### Bug fix
* #10551. [#10569](https://github.com/ClickHouse/ClickHouse/pull/10569) ([Winter Zhang](https://github.com/zhang2014)).
* Fix error `the BloomFilter false positive must be a double number between 0 and 1` [#10551](https://github.com/ClickHouse/ClickHouse/issues/10551). [#10569](https://github.com/ClickHouse/ClickHouse/pull/10569) ([Winter Zhang](https://github.com/zhang2014)).
### ClickHouse release v20.3.8.53, 2020-04-23
@ -2424,7 +2424,7 @@ No changes compared to v20.4.3.16-stable.
* Fixed the behaviour of `match` and `extract` functions when haystack has zero bytes. The behaviour was wrong when haystack was constant. This fixes [#9160](https://github.com/ClickHouse/ClickHouse/issues/9160) [#9163](https://github.com/ClickHouse/ClickHouse/pull/9163) ([alexey-milovidov](https://github.com/alexey-milovidov)) [#9345](https://github.com/ClickHouse/ClickHouse/pull/9345) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Avoid throwing from destructor in Apache Avro 3rd-party library. [#9066](https://github.com/ClickHouse/ClickHouse/pull/9066) ([Andrew Onyshchuk](https://github.com/oandrew))
* Don't commit a batch polled from `Kafka` partially as it can lead to holes in data. [#8876](https://github.com/ClickHouse/ClickHouse/pull/8876) ([filimonov](https://github.com/filimonov))
* Fix `joinGet` with nullable return types. https://github.com/ClickHouse/ClickHouse/issues/8919 [#9014](https://github.com/ClickHouse/ClickHouse/pull/9014) ([Amos Bird](https://github.com/amosbird))
* Fix `joinGet` with nullable return types. [#8919](https://github.com/ClickHouse/ClickHouse/issues/8919) [#9014](https://github.com/ClickHouse/ClickHouse/pull/9014) ([Amos Bird](https://github.com/amosbird))
* Fix data incompatibility when compressed with `T64` codec. [#9016](https://github.com/ClickHouse/ClickHouse/pull/9016) ([Artem Zuikov](https://github.com/4ertus2)) Fix data type ids in `T64` compression codec that leads to wrong (de)compression in affected versions. [#9033](https://github.com/ClickHouse/ClickHouse/pull/9033) ([Artem Zuikov](https://github.com/4ertus2))
* Add setting `enable_early_constant_folding` and disable it in some cases that leads to errors. [#9010](https://github.com/ClickHouse/ClickHouse/pull/9010) ([Artem Zuikov](https://github.com/4ertus2))
* Fix pushdown predicate optimizer with VIEW and enable the test [#9011](https://github.com/ClickHouse/ClickHouse/pull/9011) ([Winter Zhang](https://github.com/zhang2014))
@ -2626,7 +2626,7 @@ No changes compared to v20.4.3.16-stable.
* Fix the error `Data compressed with different methods` that can happen if `min_bytes_to_use_direct_io` is enabled and PREWHERE is active and using SAMPLE or high number of threads. This fixes [#11539](https://github.com/ClickHouse/ClickHouse/issues/11539). [#11540](https://github.com/ClickHouse/ClickHouse/pull/11540) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix return compressed size for codecs. [#11448](https://github.com/ClickHouse/ClickHouse/pull/11448) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix server crash when a column has compression codec with non-literal arguments. Fixes [#11365](https://github.com/ClickHouse/ClickHouse/issues/11365). [#11431](https://github.com/ClickHouse/ClickHouse/pull/11431) ([alesapin](https://github.com/alesapin)).
* Fix pointInPolygon with nan as point. Fixes https://github.com/ClickHouse/ClickHouse/issues/11375. [#11421](https://github.com/ClickHouse/ClickHouse/pull/11421) ([Alexey Ilyukhov](https://github.com/livace)).
* Fix pointInPolygon with nan as point. Fixes [#11375](https://github.com/ClickHouse/ClickHouse/issues/11375). [#11421](https://github.com/ClickHouse/ClickHouse/pull/11421) ([Alexey Ilyukhov](https://github.com/livace)).
* Fixed geohashesInBox with arguments outside of latitude/longitude range. [#11403](https://github.com/ClickHouse/ClickHouse/pull/11403) ([Vasily Nemkov](https://github.com/Enmk)).
* Fix possible `Pipeline stuck` error for queries with external sort and limit. Fixes [#11359](https://github.com/ClickHouse/ClickHouse/issues/11359). [#11366](https://github.com/ClickHouse/ClickHouse/pull/11366) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix crash in `quantilesExactWeightedArray`. [#11337](https://github.com/ClickHouse/ClickHouse/pull/11337) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
@ -2636,7 +2636,7 @@ No changes compared to v20.4.3.16-stable.
* Fix potential uninitialized memory in conversion. Example: `SELECT toIntervalSecond(now64())`. [#11311](https://github.com/ClickHouse/ClickHouse/pull/11311) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix the issue when index analysis cannot work if a table has Array column in primary key and if a query is filtering by this column with `empty` or `notEmpty` functions. This fixes [#11286](https://github.com/ClickHouse/ClickHouse/issues/11286). [#11303](https://github.com/ClickHouse/ClickHouse/pull/11303) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix bug when query speed estimation can be incorrect and the limit of `min_execution_speed` may not work or work incorrectly if the query is throttled by `max_network_bandwidth`, `max_execution_speed` or `priority` settings. Change the default value of `timeout_before_checking_execution_speed` to non-zero, because otherwise the settings `min_execution_speed` and `max_execution_speed` have no effect. This fixes [#11297](https://github.com/ClickHouse/ClickHouse/issues/11297). This fixes [#5732](https://github.com/ClickHouse/ClickHouse/issues/5732). This fixes [#6228](https://github.com/ClickHouse/ClickHouse/issues/6228). Usability improvement: avoid concatenation of exception message with progress bar in `clickhouse-client`. [#11296](https://github.com/ClickHouse/ClickHouse/pull/11296) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix crash while reading malformed data in Protobuf format. This fixes https://github.com/ClickHouse/ClickHouse/issues/5957, fixes https://github.com/ClickHouse/ClickHouse/issues/11203. [#11258](https://github.com/ClickHouse/ClickHouse/pull/11258) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix crash while reading malformed data in Protobuf format. This fixes [#5957](https://github.com/ClickHouse/ClickHouse/issues/5957), fixes [#11203](https://github.com/ClickHouse/ClickHouse/issues/11203). [#11258](https://github.com/ClickHouse/ClickHouse/pull/11258) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix possible error `Cannot capture column` for higher-order functions with `Array(Array(LowCardinality))` captured argument. [#11185](https://github.com/ClickHouse/ClickHouse/pull/11185) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* If data skipping index is dependent on columns that are going to be modified during background merge (for SummingMergeTree, AggregatingMergeTree as well as for TTL GROUP BY), it was calculated incorrectly. This issue is fixed by moving index calculation after merge so the index is calculated on merged data. [#11162](https://github.com/ClickHouse/ClickHouse/pull/11162) ([Azat Khuzhin](https://github.com/azat)).
* Remove logging from mutation finalization task if nothing was finalized. [#11109](https://github.com/ClickHouse/ClickHouse/pull/11109) ([alesapin](https://github.com/alesapin)).
@ -2914,7 +2914,7 @@ No changes compared to v20.4.3.16-stable.
* Several improvements ClickHouse grammar in `.g4` file. [#8294](https://github.com/ClickHouse/ClickHouse/pull/8294) ([taiyang-li](https://github.com/taiyang-li))
* Fix bug that leads to crashes in `JOIN`s with tables with engine `Join`. This fixes [#7556](https://github.com/ClickHouse/ClickHouse/issues/7556) [#8254](https://github.com/ClickHouse/ClickHouse/issues/8254) [#7915](https://github.com/ClickHouse/ClickHouse/issues/7915) [#8100](https://github.com/ClickHouse/ClickHouse/issues/8100). [#8298](https://github.com/ClickHouse/ClickHouse/pull/8298) ([Artem Zuikov](https://github.com/4ertus2))
* Fix redundant dictionaries reload on `CREATE DATABASE`. [#7916](https://github.com/ClickHouse/ClickHouse/pull/7916) ([Azat Khuzhin](https://github.com/azat))
* Limit maximum number of streams for read from `StorageFile` and `StorageHDFS`. Fixes https://github.com/ClickHouse/ClickHouse/issues/7650. [#7981](https://github.com/ClickHouse/ClickHouse/pull/7981) ([alesapin](https://github.com/alesapin))
* Limit maximum number of streams for read from `StorageFile` and `StorageHDFS`. Fixes [#7650](https://github.com/ClickHouse/ClickHouse/issues/7650). [#7981](https://github.com/ClickHouse/ClickHouse/pull/7981) ([alesapin](https://github.com/alesapin))
* Fix bug in `ALTER ... MODIFY ... CODEC` query, when user specify both default expression and codec. Fixes [8593](https://github.com/ClickHouse/ClickHouse/issues/8593). [#8614](https://github.com/ClickHouse/ClickHouse/pull/8614) ([alesapin](https://github.com/alesapin))
* Fix error in background merge of columns with `SimpleAggregateFunction(LowCardinality)` type. [#8613](https://github.com/ClickHouse/ClickHouse/pull/8613) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fixed type check in function `toDateTime64`. [#8375](https://github.com/ClickHouse/ClickHouse/pull/8375) ([Vasily Nemkov](https://github.com/Enmk))
@ -2998,7 +2998,7 @@ No changes compared to v20.4.3.16-stable.
* Added check for extra parts of `MergeTree` at different disks, in order to not allow to miss data parts at undefined disks. [#8118](https://github.com/ClickHouse/ClickHouse/pull/8118) ([Vladimir Chebotarev](https://github.com/excitoon))
* Enable SSL support for Mac client and server. [#8297](https://github.com/ClickHouse/ClickHouse/pull/8297) ([Ivan](https://github.com/abyss7))
* Now ClickHouse can work as MySQL federated server (see https://dev.mysql.com/doc/refman/5.7/en/federated-create-server.html). [#7717](https://github.com/ClickHouse/ClickHouse/pull/7717) ([Maxim Fedotov](https://github.com/MaxFedotov))
* `clickhouse-client` now only enable `bracketed-paste` when multiquery is on and multiline is off. This fixes (#7757)[https://github.com/ClickHouse/ClickHouse/issues/7757]. [#7761](https://github.com/ClickHouse/ClickHouse/pull/7761) ([Amos Bird](https://github.com/amosbird))
* `clickhouse-client` now only enable `bracketed-paste` when multiquery is on and multiline is off. This fixes [#7757](https://github.com/ClickHouse/ClickHouse/issues/7757). [#7761](https://github.com/ClickHouse/ClickHouse/pull/7761) ([Amos Bird](https://github.com/amosbird))
* Support `Array(Decimal)` in `if` function. [#7721](https://github.com/ClickHouse/ClickHouse/pull/7721) ([Artem Zuikov](https://github.com/4ertus2))
* Support Decimals in `arrayDifference`, `arrayCumSum` and `arrayCumSumNegative` functions. [#7724](https://github.com/ClickHouse/ClickHouse/pull/7724) ([Artem Zuikov](https://github.com/4ertus2))
* Added `lifetime` column to `system.dictionaries` table. [#6820](https://github.com/ClickHouse/ClickHouse/issues/6820) [#7727](https://github.com/ClickHouse/ClickHouse/pull/7727) ([kekekekule](https://github.com/kekekekule))

View File

@ -223,8 +223,8 @@ if (ARCH_NATIVE)
set (COMPILER_FLAGS "${COMPILER_FLAGS} -march=native")
endif ()
if (UNBUNDLED AND (COMPILER_GCC OR COMPILER_CLANG))
# to make numeric_limits<__int128> works for unbundled build
if (COMPILER_GCC OR COMPILER_CLANG)
# to make numeric_limits<__int128> works with GCC
set (_CXX_STANDARD "-std=gnu++2a")
else()
set (_CXX_STANDARD "-std=c++2a")

View File

@ -58,8 +58,7 @@ public:
using signed_base_type = int64_t;
// ctors
integer() = default;
constexpr integer() noexcept;
template <typename T>
constexpr integer(T rhs) noexcept;
template <typename T>

View File

@ -916,6 +916,11 @@ public:
// Members
template <size_t Bits, typename Signed>
constexpr integer<Bits, Signed>::integer() noexcept
: items{}
{}
template <size_t Bits, typename Signed>
template <typename T>
constexpr integer<Bits, Signed>::integer(T rhs) noexcept

View File

@ -761,7 +761,7 @@ void BaseDaemon::initializeTerminationAndSignalProcessing()
static KillingErrorHandler killing_error_handler;
Poco::ErrorHandler::set(&killing_error_handler);
signal_pipe.setNonBlocking();
signal_pipe.setNonBlockingWrite();
signal_pipe.tryIncreaseSize(1 << 20);
signal_listener = std::make_unique<SignalListener>(*this);

View File

@ -22,4 +22,12 @@ ResultBase::~ResultBase()
mysql_free_result(res);
}
std::string ResultBase::getFieldName(size_t n) const
{
if (num_fields <= n)
throw Exception(std::string("Unknown column position ") + std::to_string(n));
return fields[n].name;
}
}

View File

@ -31,6 +31,8 @@ public:
MYSQL_RES * getRes() { return res; }
const Query * getQuery() const { return query; }
std::string getFieldName(size_t n) const;
virtual ~ResultBase();
protected:

View File

@ -26,6 +26,7 @@ add_subdirectory (boost-cmake)
add_subdirectory (cctz-cmake)
add_subdirectory (consistent-hashing-sumbur)
add_subdirectory (consistent-hashing)
add_subdirectory (dragonbox-cmake)
add_subdirectory (FastMemcpy)
add_subdirectory (hyperscan-cmake)
add_subdirectory (jemalloc-cmake)
@ -322,5 +323,5 @@ if (USE_INTERNAL_ROCKSDB_LIBRARY)
add_subdirectory(rocksdb-cmake)
endif()
add_subdirectory(dragonbox)
add_subdirectory(fast_float)

View File

@ -0,0 +1,5 @@
set(LIBRARY_DIR "${ClickHouse_SOURCE_DIR}/contrib/dragonbox")
add_library(dragonbox_to_chars "${LIBRARY_DIR}/source/dragonbox_to_chars.cpp")
target_include_directories(dragonbox_to_chars SYSTEM BEFORE PUBLIC "${LIBRARY_DIR}/include/")

2
contrib/poco vendored

@ -1 +1 @@
Subproject commit b5523bb9b4bc4239640cbfec4d734be8b8585639
Subproject commit 08974cc024b2e748f5b1d45415396706b3521d0f

View File

@ -64,7 +64,14 @@ function stop_server
function start_server
{
set -m # Spawn server in its own process groups
clickhouse-server --config-file="$FASTTEST_DATA/config.xml" -- --path "$FASTTEST_DATA" --user_files_path "$FASTTEST_DATA/user_files" &>> "$FASTTEST_OUTPUT/server.log" &
local opts=(
--config-file="$FASTTEST_DATA/config.xml"
--
--path "$FASTTEST_DATA"
--user_files_path "$FASTTEST_DATA/user_files"
--top_level_domains_path "$FASTTEST_DATA/top_level_domains"
)
clickhouse-server "${opts[@]}" &>> "$FASTTEST_OUTPUT/server.log" &
server_pid=$!
set +m

View File

@ -53,4 +53,3 @@ COPY * /
CMD ["bash", "-c", "node=$((RANDOM % $(numactl --hardware | sed -n 's/^.*available:\\(.*\\)nodes.*$/\\1/p'))); echo Will bind to NUMA node $node; numactl --cpunodebind=$node --membind=$node /entrypoint.sh"]
# docker run --network=host --volume <workspace>:/workspace --volume=<output>:/output -e PR_TO_TEST=<> -e SHA_TO_TEST=<> yandex/clickhouse-performance-comparison

View File

@ -55,6 +55,7 @@ function configure
# server *config* directives overrides
--path db0
--user_files_path db0/user_files
--top_level_domains_path /top_level_domains
--tcp_port $LEFT_SERVER_PORT
)
left/clickhouse-server "${setup_left_server_opts[@]}" &> setup-server-log.log &
@ -102,6 +103,7 @@ function restart
# server *config* directives overrides
--path left/db
--user_files_path left/db/user_files
--top_level_domains_path /top_level_domains
--tcp_port $LEFT_SERVER_PORT
)
left/clickhouse-server "${left_server_opts[@]}" &>> left-server-log.log &
@ -116,6 +118,7 @@ function restart
# server *config* directives overrides
--path right/db
--user_files_path right/db/user_files
--top_level_domains_path /top_level_domains
--tcp_port $RIGHT_SERVER_PORT
)
right/clickhouse-server "${right_server_opts[@]}" &>> right-server-log.log &

View File

@ -0,0 +1,5 @@
<yandex>
<top_level_domains_lists>
<public_suffix_list>public_suffix_list.dat</public_suffix_list>
</top_level_domains_lists>
</yandex>

File diff suppressed because it is too large Load Diff

View File

@ -177,8 +177,6 @@ When you `INSERT` a bunch of data into `MergeTree`, that bunch is sorted by prim
`MergeTree` is not an LSM tree because it doesnt contain “memtable” and “log”: inserted data is written directly to the filesystem. This makes it suitable only to INSERT data in batches, not by individual row and not very frequently about once per second is ok, but a thousand times a second is not. We did it this way for simplicitys sake, and because we are already inserting data in batches in our applications.
> MergeTree tables can only have one (primary) index: there arent any secondary indices. It would be nice to allow multiple physical representations under one logical table, for example, to store data in more than one physical order or even to allow representations with pre-aggregated data along with original data.
There are MergeTree engines that are doing additional work during background merges. Examples are `CollapsingMergeTree` and `AggregatingMergeTree`. This could be treated as special support for updates. Keep in mind that these are not real updates because users usually have no control over the time when background merges are executed, and data in a `MergeTree` table is almost always stored in more than one part, not in completely merged form.
## Replication {#replication}

View File

@ -0,0 +1,11 @@
---
toc_priority: 11
toc_title: GitHub Events
---
# GitHub Events Dataset
Dataset contains all events on GitHub from 2011 to Dec 6 2020, the size is 3.1 billion records. Download size is 75 GB and it will require up to 200 GB space on disk if stored in a table with lz4 compression.
Full dataset description, insights, download instruction and interactive queries are posted [here](https://github-sql.github.io/explorer/).

View File

@ -1,6 +1,6 @@
---
toc_folder_title: Example Datasets
toc_priority: 14
toc_priority: 10
toc_title: Introduction
---
@ -18,4 +18,4 @@ The list of documented datasets:
- [New York Taxi Data](../../getting-started/example-datasets/nyc-taxi.md)
- [OnTime](../../getting-started/example-datasets/ontime.md)
[Original article](https://clickhouse.tech/docs/en/getting_started/example_datasets) <!--hide-->
[Original article](https://clickhouse.tech/docs/en/getting_started/example_datasets) <!--hide-->

View File

@ -58,6 +58,7 @@ The supported formats are:
| [XML](#xml) | ✗ | ✔ |
| [CapnProto](#capnproto) | ✔ | ✗ |
| [LineAsString](#lineasstring) | ✔ | ✗ |
| [RawBLOB](#rawblob) | ✔ | ✔ |
You can control some format processing parameters with the ClickHouse settings. For more information read the [Settings](../operations/settings/settings.md) section.
@ -1370,4 +1371,45 @@ Result:
└───────────────────────────────────────────────────┘
```
## RawBLOB {#rawblob}
In this format, all input data is read to a single value. It is possible to parse only a table with a single field of type [String](../sql-reference/data-types/string.md) or similar.
The result is output in binary format without delimiters and escaping. If more than one value is output, the format is ambiguous, and it will be impossible to read the data back.
Below is a comparison of the formats `RawBLOB` and [TabSeparatedRaw](#tabseparatedraw).
`RawBLOB`:
- data is output in binary format, no escaping;
- there are no delimiters between values;
- no newline at the end of each value.
[TabSeparatedRaw] (#tabseparatedraw):
- data is output without escaping;
- the rows contain values separated by tabs;
- there is a line feed after the last value in every row.
The following is a comparison of the `RawBLOB` and [RowBinary](#rowbinary) formats.
`RawBLOB`:
- String fields are output without being prefixed by length.
`RowBinary`:
- String fields are represented as length in varint format (unsigned [LEB128] (https://en.wikipedia.org/wiki/LEB128)), followed by the bytes of the string.
When an empty data is passed to the `RawBLOB` input, ClickHouse throws an exception:
``` text
Code: 108. DB::Exception: No data to insert
```
**Example**
``` bash
$ clickhouse-client --query "CREATE TABLE {some_table} (a String) ENGINE = Memory;"
$ cat {filename} | clickhouse-client --query="INSERT INTO {some_table} FORMAT RawBLOB"
$ clickhouse-client --query "SELECT * FROM {some_table} FORMAT RawBLOB" | md5sum
```
Result:
``` text
f9725a22f9191e064120d718e26862a9 -
```
[Original article](https://clickhouse.tech/docs/en/interfaces/formats/) <!--hide-->

View File

@ -2360,10 +2360,41 @@ Default value: `1`.
## output_format_tsv_null_representation {#output_format_tsv_null_representation}
Allows configurable `NULL` representation for [TSV](../../interfaces/formats.md#tabseparated) output format. The setting only controls output format and `\N` is the only supported `NULL` representation for TSV input format.
Defines the representation of `NULL` for [TSV](../../interfaces/formats.md#tabseparated) output format. User can set any string as a value, for example, `My NULL`.
Default value: `\N`.
**Examples**
Query
```sql
SELECT * FROM tsv_custom_null FORMAT TSV;
```
Result
```text
788
\N
\N
```
Query
```sql
SET output_format_tsv_null_representation = 'My NULL';
SELECT * FROM tsv_custom_null FORMAT TSV;
```
Result
```text
788
My NULL
My NULL
```
## output_format_json_array_of_rows {#output-format-json-array-of-rows}
Enables the ability to output all rows as a JSON array in the [JSONEachRow](../../interfaces/formats.md#jsoneachrow) format.

View File

@ -0,0 +1,42 @@
# system.distribution_queue {#system_tables-distribution_queue}
Contains information about local files that are in the queue to be sent to the shards. This local files contain new parts that are created by inserting new data into the Distributed table in asynchronous mode.
Columns:
- `database` ([String](../../sql-reference/data-types/string.md)) — Name of the database.
- `table` ([String](../../sql-reference/data-types/string.md)) — Name of the table.
- `data_path` ([String](../../sql-reference/data-types/string.md)) — Path to the folder with local files.
- `is_blocked` ([UInt8](../../sql-reference/data-types/int-uint.md)) — Flag indicates whether sending local files to the server is blocked.
- `error_count` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Number of errors.
- `data_files` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Number of local files in a folder.
- `data_compressed_bytes` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Size of compressed data in local files, in bytes.
- `last_exception` ([String](../../sql-reference/data-types/string.md)) — Text message about the last error that occurred (if any).
**Example**
``` sql
SELECT * FROM system.distribution_queue LIMIT 1 FORMAT Vertical;
```
``` text
Row 1:
──────
database: default
table: dist
data_path: ./store/268/268bc070-3aad-4b1a-9cf2-4987580161af/default@127%2E0%2E0%2E2:9000/
is_blocked: 1
error_count: 0
data_files: 1
data_compressed_bytes: 499
last_exception:
```
[Original article](https://clickhouse.tech/docs/en/operations/system_tables/distribution_queue) <!--hide-->

View File

@ -93,6 +93,8 @@ Setting fields:
- `path` The absolute path to the file.
- `format` The file format. All the formats described in “[Formats](../../../interfaces/formats.md#formats)” are supported.
When dictionary with FILE source is created via DDL command (`CREATE DICTIONARY ...`), source of the dictionary have to be located in `user_files` directory, to prevent DB users accessing arbitrary file on clickhouse node.
## Executable File {#dicts-external_dicts_dict_sources-executable}
Working with executable files depends on [how the dictionary is stored in memory](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-layout.md). If the dictionary is stored using `cache` and `complex_key_cache`, ClickHouse requests the necessary keys by sending a request to the executable files STDIN. Otherwise, ClickHouse starts executable file and treats its output as dictionary data.
@ -108,17 +110,13 @@ Example of settings:
</source>
```
or
``` sql
SOURCE(EXECUTABLE(command 'cat /opt/dictionaries/os.tsv' format 'TabSeparated'))
```
Setting fields:
- `command` The absolute path to the executable file, or the file name (if the program directory is written to `PATH`).
- `format` The file format. All the formats described in “[Formats](../../../interfaces/formats.md#formats)” are supported.
That dictionary source can be configured only via XML configuration. Creating dictionaries with executable source via DDL is disabled, otherwise, the DB user would be able to execute arbitrary binary on clickhouse node.
## Http(s) {#dicts-external_dicts_dict_sources-http}
Working with an HTTP(s) server depends on [how the dictionary is stored in memory](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-layout.md). If the dictionary is stored using `cache` and `complex_key_cache`, ClickHouse requests the necessary keys by sending a request via the `POST` method.
@ -169,6 +167,8 @@ Setting fields:
- `name` Identifiant name used for the header send on the request.
- `value` Value set for a specific identifiant name.
When creating a dictionary using the DDL command (`CREATE DICTIONARY ...`) remote hosts for HTTP dictionaries checked with the `remote_url_allow_hosts` section from config to prevent database users to access arbitrary HTTP server.
## ODBC {#dicts-external_dicts_dict_sources-odbc}
You can use this method to connect any database that has an ODBC driver.

View File

@ -366,7 +366,7 @@ SELECT toDate('2016-12-27') AS date, toYearWeek(date) AS yearWeek0, toYearWeek(d
└────────────┴───────────┴───────────┴───────────┘
```
## date_trunc {#date_trunc}
## date\_trunc {#date_trunc}
Truncates date and time data to the specified part of date.
@ -435,7 +435,7 @@ Result:
- [toStartOfInterval](#tostartofintervaltime-or-data-interval-x-unit-time-zone)
# now {#now}
## now {#now}
Returns the current date and time.
@ -662,7 +662,7 @@ Result:
[Original article](https://clickhouse.tech/docs/en/query_language/functions/date_time_functions/) <!--hide-->
## FROM_UNIXTIME
## FROM\_UNIXTIME {#fromunixfime}
When there is only single argument of integer type, it act in the same way as `toDateTime` and return [DateTime](../../sql-reference/data-types/datetime.md).
type.
@ -692,3 +692,147 @@ SELECT FROM_UNIXTIME(1234334543, '%Y-%m-%d %R:%S') AS DateTime
│ 2009-02-11 14:42:23 │
└─────────────────────┘
```
## toModifiedJulianDay {#tomodifiedjulianday}
Converts a [Proleptic Gregorian calendar](https://en.wikipedia.org/wiki/Proleptic_Gregorian_calendar) date in text form `YYYY-MM-DD` to a [Modified Julian Day](https://en.wikipedia.org/wiki/Julian_day#Variants) number in Int32. This function supports date from `0000-01-01` to `9999-12-31`. It raises an exception if the argument cannot be parsed as a date, or the date is invalid.
**Syntax**
``` sql
toModifiedJulianDay(date)
```
**Parameters**
- `date` — Date in text form. [String](../../sql-reference/data-types/string.md) or [FixedString](../../sql-reference/data-types/fixedstring.md).
**Returned value**
- Modified Julian Day number.
Type: [Int32](../../sql-reference/data-types/int-uint.md).
**Example**
Query:
``` sql
SELECT toModifiedJulianDay('2020-01-01');
```
Result:
``` text
┌─toModifiedJulianDay('2020-01-01')─┐
│ 58849 │
└───────────────────────────────────┘
```
## toModifiedJulianDayOrNull {#tomodifiedjuliandayornull}
Similar to [toModifiedJulianDay()](#tomodifiedjulianday), but instead of raising exceptions it returns `NULL`.
**Syntax**
``` sql
toModifiedJulianDayOrNull(date)
```
**Parameters**
- `date` — Date in text form. [String](../../sql-reference/data-types/string.md) or [FixedString](../../sql-reference/data-types/fixedstring.md).
**Returned value**
- Modified Julian Day number.
Type: [Nullable(Int32)](../../sql-reference/data-types/int-uint.md).
**Example**
Query:
``` sql
SELECT toModifiedJulianDayOrNull('2020-01-01');
```
Result:
``` text
┌─toModifiedJulianDayOrNull('2020-01-01')─┐
│ 58849 │
└─────────────────────────────────────────┘
```
## fromModifiedJulianDay {#frommodifiedjulianday}
Converts a [Modified Julian Day](https://en.wikipedia.org/wiki/Julian_day#Variants) number to a [Proleptic Gregorian calendar](https://en.wikipedia.org/wiki/Proleptic_Gregorian_calendar) date in text form `YYYY-MM-DD`. This function supports day number from `-678941` to `2973119` (which represent 0000-01-01 and 9999-12-31 respectively). It raises an exception if the day number is outside of the supported range.
**Syntax**
``` sql
fromModifiedJulianDay(day)
```
**Parameters**
- `day` — Modified Julian Day number. [Any integral types](../../sql-reference/data-types/int-uint.md).
**Returned value**
- Date in text form.
Type: [String](../../sql-reference/data-types/string.md)
**Example**
Query:
``` sql
SELECT fromModifiedJulianDay(58849);
```
Result:
``` text
┌─fromModifiedJulianDay(58849)─┐
│ 2020-01-01 │
└──────────────────────────────┘
```
## fromModifiedJulianDayOrNull {#frommodifiedjuliandayornull}
Similar to [fromModifiedJulianDayOrNull()](#frommodifiedjuliandayornull), but instead of raising exceptions it returns `NULL`.
**Syntax**
``` sql
fromModifiedJulianDayOrNull(day)
```
**Parameters**
- `day` — Modified Julian Day number. [Any integral types](../../sql-reference/data-types/int-uint.md).
**Returned value**
- Date in text form.
Type: [Nullable(String)](../../sql-reference/data-types/string.md)
**Example**
Query:
``` sql
SELECT fromModifiedJulianDayOrNull(58849);
```
Result:
``` text
┌─fromModifiedJulianDayOrNull(58849)─┐
│ 2020-01-01 │
└────────────────────────────────────┘
```

View File

@ -111,4 +111,306 @@ Accepts a numeric argument and returns a UInt64 number close to 2 to the power o
Accepts a numeric argument and returns a UInt64 number close to 10 to the power of x.
## cosh(x) {#coshx}
[Hyperbolic cosine](https://in.mathworks.com/help/matlab/ref/cosh.html).
**Syntax**
``` sql
cosh(x)
```
**Parameters**
- `x` — The angle, in radians. Values from the interval: `-∞ < x < +∞`. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Returned value**
- Values from the interval: `1 <= cosh(x) < +∞`.
Type: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Example**
Query:
``` sql
SELECT cosh(0);
```
Result:
``` text
┌─cosh(0)──┐
│ 1 │
└──────────┘
```
## acosh(x) {#acoshx}
[Inverse hyperbolic cosine](https://www.mathworks.com/help/matlab/ref/acosh.html).
**Syntax**
``` sql
acosh(x)
```
**Parameters**
- `x` — Hyperbolic cosine of angle. Values from the interval: `1 <= x < +∞`. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Returned value**
- The angle, in radians. Values from the interval: `0 <= acosh(x) < +∞`.
Type: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Example**
Query:
``` sql
SELECT acosh(1);
```
Result:
``` text
┌─acosh(1)─┐
│ 0 │
└──────────┘
```
**See Also**
- [cosh(x)](../../sql-reference/functions/math-functions.md#coshx)
## sinh(x) {#sinhx}
[Hyperbolic sine](https://www.mathworks.com/help/matlab/ref/sinh.html).
**Syntax**
``` sql
sinh(x)
```
**Parameters**
- `x` — The angle, in radians. Values from the interval: `-∞ < x < +∞`. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Returned value**
- Values from the interval: `-∞ < sinh(x) < +∞`.
Type: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Example**
Query:
``` sql
SELECT sinh(0);
```
Result:
``` text
┌─sinh(0)──┐
│ 0 │
└──────────┘
```
## asinh(x) {#asinhx}
[Inverse hyperbolic sine](https://www.mathworks.com/help/matlab/ref/asinh.html).
**Syntax**
``` sql
asinh(x)
```
**Parameters**
- `x` — Hyperbolic sine of angle. Values from the interval: `-∞ < x < +∞`. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Returned value**
- The angle, in radians. Values from the interval: `-∞ < asinh(x) < +∞`.
Type: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Example**
Query:
``` sql
SELECT asinh(0);
```
Result:
``` text
┌─asinh(0)─┐
│ 0 │
└──────────┘
```
**See Also**
- [sinh(x)](../../sql-reference/functions/math-functions.md#sinhx)
## atanh(x) {#atanhx}
[Inverse hyperbolic tangent](https://www.mathworks.com/help/matlab/ref/atanh.html).
**Syntax**
``` sql
atanh(x)
```
**Parameters**
- `x` — Hyperbolic tangent of angle. Values from the interval: `1 < x < 1`. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Returned value**
- The angle, in radians. Values from the interval: `-∞ < atanh(x) < +∞`.
Type: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Example**
Query:
``` sql
SELECT atanh(0);
```
Result:
``` text
┌─atanh(0)─┐
│ 0 │
└──────────┘
```
## atan2(y, x) {#atan2yx}
The [function](https://en.wikipedia.org/wiki/Atan2) calculates the angle in the Euclidean plane, given in radians, between the positive x axis and the ray to the point `(x, y) ≠ (0, 0)`.
**Syntax**
``` sql
atan2(y, x)
```
**Parameters**
- `y` — y-coordinate of the point through which the ray passes. [Float64](../../sql-reference/data-types/float.md#float32-float64).
- `x` — x-coordinate of the point through which the ray passes. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Returned value**
- The angle `θ` such that `−π < θ ≤ π`, in radians.
Type: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Example**
Query:
``` sql
SELECT atan2(1, 1);
```
Result:
``` text
┌────────atan2(1, 1)─┐
│ 0.7853981633974483 │
└────────────────────┘
```
## hypot(x, y) {#hypotxy}
Calculates the length of the hypotenuse of a right-angle triangle. The [function](https://en.wikipedia.org/wiki/Hypot) avoids problems that occur when squaring very large or very small numbers.
**Syntax**
``` sql
hypot(x, y)
```
**Parameters**
- `x` — The first cathetus of a right-angle triangle. [Float64](../../sql-reference/data-types/float.md#float32-float64).
- `y` — The second cathetus of a right-angle triangle. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Returned value**
- The length of the hypotenuse of a right-angle triangle.
Type: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Example**
Query:
``` sql
SELECT hypot(1, 1);
```
Result:
``` text
┌────────hypot(1, 1)─┐
│ 1.4142135623730951 │
└────────────────────┘
```
## log1p(x) {#log1px}
Calculates `log(1+x)`. The [function](https://en.wikipedia.org/wiki/Natural_logarithm#lnp1) `log1p(x)` is more accurate than `log(1+x)` for small values of x.
**Syntax**
``` sql
log1p(x)
```
**Parameters**
- `x` — Values from the interval: `-1 < x < +∞`. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Returned value**
- Values from the interval: `-∞ < log1p(x) < +∞`.
Type: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Example**
Query:
``` sql
SELECT log1p(0);
```
Result:
``` text
┌─log1p(0)─┐
│ 0 │
└──────────┘
```
**See Also**
- [log(x)](../../sql-reference/functions/math-functions.md#logx-lnx)
[Original article](https://clickhouse.tech/docs/en/query_language/functions/math_functions/) <!--hide-->

View File

@ -131,6 +131,40 @@ For example:
- `cutToFirstSignificantSubdomain('www.tr') = 'www.tr'`.
- `cutToFirstSignificantSubdomain('tr') = ''`.
### cutToFirstSignificantSubdomainCustom {#cuttofirstsignificantsubdomaincustom}
Same as `cutToFirstSignificantSubdomain` but accept custom TLD list name, useful if:
- you need fresh TLD list,
- or you have custom.
Configuration example:
```xml
<!-- <top_level_domains_path>/var/lib/clickhouse/top_level_domains/</top_level_domains_path> -->
<top_level_domains_lists>
<!-- https://publicsuffix.org/list/public_suffix_list.dat -->
<public_suffix_list>public_suffix_list.dat</public_suffix_list>
<!-- NOTE: path is under top_level_domains_path -->
</top_level_domains_lists>
```
Example:
- `cutToFirstSignificantSubdomain('https://news.yandex.com.tr/', 'public_suffix_list') = 'yandex.com.tr'`.
### cutToFirstSignificantSubdomainCustomWithWWW {#cuttofirstsignificantsubdomaincustomwithwww}
Same as `cutToFirstSignificantSubdomainWithWWW` but accept custom TLD list name.
### firstSignificantSubdomainCustom {#firstsignificantsubdomaincustom}
Same as `firstSignificantSubdomain` but accept custom TLD list name.
### cutToFirstSignificantSubdomainCustomWithWWW {#cuttofirstsignificantsubdomaincustomwithwww}
Same as `cutToFirstSignificantSubdomainWithWWW` but accept custom TLD list name.
### port(URL\[, default_port = 0\]) {#port}
Returns the port or `default_port` if there is no port in the URL (or in case of validation error).

View File

@ -53,7 +53,7 @@ KILL MUTATION [ON CLUSTER cluster]
Tries to cancel and remove [mutations](../../sql-reference/statements/alter/index.md#alter-mutations) that are currently executing. Mutations to cancel are selected from the [`system.mutations`](../../operations/system-tables/mutations.md#system_tables-mutations) table using the filter specified by the `WHERE` clause of the `KILL` query.
A test query (`TEST`) only checks the users rights and displays a list of queries to stop.
A test query (`TEST`) only checks the users rights and displays a list of mutations to stop.
Examples:

View File

@ -9,6 +9,7 @@ ClickHouse может принимать (`INSERT`) и отдавать (`SELECT
Поддерживаемые форматы и возможность использовать их в запросах `INSERT` и `SELECT` перечислены в таблице ниже.
=======
| Формат | INSERT | SELECT |
|-----------------------------------------------------------------------------------------|--------|--------|
| [TabSeparated](#tabseparated) | ✔ | ✔ |
@ -56,6 +57,7 @@ ClickHouse может принимать (`INSERT`) и отдавать (`SELECT
| [XML](#xml) | ✗ | ✔ |
| [CapnProto](#capnproto) | ✔ | ✗ |
| [LineAsString](#lineasstring) | ✔ | ✗ |
| [RawBLOB](#rawblob) | ✔ | ✔ |
Вы можете регулировать некоторые параметры работы с форматами с помощью настроек ClickHouse. За дополнительной информацией обращайтесь к разделу [Настройки](../operations/settings/settings.md).
@ -1248,4 +1250,45 @@ SELECT * FROM line_as_string;
└───────────────────────────────────────────────────┘
```
## RawBLOB {#rawblob}
В этом формате все входные данные считываются в одно значение. Парсить можно только таблицу с одним полем типа [String](../sql-reference/data-types/string.md) или подобным ему.
Результат выводится в бинарном виде без разделителей и экранирования. При выводе более одного значения формат неоднозначен и будет невозможно прочитать данные снова.
Ниже приведено сравнение форматов `RawBLOB` и [TabSeparatedRaw](#tabseparatedraw).
`RawBLOB`:
- данные выводятся в бинарном виде, без экранирования;
- нет разделителей между значениями;
- нет перевода строки в конце каждого значения.
[TabSeparatedRaw](#tabseparatedraw):
- данные выводятся без экранирования;
- строка содержит значения, разделённые табуляцией;
- после последнего значения в строке есть перевод строки.
Далее рассмотрено сравнение форматов `RawBLOB` и [RowBinary](#rowbinary).
`RawBLOB`:
- строки выводятся без их длины в начале.
`RowBinary`:
- строки представлены как длина в формате varint (unsigned [LEB128](https://en.wikipedia.org/wiki/LEB128)), а затем байты строки.
При передаче на вход `RawBLOB` пустых данных, ClickHouse бросает исключение:
``` text
Code: 108. DB::Exception: No data to insert
```
**Пример**
``` bash
$ clickhouse-client --query "CREATE TABLE {some_table} (a String) ENGINE = Memory;"
$ cat {filename} | clickhouse-client --query="INSERT INTO {some_table} FORMAT RawBLOB"
$ clickhouse-client --query "SELECT * FROM {some_table} FORMAT RawBLOB" | md5sum
```
Результат:
``` text
f9725a22f9191e064120d718e26862a9 -
```
[Оригинальная статья](https://clickhouse.tech/docs/ru/interfaces/formats/) <!--hide-->

View File

@ -297,7 +297,7 @@ FORMAT Null;
**Смотрите также**
- [Секция JOIN](../../sql-reference/statements/select/join.md#select-join)
- [Движоy таблиц Join](../../engines/table-engines/special/join.md)
- [Движок таблиц Join](../../engines/table-engines/special/join.md)
## max_partitions_per_insert_block {#max-partitions-per-insert-block}

View File

@ -2231,10 +2231,41 @@ SELECT CAST(toNullable(toInt32(0)) AS Int32) as x, toTypeName(x);
## output_format_tsv_null_representation {#output_format_tsv_null_representation}
Позволяет настраивать представление `NULL` для формата выходных данных [TSV](../../interfaces/formats.md#tabseparated). Настройка управляет форматом выходных данных, `\N` является единственным поддерживаемым представлением для формата входных данных TSV.
Определяет представление `NULL` для формата выходных данных [TSV](../../interfaces/formats.md#tabseparated). Пользователь может установить в качестве значения любую строку.
Значение по умолчанию: `\N`.
**Примеры**
Запрос
```sql
SELECT * FROM tsv_custom_null FORMAT TSV;
```
Результат
```text
788
\N
\N
```
Запрос
```sql
SET output_format_tsv_null_representation = 'My NULL';
SELECT * FROM tsv_custom_null FORMAT TSV;
```
Результат
```text
788
My NULL
My NULL
```
## output_format_json_array_of_rows {#output-format-json-array-of-rows}
Позволяет выводить все строки в виде массива JSON в формате [JSONEachRow](../../interfaces/formats.md#jsoneachrow).

View File

@ -63,10 +63,18 @@ int32samoa: 1546300800
Переводит дату или дату-с-временем в число типа UInt16, содержащее номер года (AD).
## toQuarter {#toquarter}
Переводит дату или дату-с-временем в число типа UInt8, содержащее номер квартала.
## toMonth {#tomonth}
Переводит дату или дату-с-временем в число типа UInt8, содержащее номер месяца (1-12).
## toDayOfYear {#todayofyear}
Переводит дату или дату-с-временем в число типа UInt16, содержащее номер дня года (1-366).
## toDayOfMonth {#todayofmonth}
Переводит дату или дату-с-временем в число типа UInt8, содержащее номер дня в месяце (1-31).
@ -128,6 +136,22 @@ SELECT toUnixTimestamp('2017-11-05 08:07:47', 'Asia/Tokyo') AS unix_timestamp
Округляет дату или дату-с-временем вниз до первого дня года.
Возвращается дата.
## toStartOfISOYear {#tostartofisoyear}
Округляет дату или дату-с-временем вниз до первого дня ISO года. Возвращается дата.
Начало ISO года отличается от начала обычного года, потому что в соответствии с [ISO 8601:1988](https://en.wikipedia.org/wiki/ISO_8601) первая неделя года - это неделя с четырьмя или более днями в этом году.
1 Января 2017 г. - воскресение, т.е. первая ISO неделя 2017 года началась в понедельник 2 января, поэтому 1 января 2017 это 2016 ISO-год, который начался 2016-01-04.
```sql
SELECT toStartOfISOYear(toDate('2017-01-01')) AS ISOYear20170101;
```
```text
┌─ISOYear20170101─┐
│ 2016-01-04 │
└─────────────────┘
```
## toStartOfQuarter {#tostartofquarter}
Округляет дату или дату-с-временем вниз до первого дня квартала.
@ -147,6 +171,12 @@ SELECT toUnixTimestamp('2017-11-05 08:07:47', 'Asia/Tokyo') AS unix_timestamp
Округляет дату или дату-с-временем вниз до ближайшего понедельника.
Возвращается дата.
## toStartOfWeek(t[,mode]) {#tostartofweek}
Округляет дату или дату со временем до ближайшего воскресенья или понедельника в соответствии с mode.
Возвращается дата.
Аргумент mode работает точно так же, как аргумент mode [toWeek()](#toweek). Если аргумент mode опущен, то используется режим 0.
## toStartOfDay {#tostartofday}
Округляет дату-с-временем вниз до начала дня. Возвращается дата-с-временем.
@ -243,6 +273,10 @@ WITH toDateTime64('2020-01-01 10:20:30.999', 3) AS dt64 SELECT toStartOfSecond(d
Переводит дату-с-временем или дату в номер года, начиная с некоторого фиксированного момента в прошлом.
## toRelativeQuarterNum {#torelativequarternum}
Переводит дату-с-временем или дату в номер квартала, начиная с некоторого фиксированного момента в прошлом.
## toRelativeMonthNum {#torelativemonthnum}
Переводит дату-с-временем или дату в номер месяца, начиная с некоторого фиксированного момента в прошлом.
@ -267,6 +301,102 @@ WITH toDateTime64('2020-01-01 10:20:30.999', 3) AS dt64 SELECT toStartOfSecond(d
Переводит дату-с-временем в номер секунды, начиная с некоторого фиксированного момента в прошлом.
## toISOYear {#toisoyear}
Переводит дату-с-временем или дату в число типа UInt16, содержащее номер ISO года. ISO год отличается от обычного года, потому что в соответствии с [ISO 8601:1988](https://en.wikipedia.org/wiki/ISO_8601) ISO год начинается необязательно первого января.
Пример:
```sql
SELECT
toDate('2017-01-01') AS date,
toYear(date),
toISOYear(date)
```
```text
┌───────date─┬─toYear(toDate('2017-01-01'))─┬─toISOYear(toDate('2017-01-01'))─┐
│ 2017-01-01 │ 2017 │ 2016 │
└────────────┴──────────────────────────────┴─────────────────────────────────┘
```
## toISOWeek {#toisoweek}
Переводит дату-с-временем или дату в число типа UInt8, содержащее номер ISO недели.
Начало ISO года отличается от начала обычного года, потому что в соответствии с [ISO 8601:1988](https://en.wikipedia.org/wiki/ISO_8601) первая неделя года - это неделя с четырьмя или более днями в этом году.
1 Января 2017 г. - воскресение, т.е. первая ISO неделя 2017 года началась в понедельник 2 января, поэтому 1 января 2017 это последняя неделя 2016 года.
```sql
SELECT
toISOWeek(toDate('2017-01-01')) AS ISOWeek20170101,
toISOWeek(toDate('2017-01-02')) AS ISOWeek20170102
```
```text
┌─ISOWeek20170101─┬─ISOWeek20170102─┐
│ 52 │ 1 │
└─────────────────┴─────────────────┘
```
## toWeek(date\[, mode\]\[, timezone\]) {#toweek}
Переводит дату-с-временем или дату в число UInt8, содержащее номер недели. Второй аргументам mode задает режим, начинается ли неделя с воскресенья или с понедельника и должно ли возвращаемое значение находиться в диапазоне от 0 до 53 или от 1 до 53. Если аргумент mode опущен, то используется режим 0.
`toISOWeek() ` эквивалентно `toWeek(date,3)`.
Описание режимов (mode):
| Mode | Первый день недели | Диапазон | Неделя 1 это первая неделя … |
| ----------- | -------- | -------- | ------------------ |
|0|Воскресенье|0-53|с воскресеньем в этом году
|1|Понедельник|0-53|с 4-мя или более днями в этом году
|2|Воскресенье|1-53|с воскресеньем в этом году
|3|Понедельник|1-53|с 4-мя или более днями в этом году
|4|Воскресенье|0-53|с 4-мя или более днями в этом году
|5|Понедельник|0-53|с понедельником в этом году
|6|Воскресенье|1-53|с 4-мя или более днями в этом году
|7|Понедельник|1-53|с понедельником в этом году
|8|Воскресенье|1-53|содержащая 1 января
|9|Понедельник|1-53|содержащая 1 января
Для режимов со значением «с 4 или более днями в этом году» недели нумеруются в соответствии с ISO 8601:1988:
- Если неделя, содержащая 1 января, имеет 4 или более дней в новом году, это неделя 1.
- В противном случае это последняя неделя предыдущего года, а следующая неделя - неделя 1.
Для режимов со значением «содержит 1 января», неделя 1 это неделя содержащая 1 января. Не имеет значения, сколько дней в новом году содержала неделя, даже если она содержала только один день.
**Пример**
```sql
SELECT toDate('2016-12-27') AS date, toWeek(date) AS week0, toWeek(date,1) AS week1, toWeek(date,9) AS week9;
```
```text
┌───────date─┬─week0─┬─week1─┬─week9─┐
│ 2016-12-27 │ 52 │ 52 │ 1 │
└────────────┴───────┴───────┴───────┘
```
## toYearWeek(date[,mode]) {#toyearweek}
Возвращает год и неделю для даты. Год в результате может отличаться от года в аргументе даты для первой и последней недели года.
Аргумент mode работает точно так же, как аргумент mode [toWeek()](#toweek). Если mode не задан, используется режим 0.
`toISOYear() ` эквивалентно `intDiv(toYearWeek(date,3),100)`.
**Пример**
```sql
SELECT toDate('2016-12-27') AS date, toYearWeek(date) AS yearWeek0, toYearWeek(date,1) AS yearWeek1, toYearWeek(date,9) AS yearWeek9;
```
```text
┌───────date─┬─yearWeek0─┬─yearWeek1─┬─yearWeek9─┐
│ 2016-12-27 │ 201652 │ 201652 │ 201701 │
└────────────┴───────────┴───────────┴───────────┘
```
## date_trunc {#date_trunc}
Отсекает от даты и времени части, меньшие чем указанная часть.

View File

@ -103,4 +103,306 @@ SELECT erf(3 / sqrt(2))
Принимает два числовых аргумента x и y. Возвращает число типа Float64, близкое к x в степени y.
## cosh(x) {#coshx}
[Гиперболический косинус](https://help.scilab.org/docs/5.4.0/ru_RU/cosh.html).
**Синтаксис**
``` sql
cosh(x)
```
**Параметры**
- `x` — угол в радианах. Значения из интервала: `-∞ < x < +∞`. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Возвращаемое значение**
- Значения из интервала: `1 <= cosh(x) < +∞`.
Тип: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Пример**
Запрос:
``` sql
SELECT cosh(0);
```
Результат:
``` text
┌─cosh(0)──┐
│ 1 │
└──────────┘
```
## acosh(x) {#acoshx}
[Обратный гиперболический косинус](https://help.scilab.org/docs/5.4.0/ru_RU/acosh.html).
**Синтаксис**
``` sql
acosh(x)
```
**Параметры**
- `x` — гиперболический косинус угла. Значения из интервала: `1 <= x < +∞`. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Возвращаемое значение**
- Угол в радианах. Значения из интервала: `0 <= acosh(x) < +∞`.
Тип: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Пример**
Запрос:
``` sql
SELECT acosh(1);
```
Результат:
``` text
┌─acosh(1)─┐
│ 0 │
└──────────┘
```
**Смотрите также**
- [cosh(x)](../../sql-reference/functions/math-functions.md#coshx)
## sinh(x) {#sinhx}
[Гиперболический синус](https://help.scilab.org/docs/5.4.0/ru_RU/sinh.html).
**Синтаксис**
``` sql
sinh(x)
```
**Параметры**
- `x` — угол в радианах. Значения из интервала: `-∞ < x < +∞`. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Возвращаемое значение**
- Значения из интервала: `-∞ < sinh(x) < +∞`.
Тип: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Пример**
Запрос:
``` sql
SELECT sinh(0);
```
Результат:
``` text
┌─sinh(0)──┐
│ 0 │
└──────────┘
```
## asinh(x) {#asinhx}
[Обратный гиперболический синус](https://help.scilab.org/docs/5.4.0/ru_RU/asinh.html).
**Синтаксис**
``` sql
asinh(x)
```
**Параметры**
- `x` — гиперболический синус угла. Значения из интервала: `-∞ < x < +∞`. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Возвращаемое значение**
- Угол в радианах. Значения из интервала: `-∞ < asinh(x) < +∞`.
Тип: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Пример**
Запрос:
``` sql
SELECT asinh(0);
```
Результат:
``` text
┌─asinh(0)─┐
│ 0 │
└──────────┘
```
**Смотрите также**
- [sinh(x)](../../sql-reference/functions/math-functions.md#sinhx)
## atanh(x) {#atanhx}
[Обратный гиперболический тангенс](https://help.scilab.org/docs/5.4.0/ru_RU/atanh.html).
**Синтаксис**
``` sql
atanh(x)
```
**Параметры**
- `x` — гиперболический тангенс угла. Значения из интервала: `1 < x < 1`. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Возвращаемое значение**
- Угол в радианах. Значения из интервала: `-∞ < atanh(x) < +∞`.
Тип: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Пример**
Запрос:
``` sql
SELECT atanh(0);
```
Результат:
``` text
┌─atanh(0)─┐
│ 0 │
└──────────┘
```
## atan2(y, x) {#atan2yx}
[Функция](https://msoffice-prowork.com/ref/excel/excelfunc/math/atan2/) вычисляет угол в радианах между положительной осью x и линией, проведенной из начала координат в точку `(x, y) ≠ (0, 0)`.
**Синтаксис**
``` sql
atan2(y, x)
```
**Параметры**
- `y` — координата y точки, в которую проведена линия. [Float64](../../sql-reference/data-types/float.md#float32-float64).
- `x` — координата х точки, в которую проведена линия. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Возвращаемое значение**
- Угол `θ` в радианах из интервала: `−π < θ ≤ π`.
Тип: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Пример**
Запрос:
``` sql
SELECT atan2(1, 1);
```
Результат:
``` text
┌────────atan2(1, 1)─┐
│ 0.7853981633974483 │
└────────────────────┘
```
## hypot(x, y) {#hypotxy}
Вычисляет длину гипотенузы прямоугольного треугольника. При использовании этой [функции](https://php.ru/manual/function.hypot.html) не возникает проблем при возведении в квадрат очень больших или очень малых чисел.
**Синтаксис**
``` sql
hypot(x, y)
```
**Параметры**
- `x` — первый катет прямоугольного треугольника. [Float64](../../sql-reference/data-types/float.md#float32-float64).
- `y` — второй катет прямоугольного треугольника. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Возвращаемое значение**
- Длина гипотенузы прямоугольного треугольника.
Тип: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Пример**
Запрос:
``` sql
SELECT hypot(1, 1);
```
Результат:
``` text
┌────────hypot(1, 1)─┐
│ 1.4142135623730951 │
└────────────────────┘
```
## log1p(x) {#log1px}
Вычисляет `log(1+x)`. [Функция](https://help.scilab.org/docs/6.0.1/ru_RU/log1p.html) `log1p(x)` является более точной, чем функция `log(1+x)` для малых значений x.
**Синтаксис**
``` sql
log1p(x)
```
**Параметры**
- `x` — значения из интервала: `-1 < x < +∞`. [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Возвращаемое значение**
- Значения из интервала: `-∞ < log1p(x) < +∞`.
Тип: [Float64](../../sql-reference/data-types/float.md#float32-float64).
**Пример**
Запрос:
``` sql
SELECT log1p(0);
```
Результат:
``` text
┌─log1p(0)─┐
│ 0 │
└──────────┘
```
**Смотрите также**
- [log(x)](../../sql-reference/functions/math-functions.md#logx)
[Оригинальная статья](https://clickhouse.tech/docs/ru/query_language/functions/math_functions/) <!--hide-->

View File

@ -12,6 +12,7 @@ toc_title: SYSTEM
- [DROP MARK CACHE](#query_language-system-drop-mark-cache)
- [DROP UNCOMPRESSED CACHE](#query_language-system-drop-uncompressed-cache)
- [DROP COMPILED EXPRESSION CACHE](#query_language-system-drop-compiled-expression-cache)
- [DROP REPLICA](#query_language-system-drop-replica)
- [FLUSH LOGS](#query_language-system-flush_logs)
- [RELOAD CONFIG](#query_language-system-reload-config)
- [SHUTDOWN](#query_language-system-shutdown)
@ -66,6 +67,24 @@ SELECT name, status FROM system.dictionaries;
Сбрасывает кеш «засечек» (`mark cache`). Используется при разработке ClickHouse и тестах производительности.
## DROP REPLICA {#query_language-system-drop-replica}
Мертвые реплики можно удалить, используя следующий синтаксис:
``` sql
SYSTEM DROP REPLICA 'replica_name' FROM TABLE database.table;
SYSTEM DROP REPLICA 'replica_name' FROM DATABASE database;
SYSTEM DROP REPLICA 'replica_name';
SYSTEM DROP REPLICA 'replica_name' FROM ZKPATH '/path/to/table/in/zk';
```
Удаляет путь реплики из ZooKeeper-а. Это полезно, когда реплика мертва и ее метаданные не могут быть удалены из ZooKeeper с помощью `DROP TABLE`, потому что такой таблицы больше нет. `DROP REPLICA` может удалить только неактивную / устаревшую реплику и не может удалить локальную реплику, используйте для этого `DROP TABLE`. `DROP REPLICA` не удаляет таблицы и не удаляет данные или метаданные с диска.
Первая команда удаляет метаданные реплики `'replica_name'` для таблицы `database.table`.
Вторая команда удаляет метаданные реплики `'replica_name'` для всех таблиц базы данных `database`.
Третья команда удаляет метаданные реплики `'replica_name'` для всех таблиц, существующих на локальном сервере (список таблиц генерируется из локальной реплики).
Четверая команда полезна для удаления метаданных мертвой реплики когда все другие реплики таблицы уже были удалены ранее, поэтому необходимо явно указать ZooKeeper путь таблицы. ZooKeeper путь это первый аргумент для `ReplicatedMergeTree` движка при создании таблицы.
## DROP UNCOMPRESSED CACHE {#query_language-system-drop-uncompressed-cache}
Сбрасывает кеш не сжатых данных. Используется при разработке ClickHouse и тестах производительности.

View File

@ -46,7 +46,7 @@ ClickHouse是一个用于联机分析(OLAP)的列式数据库管理系统(DBMS)
- 处理单个查询时需要高吞吐量(每台服务器每秒可达数十亿行)
- 事务不是必须的
- 对数据一致性要求低
- 每个查询有一个大表。除了他以外,其他的都很小。
- 每个查询有一个大表。除了他以外,其他的都很小。
- 查询结果明显小于源数据。换句话说数据经过过滤或聚合因此结果适合于单个服务器的RAM中
很容易可以看出OLAP场景与其他通常业务场景(例如,OLTP或K/V)有很大的不同, 因此想要使用OLTP或Key-Value数据库去高效的处理分析查询场景并不是非常完美的适用方案。例如使用OLAP数据库去处理分析请求通常要优于使用MongoDB或Redis去处理分析请求。

View File

@ -20,7 +20,37 @@ SELECT
## toTimeZone {#totimezone}
将Date或DateTime转换为指定的时区。
将Date或DateTime转换为指定的时区。 时区是Date/DateTime类型的属性。 表字段或结果集的列的内部值(秒数)不会更改,列的类型会更改,并且其字符串表示形式也会相应更改。
```sql
SELECT
toDateTime('2019-01-01 00:00:00', 'UTC') AS time_utc,
toTypeName(time_utc) AS type_utc,
toInt32(time_utc) AS int32utc,
toTimeZone(time_utc, 'Asia/Yekaterinburg') AS time_yekat,
toTypeName(time_yekat) AS type_yekat,
toInt32(time_yekat) AS int32yekat,
toTimeZone(time_utc, 'US/Samoa') AS time_samoa,
toTypeName(time_samoa) AS type_samoa,
toInt32(time_samoa) AS int32samoa
FORMAT Vertical;
```
```text
Row 1:
──────
time_utc: 2019-01-01 00:00:00
type_utc: DateTime('UTC')
int32utc: 1546300800
time_yekat: 2019-01-01 05:00:00
type_yekat: DateTime('Asia/Yekaterinburg')
int32yekat: 1546300800
time_samoa: 2018-12-31 13:00:00
type_samoa: DateTime('US/Samoa')
int32samoa: 1546300800
```
`toTimeZone(time_utc, 'Asia/Yekaterinburg')``DateTime('UTC')` 类型转换为 `DateTime('Asia/Yekaterinburg')`. 内部值 (Unixtimestamp) 1546300800 保持不变, 但是字符串表示(toString() 函数的结果值) 由 `time_utc: 2019-01-01 00:00:00` 转换为o `time_yekat: 2019-01-01 05:00:00`.
## toYear {#toyear}
@ -34,15 +64,15 @@ SELECT
将Date或DateTime转换为包含月份编号1-12的UInt8类型的数字。
## 今天一年 {#todayofyear}
## toDayOfYear {#todayofyear}
将Date或DateTime转换为包含一年中的某一天的编号的UInt161-366类型的数字。
## 今天月 {#todayofmonth}
## toDayOfMonth {#todayofmonth}
将Date或DateTime转换为包含一月中的某一天的编号的UInt81-31类型的数字。
## 今天一周 {#todayofweek}
## toDayOfWeek {#todayofweek}
将Date或DateTime转换为包含一周中的某一天的编号的UInt8周一是1, 周日是7类型的数字。
@ -55,31 +85,61 @@ SELECT
将DateTime转换为包含一小时中分钟数0-59的UInt8数字。
## {#tosecond}
## toSecond {#tosecond}
将DateTime转换为包含一分钟中秒数0-59的UInt8数字。
闰秒不计算在内。
## toUnixTimestamp {#tounixtimestamp}
## toUnixTimestamp {#to-unix-timestamp}
将DateTime转换为unix时间戳。
对于DateTime参数将值转换为UInt32类型的数字-Unix时间戳https://en.wikipedia.org/wiki/Unix_time
对于String参数根据时区将输入字符串转换为日期时间可选的第二个参数默认使用服务器时区并返回相应的unix时间戳。
## 开始一年 {#tostartofyear}
**语法**
``` sql
toUnixTimestamp(datetime)
toUnixTimestamp(str, [timezone])
```
**返回值**
- 返回 unix timestamp.
类型: `UInt32`.
**示例**
查询:
``` sql
SELECT toUnixTimestamp('2017-11-05 08:07:47', 'Asia/Tokyo') AS unix_timestamp
```
结果:
``` text
┌─unix_timestamp─┐
│ 1509836867 │
└────────────────┘
```
## toStartOfYear {#tostartofyear}
将Date或DateTime向前取整到本年的第一天。
返回Date类型。
## 今年开始 {#tostartofisoyear}
## toStartOfISOYear {#tostartofisoyear}
将Date或DateTime向前取整到ISO本年的第一天。
返回Date类型。
## 四分之一开始 {#tostartofquarter}
## toStartOfQuarter {#tostartofquarter}
将Date或DateTime向前取整到本季度的第一天。
返回Date类型。
## 到月份开始 {#tostartofmonth}
## toStartOfMonth {#tostartofmonth}
将Date或DateTime向前取整到本月的第一天。
返回Date类型。
@ -92,27 +152,90 @@ SELECT
将Date或DateTime向前取整到本周的星期一。
返回Date类型。
## 今天开始 {#tostartofday}
## toStartOfWeek(t\[,mode\]) {#tostartofweek}
将DateTime向前取整到当日的开始。
按mode将Date或DateTime向前取整到最近的星期日或星期一。
返回Date类型。
mode参数的工作方式与toWeek()的mode参数完全相同。 对于单参数语法mode使用默认值0。
## 开始一小时 {#tostartofhour}
## toStartOfDay {#tostartofday}
将DateTime向前取整到今天的开始。
## toStartOfHour {#tostartofhour}
将DateTime向前取整到当前小时的开始。
## to startofminute {#tostartofminute}
## toStartOfMinute {#tostartofminute}
将DateTime向前取整到当前分钟的开始。
## to startoffiveminute {#tostartoffiveminute}
## toStartOfSecond {#tostartofsecond}
将DateTime向前取整到当前秒数的开始。
**语法**
``` sql
toStartOfSecond(value[, timezone])
```
**参数**
- `value` — 时间和日期[DateTime64](../../sql-reference/data-types/datetime64.md).
- `timezone` — 返回值的[Timezone](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone) (可选参数)。 如果未指定将使用 `value` 参数的时区。 [String](../../sql-reference/data-types/string.md)。
**返回值**
- 输入值毫秒部分为零。
类型: [DateTime64](../../sql-reference/data-types/datetime64.md).
**示例**
不指定时区查询:
``` sql
WITH toDateTime64('2020-01-01 10:20:30.999', 3) AS dt64
SELECT toStartOfSecond(dt64);
```
结果:
``` text
┌───toStartOfSecond(dt64)─┐
│ 2020-01-01 10:20:30.000 │
└─────────────────────────┘
```
指定时区查询:
``` sql
WITH toDateTime64('2020-01-01 10:20:30.999', 3) AS dt64
SELECT toStartOfSecond(dt64, 'Europe/Moscow');
```
结果:
``` text
┌─toStartOfSecond(dt64, 'Europe/Moscow')─┐
│ 2020-01-01 13:20:30.000 │
└────────────────────────────────────────┘
```
**参考**
- [Timezone](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone) 服务器配置选项。
## toStartOfFiveMinute {#tostartoffiveminute}
将DateTime以五分钟为单位向前取整到最接近的时间点。
## 开始分钟 {#tostartoftenminutes}
## toStartOfTenMinutes {#tostartoftenminutes}
将DateTime以十分钟为单位向前取整到最接近的时间点。
## 开始几分钟 {#tostartoffifteenminutes}
## toStartOfFifteenMinutes {#tostartoffifteenminutes}
将DateTime以十五分钟为单位向前取整到最接近的时间点。
@ -168,31 +291,214 @@ SELECT
将Date或DateTime转换为包含ISO周数的UInt8类型的编号。
## 现在 {#now}
## toWeek(date\[,mode\]) {#toweekdatemode}
不接受任何参数并在请求执行时的某一刻返回当前时间DateTime
此函数返回一个常量,即时请求需要很长时间能够完成。
返回Date或DateTime的周数。两个参数形式可以指定星期是从星期日还是星期一开始以及返回值应在0到53还是从1到53的范围内。如果省略了mode参数则默认 模式为0。
`toISOWeek()`是一个兼容函数,等效于`toWeek(date,3)`。
下表描述了mode参数的工作方式。
## 今天 {#today}
| Mode | First day of week | Range | Week 1 is the first week … |
|------|-------------------|-------|-------------------------------|
| 0 | Sunday | 0-53 | with a Sunday in this year |
| 1 | Monday | 0-53 | with 4 or more days this year |
| 2 | Sunday | 1-53 | with a Sunday in this year |
| 3 | Monday | 1-53 | with 4 or more days this year |
| 4 | Sunday | 0-53 | with 4 or more days this year |
| 5 | Monday | 0-53 | with a Monday in this year |
| 6 | Sunday | 1-53 | with 4 or more days this year |
| 7 | Monday | 1-53 | with a Monday in this year |
| 8 | Sunday | 1-53 | contains January 1 |
| 9 | Monday | 1-53 | contains January 1 |
对于象“with 4 or more days this year,”的mode值根据ISO 86011988对周进行编号
- 如果包含1月1日的一周在后一年度中有4天或更多天则为第1周。
- 否则它是上一年的最后一周下周是第1周。
对于像“contains January 1”的mode值, 包含1月1日的那周为本年度的第1周。
``` sql
toWeek(date, [, mode][, Timezone])
```
**参数**
- `date` Date 或 DateTime.
- `mode` 可选参数, 取值范围 \[0,9\] 默认0。
- `Timezone` 可选参数, 可其他时间日期转换参数的行为一致。
**示例**
``` sql
SELECT toDate('2016-12-27') AS date, toWeek(date) AS week0, toWeek(date,1) AS week1, toWeek(date,9) AS week9;
```
``` text
┌───────date─┬─week0─┬─week1─┬─week9─┐
│ 2016-12-27 │ 52 │ 52 │ 1 │
└────────────┴───────┴───────┴───────┘
```
## toYearWeek(date\[,mode\]) {#toyearweekdatemode}
返回Date的年和周。 结果中的年份可能因为Date为该年份的第一周和最后一周而于Date的年份不同。
mode参数的工作方式与toWeek()的mode参数完全相同。 对于单参数语法mode使用默认值0。
`toISOYear()`是一个兼容函数,等效于`intDiv(toYearWeek(date,3),100)`.
**示例**
``` sql
SELECT toDate('2016-12-27') AS date, toYearWeek(date) AS yearWeek0, toYearWeek(date,1) AS yearWeek1, toYearWeek(date,9) AS yearWeek9;
```
``` text
┌───────date─┬─yearWeek0─┬─yearWeek1─┬─yearWeek9─┐
│ 2016-12-27 │ 201652 │ 201652 │ 201701 │
└────────────┴───────────┴───────────┴───────────┘
```
## date_trunc {#date_trunc}
将Date或DateTime按指定的单位向前取整到最接近的时间点。
**语法**
``` sql
date_trunc(unit, value[, timezone])
```
别名: `dateTrunc`.
**参数**
- `unit` — 单位. [String](../syntax.md#syntax-string-literal).
可选值:
- `second`
- `minute`
- `hour`
- `day`
- `week`
- `month`
- `quarter`
- `year`
- `value` — [DateTime](../../sql-reference/data-types/datetime.md) 或者 [DateTime64](../../sql-reference/data-types/datetime64.md).
- `timezone` — [Timezone name](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone) 返回值的时区(可选值)。如果未指定将使用`value`的时区。 [String](../../sql-reference/data-types/string.md).
**返回值**
- 按指定的单位向前取整后的DateTime。
类型: [Datetime](../../sql-reference/data-types/datetime.md).
**示例**
不指定时区查询:
``` sql
SELECT now(), date_trunc('hour', now());
```
结果:
``` text
┌───────────────now()─┬─date_trunc('hour', now())─┐
│ 2020-09-28 10:40:45 │ 2020-09-28 10:00:00 │
└─────────────────────┴───────────────────────────┘
```
指定时区查询:
```sql
SELECT now(), date_trunc('hour', now(), 'Europe/Moscow');
```
结果:
```text
┌───────────────now()─┬─date_trunc('hour', now(), 'Europe/Moscow')─┐
│ 2020-09-28 10:46:26 │ 2020-09-28 13:00:00 │
└─────────────────────┴────────────────────────────────────────────┘
```
**参考**
- [toStartOfInterval](#tostartofintervaltime-or-data-interval-x-unit-time-zone)
# now {#now}
返回当前日期和时间。
**语法**
``` sql
now([timezone])
```
**参数**
- `timezone` — [Timezone name](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone) 返回结果的时区(可先参数). [String](../../sql-reference/data-types/string.md).
**返回值**
- 当前日期和时间。
类型: [Datetime](../../sql-reference/data-types/datetime.md).
**示例**
不指定时区查询:
``` sql
SELECT now();
```
结果:
``` text
┌───────────────now()─┐
│ 2020-10-17 07:42:09 │
└─────────────────────┘
```
指定时区查询:
``` sql
SELECT now('Europe/Moscow');
```
结果:
``` text
┌─now('Europe/Moscow')─┐
│ 2020-10-17 10:42:23 │
└──────────────────────┘
```
## today {#today}
不接受任何参数并在请求执行时的某一刻返回当前日期(Date)。
其功能与toDatenow相同。
其功能与toDatenow())’相同。
## 昨天 {#yesterday}
## yesterday {#yesterday}
不接受任何参数并在请求执行时的某一刻返回昨天的日期(Date)。
其功能与today - 1相同。
其功能与today() - 1相同。
## 时隙 {#timeslot}
## timeSlot {#timeslot}
将时间向前取整半小时。
此功能用于Yandex.Metrica因为如果跟踪标记显示单个用户的连续综合浏览量在时间上严格超过此数量则半小时是将会话分成两个会话的最短时间。这意味着tag iduser idtime slot可用于搜索相应会话中包含的综合浏览量。
## toyyymm {#toyyyymm}
## toYYYMM {#toyyyymm}
将Date或DateTime转换为包含年份和月份编号的UInt32类型的数字YYYY \* 100 + MM
## toyyymmdd {#toyyyymmdd}
## toYYYMMDD {#toyyyymmdd}
将Date或DateTime转换为包含年份和月份编号的UInt32类型的数字YYYY \* 10000 + MM \* 100 + DD
@ -200,7 +506,7 @@ SELECT
将Date或DateTime转换为包含年份和月份编号的UInt64类型的数字YYYY \* 10000000000 + MM \* 100000000 + DD \* 1000000 + hh \* 10000 + mm \* 100 + ss
## 隆隆隆隆路虏脢,,陇,貌,垄拢卢虏禄quar陇,貌路,隆拢脳枚脢虏,麓脢,脱,,,录,禄庐戮,utes, {#addyears-addmonths-addweeks-adddays-addhours-addminutes-addseconds-addquarters}
## addYears, addMonths, addWeeks, addDays, addHours, addMinutes, addSeconds, addQuarters {#addyears-addmonths-addweeks-adddays-addhours-addminutes-addseconds-addquarters}
函数将一段时间间隔添加到Date/DateTime然后返回Date/DateTime。例如
@ -234,59 +540,145 @@ SELECT
│ 2018-01-01 │ 2018-01-01 00:00:00 │
└──────────────────────────┴───────────────────────────────┘
## dateDiff(unit,t1,t2,\[时区\]) {#datediffunit-t1-t2-timezone}
## dateDiff {#datediff}
返回unit为单位表示的两个时间之间的差异例如`'hours'`。 t1t2可以是Date或DateTime如果指定timezone它将应用于两个参数。如果不是则使用来自数据类型t1t2的时区。如果时区不相同则结果将是未定义的
返回两个Date或DateTime类型之间的时差
支持的单位值:
**语法**
| 单位 |
|------|
| 第二 |
| 分钟 |
| 小时 |
| 日 |
| 周 |
| 月 |
| 季 |
| 年 |
``` sql
dateDiff('unit', startdate, enddate, [timezone])
```
## 时隙(开始时间,持续时间,\[,大小\]) {#timeslotsstarttime-duration-size}
**参数**
- `unit` — 返回结果的时间单位。 [String](../../sql-reference/syntax.md#syntax-string-literal).
支持的时间单位: second, minute, hour, day, week, month, quarter, year.
- `startdate` — 第一个待比较值。 [Date](../../sql-reference/data-types/date.md) 或 [DateTime](../../sql-reference/data-types/datetime.md).
- `enddate` — 第二个待比较值。 [Date](../../sql-reference/data-types/date.md) 或 [DateTime](../../sql-reference/data-types/datetime.md).
- `timezone` — 可选参数。 如果指定了,则同时适用于`startdate`和`enddate`。如果不指定,则使用`startdate`和`enddate`的时区。如果两个时区不一致,则结果不可预料。
**返回值**
以`unit`为单位的`startdate`和`enddate`之间的时差。
类型: `int`.
**示例**
查询:
``` sql
SELECT dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'));
```
结果:
``` text
┌─dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'))─┐
│ 25 │
└────────────────────────────────────────────────────────────────────────────────────────┘
```
## timeSlots(StartTime, Duration,\[, Size\]) {#timeslotsstarttime-duration-size}
它返回一个时间数组其中包括从从«StartTime»开始到«StartTime + Duration 秒»内的所有符合«size»以秒为单位步长的时间点。其中«size»是一个可选参数默认为1800。
例如,`timeSlots(toDateTime('2012-01-01 12:20:00')600) = [toDateTime'2012-01-01 12:00:00'toDateTime'2012-01-01 12:30:00' ]`。
这对于搜索在相应会话中综合浏览量是非常有用的。
## formatDateTime时间格式\[,时区\]) {#formatdatetimetime-format-timezone}
## formatDateTime {#formatdatetime}
函数根据给定的格式字符串来格式化时间。请注意:格式字符串必须是常量表达式,例如:单个结果列不能有多种格式字符串。
支持的格式修饰符:
(«Example» 列是对`2018-01-02 22:33:44`的格式化结果)
**语法**
| 修饰符 | 产品描述 | 示例 |
|--------|-------------------------------------------|------------|
| %C | 年除以100并截断为整数(00-99) | 20 |
| %d | 月中的一天零填充01-31) | 02 |
| %D | 短MM/DD/YY日期相当于%m/%d/%y | 01/02/2018 |
| %e | 月中的一天空格填充1-31) | 2 |
| %F | 短YYYY-MM-DD日期相当于%Y-%m-%d | 2018-01-02 |
| %H | 24小时格式00-23) | 22 |
| %I | 小时12h格式01-12) | 10 |
| %j | 一年(001-366) | 002 |
| %m | 月份为十进制数01-12) | 01 |
| %M | 分钟(00-59) | 33 |
| %n | 换行符(") | |
| %p | AM或PM指定 | PM |
| %R | 24小时HH:MM时间相当于%H:%M | 22:33 |
| %S | 第二(00-59) | 44 |
| %t | 水平制表符() | |
| %T | ISO8601时间格式(HH:MM:SS),相当于%H:%M:%S | 22:33:44 |
| %u | ISO8601平日as编号星期一为1(1-7) | 2 |
| %V | ISO8601周编号(01-53) | 01 |
| %w | 周日为十进制数周日为0(0-6) | 2 |
| %y | 年份最后两位数字00-99) | 18 |
| %Y | 年 | 2018 |
| %% | %符号 | % |
``` sql
formatDateTime(Time, Format\[, Timezone\])
```
[来源文章](https://clickhouse.tech/docs/en/query_language/functions/date_time_functions/) <!--hide-->
**返回值**
根据指定格式返回的日期和时间。
**支持的格式修饰符**
使用格式修饰符来指定结果字符串的样式。«Example» 列是对`2018-01-02 22:33:44`的格式化结果。
| 修饰符 | 描述 | 示例 |
|--------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|
| %C | 年除以100并截断为整数(00-99) | 20 |
| %d | 月中的一天零填充01-31) | 02 |
| %D | 短MM/DD/YY日期相当于%m/%d/%y | 01/02/2018 |
| %e | 月中的一天空格填充1-31) | 2 |
| %F | 短YYYY-MM-DD日期相当于%Y-%m-%d | 2018-01-02 |
| %G | ISO周号的四位数年份格式 从基于周的年份[由ISO 8601定义](https://en.wikipedia.org/wiki/ISO_8601#Week_dates) 标准计算得出通常仅对V有用 | 2018 |
| %g | 两位数的年份格式与ISO 8601一致四位数表示法的缩写 | 18 |
| %H | 24小时格式00-23) | 22 |
| %I | 小时12h格式01-12) | 10 |
| %j | 一年(001-366) | 002 |
| %m | 月份为十进制数01-12) | 01 |
| %M | 分钟(00-59) | 33 |
| %n | 换行符(") | |
| %p | AM或PM指定 | PM |
| %R | 24小时HH:MM时间相当于%H:%M | 22:33 |
| %S | 第二(00-59) | 44 |
| %t | 水平制表符() | |
| %T | ISO8601时间格式(HH:MM:SS),相当于%H:%M:%S | 22:33:44 |
| %u | ISO8601平日as编号星期一为1(1-7) | 2 |
| %V | ISO8601周编号(01-53) | 01 |
| %w | 周日为十进制数周日为0(0-6) | 2 |
| %y | 年份最后两位数字00-99) | 18 |
| %Y | 年 | 2018 |
| %% | %符号 | % |
**示例**
查询:
``` sql
SELECT formatDateTime(toDate('2010-01-04'), '%g')
```
结果:
```
┌─formatDateTime(toDate('2010-01-04'), '%g')─┐
│ 10 │
└────────────────────────────────────────────┘
```
[Original article](https://clickhouse.tech/docs/en/query_language/functions/date_time_functions/) <!--hide-->
## FROM_UNIXTIME
当只有单个整数类型的参数时,它的作用与`toDateTime`相同,并返回[DateTime](../../sql-reference/data-types/datetime.md)类型。
例如:
```sql
SELECT FROM_UNIXTIME(423543535)
```
```text
┌─FROM_UNIXTIME(423543535)─┐
│ 1983-06-04 10:58:55 │
└──────────────────────────┘
```
当有两个参数时第一个是整型或DateTime第二个是常量格式字符串它的作用与`formatDateTime`相同,并返回`String`类型。
例如:
```sql
SELECT FROM_UNIXTIME(1234334543, '%Y-%m-%d %R:%S') AS DateTime
```
```text
┌─DateTime────────────┐
│ 2009-02-11 14:42:23 │
└─────────────────────┘
```

View File

@ -18,7 +18,7 @@ toc_title: "\u81EA\u7701"
- 设置 [allow_introspection_functions](../../operations/settings/settings.md#settings-allow_introspection_functions) 设置为1。
For security reasons introspection functions are disabled by default.
出于安全考虑,内省函数默认是关闭的。
ClickHouse将探查器报告保存到 [trace_log](../../operations/system-tables/trace_log.md#system_tables-trace_log) 系统表. 确保正确配置了表和探查器。
@ -36,17 +36,17 @@ addressToLine(address_of_binary_instruction)
**参数**
- `address_of_binary_instruction` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Address of instruction in a running process.
- `address_of_binary_instruction` ([UInt64](../../sql-reference/data-types/int-uint.md)) — 正在运行进程的指令地址。
**返回值**
- 源代码文件名和此文件中用冒号分隔的行号。
- 源代码文件名和行号(用冒号分隔的行号)
For example, `/build/obj-x86_64-linux-gnu/../src/Common/ThreadPool.cpp:199`, where `199` is a line number.
示例, `/build/obj-x86_64-linux-gnu/../src/Common/ThreadPool.cpp:199`, where `199` is a line number.
- 二进制文件的名称,如果函数找不到调试信息。
- 如果函数找不到调试信息,返回二进制文件的名称
- 空字符串,如果地址无效。
- 如果地址无效,返回空字符串
类型: [字符串](../../sql-reference/data-types/string.md).
@ -132,7 +132,7 @@ addressToSymbol(address_of_binary_instruction)
**返回值**
- 来自ClickHouse对象文件的符号。
- 空字符串,如果地址无效。
- 如果地址无效,返回空字符串
类型: [字符串](../../sql-reference/data-types/string.md).

View File

@ -41,25 +41,25 @@ CHECK TABLE [db.]name
`CHECK TABLE` 查询支持下表引擎:
- [日志](../../engines/table-engines/log-family/log.md)
- [Log](../../engines/table-engines/log-family/log.md)
- [TinyLog](../../engines/table-engines/log-family/tinylog.md)
- [StripeLog](../../engines/table-engines/log-family/stripelog.md)
- [梅树家族](../../engines/table-engines/mergetree-family/mergetree.md)
- [MergeTree 家族](../../engines/table-engines/mergetree-family/mergetree.md)
使用另一个表引擎对表执行会导致异常。
对其他不支持的表引擎的表执行会导致异常。
从发动机 `*Log` 家庭不提供故障自动数据恢复。 使用 `CHECK TABLE` 查询及时跟踪数据丢失。
来自 `*Log` 家族的引擎不提供故障自动数据恢复。 使用 `CHECK TABLE` 查询及时跟踪数据丢失。
`MergeTree` 家庭发动机 `CHECK TABLE` 查询显示本地服务器上表的每个单独数据部分的检查状态。
对于 `MergeTree` 家族引擎 `CHECK TABLE` 查询显示本地服务器上表的每个单独数据部分的检查状态。
**如果数据已损坏**
如果表已损坏,则可以将未损坏的数据复制到另一个表。 要做到这一点:
1. 创建具有与损坏的表相同结构的新表。 要执行此操作,请执行查询 `CREATE TABLE <new_table_name> AS <damaged_table_name>`.
2. 设置 [max_threads](../../operations/settings/settings.md#settings-max_threads) 值为1以在单个线程中处理下一个查询。 要执行此操作,请运行查询 `SET max_threads = 1`.
1. 创建一个与损坏的表结构相同的新表。 要做到这一点,请执行查询 `CREATE TABLE <new_table_name> AS <damaged_table_name>`.
2. 将 [max_threads](../../operations/settings/settings.md#settings-max_threads) 值设置为1以在单个线程中处理下一个查询。 要这样做,请运行查询 `SET max_threads = 1`.
3. 执行查询 `INSERT INTO <new_table_name> SELECT * FROM <damaged_table_name>`. 此请求将未损坏的数据从损坏的表复制到另一个表。 只有损坏部分之前的数据才会被复制。
4. 重新启动 `clickhouse-client` 要重置 `max_threads`值。
4. 重新启动 `clickhouse-client` 以重置 `max_threads` 值。
## DESCRIBE TABLE {#misc-describe-table}
@ -67,57 +67,65 @@ CHECK TABLE [db.]name
DESC|DESCRIBE TABLE [db.]table [INTO OUTFILE filename] [FORMAT format]
```
返回以下内容 `String` 类型列:
返回以下 `String` 类型列:
- `name`Column name.
- `type`Column type.
- `default_type`Clause that is used in [默认表达式](create.md#create-default-values) (`DEFAULT`, `MATERIALIZED``ALIAS`). 如果未指定默认表达式则Column包含一个空字符串。
- `default_expression`Value specified in the `DEFAULT` 条款
- `comment_expression`Comment text.
- `name`列名。
- `type`列的类型。
- `default_type` — [默认表达式](create.md#create-default-values) (`DEFAULT`, `MATERIALIZED``ALIAS`)中使用的子句。 如果没有指定默认表达式,则列包含一个空字符串。
- `default_expression``DEFAULT` 子句中指定的值。
- `comment_expression`注释。
嵌套的数据结构输出 “expanded” 格式。 每列分别显示,名称后面有一个点
嵌套数据结构以 “expanded” 格式输出。 每列分别显示,列名后加点号
## DETACH {#detach}
删除有关 name 表从服务器。 服务器停止了解表的存在。
从服务器中删除有关 name 表的信息。 服务器停止了解该表的存在。
``` sql
DETACH TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]
```
这不会删除表的数据或元数据。 在下一次服务器启动时,服务器将读取元数据并再次查找有关表的信息。
同样,一个 “detached” 表可以使用重新连接 `ATTACH` 查询(系统表除外,它们没有为它们存储元数据)。
没有 `DETACH DATABASE` 查询。
这不会删除表的数据或元数据。 在下一次服务器启动时,服务器将读取元数据并再次查找该表。
同样,可以使用 `ATTACH` 查询重新连接一个 “detached” 的表(系统表除外,没有为它们存储元数据)。
## DROP {#drop}
此查询有两种类型: `DROP DATABASE``DROP TABLE`.
删除已经存在的实体。如果指定 `IF EXISTS` 则如果实体不存在,则不返回错误。
## DROP DATABASE {#drop-database}
删除 `db` 数据库中的所有表,然后删除 `db` 数据库本身。
语法:
``` sql
DROP DATABASE [IF EXISTS] db [ON CLUSTER cluster]
```
## DROP TABLE {#drop-table}
删除内部的所有表 db 数据库,然后删除 db 数据库本身。
如果 `IF EXISTS` 如果数据库不存在,则不会返回错误。
删除表。
语法:
``` sql
DROP [TEMPORARY] TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]
```
删除表。
如果 `IF EXISTS` 如果表不存在或数据库不存在,则不会返回错误。
DROP DICTIONARY [IF EXISTS] [db.]name
## DROP DICTIONARY {#drop-dictionary}
删除字典。
如果 `IF EXISTS` 如果表不存在或数据库不存在,则不会返回错误。
语法:
``` sql
DROP DICTIONARY [IF EXISTS] [db.]name
```
## DROP USER {#drop-user-statement}
删除用户。
### 语法 {#drop-user-syntax}
语法:
``` sql
DROP USER [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
@ -129,7 +137,7 @@ DROP USER [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
已删除的角色将从授予该角色的所有实体撤销。
### 语法 {#drop-role-syntax}
语法:
``` sql
DROP ROLE [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
@ -141,7 +149,7 @@ DROP ROLE [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
已删除行策略将从分配该策略的所有实体撤销。
### 语法 {#drop-row-policy-syntax}
语法:
``` sql
DROP [ROW] POLICY [IF EXISTS] name [,...] ON [database.]table [,...] [ON CLUSTER cluster_name]
@ -153,7 +161,7 @@ DROP [ROW] POLICY [IF EXISTS] name [,...] ON [database.]table [,...] [ON CLUSTER
已删除的配额将从分配该配额的所有实体撤销。
### 语法 {#drop-quota-syntax}
语法:
``` sql
DROP QUOTA [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
@ -165,12 +173,22 @@ DROP QUOTA [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
已删除的settings配置将从分配该settings配置的所有实体撤销。
### 语法 {#drop-settings-profile-syntax}
语法:
``` sql
DROP [SETTINGS] PROFILE [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
```
## DROP VIEW {#drop-view}
删除视图。视图也可以通过 `DROP TABLE` 删除,但是 `DROP VIEW` 检查 `[db.]name` 是视图。
语法:
``` sql
DROP VIEW [IF EXISTS] [db.]name [ON CLUSTER cluster]
```
## EXISTS {#exists-statement}
``` sql
@ -189,7 +207,7 @@ KILL QUERY [ON CLUSTER cluster]
```
尝试强制终止当前正在运行的查询。
要终止的查询是从系统中选择的。使用在定义的标准进程表 `WHERE` 《公约》条款 `KILL` 查询
要终止的查询是使用 `KILL` 查询的 `WHERE` 子句定义的标准从system.processes表中选择的
例:
@ -206,13 +224,13 @@ KILL QUERY WHERE user='username' SYNC
默认情况下,使用异步版本的查询 (`ASYNC`),不等待确认查询已停止。
同步版本 (`SYNC`)等待所有查询停止,并在停止时显示有关每个进程的信息。
响应包含 `kill_status` 列,可以采用以下值:
响应包含 `kill_status` 列,该列可以采用以下值:
1. finished The query was terminated successfully.
2. waiting Waiting for the query to end after sending it a signal to terminate.
3. The other values explain why the query can't be stopped.
1. finished 查询已成功终止。
2. waiting 发送查询信号终止后,等待查询结束。
3. 其他值解释为什么查询不能停止。
测试查询 (`TEST`)仅检查用户的权限并显示要停止的查询列表。
测试查询 (`TEST`)仅检查用户的权限并显示要停止的查询列表。
## KILL MUTATION {#kill-mutation}
@ -223,9 +241,9 @@ KILL MUTATION [ON CLUSTER cluster]
[FORMAT format]
```
尝试取消和删除 [突变](alter.md#alter-mutations) 当前正在执行。 要取消的突变选自 [`system.mutations`](../../operations/system-tables/mutations.md#system_tables-mutations) 表使用由指定的过滤器 `WHERE` 《公约》条款 `KILL` 查询
尝试取消和删除当前正在执行的 [mutations](alter.md#alter-mutations) 。 要取消的mutation是使用 `KILL` 查询的WHERE子句指定的过滤器从[`system.mutations`](../../operations/system-tables/mutations.md#system_tables-mutations) 表中选择的
测试查询 (`TEST`)仅检查用户的权限并显示要停止的查询列表。
测试查询 (`TEST`)仅检查用户的权限并显示要停止的mutations列表。
例:
@ -237,9 +255,9 @@ KILL MUTATION WHERE database = 'default' AND table = 'table'
KILL MUTATION WHERE database = 'default' AND table = 'table' AND mutation_id = 'mutation_3.txt'
```
The query is useful when a mutation is stuck and cannot finish (e.g. if some function in the mutation query throws an exception when applied to the data contained in the table).
当mutation卡住且无法完成时该查询是有用的(例如当mutation查询中的某些函数在应用于表中包含的数据时抛出异常)。
已经由突变所做的更改不会回滚。
Mutation已经做的更改不会回滚。
## OPTIMIZE {#misc_operations-optimize}
@ -247,19 +265,19 @@ The query is useful when a mutation is stuck and cannot finish (e.g. if some fu
OPTIMIZE TABLE [db.]name [ON CLUSTER cluster] [PARTITION partition | PARTITION ID 'partition_id'] [FINAL] [DEDUPLICATE]
```
此查询尝试使用来自表引擎的表初始化表的数据部分的非计划合并 [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md) 家人
此查询尝试初始化 [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md)家族的表引擎的表中未计划合并数据部分。
`OPTMIZE` 查询也支持 [MaterializedView](../../engines/table-engines/special/materializedview.md) 和 [缓冲区](../../engines/table-engines/special/buffer.md) 引擎 不支持其他表引擎。
`OPTMIZE` 查询也支持 [MaterializedView](../../engines/table-engines/special/materializedview.md) 和 [Buffer](../../engines/table-engines/special/buffer.md) 引擎 不支持其他表引擎。
`OPTIMIZE`使用 [ReplicatedMergeTree](../../engines/table-engines/mergetree-family/replication.md) 表引擎的家族ClickHouse创建合并任务并等待在所有节点上执行(如果 `replication_alter_partitions_sync` 设置已启用)。
`OPTIMIZE` 与 [ReplicatedMergeTree](../../engines/table-engines/mergetree-family/replication.md) 家族的表引擎一起使用时ClickHouse将创建一个合并任务并等待所有节点上的执行(如果 `replication_alter_partitions_sync` 设置已启用)。
- 如果 `OPTIMIZE` 出于任何原因不执行合并,它不通知客户端。 要启用通知,请使用 [optimize_throw_if_noop](../../operations/settings/settings.md#setting-optimize_throw_if_noop) 设置。
- 如果您指定 `PARTITION`,仅优化指定的分区。 [如何设置分区表达式](alter.md#alter-how-to-specify-part-expr).
- 如果您指定 `FINAL`,即使所有数据已经在一个部分中,也会执行优化。
- 如果您指定 `DEDUPLICATE`然后完全相同的行将被重复数据删除所有列进行比较这仅适用于MergeTree引擎。
- 如果您指定 `DEDUPLICATE`则将对完全相同的行进行重复数据删除所有列进行比较这仅适用于MergeTree引擎。
!!! warning "警告"
`OPTIMIZE` 无法修复 “Too many parts” 错误
`OPTIMIZE` 无法修复 “Too many parts” 错误
## RENAME {#misc_operations-rename}
@ -270,6 +288,7 @@ RENAME TABLE [db11.]name11 TO [db12.]name12, [db21.]name21 TO [db22.]name22, ...
```
所有表都在全局锁定下重命名。 重命名表是一个轻型操作。 如果您在TO之后指定了另一个数据库则表将被移动到此数据库。 但是,包含数据库的目录必须位于同一文件系统中(否则,将返回错误)。
如果您在一个查询中重命名多个表,这是一个非原子操作,它可能被部分执行,其他会话中的查询可能会接收错误 Table ... doesn't exist ...。
## SET {#query-set}
@ -277,9 +296,9 @@ RENAME TABLE [db11.]name11 TO [db12.]name12, [db21.]name21 TO [db22.]name22, ...
SET param = value
```
分配 `value``param` [设置](../../operations/settings/index.md) 对于当前会话。 你不能改变 [服务器设置](../../operations/server-configuration-parameters/index.md) 这边
为当前会话的 [设置](../../operations/settings/index.md) `param` 分配值 `value`。 您不能以这种方式更改 [服务器设置](../../operations/server-configuration-parameters/index.md)。
您还可以在单个查询中设置指定设置配置文件中的所有值。
您还可以在单个查询中从指定的设置配置文件中设置所有值。
``` sql
SET profile = 'profile-name-from-the-settings-file'
@ -291,8 +310,6 @@ SET profile = 'profile-name-from-the-settings-file'
激活当前用户的角色。
### 语法 {#set-role-syntax}
``` sql
SET ROLE {DEFAULT | NONE | role [,...] | ALL | ALL EXCEPT role [,...]}
```
@ -301,15 +318,13 @@ SET ROLE {DEFAULT | NONE | role [,...] | ALL | ALL EXCEPT role [,...]}
将默认角色设置为用户。
默认角色在用户登录时自动激活。 您只能将以前授予的角色设置为默认值。 如果未向用户授予角色ClickHouse将引发异常。
### 语法 {#set-default-role-syntax}
默认角色在用户登录时自动激活。 您只能将以前授予的角色设置为默认值。 如果角色没有授予用户ClickHouse会抛出异常。
``` sql
SET DEFAULT ROLE {NONE | role [,...] | ALL | ALL EXCEPT role [,...]} TO {user|CURRENT_USER} [,...]
```
### 例 {#set-default-role-examples}
### 例 {#set-default-role-examples}
为用户设置多个默认角色:
@ -317,19 +332,19 @@ SET DEFAULT ROLE {NONE | role [,...] | ALL | ALL EXCEPT role [,...]} TO {user|CU
SET DEFAULT ROLE role1, role2, ... TO user
```
将所有授予的角色设置为用户的默认:
将所有授予的角色设置为用户的默认角色:
``` sql
SET DEFAULT ROLE ALL TO user
```
从用户清除默认角色:
清除用户的默认角色:
``` sql
SET DEFAULT ROLE NONE TO user
```
将所有授予的角色设置为默认角色,其中一些角色除外:
将所有授予的角色设置为默认角色,其中一些角色除外:
``` sql
SET DEFAULT ROLE ALL EXCEPT role1, role2 TO user
@ -341,9 +356,9 @@ SET DEFAULT ROLE ALL EXCEPT role1, role2 TO user
TRUNCATE TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]
```
从表中删除所有数据。 当条款 `IF EXISTS` 如果该表不存在,则查询返回错误。
从表中删除所有数据。 当省略 `IF EXISTS`子句时,如果该表不存在,则查询返回错误。
`TRUNCATE` 查询不支持 [查看](../../engines/table-engines/special/view.md), [文件](../../engines/table-engines/special/file.md), [URL](../../engines/table-engines/special/url.md) 和 [Null](../../engines/table-engines/special/null.md) 表引擎.
`TRUNCATE` 查询不支持 [View](../../engines/table-engines/special/view.md), [File](../../engines/table-engines/special/file.md), [URL](../../engines/table-engines/special/url.md) 和 [Null](../../engines/table-engines/special/null.md) 表引擎.
## USE {#use}

View File

@ -34,6 +34,7 @@
#include <Common/ThreadStatus.h>
#include <Common/getMappedArea.h>
#include <Common/remapExecutable.h>
#include <Common/TLDListsHolder.h>
#include <IO/HTTPCommon.h>
#include <IO/UseSSL.h>
#include <Interpreters/AsynchronousMetrics.h>
@ -57,6 +58,7 @@
#include <Disks/registerDisks.h>
#include <Common/Config/ConfigReloader.h>
#include <Server/HTTPHandlerFactory.h>
#include <Server/TestKeeperTCPHandlerFactory.h>
#include "MetricsTransmitter.h"
#include <Common/StatusFile.h>
#include <Server/TCPHandlerFactory.h>
@ -186,6 +188,85 @@ static std::string getUserName(uid_t user_id)
return toString(user_id);
}
Poco::Net::SocketAddress makeSocketAddress(const std::string & host, UInt16 port, Poco::Logger * log)
{
Poco::Net::SocketAddress socket_address;
try
{
socket_address = Poco::Net::SocketAddress(host, port);
}
catch (const Poco::Net::DNSException & e)
{
const auto code = e.code();
if (code == EAI_FAMILY
#if defined(EAI_ADDRFAMILY)
|| code == EAI_ADDRFAMILY
#endif
)
{
LOG_ERROR(log, "Cannot resolve listen_host ({}), error {}: {}. "
"If it is an IPv6 address and your host has disabled IPv6, then consider to "
"specify IPv4 address to listen in <listen_host> element of configuration "
"file. Example: <listen_host>0.0.0.0</listen_host>",
host, e.code(), e.message());
}
throw;
}
return socket_address;
}
Poco::Net::SocketAddress Server::socketBindListen(Poco::Net::ServerSocket & socket, const std::string & host, UInt16 port, [[maybe_unused]] bool secure) const
{
auto address = makeSocketAddress(host, port, &logger());
#if !defined(POCO_CLICKHOUSE_PATCH) || POCO_VERSION < 0x01090100
if (secure)
/// Bug in old (<1.9.1) poco, listen() after bind() with reusePort param will fail because have no implementation in SecureServerSocketImpl
/// https://github.com/pocoproject/poco/pull/2257
socket.bind(address, /* reuseAddress = */ true);
else
#endif
#if POCO_VERSION < 0x01080000
socket.bind(address, /* reuseAddress = */ true);
#else
socket.bind(address, /* reuseAddress = */ true, /* reusePort = */ config().getBool("listen_reuse_port", false));
#endif
socket.listen(/* backlog = */ config().getUInt("listen_backlog", 64));
return address;
}
void Server::createServer(const std::string & listen_host, const char * port_name, bool listen_try, CreateServerFunc && func) const
{
/// For testing purposes, user may omit tcp_port or http_port or https_port in configuration file.
if (!config().has(port_name))
return;
auto port = config().getInt(port_name);
try
{
func(port);
}
catch (const Poco::Exception &)
{
std::string message = "Listen [" + listen_host + "]:" + std::to_string(port) + " failed: " + getCurrentExceptionMessage(false);
if (listen_try)
{
LOG_WARNING(&logger(), "{}. If it is an IPv6 or IPv4 address and your host has disabled IPv6 or IPv4, then consider to "
"specify not disabled IPv4 or IPv6 address to listen in <listen_host> element of configuration "
"file. Example for disabled IPv6: <listen_host>0.0.0.0</listen_host> ."
" Example for disabled IPv4: <listen_host>::</listen_host>",
message);
}
else
{
throw Exception{message, ErrorCodes::NETWORK_ERROR};
}
}
}
void Server::uninitialize()
{
logger().information("shutting down");
@ -399,27 +480,6 @@ int Server::main(const std::vector<std::string> & /*args*/)
StatusFile status{path + "status", StatusFile::write_full_info};
SCOPE_EXIT({
/** Ask to cancel background jobs all table engines,
* and also query_log.
* It is important to do early, not in destructor of Context, because
* table engines could use Context on destroy.
*/
LOG_INFO(log, "Shutting down storages.");
global_context->shutdown();
LOG_DEBUG(log, "Shut down storages.");
/** Explicitly destroy Context. It is more convenient than in destructor of Server, because logger is still available.
* At this moment, no one could own shared part of Context.
*/
global_context_ptr = nullptr;
global_context.reset();
shared_context.reset();
LOG_DEBUG(log, "Destroyed global context.");
});
/// Try to increase limit on number of open files.
{
rlimit rlim;
@ -483,6 +543,12 @@ int Server::main(const std::vector<std::string> & /*args*/)
Poco::File(dictionaries_lib_path).createDirectories();
}
/// top_level_domains_lists
{
const std::string & top_level_domains_path = config().getString("top_level_domains_path", path + "top_level_domains/") + "/";
TLDListsHolder::getInstance().parseConfig(top_level_domains_path, config());
}
{
Poco::File(path + "data/").createDirectories();
Poco::File(path + "metadata/").createDirectories();
@ -675,6 +741,71 @@ int Server::main(const std::vector<std::string> & /*args*/)
total_memory_tracker.setDescription("(total)");
total_memory_tracker.setMetric(CurrentMetrics::MemoryTracking);
Poco::Timespan keep_alive_timeout(config().getUInt("keep_alive_timeout", 10), 0);
Poco::ThreadPool server_pool(3, config().getUInt("max_connections", 1024));
Poco::Net::HTTPServerParams::Ptr http_params = new Poco::Net::HTTPServerParams;
http_params->setTimeout(settings.http_receive_timeout);
http_params->setKeepAliveTimeout(keep_alive_timeout);
std::vector<ProtocolServerAdapter> servers_to_start_before_tables;
std::vector<std::string> listen_hosts = DB::getMultipleValuesFromConfig(config(), "", "listen_host");
bool listen_try = config().getBool("listen_try", false);
if (listen_hosts.empty())
{
listen_hosts.emplace_back("::1");
listen_hosts.emplace_back("127.0.0.1");
listen_try = true;
}
for (const auto & listen_host : listen_hosts)
{
/// TCP TestKeeper
createServer(listen_host, "test_keeper_server.tcp_port", listen_try, [&](UInt16 port)
{
Poco::Net::ServerSocket socket;
auto address = socketBindListen(socket, listen_host, port);
socket.setReceiveTimeout(settings.receive_timeout);
socket.setSendTimeout(settings.send_timeout);
servers_to_start_before_tables.emplace_back(std::make_unique<Poco::Net::TCPServer>(
new TestKeeperTCPHandlerFactory(*this),
server_pool,
socket,
new Poco::Net::TCPServerParams));
LOG_INFO(log, "Listening for connections to fake zookeeper (tcp): {}", address.toString());
});
}
for (auto & server : servers_to_start_before_tables)
server.start();
SCOPE_EXIT({
/** Ask to cancel background jobs all table engines,
* and also query_log.
* It is important to do early, not in destructor of Context, because
* table engines could use Context on destroy.
*/
LOG_INFO(log, "Shutting down storages.");
global_context->shutdown();
LOG_DEBUG(log, "Shut down storages.");
for (auto & server : servers_to_start_before_tables)
server.stop();
/** Explicitly destroy Context. It is more convenient than in destructor of Server, because logger is still available.
* At this moment, no one could own shared part of Context.
*/
global_context_ptr = nullptr;
global_context.reset();
shared_context.reset();
LOG_DEBUG(log, "Destroyed global context.");
});
/// Set current database name before loading tables and databases because
/// system logs may copy global context.
global_context->setCurrentDatabaseNameInGlobalContext(default_database);
@ -804,75 +935,8 @@ int Server::main(const std::vector<std::string> & /*args*/)
LOG_INFO(log, "TaskStats is not implemented for this OS. IO accounting will be disabled.");
#endif
std::vector<ProtocolServerAdapter> servers;
{
Poco::Timespan keep_alive_timeout(config().getUInt("keep_alive_timeout", 10), 0);
Poco::ThreadPool server_pool(3, config().getUInt("max_connections", 1024));
Poco::Net::HTTPServerParams::Ptr http_params = new Poco::Net::HTTPServerParams;
http_params->setTimeout(settings.http_receive_timeout);
http_params->setKeepAliveTimeout(keep_alive_timeout);
std::vector<ProtocolServerAdapter> servers;
std::vector<std::string> listen_hosts = DB::getMultipleValuesFromConfig(config(), "", "listen_host");
bool listen_try = config().getBool("listen_try", false);
if (listen_hosts.empty())
{
listen_hosts.emplace_back("::1");
listen_hosts.emplace_back("127.0.0.1");
listen_try = true;
}
auto make_socket_address = [&](const std::string & host, UInt16 port)
{
Poco::Net::SocketAddress socket_address;
try
{
socket_address = Poco::Net::SocketAddress(host, port);
}
catch (const Poco::Net::DNSException & e)
{
const auto code = e.code();
if (code == EAI_FAMILY
#if defined(EAI_ADDRFAMILY)
|| code == EAI_ADDRFAMILY
#endif
)
{
LOG_ERROR(log, "Cannot resolve listen_host ({}), error {}: {}. "
"If it is an IPv6 address and your host has disabled IPv6, then consider to "
"specify IPv4 address to listen in <listen_host> element of configuration "
"file. Example: <listen_host>0.0.0.0</listen_host>",
host, e.code(), e.message());
}
throw;
}
return socket_address;
};
auto socket_bind_listen = [&](auto & socket, const std::string & host, UInt16 port, [[maybe_unused]] bool secure = false)
{
auto address = make_socket_address(host, port);
#if !defined(POCO_CLICKHOUSE_PATCH) || POCO_VERSION < 0x01090100
if (secure)
/// Bug in old (<1.9.1) poco, listen() after bind() with reusePort param will fail because have no implementation in SecureServerSocketImpl
/// https://github.com/pocoproject/poco/pull/2257
socket.bind(address, /* reuseAddress = */ true);
else
#endif
#if POCO_VERSION < 0x01080000
socket.bind(address, /* reuseAddress = */ true);
#else
socket.bind(address, /* reuseAddress = */ true, /* reusePort = */ config().getBool("listen_reuse_port", false));
#endif
socket.listen(/* backlog = */ config().getUInt("listen_backlog", 64));
return address;
};
/// This object will periodically calculate some metrics.
AsynchronousMetrics async_metrics(*global_context,
config().getUInt("asynchronous_metrics_update_period_s", 60));
@ -880,41 +944,11 @@ int Server::main(const std::vector<std::string> & /*args*/)
for (const auto & listen_host : listen_hosts)
{
auto create_server = [&](const char * port_name, auto && func)
{
/// For testing purposes, user may omit tcp_port or http_port or https_port in configuration file.
if (!config().has(port_name))
return;
auto port = config().getInt(port_name);
try
{
func(port);
}
catch (const Poco::Exception &)
{
std::string message = "Listen [" + listen_host + "]:" + std::to_string(port) + " failed: " + getCurrentExceptionMessage(false);
if (listen_try)
{
LOG_WARNING(log, "{}. If it is an IPv6 or IPv4 address and your host has disabled IPv6 or IPv4, then consider to "
"specify not disabled IPv4 or IPv6 address to listen in <listen_host> element of configuration "
"file. Example for disabled IPv6: <listen_host>0.0.0.0</listen_host> ."
" Example for disabled IPv4: <listen_host>::</listen_host>",
message);
}
else
{
throw Exception{message, ErrorCodes::NETWORK_ERROR};
}
}
};
/// HTTP
create_server("http_port", [&](UInt16 port)
createServer(listen_host, "http_port", listen_try, [&](UInt16 port)
{
Poco::Net::ServerSocket socket;
auto address = socket_bind_listen(socket, listen_host, port);
auto address = socketBindListen(socket, listen_host, port);
socket.setReceiveTimeout(settings.http_receive_timeout);
socket.setSendTimeout(settings.http_send_timeout);
@ -925,11 +959,11 @@ int Server::main(const std::vector<std::string> & /*args*/)
});
/// HTTPS
create_server("https_port", [&](UInt16 port)
createServer(listen_host, "https_port", listen_try, [&](UInt16 port)
{
#if USE_SSL
Poco::Net::SecureServerSocket socket;
auto address = socket_bind_listen(socket, listen_host, port, /* secure = */ true);
auto address = socketBindListen(socket, listen_host, port, /* secure = */ true);
socket.setReceiveTimeout(settings.http_receive_timeout);
socket.setSendTimeout(settings.http_send_timeout);
servers.emplace_back(std::make_unique<Poco::Net::HTTPServer>(
@ -944,10 +978,10 @@ int Server::main(const std::vector<std::string> & /*args*/)
});
/// TCP
create_server("tcp_port", [&](UInt16 port)
createServer(listen_host, "tcp_port", listen_try, [&](UInt16 port)
{
Poco::Net::ServerSocket socket;
auto address = socket_bind_listen(socket, listen_host, port);
auto address = socketBindListen(socket, listen_host, port);
socket.setReceiveTimeout(settings.receive_timeout);
socket.setSendTimeout(settings.send_timeout);
servers.emplace_back(std::make_unique<Poco::Net::TCPServer>(
@ -960,10 +994,10 @@ int Server::main(const std::vector<std::string> & /*args*/)
});
/// TCP with PROXY protocol, see https://github.com/wolfeidau/proxyv2/blob/master/docs/proxy-protocol.txt
create_server("tcp_with_proxy_port", [&](UInt16 port)
createServer(listen_host, "tcp_with_proxy_port", listen_try, [&](UInt16 port)
{
Poco::Net::ServerSocket socket;
auto address = socket_bind_listen(socket, listen_host, port);
auto address = socketBindListen(socket, listen_host, port);
socket.setReceiveTimeout(settings.receive_timeout);
socket.setSendTimeout(settings.send_timeout);
servers.emplace_back(std::make_unique<Poco::Net::TCPServer>(
@ -976,11 +1010,11 @@ int Server::main(const std::vector<std::string> & /*args*/)
});
/// TCP with SSL
create_server("tcp_port_secure", [&](UInt16 port)
createServer(listen_host, "tcp_port_secure", listen_try, [&](UInt16 port)
{
#if USE_SSL
Poco::Net::SecureServerSocket socket;
auto address = socket_bind_listen(socket, listen_host, port, /* secure = */ true);
auto address = socketBindListen(socket, listen_host, port, /* secure = */ true);
socket.setReceiveTimeout(settings.receive_timeout);
socket.setSendTimeout(settings.send_timeout);
servers.emplace_back(std::make_unique<Poco::Net::TCPServer>(
@ -997,10 +1031,10 @@ int Server::main(const std::vector<std::string> & /*args*/)
});
/// Interserver IO HTTP
create_server("interserver_http_port", [&](UInt16 port)
createServer(listen_host, "interserver_http_port", listen_try, [&](UInt16 port)
{
Poco::Net::ServerSocket socket;
auto address = socket_bind_listen(socket, listen_host, port);
auto address = socketBindListen(socket, listen_host, port);
socket.setReceiveTimeout(settings.http_receive_timeout);
socket.setSendTimeout(settings.http_send_timeout);
servers.emplace_back(std::make_unique<Poco::Net::HTTPServer>(
@ -1009,11 +1043,11 @@ int Server::main(const std::vector<std::string> & /*args*/)
LOG_INFO(log, "Listening for replica communication (interserver): http://{}", address.toString());
});
create_server("interserver_https_port", [&](UInt16 port)
createServer(listen_host, "interserver_https_port", listen_try, [&](UInt16 port)
{
#if USE_SSL
Poco::Net::SecureServerSocket socket;
auto address = socket_bind_listen(socket, listen_host, port, /* secure = */ true);
auto address = socketBindListen(socket, listen_host, port, /* secure = */ true);
socket.setReceiveTimeout(settings.http_receive_timeout);
socket.setSendTimeout(settings.http_send_timeout);
servers.emplace_back(std::make_unique<Poco::Net::HTTPServer>(
@ -1027,10 +1061,10 @@ int Server::main(const std::vector<std::string> & /*args*/)
#endif
});
create_server("mysql_port", [&](UInt16 port)
createServer(listen_host, "mysql_port", listen_try, [&](UInt16 port)
{
Poco::Net::ServerSocket socket;
auto address = socket_bind_listen(socket, listen_host, port, /* secure = */ true);
auto address = socketBindListen(socket, listen_host, port, /* secure = */ true);
socket.setReceiveTimeout(Poco::Timespan());
socket.setSendTimeout(settings.send_timeout);
servers.emplace_back(std::make_unique<Poco::Net::TCPServer>(
@ -1042,10 +1076,10 @@ int Server::main(const std::vector<std::string> & /*args*/)
LOG_INFO(log, "Listening for MySQL compatibility protocol: {}", address.toString());
});
create_server("postgresql_port", [&](UInt16 port)
createServer(listen_host, "postgresql_port", listen_try, [&](UInt16 port)
{
Poco::Net::ServerSocket socket;
auto address = socket_bind_listen(socket, listen_host, port, /* secure = */ true);
auto address = socketBindListen(socket, listen_host, port, /* secure = */ true);
socket.setReceiveTimeout(Poco::Timespan());
socket.setSendTimeout(settings.send_timeout);
servers.emplace_back(std::make_unique<Poco::Net::TCPServer>(
@ -1058,19 +1092,19 @@ int Server::main(const std::vector<std::string> & /*args*/)
});
#if USE_GRPC
create_server("grpc_port", [&](UInt16 port)
createServer(listen_host, "grpc_port", listen_try, [&](UInt16 port)
{
Poco::Net::SocketAddress server_address(listen_host, port);
servers.emplace_back(std::make_unique<GRPCServer>(*this, make_socket_address(listen_host, port)));
servers.emplace_back(std::make_unique<GRPCServer>(*this, makeSocketAddress(listen_host, port, log)));
LOG_INFO(log, "Listening for gRPC protocol: " + server_address.toString());
});
#endif
/// Prometheus (if defined and not setup yet with http_port)
create_server("prometheus.port", [&](UInt16 port)
createServer(listen_host, "prometheus.port", listen_try, [&](UInt16 port)
{
Poco::Net::ServerSocket socket;
auto address = socket_bind_listen(socket, listen_host, port);
auto address = socketBindListen(socket, listen_host, port);
socket.setReceiveTimeout(settings.http_receive_timeout);
socket.setSendTimeout(settings.http_send_timeout);
servers.emplace_back(std::make_unique<Poco::Net::HTTPServer>(
@ -1094,6 +1128,7 @@ int Server::main(const std::vector<std::string> & /*args*/)
int level = level_str.empty() ? INT_MAX : Poco::Logger::parseLevel(level_str);
setTextLog(global_context->getTextLog(), level);
}
buildLoggers(config(), logger());
main_config_reloader->start();
@ -1140,7 +1175,10 @@ int Server::main(const std::vector<std::string> & /*args*/)
{
current_connections = 0;
for (auto & server : servers)
{
server.stop();
current_connections += server.currentConnections();
}
if (!current_connections)
break;
sleep_current_ms += sleep_one_ms;

View File

@ -14,6 +14,13 @@
* 3. Interserver HTTP - for replication.
*/
namespace Poco
{
namespace Net
{
class ServerSocket;
}
}
namespace DB
{
@ -57,6 +64,13 @@ protected:
private:
Context * global_context_ptr = nullptr;
private:
Poco::Net::SocketAddress socketBindListen(Poco::Net::ServerSocket & socket, const std::string & host, UInt16 port, [[maybe_unused]] bool secure = false) const;
using CreateServerFunc = std::function<void(UInt16)>;
void createServer(const std::string & listen_host, const char * port_name, bool listen_try, CreateServerFunc && func) const;
};
}

View File

@ -4,4 +4,5 @@
<user_files_path replace="replace">./user_files/</user_files_path>
<format_schema_path replace="replace">./format_schemas/</format_schema_path>
<access_control_path replace="replace">./access/</access_control_path>
<top_level_domains_path replace="replace">./top_level_domains/</top_level_domains_path>
</yandex>

View File

@ -724,6 +724,19 @@
<!-- <path_to_regions_names_files>/opt/geo/</path_to_regions_names_files> -->
<!-- <top_level_domains_path>/var/lib/clickhouse/top_level_domains/</top_level_domains_path> -->
<!-- Custom TLD lists.
Format: <name>/path/to/file</name>
Changes will not be applied w/o server restart.
Path to the list is under top_level_domains_path (see above).
-->
<top_level_domains_lists>
<!--
<public_suffix_list>/path/to/public_suffix_list.dat</public_suffix_list>
-->
</top_level_domains_lists>
<!-- Configuration of external dictionaries. See:
https://clickhouse.tech/docs/en/sql-reference/dictionaries/external-dictionaries/external-dicts
-->

View File

@ -528,6 +528,7 @@
M(559, INVALID_GRPC_QUERY_INFO) \
M(560, ZSTD_ENCODER_FAILED) \
M(561, ZSTD_DECODER_FAILED) \
M(562, TLD_LIST_NOT_FOUND) \
\
M(999, KEEPER_EXCEPTION) \
M(1000, POCO_EXCEPTION) \

View File

@ -0,0 +1,101 @@
#pragma once
#include <Common/HashTable/HashSet.h>
#include <Common/HashTable/HashTableAllocator.h>
#include <Common/HashTable/StringHashTable.h>
template <typename Key>
struct StringHashSetCell : public HashTableCell<Key, StringHashTableHash, HashTableNoState>
{
using Base = HashTableCell<Key, StringHashTableHash, HashTableNoState>;
using Base::Base;
VoidMapped void_map;
VoidMapped & getMapped() { return void_map; }
const VoidMapped & getMapped() const { return void_map; }
static constexpr bool need_zero_value_storage = false;
};
template <>
struct StringHashSetCell<StringKey16> : public HashTableCell<StringKey16, StringHashTableHash, HashTableNoState>
{
using Base = HashTableCell<StringKey16, StringHashTableHash, HashTableNoState>;
using Base::Base;
VoidMapped void_map;
VoidMapped & getMapped() { return void_map; }
const VoidMapped & getMapped() const { return void_map; }
static constexpr bool need_zero_value_storage = false;
bool isZero(const HashTableNoState & state) const { return isZero(this->key, state); }
// Zero means unoccupied cells in hash table. Use key with last word = 0 as
// zero keys, because such keys are unrepresentable (no way to encode length).
static bool isZero(const StringKey16 & key_, const HashTableNoState &)
{ return key_.high == 0; }
void setZero() { this->key.high = 0; }
};
template <>
struct StringHashSetCell<StringKey24> : public HashTableCell<StringKey24, StringHashTableHash, HashTableNoState>
{
using Base = HashTableCell<StringKey24, StringHashTableHash, HashTableNoState>;
using Base::Base;
VoidMapped void_map;
VoidMapped & getMapped() { return void_map; }
const VoidMapped & getMapped() const { return void_map; }
static constexpr bool need_zero_value_storage = false;
bool isZero(const HashTableNoState & state) const { return isZero(this->key, state); }
// Zero means unoccupied cells in hash table. Use key with last word = 0 as
// zero keys, because such keys are unrepresentable (no way to encode length).
static bool isZero(const StringKey24 & key_, const HashTableNoState &)
{ return key_.c == 0; }
void setZero() { this->key.c = 0; }
};
template <>
struct StringHashSetCell<StringRef> : public HashSetCellWithSavedHash<StringRef, StringHashTableHash, HashTableNoState>
{
using Base = HashSetCellWithSavedHash<StringRef, StringHashTableHash, HashTableNoState>;
using Base::Base;
VoidMapped void_map;
VoidMapped & getMapped() { return void_map; }
const VoidMapped & getMapped() const { return void_map; }
static constexpr bool need_zero_value_storage = false;
};
template <typename Allocator>
struct StringHashSetSubMaps
{
using T0 = StringHashTableEmpty<StringHashSetCell<StringRef>>;
using T1 = HashSetTable<StringKey8, StringHashSetCell<StringKey8>, StringHashTableHash, StringHashTableGrower<>, Allocator>;
using T2 = HashSetTable<StringKey16, StringHashSetCell<StringKey16>, StringHashTableHash, StringHashTableGrower<>, Allocator>;
using T3 = HashSetTable<StringKey24, StringHashSetCell<StringKey24>, StringHashTableHash, StringHashTableGrower<>, Allocator>;
using Ts = HashSetTable<StringRef, StringHashSetCell<StringRef>, StringHashTableHash, StringHashTableGrower<>, Allocator>;
};
template <typename Allocator = HashTableAllocator>
class StringHashSet : public StringHashTable<StringHashSetSubMaps<Allocator>>
{
public:
using Key = StringRef;
using Base = StringHashTable<StringHashSetSubMaps<Allocator>>;
using Self = StringHashSet;
using LookupResult = typename Base::LookupResult;
using Base::Base;
template <typename KeyHolder>
void ALWAYS_INLINE emplace(KeyHolder && key_holder, bool & inserted)
{
LookupResult it;
Base::emplace(key_holder, it, inserted);
}
};

View File

@ -212,7 +212,7 @@ public:
using LookupResult = StringHashTableLookupResult<typename cell_type::mapped_type>;
using ConstLookupResult = StringHashTableLookupResult<const typename cell_type::mapped_type>;
StringHashTable() {}
StringHashTable() = default;
StringHashTable(size_t reserve_for_num_elements)
: m1{reserve_for_num_elements / 4}
@ -222,8 +222,15 @@ public:
{
}
StringHashTable(StringHashTable && rhs) { *this = std::move(rhs); }
~StringHashTable() {}
StringHashTable(StringHashTable && rhs)
: m1(std::move(rhs.m1))
, m2(std::move(rhs.m2))
, m3(std::move(rhs.m3))
, ms(std::move(rhs.ms))
{
}
~StringHashTable() = default;
public:
// Dispatch is written in a way that maximizes the performance:

View File

@ -70,7 +70,7 @@ LazyPipeFDs::~LazyPipeFDs()
}
void LazyPipeFDs::setNonBlocking()
void LazyPipeFDs::setNonBlockingWrite()
{
int flags = fcntl(fds_rw[1], F_GETFL, 0);
if (-1 == flags)
@ -79,6 +79,21 @@ void LazyPipeFDs::setNonBlocking()
throwFromErrno("Cannot set non-blocking mode of pipe", ErrorCodes::CANNOT_FCNTL);
}
void LazyPipeFDs::setNonBlockingRead()
{
int flags = fcntl(fds_rw[0], F_GETFL, 0);
if (-1 == flags)
throwFromErrno("Cannot get file status flags of pipe", ErrorCodes::CANNOT_FCNTL);
if (-1 == fcntl(fds_rw[0], F_SETFL, flags | O_NONBLOCK))
throwFromErrno("Cannot set non-blocking mode of pipe", ErrorCodes::CANNOT_FCNTL);
}
void LazyPipeFDs::setNonBlockingReadWrite()
{
setNonBlockingRead();
setNonBlockingWrite();
}
void LazyPipeFDs::tryIncreaseSize(int desired_size)
{
#if defined(OS_LINUX)

View File

@ -17,7 +17,12 @@ struct LazyPipeFDs
void open();
void close();
void setNonBlocking();
/// Set O_NONBLOCK to different ends of pipe preserving existing flags.
/// Throws an exception if fcntl was not successful.
void setNonBlockingWrite();
void setNonBlockingRead();
void setNonBlockingReadWrite();
void tryIncreaseSize(int desired_size);
~LazyPipeFDs();

View File

@ -0,0 +1,106 @@
#include <Common/TLDListsHolder.h>
#include <Common/StringUtils/StringUtils.h>
#include <common/logger_useful.h>
#include <IO/ReadBufferFromFile.h>
#include <string_view>
#include <unordered_set>
namespace DB
{
namespace ErrorCodes
{
extern const int TLD_LIST_NOT_FOUND;
}
///
/// TLDList
///
TLDList::TLDList(size_t size)
: tld_container(size)
, pool(std::make_unique<Arena>(10 << 20))
{}
bool TLDList::insert(const StringRef & host)
{
bool inserted;
tld_container.emplace(DB::ArenaKeyHolder{host, *pool}, inserted);
return inserted;
}
bool TLDList::has(const StringRef & host) const
{
return tld_container.has(host);
}
///
/// TLDListsHolder
///
TLDListsHolder & TLDListsHolder::getInstance()
{
static TLDListsHolder instance;
return instance;
}
TLDListsHolder::TLDListsHolder() = default;
void TLDListsHolder::parseConfig(const std::string & top_level_domains_path, const Poco::Util::AbstractConfiguration & config)
{
Poco::Util::AbstractConfiguration::Keys config_keys;
config.keys("top_level_domains_lists", config_keys);
Poco::Logger * log = &Poco::Logger::get("TLDListsHolder");
for (const auto & key : config_keys)
{
const std::string & path = top_level_domains_path + config.getString("top_level_domains_lists." + key);
LOG_TRACE(log, "{} loading from {}", key, path);
size_t hosts = parseAndAddTldList(key, path);
LOG_INFO(log, "{} was added ({} hosts)", key, hosts);
}
}
size_t TLDListsHolder::parseAndAddTldList(const std::string & name, const std::string & path)
{
std::unordered_set<std::string> tld_list_tmp;
ReadBufferFromFile in(path);
while (!in.eof())
{
char * newline = find_first_symbols<'\n'>(in.position(), in.buffer().end());
if (newline >= in.buffer().end())
break;
std::string_view line(in.position(), newline - in.position());
in.position() = newline + 1;
/// Skip comments
if (line.size() > 2 && line[0] == '/' && line[1] == '/')
continue;
trim(line);
/// Skip empty line
if (line.empty())
continue;
tld_list_tmp.emplace(line);
}
TLDList tld_list(tld_list_tmp.size());
for (const auto & host : tld_list_tmp)
{
StringRef host_ref{host.data(), host.size()};
tld_list.insert(host_ref);
}
size_t tld_list_size = tld_list.size();
std::lock_guard<std::mutex> lock(tld_lists_map_mutex);
tld_lists_map.insert(std::make_pair(name, std::move(tld_list)));
return tld_list_size;
}
const TLDList & TLDListsHolder::getTldList(const std::string & name)
{
std::lock_guard<std::mutex> lock(tld_lists_map_mutex);
auto it = tld_lists_map.find(name);
if (it == tld_lists_map.end())
throw Exception(ErrorCodes::TLD_LIST_NOT_FOUND, "TLD list {} does not exist", name);
return it->second;
}
}

View File

@ -0,0 +1,65 @@
#pragma once
#include <common/defines.h>
#include <common/StringRef.h>
#include <Common/HashTable/StringHashSet.h>
#include <Common/Arena.h>
#include <Poco/Util/AbstractConfiguration.h>
#include <mutex>
#include <string>
#include <unordered_map>
namespace DB
{
/// Custom TLD List
///
/// Unlike tldLookup (which uses gperf) this one uses plain StringHashSet.
class TLDList
{
public:
using Container = StringHashSet<>;
TLDList(size_t size);
/// Return true if the tld_container does not contains such element.
bool insert(const StringRef & host);
/// Check is there such TLD
bool has(const StringRef & host) const;
size_t size() const { return tld_container.size(); }
private:
Container tld_container;
std::unique_ptr<Arena> pool;
};
class TLDListsHolder
{
public:
using Map = std::unordered_map<std::string, TLDList>;
static TLDListsHolder & getInstance();
/// Parse "top_level_domains_lists" section,
/// And add each found dictionary.
void parseConfig(const std::string & top_level_domains_path, const Poco::Util::AbstractConfiguration & config);
/// Parse file and add it as a Set to the list of TLDs
/// - "//" -- comment,
/// - empty lines will be ignored.
///
/// Example: https://publicsuffix.org/list/public_suffix_list.dat
///
/// Return size of the list.
size_t parseAndAddTldList(const std::string & name, const std::string & path);
/// Throws TLD_LIST_NOT_FOUND if list does not exist
const TLDList & getTldList(const std::string & name);
protected:
TLDListsHolder();
std::mutex tld_lists_map_mutex;
Map tld_lists_map;
};
}

View File

@ -36,7 +36,7 @@ TraceCollector::TraceCollector(std::shared_ptr<TraceLog> trace_log_)
/** Turn write end of pipe to non-blocking mode to avoid deadlocks
* when QueryProfiler is invoked under locks and TraceCollector cannot pull data from pipe.
*/
pipe.setNonBlocking();
pipe.setNonBlockingWrite();
pipe.tryIncreaseSize(1 << 20);
thread = ThreadFromGlobalPool(&TraceCollector::run, this);

View File

@ -31,7 +31,6 @@ using Undo = std::function<void()>;
struct TestKeeperRequest : virtual Request
{
virtual bool isMutable() const { return false; }
virtual ResponsePtr createResponse() const = 0;
virtual std::pair<ResponsePtr, Undo> process(TestKeeper::Container & container, int64_t zxid) const = 0;
virtual void processWatches(TestKeeper::Watches & /*watches*/, TestKeeper::Watches & /*list_watches*/) const {}
@ -85,7 +84,6 @@ struct TestKeeperRemoveRequest final : RemoveRequest, TestKeeperRequest
{
TestKeeperRemoveRequest() = default;
explicit TestKeeperRemoveRequest(const RemoveRequest & base) : RemoveRequest(base) {}
bool isMutable() const override { return true; }
ResponsePtr createResponse() const override;
std::pair<ResponsePtr, Undo> process(TestKeeper::Container & container, int64_t zxid) const override;
@ -112,7 +110,6 @@ struct TestKeeperSetRequest final : SetRequest, TestKeeperRequest
{
TestKeeperSetRequest() = default;
explicit TestKeeperSetRequest(const SetRequest & base) : SetRequest(base) {}
bool isMutable() const override { return true; }
ResponsePtr createResponse() const override;
std::pair<ResponsePtr, Undo> process(TestKeeper::Container & container, int64_t zxid) const override;

View File

@ -125,8 +125,6 @@ private:
Watches watches;
Watches list_watches; /// Watches for 'list' request (watches on children).
void createWatchCallBack(const String & path);
using RequestsQueue = ConcurrentBoundedQueue<RequestInfo>;
RequestsQueue requests_queue{1};

View File

@ -0,0 +1,806 @@
#include <Common/ZooKeeper/TestKeeperStorage.h>
#include <Common/ZooKeeper/IKeeper.h>
#include <Common/setThreadName.h>
#include <mutex>
#include <functional>
#include <common/logger_useful.h>
#include <Common/StringUtils/StringUtils.h>
#include <sstream>
#include <iomanip>
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
extern const int TIMEOUT_EXCEEDED;
extern const int BAD_ARGUMENTS;
}
}
namespace zkutil
{
using namespace DB;
static String parentPath(const String & path)
{
auto rslash_pos = path.rfind('/');
if (rslash_pos > 0)
return path.substr(0, rslash_pos);
return "/";
}
static String baseName(const String & path)
{
auto rslash_pos = path.rfind('/');
return path.substr(rslash_pos + 1);
}
static void processWatchesImpl(const String & path, TestKeeperStorage::Watches & watches, TestKeeperStorage::Watches & list_watches, Coordination::Event event_type)
{
auto it = watches.find(path);
if (it != watches.end())
{
std::shared_ptr<Coordination::ZooKeeperWatchResponse> watch_response = std::make_shared<Coordination::ZooKeeperWatchResponse>();
watch_response->path = path;
watch_response->xid = -1;
watch_response->zxid = -1;
watch_response->type = event_type;
watch_response->state = Coordination::State::CONNECTED;
for (auto & watcher : it->second)
if (watcher.watch_callback)
watcher.watch_callback(watch_response);
watches.erase(it);
}
auto parent_path = parentPath(path);
it = list_watches.find(parent_path);
if (it != list_watches.end())
{
std::shared_ptr<Coordination::ZooKeeperWatchResponse> watch_list_response = std::make_shared<Coordination::ZooKeeperWatchResponse>();
watch_list_response->path = parent_path;
watch_list_response->xid = -1;
watch_list_response->zxid = -1;
watch_list_response->type = Coordination::Event::CHILD;
watch_list_response->state = Coordination::State::CONNECTED;
for (auto & watcher : it->second)
if (watcher.watch_callback)
watcher.watch_callback(watch_list_response);
list_watches.erase(it);
}
}
TestKeeperStorage::TestKeeperStorage()
{
container.emplace("/", Node());
processing_thread = ThreadFromGlobalPool([this] { processingThread(); });
}
using Undo = std::function<void()>;
struct TestKeeperStorageRequest
{
Coordination::ZooKeeperRequestPtr zk_request;
explicit TestKeeperStorageRequest(const Coordination::ZooKeeperRequestPtr & zk_request_)
: zk_request(zk_request_)
{}
virtual std::pair<Coordination::ZooKeeperResponsePtr, Undo> process(TestKeeperStorage::Container & container, TestKeeperStorage::Ephemerals & ephemerals, int64_t zxid, int64_t session_id) const = 0;
virtual void processWatches(TestKeeperStorage::Watches & /*watches*/, TestKeeperStorage::Watches & /*list_watches*/) const {}
virtual ~TestKeeperStorageRequest() = default;
};
struct TestKeeperStorageHeartbeatRequest final : public TestKeeperStorageRequest
{
using TestKeeperStorageRequest::TestKeeperStorageRequest;
std::pair<Coordination::ZooKeeperResponsePtr, Undo> process(TestKeeperStorage::Container & /* container */, TestKeeperStorage::Ephemerals & /* ephemerals */, int64_t /* zxid */, int64_t /* session_id */) const override
{
return {zk_request->makeResponse(), {}};
}
};
struct TestKeeperStorageCreateRequest final : public TestKeeperStorageRequest
{
using TestKeeperStorageRequest::TestKeeperStorageRequest;
void processWatches(TestKeeperStorage::Watches & watches, TestKeeperStorage::Watches & list_watches) const override
{
processWatchesImpl(zk_request->getPath(), watches, list_watches, Coordination::Event::CREATED);
}
std::pair<Coordination::ZooKeeperResponsePtr, Undo> process(TestKeeperStorage::Container & container, TestKeeperStorage::Ephemerals & ephemerals, int64_t zxid, int64_t session_id) const override
{
Coordination::ZooKeeperResponsePtr response_ptr = zk_request->makeResponse();
Undo undo;
Coordination::ZooKeeperCreateResponse & response = dynamic_cast<Coordination::ZooKeeperCreateResponse &>(*response_ptr);
Coordination::ZooKeeperCreateRequest & request = dynamic_cast<Coordination::ZooKeeperCreateRequest &>(*zk_request);
if (container.count(request.path))
{
response.error = Coordination::Error::ZNODEEXISTS;
}
else
{
auto it = container.find(parentPath(request.path));
if (it == container.end())
{
response.error = Coordination::Error::ZNONODE;
}
else if (it->second.is_ephemeral)
{
response.error = Coordination::Error::ZNOCHILDRENFOREPHEMERALS;
}
else
{
TestKeeperStorage::Node created_node;
created_node.seq_num = 0;
created_node.stat.czxid = zxid;
created_node.stat.mzxid = zxid;
created_node.stat.ctime = std::chrono::system_clock::now().time_since_epoch() / std::chrono::milliseconds(1);
created_node.stat.mtime = created_node.stat.ctime;
created_node.stat.numChildren = 0;
created_node.stat.dataLength = request.data.length();
created_node.data = request.data;
created_node.is_ephemeral = request.is_ephemeral;
created_node.is_sequental = request.is_sequential;
std::string path_created = request.path;
if (request.is_sequential)
{
auto seq_num = it->second.seq_num;
std::stringstream seq_num_str; // STYLE_CHECK_ALLOW_STD_STRING_STREAM
seq_num_str.exceptions(std::ios::failbit);
seq_num_str << std::setw(10) << std::setfill('0') << seq_num;
path_created += seq_num_str.str();
}
/// Increment sequential number even if node is not sequential
++it->second.seq_num;
response.path_created = path_created;
container.emplace(path_created, std::move(created_node));
if (request.is_ephemeral)
ephemerals[session_id].emplace(path_created);
undo = [&container, &ephemerals, session_id, path_created, is_ephemeral = request.is_ephemeral, parent_path = it->first]
{
container.erase(path_created);
if (is_ephemeral)
ephemerals[session_id].erase(path_created);
auto & undo_parent = container.at(parent_path);
--undo_parent.stat.cversion;
--undo_parent.stat.numChildren;
--undo_parent.seq_num;
};
++it->second.stat.cversion;
++it->second.stat.numChildren;
response.error = Coordination::Error::ZOK;
}
}
return { response_ptr, undo };
}
};
struct TestKeeperStorageGetRequest final : public TestKeeperStorageRequest
{
using TestKeeperStorageRequest::TestKeeperStorageRequest;
std::pair<Coordination::ZooKeeperResponsePtr, Undo> process(TestKeeperStorage::Container & container, TestKeeperStorage::Ephemerals & /* ephemerals */, int64_t /* zxid */, int64_t /* session_id */) const override
{
Coordination::ZooKeeperResponsePtr response_ptr = zk_request->makeResponse();
Coordination::ZooKeeperGetResponse & response = dynamic_cast<Coordination::ZooKeeperGetResponse &>(*response_ptr);
Coordination::ZooKeeperGetRequest & request = dynamic_cast<Coordination::ZooKeeperGetRequest &>(*zk_request);
auto it = container.find(request.path);
if (it == container.end())
{
response.error = Coordination::Error::ZNONODE;
}
else
{
response.stat = it->second.stat;
response.data = it->second.data;
response.error = Coordination::Error::ZOK;
}
return { response_ptr, {} };
}
};
struct TestKeeperStorageRemoveRequest final : public TestKeeperStorageRequest
{
using TestKeeperStorageRequest::TestKeeperStorageRequest;
std::pair<Coordination::ZooKeeperResponsePtr, Undo> process(TestKeeperStorage::Container & container, TestKeeperStorage::Ephemerals & ephemerals, int64_t /*zxid*/, int64_t session_id) const override
{
Coordination::ZooKeeperResponsePtr response_ptr = zk_request->makeResponse();
Coordination::ZooKeeperRemoveResponse & response = dynamic_cast<Coordination::ZooKeeperRemoveResponse &>(*response_ptr);
Coordination::ZooKeeperRemoveRequest & request = dynamic_cast<Coordination::ZooKeeperRemoveRequest &>(*zk_request);
Undo undo;
auto it = container.find(request.path);
if (it == container.end())
{
response.error = Coordination::Error::ZNONODE;
}
else if (request.version != -1 && request.version != it->second.stat.version)
{
response.error = Coordination::Error::ZBADVERSION;
}
else if (it->second.stat.numChildren)
{
response.error = Coordination::Error::ZNOTEMPTY;
}
else
{
auto prev_node = it->second;
if (prev_node.is_ephemeral)
ephemerals[session_id].erase(request.path);
container.erase(it);
auto & parent = container.at(parentPath(request.path));
--parent.stat.numChildren;
++parent.stat.cversion;
response.error = Coordination::Error::ZOK;
undo = [prev_node, &container, &ephemerals, session_id, path = request.path]
{
if (prev_node.is_ephemeral)
ephemerals[session_id].emplace(path);
container.emplace(path, prev_node);
auto & undo_parent = container.at(parentPath(path));
++undo_parent.stat.numChildren;
--undo_parent.stat.cversion;
};
}
return { response_ptr, undo };
}
void processWatches(TestKeeperStorage::Watches & watches, TestKeeperStorage::Watches & list_watches) const override
{
processWatchesImpl(zk_request->getPath(), watches, list_watches, Coordination::Event::DELETED);
}
};
struct TestKeeperStorageExistsRequest final : public TestKeeperStorageRequest
{
using TestKeeperStorageRequest::TestKeeperStorageRequest;
std::pair<Coordination::ZooKeeperResponsePtr, Undo> process(TestKeeperStorage::Container & container, TestKeeperStorage::Ephemerals & /* ephemerals */, int64_t /*zxid*/, int64_t /* session_id */) const override
{
Coordination::ZooKeeperResponsePtr response_ptr = zk_request->makeResponse();
Coordination::ZooKeeperExistsResponse & response = dynamic_cast<Coordination::ZooKeeperExistsResponse &>(*response_ptr);
Coordination::ZooKeeperExistsRequest & request = dynamic_cast<Coordination::ZooKeeperExistsRequest &>(*zk_request);
auto it = container.find(request.path);
if (it != container.end())
{
response.stat = it->second.stat;
response.error = Coordination::Error::ZOK;
}
else
{
response.error = Coordination::Error::ZNONODE;
}
return { response_ptr, {} };
}
};
struct TestKeeperStorageSetRequest final : public TestKeeperStorageRequest
{
using TestKeeperStorageRequest::TestKeeperStorageRequest;
std::pair<Coordination::ZooKeeperResponsePtr, Undo> process(TestKeeperStorage::Container & container, TestKeeperStorage::Ephemerals & /* ephemerals */, int64_t zxid, int64_t /* session_id */) const override
{
Coordination::ZooKeeperResponsePtr response_ptr = zk_request->makeResponse();
Coordination::ZooKeeperSetResponse & response = dynamic_cast<Coordination::ZooKeeperSetResponse &>(*response_ptr);
Coordination::ZooKeeperSetRequest & request = dynamic_cast<Coordination::ZooKeeperSetRequest &>(*zk_request);
Undo undo;
auto it = container.find(request.path);
if (it == container.end())
{
response.error = Coordination::Error::ZNONODE;
}
else if (request.version == -1 || request.version == it->second.stat.version)
{
auto prev_node = it->second;
it->second.data = request.data;
++it->second.stat.version;
it->second.stat.mzxid = zxid;
it->second.stat.mtime = std::chrono::system_clock::now().time_since_epoch() / std::chrono::milliseconds(1);
it->second.stat.dataLength = request.data.length();
it->second.data = request.data;
++container.at(parentPath(request.path)).stat.cversion;
response.stat = it->second.stat;
response.error = Coordination::Error::ZOK;
undo = [prev_node, &container, path = request.path]
{
container.at(path) = prev_node;
--container.at(parentPath(path)).stat.cversion;
};
}
else
{
response.error = Coordination::Error::ZBADVERSION;
}
return { response_ptr, undo };
}
void processWatches(TestKeeperStorage::Watches & watches, TestKeeperStorage::Watches & list_watches) const override
{
processWatchesImpl(zk_request->getPath(), watches, list_watches, Coordination::Event::CHANGED);
}
};
struct TestKeeperStorageListRequest final : public TestKeeperStorageRequest
{
using TestKeeperStorageRequest::TestKeeperStorageRequest;
std::pair<Coordination::ZooKeeperResponsePtr, Undo> process(TestKeeperStorage::Container & container, TestKeeperStorage::Ephemerals & /* ephemerals */, int64_t /*zxid*/, int64_t /*session_id*/) const override
{
Coordination::ZooKeeperResponsePtr response_ptr = zk_request->makeResponse();
Coordination::ZooKeeperListResponse & response = dynamic_cast<Coordination::ZooKeeperListResponse &>(*response_ptr);
Coordination::ZooKeeperListRequest & request = dynamic_cast<Coordination::ZooKeeperListRequest &>(*zk_request);
auto it = container.find(request.path);
if (it == container.end())
{
response.error = Coordination::Error::ZNONODE;
}
else
{
auto path_prefix = request.path;
if (path_prefix.empty())
throw DB::Exception("Logical error: path cannot be empty", ErrorCodes::LOGICAL_ERROR);
if (path_prefix.back() != '/')
path_prefix += '/';
/// Fairly inefficient.
for (auto child_it = container.upper_bound(path_prefix);
child_it != container.end() && startsWith(child_it->first, path_prefix);
++child_it)
{
if (parentPath(child_it->first) == request.path)
response.names.emplace_back(baseName(child_it->first));
}
response.stat = it->second.stat;
response.error = Coordination::Error::ZOK;
}
return { response_ptr, {} };
}
};
struct TestKeeperStorageCheckRequest final : public TestKeeperStorageRequest
{
using TestKeeperStorageRequest::TestKeeperStorageRequest;
std::pair<Coordination::ZooKeeperResponsePtr, Undo> process(TestKeeperStorage::Container & container, TestKeeperStorage::Ephemerals & /* ephemerals */, int64_t /*zxid*/, int64_t /*session_id*/) const override
{
Coordination::ZooKeeperResponsePtr response_ptr = zk_request->makeResponse();
Coordination::ZooKeeperCheckResponse & response = dynamic_cast<Coordination::ZooKeeperCheckResponse &>(*response_ptr);
Coordination::ZooKeeperCheckRequest & request = dynamic_cast<Coordination::ZooKeeperCheckRequest &>(*zk_request);
auto it = container.find(request.path);
if (it == container.end())
{
response.error = Coordination::Error::ZNONODE;
}
else if (request.version != -1 && request.version != it->second.stat.version)
{
response.error = Coordination::Error::ZBADVERSION;
}
else
{
response.error = Coordination::Error::ZOK;
}
return { response_ptr, {} };
}
};
struct TestKeeperStorageMultiRequest final : public TestKeeperStorageRequest
{
std::vector<TestKeeperStorageRequestPtr> concrete_requests;
explicit TestKeeperStorageMultiRequest(const Coordination::ZooKeeperRequestPtr & zk_request_)
: TestKeeperStorageRequest(zk_request_)
{
Coordination::ZooKeeperMultiRequest & request = dynamic_cast<Coordination::ZooKeeperMultiRequest &>(*zk_request);
concrete_requests.reserve(request.requests.size());
for (const auto & sub_request : request.requests)
{
auto sub_zk_request = std::dynamic_pointer_cast<Coordination::ZooKeeperRequest>(sub_request);
if (sub_zk_request->getOpNum() == Coordination::OpNum::Create)
{
concrete_requests.push_back(std::make_shared<TestKeeperStorageCreateRequest>(sub_zk_request));
}
else if (sub_zk_request->getOpNum() == Coordination::OpNum::Remove)
{
concrete_requests.push_back(std::make_shared<TestKeeperStorageRemoveRequest>(sub_zk_request));
}
else if (sub_zk_request->getOpNum() == Coordination::OpNum::Set)
{
concrete_requests.push_back(std::make_shared<TestKeeperStorageSetRequest>(sub_zk_request));
}
else if (sub_zk_request->getOpNum() == Coordination::OpNum::Check)
{
concrete_requests.push_back(std::make_shared<TestKeeperStorageCheckRequest>(sub_zk_request));
}
else
throw DB::Exception(ErrorCodes::BAD_ARGUMENTS, "Illegal command as part of multi ZooKeeper request {}", sub_zk_request->getOpNum());
}
}
std::pair<Coordination::ZooKeeperResponsePtr, Undo> process(TestKeeperStorage::Container & container, TestKeeperStorage::Ephemerals & ephemerals, int64_t zxid, int64_t session_id) const override
{
Coordination::ZooKeeperResponsePtr response_ptr = zk_request->makeResponse();
Coordination::ZooKeeperMultiResponse & response = dynamic_cast<Coordination::ZooKeeperMultiResponse &>(*response_ptr);
std::vector<Undo> undo_actions;
try
{
size_t i = 0;
for (const auto & concrete_request : concrete_requests)
{
auto [ cur_response, undo_action ] = concrete_request->process(container, ephemerals, zxid, session_id);
response.responses[i] = cur_response;
if (cur_response->error != Coordination::Error::ZOK)
{
for (size_t j = 0; j <= i; ++j)
{
auto response_error = response.responses[j]->error;
response.responses[j] = std::make_shared<Coordination::ZooKeeperErrorResponse>();
response.responses[j]->error = response_error;
}
for (size_t j = i + 1; j < response.responses.size(); ++j)
{
response.responses[j] = std::make_shared<Coordination::ZooKeeperErrorResponse>();
response.responses[j]->error = Coordination::Error::ZRUNTIMEINCONSISTENCY;
}
for (auto it = undo_actions.rbegin(); it != undo_actions.rend(); ++it)
if (*it)
(*it)();
return { response_ptr, {} };
}
else
undo_actions.emplace_back(std::move(undo_action));
++i;
}
response.error = Coordination::Error::ZOK;
return { response_ptr, {} };
}
catch (...)
{
for (auto it = undo_actions.rbegin(); it != undo_actions.rend(); ++it)
if (*it)
(*it)();
throw;
}
}
void processWatches(TestKeeperStorage::Watches & watches, TestKeeperStorage::Watches & list_watches) const override
{
for (const auto & generic_request : concrete_requests)
generic_request->processWatches(watches, list_watches);
}
};
struct TestKeeperStorageCloseRequest final : public TestKeeperStorageRequest
{
using TestKeeperStorageRequest::TestKeeperStorageRequest;
std::pair<Coordination::ZooKeeperResponsePtr, Undo> process(TestKeeperStorage::Container &, TestKeeperStorage::Ephemerals &, int64_t, int64_t) const override
{
throw DB::Exception("Called process on close request", ErrorCodes::LOGICAL_ERROR);
}
};
void TestKeeperStorage::processingThread()
{
setThreadName("TestKeeperSProc");
try
{
while (!shutdown)
{
RequestInfo info;
UInt64 max_wait = UInt64(operation_timeout.totalMilliseconds());
if (requests_queue.tryPop(info, max_wait))
{
if (shutdown)
break;
auto zk_request = info.request->zk_request;
if (zk_request->getOpNum() == Coordination::OpNum::Close)
{
auto it = ephemerals.find(info.session_id);
if (it != ephemerals.end())
{
for (const auto & ephemeral_path : it->second)
{
container.erase(ephemeral_path);
processWatchesImpl(ephemeral_path, watches, list_watches, Coordination::Event::DELETED);
}
ephemerals.erase(it);
}
clearDeadWatches(info.session_id);
/// Finish connection
auto response = std::make_shared<Coordination::ZooKeeperCloseResponse>();
response->xid = zk_request->xid;
response->zxid = getZXID();
info.response_callback(response);
}
else
{
auto [response, _] = info.request->process(container, ephemerals, zxid, info.session_id);
if (info.watch_callback)
{
if (response->error == Coordination::Error::ZOK)
{
auto & watches_type = zk_request->getOpNum() == Coordination::OpNum::List || zk_request->getOpNum() == Coordination::OpNum::SimpleList
? list_watches
: watches;
watches_type[zk_request->getPath()].emplace_back(Watcher{info.session_id, info.watch_callback});
sessions_and_watchers[info.session_id].emplace(zk_request->getPath());
}
else if (response->error == Coordination::Error::ZNONODE && zk_request->getOpNum() == Coordination::OpNum::Exists)
{
watches[zk_request->getPath()].emplace_back(Watcher{info.session_id, info.watch_callback});
sessions_and_watchers[info.session_id].emplace(zk_request->getPath());
}
else
{
std::shared_ptr<Coordination::ZooKeeperWatchResponse> watch_response = std::make_shared<Coordination::ZooKeeperWatchResponse>();
watch_response->path = zk_request->getPath();
watch_response->xid = -1;
watch_response->error = response->error;
watch_response->type = Coordination::Event::NOTWATCHING;
info.watch_callback(watch_response);
}
}
if (response->error == Coordination::Error::ZOK)
info.request->processWatches(watches, list_watches);
response->xid = zk_request->xid;
response->zxid = getZXID();
info.response_callback(response);
}
}
}
}
catch (...)
{
tryLogCurrentException(__PRETTY_FUNCTION__);
finalize();
}
}
void TestKeeperStorage::finalize()
{
{
std::lock_guard lock(push_request_mutex);
if (shutdown)
return;
shutdown = true;
if (processing_thread.joinable())
processing_thread.join();
}
try
{
{
auto finish_watch = [] (const auto & watch_pair)
{
Coordination::ZooKeeperWatchResponse response;
response.type = Coordination::SESSION;
response.state = Coordination::EXPIRED_SESSION;
response.error = Coordination::Error::ZSESSIONEXPIRED;
for (auto & watcher : watch_pair.second)
{
if (watcher.watch_callback)
{
try
{
watcher.watch_callback(std::make_shared<Coordination::ZooKeeperWatchResponse>(response));
}
catch (...)
{
tryLogCurrentException(__PRETTY_FUNCTION__);
}
}
}
};
for (auto & path_watch : watches)
finish_watch(path_watch);
watches.clear();
for (auto & path_watch : list_watches)
finish_watch(path_watch);
list_watches.clear();
sessions_and_watchers.clear();
}
RequestInfo info;
while (requests_queue.tryPop(info))
{
auto response = info.request->zk_request->makeResponse();
response->error = Coordination::Error::ZSESSIONEXPIRED;
try
{
info.response_callback(response);
}
catch (...)
{
tryLogCurrentException(__PRETTY_FUNCTION__);
}
}
}
catch (...)
{
tryLogCurrentException(__PRETTY_FUNCTION__);
}
}
class TestKeeperWrapperFactory final : private boost::noncopyable
{
public:
using Creator = std::function<TestKeeperStorageRequestPtr(const Coordination::ZooKeeperRequestPtr &)>;
using OpNumToRequest = std::unordered_map<Coordination::OpNum, Creator>;
static TestKeeperWrapperFactory & instance()
{
static TestKeeperWrapperFactory factory;
return factory;
}
TestKeeperStorageRequestPtr get(const Coordination::ZooKeeperRequestPtr & zk_request) const
{
auto it = op_num_to_request.find(zk_request->getOpNum());
if (it == op_num_to_request.end())
throw DB::Exception("Unknown operation type " + toString(zk_request->getOpNum()), ErrorCodes::LOGICAL_ERROR);
return it->second(zk_request);
}
void registerRequest(Coordination::OpNum op_num, Creator creator)
{
if (!op_num_to_request.try_emplace(op_num, creator).second)
throw DB::Exception(ErrorCodes::LOGICAL_ERROR, "Request with op num {} already registered", op_num);
}
private:
OpNumToRequest op_num_to_request;
TestKeeperWrapperFactory();
};
template<Coordination::OpNum num, typename RequestT>
void registerTestKeeperRequestWrapper(TestKeeperWrapperFactory & factory)
{
factory.registerRequest(num, [] (const Coordination::ZooKeeperRequestPtr & zk_request) { return std::make_shared<RequestT>(zk_request); });
}
TestKeeperWrapperFactory::TestKeeperWrapperFactory()
{
registerTestKeeperRequestWrapper<Coordination::OpNum::Heartbeat, TestKeeperStorageHeartbeatRequest>(*this);
//registerTestKeeperRequestWrapper<Coordination::OpNum::Auth, TestKeeperStorageAuthRequest>(*this);
registerTestKeeperRequestWrapper<Coordination::OpNum::Close, TestKeeperStorageCloseRequest>(*this);
registerTestKeeperRequestWrapper<Coordination::OpNum::Create, TestKeeperStorageCreateRequest>(*this);
registerTestKeeperRequestWrapper<Coordination::OpNum::Remove, TestKeeperStorageRemoveRequest>(*this);
registerTestKeeperRequestWrapper<Coordination::OpNum::Exists, TestKeeperStorageExistsRequest>(*this);
registerTestKeeperRequestWrapper<Coordination::OpNum::Get, TestKeeperStorageGetRequest>(*this);
registerTestKeeperRequestWrapper<Coordination::OpNum::Set, TestKeeperStorageSetRequest>(*this);
registerTestKeeperRequestWrapper<Coordination::OpNum::List, TestKeeperStorageListRequest>(*this);
registerTestKeeperRequestWrapper<Coordination::OpNum::SimpleList, TestKeeperStorageListRequest>(*this);
registerTestKeeperRequestWrapper<Coordination::OpNum::Check, TestKeeperStorageCheckRequest>(*this);
registerTestKeeperRequestWrapper<Coordination::OpNum::Multi, TestKeeperStorageMultiRequest>(*this);
}
void TestKeeperStorage::putRequest(const Coordination::ZooKeeperRequestPtr & request, int64_t session_id, ResponseCallback callback)
{
TestKeeperStorageRequestPtr storage_request = TestKeeperWrapperFactory::instance().get(request);
RequestInfo request_info;
request_info.time = clock::now();
request_info.request = storage_request;
request_info.session_id = session_id;
request_info.response_callback = callback;
/// Put close requests without timeouts
auto timeout = request->getOpNum() == Coordination::OpNum::Close ? 0 : operation_timeout.totalMilliseconds();
std::lock_guard lock(push_request_mutex);
if (!requests_queue.tryPush(std::move(request_info), timeout))
throw Exception("Cannot push request to queue within operation timeout", ErrorCodes::TIMEOUT_EXCEEDED);
}
void TestKeeperStorage::putRequest(const Coordination::ZooKeeperRequestPtr & request, int64_t session_id, ResponseCallback callback, ResponseCallback watch_callback)
{
TestKeeperStorageRequestPtr storage_request = TestKeeperWrapperFactory::instance().get(request);
RequestInfo request_info;
request_info.time = clock::now();
request_info.request = storage_request;
request_info.session_id = session_id;
request_info.response_callback = callback;
if (request->has_watch)
request_info.watch_callback = watch_callback;
/// Put close requests without timeouts
auto timeout = request->getOpNum() == Coordination::OpNum::Close ? 0 : operation_timeout.totalMilliseconds();
std::lock_guard lock(push_request_mutex);
if (!requests_queue.tryPush(std::move(request_info), timeout))
throw Exception("Cannot push request to queue within operation timeout", ErrorCodes::TIMEOUT_EXCEEDED);
}
TestKeeperStorage::~TestKeeperStorage()
{
try
{
finalize();
}
catch (...)
{
tryLogCurrentException(__PRETTY_FUNCTION__);
}
}
void TestKeeperStorage::clearDeadWatches(int64_t session_id)
{
auto watches_it = sessions_and_watchers.find(session_id);
if (watches_it != sessions_and_watchers.end())
{
for (const auto & watch_path : watches_it->second)
{
auto watch = watches.find(watch_path);
if (watch != watches.end())
{
auto & watches_for_path = watch->second;
for (auto w_it = watches_for_path.begin(); w_it != watches_for_path.end();)
{
if (w_it->session_id == session_id)
w_it = watches_for_path.erase(w_it);
else
++w_it;
}
if (watches_for_path.empty())
watches.erase(watch);
}
}
sessions_and_watchers.erase(watches_it);
}
}
}

View File

@ -0,0 +1,104 @@
#pragma once
#include <Common/ThreadPool.h>
#include <Common/ZooKeeper/IKeeper.h>
#include <Common/ConcurrentBoundedQueue.h>
#include <Common/ZooKeeper/ZooKeeperCommon.h>
#include <future>
#include <unordered_map>
#include <unordered_set>
namespace zkutil
{
using namespace DB;
struct TestKeeperStorageRequest;
using TestKeeperStorageRequestPtr = std::shared_ptr<TestKeeperStorageRequest>;
using ResponseCallback = std::function<void(const Coordination::ZooKeeperResponsePtr &)>;
class TestKeeperStorage
{
public:
Poco::Timespan operation_timeout{0, Coordination::DEFAULT_OPERATION_TIMEOUT_MS * 1000};
std::atomic<int64_t> session_id_counter{0};
struct Node
{
String data;
Coordination::ACLs acls;
bool is_ephemeral = false;
bool is_sequental = false;
Coordination::Stat stat{};
int32_t seq_num = 0;
};
struct Watcher
{
int64_t session_id;
ResponseCallback watch_callback;
};
using Container = std::map<std::string, Node>;
using Ephemerals = std::unordered_map<int64_t, std::unordered_set<String>>;
using SessionAndWatcher = std::unordered_map<int64_t, std::unordered_set<String>>;
using WatchCallbacks = std::vector<Watcher>;
using Watches = std::map<String /* path, relative of root_path */, WatchCallbacks>;
Container container;
Ephemerals ephemerals;
SessionAndWatcher sessions_and_watchers;
std::atomic<int64_t> zxid{0};
std::atomic<bool> shutdown{false};
Watches watches;
Watches list_watches; /// Watches for 'list' request (watches on children).
using clock = std::chrono::steady_clock;
struct RequestInfo
{
TestKeeperStorageRequestPtr request;
ResponseCallback response_callback;
ResponseCallback watch_callback;
clock::time_point time;
int64_t session_id;
};
std::mutex push_request_mutex;
using RequestsQueue = ConcurrentBoundedQueue<RequestInfo>;
RequestsQueue requests_queue{1};
void finalize();
ThreadFromGlobalPool processing_thread;
void processingThread();
void clearDeadWatches(int64_t session_id);
public:
using AsyncResponse = std::future<Coordination::ZooKeeperResponsePtr>;
TestKeeperStorage();
~TestKeeperStorage();
struct ResponsePair
{
AsyncResponse response;
std::optional<AsyncResponse> watch_response;
};
void putRequest(const Coordination::ZooKeeperRequestPtr & request, int64_t session_id, ResponseCallback callback);
void putRequest(const Coordination::ZooKeeperRequestPtr & request, int64_t session_id, ResponseCallback callback, ResponseCallback watch_callback);
int64_t getSessionID()
{
return session_id_counter.fetch_add(1);
}
int64_t getZXID()
{
return zxid.fetch_add(1);
}
};
}

View File

@ -129,8 +129,8 @@ struct ZooKeeperArgs
std::vector<std::string> hosts_strings;
session_timeout_ms = DEFAULT_SESSION_TIMEOUT;
operation_timeout_ms = DEFAULT_OPERATION_TIMEOUT;
session_timeout_ms = Coordination::DEFAULT_SESSION_TIMEOUT_MS;
operation_timeout_ms = Coordination::DEFAULT_OPERATION_TIMEOUT_MS;
implementation = "zookeeper";
for (const auto & key : keys)
{

View File

@ -11,6 +11,7 @@
#include <Common/ProfileEvents.h>
#include <Common/CurrentMetrics.h>
#include <Common/ZooKeeper/IKeeper.h>
#include <Common/ZooKeeper/ZooKeeperConstants.h>
#include <unistd.h>
@ -28,9 +29,6 @@ namespace CurrentMetrics
namespace zkutil
{
const UInt32 DEFAULT_SESSION_TIMEOUT = 30000;
const UInt32 DEFAULT_OPERATION_TIMEOUT = 10000;
/// Preferred size of multi() command (in number of ops)
constexpr size_t MULTI_BATCH_SIZE = 100;
@ -53,8 +51,8 @@ public:
using Ptr = std::shared_ptr<ZooKeeper>;
ZooKeeper(const std::string & hosts_, const std::string & identity_ = "",
int32_t session_timeout_ms_ = DEFAULT_SESSION_TIMEOUT,
int32_t operation_timeout_ms_ = DEFAULT_OPERATION_TIMEOUT,
int32_t session_timeout_ms_ = Coordination::DEFAULT_SESSION_TIMEOUT_MS,
int32_t operation_timeout_ms_ = Coordination::DEFAULT_OPERATION_TIMEOUT_MS,
const std::string & chroot_ = "",
const std::string & implementation_ = "zookeeper");

View File

@ -0,0 +1,481 @@
#include <Common/ZooKeeper/ZooKeeperCommon.h>
#include <Common/ZooKeeper/ZooKeeperIO.h>
#include <IO/WriteHelpers.h>
#include <IO/WriteBufferFromString.h>
#include <IO/Operators.h>
#include <IO/ReadHelpers.h>
#include <common/logger_useful.h>
#include <array>
namespace Coordination
{
using namespace DB;
void ZooKeeperResponse::write(WriteBuffer & out) const
{
/// Excessive copy to calculate length.
WriteBufferFromOwnString buf;
Coordination::write(xid, buf);
Coordination::write(zxid, buf);
Coordination::write(error, buf);
if (error == Error::ZOK)
writeImpl(buf);
Coordination::write(buf.str(), out);
out.next();
}
void ZooKeeperRequest::write(WriteBuffer & out) const
{
/// Excessive copy to calculate length.
WriteBufferFromOwnString buf;
Coordination::write(xid, buf);
Coordination::write(getOpNum(), buf);
writeImpl(buf);
Coordination::write(buf.str(), out);
out.next();
}
void ZooKeeperWatchResponse::readImpl(ReadBuffer & in)
{
Coordination::read(type, in);
Coordination::read(state, in);
Coordination::read(path, in);
}
void ZooKeeperWatchResponse::writeImpl(WriteBuffer & out) const
{
Coordination::write(type, out);
Coordination::write(state, out);
Coordination::write(path, out);
}
void ZooKeeperAuthRequest::writeImpl(WriteBuffer & out) const
{
Coordination::write(type, out);
Coordination::write(scheme, out);
Coordination::write(data, out);
}
void ZooKeeperAuthRequest::readImpl(ReadBuffer & in)
{
Coordination::read(type, in);
Coordination::read(scheme, in);
Coordination::read(data, in);
}
void ZooKeeperCreateRequest::writeImpl(WriteBuffer & out) const
{
Coordination::write(path, out);
Coordination::write(data, out);
Coordination::write(acls, out);
int32_t flags = 0;
if (is_ephemeral)
flags |= 1;
if (is_sequential)
flags |= 2;
Coordination::write(flags, out);
}
void ZooKeeperCreateRequest::readImpl(ReadBuffer & in)
{
Coordination::read(path, in);
Coordination::read(data, in);
Coordination::read(acls, in);
int32_t flags = 0;
Coordination::read(flags, in);
if (flags & 1)
is_ephemeral = true;
if (flags & 2)
is_sequential = true;
}
void ZooKeeperCreateResponse::readImpl(ReadBuffer & in)
{
Coordination::read(path_created, in);
}
void ZooKeeperCreateResponse::writeImpl(WriteBuffer & out) const
{
Coordination::write(path_created, out);
}
void ZooKeeperRemoveRequest::writeImpl(WriteBuffer & out) const
{
Coordination::write(path, out);
Coordination::write(version, out);
}
void ZooKeeperRemoveRequest::readImpl(ReadBuffer & in)
{
Coordination::read(path, in);
Coordination::read(version, in);
}
void ZooKeeperExistsRequest::writeImpl(WriteBuffer & out) const
{
Coordination::write(path, out);
Coordination::write(has_watch, out);
}
void ZooKeeperExistsRequest::readImpl(ReadBuffer & in)
{
Coordination::read(path, in);
Coordination::read(has_watch, in);
}
void ZooKeeperExistsResponse::readImpl(ReadBuffer & in)
{
Coordination::read(stat, in);
}
void ZooKeeperExistsResponse::writeImpl(WriteBuffer & out) const
{
Coordination::write(stat, out);
}
void ZooKeeperGetRequest::writeImpl(WriteBuffer & out) const
{
Coordination::write(path, out);
Coordination::write(has_watch, out);
}
void ZooKeeperGetRequest::readImpl(ReadBuffer & in)
{
Coordination::read(path, in);
Coordination::read(has_watch, in);
}
void ZooKeeperGetResponse::readImpl(ReadBuffer & in)
{
Coordination::read(data, in);
Coordination::read(stat, in);
}
void ZooKeeperGetResponse::writeImpl(WriteBuffer & out) const
{
Coordination::write(data, out);
Coordination::write(stat, out);
}
void ZooKeeperSetRequest::writeImpl(WriteBuffer & out) const
{
Coordination::write(path, out);
Coordination::write(data, out);
Coordination::write(version, out);
}
void ZooKeeperSetRequest::readImpl(ReadBuffer & in)
{
Coordination::read(path, in);
Coordination::read(data, in);
Coordination::read(version, in);
}
void ZooKeeperSetResponse::readImpl(ReadBuffer & in)
{
Coordination::read(stat, in);
}
void ZooKeeperSetResponse::writeImpl(WriteBuffer & out) const
{
Coordination::write(stat, out);
}
void ZooKeeperListRequest::writeImpl(WriteBuffer & out) const
{
Coordination::write(path, out);
Coordination::write(has_watch, out);
}
void ZooKeeperListRequest::readImpl(ReadBuffer & in)
{
Coordination::read(path, in);
Coordination::read(has_watch, in);
}
void ZooKeeperListResponse::readImpl(ReadBuffer & in)
{
Coordination::read(names, in);
Coordination::read(stat, in);
}
void ZooKeeperListResponse::writeImpl(WriteBuffer & out) const
{
Coordination::write(names, out);
Coordination::write(stat, out);
}
void ZooKeeperCheckRequest::writeImpl(WriteBuffer & out) const
{
Coordination::write(path, out);
Coordination::write(version, out);
}
void ZooKeeperCheckRequest::readImpl(ReadBuffer & in)
{
Coordination::read(path, in);
Coordination::read(version, in);
}
void ZooKeeperErrorResponse::readImpl(ReadBuffer & in)
{
Coordination::Error read_error;
Coordination::read(read_error, in);
if (read_error != error)
throw Exception(fmt::format("Error code in ErrorResponse ({}) doesn't match error code in header ({})", read_error, error),
Error::ZMARSHALLINGERROR);
}
void ZooKeeperErrorResponse::writeImpl(WriteBuffer & out) const
{
Coordination::write(error, out);
}
ZooKeeperMultiRequest::ZooKeeperMultiRequest(const Requests & generic_requests, const ACLs & default_acls)
{
/// Convert nested Requests to ZooKeeperRequests.
/// Note that deep copy is required to avoid modifying path in presence of chroot prefix.
requests.reserve(generic_requests.size());
for (const auto & generic_request : generic_requests)
{
if (const auto * concrete_request_create = dynamic_cast<const CreateRequest *>(generic_request.get()))
{
auto create = std::make_shared<ZooKeeperCreateRequest>(*concrete_request_create);
if (create->acls.empty())
create->acls = default_acls;
requests.push_back(create);
}
else if (const auto * concrete_request_remove = dynamic_cast<const RemoveRequest *>(generic_request.get()))
{
requests.push_back(std::make_shared<ZooKeeperRemoveRequest>(*concrete_request_remove));
}
else if (const auto * concrete_request_set = dynamic_cast<const SetRequest *>(generic_request.get()))
{
requests.push_back(std::make_shared<ZooKeeperSetRequest>(*concrete_request_set));
}
else if (const auto * concrete_request_check = dynamic_cast<const CheckRequest *>(generic_request.get()))
{
requests.push_back(std::make_shared<ZooKeeperCheckRequest>(*concrete_request_check));
}
else
throw Exception("Illegal command as part of multi ZooKeeper request", Error::ZBADARGUMENTS);
}
}
void ZooKeeperMultiRequest::writeImpl(WriteBuffer & out) const
{
for (const auto & request : requests)
{
const auto & zk_request = dynamic_cast<const ZooKeeperRequest &>(*request);
bool done = false;
int32_t error = -1;
Coordination::write(zk_request.getOpNum(), out);
Coordination::write(done, out);
Coordination::write(error, out);
zk_request.writeImpl(out);
}
OpNum op_num = OpNum::Error;
bool done = true;
int32_t error = -1;
Coordination::write(op_num, out);
Coordination::write(done, out);
Coordination::write(error, out);
}
void ZooKeeperMultiRequest::readImpl(ReadBuffer & in)
{
while (true)
{
OpNum op_num;
bool done;
int32_t error;
Coordination::read(op_num, in);
Coordination::read(done, in);
Coordination::read(error, in);
if (done)
{
if (op_num != OpNum::Error)
throw Exception("Unexpected op_num received at the end of results for multi transaction", Error::ZMARSHALLINGERROR);
if (error != -1)
throw Exception("Unexpected error value received at the end of results for multi transaction", Error::ZMARSHALLINGERROR);
break;
}
ZooKeeperRequestPtr request = ZooKeeperRequestFactory::instance().get(op_num);
request->readImpl(in);
requests.push_back(request);
if (in.eof())
throw Exception("Not enough results received for multi transaction", Error::ZMARSHALLINGERROR);
}
}
void ZooKeeperMultiResponse::readImpl(ReadBuffer & in)
{
for (auto & response : responses)
{
OpNum op_num;
bool done;
Error op_error;
Coordination::read(op_num, in);
Coordination::read(done, in);
Coordination::read(op_error, in);
if (done)
throw Exception("Not enough results received for multi transaction", Error::ZMARSHALLINGERROR);
/// op_num == -1 is special for multi transaction.
/// For unknown reason, error code is duplicated in header and in response body.
if (op_num == OpNum::Error)
response = std::make_shared<ZooKeeperErrorResponse>();
if (op_error != Error::ZOK)
{
response->error = op_error;
/// Set error for whole transaction.
/// If some operations fail, ZK send global error as zero and then send details about each operation.
/// It will set error code for first failed operation and it will set special "runtime inconsistency" code for other operations.
if (error == Error::ZOK && op_error != Error::ZRUNTIMEINCONSISTENCY)
error = op_error;
}
if (op_error == Error::ZOK || op_num == OpNum::Error)
dynamic_cast<ZooKeeperResponse &>(*response).readImpl(in);
}
/// Footer.
{
OpNum op_num;
bool done;
int32_t error_read;
Coordination::read(op_num, in);
Coordination::read(done, in);
Coordination::read(error_read, in);
if (!done)
throw Exception("Too many results received for multi transaction", Error::ZMARSHALLINGERROR);
if (op_num != OpNum::Error)
throw Exception("Unexpected op_num received at the end of results for multi transaction", Error::ZMARSHALLINGERROR);
if (error_read != -1)
throw Exception("Unexpected error value received at the end of results for multi transaction", Error::ZMARSHALLINGERROR);
}
}
void ZooKeeperMultiResponse::writeImpl(WriteBuffer & out) const
{
for (const auto & response : responses)
{
const ZooKeeperResponse & zk_response = dynamic_cast<const ZooKeeperResponse &>(*response);
OpNum op_num = zk_response.getOpNum();
bool done = false;
Error op_error = zk_response.error;
Coordination::write(op_num, out);
Coordination::write(done, out);
Coordination::write(op_error, out);
if (op_error == Error::ZOK || op_num == OpNum::Error)
zk_response.writeImpl(out);
}
/// Footer.
{
OpNum op_num = OpNum::Error;
bool done = true;
int32_t error_read = - 1;
Coordination::write(op_num, out);
Coordination::write(done, out);
Coordination::write(error_read, out);
}
}
ZooKeeperResponsePtr ZooKeeperHeartbeatRequest::makeResponse() const { return std::make_shared<ZooKeeperHeartbeatResponse>(); }
ZooKeeperResponsePtr ZooKeeperAuthRequest::makeResponse() const { return std::make_shared<ZooKeeperAuthResponse>(); }
ZooKeeperResponsePtr ZooKeeperCreateRequest::makeResponse() const { return std::make_shared<ZooKeeperCreateResponse>(); }
ZooKeeperResponsePtr ZooKeeperRemoveRequest::makeResponse() const { return std::make_shared<ZooKeeperRemoveResponse>(); }
ZooKeeperResponsePtr ZooKeeperExistsRequest::makeResponse() const { return std::make_shared<ZooKeeperExistsResponse>(); }
ZooKeeperResponsePtr ZooKeeperGetRequest::makeResponse() const { return std::make_shared<ZooKeeperGetResponse>(); }
ZooKeeperResponsePtr ZooKeeperSetRequest::makeResponse() const { return std::make_shared<ZooKeeperSetResponse>(); }
ZooKeeperResponsePtr ZooKeeperListRequest::makeResponse() const { return std::make_shared<ZooKeeperListResponse>(); }
ZooKeeperResponsePtr ZooKeeperCheckRequest::makeResponse() const { return std::make_shared<ZooKeeperCheckResponse>(); }
ZooKeeperResponsePtr ZooKeeperMultiRequest::makeResponse() const { return std::make_shared<ZooKeeperMultiResponse>(requests); }
ZooKeeperResponsePtr ZooKeeperCloseRequest::makeResponse() const { return std::make_shared<ZooKeeperCloseResponse>(); }
void ZooKeeperRequestFactory::registerRequest(OpNum op_num, Creator creator)
{
if (!op_num_to_request.try_emplace(op_num, creator).second)
throw Coordination::Exception("Request type " + toString(op_num) + " already registered", Coordination::Error::ZRUNTIMEINCONSISTENCY);
}
std::shared_ptr<ZooKeeperRequest> ZooKeeperRequest::read(ReadBuffer & in)
{
XID xid;
OpNum op_num;
Coordination::read(xid, in);
Coordination::read(op_num, in);
auto request = ZooKeeperRequestFactory::instance().get(op_num);
request->xid = xid;
request->readImpl(in);
return request;
}
ZooKeeperRequestPtr ZooKeeperRequestFactory::get(OpNum op_num) const
{
auto it = op_num_to_request.find(op_num);
if (it == op_num_to_request.end())
throw Exception("Unknown operation type " + toString(op_num), Error::ZBADARGUMENTS);
return it->second();
}
ZooKeeperRequestFactory & ZooKeeperRequestFactory::instance()
{
static ZooKeeperRequestFactory factory;
return factory;
}
template<OpNum num, typename RequestT>
void registerZooKeeperRequest(ZooKeeperRequestFactory & factory)
{
factory.registerRequest(num, [] { return std::make_shared<RequestT>(); });
}
ZooKeeperRequestFactory::ZooKeeperRequestFactory()
{
registerZooKeeperRequest<OpNum::Heartbeat, ZooKeeperHeartbeatRequest>(*this);
registerZooKeeperRequest<OpNum::Auth, ZooKeeperAuthRequest>(*this);
registerZooKeeperRequest<OpNum::Close, ZooKeeperCloseRequest>(*this);
registerZooKeeperRequest<OpNum::Create, ZooKeeperCreateRequest>(*this);
registerZooKeeperRequest<OpNum::Remove, ZooKeeperRemoveRequest>(*this);
registerZooKeeperRequest<OpNum::Exists, ZooKeeperExistsRequest>(*this);
registerZooKeeperRequest<OpNum::Get, ZooKeeperGetRequest>(*this);
registerZooKeeperRequest<OpNum::Set, ZooKeeperSetRequest>(*this);
registerZooKeeperRequest<OpNum::SimpleList, ZooKeeperSimpleListRequest>(*this);
registerZooKeeperRequest<OpNum::List, ZooKeeperListRequest>(*this);
registerZooKeeperRequest<OpNum::Check, ZooKeeperCheckRequest>(*this);
registerZooKeeperRequest<OpNum::Multi, ZooKeeperMultiRequest>(*this);
}
}

View File

@ -0,0 +1,338 @@
#pragma once
#include <Common/ZooKeeper/IKeeper.h>
#include <Common/ZooKeeper/ZooKeeperConstants.h>
#include <boost/noncopyable.hpp>
#include <IO/ReadBuffer.h>
#include <IO/WriteBuffer.h>
#include <map>
#include <unordered_map>
#include <mutex>
#include <chrono>
#include <vector>
#include <memory>
#include <thread>
#include <atomic>
#include <cstdint>
#include <optional>
#include <functional>
namespace Coordination
{
struct ZooKeeperResponse : virtual Response
{
XID xid = 0;
int64_t zxid;
virtual ~ZooKeeperResponse() override = default;
virtual void readImpl(ReadBuffer &) = 0;
virtual void writeImpl(WriteBuffer &) const = 0;
void write(WriteBuffer & out) const;
virtual OpNum getOpNum() const = 0;
};
using ZooKeeperResponsePtr = std::shared_ptr<ZooKeeperResponse>;
/// Exposed in header file for Yandex.Metrica code.
struct ZooKeeperRequest : virtual Request
{
XID xid = 0;
bool has_watch = false;
/// If the request was not send and the error happens, we definitely sure, that it has not been processed by the server.
/// If the request was sent and we didn't get the response and the error happens, then we cannot be sure was it processed or not.
bool probably_sent = false;
ZooKeeperRequest() = default;
ZooKeeperRequest(const ZooKeeperRequest &) = default;
virtual ~ZooKeeperRequest() override = default;
virtual OpNum getOpNum() const = 0;
/// Writes length, xid, op_num, then the rest.
void write(WriteBuffer & out) const;
virtual void writeImpl(WriteBuffer &) const = 0;
virtual void readImpl(ReadBuffer &) = 0;
static std::shared_ptr<ZooKeeperRequest> read(ReadBuffer & in);
virtual ZooKeeperResponsePtr makeResponse() const = 0;
};
using ZooKeeperRequestPtr = std::shared_ptr<ZooKeeperRequest>;
struct ZooKeeperHeartbeatRequest final : ZooKeeperRequest
{
String getPath() const override { return {}; }
OpNum getOpNum() const override { return OpNum::Heartbeat; }
void writeImpl(WriteBuffer &) const override {}
void readImpl(ReadBuffer &) override {}
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperHeartbeatResponse final : ZooKeeperResponse
{
void readImpl(ReadBuffer &) override {}
void writeImpl(WriteBuffer &) const override {}
OpNum getOpNum() const override { return OpNum::Heartbeat; }
};
struct ZooKeeperWatchResponse final : WatchResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer & in) override;
void writeImpl(WriteBuffer & out) const override;
OpNum getOpNum() const override
{
throw Exception("OpNum for watch response doesn't exist", Error::ZRUNTIMEINCONSISTENCY);
}
};
struct ZooKeeperAuthRequest final : ZooKeeperRequest
{
int32_t type = 0; /// ignored by the server
String scheme;
String data;
String getPath() const override { return {}; }
OpNum getOpNum() const override { return OpNum::Auth; }
void writeImpl(WriteBuffer & out) const override;
void readImpl(ReadBuffer & in) override;
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperAuthResponse final : ZooKeeperResponse
{
void readImpl(ReadBuffer &) override {}
void writeImpl(WriteBuffer &) const override {}
OpNum getOpNum() const override { return OpNum::Auth; }
};
struct ZooKeeperCloseRequest final : ZooKeeperRequest
{
String getPath() const override { return {}; }
OpNum getOpNum() const override { return OpNum::Close; }
void writeImpl(WriteBuffer &) const override {}
void readImpl(ReadBuffer &) override {}
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperCloseResponse final : ZooKeeperResponse
{
void readImpl(ReadBuffer &) override
{
throw Exception("Received response for close request", Error::ZRUNTIMEINCONSISTENCY);
}
void writeImpl(WriteBuffer &) const override {}
OpNum getOpNum() const override { return OpNum::Close; }
};
struct ZooKeeperCreateRequest final : public CreateRequest, ZooKeeperRequest
{
ZooKeeperCreateRequest() = default;
explicit ZooKeeperCreateRequest(const CreateRequest & base) : CreateRequest(base) {}
OpNum getOpNum() const override { return OpNum::Create; }
void writeImpl(WriteBuffer & out) const override;
void readImpl(ReadBuffer & in) override;
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperCreateResponse final : CreateResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer & in) override;
void writeImpl(WriteBuffer & out) const override;
OpNum getOpNum() const override { return OpNum::Create; }
};
struct ZooKeeperRemoveRequest final : RemoveRequest, ZooKeeperRequest
{
ZooKeeperRemoveRequest() = default;
explicit ZooKeeperRemoveRequest(const RemoveRequest & base) : RemoveRequest(base) {}
OpNum getOpNum() const override { return OpNum::Remove; }
void writeImpl(WriteBuffer & out) const override;
void readImpl(ReadBuffer & in) override;
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperRemoveResponse final : RemoveResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer &) override {}
void writeImpl(WriteBuffer &) const override {}
OpNum getOpNum() const override { return OpNum::Remove; }
};
struct ZooKeeperExistsRequest final : ExistsRequest, ZooKeeperRequest
{
OpNum getOpNum() const override { return OpNum::Exists; }
void writeImpl(WriteBuffer & out) const override;
void readImpl(ReadBuffer & in) override;
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperExistsResponse final : ExistsResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer & in) override;
void writeImpl(WriteBuffer & out) const override;
OpNum getOpNum() const override { return OpNum::Exists; }
};
struct ZooKeeperGetRequest final : GetRequest, ZooKeeperRequest
{
OpNum getOpNum() const override { return OpNum::Get; }
void writeImpl(WriteBuffer & out) const override;
void readImpl(ReadBuffer & in) override;
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperGetResponse final : GetResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer & in) override;
void writeImpl(WriteBuffer & out) const override;
OpNum getOpNum() const override { return OpNum::Get; }
};
struct ZooKeeperSetRequest final : SetRequest, ZooKeeperRequest
{
ZooKeeperSetRequest() = default;
explicit ZooKeeperSetRequest(const SetRequest & base) : SetRequest(base) {}
OpNum getOpNum() const override { return OpNum::Set; }
void writeImpl(WriteBuffer & out) const override;
void readImpl(ReadBuffer & in) override;
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperSetResponse final : SetResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer & in) override;
void writeImpl(WriteBuffer & out) const override;
OpNum getOpNum() const override { return OpNum::Set; }
};
struct ZooKeeperListRequest : ListRequest, ZooKeeperRequest
{
OpNum getOpNum() const override { return OpNum::List; }
void writeImpl(WriteBuffer & out) const override;
void readImpl(ReadBuffer & in) override;
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperSimpleListRequest final : ZooKeeperListRequest
{
OpNum getOpNum() const override { return OpNum::SimpleList; }
};
struct ZooKeeperListResponse : ListResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer & in) override;
void writeImpl(WriteBuffer & out) const override;
OpNum getOpNum() const override { return OpNum::List; }
};
struct ZooKeeperSimpleListResponse final : ZooKeeperListResponse
{
OpNum getOpNum() const override { return OpNum::SimpleList; }
};
struct ZooKeeperCheckRequest final : CheckRequest, ZooKeeperRequest
{
ZooKeeperCheckRequest() = default;
explicit ZooKeeperCheckRequest(const CheckRequest & base) : CheckRequest(base) {}
OpNum getOpNum() const override { return OpNum::Check; }
void writeImpl(WriteBuffer & out) const override;
void readImpl(ReadBuffer & in) override;
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperCheckResponse final : CheckResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer &) override {}
void writeImpl(WriteBuffer &) const override {}
OpNum getOpNum() const override { return OpNum::Check; }
};
/// This response may be received only as an element of responses in MultiResponse.
struct ZooKeeperErrorResponse final : ErrorResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer & in) override;
void writeImpl(WriteBuffer & out) const override;
OpNum getOpNum() const override { return OpNum::Error; }
};
struct ZooKeeperMultiRequest final : MultiRequest, ZooKeeperRequest
{
OpNum getOpNum() const override { return OpNum::Multi; }
ZooKeeperMultiRequest() = default;
ZooKeeperMultiRequest(const Requests & generic_requests, const ACLs & default_acls);
void writeImpl(WriteBuffer & out) const override;
void readImpl(ReadBuffer & in) override;
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperMultiResponse final : MultiResponse, ZooKeeperResponse
{
OpNum getOpNum() const override { return OpNum::Multi; }
explicit ZooKeeperMultiResponse(const Requests & requests)
{
responses.reserve(requests.size());
for (const auto & request : requests)
responses.emplace_back(dynamic_cast<const ZooKeeperRequest &>(*request).makeResponse());
}
explicit ZooKeeperMultiResponse(const Responses & responses_)
{
responses = responses_;
}
void readImpl(ReadBuffer & in) override;
void writeImpl(WriteBuffer & out) const override;
};
class ZooKeeperRequestFactory final : private boost::noncopyable
{
public:
using Creator = std::function<ZooKeeperRequestPtr()>;
using OpNumToRequest = std::unordered_map<OpNum, Creator>;
static ZooKeeperRequestFactory & instance();
ZooKeeperRequestPtr get(OpNum op_num) const;
void registerRequest(OpNum op_num, Creator creator);
private:
OpNumToRequest op_num_to_request;
private:
ZooKeeperRequestFactory();
};
}

View File

@ -0,0 +1,67 @@
#include <Common/ZooKeeper/ZooKeeperConstants.h>
#include <Common/ZooKeeper/IKeeper.h>
#include <unordered_set>
namespace Coordination
{
static const std::unordered_set<int32_t> VALID_OPERATIONS =
{
static_cast<int32_t>(OpNum::Close),
static_cast<int32_t>(OpNum::Error),
static_cast<int32_t>(OpNum::Create),
static_cast<int32_t>(OpNum::Remove),
static_cast<int32_t>(OpNum::Exists),
static_cast<int32_t>(OpNum::Get),
static_cast<int32_t>(OpNum::Set),
static_cast<int32_t>(OpNum::SimpleList),
static_cast<int32_t>(OpNum::Heartbeat),
static_cast<int32_t>(OpNum::List),
static_cast<int32_t>(OpNum::Check),
static_cast<int32_t>(OpNum::Multi),
static_cast<int32_t>(OpNum::Auth),
};
std::string toString(OpNum op_num)
{
switch (op_num)
{
case OpNum::Close:
return "Close";
case OpNum::Error:
return "Error";
case OpNum::Create:
return "Create";
case OpNum::Remove:
return "Remove";
case OpNum::Exists:
return "Exists";
case OpNum::Get:
return "Get";
case OpNum::Set:
return "Set";
case OpNum::SimpleList:
return "SimpleList";
case OpNum::List:
return "List";
case OpNum::Check:
return "Check";
case OpNum::Multi:
return "Multi";
case OpNum::Heartbeat:
return "Heartbeat";
case OpNum::Auth:
return "Auth";
}
int32_t raw_op = static_cast<int32_t>(op_num);
throw Exception("Operation " + std::to_string(raw_op) + " is unknown", Error::ZUNIMPLEMENTED);
}
OpNum getOpNum(int32_t raw_op_num)
{
if (!VALID_OPERATIONS.count(raw_op_num))
throw Exception("Operation " + std::to_string(raw_op_num) + " is unknown", Error::ZUNIMPLEMENTED);
return static_cast<OpNum>(raw_op_num);
}
}

View File

@ -0,0 +1,49 @@
#pragma once
#include <string>
#include <cstdint>
namespace Coordination
{
using XID = int32_t;
static constexpr XID WATCH_XID = -1;
static constexpr XID PING_XID = -2;
static constexpr XID AUTH_XID = -4;
static constexpr XID CLOSE_XID = 0x7FFFFFFF;
enum class OpNum : int32_t
{
Close = -11,
Error = -1,
Create = 1,
Remove = 2,
Exists = 3,
Get = 4,
Set = 5,
SimpleList = 8,
Heartbeat = 11,
List = 12,
Check = 13,
Multi = 14,
Auth = 100,
};
std::string toString(OpNum op_num);
OpNum getOpNum(int32_t raw_op_num);
static constexpr int32_t ZOOKEEPER_PROTOCOL_VERSION = 0;
static constexpr int32_t CLIENT_HANDSHAKE_LENGTH = 44;
static constexpr int32_t CLIENT_HANDSHAKE_LENGTH_WITH_READONLY = 45;
static constexpr int32_t SERVER_HANDSHAKE_LENGTH = 36;
static constexpr int32_t PASSWORD_LENGTH = 16;
/// ZooKeeper has 1 MB node size and serialization limit by default,
/// but it can be raised up, so we have a slightly larger limit on our side.
static constexpr int32_t MAX_STRING_OR_ARRAY_SIZE = 1 << 28; /// 256 MiB
static constexpr int32_t DEFAULT_SESSION_TIMEOUT_MS = 30000;
static constexpr int32_t DEFAULT_OPERATION_TIMEOUT_MS = 10000;
}

View File

@ -0,0 +1,140 @@
#include <Common/ZooKeeper/ZooKeeperIO.h>
namespace Coordination
{
void write(int64_t x, WriteBuffer & out)
{
x = __builtin_bswap64(x);
writeBinary(x, out);
}
void write(int32_t x, WriteBuffer & out)
{
x = __builtin_bswap32(x);
writeBinary(x, out);
}
void write(OpNum x, WriteBuffer & out)
{
write(static_cast<int32_t>(x), out);
}
void write(bool x, WriteBuffer & out)
{
writeBinary(x, out);
}
void write(const std::string & s, WriteBuffer & out)
{
write(int32_t(s.size()), out);
out.write(s.data(), s.size());
}
void write(const ACL & acl, WriteBuffer & out)
{
write(acl.permissions, out);
write(acl.scheme, out);
write(acl.id, out);
}
void write(const Stat & stat, WriteBuffer & out)
{
write(stat.czxid, out);
write(stat.mzxid, out);
write(stat.ctime, out);
write(stat.mtime, out);
write(stat.version, out);
write(stat.cversion, out);
write(stat.aversion, out);
write(stat.ephemeralOwner, out);
write(stat.dataLength, out);
write(stat.numChildren, out);
write(stat.pzxid, out);
}
void write(const Error & x, WriteBuffer & out)
{
write(static_cast<int32_t>(x), out);
}
void read(int64_t & x, ReadBuffer & in)
{
readBinary(x, in);
x = __builtin_bswap64(x);
}
void read(int32_t & x, ReadBuffer & in)
{
readBinary(x, in);
x = __builtin_bswap32(x);
}
void read(OpNum & x, ReadBuffer & in)
{
int32_t raw_op_num;
read(raw_op_num, in);
x = getOpNum(raw_op_num);
}
void read(bool & x, ReadBuffer & in)
{
readBinary(x, in);
}
void read(int8_t & x, ReadBuffer & in)
{
readBinary(x, in);
}
void read(std::string & s, ReadBuffer & in)
{
int32_t size = 0;
read(size, in);
if (size == -1)
{
/// It means that zookeeper node has NULL value. We will treat it like empty string.
s.clear();
return;
}
if (size < 0)
throw Exception("Negative size while reading string from ZooKeeper", Error::ZMARSHALLINGERROR);
if (size > MAX_STRING_OR_ARRAY_SIZE)
throw Exception("Too large string size while reading from ZooKeeper", Error::ZMARSHALLINGERROR);
s.resize(size);
in.read(s.data(), size);
}
void read(ACL & acl, ReadBuffer & in)
{
read(acl.permissions, in);
read(acl.scheme, in);
read(acl.id, in);
}
void read(Stat & stat, ReadBuffer & in)
{
read(stat.czxid, in);
read(stat.mzxid, in);
read(stat.ctime, in);
read(stat.mtime, in);
read(stat.version, in);
read(stat.cversion, in);
read(stat.aversion, in);
read(stat.ephemeralOwner, in);
read(stat.dataLength, in);
read(stat.numChildren, in);
read(stat.pzxid, in);
}
void read(Error & x, ReadBuffer & in)
{
int32_t code;
read(code, in);
x = Coordination::Error(code);
}
}

View File

@ -0,0 +1,74 @@
#pragma once
#include <IO/WriteHelpers.h>
#include <IO/ReadHelpers.h>
#include <IO/Operators.h>
#include <Common/ZooKeeper/IKeeper.h>
#include <Common/ZooKeeper/ZooKeeperConstants.h>
#include <cstdint>
#include <vector>
#include <array>
namespace Coordination
{
using namespace DB;
void write(int64_t x, WriteBuffer & out);
void write(int32_t x, WriteBuffer & out);
void write(OpNum x, WriteBuffer & out);
void write(bool x, WriteBuffer & out);
void write(const std::string & s, WriteBuffer & out);
void write(const ACL & acl, WriteBuffer & out);
void write(const Stat & stat, WriteBuffer & out);
void write(const Error & x, WriteBuffer & out);
template <size_t N>
void write(const std::array<char, N> s, WriteBuffer & out)
{
write(int32_t(N), out);
out.write(s.data(), N);
}
template <typename T>
void write(const std::vector<T> & arr, WriteBuffer & out)
{
write(int32_t(arr.size()), out);
for (const auto & elem : arr)
write(elem, out);
}
void read(int64_t & x, ReadBuffer & in);
void read(int32_t & x, ReadBuffer & in);
void read(OpNum & x, ReadBuffer & in);
void read(bool & x, ReadBuffer & in);
void read(int8_t & x, ReadBuffer & in);
void read(std::string & s, ReadBuffer & in);
void read(ACL & acl, ReadBuffer & in);
void read(Stat & stat, ReadBuffer & in);
void read(Error & x, ReadBuffer & in);
template <size_t N>
void read(std::array<char, N> & s, ReadBuffer & in)
{
int32_t size = 0;
read(size, in);
if (size != N)
throw Exception("Unexpected array size while reading from ZooKeeper", Error::ZMARSHALLINGERROR);
in.read(s.data(), N);
}
template <typename T>
void read(std::vector<T> & arr, ReadBuffer & in)
{
int32_t size = 0;
read(size, in);
if (size < 0)
throw Exception("Negative size while reading array from ZooKeeper", Error::ZMARSHALLINGERROR);
if (size > MAX_STRING_OR_ARRAY_SIZE)
throw Exception("Too large array size while reading from ZooKeeper", Error::ZMARSHALLINGERROR);
arr.resize(size);
for (auto & elem : arr)
read(elem, in);
}
}

View File

@ -2,11 +2,12 @@
#include <Common/Exception.h>
#include <Common/ProfileEvents.h>
#include <Common/setThreadName.h>
#include <Common/ZooKeeper/ZooKeeperIO.h>
#include <IO/WriteHelpers.h>
#include <IO/ReadHelpers.h>
#include <IO/Operators.h>
#include <IO/WriteBufferFromString.h>
#include <common/logger_useful.h>
#if !defined(ARCADIA_BUILD)
# include <Common/config.h>
@ -19,11 +20,6 @@
#include <array>
/// ZooKeeper has 1 MB node size and serialization limit by default,
/// but it can be raised up, so we have a slightly larger limit on our side.
#define MAX_STRING_OR_ARRAY_SIZE (1 << 28) /// 256 MiB
namespace ProfileEvents
{
extern const Event ZooKeeperInit;
@ -266,137 +262,6 @@ namespace Coordination
using namespace DB;
/// Assuming we are at little endian.
static void write(int64_t x, WriteBuffer & out)
{
x = __builtin_bswap64(x);
writeBinary(x, out);
}
static void write(int32_t x, WriteBuffer & out)
{
x = __builtin_bswap32(x);
writeBinary(x, out);
}
static void write(bool x, WriteBuffer & out)
{
writeBinary(x, out);
}
static void write(const String & s, WriteBuffer & out)
{
write(int32_t(s.size()), out);
out.write(s.data(), s.size());
}
template <size_t N> void write(std::array<char, N> s, WriteBuffer & out)
{
write(int32_t(N), out);
out.write(s.data(), N);
}
template <typename T> void write(const std::vector<T> & arr, WriteBuffer & out)
{
write(int32_t(arr.size()), out);
for (const auto & elem : arr)
write(elem, out);
}
static void write(const ACL & acl, WriteBuffer & out)
{
write(acl.permissions, out);
write(acl.scheme, out);
write(acl.id, out);
}
static void read(int64_t & x, ReadBuffer & in)
{
readBinary(x, in);
x = __builtin_bswap64(x);
}
static void read(int32_t & x, ReadBuffer & in)
{
readBinary(x, in);
x = __builtin_bswap32(x);
}
static void read(Error & x, ReadBuffer & in)
{
int32_t code;
read(code, in);
x = Error(code);
}
static void read(bool & x, ReadBuffer & in)
{
readBinary(x, in);
}
static void read(String & s, ReadBuffer & in)
{
int32_t size = 0;
read(size, in);
if (size == -1)
{
/// It means that zookeeper node has NULL value. We will treat it like empty string.
s.clear();
return;
}
if (size < 0)
throw Exception("Negative size while reading string from ZooKeeper", Error::ZMARSHALLINGERROR);
if (size > MAX_STRING_OR_ARRAY_SIZE)
throw Exception("Too large string size while reading from ZooKeeper", Error::ZMARSHALLINGERROR);
s.resize(size);
in.read(s.data(), size);
}
template <size_t N> void read(std::array<char, N> & s, ReadBuffer & in)
{
int32_t size = 0;
read(size, in);
if (size != N)
throw Exception("Unexpected array size while reading from ZooKeeper", Error::ZMARSHALLINGERROR);
in.read(s.data(), N);
}
static void read(Stat & stat, ReadBuffer & in)
{
read(stat.czxid, in);
read(stat.mzxid, in);
read(stat.ctime, in);
read(stat.mtime, in);
read(stat.version, in);
read(stat.cversion, in);
read(stat.aversion, in);
read(stat.ephemeralOwner, in);
read(stat.dataLength, in);
read(stat.numChildren, in);
read(stat.pzxid, in);
}
template <typename T> void read(std::vector<T> & arr, ReadBuffer & in)
{
int32_t size = 0;
read(size, in);
if (size < 0)
throw Exception("Negative size while reading array from ZooKeeper", Error::ZMARSHALLINGERROR);
if (size > MAX_STRING_OR_ARRAY_SIZE)
throw Exception("Too large array size while reading from ZooKeeper", Error::ZMARSHALLINGERROR);
arr.resize(size);
for (auto & elem : arr)
read(elem, in);
}
template <typename T>
void ZooKeeper::write(const T & x)
{
@ -409,19 +274,6 @@ void ZooKeeper::read(T & x)
Coordination::read(x, *in);
}
void ZooKeeperRequest::write(WriteBuffer & out) const
{
/// Excessive copy to calculate length.
WriteBufferFromOwnString buf;
Coordination::write(xid, buf);
Coordination::write(getOpNum(), buf);
writeImpl(buf);
Coordination::write(buf.str(), out);
out.next();
}
static void removeRootPath(String & path, const String & root_path)
{
if (root_path.empty())
@ -433,394 +285,6 @@ static void removeRootPath(String & path, const String & root_path)
path = path.substr(root_path.size());
}
struct ZooKeeperResponse : virtual Response
{
virtual ~ZooKeeperResponse() override = default;
virtual void readImpl(ReadBuffer &) = 0;
};
struct ZooKeeperHeartbeatRequest final : ZooKeeperRequest
{
String getPath() const override { return {}; }
ZooKeeper::OpNum getOpNum() const override { return 11; }
void writeImpl(WriteBuffer &) const override {}
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperHeartbeatResponse final : ZooKeeperResponse
{
void readImpl(ReadBuffer &) override {}
};
struct ZooKeeperWatchResponse final : WatchResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer & in) override
{
Coordination::read(type, in);
Coordination::read(state, in);
Coordination::read(path, in);
}
};
struct ZooKeeperAuthRequest final : ZooKeeperRequest
{
int32_t type = 0; /// ignored by the server
String scheme;
String data;
String getPath() const override { return {}; }
ZooKeeper::OpNum getOpNum() const override { return 100; }
void writeImpl(WriteBuffer & out) const override
{
Coordination::write(type, out);
Coordination::write(scheme, out);
Coordination::write(data, out);
}
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperAuthResponse final : ZooKeeperResponse
{
void readImpl(ReadBuffer &) override {}
};
struct ZooKeeperCloseRequest final : ZooKeeperRequest
{
String getPath() const override { return {}; }
ZooKeeper::OpNum getOpNum() const override { return -11; }
void writeImpl(WriteBuffer &) const override {}
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperCloseResponse final : ZooKeeperResponse
{
void readImpl(ReadBuffer &) override
{
throw Exception("Received response for close request", Error::ZRUNTIMEINCONSISTENCY);
}
};
struct ZooKeeperCreateRequest final : CreateRequest, ZooKeeperRequest
{
ZooKeeperCreateRequest() = default;
explicit ZooKeeperCreateRequest(const CreateRequest & base) : CreateRequest(base) {}
ZooKeeper::OpNum getOpNum() const override { return 1; }
void writeImpl(WriteBuffer & out) const override
{
Coordination::write(path, out);
Coordination::write(data, out);
Coordination::write(acls, out);
int32_t flags = 0;
if (is_ephemeral)
flags |= 1;
if (is_sequential)
flags |= 2;
Coordination::write(flags, out);
}
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperCreateResponse final : CreateResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer & in) override
{
Coordination::read(path_created, in);
}
};
struct ZooKeeperRemoveRequest final : RemoveRequest, ZooKeeperRequest
{
ZooKeeperRemoveRequest() = default;
explicit ZooKeeperRemoveRequest(const RemoveRequest & base) : RemoveRequest(base) {}
ZooKeeper::OpNum getOpNum() const override { return 2; }
void writeImpl(WriteBuffer & out) const override
{
Coordination::write(path, out);
Coordination::write(version, out);
}
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperRemoveResponse final : RemoveResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer &) override {}
};
struct ZooKeeperExistsRequest final : ExistsRequest, ZooKeeperRequest
{
ZooKeeper::OpNum getOpNum() const override { return 3; }
void writeImpl(WriteBuffer & out) const override
{
Coordination::write(path, out);
Coordination::write(has_watch, out);
}
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperExistsResponse final : ExistsResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer & in) override
{
Coordination::read(stat, in);
}
};
struct ZooKeeperGetRequest final : GetRequest, ZooKeeperRequest
{
ZooKeeper::OpNum getOpNum() const override { return 4; }
void writeImpl(WriteBuffer & out) const override
{
Coordination::write(path, out);
Coordination::write(has_watch, out);
}
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperGetResponse final : GetResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer & in) override
{
Coordination::read(data, in);
Coordination::read(stat, in);
}
};
struct ZooKeeperSetRequest final : SetRequest, ZooKeeperRequest
{
ZooKeeperSetRequest() = default;
explicit ZooKeeperSetRequest(const SetRequest & base) : SetRequest(base) {}
ZooKeeper::OpNum getOpNum() const override { return 5; }
void writeImpl(WriteBuffer & out) const override
{
Coordination::write(path, out);
Coordination::write(data, out);
Coordination::write(version, out);
}
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperSetResponse final : SetResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer & in) override
{
Coordination::read(stat, in);
}
};
struct ZooKeeperListRequest final : ListRequest, ZooKeeperRequest
{
ZooKeeper::OpNum getOpNum() const override { return 12; }
void writeImpl(WriteBuffer & out) const override
{
Coordination::write(path, out);
Coordination::write(has_watch, out);
}
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperListResponse final : ListResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer & in) override
{
Coordination::read(names, in);
Coordination::read(stat, in);
}
};
struct ZooKeeperCheckRequest final : CheckRequest, ZooKeeperRequest
{
ZooKeeperCheckRequest() = default;
explicit ZooKeeperCheckRequest(const CheckRequest & base) : CheckRequest(base) {}
ZooKeeper::OpNum getOpNum() const override { return 13; }
void writeImpl(WriteBuffer & out) const override
{
Coordination::write(path, out);
Coordination::write(version, out);
}
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperCheckResponse final : CheckResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer &) override {}
};
/// This response may be received only as an element of responses in MultiResponse.
struct ZooKeeperErrorResponse final : ErrorResponse, ZooKeeperResponse
{
void readImpl(ReadBuffer & in) override
{
Coordination::Error read_error;
Coordination::read(read_error, in);
if (read_error != error)
throw Exception(fmt::format("Error code in ErrorResponse ({}) doesn't match error code in header ({})", read_error, error),
Error::ZMARSHALLINGERROR);
}
};
struct ZooKeeperMultiRequest final : MultiRequest, ZooKeeperRequest
{
ZooKeeper::OpNum getOpNum() const override { return 14; }
ZooKeeperMultiRequest(const Requests & generic_requests, const ACLs & default_acls)
{
/// Convert nested Requests to ZooKeeperRequests.
/// Note that deep copy is required to avoid modifying path in presence of chroot prefix.
requests.reserve(generic_requests.size());
for (const auto & generic_request : generic_requests)
{
if (const auto * concrete_request_create = dynamic_cast<const CreateRequest *>(generic_request.get()))
{
auto create = std::make_shared<ZooKeeperCreateRequest>(*concrete_request_create);
if (create->acls.empty())
create->acls = default_acls;
requests.push_back(create);
}
else if (const auto * concrete_request_remove = dynamic_cast<const RemoveRequest *>(generic_request.get()))
{
requests.push_back(std::make_shared<ZooKeeperRemoveRequest>(*concrete_request_remove));
}
else if (const auto * concrete_request_set = dynamic_cast<const SetRequest *>(generic_request.get()))
{
requests.push_back(std::make_shared<ZooKeeperSetRequest>(*concrete_request_set));
}
else if (const auto * concrete_request_check = dynamic_cast<const CheckRequest *>(generic_request.get()))
{
requests.push_back(std::make_shared<ZooKeeperCheckRequest>(*concrete_request_check));
}
else
throw Exception("Illegal command as part of multi ZooKeeper request", Error::ZBADARGUMENTS);
}
}
void writeImpl(WriteBuffer & out) const override
{
for (const auto & request : requests)
{
const auto & zk_request = dynamic_cast<const ZooKeeperRequest &>(*request);
bool done = false;
int32_t error = -1;
Coordination::write(zk_request.getOpNum(), out);
Coordination::write(done, out);
Coordination::write(error, out);
zk_request.writeImpl(out);
}
ZooKeeper::OpNum op_num = -1;
bool done = true;
int32_t error = -1;
Coordination::write(op_num, out);
Coordination::write(done, out);
Coordination::write(error, out);
}
ZooKeeperResponsePtr makeResponse() const override;
};
struct ZooKeeperMultiResponse final : MultiResponse, ZooKeeperResponse
{
explicit ZooKeeperMultiResponse(const Requests & requests)
{
responses.reserve(requests.size());
for (const auto & request : requests)
responses.emplace_back(dynamic_cast<const ZooKeeperRequest &>(*request).makeResponse());
}
void readImpl(ReadBuffer & in) override
{
for (auto & response : responses)
{
ZooKeeper::OpNum op_num;
bool done;
Error op_error;
Coordination::read(op_num, in);
Coordination::read(done, in);
Coordination::read(op_error, in);
if (done)
throw Exception("Not enough results received for multi transaction", Error::ZMARSHALLINGERROR);
/// op_num == -1 is special for multi transaction.
/// For unknown reason, error code is duplicated in header and in response body.
if (op_num == -1)
response = std::make_shared<ZooKeeperErrorResponse>();
if (op_error != Error::ZOK)
{
response->error = op_error;
/// Set error for whole transaction.
/// If some operations fail, ZK send global error as zero and then send details about each operation.
/// It will set error code for first failed operation and it will set special "runtime inconsistency" code for other operations.
if (error == Error::ZOK && op_error != Error::ZRUNTIMEINCONSISTENCY)
error = op_error;
}
if (op_error == Error::ZOK || op_num == -1)
dynamic_cast<ZooKeeperResponse &>(*response).readImpl(in);
}
/// Footer.
{
ZooKeeper::OpNum op_num;
bool done;
int32_t error_read;
Coordination::read(op_num, in);
Coordination::read(done, in);
Coordination::read(error_read, in);
if (!done)
throw Exception("Too many results received for multi transaction", Error::ZMARSHALLINGERROR);
if (op_num != -1)
throw Exception("Unexpected op_num received at the end of results for multi transaction", Error::ZMARSHALLINGERROR);
if (error_read != -1)
throw Exception("Unexpected error value received at the end of results for multi transaction", Error::ZMARSHALLINGERROR);
}
}
};
ZooKeeperResponsePtr ZooKeeperHeartbeatRequest::makeResponse() const { return std::make_shared<ZooKeeperHeartbeatResponse>(); }
ZooKeeperResponsePtr ZooKeeperAuthRequest::makeResponse() const { return std::make_shared<ZooKeeperAuthResponse>(); }
ZooKeeperResponsePtr ZooKeeperCreateRequest::makeResponse() const { return std::make_shared<ZooKeeperCreateResponse>(); }
ZooKeeperResponsePtr ZooKeeperRemoveRequest::makeResponse() const { return std::make_shared<ZooKeeperRemoveResponse>(); }
ZooKeeperResponsePtr ZooKeeperExistsRequest::makeResponse() const { return std::make_shared<ZooKeeperExistsResponse>(); }
ZooKeeperResponsePtr ZooKeeperGetRequest::makeResponse() const { return std::make_shared<ZooKeeperGetResponse>(); }
ZooKeeperResponsePtr ZooKeeperSetRequest::makeResponse() const { return std::make_shared<ZooKeeperSetResponse>(); }
ZooKeeperResponsePtr ZooKeeperListRequest::makeResponse() const { return std::make_shared<ZooKeeperListResponse>(); }
ZooKeeperResponsePtr ZooKeeperCheckRequest::makeResponse() const { return std::make_shared<ZooKeeperCheckResponse>(); }
ZooKeeperResponsePtr ZooKeeperMultiRequest::makeResponse() const { return std::make_shared<ZooKeeperMultiResponse>(requests); }
ZooKeeperResponsePtr ZooKeeperCloseRequest::makeResponse() const { return std::make_shared<ZooKeeperCloseResponse>(); }
static constexpr int32_t protocol_version = 0;
static constexpr ZooKeeper::XID watch_xid = -1;
static constexpr ZooKeeper::XID ping_xid = -2;
static constexpr ZooKeeper::XID auth_xid = -4;
static constexpr ZooKeeper::XID close_xid = 0x7FFFFFFF;
ZooKeeper::~ZooKeeper()
{
try
@ -995,7 +459,7 @@ void ZooKeeper::sendHandshake()
std::array<char, passwd_len> passwd {};
write(handshake_length);
write(protocol_version);
write(ZOOKEEPER_PROTOCOL_VERSION);
write(last_zxid_seen);
write(timeout);
write(previous_session_id);
@ -1010,16 +474,15 @@ void ZooKeeper::receiveHandshake()
int32_t handshake_length;
int32_t protocol_version_read;
int32_t timeout;
constexpr int32_t passwd_len = 16;
std::array<char, passwd_len> passwd;
std::array<char, PASSWORD_LENGTH> passwd;
read(handshake_length);
if (handshake_length != 36)
throw Exception("Unexpected handshake length received: " + toString(handshake_length), Error::ZMARSHALLINGERROR);
if (handshake_length != SERVER_HANDSHAKE_LENGTH)
throw Exception("Unexpected handshake length received: " + DB::toString(handshake_length), Error::ZMARSHALLINGERROR);
read(protocol_version_read);
if (protocol_version_read != protocol_version)
throw Exception("Unexpected protocol version: " + toString(protocol_version_read), Error::ZMARSHALLINGERROR);
if (protocol_version_read != ZOOKEEPER_PROTOCOL_VERSION)
throw Exception("Unexpected protocol version: " + DB::toString(protocol_version_read), Error::ZMARSHALLINGERROR);
read(timeout);
if (timeout != session_timeout.totalMilliseconds())
@ -1036,7 +499,7 @@ void ZooKeeper::sendAuth(const String & scheme, const String & data)
ZooKeeperAuthRequest request;
request.scheme = scheme;
request.data = data;
request.xid = auth_xid;
request.xid = AUTH_XID;
request.write(*out);
int32_t length;
@ -1050,17 +513,17 @@ void ZooKeeper::sendAuth(const String & scheme, const String & data)
read(zxid);
read(err);
if (read_xid != auth_xid)
throw Exception("Unexpected event received in reply to auth request: " + toString(read_xid),
if (read_xid != AUTH_XID)
throw Exception("Unexpected event received in reply to auth request: " + DB::toString(read_xid),
Error::ZMARSHALLINGERROR);
int32_t actual_length = in->count() - count_before_event;
if (length != actual_length)
throw Exception("Response length doesn't match. Expected: " + toString(length) + ", actual: " + toString(actual_length),
throw Exception("Response length doesn't match. Expected: " + DB::toString(length) + ", actual: " + DB::toString(actual_length),
Error::ZMARSHALLINGERROR);
if (err != Error::ZOK)
throw Exception("Error received in reply to auth request. Code: " + toString(int32_t(err)) + ". Message: " + String(errorMessage(err)),
throw Exception("Error received in reply to auth request. Code: " + DB::toString(int32_t(err)) + ". Message: " + String(errorMessage(err)),
Error::ZMARSHALLINGERROR);
}
@ -1093,7 +556,7 @@ void ZooKeeper::sendThread()
/// After we popped element from the queue, we must register callbacks (even in the case when expired == true right now),
/// because they must not be lost (callbacks must be called because the user will wait for them).
if (info.request->xid != close_xid)
if (info.request->xid != CLOSE_XID)
{
CurrentMetrics::add(CurrentMetrics::ZooKeeperRequest);
std::lock_guard lock(operations_mutex);
@ -1107,7 +570,9 @@ void ZooKeeper::sendThread()
}
if (expired)
{
break;
}
info.request->addRootPath(root_path);
@ -1115,7 +580,7 @@ void ZooKeeper::sendThread()
info.request->write(*out);
/// We sent close request, exit
if (info.request->xid == close_xid)
if (info.request->xid == CLOSE_XID)
break;
}
}
@ -1125,7 +590,7 @@ void ZooKeeper::sendThread()
prev_heartbeat_time = clock::now();
ZooKeeperHeartbeatRequest request;
request.xid = ping_xid;
request.xid = PING_XID;
request.write(*out);
}
@ -1179,7 +644,9 @@ void ZooKeeper::receiveThread()
else
{
if (earliest_operation)
throw Exception("Operation timeout (no response) for path: " + earliest_operation->request->getPath(), Error::ZOPERATIONTIMEOUT);
{
throw Exception("Operation timeout (no response) for request " + toString(earliest_operation->request->getOpNum()) + " for path: " + earliest_operation->request->getPath(), Error::ZOPERATIONTIMEOUT);
}
waited += max_wait;
if (waited >= session_timeout.totalMicroseconds())
throw Exception("Nothing is received in session timeout", Error::ZOPERATIONTIMEOUT);
@ -1213,14 +680,14 @@ void ZooKeeper::receiveEvent()
RequestInfo request_info;
ZooKeeperResponsePtr response;
if (xid == ping_xid)
if (xid == PING_XID)
{
if (err != Error::ZOK)
throw Exception("Received error in heartbeat response: " + String(errorMessage(err)), Error::ZRUNTIMEINCONSISTENCY);
response = std::make_shared<ZooKeeperHeartbeatResponse>();
}
else if (xid == watch_xid)
else if (xid == WATCH_XID)
{
ProfileEvents::increment(ProfileEvents::ZooKeeperWatchResponse);
response = std::make_shared<ZooKeeperWatchResponse>();
@ -1261,7 +728,7 @@ void ZooKeeper::receiveEvent()
auto it = operations.find(xid);
if (it == operations.end())
throw Exception("Received response for unknown xid", Error::ZRUNTIMEINCONSISTENCY);
throw Exception("Received response for unknown xid " + DB::toString(xid), Error::ZRUNTIMEINCONSISTENCY);
/// After this point, we must invoke callback, that we've grabbed from 'operations'.
/// Invariant: all callbacks are invoked either in case of success or in case of error.
@ -1282,13 +749,14 @@ void ZooKeeper::receiveEvent()
response = request_info.request->makeResponse();
if (err != Error::ZOK)
{
response->error = err;
}
else
{
response->readImpl(*in);
response->removeRootPath(root_path);
}
/// Instead of setting the watch in sendEvent, set it in receiveEvent because need to check the response.
/// The watch shouldn't be set if the node does not exist and it will never exist like sequential ephemeral nodes.
/// By using getData() instead of exists(), a watch won't be set if the node doesn't exist.
@ -1298,7 +766,7 @@ void ZooKeeper::receiveEvent()
/// 3 indicates the ZooKeeperExistsRequest.
// For exists, we set the watch on both node exist and nonexist case.
// For other case like getData, we only set the watch when node exists.
if (request_info.request->getOpNum() == 3)
if (request_info.request->getOpNum() == OpNum::Exists)
add_watch = (response->error == Error::ZOK || response->error == Error::ZNONODE);
else
add_watch = response->error == Error::ZOK;
@ -1315,7 +783,7 @@ void ZooKeeper::receiveEvent()
int32_t actual_length = in->count() - count_before_event;
if (length != actual_length)
throw Exception("Response length doesn't match. Expected: " + toString(length) + ", actual: " + toString(actual_length), Error::ZMARSHALLINGERROR);
throw Exception("Response length doesn't match. Expected: " + DB::toString(length) + ", actual: " + DB::toString(actual_length), Error::ZMARSHALLINGERROR);
}
catch (...)
{
@ -1508,7 +976,7 @@ void ZooKeeper::pushRequest(RequestInfo && info)
if (!info.request->xid)
{
info.request->xid = next_xid.fetch_add(1);
if (info.request->xid == close_xid)
if (info.request->xid == CLOSE_XID)
throw Exception("xid equal to close_xid", Error::ZSESSIONEXPIRED);
if (info.request->xid < 0)
throw Exception("XID overflow", Error::ZSESSIONEXPIRED);
@ -1688,7 +1156,7 @@ void ZooKeeper::multi(
void ZooKeeper::close()
{
ZooKeeperCloseRequest request;
request.xid = close_xid;
request.xid = CLOSE_XID;
RequestInfo request_info;
request_info.request = std::make_shared<ZooKeeperCloseRequest>(std::move(request));
@ -1699,5 +1167,4 @@ void ZooKeeper::close()
ProfileEvents::increment(ProfileEvents::ZooKeeperClose);
}
}

View File

@ -5,6 +5,7 @@
#include <Common/CurrentMetrics.h>
#include <Common/ThreadPool.h>
#include <Common/ZooKeeper/IKeeper.h>
#include <Common/ZooKeeper/ZooKeeperCommon.h>
#include <IO/ReadBuffer.h>
#include <IO/WriteBuffer.h>
@ -85,9 +86,6 @@ namespace Coordination
using namespace DB;
struct ZooKeeperRequest;
/** Usage scenario: look at the documentation for IKeeper class.
*/
class ZooKeeper : public IKeeper
@ -101,9 +99,6 @@ public:
using Nodes = std::vector<Node>;
using XID = int32_t;
using OpNum = int32_t;
/** Connection to nodes is performed in order. If you want, shuffle them manually.
* Operation timeout couldn't be greater than session timeout.
* Operation timeout applies independently for network read, network write, waiting for events and synchronization.
@ -196,7 +191,7 @@ private:
struct RequestInfo
{
std::shared_ptr<ZooKeeperRequest> request;
ZooKeeperRequestPtr request;
ResponseCallback callback;
WatchCallback watch;
clock::time_point time;
@ -249,31 +244,4 @@ private:
CurrentMetrics::Increment active_session_metric_increment{CurrentMetrics::ZooKeeperSession};
};
struct ZooKeeperResponse;
using ZooKeeperResponsePtr = std::shared_ptr<ZooKeeperResponse>;
/// Exposed in header file for Yandex.Metrica code.
struct ZooKeeperRequest : virtual Request
{
ZooKeeper::XID xid = 0;
bool has_watch = false;
/// If the request was not send and the error happens, we definitely sure, that is has not been processed by the server.
/// If the request was sent and we didn't get the response and the error happens, then we cannot be sure was it processed or not.
bool probably_sent = false;
ZooKeeperRequest() = default;
ZooKeeperRequest(const ZooKeeperRequest &) = default;
virtual ~ZooKeeperRequest() override = default;
virtual ZooKeeper::OpNum getOpNum() const = 0;
/// Writes length, xid, op_num, then the rest.
void write(WriteBuffer & out) const;
virtual void writeImpl(WriteBuffer &) const = 0;
virtual ZooKeeperResponsePtr makeResponse() const = 0;
};
}

View File

@ -68,6 +68,7 @@ SRCS(
StringUtils/StringUtils.cpp
StudentTTest.cpp
SymbolIndex.cpp
TLDListsHolder.cpp
TaskStatsInfoGetter.cpp
TerminalSize.cpp
ThreadFuzzer.cpp
@ -80,7 +81,11 @@ SRCS(
WeakHash.cpp
ZooKeeper/IKeeper.cpp
ZooKeeper/TestKeeper.cpp
ZooKeeper/TestKeeperStorage.cpp
ZooKeeper/ZooKeeper.cpp
ZooKeeper/ZooKeeperCommon.cpp
ZooKeeper/ZooKeeperConstants.cpp
ZooKeeper/ZooKeeperIO.cpp
ZooKeeper/ZooKeeperImpl.cpp
ZooKeeper/ZooKeeperNodeCache.cpp
checkStackSize.cpp

View File

@ -36,7 +36,7 @@ static std::unordered_map<String, String> fetchTablesCreateQuery(
MySQLBlockInputStream show_create_table(
connection, "SHOW CREATE TABLE " + backQuoteIfNeed(database_name) + "." + backQuoteIfNeed(fetch_table_name),
show_create_table_header, DEFAULT_BLOCK_SIZE);
show_create_table_header, DEFAULT_BLOCK_SIZE, false, true);
Block create_query_block = show_create_table.read();
if (!create_query_block || create_query_block.rows() != 1)
@ -77,7 +77,7 @@ void MaterializeMetadata::fetchMasterStatus(mysqlxx::PoolWithFailover::Entry & c
{std::make_shared<DataTypeString>(), "Executed_Gtid_Set"},
};
MySQLBlockInputStream input(connection, "SHOW MASTER STATUS;", header, DEFAULT_BLOCK_SIZE);
MySQLBlockInputStream input(connection, "SHOW MASTER STATUS;", header, DEFAULT_BLOCK_SIZE, false, true);
Block master_status = input.read();
if (!master_status || master_status.rows() != 1)
@ -99,7 +99,7 @@ void MaterializeMetadata::fetchMasterVariablesValue(const mysqlxx::PoolWithFailo
};
const String & fetch_query = "SHOW VARIABLES WHERE Variable_name = 'binlog_checksum'";
MySQLBlockInputStream variables_input(connection, fetch_query, variables_header, DEFAULT_BLOCK_SIZE);
MySQLBlockInputStream variables_input(connection, fetch_query, variables_header, DEFAULT_BLOCK_SIZE, false, true);
while (Block variables_block = variables_input.read())
{
@ -114,23 +114,6 @@ void MaterializeMetadata::fetchMasterVariablesValue(const mysqlxx::PoolWithFailo
}
}
static Block getShowMasterLogHeader(const String & mysql_version)
{
if (startsWith(mysql_version, "5."))
{
return Block {
{std::make_shared<DataTypeString>(), "Log_name"},
{std::make_shared<DataTypeUInt64>(), "File_size"}
};
}
return Block {
{std::make_shared<DataTypeString>(), "Log_name"},
{std::make_shared<DataTypeUInt64>(), "File_size"},
{std::make_shared<DataTypeString>(), "Encrypted"}
};
}
static bool checkSyncUserPrivImpl(const mysqlxx::PoolWithFailover::Entry & connection, WriteBuffer & out)
{
Block sync_user_privs_header
@ -174,9 +157,14 @@ static void checkSyncUserPriv(const mysqlxx::PoolWithFailover::Entry & connectio
"But the SYNC USER grant query is: " + out.str(), ErrorCodes::SYNC_MYSQL_USER_ACCESS_ERROR);
}
bool MaterializeMetadata::checkBinlogFileExists(const mysqlxx::PoolWithFailover::Entry & connection, const String & mysql_version) const
bool MaterializeMetadata::checkBinlogFileExists(const mysqlxx::PoolWithFailover::Entry & connection) const
{
MySQLBlockInputStream input(connection, "SHOW MASTER LOGS", getShowMasterLogHeader(mysql_version), DEFAULT_BLOCK_SIZE);
Block logs_header {
{std::make_shared<DataTypeString>(), "Log_name"},
{std::make_shared<DataTypeUInt64>(), "File_size"}
};
MySQLBlockInputStream input(connection, "SHOW MASTER LOGS", logs_header, DEFAULT_BLOCK_SIZE, false, true);
while (Block block = input.read())
{
@ -233,7 +221,7 @@ void MaterializeMetadata::transaction(const MySQLReplication::Position & positio
MaterializeMetadata::MaterializeMetadata(
mysqlxx::PoolWithFailover::Entry & connection, const String & path_,
const String & database, bool & opened_transaction, const String & mysql_version)
const String & database, bool & opened_transaction)
: persistent_path(path_)
{
checkSyncUserPriv(connection);
@ -251,7 +239,7 @@ MaterializeMetadata::MaterializeMetadata(
assertString("\nData Version:\t", in);
readIntText(data_version, in);
if (checkBinlogFileExists(connection, mysql_version))
if (checkBinlogFileExists(connection))
return;
}

View File

@ -41,13 +41,13 @@ struct MaterializeMetadata
void fetchMasterVariablesValue(const mysqlxx::PoolWithFailover::Entry & connection);
bool checkBinlogFileExists(const mysqlxx::PoolWithFailover::Entry & connection, const String & mysql_version) const;
bool checkBinlogFileExists(const mysqlxx::PoolWithFailover::Entry & connection) const;
void transaction(const MySQLReplication::Position & position, const std::function<void()> & fun);
MaterializeMetadata(
mysqlxx::PoolWithFailover::Entry & connection, const String & path
, const String & database, bool & opened_transaction, const String & mysql_version);
, const String & database, bool & opened_transaction);
};
}

View File

@ -93,7 +93,7 @@ MaterializeMySQLSyncThread::~MaterializeMySQLSyncThread()
}
}
static String checkVariableAndGetVersion(const mysqlxx::Pool::Entry & connection)
static void checkMySQLVariables(const mysqlxx::Pool::Entry & connection)
{
Block variables_header{
{std::make_shared<DataTypeString>(), "Variable_name"},
@ -106,7 +106,7 @@ static String checkVariableAndGetVersion(const mysqlxx::Pool::Entry & connection
"OR (Variable_name = 'binlog_row_image' AND upper(Value) = 'FULL') "
"OR (Variable_name = 'default_authentication_plugin' AND upper(Value) = 'MYSQL_NATIVE_PASSWORD');";
MySQLBlockInputStream variables_input(connection, check_query, variables_header, DEFAULT_BLOCK_SIZE);
MySQLBlockInputStream variables_input(connection, check_query, variables_header, DEFAULT_BLOCK_SIZE, false, true);
Block variables_block = variables_input.read();
if (!variables_block || variables_block.rows() != 4)
@ -140,15 +140,6 @@ static String checkVariableAndGetVersion(const mysqlxx::Pool::Entry & connection
throw Exception(error_message.str(), ErrorCodes::ILLEGAL_MYSQL_VARIABLE);
}
Block version_header{{std::make_shared<DataTypeString>(), "version"}};
MySQLBlockInputStream version_input(connection, "SELECT version() AS version;", version_header, DEFAULT_BLOCK_SIZE);
Block version_block = version_input.read();
if (!version_block || version_block.rows() != 1)
throw Exception("LOGICAL ERROR: cannot get mysql version.", ErrorCodes::LOGICAL_ERROR);
return version_block.getByPosition(0).column->getDataAt(0).toString();
}
MaterializeMySQLSyncThread::MaterializeMySQLSyncThread(
@ -160,13 +151,13 @@ MaterializeMySQLSyncThread::MaterializeMySQLSyncThread(
query_prefix = "EXTERNAL DDL FROM MySQL(" + backQuoteIfNeed(database_name) + ", " + backQuoteIfNeed(mysql_database_name) + ") ";
}
void MaterializeMySQLSyncThread::synchronization(const String & mysql_version)
void MaterializeMySQLSyncThread::synchronization()
{
setThreadName(MYSQL_BACKGROUND_THREAD_NAME);
try
{
if (std::optional<MaterializeMetadata> metadata = prepareSynchronized(mysql_version))
if (std::optional<MaterializeMetadata> metadata = prepareSynchronized())
{
Stopwatch watch;
Buffers buffers(database_name);
@ -217,10 +208,8 @@ void MaterializeMySQLSyncThread::startSynchronization()
{
try
{
const auto & mysql_server_version = checkVariableAndGetVersion(pool.get());
background_thread_pool = std::make_unique<ThreadFromGlobalPool>(
[this, mysql_server_version = mysql_server_version]() { synchronization(mysql_server_version); });
checkMySQLVariables(pool.get());
background_thread_pool = std::make_unique<ThreadFromGlobalPool>([this]() { synchronization(); });
}
catch (...)
{
@ -246,15 +235,24 @@ void MaterializeMySQLSyncThread::startSynchronization()
static inline void cleanOutdatedTables(const String & database_name, const Context & context)
{
auto ddl_guard = DatabaseCatalog::instance().getDDLGuard(database_name, "");
const DatabasePtr & clean_database = DatabaseCatalog::instance().getDatabase(database_name);
for (auto iterator = clean_database->getTablesIterator(context); iterator->isValid(); iterator->next())
String cleaning_table_name;
try
{
Context query_context = createQueryContext(context);
String comment = "Materialize MySQL step 1: execute MySQL DDL for dump data";
String table_name = backQuoteIfNeed(database_name) + "." + backQuoteIfNeed(iterator->name());
tryToExecuteQuery(" DROP TABLE " + table_name, query_context, database_name, comment);
auto ddl_guard = DatabaseCatalog::instance().getDDLGuard(database_name, "");
const DatabasePtr & clean_database = DatabaseCatalog::instance().getDatabase(database_name);
for (auto iterator = clean_database->getTablesIterator(context); iterator->isValid(); iterator->next())
{
Context query_context = createQueryContext(context);
String comment = "Materialize MySQL step 1: execute MySQL DDL for dump data";
cleaning_table_name = backQuoteIfNeed(database_name) + "." + backQuoteIfNeed(iterator->name());
tryToExecuteQuery(" DROP TABLE " + cleaning_table_name, query_context, database_name, comment);
}
}
catch (Exception & exception)
{
exception.addMessage("While executing " + (cleaning_table_name.empty() ? "cleanOutdatedTables" : cleaning_table_name));
throw;
}
}
@ -295,24 +293,32 @@ static inline void dumpDataForTables(
auto iterator = master_info.need_dumping_tables.begin();
for (; iterator != master_info.need_dumping_tables.end() && !is_cancelled(); ++iterator)
{
const auto & table_name = iterator->first;
Context query_context = createQueryContext(context);
String comment = "Materialize MySQL step 1: execute MySQL DDL for dump data";
tryToExecuteQuery(query_prefix + " " + iterator->second, query_context, database_name, comment); /// create table.
try
{
const auto & table_name = iterator->first;
Context query_context = createQueryContext(context);
String comment = "Materialize MySQL step 1: execute MySQL DDL for dump data";
tryToExecuteQuery(query_prefix + " " + iterator->second, query_context, database_name, comment); /// create table.
auto out = std::make_shared<CountingBlockOutputStream>(getTableOutput(database_name, table_name, query_context));
MySQLBlockInputStream input(
connection, "SELECT * FROM " + backQuoteIfNeed(mysql_database_name) + "." + backQuoteIfNeed(table_name),
out->getHeader(), DEFAULT_BLOCK_SIZE);
auto out = std::make_shared<CountingBlockOutputStream>(getTableOutput(database_name, table_name, query_context));
MySQLBlockInputStream input(
connection, "SELECT * FROM " + backQuoteIfNeed(mysql_database_name) + "." + backQuoteIfNeed(table_name),
out->getHeader(), DEFAULT_BLOCK_SIZE);
Stopwatch watch;
copyData(input, *out, is_cancelled);
const Progress & progress = out->getProgress();
LOG_INFO(&Poco::Logger::get("MaterializeMySQLSyncThread(" + database_name + ")"),
"Materialize MySQL step 1: dump {}, {} rows, {} in {} sec., {} rows/sec., {}/sec."
, table_name, formatReadableQuantity(progress.written_rows), formatReadableSizeWithBinarySuffix(progress.written_bytes)
, watch.elapsedSeconds(), formatReadableQuantity(static_cast<size_t>(progress.written_rows / watch.elapsedSeconds()))
, formatReadableSizeWithBinarySuffix(static_cast<size_t>(progress.written_bytes / watch.elapsedSeconds())));
Stopwatch watch;
copyData(input, *out, is_cancelled);
const Progress & progress = out->getProgress();
LOG_INFO(&Poco::Logger::get("MaterializeMySQLSyncThread(" + database_name + ")"),
"Materialize MySQL step 1: dump {}, {} rows, {} in {} sec., {} rows/sec., {}/sec."
, table_name, formatReadableQuantity(progress.written_rows), formatReadableSizeWithBinarySuffix(progress.written_bytes)
, watch.elapsedSeconds(), formatReadableQuantity(static_cast<size_t>(progress.written_rows / watch.elapsedSeconds()))
, formatReadableSizeWithBinarySuffix(static_cast<size_t>(progress.written_bytes / watch.elapsedSeconds())));
}
catch (Exception & exception)
{
exception.addMessage("While executing dump MySQL {}.{} table data.", mysql_database_name, iterator->first);
throw;
}
}
}
@ -324,7 +330,7 @@ static inline UInt32 randomNumber()
return dist6(rng);
}
std::optional<MaterializeMetadata> MaterializeMySQLSyncThread::prepareSynchronized(const String & mysql_version)
std::optional<MaterializeMetadata> MaterializeMySQLSyncThread::prepareSynchronized()
{
bool opened_transaction = false;
mysqlxx::PoolWithFailover::Entry connection;
@ -337,8 +343,7 @@ std::optional<MaterializeMetadata> MaterializeMySQLSyncThread::prepareSynchroniz
opened_transaction = false;
MaterializeMetadata metadata(
connection, getDatabase(database_name).getMetadataPath() + "/.metadata",
mysql_database_name, opened_transaction, mysql_version);
connection, getDatabase(database_name).getMetadataPath() + "/.metadata", mysql_database_name, opened_transaction);
if (!metadata.need_dumping_tables.empty())
{
@ -685,6 +690,8 @@ void MaterializeMySQLSyncThread::executeDDLAtomic(const QueryEvent & query_event
}
catch (Exception & exception)
{
exception.addMessage("While executing MYSQL_QUERY_EVENT. The query: " + query_event.query);
tryLogCurrentException(log);
/// If some DDL query was not successfully parsed and executed

View File

@ -95,11 +95,11 @@ private:
BufferAndSortingColumnsPtr getTableDataBuffer(const String & table, const Context & context);
};
void synchronization(const String & mysql_version);
void synchronization();
bool isCancelled() { return sync_quit.load(std::memory_order_relaxed); }
std::optional<MaterializeMetadata> prepareSynchronized(const String & mysql_version);
std::optional<MaterializeMetadata> prepareSynchronized();
void flushBuffersData(Buffers & buffers, MaterializeMetadata & metadata);

View File

@ -3,6 +3,7 @@
#include "Disks/DiskFactory.h"
#include <random>
#include <optional>
#include <utility>
#include <IO/ReadBufferFromFile.h>
#include <IO/ReadBufferFromS3.h>
@ -326,11 +327,19 @@ namespace
const String & bucket_,
Metadata metadata_,
const String & s3_path_,
std::optional<DiskS3::ObjectMetadata> object_metadata_,
bool is_multipart,
size_t min_upload_part_size,
size_t buf_size_)
: WriteBufferFromFileBase(buf_size_, nullptr, 0)
, impl(WriteBufferFromS3(client_ptr_, bucket_, metadata_.s3_root_path + s3_path_, min_upload_part_size, is_multipart, buf_size_))
, impl(WriteBufferFromS3(
client_ptr_,
bucket_,
metadata_.s3_root_path + s3_path_,
min_upload_part_size,
is_multipart,
std::move(object_metadata_),
buf_size_))
, metadata(std::move(metadata_))
, s3_path(s3_path_)
{
@ -522,7 +531,8 @@ DiskS3::DiskS3(
String metadata_path_,
size_t min_upload_part_size_,
size_t min_multi_part_upload_size_,
size_t min_bytes_for_seek_)
size_t min_bytes_for_seek_,
bool send_metadata_)
: IDisk(std::make_unique<AsyncExecutor>())
, name(std::move(name_))
, client(std::move(client_))
@ -533,6 +543,7 @@ DiskS3::DiskS3(
, min_upload_part_size(min_upload_part_size_)
, min_multi_part_upload_size(min_multi_part_upload_size_)
, min_bytes_for_seek(min_bytes_for_seek_)
, send_metadata(send_metadata_)
{
}
@ -653,6 +664,7 @@ std::unique_ptr<WriteBufferFromFileBase> DiskS3::writeFile(const String & path,
}
/// Path to store new S3 object.
auto s3_path = getRandomName();
auto object_metadata = createObjectMetadata(path);
bool is_multipart = estimated_size >= min_multi_part_upload_size;
if (!exist || mode == WriteMode::Rewrite)
{
@ -664,9 +676,9 @@ std::unique_ptr<WriteBufferFromFileBase> DiskS3::writeFile(const String & path,
/// Save empty metadata to disk to have ability to get file size while buffer is not finalized.
metadata.save();
LOG_DEBUG(&Poco::Logger::get("DiskS3"), "Write to file by path: {} New S3 path: {}", backQuote(metadata_path + path), s3_root_path + s3_path);
LOG_DEBUG(&Poco::Logger::get("DiskS3"), "Write to file by path: {}. New S3 path: {}", backQuote(metadata_path + path), s3_root_path + s3_path);
return std::make_unique<WriteIndirectBufferFromS3>(client, bucket, metadata, s3_path, is_multipart, min_upload_part_size, buf_size);
return std::make_unique<WriteIndirectBufferFromS3>(client, bucket, metadata, s3_path, object_metadata, is_multipart, min_upload_part_size, buf_size);
}
else
{
@ -675,7 +687,7 @@ std::unique_ptr<WriteBufferFromFileBase> DiskS3::writeFile(const String & path,
LOG_DEBUG(&Poco::Logger::get("DiskS3"), "Append to file by path: {}. New S3 path: {}. Existing S3 objects: {}.",
backQuote(metadata_path + path), s3_root_path + s3_path, metadata.s3_objects.size());
return std::make_unique<WriteIndirectBufferFromS3>(client, bucket, metadata, s3_path, is_multipart, min_upload_part_size, buf_size);
return std::make_unique<WriteIndirectBufferFromS3>(client, bucket, metadata, s3_path, object_metadata, is_multipart, min_upload_part_size, buf_size);
}
}
@ -847,4 +859,12 @@ void DiskS3::shutdown()
client->DisableRequestProcessing();
}
std::optional<DiskS3::ObjectMetadata> DiskS3::createObjectMetadata(const String & path) const
{
if (send_metadata)
return (DiskS3::ObjectMetadata){{"path", path}};
return {};
}
}

View File

@ -19,6 +19,8 @@ namespace DB
class DiskS3 : public IDisk
{
public:
using ObjectMetadata = std::map<std::string, std::string>;
friend class DiskS3Reservation;
class AwsS3KeyKeeper;
@ -32,7 +34,8 @@ public:
String metadata_path_,
size_t min_upload_part_size_,
size_t min_multi_part_upload_size_,
size_t min_bytes_for_seek_);
size_t min_bytes_for_seek_,
bool send_metadata_);
const String & getName() const override { return name; }
@ -116,6 +119,7 @@ private:
void removeMeta(const String & path, AwsS3KeyKeeper & keys);
void removeMetaRecursive(const String & path, AwsS3KeyKeeper & keys);
void removeAws(const AwsS3KeyKeeper & keys);
std::optional<ObjectMetadata> createObjectMetadata(const String & path) const;
private:
const String name;
@ -127,6 +131,7 @@ private:
size_t min_upload_part_size;
size_t min_multi_part_upload_size;
size_t min_bytes_for_seek;
bool send_metadata;
UInt64 reserved_bytes = 0;
UInt64 reservation_count = 0;

View File

@ -134,6 +134,7 @@ void registerDiskS3(DiskFactory & factory)
uri.is_virtual_hosted_style,
config.getString(config_prefix + ".access_key_id", ""),
config.getString(config_prefix + ".secret_access_key", ""),
config.getBool(config_prefix + ".use_environment_credentials", config.getBool("s3.use_environment_credentials", false)),
context.getRemoteHostFilter(),
context.getGlobalContext().getSettingsRef().s3_max_redirects);
@ -148,7 +149,8 @@ void registerDiskS3(DiskFactory & factory)
metadata_path,
context.getSettingsRef().s3_min_upload_part_size,
config.getUInt64(config_prefix + ".min_multi_part_upload_size", 10 * 1024 * 1024),
config.getUInt64(config_prefix + ".min_bytes_for_seek", 1024 * 1024));
config.getUInt64(config_prefix + ".min_bytes_for_seek", 1024 * 1024),
config.getBool(config_prefix + ".send_object_metadata", false));
/// This code is used only to check access to the corresponding disk.
if (!config.getBool(config_prefix + ".skip_access_check", false))

View File

@ -12,6 +12,7 @@
# include <DataTypes/DataTypeNullable.h>
# include <IO/ReadHelpers.h>
# include <IO/WriteHelpers.h>
# include <IO/Operators.h>
# include <Common/assert_cast.h>
# include <ext/range.h>
# include "MySQLBlockInputStream.h"
@ -37,17 +38,15 @@ MySQLBlockInputStream::MySQLBlockInputStream(
const std::string & query_str,
const Block & sample_block,
const UInt64 max_block_size_,
const bool auto_close_)
const bool auto_close_,
const bool fetch_by_name_)
: connection{std::make_unique<Connection>(entry, query_str)}
, max_block_size{max_block_size_}
, auto_close{auto_close_}
, fetch_by_name(fetch_by_name_)
{
if (sample_block.columns() != connection->result.getNumFields())
throw Exception{"mysqlxx::UseQueryResult contains " + toString(connection->result.getNumFields()) + " columns while "
+ toString(sample_block.columns()) + " expected",
ErrorCodes::NUMBER_OF_COLUMNS_DOESNT_MATCH};
description.init(sample_block);
initPositionMappingFromQueryResultStructure();
}
@ -135,24 +134,25 @@ Block MySQLBlockInputStream::readImpl()
size_t num_rows = 0;
while (row)
{
for (const auto idx : ext::range(0, row.size()))
for (size_t index = 0; index < position_mapping.size(); ++index)
{
const auto value = row[idx];
const auto & sample = description.sample_block.getByPosition(idx);
const auto value = row[position_mapping[index]];
const auto & sample = description.sample_block.getByPosition(index);
if (!value.isNull())
{
if (description.types[idx].second)
if (description.types[index].second)
{
ColumnNullable & column_nullable = assert_cast<ColumnNullable &>(*columns[idx]);
ColumnNullable & column_nullable = assert_cast<ColumnNullable &>(*columns[index]);
const auto & data_type = assert_cast<const DataTypeNullable &>(*sample.type);
insertValue(*data_type.getNestedType(), column_nullable.getNestedColumn(), description.types[idx].first, value);
insertValue(*data_type.getNestedType(), column_nullable.getNestedColumn(), description.types[index].first, value);
column_nullable.getNullMapData().emplace_back(0);
}
else
insertValue(*sample.type, *columns[idx], description.types[idx].first, value);
insertValue(*sample.type, *columns[index], description.types[index].first, value);
}
else
insertDefaultValue(*columns[idx], *sample.column);
insertDefaultValue(*columns[index], *sample.column);
}
++num_rows;
@ -167,20 +167,70 @@ Block MySQLBlockInputStream::readImpl()
MySQLBlockInputStream::MySQLBlockInputStream(
const Block & sample_block_,
UInt64 max_block_size_,
bool auto_close_)
bool auto_close_,
bool fetch_by_name_)
: max_block_size(max_block_size_)
, auto_close(auto_close_)
, fetch_by_name(fetch_by_name_)
{
description.init(sample_block_);
}
void MySQLBlockInputStream::initPositionMappingFromQueryResultStructure()
{
position_mapping.resize(description.sample_block.columns());
if (!fetch_by_name)
{
if (description.sample_block.columns() != connection->result.getNumFields())
throw Exception{"mysqlxx::UseQueryResult contains " + toString(connection->result.getNumFields()) + " columns while "
+ toString(description.sample_block.columns()) + " expected", ErrorCodes::NUMBER_OF_COLUMNS_DOESNT_MATCH};
for (const auto idx : ext::range(0, connection->result.getNumFields()))
position_mapping[idx] = idx;
}
else
{
const auto & sample_names = description.sample_block.getNames();
std::unordered_set<std::string> missing_names(sample_names.begin(), sample_names.end());
size_t fields_size = connection->result.getNumFields();
for (const size_t & idx : ext::range(0, fields_size))
{
const auto & field_name = connection->result.getFieldName(idx);
if (description.sample_block.has(field_name))
{
const auto & position = description.sample_block.getPositionByName(field_name);
position_mapping[position] = idx;
missing_names.erase(field_name);
}
}
if (!missing_names.empty())
{
WriteBufferFromOwnString exception_message;
for (auto iter = missing_names.begin(); iter != missing_names.end(); ++iter)
{
if (iter != missing_names.begin())
exception_message << ", ";
exception_message << *iter;
}
throw Exception("mysqlxx::UseQueryResult must be contain the" + exception_message.str() + " columns.",
ErrorCodes::NUMBER_OF_COLUMNS_DOESNT_MATCH);
}
}
}
MySQLLazyBlockInputStream::MySQLLazyBlockInputStream(
mysqlxx::Pool & pool_,
const std::string & query_str_,
const Block & sample_block_,
const UInt64 max_block_size_,
const bool auto_close_)
: MySQLBlockInputStream(sample_block_, max_block_size_, auto_close_)
const bool auto_close_,
const bool fetch_by_name_)
: MySQLBlockInputStream(sample_block_, max_block_size_, auto_close_, fetch_by_name_)
, pool(pool_)
, query_str(query_str_)
{
@ -189,10 +239,7 @@ MySQLLazyBlockInputStream::MySQLLazyBlockInputStream(
void MySQLLazyBlockInputStream::readPrefix()
{
connection = std::make_unique<Connection>(pool.get(), query_str);
if (description.sample_block.columns() != connection->result.getNumFields())
throw Exception{"mysqlxx::UseQueryResult contains " + toString(connection->result.getNumFields()) + " columns while "
+ toString(description.sample_block.columns()) + " expected",
ErrorCodes::NUMBER_OF_COLUMNS_DOESNT_MATCH};
initPositionMappingFromQueryResultStructure();
}
}

View File

@ -20,15 +20,17 @@ public:
const std::string & query_str,
const Block & sample_block,
const UInt64 max_block_size_,
const bool auto_close_ = false);
const bool auto_close_ = false,
const bool fetch_by_name_ = false);
String getName() const override { return "MySQL"; }
Block getHeader() const override { return description.sample_block.cloneEmpty(); }
protected:
MySQLBlockInputStream(const Block & sample_block_, UInt64 max_block_size_, bool auto_close_);
MySQLBlockInputStream(const Block & sample_block_, UInt64 max_block_size_, bool auto_close_, bool fetch_by_name_);
Block readImpl() override;
void initPositionMappingFromQueryResultStructure();
struct Connection
{
@ -43,6 +45,8 @@ protected:
const UInt64 max_block_size;
const bool auto_close;
const bool fetch_by_name;
std::vector<size_t> position_mapping;
ExternalResultDescription description;
};
@ -56,7 +60,8 @@ public:
const std::string & query_str_,
const Block & sample_block_,
const UInt64 max_block_size_,
const bool auto_close_ = false);
const bool auto_close_ = false,
const bool fetch_by_name_ = false);
private:
void readPrefix() override;

View File

@ -0,0 +1,409 @@
#pragma once
#include <Common/Exception.h>
#include <Core/Types.h>
#include <IO/ReadBuffer.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteBufferFromString.h>
#include <IO/WriteHelpers.h>
#include <cstdint>
namespace DB
{
namespace ErrorCodes
{
extern const int CANNOT_PARSE_INPUT_ASSERTION_FAILED;
extern const int CANNOT_PARSE_DATE;
extern const int CANNOT_FORMAT_DATETIME;
extern const int LOGICAL_ERROR;
}
/** Proleptic Gregorian calendar date. YearT is an integral type
* which should be at least 32 bits wide, and should preferably
* be signed.
*/
template <typename YearT = int32_t>
class GregorianDate
{
public:
GregorianDate() = delete;
/** Construct from date in text form 'YYYY-MM-DD' by reading from
* ReadBuffer.
*/
GregorianDate(ReadBuffer & in);
/** Construct from Modified Julian Day. The type T is an
* integral type which should be at least 32 bits wide, and
* should preferably signed.
*/
template <typename T, std::enable_if_t<wide::IntegralConcept<T>()> * = nullptr>
GregorianDate(T mjd);
/** Convert to Modified Julian Day. The type T is an integral type
* which should be at least 32 bits wide, and should preferably
* signed.
*/
template <typename T, std::enable_if_t<wide::IntegralConcept<T>()> * = nullptr>
T toModifiedJulianDay() const;
/** Write the date in text form 'YYYY-MM-DD' to a buffer.
*/
void write(WriteBuffer & buf) const;
/** Convert to a string in text form 'YYYY-MM-DD'.
*/
std::string toString() const;
YearT year() const noexcept
{
return year_;
}
uint8_t month() const noexcept
{
return month_;
}
uint8_t day_of_month() const noexcept
{
return day_of_month_;
}
private:
YearT year_;
uint8_t month_;
uint8_t day_of_month_;
};
/** ISO 8601 Ordinal Date. YearT is an integral type which should
* be at least 32 bits wide, and should preferably signed.
*/
template <typename YearT = int32_t>
class OrdinalDate
{
public:
OrdinalDate(YearT year, uint16_t day_of_year);
/** Construct from Modified Julian Day. The type T is an
* integral type which should be at least 32 bits wide, and
* should preferably signed.
*/
template <typename T, std::enable_if_t<wide::IntegralConcept<T>()> * = nullptr>
OrdinalDate(T mjd);
/** Convert to Modified Julian Day. The type T is an integral
* type which should be at least 32 bits wide, and should
* preferably be signed.
*/
template <typename T, std::enable_if_t<wide::IntegralConcept<T>()> * = nullptr>
T toModifiedJulianDay() const noexcept;
YearT year() const noexcept
{
return year_;
}
uint16_t dayOfYear() const noexcept
{
return day_of_year_;
}
private:
YearT year_;
uint16_t day_of_year_;
};
class MonthDay
{
public:
/** Construct from month and day. */
MonthDay(uint8_t month, uint8_t day_of_month);
/** Construct from day of year in Gregorian or Julian
* calendars to month and day.
*/
MonthDay(bool is_leap_year, uint16_t day_of_year);
/** Convert month and day in Gregorian or Julian calendars to
* day of year.
*/
uint16_t dayOfYear(bool is_leap_year) const;
uint8_t month() const noexcept
{
return month_;
}
uint8_t day_of_month() const noexcept
{
return day_of_month_;
}
private:
uint8_t month_;
uint8_t day_of_month_;
};
}
/* Implementation */
namespace gd
{
using namespace DB;
template <typename YearT>
static inline constexpr bool is_leap_year(YearT year)
{
return (year % 4 == 0) && ((year % 400 == 0) || (year % 100 != 0));
}
static inline constexpr uint8_t monthLength(bool is_leap_year, uint8_t month)
{
switch (month)
{
case 1: return 31;
case 2: return is_leap_year ? 29 : 28;
case 3: return 31;
case 4: return 30;
case 5: return 31;
case 6: return 30;
case 7: return 31;
case 8: return 31;
case 9: return 30;
case 10: return 31;
case 11: return 30;
case 12: return 31;
default:
std::terminate();
}
}
/** Integer division truncated toward negative infinity.
*/
template <typename I, typename J>
static inline constexpr I div(I x, J y)
{
const auto y_ = static_cast<I>(y);
if (x > 0 && y_ < 0)
return ((x - 1) / y_) - 1;
else if (x < 0 && y_ > 0)
return ((x + 1) / y_) - 1;
else
return x / y_;
}
/** Integer modulus, satisfying div(x, y)*y + mod(x, y) == x.
*/
template <typename I, typename J>
static inline constexpr I mod(I x, J y)
{
const auto y_ = static_cast<I>(y);
const auto r = x % y_;
if ((x > 0 && y_ < 0) || (x < 0 && y_ > 0))
return r == 0 ? static_cast<I>(0) : r + y_;
else
return r;
}
/** Like std::min(), but the type of operands may differ.
*/
template <typename I, typename J>
static inline constexpr I min(I x, J y)
{
const auto y_ = static_cast<I>(y);
return x < y_ ? x : y_;
}
static inline char readDigit(ReadBuffer & in)
{
char c;
if (!in.read(c))
throw Exception(
"Cannot parse input: expected a digit at the end of stream",
ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED);
else if (c < '0' || c > '9')
throw Exception(
"Cannot read input: expected a digit but got something else",
ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED);
else
return c - '0';
}
}
namespace DB
{
template <typename YearT>
GregorianDate<YearT>::GregorianDate(ReadBuffer & in)
{
year_ = gd::readDigit(in) * 1000
+ gd::readDigit(in) * 100
+ gd::readDigit(in) * 10
+ gd::readDigit(in);
assertChar('-', in);
month_ = gd::readDigit(in) * 10
+ gd::readDigit(in);
assertChar('-', in);
day_of_month_ = gd::readDigit(in) * 10
+ gd::readDigit(in);
assertEOF(in);
if (month_ < 1 || month_ > 12 || day_of_month_ < 1 || day_of_month_ > gd::monthLength(gd::is_leap_year(year_), month_))
throw Exception("Invalid date: " + toString(), ErrorCodes::CANNOT_PARSE_DATE);
}
template <typename YearT>
template <typename T, std::enable_if_t<wide::IntegralConcept<T>()> *>
GregorianDate<YearT>::GregorianDate(T mjd)
{
const OrdinalDate<YearT> ord(mjd);
const MonthDay md(gd::is_leap_year(ord.year()), ord.dayOfYear());
year_ = ord.year();
month_ = md.month();
day_of_month_ = md.day_of_month();
}
template <typename YearT>
template <typename T, std::enable_if_t<wide::IntegralConcept<T>()> *>
T GregorianDate<YearT>::toModifiedJulianDay() const
{
const MonthDay md(month_, day_of_month_);
const auto day_of_year = md.dayOfYear(gd::is_leap_year(year_));
const OrdinalDate<YearT> ord(year_, day_of_year);
return ord.template toModifiedJulianDay<T>();
}
template <typename YearT>
void GregorianDate<YearT>::write(WriteBuffer & buf) const
{
if (year_ < 0 || year_ > 9999)
{
throw Exception(
"Impossible to stringify: year too big or small: " + DB::toString(year_),
ErrorCodes::CANNOT_FORMAT_DATETIME);
}
else
{
auto y = year_;
writeChar('0' + y / 1000, buf); y %= 1000;
writeChar('0' + y / 100, buf); y %= 100;
writeChar('0' + y / 10, buf); y %= 10;
writeChar('0' + y , buf);
writeChar('-', buf);
auto m = month_;
writeChar('0' + m / 10, buf); m %= 10;
writeChar('0' + m , buf);
writeChar('-', buf);
auto d = day_of_month_;
writeChar('0' + d / 10, buf); d %= 10;
writeChar('0' + d , buf);
}
}
template <typename YearT>
std::string GregorianDate<YearT>::toString() const
{
WriteBufferFromOwnString buf;
write(buf);
return buf.str();
}
template <typename YearT>
OrdinalDate<YearT>::OrdinalDate(YearT year, uint16_t day_of_year)
: year_(year)
, day_of_year_(day_of_year)
{
if (day_of_year < 1 || day_of_year > (gd::is_leap_year(year) ? 366 : 365))
{
throw Exception(
"Invalid ordinal date: " + toString(year) + "-" + toString(day_of_year),
ErrorCodes::LOGICAL_ERROR);
}
}
template <typename YearT>
template <typename T, std::enable_if_t<wide::IntegralConcept<T>()> *>
OrdinalDate<YearT>::OrdinalDate(T mjd)
{
const auto a = mjd + 678575;
const auto quad_cent = gd::div(a, 146097);
const auto b = gd::mod(a, 146097);
const auto cent = gd::min(gd::div(b, 36524), 3);
const auto c = b - cent * 36524;
const auto quad = gd::div(c, 1461);
const auto d = gd::mod(c, 1461);
const auto y = gd::min(gd::div(d, 365), 3);
day_of_year_ = d - y * 365 + 1;
year_ = quad_cent * 400 + cent * 100 + quad * 4 + y + 1;
}
template <typename YearT>
template <typename T, std::enable_if_t<wide::IntegralConcept<T>()> *>
T OrdinalDate<YearT>::toModifiedJulianDay() const noexcept
{
const auto y = year_ - 1;
return day_of_year_
+ 365 * y
+ gd::div(y, 4)
- gd::div(y, 100)
+ gd::div(y, 400)
- 678576;
}
inline MonthDay::MonthDay(uint8_t month, uint8_t day_of_month)
: month_(month)
, day_of_month_(day_of_month)
{
if (month < 1 || month > 12)
throw Exception(
"Invalid month: " + DB::toString(month),
ErrorCodes::LOGICAL_ERROR);
/* We can't validate day_of_month here, because we don't know if
* it's a leap year. */
}
inline MonthDay::MonthDay(bool is_leap_year, uint16_t day_of_year)
{
if (day_of_year < 1 || day_of_year > (is_leap_year ? 366 : 365))
throw Exception(
std::string("Invalid day of year: ") +
(is_leap_year ? "leap, " : "non-leap, ") + DB::toString(day_of_year),
ErrorCodes::LOGICAL_ERROR);
month_ = 1;
uint16_t d = day_of_year;
while (true)
{
const auto len = gd::monthLength(is_leap_year, month_);
if (d <= len)
break;
month_++;
d -= len;
}
day_of_month_ = d;
}
inline uint16_t MonthDay::dayOfYear(bool is_leap_year) const
{
if (day_of_month_ < 1 || day_of_month_ > gd::monthLength(is_leap_year, month_))
{
throw Exception(
std::string("Invalid day of month: ") +
(is_leap_year ? "leap, " : "non-leap, ") + DB::toString(month_) +
"-" + DB::toString(day_of_month_),
ErrorCodes::LOGICAL_ERROR);
}
const auto k = month_ <= 2 ? 0 : is_leap_year ? -1 :-2;
return (367 * month_ - 362) / 12 + k + day_of_month_;
}
}

View File

@ -7,12 +7,27 @@
namespace DB
{
struct FirstSignificantSubdomainDefaultLookup
{
bool operator()(const char *src, size_t len) const
{
return tldLookup::isValid(src, len);
}
};
template <bool without_www>
struct ExtractFirstSignificantSubdomain
{
static size_t getReserveLengthForElement() { return 10; }
static void execute(const Pos data, const size_t size, Pos & res_data, size_t & res_size, Pos * out_domain_end = nullptr)
{
FirstSignificantSubdomainDefaultLookup loookup;
return execute(loookup, data, size, res_data, res_size, out_domain_end);
}
template <class Lookup>
static void execute(const Lookup & lookup, const Pos data, const size_t size, Pos & res_data, size_t & res_size, Pos * out_domain_end = nullptr)
{
res_data = data;
res_size = 0;
@ -65,7 +80,7 @@ struct ExtractFirstSignificantSubdomain
end_of_level_domain = end;
}
if (tldLookup::isValid(last_3_periods[1] + 1, end_of_level_domain - last_3_periods[1] - 1) != nullptr)
if (lookup(last_3_periods[1] + 1, end_of_level_domain - last_3_periods[1] - 1))
{
res_data += last_3_periods[2] + 1 - begin;
res_size = last_3_periods[1] - last_3_periods[2] - 1;

View File

@ -0,0 +1,112 @@
#pragma once
#include <Functions/FunctionFactory.h>
#include <Functions/URL/FunctionsURL.h>
#include <Functions/FunctionHelpers.h>
#include <DataTypes/DataTypeString.h>
#include <Columns/ColumnString.h>
#include <Columns/ColumnFixedString.h>
#include <Common/TLDListsHolder.h>
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_COLUMN;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
}
struct FirstSignificantSubdomainCustomtLookup
{
const TLDList & tld_list;
FirstSignificantSubdomainCustomtLookup(const std::string & tld_list_name)
: tld_list(TLDListsHolder::getInstance().getTldList(tld_list_name))
{
}
bool operator()(const char *pos, size_t len) const
{
return tld_list.has(StringRef{pos, len});
}
};
template <typename Extractor, typename Name>
class FunctionCutToFirstSignificantSubdomainCustomImpl : public IFunction
{
public:
static constexpr auto name = Name::name;
static FunctionPtr create(const Context &) { return std::make_shared<FunctionCutToFirstSignificantSubdomainCustomImpl>(); }
String getName() const override { return name; }
size_t getNumberOfArguments() const override { return 2; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
if (!isString(arguments[0].type))
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Illegal type {} of first argument of function {}. Must be String.",
arguments[0].type->getName(), getName());
if (!isString(arguments[1].type))
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Illegal type {} of second argument (TLD_list_name) of function {}. Must be String/FixedString.",
arguments[1].type->getName(), getName());
const auto * column = arguments[1].column.get();
if (!column || !checkAndGetColumnConstStringOrFixedString(column))
throw Exception(ErrorCodes::ILLEGAL_COLUMN,
"The second argument of function {} should be a constant string with the name of the custom TLD",
getName());
return arguments[0].type;
}
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr & /*result_type*/, size_t /*input_rows_count*/) const override
{
const ColumnConst * column_tld_list_name = checkAndGetColumnConstStringOrFixedString(arguments[1].column.get());
FirstSignificantSubdomainCustomtLookup tld_lookup(column_tld_list_name->getValue<String>());
/// FIXME: convertToFullColumnIfConst() is suboptimal
auto column = arguments[0].column->convertToFullColumnIfConst();
if (const ColumnString * col = checkAndGetColumn<ColumnString>(*column))
{
auto col_res = ColumnString::create();
vector(tld_lookup, col->getChars(), col->getOffsets(), col_res->getChars(), col_res->getOffsets());
return col_res;
}
else
throw Exception(
"Illegal column " + arguments[0].column->getName() + " of argument of function " + getName(),
ErrorCodes::ILLEGAL_COLUMN);
}
static void vector(FirstSignificantSubdomainCustomtLookup & tld_lookup,
const ColumnString::Chars & data, const ColumnString::Offsets & offsets,
ColumnString::Chars & res_data, ColumnString::Offsets & res_offsets)
{
size_t size = offsets.size();
res_offsets.resize(size);
res_data.reserve(size * Extractor::getReserveLengthForElement());
size_t prev_offset = 0;
size_t res_offset = 0;
/// Matched part.
Pos start;
size_t length;
for (size_t i = 0; i < size; ++i)
{
Extractor::execute(tld_lookup, reinterpret_cast<const char *>(&data[prev_offset]), offsets[i] - prev_offset - 1, start, length);
res_data.resize(res_data.size() + length + 1);
memcpySmallAllowReadWriteOverflow15(&res_data[res_offset], start, length);
res_offset += length + 1;
res_data[res_offset - 1] = 0;
res_offsets[i] = res_offset;
prev_offset = offsets[i];
}
}
};
}

View File

@ -1,6 +1,6 @@
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionStringToString.h>
#include "firstSignificantSubdomain.h"
#include "ExtractFirstSignificantSubdomain.h"
namespace DB

View File

@ -0,0 +1,43 @@
#include <Functions/FunctionFactory.h>
#include "ExtractFirstSignificantSubdomain.h"
#include "FirstSignificantSubdomainCustomImpl.h"
namespace DB
{
template <bool without_www>
struct CutToFirstSignificantSubdomainCustom
{
static size_t getReserveLengthForElement() { return 15; }
static void execute(FirstSignificantSubdomainCustomtLookup & tld_lookup, const Pos data, const size_t size, Pos & res_data, size_t & res_size)
{
res_data = data;
res_size = 0;
Pos tmp_data;
size_t tmp_length;
Pos domain_end;
ExtractFirstSignificantSubdomain<without_www>::execute(tld_lookup, data, size, tmp_data, tmp_length, &domain_end);
if (tmp_length == 0)
return;
res_data = tmp_data;
res_size = domain_end - tmp_data;
}
};
struct NameCutToFirstSignificantSubdomainCustom { static constexpr auto name = "cutToFirstSignificantSubdomainCustom"; };
using FunctionCutToFirstSignificantSubdomainCustom = FunctionCutToFirstSignificantSubdomainCustomImpl<CutToFirstSignificantSubdomainCustom<true>, NameCutToFirstSignificantSubdomainCustom>;
struct NameCutToFirstSignificantSubdomainCustomWithWWW { static constexpr auto name = "cutToFirstSignificantSubdomainCustomWithWWW"; };
using FunctionCutToFirstSignificantSubdomainCustomWithWWW = FunctionCutToFirstSignificantSubdomainCustomImpl<CutToFirstSignificantSubdomainCustom<false>, NameCutToFirstSignificantSubdomainCustomWithWWW>;
void registerFunctionCutToFirstSignificantSubdomainCustom(FunctionFactory & factory)
{
factory.registerFunction<FunctionCutToFirstSignificantSubdomainCustom>();
factory.registerFunction<FunctionCutToFirstSignificantSubdomainCustomWithWWW>();
}
}

View File

@ -1,12 +1,13 @@
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionStringToString.h>
#include "firstSignificantSubdomain.h"
#include "ExtractFirstSignificantSubdomain.h"
namespace DB
{
struct NameFirstSignificantSubdomain { static constexpr auto name = "firstSignificantSubdomain"; };
using FunctionFirstSignificantSubdomain = FunctionStringToString<ExtractSubstringImpl<ExtractFirstSignificantSubdomain<true>>, NameFirstSignificantSubdomain>;
void registerFunctionFirstSignificantSubdomain(FunctionFactory & factory)

View File

@ -0,0 +1,18 @@
#include <Functions/FunctionFactory.h>
#include "ExtractFirstSignificantSubdomain.h"
#include "FirstSignificantSubdomainCustomImpl.h"
namespace DB
{
struct NameFirstSignificantSubdomainCustom { static constexpr auto name = "firstSignificantSubdomainCustom"; };
using FunctionFirstSignificantSubdomainCustom = FunctionCutToFirstSignificantSubdomainCustomImpl<ExtractFirstSignificantSubdomain<true>, NameFirstSignificantSubdomainCustom>;
void registerFunctionFirstSignificantSubdomainCustom(FunctionFactory & factory)
{
factory.registerFunction<FunctionFirstSignificantSubdomainCustom>();
}
}

View File

@ -7,6 +7,7 @@ void registerFunctionProtocol(FunctionFactory & factory);
void registerFunctionDomain(FunctionFactory & factory);
void registerFunctionDomainWithoutWWW(FunctionFactory & factory);
void registerFunctionFirstSignificantSubdomain(FunctionFactory & factory);
void registerFunctionFirstSignificantSubdomainCustom(FunctionFactory & factory);
void registerFunctionTopLevelDomain(FunctionFactory & factory);
void registerFunctionPort(FunctionFactory & factory);
void registerFunctionPath(FunctionFactory & factory);
@ -20,6 +21,7 @@ void registerFunctionExtractURLParameterNames(FunctionFactory & factory);
void registerFunctionURLHierarchy(FunctionFactory & factory);
void registerFunctionURLPathHierarchy(FunctionFactory & factory);
void registerFunctionCutToFirstSignificantSubdomain(FunctionFactory & factory);
void registerFunctionCutToFirstSignificantSubdomainCustom(FunctionFactory & factory);
void registerFunctionCutWWW(FunctionFactory & factory);
void registerFunctionCutQueryString(FunctionFactory & factory);
void registerFunctionCutFragment(FunctionFactory & factory);
@ -34,6 +36,7 @@ void registerFunctionsURL(FunctionFactory & factory)
registerFunctionDomain(factory);
registerFunctionDomainWithoutWWW(factory);
registerFunctionFirstSignificantSubdomain(factory);
registerFunctionFirstSignificantSubdomainCustom(factory);
registerFunctionTopLevelDomain(factory);
registerFunctionPort(factory);
registerFunctionPath(factory);
@ -47,6 +50,7 @@ void registerFunctionsURL(FunctionFactory & factory)
registerFunctionURLHierarchy(factory);
registerFunctionURLPathHierarchy(factory);
registerFunctionCutToFirstSignificantSubdomain(factory);
registerFunctionCutToFirstSignificantSubdomainCustom(factory);
registerFunctionCutWWW(factory);
registerFunctionCutQueryString(factory);
registerFunctionCutFragment(factory);

View File

@ -1,5 +1,7 @@
#pragma once
#include <cstdlib>
// Definition of the class generated by gperf, present on gperf/tldLookup.gperf
class TopLevelDomainLookupHash
{

View File

@ -0,0 +1,237 @@
#include <Columns/ColumnNullable.h>
#include <Columns/ColumnString.h>
#include <Columns/ColumnVector.h>
#include <Columns/ColumnsNumber.h>
#include <Core/callOnTypeIndex.h>
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypeNullable.h>
#include <DataTypes/DataTypesNumber.h>
#include <DataTypes/IDataType.h>
#include <Functions/IFunctionImpl.h>
#include <Functions/FunctionFactory.h>
#include <Functions/GregorianDate.h>
#include <IO/WriteBufferFromVector.h>
#include <IO/WriteHelpers.h>
namespace DB
{
namespace ErrorCodes
{
extern const int CANNOT_FORMAT_DATETIME;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
}
template <typename Name, typename FromDataType, bool nullOnErrors>
class ExecutableFunctionFromModifiedJulianDay : public IExecutableFunctionImpl
{
public:
String getName() const override
{
return Name::name;
}
ColumnPtr execute(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t input_rows_count) const override
{
using ColVecType = typename FromDataType::ColumnType;
const ColVecType * col_from = checkAndGetColumn<ColVecType>(arguments[0].column.get());
const typename ColVecType::Container & vec_from = col_from->getData();
auto col_to = ColumnString::create();
ColumnString::Chars & data_to = col_to->getChars();
ColumnString::Offsets & offsets_to = col_to->getOffsets();
data_to.resize(input_rows_count * strlen("YYYY-MM-DD") + 1);
offsets_to.resize(input_rows_count);
ColumnUInt8::MutablePtr col_null_map_to;
ColumnUInt8::Container * vec_null_map_to [[maybe_unused]] = nullptr;
if constexpr (nullOnErrors)
{
col_null_map_to = ColumnUInt8::create(input_rows_count);
vec_null_map_to = &col_null_map_to->getData();
}
WriteBufferFromVector<ColumnString::Chars> write_buffer(data_to);
for (size_t i = 0; i < input_rows_count; ++i)
{
if constexpr (nullOnErrors)
{
try
{
const GregorianDate<> gd(vec_from[i]);
gd.write(write_buffer);
(*vec_null_map_to)[i] = false;
}
catch (const Exception & e)
{
if (e.code() == ErrorCodes::CANNOT_FORMAT_DATETIME)
(*vec_null_map_to)[i] = true;
else
throw;
}
writeChar(0, write_buffer);
offsets_to[i] = write_buffer.count();
}
else
{
const GregorianDate<> gd(vec_from[i]);
gd.write(write_buffer);
writeChar(0, write_buffer);
offsets_to[i] = write_buffer.count();
}
}
write_buffer.finalize();
if constexpr (nullOnErrors)
return ColumnNullable::create(std::move(col_to), std::move(col_null_map_to));
else
return col_to;
}
bool useDefaultImplementationForConstants() const override
{
return true;
}
};
template <typename Name, typename FromDataType, bool nullOnErrors>
class FunctionBaseFromModifiedJulianDay : public IFunctionBaseImpl
{
public:
explicit FunctionBaseFromModifiedJulianDay(DataTypes argument_types_, DataTypePtr return_type_)
: argument_types(std::move(argument_types_))
, return_type(std::move(return_type_)) {}
String getName() const override
{
return Name::name;
}
const DataTypes & getArgumentTypes() const override
{
return argument_types;
}
const DataTypePtr & getResultType() const override
{
return return_type;
}
ExecutableFunctionImplPtr prepare(const ColumnsWithTypeAndName &) const override
{
return std::make_unique<ExecutableFunctionFromModifiedJulianDay<Name, FromDataType, nullOnErrors>>();
}
bool isInjective(const ColumnsWithTypeAndName &) const override
{
return true;
}
bool hasInformationAboutMonotonicity() const override
{
return true;
}
Monotonicity getMonotonicityForRange(const IDataType &, const Field &, const Field &) const override
{
return Monotonicity(
true, // is_monotonic
true, // is_positive
true); // is_always_monotonic
}
private:
DataTypes argument_types;
DataTypePtr return_type;
};
template <typename Name, bool nullOnErrors>
class FromModifiedJulianDayOverloadResolver : public IFunctionOverloadResolverImpl
{
public:
static constexpr auto name = Name::name;
static FunctionOverloadResolverImplPtr create(const Context &)
{
return std::make_unique<FromModifiedJulianDayOverloadResolver<Name, nullOnErrors>>();
}
String getName() const override
{
return Name::name;
}
FunctionBaseImplPtr build(const ColumnsWithTypeAndName & arguments, const DataTypePtr & return_type) const override
{
const DataTypePtr & from_type = arguments[0].type;
DataTypes argument_types = { from_type };
FunctionBaseImplPtr base;
auto call = [&](const auto & types) -> bool
{
using Types = std::decay_t<decltype(types)>;
using FromIntType = typename Types::RightType;
using FromDataType = DataTypeNumber<FromIntType>;
base = std::make_unique<FunctionBaseFromModifiedJulianDay<Name, FromDataType, nullOnErrors>>(argument_types, return_type);
return true;
};
bool built = callOnBasicType<void, true, false, false, false>(from_type->getTypeId(), call);
if (built)
return base;
/* When the argument is a NULL constant, the resulting
* function base will not be actually called but it
* will still be inspected. Returning a NULL pointer
* here causes a SEGV. So we must somehow create a
* dummy implementation and return it.
*/
if (WhichDataType(from_type).isNullable()) // Nullable(Nothing)
return std::make_unique<FunctionBaseFromModifiedJulianDay<Name, DataTypeInt32, nullOnErrors>>(argument_types, return_type);
else
// Should not happen.
throw Exception(
"The argument of function " + getName() + " must be integral", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
}
DataTypePtr getReturnType(const DataTypes & arguments) const override
{
if (!isInteger(arguments[0]))
{
throw Exception(
"The argument of function " + getName() + " must be integral", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
}
DataTypePtr base_type = std::make_shared<DataTypeString>();
if constexpr (nullOnErrors)
return std::make_shared<DataTypeNullable>(base_type);
else
return base_type;
}
size_t getNumberOfArguments() const override
{
return 1;
}
bool isInjective(const ColumnsWithTypeAndName &) const override
{
return true;
}
};
struct NameFromModifiedJulianDay
{
static constexpr auto name = "fromModifiedJulianDay";
};
struct NameFromModifiedJulianDayOrNull
{
static constexpr auto name = "fromModifiedJulianDayOrNull";
};
void registerFunctionFromModifiedJulianDay(FunctionFactory & factory)
{
factory.registerFunction<FromModifiedJulianDayOverloadResolver<NameFromModifiedJulianDay, false>>();
factory.registerFunction<FromModifiedJulianDayOverloadResolver<NameFromModifiedJulianDayOrNull, true>>();
}
}

View File

@ -18,6 +18,7 @@ void registerFunctionToMonday(FunctionFactory &);
void registerFunctionToISOWeek(FunctionFactory &);
void registerFunctionToISOYear(FunctionFactory &);
void registerFunctionToCustomWeek(FunctionFactory &);
void registerFunctionToModifiedJulianDay(FunctionFactory &);
void registerFunctionToStartOfMonth(FunctionFactory &);
void registerFunctionToStartOfQuarter(FunctionFactory &);
void registerFunctionToStartOfYear(FunctionFactory &);
@ -65,6 +66,7 @@ void registerFunctionSubtractYears(FunctionFactory &);
void registerFunctionDateDiff(FunctionFactory &);
void registerFunctionToTimeZone(FunctionFactory &);
void registerFunctionFormatDateTime(FunctionFactory &);
void registerFunctionFromModifiedJulianDay(FunctionFactory &);
void registerFunctionDateTrunc(FunctionFactory &);
void registerFunctionsDateTime(FunctionFactory & factory)
@ -83,6 +85,7 @@ void registerFunctionsDateTime(FunctionFactory & factory)
registerFunctionToISOWeek(factory);
registerFunctionToISOYear(factory);
registerFunctionToCustomWeek(factory);
registerFunctionToModifiedJulianDay(factory);
registerFunctionToStartOfMonth(factory);
registerFunctionToStartOfQuarter(factory);
registerFunctionToStartOfYear(factory);
@ -131,6 +134,7 @@ void registerFunctionsDateTime(FunctionFactory & factory)
registerFunctionDateDiff(factory);
registerFunctionToTimeZone(factory);
registerFunctionFormatDateTime(factory);
registerFunctionFromModifiedJulianDay(factory);
registerFunctionDateTrunc(factory);
}

View File

@ -0,0 +1,234 @@
#include <Columns/ColumnNullable.h>
#include <Columns/ColumnString.h>
#include <Columns/ColumnFixedString.h>
#include <Columns/ColumnVector.h>
#include <Columns/ColumnsNumber.h>
#include <DataTypes/IDataType.h>
#include <DataTypes/DataTypesNumber.h>
#include <DataTypes/DataTypeNullable.h>
#include <Functions/IFunctionImpl.h>
#include <Functions/FunctionFactory.h>
#include <Functions/GregorianDate.h>
#include <IO/ReadBufferFromMemory.h>
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_COLUMN;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
extern const int CANNOT_PARSE_INPUT_ASSERTION_FAILED;
extern const int CANNOT_PARSE_DATE;
}
template <typename Name, typename ToDataType, bool nullOnErrors>
class ExecutableFunctionToModifiedJulianDay : public IExecutableFunctionImpl
{
public:
String getName() const override
{
return Name::name;
}
ColumnPtr execute(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t input_rows_count) const override
{
const IColumn * col_from = arguments[0].column.get();
const ColumnString * col_from_string = checkAndGetColumn<ColumnString>(col_from);
const ColumnFixedString * col_from_fixed_string = checkAndGetColumn<ColumnFixedString>(col_from);
const ColumnString::Chars * chars = nullptr;
const IColumn::Offsets * offsets = nullptr;
size_t fixed_string_size = 0;
if (col_from_string)
{
chars = &col_from_string->getChars();
offsets = &col_from_string->getOffsets();
}
else if (col_from_fixed_string)
{
chars = &col_from_fixed_string->getChars();
fixed_string_size = col_from_fixed_string->getN();
}
else
{
throw Exception("Illegal column " + col_from->getName()
+ " of first argument of function " + Name::name,
ErrorCodes::ILLEGAL_COLUMN);
}
using ColVecTo = typename ToDataType::ColumnType;
typename ColVecTo::MutablePtr col_to = ColVecTo::create(input_rows_count);
typename ColVecTo::Container & vec_to = col_to->getData();
ColumnUInt8::MutablePtr col_null_map_to;
ColumnUInt8::Container * vec_null_map_to [[maybe_unused]] = nullptr;
if constexpr (nullOnErrors)
{
col_null_map_to = ColumnUInt8::create(input_rows_count);
vec_null_map_to = &col_null_map_to->getData();
}
size_t current_offset = 0;
for (size_t i = 0; i < input_rows_count; ++i)
{
const size_t next_offset = offsets ? (*offsets)[i] : current_offset + fixed_string_size;
const size_t string_size = offsets ? next_offset - current_offset - 1 : fixed_string_size;
ReadBufferFromMemory read_buffer(&(*chars)[current_offset], string_size);
current_offset = next_offset;
if constexpr (nullOnErrors)
{
try
{
const GregorianDate<> date(read_buffer);
vec_to[i] = date.toModifiedJulianDay<typename ToDataType::FieldType>();
(*vec_null_map_to)[i] = false;
}
catch (const Exception & e)
{
if (e.code() == ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED || e.code() == ErrorCodes::CANNOT_PARSE_DATE)
(*vec_null_map_to)[i] = true;
else
throw;
}
}
else
{
const GregorianDate<> date(read_buffer);
vec_to[i] = date.toModifiedJulianDay<typename ToDataType::FieldType>();
}
}
if constexpr (nullOnErrors)
return ColumnNullable::create(std::move(col_to), std::move(col_null_map_to));
else
return col_to;
}
bool useDefaultImplementationForConstants() const override
{
return true;
}
};
template <typename Name, typename ToDataType, bool nullOnErrors>
class FunctionBaseToModifiedJulianDay : public IFunctionBaseImpl
{
public:
explicit FunctionBaseToModifiedJulianDay(DataTypes argument_types_, DataTypePtr return_type_)
: argument_types(std::move(argument_types_))
, return_type(std::move(return_type_)) {}
String getName() const override
{
return Name::name;
}
const DataTypes & getArgumentTypes() const override
{
return argument_types;
}
const DataTypePtr & getResultType() const override
{
return return_type;
}
ExecutableFunctionImplPtr prepare(const ColumnsWithTypeAndName &) const override
{
return std::make_unique<ExecutableFunctionToModifiedJulianDay<Name, ToDataType, nullOnErrors>>();
}
bool isInjective(const ColumnsWithTypeAndName &) const override
{
return true;
}
bool hasInformationAboutMonotonicity() const override
{
return true;
}
Monotonicity getMonotonicityForRange(const IDataType &, const Field &, const Field &) const override
{
return Monotonicity(
true, // is_monotonic
true, // is_positive
true); // is_always_monotonic
}
private:
DataTypes argument_types;
DataTypePtr return_type;
};
template <typename Name, typename ToDataType, bool nullOnErrors>
class ToModifiedJulianDayOverloadResolver : public IFunctionOverloadResolverImpl
{
public:
static constexpr auto name = Name::name;
static FunctionOverloadResolverImplPtr create(const Context &)
{
return std::make_unique<ToModifiedJulianDayOverloadResolver<Name, ToDataType, nullOnErrors>>();
}
String getName() const override
{
return Name::name;
}
FunctionBaseImplPtr build(const ColumnsWithTypeAndName & arguments, const DataTypePtr & return_type) const override
{
DataTypes argument_types = { arguments[0].type };
return std::make_unique<FunctionBaseToModifiedJulianDay<Name, ToDataType, nullOnErrors>>(argument_types, return_type);
}
DataTypePtr getReturnType(const DataTypes & arguments) const override
{
if (!isStringOrFixedString(arguments[0]))
{
throw Exception(
"The argument of function " + getName() + " must be String or FixedString", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
}
DataTypePtr base_type = std::make_shared<ToDataType>();
if constexpr (nullOnErrors)
{
return std::make_shared<DataTypeNullable>(base_type);
}
else
{
return base_type;
}
}
size_t getNumberOfArguments() const override
{
return 1;
}
bool isInjective(const ColumnsWithTypeAndName &) const override
{
return true;
}
};
struct NameToModifiedJulianDay
{
static constexpr auto name = "toModifiedJulianDay";
};
struct NameToModifiedJulianDayOrNull
{
static constexpr auto name = "toModifiedJulianDayOrNull";
};
void registerFunctionToModifiedJulianDay(FunctionFactory & factory)
{
factory.registerFunction<ToModifiedJulianDayOverloadResolver<NameToModifiedJulianDay, DataTypeInt32, false>>();
factory.registerFunction<ToModifiedJulianDayOverloadResolver<NameToModifiedJulianDayOrNull, DataTypeInt32, true>>();
}
}

View File

@ -80,6 +80,7 @@ SRCS(
URL/cutQueryString.cpp
URL/cutQueryStringAndFragment.cpp
URL/cutToFirstSignificantSubdomain.cpp
URL/cutToFirstSignificantSubdomainCustom.cpp
URL/cutURLParameter.cpp
URL/cutWWW.cpp
URL/decodeURLComponent.cpp
@ -89,6 +90,7 @@ SRCS(
URL/extractURLParameterNames.cpp
URL/extractURLParameters.cpp
URL/firstSignificantSubdomain.cpp
URL/firstSignificantSubdomainCustom.cpp
URL/fragment.cpp
URL/netloc.cpp
URL/path.cpp
@ -247,6 +249,7 @@ SRCS(
formatReadableTimeDelta.cpp
formatRow.cpp
formatString.cpp
fromModifiedJulianDay.cpp
fromUnixTimestamp64Micro.cpp
fromUnixTimestamp64Milli.cpp
fromUnixTimestamp64Nano.cpp
@ -455,6 +458,7 @@ SRCS(
toISOYear.cpp
toLowCardinality.cpp
toMinute.cpp
toModifiedJulianDay.cpp
toMonday.cpp
toMonth.cpp
toNullable.cpp

View File

@ -7,6 +7,10 @@
# include <Storages/StorageS3Settings.h>
# include <aws/core/auth/AWSCredentialsProvider.h>
# include <aws/core/auth/AWSCredentialsProviderChain.h>
# include <aws/core/auth/STSCredentialsProvider.h>
# include <aws/core/client/DefaultRetryStrategy.h>
# include <aws/core/platform/Environment.h>
# include <aws/core/utils/logging/LogMacros.h>
# include <aws/core/utils/logging/LogSystemInterface.h>
# include <aws/s3/S3Client.h>
@ -85,15 +89,107 @@ private:
std::unordered_map<String, Poco::Logger *> tag_loggers;
};
class S3CredentialsProviderChain : public Aws::Auth::AWSCredentialsProviderChain
{
public:
explicit S3CredentialsProviderChain(const DB::S3::PocoHTTPClientConfiguration & configuration, const Aws::Auth::AWSCredentials & credentials, bool use_environment_credentials)
{
if (use_environment_credentials)
{
const DB::RemoteHostFilter & remote_host_filter = configuration.remote_host_filter;
const unsigned int s3_max_redirects = configuration.s3_max_redirects;
static const char AWS_ECS_CONTAINER_CREDENTIALS_RELATIVE_URI[] = "AWS_CONTAINER_CREDENTIALS_RELATIVE_URI";
static const char AWS_ECS_CONTAINER_CREDENTIALS_FULL_URI[] = "AWS_CONTAINER_CREDENTIALS_FULL_URI";
static const char AWS_ECS_CONTAINER_AUTHORIZATION_TOKEN[] = "AWS_CONTAINER_AUTHORIZATION_TOKEN";
static const char AWS_EC2_METADATA_DISABLED[] = "AWS_EC2_METADATA_DISABLED";
auto * logger = &Poco::Logger::get("S3CredentialsProviderChain");
/// The only difference from DefaultAWSCredentialsProviderChain::DefaultAWSCredentialsProviderChain()
/// is that this chain uses custom ClientConfiguration.
AddProvider(std::make_shared<Aws::Auth::EnvironmentAWSCredentialsProvider>());
AddProvider(std::make_shared<Aws::Auth::ProfileConfigFileAWSCredentialsProvider>());
AddProvider(std::make_shared<Aws::Auth::STSAssumeRoleWebIdentityCredentialsProvider>());
/// ECS TaskRole Credentials only available when ENVIRONMENT VARIABLE is set.
const auto relative_uri = Aws::Environment::GetEnv(AWS_ECS_CONTAINER_CREDENTIALS_RELATIVE_URI);
LOG_DEBUG(logger, "The environment variable value {} is {}", AWS_ECS_CONTAINER_CREDENTIALS_RELATIVE_URI,
relative_uri);
const auto absolute_uri = Aws::Environment::GetEnv(AWS_ECS_CONTAINER_CREDENTIALS_FULL_URI);
LOG_DEBUG(logger, "The environment variable value {} is {}", AWS_ECS_CONTAINER_CREDENTIALS_FULL_URI,
absolute_uri);
const auto ec2_metadata_disabled = Aws::Environment::GetEnv(AWS_EC2_METADATA_DISABLED);
LOG_DEBUG(logger, "The environment variable value {} is {}", AWS_EC2_METADATA_DISABLED,
ec2_metadata_disabled);
if (!relative_uri.empty())
{
AddProvider(std::make_shared<Aws::Auth::TaskRoleCredentialsProvider>(relative_uri.c_str()));
LOG_INFO(logger, "Added ECS metadata service credentials provider with relative path: [{}] to the provider chain.",
relative_uri);
}
else if (!absolute_uri.empty())
{
const auto token = Aws::Environment::GetEnv(AWS_ECS_CONTAINER_AUTHORIZATION_TOKEN);
AddProvider(std::make_shared<Aws::Auth::TaskRoleCredentialsProvider>(absolute_uri.c_str(), token.c_str()));
/// DO NOT log the value of the authorization token for security purposes.
LOG_INFO(logger, "Added ECS credentials provider with URI: [{}] to the provider chain with a{} authorization token.",
absolute_uri, token.empty() ? "n empty" : " non-empty");
}
else if (Aws::Utils::StringUtils::ToLower(ec2_metadata_disabled.c_str()) != "true")
{
Aws::Client::ClientConfiguration aws_client_configuration;
/// See MakeDefaultHttpResourceClientConfiguration().
/// This is part of EC2 metadata client, but unfortunately it can't be accessed from outside
/// of contrib/aws/aws-cpp-sdk-core/source/internal/AWSHttpResourceClient.cpp
aws_client_configuration.maxConnections = 2;
aws_client_configuration.scheme = Aws::Http::Scheme::HTTP;
/// Explicitly set the proxy settings to empty/zero to avoid relying on defaults that could potentially change
/// in the future.
aws_client_configuration.proxyHost = "";
aws_client_configuration.proxyUserName = "";
aws_client_configuration.proxyPassword = "";
aws_client_configuration.proxyPort = 0;
/// EC2MetadataService throttles by delaying the response so the service client should set a large read timeout.
/// EC2MetadataService delay is in order of seconds so it only make sense to retry after a couple of seconds.
aws_client_configuration.connectTimeoutMs = 1000;
aws_client_configuration.requestTimeoutMs = 1000;
aws_client_configuration.retryStrategy = std::make_shared<Aws::Client::DefaultRetryStrategy>(1, 1000);
DB::S3::PocoHTTPClientConfiguration client_configuration(aws_client_configuration, remote_host_filter, s3_max_redirects);
auto ec2_metadata_client = std::make_shared<Aws::Internal::EC2MetadataClient>(client_configuration);
auto config_loader = std::make_shared<Aws::Config::EC2InstanceProfileConfigLoader>(ec2_metadata_client);
AddProvider(std::make_shared<Aws::Auth::InstanceProfileCredentialsProvider>(config_loader));
LOG_INFO(logger, "Added EC2 metadata service credentials provider to the provider chain.");
}
}
AddProvider(std::make_shared<Aws::Auth::SimpleAWSCredentialsProvider>(credentials));
}
};
class S3AuthSigner : public Aws::Client::AWSAuthV4Signer
{
public:
S3AuthSigner(
const Aws::Client::ClientConfiguration & client_configuration,
const Aws::Auth::AWSCredentials & credentials,
const DB::HeaderCollection & headers_)
const DB::HeaderCollection & headers_,
bool use_environment_credentials)
: Aws::Client::AWSAuthV4Signer(
std::make_shared<Aws::Auth::SimpleAWSCredentialsProvider>(credentials),
std::make_shared<S3CredentialsProviderChain>(
static_cast<const DB::S3::PocoHTTPClientConfiguration &>(client_configuration),
credentials,
use_environment_credentials),
"s3",
client_configuration.region,
Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy::Never,
@ -164,6 +260,7 @@ namespace S3
bool is_virtual_hosted_style,
const String & access_key_id,
const String & secret_access_key,
bool use_environment_credentials,
const RemoteHostFilter & remote_host_filter,
unsigned int s3_max_redirects)
{
@ -172,7 +269,13 @@ namespace S3
if (!endpoint.empty())
cfg.endpointOverride = endpoint;
return create(cfg, is_virtual_hosted_style, access_key_id, secret_access_key, remote_host_filter, s3_max_redirects);
return create(cfg,
is_virtual_hosted_style,
access_key_id,
secret_access_key,
use_environment_credentials,
remote_host_filter,
s3_max_redirects);
}
std::shared_ptr<Aws::S3::S3Client> ClientFactory::create( // NOLINT
@ -180,6 +283,7 @@ namespace S3
bool is_virtual_hosted_style,
const String & access_key_id,
const String & secret_access_key,
bool use_environment_credentials,
const RemoteHostFilter & remote_host_filter,
unsigned int s3_max_redirects)
{
@ -190,7 +294,10 @@ namespace S3
client_configuration.updateSchemeAndRegion();
return std::make_shared<Aws::S3::S3Client>(
credentials, // Aws credentials.
std::make_shared<S3CredentialsProviderChain>(
client_configuration,
credentials,
use_environment_credentials), // AWS credentials provider.
std::move(client_configuration), // Client configuration.
Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy::Never, // Sign policy.
is_virtual_hosted_style || cfg.endpointOverride.empty() // Use virtual addressing if endpoint is not specified.
@ -203,6 +310,7 @@ namespace S3
const String & access_key_id,
const String & secret_access_key,
HeaderCollection headers,
bool use_environment_credentials,
const RemoteHostFilter & remote_host_filter,
unsigned int s3_max_redirects)
{
@ -214,8 +322,10 @@ namespace S3
client_configuration.updateSchemeAndRegion();
Aws::Auth::AWSCredentials credentials(access_key_id, secret_access_key);
auto auth_signer = std::make_shared<S3AuthSigner>(client_configuration, std::move(credentials), std::move(headers), use_environment_credentials);
return std::make_shared<Aws::S3::S3Client>(
std::make_shared<S3AuthSigner>(client_configuration, std::move(credentials), std::move(headers)),
std::move(auth_signer),
std::move(client_configuration), // Client configuration.
is_virtual_hosted_style || client_configuration.endpointOverride.empty() // Use virtual addressing only if endpoint is not specified.
);

View File

@ -36,6 +36,7 @@ public:
bool is_virtual_hosted_style,
const String & access_key_id,
const String & secret_access_key,
bool use_environment_credentials,
const RemoteHostFilter & remote_host_filter,
unsigned int s3_max_redirects);
@ -44,6 +45,7 @@ public:
bool is_virtual_hosted_style,
const String & access_key_id,
const String & secret_access_key,
bool use_environment_credentials,
const RemoteHostFilter & remote_host_filter,
unsigned int s3_max_redirects);
@ -53,6 +55,7 @@ public:
const String & access_key_id,
const String & secret_access_key,
HeaderCollection headers,
bool use_environment_credentials,
const RemoteHostFilter & remote_host_filter,
unsigned int s3_max_redirects);

View File

@ -43,11 +43,13 @@ WriteBufferFromS3::WriteBufferFromS3(
const String & key_,
size_t minimum_upload_part_size_,
bool is_multipart_,
std::optional<std::map<String, String>> object_metadata_,
size_t buffer_size_)
: BufferWithOwnMemory<WriteBuffer>(buffer_size_, nullptr, 0)
, is_multipart(is_multipart_)
, bucket(bucket_)
, key(key_)
, object_metadata(std::move(object_metadata_))
, client_ptr(std::move(client_ptr_))
, minimum_upload_part_size{minimum_upload_part_size_}
, temporary_buffer{std::make_unique<WriteBufferFromOwnString>()}
@ -116,6 +118,8 @@ void WriteBufferFromS3::initiate()
Aws::S3::Model::CreateMultipartUploadRequest req;
req.SetBucket(bucket);
req.SetKey(key);
if (object_metadata.has_value())
req.SetMetadata(object_metadata.value());
auto outcome = client_ptr->CreateMultipartUpload(req);
@ -217,6 +221,8 @@ void WriteBufferFromS3::complete()
Aws::S3::Model::PutObjectRequest req;
req.SetBucket(bucket);
req.SetKey(key);
if (object_metadata.has_value())
req.SetMetadata(object_metadata.value());
/// This could be improved using an adapter to WriteBuffer.
const std::shared_ptr<Aws::IOStream> input_data = Aws::MakeShared<Aws::StringStream>("temporary buffer", temporary_buffer->str());

View File

@ -28,6 +28,7 @@ private:
String bucket;
String key;
std::optional<std::map<String, String>> object_metadata;
std::shared_ptr<Aws::S3::S3Client> client_ptr;
size_t minimum_upload_part_size;
std::unique_ptr<WriteBufferFromOwnString> temporary_buffer;
@ -47,6 +48,7 @@ public:
const String & key_,
size_t minimum_upload_part_size_,
bool is_multipart,
std::optional<std::map<String, String>> object_metadata_ = std::nullopt,
size_t buffer_size_ = DBMS_DEFAULT_BUFFER_SIZE);
void nextImpl() override;

View File

@ -29,7 +29,12 @@
#include <IO/DoubleConverter.h>
#include <IO/WriteBufferFromString.h>
#include <dragonbox/dragonbox_to_chars.h>
/// There is no dragonbox in Arcadia
#if !defined(ARCADIA_BUILD)
# include <dragonbox/dragonbox_to_chars.h>
#else
# include <ryu/ryu.h>
#endif
#include <Formats/FormatSettings.h>
@ -228,14 +233,22 @@ inline size_t writeFloatTextFastPath(T x, char * buffer)
if (DecomposedFloat64(x).is_inside_int64())
result = itoa(Int64(x), buffer) - buffer;
else
#if !defined(ARCADIA_BUILD)
result = jkj::dragonbox::to_chars_n(x, buffer) - buffer;
#else
result = d2s_buffered_n(x, buffer);
#endif
}
else
{
if (DecomposedFloat32(x).is_inside_int32())
result = itoa(Int32(x), buffer) - buffer;
else
#if !defined(ARCADIA_BUILD)
result = jkj::dragonbox::to_chars_n(x, buffer) - buffer;
#else
result = f2s_buffered_n(x, buffer);
#endif
}
if (result <= 0)

View File

@ -12,6 +12,7 @@
#include <Common/Stopwatch.h>
#include <Common/formatReadable.h>
#include <Common/thread_local_rng.h>
#include <Common/ZooKeeper/TestKeeperStorage.h>
#include <Compression/ICompressionCodec.h>
#include <Core/BackgroundSchedulePool.h>
#include <Formats/FormatFactory.h>
@ -304,6 +305,8 @@ struct ContextShared
mutable zkutil::ZooKeeperPtr zookeeper; /// Client for ZooKeeper.
ConfigurationPtr zookeeper_config; /// Stores zookeeper configs
mutable std::mutex test_keeper_storage_mutex;
mutable std::shared_ptr<zkutil::TestKeeperStorage> test_keeper_storage;
mutable std::mutex auxiliary_zookeepers_mutex;
mutable std::map<String, zkutil::ZooKeeperPtr> auxiliary_zookeepers; /// Map for auxiliary ZooKeeper clients.
ConfigurationPtr auxiliary_zookeepers_config; /// Stores auxiliary zookeepers configs
@ -442,6 +445,8 @@ struct ContextShared
/// Stop trace collector if any
trace_collector.reset();
/// Stop test_keeper storage
test_keeper_storage.reset();
}
bool hasTraceCollector() const
@ -1505,6 +1510,15 @@ zkutil::ZooKeeperPtr Context::getZooKeeper() const
return shared->zookeeper;
}
std::shared_ptr<zkutil::TestKeeperStorage> & Context::getTestKeeperStorage() const
{
std::lock_guard lock(shared->test_keeper_storage_mutex);
if (!shared->test_keeper_storage)
shared->test_keeper_storage = std::make_shared<zkutil::TestKeeperStorage>();
return shared->test_keeper_storage;
}
zkutil::ZooKeeperPtr Context::getAuxiliaryZooKeeper(const String & name) const
{
std::lock_guard lock(shared->auxiliary_zookeepers_mutex);

View File

@ -40,6 +40,7 @@ namespace Poco
namespace zkutil
{
class ZooKeeper;
class TestKeeperStorage;
}
@ -494,6 +495,9 @@ public:
/// Same as above but return a zookeeper connection from auxiliary_zookeepers configuration entry.
std::shared_ptr<zkutil::ZooKeeper> getAuxiliaryZooKeeper(const String & name) const;
std::shared_ptr<zkutil::TestKeeperStorage> & getTestKeeperStorage() const;
/// Set auxiliary zookeepers configuration at server starting or configuration reloading.
void reloadAuxiliaryZooKeepersConfigIfChanged(const ConfigurationPtr & config);
/// Has ready or expired ZooKeeper

View File

@ -164,20 +164,47 @@ static inline std::tuple<NamesAndTypesList, NamesAndTypesList, NamesAndTypesList
if (indices_define && !indices_define->children.empty())
{
NameSet columns_name_set;
const Names & columns_name = columns.getNames();
columns_name_set.insert(columns_name.begin(), columns_name.end());
const auto & remove_prefix_key = [&](const ASTPtr & node) -> ASTPtr
{
auto res = std::make_shared<ASTExpressionList>();
for (const auto & index_expression : node->children)
{
res->children.emplace_back(index_expression);
if (const auto & function = index_expression->as<ASTFunction>())
{
/// column_name(int64 literal)
if (columns_name_set.count(function->name) && function->arguments->children.size() == 1)
{
const auto & prefix_limit = function->arguments->children[0]->as<ASTLiteral>();
if (prefix_limit && isInt64FieldType(prefix_limit->value.getType()))
res->children.back() = std::make_shared<ASTIdentifier>(function->name);
}
}
}
return res;
};
for (const auto & declare_index_ast : indices_define->children)
{
const auto & declare_index = declare_index_ast->as<MySQLParser::ASTDeclareIndex>();
const auto & index_columns = remove_prefix_key(declare_index->index_columns);
/// flatten
if (startsWith(declare_index->index_type, "KEY_"))
keys->arguments->children.insert(keys->arguments->children.end(),
declare_index->index_columns->children.begin(), declare_index->index_columns->children.end());
index_columns->children.begin(), index_columns->children.end());
else if (startsWith(declare_index->index_type, "UNIQUE_"))
unique_keys->arguments->children.insert(keys->arguments->children.end(),
declare_index->index_columns->children.begin(), declare_index->index_columns->children.end());
index_columns->children.begin(), index_columns->children.end());
if (startsWith(declare_index->index_type, "PRIMARY_KEY_"))
primary_keys->arguments->children.insert(keys->arguments->children.end(),
declare_index->index_columns->children.begin(), declare_index->index_columns->children.end());
index_columns->children.begin(), index_columns->children.end());
}
}

View File

@ -184,3 +184,14 @@ TEST(MySQLCreateRewritten, RewrittenQueryWithPrimaryKey)
"ReplacingMergeTree(_version) PARTITION BY intDiv(key_2, 4294967) ORDER BY (key_1, key_2)");
}
TEST(MySQLCreateRewritten, RewrittenQueryWithPrefixKey)
{
tryRegisterFunctions();
const auto & context_holder = getContext();
EXPECT_EQ(queryToString(tryRewrittenCreateQuery(
"CREATE TABLE `test_database`.`test_table_1` (`key` int NOT NULL PRIMARY KEY, `prefix_key` varchar(200) NOT NULL, KEY prefix_key_index(prefix_key(2))) ENGINE=InnoDB DEFAULT CHARSET=utf8", context_holder.context)),
"CREATE TABLE test_database.test_table_1 (`key` Int32, `prefix_key` String, `_sign` Int8() MATERIALIZED 1, `_version` UInt64() MATERIALIZED 1) ENGINE = "
"ReplacingMergeTree(_version) PARTITION BY intDiv(key, 4294967) ORDER BY (key, prefix_key)");
}

View File

@ -125,12 +125,60 @@ struct CustomizeAggregateFunctionsSuffixData
{
auto properties = instance.tryGetProperties(func.name);
if (properties && !properties->returns_default_when_only_null)
func.name = func.name + customized_func_suffix;
{
func.name += customized_func_suffix;
}
}
}
};
// Used to rewrite aggregate functions with -OrNull suffix in some cases, such as sumIfOrNull, we shoule rewrite to sumOrNullIf
struct CustomizeAggregateFunctionsMoveSuffixData
{
using TypeToVisit = ASTFunction;
const String & customized_func_suffix;
String moveSuffixAhead(const String & name) const
{
auto prefix = name.substr(0, name.size() - customized_func_suffix.size());
auto prefix_size = prefix.size();
if (endsWith(prefix, "MergeState"))
return prefix.substr(0, prefix_size - 10) + customized_func_suffix + "MergeState";
if (endsWith(prefix, "Merge"))
return prefix.substr(0, prefix_size - 5) + customized_func_suffix + "Merge";
if (endsWith(prefix, "State"))
return prefix.substr(0, prefix_size - 5) + customized_func_suffix + "State";
if (endsWith(prefix, "If"))
return prefix.substr(0, prefix_size - 2) + customized_func_suffix + "If";
return name;
}
void visit(ASTFunction & func, ASTPtr &) const
{
const auto & instance = AggregateFunctionFactory::instance();
if (instance.isAggregateFunctionName(func.name))
{
if (endsWith(func.name, customized_func_suffix))
{
auto properties = instance.tryGetProperties(func.name);
if (properties && !properties->returns_default_when_only_null)
{
func.name = moveSuffixAhead(func.name);
}
}
}
}
};
using CustomizeAggregateFunctionsOrNullVisitor = InDepthNodeVisitor<OneTypeMatcher<CustomizeAggregateFunctionsSuffixData>, true>;
using CustomizeAggregateFunctionsMoveOrNullVisitor = InDepthNodeVisitor<OneTypeMatcher<CustomizeAggregateFunctionsMoveSuffixData>, true>;
/// Translate qualified names such as db.table.column, table.column, table_alias.column to names' normal form.
/// Expand asterisks and qualified asterisks with column names.
@ -753,6 +801,10 @@ void TreeRewriter::normalize(ASTPtr & query, Aliases & aliases, const Settings &
CustomizeAggregateFunctionsOrNullVisitor(data_or_null).visit(query);
}
/// Move -OrNull suffix ahead, this should execute after add -OrNull suffix
CustomizeAggregateFunctionsMoveOrNullVisitor::Data data_or_null{"OrNull"};
CustomizeAggregateFunctionsMoveOrNullVisitor(data_or_null).visit(query);
/// Creates a dictionary `aliases`: alias -> ASTPtr
QueryAliasesVisitor(aliases).visit(query);

View File

@ -29,6 +29,9 @@ target_link_libraries (string_hash_map PRIVATE dbms)
add_executable (string_hash_map_aggregation string_hash_map.cpp)
target_link_libraries (string_hash_map_aggregation PRIVATE dbms)
add_executable (string_hash_set string_hash_set.cpp)
target_link_libraries (string_hash_set PRIVATE dbms)
add_executable (two_level_hash_map two_level_hash_map.cpp)
target_include_directories (two_level_hash_map SYSTEM BEFORE PRIVATE ${SPARSEHASH_INCLUDE_DIR})
target_link_libraries (two_level_hash_map PRIVATE dbms)

View File

@ -0,0 +1,83 @@
#include <iomanip>
#include <iostream>
#include <vector>
#include <Compression/CompressedReadBuffer.h>
#include <common/types.h>
#include <IO/ReadBufferFromFile.h>
#include <IO/ReadHelpers.h>
#include <Interpreters/AggregationCommon.h>
#include <Common/HashTable/HashSet.h>
#include <Common/HashTable/HashTableKeyHolder.h>
#include <Common/HashTable/StringHashSet.h>
#include <Common/Stopwatch.h>
#include <common/StringRef.h>
/// NOTE: see string_hash_map.cpp for usage example
template <typename Set>
void NO_INLINE bench(const std::vector<StringRef> & data, DB::Arena & pool, const char * name)
{
std::cerr << "method " << name << std::endl;
for (auto t = 0ul; t < 7; ++t)
{
Stopwatch watch;
Set set;
typename Set::LookupResult it;
bool inserted;
for (const auto & value : data)
{
if constexpr (std::is_same_v<StringHashSet<>, Set>)
set.emplace(DB::ArenaKeyHolder{value, pool}, inserted);
else
set.emplace(DB::ArenaKeyHolder{value, pool}, it, inserted);
}
watch.stop();
std::cerr << "arena-memory " << pool.size() + set.getBufferSizeInBytes() << std::endl;
std::cerr << "single-run " << std::setprecision(3)
<< watch.elapsedSeconds() << std::endl;
}
}
int main(int argc, char ** argv)
{
if (argc < 3)
{
std::cerr << "Usage: program n m\n";
return 1;
}
size_t n = std::stol(argv[1]);
size_t m = std::stol(argv[2]);
DB::Arena pool(128 * 1024 * 1024);
std::vector<StringRef> data(n);
std::cerr << "sizeof(Key) = " << sizeof(StringRef) << std::endl;
{
Stopwatch watch;
DB::ReadBufferFromFileDescriptor in1(STDIN_FILENO);
DB::CompressedReadBuffer in2(in1);
std::string tmp;
for (size_t i = 0; i < n && !in2.eof(); ++i)
{
DB::readStringBinary(tmp, in2);
data[i] = StringRef(pool.insert(tmp.data(), tmp.size()), tmp.size());
}
watch.stop();
std::cerr << std::fixed << std::setprecision(2) << "Vector. Size: " << n << ", elapsed: " << watch.elapsedSeconds() << " ("
<< n / watch.elapsedSeconds() << " elem/sec.)" << std::endl;
}
if (!m || m == 1)
bench<StringHashSet<>>(data, pool, "StringHashSet");
if (!m || m == 2)
bench<HashSetWithSavedHash<StringRef>>(data, pool, "HashSetWithSavedHash");
if (!m || m == 3)
bench<HashSet<StringRef>>(data, pool, "HashSet");
return 0;
}

View File

@ -41,6 +41,7 @@ public:
try
{
LOG_TRACE(log, "TCP Request. Address: {}", socket.peerAddress().toString());
return new TCPHandler(server, socket, parse_proxy_protocol);
}
catch (const Poco::Net::NetException &)

View File

@ -0,0 +1,452 @@
#include <Server/TestKeeperTCPHandler.h>
#include <Common/ZooKeeper/ZooKeeperIO.h>
#include <Core/Types.h>
#include <IO/WriteBufferFromPocoSocket.h>
#include <IO/ReadBufferFromPocoSocket.h>
#include <Poco/Net/NetException.h>
#include <Common/CurrentThread.h>
#include <Common/Stopwatch.h>
#include <Common/NetException.h>
#include <Common/setThreadName.h>
#include <common/logger_useful.h>
#include <chrono>
#include <Common/PipeFDs.h>
#include <Poco/Util/AbstractConfiguration.h>
#ifdef POCO_HAVE_FD_EPOLL
#include <sys/epoll.h>
#else
#include <poll.h>
#endif
namespace DB
{
namespace ErrorCodes
{
extern const int SYSTEM_ERROR;
extern const int UNEXPECTED_PACKET_FROM_CLIENT;
}
static constexpr UInt8 RESPONSE_BYTE = 1;
static constexpr UInt8 WATCH_RESPONSE_BYTE = 2;
struct SocketInterruptablePollWrapper
{
int sockfd;
PipeFDs pipe;
#if defined(POCO_HAVE_FD_EPOLL)
int epollfd;
epoll_event socket_event{};
epoll_event pipe_event{};
#endif
using PollStatus = size_t;
static constexpr PollStatus TIMEOUT = 0x0;
static constexpr PollStatus HAS_REQUEST = 0x1;
static constexpr PollStatus HAS_RESPONSE = 0x2;
static constexpr PollStatus HAS_WATCH_RESPONSE = 0x4;
static constexpr PollStatus ERROR = 0x8;
using InterruptCallback = std::function<void()>;
explicit SocketInterruptablePollWrapper(const Poco::Net::StreamSocket & poco_socket_)
: sockfd(poco_socket_.impl()->sockfd())
{
pipe.setNonBlockingReadWrite();
#if defined(POCO_HAVE_FD_EPOLL)
epollfd = epoll_create(2);
if (epollfd < 0)
throwFromErrno("Cannot epoll_create", ErrorCodes::SYSTEM_ERROR);
socket_event.events = EPOLLIN | EPOLLERR;
socket_event.data.fd = sockfd;
if (epoll_ctl(epollfd, EPOLL_CTL_ADD, sockfd, &socket_event) < 0)
{
::close(epollfd);
throwFromErrno("Cannot insert socket into epoll queue", ErrorCodes::SYSTEM_ERROR);
}
pipe_event.events = EPOLLIN | EPOLLERR;
pipe_event.data.fd = pipe.fds_rw[0];
if (epoll_ctl(epollfd, EPOLL_CTL_ADD, pipe.fds_rw[0], &pipe_event) < 0)
{
::close(epollfd);
throwFromErrno("Cannot insert socket into epoll queue", ErrorCodes::SYSTEM_ERROR);
}
#endif
}
int getResponseFD() const
{
return pipe.fds_rw[1];
}
PollStatus poll(Poco::Timespan remaining_time)
{
std::array<int, 2> outputs = {-1, -1};
#if defined(POCO_HAVE_FD_EPOLL)
int rc;
epoll_event evout[2];
memset(evout, 0, sizeof(evout));
do
{
Poco::Timestamp start;
rc = epoll_wait(epollfd, evout, 2, remaining_time.totalMilliseconds());
if (rc < 0 && errno == EINTR)
{
Poco::Timestamp end;
Poco::Timespan waited = end - start;
if (waited < remaining_time)
remaining_time -= waited;
else
remaining_time = 0;
}
}
while (rc < 0 && errno == EINTR);
if (rc >= 1 && evout[0].events & EPOLLIN)
outputs[0] = evout[0].data.fd;
if (rc == 2 && evout[1].events & EPOLLIN)
outputs[1] = evout[1].data.fd;
#else
pollfd poll_buf[2];
poll_buf[0].fd = sockfd;
poll_buf[0].events = POLLIN;
poll_buf[1].fd = pipe.fds_rw[0];
poll_buf[1].events = POLLIN;
int rc;
do
{
Poco::Timestamp start;
rc = ::poll(poll_buf, 2, remaining_time.totalMilliseconds());
if (rc < 0 && errno == POCO_EINTR)
{
Poco::Timestamp end;
Poco::Timespan waited = end - start;
if (waited < remaining_time)
remaining_time -= waited;
else
remaining_time = 0;
}
}
while (rc < 0 && errno == POCO_EINTR);
if (rc >= 1 && poll_buf[0].revents & POLLIN)
outputs[0] = sockfd;
if (rc == 2 && poll_buf[1].revents & POLLIN)
outputs[1] = pipe.fds_rw[0];
#endif
PollStatus result = TIMEOUT;
if (rc < 0)
{
return ERROR;
}
else if (rc == 0)
{
return result;
}
else
{
for (auto fd : outputs)
{
if (fd != -1)
{
if (fd == sockfd)
result |= HAS_REQUEST;
else
{
int read_result;
do
{
UInt8 byte;
read_result = read(pipe.fds_rw[0], &byte, sizeof(byte));
if (read_result > 0)
{
if (byte == WATCH_RESPONSE_BYTE)
result |= HAS_WATCH_RESPONSE;
else if (byte == RESPONSE_BYTE)
result |= HAS_RESPONSE;
else
throw Exception("Unexpected byte received from signaling pipe", ErrorCodes::UNEXPECTED_PACKET_FROM_CLIENT);
}
}
while (read_result > 0 || (read_result < 0 && errno == EINTR));
if (read_result < 0 && errno != EAGAIN)
throwFromErrno("Got error reading from pipe", ErrorCodes::SYSTEM_ERROR);
}
}
}
}
return result;
}
#if defined(POCO_HAVE_FD_EPOLL)
~SocketInterruptablePollWrapper()
{
::close(epollfd);
}
#endif
};
TestKeeperTCPHandler::TestKeeperTCPHandler(IServer & server_, const Poco::Net::StreamSocket & socket_)
: Poco::Net::TCPServerConnection(socket_)
, server(server_)
, log(&Poco::Logger::get("TestKeeperTCPHandler"))
, global_context(server.context())
, test_keeper_storage(global_context.getTestKeeperStorage())
, operation_timeout(0, global_context.getConfigRef().getUInt("test_keeper_server.operation_timeout_ms", Coordination::DEFAULT_OPERATION_TIMEOUT_MS) * 1000)
, session_timeout(0, global_context.getConfigRef().getUInt("test_keeper_server.session_timeout_ms", Coordination::DEFAULT_SESSION_TIMEOUT_MS) * 1000)
, session_id(test_keeper_storage->getSessionID())
, poll_wrapper(std::make_unique<SocketInterruptablePollWrapper>(socket_))
{
}
void TestKeeperTCPHandler::sendHandshake()
{
Coordination::write(Coordination::SERVER_HANDSHAKE_LENGTH, *out);
Coordination::write(Coordination::ZOOKEEPER_PROTOCOL_VERSION, *out);
Coordination::write(Coordination::DEFAULT_SESSION_TIMEOUT_MS, *out);
Coordination::write(session_id, *out);
std::array<char, Coordination::PASSWORD_LENGTH> passwd{};
Coordination::write(passwd, *out);
out->next();
}
void TestKeeperTCPHandler::run()
{
runImpl();
}
void TestKeeperTCPHandler::receiveHandshake()
{
int32_t handshake_length;
int32_t protocol_version;
int64_t last_zxid_seen;
int32_t timeout;
int64_t previous_session_id = 0; /// We don't support session restore. So previous session_id is always zero.
std::array<char, Coordination::PASSWORD_LENGTH> passwd {};
Coordination::read(handshake_length, *in);
if (handshake_length != Coordination::CLIENT_HANDSHAKE_LENGTH && handshake_length != Coordination::CLIENT_HANDSHAKE_LENGTH_WITH_READONLY)
throw Exception("Unexpected handshake length received: " + toString(handshake_length), ErrorCodes::UNEXPECTED_PACKET_FROM_CLIENT);
Coordination::read(protocol_version, *in);
if (protocol_version != Coordination::ZOOKEEPER_PROTOCOL_VERSION)
throw Exception("Unexpected protocol version: " + toString(protocol_version), ErrorCodes::UNEXPECTED_PACKET_FROM_CLIENT);
Coordination::read(last_zxid_seen, *in);
if (last_zxid_seen != 0)
throw Exception("Non zero last_zxid_seen is not supported", ErrorCodes::UNEXPECTED_PACKET_FROM_CLIENT);
Coordination::read(timeout, *in);
Coordination::read(previous_session_id, *in);
if (previous_session_id != 0)
throw Exception("Non zero previous session id is not supported", ErrorCodes::UNEXPECTED_PACKET_FROM_CLIENT);
Coordination::read(passwd, *in);
int8_t readonly;
if (handshake_length == Coordination::CLIENT_HANDSHAKE_LENGTH_WITH_READONLY)
Coordination::read(readonly, *in);
}
void TestKeeperTCPHandler::runImpl()
{
setThreadName("TstKprHandler");
ThreadStatus thread_status;
auto global_receive_timeout = global_context.getSettingsRef().receive_timeout;
auto global_send_timeout = global_context.getSettingsRef().send_timeout;
socket().setReceiveTimeout(global_receive_timeout);
socket().setSendTimeout(global_send_timeout);
socket().setNoDelay(true);
in = std::make_shared<ReadBufferFromPocoSocket>(socket());
out = std::make_shared<WriteBufferFromPocoSocket>(socket());
if (in->eof())
{
LOG_WARNING(log, "Client has not sent any data.");
return;
}
try
{
receiveHandshake();
}
catch (const Exception & e) /// Typical for an incorrect username, password, or address.
{
LOG_WARNING(log, "Cannot receive handshake {}", e.displayText());
return;
}
sendHandshake();
session_stopwatch.start();
bool close_received = false;
try
{
while (true)
{
using namespace std::chrono_literals;
auto state = poll_wrapper->poll(session_timeout);
if (state & SocketInterruptablePollWrapper::HAS_REQUEST)
{
do
{
Coordination::OpNum received_op = receiveRequest();
if (received_op == Coordination::OpNum::Close)
{
LOG_DEBUG(log, "Received close request for session #{}", session_id);
if (responses.back().wait_for(std::chrono::microseconds(operation_timeout.totalMicroseconds())) != std::future_status::ready)
{
LOG_DEBUG(log, "Cannot sent close for session #{}", session_id);
}
else
{
LOG_DEBUG(log, "Sent close for session #{}", session_id);
responses.back().get()->write(*out);
}
close_received = true;
break;
}
else if (received_op == Coordination::OpNum::Heartbeat)
{
LOG_TRACE(log, "Received heartbeat for session #{}", session_id);
session_stopwatch.restart();
}
}
while (in->available());
}
if (close_received)
break;
if (state & SocketInterruptablePollWrapper::HAS_RESPONSE)
{
while (!responses.empty())
{
if (responses.front().wait_for(0s) != std::future_status::ready)
break;
auto response = responses.front().get();
response->write(*out);
responses.pop();
}
}
if (state & SocketInterruptablePollWrapper::HAS_WATCH_RESPONSE)
{
for (auto it = watch_responses.begin(); it != watch_responses.end();)
{
if (it->wait_for(0s) == std::future_status::ready)
{
auto response = it->get();
if (response->error == Coordination::Error::ZOK)
response->write(*out);
it = watch_responses.erase(it);
}
else
{
++it;
}
}
}
if (state == SocketInterruptablePollWrapper::ERROR)
{
throw Exception("Exception happened while reading from socket", ErrorCodes::SYSTEM_ERROR);
}
if (session_stopwatch.elapsedMicroseconds() > static_cast<UInt64>(session_timeout.totalMicroseconds()))
{
LOG_DEBUG(log, "Session #{} expired", session_id);
auto response = putCloseRequest();
if (response.wait_for(std::chrono::microseconds(operation_timeout.totalMicroseconds())) != std::future_status::ready)
LOG_DEBUG(log, "Cannot sent close for expired session #{}", session_id);
else
response.get()->write(*out);
break;
}
}
}
catch (const Exception & ex)
{
LOG_INFO(log, "Got exception processing session #{}: {}", session_id, getExceptionMessage(ex, true));
auto response = putCloseRequest();
if (response.wait_for(std::chrono::microseconds(operation_timeout.totalMicroseconds())) != std::future_status::ready)
LOG_DEBUG(log, "Cannot sent close for session #{}", session_id);
else
response.get()->write(*out);
}
}
zkutil::TestKeeperStorage::AsyncResponse TestKeeperTCPHandler::putCloseRequest()
{
Coordination::ZooKeeperRequestPtr request = Coordination::ZooKeeperRequestFactory::instance().get(Coordination::OpNum::Close);
request->xid = Coordination::CLOSE_XID;
auto promise = std::make_shared<std::promise<Coordination::ZooKeeperResponsePtr>>();
zkutil::ResponseCallback callback = [promise] (const Coordination::ZooKeeperResponsePtr & response)
{
promise->set_value(response);
};
test_keeper_storage->putRequest(request, session_id, callback);
return promise->get_future();
}
Coordination::OpNum TestKeeperTCPHandler::receiveRequest()
{
int32_t length;
Coordination::read(length, *in);
int32_t xid;
Coordination::read(xid, *in);
Coordination::OpNum opnum;
Coordination::read(opnum, *in);
Coordination::ZooKeeperRequestPtr request = Coordination::ZooKeeperRequestFactory::instance().get(opnum);
request->xid = xid;
request->readImpl(*in);
int response_fd = poll_wrapper->getResponseFD();
auto promise = std::make_shared<std::promise<Coordination::ZooKeeperResponsePtr>>();
zkutil::ResponseCallback callback = [response_fd, promise] (const Coordination::ZooKeeperResponsePtr & response)
{
promise->set_value(response);
[[maybe_unused]] int result = write(response_fd, &RESPONSE_BYTE, sizeof(RESPONSE_BYTE));
};
if (request->has_watch)
{
auto watch_promise = std::make_shared<std::promise<Coordination::ZooKeeperResponsePtr>>();
zkutil::ResponseCallback watch_callback = [response_fd, watch_promise] (const Coordination::ZooKeeperResponsePtr & response)
{
watch_promise->set_value(response);
[[maybe_unused]] int result = write(response_fd, &WATCH_RESPONSE_BYTE, sizeof(WATCH_RESPONSE_BYTE));
};
test_keeper_storage->putRequest(request, session_id, callback, watch_callback);
responses.push(promise->get_future());
watch_responses.emplace_back(watch_promise->get_future());
}
else
{
test_keeper_storage->putRequest(request, session_id, callback);
responses.push(promise->get_future());
}
return opnum;
}
}

View File

@ -0,0 +1,52 @@
#pragma once
#include <Poco/Net/TCPServerConnection.h>
#include "IServer.h"
#include <Common/Stopwatch.h>
#include <Interpreters/Context.h>
#include <Common/ZooKeeper/ZooKeeperCommon.h>
#include <Common/ZooKeeper/ZooKeeperConstants.h>
#include <Common/ZooKeeper/TestKeeperStorage.h>
#include <IO/WriteBufferFromPocoSocket.h>
#include <IO/ReadBufferFromPocoSocket.h>
#include <future>
namespace DB
{
struct SocketInterruptablePollWrapper;
using SocketInterruptablePollWrapperPtr = std::unique_ptr<SocketInterruptablePollWrapper>;
class TestKeeperTCPHandler : public Poco::Net::TCPServerConnection
{
public:
TestKeeperTCPHandler(IServer & server_, const Poco::Net::StreamSocket & socket_);
void run() override;
private:
IServer & server;
Poco::Logger * log;
Context global_context;
std::shared_ptr<zkutil::TestKeeperStorage> test_keeper_storage;
Poco::Timespan operation_timeout;
Poco::Timespan session_timeout;
int64_t session_id;
Stopwatch session_stopwatch;
SocketInterruptablePollWrapperPtr poll_wrapper;
std::queue<zkutil::TestKeeperStorage::AsyncResponse> responses;
std::vector<zkutil::TestKeeperStorage::AsyncResponse> watch_responses;
/// Streams for reading/writing from/to client connection socket.
std::shared_ptr<ReadBufferFromPocoSocket> in;
std::shared_ptr<WriteBufferFromPocoSocket> out;
void runImpl();
void sendHandshake();
void receiveHandshake();
Coordination::OpNum receiveRequest();
zkutil::TestKeeperStorage::AsyncResponse putCloseRequest();
};
}

Some files were not shown because too many files have changed in this diff Show More