diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index 257040c68b7..8e502c0b36f 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -1,4 +1,2 @@ -dbms/* @ClickHouse/core-assigner -utils/* @ClickHouse/core-assigner docs/* @ClickHouse/docs docs/zh/* @ClickHouse/docs-zh diff --git a/CHANGELOG.md b/CHANGELOG.md index a6757c38898..f456a56f1be 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,8 @@ -## ClickHouse release v19.17.6.36, 2019-12-27 +## ClickHouse release v19.17 -### Bug Fix +### ClickHouse release v19.17.6.36, 2019-12-27 + +#### Bug Fix * Fixed potential buffer overflow in decompress. Malicious user can pass fabricated compressed data that could cause read after buffer. This issue was found by Eldar Zaitov from Yandex information security team. [#8404](https://github.com/ClickHouse/ClickHouse/pull/8404) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixed possible server crash (`std::terminate`) when the server cannot send or write data in JSON or XML format with values of String data type (that require UTF-8 validation) or when compressing result data with Brotli algorithm or in some other rare cases. [#8384](https://github.com/ClickHouse/ClickHouse/pull/8384) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixed dictionaries with source from a clickhouse `VIEW`, now reading such dictionaries doesn't cause the error `There is no query`. [#8351](https://github.com/ClickHouse/ClickHouse/pull/8351) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) @@ -32,13 +34,12 @@ next request would interpret this info as the beginning of the next query causin * Now an exception will be thrown in case of using WITH TIES alongside LIMIT BY. And now it's possible to use TOP with LIMIT BY. [#7637](https://github.com/ClickHouse/ClickHouse/pull/7637) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)) * Fix dictionary reload if it has `invalidate_query`, which stopped updates and some exception on previous update tries. [#8029](https://github.com/ClickHouse/ClickHouse/pull/8029) ([alesapin](https://github.com/alesapin)) +### ClickHouse release v19.17.4.11, 2019-11-22 -## ClickHouse release v19.17.4.11, 2019-11-22 - -### Backward Incompatible Change +#### Backward Incompatible Change * Using column instead of AST to store scalar subquery results for better performance. Setting `enable_scalar_subquery_optimization` was added in 19.17 and it was enabled by default. It leads to errors like [this](https://github.com/ClickHouse/ClickHouse/issues/7851) during upgrade to 19.17.2 or 19.17.3 from previous versions. This setting was disabled by default in 19.17.4, to make possible upgrading from 19.16 and older versions without errors. [#7392](https://github.com/ClickHouse/ClickHouse/pull/7392) ([Amos Bird](https://github.com/amosbird)) -### New Feature +#### New Feature * Add the ability to create dictionaries with DDL queries. [#7360](https://github.com/ClickHouse/ClickHouse/pull/7360) ([alesapin](https://github.com/alesapin)) * Make `bloom_filter` type of index supporting `LowCardinality` and `Nullable` [#7363](https://github.com/ClickHouse/ClickHouse/issues/7363) [#7561](https://github.com/ClickHouse/ClickHouse/pull/7561) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) * Add function `isValidJSON` to check that passed string is a valid json. [#5910](https://github.com/ClickHouse/ClickHouse/issues/5910) [#7293](https://github.com/ClickHouse/ClickHouse/pull/7293) ([Vdimir](https://github.com/Vdimir)) @@ -51,10 +52,10 @@ next request would interpret this info as the beginning of the next query causin * Implemented `javaHashUTF16LE()` function [#7651](https://github.com/ClickHouse/ClickHouse/pull/7651) ([achimbab](https://github.com/achimbab)) * Add `_shard_num` virtual column for the Distributed engine [#7624](https://github.com/ClickHouse/ClickHouse/pull/7624) ([Azat Khuzhin](https://github.com/azat)) -### Experimental Feature +#### Experimental Feature * Support for processors (new query execution pipeline) in `MergeTree`. [#7181](https://github.com/ClickHouse/ClickHouse/pull/7181) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) -### Bug Fix +#### Bug Fix * Fix incorrect float parsing in `Values` [#7817](https://github.com/ClickHouse/ClickHouse/issues/7817) [#7870](https://github.com/ClickHouse/ClickHouse/pull/7870) ([tavplubix](https://github.com/tavplubix)) * Fix rare deadlock which can happen when trace_log is enabled. [#7838](https://github.com/ClickHouse/ClickHouse/pull/7838) ([filimonov](https://github.com/filimonov)) * Prevent message duplication when producing Kafka table has any MVs selecting from it [#7265](https://github.com/ClickHouse/ClickHouse/pull/7265) ([Ivan](https://github.com/abyss7)) @@ -78,7 +79,7 @@ next request would interpret this info as the beginning of the next query causin * Fixed exception in case of using 1 argument while defining S3, URL and HDFS storages. [#7618](https://github.com/ClickHouse/ClickHouse/pull/7618) ([Vladimir Chebotarev](https://github.com/excitoon)) * Fix scope of the InterpreterSelectQuery for views with query [#7601](https://github.com/ClickHouse/ClickHouse/pull/7601) ([Azat Khuzhin](https://github.com/azat)) -### Improvement +#### Improvement * `Nullable` columns recognized and NULL-values handled correctly by ODBC-bridge [#7402](https://github.com/ClickHouse/ClickHouse/pull/7402) ([Vasily Nemkov](https://github.com/Enmk)) * Write current batch for distributed send atomically [#7600](https://github.com/ClickHouse/ClickHouse/pull/7600) ([Azat Khuzhin](https://github.com/azat)) * Throw an exception if we cannot detect table for column name in query. [#7358](https://github.com/ClickHouse/ClickHouse/pull/7358) ([Artem Zuikov](https://github.com/4ertus2)) @@ -90,14 +91,14 @@ next request would interpret this info as the beginning of the next query causin * Better Null format for tcp handler, so that it's possible to use `select ignore() from table format Null` for perf measure via clickhouse-client [#7606](https://github.com/ClickHouse/ClickHouse/pull/7606) ([Amos Bird](https://github.com/amosbird)) * Queries like `CREATE TABLE ... AS (SELECT (1, 2))` are parsed correctly [#7542](https://github.com/ClickHouse/ClickHouse/pull/7542) ([hcz](https://github.com/hczhcz)) -### Performance Improvement +#### Performance Improvement * The performance of aggregation over short string keys is improved. [#6243](https://github.com/ClickHouse/ClickHouse/pull/6243) ([Alexander Kuzmenkov](https://github.com/akuzm), [Amos Bird](https://github.com/amosbird)) * Run another pass of syntax/expression analysis to get potential optimizations after constant predicates are folded. [#7497](https://github.com/ClickHouse/ClickHouse/pull/7497) ([Amos Bird](https://github.com/amosbird)) * Use storage meta info to evaluate trivial `SELECT count() FROM table;` [#7510](https://github.com/ClickHouse/ClickHouse/pull/7510) ([Amos Bird](https://github.com/amosbird), [alexey-milovidov](https://github.com/alexey-milovidov)) * Vectorize processing `arrayReduce` similar to Aggregator `addBatch`. [#7608](https://github.com/ClickHouse/ClickHouse/pull/7608) ([Amos Bird](https://github.com/amosbird)) * Minor improvements in performance of `Kafka` consumption [#7475](https://github.com/ClickHouse/ClickHouse/pull/7475) ([Ivan](https://github.com/abyss7)) -### Build/Testing/Packaging Improvement +#### Build/Testing/Packaging Improvement * Add support for cross-compiling to the CPU architecture AARCH64. Refactor packager script. [#7370](https://github.com/ClickHouse/ClickHouse/pull/7370) [#7539](https://github.com/ClickHouse/ClickHouse/pull/7539) ([Ivan](https://github.com/abyss7)) * Unpack darwin-x86_64 and linux-aarch64 toolchains into mounted Docker volume when building packages [#7534](https://github.com/ClickHouse/ClickHouse/pull/7534) ([Ivan](https://github.com/abyss7)) * Update Docker Image for Binary Packager [#7474](https://github.com/ClickHouse/ClickHouse/pull/7474) ([Ivan](https://github.com/abyss7)) @@ -108,13 +109,14 @@ next request would interpret this info as the beginning of the next query causin * Remove hardcoded paths in `unwind` target [#7460](https://github.com/ClickHouse/ClickHouse/pull/7460) ([Konstantin Podshumok](https://github.com/podshumok)) * Allow to use mysql format without ssl [#7524](https://github.com/ClickHouse/ClickHouse/pull/7524) ([proller](https://github.com/proller)) -### Other +#### Other * Added ANTLR4 grammar for ClickHouse SQL dialect [#7595](https://github.com/ClickHouse/ClickHouse/issues/7595) [#7596](https://github.com/ClickHouse/ClickHouse/pull/7596) ([alexey-milovidov](https://github.com/alexey-milovidov)) +## ClickHouse release v19.16 -## ClickHouse release v19.16.2.2, 2019-10-30 +### ClickHouse release v19.16.2.2, 2019-10-30 -### Backward Incompatible Change +#### Backward Incompatible Change * Add missing arity validation for count/counIf. [#7095](https://github.com/ClickHouse/ClickHouse/issues/7095) [#7298](https://github.com/ClickHouse/ClickHouse/pull/7298) ([Vdimir](https://github.com/Vdimir)) @@ -125,7 +127,7 @@ Zuikov](https://github.com/4ertus2)) [#7118](https://github.com/ClickHouse/ClickHouse/pull/7118) ([tavplubix](https://github.com/tavplubix)) -### New Feature +#### New Feature * Introduce uniqCombined64() to calculate cardinality greater than UINT_MAX. [#7213](https://github.com/ClickHouse/ClickHouse/pull/7213), [#7222](https://github.com/ClickHouse/ClickHouse/pull/7222) ([Azat @@ -166,7 +168,7 @@ Yu](https://github.com/yuzhichang)) * Support Redis as source of external dictionary. [#4361](https://github.com/ClickHouse/ClickHouse/pull/4361) [#6962](https://github.com/ClickHouse/ClickHouse/pull/6962) ([comunodi](https://github.com/comunodi), [Anton Popov](https://github.com/CurtizJ)) -### Bug Fix +#### Bug Fix * Fix wrong query result if it has `WHERE IN (SELECT ...)` section and `optimize_read_in_order` is used. [#7371](https://github.com/ClickHouse/ClickHouse/pull/7371) ([Anton Popov](https://github.com/CurtizJ)) @@ -207,7 +209,7 @@ Kochetov](https://github.com/KochetovNicolai)) [#7271](https://github.com/ClickHouse/ClickHouse/pull/7271) ([vzakaznikov](https://github.com/vzakaznikov)) -### Improvement +#### Improvement * Add a message in case of queue_wait_max_ms wait takes place. [#7390](https://github.com/ClickHouse/ClickHouse/pull/7390) ([Azat Khuzhin](https://github.com/azat)) @@ -252,7 +254,7 @@ Khuzhin](https://github.com/azat)) memory). Load data back when needed. [#7186](https://github.com/ClickHouse/ClickHouse/pull/7186) ([Artem Zuikov](https://github.com/4ertus2)) -### Performance Improvement +#### Performance Improvement * Speed up joinGet with const arguments by avoiding data duplication. [#7359](https://github.com/ClickHouse/ClickHouse/pull/7359) ([Amos Bird](https://github.com/amosbird)) @@ -262,7 +264,7 @@ Bird](https://github.com/amosbird)) [#6781](https://github.com/ClickHouse/ClickHouse/pull/6781) ([tavplubix](https://github.com/tavplubix)) -### Build/Testing/Packaging Improvement +#### Build/Testing/Packaging Improvement * Disable some contribs for cross-compilation to Mac OS. [#7101](https://github.com/ClickHouse/ClickHouse/pull/7101) ([Ivan](https://github.com/abyss7)) * Add missing linking with PocoXML for clickhouse_common_io. @@ -330,7 +332,7 @@ Bird](https://github.com/amosbird)) [#7063](https://github.com/ClickHouse/ClickHouse/pull/7063) ([proller](https://github.com/proller)) -### Code cleanup +#### Code cleanup * Generalize configuration repository to prepare for DDL for Dictionaries. [#7155](https://github.com/ClickHouse/ClickHouse/pull/7155) ([alesapin](https://github.com/alesapin)) * Parser for dictionaries DDL without any semantic. @@ -364,9 +366,11 @@ fix comments to make obvious that it may throw. [#7350](https://github.com/ClickHouse/ClickHouse/pull/7350) ([tavplubix](https://github.com/tavplubix)) -## ClickHouse release 19.15.4.10, 2019-10-31 +## ClickHouse release 19.15 -### Bug Fix +### ClickHouse release 19.15.4.10, 2019-10-31 + +#### Bug Fix * Added handling of SQL_TINYINT and SQL_BIGINT, and fix handling of SQL_FLOAT data source types in ODBC Bridge. [#7491](https://github.com/ClickHouse/ClickHouse/pull/7491) ([Denis Glazachev](https://github.com/traceon)) * Allowed to have some parts on destination disk or volume in MOVE PARTITION. @@ -391,9 +395,9 @@ fix comments to make obvious that it may throw. [#7158](https://github.com/ClickHouse/ClickHouse/pull/7158) ([Azat Khuzhin](https://github.com/azat)) * Added example config with macros for tests ([alexey-milovidov](https://github.com/alexey-milovidov)) -## ClickHouse release 19.15.3.6, 2019-10-09 +### ClickHouse release 19.15.3.6, 2019-10-09 -### Bug Fix +#### Bug Fix * Fixed bad_variant in hashed dictionary. ([alesapin](https://github.com/alesapin)) * Fixed up bug with segmentation fault in ATTACH PART query. @@ -405,9 +409,9 @@ fix comments to make obvious that it may throw. * Serialize NULL values correctly in min/max indexes of MergeTree parts. [#7234](https://github.com/ClickHouse/ClickHouse/pull/7234) ([Alexander Kuzmenkov](https://github.com/akuzm)) -## ClickHouse release 19.15.2.2, 2019-10-01 +### ClickHouse release 19.15.2.2, 2019-10-01 -### New Feature +#### New Feature * Tiered storage: support to use multiple storage volumes for tables with MergeTree engine. It's possible to store fresh data on SSD and automatically move old data to HDD. ([example](https://clickhouse.github.io/clickhouse-presentations/meetup30/new_features/#12)). [#4918](https://github.com/ClickHouse/ClickHouse/pull/4918) ([Igr](https://github.com/ObjatieGroba)) [#6489](https://github.com/ClickHouse/ClickHouse/pull/6489) ([alesapin](https://github.com/alesapin)) * Add table function `input` for reading incoming data in `INSERT SELECT` query. [#5450](https://github.com/ClickHouse/ClickHouse/pull/5450) ([palasonic1](https://github.com/palasonic1)) [#6832](https://github.com/ClickHouse/ClickHouse/pull/6832) ([Anton Popov](https://github.com/CurtizJ)) * Add a `sparse_hashed` dictionary layout, that is functionally equivalent to the `hashed` layout, but is more memory efficient. It uses about twice as less memory at the cost of slower value retrieval. [#6894](https://github.com/ClickHouse/ClickHouse/pull/6894) ([Azat Khuzhin](https://github.com/azat)) @@ -417,11 +421,11 @@ fix comments to make obvious that it may throw. * Add `bitmapMin` and `bitmapMax` functions. [#6970](https://github.com/ClickHouse/ClickHouse/pull/6970) ([Zhichang Yu](https://github.com/yuzhichang)) * Add function `repeat` related to [issue-6648](https://github.com/yandex/ClickHouse/issues/6648) [#6999](https://github.com/ClickHouse/ClickHouse/pull/6999) ([flynn](https://github.com/ucasFL)) -### Experimental Feature +#### Experimental Feature * Implement (in memory) Merge Join variant that does not change current pipeline. Result is partially sorted by merge key. Set `partial_merge_join = 1` to use this feature. The Merge Join is still in development. [#6940](https://github.com/ClickHouse/ClickHouse/pull/6940) ([Artem Zuikov](https://github.com/4ertus2)) * Add `S3` engine and table function. It is still in development (no authentication support yet). [#5596](https://github.com/ClickHouse/ClickHouse/pull/5596) ([Vladimir Chebotarev](https://github.com/excitoon)) -### Improvement +#### Improvement * Every message read from Kafka is inserted atomically. This resolves almost all known issues with Kafka engine. [#6950](https://github.com/ClickHouse/ClickHouse/pull/6950) ([Ivan](https://github.com/abyss7)) * Improvements for failover of Distributed queries. Shorten recovery time, also it is now configurable and can be seen in `system.clusters`. [#6399](https://github.com/ClickHouse/ClickHouse/pull/6399) ([Vasily Nemkov](https://github.com/Enmk)) * Support numeric values for Enums directly in `IN` section. #6766 [#6941](https://github.com/ClickHouse/ClickHouse/pull/6941) ([dimarub2000](https://github.com/dimarub2000)) @@ -432,7 +436,7 @@ fix comments to make obvious that it may throw. * Add automatically cast type `T` to `LowCardinality(T)` while inserting data in column of type `LowCardinality(T)` in Native format via HTTP. [#6891](https://github.com/ClickHouse/ClickHouse/pull/6891) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) * Add ability to use function `hex` without using `reinterpretAsString` for `Float32`, `Float64`. [#7024](https://github.com/ClickHouse/ClickHouse/pull/7024) ([Mikhail Korotov](https://github.com/millb)) -### Build/Testing/Packaging Improvement +#### Build/Testing/Packaging Improvement * Add gdb-index to clickhouse binary with debug info. It will speed up startup time of `gdb`. [#6947](https://github.com/ClickHouse/ClickHouse/pull/6947) ([alesapin](https://github.com/alesapin)) * Speed up deb packaging with patched dpkg-deb which uses `pigz`. [#6960](https://github.com/ClickHouse/ClickHouse/pull/6960) ([alesapin](https://github.com/alesapin)) * Set `enable_fuzzing = 1` to enable libfuzzer instrumentation of all the project code. [#7042](https://github.com/ClickHouse/ClickHouse/pull/7042) ([kyprizel](https://github.com/kyprizel)) @@ -440,7 +444,7 @@ fix comments to make obvious that it may throw. * Add build with MemorySanitizer to CI. [#7066](https://github.com/ClickHouse/ClickHouse/pull/7066) ([Alexander Kuzmenkov](https://github.com/akuzm)) * Replace `libsparsehash` with `sparsehash-c11` [#6965](https://github.com/ClickHouse/ClickHouse/pull/6965) ([Azat Khuzhin](https://github.com/azat)) -### Bug Fix +#### Bug Fix * Fixed performance degradation of index analysis on complex keys on large tables. This fixes #6924. [#7075](https://github.com/ClickHouse/ClickHouse/pull/7075) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fix logical error causing segfaults when selecting from Kafka empty topic. [#6909](https://github.com/ClickHouse/ClickHouse/pull/6909) ([Ivan](https://github.com/abyss7)) * Fix too early MySQL connection close in `MySQLBlockInputStream.cpp`. [#6882](https://github.com/ClickHouse/ClickHouse/pull/6882) ([Clément Rodriguez](https://github.com/clemrodriguez)) @@ -451,28 +455,29 @@ fix comments to make obvious that it may throw. * Fix `Unknown identifier` error in ORDER BY and GROUP BY with multiple JOINs [#7022](https://github.com/ClickHouse/ClickHouse/pull/7022) ([Artem Zuikov](https://github.com/4ertus2)) * Fixed `MSan` warning while executing function with `LowCardinality` argument. [#7062](https://github.com/ClickHouse/ClickHouse/pull/7062) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) -### Backward Incompatible Change +#### Backward Incompatible Change * Changed serialization format of bitmap* aggregate function states to improve performance. Serialized states of bitmap* from previous versions cannot be read. [#6908](https://github.com/ClickHouse/ClickHouse/pull/6908) ([Zhichang Yu](https://github.com/yuzhichang)) -## ClickHouse release 19.14.7.15, 2019-10-02 +## ClickHouse release 19.14 +### ClickHouse release 19.14.7.15, 2019-10-02 -### Bug Fix +#### Bug Fix * This release also contains all bug fixes from 19.11.12.69. * Fixed compatibility for distributed queries between 19.14 and earlier versions. This fixes [#7068](https://github.com/ClickHouse/ClickHouse/issues/7068). [#7069](https://github.com/ClickHouse/ClickHouse/pull/7069) ([alexey-milovidov](https://github.com/alexey-milovidov)) -## ClickHouse release 19.14.6.12, 2019-09-19 +### ClickHouse release 19.14.6.12, 2019-09-19 -### Bug Fix +#### Bug Fix * Fix for function `АrrayEnumerateUniqRanked` with empty arrays in params. [#6928](https://github.com/ClickHouse/ClickHouse/pull/6928) ([proller](https://github.com/proller)) * Fixed subquery name in queries with `ARRAY JOIN` and `GLOBAL IN subquery` with alias. Use subquery alias for external table name if it is specified. [#6934](https://github.com/ClickHouse/ClickHouse/pull/6934) ([Ivan](https://github.com/abyss7)) -### Build/Testing/Packaging Improvement +#### Build/Testing/Packaging Improvement * Fix [flapping](https://clickhouse-test-reports.s3.yandex.net/6944/aab95fd5175a513413c7395a73a82044bdafb906/functional_stateless_tests_(debug).html) test `00715_fetch_merged_or_mutated_part_zookeeper` by rewriting it to a shell scripts because it needs to wait for mutations to apply. [#6977](https://github.com/ClickHouse/ClickHouse/pull/6977) ([Alexander Kazakov](https://github.com/Akazz)) * Fixed UBSan and MemSan failure in function `groupUniqArray` with emtpy array argument. It was caused by placing of empty `PaddedPODArray` into hash table zero cell because constructor for zero cell value was not called. [#6937](https://github.com/ClickHouse/ClickHouse/pull/6937) ([Amos Bird](https://github.com/amosbird)) -## ClickHouse release 19.14.3.3, 2019-09-10 +### ClickHouse release 19.14.3.3, 2019-09-10 -### New Feature +#### New Feature * `WITH FILL` modifier for `ORDER BY`. (continuation of [#5069](https://github.com/ClickHouse/ClickHouse/issues/5069)) [#6610](https://github.com/ClickHouse/ClickHouse/pull/6610) ([Anton Popov](https://github.com/CurtizJ)) * `WITH TIES` modifier for `LIMIT`. (continuation of [#5069](https://github.com/ClickHouse/ClickHouse/issues/5069)) [#6610](https://github.com/ClickHouse/ClickHouse/pull/6610) ([Anton Popov](https://github.com/CurtizJ)) * Parse unquoted `NULL` literal as NULL (if setting `format_csv_unquoted_null_literal_as_null=1`). Initialize null fields with default values if data type of this field is not nullable (if setting `input_format_null_as_default=1`). [#5990](https://github.com/ClickHouse/ClickHouse/issues/5990) [#6055](https://github.com/ClickHouse/ClickHouse/pull/6055) ([tavplubix](https://github.com/tavplubix)) @@ -498,11 +503,11 @@ fix comments to make obvious that it may throw. * Added support for `_partition` and `_timestamp` virtual columns to Kafka engine. [#6400](https://github.com/ClickHouse/ClickHouse/pull/6400) ([Ivan](https://github.com/abyss7)) * Possibility to remove sensitive data from `query_log`, server logs, process list with regexp-based rules. [#5710](https://github.com/ClickHouse/ClickHouse/pull/5710) ([filimonov](https://github.com/filimonov)) -### Experimental Feature +#### Experimental Feature * Input and output data format `Template`. It allows to specify custom format string for input and output. [#4354](https://github.com/ClickHouse/ClickHouse/issues/4354) [#6727](https://github.com/ClickHouse/ClickHouse/pull/6727) ([tavplubix](https://github.com/tavplubix)) * Implementation of `LIVE VIEW` tables that were originally proposed in [#2898](https://github.com/ClickHouse/ClickHouse/pull/2898), prepared in [#3925](https://github.com/ClickHouse/ClickHouse/issues/3925), and then updated in [#5541](https://github.com/ClickHouse/ClickHouse/issues/5541). See [#5541](https://github.com/ClickHouse/ClickHouse/issues/5541) for detailed description. [#5541](https://github.com/ClickHouse/ClickHouse/issues/5541) ([vzakaznikov](https://github.com/vzakaznikov)) [#6425](https://github.com/ClickHouse/ClickHouse/pull/6425) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) [#6656](https://github.com/ClickHouse/ClickHouse/pull/6656) ([vzakaznikov](https://github.com/vzakaznikov)) Note that `LIVE VIEW` feature may be removed in next versions. -### Bug Fix +#### Bug Fix * This release also contains all bug fixes from 19.13 and 19.11. * Fix segmentation fault when the table has skip indices and vertical merge happens. [#6723](https://github.com/ClickHouse/ClickHouse/pull/6723) ([alesapin](https://github.com/alesapin)) * Fix per-column TTL with non-trivial column defaults. Previously in case of force TTL merge with `OPTIMIZE ... FINAL` query, expired values was replaced by type defaults instead of user-specified column defaults. [#6796](https://github.com/ClickHouse/ClickHouse/pull/6796) ([Anton Popov](https://github.com/CurtizJ)) @@ -540,7 +545,7 @@ fix comments to make obvious that it may throw. * Fix bug with writing secondary indices marks with adaptive granularity. [#6126](https://github.com/ClickHouse/ClickHouse/pull/6126) ([alesapin](https://github.com/alesapin)) * Fix initialization order while server startup. Since `StorageMergeTree::background_task_handle` is initialized in `startup()` the `MergeTreeBlockOutputStream::write()` may try to use it before initialization. Just check if it is initialized. [#6080](https://github.com/ClickHouse/ClickHouse/pull/6080) ([Ivan](https://github.com/abyss7)) * Clearing the data buffer from the previous read operation that was completed with an error. [#6026](https://github.com/ClickHouse/ClickHouse/pull/6026) ([Nikolay](https://github.com/bopohaa)) -* Fix bug with enabling adaptive granularity when creating a new replica for Replicated*MergeTree table. [#6394](https://github.com/ClickHouse/ClickHouse/issues/6394) [#6452](https://github.com/ClickHouse/ClickHouse/pull/6452) ([alesapin](https://github.com/alesapin)) +* Fix bug with enabling adaptive granularity when creating a new replica for Replicated\*MergeTree table. [#6394](https://github.com/ClickHouse/ClickHouse/issues/6394) [#6452](https://github.com/ClickHouse/ClickHouse/pull/6452) ([alesapin](https://github.com/alesapin)) * Fixed possible crash during server startup in case of exception happened in `libunwind` during exception at access to uninitialized `ThreadStatus` structure. [#6456](https://github.com/ClickHouse/ClickHouse/pull/6456) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)) * Fix crash in `yandexConsistentHash` function. Found by fuzz test. [#6304](https://github.com/ClickHouse/ClickHouse/issues/6304) [#6305](https://github.com/ClickHouse/ClickHouse/pull/6305) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixed the possibility of hanging queries when server is overloaded and global thread pool becomes near full. This have higher chance to happen on clusters with large number of shards (hundreds), because distributed queries allocate a thread per connection to each shard. For example, this issue may reproduce if a cluster of 330 shards is processing 30 concurrent distributed queries. This issue affects all versions starting from 19.2. [#6301](https://github.com/ClickHouse/ClickHouse/pull/6301) ([alexey-milovidov](https://github.com/alexey-milovidov)) @@ -560,11 +565,11 @@ fix comments to make obvious that it may throw. * Typo in the error message ( is -> are ). [#6839](https://github.com/ClickHouse/ClickHouse/pull/6839) ([Denis Zhuravlev](https://github.com/den-crane)) * Fixed error while parsing of columns list from string if type contained a comma (this issue was relevant for `File`, `URL`, `HDFS` storages) [#6217](https://github.com/ClickHouse/ClickHouse/issues/6217). [#6209](https://github.com/ClickHouse/ClickHouse/pull/6209) ([dimarub2000](https://github.com/dimarub2000)) -### Security Fix +#### Security Fix * This release also contains all bug security fixes from 19.13 and 19.11. * Fixed the possibility of a fabricated query to cause server crash due to stack overflow in SQL parser. Fixed the possibility of stack overflow in Merge and Distributed tables, materialized views and conditions for row-level security that involve subqueries. [#6433](https://github.com/ClickHouse/ClickHouse/pull/6433) ([alexey-milovidov](https://github.com/alexey-milovidov)) -### Improvement +#### Improvement * Correct implementation of ternary logic for `AND/OR`. [#6048](https://github.com/ClickHouse/ClickHouse/pull/6048) ([Alexander Kazakov](https://github.com/Akazz)) * Now values and rows with expired TTL will be removed after `OPTIMIZE ... FINAL` query from old parts without TTL infos or with outdated TTL infos, e.g. after `ALTER ... MODIFY TTL` query. Added queries `SYSTEM STOP/START TTL MERGES` to disallow/allow assign merges with TTL and filter expired values in all merges. [#6274](https://github.com/ClickHouse/ClickHouse/pull/6274) ([Anton Popov](https://github.com/CurtizJ)) * Possibility to change the location of ClickHouse history file for client using `CLICKHOUSE_HISTORY_FILE` env. [#6840](https://github.com/ClickHouse/ClickHouse/pull/6840) ([filimonov](https://github.com/filimonov)) @@ -625,7 +630,7 @@ fix comments to make obvious that it may throw. * `MergeTree` now has an additional option `ttl_only_drop_parts` (disabled by default) to avoid partial pruning of parts, so that they dropped completely when all the rows in a part are expired. [#6191](https://github.com/ClickHouse/ClickHouse/pull/6191) ([Sergi Vladykin](https://github.com/svladykin)) * Type checks for set index functions. Throw exception if function got a wrong type. This fixes fuzz test with UBSan. [#6511](https://github.com/ClickHouse/ClickHouse/pull/6511) ([Nikita Vasilev](https://github.com/nikvas0)) -### Performance Improvement +#### Performance Improvement * Optimize queries with `ORDER BY expressions` clause, where `expressions` have coinciding prefix with sorting key in `MergeTree` tables. This optimization is controlled by `optimize_read_in_order` setting. [#6054](https://github.com/ClickHouse/ClickHouse/pull/6054) [#6629](https://github.com/ClickHouse/ClickHouse/pull/6629) ([Anton Popov](https://github.com/CurtizJ)) * Allow to use multiple threads during parts loading and removal. [#6372](https://github.com/ClickHouse/ClickHouse/issues/6372) [#6074](https://github.com/ClickHouse/ClickHouse/issues/6074) [#6438](https://github.com/ClickHouse/ClickHouse/pull/6438) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Implemented batch variant of updating aggregate function states. It may lead to performance benefits. [#6435](https://github.com/ClickHouse/ClickHouse/pull/6435) ([alexey-milovidov](https://github.com/alexey-milovidov)) @@ -635,7 +640,7 @@ fix comments to make obvious that it may throw. * Pre-fault pages when allocating memory with `mmap()`. [#6667](https://github.com/ClickHouse/ClickHouse/pull/6667) ([akuzm](https://github.com/akuzm)) * Fix performance bug in `Decimal` comparison. [#6380](https://github.com/ClickHouse/ClickHouse/pull/6380) ([Artem Zuikov](https://github.com/4ertus2)) -### Build/Testing/Packaging Improvement +#### Build/Testing/Packaging Improvement * Remove Compiler (runtime template instantiation) because we've win over it's performance. [#6646](https://github.com/ClickHouse/ClickHouse/pull/6646) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Added performance test to show degradation of performance in gcc-9 in more isolated way. [#6302](https://github.com/ClickHouse/ClickHouse/pull/6302) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Added table function `numbers_mt`, which is multithreaded version of `numbers`. Updated performance tests with hash functions. [#6554](https://github.com/ClickHouse/ClickHouse/pull/6554) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) @@ -712,19 +717,19 @@ fix comments to make obvious that it may throw. * Fix "splitted" build. [#6618](https://github.com/ClickHouse/ClickHouse/pull/6618) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Other build fixes: [#6186](https://github.com/ClickHouse/ClickHouse/pull/6186) ([Amos Bird](https://github.com/amosbird)) [#6486](https://github.com/ClickHouse/ClickHouse/pull/6486) [#6348](https://github.com/ClickHouse/ClickHouse/pull/6348) ([vxider](https://github.com/Vxider)) [#6744](https://github.com/ClickHouse/ClickHouse/pull/6744) ([Ivan](https://github.com/abyss7)) [#6016](https://github.com/ClickHouse/ClickHouse/pull/6016) [#6421](https://github.com/ClickHouse/ClickHouse/pull/6421) [#6491](https://github.com/ClickHouse/ClickHouse/pull/6491) ([proller](https://github.com/proller)) -### Backward Incompatible Change +#### Backward Incompatible Change * Removed rarely used table function `catBoostPool` and storage `CatBoostPool`. If you have used this table function, please write email to `clickhouse-feedback@yandex-team.com`. Note that CatBoost integration remains and will be supported. [#6279](https://github.com/ClickHouse/ClickHouse/pull/6279) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Disable `ANY RIGHT JOIN` and `ANY FULL JOIN` by default. Set `any_join_distinct_right_table_keys` setting to enable them. [#5126](https://github.com/ClickHouse/ClickHouse/issues/5126) [#6351](https://github.com/ClickHouse/ClickHouse/pull/6351) ([Artem Zuikov](https://github.com/4ertus2)) -## ClickHouse release 19.13.6.51, 2019-10-02 +## ClickHouse release 19.13 +### ClickHouse release 19.13.6.51, 2019-10-02 -### Bug Fix +#### Bug Fix * This release also contains all bug fixes from 19.11.12.69. +### ClickHouse release 19.13.5.44, 2019-09-20 -## ClickHouse release 19.13.5.44, 2019-09-20 - -### Bug Fix +#### Bug Fix * This release also contains all bug fixes from 19.14.6.12. * Fixed possible inconsistent state of table while executing `DROP` query for replicated table while zookeeper is not accessible. [#6045](https://github.com/ClickHouse/ClickHouse/issues/6045) [#6413](https://github.com/ClickHouse/ClickHouse/pull/6413) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)) * Fix for data race in StorageMerge [#6717](https://github.com/ClickHouse/ClickHouse/pull/6717) ([alexey-milovidov](https://github.com/alexey-milovidov)) @@ -736,9 +741,9 @@ fix comments to make obvious that it may throw. * Fixed parsing of `AggregateFunction` values embedded in query. [#6575](https://github.com/ClickHouse/ClickHouse/issues/6575) [#6773](https://github.com/ClickHouse/ClickHouse/pull/6773) ([Zhichang Yu](https://github.com/yuzhichang)) * Fixed wrong behaviour of `trim` functions family. [#6647](https://github.com/ClickHouse/ClickHouse/pull/6647) ([alexey-milovidov](https://github.com/alexey-milovidov)) -## ClickHouse release 19.13.4.32, 2019-09-10 +### ClickHouse release 19.13.4.32, 2019-09-10 -### Bug Fix +#### Bug Fix * This release also contains all bug security fixes from 19.11.9.52 and 19.11.10.54. * Fixed data race in `system.parts` table and `ALTER` query. [#6245](https://github.com/ClickHouse/ClickHouse/issues/6245) [#6513](https://github.com/ClickHouse/ClickHouse/pull/6513) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixed mismatched header in streams happened in case of reading from empty distributed table with sample and prewhere. [#6167](https://github.com/ClickHouse/ClickHouse/issues/6167) ([Lixiang Qian](https://github.com/fancyqlx)) [#6823](https://github.com/ClickHouse/ClickHouse/pull/6823) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) @@ -749,29 +754,83 @@ fix comments to make obvious that it may throw. * Query transformation for `MySQL`, `ODBC`, `JDBC` table functions now works properly for `SELECT WHERE` queries with multiple `AND` expressions. [#6381](https://github.com/ClickHouse/ClickHouse/issues/6381) [#6676](https://github.com/ClickHouse/ClickHouse/pull/6676) ([dimarub2000](https://github.com/dimarub2000)) * Added previous declaration checks for MySQL 8 integration. [#6569](https://github.com/ClickHouse/ClickHouse/pull/6569) ([Rafael David Tinoco](https://github.com/rafaeldtinoco)) -### Security Fix +#### Security Fix * Fix two vulnerabilities in codecs in decompression phase (malicious user can fabricate compressed data that will lead to buffer overflow in decompression). [#6670](https://github.com/ClickHouse/ClickHouse/pull/6670) ([Artem Zuikov](https://github.com/4ertus2)) -## ClickHouse release 19.11.13.74, 2019-11-01 -### Bug Fix +### ClickHouse release 19.13.3.26, 2019-08-22 + +#### Bug Fix +* Fix `ALTER TABLE ... UPDATE` query for tables with `enable_mixed_granularity_parts=1`. [#6543](https://github.com/ClickHouse/ClickHouse/pull/6543) ([alesapin](https://github.com/alesapin)) +* Fix NPE when using IN clause with a subquery with a tuple. [#6125](https://github.com/ClickHouse/ClickHouse/issues/6125) [#6550](https://github.com/ClickHouse/ClickHouse/pull/6550) ([tavplubix](https://github.com/tavplubix)) +* Fixed an issue that if a stale replica becomes alive, it may still have data parts that were removed by DROP PARTITION. [#6522](https://github.com/ClickHouse/ClickHouse/issues/6522) [#6523](https://github.com/ClickHouse/ClickHouse/pull/6523) ([tavplubix](https://github.com/tavplubix)) +* Fixed issue with parsing CSV [#6426](https://github.com/ClickHouse/ClickHouse/issues/6426) [#6559](https://github.com/ClickHouse/ClickHouse/pull/6559) ([tavplubix](https://github.com/tavplubix)) +* Fixed data race in system.parts table and ALTER query. This fixes [#6245](https://github.com/ClickHouse/ClickHouse/issues/6245). [#6513](https://github.com/ClickHouse/ClickHouse/pull/6513) ([alexey-milovidov](https://github.com/alexey-milovidov)) +* Fixed wrong code in mutations that may lead to memory corruption. Fixed segfault with read of address `0x14c0` that may happed due to concurrent `DROP TABLE` and `SELECT` from `system.parts` or `system.parts_columns`. Fixed race condition in preparation of mutation queries. Fixed deadlock caused by `OPTIMIZE` of Replicated tables and concurrent modification operations like ALTERs. [#6514](https://github.com/ClickHouse/ClickHouse/pull/6514) ([alexey-milovidov](https://github.com/alexey-milovidov)) +* Fixed possible data loss after `ALTER DELETE` query on table with skipping index. [#6224](https://github.com/ClickHouse/ClickHouse/issues/6224) [#6282](https://github.com/ClickHouse/ClickHouse/pull/6282) ([Nikita Vasilev](https://github.com/nikvas0)) + +#### Security Fix +* If the attacker has write access to ZooKeeper and is able to run custom server available from the network where ClickHouse run, it can create custom-built malicious server that will act as ClickHouse replica and register it in ZooKeeper. When another replica will fetch data part from malicious replica, it can force clickhouse-server to write to arbitrary path on filesystem. Found by Eldar Zaitov, information security team at Yandex. [#6247](https://github.com/ClickHouse/ClickHouse/pull/6247) ([alexey-milovidov](https://github.com/alexey-milovidov)) + +### ClickHouse release 19.13.2.19, 2019-08-14 + +#### New Feature +* Sampling profiler on query level. [Example](https://gist.github.com/alexey-milovidov/92758583dd41c24c360fdb8d6a4da194). [#4247](https://github.com/ClickHouse/ClickHouse/issues/4247) ([laplab](https://github.com/laplab)) [#6124](https://github.com/ClickHouse/ClickHouse/pull/6124) ([alexey-milovidov](https://github.com/alexey-milovidov)) [#6250](https://github.com/ClickHouse/ClickHouse/pull/6250) [#6283](https://github.com/ClickHouse/ClickHouse/pull/6283) [#6386](https://github.com/ClickHouse/ClickHouse/pull/6386) +* Allow to specify a list of columns with `COLUMNS('regexp')` expression that works like a more sophisticated variant of `*` asterisk. [#5951](https://github.com/ClickHouse/ClickHouse/pull/5951) ([mfridental](https://github.com/mfridental)), ([alexey-milovidov](https://github.com/alexey-milovidov)) +* `CREATE TABLE AS table_function()` is now possible [#6057](https://github.com/ClickHouse/ClickHouse/pull/6057) ([dimarub2000](https://github.com/dimarub2000)) +* Adam optimizer for stochastic gradient descent is used by default in `stochasticLinearRegression()` and `stochasticLogisticRegression()` aggregate functions, because it shows good quality without almost any tuning. [#6000](https://github.com/ClickHouse/ClickHouse/pull/6000) ([Quid37](https://github.com/Quid37)) +* Added functions for working with the сustom week number [#5212](https://github.com/ClickHouse/ClickHouse/pull/5212) ([Andy Yang](https://github.com/andyyzh)) +* `RENAME` queries now work with all storages. [#5953](https://github.com/ClickHouse/ClickHouse/pull/5953) ([Ivan](https://github.com/abyss7)) +* Now client receive logs from server with any desired level by setting `send_logs_level` regardless to the log level specified in server settings. [#5964](https://github.com/ClickHouse/ClickHouse/pull/5964) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)) + +#### Backward Incompatible Change +* The setting `input_format_defaults_for_omitted_fields` is enabled by default. Inserts in Distributed tables need this setting to be the same on cluster (you need to set it before rolling update). It enables calculation of complex default expressions for omitted fields in `JSONEachRow` and `CSV*` formats. It should be the expected behavior but may lead to negligible performance difference. [#6043](https://github.com/ClickHouse/ClickHouse/pull/6043) ([Artem Zuikov](https://github.com/4ertus2)), [#5625](https://github.com/ClickHouse/ClickHouse/pull/5625) ([akuzm](https://github.com/akuzm)) + +#### Experimental features +* New query processing pipeline. Use `experimental_use_processors=1` option to enable it. Use for your own trouble. [#4914](https://github.com/ClickHouse/ClickHouse/pull/4914) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) + +#### Bug Fix +* Kafka integration has been fixed in this version. +* Fixed `DoubleDelta` encoding of `Int64` for large `DoubleDelta` values, improved `DoubleDelta` encoding for random data for `Int32`. [#5998](https://github.com/ClickHouse/ClickHouse/pull/5998) ([Vasily Nemkov](https://github.com/Enmk)) +* Fixed overestimation of `max_rows_to_read` if the setting `merge_tree_uniform_read_distribution` is set to 0. [#6019](https://github.com/ClickHouse/ClickHouse/pull/6019) ([alexey-milovidov](https://github.com/alexey-milovidov)) + +#### Improvement +* Throws an exception if `config.d` file doesn't have the corresponding root element as the config file [#6123](https://github.com/ClickHouse/ClickHouse/pull/6123) ([dimarub2000](https://github.com/dimarub2000)) + +#### Performance Improvement +* Optimize `count()`. Now it uses the smallest column (if possible). [#6028](https://github.com/ClickHouse/ClickHouse/pull/6028) ([Amos Bird](https://github.com/amosbird)) + +#### Build/Testing/Packaging Improvement +* Report memory usage in performance tests. [#5899](https://github.com/ClickHouse/ClickHouse/pull/5899) ([akuzm](https://github.com/akuzm)) +* Fix build with external `libcxx` [#6010](https://github.com/ClickHouse/ClickHouse/pull/6010) ([Ivan](https://github.com/abyss7)) +* Fix shared build with `rdkafka` library [#6101](https://github.com/ClickHouse/ClickHouse/pull/6101) ([Ivan](https://github.com/abyss7)) + +## ClickHouse release 19.11 + +### ClickHouse release 19.11.13.74, 2019-11-01 + +#### Bug Fix * Fixed rare crash in `ALTER MODIFY COLUMN` and vertical merge when one of merged/altered parts is empty (0 rows). [#6780](https://github.com/ClickHouse/ClickHouse/pull/6780) ([alesapin](https://github.com/alesapin)) * Manual update of `SIMDJSON`. This fixes possible flooding of stderr files with bogus json diagnostic messages. [#7548](https://github.com/ClickHouse/ClickHouse/pull/7548) ([Alexander Kazakov](https://github.com/Akazz)) * Fixed bug with `mrk` file extension for mutations ([alesapin](https://github.com/alesapin)) -## ClickHouse release 19.11.12.69, 2019-10-02 +### ClickHouse release 19.11.12.69, 2019-10-02 -### Bug Fix +#### Bug Fix * Fixed performance degradation of index analysis on complex keys on large tables. This fixes [#6924](https://github.com/ClickHouse/ClickHouse/issues/6924). [#7075](https://github.com/ClickHouse/ClickHouse/pull/7075) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Avoid rare SIGSEGV while sending data in tables with Distributed engine (`Failed to send batch: file with index XXXXX is absent`). [#7032](https://github.com/ClickHouse/ClickHouse/pull/7032) ([Azat Khuzhin](https://github.com/azat)) * Fix `Unknown identifier` with multiple joins. This fixes [#5254](https://github.com/ClickHouse/ClickHouse/issues/5254). [#7022](https://github.com/ClickHouse/ClickHouse/pull/7022) ([Artem Zuikov](https://github.com/4ertus2)) -## ClickHouse release 19.11.10.54, 2019-09-10 +### ClickHouse release 19.11.11.57, 2019-09-13 +* Fix logical error causing segfaults when selecting from Kafka empty topic. [#6902](https://github.com/ClickHouse/ClickHouse/issues/6902) [#6909](https://github.com/ClickHouse/ClickHouse/pull/6909) ([Ivan](https://github.com/abyss7)) +* Fix for function `АrrayEnumerateUniqRanked` with empty arrays in params. [#6928](https://github.com/ClickHouse/ClickHouse/pull/6928) ([proller](https://github.com/proller)) -### Bug Fix +### ClickHouse release 19.11.10.54, 2019-09-10 + +#### Bug Fix * Do store offsets for Kafka messages manually to be able to commit them all at once for all partitions. Fixes potential duplication in "one consumer - many partitions" scenario. [#6872](https://github.com/ClickHouse/ClickHouse/pull/6872) ([Ivan](https://github.com/abyss7)) -## ClickHouse release 19.11.9.52, 2019-09-6 +### ClickHouse release 19.11.9.52, 2019-09-6 * Improve error handling in cache dictionaries. [#6737](https://github.com/ClickHouse/ClickHouse/pull/6737) ([Vitaly Baranov](https://github.com/vitlibar)) * Fixed bug in function `arrayEnumerateUniqRanked`. [#6779](https://github.com/ClickHouse/ClickHouse/pull/6779) ([proller](https://github.com/proller)) * Fix `JSONExtract` function while extracting a `Tuple` from JSON. [#6718](https://github.com/ClickHouse/ClickHouse/pull/6718) ([Vitaly Baranov](https://github.com/vitlibar)) @@ -784,63 +843,12 @@ fix comments to make obvious that it may throw. * Fixed error with processing "timezone" in server configuration file. [#6709](https://github.com/ClickHouse/ClickHouse/pull/6709) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fix kafka tests. [#6805](https://github.com/ClickHouse/ClickHouse/pull/6805) ([Ivan](https://github.com/abyss7)) -### Security Fix +#### Security Fix * If the attacker has write access to ZooKeeper and is able to run custom server available from the network where ClickHouse runs, it can create custom-built malicious server that will act as ClickHouse replica and register it in ZooKeeper. When another replica will fetch data part from malicious replica, it can force clickhouse-server to write to arbitrary path on filesystem. Found by Eldar Zaitov, information security team at Yandex. [#6247](https://github.com/ClickHouse/ClickHouse/pull/6247) ([alexey-milovidov](https://github.com/alexey-milovidov)) -## ClickHouse release 19.13.3.26, 2019-08-22 +### ClickHouse release 19.11.8.46, 2019-08-22 -### Bug Fix -* Fix `ALTER TABLE ... UPDATE` query for tables with `enable_mixed_granularity_parts=1`. [#6543](https://github.com/ClickHouse/ClickHouse/pull/6543) ([alesapin](https://github.com/alesapin)) -* Fix NPE when using IN clause with a subquery with a tuple. [#6125](https://github.com/ClickHouse/ClickHouse/issues/6125) [#6550](https://github.com/ClickHouse/ClickHouse/pull/6550) ([tavplubix](https://github.com/tavplubix)) -* Fixed an issue that if a stale replica becomes alive, it may still have data parts that were removed by DROP PARTITION. [#6522](https://github.com/ClickHouse/ClickHouse/issues/6522) [#6523](https://github.com/ClickHouse/ClickHouse/pull/6523) ([tavplubix](https://github.com/tavplubix)) -* Fixed issue with parsing CSV [#6426](https://github.com/ClickHouse/ClickHouse/issues/6426) [#6559](https://github.com/ClickHouse/ClickHouse/pull/6559) ([tavplubix](https://github.com/tavplubix)) -* Fixed data race in system.parts table and ALTER query. This fixes [#6245](https://github.com/ClickHouse/ClickHouse/issues/6245). [#6513](https://github.com/ClickHouse/ClickHouse/pull/6513) ([alexey-milovidov](https://github.com/alexey-milovidov)) -* Fixed wrong code in mutations that may lead to memory corruption. Fixed segfault with read of address `0x14c0` that may happed due to concurrent `DROP TABLE` and `SELECT` from `system.parts` or `system.parts_columns`. Fixed race condition in preparation of mutation queries. Fixed deadlock caused by `OPTIMIZE` of Replicated tables and concurrent modification operations like ALTERs. [#6514](https://github.com/ClickHouse/ClickHouse/pull/6514) ([alexey-milovidov](https://github.com/alexey-milovidov)) -* Fixed possible data loss after `ALTER DELETE` query on table with skipping index. [#6224](https://github.com/ClickHouse/ClickHouse/issues/6224) [#6282](https://github.com/ClickHouse/ClickHouse/pull/6282) ([Nikita Vasilev](https://github.com/nikvas0)) - -### Security Fix -* If the attacker has write access to ZooKeeper and is able to run custom server available from the network where ClickHouse run, it can create custom-built malicious server that will act as ClickHouse replica and register it in ZooKeeper. When another replica will fetch data part from malicious replica, it can force clickhouse-server to write to arbitrary path on filesystem. Found by Eldar Zaitov, information security team at Yandex. [#6247](https://github.com/ClickHouse/ClickHouse/pull/6247) ([alexey-milovidov](https://github.com/alexey-milovidov)) - -## ClickHouse release 19.13.2.19, 2019-08-14 - -### New Feature -* Sampling profiler on query level. [Example](https://gist.github.com/alexey-milovidov/92758583dd41c24c360fdb8d6a4da194). [#4247](https://github.com/ClickHouse/ClickHouse/issues/4247) ([laplab](https://github.com/laplab)) [#6124](https://github.com/ClickHouse/ClickHouse/pull/6124) ([alexey-milovidov](https://github.com/alexey-milovidov)) [#6250](https://github.com/ClickHouse/ClickHouse/pull/6250) [#6283](https://github.com/ClickHouse/ClickHouse/pull/6283) [#6386](https://github.com/ClickHouse/ClickHouse/pull/6386) -* Allow to specify a list of columns with `COLUMNS('regexp')` expression that works like a more sophisticated variant of `*` asterisk. [#5951](https://github.com/ClickHouse/ClickHouse/pull/5951) ([mfridental](https://github.com/mfridental)), ([alexey-milovidov](https://github.com/alexey-milovidov)) -* `CREATE TABLE AS table_function()` is now possible [#6057](https://github.com/ClickHouse/ClickHouse/pull/6057) ([dimarub2000](https://github.com/dimarub2000)) -* Adam optimizer for stochastic gradient descent is used by default in `stochasticLinearRegression()` and `stochasticLogisticRegression()` aggregate functions, because it shows good quality without almost any tuning. [#6000](https://github.com/ClickHouse/ClickHouse/pull/6000) ([Quid37](https://github.com/Quid37)) -* Added functions for working with the сustom week number [#5212](https://github.com/ClickHouse/ClickHouse/pull/5212) ([Andy Yang](https://github.com/andyyzh)) -* `RENAME` queries now work with all storages. [#5953](https://github.com/ClickHouse/ClickHouse/pull/5953) ([Ivan](https://github.com/abyss7)) -* Now client receive logs from server with any desired level by setting `send_logs_level` regardless to the log level specified in server settings. [#5964](https://github.com/ClickHouse/ClickHouse/pull/5964) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)) - -### Backward Incompatible Change -* The setting `input_format_defaults_for_omitted_fields` is enabled by default. Inserts in Distributed tables need this setting to be the same on cluster (you need to set it before rolling update). It enables calculation of complex default expressions for omitted fields in `JSONEachRow` and `CSV*` formats. It should be the expected behavior but may lead to negligible performance difference. [#6043](https://github.com/ClickHouse/ClickHouse/pull/6043) ([Artem Zuikov](https://github.com/4ertus2)), [#5625](https://github.com/ClickHouse/ClickHouse/pull/5625) ([akuzm](https://github.com/akuzm)) - -### Experimental features -* New query processing pipeline. Use `experimental_use_processors=1` option to enable it. Use for your own trouble. [#4914](https://github.com/ClickHouse/ClickHouse/pull/4914) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) - -### Bug Fix -* Kafka integration has been fixed in this version. -* Fixed `DoubleDelta` encoding of `Int64` for large `DoubleDelta` values, improved `DoubleDelta` encoding for random data for `Int32`. [#5998](https://github.com/ClickHouse/ClickHouse/pull/5998) ([Vasily Nemkov](https://github.com/Enmk)) -* Fixed overestimation of `max_rows_to_read` if the setting `merge_tree_uniform_read_distribution` is set to 0. [#6019](https://github.com/ClickHouse/ClickHouse/pull/6019) ([alexey-milovidov](https://github.com/alexey-milovidov)) - -### Improvement -* Throws an exception if `config.d` file doesn't have the corresponding root element as the config file [#6123](https://github.com/ClickHouse/ClickHouse/pull/6123) ([dimarub2000](https://github.com/dimarub2000)) - -### Performance Improvement -* Optimize `count()`. Now it uses the smallest column (if possible). [#6028](https://github.com/ClickHouse/ClickHouse/pull/6028) ([Amos Bird](https://github.com/amosbird)) - -### Build/Testing/Packaging Improvement -* Report memory usage in performance tests. [#5899](https://github.com/ClickHouse/ClickHouse/pull/5899) ([akuzm](https://github.com/akuzm)) -* Fix build with external `libcxx` [#6010](https://github.com/ClickHouse/ClickHouse/pull/6010) ([Ivan](https://github.com/abyss7)) -* Fix shared build with `rdkafka` library [#6101](https://github.com/ClickHouse/ClickHouse/pull/6101) ([Ivan](https://github.com/abyss7)) - -## ClickHouse release 19.11.11.57, 2019-09-13 -* Fix logical error causing segfaults when selecting from Kafka empty topic. [#6902](https://github.com/ClickHouse/ClickHouse/issues/6902) [#6909](https://github.com/ClickHouse/ClickHouse/pull/6909) ([Ivan](https://github.com/abyss7)) -* Fix for function `АrrayEnumerateUniqRanked` with empty arrays in params. [#6928](https://github.com/ClickHouse/ClickHouse/pull/6928) ([proller](https://github.com/proller)) - -## ClickHouse release 19.11.8.46, 2019-08-22 - -### Bug Fix +#### Bug Fix * Fix `ALTER TABLE ... UPDATE` query for tables with `enable_mixed_granularity_parts=1`. [#6543](https://github.com/ClickHouse/ClickHouse/pull/6543) ([alesapin](https://github.com/alesapin)) * Fix NPE when using IN clause with a subquery with a tuple. [#6125](https://github.com/ClickHouse/ClickHouse/issues/6125) [#6550](https://github.com/ClickHouse/ClickHouse/pull/6550) ([tavplubix](https://github.com/tavplubix)) * Fixed an issue that if a stale replica becomes alive, it may still have data parts that were removed by DROP PARTITION. [#6522](https://github.com/ClickHouse/ClickHouse/issues/6522) [#6523](https://github.com/ClickHouse/ClickHouse/pull/6523) ([tavplubix](https://github.com/tavplubix)) @@ -848,9 +856,9 @@ fix comments to make obvious that it may throw. * Fixed data race in system.parts table and ALTER query. This fixes [#6245](https://github.com/ClickHouse/ClickHouse/issues/6245). [#6513](https://github.com/ClickHouse/ClickHouse/pull/6513) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixed wrong code in mutations that may lead to memory corruption. Fixed segfault with read of address `0x14c0` that may happed due to concurrent `DROP TABLE` and `SELECT` from `system.parts` or `system.parts_columns`. Fixed race condition in preparation of mutation queries. Fixed deadlock caused by `OPTIMIZE` of Replicated tables and concurrent modification operations like ALTERs. [#6514](https://github.com/ClickHouse/ClickHouse/pull/6514) ([alexey-milovidov](https://github.com/alexey-milovidov)) -## ClickHouse release 19.11.7.40, 2019-08-14 +### ClickHouse release 19.11.7.40, 2019-08-14 -### Bug fix +#### Bug fix * Kafka integration has been fixed in this version. * Fix segfault when using `arrayReduce` for constant arguments. [#6326](https://github.com/ClickHouse/ClickHouse/pull/6326) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixed `toFloat()` monotonicity. [#6374](https://github.com/ClickHouse/ClickHouse/pull/6374) ([dimarub2000](https://github.com/dimarub2000)) @@ -865,12 +873,12 @@ fix comments to make obvious that it may throw. * Fixed the possibility of a fabricated query to cause server crash due to stack overflow in SQL parser and possibility of stack overflow in `Merge` and `Distributed` tables [#6433](https://github.com/ClickHouse/ClickHouse/pull/6433) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixed Gorilla encoding error on small sequences. [#6444](https://github.com/ClickHouse/ClickHouse/pull/6444) ([Enmk](https://github.com/Enmk)) -### Improvement +#### Improvement * Allow user to override `poll_interval` and `idle_connection_timeout` settings on connection. [#6230](https://github.com/ClickHouse/ClickHouse/pull/6230) ([alexey-milovidov](https://github.com/alexey-milovidov)) -## ClickHouse release 19.11.5.28, 2019-08-05 +### ClickHouse release 19.11.5.28, 2019-08-05 -### Bug fix +#### Bug fix * Fixed the possibility of hanging queries when server is overloaded. [#6301](https://github.com/ClickHouse/ClickHouse/pull/6301) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fix FPE in yandexConsistentHash function. This fixes [#6304](https://github.com/ClickHouse/ClickHouse/issues/6304). [#6126](https://github.com/ClickHouse/ClickHouse/pull/6126) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixed bug in conversion of `LowCardinality` types in `AggregateFunctionFactory`. This fixes [#6257](https://github.com/ClickHouse/ClickHouse/issues/6257). [#6281](https://github.com/ClickHouse/ClickHouse/pull/6281) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) @@ -878,12 +886,12 @@ fix comments to make obvious that it may throw. * Fix rare bug with incompatible stream headers in queries to `Distributed` table over `MergeTree` table when part of `WHERE` moves to `PREWHERE`. [#6236](https://github.com/ClickHouse/ClickHouse/pull/6236) ([alesapin](https://github.com/alesapin)) * Fixed overflow in integer division of signed type to unsigned type. This fixes [#6214](https://github.com/ClickHouse/ClickHouse/issues/6214). [#6233](https://github.com/ClickHouse/ClickHouse/pull/6233) ([alexey-milovidov](https://github.com/alexey-milovidov)) -### Backward Incompatible Change +#### Backward Incompatible Change * `Kafka` still broken. -## ClickHouse release 19.11.4.24, 2019-08-01 +### ClickHouse release 19.11.4.24, 2019-08-01 -### Bug Fix +#### Bug Fix * Fix bug with writing secondary indices marks with adaptive granularity. [#6126](https://github.com/ClickHouse/ClickHouse/pull/6126) ([alesapin](https://github.com/alesapin)) * Fix `WITH ROLLUP` and `WITH CUBE` modifiers of `GROUP BY` with two-level aggregation. [#6225](https://github.com/ClickHouse/ClickHouse/pull/6225) ([Anton Popov](https://github.com/CurtizJ)) * Fixed hang in `JSONExtractRaw` function. Fixed [#6195](https://github.com/ClickHouse/ClickHouse/issues/6195) [#6198](https://github.com/ClickHouse/ClickHouse/pull/6198) ([alexey-milovidov](https://github.com/alexey-milovidov)) @@ -898,18 +906,18 @@ fix comments to make obvious that it may throw. * Clearing the Kafka data buffer from the previous read operation that was completed with an error [#6026](https://github.com/ClickHouse/ClickHouse/pull/6026) ([Nikolay](https://github.com/bopohaa)) Note that Kafka is broken in this version. * Since `StorageMergeTree::background_task_handle` is initialized in `startup()` the `MergeTreeBlockOutputStream::write()` may try to use it before initialization. Just check if it is initialized. [#6080](https://github.com/ClickHouse/ClickHouse/pull/6080) ([Ivan](https://github.com/abyss7)) -### Build/Testing/Packaging Improvement +#### Build/Testing/Packaging Improvement * Added official `rpm` packages. [#5740](https://github.com/ClickHouse/ClickHouse/pull/5740) ([proller](https://github.com/proller)) ([alesapin](https://github.com/alesapin)) * Add an ability to build `.rpm` and `.tgz` packages with `packager` script. [#5769](https://github.com/ClickHouse/ClickHouse/pull/5769) ([alesapin](https://github.com/alesapin)) * Fixes for "Arcadia" build system. [#6223](https://github.com/ClickHouse/ClickHouse/pull/6223) ([proller](https://github.com/proller)) -### Backward Incompatible Change +#### Backward Incompatible Change * `Kafka` is broken in this version. -## ClickHouse release 19.11.3.11, 2019-07-18 +### ClickHouse release 19.11.3.11, 2019-07-18 -### New Feature +#### New Feature * Added support for prepared statements. [#5331](https://github.com/ClickHouse/ClickHouse/pull/5331/) ([Alexander](https://github.com/sanych73)) [#5630](https://github.com/ClickHouse/ClickHouse/pull/5630) ([alexey-milovidov](https://github.com/alexey-milovidov)) * `DoubleDelta` and `Gorilla` column codecs [#5600](https://github.com/ClickHouse/ClickHouse/pull/5600) ([Vasily Nemkov](https://github.com/Enmk)) * Added `os_thread_priority` setting that allows to control the "nice" value of query processing threads that is used by OS to adjust dynamic scheduling priority. It requires `CAP_SYS_NICE` capabilities to work. This implements [#5858](https://github.com/ClickHouse/ClickHouse/issues/5858) [#5909](https://github.com/ClickHouse/ClickHouse/pull/5909) ([alexey-milovidov](https://github.com/alexey-milovidov)) @@ -919,7 +927,7 @@ fix comments to make obvious that it may throw. * Add synonim `arrayFlatten` <-> `flatten` [#5764](https://github.com/ClickHouse/ClickHouse/pull/5764) ([hcz](https://github.com/hczhcz)) * Intergate H3 function `geoToH3` from Uber. [#4724](https://github.com/ClickHouse/ClickHouse/pull/4724) ([Remen Ivan](https://github.com/BHYCHIK)) [#5805](https://github.com/ClickHouse/ClickHouse/pull/5805) ([alexey-milovidov](https://github.com/alexey-milovidov)) -### Bug Fix +#### Bug Fix * Implement DNS cache with asynchronous update. Separate thread resolves all hosts and updates DNS cache with period (setting `dns_cache_update_period`). It should help, when ip of hosts changes frequently. [#5857](https://github.com/ClickHouse/ClickHouse/pull/5857) ([Anton Popov](https://github.com/CurtizJ)) * Fix segfault in `Delta` codec which affects columns with values less than 32 bits size. The bug led to random memory corruption. [#5786](https://github.com/ClickHouse/ClickHouse/pull/5786) ([alesapin](https://github.com/alesapin)) * Fix segfault in TTL merge with non-physical columns in block. [#5819](https://github.com/ClickHouse/ClickHouse/pull/5819) ([Anton Popov](https://github.com/CurtizJ)) @@ -946,7 +954,7 @@ fix comments to make obvious that it may throw. * Fix shutdown of SystemLogs [#5802](https://github.com/ClickHouse/ClickHouse/pull/5802) ([Anton Popov](https://github.com/CurtizJ)) * Fix hanging when condition in invalidate_query depends on a dictionary. [#6011](https://github.com/ClickHouse/ClickHouse/pull/6011) ([Vitaly Baranov](https://github.com/vitlibar)) -### Improvement +#### Improvement * Allow unresolvable addresses in cluster configuration. They will be considered unavailable and tried to resolve at every connection attempt. This is especially useful for Kubernetes. This fixes [#5714](https://github.com/ClickHouse/ClickHouse/issues/5714) [#5924](https://github.com/ClickHouse/ClickHouse/pull/5924) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Close idle TCP connections (with one hour timeout by default). This is especially important for large clusters with multiple distributed tables on every server, because every server can possibly keep a connection pool to every other server, and after peak query concurrency, connections will stall. This fixes [#5879](https://github.com/ClickHouse/ClickHouse/issues/5879) [#5880](https://github.com/ClickHouse/ClickHouse/pull/5880) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Better quality of `topK` function. Changed the SavingSpace set behavior to remove the last element if the new element have a bigger weight. [#5833](https://github.com/ClickHouse/ClickHouse/issues/5833) [#5850](https://github.com/ClickHouse/ClickHouse/pull/5850) ([Guillaume Tassery](https://github.com/YiuRULE)) @@ -967,10 +975,10 @@ fix comments to make obvious that it may throw. * Update default value of `max_ast_elements parameter` [#5933](https://github.com/ClickHouse/ClickHouse/pull/5933) ([Artem Konovalov](https://github.com/izebit)) * Added a notion of obsolete settings. The obsolete setting `allow_experimental_low_cardinality_type` can be used with no effect. [0f15c01c6802f7ce1a1494c12c846be8c98944cd](https://github.com/ClickHouse/ClickHouse/commit/0f15c01c6802f7ce1a1494c12c846be8c98944cd) [Alexey Milovidov](https://github.com/alexey-milovidov) -### Performance Improvement +#### Performance Improvement * Increase number of streams to SELECT from Merge table for more uniform distribution of threads. Added setting `max_streams_multiplier_for_merge_tables`. This fixes [#5797](https://github.com/ClickHouse/ClickHouse/issues/5797) [#5915](https://github.com/ClickHouse/ClickHouse/pull/5915) ([alexey-milovidov](https://github.com/alexey-milovidov)) -### Build/Testing/Packaging Improvement +#### Build/Testing/Packaging Improvement * Add a backward compatibility test for client-server interaction with different versions of clickhouse. [#5868](https://github.com/ClickHouse/ClickHouse/pull/5868) ([alesapin](https://github.com/alesapin)) * Test coverage information in every commit and pull request. [#5896](https://github.com/ClickHouse/ClickHouse/pull/5896) ([alesapin](https://github.com/alesapin)) * Cooperate with address sanitizer to support our custom allocators (`Arena` and `ArenaWithFreeLists`) for better debugging of "use-after-free" errors. [#5728](https://github.com/ClickHouse/ClickHouse/pull/5728) ([akuzm](https://github.com/akuzm)) @@ -1007,22 +1015,23 @@ fix comments to make obvious that it may throw. * Performance test concerning the new JIT feature with bigger dataset, as requested here [#5263](https://github.com/ClickHouse/ClickHouse/issues/5263) [#5887](https://github.com/ClickHouse/ClickHouse/pull/5887) ([Guillaume Tassery](https://github.com/YiuRULE)) * Run stateful tests in stress test [12693e568722f11e19859742f56428455501fd2a](https://github.com/ClickHouse/ClickHouse/commit/12693e568722f11e19859742f56428455501fd2a) ([alesapin](https://github.com/alesapin)) -### Backward Incompatible Change +#### Backward Incompatible Change * `Kafka` is broken in this version. * Enable `adaptive_index_granularity` = 10MB by default for new `MergeTree` tables. If you created new MergeTree tables on version 19.11+, downgrade to versions prior to 19.6 will be impossible. [#5628](https://github.com/ClickHouse/ClickHouse/pull/5628) ([alesapin](https://github.com/alesapin)) * Removed obsolete undocumented embedded dictionaries that were used by Yandex.Metrica. The functions `OSIn`, `SEIn`, `OSToRoot`, `SEToRoot`, `OSHierarchy`, `SEHierarchy` are no longer available. If you are using these functions, write email to clickhouse-feedback@yandex-team.com. Note: at the last moment we decided to keep these functions for a while. [#5780](https://github.com/ClickHouse/ClickHouse/pull/5780) ([alexey-milovidov](https://github.com/alexey-milovidov)) -## ClickHouse release 19.10.1.5, 2019-07-12 +## ClickHouse release 19.10 +### ClickHouse release 19.10.1.5, 2019-07-12 -### New Feature +#### New Feature * Add new column codec: `T64`. Made for (U)IntX/EnumX/Data(Time)/DecimalX columns. It should be good for columns with constant or small range values. Codec itself allows enlarge or shrink data type without re-compression. [#5557](https://github.com/ClickHouse/ClickHouse/pull/5557) ([Artem Zuikov](https://github.com/4ertus2)) * Add database engine `MySQL` that allow to view all the tables in remote MySQL server [#5599](https://github.com/ClickHouse/ClickHouse/pull/5599) ([Winter Zhang](https://github.com/zhang2014)) * `bitmapContains` implementation. It's 2x faster than `bitmapHasAny` if the second bitmap contains one element. [#5535](https://github.com/ClickHouse/ClickHouse/pull/5535) ([Zhichang Yu](https://github.com/yuzhichang)) * Support for `crc32` function (with behaviour exactly as in MySQL or PHP). Do not use it if you need a hash function. [#5661](https://github.com/ClickHouse/ClickHouse/pull/5661) ([Remen Ivan](https://github.com/BHYCHIK)) * Implemented `SYSTEM START/STOP DISTRIBUTED SENDS` queries to control asynchronous inserts into `Distributed` tables. [#4935](https://github.com/ClickHouse/ClickHouse/pull/4935) ([Winter Zhang](https://github.com/zhang2014)) -### Bug Fix +#### Bug Fix * Ignore query execution limits and max parts size for merge limits while executing mutations. [#5659](https://github.com/ClickHouse/ClickHouse/pull/5659) ([Anton Popov](https://github.com/CurtizJ)) * Fix bug which may lead to deduplication of normal blocks (extremely rare) and insertion of duplicate blocks (more often). [#5549](https://github.com/ClickHouse/ClickHouse/pull/5549) ([alesapin](https://github.com/alesapin)) * Fix of function `arrayEnumerateUniqRanked` for arguments with empty arrays [#5559](https://github.com/ClickHouse/ClickHouse/pull/5559) ([proller](https://github.com/proller)) @@ -1032,7 +1041,7 @@ fix comments to make obvious that it may throw. * Fix Float to Decimal convert overflow [#5607](https://github.com/ClickHouse/ClickHouse/pull/5607) ([coraxster](https://github.com/coraxster)) * Flush buffer when `WriteBufferFromHDFS`'s destructor is called. This fixes writing into `HDFS`. [#5684](https://github.com/ClickHouse/ClickHouse/pull/5684) ([Xindong Peng](https://github.com/eejoin)) -### Improvement +#### Improvement * Treat empty cells in `CSV` as default values when the setting `input_format_defaults_for_omitted_fields` is enabled. [#5625](https://github.com/ClickHouse/ClickHouse/pull/5625) ([akuzm](https://github.com/akuzm)) * Non-blocking loading of external dictionaries. [#5567](https://github.com/ClickHouse/ClickHouse/pull/5567) ([Vitaly Baranov](https://github.com/vitlibar)) * Network timeouts can be dynamically changed for already established connections according to the settings. [#4558](https://github.com/ClickHouse/ClickHouse/pull/4558) ([Konstantin Podshumok](https://github.com/podshumok)) @@ -1043,21 +1052,22 @@ fix comments to make obvious that it may throw. * Support `` section in `clickhouse-local` config file. [#5540](https://github.com/ClickHouse/ClickHouse/pull/5540) ([proller](https://github.com/proller)) * Allow run query with `remote` table function in `clickhouse-local` [#5627](https://github.com/ClickHouse/ClickHouse/pull/5627) ([proller](https://github.com/proller)) -### Performance Improvement +#### Performance Improvement * Add the possibility to write the final mark at the end of MergeTree columns. It allows to avoid useless reads for keys that are out of table data range. It is enabled only if adaptive index granularity is in use. [#5624](https://github.com/ClickHouse/ClickHouse/pull/5624) ([alesapin](https://github.com/alesapin)) * Improved performance of MergeTree tables on very slow filesystems by reducing number of `stat` syscalls. [#5648](https://github.com/ClickHouse/ClickHouse/pull/5648) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixed performance degradation in reading from MergeTree tables that was introduced in version 19.6. Fixes #5631. [#5633](https://github.com/ClickHouse/ClickHouse/pull/5633) ([alexey-milovidov](https://github.com/alexey-milovidov)) -### Build/Testing/Packaging Improvement +#### Build/Testing/Packaging Improvement * Implemented `TestKeeper` as an implementation of ZooKeeper interface used for testing [#5643](https://github.com/ClickHouse/ClickHouse/pull/5643) ([alexey-milovidov](https://github.com/alexey-milovidov)) ([levushkin aleksej](https://github.com/alexey-milovidov)) * From now on `.sql` tests can be run isolated by server, in parallel, with random database. It allows to run them faster, add new tests with custom server configurations, and be sure that different tests doesn't affect each other. [#5554](https://github.com/ClickHouse/ClickHouse/pull/5554) ([Ivan](https://github.com/abyss7)) * Remove `` and `` from performance tests [#5672](https://github.com/ClickHouse/ClickHouse/pull/5672) ([Olga Khvostikova](https://github.com/stavrolia)) * Fixed "select_format" performance test for `Pretty` formats [#5642](https://github.com/ClickHouse/ClickHouse/pull/5642) ([alexey-milovidov](https://github.com/alexey-milovidov)) -## ClickHouse release 19.9.3.31, 2019-07-05 +## ClickHouse release 19.9 +### ClickHouse release 19.9.3.31, 2019-07-05 -### Bug Fix +#### Bug Fix * Fix segfault in Delta codec which affects columns with values less than 32 bits size. The bug led to random memory corruption. [#5786](https://github.com/ClickHouse/ClickHouse/pull/5786) ([alesapin](https://github.com/alesapin)) * Fix rare bug in checking of part with LowCardinality column. [#5832](https://github.com/ClickHouse/ClickHouse/pull/5832) ([alesapin](https://github.com/alesapin)) * Fix segfault in TTL merge with non-physical columns in block. [#5819](https://github.com/ClickHouse/ClickHouse/pull/5819) ([Anton Popov](https://github.com/CurtizJ)) @@ -1067,26 +1077,21 @@ fix comments to make obvious that it may throw. * Fix race condition, which cause that some queries may not appear in query_log instantly after SYSTEM FLUSH LOGS query. [#5685](https://github.com/ClickHouse/ClickHouse/pull/5685) ([Anton Popov](https://github.com/CurtizJ)) * Added missing support for constant arguments to `evalMLModel` function. [#5820](https://github.com/ClickHouse/ClickHouse/pull/5820) ([alexey-milovidov](https://github.com/alexey-milovidov)) -## ClickHouse release 19.7.5.29, 2019-07-05 +### ClickHouse release 19.9.2.4, 2019-06-24 -### Bug Fix -* Fix performance regression in some queries with JOIN. [#5192](https://github.com/ClickHouse/ClickHouse/pull/5192) ([Winter Zhang](https://github.com/zhang2014)) - -## ClickHouse release 19.9.2.4, 2019-06-24 - -### New Feature +#### New Feature * Print information about frozen parts in `system.parts` table. [#5471](https://github.com/ClickHouse/ClickHouse/pull/5471) ([proller](https://github.com/proller)) * Ask client password on clickhouse-client start on tty if not set in arguments [#5092](https://github.com/ClickHouse/ClickHouse/pull/5092) ([proller](https://github.com/proller)) * Implement `dictGet` and `dictGetOrDefault` functions for Decimal types. [#5394](https://github.com/ClickHouse/ClickHouse/pull/5394) ([Artem Zuikov](https://github.com/4ertus2)) -### Improvement +#### Improvement * Debian init: Add service stop timeout [#5522](https://github.com/ClickHouse/ClickHouse/pull/5522) ([proller](https://github.com/proller)) * Add setting forbidden by default to create table with suspicious types for LowCardinality [#5448](https://github.com/ClickHouse/ClickHouse/pull/5448) ([Olga Khvostikova](https://github.com/stavrolia)) * Regression functions return model weights when not used as State in function `evalMLMethod`. [#5411](https://github.com/ClickHouse/ClickHouse/pull/5411) ([Quid37](https://github.com/Quid37)) * Rename and improve regression methods. [#5492](https://github.com/ClickHouse/ClickHouse/pull/5492) ([Quid37](https://github.com/Quid37)) * Clearer interfaces of string searchers. [#5586](https://github.com/ClickHouse/ClickHouse/pull/5586) ([Danila Kutenin](https://github.com/danlark1)) -### Bug Fix +#### Bug Fix * Fix potential data loss in Kafka [#5445](https://github.com/ClickHouse/ClickHouse/pull/5445) ([Ivan](https://github.com/abyss7)) * Fix potential infinite loop in `PrettySpace` format when called with zero columns [#5560](https://github.com/ClickHouse/ClickHouse/pull/5560) ([Olga Khvostikova](https://github.com/stavrolia)) * Fixed UInt32 overflow bug in linear models. Allow eval ML model for non-const model argument. [#5516](https://github.com/ClickHouse/ClickHouse/pull/5516) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) @@ -1106,7 +1111,7 @@ fix comments to make obvious that it may throw. * Throw an exception on wrong integers in `dictGetT` functions instead of crash. [#5446](https://github.com/ClickHouse/ClickHouse/pull/5446) ([Artem Zuikov](https://github.com/4ertus2)) * Fix wrong element_count and load_factor for hashed dictionary in `system.dictionaries` table. [#5440](https://github.com/ClickHouse/ClickHouse/pull/5440) ([Azat Khuzhin](https://github.com/azat)) -### Build/Testing/Packaging Improvement +#### Build/Testing/Packaging Improvement * Fixed build without `Brotli` HTTP compression support (`ENABLE_BROTLI=OFF` cmake variable). [#5521](https://github.com/ClickHouse/ClickHouse/pull/5521) ([Anton Yuzhaninov](https://github.com/citrin)) * Include roaring.h as roaring/roaring.h [#5523](https://github.com/ClickHouse/ClickHouse/pull/5523) ([Orivej Desh](https://github.com/orivej)) * Fix gcc9 warnings in hyperscan (#line directive is evil!) [#5546](https://github.com/ClickHouse/ClickHouse/pull/5546) ([Danila Kutenin](https://github.com/danlark1)) @@ -1121,9 +1126,10 @@ fix comments to make obvious that it may throw. * Fix build clickhouse as submodule [#5574](https://github.com/ClickHouse/ClickHouse/pull/5574) ([proller](https://github.com/proller)) * Improve JSONExtract performance tests [#5444](https://github.com/ClickHouse/ClickHouse/pull/5444) ([Vitaly Baranov](https://github.com/vitlibar)) -## ClickHouse release 19.8.3.8, 2019-06-11 +## ClickHouse release 19.8 +### ClickHouse release 19.8.3.8, 2019-06-11 -### New Features +#### New Features * Added functions to work with JSON [#4686](https://github.com/ClickHouse/ClickHouse/pull/4686) ([hcz](https://github.com/hczhcz)) [#5124](https://github.com/ClickHouse/ClickHouse/pull/5124). ([Vitaly Baranov](https://github.com/vitlibar)) * Add a function basename, with a similar behaviour to a basename function, which exists in a lot of languages (`os.path.basename` in python, `basename` in PHP, etc...). Work with both an UNIX-like path or a Windows path. [#5136](https://github.com/ClickHouse/ClickHouse/pull/5136) ([Guillaume Tassery](https://github.com/YiuRULE)) * Added `LIMIT n, m BY` or `LIMIT m OFFSET n BY` syntax to set offset of n for LIMIT BY clause. [#5138](https://github.com/ClickHouse/ClickHouse/pull/5138) ([Anton Popov](https://github.com/CurtizJ)) @@ -1144,7 +1150,7 @@ fix comments to make obvious that it may throw. * Added functions `IPv4CIDRtoIPv4Range` and `IPv6CIDRtoIPv6Range` to calculate the lower and higher bounds for an IP in the subnet using a CIDR. [#5095](https://github.com/ClickHouse/ClickHouse/pull/5095) ([Guillaume Tassery](https://github.com/YiuRULE)) * Add a X-ClickHouse-Summary header when we send a query using HTTP with enabled setting `send_progress_in_http_headers`. Return the usual information of X-ClickHouse-Progress, with additional information like how many rows and bytes were inserted in the query. [#5116](https://github.com/ClickHouse/ClickHouse/pull/5116) ([Guillaume Tassery](https://github.com/YiuRULE)) -### Improvements +#### Improvements * Added `max_parts_in_total` setting for MergeTree family of tables (default: 100 000) that prevents unsafe specification of partition key #5166. [#5171](https://github.com/ClickHouse/ClickHouse/pull/5171) ([alexey-milovidov](https://github.com/alexey-milovidov)) * `clickhouse-obfuscator`: derive seed for individual columns by combining initial seed with column name, not column position. This is intended to transform datasets with multiple related tables, so that tables will remain JOINable after transformation. [#5178](https://github.com/ClickHouse/ClickHouse/pull/5178) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Added functions `JSONExtractRaw`, `JSONExtractKeyAndValues`. Renamed functions `jsonExtract` to `JSONExtract`. When something goes wrong these functions return the correspondent values, not `NULL`. Modified function `JSONExtract`, now it gets the return type from its last parameter and doesn't inject nullables. Implemented fallback to RapidJSON in case AVX2 instructions are not available. Simdjson library updated to a new version. [#5235](https://github.com/ClickHouse/ClickHouse/pull/5235) ([Vitaly Baranov](https://github.com/vitlibar)) @@ -1168,7 +1174,7 @@ It allows to set commit mode: after every batch of messages is handled, or after * Respect query settings in asynchronous INSERTs into Distributed tables. [#4936](https://github.com/ClickHouse/ClickHouse/pull/4936) ([TCeason](https://github.com/TCeason)) * Renamed functions `leastSqr` to `simpleLinearRegression`, `LinearRegression` to `linearRegression`, `LogisticRegression` to `logisticRegression`. [#5391](https://github.com/ClickHouse/ClickHouse/pull/5391) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) -### Performance Improvements +#### Performance Improvements * Parallelize processing of parts of non-replicated MergeTree tables in ALTER MODIFY query. [#4639](https://github.com/ClickHouse/ClickHouse/pull/4639) ([Ivan Kush](https://github.com/IvanKush)) * Optimizations in regular expressions extraction. [#5193](https://github.com/ClickHouse/ClickHouse/pull/5193) [#5191](https://github.com/ClickHouse/ClickHouse/pull/5191) ([Danila Kutenin](https://github.com/danlark1)) * Do not add right join key column to join result if it's used only in join on section. [#5260](https://github.com/ClickHouse/ClickHouse/pull/5260) ([Artem Zuikov](https://github.com/4ertus2)) @@ -1178,7 +1184,7 @@ It allows to set commit mode: after every batch of messages is handled, or after * Upgrade our LZ4 implementation with reference one to have faster decompression. [#5070](https://github.com/ClickHouse/ClickHouse/pull/5070) ([Danila Kutenin](https://github.com/danlark1)) * Implemented MSD radix sort (based on kxsort), and partial sorting. [#5129](https://github.com/ClickHouse/ClickHouse/pull/5129) ([Evgenii Pravda](https://github.com/kvinty)) -### Bug Fixes +#### Bug Fixes * Fix push require columns with join [#5192](https://github.com/ClickHouse/ClickHouse/pull/5192) ([Winter Zhang](https://github.com/zhang2014)) * Fixed bug, when ClickHouse is run by systemd, the command `sudo service clickhouse-server forcerestart` was not working as expected. [#5204](https://github.com/ClickHouse/ClickHouse/pull/5204) ([proller](https://github.com/proller)) * Fix http error codes in DataPartsExchange (interserver http server on 9009 port always returned code 200, even on errors). [#5216](https://github.com/ClickHouse/ClickHouse/pull/5216) ([proller](https://github.com/proller)) @@ -1189,7 +1195,7 @@ It allows to set commit mode: after every batch of messages is handled, or after * Fix `retention` function. Now all conditions that satisfy in a row of data are added to the data state. [#5119](https://github.com/ClickHouse/ClickHouse/pull/5119) ([小路](https://github.com/nicelulu)) * Fix result type for `quantileExact` with Decimals. [#5304](https://github.com/ClickHouse/ClickHouse/pull/5304) ([Artem Zuikov](https://github.com/4ertus2)) -### Documentation +#### Documentation * Translate documentation for `CollapsingMergeTree` to chinese. [#5168](https://github.com/ClickHouse/ClickHouse/pull/5168) ([张风啸](https://github.com/AlexZFX)) * Translate some documentation about table engines to chinese. [#5134](https://github.com/ClickHouse/ClickHouse/pull/5134) @@ -1197,7 +1203,7 @@ It allows to set commit mode: after every batch of messages is handled, or after ([never lee](https://github.com/neverlee)) -### Build/Testing/Packaging Improvements +#### Build/Testing/Packaging Improvements * Fix some sanitizer reports that show probable use-after-free.[#5139](https://github.com/ClickHouse/ClickHouse/pull/5139) [#5143](https://github.com/ClickHouse/ClickHouse/pull/5143) [#5393](https://github.com/ClickHouse/ClickHouse/pull/5393) ([Ivan](https://github.com/abyss7)) * Move performance tests out of separate directories for convenience. [#5158](https://github.com/ClickHouse/ClickHouse/pull/5158) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fix incorrect performance tests. [#5255](https://github.com/ClickHouse/ClickHouse/pull/5255) ([alesapin](https://github.com/alesapin)) @@ -1206,12 +1212,19 @@ It allows to set commit mode: after every batch of messages is handled, or after * Add small instruction how to write performance tests. [#5408](https://github.com/ClickHouse/ClickHouse/pull/5408) ([alesapin](https://github.com/alesapin)) * Add ability to make substitutions in create, fill and drop query in performance tests [#5367](https://github.com/ClickHouse/ClickHouse/pull/5367) ([Olga Khvostikova](https://github.com/stavrolia)) -## ClickHouse release 19.7.5.27, 2019-06-09 +## ClickHouse release 19.7 -### New features +### ClickHouse release 19.7.5.29, 2019-07-05 + +#### Bug Fix +* Fix performance regression in some queries with JOIN. [#5192](https://github.com/ClickHouse/ClickHouse/pull/5192) ([Winter Zhang](https://github.com/zhang2014)) + +### ClickHouse release 19.7.5.27, 2019-06-09 + +#### New features * Added bitmap related functions `bitmapHasAny` and `bitmapHasAll` analogous to `hasAny` and `hasAll` functions for arrays. [#5279](https://github.com/ClickHouse/ClickHouse/pull/5279) ([Sergi Vladykin](https://github.com/svladykin)) -### Bug Fixes +#### Bug Fixes * Fix segfault on `minmax` INDEX with Null value. [#5246](https://github.com/ClickHouse/ClickHouse/pull/5246) ([Nikita Vasilev](https://github.com/nikvas0)) * Mark all input columns in LIMIT BY as required output. It fixes 'Not found column' error in some distributed queries. [#5407](https://github.com/ClickHouse/ClickHouse/pull/5407) ([Constantin S. Pan](https://github.com/kvap)) * Fix "Column '0' already exists" error in `SELECT .. PREWHERE` on column with DEFAULT [#5397](https://github.com/ClickHouse/ClickHouse/pull/5397) ([proller](https://github.com/proller)) @@ -1232,9 +1245,9 @@ It allows to set commit mode: after every batch of messages is handled, or after did not process it, but already get list of children, will terminate the DDLWorker thread. [#5489](https://github.com/ClickHouse/ClickHouse/pull/5489) ([Azat Khuzhin](https://github.com/azat)) * Fix INSERT into Distributed() table with MATERIALIZED column. [#5429](https://github.com/ClickHouse/ClickHouse/pull/5429) ([Azat Khuzhin](https://github.com/azat)) -## ClickHouse release 19.7.3.9, 2019-05-30 +### ClickHouse release 19.7.3.9, 2019-05-30 -### New Features +#### New Features * Allow to limit the range of a setting that can be specified by user. These constraints can be set up in user settings profile. [#4931](https://github.com/ClickHouse/ClickHouse/pull/4931) ([Vitaly @@ -1250,7 +1263,7 @@ Tassery](https://github.com/YiuRULE)) [#5081](https://github.com/ClickHouse/ClickHouse/pull/5081) ([Alexander](https://github.com/Akazz)) -### Bug Fixes +#### Bug Fixes * Crash with uncompressed_cache + JOIN during merge (#5197) [#5133](https://github.com/ClickHouse/ClickHouse/pull/5133) ([Danila Kutenin](https://github.com/danlark1)) @@ -1262,14 +1275,14 @@ Kutenin](https://github.com/danlark1)) ([Ivan](https://github.com/abyss7)) * Fixed very rare data race condition that could happen when executing a query with UNION ALL involving at least two SELECTs from system.columns, system.tables, system.parts, system.parts_tables or tables of Merge family and performing ALTER of columns of the related tables concurrently. [#5189](https://github.com/ClickHouse/ClickHouse/pull/5189) ([alexey-milovidov](https://github.com/alexey-milovidov)) -### Performance Improvements +#### Performance Improvements * Use radix sort for sorting by single numeric column in `ORDER BY` without `LIMIT`. [#5106](https://github.com/ClickHouse/ClickHouse/pull/5106), [#4439](https://github.com/ClickHouse/ClickHouse/pull/4439) ([Evgenii Pravda](https://github.com/kvinty), [alexey-milovidov](https://github.com/alexey-milovidov)) -### Documentation +#### Documentation * Translate documentation for some table engines to Chinese. [#5107](https://github.com/ClickHouse/ClickHouse/pull/5107), [#5094](https://github.com/ClickHouse/ClickHouse/pull/5094), @@ -1278,7 +1291,7 @@ Kutenin](https://github.com/danlark1)) [#5068](https://github.com/ClickHouse/ClickHouse/pull/5068) ([never lee](https://github.com/neverlee)) -### Build/Testing/Packaging Improvements +#### Build/Testing/Packaging Improvements * Print UTF-8 characters properly in `clickhouse-test`. [#5084](https://github.com/ClickHouse/ClickHouse/pull/5084) ([alexey-milovidov](https://github.com/alexey-milovidov)) @@ -1294,9 +1307,10 @@ lee](https://github.com/neverlee)) [#5110](https://github.com/ClickHouse/ClickHouse/pull/5110) ([proller](https://github.com/proller)) -## ClickHouse release 19.6.3.18, 2019-06-13 +## ClickHouse release 19.6 +### ClickHouse release 19.6.3.18, 2019-06-13 -### Bug Fixes +#### Bug Fixes * Fixed IN condition pushdown for queries from table functions `mysql` and `odbc` and corresponding table engines. This fixes #3540 and #2384. [#5313](https://github.com/ClickHouse/ClickHouse/pull/5313) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fix deadlock in Zookeeper. [#5297](https://github.com/ClickHouse/ClickHouse/pull/5297) ([github1youlc](https://github.com/github1youlc)) * Allow quoted decimals in CSV. [#5284](https://github.com/ClickHouse/ClickHouse/pull/5284) ([Artem Zuikov](https://github.com/4ertus2) @@ -1304,18 +1318,18 @@ lee](https://github.com/neverlee)) * Fix data race in rename query. [#5247](https://github.com/ClickHouse/ClickHouse/pull/5247) ([Winter Zhang](https://github.com/zhang2014)) * Temporarily disable LFAlloc. Usage of LFAlloc might lead to a lot of MAP_FAILED in allocating UncompressedCache and in a result to crashes of queries at high loaded servers. [cfdba93](https://github.com/ClickHouse/ClickHouse/commit/cfdba938ce22f16efeec504f7f90206a515b1280)([Danila Kutenin](https://github.com/danlark1)) -## ClickHouse release 19.6.2.11, 2019-05-13 +### ClickHouse release 19.6.2.11, 2019-05-13 -### New Features +#### New Features * TTL expressions for columns and tables. [#4212](https://github.com/ClickHouse/ClickHouse/pull/4212) ([Anton Popov](https://github.com/CurtizJ)) * Added support for `brotli` compression for HTTP responses (Accept-Encoding: br) [#4388](https://github.com/ClickHouse/ClickHouse/pull/4388) ([Mikhail](https://github.com/fandyushin)) * Added new function `isValidUTF8` for checking whether a set of bytes is correctly utf-8 encoded. [#4934](https://github.com/ClickHouse/ClickHouse/pull/4934) ([Danila Kutenin](https://github.com/danlark1)) * Add new load balancing policy `first_or_random` which sends queries to the first specified host and if it's inaccessible send queries to random hosts of shard. Useful for cross-replication topology setups. [#5012](https://github.com/ClickHouse/ClickHouse/pull/5012) ([nvartolomei](https://github.com/nvartolomei)) -### Experimental Features +#### Experimental Features * Add setting `index_granularity_bytes` (adaptive index granularity) for MergeTree* tables family. [#4826](https://github.com/ClickHouse/ClickHouse/pull/4826) ([alesapin](https://github.com/alesapin)) -### Improvements +#### Improvements * Added support for non-constant and negative size and length arguments for function `substringUTF8`. [#4989](https://github.com/ClickHouse/ClickHouse/pull/4989) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Disable push-down to right table in left join, left table in right join, and both tables in full join. This fixes wrong JOIN results in some cases. [#4846](https://github.com/ClickHouse/ClickHouse/pull/4846) ([Ivan](https://github.com/abyss7)) * `clickhouse-copier`: auto upload task configuration from `--task-file` option [#4876](https://github.com/ClickHouse/ClickHouse/pull/4876) ([proller](https://github.com/proller)) @@ -1323,13 +1337,13 @@ lee](https://github.com/neverlee)) * Support asterisks and qualified asterisks for multiple joins without subqueries [#4898](https://github.com/ClickHouse/ClickHouse/pull/4898) ([Artem Zuikov](https://github.com/4ertus2)) * Make missing column error message more user friendly. [#4915](https://github.com/ClickHouse/ClickHouse/pull/4915) ([Artem Zuikov](https://github.com/4ertus2)) -### Performance Improvements +#### Performance Improvements * Significant speedup of ASOF JOIN [#4924](https://github.com/ClickHouse/ClickHouse/pull/4924) ([Martijn Bakker](https://github.com/Gladdy)) -### Backward Incompatible Changes +#### Backward Incompatible Changes * HTTP header `Query-Id` was renamed to `X-ClickHouse-Query-Id` for consistency. [#4972](https://github.com/ClickHouse/ClickHouse/pull/4972) ([Mikhail](https://github.com/fandyushin)) -### Bug Fixes +#### Bug Fixes * Fixed potential null pointer dereference in `clickhouse-copier`. [#4900](https://github.com/ClickHouse/ClickHouse/pull/4900) ([proller](https://github.com/proller)) * Fixed error on query with JOIN + ARRAY JOIN [#4938](https://github.com/ClickHouse/ClickHouse/pull/4938) ([Artem Zuikov](https://github.com/4ertus2)) * Fixed hanging on start of the server when a dictionary depends on another dictionary via a database with engine=Dictionary. [#4962](https://github.com/ClickHouse/ClickHouse/pull/4962) ([Vitaly Baranov](https://github.com/vitlibar)) @@ -1337,7 +1351,7 @@ lee](https://github.com/neverlee)) * Fix potentially wrong result for `SELECT DISTINCT` with `JOIN` [#5001](https://github.com/ClickHouse/ClickHouse/pull/5001) ([Artem Zuikov](https://github.com/4ertus2)) * Fixed very rare data race condition that could happen when executing a query with UNION ALL involving at least two SELECTs from system.columns, system.tables, system.parts, system.parts_tables or tables of Merge family and performing ALTER of columns of the related tables concurrently. [#5189](https://github.com/ClickHouse/ClickHouse/pull/5189) ([alexey-milovidov](https://github.com/alexey-milovidov)) -### Build/Testing/Packaging Improvements +#### Build/Testing/Packaging Improvements * Fixed test failures when running clickhouse-server on different host [#4713](https://github.com/ClickHouse/ClickHouse/pull/4713) ([Vasily Nemkov](https://github.com/Enmk)) * clickhouse-test: Disable color control sequences in non tty environment. [#4937](https://github.com/ClickHouse/ClickHouse/pull/4937) ([alesapin](https://github.com/alesapin)) * clickhouse-test: Allow use any test database (remove `test.` qualification where it possible) [#5008](https://github.com/ClickHouse/ClickHouse/pull/5008) ([proller](https://github.com/proller)) @@ -1346,24 +1360,25 @@ lee](https://github.com/neverlee)) * Python util to help with backports and changelogs. [#4949](https://github.com/ClickHouse/ClickHouse/pull/4949) ([Ivan](https://github.com/abyss7)) -## ClickHouse release 19.5.4.22, 2019-05-13 +## ClickHouse release 19.5 +### ClickHouse release 19.5.4.22, 2019-05-13 -### Bug fixes +#### Bug fixes * Fixed possible crash in bitmap* functions [#5220](https://github.com/ClickHouse/ClickHouse/pull/5220) [#5228](https://github.com/ClickHouse/ClickHouse/pull/5228) ([Andy Yang](https://github.com/andyyzh)) * Fixed very rare data race condition that could happen when executing a query with UNION ALL involving at least two SELECTs from system.columns, system.tables, system.parts, system.parts_tables or tables of Merge family and performing ALTER of columns of the related tables concurrently. [#5189](https://github.com/ClickHouse/ClickHouse/pull/5189) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixed error `Set for IN is not created yet in case of using single LowCardinality column in the left part of IN`. This error happened if LowCardinality column was the part of primary key. #5031 [#5154](https://github.com/ClickHouse/ClickHouse/pull/5154) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) * Modification of retention function: If a row satisfies both the first and NTH condition, only the first satisfied condition is added to the data state. Now all conditions that satisfy in a row of data are added to the data state. [#5119](https://github.com/ClickHouse/ClickHouse/pull/5119) ([小路](https://github.com/nicelulu)) -## ClickHouse release 19.5.3.8, 2019-04-18 +### ClickHouse release 19.5.3.8, 2019-04-18 -### Bug fixes +#### Bug fixes * Fixed type of setting `max_partitions_per_insert_block` from boolean to UInt64. [#5028](https://github.com/ClickHouse/ClickHouse/pull/5028) ([Mohammad Hossein Sekhavat](https://github.com/mhsekhavat)) -## ClickHouse release 19.5.2.6, 2019-04-15 +### ClickHouse release 19.5.2.6, 2019-04-15 -### New Features +#### New Features * [Hyperscan](https://github.com/intel/hyperscan) multiple regular expression matching was added (functions `multiMatchAny`, `multiMatchAnyIndex`, `multiFuzzyMatchAny`, `multiFuzzyMatchAnyIndex`). [#4780](https://github.com/ClickHouse/ClickHouse/pull/4780), [#4841](https://github.com/ClickHouse/ClickHouse/pull/4841) ([Danila Kutenin](https://github.com/danlark1)) * `multiSearchFirstPosition` function was added. [#4780](https://github.com/ClickHouse/ClickHouse/pull/4780) ([Danila Kutenin](https://github.com/danlark1)) @@ -1372,7 +1387,7 @@ lee](https://github.com/neverlee)) * Added `ASOF JOIN` which allows to run queries that join to the most recent value known. [#4774](https://github.com/ClickHouse/ClickHouse/pull/4774) [#4867](https://github.com/ClickHouse/ClickHouse/pull/4867) [#4863](https://github.com/ClickHouse/ClickHouse/pull/4863) [#4875](https://github.com/ClickHouse/ClickHouse/pull/4875) ([Martijn Bakker](https://github.com/Gladdy), [Artem Zuikov](https://github.com/4ertus2)) * Rewrite multiple `COMMA JOIN` to `CROSS JOIN`. Then rewrite them to `INNER JOIN` if possible. [#4661](https://github.com/ClickHouse/ClickHouse/pull/4661) ([Artem Zuikov](https://github.com/4ertus2)) -### Improvement +#### Improvement * `topK` and `topKWeighted` now supports custom `loadFactor` (fixes issue [#4252](https://github.com/ClickHouse/ClickHouse/issues/4252)). [#4634](https://github.com/ClickHouse/ClickHouse/pull/4634) ([Kirill Danshin](https://github.com/kirillDanshin)) * Allow to use `parallel_replicas_count > 1` even for tables without sampling (the setting is simply ignored for them). In previous versions it was lead to exception. [#4637](https://github.com/ClickHouse/ClickHouse/pull/4637) ([Alexey Elymanov](https://github.com/digitalist)) @@ -1389,7 +1404,7 @@ lee](https://github.com/neverlee)) * Improved data skipping indices calculation. [#4640](https://github.com/ClickHouse/ClickHouse/pull/4640) ([Nikita Vasilev](https://github.com/nikvas0)) * Keep ordinary, `DEFAULT`, `MATERIALIZED` and `ALIAS` columns in a single list (fixes issue [#2867](https://github.com/ClickHouse/ClickHouse/issues/2867)). [#4707](https://github.com/ClickHouse/ClickHouse/pull/4707) ([Alex Zatelepin](https://github.com/ztlpn)) -### Bug Fix +#### Bug Fix * Avoid `std::terminate` in case of memory allocation failure. Now `std::bad_alloc` exception is thrown as expected. [#4665](https://github.com/ClickHouse/ClickHouse/pull/4665) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixes capnproto reading from buffer. Sometimes files wasn't loaded successfully by HTTP. [#4674](https://github.com/ClickHouse/ClickHouse/pull/4674) ([Vladislav](https://github.com/smirnov-vs)) @@ -1429,19 +1444,19 @@ lee](https://github.com/neverlee)) * Fix function `toISOWeek` result for year 1970. [#4988](https://github.com/ClickHouse/ClickHouse/pull/4988) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fix `DROP`, `TRUNCATE` and `OPTIMIZE` queries duplication, when executed on `ON CLUSTER` for `ReplicatedMergeTree*` tables family. [#4991](https://github.com/ClickHouse/ClickHouse/pull/4991) ([alesapin](https://github.com/alesapin)) -### Backward Incompatible Change +#### Backward Incompatible Change * Rename setting `insert_sample_with_metadata` to setting `input_format_defaults_for_omitted_fields`. [#4771](https://github.com/ClickHouse/ClickHouse/pull/4771) ([Artem Zuikov](https://github.com/4ertus2)) * Added setting `max_partitions_per_insert_block` (with value 100 by default). If inserted block contains larger number of partitions, an exception is thrown. Set it to 0 if you want to remove the limit (not recommended). [#4845](https://github.com/ClickHouse/ClickHouse/pull/4845) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Multi-search functions were renamed (`multiPosition` to `multiSearchAllPositions`, `multiSearch` to `multiSearchAny`, `firstMatch` to `multiSearchFirstIndex`). [#4780](https://github.com/ClickHouse/ClickHouse/pull/4780) ([Danila Kutenin](https://github.com/danlark1)) -### Performance Improvement +#### Performance Improvement * Optimize Volnitsky searcher by inlining, giving about 5-10% search improvement for queries with many needles or many similar bigrams. [#4862](https://github.com/ClickHouse/ClickHouse/pull/4862) ([Danila Kutenin](https://github.com/danlark1)) * Fix performance issue when setting `use_uncompressed_cache` is greater than zero, which appeared when all read data contained in cache. [#4913](https://github.com/ClickHouse/ClickHouse/pull/4913) ([alesapin](https://github.com/alesapin)) -### Build/Testing/Packaging Improvement +#### Build/Testing/Packaging Improvement * Hardening debug build: more granular memory mappings and ASLR; add memory protection for mark cache and index. This allows to find more memory stomping bugs in case when ASan and MSan cannot do it. [#4632](https://github.com/ClickHouse/ClickHouse/pull/4632) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Add support for cmake variables `ENABLE_PROTOBUF`, `ENABLE_PARQUET` and `ENABLE_BROTLI` which allows to enable/disable the above features (same as we can do for librdkafka, mysql, etc). [#4669](https://github.com/ClickHouse/ClickHouse/pull/4669) ([Silviu Caragea](https://github.com/silviucpp)) @@ -1456,9 +1471,10 @@ lee](https://github.com/neverlee)) * Disable usage of `mremap` when compiled with Thread Sanitizer. Surprisingly enough, TSan does not intercept `mremap` (though it does intercept `mmap`, `munmap`) that leads to false positives. Fixed TSan report in stateful tests. [#4859](https://github.com/ClickHouse/ClickHouse/pull/4859) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Add test checking using format schema via HTTP interface. [#4864](https://github.com/ClickHouse/ClickHouse/pull/4864) ([Vitaly Baranov](https://github.com/vitlibar)) -## ClickHouse release 19.4.4.33, 2019-04-17 +## ClickHouse release 19.4 +### ClickHouse release 19.4.4.33, 2019-04-17 -### Bug Fixes +#### Bug Fixes * Avoid `std::terminate` in case of memory allocation failure. Now `std::bad_alloc` exception is thrown as expected. [#4665](https://github.com/ClickHouse/ClickHouse/pull/4665) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixes capnproto reading from buffer. Sometimes files wasn't loaded successfully by HTTP. [#4674](https://github.com/ClickHouse/ClickHouse/pull/4674) ([Vladislav](https://github.com/smirnov-vs)) @@ -1493,34 +1509,34 @@ lee](https://github.com/neverlee)) * Fix function `toISOWeek` result for year 1970. [#4988](https://github.com/ClickHouse/ClickHouse/pull/4988) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fix `DROP`, `TRUNCATE` and `OPTIMIZE` queries duplication, when executed on `ON CLUSTER` for `ReplicatedMergeTree*` tables family. [#4991](https://github.com/ClickHouse/ClickHouse/pull/4991) ([alesapin](https://github.com/alesapin)) -### Improvements +#### Improvements * Keep ordinary, `DEFAULT`, `MATERIALIZED` and `ALIAS` columns in a single list (fixes issue [#2867](https://github.com/ClickHouse/ClickHouse/issues/2867)). [#4707](https://github.com/ClickHouse/ClickHouse/pull/4707) ([Alex Zatelepin](https://github.com/ztlpn)) -## ClickHouse release 19.4.3.11, 2019-04-02 +### ClickHouse release 19.4.3.11, 2019-04-02 -### Bug Fixes +#### Bug Fixes * Fix crash in `FULL/RIGHT JOIN` when we joining on nullable vs not nullable. [#4855](https://github.com/ClickHouse/ClickHouse/pull/4855) ([Artem Zuikov](https://github.com/4ertus2)) * Fix segmentation fault in `clickhouse-copier`. [#4835](https://github.com/ClickHouse/ClickHouse/pull/4835) ([proller](https://github.com/proller)) -### Build/Testing/Packaging Improvement +#### Build/Testing/Packaging Improvement * Add a way to launch clickhouse-server image from a custom user. [#4753](https://github.com/ClickHouse/ClickHouse/pull/4753) ([Mikhail f. Shiryaev](https://github.com/Felixoid)) -## ClickHouse release 19.4.2.7, 2019-03-30 +### ClickHouse release 19.4.2.7, 2019-03-30 -### Bug Fixes +#### Bug Fixes * Fixed reading from `Array(LowCardinality)` column in rare case when column contained a long sequence of empty arrays. [#4850](https://github.com/ClickHouse/ClickHouse/pull/4850) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) -## ClickHouse release 19.4.1.3, 2019-03-19 +### ClickHouse release 19.4.1.3, 2019-03-19 -### Bug Fixes +#### Bug Fixes * Fixed remote queries which contain both `LIMIT BY` and `LIMIT`. Previously, if `LIMIT BY` and `LIMIT` were used for remote query, `LIMIT` could happen before `LIMIT BY`, which led to too filtered result. [#4708](https://github.com/ClickHouse/ClickHouse/pull/4708) ([Constantin S. Pan](https://github.com/kvap)) -## ClickHouse release 19.4.0.49, 2019-03-09 +### ClickHouse release 19.4.0.49, 2019-03-09 -### New Features +#### New Features * Added full support for `Protobuf` format (input and output, nested data structures). [#4174](https://github.com/ClickHouse/ClickHouse/pull/4174) [#4493](https://github.com/ClickHouse/ClickHouse/pull/4493) ([Vitaly Baranov](https://github.com/vitlibar)) * Added bitmap functions with Roaring Bitmaps. [#4207](https://github.com/ClickHouse/ClickHouse/pull/4207) ([Andy Yang](https://github.com/andyyzh)) [#4568](https://github.com/ClickHouse/ClickHouse/pull/4568) ([Vitaly Baranov](https://github.com/vitlibar)) * Parquet format support. [#4448](https://github.com/ClickHouse/ClickHouse/pull/4448) ([proller](https://github.com/proller)) @@ -1531,7 +1547,7 @@ lee](https://github.com/neverlee)) * Added functions `arrayEnumerateDenseRanked` and `arrayEnumerateUniqRanked` (it's like `arrayEnumerateUniq` but allows to fine tune array depth to look inside multidimensional arrays). [#4475](https://github.com/ClickHouse/ClickHouse/pull/4475) ([proller](https://github.com/proller)) [#4601](https://github.com/ClickHouse/ClickHouse/pull/4601) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Multiple JOINS with some restrictions: no asterisks, no complex aliases in ON/WHERE/GROUP BY/... [#4462](https://github.com/ClickHouse/ClickHouse/pull/4462) ([Artem Zuikov](https://github.com/4ertus2)) -### Bug Fixes +#### Bug Fixes * This release also contains all bug fixes from 19.3 and 19.1. * Fixed bug in data skipping indices: order of granules after INSERT was incorrect. [#4407](https://github.com/ClickHouse/ClickHouse/pull/4407) ([Nikita Vasilev](https://github.com/nikvas0)) * Fixed `set` index for `Nullable` and `LowCardinality` columns. Before it, `set` index with `Nullable` or `LowCardinality` column led to error `Data type must be deserialized with multiple streams` while selecting. [#4594](https://github.com/ClickHouse/ClickHouse/pull/4594) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) @@ -1553,19 +1569,19 @@ lee](https://github.com/neverlee)) * Fix lambda function with predicate optimizer. [#4408](https://github.com/ClickHouse/ClickHouse/pull/4408) ([Winter Zhang](https://github.com/zhang2014)) * Multiple JOINs multiple fixes. [#4595](https://github.com/ClickHouse/ClickHouse/pull/4595) ([Artem Zuikov](https://github.com/4ertus2)) -### Improvements +#### Improvements * Support aliases in JOIN ON section for right table columns. [#4412](https://github.com/ClickHouse/ClickHouse/pull/4412) ([Artem Zuikov](https://github.com/4ertus2)) * Result of multiple JOINs need correct result names to be used in subselects. Replace flat aliases with source names in result. [#4474](https://github.com/ClickHouse/ClickHouse/pull/4474) ([Artem Zuikov](https://github.com/4ertus2)) * Improve push-down logic for joined statements. [#4387](https://github.com/ClickHouse/ClickHouse/pull/4387) ([Ivan](https://github.com/abyss7)) -### Performance Improvements +#### Performance Improvements * Improved heuristics of "move to PREWHERE" optimization. [#4405](https://github.com/ClickHouse/ClickHouse/pull/4405) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Use proper lookup tables that uses HashTable's API for 8-bit and 16-bit keys. [#4536](https://github.com/ClickHouse/ClickHouse/pull/4536) ([Amos Bird](https://github.com/amosbird)) * Improved performance of string comparison. [#4564](https://github.com/ClickHouse/ClickHouse/pull/4564) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Cleanup distributed DDL queue in a separate thread so that it doesn't slow down the main loop that processes distributed DDL tasks. [#4502](https://github.com/ClickHouse/ClickHouse/pull/4502) ([Alex Zatelepin](https://github.com/ztlpn)) * When `min_bytes_to_use_direct_io` is set to 1, not every file was opened with O_DIRECT mode because the data size to read was sometimes underestimated by the size of one compressed block. [#4526](https://github.com/ClickHouse/ClickHouse/pull/4526) ([alexey-milovidov](https://github.com/alexey-milovidov)) -### Build/Testing/Packaging Improvement +#### Build/Testing/Packaging Improvement * Added support for clang-9 [#4604](https://github.com/ClickHouse/ClickHouse/pull/4604) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fix wrong `__asm__` instructions (again) [#4621](https://github.com/ClickHouse/ClickHouse/pull/4621) ([Konstantin Podshumok](https://github.com/podshumok)) * Add ability to specify settings for `clickhouse-performance-test` from command line. [#4437](https://github.com/ClickHouse/ClickHouse/pull/4437) ([alesapin](https://github.com/alesapin)) @@ -1577,29 +1593,30 @@ lee](https://github.com/neverlee)) * Fix compilation on Mac. [#4371](https://github.com/ClickHouse/ClickHouse/pull/4371) ([Vitaly Baranov](https://github.com/vitlibar)) * Build fixes for FreeBSD and various unusual build configurations. [#4444](https://github.com/ClickHouse/ClickHouse/pull/4444) ([proller](https://github.com/proller)) -## ClickHouse release 19.3.9.1, 2019-04-02 +## ClickHouse release 19.3 +### ClickHouse release 19.3.9.1, 2019-04-02 -### Bug Fixes +#### Bug Fixes * Fix crash in `FULL/RIGHT JOIN` when we joining on nullable vs not nullable. [#4855](https://github.com/ClickHouse/ClickHouse/pull/4855) ([Artem Zuikov](https://github.com/4ertus2)) * Fix segmentation fault in `clickhouse-copier`. [#4835](https://github.com/ClickHouse/ClickHouse/pull/4835) ([proller](https://github.com/proller)) * Fixed reading from `Array(LowCardinality)` column in rare case when column contained a long sequence of empty arrays. [#4850](https://github.com/ClickHouse/ClickHouse/pull/4850) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) -### Build/Testing/Packaging Improvement +#### Build/Testing/Packaging Improvement * Add a way to launch clickhouse-server image from a custom user [#4753](https://github.com/ClickHouse/ClickHouse/pull/4753) ([Mikhail f. Shiryaev](https://github.com/Felixoid)) -## ClickHouse release 19.3.7, 2019-03-12 +### ClickHouse release 19.3.7, 2019-03-12 -### Bug fixes +#### Bug fixes * Fixed error in #3920. This error manifests itself as random cache corruption (messages `Unknown codec family code`, `Cannot seek through file`) and segfaults. This bug first appeared in version 19.1 and is present in versions up to 19.1.10 and 19.3.6. [#4623](https://github.com/ClickHouse/ClickHouse/pull/4623) ([alexey-milovidov](https://github.com/alexey-milovidov)) -## ClickHouse release 19.3.6, 2019-03-02 +### ClickHouse release 19.3.6, 2019-03-02 -### Bug fixes +#### Bug fixes * When there are more than 1000 threads in a thread pool, `std::terminate` may happen on thread exit. [Azat Khuzhin](https://github.com/azat) [#4485](https://github.com/ClickHouse/ClickHouse/pull/4485) [#4505](https://github.com/ClickHouse/ClickHouse/pull/4505) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Now it's possible to create `ReplicatedMergeTree*` tables with comments on columns without defaults and tables with columns codecs without comments and defaults. Also fix comparison of codecs. [#4523](https://github.com/ClickHouse/ClickHouse/pull/4523) ([alesapin](https://github.com/alesapin)) @@ -1608,7 +1625,7 @@ lee](https://github.com/neverlee)) * Fixed hangup on server shutdown if distributed DDLs were used. [#4472](https://github.com/ClickHouse/ClickHouse/pull/4472) ([Alex Zatelepin](https://github.com/ztlpn)) * Incorrect column numbers were printed in error message about text format parsing for columns with number greater than 10. [#4484](https://github.com/ClickHouse/ClickHouse/pull/4484) ([alexey-milovidov](https://github.com/alexey-milovidov)) -### Build/Testing/Packaging Improvements +#### Build/Testing/Packaging Improvements * Fixed build with AVX enabled. [#4527](https://github.com/ClickHouse/ClickHouse/pull/4527) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Enable extended accounting and IO accounting based on good known version instead of kernel under which it is compiled. [#4541](https://github.com/ClickHouse/ClickHouse/pull/4541) ([nvartolomei](https://github.com/nvartolomei)) @@ -1616,33 +1633,33 @@ lee](https://github.com/neverlee)) * Removed the `inline` tags of `void readBinary(...)` in `Field.cpp`. Also merged redundant `namespace DB` blocks. [#4530](https://github.com/ClickHouse/ClickHouse/pull/4530) ([hcz](https://github.com/hczhcz)) -## ClickHouse release 19.3.5, 2019-02-21 +### ClickHouse release 19.3.5, 2019-02-21 -### Bug fixes +#### Bug fixes * Fixed bug with large http insert queries processing. [#4454](https://github.com/ClickHouse/ClickHouse/pull/4454) ([alesapin](https://github.com/alesapin)) * Fixed backward incompatibility with old versions due to wrong implementation of `send_logs_level` setting. [#4445](https://github.com/ClickHouse/ClickHouse/pull/4445) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixed backward incompatibility of table function `remote` introduced with column comments. [#4446](https://github.com/ClickHouse/ClickHouse/pull/4446) ([alexey-milovidov](https://github.com/alexey-milovidov)) -## ClickHouse release 19.3.4, 2019-02-16 +### ClickHouse release 19.3.4, 2019-02-16 -### Improvements +#### Improvements * Table index size is not accounted for memory limits when doing `ATTACH TABLE` query. Avoided the possibility that a table cannot be attached after being detached. [#4396](https://github.com/ClickHouse/ClickHouse/pull/4396) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Slightly raised up the limit on max string and array size received from ZooKeeper. It allows to continue to work with increased size of `CLIENT_JVMFLAGS=-Djute.maxbuffer=...` on ZooKeeper. [#4398](https://github.com/ClickHouse/ClickHouse/pull/4398) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Allow to repair abandoned replica even if it already has huge number of nodes in its queue. [#4399](https://github.com/ClickHouse/ClickHouse/pull/4399) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Add one required argument to `SET` index (max stored rows number). [#4386](https://github.com/ClickHouse/ClickHouse/pull/4386) ([Nikita Vasilev](https://github.com/nikvas0)) -### Bug Fixes +#### Bug Fixes * Fixed `WITH ROLLUP` result for group by single `LowCardinality` key. [#4384](https://github.com/ClickHouse/ClickHouse/pull/4384) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) * Fixed bug in the set index (dropping a granule if it contains more than `max_rows` rows). [#4386](https://github.com/ClickHouse/ClickHouse/pull/4386) ([Nikita Vasilev](https://github.com/nikvas0)) * A lot of FreeBSD build fixes. [#4397](https://github.com/ClickHouse/ClickHouse/pull/4397) ([proller](https://github.com/proller)) * Fixed aliases substitution in queries with subquery containing same alias (issue [#4110](https://github.com/ClickHouse/ClickHouse/issues/4110)). [#4351](https://github.com/ClickHouse/ClickHouse/pull/4351) ([Artem Zuikov](https://github.com/4ertus2)) -### Build/Testing/Packaging Improvements +#### Build/Testing/Packaging Improvements * Add ability to run `clickhouse-server` for stateless tests in docker image. [#4347](https://github.com/ClickHouse/ClickHouse/pull/4347) ([Vasily Nemkov](https://github.com/Enmk)) -## ClickHouse release 19.3.3, 2019-02-13 +### ClickHouse release 19.3.3, 2019-02-13 -### New Features +#### New Features * Added the `KILL MUTATION` statement that allows removing mutations that are for some reasons stuck. Added `latest_failed_part`, `latest_fail_time`, `latest_fail_reason` fields to the `system.mutations` table for easier troubleshooting. [#4287](https://github.com/ClickHouse/ClickHouse/pull/4287) ([Alex Zatelepin](https://github.com/ztlpn)) * Added aggregate function `entropy` which computes Shannon entropy. [#4238](https://github.com/ClickHouse/ClickHouse/pull/4238) ([Quid37](https://github.com/Quid37)) * Added ability to send queries `INSERT INTO tbl VALUES (....` to server without splitting on `query` and `data` parts. [#4301](https://github.com/ClickHouse/ClickHouse/pull/4301) ([alesapin](https://github.com/alesapin)) @@ -1662,11 +1679,11 @@ lee](https://github.com/neverlee)) * Added hints while user make typo in function name or type in command line client. [#4239](https://github.com/ClickHouse/ClickHouse/pull/4239) ([Danila Kutenin](https://github.com/danlark1)) * Added `Query-Id` to Server's HTTP Response header. [#4231](https://github.com/ClickHouse/ClickHouse/pull/4231) ([Mikhail ](https://github.com/fandyushin)) -### Experimental features +#### Experimental features * Added `minmax` and `set` data skipping indices for MergeTree table engines family. [#4143](https://github.com/ClickHouse/ClickHouse/pull/4143) ([Nikita Vasilev](https://github.com/nikvas0)) * Added conversion of `CROSS JOIN` to `INNER JOIN` if possible. [#4221](https://github.com/ClickHouse/ClickHouse/pull/4221) [#4266](https://github.com/ClickHouse/ClickHouse/pull/4266) ([Artem Zuikov](https://github.com/4ertus2)) -### Bug Fixes +#### Bug Fixes * Fixed `Not found column` for duplicate columns in `JOIN ON` section. [#4279](https://github.com/ClickHouse/ClickHouse/pull/4279) ([Artem Zuikov](https://github.com/4ertus2)) * Make `START REPLICATED SENDS` command start replicated sends. [#4229](https://github.com/ClickHouse/ClickHouse/pull/4229) ([nvartolomei](https://github.com/nvartolomei)) * Fixed aggregate functions execution with `Array(LowCardinality)` arguments. [#4055](https://github.com/ClickHouse/ClickHouse/pull/4055) ([KochetovNicolai](https://github.com/KochetovNicolai)) @@ -1696,7 +1713,7 @@ lee](https://github.com/neverlee)) * Fix install package with missing /etc/clickhouse-server/config.xml. [#4343](https://github.com/ClickHouse/ClickHouse/pull/4343) ([proller](https://github.com/proller)) -### Build/Testing/Packaging Improvements +#### Build/Testing/Packaging Improvements * Debian package: correct /etc/clickhouse-server/preprocessed link according to config. [#4205](https://github.com/ClickHouse/ClickHouse/pull/4205) ([proller](https://github.com/proller)) * Various build fixes for FreeBSD. [#4225](https://github.com/ClickHouse/ClickHouse/pull/4225) ([proller](https://github.com/proller)) * Added ability to create, fill and drop tables in perftest. [#4220](https://github.com/ClickHouse/ClickHouse/pull/4220) ([alesapin](https://github.com/alesapin)) @@ -1716,17 +1733,17 @@ lee](https://github.com/neverlee)) * Added checking SSE and AVX instruction at start. [#4234](https://github.com/ClickHouse/ClickHouse/pull/4234) ([Igr](https://github.com/igron99)) * Init script will wait server until start. [#4281](https://github.com/ClickHouse/ClickHouse/pull/4281) ([proller](https://github.com/proller)) -### Backward Incompatible Changes +#### Backward Incompatible Changes * Removed `allow_experimental_low_cardinality_type` setting. `LowCardinality` data types are production ready. [#4323](https://github.com/ClickHouse/ClickHouse/pull/4323) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Reduce mark cache size and uncompressed cache size accordingly to available memory amount. [#4240](https://github.com/ClickHouse/ClickHouse/pull/4240) ([Lopatin Konstantin](https://github.com/k-lopatin) * Added keyword `INDEX` in `CREATE TABLE` query. A column with name `index` must be quoted with backticks or double quotes: `` `index` ``. [#4143](https://github.com/ClickHouse/ClickHouse/pull/4143) ([Nikita Vasilev](https://github.com/nikvas0)) * `sumMap` now promote result type instead of overflow. The old `sumMap` behavior can be obtained by using `sumMapWithOverflow` function. [#4151](https://github.com/ClickHouse/ClickHouse/pull/4151) ([Léo Ercolanelli](https://github.com/ercolanelli-leo)) -### Performance Improvements +#### Performance Improvements * `std::sort` replaced by `pdqsort` for queries without `LIMIT`. [#4236](https://github.com/ClickHouse/ClickHouse/pull/4236) ([Evgenii Pravda](https://github.com/kvinty)) * Now server reuse threads from global thread pool. This affects performance in some corner cases. [#4150](https://github.com/ClickHouse/ClickHouse/pull/4150) ([alexey-milovidov](https://github.com/alexey-milovidov)) -### Improvements +#### Improvements * Implemented AIO support for FreeBSD. [#4305](https://github.com/ClickHouse/ClickHouse/pull/4305) ([urgordeadbeef](https://github.com/urgordeadbeef)) * `SELECT * FROM a JOIN b USING a, b` now return `a` and `b` columns only from the left table. [#4141](https://github.com/ClickHouse/ClickHouse/pull/4141) ([Artem Zuikov](https://github.com/4ertus2)) * Allow `-C` option of client to work as `-c` option. [#4232](https://github.com/ClickHouse/ClickHouse/pull/4232) ([syominsergey](https://github.com/syominsergey)) @@ -1742,34 +1759,37 @@ lee](https://github.com/neverlee)) * Added info about the replicated_can_become_leader setting to system.replicas and add logging if the replica won't try to become leader. [#4379](https://github.com/ClickHouse/ClickHouse/pull/4379) ([Alex Zatelepin](https://github.com/ztlpn)) -## ClickHouse release 19.1.14, 2019-03-14 +## ClickHouse release 19.1 +### ClickHouse release 19.1.14, 2019-03-14 * Fixed error `Column ... queried more than once` that may happen if the setting `asterisk_left_columns_only` is set to 1 in case of using `GLOBAL JOIN` with `SELECT *` (rare case). The issue does not exist in 19.3 and newer. [6bac7d8d](https://github.com/ClickHouse/ClickHouse/pull/4692/commits/6bac7d8d11a9b0d6de0b32b53c47eb2f6f8e7062) ([Artem Zuikov](https://github.com/4ertus2)) -## ClickHouse release 19.1.13, 2019-03-12 +### ClickHouse release 19.1.13, 2019-03-12 This release contains exactly the same set of patches as 19.3.7. -## ClickHouse release 19.1.10, 2019-03-03 +### ClickHouse release 19.1.10, 2019-03-03 This release contains exactly the same set of patches as 19.3.6. -## ClickHouse release 19.1.9, 2019-02-21 +## ClickHouse release 19.1 +### ClickHouse release 19.1.9, 2019-02-21 -### Bug fixes +#### Bug fixes * Fixed backward incompatibility with old versions due to wrong implementation of `send_logs_level` setting. [#4445](https://github.com/ClickHouse/ClickHouse/pull/4445) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixed backward incompatibility of table function `remote` introduced with column comments. [#4446](https://github.com/ClickHouse/ClickHouse/pull/4446) ([alexey-milovidov](https://github.com/alexey-milovidov)) -## ClickHouse release 19.1.8, 2019-02-16 +### ClickHouse release 19.1.8, 2019-02-16 -### Bug Fixes +#### Bug Fixes * Fix install package with missing /etc/clickhouse-server/config.xml. [#4343](https://github.com/ClickHouse/ClickHouse/pull/4343) ([proller](https://github.com/proller)) -## ClickHouse release 19.1.7, 2019-02-15 +## ClickHouse release 19.1 +### ClickHouse release 19.1.7, 2019-02-15 -### Bug Fixes +#### Bug Fixes * Correctly return the right type and properly handle locks in `joinGet` function. [#4153](https://github.com/ClickHouse/ClickHouse/pull/4153) ([Amos Bird](https://github.com/amosbird)) * Fixed error when system logs are tried to create again at server shutdown. [#4254](https://github.com/ClickHouse/ClickHouse/pull/4254) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Fixed error: if there is a database with `Dictionary` engine, all dictionaries forced to load at server startup, and if there is a dictionary with ClickHouse source from localhost, the dictionary cannot load. [#4255](https://github.com/ClickHouse/ClickHouse/pull/4255) ([alexey-milovidov](https://github.com/alexey-milovidov)) @@ -1795,9 +1815,9 @@ This release contains exactly the same set of patches as 19.3.6. * Fixed bug with incorrect `Date` and `DateTime` comparison. [#4237](https://github.com/ClickHouse/ClickHouse/pull/4237) ([valexey](https://github.com/valexey)) * Fixed incorrect result when `Date` and `DateTime` arguments are used in branches of conditional operator (function `if`). Added generic case for function `if`. [#4243](https://github.com/ClickHouse/ClickHouse/pull/4243) ([alexey-milovidov](https://github.com/alexey-milovidov)) -## ClickHouse release 19.1.6, 2019-01-24 +### ClickHouse release 19.1.6, 2019-01-24 -### New Features +#### New Features * Custom per column compression codecs for tables. [#3899](https://github.com/ClickHouse/ClickHouse/pull/3899) [#4111](https://github.com/ClickHouse/ClickHouse/pull/4111) ([alesapin](https://github.com/alesapin), [Winter Zhang](https://github.com/zhang2014), [Anatoly](https://github.com/Sindbag)) * Added compression codec `Delta`. [#4052](https://github.com/ClickHouse/ClickHouse/pull/4052) ([alesapin](https://github.com/alesapin)) @@ -1815,12 +1835,12 @@ This release contains exactly the same set of patches as 19.3.6. * Added table function `remoteSecure`. Function works as `remote`, but uses secure connection. [#4088](https://github.com/ClickHouse/ClickHouse/pull/4088) ([proller](https://github.com/proller)) -### Experimental features +#### Experimental features * Added multiple JOINs emulation (`allow_experimental_multiple_joins_emulation` setting). [#3946](https://github.com/ClickHouse/ClickHouse/pull/3946) ([Artem Zuikov](https://github.com/4ertus2)) -### Bug Fixes +#### Bug Fixes * Make `compiled_expression_cache_size` setting limited by default to lower memory consumption. [#4041](https://github.com/ClickHouse/ClickHouse/pull/4041) ([alesapin](https://github.com/alesapin)) * Fix a bug that led to hangups in threads that perform ALTERs of Replicated tables and in the thread that updates configuration from ZooKeeper. [#2947](https://github.com/ClickHouse/ClickHouse/issues/2947) [#3891](https://github.com/ClickHouse/ClickHouse/issues/3891) [#3934](https://github.com/ClickHouse/ClickHouse/pull/3934) ([Alex Zatelepin](https://github.com/ztlpn)) @@ -1849,7 +1869,7 @@ This release contains exactly the same set of patches as 19.3.6. * Fixed column aliases for query with `JOIN ON` syntax and distributed tables. [#3980](https://github.com/ClickHouse/ClickHouse/pull/3980) ([Winter Zhang](https://github.com/zhang2014)) * Fixed error in internal implementation of `quantileTDigest` (found by Artem Vakhrushev). This error never happens in ClickHouse and was relevant only for those who use ClickHouse codebase as a library directly. [#3935](https://github.com/ClickHouse/ClickHouse/pull/3935) ([alexey-milovidov](https://github.com/alexey-milovidov)) -### Improvements +#### Improvements * Support for `IF NOT EXISTS` in `ALTER TABLE ADD COLUMN` statements along with `IF EXISTS` in `DROP/MODIFY/CLEAR/COMMENT COLUMN`. [#3900](https://github.com/ClickHouse/ClickHouse/pull/3900) ([Boris Granveaud](https://github.com/bgranvea)) * Function `parseDateTimeBestEffort`: support for formats `DD.MM.YYYY`, `DD.MM.YY`, `DD-MM-YYYY`, `DD-Mon-YYYY`, `DD/Month/YYYY` and similar. [#3922](https://github.com/ClickHouse/ClickHouse/pull/3922) ([alexey-milovidov](https://github.com/alexey-milovidov)) @@ -1864,7 +1884,7 @@ This release contains exactly the same set of patches as 19.3.6. * Add check that `SET send_logs_level = 'value'` query accept appropriate value. [#3873](https://github.com/ClickHouse/ClickHouse/pull/3873) ([Sabyanin Maxim](https://github.com/s-mx)) * Fixed data type check in type conversion functions. [#3896](https://github.com/ClickHouse/ClickHouse/pull/3896) ([Winter Zhang](https://github.com/zhang2014)) -### Performance Improvements +#### Performance Improvements * Add a MergeTree setting `use_minimalistic_part_header_in_zookeeper`. If enabled, Replicated tables will store compact part metadata in a single part znode. This can dramatically reduce ZooKeeper snapshot size (especially if the tables have a lot of columns). Note that after enabling this setting you will not be able to downgrade to a version that doesn't support it. [#3960](https://github.com/ClickHouse/ClickHouse/pull/3960) ([Alex Zatelepin](https://github.com/ztlpn)) * Add an DFA-based implementation for functions `sequenceMatch` and `sequenceCount` in case pattern doesn't contain time. [#4004](https://github.com/ClickHouse/ClickHouse/pull/4004) ([Léo Ercolanelli](https://github.com/ercolanelli-leo)) @@ -1872,13 +1892,13 @@ This release contains exactly the same set of patches as 19.3.6. * Zero left padding PODArray so that -1 element is always valid and zeroed. It's used for branchless calculation of offsets. [#3920](https://github.com/ClickHouse/ClickHouse/pull/3920) ([Amos Bird](https://github.com/amosbird)) * Reverted `jemalloc` version which lead to performance degradation. [#4018](https://github.com/ClickHouse/ClickHouse/pull/4018) ([alexey-milovidov](https://github.com/alexey-milovidov)) -### Backward Incompatible Changes +#### Backward Incompatible Changes * Removed undocumented feature `ALTER MODIFY PRIMARY KEY` because it was superseded by the `ALTER MODIFY ORDER BY` command. [#3887](https://github.com/ClickHouse/ClickHouse/pull/3887) ([Alex Zatelepin](https://github.com/ztlpn)) * Removed function `shardByHash`. [#3833](https://github.com/ClickHouse/ClickHouse/pull/3833) ([alexey-milovidov](https://github.com/alexey-milovidov)) * Forbid using scalar subqueries with result of type `AggregateFunction`. [#3865](https://github.com/ClickHouse/ClickHouse/pull/3865) ([Ivan](https://github.com/abyss7)) -### Build/Testing/Packaging Improvements +#### Build/Testing/Packaging Improvements * Added support for PowerPC (`ppc64le`) build. [#4132](https://github.com/ClickHouse/ClickHouse/pull/4132) ([Danila Kutenin](https://github.com/danlark1)) * Stateful functional tests are run on public available dataset. [#3969](https://github.com/ClickHouse/ClickHouse/pull/3969) ([alexey-milovidov](https://github.com/alexey-milovidov)) @@ -1906,24 +1926,25 @@ This release contains exactly the same set of patches as 19.3.6. * Fixed typos in comments. [#4089](https://github.com/ClickHouse/ClickHouse/pull/4089) ([Evgenii Pravda](https://github.com/kvinty)) -## ClickHouse release 18.16.1, 2018-12-21 +## ClickHouse release 18.16 +### ClickHouse release 18.16.1, 2018-12-21 -### Bug fixes: +#### Bug fixes: * Fixed an error that led to problems with updating dictionaries with the ODBC source. [#3825](https://github.com/ClickHouse/ClickHouse/issues/3825), [#3829](https://github.com/ClickHouse/ClickHouse/issues/3829) * JIT compilation of aggregate functions now works with LowCardinality columns. [#3838](https://github.com/ClickHouse/ClickHouse/issues/3838) -### Improvements: +#### Improvements: * Added the `low_cardinality_allow_in_native_format` setting (enabled by default). When disabled, LowCardinality columns will be converted to ordinary columns for SELECT queries and ordinary columns will be expected for INSERT queries. [#3879](https://github.com/ClickHouse/ClickHouse/pull/3879) -### Build improvements: +#### Build improvements: * Fixes for builds on macOS and ARM. -## ClickHouse release 18.16.0, 2018-12-14 +### ClickHouse release 18.16.0, 2018-12-14 -### New features: +#### New features: * `DEFAULT` expressions are evaluated for missing fields when loading data in semi-structured input formats (`JSONEachRow`, `TSKV`). The feature is enabled with the `insert_sample_with_metadata` setting. [#3555](https://github.com/ClickHouse/ClickHouse/pull/3555) * The `ALTER TABLE` query now has the `MODIFY ORDER BY` action for changing the sorting key when adding or removing a table column. This is useful for tables in the `MergeTree` family that perform additional tasks when merging based on this sorting key, such as `SummingMergeTree`, `AggregatingMergeTree`, and so on. [#3581](https://github.com/ClickHouse/ClickHouse/pull/3581) [#3755](https://github.com/ClickHouse/ClickHouse/pull/3755) @@ -1942,7 +1963,7 @@ This release contains exactly the same set of patches as 19.3.6. * Added the `is_in_partition_key`, `is_in_sorting_key`, `is_in_primary_key`, and `is_in_sampling_key` columns to the `system.columns` table. [#3609](https://github.com/ClickHouse/ClickHouse/pull/3609) * Added the `min_time` and `max_time` columns to the `system.parts` table. These columns are populated when the partitioning key is an expression consisting of `DateTime` columns. [Emmanuel Donin de Rosière](https://github.com/ClickHouse/ClickHouse/pull/3800) -### Bug fixes: +#### Bug fixes: * Fixes and performance improvements for the `LowCardinality` data type. `GROUP BY` using `LowCardinality(Nullable(...))`. Getting the values of `extremes`. Processing high-order functions. `LEFT ARRAY JOIN`. Distributed `GROUP BY`. Functions that return `Array`. Execution of `ORDER BY`. Writing to `Distributed` tables (nicelulu). Backward compatibility for `INSERT` queries from old clients that implement the `Native` protocol. Support for `LowCardinality` for `JOIN`. Improved performance when working in a single stream. [#3823](https://github.com/ClickHouse/ClickHouse/pull/3823) [#3803](https://github.com/ClickHouse/ClickHouse/pull/3803) [#3799](https://github.com/ClickHouse/ClickHouse/pull/3799) [#3769](https://github.com/ClickHouse/ClickHouse/pull/3769) [#3744](https://github.com/ClickHouse/ClickHouse/pull/3744) [#3681](https://github.com/ClickHouse/ClickHouse/pull/3681) [#3651](https://github.com/ClickHouse/ClickHouse/pull/3651) [#3649](https://github.com/ClickHouse/ClickHouse/pull/3649) [#3641](https://github.com/ClickHouse/ClickHouse/pull/3641) [#3632](https://github.com/ClickHouse/ClickHouse/pull/3632) [#3568](https://github.com/ClickHouse/ClickHouse/pull/3568) [#3523](https://github.com/ClickHouse/ClickHouse/pull/3523) [#3518](https://github.com/ClickHouse/ClickHouse/pull/3518) * Fixed how the `select_sequential_consistency` option works. Previously, when this setting was enabled, an incomplete result was sometimes returned after beginning to write to a new partition. [#2863](https://github.com/ClickHouse/ClickHouse/pull/2863) @@ -1969,7 +1990,7 @@ This release contains exactly the same set of patches as 19.3.6. * Fixed a race condition when reading from `Buffer` tables and simultaneously performing `ALTER` or `DROP` on the target tables. [#3719](https://github.com/ClickHouse/ClickHouse/pull/3719) * Fixed a segfault if the `max_temporary_non_const_columns` limit was exceeded. [#3788](https://github.com/ClickHouse/ClickHouse/pull/3788) -### Improvements: +#### Improvements: * The server does not write the processed configuration files to the `/etc/clickhouse-server/` directory. Instead, it saves them in the `preprocessed_configs` directory inside `path`. This means that the `/etc/clickhouse-server/` directory doesn't have write access for the `clickhouse` user, which improves security. [#2443](https://github.com/ClickHouse/ClickHouse/pull/2443) * The `min_merge_bytes_to_use_direct_io` option is set to 10 GiB by default. A merge that forms large parts of tables from the MergeTree family will be performed in `O_DIRECT` mode, which prevents excessive page cache eviction. [#3504](https://github.com/ClickHouse/ClickHouse/pull/3504) @@ -2001,7 +2022,7 @@ This release contains exactly the same set of patches as 19.3.6. * Fixed the behavior of stateful functions like `rowNumberInAllBlocks`. They previously output a result that was one number larger due to starting during query analysis. [Amos Bird](https://github.com/ClickHouse/ClickHouse/pull/3729) * If the `force_restore_data` file can't be deleted, an error message is displayed. [Amos Bird](https://github.com/ClickHouse/ClickHouse/pull/3794) -### Build improvements: +#### Build improvements: * Updated the `jemalloc` library, which fixes a potential memory leak. [Amos Bird](https://github.com/ClickHouse/ClickHouse/pull/3557) * Profiling with `jemalloc` is enabled by default in order to debug builds. [2cc82f5c](https://github.com/ClickHouse/ClickHouse/commit/2cc82f5cbe266421cd4c1165286c2c47e5ffcb15) @@ -2012,95 +2033,96 @@ This release contains exactly the same set of patches as 19.3.6. * For a Docker image, added support for initializing databases using files in the `/docker-entrypoint-initdb.d` directory. [Konstantin Lebedev](https://github.com/ClickHouse/ClickHouse/pull/3695) * Fixes for builds on ARM. [#3709](https://github.com/ClickHouse/ClickHouse/pull/3709) -### Backward incompatible changes: +#### Backward incompatible changes: * Removed the ability to compare the `Date` type with a number. Instead of `toDate('2018-12-18') = 17883`, you must use explicit type conversion `= toDate(17883)` [#3687](https://github.com/ClickHouse/ClickHouse/pull/3687) -## ClickHouse release 18.14.19, 2018-12-19 +## ClickHouse release 18.14 +### ClickHouse release 18.14.19, 2018-12-19 -### Bug fixes: +#### Bug fixes: * Fixed an error that led to problems with updating dictionaries with the ODBC source. [#3825](https://github.com/ClickHouse/ClickHouse/issues/3825), [#3829](https://github.com/ClickHouse/ClickHouse/issues/3829) * Databases are correctly specified when executing DDL `ON CLUSTER` queries. [#3460](https://github.com/ClickHouse/ClickHouse/pull/3460) * Fixed a segfault if the `max_temporary_non_const_columns` limit was exceeded. [#3788](https://github.com/ClickHouse/ClickHouse/pull/3788) -### Build improvements: +#### Build improvements: * Fixes for builds on ARM. -## ClickHouse release 18.14.18, 2018-12-04 +### ClickHouse release 18.14.18, 2018-12-04 -### Bug fixes: +#### Bug fixes: * Fixed error in `dictGet...` function for dictionaries of type `range`, if one of the arguments is constant and other is not. [#3751](https://github.com/ClickHouse/ClickHouse/pull/3751) * Fixed error that caused messages `netlink: '...': attribute type 1 has an invalid length` to be printed in Linux kernel log, that was happening only on fresh enough versions of Linux kernel. [#3749](https://github.com/ClickHouse/ClickHouse/pull/3749) * Fixed segfault in function `empty` for argument of `FixedString` type. [Daniel, Dao Quang Minh](https://github.com/ClickHouse/ClickHouse/pull/3703) * Fixed excessive memory allocation when using large value of `max_query_size` setting (a memory chunk of `max_query_size` bytes was preallocated at once). [#3720](https://github.com/ClickHouse/ClickHouse/pull/3720) -### Build changes: +#### Build changes: * Fixed build with LLVM/Clang libraries of version 7 from the OS packages (these libraries are used for runtime query compilation). [#3582](https://github.com/ClickHouse/ClickHouse/pull/3582) -## ClickHouse release 18.14.17, 2018-11-30 +### ClickHouse release 18.14.17, 2018-11-30 -### Bug fixes: +#### Bug fixes: * Fixed cases when the ODBC bridge process did not terminate with the main server process. [#3642](https://github.com/ClickHouse/ClickHouse/pull/3642) * Fixed synchronous insertion into the `Distributed` table with a columns list that differs from the column list of the remote table. [#3673](https://github.com/ClickHouse/ClickHouse/pull/3673) * Fixed a rare race condition that can lead to a crash when dropping a MergeTree table. [#3643](https://github.com/ClickHouse/ClickHouse/pull/3643) * Fixed a query deadlock in case when query thread creation fails with the `Resource temporarily unavailable` error. [#3643](https://github.com/ClickHouse/ClickHouse/pull/3643) * Fixed parsing of the `ENGINE` clause when the `CREATE AS table` syntax was used and the `ENGINE` clause was specified before the `AS table` (the error resulted in ignoring the specified engine). [#3692](https://github.com/ClickHouse/ClickHouse/pull/3692) -## ClickHouse release 18.14.15, 2018-11-21 +### ClickHouse release 18.14.15, 2018-11-21 -### Bug fixes: +#### Bug fixes: * The size of memory chunk was overestimated while deserializing the column of type `Array(String)` that leads to "Memory limit exceeded" errors. The issue appeared in version 18.12.13. [#3589](https://github.com/ClickHouse/ClickHouse/issues/3589) -## ClickHouse release 18.14.14, 2018-11-20 +### ClickHouse release 18.14.14, 2018-11-20 -### Bug fixes: +#### Bug fixes: * Fixed `ON CLUSTER` queries when cluster configured as secure (flag ``). [#3599](https://github.com/ClickHouse/ClickHouse/pull/3599) -### Build changes: +#### Build changes: * Fixed problems (llvm-7 from system, macos) [#3582](https://github.com/ClickHouse/ClickHouse/pull/3582) -## ClickHouse release 18.14.13, 2018-11-08 +### ClickHouse release 18.14.13, 2018-11-08 -### Bug fixes: +#### Bug fixes: * Fixed the `Block structure mismatch in MergingSorted stream` error. [#3162](https://github.com/ClickHouse/ClickHouse/issues/3162) * Fixed `ON CLUSTER` queries in case when secure connections were turned on in the cluster config (the `` flag). [#3465](https://github.com/ClickHouse/ClickHouse/pull/3465) * Fixed an error in queries that used `SAMPLE`, `PREWHERE` and alias columns. [#3543](https://github.com/ClickHouse/ClickHouse/pull/3543) * Fixed a rare `unknown compression method` error when the `min_bytes_to_use_direct_io` setting was enabled. [3544](https://github.com/ClickHouse/ClickHouse/pull/3544) -### Performance improvements: +#### Performance improvements: * Fixed performance regression of queries with `GROUP BY` of columns of UInt16 or Date type when executing on AMD EPYC processors. [Igor Lapko](https://github.com/ClickHouse/ClickHouse/pull/3512) * Fixed performance regression of queries that process long strings. [#3530](https://github.com/ClickHouse/ClickHouse/pull/3530) -### Build improvements: +#### Build improvements: * Improvements for simplifying the Arcadia build. [#3475](https://github.com/ClickHouse/ClickHouse/pull/3475), [#3535](https://github.com/ClickHouse/ClickHouse/pull/3535) -## ClickHouse release 18.14.12, 2018-11-02 +### ClickHouse release 18.14.12, 2018-11-02 -### Bug fixes: +#### Bug fixes: * Fixed a crash on joining two unnamed subqueries. [#3505](https://github.com/ClickHouse/ClickHouse/pull/3505) * Fixed generating incorrect queries (with an empty `WHERE` clause) when querying external databases. [hotid](https://github.com/ClickHouse/ClickHouse/pull/3477) * Fixed using an incorrect timeout value in ODBC dictionaries. [Marek Vavruša](https://github.com/ClickHouse/ClickHouse/pull/3511) -## ClickHouse release 18.14.11, 2018-10-29 +### ClickHouse release 18.14.11, 2018-10-29 -### Bug fixes: +#### Bug fixes: * Fixed the error `Block structure mismatch in UNION stream: different number of columns` in LIMIT queries. [#2156](https://github.com/ClickHouse/ClickHouse/issues/2156) * Fixed errors when merging data in tables containing arrays inside Nested structures. [#3397](https://github.com/ClickHouse/ClickHouse/pull/3397) * Fixed incorrect query results if the `merge_tree_uniform_read_distribution` setting is disabled (it is enabled by default). [#3429](https://github.com/ClickHouse/ClickHouse/pull/3429) * Fixed an error on inserts to a Distributed table in Native format. [#3411](https://github.com/ClickHouse/ClickHouse/issues/3411) -## ClickHouse release 18.14.10, 2018-10-23 +### ClickHouse release 18.14.10, 2018-10-23 * The `compile_expressions` setting (JIT compilation of expressions) is disabled by default. [#3410](https://github.com/ClickHouse/ClickHouse/pull/3410) * The `enable_optimize_predicate_expression` setting is disabled by default. -## ClickHouse release 18.14.9, 2018-10-16 +### ClickHouse release 18.14.9, 2018-10-16 -### New features: +#### New features: * The `WITH CUBE` modifier for `GROUP BY` (the alternative syntax `GROUP BY CUBE(...)` is also available). [#3172](https://github.com/ClickHouse/ClickHouse/pull/3172) * Added the `formatDateTime` function. [Alexandr Krasheninnikov](https://github.com/ClickHouse/ClickHouse/pull/2770) @@ -2113,12 +2135,12 @@ This release contains exactly the same set of patches as 19.3.6. * Now you can use pre-defined `database` and `table` macros when declaring `Replicated` tables. [#3251](https://github.com/ClickHouse/ClickHouse/pull/3251) * Added the ability to read `Decimal` type values in engineering notation (indicating powers of ten). [#3153](https://github.com/ClickHouse/ClickHouse/pull/3153) -### Experimental features: +#### Experimental features: * Optimization of the GROUP BY clause for `LowCardinality data types.` [#3138](https://github.com/ClickHouse/ClickHouse/pull/3138) * Optimized calculation of expressions for `LowCardinality data types.` [#3200](https://github.com/ClickHouse/ClickHouse/pull/3200) -### Improvements: +#### Improvements: * Significantly reduced memory consumption for queries with `ORDER BY` and `LIMIT`. See the `max_bytes_before_remerge_sort` setting. [#3205](https://github.com/ClickHouse/ClickHouse/pull/3205) * In the absence of `JOIN` (`LEFT`, `INNER`, ...), `INNER JOIN` is assumed. [#3147](https://github.com/ClickHouse/ClickHouse/pull/3147) @@ -2149,7 +2171,7 @@ This release contains exactly the same set of patches as 19.3.6. * Reduced the number of `open` and `close` system calls when reading from a `MergeTree table`. [#3283](https://github.com/ClickHouse/ClickHouse/pull/3283) * A `TRUNCATE TABLE` query can be executed on any replica (the query is passed to the leader replica). [Kirill Shvakov](https://github.com/ClickHouse/ClickHouse/pull/3375) -### Bug fixes: +#### Bug fixes: * Fixed an issue with `Dictionary` tables for `range_hashed` dictionaries. This error occurred in version 18.12.17. [#1702](https://github.com/ClickHouse/ClickHouse/pull/1702) * Fixed an error when loading `range_hashed` dictionaries (the message `Unsupported type Nullable (...)`). This error occurred in version 18.12.17. [#3362](https://github.com/ClickHouse/ClickHouse/pull/3362) @@ -2185,13 +2207,15 @@ This release contains exactly the same set of patches as 19.3.6. * Fixed segfault that could occur in rare cases after optimization that replaced AND chains from equality evaluations with the corresponding IN expression. [liuyimin-bytedance](https://github.com/ClickHouse/ClickHouse/pull/3339) * Minor corrections to `clickhouse-benchmark`: previously, client information was not sent to the server; now the number of queries executed is calculated more accurately when shutting down and for limiting the number of iterations. [#3351](https://github.com/ClickHouse/ClickHouse/pull/3351) [#3352](https://github.com/ClickHouse/ClickHouse/pull/3352) -### Backward incompatible changes: +#### Backward incompatible changes: * Removed the `allow_experimental_decimal_type` option. The `Decimal` data type is available for default use. [#3329](https://github.com/ClickHouse/ClickHouse/pull/3329) -## ClickHouse release 18.12.17, 2018-09-16 +## ClickHouse release 18.12 -### New features: +### ClickHouse release 18.12.17, 2018-09-16 + +#### New features: * `invalidate_query` (the ability to specify a query to check whether an external dictionary needs to be updated) is implemented for the `clickhouse` source. [#3126](https://github.com/ClickHouse/ClickHouse/pull/3126) * Added the ability to use `UInt*`, `Int*`, and `DateTime` data types (along with the `Date` type) as a `range_hashed` external dictionary key that defines the boundaries of ranges. Now `NULL` can be used to designate an open range. [Vasily Nemkov](https://github.com/ClickHouse/ClickHouse/pull/3123) @@ -2199,32 +2223,32 @@ This release contains exactly the same set of patches as 19.3.6. * The `Decimal` type now supports mathematical functions (`exp`, `sin` and so on.) [#3129](https://github.com/ClickHouse/ClickHouse/pull/3129) * The `system.part_log` table now has the `partition_id` column. [#3089](https://github.com/ClickHouse/ClickHouse/pull/3089) -### Bug fixes: +#### Bug fixes: * `Merge` now works correctly on `Distributed` tables. [Winter Zhang](https://github.com/ClickHouse/ClickHouse/pull/3159) * Fixed incompatibility (unnecessary dependency on the `glibc` version) that made it impossible to run ClickHouse on `Ubuntu Precise` and older versions. The incompatibility arose in version 18.12.13. [#3130](https://github.com/ClickHouse/ClickHouse/pull/3130) * Fixed errors in the `enable_optimize_predicate_expression` setting. [Winter Zhang](https://github.com/ClickHouse/ClickHouse/pull/3107) * Fixed a minor issue with backwards compatibility that appeared when working with a cluster of replicas on versions earlier than 18.12.13 and simultaneously creating a new replica of a table on a server with a newer version (shown in the message `Can not clone replica, because the ... updated to new ClickHouse version`, which is logical, but shouldn't happen). [#3122](https://github.com/ClickHouse/ClickHouse/pull/3122) -### Backward incompatible changes: +#### Backward incompatible changes: * The `enable_optimize_predicate_expression` option is enabled by default (which is rather optimistic). If query analysis errors occur that are related to searching for the column names, set `enable_optimize_predicate_expression` to 0. [Winter Zhang](https://github.com/ClickHouse/ClickHouse/pull/3107) -## ClickHouse release 18.12.14, 2018-09-13 +### ClickHouse release 18.12.14, 2018-09-13 -### New features: +#### New features: * Added support for `ALTER UPDATE` queries. [#3035](https://github.com/ClickHouse/ClickHouse/pull/3035) * Added the `allow_ddl` option, which restricts the user's access to DDL queries. [#3104](https://github.com/ClickHouse/ClickHouse/pull/3104) * Added the `min_merge_bytes_to_use_direct_io` option for `MergeTree` engines, which allows you to set a threshold for the total size of the merge (when above the threshold, data part files will be handled using O_DIRECT). [#3117](https://github.com/ClickHouse/ClickHouse/pull/3117) * The `system.merges` system table now contains the `partition_id` column. [#3099](https://github.com/ClickHouse/ClickHouse/pull/3099) -### Improvements +#### Improvements * If a data part remains unchanged during mutation, it isn't downloaded by replicas. [#3103](https://github.com/ClickHouse/ClickHouse/pull/3103) * Autocomplete is available for names of settings when working with `clickhouse-client`. [#3106](https://github.com/ClickHouse/ClickHouse/pull/3106) -### Bug fixes: +#### Bug fixes: * Added a check for the sizes of arrays that are elements of `Nested` type fields when inserting. [#3118](https://github.com/ClickHouse/ClickHouse/pull/3118) * Fixed an error updating external dictionaries with the `ODBC` source and `hashed` storage. This error occurred in version 18.12.13. @@ -2232,9 +2256,9 @@ This release contains exactly the same set of patches as 19.3.6. * Fixed an error in aggregate functions for arrays that can have `NULL` elements. [Winter Zhang](https://github.com/ClickHouse/ClickHouse/pull/3097) -## ClickHouse release 18.12.13, 2018-09-10 +### ClickHouse release 18.12.13, 2018-09-10 -### New features: +#### New features: * Added the `DECIMAL(digits, scale)` data type (`Decimal32(scale)`, `Decimal64(scale)`, `Decimal128(scale)`). To enable it, use the setting `allow_experimental_decimal_type`. [#2846](https://github.com/ClickHouse/ClickHouse/pull/2846) [#2970](https://github.com/ClickHouse/ClickHouse/pull/2970) [#3008](https://github.com/ClickHouse/ClickHouse/pull/3008) [#3047](https://github.com/ClickHouse/ClickHouse/pull/3047) * New `WITH ROLLUP` modifier for `GROUP BY` (alternative syntax: `GROUP BY ROLLUP(...)`). [#2948](https://github.com/ClickHouse/ClickHouse/pull/2948) @@ -2258,12 +2282,12 @@ This release contains exactly the same set of patches as 19.3.6. * Now you can add (merge) states of aggregate functions by using the plus operator, and multiply the states of aggregate functions by a nonnegative constant. [#3062](https://github.com/ClickHouse/ClickHouse/pull/3062) [#3034](https://github.com/ClickHouse/ClickHouse/pull/3034) * Tables in the MergeTree family now have the virtual column `_partition_id`. [#3089](https://github.com/ClickHouse/ClickHouse/pull/3089) -### Experimental features: +#### Experimental features: * Added the `LowCardinality(T)` data type. This data type automatically creates a local dictionary of values and allows data processing without unpacking the dictionary. [#2830](https://github.com/ClickHouse/ClickHouse/pull/2830) * Added a cache of JIT-compiled functions and a counter for the number of uses before compiling. To JIT compile expressions, enable the `compile_expressions` setting. [#2990](https://github.com/ClickHouse/ClickHouse/pull/2990) [#3077](https://github.com/ClickHouse/ClickHouse/pull/3077) -### Improvements: +#### Improvements: * Fixed the problem with unlimited accumulation of the replication log when there are abandoned replicas. Added an effective recovery mode for replicas with a long lag. * Improved performance of `GROUP BY` with multiple aggregation fields when one of them is string and the others are fixed length. @@ -2292,7 +2316,7 @@ This release contains exactly the same set of patches as 19.3.6. * Added randomization when running the cleanup thread periodically for `ReplicatedMergeTree` tables in order to avoid periodic load spikes when there are a very large number of `ReplicatedMergeTree` tables. * Support for `ATTACH TABLE ... ON CLUSTER` queries. [#3025](https://github.com/ClickHouse/ClickHouse/pull/3025) -### Bug fixes: +#### Bug fixes: * Fixed an issue with `Dictionary` tables (throws the `Size of offsets doesn't match size of column` or `Unknown compression method` exception). This bug appeared in version 18.10.3. [#2913](https://github.com/ClickHouse/ClickHouse/issues/2913) * Fixed a bug when merging `CollapsingMergeTree` tables if one of the data parts is empty (these parts are formed during merge or `ALTER DELETE` if all data was deleted), and the `vertical` algorithm was used for the merge. [#3049](https://github.com/ClickHouse/ClickHouse/pull/3049) @@ -2316,17 +2340,17 @@ This release contains exactly the same set of patches as 19.3.6. * Fixed incorrect code for adding nested data structures in a `SummingMergeTree`. * When allocating memory for states of aggregate functions, alignment is correctly taken into account, which makes it possible to use operations that require alignment when implementing states of aggregate functions. [chenxing-xc](https://github.com/ClickHouse/ClickHouse/pull/2808) -### Security fix: +#### Security fix: * Safe use of ODBC data sources. Interaction with ODBC drivers uses a separate `clickhouse-odbc-bridge` process. Errors in third-party ODBC drivers no longer cause problems with server stability or vulnerabilities. [#2828](https://github.com/ClickHouse/ClickHouse/pull/2828) [#2879](https://github.com/ClickHouse/ClickHouse/pull/2879) [#2886](https://github.com/ClickHouse/ClickHouse/pull/2886) [#2893](https://github.com/ClickHouse/ClickHouse/pull/2893) [#2921](https://github.com/ClickHouse/ClickHouse/pull/2921) * Fixed incorrect validation of the file path in the `catBoostPool` table function. [#2894](https://github.com/ClickHouse/ClickHouse/pull/2894) * The contents of system tables (`tables`, `databases`, `parts`, `columns`, `parts_columns`, `merges`, `mutations`, `replicas`, and `replication_queue`) are filtered according to the user's configured access to databases (`allow_databases`). [Winter Zhang](https://github.com/ClickHouse/ClickHouse/pull/2856) -### Backward incompatible changes: +#### Backward incompatible changes: * In queries with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting `asterisk_left_columns_only` to 1 on the user configuration level. -### Build changes: +#### Build changes: * Most integration tests can now be run by commit. * Code style checks can also be run by commit. @@ -2335,16 +2359,18 @@ This release contains exactly the same set of patches as 19.3.6. * Debugging the build uses the `jemalloc` debug option. * The interface of the library for interacting with ZooKeeper is declared abstract. [#2950](https://github.com/ClickHouse/ClickHouse/pull/2950) -## ClickHouse release 18.10.3, 2018-08-13 +## ClickHouse release 18.10 -### New features: +### ClickHouse release 18.10.3, 2018-08-13 + +#### New features: * HTTPS can be used for replication. [#2760](https://github.com/ClickHouse/ClickHouse/pull/2760) * Added the functions `murmurHash2_64`, `murmurHash3_32`, `murmurHash3_64`, and `murmurHash3_128` in addition to the existing `murmurHash2_32`. [#2791](https://github.com/ClickHouse/ClickHouse/pull/2791) * Support for Nullable types in the ClickHouse ODBC driver (`ODBCDriver2` output format). [#2834](https://github.com/ClickHouse/ClickHouse/pull/2834) * Support for `UUID` in the key columns. -### Improvements: +#### Improvements: * Clusters can be removed without restarting the server when they are deleted from the config files. [#2777](https://github.com/ClickHouse/ClickHouse/pull/2777) * External dictionaries can be removed without restarting the server when they are removed from config files. [#2779](https://github.com/ClickHouse/ClickHouse/pull/2779) @@ -2360,7 +2386,7 @@ This release contains exactly the same set of patches as 19.3.6. * Added the `prefer_localhost_replica` setting for disabling the preference for a local replica and going to a local replica without inter-process interaction. [#2832](https://github.com/ClickHouse/ClickHouse/pull/2832) * The `quantileExact` aggregate function returns `nan` in the case of aggregation on an empty `Float32` or `Float64` set. [Sundy Li](https://github.com/ClickHouse/ClickHouse/pull/2855) -### Bug fixes: +#### Bug fixes: * Removed unnecessary escaping of the connection string parameters for ODBC, which made it impossible to establish a connection. This error occurred in version 18.6.0. * Fixed the logic for processing `REPLACE PARTITION` commands in the replication queue. If there are two `REPLACE` commands for the same partition, the incorrect logic could cause one of them to remain in the replication queue and not be executed. [#2814](https://github.com/ClickHouse/ClickHouse/pull/2814) @@ -2371,11 +2397,11 @@ This release contains exactly the same set of patches as 19.3.6. * Fixed incorrect clickhouse-client response code in case of a query error. * Fixed incorrect behavior of materialized views containing DISTINCT. [#2795](https://github.com/ClickHouse/ClickHouse/issues/2795) -### Backward incompatible changes +#### Backward incompatible changes * Removed support for CHECK TABLE queries for Distributed tables. -### Build changes: +#### Build changes: * The allocator has been replaced: `jemalloc` is now used instead of `tcmalloc`. In some scenarios, this increases speed up to 20%. However, there are queries that have slowed by up to 20%. Memory consumption has been reduced by approximately 10% in some scenarios, with improved stability. With highly competitive loads, CPU usage in userspace and in system shows just a slight increase. [#2773](https://github.com/ClickHouse/ClickHouse/pull/2773) * Use of libressl from a submodule. [#1983](https://github.com/ClickHouse/ClickHouse/pull/1983) [#2807](https://github.com/ClickHouse/ClickHouse/pull/2807) @@ -2383,37 +2409,43 @@ This release contains exactly the same set of patches as 19.3.6. * Use of mariadb-connector-c from a submodule. [#2785](https://github.com/ClickHouse/ClickHouse/pull/2785) * Added functional test files to the repository that depend on the availability of test data (for the time being, without the test data itself). -## ClickHouse release 18.6.0, 2018-08-02 +## ClickHouse release 18.6 -### New features: +### ClickHouse release 18.6.0, 2018-08-02 + +#### New features: * Added support for ON expressions for the JOIN ON syntax: `JOIN ON Expr([table.]column ...) = Expr([table.]column, ...) [AND Expr([table.]column, ...) = Expr([table.]column, ...) ...]` The expression must be a chain of equalities joined by the AND operator. Each side of the equality can be an arbitrary expression over the columns of one of the tables. The use of fully qualified column names is supported (`table.name`, `database.table.name`, `table_alias.name`, `subquery_alias.name`) for the right table. [#2742](https://github.com/ClickHouse/ClickHouse/pull/2742) * HTTPS can be enabled for replication. [#2760](https://github.com/ClickHouse/ClickHouse/pull/2760) -### Improvements: +#### Improvements: * The server passes the patch component of its version to the client. Data about the patch version component is in `system.processes` and `query_log`. [#2646](https://github.com/ClickHouse/ClickHouse/pull/2646) -## ClickHouse release 18.5.1, 2018-07-31 +## ClickHouse release 18.5 -### New features: +### ClickHouse release 18.5.1, 2018-07-31 + +#### New features: * Added the hash function `murmurHash2_32` [#2756](https://github.com/ClickHouse/ClickHouse/pull/2756). -### Improvements: +#### Improvements: * Now you can use the `from_env` [#2741](https://github.com/ClickHouse/ClickHouse/pull/2741) attribute to set values in config files from environment variables. * Added case-insensitive versions of the `coalesce`, `ifNull`, and `nullIf functions` [#2752](https://github.com/ClickHouse/ClickHouse/pull/2752). -### Bug fixes: +#### Bug fixes: * Fixed a possible bug when starting a replica [#2759](https://github.com/ClickHouse/ClickHouse/pull/2759). -## ClickHouse release 18.4.0, 2018-07-28 +## ClickHouse release 18.4 -### New features: +### ClickHouse release 18.4.0, 2018-07-28 + +#### New features: * Added system tables: `formats`, `data_type_families`, `aggregate_function_combinators`, `table_functions`, `table_engines`, `collations` [#2721](https://github.com/ClickHouse/ClickHouse/pull/2721). * Added the ability to use a table function instead of a table as an argument of a `remote` or `cluster table function` [#2708](https://github.com/ClickHouse/ClickHouse/pull/2708). @@ -2421,26 +2453,28 @@ The expression must be a chain of equalities joined by the AND operator. Each si * The `has` function now allows searching for a numeric value in an array of `Enum` values [Maxim Khrisanfov](https://github.com/ClickHouse/ClickHouse/pull/2699). * Support for adding arbitrary message separators when reading from `Kafka` [Amos Bird](https://github.com/ClickHouse/ClickHouse/pull/2701). -### Improvements: +#### Improvements: * The `ALTER TABLE t DELETE WHERE` query does not rewrite data parts that were not affected by the WHERE condition [#2694](https://github.com/ClickHouse/ClickHouse/pull/2694). * The `use_minimalistic_checksums_in_zookeeper` option for `ReplicatedMergeTree` tables is enabled by default. This setting was added in version 1.1.54378, 2018-04-16. Versions that are older than 1.1.54378 can no longer be installed. * Support for running `KILL` and `OPTIMIZE` queries that specify `ON CLUSTER` [Winter Zhang](https://github.com/ClickHouse/ClickHouse/pull/2689). -### Bug fixes: +#### Bug fixes: * Fixed the error `Column ... is not under an aggregate function and not in GROUP BY` for aggregation with an IN expression. This bug appeared in version 18.1.0. ([bbdd780b](https://github.com/ClickHouse/ClickHouse/commit/bbdd780be0be06a0f336775941cdd536878dd2c2)) * Fixed a bug in the `windowFunnel aggregate function` [Winter Zhang](https://github.com/ClickHouse/ClickHouse/pull/2735). * Fixed a bug in the `anyHeavy` aggregate function ([a2101df2](https://github.com/ClickHouse/ClickHouse/commit/a2101df25a6a0fba99aa71f8793d762af2b801ee)) * Fixed server crash when using the `countArray()` aggregate function. -### Backward incompatible changes: +#### Backward incompatible changes: * Parameters for `Kafka` engine was changed from `Kafka(kafka_broker_list, kafka_topic_list, kafka_group_name, kafka_format[, kafka_schema, kafka_num_consumers])` to `Kafka(kafka_broker_list, kafka_topic_list, kafka_group_name, kafka_format[, kafka_row_delimiter, kafka_schema, kafka_num_consumers])`. If your tables use `kafka_schema` or `kafka_num_consumers` parameters, you have to manually edit the metadata files `path/metadata/database/table.sql` and add `kafka_row_delimiter` parameter with `''` value. -## ClickHouse release 18.1.0, 2018-07-23 +## ClickHouse release 18.1 -### New features: +### ClickHouse release 18.1.0, 2018-07-23 + +#### New features: * Support for the `ALTER TABLE t DELETE WHERE` query for non-replicated MergeTree tables ([#2634](https://github.com/ClickHouse/ClickHouse/pull/2634)). * Support for arbitrary types for the `uniq*` family of aggregate functions ([#2010](https://github.com/ClickHouse/ClickHouse/issues/2010)). @@ -2449,13 +2483,13 @@ The expression must be a chain of equalities joined by the AND operator. Each si * Added the `arrayDistinct` function ([#2670](https://github.com/ClickHouse/ClickHouse/pull/2670)). * The SummingMergeTree engine can now work with AggregateFunction type columns ([Constantin S. Pan](https://github.com/ClickHouse/ClickHouse/pull/2566)). -### Improvements: +#### Improvements: * Changed the numbering scheme for release versions. Now the first part contains the year of release (A.D., Moscow timezone, minus 2000), the second part contains the number for major changes (increases for most releases), and the third part is the patch version. Releases are still backward compatible, unless otherwise stated in the changelog. * Faster conversions of floating-point numbers to a string ([Amos Bird](https://github.com/ClickHouse/ClickHouse/pull/2664)). * If some rows were skipped during an insert due to parsing errors (this is possible with the `input_allow_errors_num` and `input_allow_errors_ratio` settings enabled), the number of skipped rows is now written to the server log ([Leonardo Cecchi](https://github.com/ClickHouse/ClickHouse/pull/2669)). -### Bug fixes: +#### Bug fixes: * Fixed the TRUNCATE command for temporary tables ([Amos Bird](https://github.com/ClickHouse/ClickHouse/pull/2624)). * Fixed a rare deadlock in the ZooKeeper client library that occurred when there was a network error while reading the response ([c315200](https://github.com/ClickHouse/ClickHouse/commit/c315200e64b87e44bdf740707fc857d1fdf7e947)). @@ -2466,18 +2500,20 @@ The expression must be a chain of equalities joined by the AND operator. Each si * Fixed incompatibility between servers with different versions in distributed queries that use a `CAST` function that isn't in uppercase letters ([fe8c4d6](https://github.com/ClickHouse/ClickHouse/commit/fe8c4d64e434cacd4ceef34faa9005129f2190a5)). * Added missing quoting of identifiers for queries to an external DBMS ([#2635](https://github.com/ClickHouse/ClickHouse/issues/2635)). -### Backward incompatible changes: +#### Backward incompatible changes: * Converting a string containing the number zero to DateTime does not work. Example: `SELECT toDateTime('0')`. This is also the reason that `DateTime DEFAULT '0'` does not work in tables, as well as `0` in dictionaries. Solution: replace `0` with `0000-00-00 00:00:00`. -## ClickHouse release 1.1.54394, 2018-07-12 +## ClickHouse release 1.1 -### New features: +### ClickHouse release 1.1.54394, 2018-07-12 + +#### New features: * Added the `histogram` aggregate function ([Mikhail Surin](https://github.com/ClickHouse/ClickHouse/pull/2521)). * Now `OPTIMIZE TABLE ... FINAL` can be used without specifying partitions for `ReplicatedMergeTree` ([Amos Bird](https://github.com/ClickHouse/ClickHouse/pull/2600)). -### Bug fixes: +#### Bug fixes: * Fixed a problem with a very small timeout for sockets (one second) for reading and writing when sending and downloading replicated data, which made it impossible to download larger parts if there is a load on the network or disk (it resulted in cyclical attempts to download parts). This error occurred in version 1.1.54388. * Fixed issues when using chroot in ZooKeeper if you inserted duplicate data blocks in the table. @@ -2486,15 +2522,15 @@ The expression must be a chain of equalities joined by the AND operator. Each si * Fixed how an empty `TinyLog` table works after inserting an empty data block ([#2563](https://github.com/ClickHouse/ClickHouse/issues/2563)). * The `system.zookeeper` table works if the value of the node in ZooKeeper is NULL. -## ClickHouse release 1.1.54390, 2018-07-06 +### ClickHouse release 1.1.54390, 2018-07-06 -### New features: +#### New features: * Queries can be sent in `multipart/form-data` format (in the `query` field), which is useful if external data is also sent for query processing ([Olga Hvostikova](https://github.com/ClickHouse/ClickHouse/pull/2490)). * Added the ability to enable or disable processing single or double quotes when reading data in CSV format. You can configure this in the `format_csv_allow_single_quotes` and `format_csv_allow_double_quotes` settings ([Amos Bird](https://github.com/ClickHouse/ClickHouse/pull/2574)). * Now `OPTIMIZE TABLE ... FINAL` can be used without specifying the partition for non-replicated variants of `MergeTree` ([Amos Bird](https://github.com/ClickHouse/ClickHouse/pull/2599)). -### Improvements: +#### Improvements: * Improved performance, reduced memory consumption, and correct memory consumption tracking with use of the IN operator when a table index could be used ([#2584](https://github.com/ClickHouse/ClickHouse/pull/2584)). * Removed redundant checking of checksums when adding a data part. This is important when there are a large number of replicas, because in these cases the total number of checks was equal to N^2. @@ -2504,7 +2540,7 @@ The expression must be a chain of equalities joined by the AND operator. Each si * Faster selection of data parts for merging in `ReplicatedMergeTree` tables. Faster recovery of the ZooKeeper session ([#2597](https://github.com/ClickHouse/ClickHouse/pull/2597)). * The `format_version.txt` file for `MergeTree` tables is re-created if it is missing, which makes sense if ClickHouse is launched after copying the directory structure without files ([Ciprian Hacman](https://github.com/ClickHouse/ClickHouse/pull/2593)). -### Bug fixes: +#### Bug fixes: * Fixed a bug when working with ZooKeeper that could make it impossible to recover the session and readonly states of tables before restarting the server. * Fixed a bug when working with ZooKeeper that could result in old nodes not being deleted if the session is interrupted. @@ -2514,13 +2550,13 @@ The expression must be a chain of equalities joined by the AND operator. Each si * Fixed switching to the default database when reconnecting the client ([#2583](https://github.com/ClickHouse/ClickHouse/pull/2583)). * Fixed a bug that occurred when the `use_index_for_in_with_subqueries` setting was disabled. -### Security fix: +#### Security fix: * Sending files is no longer possible when connected to MySQL (`LOAD DATA LOCAL INFILE`). -## ClickHouse release 1.1.54388, 2018-06-28 +### ClickHouse release 1.1.54388, 2018-06-28 -### New features: +#### New features: * Support for the `ALTER TABLE t DELETE WHERE` query for replicated tables. Added the `system.mutations` table to track progress of this type of queries. * Support for the `ALTER TABLE t [REPLACE|ATTACH] PARTITION` query for \*MergeTree tables. @@ -2538,12 +2574,12 @@ The expression must be a chain of equalities joined by the AND operator. Each si * Added the `date_time_input_format` setting. If you switch this setting to `'best_effort'`, DateTime values will be read in a wide range of formats. * Added the `clickhouse-obfuscator` utility for data obfuscation. Usage example: publishing data used in performance tests. -### Experimental features: +#### Experimental features: * Added the ability to calculate `and` arguments only where they are needed ([Anastasia Tsarkova](https://github.com/ClickHouse/ClickHouse/pull/2272)) * JIT compilation to native code is now available for some expressions ([pyos](https://github.com/ClickHouse/ClickHouse/pull/2277)). -### Bug fixes: +#### Bug fixes: * Duplicates no longer appear for a query with `DISTINCT` and `ORDER BY`. * Queries with `ARRAY JOIN` and `arrayFilter` no longer return an incorrect result. @@ -2565,7 +2601,7 @@ The expression must be a chain of equalities joined by the AND operator. Each si * Fixed SSRF in the remote() table function. * Fixed exit behavior of `clickhouse-client` in multiline mode ([#2510](https://github.com/ClickHouse/ClickHouse/issues/2510)). -### Improvements: +#### Improvements: * Background tasks in replicated tables are now performed in a thread pool instead of in separate threads ([Silviu Caragea](https://github.com/ClickHouse/ClickHouse/pull/1722)). * Improved LZ4 compression performance. @@ -2578,7 +2614,7 @@ The expression must be a chain of equalities joined by the AND operator. Each si * When calculating the number of available CPU cores, limits on cgroups are now taken into account ([Atri Sharma](https://github.com/ClickHouse/ClickHouse/pull/2325)). * Added chown for config directories in the systemd config file ([Mikhail Shiryaev](https://github.com/ClickHouse/ClickHouse/pull/2421)). -### Build changes: +#### Build changes: * The gcc8 compiler can be used for builds. * Added the ability to build llvm from submodule. @@ -2589,41 +2625,41 @@ The expression must be a chain of equalities joined by the AND operator. Each si * Added the ability to use the libtinfo library instead of libtermcap ([Georgy Kondratiev](https://github.com/ClickHouse/ClickHouse/pull/2519)). * Fixed a header file conflict in Fedora Rawhide ([#2520](https://github.com/ClickHouse/ClickHouse/issues/2520)). -### Backward incompatible changes: +#### Backward incompatible changes: * Removed escaping in `Vertical` and `Pretty*` formats and deleted the `VerticalRaw` format. * If servers with version 1.1.54388 (or newer) and servers with an older version are used simultaneously in a distributed query and the query has the `cast(x, 'Type')` expression without the `AS` keyword and doesn't have the word `cast` in uppercase, an exception will be thrown with a message like `Not found column cast(0, 'UInt8') in block`. Solution: Update the server on the entire cluster. -## ClickHouse release 1.1.54385, 2018-06-01 +### ClickHouse release 1.1.54385, 2018-06-01 -### Bug fixes: +#### Bug fixes: * Fixed an error that in some cases caused ZooKeeper operations to block. -## ClickHouse release 1.1.54383, 2018-05-22 +### ClickHouse release 1.1.54383, 2018-05-22 -### Bug fixes: +#### Bug fixes: * Fixed a slowdown of replication queue if a table has many replicas. -## ClickHouse release 1.1.54381, 2018-05-14 +### ClickHouse release 1.1.54381, 2018-05-14 -### Bug fixes: +#### Bug fixes: * Fixed a nodes leak in ZooKeeper when ClickHouse loses connection to ZooKeeper server. -## ClickHouse release 1.1.54380, 2018-04-21 +### ClickHouse release 1.1.54380, 2018-04-21 -### New features: +#### New features: * Added the table function `file(path, format, structure)`. An example reading bytes from `/dev/urandom`: `ln -s /dev/urandom /var/lib/clickhouse/user_files/random``clickhouse-client -q "SELECT * FROM file('random', 'RowBinary', 'd UInt8') LIMIT 10"`. -### Improvements: +#### Improvements: * Subqueries can be wrapped in `()` brackets to enhance query readability. For example: `(SELECT 1) UNION ALL (SELECT 1)`. * Simple `SELECT` queries from the `system.processes` table are not included in the `max_concurrent_queries` limit. -### Bug fixes: +#### Bug fixes: * Fixed incorrect behavior of the `IN` operator when select from `MATERIALIZED VIEW`. * Fixed incorrect filtering by partition index in expressions like `partition_key_column IN (...)`. @@ -2632,13 +2668,13 @@ The expression must be a chain of equalities joined by the AND operator. Each si * Fixed freezing of `KILL QUERY`. * Fixed an error in ZooKeeper client library which led to loss of watches, freezing of distributed DDL queue, and slowdowns in the replication queue if a non-empty `chroot` prefix is used in the ZooKeeper configuration. -### Backward incompatible changes: +#### Backward incompatible changes: * Removed support for expressions like `(a, b) IN (SELECT (a, b))` (you can use the equivalent expression `(a, b) IN (SELECT a, b)`). In previous releases, these expressions led to undetermined `WHERE` filtering or caused errors. -## ClickHouse release 1.1.54378, 2018-04-16 +### ClickHouse release 1.1.54378, 2018-04-16 -### New features: +#### New features: * Logging level can be changed without restarting the server. * Added the `SHOW CREATE DATABASE` query. @@ -2652,7 +2688,7 @@ The expression must be a chain of equalities joined by the AND operator. Each si * Multiple comma-separated `topics` can be specified for the `Kafka` engine (Tobias Adamson) * When a query is stopped by `KILL QUERY` or `replace_running_query`, the client receives the `Query was canceled` exception instead of an incomplete result. -### Improvements: +#### Improvements: * `ALTER TABLE ... DROP/DETACH PARTITION` queries are run at the front of the replication queue. * `SELECT ... FINAL` and `OPTIMIZE ... FINAL` can be used even when the table has a single data part. @@ -2663,7 +2699,7 @@ The expression must be a chain of equalities joined by the AND operator. Each si * More robust crash recovery for asynchronous insertion into `Distributed` tables. * The return type of the `countEqual` function changed from `UInt32` to `UInt64` (谢磊). -### Bug fixes: +#### Bug fixes: * Fixed an error with `IN` when the left side of the expression is `Nullable`. * Correct results are now returned when using tuples with `IN` when some of the tuple components are in the table index. @@ -2679,30 +2715,30 @@ The expression must be a chain of equalities joined by the AND operator. Each si * `SummingMergeTree` now works correctly for summation of nested data structures with a composite key. * Fixed the possibility of a race condition when choosing the leader for `ReplicatedMergeTree` tables. -### Build changes: +#### Build changes: * The build supports `ninja` instead of `make` and uses `ninja` by default for building releases. * Renamed packages: `clickhouse-server-base` in `clickhouse-common-static`; `clickhouse-server-common` in `clickhouse-server`; `clickhouse-common-dbg` in `clickhouse-common-static-dbg`. To install, use `clickhouse-server clickhouse-client`. Packages with the old names will still load in the repositories for backward compatibility. -### Backward incompatible changes: +#### Backward incompatible changes: * Removed the special interpretation of an IN expression if an array is specified on the left side. Previously, the expression `arr IN (set)` was interpreted as "at least one `arr` element belongs to the `set`". To get the same behavior in the new version, write `arrayExists(x -> x IN (set), arr)`. * Disabled the incorrect use of the socket option `SO_REUSEPORT`, which was incorrectly enabled by default in the Poco library. Note that on Linux there is no longer any reason to simultaneously specify the addresses `::` and `0.0.0.0` for listen – use just `::`, which allows listening to the connection both over IPv4 and IPv6 (with the default kernel config settings). You can also revert to the behavior from previous versions by specifying `1` in the config. -## ClickHouse release 1.1.54370, 2018-03-16 +### ClickHouse release 1.1.54370, 2018-03-16 -### New features: +#### New features: * Added the `system.macros` table and auto updating of macros when the config file is changed. * Added the `SYSTEM RELOAD CONFIG` query. * Added the `maxIntersections(left_col, right_col)` aggregate function, which returns the maximum number of simultaneously intersecting intervals `[left; right]`. The `maxIntersectionsPosition(left, right)` function returns the beginning of the "maximum" interval. ([Michael Furmur](https://github.com/ClickHouse/ClickHouse/pull/2012)). -### Improvements: +#### Improvements: * When inserting data in a `Replicated` table, fewer requests are made to `ZooKeeper` (and most of the user-level errors have disappeared from the `ZooKeeper` log). * Added the ability to create aliases for data sets. Example: `WITH (1, 2, 3) AS set SELECT number IN set FROM system.numbers LIMIT 10`. -### Bug fixes: +#### Bug fixes: * Fixed the `Illegal PREWHERE` error when reading from Merge tables for `Distributed`tables. * Added fixes that allow you to start clickhouse-server in IPv4-only Docker containers. @@ -2716,9 +2752,9 @@ The expression must be a chain of equalities joined by the AND operator. Each si * Restored the behavior for queries like `SELECT * FROM remote('server2', default.table) WHERE col IN (SELECT col2 FROM default.table)` when the right side of the `IN` should use a remote `default.table` instead of a local one. This behavior was broken in version 1.1.54358. * Removed extraneous error-level logging of `Not found column ... in block`. -## Clickhouse Release 1.1.54362, 2018-03-11 +### Clickhouse Release 1.1.54362, 2018-03-11 -### New features: +#### New features: * Aggregation without `GROUP BY` for an empty set (such as `SELECT count(*) FROM table WHERE 0`) now returns a result with one row with null values for aggregate functions, in compliance with the SQL standard. To restore the old behavior (return an empty result), set `empty_result_for_aggregation_by_empty_set` to 1. * Added type conversion for `UNION ALL`. Different alias names are allowed in `SELECT` positions in `UNION ALL`, in compliance with the SQL standard. @@ -2756,7 +2792,7 @@ The expression must be a chain of equalities joined by the AND operator. Each si * Added the `odbc_default_field_size` option, which allows you to extend the maximum size of the value loaded from an ODBC source (by default, it is 1024). * The `system.processes` table and `SHOW PROCESSLIST` now have the `is_cancelled` and `peak_memory_usage` columns. -### Improvements: +#### Improvements: * Limits and quotas on the result are no longer applied to intermediate data for `INSERT SELECT` queries or for `SELECT` subqueries. * Fewer false triggers of `force_restore_data` when checking the status of `Replicated` tables when the server starts. @@ -2772,7 +2808,7 @@ The expression must be a chain of equalities joined by the AND operator. Each si * `Enum` values can be used in `min`, `max`, `sum` and some other functions. In these cases, it uses the corresponding numeric values. This feature was previously available but was lost in the release 1.1.54337. * Added `max_expanded_ast_elements` to restrict the size of the AST after recursively expanding aliases. -### Bug fixes: +#### Bug fixes: * Fixed cases when unnecessary columns were removed from subqueries in error, or not removed from subqueries containing `UNION ALL`. * Fixed a bug in merges for `ReplacingMergeTree` tables. @@ -2800,19 +2836,19 @@ The expression must be a chain of equalities joined by the AND operator. Each si * Prohibited the use of queries with `UNION ALL` in a `MATERIALIZED VIEW`. * Fixed an error during initialization of the `part_log` system table when the server starts (by default, `part_log` is disabled). -### Backward incompatible changes: +#### Backward incompatible changes: * Removed the `distributed_ddl_allow_replicated_alter` option. This behavior is enabled by default. * Removed the `strict_insert_defaults` setting. If you were using this functionality, write to `clickhouse-feedback@yandex-team.com`. * Removed the `UnsortedMergeTree` engine. -## Clickhouse Release 1.1.54343, 2018-02-05 +### Clickhouse Release 1.1.54343, 2018-02-05 * Added macros support for defining cluster names in distributed DDL queries and constructors of Distributed tables: `CREATE TABLE distr ON CLUSTER '{cluster}' (...) ENGINE = Distributed('{cluster}', 'db', 'table')`. * Now queries like `SELECT ... FROM table WHERE expr IN (subquery)` are processed using the `table` index. * Improved processing of duplicates when inserting to Replicated tables, so they no longer slow down execution of the replication queue. -## Clickhouse Release 1.1.54342, 2018-01-22 +### Clickhouse Release 1.1.54342, 2018-01-22 This release contains bug fixes for the previous release 1.1.54337: @@ -2824,9 +2860,9 @@ This release contains bug fixes for the previous release 1.1.54337: * Buffer tables now work correctly when MATERIALIZED columns are present in the destination table (by zhang2014). * Fixed a bug in implementation of NULL. -## Clickhouse Release 1.1.54337, 2018-01-18 +### Clickhouse Release 1.1.54337, 2018-01-18 -### New features: +#### New features: * Added support for storage of multi-dimensional arrays and tuples (`Tuple` data type) in tables. * Support for table functions for `DESCRIBE` and `INSERT` queries. Added support for subqueries in `DESCRIBE`. Examples: `DESC TABLE remote('host', default.hits)`; `DESC TABLE (SELECT 1)`; `INSERT INTO TABLE FUNCTION remote('host', default.hits)`. Support for `INSERT INTO TABLE` in addition to `INSERT INTO`. @@ -2857,7 +2893,7 @@ This release contains bug fixes for the previous release 1.1.54337: * Added the `--silent` option for the `clickhouse-local` tool. It suppresses printing query execution info in stderr. * Added support for reading values of type `Date` from text in a format where the month and/or day of the month is specified using a single digit instead of two digits (Amos Bird). -### Performance optimizations: +#### Performance optimizations: * Improved performance of aggregate functions `min`, `max`, `any`, `anyLast`, `anyHeavy`, `argMin`, `argMax` from string arguments. * Improved performance of the functions `isInfinite`, `isFinite`, `isNaN`, `roundToExp2`. @@ -2866,7 +2902,7 @@ This release contains bug fixes for the previous release 1.1.54337: * Lowered memory usage for `JOIN` in the case when the left and right parts have columns with identical names that are not contained in `USING` . * Improved performance of aggregate functions `varSamp`, `varPop`, `stddevSamp`, `stddevPop`, `covarSamp`, `covarPop`, `corr` by reducing computational stability. The old functions are available under the names `varSampStable`, `varPopStable`, `stddevSampStable`, `stddevPopStable`, `covarSampStable`, `covarPopStable`, `corrStable`. -### Bug fixes: +#### Bug fixes: * Fixed data deduplication after running a `DROP` or `DETACH PARTITION` query. In the previous version, dropping a partition and inserting the same data again was not working because inserted blocks were considered duplicates. * Fixed a bug that could lead to incorrect interpretation of the `WHERE` clause for ` CREATE MATERIALIZED VIEW` queries with `POPULATE` . @@ -2905,7 +2941,7 @@ This release contains bug fixes for the previous release 1.1.54337: * Fixed the ` SYSTEM DROP DNS CACHE` query: the cache was flushed but addresses of cluster nodes were not updated. * Fixed the behavior of ` MATERIALIZED VIEW` after executing ` DETACH TABLE` for the table under the view (Marek Vavruša). -### Build improvements: +#### Build improvements: * The `pbuilder` tool is used for builds. The build process is almost completely independent of the build host environment. * A single build is used for different OS versions. Packages and binaries have been made compatible with a wide range of Linux systems. @@ -2919,7 +2955,7 @@ This release contains bug fixes for the previous release 1.1.54337: * Removed usage of GNU extensions from the code. Enabled the `-Wextra` option. When building with `clang` the default is `libc++` instead of `libstdc++`. * Extracted `clickhouse_parsers` and `clickhouse_common_io` libraries to speed up builds of various tools. -### Backward incompatible changes: +#### Backward incompatible changes: * The format for marks in `Log` type tables that contain `Nullable` columns was changed in a backward incompatible way. If you have these tables, you should convert them to the `TinyLog` type before starting up the new server version. To do this, replace `ENGINE = Log` with `ENGINE = TinyLog` in the corresponding `.sql` file in the `metadata` directory. If your table doesn't have `Nullable` columns or if the type of your table is not `Log`, then you don't need to do anything. * Removed the `experimental_allow_extended_storage_definition_syntax` setting. Now this feature is enabled by default. @@ -2930,18 +2966,18 @@ This release contains bug fixes for the previous release 1.1.54337: * In previous server versions there was an undocumented feature: if an aggregate function depends on parameters, you can still specify it without parameters in the AggregateFunction data type. Example: `AggregateFunction(quantiles, UInt64)` instead of `AggregateFunction(quantiles(0.5, 0.9), UInt64)`. This feature was lost. Although it was undocumented, we plan to support it again in future releases. * Enum data types cannot be used in min/max aggregate functions. This ability will be returned in the next release. -### Please note when upgrading: +#### Please note when upgrading: * When doing a rolling update on a cluster, at the point when some of the replicas are running the old version of ClickHouse and some are running the new version, replication is temporarily stopped and the message ` unknown parameter 'shard'` appears in the log. Replication will continue after all replicas of the cluster are updated. * If different versions of ClickHouse are running on the cluster servers, it is possible that distributed queries using the following functions will have incorrect results: `varSamp`, `varPop`, `stddevSamp`, `stddevPop`, `covarSamp`, `covarPop`, `corr`. You should update all cluster nodes. -## ClickHouse release 1.1.54327, 2017-12-21 +### ClickHouse release 1.1.54327, 2017-12-21 This release contains bug fixes for the previous release 1.1.54318: * Fixed bug with possible race condition in replication that could lead to data loss. This issue affects versions 1.1.54310 and 1.1.54318. If you use one of these versions with Replicated tables, the update is strongly recommended. This issue shows in logs in Warning messages like ` Part ... from own log doesn't exist.` The issue is relevant even if you don't see these messages in logs. -## ClickHouse release 1.1.54318, 2017-11-30 +### ClickHouse release 1.1.54318, 2017-11-30 This release contains bug fixes for the previous release 1.1.54310: @@ -2951,9 +2987,9 @@ This release contains bug fixes for the previous release 1.1.54310: * Fixed an issue that was causing the replication queue to stop running * Fixed rotation and archiving of server logs -## ClickHouse release 1.1.54310, 2017-11-01 +### ClickHouse release 1.1.54310, 2017-11-01 -### New features: +#### New features: * Custom partitioning key for the MergeTree family of table engines. * [Kafka](https://clickhouse.yandex/docs/en/operations/table_engines/kafka/) table engine. @@ -2970,13 +3006,13 @@ This release contains bug fixes for the previous release 1.1.54310: * Added support for the Cap'n Proto input format. * You can now customize compression level when using the zstd algorithm. -### Backward incompatible changes: +#### Backward incompatible changes: * Creation of temporary tables with an engine other than Memory is not allowed. * Explicit creation of tables with the View or MaterializedView engine is not allowed. * During table creation, a new check verifies that the sampling key expression is included in the primary key. -### Bug fixes: +#### Bug fixes: * Fixed hangups when synchronously inserting into a Distributed table. * Fixed nonatomic adding and removing of parts in Replicated tables. @@ -2987,17 +3023,17 @@ This release contains bug fixes for the previous release 1.1.54310: * Fixed hangups when the disk volume containing server logs is full. * Fixed an overflow in the toRelativeWeekNum function for the first week of the Unix epoch. -### Build improvements: +#### Build improvements: * Several third-party libraries (notably Poco) were updated and converted to git submodules. -## ClickHouse release 1.1.54304, 2017-10-19 +### ClickHouse release 1.1.54304, 2017-10-19 -### New features: +#### New features: * TLS support in the native protocol (to enable, set `tcp_ssl_port` in `config.xml` ). -### Bug fixes: +#### Bug fixes: * `ALTER` for replicated tables now tries to start running as soon as possible. * Fixed crashing when reading data with the setting `preferred_block_size_bytes=0.` @@ -3011,9 +3047,9 @@ This release contains bug fixes for the previous release 1.1.54310: * Users are updated correctly with invalid `users.xml` * Correct handling when an executable dictionary returns a non-zero response code. -## ClickHouse release 1.1.54292, 2017-09-20 +### ClickHouse release 1.1.54292, 2017-09-20 -### New features: +#### New features: * Added the `pointInPolygon` function for working with coordinates on a coordinate plane. * Added the `sumMap` aggregate function for calculating the sum of arrays, similar to `SummingMergeTree`. @@ -3021,7 +3057,7 @@ This release contains bug fixes for the previous release 1.1.54310: * The ClickHouse executable file is now less dependent on the libc version. The same ClickHouse executable file can run on a wide variety of Linux systems. There is still a dependency when using compiled queries (with the setting ` compile = 1` , which is not used by default). * Reduced the time needed for dynamic compilation of queries. -### Bug fixes: +#### Bug fixes: * Fixed an error that sometimes produced ` part ... intersects previous part` messages and weakened replica consistency. * Fixed an error that caused the server to lock up if ZooKeeper was unavailable during shutdown. @@ -3030,9 +3066,9 @@ This release contains bug fixes for the previous release 1.1.54310: * Fixed an error in the concat function that occurred if the first column in a block has the Array type. * Progress is now displayed correctly in the system.merges table. -## ClickHouse release 1.1.54289, 2017-09-13 +### ClickHouse release 1.1.54289, 2017-09-13 -### New features: +#### New features: * `SYSTEM` queries for server administration: `SYSTEM RELOAD DICTIONARY`, `SYSTEM RELOAD DICTIONARIES`, `SYSTEM DROP DNS CACHE`, `SYSTEM SHUTDOWN`, `SYSTEM KILL`. * Added functions for working with arrays: `concat`, `arraySlice`, `arrayPushBack`, `arrayPushFront`, `arrayPopBack`, `arrayPopFront`. @@ -3048,7 +3084,7 @@ This release contains bug fixes for the previous release 1.1.54310: * Option to set `umask` in the config file. * Improved performance for queries with `DISTINCT` . -### Bug fixes: +#### Bug fixes: * Improved the process for deleting old nodes in ZooKeeper. Previously, old nodes sometimes didn't get deleted if there were very frequent inserts, which caused the server to be slow to shut down, among other things. * Fixed randomization when choosing hosts for the connection to ZooKeeper. @@ -3062,21 +3098,21 @@ This release contains bug fixes for the previous release 1.1.54310: * Resolved the appearance of zombie processes when using a dictionary with an `executable` source. * Fixed segfault for the HEAD query. -### Improved workflow for developing and assembling ClickHouse: +#### Improved workflow for developing and assembling ClickHouse: * You can use `pbuilder` to build ClickHouse. * You can use `libc++` instead of `libstdc++` for builds on Linux. * Added instructions for using static code analysis tools: `Coverage`, `clang-tidy`, `cppcheck`. -### Please note when upgrading: +#### Please note when upgrading: * There is now a higher default value for the MergeTree setting `max_bytes_to_merge_at_max_space_in_pool` (the maximum total size of data parts to merge, in bytes): it has increased from 100 GiB to 150 GiB. This might result in large merges running after the server upgrade, which could cause an increased load on the disk subsystem. If the free space available on the server is less than twice the total amount of the merges that are running, this will cause all other merges to stop running, including merges of small data parts. As a result, INSERT queries will fail with the message "Merges are processing significantly slower than inserts." Use the ` SELECT * FROM system.merges` query to monitor the situation. You can also check the `DiskSpaceReservedForMerge` metric in the `system.metrics` table, or in Graphite. You don't need to do anything to fix this, since the issue will resolve itself once the large merges finish. If you find this unacceptable, you can restore the previous value for the `max_bytes_to_merge_at_max_space_in_pool` setting. To do this, go to the section in config.xml, set ```107374182400` and restart the server. -## ClickHouse release 1.1.54284, 2017-08-29 +### ClickHouse release 1.1.54284, 2017-08-29 * This is a bugfix release for the previous 1.1.54282 release. It fixes leaks in the parts directory in ZooKeeper. -## ClickHouse release 1.1.54282, 2017-08-23 +### ClickHouse release 1.1.54282, 2017-08-23 This release contains bug fixes for the previous release 1.1.54276: @@ -3084,9 +3120,9 @@ This release contains bug fixes for the previous release 1.1.54276: * Fixed parsing when inserting in RowBinary format if input data starts with';'. * Errors during runtime compilation of certain aggregate functions (e.g. `groupArray()`). -## Clickhouse Release 1.1.54276, 2017-08-16 +### Clickhouse Release 1.1.54276, 2017-08-16 -### New features: +#### New features: * Added an optional WITH section for a SELECT query. Example query: `WITH 1+1 AS a SELECT a, a*a` * INSERT can be performed synchronously in a Distributed table: OK is returned only after all the data is saved on all the shards. This is activated by the setting insert_distributed_sync=1. @@ -3097,7 +3133,7 @@ This release contains bug fixes for the previous release 1.1.54276: * Added support for non-constant arguments and negative offsets in the function `substring(str, pos, len).` * Added the max_size parameter for the `groupArray(max_size)(column)` aggregate function, and optimized its performance. -### Main changes: +#### Main changes: * Security improvements: all server files are created with 0640 permissions (can be changed via config parameter). * Improved error messages for queries with invalid syntax. @@ -3105,11 +3141,11 @@ This release contains bug fixes for the previous release 1.1.54276: * Significantly increased the performance of data merges for the ReplacingMergeTree engine. * Improved performance for asynchronous inserts from a Distributed table by combining multiple source inserts. To enable this functionality, use the setting distributed_directory_monitor_batch_inserts=1. -### Backward incompatible changes: +#### Backward incompatible changes: * Changed the binary format of aggregate states of `groupArray(array_column)` functions for arrays. -### Complete list of changes: +#### Complete list of changes: * Added the `output_format_json_quote_denormals` setting, which enables outputting nan and inf values in JSON format. * Optimized stream allocation when reading from a Distributed table. @@ -3128,7 +3164,7 @@ This release contains bug fixes for the previous release 1.1.54276: * It is possible to connect to MySQL through a socket in the file system. * The system.parts table has a new column with information about the size of marks, in bytes. -### Bug fixes: +#### Bug fixes: * Distributed tables using a Merge table now work correctly for a SELECT query with a condition on the `_table` field. * Fixed a rare race condition in ReplicatedMergeTree when checking data parts. @@ -3152,15 +3188,15 @@ This release contains bug fixes for the previous release 1.1.54276: * Fixed the "Cannot mremap" error when using arrays in IN and JOIN clauses with more than 2 billion elements. * Fixed the failover for dictionaries with MySQL as the source. -### Improved workflow for developing and assembling ClickHouse: +#### Improved workflow for developing and assembling ClickHouse: * Builds can be assembled in Arcadia. * You can use gcc 7 to compile ClickHouse. * Parallel builds using ccache+distcc are faster now. -## ClickHouse release 1.1.54245, 2017-07-04 +### ClickHouse release 1.1.54245, 2017-07-04 -### New features: +#### New features: * Distributed DDL (for example, `CREATE TABLE ON CLUSTER`) * The replicated query `ALTER TABLE CLEAR COLUMN IN PARTITION.` @@ -3172,16 +3208,16 @@ This release contains bug fixes for the previous release 1.1.54276: * Sessions in the HTTP interface. * The OPTIMIZE query for a Replicated table can can run not only on the leader. -### Backward incompatible changes: +#### Backward incompatible changes: * Removed SET GLOBAL. -### Minor changes: +#### Minor changes: * Now after an alert is triggered, the log prints the full stack trace. * Relaxed the verification of the number of damaged/extra data parts at startup (there were too many false positives). -### Bug fixes: +#### Bug fixes: * Fixed a bad connection "sticking" when inserting into a Distributed table. * GLOBAL IN now works for a query from a Merge table that looks at a Distributed table. diff --git a/CMakeLists.txt b/CMakeLists.txt index 623b6ac9966..7c8ccb6e17c 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -95,6 +95,8 @@ if (CMAKE_GENERATOR STREQUAL "Ninja") set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fdiagnostics-color=always") endif () +include (cmake/add_warning.cmake) + if (NOT MSVC) set (COMMON_WARNING_FLAGS "${COMMON_WARNING_FLAGS} -Wall") # -Werror is also added inside directories with our own code. endif () @@ -224,8 +226,8 @@ else () set(NOT_UNBUNDLED 1) endif () -# Using system libs can cause lot of warnings in includes (on macro expansion). -if (UNBUNDLED OR NOT (OS_LINUX OR APPLE) OR ARCH_32) +# Using system libs can cause a lot of warnings in includes (on macro expansion). +if (UNBUNDLED OR NOT (OS_LINUX OR OS_DARWIN) OR ARCH_32) option (NO_WERROR "Disable -Werror compiler option" ON) endif () diff --git a/README.md b/README.md index a545c91886f..21498a22912 100644 --- a/README.md +++ b/README.md @@ -11,3 +11,7 @@ ClickHouse is an open-source column-oriented database management system that all * [Blog](https://clickhouse.yandex/blog/en/) contains various ClickHouse-related articles, as well as announces and reports about events. * [Contacts](https://clickhouse.yandex/#contacts) can help to get your questions answered if there are any. * You can also [fill this form](https://forms.yandex.com/surveys/meet-yandex-clickhouse-team/) to meet Yandex ClickHouse team in person. + +## Upcoming Events + +* [ClickHouse Meetup in San Francisco](https://www.eventbrite.com/e/clickhouse-february-meetup-registration-88496227599) on February 5. diff --git a/cmake/add_warning.cmake b/cmake/add_warning.cmake new file mode 100644 index 00000000000..a7a5f0f035e --- /dev/null +++ b/cmake/add_warning.cmake @@ -0,0 +1,18 @@ +include (CheckCXXSourceCompiles) + +# Try to add -Wflag if compiler supports it +macro (add_warning flag) + string (REPLACE "-" "_" underscored_flag ${flag}) + string (REPLACE "+" "x" underscored_flag ${underscored_flag}) + check_cxx_compiler_flag("-W${flag}" SUPPORTS_FLAG_${underscored_flag}) + if (SUPPORTS_FLAG_${underscored_flag}) + set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -W${flag}") + else () + message (WARNING "Flag -W${flag} is unsupported") + endif () +endmacro () + +# Try to add -Wno flag if compiler supports it +macro (no_warning flag) + add_warning(no-${flag}) +endmacro () diff --git a/cmake/darwin/default_libs.cmake b/cmake/darwin/default_libs.cmake index 6010ea0f5de..7b57e63f4ee 100644 --- a/cmake/darwin/default_libs.cmake +++ b/cmake/darwin/default_libs.cmake @@ -13,12 +13,12 @@ set(CMAKE_C_STANDARD_LIBRARIES ${DEFAULT_LIBS}) # Minimal supported SDK version -set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mmacosx-version-min=10.14") -set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mmacosx-version-min=10.14") -set (CMAKE_ASM_FLAGS "${CMAKE_ASM_FLAGS} -mmacosx-version-min=10.14") +set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mmacosx-version-min=10.15") +set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mmacosx-version-min=10.15") +set (CMAKE_ASM_FLAGS "${CMAKE_ASM_FLAGS} -mmacosx-version-min=10.15") -set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -mmacosx-version-min=10.14") -set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -mmacosx-version-min=10.14") +set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -mmacosx-version-min=10.15") +set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -mmacosx-version-min=10.15") # Global libraries diff --git a/contrib/googletest b/contrib/googletest index d175c8bf823..703bd9caab5 160000 --- a/contrib/googletest +++ b/contrib/googletest @@ -1 +1 @@ -Subproject commit d175c8bf823e709d570772b038757fadf63bc632 +Subproject commit 703bd9caab50b139428cea1aaff9974ebee5742e diff --git a/contrib/libcxx-cmake/CMakeLists.txt b/contrib/libcxx-cmake/CMakeLists.txt index 3d7447b7bf0..a56efae9bf2 100644 --- a/contrib/libcxx-cmake/CMakeLists.txt +++ b/contrib/libcxx-cmake/CMakeLists.txt @@ -54,13 +54,12 @@ endif () target_compile_options(cxx PUBLIC $<$:-nostdinc++>) -check_cxx_compiler_flag(-Wreserved-id-macro HAVE_WARNING_RESERVED_ID_MACRO) -if (HAVE_WARNING_RESERVED_ID_MACRO) + +if (SUPPORTS_FLAG_no_reserved_id_macro) target_compile_options(cxx PUBLIC -Wno-reserved-id-macro) endif () -check_cxx_compiler_flag(-Wctad-maybe-unsupported HAVE_WARNING_CTAD_MAYBE_UNSUPPORTED) -if (HAVE_WARNING_CTAD_MAYBE_UNSUPPORTED) +if (SUPPORTS_FLAG_no_ctad_maybe_unsupported) target_compile_options(cxx PUBLIC -Wno-ctad-maybe-unsupported) endif () diff --git a/dbms/CMakeLists.txt b/dbms/CMakeLists.txt index e0c8b7da37a..5194c292f3b 100644 --- a/dbms/CMakeLists.txt +++ b/dbms/CMakeLists.txt @@ -45,36 +45,75 @@ endif () option (WEVERYTHING "Enables -Weverything option with some exceptions. This is intended for exploration of new compiler warnings that may be found to be useful. Only makes sense for clang." ON) -if (CMAKE_CXX_COMPILER_ID STREQUAL "Clang") - set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wpedantic -Wno-vla-extension -Wno-zero-length-array -Wno-gnu-anonymous-struct -Wno-nested-anon-types") +if (COMPILER_CLANG) + add_warning(pedantic) + no_warning(gnu-anonymous-struct) + no_warning(nested-anon-types) + no_warning(vla-extension) + no_warning(zero-length-array) - set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wshadow -Wshadow-uncaptured-local -Wextra-semi -Wcomma -Winconsistent-missing-destructor-override -Wunused-exception-parameter -Wcovered-switch-default -Wold-style-cast -Wrange-loop-analysis -Wunused-member-function -Wunreachable-code -Wunreachable-code-return -Wnewline-eof -Wembedded-directive -Wgnu-case-range -Wunused-macros -Wconditional-uninitialized -Wdeprecated -Wundef -Wreserved-id-macro -Wredundant-parens -Wzero-as-null-pointer-constant") + add_warning(comma) + add_warning(conditional-uninitialized) + add_warning(covered-switch-default) + add_warning(deprecated) + add_warning(embedded-directive) + add_warning(empty-init-stmt) # linux-only + add_warning(extra-semi-stmt) # linux-only + add_warning(extra-semi) + add_warning(gnu-case-range) + add_warning(inconsistent-missing-destructor-override) + add_warning(newline-eof) + add_warning(old-style-cast) + add_warning(range-loop-analysis) + add_warning(redundant-parens) + add_warning(reserved-id-macro) + add_warning(shadow-field) # clang 8+ + add_warning(shadow-uncaptured-local) + add_warning(shadow) + add_warning(string-plus-int) # clang 8+ + add_warning(undef) + add_warning(unreachable-code-return) + add_warning(unreachable-code) + add_warning(unused-exception-parameter) + add_warning(unused-macros) + add_warning(unused-member-function) + add_warning(zero-as-null-pointer-constant) if (WEVERYTHING) - set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-padded -Wno-switch-enum -Wno-deprecated-dynamic-exception-spec -Wno-float-equal -Wno-weak-vtables -Wno-shift-sign-overflow -Wno-sign-conversion -Wno-conversion -Wno-exit-time-destructors -Wno-undefined-func-template -Wno-documentation-unknown-command -Wno-missing-variable-declarations -Wno-unused-template -Wno-global-constructors -Wno-c99-extensions -Wno-missing-prototypes -Wno-weak-template-vtables -Wno-zero-length-array -Wno-gnu-anonymous-struct -Wno-nested-anon-types -Wno-double-promotion -Wno-disabled-macro-expansion -Wno-vla-extension -Wno-vla -Wno-packed") + add_warning(everything) + no_warning(c++98-compat-pedantic) + no_warning(c++98-compat) + no_warning(c99-extensions) + no_warning(conversion) + no_warning(ctad-maybe-unsupported) # clang 9+, linux-only + no_warning(deprecated-dynamic-exception-spec) + no_warning(disabled-macro-expansion) + no_warning(documentation-unknown-command) + no_warning(double-promotion) + no_warning(exit-time-destructors) + no_warning(float-equal) + no_warning(global-constructors) + no_warning(gnu-anonymous-struct) + no_warning(missing-prototypes) + no_warning(missing-variable-declarations) + no_warning(nested-anon-types) + no_warning(packed) + no_warning(padded) + no_warning(return-std-move-in-c++11) # clang 7+ + no_warning(shift-sign-overflow) + no_warning(sign-conversion) + no_warning(switch-enum) + no_warning(undefined-func-template) + no_warning(unused-template) + no_warning(vla-extension) + no_warning(vla) + no_warning(weak-template-vtables) + no_warning(weak-vtables) + no_warning(zero-length-array) # TODO Enable conversion, sign-conversion, double-promotion warnings. endif () - - if (NOT CMAKE_CXX_COMPILER_VERSION VERSION_LESS 7) - if (WEVERYTHING) - set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-return-std-move-in-c++11") - endif () - endif () - - if (NOT CMAKE_CXX_COMPILER_VERSION VERSION_LESS 8) - set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wshadow-field -Wstring-plus-int") - if(NOT APPLE) - set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wextra-semi-stmt -Wempty-init-stmt") - endif() - endif () - - if (NOT CMAKE_CXX_COMPILER_VERSION VERSION_LESS 9) - if (WEVERYTHING AND NOT APPLE) - set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-ctad-maybe-unsupported") - endif () - endif () -elseif (CMAKE_CXX_COMPILER_ID STREQUAL "GNU") +elseif (COMPILER_GCC) # Add compiler options only to c++ compiler function(add_cxx_compile_options option) add_compile_options("$<$,CXX>:${option}>") @@ -156,7 +195,7 @@ if (USE_DEBUG_HELPERS) set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${INCLUDE_DEBUG_HELPERS}") endif () -if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU") +if (COMPILER_GCC) # If we leave this optimization enabled, gcc-7 replaces a pair of SSE intrinsics (16 byte load, store) with a call to memcpy. # It leads to slow code. This is compiler bug. It looks like this: # diff --git a/dbms/programs/benchmark/Benchmark.cpp b/dbms/programs/benchmark/Benchmark.cpp index 6f08475f934..1c98ecd1333 100644 --- a/dbms/programs/benchmark/Benchmark.cpp +++ b/dbms/programs/benchmark/Benchmark.cpp @@ -254,7 +254,7 @@ private: if (interrupt_listener.check()) { - std::cout << "Stopping launch of queries. SIGINT recieved.\n"; + std::cout << "Stopping launch of queries. SIGINT received.\n"; return false; } diff --git a/dbms/programs/client/Client.cpp b/dbms/programs/client/Client.cpp index a38906c9620..1c4902c48f6 100644 --- a/dbms/programs/client/Client.cpp +++ b/dbms/programs/client/Client.cpp @@ -98,7 +98,7 @@ namespace ErrorCodes extern const int UNKNOWN_PACKET_FROM_SERVER; extern const int UNEXPECTED_PACKET_FROM_SERVER; extern const int CLIENT_OUTPUT_FORMAT_SPECIFIED; - extern const int LOGICAL_ERROR; + extern const int CANNOT_SET_SIGNAL_HANDLER; extern const int CANNOT_READLINE; extern const int SYSTEM_ERROR; extern const int INVALID_USAGE_OF_INPUT; diff --git a/dbms/programs/compressor/Compressor.cpp b/dbms/programs/compressor/Compressor.cpp index 9c4699b610a..98a3055da28 100644 --- a/dbms/programs/compressor/Compressor.cpp +++ b/dbms/programs/compressor/Compressor.cpp @@ -70,7 +70,7 @@ int mainEntryClickHouseCompressor(int argc, char ** argv) ("hc", "use LZ4HC instead of LZ4") ("zstd", "use ZSTD instead of LZ4") ("codec", boost::program_options::value>()->multitoken(), "use codecs combination instead of LZ4") - ("level", boost::program_options::value(), "compression level for codecs spicified via flags") + ("level", boost::program_options::value(), "compression level for codecs specified via flags") ("none", "use no compression instead of LZ4") ("stat", "print block statistics of compressed data") ; diff --git a/dbms/programs/copier/ClusterCopier.cpp b/dbms/programs/copier/ClusterCopier.cpp index a095e99fe22..2c6b16a7ae4 100644 --- a/dbms/programs/copier/ClusterCopier.cpp +++ b/dbms/programs/copier/ClusterCopier.cpp @@ -2430,7 +2430,7 @@ void ClusterCopierApp::defineOptions(Poco::Util::OptionSet & options) .argument("copy-fault-probability").binding("copy-fault-probability")); options.addOption(Poco::Util::Option("log-level", "", "sets log level") .argument("log-level").binding("log-level")); - options.addOption(Poco::Util::Option("base-dir", "", "base directory for copiers, consequitive copier launches will populate /base-dir/launch_id/* directories") + options.addOption(Poco::Util::Option("base-dir", "", "base directory for copiers, consecutive copier launches will populate /base-dir/launch_id/* directories") .argument("base-dir").binding("base-dir")); using Me = std::decay_t; diff --git a/dbms/programs/local/LocalServer.cpp b/dbms/programs/local/LocalServer.cpp index ca45217cb97..cac561117b4 100644 --- a/dbms/programs/local/LocalServer.cpp +++ b/dbms/programs/local/LocalServer.cpp @@ -164,7 +164,7 @@ try setupUsers(); /// Limit on total number of concurrently executing queries. - /// Threre are no need for concurrent threads, override max_concurrent_queries. + /// There is no need for concurrent threads, override max_concurrent_queries. context->getProcessList().setMaxSize(0); /// Size of cache for uncompressed blocks. Zero means disabled. @@ -182,7 +182,7 @@ try context->setDefaultProfiles(config()); /** Init dummy default DB - * NOTE: We force using isolated default database to avoid conflicts with default database from server enviroment + * NOTE: We force using isolated default database to avoid conflicts with default database from server environment * Otherwise, metadata of temporary File(format, EXPLICIT_PATH) tables will pollute metadata/ directory; * if such tables will not be dropped, clickhouse-server will not be able to load them due to security reasons. */ diff --git a/dbms/programs/obfuscator/Obfuscator.cpp b/dbms/programs/obfuscator/Obfuscator.cpp index f267acc1f01..49c6fabd435 100644 --- a/dbms/programs/obfuscator/Obfuscator.cpp +++ b/dbms/programs/obfuscator/Obfuscator.cpp @@ -40,7 +40,7 @@ #include -static const char * documantation = R"( +static const char * documentation = R"( Simple tool for table data obfuscation. It reads input table and produces output table, that retain some properties of input, but contains different data. @@ -979,7 +979,7 @@ try || !options.count("input-format") || !options.count("output-format")) { - std::cout << documantation << "\n" + std::cout << documentation << "\n" << "\nUsage: " << argv[0] << " [options] < in > out\n" << "\nInput must be seekable file (it will be read twice).\n" << "\n" << description << "\n" diff --git a/dbms/programs/odbc-bridge/MainHandler.cpp b/dbms/programs/odbc-bridge/MainHandler.cpp index 074aaedd7ce..3ae5f49f24b 100644 --- a/dbms/programs/odbc-bridge/MainHandler.cpp +++ b/dbms/programs/odbc-bridge/MainHandler.cpp @@ -138,7 +138,7 @@ void ODBCHandler::handleRequest(Poco::Net::HTTPServerRequest & request, Poco::Ne { auto message = getCurrentExceptionMessage(true); response.setStatusAndReason( - Poco::Net::HTTPResponse::HTTP_INTERNAL_SERVER_ERROR); // can't call process_error, bacause of too soon response sending + Poco::Net::HTTPResponse::HTTP_INTERNAL_SERVER_ERROR); // can't call process_error, because of too soon response sending writeStringBinary(message, out); tryLogCurrentException(log); } diff --git a/dbms/programs/odbc-bridge/ODBCBridge.cpp b/dbms/programs/odbc-bridge/ODBCBridge.cpp index a99e9fcf2c6..565ee5602ca 100644 --- a/dbms/programs/odbc-bridge/ODBCBridge.cpp +++ b/dbms/programs/odbc-bridge/ODBCBridge.cpp @@ -88,7 +88,7 @@ void ODBCBridge::defineOptions(Poco::Util::OptionSet & options) options.addOption( Poco::Util::Option("listen-host", "", "hostname to listen, default localhost").argument("listen-host").binding("listen-host")); options.addOption( - Poco::Util::Option("http-timeout", "", "http timout for socket, default 1800").argument("http-timeout").binding("http-timeout")); + Poco::Util::Option("http-timeout", "", "http timeout for socket, default 1800").argument("http-timeout").binding("http-timeout")); options.addOption(Poco::Util::Option("max-server-connections", "", "max connections to server, default 1024") .argument("max-server-connections") diff --git a/dbms/programs/performance-test/PerformanceTest.cpp b/dbms/programs/performance-test/PerformanceTest.cpp index e1550780b15..e2c5c0d8741 100644 --- a/dbms/programs/performance-test/PerformanceTest.cpp +++ b/dbms/programs/performance-test/PerformanceTest.cpp @@ -315,7 +315,7 @@ void PerformanceTest::runQueries( stop_conditions.reportIterations(iteration); if (stop_conditions.areFulfilled()) { - LOG_INFO(log, "Stop conditions fullfilled"); + LOG_INFO(log, "Stop conditions fulfilled"); break; } diff --git a/dbms/programs/performance-test/PerformanceTestSuite.cpp b/dbms/programs/performance-test/PerformanceTestSuite.cpp index 66ef8eb51c0..fbf43bfa29f 100644 --- a/dbms/programs/performance-test/PerformanceTestSuite.cpp +++ b/dbms/programs/performance-test/PerformanceTestSuite.cpp @@ -200,7 +200,7 @@ private: if (current.checkPreconditions()) { - LOG_INFO(log, "Preconditions for test '" << info.test_name << "' are fullfilled"); + LOG_INFO(log, "Preconditions for test '" << info.test_name << "' are fulfilled"); LOG_INFO( log, "Preparing for run, have " << info.create_and_fill_queries.size() << " create and fill queries"); @@ -219,7 +219,7 @@ private: return {report_builder->buildFullReport(info, result, query_indexes[info.path]), current.checkSIGINT()}; } else - LOG_INFO(log, "Preconditions for test '" << info.test_name << "' are not fullfilled, skip run"); + LOG_INFO(log, "Preconditions for test '" << info.test_name << "' are not fulfilled, skip run"); return {"", current.checkSIGINT()}; } @@ -361,8 +361,8 @@ try po::notify(options); Poco::AutoPtr formatter(new Poco::PatternFormatter("%Y.%m.%d %H:%M:%S.%F <%p> %s: %t")); - Poco::AutoPtr console_chanel(new Poco::ConsoleChannel); - Poco::AutoPtr channel(new Poco::FormattingChannel(formatter, console_chanel)); + Poco::AutoPtr console_channel(new Poco::ConsoleChannel); + Poco::AutoPtr channel(new Poco::FormattingChannel(formatter, console_channel)); Poco::Logger::root().setLevel(options["log-level"].as()); Poco::Logger::root().setChannel(channel); diff --git a/dbms/programs/performance-test/ReportBuilder.cpp b/dbms/programs/performance-test/ReportBuilder.cpp index f95aa025095..c95b4d56a1e 100644 --- a/dbms/programs/performance-test/ReportBuilder.cpp +++ b/dbms/programs/performance-test/ReportBuilder.cpp @@ -117,7 +117,7 @@ std::string ReportBuilder::buildFullReport( if (isASCIIString(statistics.exception)) runJSON.set("exception", jsonString(statistics.exception, settings), false); else - runJSON.set("exception", "Some exception occured with non ASCII message. This may produce invalid JSON. Try reproduce locally."); + runJSON.set("exception", "Some exception occurred with non ASCII message. This may produce invalid JSON. Try reproduce locally."); } if (test_info.exec_type == ExecutionType::Loop) diff --git a/dbms/programs/server/InterserverIOHTTPHandler.cpp b/dbms/programs/server/InterserverIOHTTPHandler.cpp index 27e4c7041c4..5302302bb5b 100644 --- a/dbms/programs/server/InterserverIOHTTPHandler.cpp +++ b/dbms/programs/server/InterserverIOHTTPHandler.cpp @@ -28,23 +28,23 @@ std::pair InterserverIOHTTPHandler::checkAuthentication(Poco::Net: if (config.has("interserver_http_credentials.user")) { if (!request.hasCredentials()) - return {"Server requires HTTP Basic authentification, but client doesn't provide it", false}; + return {"Server requires HTTP Basic authentication, but client doesn't provide it", false}; String scheme, info; request.getCredentials(scheme, info); if (scheme != "Basic") - return {"Server requires HTTP Basic authentification but client provides another method", false}; + return {"Server requires HTTP Basic authentication but client provides another method", false}; String user = config.getString("interserver_http_credentials.user"); String password = config.getString("interserver_http_credentials.password", ""); Poco::Net::HTTPBasicCredentials credentials(info); if (std::make_pair(user, password) != std::make_pair(credentials.getUsername(), credentials.getPassword())) - return {"Incorrect user or password in HTTP Basic authentification", false}; + return {"Incorrect user or password in HTTP Basic authentication", false}; } else if (request.hasCredentials()) { - return {"Client requires HTTP Basic authentification, but server doesn't provide it", false}; + return {"Client requires HTTP Basic authentication, but server doesn't provide it", false}; } return {"", true}; } @@ -99,7 +99,7 @@ void InterserverIOHTTPHandler::handleRequest(Poco::Net::HTTPServerRequest & requ response.setStatusAndReason(Poco::Net::HTTPServerResponse::HTTP_UNAUTHORIZED); if (!response.sent()) writeString(message, *used_output.out); - LOG_WARNING(log, "Query processing failed request: '" << request.getURI() << "' authentification failed"); + LOG_WARNING(log, "Query processing failed request: '" << request.getURI() << "' authentication failed"); } } catch (Exception & e) diff --git a/dbms/programs/server/PrometheusRequestHandler.h b/dbms/programs/server/PrometheusRequestHandler.h index 439a01c7d6f..d3d1dee88b1 100644 --- a/dbms/programs/server/PrometheusRequestHandler.h +++ b/dbms/programs/server/PrometheusRequestHandler.h @@ -31,7 +31,7 @@ public: template -class PrometeusRequestHandlerFactory : public Poco::Net::HTTPRequestHandlerFactory +class PrometheusRequestHandlerFactory : public Poco::Net::HTTPRequestHandlerFactory { private: IServer & server; @@ -39,7 +39,7 @@ private: PrometheusMetricsWriter metrics_writer; public: - PrometeusRequestHandlerFactory(IServer & server_, const AsynchronousMetrics & async_metrics_) + PrometheusRequestHandlerFactory(IServer & server_, const AsynchronousMetrics & async_metrics_) : server(server_) , endpoint_path(server_.config().getString("prometheus.endpoint", "/metrics")) , metrics_writer(server_.config(), "prometheus", async_metrics_) @@ -56,6 +56,6 @@ public: } }; -using PrometeusHandlerFactory = PrometeusRequestHandlerFactory; +using PrometheusHandlerFactory = PrometheusRequestHandlerFactory; } diff --git a/dbms/programs/server/Server.cpp b/dbms/programs/server/Server.cpp index bb08abf2161..b2f335df2a0 100644 --- a/dbms/programs/server/Server.cpp +++ b/dbms/programs/server/Server.cpp @@ -554,8 +554,8 @@ int Server::main(const std::vector & /*args*/) /// /// It also cannot work with sanitizers. /// Sanitizers are using quick "frame walking" stack unwinding (this implies -fno-omit-frame-pointer) - /// And they do unwiding frequently (on every malloc/free, thread/mutex operations, etc). - /// They change %rbp during unwinding and it confuses libunwind if signal comes during sanitizer unwiding + /// And they do unwinding frequently (on every malloc/free, thread/mutex operations, etc). + /// They change %rbp during unwinding and it confuses libunwind if signal comes during sanitizer unwinding /// and query profiler decide to unwind stack with libunwind at this moment. /// /// Symptoms: you'll get silent Segmentation Fault - without sanitizer message and without usual ClickHouse diagnostics. @@ -724,7 +724,7 @@ int Server::main(const std::vector & /*args*/) socket.setSendTimeout(settings.http_send_timeout); auto handler_factory = createDefaultHandlerFatory(*this, "HTTPHandler-factory"); if (config().has("prometheus") && config().getInt("prometheus.port", 0) == 0) - handler_factory->addHandler(async_metrics); + handler_factory->addHandler(async_metrics); servers.emplace_back(std::make_unique( handler_factory, @@ -854,7 +854,7 @@ int Server::main(const std::vector & /*args*/) socket.setReceiveTimeout(settings.http_receive_timeout); socket.setSendTimeout(settings.http_send_timeout); auto handler_factory = new HTTPRequestHandlerFactoryMain(*this, "PrometheusHandler-factory"); - handler_factory->addHandler(async_metrics); + handler_factory->addHandler(async_metrics); servers.emplace_back(std::make_unique( handler_factory, server_pool, diff --git a/dbms/programs/server/config.d/part_log.xml b/dbms/programs/server/config.d/part_log.xml new file mode 100644 index 00000000000..35add3c6cc1 --- /dev/null +++ b/dbms/programs/server/config.d/part_log.xml @@ -0,0 +1,7 @@ + + + system + part_log
+ 7500 +
+
diff --git a/dbms/programs/server/users.xml b/dbms/programs/server/users.xml index 0058ee51184..87e6c406b0a 100644 --- a/dbms/programs/server/users.xml +++ b/dbms/programs/server/users.xml @@ -83,30 +83,7 @@ default - - - - - - diff --git a/dbms/src/Access/RowPolicyContextFactory.cpp b/dbms/src/Access/RowPolicyContextFactory.cpp index e458f06ca94..77e5056e206 100644 --- a/dbms/src/Access/RowPolicyContextFactory.cpp +++ b/dbms/src/Access/RowPolicyContextFactory.cpp @@ -101,9 +101,6 @@ namespace public: void add(const ASTPtr & condition, bool is_restrictive) { - if (!condition) - return; - if (is_restrictive) restrictions.push_back(condition); else @@ -139,29 +136,32 @@ void RowPolicyContextFactory::PolicyInfo::setPolicy(const RowPolicyPtr & policy_ for (auto index : ext::range_with_static_cast(0, MAX_CONDITION_INDEX)) { + parsed_conditions[index] = nullptr; const String & condition = policy->conditions[index]; + if (condition.empty()) + continue; + auto previous_range = std::pair(std::begin(policy->conditions), std::begin(policy->conditions) + index); auto previous_it = std::find(previous_range.first, previous_range.second, condition); if (previous_it != previous_range.second) { /// The condition is already parsed before. parsed_conditions[index] = parsed_conditions[previous_it - previous_range.first]; + continue; } - else + + /// Try to parse the condition. + try { - /// Try to parse the condition. - try - { - ParserExpression parser; - parsed_conditions[index] = parseQuery(parser, condition, 0); - } - catch (...) - { - tryLogCurrentException( - &Poco::Logger::get("RowPolicy"), - String("Could not parse the condition ") + RowPolicy::conditionIndexToString(index) + " of row policy " - + backQuote(policy->getFullName())); - } + ParserExpression parser; + parsed_conditions[index] = parseQuery(parser, condition, 0); + } + catch (...) + { + tryLogCurrentException( + &Poco::Logger::get("RowPolicy"), + String("Could not parse the condition ") + RowPolicy::conditionIndexToString(index) + " of row policy " + + backQuote(policy->getFullName())); } } } @@ -290,7 +290,8 @@ void RowPolicyContextFactory::mixConditionsForContext(RowPolicyContext & context auto & mixers = map_of_mixers[std::pair{policy.getDatabase(), policy.getTableName()}]; mixers.policy_ids.push_back(policy_id); for (auto index : ext::range(0, MAX_CONDITION_INDEX)) - mixers.mixers[index].add(info.parsed_conditions[index], policy.isRestrictive()); + if (info.parsed_conditions[index]) + mixers.mixers[index].add(info.parsed_conditions[index], policy.isRestrictive()); } } diff --git a/dbms/src/Access/UsersConfigAccessStorage.cpp b/dbms/src/Access/UsersConfigAccessStorage.cpp index c9671afaca1..033e8f557b7 100644 --- a/dbms/src/Access/UsersConfigAccessStorage.cpp +++ b/dbms/src/Access/UsersConfigAccessStorage.cpp @@ -135,13 +135,25 @@ namespace for (const String & database : databases) { const String database_config = databases_config + "." + database; - Poco::Util::AbstractConfiguration::Keys table_names; - config.keys(database_config, table_names); + Poco::Util::AbstractConfiguration::Keys keys_in_database_config; + config.keys(database_config, keys_in_database_config); /// Read table properties - for (const String & table_name : table_names) + for (const String & key_in_database_config : keys_in_database_config) { - const auto filter_config = database_config + "." + table_name + ".filter"; + String table_name = key_in_database_config; + String filter_config = database_config + "." + table_name + ".filter"; + + if (key_in_database_config.starts_with("table[")) + { + const auto table_name_config = database_config + "." + table_name + "[@name]"; + if (config.has(table_name_config)) + { + table_name = config.getString(table_name_config); + filter_config = database_config + ".table[@name='" + table_name + "']"; + } + } + if (config.has(filter_config)) { try diff --git a/dbms/src/AggregateFunctions/AggregateFunctionForEach.h b/dbms/src/AggregateFunctions/AggregateFunctionForEach.h index 8f47a2de018..1b32eaeac46 100644 --- a/dbms/src/AggregateFunctions/AggregateFunctionForEach.h +++ b/dbms/src/AggregateFunctions/AggregateFunctionForEach.h @@ -54,7 +54,7 @@ private: { AggregateFunctionForEachData & state = data(place); - /// Ensure we have aggreate states for new_size elements, allocate + /// Ensure we have aggregate states for new_size elements, allocate /// from arena if needed. When reallocating, we can't copy the /// states to new buffer with memcpy, because they may contain pointers /// to themselves. In particular, this happens when a state contains diff --git a/dbms/src/Client/Connection.cpp b/dbms/src/Client/Connection.cpp index 75bac5c0cb2..b3f1e341f80 100644 --- a/dbms/src/Client/Connection.cpp +++ b/dbms/src/Client/Connection.cpp @@ -774,9 +774,7 @@ std::unique_ptr Connection::receiveException() { //LOG_TRACE(log_wrapper.get(), "Receiving exception"); - Exception e; - readException(e, *in, "Received from " + getDescription()); - return std::unique_ptr{ e.clone() }; + return std::make_unique(readException(*in, "Received from " + getDescription())); } diff --git a/dbms/src/Common/COW.h b/dbms/src/Common/COW.h index d8152af8356..b3d23a459ea 100644 --- a/dbms/src/Common/COW.h +++ b/dbms/src/Common/COW.h @@ -15,7 +15,7 @@ private: friend class COW; - /// Leave all constructors in private section. They will be avaliable through 'create' method. + /// Leave all constructors in private section. They will be available through 'create' method. Column(); /// Provide 'clone' method. It can be virtual if you want polymorphic behaviour. diff --git a/dbms/src/Common/Exception.cpp b/dbms/src/Common/Exception.cpp index a7a3d58c04f..25da9674e4d 100644 --- a/dbms/src/Common/Exception.cpp +++ b/dbms/src/Common/Exception.cpp @@ -23,6 +23,7 @@ namespace ErrorCodes extern const int UNKNOWN_EXCEPTION; extern const int CANNOT_TRUNCATE_FILE; extern const int NOT_IMPLEMENTED; + extern const int LOGICAL_ERROR; } @@ -33,6 +34,8 @@ Exception::Exception() Exception::Exception(const std::string & msg, int code) : Poco::Exception(msg, code) { + // In debug builds, treat LOGICAL_ERROR as an assertion failure. + assert(code != ErrorCodes::LOGICAL_ERROR); } Exception::Exception(CreateFromPocoTag, const Poco::Exception & exc) diff --git a/dbms/src/Common/ZooKeeper/tests/gtest_zkutil_test_multi_exception.cpp b/dbms/src/Common/ZooKeeper/tests/gtest_zkutil_test_multi_exception.cpp index 2f332c7039d..37240175c04 100644 --- a/dbms/src/Common/ZooKeeper/tests/gtest_zkutil_test_multi_exception.cpp +++ b/dbms/src/Common/ZooKeeper/tests/gtest_zkutil_test_multi_exception.cpp @@ -85,7 +85,7 @@ TEST(zkutil, multi_async) ops.clear(); auto res = fut.get(); - ASSERT_TRUE(res.error == Coordination::ZOK); + ASSERT_EQ(res.error, Coordination::ZOK); ASSERT_EQ(res.responses.size(), 2); } @@ -121,7 +121,7 @@ TEST(zkutil, multi_async) ops.clear(); auto res = fut.get(); - ASSERT_TRUE(res.error == Coordination::ZNODEEXISTS); + ASSERT_EQ(res.error, Coordination::ZNODEEXISTS); ASSERT_EQ(res.responses.size(), 2); } } diff --git a/dbms/src/Compression/CompressionCodecT64.cpp b/dbms/src/Compression/CompressionCodecT64.cpp index af55b6ec512..b7fed59fb92 100644 --- a/dbms/src/Compression/CompressionCodecT64.cpp +++ b/dbms/src/Compression/CompressionCodecT64.cpp @@ -516,13 +516,13 @@ UInt32 CompressionCodecT64::doCompressData(const char * src, UInt32 src_size, ch break; } - throw Exception("Connot compress with T64", ErrorCodes::CANNOT_COMPRESS); + throw Exception("Cannot compress with T64", ErrorCodes::CANNOT_COMPRESS); } void CompressionCodecT64::doDecompressData(const char * src, UInt32 src_size, char * dst, UInt32 uncompressed_size) const { if (!src_size) - throw Exception("Connot decompress with T64", ErrorCodes::CANNOT_DECOMPRESS); + throw Exception("Cannot decompress with T64", ErrorCodes::CANNOT_DECOMPRESS); UInt8 cookie = unalignedLoad(src); src += 1; @@ -553,7 +553,7 @@ void CompressionCodecT64::doDecompressData(const char * src, UInt32 src_size, ch break; } - throw Exception("Connot decompress with T64", ErrorCodes::CANNOT_DECOMPRESS); + throw Exception("Cannot decompress with T64", ErrorCodes::CANNOT_DECOMPRESS); } void CompressionCodecT64::useInfoAboutType(DataTypePtr data_type) diff --git a/dbms/src/Compression/tests/gtest_compressionCodec.cpp b/dbms/src/Compression/tests/gtest_compressionCodec.cpp index 95bef3b691e..8693ce86c2a 100644 --- a/dbms/src/Compression/tests/gtest_compressionCodec.cpp +++ b/dbms/src/Compression/tests/gtest_compressionCodec.cpp @@ -301,10 +301,6 @@ struct Codec : codec_statement(std::move(codec_statement_)), expected_compression_ratio(expected_compression_ratio_) {} - - Codec() - : Codec(std::string()) - {} }; @@ -314,23 +310,12 @@ struct CodecTestSequence std::vector serialized_data; DataTypePtr data_type; - CodecTestSequence() - : name(), - serialized_data(), - data_type() - {} - CodecTestSequence(std::string name_, std::vector serialized_data_, DataTypePtr data_type_) : name(name_), serialized_data(serialized_data_), data_type(data_type_) {} - CodecTestSequence(const CodecTestSequence &) = default; - CodecTestSequence & operator=(const CodecTestSequence &) = default; - CodecTestSequence(CodecTestSequence &&) = default; - CodecTestSequence & operator=(CodecTestSequence &&) = default; - CodecTestSequence & append(const CodecTestSequence & other) { assert(data_type->equals(*other.data_type)); @@ -819,24 +804,6 @@ std::vector generatePyramidOfSequences(const size_t sequences return sequences; }; -// Just as if all sequences from generatePyramidOfSequences were appended to one-by-one to the first one. -template -CodecTestSequence generatePyramidSequence(const size_t sequences_count, Generator && generator, const char* generator_name) -{ - CodecTestSequence sequence; - sequence.data_type = makeDataType(); - sequence.serialized_data.reserve(sequences_count * sequences_count * sizeof(T)); - - for (size_t i = 1; i < sequences_count; ++i) - { - std::string name = generator_name + std::string(" from 0 to ") + std::to_string(i); - sequence.append(generateSeq(std::forward(generator), name.c_str(), 0, i)); - } - - return sequence; -}; - - // helper macro to produce human-friendly sequence name from generator #define G(generator) generator, #generator @@ -853,17 +820,17 @@ const auto DefaultCodecsToTest = ::testing::Values( // test cases /////////////////////////////////////////////////////////////////////////////////////////////////// -INSTANTIATE_TEST_CASE_P(Simple, +INSTANTIATE_TEST_SUITE_P(Simple, CodecTest, ::testing::Combine( DefaultCodecsToTest, ::testing::Values( makeSeq(1, 2, 3, 5, 7, 11, 13, 17, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(SmallSequences, +INSTANTIATE_TEST_SUITE_P(SmallSequences, CodecTest, ::testing::Combine( DefaultCodecsToTest, @@ -877,10 +844,10 @@ INSTANTIATE_TEST_CASE_P(SmallSequences, + generatePyramidOfSequences(42, G(SequentialGenerator(1))) + generatePyramidOfSequences(42, G(SequentialGenerator(1))) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(Mixed, +INSTANTIATE_TEST_SUITE_P(Mixed, CodecTest, ::testing::Combine( DefaultCodecsToTest, @@ -894,10 +861,10 @@ INSTANTIATE_TEST_CASE_P(Mixed, generateSeq(G(MinMaxGenerator()), 1, 5) + generateSeq(G(SequentialGenerator(1)), 1, 1001), generateSeq(G(MinMaxGenerator()), 1, 5) + generateSeq(G(SequentialGenerator(1)), 1, 1001) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(SameValueInt, +INSTANTIATE_TEST_SUITE_P(SameValueInt, CodecTest, ::testing::Combine( DefaultCodecsToTest, @@ -911,10 +878,10 @@ INSTANTIATE_TEST_CASE_P(SameValueInt, generateSeq(G(SameValueGenerator(1000))), generateSeq(G(SameValueGenerator(1000))) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(SameNegativeValueInt, +INSTANTIATE_TEST_SUITE_P(SameNegativeValueInt, CodecTest, ::testing::Combine( DefaultCodecsToTest, @@ -928,10 +895,10 @@ INSTANTIATE_TEST_CASE_P(SameNegativeValueInt, generateSeq(G(SameValueGenerator(-1000))), generateSeq(G(SameValueGenerator(-1000))) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(SameValueFloat, +INSTANTIATE_TEST_SUITE_P(SameValueFloat, CodecTest, ::testing::Combine( ::testing::Values( @@ -942,10 +909,10 @@ INSTANTIATE_TEST_CASE_P(SameValueFloat, generateSeq(G(SameValueGenerator(M_E))), generateSeq(G(SameValueGenerator(M_E))) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(SameNegativeValueFloat, +INSTANTIATE_TEST_SUITE_P(SameNegativeValueFloat, CodecTest, ::testing::Combine( ::testing::Values( @@ -956,10 +923,10 @@ INSTANTIATE_TEST_CASE_P(SameNegativeValueFloat, generateSeq(G(SameValueGenerator(-1 * M_E))), generateSeq(G(SameValueGenerator(-1 * M_E))) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(SequentialInt, +INSTANTIATE_TEST_SUITE_P(SequentialInt, CodecTest, ::testing::Combine( DefaultCodecsToTest, @@ -973,12 +940,12 @@ INSTANTIATE_TEST_CASE_P(SequentialInt, generateSeq(G(SequentialGenerator(1))), generateSeq(G(SequentialGenerator(1))) ) - ), + ) ); // -1, -2, -3, ... etc for signed // 0xFF, 0xFE, 0xFD, ... for unsigned -INSTANTIATE_TEST_CASE_P(SequentialReverseInt, +INSTANTIATE_TEST_SUITE_P(SequentialReverseInt, CodecTest, ::testing::Combine( DefaultCodecsToTest, @@ -992,10 +959,10 @@ INSTANTIATE_TEST_CASE_P(SequentialReverseInt, generateSeq(G(SequentialGenerator(-1))), generateSeq(G(SequentialGenerator(-1))) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(SequentialFloat, +INSTANTIATE_TEST_SUITE_P(SequentialFloat, CodecTest, ::testing::Combine( ::testing::Values( @@ -1006,10 +973,10 @@ INSTANTIATE_TEST_CASE_P(SequentialFloat, generateSeq(G(SequentialGenerator(M_E))), generateSeq(G(SequentialGenerator(M_E))) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(SequentialReverseFloat, +INSTANTIATE_TEST_SUITE_P(SequentialReverseFloat, CodecTest, ::testing::Combine( ::testing::Values( @@ -1020,10 +987,10 @@ INSTANTIATE_TEST_CASE_P(SequentialReverseFloat, generateSeq(G(SequentialGenerator(-1 * M_E))), generateSeq(G(SequentialGenerator(-1 * M_E))) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(MonotonicInt, +INSTANTIATE_TEST_SUITE_P(MonotonicInt, CodecTest, ::testing::Combine( DefaultCodecsToTest, @@ -1037,10 +1004,10 @@ INSTANTIATE_TEST_CASE_P(MonotonicInt, generateSeq(G(MonotonicGenerator(1, 5))), generateSeq(G(MonotonicGenerator(1, 5))) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(MonotonicReverseInt, +INSTANTIATE_TEST_SUITE_P(MonotonicReverseInt, CodecTest, ::testing::Combine( DefaultCodecsToTest, @@ -1054,10 +1021,10 @@ INSTANTIATE_TEST_CASE_P(MonotonicReverseInt, generateSeq(G(MonotonicGenerator(-1, 5))), generateSeq(G(MonotonicGenerator(-1, 5))) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(MonotonicFloat, +INSTANTIATE_TEST_SUITE_P(MonotonicFloat, CodecTest, ::testing::Combine( ::testing::Values( @@ -1067,10 +1034,10 @@ INSTANTIATE_TEST_CASE_P(MonotonicFloat, generateSeq(G(MonotonicGenerator(M_E, 5))), generateSeq(G(MonotonicGenerator(M_E, 5))) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(MonotonicReverseFloat, +INSTANTIATE_TEST_SUITE_P(MonotonicReverseFloat, CodecTest, ::testing::Combine( ::testing::Values( @@ -1080,10 +1047,10 @@ INSTANTIATE_TEST_CASE_P(MonotonicReverseFloat, generateSeq(G(MonotonicGenerator(-1 * M_E, 5))), generateSeq(G(MonotonicGenerator(-1 * M_E, 5))) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(RandomInt, +INSTANTIATE_TEST_SUITE_P(RandomInt, CodecTest, ::testing::Combine( DefaultCodecsToTest, @@ -1093,10 +1060,10 @@ INSTANTIATE_TEST_CASE_P(RandomInt, generateSeq(G(RandomGenerator(0, 0, 1000'000'000))), generateSeq(G(RandomGenerator(0, 0, 1000'000'000))) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(RandomishInt, +INSTANTIATE_TEST_SUITE_P(RandomishInt, CodecTest, ::testing::Combine( DefaultCodecsToTest, @@ -1108,10 +1075,10 @@ INSTANTIATE_TEST_CASE_P(RandomishInt, generateSeq(G(RandomishGenerator)), generateSeq(G(RandomishGenerator)) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(RandomishFloat, +INSTANTIATE_TEST_SUITE_P(RandomishFloat, CodecTest, ::testing::Combine( DefaultCodecsToTest, @@ -1119,11 +1086,11 @@ INSTANTIATE_TEST_CASE_P(RandomishFloat, generateSeq(G(RandomishGenerator)), generateSeq(G(RandomishGenerator)) ) - ), + ) ); // Double delta overflow case, deltas are out of bounds for target type -INSTANTIATE_TEST_CASE_P(OverflowInt, +INSTANTIATE_TEST_SUITE_P(OverflowInt, CodecTest, ::testing::Combine( ::testing::Values( @@ -1136,10 +1103,10 @@ INSTANTIATE_TEST_CASE_P(OverflowInt, generateSeq(G(MinMaxGenerator())), generateSeq(G(MinMaxGenerator())) ) - ), + ) ); -INSTANTIATE_TEST_CASE_P(OverflowFloat, +INSTANTIATE_TEST_SUITE_P(OverflowFloat, CodecTest, ::testing::Combine( ::testing::Values( @@ -1152,7 +1119,7 @@ INSTANTIATE_TEST_CASE_P(OverflowFloat, generateSeq(G(FFand0Generator())), generateSeq(G(FFand0Generator())) ) - ), + ) ); template @@ -1189,7 +1156,7 @@ auto DDCompatibilityTestSequence() #define BIN_STR(x) std::string{x, sizeof(x) - 1} -INSTANTIATE_TEST_CASE_P(DoubleDelta, +INSTANTIATE_TEST_SUITE_P(DoubleDelta, CodecTest_Compatibility, ::testing::Combine( ::testing::Values(Codec("DoubleDelta")), @@ -1227,7 +1194,7 @@ INSTANTIATE_TEST_CASE_P(DoubleDelta, BIN_STR("\x94\xd4\x00\x00\x00\x98\x01\x00\x00\x08\x00\x33\x00\x00\x00\x2a\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x6b\x65\x5f\x50\x34\xff\x4f\xaf\xbc\xe3\x5d\xa3\xd3\xd9\xf6\x1f\xe2\x07\x7c\x47\x20\x67\x48\x07\x47\xff\x47\xf6\xfe\xf8\x00\x00\x70\x6b\xd0\x00\x02\x83\xd9\xfb\x9f\xdc\x1f\xfc\x20\x1e\x80\x00\x22\xc8\xf0\x00\x00\x66\x67\xa0\x00\x02\x00\x3d\x00\x00\x0f\xff\xe8\x00\x00\x7f\xee\xff\xdf\x00\x00\x70\x0d\x7a\x00\x02\x80\x7b\x9f\xf7\x9f\xfb\xc0\x00\x00\xff\xfe\x00\x00\x08\x00\xfc\x00\x00\x00\x04\x00\x06\xbe\x4f\xbf\xff\xd6\x0c\xff\x00\x00\x00\x01\x00\x00\x00\x03\xf8\x00\x00\x00\x08\x00\x00\x00\x0f\xc0\x00\x00\x00\x3f\xff\xff\xff\xfb\xff\xff\xff\xfb\xe0\x00\x00\x01\xc0\x00\x00\x06\x9f\x80\x00\x00\x0a\x00\x00\x00\x34\xf3\xff\xff\xff\xe7\x9f\xff\xff\xff\x7e\x00\x00\x00\x00\xff\xff\xff\xfd\xf0\x00\x00\x00\x07\xff\xff\xff\xf0") }, }) - ), + ) ); template @@ -1263,7 +1230,7 @@ auto GCompatibilityTestSequence() return generateSeq(G(PrimesWithMultiplierGenerator(intExp10(sizeof(ValueType)))), 0, 42); } -INSTANTIATE_TEST_CASE_P(Gorilla, +INSTANTIATE_TEST_SUITE_P(Gorilla, CodecTest_Compatibility, ::testing::Combine( ::testing::Values(Codec("Gorilla")), @@ -1301,14 +1268,31 @@ INSTANTIATE_TEST_CASE_P(Gorilla, BIN_STR("\x95\x91\x00\x00\x00\x50\x01\x00\x00\x08\x00\x2a\x00\x00\x00\x00\xc2\xeb\x0b\x00\x00\x00\x00\xe3\x2b\xa0\xa6\x19\x85\x98\xdc\x45\x74\x74\x43\xc2\x57\x41\x4c\x6e\x42\x79\xd9\x8f\x88\xa5\x05\xf3\xf1\x94\xa3\x62\x1e\x02\xdf\x05\x10\xf1\x15\x97\x35\x2a\x50\x71\x0f\x09\x6c\x89\xf7\x65\x1d\x11\xb7\xcc\x7d\x0b\x70\xc1\x86\x88\x48\x47\x87\xb6\x32\x26\xa7\x86\x87\x88\xd3\x93\x3d\xfc\x28\x68\x85\x05\x0b\x13\xc6\x5f\xd4\x70\xe1\x5e\x76\xf1\x9f\xf3\x33\x2a\x14\x14\x5e\x40\xc1\x5c\x28\x3f\xec\x43\x03\x05\x11\x91\xe8\xeb\x8e\x0a\x0e\x27\x21\x55\xcb\x39\xbc\x6a\xff\x11\x5d\x81\xa0\xa6\x10") }, }) - ), + ) ); // These 'tests' try to measure performance of encoding and decoding and hence only make sence to be run locally, // also they require pretty big data to run agains and generating this data slows down startup of unit test process. // So un-comment only at your discretion. -//INSTANTIATE_TEST_CASE_P(DoubleDelta, +// Just as if all sequences from generatePyramidOfSequences were appended to one-by-one to the first one. +//template +//CodecTestSequence generatePyramidSequence(const size_t sequences_count, Generator && generator, const char* generator_name) +//{ +// CodecTestSequence sequence; +// sequence.data_type = makeDataType(); +// sequence.serialized_data.reserve(sequences_count * sequences_count * sizeof(T)); +// +// for (size_t i = 1; i < sequences_count; ++i) +// { +// std::string name = generator_name + std::string(" from 0 to ") + std::to_string(i); +// sequence.append(generateSeq(std::forward(generator), name.c_str(), 0, i)); +// } +// +// return sequence; +//}; + +//INSTANTIATE_TEST_SUITE_P(DoubleDelta, // CodecTest_Performance, // ::testing::Combine( // ::testing::Values(Codec("DoubleDelta")), @@ -1325,7 +1309,7 @@ INSTANTIATE_TEST_CASE_P(Gorilla, // ), //); -//INSTANTIATE_TEST_CASE_P(Gorilla, +//INSTANTIATE_TEST_SUITE_P(Gorilla, // CodecTest_Performance, // ::testing::Combine( // ::testing::Values(Codec("Gorilla")), diff --git a/dbms/src/Core/tests/gtest_DecimalFunctions.cpp b/dbms/src/Core/tests/gtest_DecimalFunctions.cpp index d03be3ff3b8..fc304446057 100644 --- a/dbms/src/Core/tests/gtest_DecimalFunctions.cpp +++ b/dbms/src/Core/tests/gtest_DecimalFunctions.cpp @@ -120,7 +120,7 @@ TEST_P(DecimalUtilsSplitAndCombineTest, getFractionalPart_Decimal128) } // Intentionally small values that fit into 32-bit in order to cover Decimal32, Decimal64 and Decimal128 with single set of data. -INSTANTIATE_TEST_CASE_P(Basic, +INSTANTIATE_TEST_SUITE_P(Basic, DecimalUtilsSplitAndCombineTest, ::testing::ValuesIn(std::initializer_list{ { @@ -168,5 +168,5 @@ INSTANTIATE_TEST_CASE_P(Basic, 89 } } - } -),); + }) +); diff --git a/dbms/src/DataStreams/AddingDefaultsBlockInputStream.cpp b/dbms/src/DataStreams/AddingDefaultsBlockInputStream.cpp index f6b6b290428..112afe61183 100644 --- a/dbms/src/DataStreams/AddingDefaultsBlockInputStream.cpp +++ b/dbms/src/DataStreams/AddingDefaultsBlockInputStream.cpp @@ -62,6 +62,9 @@ Block AddingDefaultsBlockInputStream::readImpl() if (evaluate_block.has(column.first)) evaluate_block.erase(column.first); + if (!evaluate_block.columns()) + evaluate_block.insert({ColumnConst::create(ColumnUInt8::create(1, 0), res.rows()), std::make_shared(), "_dummy"}); + evaluateMissingDefaults(evaluate_block, header.getNamesAndTypesList(), column_defaults, context, false); std::unordered_map mixed_columns; diff --git a/dbms/src/DataStreams/AggregatingSortedBlockInputStream.cpp b/dbms/src/DataStreams/AggregatingSortedBlockInputStream.cpp index d23d93e7e5c..03da7f7c528 100644 --- a/dbms/src/DataStreams/AggregatingSortedBlockInputStream.cpp +++ b/dbms/src/DataStreams/AggregatingSortedBlockInputStream.cpp @@ -60,8 +60,6 @@ AggregatingSortedBlockInputStream::AggregatingSortedBlockInputStream( const BlockInputStreams & inputs_, const SortDescription & description_, size_t max_block_size_) : MergingSortedBlockInputStream(inputs_, description_, max_block_size_) { - ColumnNumbers positions; - /// Fill in the column numbers that need to be aggregated. for (size_t i = 0; i < num_columns; ++i) { @@ -96,7 +94,7 @@ AggregatingSortedBlockInputStream::AggregatingSortedBlockInputStream( columns_to_simple_aggregate.emplace_back(std::move(desc)); if (recursiveRemoveLowCardinality(column.type).get() != column.type.get()) - positions.emplace_back(i); + converted_lc_columns.emplace_back(i); } else { @@ -105,10 +103,12 @@ AggregatingSortedBlockInputStream::AggregatingSortedBlockInputStream( } } - if (!positions.empty()) + result_header = header; + + if (!converted_lc_columns.empty()) { for (auto & input : children) - input = std::make_shared(input, positions); + input = std::make_shared(input, converted_lc_columns); header = children.at(0)->getHeader(); } @@ -134,7 +134,15 @@ Block AggregatingSortedBlockInputStream::readImpl() columns_to_aggregate[i] = typeid_cast(merged_columns[column_numbers_to_aggregate[i]].get()); merge(merged_columns, queue_without_collation); - return header.cloneWithColumns(std::move(merged_columns)); + + for (auto & pos : converted_lc_columns) + { + auto & from_type = header.getByPosition(pos).type; + auto & to_type = result_header.getByPosition(pos).type; + merged_columns[pos] = (*recursiveTypeConversion(std::move(merged_columns[pos]), from_type, to_type)).mutate(); + } + + return result_header.cloneWithColumns(std::move(merged_columns)); } diff --git a/dbms/src/DataStreams/AggregatingSortedBlockInputStream.h b/dbms/src/DataStreams/AggregatingSortedBlockInputStream.h index 6ef1259e458..b0387dbcf2b 100644 --- a/dbms/src/DataStreams/AggregatingSortedBlockInputStream.h +++ b/dbms/src/DataStreams/AggregatingSortedBlockInputStream.h @@ -31,6 +31,8 @@ public: bool isSortedOutput() const override { return true; } + Block getHeader() const override { return result_header; } + protected: /// Can return 1 more records than max_block_size. Block readImpl() override; @@ -52,6 +54,9 @@ private: SharedBlockRowRef current_key; /// The current primary key. SharedBlockRowRef next_key; /// The primary key of the next row. + Block result_header; + ColumnNumbers converted_lc_columns; + /** We support two different cursors - with Collation and without. * Templates are used instead of polymorphic SortCursor and calls to virtual functions. */ diff --git a/dbms/src/DataTypes/DataTypeFactory.cpp b/dbms/src/DataTypes/DataTypeFactory.cpp index 3a0a25b1715..c073f00a0b7 100644 --- a/dbms/src/DataTypes/DataTypeFactory.cpp +++ b/dbms/src/DataTypes/DataTypeFactory.cpp @@ -35,7 +35,7 @@ DataTypePtr DataTypeFactory::get(const ASTPtr & ast) const if (const auto * func = ast->as()) { if (func->parameters) - throw Exception("Data type cannot have multiple parenthesed parameters.", ErrorCodes::ILLEGAL_SYNTAX_FOR_DATA_TYPE); + throw Exception("Data type cannot have multiple parenthesized parameters.", ErrorCodes::ILLEGAL_SYNTAX_FOR_DATA_TYPE); return get(func->name, func->arguments); } diff --git a/dbms/src/DataTypes/tests/gtest_data_type_get_common_type.cpp b/dbms/src/DataTypes/tests/gtest_data_type_get_common_type.cpp index 2ae1c335387..602320f5fca 100644 --- a/dbms/src/DataTypes/tests/gtest_data_type_get_common_type.cpp +++ b/dbms/src/DataTypes/tests/gtest_data_type_get_common_type.cpp @@ -104,7 +104,7 @@ TEST_P(MostSubtypeTest, getLeastSupertype) } } -INSTANTIATE_TEST_CASE_P(data_type, +INSTANTIATE_TEST_SUITE_P(data_type, LeastSuperTypeTest, ::testing::ValuesIn( std::initializer_list{ @@ -159,10 +159,10 @@ INSTANTIATE_TEST_CASE_P(data_type, {"Tuple(Int64,Int8) Tuple(UInt64)", nullptr}, {"Array(Int64) Array(String)", nullptr}, } - ), + ) ); -INSTANTIATE_TEST_CASE_P(data_type, +INSTANTIATE_TEST_SUITE_P(data_type, MostSubtypeTest, ::testing::ValuesIn( std::initializer_list{ @@ -210,5 +210,6 @@ INSTANTIATE_TEST_CASE_P(data_type, {"Int8 String", nullptr}, {"Nothing", nullptr}, {"FixedString(16) FixedString(8) String", nullptr}, - }), + } + ) ); diff --git a/dbms/src/Databases/DatabaseWithDictionaries.h b/dbms/src/Databases/DatabaseWithDictionaries.h index c16e11f24c5..5ec37bdbb1a 100644 --- a/dbms/src/Databases/DatabaseWithDictionaries.h +++ b/dbms/src/Databases/DatabaseWithDictionaries.h @@ -32,7 +32,7 @@ public: protected: DatabaseWithDictionaries(const String & name, const String & metadata_path_, const String & logger) - : DatabaseOnDisk(name, metadata_path_, logger) {} + : DatabaseOnDisk(name, metadata_path_, logger) {} void attachToExternalDictionariesLoader(Context & context); void detachFromExternalDictionariesLoader(); diff --git a/dbms/src/Dictionaries/DictionaryBlockInputStream.h b/dbms/src/Dictionaries/DictionaryBlockInputStream.h index 09b9ec8d4af..c683ef0e9cc 100644 --- a/dbms/src/Dictionaries/DictionaryBlockInputStream.h +++ b/dbms/src/Dictionaries/DictionaryBlockInputStream.h @@ -43,7 +43,7 @@ public: using GetColumnsFunction = std::function & attributes)>; // Used to separate key columns format for storage and view. - // Calls get_key_columns_function to get key column for dictionary get fuction call + // Calls get_key_columns_function to get key column for dictionary get function call // and get_view_columns_function to get key representation. // Now used in trie dictionary, where columns are stored as ip and mask, and are showed as string DictionaryBlockInputStream( diff --git a/dbms/src/Functions/FunctionsHashing.h b/dbms/src/Functions/FunctionsHashing.h index bf36e1999e2..fd52bbeb316 100644 --- a/dbms/src/Functions/FunctionsHashing.h +++ b/dbms/src/Functions/FunctionsHashing.h @@ -371,7 +371,7 @@ struct JavaHashUTF16LEImpl } if (size % 2 != 0) - throw Exception("Arguments for javaHashUTF16LE must be in the form of UTF-16", ErrorCodes::LOGICAL_ERROR); + throw Exception("Arguments for javaHashUTF16LE must be in the form of UTF-16", ErrorCodes::BAD_ARGUMENTS); UInt32 h = 0; for (size_t i = 0; i < size; i += 2) diff --git a/dbms/src/Functions/GatherUtils/Sources.h b/dbms/src/Functions/GatherUtils/Sources.h index c21a6fc523c..7c881bba0c5 100644 --- a/dbms/src/Functions/GatherUtils/Sources.h +++ b/dbms/src/Functions/GatherUtils/Sources.h @@ -238,7 +238,7 @@ struct StringSource size_t getElementSize() const { - return offsets[row_num] - prev_offset; + return offsets[row_num] - prev_offset - 1; } Slice getWhole() const diff --git a/dbms/src/Functions/Regexps.h b/dbms/src/Functions/Regexps.h index 9a8366fb543..5d93e823419 100644 --- a/dbms/src/Functions/Regexps.h +++ b/dbms/src/Functions/Regexps.h @@ -36,6 +36,7 @@ namespace ErrorCodes { extern const int CANNOT_ALLOCATE_MEMORY; extern const int LOGICAL_ERROR; + extern const int BAD_ARGUMENTS; } namespace Regexps @@ -205,7 +206,7 @@ namespace MultiRegexps else throw Exception( "Pattern '" + str_patterns[error->expression] + "' failed with error '" + String(error->message), - ErrorCodes::LOGICAL_ERROR); + ErrorCodes::BAD_ARGUMENTS); } ProfileEvents::increment(ProfileEvents::RegexpCreated); diff --git a/dbms/src/Functions/formatString.h b/dbms/src/Functions/formatString.h index c1f9b6d3783..3b08d313c4d 100644 --- a/dbms/src/Functions/formatString.h +++ b/dbms/src/Functions/formatString.h @@ -18,6 +18,7 @@ namespace DB namespace ErrorCodes { extern const int LOGICAL_ERROR; + extern const int BAD_ARGUMENTS; } struct FormatImpl @@ -45,11 +46,11 @@ struct FormatImpl for (UInt64 pos = l; pos < r; pos++) { if (!isNumericASCII(description[pos])) - throw Exception("Not a number in curly braces at position " + std::to_string(pos), ErrorCodes::LOGICAL_ERROR); + throw Exception("Not a number in curly braces at position " + std::to_string(pos), ErrorCodes::BAD_ARGUMENTS); res = res * 10 + description[pos] - '0'; if (res >= argument_threshold) throw Exception( - "Too big number for arguments, must be at most " + std::to_string(argument_threshold), ErrorCodes::LOGICAL_ERROR); + "Too big number for arguments, must be at most " + std::to_string(argument_threshold), ErrorCodes::BAD_ARGUMENTS); } } @@ -114,7 +115,7 @@ struct FormatImpl } if (is_open_curly) - throw Exception("Two open curly braces without close one at position " + std::to_string(i), ErrorCodes::LOGICAL_ERROR); + throw Exception("Two open curly braces without close one at position " + std::to_string(i), ErrorCodes::BAD_ARGUMENTS); String to_add = String(pattern.data() + start_pos, i - start_pos); double_brace_removal(to_add); @@ -137,7 +138,7 @@ struct FormatImpl } if (!is_open_curly) - throw Exception("Closed curly brace without open one at position " + std::to_string(i), ErrorCodes::LOGICAL_ERROR); + throw Exception("Closed curly brace without open one at position " + std::to_string(i), ErrorCodes::BAD_ARGUMENTS); is_open_curly = false; @@ -145,17 +146,17 @@ struct FormatImpl { if (is_plain_numbering && !*is_plain_numbering) throw Exception( - "Cannot switch from automatic field numbering to manual field specification", ErrorCodes::LOGICAL_ERROR); + "Cannot switch from automatic field numbering to manual field specification", ErrorCodes::BAD_ARGUMENTS); is_plain_numbering = true; if (index_if_plain >= argument_number) - throw Exception("Argument is too big for formatting", ErrorCodes::LOGICAL_ERROR); + throw Exception("Argument is too big for formatting", ErrorCodes::BAD_ARGUMENTS); *index_positions_ptr = index_if_plain++; } else { if (is_plain_numbering && *is_plain_numbering) throw Exception( - "Cannot switch from automatic field numbering to manual field specification", ErrorCodes::LOGICAL_ERROR); + "Cannot switch from automatic field numbering to manual field specification", ErrorCodes::BAD_ARGUMENTS); is_plain_numbering = false; UInt64 arg; @@ -163,7 +164,7 @@ struct FormatImpl if (arg >= argument_number) throw Exception( - "Argument is too big for formatting. Note that indexing starts from zero", ErrorCodes::LOGICAL_ERROR); + "Argument is too big for formatting. Note that indexing starts from zero", ErrorCodes::BAD_ARGUMENTS); *index_positions_ptr = arg; } @@ -183,7 +184,7 @@ struct FormatImpl } if (is_open_curly) - throw Exception("Last open curly brace is not closed", ErrorCodes::LOGICAL_ERROR); + throw Exception("Last open curly brace is not closed", ErrorCodes::BAD_ARGUMENTS); String to_add = String(pattern.data() + start_pos, pattern.size() - start_pos); double_brace_removal(to_add); diff --git a/dbms/src/Functions/neighbor.cpp b/dbms/src/Functions/neighbor.cpp index 0253aed65d3..c37a3313a80 100644 --- a/dbms/src/Functions/neighbor.cpp +++ b/dbms/src/Functions/neighbor.cpp @@ -40,6 +40,8 @@ public: bool isVariadic() const override { return true; } + bool isStateful() const override { return true; } + bool isDeterministic() const override { return false; } bool isDeterministicInScopeOfQuery() const override { return false; } diff --git a/dbms/src/IO/ReadHelpers.cpp b/dbms/src/IO/ReadHelpers.cpp index 9ad6cf72171..eba724f2193 100644 --- a/dbms/src/IO/ReadHelpers.cpp +++ b/dbms/src/IO/ReadHelpers.cpp @@ -959,7 +959,7 @@ void skipJSONField(ReadBuffer & buf, const StringRef & name_of_field) } -void readException(Exception & e, ReadBuffer & buf, const String & additional_message) +Exception readException(ReadBuffer & buf, const String & additional_message) { int code = 0; String name; @@ -986,14 +986,12 @@ void readException(Exception & e, ReadBuffer & buf, const String & additional_me if (!stack_trace.empty()) out << " Stack trace:\n\n" << stack_trace; - e = Exception(out.str(), code); + return Exception(out.str(), code); } void readAndThrowException(ReadBuffer & buf, const String & additional_message) { - Exception e; - readException(e, buf, additional_message); - e.rethrow(); + readException(buf, additional_message).rethrow(); } diff --git a/dbms/src/IO/ReadHelpers.h b/dbms/src/IO/ReadHelpers.h index 7e5b5ce804f..fc8e444330c 100644 --- a/dbms/src/IO/ReadHelpers.h +++ b/dbms/src/IO/ReadHelpers.h @@ -930,7 +930,7 @@ void skipJSONField(ReadBuffer & buf, const StringRef & name_of_field); * (type is cut to base class, 'message' replaced by 'displayText', and stack trace is appended to 'message') * Some additional message could be appended to exception (example: you could add information about from where it was received). */ -void readException(Exception & e, ReadBuffer & buf, const String & additional_message = ""); +Exception readException(ReadBuffer & buf, const String & additional_message = ""); void readAndThrowException(ReadBuffer & buf, const String & additional_message = ""); diff --git a/dbms/src/IO/tests/gtest_DateTime64_parsing_and_writing.cpp b/dbms/src/IO/tests/gtest_DateTime64_parsing_and_writing.cpp index 08ca5dc88ee..04fdb6f4a34 100644 --- a/dbms/src/IO/tests/gtest_DateTime64_parsing_and_writing.cpp +++ b/dbms/src/IO/tests/gtest_DateTime64_parsing_and_writing.cpp @@ -79,7 +79,7 @@ TEST_P(DateTime64StringParseBestEffortTest, parse) // YYYY-MM-DD HH:MM:SS.NNNNNNNNN -INSTANTIATE_TEST_CASE_P(Basic, +INSTANTIATE_TEST_SUITE_P(Basic, DateTime64StringParseTest, ::testing::ValuesIn(std::initializer_list{ { @@ -130,10 +130,10 @@ INSTANTIATE_TEST_CASE_P(Basic, 1568650817'1ULL, 1 } - }), + }) ); -INSTANTIATE_TEST_CASE_P(BestEffort, +INSTANTIATE_TEST_SUITE_P(BestEffort, DateTime64StringParseBestEffortTest, ::testing::ValuesIn(std::initializer_list{ { @@ -142,13 +142,13 @@ INSTANTIATE_TEST_CASE_P(BestEffort, 1568650817'123456ULL, 6 } - }), + }) ); // TODO: add negative test cases for invalid strings, verifying that error is reported properly -INSTANTIATE_TEST_CASE_P(Basic, +INSTANTIATE_TEST_SUITE_P(Basic, DateTime64StringWriteTest, ::testing::ValuesIn(std::initializer_list{ { @@ -181,6 +181,6 @@ INSTANTIATE_TEST_CASE_P(Basic, 1568650817'001ULL, 3 } - }), + }) ); diff --git a/dbms/src/IO/tests/gtest_bit_io.cpp b/dbms/src/IO/tests/gtest_bit_io.cpp index 994e08214cc..57fe45ca1a6 100644 --- a/dbms/src/IO/tests/gtest_bit_io.cpp +++ b/dbms/src/IO/tests/gtest_bit_io.cpp @@ -177,7 +177,7 @@ TEST_P(BitIO, WriteAndRead) } } -INSTANTIATE_TEST_CASE_P(Simple, +INSTANTIATE_TEST_SUITE_P(Simple, BitIO, ::testing::ValuesIn(std::initializer_list{ { @@ -221,7 +221,7 @@ INSTANTIATE_TEST_CASE_P(Simple, "10101001 10111010 11101111 10101111 10111010 11101011 10101001 00000000 " // 256 "10101111 10111010 11101011 10101001 00001111 11110000 00001110 11111111 " // 320 } - }), + }) ); TestCaseParameter primes_case(UInt8 repeat_times, UInt64 pattern) @@ -241,12 +241,13 @@ TestCaseParameter primes_case(UInt8 repeat_times, UInt64 pattern) return TestCaseParameter(test_data); } -INSTANTIATE_TEST_CASE_P(Primes, - BitIO, - ::testing::Values( - primes_case(11, 0xFFFFFFFFFFFFFFFFULL), - primes_case(11, BIT_PATTERN) -),); +INSTANTIATE_TEST_SUITE_P(Primes, + BitIO, + ::testing::Values( + primes_case(11, 0xFFFFFFFFFFFFFFFFULL), + primes_case(11, BIT_PATTERN) + ) +); TEST(BitHelpers, maskLowBits) { diff --git a/dbms/src/Interpreters/AddDefaultDatabaseVisitor.h b/dbms/src/Interpreters/AddDefaultDatabaseVisitor.h index acb2eb8246d..8ca22cb94a9 100644 --- a/dbms/src/Interpreters/AddDefaultDatabaseVisitor.h +++ b/dbms/src/Interpreters/AddDefaultDatabaseVisitor.h @@ -17,7 +17,7 @@ namespace DB { /// Visitors consist of functions with unified interface 'void visit(Casted & x, ASTPtr & y)', there x is y, successfully casted to Casted. -/// Both types and fuction could have const specifiers. The second argument is used by visitor to replaces AST node (y) if needed. +/// Both types and function could have const specifiers. The second argument is used by visitor to replaces AST node (y) if needed. /// Visits AST nodes, add default database to tables if not set. There's different logic for DDLs and selects. class AddDefaultDatabaseVisitor diff --git a/dbms/src/Interpreters/Aggregator.cpp b/dbms/src/Interpreters/Aggregator.cpp index 8320f5dc70a..8b2bff5414e 100644 --- a/dbms/src/Interpreters/Aggregator.cpp +++ b/dbms/src/Interpreters/Aggregator.cpp @@ -157,7 +157,7 @@ Aggregator::Aggregator(const Params & params_) total_size_of_aggregate_states = 0; all_aggregates_has_trivial_destructor = true; - // aggreate_states will be aligned as below: + // aggregate_states will be aligned as below: // |<-- state_1 -->|<-- pad_1 -->|<-- state_2 -->|<-- pad_2 -->| ..... // // pad_N will be used to match alignment requirement for each next state. @@ -168,7 +168,7 @@ Aggregator::Aggregator(const Params & params_) total_size_of_aggregate_states += params.aggregates[i].function->sizeOfData(); - // aggreate states are aligned based on maximum requirement + // aggregate states are aligned based on maximum requirement align_aggregate_states = std::max(align_aggregate_states, params.aggregates[i].function->alignOfData()); // If not the last aggregate_state, we need pad it so that next aggregate_state will be aligned. diff --git a/dbms/src/Interpreters/BloomFilter.cpp b/dbms/src/Interpreters/BloomFilter.cpp index 709dd7fbddf..a6a5bedb8a8 100644 --- a/dbms/src/Interpreters/BloomFilter.cpp +++ b/dbms/src/Interpreters/BloomFilter.cpp @@ -96,7 +96,7 @@ DataTypePtr BloomFilter::getPrimitiveType(const DataTypePtr & data_type) if (!typeid_cast(array_type->getNestedType().get())) return getPrimitiveType(array_type->getNestedType()); else - throw Exception("Unexpected type " + data_type->getName() + " of bloom filter index.", ErrorCodes::LOGICAL_ERROR); + throw Exception("Unexpected type " + data_type->getName() + " of bloom filter index.", ErrorCodes::BAD_ARGUMENTS); } if (const auto * nullable_type = typeid_cast(data_type.get())) diff --git a/dbms/src/Interpreters/BloomFilterHash.h b/dbms/src/Interpreters/BloomFilterHash.h index 77bd5cc7ffd..e7411433781 100644 --- a/dbms/src/Interpreters/BloomFilterHash.h +++ b/dbms/src/Interpreters/BloomFilterHash.h @@ -23,6 +23,7 @@ namespace DB namespace ErrorCodes { extern const int ILLEGAL_COLUMN; + extern const int BAD_ARGUMENTS; } struct BloomFilterHash @@ -33,45 +34,64 @@ struct BloomFilterHash 15033938188484401405ULL, 18286745649494826751ULL, 6852245486148412312ULL, 8886056245089344681ULL, 10151472371158292780ULL }; - static ColumnPtr hashWithField(const IDataType * data_type, const Field & field) + template + static UInt64 getNumberTypeHash(const Field & field) { - WhichDataType which(data_type); - UInt64 hash = 0; - bool unexpected_type = false; + /// For negative, we should convert the type to make sure the symbol is in right place + return field.isNull() ? intHash64(0) : intHash64(ext::bit_cast(FieldType(field.safeGet()))); + } - if (field.isNull()) - { - if (which.isInt() || which.isUInt() || which.isEnum() || which.isDateOrDateTime() || which.isFloat()) - hash = intHash64(0); - else if (which.isString()) - hash = CityHash_v1_0_2::CityHash64("", 0); - else if (which.isFixedString()) - { - const auto * fixed_string_type = typeid_cast(data_type); - const std::vector value(fixed_string_type->getN(), 0); - hash = CityHash_v1_0_2::CityHash64(value.data(), value.size()); - } - else - unexpected_type = true; - } - else if (which.isUInt() || which.isDateOrDateTime()) - hash = intHash64(field.safeGet()); - else if (which.isInt() || which.isEnum()) - hash = intHash64(ext::bit_cast(field.safeGet())); - else if (which.isFloat32() || which.isFloat64()) - hash = intHash64(ext::bit_cast(field.safeGet())); - else if (which.isString() || which.isFixedString()) + static UInt64 getStringTypeHash(const Field & field) + { + if (!field.isNull()) { const auto & value = field.safeGet(); - hash = CityHash_v1_0_2::CityHash64(value.data(), value.size()); + return CityHash_v1_0_2::CityHash64(value.data(), value.size()); } - else - unexpected_type = true; - if (unexpected_type) - throw Exception("Unexpected type " + data_type->getName() + " of bloom filter index.", ErrorCodes::LOGICAL_ERROR); + return CityHash_v1_0_2::CityHash64("", 0); + } - return ColumnConst::create(ColumnUInt64::create(1, hash), 1); + static UInt64 getFixedStringTypeHash(const Field & field, const IDataType * type) + { + if (!field.isNull()) + { + const auto & value = field.safeGet(); + return CityHash_v1_0_2::CityHash64(value.data(), value.size()); + } + + const auto * fixed_string_type = typeid_cast(type); + const std::vector value(fixed_string_type->getN(), 0); + return CityHash_v1_0_2::CityHash64(value.data(), value.size()); + } + + static ColumnPtr hashWithField(const IDataType * data_type, const Field & field) + { + const auto & build_hash_column = [&](const UInt64 & hash) -> ColumnPtr + { + return ColumnConst::create(ColumnUInt64::create(1, hash), 1); + }; + + + WhichDataType which(data_type); + + if (which.isUInt8()) return build_hash_column(getNumberTypeHash(field)); + else if (which.isUInt16()) return build_hash_column(getNumberTypeHash(field)); + else if (which.isUInt32()) return build_hash_column(getNumberTypeHash(field)); + else if (which.isUInt64()) return build_hash_column(getNumberTypeHash(field)); + else if (which.isInt8()) return build_hash_column(getNumberTypeHash(field)); + else if (which.isInt16()) return build_hash_column(getNumberTypeHash(field)); + else if (which.isInt32()) return build_hash_column(getNumberTypeHash(field)); + else if (which.isInt64()) return build_hash_column(getNumberTypeHash(field)); + else if (which.isEnum8()) return build_hash_column(getNumberTypeHash(field)); + else if (which.isEnum16()) return build_hash_column(getNumberTypeHash(field)); + else if (which.isDate()) return build_hash_column(getNumberTypeHash(field)); + else if (which.isDateTime()) return build_hash_column(getNumberTypeHash(field)); + else if (which.isFloat32()) return build_hash_column(getNumberTypeHash(field)); + else if (which.isFloat64()) return build_hash_column(getNumberTypeHash(field)); + else if (which.isString()) return build_hash_column(getStringTypeHash(field)); + else if (which.isFixedString()) return build_hash_column(getFixedStringTypeHash(field, data_type)); + else throw Exception("Unexpected type " + data_type->getName() + " of bloom filter index.", ErrorCodes::BAD_ARGUMENTS); } static ColumnPtr hashWithColumn(const DataTypePtr & data_type, const ColumnPtr & column, size_t pos, size_t limit) @@ -82,7 +102,7 @@ struct BloomFilterHash const auto * array_col = typeid_cast(column.get()); if (checkAndGetColumn(array_col->getData())) - throw Exception("Unexpected type " + data_type->getName() + " of bloom filter index.", ErrorCodes::LOGICAL_ERROR); + throw Exception("Unexpected type " + data_type->getName() + " of bloom filter index.", ErrorCodes::BAD_ARGUMENTS); const auto & offsets = array_col->getOffsets(); limit = offsets[pos + limit - 1] - offsets[pos - 1]; /// PaddedPODArray allows access on index -1. @@ -127,7 +147,7 @@ struct BloomFilterHash else if (which.isFloat64()) getNumberTypeHash(column, vec, pos); else if (which.isString()) getStringTypeHash(column, vec, pos); else if (which.isFixedString()) getStringTypeHash(column, vec, pos); - else throw Exception("Unexpected type " + data_type->getName() + " of bloom filter index.", ErrorCodes::LOGICAL_ERROR); + else throw Exception("Unexpected type " + data_type->getName() + " of bloom filter index.", ErrorCodes::BAD_ARGUMENTS); } template diff --git a/dbms/src/Interpreters/Cluster.cpp b/dbms/src/Interpreters/Cluster.cpp index 2c75bd821fe..71bd89b2b6f 100644 --- a/dbms/src/Interpreters/Cluster.cpp +++ b/dbms/src/Interpreters/Cluster.cpp @@ -10,6 +10,7 @@ #include #include #include +#include namespace DB { @@ -449,18 +450,64 @@ void Cluster::initMisc() } } +std::unique_ptr Cluster::getClusterWithReplicasAsShards(const Settings & settings) const +{ + return std::unique_ptr{ new Cluster(ReplicasAsShardsTag{}, *this, settings)}; +} std::unique_ptr Cluster::getClusterWithSingleShard(size_t index) const { - return std::unique_ptr{ new Cluster(*this, {index}) }; + return std::unique_ptr{ new Cluster(SubclusterTag{}, *this, {index}) }; } std::unique_ptr Cluster::getClusterWithMultipleShards(const std::vector & indices) const { - return std::unique_ptr{ new Cluster(*this, indices) }; + return std::unique_ptr{ new Cluster(SubclusterTag{}, *this, indices) }; } -Cluster::Cluster(const Cluster & from, const std::vector & indices) +Cluster::Cluster(Cluster::ReplicasAsShardsTag, const Cluster & from, const Settings & settings) + : shards_info{}, addresses_with_failover{} +{ + if (from.addresses_with_failover.empty()) + throw Exception("Cluster is empty", ErrorCodes::LOGICAL_ERROR); + + std::set> unique_hosts; + for (size_t shard_index : ext::range(0, from.shards_info.size())) + { + const auto & replicas = from.addresses_with_failover[shard_index]; + for (const auto & address : replicas) + { + if (!unique_hosts.emplace(address.host_name, address.port).second) + continue; /// Duplicate host, skip. + + ShardInfo info; + if (address.is_local) + info.local_addresses.push_back(address); + + ConnectionPoolPtr pool = std::make_shared( + settings.distributed_connections_pool_size, + address.host_name, + address.port, + address.default_database, + address.user, + address.password, + "server", + address.compression, + address.secure); + + info.pool = std::make_shared(ConnectionPoolPtrs{pool}, settings.load_balancing); + info.per_replica_pools = {std::move(pool)}; + + addresses_with_failover.emplace_back(Addresses{address}); + shards_info.emplace_back(std::move(info)); + } + } + + initMisc(); +} + + +Cluster::Cluster(Cluster::SubclusterTag, const Cluster & from, const std::vector & indices) : shards_info{} { for (size_t index : indices) diff --git a/dbms/src/Interpreters/Cluster.h b/dbms/src/Interpreters/Cluster.h index e778c9bcf6f..ef12f9fe78f 100644 --- a/dbms/src/Interpreters/Cluster.h +++ b/dbms/src/Interpreters/Cluster.h @@ -26,7 +26,7 @@ public: const String & username, const String & password, UInt16 clickhouse_port, bool treat_local_as_remote, bool secure = false); - Cluster(const Cluster &) = delete; + Cluster(const Cluster &)= delete; Cluster & operator=(const Cluster &) = delete; /// is used to set a limit on the size of the timeout @@ -148,6 +148,9 @@ public: /// Get a subcluster consisting of one or multiple shards - indexes by count (from 0) of the shard of this cluster. std::unique_ptr getClusterWithMultipleShards(const std::vector & indices) const; + /// Get a new Cluster that contains all servers (all shards with all replicas) from existing cluster as independent shards. + std::unique_ptr getClusterWithReplicasAsShards(const Settings & settings) const; + private: using SlotToShard = std::vector; SlotToShard slot_to_shard; @@ -159,7 +162,12 @@ private: void initMisc(); /// For getClusterWithMultipleShards implementation. - Cluster(const Cluster & from, const std::vector & indices); + struct SubclusterTag {}; + Cluster(SubclusterTag, const Cluster & from, const std::vector & indices); + + /// For getClusterWithReplicasAsShards implementation + struct ReplicasAsShardsTag {}; + Cluster(ReplicasAsShardsTag, const Cluster & from, const Settings & settings); String hash_of_addresses; /// Description of the cluster shards. diff --git a/dbms/src/Interpreters/Context.cpp b/dbms/src/Interpreters/Context.cpp index 69ceb24e570..724cdd10f69 100644 --- a/dbms/src/Interpreters/Context.cpp +++ b/dbms/src/Interpreters/Context.cpp @@ -111,7 +111,7 @@ struct ContextShared mutable std::mutex embedded_dictionaries_mutex; mutable std::mutex external_dictionaries_mutex; mutable std::mutex external_models_mutex; - /// Separate mutex for re-initialization of zookeer session. This operation could take a long time and must not interfere with another operations. + /// Separate mutex for re-initialization of zookeeper session. This operation could take a long time and must not interfere with another operations. mutable std::mutex zookeeper_mutex; mutable zkutil::ZooKeeperPtr zookeeper; /// Client for ZooKeeper. @@ -191,7 +191,7 @@ struct ContextShared /// Clusters for distributed tables /// Initialized on demand (on distributed storages initialization) since Settings should be initialized std::unique_ptr clusters; - ConfigurationPtr clusters_config; /// Soteres updated configs + ConfigurationPtr clusters_config; /// Stores updated configs mutable std::mutex clusters_mutex; /// Guards clusters and clusters_config #if USE_EMBEDDED_COMPILER @@ -922,21 +922,21 @@ StoragePtr Context::tryGetExternalTable(const String & table_name) const StoragePtr Context::getTable(const String & database_name, const String & table_name) const { - Exception exc; + std::optional exc; auto res = getTableImpl(database_name, table_name, &exc); if (!res) - throw exc; + throw *exc; return res; } StoragePtr Context::tryGetTable(const String & database_name, const String & table_name) const { - return getTableImpl(database_name, table_name, nullptr); + return getTableImpl(database_name, table_name, {}); } -StoragePtr Context::getTableImpl(const String & database_name, const String & table_name, Exception * exception) const +StoragePtr Context::getTableImpl(const String & database_name, const String & table_name, std::optional * exception) const { String db; DatabasePtr database; @@ -958,7 +958,7 @@ StoragePtr Context::getTableImpl(const String & database_name, const String & ta if (shared->databases.end() == it) { if (exception) - *exception = Exception("Database " + backQuoteIfNeed(db) + " doesn't exist", ErrorCodes::UNKNOWN_DATABASE); + exception->emplace("Database " + backQuoteIfNeed(db) + " doesn't exist", ErrorCodes::UNKNOWN_DATABASE); return {}; } @@ -969,7 +969,7 @@ StoragePtr Context::getTableImpl(const String & database_name, const String & ta if (!table) { if (exception) - *exception = Exception("Table " + backQuoteIfNeed(db) + "." + backQuoteIfNeed(table_name) + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE); + exception->emplace("Table " + backQuoteIfNeed(db) + "." + backQuoteIfNeed(table_name) + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE); return {}; } diff --git a/dbms/src/Interpreters/Context.h b/dbms/src/Interpreters/Context.h index b0a4b5bb580..e1466713088 100644 --- a/dbms/src/Interpreters/Context.h +++ b/dbms/src/Interpreters/Context.h @@ -589,7 +589,7 @@ private: EmbeddedDictionaries & getEmbeddedDictionariesImpl(bool throw_on_error) const; - StoragePtr getTableImpl(const String & database_name, const String & table_name, Exception * exception) const; + StoragePtr getTableImpl(const String & database_name, const String & table_name, std::optional * exception) const; SessionKey getSessionKey(const String & session_id) const; diff --git a/dbms/src/Interpreters/CrossToInnerJoinVisitor.cpp b/dbms/src/Interpreters/CrossToInnerJoinVisitor.cpp index 61e57c4d490..596819dcde9 100644 --- a/dbms/src/Interpreters/CrossToInnerJoinVisitor.cpp +++ b/dbms/src/Interpreters/CrossToInnerJoinVisitor.cpp @@ -105,7 +105,7 @@ public: if (node.name == NameAnd::name) { if (!node.arguments || node.arguments->children.empty()) - throw Exception("Logical error: function requires argiment", ErrorCodes::LOGICAL_ERROR); + throw Exception("Logical error: function requires argument", ErrorCodes::LOGICAL_ERROR); for (auto & child : node.arguments->children) { diff --git a/dbms/src/Interpreters/DDLWorker.cpp b/dbms/src/Interpreters/DDLWorker.cpp index 3077290e3fe..32004e4e564 100644 --- a/dbms/src/Interpreters/DDLWorker.cpp +++ b/dbms/src/Interpreters/DDLWorker.cpp @@ -238,7 +238,7 @@ DDLWorker::DDLWorker(const std::string & zk_root_dir, Context & context_, const if (context.getSettingsRef().readonly) { LOG_WARNING(log, "Distributed DDL worker is run with readonly settings, it will not be able to execute DDL queries" - << " Set apropriate system_profile or distributed_ddl.profile to fix this."); + << " Set appropriate system_profile or distributed_ddl.profile to fix this."); } host_fqdn = getFQDNOrHostName(); @@ -825,7 +825,7 @@ void DDLWorker::cleanupQueue(Int64 current_time_seconds, const ZooKeeperPtr & zo if (!zookeeper->exists(node_path, &stat)) continue; - /// Delete node if its lifetmie is expired (according to task_max_lifetime parameter) + /// Delete node if its lifetime is expired (according to task_max_lifetime parameter) constexpr UInt64 zookeeper_time_resolution = 1000; Int64 zookeeper_time_seconds = stat.ctime / zookeeper_time_resolution; bool node_lifetime_is_expired = zookeeper_time_seconds + task_max_lifetime < current_time_seconds; diff --git a/dbms/src/Interpreters/DatabaseAndTableWithAlias.h b/dbms/src/Interpreters/DatabaseAndTableWithAlias.h index 3567a351b14..ad1f747b6fd 100644 --- a/dbms/src/Interpreters/DatabaseAndTableWithAlias.h +++ b/dbms/src/Interpreters/DatabaseAndTableWithAlias.h @@ -72,4 +72,6 @@ private: std::vector getDatabaseAndTables(const ASTSelectQuery & select_query, const String & current_database); std::optional getDatabaseAndTable(const ASTSelectQuery & select, size_t table_number); +using TablesWithColumnNames = std::vector; + } diff --git a/dbms/src/Interpreters/ExpressionActions.cpp b/dbms/src/Interpreters/ExpressionActions.cpp index b9beb1569fb..793dc7a8a3d 100644 --- a/dbms/src/Interpreters/ExpressionActions.cpp +++ b/dbms/src/Interpreters/ExpressionActions.cpp @@ -954,7 +954,7 @@ void ExpressionActions::finalize(const Names & output_columns) /// remote table (doesn't know anything about it). /// /// If we have combination of two previous cases, our heuristic from (1) can choose absolutely different columns, - /// so generated streams with these actions will have different headers. To avoid this we addionaly rename our "redundant" column + /// so generated streams with these actions will have different headers. To avoid this we additionally rename our "redundant" column /// to DUMMY_COLUMN_NAME with help of COPY_COLUMN action and consequent remove of original column. /// It doesn't affect any logic, but all streams will have same "redundant" column in header called "_dummy". diff --git a/dbms/src/Interpreters/ExpressionAnalyzer.cpp b/dbms/src/Interpreters/ExpressionAnalyzer.cpp index 59dff858cf0..950edba96f2 100644 --- a/dbms/src/Interpreters/ExpressionAnalyzer.cpp +++ b/dbms/src/Interpreters/ExpressionAnalyzer.cpp @@ -26,7 +26,6 @@ #include #include #include -#include #include #include #include @@ -287,7 +286,7 @@ SetPtr SelectQueryExpressionAnalyzer::isPlainStorageSetInSubquery(const ASTPtr & } -/// Perfomance optimisation for IN() if storage supports it. +/// Performance optimisation for IN() if storage supports it. void SelectQueryExpressionAnalyzer::makeSetsForIndex(const ASTPtr & node) { if (!node || !storage() || !storage()->supportsIndexForIn()) diff --git a/dbms/src/Interpreters/ExternalDictionariesLoader.cpp b/dbms/src/Interpreters/ExternalDictionariesLoader.cpp index d5f995a8db3..ec71abd4a51 100644 --- a/dbms/src/Interpreters/ExternalDictionariesLoader.cpp +++ b/dbms/src/Interpreters/ExternalDictionariesLoader.cpp @@ -19,7 +19,7 @@ ExternalLoader::LoadablePtr ExternalDictionariesLoader::create( const std::string & name, const Poco::Util::AbstractConfiguration & config, const std::string & key_in_config, const std::string & repository_name) const { - /// For dictionaries from databases (created with DDL qureies) we have to perform + /// For dictionaries from databases (created with DDL queries) we have to perform /// additional checks, so we identify them here. bool dictionary_from_database = !repository_name.empty(); return DictionaryFactory::instance().create(name, config, key_in_config, context, dictionary_from_database); diff --git a/dbms/src/Interpreters/ExternalLoader.cpp b/dbms/src/Interpreters/ExternalLoader.cpp index 4b907b521e9..e9cfe602437 100644 --- a/dbms/src/Interpreters/ExternalLoader.cpp +++ b/dbms/src/Interpreters/ExternalLoader.cpp @@ -609,7 +609,7 @@ public: { try { - /// Maybe alredy true, if we have an exception + /// Maybe already true, if we have an exception if (!should_update_flag) should_update_flag = object->isModified(); } diff --git a/dbms/src/Interpreters/ExtractExpressionInfoVisitor.cpp b/dbms/src/Interpreters/ExtractExpressionInfoVisitor.cpp new file mode 100644 index 00000000000..1240b6a09d6 --- /dev/null +++ b/dbms/src/Interpreters/ExtractExpressionInfoVisitor.cpp @@ -0,0 +1,79 @@ +#include +#include +#include +#include +#include + + +namespace DB +{ + +void ExpressionInfoMatcher::visit(const ASTPtr & ast, Data & data) +{ + if (const auto * function = ast->as()) + visit(*function, ast, data); + else if (const auto * identifier = ast->as()) + visit(*identifier, ast, data); +} + +void ExpressionInfoMatcher::visit(const ASTFunction & ast_function, const ASTPtr &, Data & data) +{ + if (ast_function.name == "arrayJoin") + data.is_array_join = true; + else if (AggregateFunctionFactory::instance().isAggregateFunctionName(ast_function.name)) + data.is_aggregate_function = true; + else + { + const auto & function = FunctionFactory::instance().tryGet(ast_function.name, data.context); + + /// Skip lambda, tuple and other special functions + if (function && function->isStateful()) + data.is_stateful_function = true; + } +} + +void ExpressionInfoMatcher::visit(const ASTIdentifier & identifier, const ASTPtr &, Data & data) +{ + if (!identifier.compound()) + { + for (size_t index = 0; index < data.tables.size(); ++index) + { + const auto & columns = data.tables[index].columns; + + // TODO: make sure no collision ever happens + if (std::find(columns.begin(), columns.end(), identifier.name) != columns.end()) + { + data.unique_reference_tables_pos.emplace(index); + break; + } + } + } + else + { + size_t best_table_pos = 0; + if (IdentifierSemantic::chooseTable(identifier, data.tables, best_table_pos)) + data.unique_reference_tables_pos.emplace(best_table_pos); + } +} + +bool ExpressionInfoMatcher::needChildVisit(const ASTPtr & node, const ASTPtr &) +{ + return !node->as(); +} + +bool hasStatefulFunction(const ASTPtr & node, const Context & context) +{ + for (const auto & select_expression : node->children) + { + ExpressionInfoVisitor::Data expression_info{.context = context, .tables = {}}; + ExpressionInfoVisitor(expression_info).visit(select_expression); + + if (expression_info.is_stateful_function) + return true; + } + + return false; +} + +} + diff --git a/dbms/src/Interpreters/ExtractExpressionInfoVisitor.h b/dbms/src/Interpreters/ExtractExpressionInfoVisitor.h new file mode 100644 index 00000000000..65d23057e52 --- /dev/null +++ b/dbms/src/Interpreters/ExtractExpressionInfoVisitor.h @@ -0,0 +1,40 @@ +#pragma once + +#include +#include +#include +#include +#include + +namespace DB +{ + +class Context; + +struct ExpressionInfoMatcher +{ + struct Data + { + const Context & context; + const std::vector & tables; + + bool is_array_join = false; + bool is_stateful_function = false; + bool is_aggregate_function = false; + std::unordered_set unique_reference_tables_pos = {}; + }; + + static void visit(const ASTPtr & ast, Data & data); + + static bool needChildVisit(const ASTPtr & node, const ASTPtr &); + + static void visit(const ASTFunction & ast_function, const ASTPtr &, Data & data); + + static void visit(const ASTIdentifier & identifier, const ASTPtr &, Data & data); +}; + +using ExpressionInfoVisitor = ConstInDepthNodeVisitor; + +bool hasStatefulFunction(const ASTPtr & node, const Context & context); + +} diff --git a/dbms/src/Interpreters/ExtractFunctionDataVisitor.cpp b/dbms/src/Interpreters/ExtractFunctionDataVisitor.cpp deleted file mode 100644 index d7a0d9001d5..00000000000 --- a/dbms/src/Interpreters/ExtractFunctionDataVisitor.cpp +++ /dev/null @@ -1,16 +0,0 @@ -#include -#include - - -namespace DB -{ - -void ExtractFunctionData::visit(ASTFunction & function, ASTPtr &) -{ - if (AggregateFunctionFactory::instance().isAggregateFunctionName(function.name)) - aggregate_functions.emplace_back(&function); - else - functions.emplace_back(&function); -} - -} diff --git a/dbms/src/Interpreters/ExtractFunctionDataVisitor.h b/dbms/src/Interpreters/ExtractFunctionDataVisitor.h deleted file mode 100644 index ed3dbb868c4..00000000000 --- a/dbms/src/Interpreters/ExtractFunctionDataVisitor.h +++ /dev/null @@ -1,25 +0,0 @@ -#pragma once - -#include -#include -#include -#include -#include - -namespace DB -{ - -struct ExtractFunctionData -{ - using TypeToVisit = ASTFunction; - - std::vector functions; - std::vector aggregate_functions; - - void visit(ASTFunction & identifier, ASTPtr &); -}; - -using ExtractFunctionMatcher = OneTypeMatcher; -using ExtractFunctionVisitor = InDepthNodeVisitor; - -} diff --git a/dbms/src/Interpreters/FindIdentifierBestTableVisitor.cpp b/dbms/src/Interpreters/FindIdentifierBestTableVisitor.cpp deleted file mode 100644 index 56897ec15c7..00000000000 --- a/dbms/src/Interpreters/FindIdentifierBestTableVisitor.cpp +++ /dev/null @@ -1,40 +0,0 @@ -#include -#include - - -namespace DB -{ - -FindIdentifierBestTableData::FindIdentifierBestTableData(const std::vector & tables_) - : tables(tables_) -{ -} - -void FindIdentifierBestTableData::visit(ASTIdentifier & identifier, ASTPtr &) -{ - const DatabaseAndTableWithAlias * best_table = nullptr; - - if (!identifier.compound()) - { - for (const auto & table_names : tables) - { - auto & columns = table_names.columns; - if (std::find(columns.begin(), columns.end(), identifier.name) != columns.end()) - { - // TODO: make sure no collision ever happens - if (!best_table) - best_table = &table_names.table; - } - } - } - else - { - size_t best_table_pos = 0; - if (IdentifierSemantic::chooseTable(identifier, tables, best_table_pos)) - best_table = &tables[best_table_pos].table; - } - - identifier_table.emplace_back(&identifier, best_table); -} - -} diff --git a/dbms/src/Interpreters/FindIdentifierBestTableVisitor.h b/dbms/src/Interpreters/FindIdentifierBestTableVisitor.h deleted file mode 100644 index 498ee60ab0b..00000000000 --- a/dbms/src/Interpreters/FindIdentifierBestTableVisitor.h +++ /dev/null @@ -1,27 +0,0 @@ -#pragma once - -#include -#include -#include -#include - -namespace DB -{ - -struct FindIdentifierBestTableData -{ - using TypeToVisit = ASTIdentifier; - using IdentifierWithTable = std::pair; - - const std::vector & tables; - std::vector identifier_table; - - FindIdentifierBestTableData(const std::vector & tables_); - - void visit(ASTIdentifier & identifier, ASTPtr &); -}; - -using FindIdentifierBestTableMatcher = OneTypeMatcher; -using FindIdentifierBestTableVisitor = InDepthNodeVisitor; - -} diff --git a/dbms/src/Interpreters/InDepthNodeVisitor.h b/dbms/src/Interpreters/InDepthNodeVisitor.h index 18b84b11b24..7bb4f5e4d54 100644 --- a/dbms/src/Interpreters/InDepthNodeVisitor.h +++ b/dbms/src/Interpreters/InDepthNodeVisitor.h @@ -59,7 +59,13 @@ public: using Data = Data_; using TypeToVisit = typename Data::TypeToVisit; - static bool needChildVisit(const ASTPtr &, const ASTPtr &) { return visit_children; } + static bool needChildVisit(const ASTPtr & node, const ASTPtr &) + { + if (node && node->as()) + return visit_children; + + return true; + } static void visit(T & ast, Data & data) { diff --git a/dbms/src/Interpreters/InterpreterSelectQuery.cpp b/dbms/src/Interpreters/InterpreterSelectQuery.cpp index f5971d7edbf..b147c5f4887 100644 --- a/dbms/src/Interpreters/InterpreterSelectQuery.cpp +++ b/dbms/src/Interpreters/InterpreterSelectQuery.cpp @@ -503,28 +503,31 @@ Block InterpreterSelectQuery::getSampleBlockImpl() /// Do all AST changes here, because actions from analysis_result will be used later in readImpl. - /// PREWHERE optimization. - /// Turn off, if the table filter (row-level security) is applied. - if (storage && !context->getRowPolicy()->getCondition(storage->getDatabaseName(), storage->getTableName(), RowPolicy::SELECT_FILTER)) + if (storage) { query_analyzer->makeSetsForIndex(query.where()); query_analyzer->makeSetsForIndex(query.prewhere()); - auto optimize_prewhere = [&](auto & merge_tree) + /// PREWHERE optimization. + /// Turn off, if the table filter (row-level security) is applied. + if (!context->getRowPolicy()->getCondition(storage->getDatabaseName(), storage->getTableName(), RowPolicy::SELECT_FILTER)) { - SelectQueryInfo current_info; - current_info.query = query_ptr; - current_info.syntax_analyzer_result = syntax_analyzer_result; - current_info.sets = query_analyzer->getPreparedSets(); + auto optimize_prewhere = [&](auto & merge_tree) + { + SelectQueryInfo current_info; + current_info.query = query_ptr; + current_info.syntax_analyzer_result = syntax_analyzer_result; + current_info.sets = query_analyzer->getPreparedSets(); - /// Try transferring some condition from WHERE to PREWHERE if enabled and viable - if (settings.optimize_move_to_prewhere && query.where() && !query.prewhere() && !query.final()) - MergeTreeWhereOptimizer{current_info, *context, merge_tree, - syntax_analyzer_result->requiredSourceColumns(), log}; - }; + /// Try transferring some condition from WHERE to PREWHERE if enabled and viable + if (settings.optimize_move_to_prewhere && query.where() && !query.prewhere() && !query.final()) + MergeTreeWhereOptimizer{current_info, *context, merge_tree, + syntax_analyzer_result->requiredSourceColumns(), log}; + }; - if (const auto * merge_tree_data = dynamic_cast(storage.get())) - optimize_prewhere(*merge_tree_data); + if (const auto * merge_tree_data = dynamic_cast(storage.get())) + optimize_prewhere(*merge_tree_data); + } } if (storage && !options.only_analyze) @@ -1180,7 +1183,6 @@ void InterpreterSelectQuery::executeImpl(TPipeline & pipeline, const BlockInputS if (expressions.second_stage) { bool need_second_distinct_pass = false; - bool need_merge_streams = false; if (expressions.need_aggregate) { @@ -1216,7 +1218,7 @@ void InterpreterSelectQuery::executeImpl(TPipeline & pipeline, const BlockInputS } else if (query.group_by_with_totals || query.group_by_with_rollup || query.group_by_with_cube) - throw Exception("WITH TOTALS, ROLLUP or CUBE are not supported without aggregation", ErrorCodes::LOGICAL_ERROR); + throw Exception("WITH TOTALS, ROLLUP or CUBE are not supported without aggregation", ErrorCodes::NOT_IMPLEMENTED); need_second_distinct_pass = query.distinct && pipeline.hasMixedStreams(); @@ -1241,13 +1243,11 @@ void InterpreterSelectQuery::executeImpl(TPipeline & pipeline, const BlockInputS executePreLimit(pipeline); } - if (need_second_distinct_pass - || query.limitLength() - || query.limitBy() - || pipeline.hasDelayedStream()) - { - need_merge_streams = true; - } + bool need_merge_streams = need_second_distinct_pass || query.limitLength() || query.limitBy(); + + if constexpr (!pipeline_with_processors) + if (pipeline.hasDelayedStream()) + need_merge_streams = true; if (need_merge_streams) { @@ -1933,7 +1933,7 @@ void InterpreterSelectQuery::executeAggregation(QueryPipeline & pipeline, const * 1. Parallel aggregation is done, and the results should be merged in parallel. * 2. An aggregation is done with store of temporary data on the disk, and they need to be merged in a memory efficient way. */ - bool allow_to_use_two_level_group_by = pipeline.getNumMainStreams() > 1 || settings.max_bytes_before_external_group_by != 0; + bool allow_to_use_two_level_group_by = pipeline.getNumStreams() > 1 || settings.max_bytes_before_external_group_by != 0; Aggregator::Params params(header_before_aggregation, keys, aggregates, overflow_row, settings.max_rows_to_group_by, settings.group_by_overflow_mode, @@ -1947,12 +1947,12 @@ void InterpreterSelectQuery::executeAggregation(QueryPipeline & pipeline, const pipeline.dropTotalsIfHas(); /// If there are several sources, then we perform parallel aggregation - if (pipeline.getNumMainStreams() > 1) + if (pipeline.getNumStreams() > 1) { /// Add resize transform to uniformly distribute data between aggregating streams. - pipeline.resize(pipeline.getNumMainStreams(), true); + pipeline.resize(pipeline.getNumStreams(), true); - auto many_data = std::make_shared(pipeline.getNumMainStreams()); + auto many_data = std::make_shared(pipeline.getNumStreams()); auto merge_threads = settings.aggregation_memory_efficient_merge_threads ? static_cast(settings.aggregation_memory_efficient_merge_threads) : static_cast(settings.max_threads); @@ -2351,9 +2351,6 @@ void InterpreterSelectQuery::executeOrder(QueryPipeline & pipeline, InputSorting return std::make_shared(header, output_order_descr, limit, do_count_rows); }); - /// If there are several streams, we merge them into one - pipeline.resize(1); - /// Merge the sorted blocks. pipeline.addSimpleTransform([&](const Block & header, QueryPipeline::StreamType stream_type) -> ProcessorPtr { @@ -2362,11 +2359,12 @@ void InterpreterSelectQuery::executeOrder(QueryPipeline & pipeline, InputSorting return std::make_shared( header, output_order_descr, settings.max_block_size, limit, - settings.max_bytes_before_remerge_sort, + settings.max_bytes_before_remerge_sort / pipeline.getNumStreams(), settings.max_bytes_before_external_sort, context->getTemporaryPath(), settings.min_free_disk_space_for_temporary_data); }); - pipeline.enableQuotaForCurrentStreams(); + /// If there are several streams, we merge them into one + executeMergeSorted(pipeline, output_order_descr, limit); } @@ -2807,11 +2805,7 @@ void InterpreterSelectQuery::executeSubqueriesInSetsAndJoins(Pipeline & pipeline void InterpreterSelectQuery::executeSubqueriesInSetsAndJoins(QueryPipeline & pipeline, SubqueriesForSets & subqueries_for_sets) { if (query_info.input_sorting_info) - { - if (pipeline.hasDelayedStream()) - throw Exception("Using read in order optimization, but has delayed stream in pipeline", ErrorCodes::LOGICAL_ERROR); executeMergeSorted(pipeline, query_info.input_sorting_info->order_key_prefix_descr, 0); - } const Settings & settings = context->getSettingsRef(); @@ -2828,7 +2822,7 @@ void InterpreterSelectQuery::unifyStreams(Pipeline & pipeline, Block header) { /// Unify streams in case they have different headers. - /// TODO: remove previos addition of _dummy column. + /// TODO: remove previous addition of _dummy column. if (header.columns() > 1 && header.has("_dummy")) header.erase("_dummy"); diff --git a/dbms/src/Interpreters/Join.cpp b/dbms/src/Interpreters/Join.cpp index 793b74ff890..2cce5d7f51d 100644 --- a/dbms/src/Interpreters/Join.cpp +++ b/dbms/src/Interpreters/Join.cpp @@ -306,7 +306,7 @@ size_t Join::getTotalByteCount() const void Join::setSampleBlock(const Block & block) { - /// You have to restore this lock if you call the fuction outside of ctor. + /// You have to restore this lock if you call the function outside of ctor. //std::unique_lock lock(rwlock); LOG_DEBUG(log, "setSampleBlock: " << block.dumpStructure()); @@ -778,7 +778,7 @@ NO_INLINE IColumn::Filter joinRightColumns(const Map & map, AddedColumns & added } else if constexpr ((is_any_join || is_semi_join) && right) { - /// Use first appered left key + it needs left columns replication + /// Use first appeared left key + it needs left columns replication if (mapped.setUsedOnce()) { setUsed(filter, i); @@ -787,7 +787,7 @@ NO_INLINE IColumn::Filter joinRightColumns(const Map & map, AddedColumns & added } else if constexpr (is_any_join && KIND == ASTTableJoin::Kind::Inner) { - /// Use first appered left key only + /// Use first appeared left key only if (mapped.setUsedOnce()) { setUsed(filter, i); diff --git a/dbms/src/Interpreters/MergeJoin.cpp b/dbms/src/Interpreters/MergeJoin.cpp index 45b9ac86bf6..f301de17bc5 100644 --- a/dbms/src/Interpreters/MergeJoin.cpp +++ b/dbms/src/Interpreters/MergeJoin.cpp @@ -527,7 +527,7 @@ void MergeJoin::mergeFlushedRightBlocks() lsm->merge(callback); flushed_right_blocks.swap(lsm->sorted_files.front()); - /// Get memory limit or aproximate it from row limit and bytes per row factor + /// Get memory limit or approximate it from row limit and bytes per row factor UInt64 memory_limit = size_limits.max_bytes; UInt64 rows_limit = size_limits.max_rows; if (!memory_limit && rows_limit) diff --git a/dbms/src/Interpreters/MergeJoin.h b/dbms/src/Interpreters/MergeJoin.h index 9c844dcfd66..960ca31153d 100644 --- a/dbms/src/Interpreters/MergeJoin.h +++ b/dbms/src/Interpreters/MergeJoin.h @@ -56,7 +56,7 @@ public: private: /// There're two size limits for right-hand table: max_rows_in_join, max_bytes_in_join. - /// max_bytes is prefered. If it isn't set we aproximate it as (max_rows * bytes/row). + /// max_bytes is prefered. If it isn't set we approximate it as (max_rows * bytes/row). struct BlockByteWeight { size_t operator()(const Block & block) const { return block.bytes(); } diff --git a/dbms/src/Interpreters/OptimizeIfChains.cpp b/dbms/src/Interpreters/OptimizeIfChains.cpp index d440b204d54..27085fe3e53 100644 --- a/dbms/src/Interpreters/OptimizeIfChains.cpp +++ b/dbms/src/Interpreters/OptimizeIfChains.cpp @@ -68,7 +68,7 @@ ASTs OptimizeIfChainsVisitor::ifChain(const ASTPtr & child) const auto * else_arg = function_args->children[2]->as(); - /// Recursively collect arguments from the innermost if ("head-resursion"). + /// Recursively collect arguments from the innermost if ("head-recursion"). /// Arguments will be returned in reverse order. if (else_arg && else_arg->name == "if") diff --git a/dbms/src/Interpreters/PredicateExpressionsOptimizer.cpp b/dbms/src/Interpreters/PredicateExpressionsOptimizer.cpp index 050ee637d18..9927091874c 100644 --- a/dbms/src/Interpreters/PredicateExpressionsOptimizer.cpp +++ b/dbms/src/Interpreters/PredicateExpressionsOptimizer.cpp @@ -1,32 +1,13 @@ -#include - -#include -#include #include -#include -#include -#include + #include #include -#include #include -#include -#include #include -#include -#include -#include -#include #include -#include -#include -#include -#include -#include -#include -#include #include -#include +#include +#include namespace DB @@ -38,155 +19,105 @@ namespace ErrorCodes extern const int UNKNOWN_ELEMENT_IN_AST; } -namespace -{ - -constexpr auto and_function_name = "and"; - -String qualifiedName(ASTIdentifier * identifier, const String & prefix) -{ - if (identifier->isShort()) - return prefix + identifier->getAliasOrColumnName(); - return identifier->getAliasOrColumnName(); -} - -} - PredicateExpressionsOptimizer::PredicateExpressionsOptimizer( - ASTSelectQuery * ast_select_, ExtractedSettings && settings_, const Context & context_) - : ast_select(ast_select_), settings(settings_), context(context_) + const Context & context_, const TablesWithColumnNames & tables_with_columns_, const Settings & settings_) + : context(context_), tables_with_columns(tables_with_columns_), settings(settings_) { } -bool PredicateExpressionsOptimizer::optimize() +bool PredicateExpressionsOptimizer::optimize(ASTSelectQuery & select_query) { - if (!settings.enable_optimize_predicate_expression || !ast_select || !ast_select->tables() || ast_select->tables()->children.empty()) + if (!settings.enable_optimize_predicate_expression) return false; - if (!ast_select->where() && !ast_select->prewhere()) + if (select_query.having() && (!select_query.group_by_with_cube && !select_query.group_by_with_rollup && !select_query.group_by_with_totals)) + tryMovePredicatesFromHavingToWhere(select_query); + + if (!select_query.tables() || select_query.tables()->children.empty()) return false; - if (ast_select->array_join_expression_list()) + if ((!select_query.where() && !select_query.prewhere()) || select_query.array_join_expression_list()) return false; - SubqueriesProjectionColumns all_subquery_projection_columns = getAllSubqueryProjectionColumns(); + const auto & tables_predicates = extractTablesPredicates(select_query.where(), select_query.prewhere()); - bool is_rewrite_subqueries = false; - if (!all_subquery_projection_columns.empty()) - { - is_rewrite_subqueries |= optimizeImpl(ast_select->where(), all_subquery_projection_columns, OptimizeKind::PUSH_TO_WHERE); - is_rewrite_subqueries |= optimizeImpl(ast_select->prewhere(), all_subquery_projection_columns, OptimizeKind::PUSH_TO_PREWHERE); - } + if (!tables_predicates.empty()) + return tryRewritePredicatesToTables(select_query.refTables()->children, tables_predicates); - return is_rewrite_subqueries; + return false; } -bool PredicateExpressionsOptimizer::optimizeImpl( - const ASTPtr & outer_expression, const SubqueriesProjectionColumns & subqueries_projection_columns, OptimizeKind expression_kind) +static ASTs splitConjunctionPredicate(const std::initializer_list & predicates) { - /// split predicate with `and` - std::vector outer_predicate_expressions = splitConjunctionPredicate(outer_expression); + std::vector res; - std::vector table_expressions = getTableExpressions(*ast_select); - std::vector tables_with_columns = getDatabaseAndTablesWithColumnNames(table_expressions, context); - - bool is_rewrite_subquery = false; - for (auto & outer_predicate : outer_predicate_expressions) + auto remove_expression_at_index = [&res] (const size_t index) { - if (isArrayJoinFunction(outer_predicate)) + if (index < res.size() - 1) + std::swap(res[index], res.back()); + res.pop_back(); + }; + + for (const auto & predicate : predicates) + { + if (!predicate) continue; - auto outer_predicate_dependencies = getDependenciesAndQualifiers(outer_predicate, tables_with_columns); + res.emplace_back(predicate); - /// TODO: remove origin expression - for (const auto & [subquery, projection_columns] : subqueries_projection_columns) + for (size_t idx = 0; idx < res.size();) { - OptimizeKind optimize_kind = OptimizeKind::NONE; - if (allowPushDown(subquery, outer_predicate, projection_columns, outer_predicate_dependencies, optimize_kind)) + const auto & expression = res.at(idx); + + if (const auto * function = expression->as(); function && function->name == "and") { - if (optimize_kind == OptimizeKind::NONE) - optimize_kind = expression_kind; + for (auto & child : function->arguments->children) + res.emplace_back(child); - ASTPtr inner_predicate = outer_predicate->clone(); - cleanExpressionAlias(inner_predicate); /// clears the alias name contained in the outer predicate + remove_expression_at_index(idx); + continue; + } + ++idx; + } + } - std::vector inner_predicate_dependencies = - getDependenciesAndQualifiers(inner_predicate, tables_with_columns); + return res; +} - setNewAliasesForInnerPredicate(projection_columns, inner_predicate_dependencies); +std::vector PredicateExpressionsOptimizer::extractTablesPredicates(const ASTPtr & where, const ASTPtr & prewhere) +{ + std::vector tables_predicates(tables_with_columns.size()); - switch (optimize_kind) - { - case OptimizeKind::NONE: continue; - case OptimizeKind::PUSH_TO_WHERE: - is_rewrite_subquery |= optimizeExpression(inner_predicate, subquery, ASTSelectQuery::Expression::WHERE); - continue; - case OptimizeKind::PUSH_TO_HAVING: - is_rewrite_subquery |= optimizeExpression(inner_predicate, subquery, ASTSelectQuery::Expression::HAVING); - continue; - case OptimizeKind::PUSH_TO_PREWHERE: - is_rewrite_subquery |= optimizeExpression(inner_predicate, subquery, ASTSelectQuery::Expression::PREWHERE); - continue; - } + for (const auto & predicate_expression : splitConjunctionPredicate({where, prewhere})) + { + ExpressionInfoVisitor::Data expression_info{.context = context, .tables = tables_with_columns}; + ExpressionInfoVisitor(expression_info).visit(predicate_expression); + + if (expression_info.is_stateful_function) + return {}; /// give up the optimization when the predicate contains stateful function + + if (!expression_info.is_array_join) + { + if (expression_info.unique_reference_tables_pos.size() == 1) + tables_predicates[*expression_info.unique_reference_tables_pos.begin()].emplace_back(predicate_expression); + else if (expression_info.unique_reference_tables_pos.size() == 0) + { + for (size_t index = 0; index < tables_predicates.size(); ++index) + tables_predicates[index].emplace_back(predicate_expression); } } } - return is_rewrite_subquery; + + return tables_predicates; /// everything is OK, it can be optimized } -bool PredicateExpressionsOptimizer::allowPushDown( - const ASTSelectQuery * subquery, - const ASTPtr &, - const std::vector & projection_columns, - const std::vector & dependencies, - OptimizeKind & optimize_kind) +bool PredicateExpressionsOptimizer::tryRewritePredicatesToTables(ASTs & tables_element, const std::vector & tables_predicates) { - if (!subquery - || (!settings.enable_optimize_predicate_expression_to_final_subquery && subquery->final()) - || subquery->limitBy() || subquery->limitLength() - || subquery->with() || subquery->withFill()) - return false; - else + bool is_rewrite_tables = false; + + for (size_t index = tables_element.size(); index > 0; --index) { - ASTPtr expr_list = ast_select->select(); - ExtractFunctionVisitor::Data extract_data; - ExtractFunctionVisitor(extract_data).visit(expr_list); - - for (const auto & subquery_function : extract_data.functions) - { - const auto & function = FunctionFactory::instance().tryGet(subquery_function->name, context); - - /// Skip lambda, tuple and other special functions - if (function && function->isStateful()) - return false; - } - } - - const auto * ast_join = ast_select->join(); - const ASTTableExpression * left_table_expr = nullptr; - const ASTTableExpression * right_table_expr = nullptr; - const ASTSelectQuery * left_subquery = nullptr; - const ASTSelectQuery * right_subquery = nullptr; - - if (ast_join) - { - left_table_expr = ast_select - ->tables()->as() - ->children[0]->as() - ->table_expression->as(); - right_table_expr = ast_select - ->tables()->as() - ->children[1]->as() - ->table_expression->as(); - - if (left_table_expr && left_table_expr->subquery) - left_subquery = left_table_expr->subquery - ->children[0]->as() - ->list_of_selects->children[0]->as(); - if (right_table_expr && right_table_expr->subquery) - right_subquery = right_table_expr->subquery - ->children[0]->as() - ->list_of_selects->children[0]->as(); + size_t table_pos = index - 1; /// NOTE: the syntactic way of pushdown has limitations and should be partially disabled in case of JOINs. /// Let's take a look at the query: @@ -201,326 +132,84 @@ bool PredicateExpressionsOptimizer::allowPushDown( /// It happens because the not-matching columns are replaced with a global default values on JOIN. /// Same is true for RIGHT JOIN and FULL JOIN. - /// Check right side for LEFT'o'FULL JOIN - if (isLeftOrFull(ast_join->table_join->as()->kind) && right_subquery == subquery) - return false; - - /// Check left side for RIGHT'o'FULL JOIN - if (isRightOrFull(ast_join->table_join->as()->kind) && left_subquery == subquery) - return false; - } - - return checkDependencies(projection_columns, dependencies, optimize_kind); -} - -bool PredicateExpressionsOptimizer::checkDependencies( - const std::vector & projection_columns, - const std::vector & dependencies, - OptimizeKind & optimize_kind) -{ - for (const auto & [identifier, prefix] : dependencies) - { - bool is_found = false; - String qualified_name = qualifiedName(identifier, prefix); - - for (const auto & [ast, alias] : projection_columns) + if (const auto & table_element = tables_element[table_pos]->as()) { - if (alias == qualified_name) - { - is_found = true; - ASTPtr projection_column = ast; - ExtractFunctionVisitor::Data extract_data; - ExtractFunctionVisitor(extract_data).visit(projection_column); + if (table_element->table_join && isLeft(table_element->table_join->as()->kind)) + continue; /// Skip right table optimization - if (!extract_data.aggregate_functions.empty()) - optimize_kind = OptimizeKind::PUSH_TO_HAVING; - } - } + if (table_element->table_join && isFull(table_element->table_join->as()->kind)) + break; /// Skip left and right table optimization - if (!is_found) - return false; - } + is_rewrite_tables |= tryRewritePredicatesToTable(tables_element[table_pos], tables_predicates[table_pos], + tables_with_columns[table_pos].columns); - return true; -} - -std::vector PredicateExpressionsOptimizer::splitConjunctionPredicate(const ASTPtr & predicate_expression) -{ - std::vector predicate_expressions; - - if (predicate_expression) - { - predicate_expressions.emplace_back(predicate_expression); - - auto remove_expression_at_index = [&predicate_expressions] (const size_t index) - { - if (index < predicate_expressions.size() - 1) - std::swap(predicate_expressions[index], predicate_expressions.back()); - predicate_expressions.pop_back(); - }; - - for (size_t idx = 0; idx < predicate_expressions.size();) - { - const auto expression = predicate_expressions.at(idx); - - if (const auto * function = expression->as()) - { - if (function->name == and_function_name) - { - for (auto & child : function->arguments->children) - predicate_expressions.emplace_back(child); - - remove_expression_at_index(idx); - continue; - } - } - ++idx; + if (table_element->table_join && isRight(table_element->table_join->as()->kind)) + break; /// Skip left table optimization } } - return predicate_expressions; + + return is_rewrite_tables; } -std::vector -PredicateExpressionsOptimizer::getDependenciesAndQualifiers(ASTPtr & expression, std::vector & tables) +bool PredicateExpressionsOptimizer::tryRewritePredicatesToTable(ASTPtr & table_element, const ASTs & table_predicates, const Names & table_column) const { - FindIdentifierBestTableVisitor::Data find_data(tables); - FindIdentifierBestTableVisitor(find_data).visit(expression); - - std::vector dependencies; - - for (const auto & [identifier, table] : find_data.identifier_table) + if (!table_predicates.empty()) { - String table_alias; - if (table) - table_alias = table->getQualifiedNamePrefix(); + auto optimize_final = settings.enable_optimize_predicate_expression_to_final_subquery; + PredicateRewriteVisitor::Data data(context, table_predicates, table_column, optimize_final); - dependencies.emplace_back(identifier, table_alias); + PredicateRewriteVisitor(data).visit(table_element); + return data.is_rewrite; } - return dependencies; -} - -void PredicateExpressionsOptimizer::setNewAliasesForInnerPredicate( - const std::vector & projection_columns, - const std::vector & dependencies) -{ - for (auto & [identifier, prefix] : dependencies) - { - String qualified_name = qualifiedName(identifier, prefix); - - for (auto & [ast, alias] : projection_columns) - { - if (alias == qualified_name) - { - String name; - if (auto * id = ast->as()) - { - name = id->tryGetAlias(); - if (name.empty()) - name = id->shortName(); - } - else - { - if (ast->tryGetAlias().empty()) - ast->setAlias(ast->getColumnName()); - name = ast->getAliasOrColumnName(); - } - - identifier->setShortName(name); - } - } - } -} - -bool PredicateExpressionsOptimizer::isArrayJoinFunction(const ASTPtr & node) -{ - if (const auto * function = node->as()) - { - if (function->name == "arrayJoin") - return true; - } - - for (auto & child : node->children) - if (isArrayJoinFunction(child)) - return true; - return false; } -bool PredicateExpressionsOptimizer::optimizeExpression(const ASTPtr & outer_expression, ASTSelectQuery * subquery, - ASTSelectQuery::Expression expr) +bool PredicateExpressionsOptimizer::tryMovePredicatesFromHavingToWhere(ASTSelectQuery & select_query) { - ASTPtr subquery_expression = subquery->getExpression(expr, false); - subquery_expression = subquery_expression ? makeASTFunction(and_function_name, outer_expression, subquery_expression) : outer_expression; + ASTs where_predicates; + ASTs having_predicates; + + const auto & reduce_predicates = [&](const ASTs & predicates) + { + ASTPtr res = predicates[0]; + for (size_t index = 1; index < predicates.size(); ++index) + res = makeASTFunction("and", res, predicates[index]); + + return res; + }; + + for (const auto & moving_predicate: splitConjunctionPredicate({select_query.having()})) + { + ExpressionInfoVisitor::Data expression_info{.context = context, .tables = {}}; + ExpressionInfoVisitor(expression_info).visit(moving_predicate); + + /// TODO: If there is no group by, where, and prewhere expression, we can push down the stateful function + if (expression_info.is_stateful_function) + return false; + + if (expression_info.is_aggregate_function) + having_predicates.emplace_back(moving_predicate); + else + where_predicates.emplace_back(moving_predicate); + } + + if (having_predicates.empty()) + select_query.setExpression(ASTSelectQuery::Expression::HAVING, {}); + else + { + auto having_predicate = reduce_predicates(having_predicates); + select_query.setExpression(ASTSelectQuery::Expression::HAVING, std::move(having_predicate)); + } + + if (!where_predicates.empty()) + { + auto moved_predicate = reduce_predicates(where_predicates); + moved_predicate = select_query.where() ? makeASTFunction("and", select_query.where(), moved_predicate) : moved_predicate; + select_query.setExpression(ASTSelectQuery::Expression::WHERE, std::move(moved_predicate)); + } - subquery->setExpression(expr, std::move(subquery_expression)); return true; } -PredicateExpressionsOptimizer::SubqueriesProjectionColumns PredicateExpressionsOptimizer::getAllSubqueryProjectionColumns() -{ - SubqueriesProjectionColumns projection_columns; - - for (const auto & table_expression : getTableExpressions(*ast_select)) - if (table_expression->subquery) - getSubqueryProjectionColumns(table_expression->subquery, projection_columns); - - return projection_columns; -} - -void PredicateExpressionsOptimizer::getSubqueryProjectionColumns(const ASTPtr & subquery, SubqueriesProjectionColumns & projection_columns) -{ - String qualified_name_prefix = subquery->tryGetAlias(); - if (!qualified_name_prefix.empty()) - qualified_name_prefix += '.'; - - const ASTPtr & subselect = subquery->children[0]; - - ASTs select_with_union_projections; - const auto * select_with_union_query = subselect->as(); - - for (auto & select : select_with_union_query->list_of_selects->children) - { - std::vector subquery_projections; - auto select_projection_columns = getSelectQueryProjectionColumns(select); - - if (!select_projection_columns.empty()) - { - if (select_with_union_projections.empty()) - select_with_union_projections = select_projection_columns; - - for (size_t i = 0; i < select_projection_columns.size(); i++) - subquery_projections.emplace_back(std::pair(select_projection_columns[i], - qualified_name_prefix + select_with_union_projections[i]->getAliasOrColumnName())); - - projection_columns.insert(std::pair(select->as(), subquery_projections)); - } - } -} - -ASTs PredicateExpressionsOptimizer::getSelectQueryProjectionColumns(ASTPtr & ast) -{ - ASTs projection_columns; - auto * select_query = ast->as(); - - /// first should normalize query tree. - std::unordered_map aliases; - std::vector tables = getDatabaseAndTables(*select_query, context.getCurrentDatabase()); - - /// TODO: get tables from evaluateAsterisk instead of tablesOnly() to extract asterisks in general way - std::vector tables_with_columns = TranslateQualifiedNamesVisitor::Data::tablesOnly(tables); - TranslateQualifiedNamesVisitor::Data qn_visitor_data({}, std::move(tables_with_columns), false); - TranslateQualifiedNamesVisitor(qn_visitor_data).visit(ast); - - QueryAliasesVisitor::Data query_aliases_data{aliases}; - QueryAliasesVisitor(query_aliases_data).visit(ast); - - MarkTableIdentifiersVisitor::Data mark_tables_data{aliases}; - MarkTableIdentifiersVisitor(mark_tables_data).visit(ast); - - QueryNormalizer::Data normalizer_data(aliases, settings); - QueryNormalizer(normalizer_data).visit(ast); - - for (const auto & projection_column : select_query->select()->children) - { - if (projection_column->as() || projection_column->as() || projection_column->as()) - { - ASTs evaluated_columns = evaluateAsterisk(select_query, projection_column); - - for (const auto & column : evaluated_columns) - projection_columns.emplace_back(column); - - continue; - } - - projection_columns.emplace_back(projection_column); - } - return projection_columns; -} - -ASTs PredicateExpressionsOptimizer::evaluateAsterisk(ASTSelectQuery * select_query, const ASTPtr & asterisk) -{ - /// SELECT *, SELECT dummy, SELECT 1 AS id - if (!select_query->tables() || select_query->tables()->children.empty()) - return {}; - - std::vector tables_expression = getTableExpressions(*select_query); - - if (const auto * qualified_asterisk = asterisk->as()) - { - if (qualified_asterisk->children.size() != 1) - throw Exception("Logical error: qualified asterisk must have exactly one child", ErrorCodes::LOGICAL_ERROR); - - DatabaseAndTableWithAlias ident_db_and_name(qualified_asterisk->children[0]); - - for (auto it = tables_expression.begin(); it != tables_expression.end();) - { - const ASTTableExpression * table_expression = *it; - DatabaseAndTableWithAlias database_and_table_with_alias(*table_expression, context.getCurrentDatabase()); - - if (ident_db_and_name.satisfies(database_and_table_with_alias, true)) - ++it; - else - it = tables_expression.erase(it); /// It's not a required table - } - } - - ASTs projection_columns; - for (auto & table_expression : tables_expression) - { - if (table_expression->subquery) - { - const auto * subquery = table_expression->subquery->as(); - const auto * select_with_union_query = subquery->children[0]->as(); - const auto subquery_projections = getSelectQueryProjectionColumns(select_with_union_query->list_of_selects->children[0]); - projection_columns.insert(projection_columns.end(), subquery_projections.begin(), subquery_projections.end()); - } - else - { - StoragePtr storage; - - if (table_expression->table_function) - { - auto query_context = const_cast(&context.getQueryContext()); - storage = query_context->executeTableFunction(table_expression->table_function); - } - else if (table_expression->database_and_table_name) - { - const auto * database_and_table_ast = table_expression->database_and_table_name->as(); - DatabaseAndTableWithAlias database_and_table_name(*database_and_table_ast); - storage = context.getTable(database_and_table_name.database, database_and_table_name.table); - } - else - throw Exception("Logical error: unexpected table expression", ErrorCodes::LOGICAL_ERROR); - - const auto block = storage->getSampleBlock(); - if (const auto * asterisk_pattern = asterisk->as()) - { - for (size_t idx = 0; idx < block.columns(); ++idx) - { - auto & col = block.getByPosition(idx); - if (asterisk_pattern->isColumnMatching(col.name)) - projection_columns.emplace_back(std::make_shared(col.name)); - } - } - else - { - for (size_t idx = 0; idx < block.columns(); ++idx) - projection_columns.emplace_back(std::make_shared(block.getByPosition(idx).name)); - } - } - } - return projection_columns; -} - -void PredicateExpressionsOptimizer::cleanExpressionAlias(ASTPtr & expression) -{ - const auto my_alias = expression->tryGetAlias(); - if (!my_alias.empty()) - expression->setAlias(""); - - for (auto & child : expression->children) - cleanExpressionAlias(child); -} - } diff --git a/dbms/src/Interpreters/PredicateExpressionsOptimizer.h b/dbms/src/Interpreters/PredicateExpressionsOptimizer.h index ca2c8b8766d..da6b98987a6 100644 --- a/dbms/src/Interpreters/PredicateExpressionsOptimizer.h +++ b/dbms/src/Interpreters/PredicateExpressionsOptimizer.h @@ -1,110 +1,53 @@ #pragma once -#include "DatabaseAndTableWithAlias.h" #include -#include +#include namespace DB { -class ASTIdentifier; -class ASTSubquery; class Context; +struct Settings; -/** This class provides functions for Push-Down predicate expressions - * - * The Example: - * - Query before optimization : - * SELECT id_1, name_1 FROM (SELECT id_1, name_1 FROM table_a UNION ALL SELECT id_2, name_2 FROM table_b) - * WHERE id_1 = 1 - * - Query after optimization : - * SELECT id_1, name_1 FROM (SELECT id_1, name_1 FROM table_a WHERE id_1 = 1 UNION ALL SELECT id_2, name_2 FROM table_b WHERE id_2 = 1) - * WHERE id_1 = 1 +/** Predicate optimization based on rewriting ast rules * For more details : https://github.com/ClickHouse/ClickHouse/pull/2015#issuecomment-374283452 + * The optimizer does two different optimizations + * - Move predicates from having to where + * - Push the predicate down from the current query to the having of the subquery */ class PredicateExpressionsOptimizer { - using ProjectionWithAlias = std::pair; - using SubqueriesProjectionColumns = std::map>; - using IdentifierWithQualifier = std::pair; +public: + PredicateExpressionsOptimizer(const Context & context_, const TablesWithColumnNames & tables_with_columns_, const Settings & settings_); + bool optimize(ASTSelectQuery & select_query); + +private: /// Extracts settings, mostly to show which are used and which are not. struct ExtractedSettings { - /// QueryNormalizer settings - const UInt64 max_ast_depth; - const UInt64 max_expanded_ast_elements; - const String count_distinct_implementation; - - /// for PredicateExpressionsOptimizer const bool enable_optimize_predicate_expression; const bool enable_optimize_predicate_expression_to_final_subquery; - const bool join_use_nulls; template ExtractedSettings(const T & settings_) - : max_ast_depth(settings_.max_ast_depth), - max_expanded_ast_elements(settings_.max_expanded_ast_elements), - count_distinct_implementation(settings_.count_distinct_implementation), - enable_optimize_predicate_expression(settings_.enable_optimize_predicate_expression), - enable_optimize_predicate_expression_to_final_subquery(settings_.enable_optimize_predicate_expression_to_final_subquery), - join_use_nulls(settings_.join_use_nulls) + : enable_optimize_predicate_expression(settings_.enable_optimize_predicate_expression), + enable_optimize_predicate_expression_to_final_subquery(settings_.enable_optimize_predicate_expression_to_final_subquery) {} }; -public: - PredicateExpressionsOptimizer(ASTSelectQuery * ast_select_, ExtractedSettings && settings_, const Context & context_); - - bool optimize(); - -private: - ASTSelectQuery * ast_select; - const ExtractedSettings settings; const Context & context; + const std::vector & tables_with_columns; - enum OptimizeKind - { - NONE, - PUSH_TO_PREWHERE, - PUSH_TO_WHERE, - PUSH_TO_HAVING, - }; + const ExtractedSettings settings; - bool isArrayJoinFunction(const ASTPtr & node); + std::vector extractTablesPredicates(const ASTPtr & where, const ASTPtr & prewhere); - std::vector splitConjunctionPredicate(const ASTPtr & predicate_expression); + bool tryRewritePredicatesToTables(ASTs & tables_element, const std::vector & tables_predicates); - std::vector getDependenciesAndQualifiers(ASTPtr & expression, - std::vector & tables_with_aliases); + bool tryRewritePredicatesToTable(ASTPtr & table_element, const ASTs & table_predicates, const Names & table_column) const; - bool optimizeExpression(const ASTPtr & outer_expression, ASTSelectQuery * subquery, ASTSelectQuery::Expression expr); - - bool optimizeImpl(const ASTPtr & outer_expression, const SubqueriesProjectionColumns & subqueries_projection_columns, OptimizeKind optimize_kind); - - bool allowPushDown( - const ASTSelectQuery * subquery, - const ASTPtr & outer_predicate, - const std::vector & subquery_projection_columns, - const std::vector & outer_predicate_dependencies, - OptimizeKind & optimize_kind); - - bool checkDependencies( - const std::vector & projection_columns, - const std::vector & dependencies, - OptimizeKind & optimize_kind); - - void setNewAliasesForInnerPredicate(const std::vector & projection_columns, - const std::vector & inner_predicate_dependencies); - - SubqueriesProjectionColumns getAllSubqueryProjectionColumns(); - - void getSubqueryProjectionColumns(const ASTPtr & subquery, SubqueriesProjectionColumns & all_subquery_projection_columns); - - ASTs getSelectQueryProjectionColumns(ASTPtr & ast); - - ASTs evaluateAsterisk(ASTSelectQuery * select_query, const ASTPtr & asterisk); - - void cleanExpressionAlias(ASTPtr & expression); + bool tryMovePredicatesFromHavingToWhere(ASTSelectQuery & select_query); }; } diff --git a/dbms/src/Interpreters/PredicateRewriteVisitor.cpp b/dbms/src/Interpreters/PredicateRewriteVisitor.cpp new file mode 100644 index 00000000000..6bd16ddc066 --- /dev/null +++ b/dbms/src/Interpreters/PredicateRewriteVisitor.cpp @@ -0,0 +1,119 @@ +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + + +namespace DB +{ + +PredicateRewriteVisitorData::PredicateRewriteVisitorData( + const Context & context_, const ASTs & predicates_, const Names & column_names_, bool optimize_final_) + : context(context_), predicates(predicates_), column_names(column_names_), optimize_final(optimize_final_) +{ +} + +void PredicateRewriteVisitorData::visit(ASTSelectWithUnionQuery & union_select_query, ASTPtr &) +{ + auto & internal_select_list = union_select_query.list_of_selects->children; + + if (internal_select_list.size() > 0) + visitFirstInternalSelect(*internal_select_list[0]->as(), internal_select_list[0]); + + for (size_t index = 1; index < internal_select_list.size(); ++index) + visitOtherInternalSelect(*internal_select_list[index]->as(), internal_select_list[index]); +} + +void PredicateRewriteVisitorData::visitFirstInternalSelect(ASTSelectQuery & select_query, ASTPtr &) +{ + is_rewrite |= rewriteSubquery(select_query, column_names, column_names); +} + +void PredicateRewriteVisitorData::visitOtherInternalSelect(ASTSelectQuery & select_query, ASTPtr &) +{ + /// For non first select, its alias has no more significance, so we can set a temporary alias for them + ASTPtr temp_internal_select = select_query.clone(); + ASTSelectQuery * temp_select_query = temp_internal_select->as(); + + size_t alias_index = 0; + for (auto & ref_select : temp_select_query->refSelect()->children) + { + if (!ref_select->as() && !ref_select->as() && !ref_select->as() && + !ref_select->as()) + { + if (const auto & alias = ref_select->tryGetAlias(); alias.empty()) + ref_select->setAlias("--predicate_optimizer_" + toString(alias_index++)); + } + } + + const Names & internal_columns = InterpreterSelectQuery( + temp_internal_select, context, SelectQueryOptions().analyze()).getSampleBlock().getNames(); + + if (rewriteSubquery(*temp_select_query, column_names, internal_columns)) + { + is_rewrite |= true; + select_query.setExpression(ASTSelectQuery::Expression::SELECT, std::move(temp_select_query->refSelect())); + select_query.setExpression(ASTSelectQuery::Expression::HAVING, std::move(temp_select_query->refHaving())); + } +} + +static void cleanAliasAndCollectIdentifiers(ASTPtr & predicate, std::vector & identifiers) +{ + /// Skip WHERE x in (SELECT ...) + if (!predicate->as()) + { + for (auto & children : predicate->children) + cleanAliasAndCollectIdentifiers(children, identifiers); + } + + if (const auto alias = predicate->tryGetAlias(); !alias.empty()) + predicate->setAlias(""); + + if (ASTIdentifier * identifier = predicate->as()) + identifiers.emplace_back(identifier); +} + +bool PredicateRewriteVisitorData::rewriteSubquery(ASTSelectQuery & subquery, const Names & outer_columns, const Names & inner_columns) +{ + if ((!optimize_final && subquery.final()) + || subquery.with() || subquery.withFill() + || subquery.limitBy() || subquery.limitLength() + || hasStatefulFunction(subquery.select(), context)) + return false; + + for (const auto & predicate : predicates) + { + std::vector identifiers; + ASTPtr optimize_predicate = predicate->clone(); + cleanAliasAndCollectIdentifiers(optimize_predicate, identifiers); + + for (size_t index = 0; index < identifiers.size(); ++index) + { + const auto & column_name = identifiers[index]->shortName(); + const auto & outer_column_iterator = std::find(outer_columns.begin(), outer_columns.end(), column_name); + + /// For lambda functions, we can't always find them in the list of columns + /// For example: SELECT * FROM system.one WHERE arrayMap(x -> x, [dummy]) = [0] + if (outer_column_iterator != outer_columns.end()) + identifiers[index]->setShortName(inner_columns[outer_column_iterator - outer_columns.begin()]); + } + + /// We only need to push all the predicates to subquery having + /// The subquery optimizer will move the appropriate predicates from having to where + subquery.setExpression(ASTSelectQuery::Expression::HAVING, + subquery.having() ? makeASTFunction("and", optimize_predicate, subquery.having()) : optimize_predicate); + } + + return true; +} + +} diff --git a/dbms/src/Interpreters/PredicateRewriteVisitor.h b/dbms/src/Interpreters/PredicateRewriteVisitor.h new file mode 100644 index 00000000000..e07df922c15 --- /dev/null +++ b/dbms/src/Interpreters/PredicateRewriteVisitor.h @@ -0,0 +1,36 @@ +#pragma once + +#include +#include +#include +#include + +namespace DB +{ + +class PredicateRewriteVisitorData +{ +public: + bool is_rewrite = false; + using TypeToVisit = ASTSelectWithUnionQuery; + + void visit(ASTSelectWithUnionQuery & union_select_query, ASTPtr &); + + PredicateRewriteVisitorData(const Context & context_, const ASTs & predicates_, const Names & column_names_, bool optimize_final_); + +private: + const Context & context; + const ASTs & predicates; + const Names & column_names; + bool optimize_final; + + void visitFirstInternalSelect(ASTSelectQuery & select_query, ASTPtr &); + + void visitOtherInternalSelect(ASTSelectQuery & select_query, ASTPtr &); + + bool rewriteSubquery(ASTSelectQuery & subquery, const Names & outer_columns, const Names & inner_columns); +}; + +using PredicateRewriteMatcher = OneTypeMatcher; +using PredicateRewriteVisitor = InDepthNodeVisitor; +} diff --git a/dbms/src/Interpreters/SyntaxAnalyzer.cpp b/dbms/src/Interpreters/SyntaxAnalyzer.cpp index 85135c71c6f..a485bd7ad73 100644 --- a/dbms/src/Interpreters/SyntaxAnalyzer.cpp +++ b/dbms/src/Interpreters/SyntaxAnalyzer.cpp @@ -181,7 +181,7 @@ void renameDuplicatedColumns(const ASTSelectQuery * select_query) /// Sometimes we have to calculate more columns in SELECT clause than will be returned from query. /// This is the case when we have DISTINCT or arrayJoin: we require more columns in SELECT even if we need less columns in result. -/// Also we have to remove duplicates in case of GLOBAL subqueries. Their results are placed into tables so duplicates are inpossible. +/// Also we have to remove duplicates in case of GLOBAL subqueries. Their results are placed into tables so duplicates are impossible. void removeUnneededColumnsFromSelectClause(const ASTSelectQuery * select_query, const Names & required_result_columns, bool remove_dups) { ASTs & elements = select_query->select()->children; @@ -632,7 +632,7 @@ std::vector getAggregates(const ASTPtr & query) /// After execution, columns will only contain the list of columns needed to read from the table. void SyntaxAnalyzerResult::collectUsedColumns(const ASTPtr & query, const NamesAndTypesList & additional_source_columns) { - /// We caclulate required_source_columns with source_columns modifications and swap them on exit + /// We calculate required_source_columns with source_columns modifications and swap them on exit required_source_columns = source_columns; if (!additional_source_columns.empty()) @@ -652,15 +652,15 @@ void SyntaxAnalyzerResult::collectUsedColumns(const ASTPtr & query, const NamesA if (columns_context.has_table_join) { - NameSet avaliable_columns; + NameSet available_columns; for (const auto & name : source_columns) - avaliable_columns.insert(name.name); + available_columns.insert(name.name); /// Add columns obtained by JOIN (if needed). for (const auto & joined_column : analyzed_join->columnsFromJoinedTable()) { auto & name = joined_column.name; - if (avaliable_columns.count(name)) + if (available_columns.count(name)) continue; if (required.count(name)) @@ -845,12 +845,12 @@ SyntaxAnalyzerResultPtr SyntaxAnalyzer::analyze( { if (storage) { - const ColumnsDescription & starage_columns = storage->getColumns(); - tables_with_columns.emplace_back(DatabaseAndTableWithAlias{}, starage_columns.getOrdinary().getNames()); + const ColumnsDescription & storage_columns = storage->getColumns(); + tables_with_columns.emplace_back(DatabaseAndTableWithAlias{}, storage_columns.getOrdinary().getNames()); auto & table = tables_with_columns.back(); - table.addHiddenColumns(starage_columns.getMaterialized()); - table.addHiddenColumns(starage_columns.getAliases()); - table.addHiddenColumns(starage_columns.getVirtuals()); + table.addHiddenColumns(storage_columns.getMaterialized()); + table.addHiddenColumns(storage_columns.getAliases()); + table.addHiddenColumns(storage_columns.getVirtuals()); } else { @@ -920,6 +920,9 @@ SyntaxAnalyzerResultPtr SyntaxAnalyzer::analyze( if (select_query) { + /// Push the predicate expression down to the subqueries. + result.rewrite_subqueries = PredicateExpressionsOptimizer(context, tables_with_columns, settings).optimize(*select_query); + /// GROUP BY injective function elimination. optimizeGroupBy(select_query, source_columns_set, context); @@ -935,9 +938,6 @@ SyntaxAnalyzerResultPtr SyntaxAnalyzer::analyze( /// array_join_alias_to_name, array_join_result_to_source. getArrayJoinedColumns(query, result, select_query, result.source_columns, source_columns_set); - /// Push the predicate expression down to the subqueries. - result.rewrite_subqueries = PredicateExpressionsOptimizer(select_query, settings, context).optimize(); - setJoinStrictness(*select_query, settings.join_default_strictness, settings.any_join_distinct_right_table_keys, result.analyzed_join->table_join); collectJoinedColumns(*result.analyzed_join, *select_query, tables_with_columns, result.aliases); diff --git a/dbms/src/Interpreters/convertFieldToType.cpp b/dbms/src/Interpreters/convertFieldToType.cpp index 05dd1370c3b..4332dd4b95b 100644 --- a/dbms/src/Interpreters/convertFieldToType.cpp +++ b/dbms/src/Interpreters/convertFieldToType.cpp @@ -167,7 +167,7 @@ Field convertFieldToTypeImpl(const Field & src, const IDataType & type, const ID { which_from_type = WhichDataType(*from_type_hint); - // This was added to mitigate converting DateTime64-Field (a typedef to a Decimal64) to DataTypeDate64-compatitable type. + // This was added to mitigate converting DateTime64-Field (a typedef to a Decimal64) to DataTypeDate64-compatible type. if (from_type_hint && from_type_hint->equals(type)) { return src; diff --git a/dbms/src/Interpreters/loadMetadata.cpp b/dbms/src/Interpreters/loadMetadata.cpp index 00090d1d309..de1ed00e74a 100644 --- a/dbms/src/Interpreters/loadMetadata.cpp +++ b/dbms/src/Interpreters/loadMetadata.cpp @@ -118,7 +118,7 @@ void loadMetadata(Context & context) } catch (...) { - tryLogCurrentException("Load metadata", "Can't remove force restore file to enable data santity checks"); + tryLogCurrentException("Load metadata", "Can't remove force restore file to enable data sanity checks"); } } } diff --git a/dbms/src/Processors/DelayedPortsProcessor.cpp b/dbms/src/Processors/DelayedPortsProcessor.cpp new file mode 100644 index 00000000000..672f2645c16 --- /dev/null +++ b/dbms/src/Processors/DelayedPortsProcessor.cpp @@ -0,0 +1,95 @@ +#include + +namespace DB +{ + +DelayedPortsProcessor::DelayedPortsProcessor(const Block & header, size_t num_ports, const PortNumbers & delayed_ports) + : IProcessor(InputPorts(num_ports, header), OutputPorts(num_ports, header)) + , num_delayed(delayed_ports.size()) +{ + port_pairs.resize(num_ports); + + auto input_it = inputs.begin(); + auto output_it = outputs.begin(); + for (size_t i = 0; i < num_ports; ++i) + { + port_pairs[i].input_port = &*input_it; + port_pairs[i].output_port = &*output_it; + ++input_it; + ++output_it; + } + + for (auto & delayed : delayed_ports) + port_pairs[delayed].is_delayed = true; +} + +bool DelayedPortsProcessor::processPair(PortsPair & pair) +{ + auto finish = [&]() + { + if (!pair.is_finished) + { + pair.is_finished = true; + ++num_finished; + } + }; + + if (pair.output_port->isFinished()) + { + pair.input_port->close(); + finish(); + return false; + } + + if (pair.input_port->isFinished()) + { + pair.output_port->finish(); + finish(); + return false; + } + + if (!pair.output_port->canPush()) + return false; + + pair.input_port->setNeeded(); + if (pair.input_port->hasData()) + pair.output_port->pushData(pair.input_port->pullData()); + + return true; +} + +IProcessor::Status DelayedPortsProcessor::prepare(const PortNumbers & updated_inputs, const PortNumbers & updated_outputs) +{ + bool skip_delayed = (num_finished + num_delayed) < port_pairs.size(); + bool need_data = false; + + for (auto & output_number : updated_outputs) + { + if (!skip_delayed || !port_pairs[output_number].is_delayed) + need_data = processPair(port_pairs[output_number]) || need_data; + } + + for (auto & input_number : updated_inputs) + { + if (!skip_delayed || !port_pairs[input_number].is_delayed) + need_data = processPair(port_pairs[input_number]) || need_data; + } + + /// In case if main streams are finished at current iteration, start processing delayed streams. + if (skip_delayed && (num_finished + num_delayed) >= port_pairs.size()) + { + for (auto & pair : port_pairs) + if (pair.is_delayed) + need_data = processPair(pair) || need_data; + } + + if (num_finished == port_pairs.size()) + return Status::Finished; + + if (need_data) + return Status::NeedData; + + return Status::PortFull; +} + +} diff --git a/dbms/src/Processors/DelayedPortsProcessor.h b/dbms/src/Processors/DelayedPortsProcessor.h new file mode 100644 index 00000000000..44dd632f8a8 --- /dev/null +++ b/dbms/src/Processors/DelayedPortsProcessor.h @@ -0,0 +1,37 @@ +#pragma once +#include + +namespace DB +{ + +/// Processor with N inputs and N outputs. Only moves data from i-th input to i-th output as is. +/// Some ports are delayed. Delayed ports are processed after other outputs are all finished. +/// Data between ports is not mixed. It is important because this processor can be used before MergingSortedTransform. +/// Delayed ports are appeared after joins, when some non-matched data need to be processed at the end. +class DelayedPortsProcessor : public IProcessor +{ +public: + DelayedPortsProcessor(const Block & header, size_t num_ports, const PortNumbers & delayed_ports); + + String getName() const override { return "DelayedPorts"; } + + Status prepare(const PortNumbers &, const PortNumbers &) override; + +private: + + struct PortsPair + { + InputPort * input_port = nullptr; + OutputPort * output_port = nullptr; + bool is_delayed = false; + bool is_finished = false; + }; + + std::vector port_pairs; + size_t num_delayed; + size_t num_finished = 0; + + bool processPair(PortsPair & pair); +}; + +} diff --git a/dbms/src/Processors/Executors/PipelineExecutor.cpp b/dbms/src/Processors/Executors/PipelineExecutor.cpp index bc0de1fb81d..70cd2e2405f 100644 --- a/dbms/src/Processors/Executors/PipelineExecutor.cpp +++ b/dbms/src/Processors/Executors/PipelineExecutor.cpp @@ -64,13 +64,6 @@ bool PipelineExecutor::addEdges(UInt64 node) throwUnknownProcessor(to_proc, cur, true); UInt64 proc_num = it->second; - - for (auto & edge : edges) - { - if (edge.to == proc_num) - throw Exception("Multiple edges are not allowed for the same processors.", ErrorCodes::LOGICAL_ERROR); - } - auto & edge = edges.emplace_back(proc_num, is_backward, input_port_number, output_port_number, update_list); from_port.setUpdateInfo(&edge.update_info); diff --git a/dbms/src/Processors/QueryPipeline.cpp b/dbms/src/Processors/QueryPipeline.cpp index 13e91ac718d..25abeb6c6d3 100644 --- a/dbms/src/Processors/QueryPipeline.cpp +++ b/dbms/src/Processors/QueryPipeline.cpp @@ -18,6 +18,7 @@ #include #include #include +#include namespace DB { @@ -165,7 +166,6 @@ void QueryPipeline::addSimpleTransformImpl(const TProcessorGetter & getter) for (size_t stream_num = 0; stream_num < streams.size(); ++stream_num) add_transform(streams[stream_num], StreamType::Main, stream_num); - add_transform(delayed_stream_port, StreamType::Main); add_transform(totals_having_port, StreamType::Totals); add_transform(extremes_port, StreamType::Extremes); @@ -185,7 +185,6 @@ void QueryPipeline::addSimpleTransform(const ProcessorGetterWithStreamKind & get void QueryPipeline::addPipe(Processors pipe) { checkInitialized(); - concatDelayedStream(); if (pipe.empty()) throw Exception("Can't add empty processors list to QueryPipeline.", ErrorCodes::LOGICAL_ERROR); @@ -224,41 +223,20 @@ void QueryPipeline::addDelayedStream(ProcessorPtr source) { checkInitialized(); - if (delayed_stream_port) - throw Exception("QueryPipeline already has stream with non joined data.", ErrorCodes::LOGICAL_ERROR); - checkSource(source, false); assertBlocksHaveEqualStructure(current_header, source->getOutputs().front().getHeader(), "QueryPipeline"); - delayed_stream_port = &source->getOutputs().front(); + IProcessor::PortNumbers delayed_streams = { streams.size() }; + streams.emplace_back(&source->getOutputs().front()); processors.emplace_back(std::move(source)); -} -void QueryPipeline::concatDelayedStream() -{ - if (!delayed_stream_port) - return; - - auto resize = std::make_shared(current_header, getNumMainStreams(), 1); - auto stream = streams.begin(); - for (auto & input : resize->getInputs()) - connect(**(stream++), input); - - auto concat = std::make_shared(current_header, 2); - connect(resize->getOutputs().front(), concat->getInputs().front()); - connect(*delayed_stream_port, concat->getInputs().back()); - - streams = { &concat->getOutputs().front() }; - processors.emplace_back(std::move(resize)); - processors.emplace_back(std::move(concat)); - - delayed_stream_port = nullptr; + auto processor = std::make_shared(current_header, streams.size(), delayed_streams); + addPipe({ std::move(processor) }); } void QueryPipeline::resize(size_t num_streams, bool force) { checkInitialized(); - concatDelayedStream(); if (!force && num_streams == getNumStreams()) return; @@ -443,7 +421,6 @@ void QueryPipeline::unitePipelines( std::vector && pipelines, const Block & common_header, const Context & context) { checkInitialized(); - concatDelayedStream(); addSimpleTransform([&](const Block & header) { @@ -456,7 +433,6 @@ void QueryPipeline::unitePipelines( for (auto & pipeline : pipelines) { pipeline.checkInitialized(); - pipeline.concatDelayedStream(); pipeline.addSimpleTransform([&](const Block & header) { diff --git a/dbms/src/Processors/QueryPipeline.h b/dbms/src/Processors/QueryPipeline.h index c27e570018f..29ebaf22955 100644 --- a/dbms/src/Processors/QueryPipeline.h +++ b/dbms/src/Processors/QueryPipeline.h @@ -57,7 +57,7 @@ public: /// Will read from this stream after all data was read from other streams. void addDelayedStream(ProcessorPtr source); - bool hasDelayedStream() const { return delayed_stream_port; } + /// Check if resize transform was used. (In that case another distinct transform will be added). bool hasMixedStreams() const { return has_resize || hasMoreThanOneStream(); } @@ -69,8 +69,7 @@ public: PipelineExecutorPtr execute(); - size_t getNumStreams() const { return streams.size() + (hasDelayedStream() ? 1 : 0); } - size_t getNumMainStreams() const { return streams.size(); } + size_t getNumStreams() const { return streams.size(); } bool hasMoreThanOneStream() const { return getNumStreams() > 1; } bool hasTotals() const { return totals_having_port != nullptr; } @@ -103,9 +102,6 @@ private: OutputPort * totals_having_port = nullptr; OutputPort * extremes_port = nullptr; - /// Special port for delayed stream. - OutputPort * delayed_stream_port = nullptr; - /// If resize processor was added to pipeline. bool has_resize = false; @@ -126,7 +122,6 @@ private: void checkInitialized(); void checkSource(const ProcessorPtr & source, bool can_have_totals); - void concatDelayedStream(); template void addSimpleTransformImpl(const TProcessorGetter & getter); diff --git a/dbms/src/Storages/AlterCommands.cpp b/dbms/src/Storages/AlterCommands.cpp index c586ea54c98..fc7bf608b17 100644 --- a/dbms/src/Storages/AlterCommands.cpp +++ b/dbms/src/Storages/AlterCommands.cpp @@ -249,6 +249,9 @@ void AlterCommand::apply(StorageInMemoryMetadata & metadata) const /// let's use info about old type if (data_type == nullptr) codec->useInfoAboutType(column.type); + else /// use info about new DataType + codec->useInfoAboutType(data_type); + column.codec = codec; } @@ -316,7 +319,7 @@ void AlterCommand::apply(StorageInMemoryMetadata & metadata) const if (insert_it == metadata.indices.indices.end()) throw Exception("Wrong index name. Cannot find index " + backQuote(after_index_name) + " to insert after.", - ErrorCodes::LOGICAL_ERROR); + ErrorCodes::BAD_ARGUMENTS); ++insert_it; } @@ -338,7 +341,7 @@ void AlterCommand::apply(StorageInMemoryMetadata & metadata) const if (if_exists) return; throw Exception("Wrong index name. Cannot find index " + backQuote(index_name) + " to drop.", - ErrorCodes::LOGICAL_ERROR); + ErrorCodes::BAD_ARGUMENTS); } metadata.indices.indices.erase(erase_it); @@ -378,7 +381,7 @@ void AlterCommand::apply(StorageInMemoryMetadata & metadata) const if (if_exists) return; throw Exception("Wrong constraint name. Cannot find constraint `" + constraint_name + "` to drop.", - ErrorCodes::LOGICAL_ERROR); + ErrorCodes::BAD_ARGUMENTS); } metadata.constraints.constraints.erase(erase_it); } diff --git a/dbms/src/Storages/Distributed/DirectoryMonitor.cpp b/dbms/src/Storages/Distributed/DirectoryMonitor.cpp index 94327d129dd..e33e7c2b001 100644 --- a/dbms/src/Storages/Distributed/DirectoryMonitor.cpp +++ b/dbms/src/Storages/Distributed/DirectoryMonitor.cpp @@ -269,7 +269,7 @@ void StorageDistributedDirectoryMonitor::processFile(const std::string & file_pa Settings insert_settings; std::string insert_query; - readHeader(in, insert_settings, insert_query); + readHeader(in, insert_settings, insert_query, log); RemoteBlockOutputStream remote{*connection, timeouts, insert_query, &insert_settings}; @@ -289,7 +289,7 @@ void StorageDistributedDirectoryMonitor::processFile(const std::string & file_pa } void StorageDistributedDirectoryMonitor::readHeader( - ReadBuffer & in, Settings & insert_settings, std::string & insert_query) const + ReadBuffer & in, Settings & insert_settings, std::string & insert_query, Logger * log) { UInt64 query_size; readVarUInt(query_size, in); @@ -449,7 +449,7 @@ struct StorageDistributedDirectoryMonitor::Batch } ReadBufferFromFile in(file_path->second); - parent.readHeader(in, insert_settings, insert_query); + parent.readHeader(in, insert_settings, insert_query, parent.log); if (first) { @@ -520,6 +520,53 @@ struct StorageDistributedDirectoryMonitor::Batch } }; +class DirectoryMonitorBlockInputStream : public IBlockInputStream +{ +public: + explicit DirectoryMonitorBlockInputStream(const String & file_name) + : in(file_name) + , decompressing_in(in) + , block_in(decompressing_in, ClickHouseRevision::get()) + , log{&Logger::get("DirectoryMonitorBlockInputStream")} + { + Settings insert_settings; + String insert_query; + StorageDistributedDirectoryMonitor::readHeader(in, insert_settings, insert_query, log); + + block_in.readPrefix(); + first_block = block_in.read(); + header = first_block.cloneEmpty(); + } + + String getName() const override { return "DirectoryMonitor"; } + +protected: + Block getHeader() const override { return header; } + Block readImpl() override + { + if (first_block) + return std::move(first_block); + + return block_in.read(); + } + + void readSuffix() override { block_in.readSuffix(); } + +private: + ReadBufferFromFile in; + CompressedReadBuffer decompressing_in; + NativeBlockInputStream block_in; + + Block first_block; + Block header; + + Logger * log; +}; + +BlockInputStreamPtr StorageDistributedDirectoryMonitor::createStreamFromFile(const String & file_name) +{ + return std::make_shared(file_name); +} void StorageDistributedDirectoryMonitor::processFilesWithBatching(const std::map & files) { @@ -557,7 +604,7 @@ void StorageDistributedDirectoryMonitor::processFilesWithBatching(const std::map { /// Determine metadata of the current file and check if it is not broken. ReadBufferFromFile in{file_path}; - readHeader(in, insert_settings, insert_query); + readHeader(in, insert_settings, insert_query, log); CompressedReadBuffer decompressing_in(in); NativeBlockInputStream block_in(decompressing_in, ClickHouseRevision::get()); diff --git a/dbms/src/Storages/Distributed/DirectoryMonitor.h b/dbms/src/Storages/Distributed/DirectoryMonitor.h index ec642d93819..4ee77072ee3 100644 --- a/dbms/src/Storages/Distributed/DirectoryMonitor.h +++ b/dbms/src/Storages/Distributed/DirectoryMonitor.h @@ -31,6 +31,8 @@ public: void flushAllData(); void shutdownAndDropAllData(); + + static BlockInputStreamPtr createStreamFromFile(const String & file_name); private: void run(); bool processFiles(); @@ -69,7 +71,9 @@ private: ThreadFromGlobalPool thread{&StorageDistributedDirectoryMonitor::run, this}; /// Read insert query and insert settings for backward compatible. - void readHeader(ReadBuffer & in, Settings & insert_settings, std::string & insert_query) const; + static void readHeader(ReadBuffer & in, Settings & insert_settings, std::string & insert_query, Logger * log); + + friend class DirectoryMonitorBlockInputStream; }; } diff --git a/dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp b/dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp index 7ed9836e032..24ead5b5832 100644 --- a/dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp +++ b/dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp @@ -182,7 +182,7 @@ void DistributedBlockOutputStream::initWritingJobs(const Block & first_block) } if (num_shards > 1) - shard_jobs.shard_current_block_permuation.reserve(first_block.rows()); + shard_jobs.shard_current_block_permutation.reserve(first_block.rows()); } } @@ -235,7 +235,7 @@ ThreadPool::Job DistributedBlockOutputStream::runWritingJob(DistributedBlockOutp /// Generate current shard block if (num_shards > 1) { - auto & shard_permutation = shard_job.shard_current_block_permuation; + auto & shard_permutation = shard_job.shard_current_block_permutation; size_t num_shard_rows = shard_permutation.size(); for (size_t j = 0; j < current_block.columns(); ++j) @@ -348,10 +348,10 @@ void DistributedBlockOutputStream::writeSync(const Block & block) /// Prepare row numbers for each shard for (size_t shard_index : ext::range(0, num_shards)) - per_shard_jobs[shard_index].shard_current_block_permuation.resize(0); + per_shard_jobs[shard_index].shard_current_block_permutation.resize(0); for (size_t i = 0; i < block.rows(); ++i) - per_shard_jobs[current_selector[i]].shard_current_block_permuation.push_back(i); + per_shard_jobs[current_selector[i]].shard_current_block_permutation.push_back(i); } try diff --git a/dbms/src/Storages/Distributed/DistributedBlockOutputStream.h b/dbms/src/Storages/Distributed/DistributedBlockOutputStream.h index 97297aae434..73fbfa593f8 100644 --- a/dbms/src/Storages/Distributed/DistributedBlockOutputStream.h +++ b/dbms/src/Storages/Distributed/DistributedBlockOutputStream.h @@ -123,7 +123,7 @@ private: struct JobShard { std::list replicas_jobs; - IColumn::Permutation shard_current_block_permuation; + IColumn::Permutation shard_current_block_permutation; }; std::vector per_shard_jobs; diff --git a/dbms/src/Storages/LiveView/StorageLiveView.cpp b/dbms/src/Storages/LiveView/StorageLiveView.cpp index db410eeb5e4..eae8eaa1d3c 100644 --- a/dbms/src/Storages/LiveView/StorageLiveView.cpp +++ b/dbms/src/Storages/LiveView/StorageLiveView.cpp @@ -1,4 +1,4 @@ -/* iopyright (c) 2018 BlackBerry Limited +/* Copyright (c) 2018 BlackBerry Limited Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. @@ -95,6 +95,66 @@ static void extractDependentTable(ASTPtr & query, String & select_database_name, DB::ErrorCodes::LOGICAL_ERROR); } +MergeableBlocksPtr StorageLiveView::collectMergeableBlocks(const Context & context) +{ + ASTPtr mergeable_query = inner_query; + + if (inner_subquery) + mergeable_query = inner_subquery; + + MergeableBlocksPtr new_mergeable_blocks = std::make_shared(); + BlocksPtrs new_blocks = std::make_shared>(); + BlocksPtr base_blocks = std::make_shared(); + + InterpreterSelectQuery interpreter(mergeable_query->clone(), context, SelectQueryOptions(QueryProcessingStage::WithMergeableState), Names()); + + auto view_mergeable_stream = std::make_shared(interpreter.execute().in); + + while (Block this_block = view_mergeable_stream->read()) + base_blocks->push_back(this_block); + + new_blocks->push_back(base_blocks); + + new_mergeable_blocks->blocks = new_blocks; + new_mergeable_blocks->sample_block = view_mergeable_stream->getHeader(); + + return new_mergeable_blocks; +} + +BlockInputStreams StorageLiveView::blocksToInputStreams(BlocksPtrs blocks, Block & sample_block) +{ + BlockInputStreams streams; + for (auto & blocks_ : *blocks) + { + BlockInputStreamPtr stream = std::make_shared(std::make_shared(blocks_), sample_block); + streams.push_back(std::move(stream)); + } + return streams; +} + +/// Complete query using input streams from mergeable blocks +BlockInputStreamPtr StorageLiveView::completeQuery(BlockInputStreams from) +{ + auto block_context = std::make_unique(global_context); + block_context->makeQueryContext(); + + auto blocks_storage = StorageBlocks::createStorage(database_name, table_name, parent_storage->getColumns(), + std::move(from), QueryProcessingStage::WithMergeableState); + + block_context->addExternalTable(table_name + "_blocks", blocks_storage); + + InterpreterSelectQuery select(inner_blocks_query->clone(), *block_context, StoragePtr(), SelectQueryOptions(QueryProcessingStage::Complete)); + BlockInputStreamPtr data = std::make_shared(select.execute().in); + + /// Squashing is needed here because the view query can generate a lot of blocks + /// even when only one block is inserted into the parent table (e.g. if the query is a GROUP BY + /// and two-level aggregation is triggered). + data = std::make_shared( + data, global_context.getSettingsRef().min_insert_block_size_rows, + global_context.getSettingsRef().min_insert_block_size_bytes); + + return data; +} void StorageLiveView::writeIntoLiveView( StorageLiveView & live_view, @@ -102,8 +162,6 @@ void StorageLiveView::writeIntoLiveView( const Context & context) { BlockOutputStreamPtr output = std::make_shared(live_view); - auto block_context = std::make_unique(context.getGlobalContext()); - block_context->makeQueryContext(); /// Check if live view has any readers if not /// just reset blocks to empty and do nothing else @@ -119,54 +177,40 @@ void StorageLiveView::writeIntoLiveView( bool is_block_processed = false; BlockInputStreams from; - BlocksPtrs mergeable_blocks; + MergeableBlocksPtr mergeable_blocks; BlocksPtr new_mergeable_blocks = std::make_shared(); - ASTPtr mergeable_query = live_view.getInnerQuery(); - - if (live_view.getInnerSubQuery()) - mergeable_query = live_view.getInnerSubQuery(); { std::lock_guard lock(live_view.mutex); mergeable_blocks = live_view.getMergeableBlocks(); - if (!mergeable_blocks || mergeable_blocks->size() >= context.getGlobalContext().getSettingsRef().max_live_view_insert_blocks_before_refresh) + if (!mergeable_blocks || mergeable_blocks->blocks->size() >= context.getGlobalContext().getSettingsRef().max_live_view_insert_blocks_before_refresh) { - mergeable_blocks = std::make_shared>(); - BlocksPtr base_mergeable_blocks = std::make_shared(); - InterpreterSelectQuery interpreter(mergeable_query, context, SelectQueryOptions(QueryProcessingStage::WithMergeableState), Names()); - auto view_mergeable_stream = std::make_shared( - interpreter.execute().in); - while (Block this_block = view_mergeable_stream->read()) - base_mergeable_blocks->push_back(this_block); - mergeable_blocks->push_back(base_mergeable_blocks); + mergeable_blocks = live_view.collectMergeableBlocks(context); live_view.setMergeableBlocks(mergeable_blocks); - - /// Create from streams - for (auto & blocks_ : *mergeable_blocks) - { - if (blocks_->empty()) - continue; - auto sample_block = blocks_->front().cloneEmpty(); - BlockInputStreamPtr stream = std::make_shared(std::make_shared(blocks_), sample_block); - from.push_back(std::move(stream)); - } - + from = live_view.blocksToInputStreams(mergeable_blocks->blocks, mergeable_blocks->sample_block); is_block_processed = true; } } - auto parent_storage = context.getTable(live_view.getSelectDatabaseName(), live_view.getSelectTableName()); - if (!is_block_processed) { + ASTPtr mergeable_query = live_view.getInnerQuery(); + + if (live_view.getInnerSubQuery()) + mergeable_query = live_view.getInnerSubQuery(); + BlockInputStreams streams = {std::make_shared(block)}; + auto blocks_storage = StorageBlocks::createStorage(live_view.database_name, live_view.table_name, - parent_storage->getColumns(), std::move(streams), QueryProcessingStage::FetchColumns); + live_view.getParentStorage()->getColumns(), std::move(streams), QueryProcessingStage::FetchColumns); + InterpreterSelectQuery select_block(mergeable_query, context, blocks_storage, QueryProcessingStage::WithMergeableState); + auto data_mergeable_stream = std::make_shared( select_block.execute().in); + while (Block this_block = data_mergeable_stream->read()) new_mergeable_blocks->push_back(this_block); @@ -177,32 +221,12 @@ void StorageLiveView::writeIntoLiveView( std::lock_guard lock(live_view.mutex); mergeable_blocks = live_view.getMergeableBlocks(); - mergeable_blocks->push_back(new_mergeable_blocks); - - /// Create from streams - for (auto & blocks_ : *mergeable_blocks) - { - if (blocks_->empty()) - continue; - auto sample_block = blocks_->front().cloneEmpty(); - BlockInputStreamPtr stream = std::make_shared(std::make_shared(blocks_), sample_block); - from.push_back(std::move(stream)); - } + mergeable_blocks->blocks->push_back(new_mergeable_blocks); + from = live_view.blocksToInputStreams(mergeable_blocks->blocks, mergeable_blocks->sample_block); } } - auto blocks_storage = StorageBlocks::createStorage(live_view.database_name, live_view.table_name, parent_storage->getColumns(), std::move(from), QueryProcessingStage::WithMergeableState); - block_context->addExternalTable(live_view.table_name + "_blocks", blocks_storage); - - InterpreterSelectQuery select(live_view.getInnerBlocksQuery(), *block_context, StoragePtr(), SelectQueryOptions(QueryProcessingStage::Complete)); - BlockInputStreamPtr data = std::make_shared(select.execute().in); - - /// Squashing is needed here because the view query can generate a lot of blocks - /// even when only one block is inserted into the parent table (e.g. if the query is a GROUP BY - /// and two-level aggregation is triggered). - data = std::make_shared( - data, context.getGlobalContext().getSettingsRef().min_insert_block_size_rows, context.getGlobalContext().getSettingsRef().min_insert_block_size_bytes); - + BlockInputStreamPtr data = live_view.completeQuery(from); copyData(*data, *output); } @@ -247,6 +271,8 @@ StorageLiveView::StorageLiveView( DatabaseAndTableName(select_database_name, select_table_name), DatabaseAndTableName(database_name, table_name)); + parent_storage = local_context.getTable(select_database_name, select_table_name); + is_temporary = query.temporary; temporary_live_view_timeout = local_context.getSettingsRef().temporary_live_view_timeout.totalSeconds(); @@ -298,36 +324,10 @@ bool StorageLiveView::getNewBlocks() UInt128 key; BlocksPtr new_blocks = std::make_shared(); BlocksMetadataPtr new_blocks_metadata = std::make_shared(); - BlocksPtr new_mergeable_blocks = std::make_shared(); - ASTPtr mergeable_query = inner_query; - if (inner_subquery) - mergeable_query = inner_subquery; - - InterpreterSelectQuery interpreter(mergeable_query->clone(), *live_view_context, SelectQueryOptions(QueryProcessingStage::WithMergeableState), Names()); - auto mergeable_stream = std::make_shared(interpreter.execute().in); - - while (Block block = mergeable_stream->read()) - new_mergeable_blocks->push_back(block); - - auto block_context = std::make_unique(global_context); - block_context->makeQueryContext(); - - mergeable_blocks = std::make_shared>(); - mergeable_blocks->push_back(new_mergeable_blocks); - BlockInputStreamPtr from = std::make_shared(std::make_shared(new_mergeable_blocks), mergeable_stream->getHeader()); - - auto blocks_storage = StorageBlocks::createStorage(database_name, table_name, global_context.getTable(select_database_name, select_table_name)->getColumns(), {from}, QueryProcessingStage::WithMergeableState); - block_context->addExternalTable(table_name + "_blocks", blocks_storage); - - InterpreterSelectQuery select(inner_blocks_query->clone(), *block_context, StoragePtr(), SelectQueryOptions(QueryProcessingStage::Complete)); - BlockInputStreamPtr data = std::make_shared(select.execute().in); - - /// Squashing is needed here because the view query can generate a lot of blocks - /// even when only one block is inserted into the parent table (e.g. if the query is a GROUP BY - /// and two-level aggregation is triggered). - data = std::make_shared( - data, global_context.getSettingsRef().min_insert_block_size_rows, global_context.getSettingsRef().min_insert_block_size_bytes); + mergeable_blocks = collectMergeableBlocks(*live_view_context); + BlockInputStreams from = blocksToInputStreams(mergeable_blocks->blocks, mergeable_blocks->sample_block); + BlockInputStreamPtr data = completeQuery({from}); while (Block block = data->read()) { diff --git a/dbms/src/Storages/LiveView/StorageLiveView.h b/dbms/src/Storages/LiveView/StorageLiveView.h index a5b0f15e879..916406a1dbd 100644 --- a/dbms/src/Storages/LiveView/StorageLiveView.h +++ b/dbms/src/Storages/LiveView/StorageLiveView.h @@ -27,9 +27,16 @@ struct BlocksMetadata UInt64 version; }; +struct MergeableBlocks +{ + BlocksPtrs blocks; + Block sample_block; +}; + class IAST; using ASTPtr = std::shared_ptr; using BlocksMetadataPtr = std::shared_ptr; +using MergeableBlocksPtr = std::shared_ptr; class StorageLiveView : public ext::shared_ptr_helper, public IStorage { @@ -45,6 +52,7 @@ public: String getDatabaseName() const override { return database_name; } String getSelectDatabaseName() const { return select_database_name; } String getSelectTableName() const { return select_table_name; } + StoragePtr getParentStorage() const { return parent_storage; } NameAndTypePair getColumn(const String & column_name) const override; bool hasColumn(const String & column_name) const override; @@ -138,8 +146,14 @@ public: unsigned num_streams) override; std::shared_ptr getBlocksPtr() { return blocks_ptr; } - BlocksPtrs getMergeableBlocks() { return mergeable_blocks; } - void setMergeableBlocks(BlocksPtrs blocks) { mergeable_blocks = blocks; } + MergeableBlocksPtr getMergeableBlocks() { return mergeable_blocks; } + + /// Collect mergeable blocks and their sample. Must be called holding mutex + MergeableBlocksPtr collectMergeableBlocks(const Context & context); + /// Complete query using input streams from mergeable blocks + BlockInputStreamPtr completeQuery(BlockInputStreams from); + + void setMergeableBlocks(MergeableBlocksPtr blocks) { mergeable_blocks = blocks; } std::shared_ptr getActivePtr() { return active_ptr; } /// Read new data blocks that store query result @@ -147,6 +161,9 @@ public: Block getHeader() const; + /// convert blocks to input streams + static BlockInputStreams blocksToInputStreams(BlocksPtrs blocks, Block & sample_block); + static void writeIntoLiveView( StorageLiveView & live_view, const Block & block, @@ -162,6 +179,7 @@ private: ASTPtr inner_blocks_query; /// query over the mergeable blocks to produce final result Context & global_context; std::unique_ptr live_view_context; + StoragePtr parent_storage; bool is_temporary = false; /// Mutex to protect access to sample block @@ -180,7 +198,7 @@ private: std::shared_ptr blocks_ptr; /// Current data blocks metadata std::shared_ptr blocks_metadata_ptr; - BlocksPtrs mergeable_blocks; + MergeableBlocksPtr mergeable_blocks; /// Background thread for temporary tables /// which drops this table if there are no users diff --git a/dbms/src/Storages/MergeTree/MergeTreeData.cpp b/dbms/src/Storages/MergeTree/MergeTreeData.cpp index d4451af3273..c10bd78afec 100644 --- a/dbms/src/Storages/MergeTree/MergeTreeData.cpp +++ b/dbms/src/Storages/MergeTree/MergeTreeData.cpp @@ -3216,18 +3216,20 @@ ReservationPtr MergeTreeData::tryReserveSpace(UInt64 expected_size, SpacePtr spa ReservationPtr MergeTreeData::reserveSpacePreferringTTLRules(UInt64 expected_size, const MergeTreeDataPart::TTLInfos & ttl_infos, - time_t time_of_move) const + time_t time_of_move, + size_t min_volume_index) const { expected_size = std::max(RESERVATION_MIN_ESTIMATION_SIZE, expected_size); - ReservationPtr reservation = tryReserveSpacePreferringTTLRules(expected_size, ttl_infos, time_of_move); + ReservationPtr reservation = tryReserveSpacePreferringTTLRules(expected_size, ttl_infos, time_of_move, min_volume_index); return checkAndReturnReservation(expected_size, std::move(reservation)); } ReservationPtr MergeTreeData::tryReserveSpacePreferringTTLRules(UInt64 expected_size, const MergeTreeDataPart::TTLInfos & ttl_infos, - time_t time_of_move) const + time_t time_of_move, + size_t min_volume_index) const { expected_size = std::max(RESERVATION_MIN_ESTIMATION_SIZE, expected_size); @@ -3253,10 +3255,19 @@ ReservationPtr MergeTreeData::tryReserveSpacePreferringTTLRules(UInt64 expected_ reservation = destination_ptr->reserve(expected_size); if (reservation) return reservation; + else + if (ttl_entry->destination_type == PartDestinationType::VOLUME) + LOG_WARNING(log, "Would like to reserve space on volume '" + << ttl_entry->destination_name << "' by TTL rule of table '" + << log_name << "' but there is not enough space"); + else if (ttl_entry->destination_type == PartDestinationType::DISK) + LOG_WARNING(log, "Would like to reserve space on disk '" + << ttl_entry->destination_name << "' by TTL rule of table '" + << log_name << "' but there is not enough space"); } } - reservation = storage_policy->reserve(expected_size); + reservation = storage_policy->reserve(expected_size, min_volume_index); return reservation; } diff --git a/dbms/src/Storages/MergeTree/MergeTreeData.h b/dbms/src/Storages/MergeTree/MergeTreeData.h index 4fb09277b1e..f458fb3e7d3 100644 --- a/dbms/src/Storages/MergeTree/MergeTreeData.h +++ b/dbms/src/Storages/MergeTree/MergeTreeData.h @@ -675,10 +675,12 @@ public: /// Reserves space at least 1MB preferring best destination according to `ttl_infos`. ReservationPtr reserveSpacePreferringTTLRules(UInt64 expected_size, const MergeTreeDataPart::TTLInfos & ttl_infos, - time_t time_of_move) const; + time_t time_of_move, + size_t min_volume_index = 0) const; ReservationPtr tryReserveSpacePreferringTTLRules(UInt64 expected_size, const MergeTreeDataPart::TTLInfos & ttl_infos, - time_t time_of_move) const; + time_t time_of_move, + size_t min_volume_index = 0) const; /// Choose disk with max available free space /// Reserves 0 bytes ReservationPtr makeEmptyReservationOnLargestDisk() { return storage_policy->makeEmptyReservationOnLargestDisk(); } diff --git a/dbms/src/Storages/MergeTree/MergeTreeDataMergerMutator.cpp b/dbms/src/Storages/MergeTree/MergeTreeDataMergerMutator.cpp index 8975535f31b..05db73ce215 100644 --- a/dbms/src/Storages/MergeTree/MergeTreeDataMergerMutator.cpp +++ b/dbms/src/Storages/MergeTree/MergeTreeDataMergerMutator.cpp @@ -169,9 +169,12 @@ UInt64 MergeTreeDataMergerMutator::getMaxSourcePartSizeForMutation() const auto data_settings = data.getSettings(); size_t busy_threads_in_pool = CurrentMetrics::values[CurrentMetrics::BackgroundPoolTask].load(std::memory_order_relaxed); + /// DataPart can be store only at one disk. Get Max of free space at all disks + UInt64 disk_space = data.storage_policy->getMaxUnreservedFreeSpace(); + /// Allow mutations only if there are enough threads, leave free threads for merges else if (background_pool_size - busy_threads_in_pool >= data_settings->number_of_free_entries_in_pool_to_execute_mutation) - return static_cast(data.storage_policy->getMaxUnreservedFreeSpace() / DISK_USAGE_COEFFICIENT_TO_RESERVE); + return static_cast(disk_space / DISK_USAGE_COEFFICIENT_TO_RESERVE); return 0; } diff --git a/dbms/src/Storages/MergeTree/MergeTreeIndexFullText.cpp b/dbms/src/Storages/MergeTree/MergeTreeIndexFullText.cpp index da3f1df8130..8041ad4dbe7 100644 --- a/dbms/src/Storages/MergeTree/MergeTreeIndexFullText.cpp +++ b/dbms/src/Storages/MergeTree/MergeTreeIndexFullText.cpp @@ -636,15 +636,16 @@ bool SplitTokenExtractor::next(const char * data, size_t len, size_t * pos, size { if (isASCII(data[*pos]) && !isAlphaNumericASCII(data[*pos])) { + /// Finish current token if any if (*token_len > 0) return true; *token_start = ++*pos; } else { - const size_t sz = UTF8::seqLength(static_cast(data[*pos])); - *pos += sz; - *token_len += sz; + /// Note that UTF-8 sequence is completely consisted of non-ASCII bytes. + ++*pos; + ++*token_len; } } return *token_len > 0; diff --git a/dbms/src/Storages/MergeTree/MergeTreePartsMover.h b/dbms/src/Storages/MergeTree/MergeTreePartsMover.h index 0d1228d3591..cdf1bff8f88 100644 --- a/dbms/src/Storages/MergeTree/MergeTreePartsMover.h +++ b/dbms/src/Storages/MergeTree/MergeTreePartsMover.h @@ -49,7 +49,7 @@ public: const AllowedMovingPredicate & can_move, const std::lock_guard & moving_parts_lock); - /// Copies part to selected reservation in detached folder. Throws exception if part alredy exists. + /// Copies part to selected reservation in detached folder. Throws exception if part already exists. std::shared_ptr clonePart(const MergeTreeMoveEntry & moving_part) const; /// Replaces cloned part from detached directory into active data parts set. diff --git a/dbms/src/Storages/ReadInOrderOptimizer.cpp b/dbms/src/Storages/ReadInOrderOptimizer.cpp index cceaf9af578..667ce095932 100644 --- a/dbms/src/Storages/ReadInOrderOptimizer.cpp +++ b/dbms/src/Storages/ReadInOrderOptimizer.cpp @@ -23,7 +23,7 @@ ReadInOrderOptimizer::ReadInOrderOptimizer( throw Exception("Sizes of sort description and actions are mismatched", ErrorCodes::LOGICAL_ERROR); /// Do not analyze joined columns. - /// They may have aliases and come to descriprion as is. + /// They may have aliases and come to description as is. /// We can mismatch them with order key columns at stage of fetching columns. for (const auto & elem : syntax_result->array_join_result_to_source) forbidden_columns.insert(elem.first); diff --git a/dbms/src/Storages/StorageBuffer.cpp b/dbms/src/Storages/StorageBuffer.cpp index 0433f8848b6..53685d8694d 100644 --- a/dbms/src/Storages/StorageBuffer.cpp +++ b/dbms/src/Storages/StorageBuffer.cpp @@ -438,7 +438,7 @@ void StorageBuffer::startup() if (global_context.getSettingsRef().readonly) { LOG_WARNING(log, "Storage " << getName() << " is run with readonly settings, it will not be able to insert data." - << " Set apropriate system_profile to fix this."); + << " Set appropriate system_profile to fix this."); } flush_thread = ThreadFromGlobalPool(&StorageBuffer::flushThread, this); diff --git a/dbms/src/Storages/StorageFile.cpp b/dbms/src/Storages/StorageFile.cpp index a15d7ba414c..8df0385d55f 100644 --- a/dbms/src/Storages/StorageFile.cpp +++ b/dbms/src/Storages/StorageFile.cpp @@ -31,6 +31,7 @@ #include #include +#include namespace fs = std::filesystem; @@ -155,6 +156,17 @@ StorageFile::StorageFile(const std::string & table_path_, const std::string & us for (const auto & cur_path : paths) checkCreationIsAllowed(args.context, user_files_absolute_path, cur_path); + + if (args.format_name == "Distributed") + { + if (!paths.empty()) + { + auto & first_path = paths[0]; + Block header = StorageDistributedDirectoryMonitor::createStreamFromFile(first_path)->getHeader(); + + setColumns(ColumnsDescription(header.getNamesAndTypesList())); + } + } } StorageFile::StorageFile(const std::string & relative_table_dir_path, CommonArguments args) @@ -172,7 +184,9 @@ StorageFile::StorageFile(CommonArguments args) : table_name(args.table_name), database_name(args.database_name), format_name(args.format_name) , compression_method(args.compression_method), base_path(args.context.getPath()) { - setColumns(args.columns); + if (args.format_name != "Distributed") + setColumns(args.columns); + setConstraints(args.constraints); } @@ -182,8 +196,9 @@ public: StorageFileBlockInputStream(std::shared_ptr storage_, const Context & context, UInt64 max_block_size, std::string file_path, - const CompressionMethod compression_method) - : storage(std::move(storage_)) + const CompressionMethod compression_method, + BlockInputStreamPtr prepared_reader = nullptr) + : storage(std::move(storage_)), reader(std::move(prepared_reader)) { if (storage->use_table_fd) { @@ -211,7 +226,8 @@ public: read_buf = wrapReadBufferWithCompressionMethod(std::make_unique(file_path), compression_method); } - reader = FormatFactory::instance().getInput(storage->format_name, *read_buf, storage->getSampleBlock(), context, max_block_size); + if (!reader) + reader = FormatFactory::instance().getInput(storage->format_name, *read_buf, storage->getSampleBlock(), context, max_block_size); } String getName() const override @@ -268,8 +284,14 @@ BlockInputStreams StorageFile::read( blocks_input.reserve(paths.size()); for (const auto & file_path : paths) { - BlockInputStreamPtr cur_block = std::make_shared( - std::static_pointer_cast(shared_from_this()), context, max_block_size, file_path, chooseCompressionMethod(file_path, compression_method)); + BlockInputStreamPtr cur_block; + + if (format_name == "Distributed") + cur_block = StorageDistributedDirectoryMonitor::createStreamFromFile(file_path); + else + cur_block = std::make_shared( + std::static_pointer_cast(shared_from_this()), context, max_block_size, file_path, chooseCompressionMethod(file_path, compression_method)); + blocks_input.push_back(column_defaults.empty() ? cur_block : std::make_shared(cur_block, column_defaults, context)); } return narrowBlockInputStreams(blocks_input, num_streams); @@ -338,7 +360,11 @@ BlockOutputStreamPtr StorageFile::write( const ASTPtr & /*query*/, const Context & context) { - return std::make_shared(*this, chooseCompressionMethod(paths[0], compression_method), context); + if (format_name == "Distributed") + throw Exception("Method write is not implemented for Distributed format", ErrorCodes::NOT_IMPLEMENTED); + + return std::make_shared(*this, + chooseCompressionMethod(paths[0], compression_method), context); } Strings StorageFile::getDataPaths() const diff --git a/dbms/src/Storages/StorageMergeTree.cpp b/dbms/src/Storages/StorageMergeTree.cpp index 9a0583a464a..f857602cdde 100644 --- a/dbms/src/Storages/StorageMergeTree.cpp +++ b/dbms/src/Storages/StorageMergeTree.cpp @@ -328,10 +328,14 @@ public: else { MergeTreeDataPart::TTLInfos ttl_infos; + size_t max_volume_index = 0; for (auto & part_ptr : future_part_.parts) + { ttl_infos.update(part_ptr->ttl_infos); + max_volume_index = std::max(max_volume_index, storage.getStoragePolicy()->getVolumeIndexByDisk(part_ptr->disk)); + } - reserved_space = storage.tryReserveSpacePreferringTTLRules(total_size, ttl_infos, time(nullptr)); + reserved_space = storage.tryReserveSpacePreferringTTLRules(total_size, ttl_infos, time(nullptr), max_volume_index); } if (!reserved_space) { @@ -612,7 +616,13 @@ bool StorageMergeTree::merge( if (!selected) { if (out_disable_reason) - *out_disable_reason = "Cannot select parts for optimization"; + { + if (!out_disable_reason->empty()) + { + *out_disable_reason += ". "; + } + *out_disable_reason += "Cannot select parts for optimization"; + } return false; } @@ -697,9 +707,6 @@ bool StorageMergeTree::tryMutatePart() /// You must call destructor with unlocked `currently_processing_in_background_mutex`. std::optional tagger; { - /// DataPart can be store only at one disk. Get Max of free space at all disks - UInt64 disk_space = storage_policy->getMaxUnreservedFreeSpace(); - std::lock_guard lock(currently_processing_in_background_mutex); if (current_mutations_by_version.empty()) @@ -715,7 +722,7 @@ bool StorageMergeTree::tryMutatePart() if (mutations_begin_it == mutations_end_it) continue; - if (merger_mutator.getMaxSourcePartSizeForMutation() > disk_space) + if (merger_mutator.getMaxSourcePartSizeForMutation() < part->bytes_on_disk) continue; size_t current_ast_elements = 0; @@ -1145,7 +1152,7 @@ void StorageMergeTree::replacePartitionFrom(const StoragePtr & source_table, con if (!canReplacePartition(src_part)) throw Exception( "Cannot replace partition '" + partition_id + "' because part '" + src_part->name + "' has inconsistent granularity with table", - ErrorCodes::LOGICAL_ERROR); + ErrorCodes::BAD_ARGUMENTS); /// This will generate unique name in scope of current server process. Int64 temp_index = insert_increment.get(); diff --git a/dbms/src/Storages/StorageReplicatedMergeTree.cpp b/dbms/src/Storages/StorageReplicatedMergeTree.cpp index 45b6cdcebf8..67b9cbd5ca4 100644 --- a/dbms/src/Storages/StorageReplicatedMergeTree.cpp +++ b/dbms/src/Storages/StorageReplicatedMergeTree.cpp @@ -1053,12 +1053,14 @@ bool StorageReplicatedMergeTree::tryExecuteMerge(const LogEntry & entry) /// Can throw an exception while reserving space. MergeTreeDataPart::TTLInfos ttl_infos; + size_t max_volume_index = 0; for (auto & part_ptr : parts) { ttl_infos.update(part_ptr->ttl_infos); + max_volume_index = std::max(max_volume_index, getStoragePolicy()->getVolumeIndexByDisk(part_ptr->disk)); } ReservationPtr reserved_space = reserveSpacePreferringTTLRules(estimated_space_for_merge, - ttl_infos, time(nullptr)); + ttl_infos, time(nullptr), max_volume_index); auto table_lock = lockStructureForShare(false, RWLockImpl::NO_QUERY); @@ -3181,7 +3183,7 @@ bool StorageReplicatedMergeTree::optimize(const ASTPtr & query, const ASTPtr & p { /// NOTE Table lock must not be held while waiting. Some combination of R-W-R locks from different threads will yield to deadlock. for (auto & merge_entry : merge_entries) - waitForAllReplicasToProcessLogEntry(merge_entry); + waitForAllReplicasToProcessLogEntry(merge_entry, false); } return true; @@ -3889,13 +3891,19 @@ StorageReplicatedMergeTree::allocateBlockNumber( } -void StorageReplicatedMergeTree::waitForAllReplicasToProcessLogEntry(const ReplicatedMergeTreeLogEntryData & entry) +void StorageReplicatedMergeTree::waitForAllReplicasToProcessLogEntry(const ReplicatedMergeTreeLogEntryData & entry, bool wait_for_non_active) { LOG_DEBUG(log, "Waiting for all replicas to process " << entry.znode_name); - Strings replicas = getZooKeeper()->getChildren(zookeeper_path + "/replicas"); + auto zookeeper = getZooKeeper(); + Strings replicas = zookeeper->getChildren(zookeeper_path + "/replicas"); for (const String & replica : replicas) - waitForReplicaToProcessLogEntry(replica, entry); + { + if (wait_for_non_active || zookeeper->exists(zookeeper_path + "/replicas/" + replica + "/is_active")) + { + waitForReplicaToProcessLogEntry(replica, entry); + } + } LOG_DEBUG(log, "Finished waiting for all replicas to process " << entry.znode_name); } diff --git a/dbms/src/Storages/StorageReplicatedMergeTree.h b/dbms/src/Storages/StorageReplicatedMergeTree.h index 9c97abdff40..60c2ea0b870 100644 --- a/dbms/src/Storages/StorageReplicatedMergeTree.h +++ b/dbms/src/Storages/StorageReplicatedMergeTree.h @@ -486,7 +486,7 @@ private: * Because it effectively waits for other thread that usually has to also acquire a lock to proceed and this yields deadlock. * TODO: There are wrong usages of this method that are not fixed yet. */ - void waitForAllReplicasToProcessLogEntry(const ReplicatedMergeTreeLogEntryData & entry); + void waitForAllReplicasToProcessLogEntry(const ReplicatedMergeTreeLogEntryData & entry, bool wait_for_non_active = true); /** Wait until the specified replica executes the specified action from the log. * NOTE: See comment about locks above. diff --git a/dbms/src/Storages/StorageView.cpp b/dbms/src/Storages/StorageView.cpp index 824856dfc4e..5c8543bbb33 100644 --- a/dbms/src/Storages/StorageView.cpp +++ b/dbms/src/Storages/StorageView.cpp @@ -1,6 +1,7 @@ #include #include #include +#include #include #include @@ -23,6 +24,7 @@ namespace ErrorCodes { extern const int INCORRECT_QUERY; extern const int LOGICAL_ERROR; + extern const int ALIAS_REQUIRED; } @@ -62,8 +64,23 @@ BlockInputStreams StorageView::read( replaceTableNameWithSubquery(new_outer_select, new_inner_query); - if (PredicateExpressionsOptimizer(new_outer_select, context.getSettings(), context).optimize()) - current_inner_query = new_inner_query; + /// TODO: remove getTableExpressions and getTablesWithColumns + { + const auto & table_expressions = getTableExpressions(*new_outer_select); + const auto & tables_with_columns = getDatabaseAndTablesWithColumnNames(table_expressions, context); + + auto & settings = context.getSettingsRef(); + if (settings.joined_subquery_requires_alias && tables_with_columns.size() > 1) + { + for (auto & pr : tables_with_columns) + if (pr.table.table.empty() && pr.table.alias.empty()) + throw Exception("Not unique subquery in FROM requires an alias (or joined_subquery_requires_alias=0 to disable restriction).", + ErrorCodes::ALIAS_REQUIRED); + } + + if (PredicateExpressionsOptimizer(context, tables_with_columns, context.getSettings()).optimize(*new_outer_select)) + current_inner_query = new_inner_query; + } } QueryPipeline pipeline; diff --git a/dbms/src/Storages/transformQueryForExternalDatabase.cpp b/dbms/src/Storages/transformQueryForExternalDatabase.cpp index aab240dc070..f98a100637b 100644 --- a/dbms/src/Storages/transformQueryForExternalDatabase.cpp +++ b/dbms/src/Storages/transformQueryForExternalDatabase.cpp @@ -109,7 +109,7 @@ bool isCompatible(const IAST & node) return false; /// A tuple with zero or one elements is represented by a function tuple(x) and is not compatible, - /// but a normal tuple with more than one element is represented as a parenthesed expression (x, y) and is perfectly compatible. + /// but a normal tuple with more than one element is represented as a parenthesized expression (x, y) and is perfectly compatible. if (name == "tuple" && function->arguments->children.size() <= 1) return false; diff --git a/dbms/src/TableFunctions/ITableFunctionFileLike.cpp b/dbms/src/TableFunctions/ITableFunctionFileLike.cpp index 7b1d342a64a..d7a6d8195e0 100644 --- a/dbms/src/TableFunctions/ITableFunctionFileLike.cpp +++ b/dbms/src/TableFunctions/ITableFunctionFileLike.cpp @@ -32,23 +32,36 @@ StoragePtr ITableFunctionFileLike::executeImpl(const ASTPtr & ast_function, cons ASTs & args = args_func.at(0)->children; - if (args.size() != 3 && args.size() != 4) - throw Exception("Table function '" + getName() + "' requires 3 or 4 arguments: filename, format, structure and compression method (default auto).", - ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH); + if (args.size() < 2) + throw Exception("Table function '" + getName() + "' requires at least 2 arguments", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH); for (size_t i = 0; i < args.size(); ++i) args[i] = evaluateConstantExpressionOrIdentifierAsLiteral(args[i], context); std::string filename = args[0]->as().value.safeGet(); std::string format = args[1]->as().value.safeGet(); - std::string structure = args[2]->as().value.safeGet(); + + if (args.size() == 2 && getName() == "file") + { + if (format != "Distributed") + throw Exception("Table function '" + getName() + "' allows 2 arguments only for Distributed format.", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH); + } + else if (args.size() != 3 && args.size() != 4) + throw Exception("Table function '" + getName() + "' requires 3 or 4 arguments: filename, format, structure and compression method (default auto).", + ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH); + + ColumnsDescription columns; std::string compression_method = "auto"; + if (args.size() > 2) + { + auto structure = args[2]->as().value.safeGet(); + columns = parseColumnsListFromString(structure, context); + } + if (args.size() == 4) compression_method = args[3]->as().value.safeGet(); - ColumnsDescription columns = parseColumnsListFromString(structure, context); - /// Create table StoragePtr storage = getStorage(filename, format, columns, const_cast(context), table_name, compression_method); diff --git a/dbms/src/TableFunctions/TableFunctionRemote.cpp b/dbms/src/TableFunctions/TableFunctionRemote.cpp index 87c8989cbe2..033839009bb 100644 --- a/dbms/src/TableFunctions/TableFunctionRemote.cpp +++ b/dbms/src/TableFunctions/TableFunctionRemote.cpp @@ -14,6 +14,7 @@ #include #include #include +#include #include "registerTableFunctions.h" @@ -140,7 +141,10 @@ StoragePtr TableFunctionRemote::executeImpl(const ASTPtr & ast_function, const C if (!cluster_name.empty()) { /// Use an existing cluster from the main config - cluster = context.getCluster(cluster_name); + if (name != "clusterAllReplicas") + cluster = context.getCluster(cluster_name); + else + cluster = context.getCluster(cluster_name)->getClusterWithReplicasAsShards(context.getSettings()); } else { @@ -164,13 +168,22 @@ StoragePtr TableFunctionRemote::executeImpl(const ASTPtr & ast_function, const C { size_t colon = host.find(':'); if (colon == String::npos) - context.getRemoteHostFilter().checkHostAndPort(host, toString((secure ? (maybe_secure_port ? *maybe_secure_port : DBMS_DEFAULT_SECURE_PORT) : context.getTCPPort()))); + context.getRemoteHostFilter().checkHostAndPort( + host, + toString((secure ? (maybe_secure_port ? *maybe_secure_port : DBMS_DEFAULT_SECURE_PORT) : context.getTCPPort()))); else context.getRemoteHostFilter().checkHostAndPort(host.substr(0, colon), host.substr(colon + 1)); } } - cluster = std::make_shared(context.getSettings(), names, username, password, (secure ? (maybe_secure_port ? *maybe_secure_port : DBMS_DEFAULT_SECURE_PORT) : context.getTCPPort()), false, secure); + cluster = std::make_shared( + context.getSettings(), + names, + username, + password, + (secure ? (maybe_secure_port ? *maybe_secure_port : DBMS_DEFAULT_SECURE_PORT) : context.getTCPPort()), + false, + secure); } auto structure_remote_table = getStructureOfRemoteTable(*cluster, remote_database, remote_table, context, remote_table_function_ptr); @@ -198,7 +211,7 @@ StoragePtr TableFunctionRemote::executeImpl(const ASTPtr & ast_function, const C TableFunctionRemote::TableFunctionRemote(const std::string & name_, bool secure_) : name{name_}, secure{secure_} { - is_cluster_function = name == "cluster"; + is_cluster_function = (name == "cluster" || name == "clusterAllReplicas"); std::stringstream ss; ss << "Table function '" << name + "' requires from 2 to " << (is_cluster_function ? 3 : 5) << " parameters" @@ -213,6 +226,7 @@ void registerTableFunctionRemote(TableFunctionFactory & factory) factory.registerFunction("remote", [] () -> TableFunctionPtr { return std::make_shared("remote"); }); factory.registerFunction("remoteSecure", [] () -> TableFunctionPtr { return std::make_shared("remote", /* secure = */ true); }); factory.registerFunction("cluster", [] () -> TableFunctionPtr { return std::make_shared("cluster"); }); + factory.registerFunction("clusterAllReplicas", [] () -> TableFunctionPtr { return std::make_shared("clusterAllReplicas"); }); } } diff --git a/dbms/tests/integration/test_cluster_all_replicas/__init__.py b/dbms/tests/integration/test_cluster_all_replicas/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/dbms/tests/integration/test_cluster_all_replicas/configs/remote_servers.xml b/dbms/tests/integration/test_cluster_all_replicas/configs/remote_servers.xml new file mode 100644 index 00000000000..68dcfcc1460 --- /dev/null +++ b/dbms/tests/integration/test_cluster_all_replicas/configs/remote_servers.xml @@ -0,0 +1,16 @@ + + + + + + node1 + 9000 + + + node2 + 9000 + + + + + diff --git a/dbms/tests/integration/test_cluster_all_replicas/test.py b/dbms/tests/integration/test_cluster_all_replicas/test.py new file mode 100644 index 00000000000..0af5693fc75 --- /dev/null +++ b/dbms/tests/integration/test_cluster_all_replicas/test.py @@ -0,0 +1,21 @@ +import pytest + +from helpers.cluster import ClickHouseCluster + +cluster = ClickHouseCluster(__file__) + +node1 = cluster.add_instance('node1', main_configs=['configs/remote_servers.xml'], with_zookeeper=True) +node2 = cluster.add_instance('node2', main_configs=['configs/remote_servers.xml'], with_zookeeper=True) + +@pytest.fixture(scope="module") +def start_cluster(): + try: + cluster.start() + yield cluster + finally: + cluster.shutdown() + + +def test_remote(start_cluster): + assert node1.query('''SELECT hostName() FROM clusterAllReplicas("two_shards", system.one)''') == 'node1\nnode2\n' + assert node1.query('''SELECT hostName() FROM cluster("two_shards", system.one)''') == 'node1\n' diff --git a/dbms/tests/integration/test_cluster_copier/task_no_index.xml b/dbms/tests/integration/test_cluster_copier/task_no_index.xml index c9359aa9278..e81743efc2d 100644 --- a/dbms/tests/integration/test_cluster_copier/task_no_index.xml +++ b/dbms/tests/integration/test_cluster_copier/task_no_index.xml @@ -91,7 +91,7 @@ NOTE: In spite of this section is optional (if it is not specified, all partitions will be copied), it is strictly recommended to specify them explicitly. - If you already have some ready paritions on destination cluster they + If you already have some ready partitions on destination cluster they will be removed at the start of the copying since they will be interpeted as unfinished data from the previous copying!!! --> diff --git a/dbms/tests/integration/test_dictionaries_all_layouts_and_sources/http_server.py b/dbms/tests/integration/test_dictionaries_all_layouts_and_sources/http_server.py index 5eb1d3cca64..dd268b3a417 100644 --- a/dbms/tests/integration/test_dictionaries_all_layouts_and_sources/http_server.py +++ b/dbms/tests/integration/test_dictionaries_all_layouts_and_sources/http_server.py @@ -6,7 +6,7 @@ import ssl import csv -# Decorator used to see if authentification works for external dictionary who use a HTTP source. +# Decorator used to see if authentication works for external dictionary who use a HTTP source. def check_auth(fn): def wrapper(req): auth_header = req.headers.get('authorization', None) diff --git a/dbms/tests/integration/test_distributed_format/__init__.py b/dbms/tests/integration/test_distributed_format/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/dbms/tests/integration/test_distributed_format/configs/remote_servers.xml b/dbms/tests/integration/test_distributed_format/configs/remote_servers.xml new file mode 100644 index 00000000000..7d8d64bb78b --- /dev/null +++ b/dbms/tests/integration/test_distributed_format/configs/remote_servers.xml @@ -0,0 +1,12 @@ + + + + + + not_existing + 9000 + + + + + diff --git a/dbms/tests/integration/test_distributed_format/test.py b/dbms/tests/integration/test_distributed_format/test.py new file mode 100644 index 00000000000..2bcc8ab8063 --- /dev/null +++ b/dbms/tests/integration/test_distributed_format/test.py @@ -0,0 +1,59 @@ +import time +import pytest + +from helpers.cluster import ClickHouseCluster +from multiprocessing.dummy import Pool +from helpers.client import QueryRuntimeException, QueryTimeoutExceedException + +from helpers.test_tools import assert_eq_with_retry + + +cluster = ClickHouseCluster(__file__) +node = cluster.add_instance('node', config_dir="configs", main_configs=['configs/remote_servers.xml']) + + +@pytest.fixture(scope="module") +def started_cluster(): + try: + cluster.start() + yield cluster + + finally: + cluster.shutdown() + + +def test_single_file(started_cluster): + node.query("create table distr_1 (x UInt64, s String) engine = Distributed('test_cluster', database, table)") + node.query("insert into distr_1 values (1, 'a'), (2, 'bb'), (3, 'ccc')") + + query = "select * from file('/var/lib/clickhouse/data/default/distr_1/default@not_existing:9000/1.bin', 'Distributed')" + out = node.exec_in_container(['/usr/bin/clickhouse', 'local', '--stacktrace', '-q', query]) + + assert out == '1\ta\n2\tbb\n3\tccc\n' + + query = "create table t (dummy UInt32) engine = File('Distributed', '/var/lib/clickhouse/data/default/distr_1/default@not_existing:9000/1.bin');" \ + "select * from t" + out = node.exec_in_container(['/usr/bin/clickhouse', 'local', '--stacktrace', '-q', query]) + + assert out == '1\ta\n2\tbb\n3\tccc\n' + + node.query("drop table distr_1") + + +def test_two_files(started_cluster): + node.query("create table distr_2 (x UInt64, s String) engine = Distributed('test_cluster', database, table)") + node.query("insert into distr_2 values (0, '_'), (1, 'a')") + node.query("insert into distr_2 values (2, 'bb'), (3, 'ccc')") + + query = "select * from file('/var/lib/clickhouse/data/default/distr_2/default@not_existing:9000/{1,2,3,4}.bin', 'Distributed') order by x" + out = node.exec_in_container(['/usr/bin/clickhouse', 'local', '--stacktrace', '-q', query]) + + assert out == '0\t_\n1\ta\n2\tbb\n3\tccc\n' + + query = "create table t (dummy UInt32) engine = File('Distributed', '/var/lib/clickhouse/data/default/distr_2/default@not_existing:9000/{1,2,3,4}.bin');" \ + "select * from t order by x" + out = node.exec_in_container(['/usr/bin/clickhouse', 'local', '--stacktrace', '-q', query]) + + assert out == '0\t_\n1\ta\n2\tbb\n3\tccc\n' + + node.query("drop table distr_2") diff --git a/dbms/tests/integration/test_row_policy/multiple_tags_with_table_names.xml b/dbms/tests/integration/test_row_policy/multiple_tags_with_table_names.xml new file mode 100644 index 00000000000..87b22047e7e --- /dev/null +++ b/dbms/tests/integration/test_row_policy/multiple_tags_with_table_names.xml @@ -0,0 +1,26 @@ + + + + + + + + + + a = 1 +
+ + + + a + b < 1 or c - d > 5 +
+ + + + c = 1 +
+
+
+
+
+
diff --git a/dbms/tests/integration/test_row_policy/tag_with_table_name.xml b/dbms/tests/integration/test_row_policy/tag_with_table_name.xml new file mode 100644 index 00000000000..4affd2d9038 --- /dev/null +++ b/dbms/tests/integration/test_row_policy/tag_with_table_name.xml @@ -0,0 +1,26 @@ + + + + + + + + + + a = 1 +
+ + + + a + b < 1 or c - d > 5 +
+ + + + c = 1 + +
+
+
+
+
diff --git a/dbms/tests/integration/test_row_policy/test.py b/dbms/tests/integration/test_row_policy/test.py index 421a4b0510c..ae64d1e5a3a 100644 --- a/dbms/tests/integration/test_row_policy/test.py +++ b/dbms/tests/integration/test_row_policy/test.py @@ -28,13 +28,19 @@ def started_cluster(): CREATE TABLE mydb.filtered_table1 (a UInt8, b UInt8) ENGINE MergeTree ORDER BY a; INSERT INTO mydb.filtered_table1 values (0, 0), (0, 1), (1, 0), (1, 1); + CREATE TABLE mydb.table (a UInt8, b UInt8) ENGINE MergeTree ORDER BY a; + INSERT INTO mydb.table values (0, 0), (0, 1), (1, 0), (1, 1); + CREATE TABLE mydb.filtered_table2 (a UInt8, b UInt8, c UInt8, d UInt8) ENGINE MergeTree ORDER BY a; INSERT INTO mydb.filtered_table2 values (0, 0, 0, 0), (1, 2, 3, 4), (4, 3, 2, 1), (0, 0, 6, 0); CREATE TABLE mydb.filtered_table3 (a UInt8, b UInt8, c UInt16 ALIAS a + b) ENGINE MergeTree ORDER BY a; INSERT INTO mydb.filtered_table3 values (0, 0), (0, 1), (1, 0), (1, 1); + + CREATE TABLE mydb.`.filtered_table4` (a UInt8, b UInt8, c UInt16 ALIAS a + b) ENGINE MergeTree ORDER BY a; + INSERT INTO mydb.`.filtered_table4` values (0, 0), (0, 1), (1, 0), (1, 1); ''') - + yield cluster finally: @@ -58,6 +64,7 @@ def test_smoke(): assert instance.query("SELECT a FROM mydb.filtered_table1") == "1\n1\n" assert instance.query("SELECT b FROM mydb.filtered_table1") == "0\n1\n" assert instance.query("SELECT a FROM mydb.filtered_table1 WHERE a = 1") == "1\n1\n" + assert instance.query("SELECT a FROM mydb.filtered_table1 WHERE a IN (1)") == "1\n1\n" assert instance.query("SELECT a = 1 FROM mydb.filtered_table1") == "1\n1\n" assert instance.query("SELECT a FROM mydb.filtered_table3") == "0\n1\n" @@ -88,6 +95,46 @@ def test_prewhere_not_supported(): assert expected_error in instance.query_and_get_error("SELECT * FROM mydb.filtered_table3 PREWHERE 1") +def test_single_table_name(): + copy_policy_xml('tag_with_table_name.xml') + assert instance.query("SELECT * FROM mydb.table") == "1\t0\n1\t1\n" + assert instance.query("SELECT * FROM mydb.filtered_table2") == "0\t0\t0\t0\n0\t0\t6\t0\n" + assert instance.query("SELECT * FROM mydb.filtered_table3") == "0\t1\n1\t0\n" + + assert instance.query("SELECT a FROM mydb.table") == "1\n1\n" + assert instance.query("SELECT b FROM mydb.table") == "0\n1\n" + assert instance.query("SELECT a FROM mydb.table WHERE a = 1") == "1\n1\n" + assert instance.query("SELECT a = 1 FROM mydb.table") == "1\n1\n" + + assert instance.query("SELECT a FROM mydb.filtered_table3") == "0\n1\n" + assert instance.query("SELECT b FROM mydb.filtered_table3") == "1\n0\n" + assert instance.query("SELECT c FROM mydb.filtered_table3") == "1\n1\n" + assert instance.query("SELECT a + b FROM mydb.filtered_table3") == "1\n1\n" + assert instance.query("SELECT a FROM mydb.filtered_table3 WHERE c = 1") == "0\n1\n" + assert instance.query("SELECT c = 1 FROM mydb.filtered_table3") == "1\n1\n" + assert instance.query("SELECT a + b = 1 FROM mydb.filtered_table3") == "1\n1\n" + + +def test_custom_table_name(): + copy_policy_xml('multiple_tags_with_table_names.xml') + assert instance.query("SELECT * FROM mydb.table") == "1\t0\n1\t1\n" + assert instance.query("SELECT * FROM mydb.filtered_table2") == "0\t0\t0\t0\n0\t0\t6\t0\n" + assert instance.query("SELECT * FROM mydb.`.filtered_table4`") == "0\t1\n1\t0\n" + + assert instance.query("SELECT a FROM mydb.table") == "1\n1\n" + assert instance.query("SELECT b FROM mydb.table") == "0\n1\n" + assert instance.query("SELECT a FROM mydb.table WHERE a = 1") == "1\n1\n" + assert instance.query("SELECT a = 1 FROM mydb.table") == "1\n1\n" + + assert instance.query("SELECT a FROM mydb.`.filtered_table4`") == "0\n1\n" + assert instance.query("SELECT b FROM mydb.`.filtered_table4`") == "1\n0\n" + assert instance.query("SELECT c FROM mydb.`.filtered_table4`") == "1\n1\n" + assert instance.query("SELECT a + b FROM mydb.`.filtered_table4`") == "1\n1\n" + assert instance.query("SELECT a FROM mydb.`.filtered_table4` WHERE c = 1") == "0\n1\n" + assert instance.query("SELECT c = 1 FROM mydb.`.filtered_table4`") == "1\n1\n" + assert instance.query("SELECT a + b = 1 FROM mydb.`.filtered_table4`") == "1\n1\n" + + def test_change_of_users_xml_changes_row_policies(): copy_policy_xml('normal_filters.xml') assert instance.query("SELECT * FROM mydb.filtered_table1") == "1\t0\n1\t1\n" diff --git a/dbms/tests/performance/codecs_float_insert.xml b/dbms/tests/performance/codecs_float_insert.xml index 2a39dfc48d6..1bbbf6b2d92 100644 --- a/dbms/tests/performance/codecs_float_insert.xml +++ b/dbms/tests/performance/codecs_float_insert.xml @@ -37,14 +37,20 @@ rnd + + num_rows + + 1000000 + + CREATE TABLE IF NOT EXISTS codec_{seq_type}_{type}_{codec} (n {type} CODEC({codec})) ENGINE = MergeTree PARTITION BY tuple() ORDER BY tuple(); - INSERT INTO codec_seq_Float64_{codec} (n) SELECT number/pi() FROM system.numbers LIMIT 100000 SETTINGS max_threads=1 - INSERT INTO codec_mon_Float64_{codec} (n) SELECT number+sin(number) FROM system.numbers LIMIT 100000 SETTINGS max_threads=1 - INSERT INTO codec_rnd_Float64_{codec} (n) SELECT (rand() - 4294967295)/pi() FROM system.numbers LIMIT 100000 SETTINGS max_threads=1 + INSERT INTO codec_seq_{type}_{codec} (n) SELECT number/pi() FROM system.numbers LIMIT {num_rows} SETTINGS max_threads=1 + INSERT INTO codec_mon_{type}_{codec} (n) SELECT number+sin(number) FROM system.numbers LIMIT {num_rows} SETTINGS max_threads=1 + INSERT INTO codec_rnd_{type}_{codec} (n) SELECT (intHash64(number) - 4294967295)/pi() FROM system.numbers LIMIT {num_rows} SETTINGS max_threads=1 DROP TABLE IF EXISTS codec_{seq_type}_{type}_{codec} diff --git a/dbms/tests/performance/codecs_float_select.xml b/dbms/tests/performance/codecs_float_select.xml index f23b363b914..1d3957c8da9 100644 --- a/dbms/tests/performance/codecs_float_select.xml +++ b/dbms/tests/performance/codecs_float_select.xml @@ -37,18 +37,21 @@ rnd + + num_rows + + 1000000 + + CREATE TABLE IF NOT EXISTS codec_{seq_type}_{type}_{codec} (n {type} CODEC({codec})) ENGINE = MergeTree PARTITION BY tuple() ORDER BY tuple(); - - INSERT INTO codec_seq_Float64_{codec} (n) SELECT number/pi() FROM system.numbers LIMIT 100000 SETTINGS max_threads=1 - INSERT INTO codec_mon_Float64_{codec} (n) SELECT number+sin(number) FROM system.numbers LIMIT 100000 SETTINGS max_threads=1 - INSERT INTO codec_rnd_Float64_{codec} (n) SELECT (rand() - 4294967295)/pi() FROM system.numbers LIMIT 100000 SETTINGS max_threads=1 + INSERT INTO codec_seq_{type}_{codec} (n) SELECT number/pi() FROM system.numbers LIMIT {num_rows} SETTINGS max_threads=1 + INSERT INTO codec_mon_{type}_{codec} (n) SELECT number+sin(number) FROM system.numbers LIMIT {num_rows} SETTINGS max_threads=1 + INSERT INTO codec_rnd_{type}_{codec} (n) SELECT (intHash64(number) - 4294967295)/pi() FROM system.numbers LIMIT {num_rows} SETTINGS max_threads=1 - - SELECT count(n) FROM codec_{seq_type}_{type}_{codec} WHERE ignore(n) LIMIT 100000 SETTINGS max_threads=1 + SELECT count(n) FROM codec_{seq_type}_{type}_{codec} WHERE ignore(n) == 0 LIMIT {num_rows} SETTINGS max_threads=1 DROP TABLE IF EXISTS codec_{seq_type}_{type}_{codec} diff --git a/dbms/tests/performance/codecs_int_insert.xml b/dbms/tests/performance/codecs_int_insert.xml index 742693d49fe..3ee293f9d16 100644 --- a/dbms/tests/performance/codecs_int_insert.xml +++ b/dbms/tests/performance/codecs_int_insert.xml @@ -39,14 +39,20 @@ rnd + + num_rows + + 1000000 + + CREATE TABLE IF NOT EXISTS codec_{seq_type}_{type}_{codec} (n {type} CODEC({codec})) ENGINE = MergeTree PARTITION BY tuple() ORDER BY tuple(); - INSERT INTO codec_seq_UInt64_{codec} (n) SELECT number FROM system.numbers LIMIT 100000 SETTINGS max_threads=1 - INSERT INTO codec_mon_UInt64_{codec} (n) SELECT number*512+(rand()%512) FROM system.numbers LIMIT 100000 SETTINGS max_threads=1 - INSERT INTO codec_rnd_UInt64_{codec} (n) SELECT rand() FROM system.numbers LIMIT 100000 SETTINGS max_threads=1 + INSERT INTO codec_seq_{type}_{codec} (n) SELECT number FROM system.numbers LIMIT {num_rows} SETTINGS max_threads=1 + INSERT INTO codec_mon_{type}_{codec} (n) SELECT number*512+(intHash64(number)%512) FROM system.numbers LIMIT {num_rows} SETTINGS max_threads=1 + INSERT INTO codec_rnd_{type}_{codec} (n) SELECT intHash64(number) FROM system.numbers LIMIT {num_rows} SETTINGS max_threads=1 DROP TABLE IF EXISTS codec_{seq_type}_{type}_{codec} diff --git a/dbms/tests/performance/codecs_int_select.xml b/dbms/tests/performance/codecs_int_select.xml index 9c007863cd8..40ebfd4d000 100644 --- a/dbms/tests/performance/codecs_int_select.xml +++ b/dbms/tests/performance/codecs_int_select.xml @@ -39,18 +39,21 @@ rnd + + num_rows + + 1000000 + + CREATE TABLE IF NOT EXISTS codec_{seq_type}_{type}_{codec} (n {type} CODEC({codec})) ENGINE = MergeTree PARTITION BY tuple() ORDER BY tuple(); - - INSERT INTO codec_seq_UInt64_{codec} (n) SELECT number FROM system.numbers LIMIT 100000 SETTINGS max_threads=1 - INSERT INTO codec_mon_UInt64_{codec} (n) SELECT number*512+(rand()%512) FROM system.numbers LIMIT 100000 SETTINGS max_threads=1 - INSERT INTO codec_rnd_UInt64_{codec} (n) SELECT rand() FROM system.numbers LIMIT 100000 SETTINGS max_threads=1 + INSERT INTO codec_seq_{type}_{codec} (n) SELECT number FROM system.numbers LIMIT {num_rows} SETTINGS max_threads=1 + INSERT INTO codec_mon_{type}_{codec} (n) SELECT number*512+(intHash64(number)%512) FROM system.numbers LIMIT {num_rows} SETTINGS max_threads=1 + INSERT INTO codec_rnd_{type}_{codec} (n) SELECT intHash64(number) FROM system.numbers LIMIT {num_rows} SETTINGS max_threads=1 - - SELECT count(n) FROM codec_{seq_type}_{type}_{codec} WHERE ignore(n) LIMIT 100000 SETTINGS max_threads=1 + SELECT count(n) FROM codec_{seq_type}_{type}_{codec} WHERE ignore(n) == 0 LIMIT {num_rows} SETTINGS max_threads=1 DROP TABLE IF EXISTS codec_{seq_type}_{type}_{codec} diff --git a/dbms/tests/queries/0_stateless/00597_push_down_predicate.reference b/dbms/tests/queries/0_stateless/00597_push_down_predicate.reference index f64243e9be7..9fde80689f1 100644 --- a/dbms/tests/queries/0_stateless/00597_push_down_predicate.reference +++ b/dbms/tests/queries/0_stateless/00597_push_down_predicate.reference @@ -13,7 +13,7 @@ SELECT \n a, \n b\nFROM \n(\n SELECT \n 1 AS a, \n 1 AS b -------Need push down------- SELECT toString(value) AS value\nFROM \n(\n SELECT 1 AS value\n) 1 -SELECT id\nFROM \n(\n SELECT 1 AS id\n UNION ALL\n SELECT 2 AS `2`\n WHERE 0\n)\nWHERE id = 1 +SELECT id\nFROM \n(\n SELECT 1 AS id\n UNION ALL\n SELECT 2 AS `--predicate_optimizer_0`\n WHERE 0\n)\nWHERE id = 1 1 SELECT id\nFROM \n(\n SELECT arrayJoin([1, 2, 3]) AS id\n WHERE id = 1\n)\nWHERE id = 1 1 diff --git a/dbms/tests/queries/0_stateless/00915_simple_aggregate_function.sql b/dbms/tests/queries/0_stateless/00915_simple_aggregate_function.sql index 037032a84cc..030893e3ea1 100644 --- a/dbms/tests/queries/0_stateless/00915_simple_aggregate_function.sql +++ b/dbms/tests/queries/0_stateless/00915_simple_aggregate_function.sql @@ -35,4 +35,6 @@ insert into simple values(10,'22222222222222222222222222222222222222222222222222 select * from simple final; select toTypeName(nullable_str),toTypeName(low_str),toTypeName(ip),toTypeName(status) from simple limit 1; +optimize table simple final; + drop table simple; diff --git a/dbms/tests/queries/0_stateless/00929_multi_match_edit_distance.sql b/dbms/tests/queries/0_stateless/00929_multi_match_edit_distance.sql index 48b31070204..c0f39a4f201 100644 --- a/dbms/tests/queries/0_stateless/00929_multi_match_edit_distance.sql +++ b/dbms/tests/queries/0_stateless/00929_multi_match_edit_distance.sql @@ -3,8 +3,8 @@ SET send_logs_level = 'none'; select 0 = multiFuzzyMatchAny('abc', 0, ['a1c']) from system.numbers limit 5; select 1 = multiFuzzyMatchAny('abc', 1, ['a1c']) from system.numbers limit 5; select 1 = multiFuzzyMatchAny('abc', 2, ['a1c']) from system.numbers limit 5; -select 1 = multiFuzzyMatchAny('abc', 3, ['a1c']) from system.numbers limit 5; -- { serverError 49 } -select 1 = multiFuzzyMatchAny('abc', 4, ['a1c']) from system.numbers limit 5; -- { serverError 49 } +select 1 = multiFuzzyMatchAny('abc', 3, ['a1c']) from system.numbers limit 5; -- { serverError 36 } +select 1 = multiFuzzyMatchAny('abc', 4, ['a1c']) from system.numbers limit 5; -- { serverError 36 } select 1 = multiFuzzyMatchAny('leftabcright', 1, ['a1c']) from system.numbers limit 5; @@ -14,7 +14,7 @@ select 0 = multiFuzzyMatchAny('halo some wrld', 2, ['^hello.*world$']); select 1 = multiFuzzyMatchAny('halo some wrld', 2, ['^hello.*world$', '^halo.*world$']); select 1 = multiFuzzyMatchAny('halo some wrld', 2, ['^halo.*world$', '^hello.*world$']); select 1 = multiFuzzyMatchAny('halo some wrld', 3, ['^hello.*world$']); -select 1 = multiFuzzyMatchAny('hello some world', 10, ['^hello.*world$']); -- { serverError 49 } +select 1 = multiFuzzyMatchAny('hello some world', 10, ['^hello.*world$']); -- { serverError 36 } select 1 = multiFuzzyMatchAny('hello some world', -1, ['^hello.*world$']); -- { serverError 43 } select 1 = multiFuzzyMatchAny('hello some world', 10000000000, ['^hello.*world$']); -- { serverError 44 } select 1 = multiFuzzyMatchAny('http://hyperscan_is_nice.ru/st', 2, ['http://hyperscan_is_nice.ru/(st\\d\\d$|st\\d\\d\\.|st1[0-4]\\d|st150|st\\d$|gl|rz|ch)']); diff --git a/dbms/tests/queries/0_stateless/00949_format.sql b/dbms/tests/queries/0_stateless/00949_format.sql index 1786a2a3e1e..433ababde9d 100644 --- a/dbms/tests/queries/0_stateless/00949_format.sql +++ b/dbms/tests/queries/0_stateless/00949_format.sql @@ -16,25 +16,25 @@ select 100 = length(format(concat((select arrayStringConcat(arrayMap(x ->'}', ra select format('', 'first'); select concat('third', 'first', 'second')=format('{2}{0}{1}', 'first', 'second', 'third'); -select format('{', ''); -- { serverError 49 } -select format('{{}', ''); -- { serverError 49 } -select format('{ {}', ''); -- { serverError 49 } -select format('}', ''); -- { serverError 49 } +select format('{', ''); -- { serverError 36 } +select format('{{}', ''); -- { serverError 36 } +select format('{ {}', ''); -- { serverError 36 } +select format('}', ''); -- { serverError 36 } select format('{{', ''); -select format('{}}', ''); -- { serverError 49 } +select format('{}}', ''); -- { serverError 36 } select format('}}', ''); -select format('{2 }', ''); -- { serverError 49 } -select format('{}{}{}{}{}{} }{}', '', '', '', '', '', '', ''); -- { serverError 49 } -select format('{sometext}', ''); -- { serverError 49 } -select format('{\0sometext}', ''); -- { serverError 49 } -select format('{1023}', ''); -- { serverError 49 } -select format('{10000000000000000000000000000000000000000000000000}', ''); -- { serverError 49 } -select format('{} {0}', '', ''); -- { serverError 49 } -select format('{0} {}', '', ''); -- { serverError 49 } -select format('Hello {} World {} {}{}', 'first', 'second', 'third') from system.numbers limit 2; -- { serverError 49 } -select format('Hello {0} World {1} {2}{3}', 'first', 'second', 'third') from system.numbers limit 2; -- { serverError 49 } +select format('{2 }', ''); -- { serverError 36 } +select format('{}{}{}{}{}{} }{}', '', '', '', '', '', '', ''); -- { serverError 36 } +select format('{sometext}', ''); -- { serverError 36 } +select format('{\0sometext}', ''); -- { serverError 36 } +select format('{1023}', ''); -- { serverError 36 } +select format('{10000000000000000000000000000000000000000000000000}', ''); -- { serverError 36 } +select format('{} {0}', '', ''); -- { serverError 36 } +select format('{0} {}', '', ''); -- { serverError 36 } +select format('Hello {} World {} {}{}', 'first', 'second', 'third') from system.numbers limit 2; -- { serverError 36 } +select format('Hello {0} World {1} {2}{3}', 'first', 'second', 'third') from system.numbers limit 2; -- { serverError 36 } -select 50 = length(format((select arrayStringConcat(arrayMap(x ->'{', range(101)))), '')); -- { serverError 49 } +select 50 = length(format((select arrayStringConcat(arrayMap(x ->'{', range(101)))), '')); -- { serverError 36 } select format('{}{}{}', materialize(toFixedString('a', 1)), materialize(toFixedString('b', 1)), materialize(toFixedString('c', 1))) == 'abc'; select format('{}{}{}', materialize(toFixedString('a', 1)), materialize('b'), materialize(toFixedString('c', 1))) == 'abc'; diff --git a/dbms/tests/queries/0_stateless/01013_totals_without_aggregation.sql b/dbms/tests/queries/0_stateless/01013_totals_without_aggregation.sql index bed393b63d3..584a8994767 100755 --- a/dbms/tests/queries/0_stateless/01013_totals_without_aggregation.sql +++ b/dbms/tests/queries/0_stateless/01013_totals_without_aggregation.sql @@ -1,6 +1,6 @@ SELECT 11 AS n GROUP BY n WITH TOTALS; SELECT 12 AS n GROUP BY n WITH ROLLUP; SELECT 13 AS n GROUP BY n WITH CUBE; -SELECT 1 AS n WITH TOTALS; -- { serverError 49 } -SELECT 1 AS n WITH ROLLUP; -- { serverError 49 } -SELECT 1 AS n WITH CUBE; -- { serverError 49 } +SELECT 1 AS n WITH TOTALS; -- { serverError 48 } +SELECT 1 AS n WITH ROLLUP; -- { serverError 48 } +SELECT 1 AS n WITH CUBE; -- { serverError 48 } diff --git a/dbms/tests/queries/0_stateless/01033_substr_negative_size_arg.reference b/dbms/tests/queries/0_stateless/01033_substr_negative_size_arg.reference index 98c07557034..db3a106ac7f 100644 --- a/dbms/tests/queries/0_stateless/01033_substr_negative_size_arg.reference +++ b/dbms/tests/queries/0_stateless/01033_substr_negative_size_arg.reference @@ -1,8 +1,8 @@ -lickhous -lickhous -lickhous -lickhous -lickhous -lickhous -lickhous -lickhous +lickhou +lickhou +lickhou +lickhou +lickhou +lickhou +lickhou +lickhou diff --git a/dbms/tests/queries/0_stateless/01052_array_reduce_exception.sql b/dbms/tests/queries/0_stateless/01052_array_reduce_exception.sql index 71c030a055c..2bdfc2136a2 100644 --- a/dbms/tests/queries/0_stateless/01052_array_reduce_exception.sql +++ b/dbms/tests/queries/0_stateless/01052_array_reduce_exception.sql @@ -1 +1 @@ -SELECT arrayReduce('aggThrow(0.0001)', range(number % 10)) FROM system.numbers; -- { serverError 503 } +SELECT arrayReduce('aggThrow(0.0001)', range(number % 10)) FROM system.numbers FORMAT Null; -- { serverError 503 } diff --git a/dbms/tests/queries/0_stateless/01056_negative_with_bloom_filter.reference b/dbms/tests/queries/0_stateless/01056_negative_with_bloom_filter.reference new file mode 100644 index 00000000000..20c726bd280 --- /dev/null +++ b/dbms/tests/queries/0_stateless/01056_negative_with_bloom_filter.reference @@ -0,0 +1,4 @@ +-1 -1 -1 -1 +-1 -1 -1 -1 +-1 -1 -1 -1 +-1 -1 -1 -1 diff --git a/dbms/tests/queries/0_stateless/01056_negative_with_bloom_filter.sql b/dbms/tests/queries/0_stateless/01056_negative_with_bloom_filter.sql new file mode 100644 index 00000000000..271754b848f --- /dev/null +++ b/dbms/tests/queries/0_stateless/01056_negative_with_bloom_filter.sql @@ -0,0 +1,14 @@ +SET allow_experimental_data_skipping_indices = 1; + +DROP TABLE IF EXISTS test; + +CREATE TABLE test (`int8` Int8, `int16` Int16, `int32` Int32, `int64` Int64, INDEX idx (`int8`, `int16`, `int32`, `int64`) TYPE bloom_filter(0.01) GRANULARITY 8192 ) ENGINE = MergeTree() ORDER BY `int8`; + +INSERT INTO test VALUES (-1, -1, -1, -1); + +SELECT * FROM test WHERE `int8` = -1; +SELECT * FROM test WHERE `int16` = -1; +SELECT * FROM test WHERE `int32` = -1; +SELECT * FROM test WHERE `int64` = -1; + +DROP TABLE IF EXISTS test; diff --git a/dbms/tests/queries/0_stateless/01056_predicate_optimizer_bugs.reference b/dbms/tests/queries/0_stateless/01056_predicate_optimizer_bugs.reference new file mode 100644 index 00000000000..019e95cb359 --- /dev/null +++ b/dbms/tests/queries/0_stateless/01056_predicate_optimizer_bugs.reference @@ -0,0 +1,28 @@ +SELECT \n k, \n v, \n d, \n i\nFROM \n(\n SELECT \n t.1 AS k, \n t.2 AS v, \n runningDifference(v) AS d, \n runningDifference(cityHash64(t.1)) AS i\n FROM \n (\n SELECT arrayJoin([(\'a\', 1), (\'a\', 2), (\'a\', 3), (\'b\', 11), (\'b\', 13), (\'b\', 15)]) AS t\n )\n)\nWHERE i = 0 +a 1 0 0 +a 2 1 0 +a 3 1 0 +b 13 2 0 +b 15 2 0 +SELECT \n co, \n co2, \n co3, \n num\nFROM \n(\n SELECT \n co, \n co2, \n co3, \n count() AS num\n FROM \n (\n SELECT \n 1 AS co, \n 2 AS co2, \n 3 AS co3\n )\n GROUP BY \n co, \n co2, \n co3\n WITH CUBE\n HAVING (co != 0) AND (co2 != 2)\n)\nWHERE (co != 0) AND (co2 != 2) +1 0 3 1 +1 0 0 1 +SELECT alias AS name\nFROM \n(\n SELECT name AS alias\n FROM system.settings\n WHERE alias = \'enable_optimize_predicate_expression\'\n)\nANY INNER JOIN \n(\n SELECT name\n FROM system.settings\n) USING (name)\nWHERE name = \'enable_optimize_predicate_expression\' +enable_optimize_predicate_expression +1 val11 val21 val31 +SELECT ccc\nFROM \n(\n SELECT 1 AS ccc\n WHERE 0\n UNION ALL\n SELECT ccc\n FROM \n (\n SELECT 2 AS ccc\n )\n ANY INNER JOIN \n (\n SELECT 2 AS ccc\n ) USING (ccc)\n WHERE ccc > 1\n)\nWHERE ccc > 1 +2 +SELECT \n ts, \n id, \n id_b, \n b.ts, \n b.id, \n id_c\nFROM \n(\n SELECT \n ts, \n id, \n id_b\n FROM A\n WHERE ts <= toDateTime(\'1970-01-01 03:00:00\')\n) AS a\nALL LEFT JOIN B AS b ON b.id = id_b\nWHERE ts <= toDateTime(\'1970-01-01 03:00:00\') +SELECT \n ts AS `--a.ts`, \n id AS `--a.id`, \n id_b AS `--a.id_b`, \n b.ts AS `--b.ts`, \n b.id AS `--b.id`, \n id_c AS `--b.id_c`\nFROM \n(\n SELECT \n ts, \n id, \n id_b\n FROM A\n WHERE ts <= toDateTime(\'1970-01-01 03:00:00\')\n) AS a\nALL LEFT JOIN B AS b ON `--b.id` = `--a.id_b`\nWHERE `--a.ts` <= toDateTime(\'1970-01-01 03:00:00\') +2 3 +3 4 +4 5 +5 0 +2 4 +4 0 +2 3 +4 5 +SELECT dummy\nFROM \n(\n SELECT dummy\n FROM system.one\n WHERE arrayMap(x -> (x + 1), [dummy]) = [1]\n)\nWHERE arrayMap(x -> (x + 1), [dummy]) = [1] +0 +SELECT \n id, \n value, \n value_1\nFROM \n(\n SELECT \n 1 AS id, \n 2 AS value\n)\nALL INNER JOIN \n(\n SELECT \n 1 AS id, \n 3 AS value_1\n) USING (id)\nWHERE arrayMap(x -> ((x + value) + value_1), [1]) = [6] +1 2 3 diff --git a/dbms/tests/queries/0_stateless/01056_predicate_optimizer_bugs.sql b/dbms/tests/queries/0_stateless/01056_predicate_optimizer_bugs.sql new file mode 100644 index 00000000000..e1e185be076 --- /dev/null +++ b/dbms/tests/queries/0_stateless/01056_predicate_optimizer_bugs.sql @@ -0,0 +1,75 @@ +SET enable_debug_queries = 1; +SET enable_optimize_predicate_expression = 1; + +-- https://github.com/ClickHouse/ClickHouse/issues/3885 +-- https://github.com/ClickHouse/ClickHouse/issues/5485 +ANALYZE SELECT k, v, d, i FROM (SELECT t.1 AS k, t.2 AS v, runningDifference(v) AS d, runningDifference(cityHash64(t.1)) AS i FROM ( SELECT arrayJoin([('a', 1), ('a', 2), ('a', 3), ('b', 11), ('b', 13), ('b', 15)]) AS t)) WHERE i = 0; +SELECT k, v, d, i FROM (SELECT t.1 AS k, t.2 AS v, runningDifference(v) AS d, runningDifference(cityHash64(t.1)) AS i FROM ( SELECT arrayJoin([('a', 1), ('a', 2), ('a', 3), ('b', 11), ('b', 13), ('b', 15)]) AS t)) WHERE i = 0; + +-- https://github.com/ClickHouse/ClickHouse/issues/5682 +ANALYZE SELECT co,co2,co3,num FROM ( SELECT co,co2,co3,count() AS num FROM ( SELECT 1 AS co,2 AS co2 ,3 AS co3 ) GROUP BY cube (co,co2,co3) ) WHERE co!=0 AND co2 !=2; +SELECT co,co2,co3,num FROM ( SELECT co,co2,co3,count() AS num FROM ( SELECT 1 AS co,2 AS co2 ,3 AS co3 ) GROUP BY cube (co,co2,co3) ) WHERE co!=0 AND co2 !=2; + +-- https://github.com/ClickHouse/ClickHouse/issues/6734 +ANALYZE SELECT alias AS name FROM ( SELECT name AS alias FROM system.settings ) ANY INNER JOIN ( SELECT name FROM system.settings ) USING (name) WHERE name = 'enable_optimize_predicate_expression'; +SELECT alias AS name FROM ( SELECT name AS alias FROM system.settings ) ANY INNER JOIN ( SELECT name FROM system.settings ) USING (name) WHERE name = 'enable_optimize_predicate_expression'; + +-- https://github.com/ClickHouse/ClickHouse/issues/6767 +DROP TABLE IF EXISTS t1; +DROP TABLE IF EXISTS t2; +DROP TABLE IF EXISTS t3; +DROP TABLE IF EXISTS view1; + +CREATE TABLE t1 (id UInt32, value1 String ) ENGINE ReplacingMergeTree() ORDER BY id; +CREATE TABLE t2 (id UInt32, value2 String ) ENGINE ReplacingMergeTree() ORDER BY id; +CREATE TABLE t3 (id UInt32, value3 String ) ENGINE ReplacingMergeTree() ORDER BY id; + +INSERT INTO t1 (id, value1) VALUES (1, 'val11'); +INSERT INTO t2 (id, value2) VALUES (1, 'val21'); +INSERT INTO t3 (id, value3) VALUES (1, 'val31'); + +CREATE VIEW IF NOT EXISTS view1 AS SELECT t1.id AS id, t1.value1 AS value1, t2.value2 AS value2, t3.value3 AS value3 FROM t1 LEFT JOIN t2 ON t1.id = t2.id LEFT JOIN t3 ON t1.id = t3.id WHERE t1.id > 0; +SELECT * FROM view1 WHERE id = 1; + +DROP TABLE IF EXISTS t1; +DROP TABLE IF EXISTS t2; +DROP TABLE IF EXISTS t3; +DROP TABLE IF EXISTS view1; + +-- https://github.com/ClickHouse/ClickHouse/issues/7136 +ANALYZE SELECT ccc FROM ( SELECT 1 AS ccc UNION ALL SELECT * FROM ( SELECT 2 AS ccc ) ANY INNER JOIN ( SELECT 2 AS ccc ) USING (ccc) ) WHERE ccc > 1; +SELECT ccc FROM ( SELECT 1 AS ccc UNION ALL SELECT * FROM ( SELECT 2 AS ccc ) ANY INNER JOIN ( SELECT 2 AS ccc ) USING (ccc) ) WHERE ccc > 1; + +-- https://github.com/ClickHouse/ClickHouse/issues/5674 +-- https://github.com/ClickHouse/ClickHouse/issues/4731 +-- https://github.com/ClickHouse/ClickHouse/issues/4904 +DROP TABLE IF EXISTS A; +DROP TABLE IF EXISTS B; + +CREATE TABLE A (ts DateTime, id String, id_b String) ENGINE = MergeTree PARTITION BY toStartOfHour(ts) ORDER BY (ts,id); +CREATE TABLE B (ts DateTime, id String, id_c String) ENGINE = MergeTree PARTITION BY toStartOfHour(ts) ORDER BY (ts,id); + +ANALYZE SELECT ts, id, id_b, b.ts, b.id, id_c FROM (SELECT ts, id, id_b FROM A) AS a ALL LEFT JOIN B AS b ON b.id = a.id_b WHERE a.ts <= toDateTime('1970-01-01 03:00:00'); +ANALYZE SELECT ts AS `--a.ts`, id AS `--a.id`, id_b AS `--a.id_b`, b.ts AS `--b.ts`, b.id AS `--b.id`, id_c AS `--b.id_c` FROM (SELECT ts, id, id_b FROM A) AS a ALL LEFT JOIN B AS b ON `--b.id` = `--a.id_b` WHERE `--a.ts` <= toDateTime('1970-01-01 03:00:00'); + +DROP TABLE IF EXISTS A; +DROP TABLE IF EXISTS B; + +-- https://github.com/ClickHouse/ClickHouse/issues/7802 +DROP TABLE IF EXISTS test; + +CREATE TABLE test ( A Int32, B Int32 ) ENGINE = Memory(); + +INSERT INTO test VALUES(1, 2)(0, 3)(1, 4)(0, 5); + +SELECT B, neighbor(B, 1) AS next_B FROM (SELECT * FROM test ORDER BY B); +SELECT B, neighbor(B, 1) AS next_B FROM (SELECT * FROM test ORDER BY B) WHERE A == 1; +SELECT B, next_B FROM (SELECT A, B, neighbor(B, 1) AS next_B FROM (SELECT * FROM test ORDER BY B)) WHERE A == 1; + +DROP TABLE IF EXISTS test; + +ANALYZE SELECT * FROM (SELECT * FROM system.one) WHERE arrayMap(x -> x + 1, [dummy]) = [1]; +SELECT * FROM (SELECT * FROM system.one) WHERE arrayMap(x -> x + 1, [dummy]) = [1]; + +ANALYZE SELECT * FROM (SELECT 1 AS id, 2 AS value) INNER JOIN (SELECT 1 AS id, 3 AS value_1) USING id WHERE arrayMap(x -> x + value + value_1, [1]) = [6]; +SELECT * FROM (SELECT 1 AS id, 2 AS value) INNER JOIN (SELECT 1 AS id, 3 AS value_1) USING id WHERE arrayMap(x -> x + value + value_1, [1]) = [6]; diff --git a/dbms/tests/queries/0_stateless/01060_defaults_all_columns.reference b/dbms/tests/queries/0_stateless/01060_defaults_all_columns.reference new file mode 100644 index 00000000000..68b4657ca60 --- /dev/null +++ b/dbms/tests/queries/0_stateless/01060_defaults_all_columns.reference @@ -0,0 +1,4 @@ +1 hello +2 test42 +42 test42 +42 world diff --git a/dbms/tests/queries/0_stateless/01060_defaults_all_columns.sql b/dbms/tests/queries/0_stateless/01060_defaults_all_columns.sql new file mode 100644 index 00000000000..afbb01b8cb2 --- /dev/null +++ b/dbms/tests/queries/0_stateless/01060_defaults_all_columns.sql @@ -0,0 +1,10 @@ +DROP TABLE IF EXISTS defaults_all_columns; + +CREATE TABLE defaults_all_columns (n UInt8 DEFAULT 42, s String DEFAULT concat('test', CAST(n, 'String'))) ENGINE = Memory; + +INSERT INTO defaults_all_columns FORMAT JSONEachRow {"n": 1, "s": "hello"} {}; +INSERT INTO defaults_all_columns FORMAT JSONEachRow {"n": 2}, {"s": "world"}; + +SELECT * FROM defaults_all_columns ORDER BY n, s; + +DROP TABLE defaults_all_columns; diff --git a/dbms/tests/queries/0_stateless/01060_substring_negative_size.reference b/dbms/tests/queries/0_stateless/01060_substring_negative_size.reference new file mode 100644 index 00000000000..b25696dc7d6 --- /dev/null +++ b/dbms/tests/queries/0_stateless/01060_substring_negative_size.reference @@ -0,0 +1,27 @@ +bcdef +bcdef +bcdef +bcdef +- +bcdef +bcdef +bcdef +bcdef +- +bcdef +23456 +bcdef +3456 +bcdef +2345 +bcdef +345 +- +bcdef +23456 +bcdef +3456 +bcdef +2345 +bcdef +345 diff --git a/dbms/tests/queries/0_stateless/01060_substring_negative_size.sql b/dbms/tests/queries/0_stateless/01060_substring_negative_size.sql new file mode 100644 index 00000000000..23cab14a6e0 --- /dev/null +++ b/dbms/tests/queries/0_stateless/01060_substring_negative_size.sql @@ -0,0 +1,36 @@ +select substring('abcdefgh', 2, -2); +select substring('abcdefgh', materialize(2), -2); +select substring('abcdefgh', 2, materialize(-2)); +select substring('abcdefgh', materialize(2), materialize(-2)); + +select '-'; + +select substring(cast('abcdefgh' as FixedString(8)), 2, -2); +select substring(cast('abcdefgh' as FixedString(8)), materialize(2), -2); +select substring(cast('abcdefgh' as FixedString(8)), 2, materialize(-2)); +select substring(cast('abcdefgh' as FixedString(8)), materialize(2), materialize(-2)); + +select '-'; + +drop table if exists t; +create table t (s String, l Int8, r Int8) engine = Memory; +insert into t values ('abcdefgh', 2, -2), ('12345678', 3, -3); + +select substring(s, 2, -2) from t; +select substring(s, l, -2) from t; +select substring(s, 2, r) from t; +select substring(s, l, r) from t; + +select '-'; + +drop table if exists t; +create table t (s FixedString(8), l Int8, r Int8) engine = Memory; +insert into t values ('abcdefgh', 2, -2), ('12345678', 3, -3); + +select substring(s, 2, -2) from t; +select substring(s, l, -2) from t; +select substring(s, 2, r) from t; +select substring(s, l, r) from t; + +drop table if exists t; + diff --git a/dbms/tests/queries/0_stateless/01061_alter_codec_with_type.reference b/dbms/tests/queries/0_stateless/01061_alter_codec_with_type.reference new file mode 100644 index 00000000000..836a5f20d7a --- /dev/null +++ b/dbms/tests/queries/0_stateless/01061_alter_codec_with_type.reference @@ -0,0 +1,5 @@ +epoch UInt64 CODEC(Delta(8), LZ4) +_time_dec Float64 +epoch UInt64 toUInt64(_time_dec) CODEC(Delta(8), LZ4) +_time_dec Float64 +1577351080 1577351080 diff --git a/dbms/tests/queries/0_stateless/01061_alter_codec_with_type.sql b/dbms/tests/queries/0_stateless/01061_alter_codec_with_type.sql new file mode 100644 index 00000000000..7f662c7463d --- /dev/null +++ b/dbms/tests/queries/0_stateless/01061_alter_codec_with_type.sql @@ -0,0 +1,19 @@ +DROP TABLE IF EXISTS alter_bug; + +create table alter_bug ( + epoch UInt64 CODEC(Delta,LZ4), + _time_dec Float64 +) Engine = MergeTree ORDER BY (epoch); + + +SELECT name, type, compression_codec FROM system.columns WHERE table='alter_bug' AND database=currentDatabase(); + +ALTER TABLE alter_bug MODIFY COLUMN epoch DEFAULT toUInt64(_time_dec) CODEC(Delta,LZ4); + +SELECT name, type, default_expression, compression_codec FROM system.columns WHERE table='alter_bug' AND database=currentDatabase(); + +INSERT INTO alter_bug(_time_dec) VALUES(1577351080); + +SELECT * FROM alter_bug; + +DROP TABLE IF EXISTS alter_bug; diff --git a/dbms/tests/queries/1_stateful/00151_replace_partition_with_different_granularity.sql b/dbms/tests/queries/1_stateful/00151_replace_partition_with_different_granularity.sql index c907f353768..bea90dade3c 100644 --- a/dbms/tests/queries/1_stateful/00151_replace_partition_with_different_granularity.sql +++ b/dbms/tests/queries/1_stateful/00151_replace_partition_with_different_granularity.sql @@ -24,7 +24,7 @@ CREATE TABLE non_mixed_granularity_adaptive_table AS test.hits; INSERT INTO non_mixed_granularity_adaptive_table SELECT * FROM test.hits LIMIT 10; -ALTER TABLE non_mixed_granularity_adaptive_table REPLACE PARTITION 201403 FROM test.hits; -- { serverError 49 } +ALTER TABLE non_mixed_granularity_adaptive_table REPLACE PARTITION 201403 FROM test.hits; -- { serverError 36 } DROP TABLE IF EXISTS non_mixed_granularity_adaptive_table; @@ -35,7 +35,7 @@ CREATE TABLE non_mixed_granularity_non_adaptive_table (`WatchID` UInt64, `JavaEn INSERT INTO non_mixed_granularity_non_adaptive_table SELECT * FROM test.hits LIMIT 10; -- after optimize mixed_granularity_table will have .mrk2 parts -ALTER TABLE non_mixed_granularity_non_adaptive_table REPLACE PARTITION 201403 FROM mixed_granularity_table; -- { serverError 49 } +ALTER TABLE non_mixed_granularity_non_adaptive_table REPLACE PARTITION 201403 FROM mixed_granularity_table; -- { serverError 36 } DROP TABLE IF EXISTS non_mixed_granularity_non_adaptive_table; @@ -46,7 +46,7 @@ CREATE TABLE mixed_granularity_strictly_non_adaptive_table (`WatchID` UInt64, `J INSERT INTO mixed_granularity_strictly_non_adaptive_table SELECT * FROM test.hits LIMIT 10; -ALTER TABLE mixed_granularity_strictly_non_adaptive_table REPLACE PARTITION 201403 FROM mixed_granularity_table; -- { serverError 49 } +ALTER TABLE mixed_granularity_strictly_non_adaptive_table REPLACE PARTITION 201403 FROM mixed_granularity_table; -- { serverError 36 } DROP TABLE IF EXISTS mixed_granularity_table; diff --git a/dbms/tests/queries/bugs/01060_defaults_all_columns.reference b/dbms/tests/queries/bugs/01060_defaults_all_columns.reference new file mode 100644 index 00000000000..7b1fdfb6817 --- /dev/null +++ b/dbms/tests/queries/bugs/01060_defaults_all_columns.reference @@ -0,0 +1,4 @@ +1 hello +2 test2 +42 test42 +42 world diff --git a/debian/clickhouse-server.init b/debian/clickhouse-server.init index 32282756719..8a50298ecd2 100755 --- a/debian/clickhouse-server.init +++ b/debian/clickhouse-server.init @@ -3,8 +3,8 @@ # Provides: clickhouse-server # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 -# Required-Start: -# Required-Stop: +# Required-Start: $network +# Required-Stop: $network # Short-Description: Yandex clickhouse-server daemon ### END INIT INFO @@ -20,7 +20,16 @@ CLICKHOUSE_LOGDIR=/var/log/clickhouse-server CLICKHOUSE_LOGDIR_USER=root CLICKHOUSE_DATADIR_OLD=/opt/clickhouse CLICKHOUSE_DATADIR=/var/lib/clickhouse -LOCALSTATEDIR=/var/lock +if [ -d "/var/lock" ]; then + LOCALSTATEDIR=/var/lock +else + LOCALSTATEDIR=/run/lock +fi + +if [ ! -d "$LOCALSTATEDIR" ]; then + mkdir -p "$LOCALSTATEDIR" +fi + CLICKHOUSE_BINDIR=/usr/bin CLICKHOUSE_CRONFILE=/etc/cron.d/clickhouse-server CLICKHOUSE_CONFIG=$CLICKHOUSE_CONFDIR/config.xml diff --git a/debian/clickhouse-server.postinst b/debian/clickhouse-server.postinst index 4a1f4d9d387..81a9582c063 100644 --- a/debian/clickhouse-server.postinst +++ b/debian/clickhouse-server.postinst @@ -13,19 +13,14 @@ CLICKHOUSE_GENERIC_PROGRAM=${CLICKHOUSE_GENERIC_PROGRAM:=clickhouse} EXTRACT_FROM_CONFIG=${CLICKHOUSE_GENERIC_PROGRAM}-extract-from-config CLICKHOUSE_CONFIG=$CLICKHOUSE_CONFDIR/config.xml -OS=${OS=`lsb_release -is 2>/dev/null ||:`} -[ -z "$OS" ] && [ -f /etc/os-release ] && . /etc/os-release && OS=$ID -[ -z "$OS" ] && [ -f /etc/centos-release ] && OS=centos -[ -z "$OS" ] && OS=`uname -s ||:` - [ -f /usr/share/debconf/confmodule ] && . /usr/share/debconf/confmodule [ -f /etc/default/clickhouse ] && . /etc/default/clickhouse -if [ "$OS" = "rhel" ] || [ "$OS" = "centos" ] || [ "$OS" = "fedora" ] || [ "$OS" = "CentOS" ] || [ "$OS" = "Fedora" ] || [ "$OS" = "ol" ]; then - is_rh=1 +if [ ! -f "/etc/debian_version" ]; then + not_deb_os=1 fi -if [ "$1" = configure ] || [ -n "$is_rh" ]; then +if [ "$1" = configure ] || [ -n "$not_deb_os" ]; then if [ -x "/bin/systemctl" ] && [ -f /etc/systemd/system/clickhouse-server.service ] && [ -d /run/systemd/system ]; then # if old rc.d service present - remove it if [ -x "/etc/init.d/clickhouse-server" ] && [ -x "/usr/sbin/update-rc.d" ]; then @@ -48,9 +43,8 @@ if [ "$1" = configure ] || [ -n "$is_rh" ]; then # Make sure the administrative user exists if ! getent passwd ${CLICKHOUSE_USER} > /dev/null; then - if [ -n "$is_rh" ]; then - adduser --system --no-create-home --home /nonexistent \ - --shell /bin/false ${CLICKHOUSE_USER} > /dev/null + if [ -n "$not_deb_os" ]; then + useradd -r -s /bin/false --home-dir /nonexistent ${CLICKHOUSE_USER} > /dev/null else adduser --system --disabled-login --no-create-home --home /nonexistent \ --shell /bin/false --group --gecos "ClickHouse server" ${CLICKHOUSE_USER} > /dev/null @@ -59,12 +53,12 @@ if [ "$1" = configure ] || [ -n "$is_rh" ]; then # if the user was created manually, make sure the group is there as well if ! getent group ${CLICKHOUSE_GROUP} > /dev/null; then - addgroup --system ${CLICKHOUSE_GROUP} > /dev/null + groupadd -r ${CLICKHOUSE_GROUP} > /dev/null fi # make sure user is in the correct group if ! id -Gn ${CLICKHOUSE_USER} | grep -qw ${CLICKHOUSE_USER}; then - adduser ${CLICKHOUSE_USER} ${CLICKHOUSE_GROUP} > /dev/null + usermod -a -G ${CLICKHOUSE_GROUP} ${CLICKHOUSE_USER} > /dev/null fi # check validity of user and group @@ -81,6 +75,9 @@ Please fix this and reinstall this package." >&2 fi if [ -x "$CLICKHOUSE_BINDIR/$EXTRACT_FROM_CONFIG" ] && [ -f "$CLICKHOUSE_CONFIG" ]; then + if [ -z "$SHELL" ]; then + SHELL="/bin/sh" + fi CLICKHOUSE_DATADIR_FROM_CONFIG=$(su -s $SHELL ${CLICKHOUSE_USER} -c "$CLICKHOUSE_BINDIR/$EXTRACT_FROM_CONFIG --config-file=\"$CLICKHOUSE_CONFIG\" --key=path") ||: echo "Path to data directory in ${CLICKHOUSE_CONFIG}: ${CLICKHOUSE_DATADIR_FROM_CONFIG}" fi diff --git a/debian/clickhouse-server.service b/debian/clickhouse-server.service index 4543b304197..b9681f9279e 100644 --- a/debian/clickhouse-server.service +++ b/debian/clickhouse-server.service @@ -1,5 +1,7 @@ [Unit] Description=ClickHouse Server (analytic DBMS for big data) +Requires=network-online.target +After=network-online.target [Service] Type=simple diff --git a/docker/test/performance-comparison/Dockerfile b/docker/test/performance-comparison/Dockerfile index 1e08ec0f521..6c67e724477 100644 --- a/docker/test/performance-comparison/Dockerfile +++ b/docker/test/performance-comparison/Dockerfile @@ -3,15 +3,16 @@ FROM ubuntu:18.04 RUN apt-get update \ && apt-get install --yes --no-install-recommends \ - p7zip-full bash ncdu wget python3 python3-pip python3-dev g++ \ + p7zip-full bash git ncdu wget psmisc python3 python3-pip python3-dev g++ \ && pip3 --no-cache-dir install clickhouse_driver \ && apt-get purge --yes python3-dev g++ \ && apt-get autoremove --yes \ - && apt-get clean + && apt-get clean \ + && rm -rf /var/lib/apt/lists/* COPY * / CMD /entrypoint.sh -# docker run --network=host --volume :/workspace --volume=:/output -e LEFT_PR=<> -e LEFT_SHA=<> -e RIGHT_PR=<> -e RIGHT_SHA=<> yandex/clickhouse-performance-comparison +# docker run --network=host --volume :/workspace --volume=:/output -e PR_TO_TEST=<> -e SHA_TO_TEST=<> yandex/clickhouse-performance-comparison diff --git a/docker/test/performance-comparison/entrypoint.sh b/docker/test/performance-comparison/entrypoint.sh index 7ef5a9553a0..823832f2881 100755 --- a/docker/test/performance-comparison/entrypoint.sh +++ b/docker/test/performance-comparison/entrypoint.sh @@ -1,8 +1,19 @@ #!/bin/bash +set -ex cd /workspace -../compare.sh $LEFT_PR $LEFT_SHA $RIGHT_PR $RIGHT_SHA > compare.log 2>&1 +# We will compare to the most recent testing tag in master branch, let's find it. +rm -rf ch ||: +git clone --branch master --single-branch --depth 50 --bare https://github.com/ClickHouse/ClickHouse ch +ref_tag=$(cd ch && git describe --match='v*-testing' --abbrev=0 --first-parent master) +echo Reference tag is $ref_tag +# We use annotated tags which have their own shas, so we have to further +# dereference the tag to get the commit it points to, hence the '~0' thing. +ref_sha=$(cd ch && git rev-parse $ref_tag~0) +echo Reference SHA is $ref_sha + +../compare.sh 0 $ref_sha $PR_TO_TEST $SHA_TO_TEST > compare.log 2>&1 7z a /output/output.7z *.log *.tsv cp compare.log /output diff --git a/docker/test/performance-comparison/perf.py b/docker/test/performance-comparison/perf.py index 5517a71cc44..8d2fe1bc476 100755 --- a/docker/test/performance-comparison/perf.py +++ b/docker/test/performance-comparison/perf.py @@ -7,7 +7,8 @@ import argparse import pprint parser = argparse.ArgumentParser(description='Run performance test.') -parser.add_argument('file', metavar='FILE', type=argparse.FileType('r'), nargs=1, help='test description file') +# Explicitly decode files as UTF-8 because sometimes we have Russian characters in queries, and LANG=C is set. +parser.add_argument('file', metavar='FILE', type=argparse.FileType('r', encoding='utf-8'), nargs=1, help='test description file') args = parser.parse_args() tree = et.parse(args.file[0]) diff --git a/docs/en/getting_started/install.md b/docs/en/getting_started/install.md index e47500fa22f..f1ec980fa70 100644 --- a/docs/en/getting_started/install.md +++ b/docs/en/getting_started/install.md @@ -59,6 +59,35 @@ sudo yum install clickhouse-server clickhouse-client You can also download and install packages manually from here: . +### From tgz archives {#from-tgz-archives} + +It is recommended to use official pre-compiled `tgz` archives for all Linux distributions, where installation of `deb` or `rpm` packages is not possible. + +Required version can be dowloaded with `curl` or `wget` from repository . +After that downloaded archives should be unpacked and installed with installation scripts. Example for the latest version: +```bash +export LATEST_VERSION=`curl https://api.github.com/repos/ClickHouse/ClickHouse/tags 2>/dev/null | grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | head -n 1` +curl -O https://repo.yandex.ru/clickhouse/tgz/clickhouse-common-static-$LATEST_VERSION.tgz +curl -O https://repo.yandex.ru/clickhouse/tgz/clickhouse-common-static-dbg-$LATEST_VERSION.tgz +curl -O https://repo.yandex.ru/clickhouse/tgz/clickhouse-server-$LATEST_VERSION.tgz +curl -O https://repo.yandex.ru/clickhouse/tgz/clickhouse-client-$LATEST_VERSION.tgz + +tar -xzvf clickhouse-common-static-$LATEST_VERSION.tgz +sudo clickhouse-common-static-$LATEST_VERSION/install/doinst.sh + +tar -xzvf clickhouse-common-static-dbg-$LATEST_VERSION.tgz +sudo clickhouse-common-static-dbg-$LATEST_VERSION/install/doinst.sh + +tar -xzvf clickhouse-server-$LATEST_VERSION.tgz +sudo clickhouse-server-$LATEST_VERSION/install/doinst.sh +sudo /etc/init.d/clickhouse-server start + +tar -xzvf clickhouse-client-$LATEST_VERSION.tgz +sudo clickhouse-client-$LATEST_VERSION/install/doinst.sh +``` + +For production environments it's recommended to use latest `stable`-version. You can find it's number on github page https://github.com/ClickHouse/ClickHouse/tags with postfix `-stable`. + ### From Docker Image To run ClickHouse inside Docker follow the guide on [Docker Hub](https://hub.docker.com/r/yandex/clickhouse-server/). Those images use official `deb` packages inside. diff --git a/docs/en/getting_started/tutorial.md b/docs/en/getting_started/tutorial.md index acdd9074beb..bffee808122 100644 --- a/docs/en/getting_started/tutorial.md +++ b/docs/en/getting_started/tutorial.md @@ -444,7 +444,7 @@ SAMPLE BY intHash32(UserID) SETTINGS index_granularity = 8192 ``` -You can execute those queries using interactive mode of `clickhouse-client` (just launch it in terminal without specifying a query in advance) or try some [alternative interface](../interfaces/index.md) if you ant. +You can execute those queries using interactive mode of `clickhouse-client` (just launch it in terminal without specifying a query in advance) or try some [alternative interface](../interfaces/index.md) if you want. As we can see, `hits_v1` uses the [basic MergeTree engine](../operations/table_engines/mergetree.md), while the `visits_v1` uses the [Collapsing](../operations/table_engines/collapsingmergetree.md) variant. diff --git a/docs/en/guides/apply_catboost_model.md b/docs/en/guides/apply_catboost_model.md index 4665809bfa0..06863bb48f9 100644 --- a/docs/en/guides/apply_catboost_model.md +++ b/docs/en/guides/apply_catboost_model.md @@ -74,7 +74,7 @@ $ clickhouse client ROLE_FAMILY UInt32, ROLE_CODE UInt32 ) -ENGINE = MergeTree() +ENGINE = MergeTree ORDER BY date ``` **3.** Exit from ClickHouse console client: @@ -227,4 +227,4 @@ FROM ``` !!! note "Note" - More info about [avg()](../query_language/agg_functions/reference.md#agg_function-avg) and [log()](../query_language/functions/math_functions.md) functions. \ No newline at end of file + More info about [avg()](../query_language/agg_functions/reference.md#agg_function-avg) and [log()](../query_language/functions/math_functions.md) functions. diff --git a/docs/en/operations/utils/clickhouse-copier.md b/docs/en/operations/utils/clickhouse-copier.md index 08388aab7db..2a75667cb50 100644 --- a/docs/en/operations/utils/clickhouse-copier.md +++ b/docs/en/operations/utils/clickhouse-copier.md @@ -144,7 +144,7 @@ Parameters: NOTE: In spite of this section is optional (if it is not specified, all partitions will be copied), it is strictly recommended to specify them explicitly. - If you already have some ready paritions on destination cluster they + If you already have some ready partitions on destination cluster they will be removed at the start of the copying since they will be interpeted as unfinished data from the previous copying!!! --> diff --git a/docs/en/query_language/dicts/external_dicts_dict_sources.md b/docs/en/query_language/dicts/external_dicts_dict_sources.md index 7b8303eb700..ecb1a94e002 100644 --- a/docs/en/query_language/dicts/external_dicts_dict_sources.md +++ b/docs/en/query_language/dicts/external_dicts_dict_sources.md @@ -137,9 +137,9 @@ Setting fields: - `url` – The source URL. - `format` – The file format. All the formats described in "[Formats](../../interfaces/formats.md#formats)" are supported. -- `credentials` – Basic HTTP authentification. Optional parameter. - - `user` – Username required for the authentification. - - `password` – Password required for the authentification. +- `credentials` – Basic HTTP authentication. Optional parameter. + - `user` – Username required for the authentication. + - `password` – Password required for the authentication. - `headers` – All custom HTTP headers entries used for the HTTP request. Optional parameter. - `header` – Single HTTP header entry. - `name` – Identifiant name used for the header send on the request. diff --git a/docs/en/query_language/functions/conditional_functions.md b/docs/en/query_language/functions/conditional_functions.md index 074df25303f..8b46566af31 100644 --- a/docs/en/query_language/functions/conditional_functions.md +++ b/docs/en/query_language/functions/conditional_functions.md @@ -1,19 +1,66 @@ # Conditional functions -## if(cond, then, else), cond ? operator then : else +## `if` function -Returns `then` if `cond != 0`, or `else` if `cond = 0`. -`cond` must be of type `UInt8`, and `then` and `else` must have the lowest common type. +Syntax: `if(cond, then, else)` -`then` and `else` can be `NULL` +Returns `then` if the `cond` is truthy(greater than zero), otherwise returns `else`. + +* `cond` must be of type of `UInt8`, and `then` and `else` must have the lowest common type. + +* `then` and `else` can be `NULL` + +**Example:** + +Take this `LEFT_RIGHT` table: + +```sql +SELECT * +FROM LEFT_RIGHT + +┌─left─┬─right─┐ +│ ᴺᵁᴸᴸ │ 4 │ +│ 1 │ 3 │ +│ 2 │ 2 │ +│ 3 │ 1 │ +│ 4 │ ᴺᵁᴸᴸ │ +└──────┴───────┘ +``` +The following query compares `left` and `right` values: + +```sql +SELECT + left, + right, + if(left < right, 'left is smaller than right', 'right is greater or equal than left') AS is_smaller +FROM LEFT_RIGHT +WHERE isNotNull(left) AND isNotNull(right) + +┌─left─┬─right─┬─is_smaller──────────────────────────┐ +│ 1 │ 3 │ left is smaller than right │ +│ 2 │ 2 │ right is greater or equal than left │ +│ 3 │ 1 │ right is greater or equal than left │ +└──────┴───────┴─────────────────────────────────────┘ +``` +Note: `NULL` values are not used in this example, check [NULL values in conditionals](#null-values-in-conditionals) section. + +## Ternary operator + +It works same as `if` function. + +Syntax: `cond ? then : else` + +Returns `then` if the `cond` is truthy(greater than zero), otherwise returns `else`. + +* `cond` must be of type of `UInt8`, and `then` and `else` must have the lowest common type. + +* `then` and `else` can be `NULL` ## multiIf Allows you to write the [CASE](../operators.md#operator_case) operator more compactly in the query. -```sql -multiIf(cond_1, then_1, cond_2, then_2...else) -``` +Syntax: `multiIf(cond_1, then_1, cond_2, then_2, ..., else)` **Parameters:** @@ -29,22 +76,76 @@ The function returns one of the values `then_N` or `else`, depending on the cond **Example** -Take the table +Again using `LEFT_RIGHT` table. -```text -┌─x─┬────y─┐ -│ 1 │ ᴺᵁᴸᴸ │ -│ 2 │ 3 │ -└───┴──────┘ +```sql +SELECT + left, + right, + multiIf(left < right, 'left is smaller', left > right, 'left is greater', left = right, 'Both equal', 'Null value') AS result +FROM LEFT_RIGHT + +┌─left─┬─right─┬─result──────────┐ +│ ᴺᵁᴸᴸ │ 4 │ Null value │ +│ 1 │ 3 │ left is smaller │ +│ 2 │ 2 │ Both equal │ +│ 3 │ 1 │ left is greater │ +│ 4 │ ᴺᵁᴸᴸ │ Null value │ +└──────┴───────┴─────────────────┘ +``` +## Using conditional results directly + +Conditionals always result to `0`, `1` or `NULL`. So you can use conditional results directly like this: + +```sql +SELECT left < right AS is_small +FROM LEFT_RIGHT + +┌─is_small─┐ +│ ᴺᵁᴸᴸ │ +│ 1 │ +│ 0 │ +│ 0 │ +│ ᴺᵁᴸᴸ │ +└──────────┘ ``` -Run the query `SELECT multiIf(isNull(y) x, y < 3, y, NULL) FROM t_null`. Result: -```text -┌─multiIf(isNull(y), x, less(y, 3), y, NULL)─┐ -│ 1 │ -│ ᴺᵁᴸᴸ │ -└────────────────────────────────────────────┘ +## NULL values in conditionals + +When `NULL` values are involved in conditionals, the result will also be `NULL`. + +```sql +SELECT + NULL < 1, + 2 < NULL, + NULL < NULL, + NULL = NULL + +┌─less(NULL, 1)─┬─less(2, NULL)─┬─less(NULL, NULL)─┬─equals(NULL, NULL)─┐ +│ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ +└───────────────┴───────────────┴──────────────────┴────────────────────┘ ``` +So you should construct your queries carefully if the types are `Nullable`. + +The following example demonstrates this by failing to add equals condition to `multiIf`. + +```sql +SELECT + left, + right, + multiIf(left < right, 'left is smaller', left > right, 'right is smaller', 'Both equal') AS faulty_result +FROM LEFT_RIGHT + +┌─left─┬─right─┬─faulty_result────┐ +│ ᴺᵁᴸᴸ │ 4 │ Both equal │ +│ 1 │ 3 │ left is smaller │ +│ 2 │ 2 │ Both equal │ +│ 3 │ 1 │ right is smaller │ +│ 4 │ ᴺᵁᴸᴸ │ Both equal │ +└──────┴───────┴──────────────────┘ +``` + + [Original article](https://clickhouse.yandex/docs/en/query_language/functions/conditional_functions/) diff --git a/docs/en/security_changelog.md b/docs/en/security_changelog.md index 0847300cc19..92b35868f94 100644 --- a/docs/en/security_changelog.md +++ b/docs/en/security_changelog.md @@ -1,6 +1,27 @@ +## Fixed in ClickHouse Release 19.14.3.3, 2019-09-10 + +### CVE-2019-15024 + +Аn attacker having write access to ZooKeeper and who is able to run a custom server available from the network where ClickHouse runs, can create a custom-built malicious server that will act as a ClickHouse replica and register it in ZooKeeper. When another replica will fetch data part from the malicious replica, it can force clickhouse-server to write to arbitrary path on filesystem. + +Credits: Eldar Zaitov of Yandex Information Security Team + +### CVE-2019-16535 + +Аn OOB read, OOB write and integer underflow in decompression algorithms can be used to achieve RCE or DoS via native protocol. + +Credits: Eldar Zaitov of Yandex Information Security Team + +### CVE-2019-16536 + +Stack overflow leading to DoS can be triggered by malicious authenticated client. + +Credits: Eldar Zaitov of Yandex Information Security Team + ## Fixed in ClickHouse Release 19.13.6.1, 2019-09-20 ### CVE-2019-18657 + Table function `url` had the vulnerability allowed the attacker to inject arbitrary HTTP headers in the request. Credits: [Nikita Tikhomirov](https://github.com/NSTikhomirov) @@ -24,6 +45,7 @@ Credits: Andrey Krasichkov and Evgeny Sidorov of Yandex Information Security Tea ## Fixed in ClickHouse Release 1.1.54388, 2018-06-28 ### CVE-2018-14668 + "remote" table function allowed arbitrary symbols in "user", "password" and "default_database" fields which led to Cross Protocol Request Forgery Attacks. Credits: Andrey Krasichkov of Yandex Information Security Team @@ -31,6 +53,7 @@ Credits: Andrey Krasichkov of Yandex Information Security Team ## Fixed in ClickHouse Release 1.1.54390, 2018-07-06 ### CVE-2018-14669 + ClickHouse MySQL client had "LOAD DATA LOCAL INFILE" functionality enabled that allowed a malicious MySQL database read arbitrary files from the connected ClickHouse server. Credits: Andrey Krasichkov and Evgeny Sidorov of Yandex Information Security Team diff --git a/docs/ru/getting_started/install.md b/docs/ru/getting_started/install.md index 29ccd2b14f4..cd1a04b6192 100644 --- a/docs/ru/getting_started/install.md +++ b/docs/ru/getting_started/install.md @@ -50,7 +50,6 @@ sudo yum-config-manager --add-repo https://repo.yandex.ru/clickhouse/rpm/stable/ Для использования наиболее свежих версий нужно заменить `stable` на `testing` (рекомендуется для тестовых окружений). -Then run these commands to actually install packages: Для, собственно, установки пакетов необходимо выполнить следующие команды: ```bash @@ -59,6 +58,35 @@ sudo yum install clickhouse-server clickhouse-client Также есть возможность установить пакеты вручную, скачав отсюда: . +### Из tgz архивов {#from-tgz-archives} + +Команда ClickHouse в Яндексе рекомендует использовать предкомпилированные бинарники из `tgz` архивов для всех дистрибутивов, где невозможна установка `deb` и `rpm` пакетов. + +Интересующую версию архивов можно скачать вручную с помощью `curl` или `wget` из репозитория . +После этого архивы нужно распаковать и воспользоваться скриптами установки. Пример установки самой свежей версии: +```bash +export LATEST_VERSION=`curl https://api.github.com/repos/ClickHouse/ClickHouse/tags 2>/dev/null | grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | head -n 1` +curl -O https://repo.yandex.ru/clickhouse/tgz/clickhouse-common-static-$LATEST_VERSION.tgz +curl -O https://repo.yandex.ru/clickhouse/tgz/clickhouse-common-static-dbg-$LATEST_VERSION.tgz +curl -O https://repo.yandex.ru/clickhouse/tgz/clickhouse-server-$LATEST_VERSION.tgz +curl -O https://repo.yandex.ru/clickhouse/tgz/clickhouse-client-$LATEST_VERSION.tgz + +tar -xzvf clickhouse-common-static-$LATEST_VERSION.tgz +sudo clickhouse-common-static-$LATEST_VERSION/install/doinst.sh + +tar -xzvf clickhouse-common-static-dbg-$LATEST_VERSION.tgz +sudo clickhouse-common-static-dbg-$LATEST_VERSION/install/doinst.sh + +tar -xzvf clickhouse-server-$LATEST_VERSION.tgz +sudo clickhouse-server-$LATEST_VERSION/install/doinst.sh +sudo /etc/init.d/clickhouse-server start + +tar -xzvf clickhouse-client-$LATEST_VERSION.tgz +sudo clickhouse-client-$LATEST_VERSION/install/doinst.sh +``` + +Для production окружений рекомендуется использовать последнюю `stable`-версию. Её номер также можно найти на github с на вкладке https://github.com/ClickHouse/ClickHouse/tags c постфиксом `-stable`. + ### Из Docker образа {#from-docker-image} Для запуска ClickHouse в Docker нужно следовать инструкции на [Docker Hub](https://hub.docker.com/r/yandex/clickhouse-server/). Внутри образов используются официальные `deb` пакеты. diff --git a/docs/ru/guides/apply_catboost_model.md b/docs/ru/guides/apply_catboost_model.md index 9f93aacbd22..69aa0faccb2 100644 --- a/docs/ru/guides/apply_catboost_model.md +++ b/docs/ru/guides/apply_catboost_model.md @@ -74,7 +74,7 @@ $ clickhouse client ROLE_FAMILY UInt32, ROLE_CODE UInt32 ) -ENGINE = MergeTree() +ENGINE = MergeTree ORDER BY date ``` **3.** Выйдите из клиента ClickHouse: @@ -227,4 +227,4 @@ FROM ``` !!! note "Примечание" - Подробнее про функции [avg()](../query_language/agg_functions/reference.md#agg_function-avg), [log()](../query_language/functions/math_functions.md). \ No newline at end of file + Подробнее про функции [avg()](../query_language/agg_functions/reference.md#agg_function-avg), [log()](../query_language/functions/math_functions.md). diff --git a/docs/ru/interfaces/http.md b/docs/ru/interfaces/http.md index 4da101796f1..4d3b831cd5c 100644 --- a/docs/ru/interfaces/http.md +++ b/docs/ru/interfaces/http.md @@ -175,7 +175,7 @@ $ echo 'SELECT number FROM numbers LIMIT 10' | curl 'http://localhost:8123/?data Имя пользователя и пароль могут быть указаны в одном из двух вариантов: -1. С использованием HTTP Basic Authentification. Пример: +1. С использованием HTTP Basic Authentication. Пример: ```bash $ echo 'SELECT 1' | curl 'http://user:password@localhost:8123/' -d @- diff --git a/docs/ru/operations/utils/clickhouse-copier.md b/docs/ru/operations/utils/clickhouse-copier.md index 9eb5a151a4a..d36de755c59 100644 --- a/docs/ru/operations/utils/clickhouse-copier.md +++ b/docs/ru/operations/utils/clickhouse-copier.md @@ -143,7 +143,7 @@ $ clickhouse-copier copier --daemon --config zookeeper.xml --task-path /task/pat NOTE: In spite of this section is optional (if it is not specified, all partitions will be copied), it is strictly recommended to specify them explicitly. - If you already have some ready paritions on destination cluster they + If you already have some ready partitions on destination cluster they will be removed at the start of the copying since they will be interpeted as unfinished data from the previous copying!!! --> diff --git a/docs/ru/security_changelog.md b/docs/ru/security_changelog.md index 17ae1eba19d..db742b5f990 100644 --- a/docs/ru/security_changelog.md +++ b/docs/ru/security_changelog.md @@ -1,3 +1,23 @@ +## Исправлено в релизе 19.14.3.3, 2019-09-10 + +### CVE-2019-15024 + +Злоумышленник с доступом на запись к ZooKeeper и возможностью запустить собственный сервер в сети доступной ClickHouse может создать вредоносный сервер, который будет вести себя как реплика ClickHouse и зарегистрируется в ZooKeeper. В процессе репликации вредоносный сервер может указать любой путь на файловой системе в который будут записаны данные. + +Обнаружено благодаря: Эльдару Заитову из Службы Информационной Безопасности Яндекса + +### CVE-2019-16535 + +Интерфейс декомпрессии позволял совершать OOB чтения и записи данных в памяти, а также переполнение целочисленных переменных, что могло приводить к отказу в обслуживании. Также потенциально могло использоваьтся для удаленного выполнения кода. + +Обнаружено благодаря: Эльдару Заитову из Службы Информационной Безопасности Яндекса + +### CVE-2019-16536 + +Аутентифицированный клиент злоумышленника имел возможность вызвать переполнение стека, что могло привести к отказу в обслуживании. + +Обнаружено благодаря: Эльдару Заитову из Службы Информационной Безопасности Яндекса + ## Исправлено в релизе 19.13.6.1 от 20 сентября 2019 ### CVE-2019-18657 @@ -19,7 +39,7 @@ unixODBC позволял указать путь для подключения Обнаружено благодаря: Андрею Красичкову и Евгению Сидорову из Службы Информационной Безопасности Яндекса -## Исправлено в релизе 1.1.54388 от 28 июня 2018 +## Исправлено в релизе 1.1.54388 от 28 июня 2018 ### CVE-2018-14668 Табличная функция "remote" допускала произвольные символы в полях "user", "password" и "default_database", что позволяло производить атаки класса Cross Protocol Request Forgery. diff --git a/docs/tools/mkdocs-material-theme/assets/javascripts/application.js b/docs/tools/mkdocs-material-theme/assets/javascripts/application.js index 12b85194724..b4aa3032430 100644 --- a/docs/tools/mkdocs-material-theme/assets/javascripts/application.js +++ b/docs/tools/mkdocs-material-theme/assets/javascripts/application.js @@ -1470,7 +1470,7 @@ function defaultClearTimeout () { } ()) function runTimeout(fun) { if (cachedSetTimeout === setTimeout) { - //normal enviroments in sane situations + //normal environments in sane situations return setTimeout(fun, 0); } // if setTimeout wasn't available but was latter defined @@ -1495,7 +1495,7 @@ function runTimeout(fun) { } function runClearTimeout(marker) { if (cachedClearTimeout === clearTimeout) { - //normal enviroments in sane situations + //normal environments in sane situations return clearTimeout(marker); } // if clearTimeout wasn't available but was latter defined @@ -8028,7 +8028,7 @@ lunr.QueryParser.parseBoost = function (parser) { } else if (typeof exports === 'object') { /** * Node. Does not work with strict CommonJS, but - * only CommonJS-like enviroments that support module.exports, + * only CommonJS-like environments that support module.exports, * like Node. */ module.exports = factory() diff --git a/docs/tools/mkdocs-material-theme/assets/javascripts/lunr/lunr.js b/docs/tools/mkdocs-material-theme/assets/javascripts/lunr/lunr.js index b157aaa31a2..8fdf0630c6a 100644 --- a/docs/tools/mkdocs-material-theme/assets/javascripts/lunr/lunr.js +++ b/docs/tools/mkdocs-material-theme/assets/javascripts/lunr/lunr.js @@ -2968,7 +2968,7 @@ lunr.QueryParser.parseBoost = function (parser) { } else if (typeof exports === 'object') { /** * Node. Does not work with strict CommonJS, but - * only CommonJS-like enviroments that support module.exports, + * only CommonJS-like environments that support module.exports, * like Node. */ module.exports = factory() diff --git a/docs/zh/operations/utils/clickhouse-copier.md b/docs/zh/operations/utils/clickhouse-copier.md index fac374b4790..cf5ef7cb7a5 100644 --- a/docs/zh/operations/utils/clickhouse-copier.md +++ b/docs/zh/operations/utils/clickhouse-copier.md @@ -142,7 +142,7 @@ Parameters: NOTE: In spite of this section is optional (if it is not specified, all partitions will be copied), it is strictly recommended to specify them explicitly. - If you already have some ready paritions on destination cluster they + If you already have some ready partitions on destination cluster they will be removed at the start of the copying since they will be interpeted as unfinished data from the previous copying!!! --> diff --git a/libs/libdaemon/include/daemon/BaseDaemon.h b/libs/libdaemon/include/daemon/BaseDaemon.h index 462cbb95418..56f7dc5f06b 100644 --- a/libs/libdaemon/include/daemon/BaseDaemon.h +++ b/libs/libdaemon/include/daemon/BaseDaemon.h @@ -168,22 +168,11 @@ protected: { std::string file; - /// Создать объект, не создавая PID файл - PID() {} - - /// Создать объект, создать PID файл - PID(const std::string & file_) { seed(file_); } - - /// Создать PID файл - void seed(const std::string & file_); - - /// Удалить PID файл - void clear(); - - ~PID() { clear(); } + PID(const std::string & file_); + ~PID(); }; - PID pid; + std::optional pid; std::atomic_bool is_cancelled{false}; diff --git a/libs/libdaemon/src/BaseDaemon.cpp b/libs/libdaemon/src/BaseDaemon.cpp index e93e2ab2f8b..9d9ea6a0a15 100644 --- a/libs/libdaemon/src/BaseDaemon.cpp +++ b/libs/libdaemon/src/BaseDaemon.cpp @@ -18,6 +18,7 @@ #include #include #include +#include #include #include @@ -498,7 +499,7 @@ void BaseDaemon::terminate() void BaseDaemon::kill() { dumpCoverageReportIfPossible(); - pid.clear(); + pid.reset(); if (::raise(SIGKILL) != 0) throw Poco::SystemException("cannot kill process"); } @@ -668,7 +669,7 @@ void BaseDaemon::initialize(Application & self) /// Create pid file. if (config().has("pid")) - pid.seed(config().getString("pid")); + pid.emplace(config().getString("pid")); /// Change path for logging. if (!log_path.empty()) @@ -835,7 +836,7 @@ bool isPidRunning(pid_t pid) return 0; } -void BaseDaemon::PID::seed(const std::string & file_) +BaseDaemon::PID::PID(const std::string & file_) { file = Poco::Path(file_).absolute().toString(); Poco::File poco_file(file); @@ -862,34 +863,28 @@ void BaseDaemon::PID::seed(const std::string & file_) if (-1 == fd) { - file.clear(); if (EEXIST == errno) throw Poco::Exception("Pid file exists, should not start daemon."); throw Poco::CreateFileException("Cannot create pid file."); } + SCOPE_EXIT({ close(fd); }); + + std::stringstream s; + s << getpid(); + if (static_cast(s.str().size()) != write(fd, s.str().c_str(), s.str().size())) + throw Poco::Exception("Cannot write to pid file."); +} + +BaseDaemon::PID::~PID() +{ try { - std::stringstream s; - s << getpid(); - if (static_cast(s.str().size()) != write(fd, s.str().c_str(), s.str().size())) - throw Poco::Exception("Cannot write to pid file."); + Poco::File(file).remove(); } catch (...) { - close(fd); - throw; - } - - close(fd); -} - -void BaseDaemon::PID::clear() -{ - if (!file.empty()) - { - Poco::File(file).remove(); - file.clear(); + DB::tryLogCurrentException(__PRETTY_FUNCTION__); } } diff --git a/utils/compressor/CMakeLists.txt b/utils/compressor/CMakeLists.txt index c032054187b..3498640acd1 100644 --- a/utils/compressor/CMakeLists.txt +++ b/utils/compressor/CMakeLists.txt @@ -1,17 +1,5 @@ -find_package (Threads) - -add_executable (zstd_test zstd_test.cpp) -if(ZSTD_LIBRARY) - target_link_libraries(zstd_test PRIVATE ${ZSTD_LIBRARY}) -endif() -target_link_libraries (zstd_test PRIVATE common) - add_executable (mutator mutator.cpp) target_link_libraries(mutator PRIVATE clickhouse_common_io) add_executable (decompress_perf decompress_perf.cpp) target_link_libraries(decompress_perf PRIVATE dbms ${LZ4_LIBRARY}) - -if (NOT USE_INTERNAL_ZSTD_LIBRARY AND ZSTD_INCLUDE_DIR) - target_include_directories (zstd_test BEFORE PRIVATE ${ZSTD_INCLUDE_DIR}) -endif () diff --git a/utils/compressor/zstd_test.cpp b/utils/compressor/zstd_test.cpp deleted file mode 100644 index 01016830e7d..00000000000 --- a/utils/compressor/zstd_test.cpp +++ /dev/null @@ -1,68 +0,0 @@ -#include -#include -#include -#include -#include - - -int main(int argc, char ** argv) -{ - bool compress = argc == 1; - - const size_t size = 1048576; - std::vector src_buf(size); - std::vector dst_buf; - - size_t pos = 0; - while (true) - { - ssize_t read_res = read(STDIN_FILENO, &src_buf[pos], size - pos); - if (read_res < 0) - throw std::runtime_error("Cannot read from stdin"); - if (read_res == 0) - break; - pos += read_res; - } - - src_buf.resize(pos); - - size_t zstd_res; - - if (compress) - { - dst_buf.resize(ZSTD_compressBound(src_buf.size())); - - zstd_res = ZSTD_compress( - &dst_buf[0], - dst_buf.size(), - &src_buf[0], - src_buf.size(), - 1); - } - else - { - dst_buf.resize(size); - - zstd_res = ZSTD_decompress( - &dst_buf[0], - dst_buf.size(), - &src_buf[0], - src_buf.size()); - } - - if (ZSTD_isError(zstd_res)) - throw std::runtime_error(ZSTD_getErrorName(zstd_res)); - - dst_buf.resize(zstd_res); - - pos = 0; - while (pos < dst_buf.size()) - { - ssize_t write_res = write(STDOUT_FILENO, &dst_buf[pos], dst_buf.size()); - if (write_res <= 0) - throw std::runtime_error("Cannot write to stdout"); - pos += write_res; - } - - return 0; -} diff --git a/utils/make_changelog.py b/utils/make_changelog.py index 90d12844c33..808fafa6d92 100755 --- a/utils/make_changelog.py +++ b/utils/make_changelog.py @@ -455,7 +455,7 @@ def make_changelog(new_tag, prev_tag, pull_requests_nums, repo, repo_folder, sta # Remove double whitespaces and trailing whitespaces changelog = re.sub(r' {2,}| +$', r''.format(repo), changelog) - print(changelog) + print(changelog.encode('utf-8')) if __name__ == '__main__': diff --git a/utils/release/release_lib.sh b/utils/release/release_lib.sh index 5aa48a60926..ab395c9ad37 100644 --- a/utils/release/release_lib.sh +++ b/utils/release/release_lib.sh @@ -275,8 +275,39 @@ function make_tgz { PACKAGE_DIR=${PACKAGE_DIR=../} for PACKAGE in clickhouse-server clickhouse-client clickhouse-test clickhouse-common-static clickhouse-common-static-dbg; do - alien --verbose --to-tgz ${PACKAGE_DIR}${PACKAGE}_${VERSION_FULL}_*.deb + alien --verbose --scripts --generate --to-tgz ${PACKAGE_DIR}${PACKAGE}_${VERSION_FULL}_*.deb + PKGDIR="./${PACKAGE}-${VERSION_FULL}" + if [ ! -d "$PKGDIR/install" ]; then + mkdir "$PKGDIR/install" + fi + + if [ ! -f "$PKGDIR/install/doinst.sh" ]; then + echo '#!/bin/sh' > "$PKGDIR/install/doinst.sh" + echo 'set -e' >> "$PKGDIR/install/doinst.sh" + fi + + SCRIPT_TEXT=' +SCRIPTPATH="$( cd "$(dirname "$0")" ; pwd -P )" +for filepath in `find $SCRIPTPATH/.. -type f -or -type l | grep -v "\.\./install/"`; do + destpath=${filepath##$SCRIPTPATH/..} + mkdir -p $(dirname "$destpath") + cp -r "$filepath" "$destpath" +done +' + + echo "$SCRIPT_TEXT" | sed -i "2r /dev/stdin" "$PKGDIR/install/doinst.sh" + + chmod +x "$PKGDIR/install/doinst.sh" + + if [ -f "/usr/bin/pigz" ]; then + tar --use-compress-program=pigz -cf "${PACKAGE}-${VERSION_FULL}.tgz" "$PKGDIR" + else + tar -czf "${PACKAGE}-${VERSION_FULL}.tgz" "$PKGDIR" + fi + + rm -r $PKGDIR done + mv clickhouse-*-${VERSION_FULL}.tgz ${PACKAGE_DIR} } diff --git a/website/benchmark.html b/website/benchmark.html index 433a9138237..5240cdeeb73 100644 --- a/website/benchmark.html +++ b/website/benchmark.html @@ -2112,13 +2112,6 @@ function generate_diagram() { - - diff --git a/website/benchmark_hardware.html b/website/benchmark_hardware.html new file mode 100644 index 00000000000..a3f7810447e --- /dev/null +++ b/website/benchmark_hardware.html @@ -0,0 +1,990 @@ + + + + + Performance Comparison Of ClickHouse On Various Hardware + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +

Performance Comparison Of ClickHouse On Various Hardware

+
+ +
+ +
+ +

Relative query processing time (lower is better):

+
+ +
+

Full results:

+ +
+ +
+ +
+Results for Lenovo B580 Laptop are from Ragıp Ünal. 16GB RAM 1600 GHz, 240GB SSD, Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz (2 Core / 4 HT)
+Submit your own results: https://clickhouse.yandex/docs/en/operations/performance_test/ +
+ + + + + diff --git a/website/index.html b/website/index.html index afe8d2abcf5..ec686bceefb 100644 --- a/website/index.html +++ b/website/index.html @@ -398,7 +398,7 @@

System requirements: Linux, x86_64 with SSE 4.2.

-

Install packages for Ubuntu/Debian or CentOS/RedHat:

+

Install packages for Ubuntu/Debian or CentOS/RedHat or Other Linux:

@@ -422,6 +422,28 @@ sudo yum install clickhouse-server clickhouse-client
 sudo /etc/init.d/clickhouse-server start
 clickhouse-client
 
+ + +

For other operating systems the easiest way to get started is using @@ -549,7 +571,7 @@ clickhouse-client window.location.host = hostParts[0] + '.' + hostParts[1]; } - var available_distributives = ['deb', 'rpm']; + var available_distributives = ['deb', 'rpm', 'tgz']; var selected_distributive = 'deb'; function refresh_distributives() {