Merge remote-tracking branch 'origin' into integration-explicit-configs

This commit is contained in:
Yatsishin Ilya 2020-09-03 11:49:48 +03:00
commit fc075e65da
78 changed files with 1334 additions and 144 deletions

View File

@ -1,3 +1,196 @@
## ClickHouse release 20.7
### ClickHouse release v20.7.2.30-stable, 2020-08-31
#### Backward Incompatible Change
* Function `modulo` (operator `%`) with at least one floating point number as argument will calculate remainder of division directly on floating point numbers without converting both arguments to integers. It makes behaviour compatible with most of DBMS. This also applicable for Date and DateTime data types. Added alias `mod`. This closes [#7323](https://github.com/ClickHouse/ClickHouse/issues/7323). [#12585](https://github.com/ClickHouse/ClickHouse/pull/12585) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Deprecate special printing of zero Date/DateTime values as `0000-00-00` and `0000-00-00 00:00:00`. [#12442](https://github.com/ClickHouse/ClickHouse/pull/12442) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* The function `groupArrayMoving*` was not working for distributed queries. It's result was calculated within incorrect data type (without promotion to the largest type). The function `groupArrayMovingAvg` was returning integer number that was inconsistent with the `avg` function. This fixes [#12568](https://github.com/ClickHouse/ClickHouse/issues/12568). [#12622](https://github.com/ClickHouse/ClickHouse/pull/12622) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Add sanity check for MergeTree settings. If the settings are incorrect, the server will refuse to start or to create a table, printing detailed explanation to the user. [#13153](https://github.com/ClickHouse/ClickHouse/pull/13153) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Protect from the cases when user may set `background_pool_size` to value lower than `number_of_free_entries_in_pool_to_execute_mutation` or `number_of_free_entries_in_pool_to_lower_max_size_of_merge`. In these cases ALTERs won't work or the maximum size of merge will be too limited. It will throw exception explaining what to do. This closes [#10897](https://github.com/ClickHouse/ClickHouse/issues/10897). [#12728](https://github.com/ClickHouse/ClickHouse/pull/12728) ([alexey-milovidov](https://github.com/alexey-milovidov)).
#### New Feature
* Polygon dictionary type that provides efficient "reverse geocoding" lookups - to find the region by coordinates in a dictionary of many polygons (world map). It is using carefully optimized algorithm with recursive grids to maintain low CPU and memory usage. [#9278](https://github.com/ClickHouse/ClickHouse/pull/9278) ([achulkov2](https://github.com/achulkov2)).
* Added support of LDAP authentication for preconfigured users ("Simple Bind" method). [#11234](https://github.com/ClickHouse/ClickHouse/pull/11234) ([Denis Glazachev](https://github.com/traceon)).
* Introduce setting `alter_partition_verbose_result` which outputs information about touched parts for some types of `ALTER TABLE ... PARTITION ...` queries (currently `ATTACH` and `FREEZE`). Closes [#8076](https://github.com/ClickHouse/ClickHouse/issues/8076). [#13017](https://github.com/ClickHouse/ClickHouse/pull/13017) ([alesapin](https://github.com/alesapin)).
* Add `bayesAB` function for bayesian-ab-testing. [#12327](https://github.com/ClickHouse/ClickHouse/pull/12327) ([achimbab](https://github.com/achimbab)).
* Added `system.crash_log` table into which stack traces for fatal errors are collected. This table should be empty. [#12316](https://github.com/ClickHouse/ClickHouse/pull/12316) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Added http headers `X-ClickHouse-Database` and `X-ClickHouse-Format` which may be used to set default database and output format. [#12981](https://github.com/ClickHouse/ClickHouse/pull/12981) ([hcz](https://github.com/hczhcz)).
* Add `minMap` and `maxMap` functions support to `SimpleAggregateFunction`. [#12662](https://github.com/ClickHouse/ClickHouse/pull/12662) ([Ildus Kurbangaliev](https://github.com/ildus)).
* Add setting `allow_non_metadata_alters` which restricts to execute `ALTER` queries which modify data on disk. Disabled be default. Closes [#11547](https://github.com/ClickHouse/ClickHouse/issues/11547). [#12635](https://github.com/ClickHouse/ClickHouse/pull/12635) ([alesapin](https://github.com/alesapin)).
* A function `formatRow` is added to support turning arbitrary expressions into a string via given format. It's useful for manipulating SQL outputs and is quite versatile combined with the `columns` function. [#12574](https://github.com/ClickHouse/ClickHouse/pull/12574) ([Amos Bird](https://github.com/amosbird)).
* Add `FROM_UNIXTIME` function for compatibility with MySQL, related to [12149](https://github.com/ClickHouse/ClickHouse/issues/12149). [#12484](https://github.com/ClickHouse/ClickHouse/pull/12484) ([flynn](https://github.com/ucasFL)).
* Allow Nullable types as keys in MergeTree tables if `allow_nullable_key` table setting is enabled. https://github.com/ClickHouse/ClickHouse/issues/5319. [#12433](https://github.com/ClickHouse/ClickHouse/pull/12433) ([Amos Bird](https://github.com/amosbird)).
* Integration with [COS](https://intl.cloud.tencent.com/product/cos). [#12386](https://github.com/ClickHouse/ClickHouse/pull/12386) ([fastio](https://github.com/fastio)).
* Add mapAdd and mapSubtract functions for adding/subtracting key-mapped values. [#11735](https://github.com/ClickHouse/ClickHouse/pull/11735) ([Ildus Kurbangaliev](https://github.com/ildus)).
#### Bug Fix
* Fix premature `ON CLUSTER` timeouts for queries that must be executed on a single replica. Fixes [#6704](https://github.com/ClickHouse/ClickHouse/issues/6704), [#7228](https://github.com/ClickHouse/ClickHouse/issues/7228), [#13361](https://github.com/ClickHouse/ClickHouse/issues/13361), [#11884](https://github.com/ClickHouse/ClickHouse/issues/11884). [#13450](https://github.com/ClickHouse/ClickHouse/pull/13450) ([alesapin](https://github.com/alesapin)).
* Fix crash in mark inclusion search introduced in https://github.com/ClickHouse/ClickHouse/pull/12277. [#14225](https://github.com/ClickHouse/ClickHouse/pull/14225) ([Amos Bird](https://github.com/amosbird)).
* Fix race condition in external dictionaries with cache layout which can lead server crash. [#12566](https://github.com/ClickHouse/ClickHouse/pull/12566) ([alesapin](https://github.com/alesapin)).
* Fix visible data clobbering by progress bar in client in interactive mode. This fixes [#12562](https://github.com/ClickHouse/ClickHouse/issues/12562) and [#13369](https://github.com/ClickHouse/ClickHouse/issues/13369) and [#13584](https://github.com/ClickHouse/ClickHouse/issues/13584) and fixes [#12964](https://github.com/ClickHouse/ClickHouse/issues/12964). [#13691](https://github.com/ClickHouse/ClickHouse/pull/13691) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed incorrect sorting order for `LowCardinality` columns when ORDER BY multiple columns is used. This fixes [#13958](https://github.com/ClickHouse/ClickHouse/issues/13958). [#14223](https://github.com/ClickHouse/ClickHouse/pull/14223) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Removed hardcoded timeout, which wrongly overruled `query_wait_timeout_milliseconds` setting for cache-dictionary. [#14105](https://github.com/ClickHouse/ClickHouse/pull/14105) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fixed wrong mount point in extra info for `Poco::Exception: no space left on device`. [#14050](https://github.com/ClickHouse/ClickHouse/pull/14050) ([tavplubix](https://github.com/tavplubix)).
* Fix wrong query optimization of select queries with `DISTINCT` keyword when subqueries also have `DISTINCT` in case `optimize_duplicate_order_by_and_distinct` setting is enabled. [#13925](https://github.com/ClickHouse/ClickHouse/pull/13925) ([Artem Zuikov](https://github.com/4ertus2)).
* Fixed potential deadlock when renaming `Distributed` table. [#13922](https://github.com/ClickHouse/ClickHouse/pull/13922) ([tavplubix](https://github.com/tavplubix)).
* Fix incorrect sorting for `FixedString` columns when ORDER BY multiple columns is used. Fixes [#13182](https://github.com/ClickHouse/ClickHouse/issues/13182). [#13887](https://github.com/ClickHouse/ClickHouse/pull/13887) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix potentially lower precision of `topK`/`topKWeighted` aggregations (with non-default parameters). [#13817](https://github.com/ClickHouse/ClickHouse/pull/13817) ([Azat Khuzhin](https://github.com/azat)).
* Fix reading from MergeTree table with INDEX of type SET fails when compared against NULL. This fixes [#13686](https://github.com/ClickHouse/ClickHouse/issues/13686). [#13793](https://github.com/ClickHouse/ClickHouse/pull/13793) ([Amos Bird](https://github.com/amosbird)).
* Fix step overflow in function `range()`. [#13790](https://github.com/ClickHouse/ClickHouse/pull/13790) ([Azat Khuzhin](https://github.com/azat)).
* Fixed `Directory not empty` error when concurrently executing `DROP DATABASE` and `CREATE TABLE`. [#13756](https://github.com/ClickHouse/ClickHouse/pull/13756) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Add range check for `h3KRing` function. This fixes [#13633](https://github.com/ClickHouse/ClickHouse/issues/13633). [#13752](https://github.com/ClickHouse/ClickHouse/pull/13752) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix race condition between DETACH and background merges. Parts may revive after detach. This is continuation of [#8602](https://github.com/ClickHouse/ClickHouse/issues/8602) that did not fix the issue but introduced a test that started to fail in very rare cases, demonstrating the issue. [#13746](https://github.com/ClickHouse/ClickHouse/pull/13746) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix logging Settings.Names/Values when `log_queries_min_type` greater than `QUERY_START`. [#13737](https://github.com/ClickHouse/ClickHouse/pull/13737) ([Azat Khuzhin](https://github.com/azat)).
* Fix incorrect message in `clickhouse-server.init` while checking user and group. [#13711](https://github.com/ClickHouse/ClickHouse/pull/13711) ([ylchou](https://github.com/ylchou)).
* Do not optimize `any(arrayJoin())` to `arrayJoin()` under `optimize_move_functions_out_of_any`. [#13681](https://github.com/ClickHouse/ClickHouse/pull/13681) ([Azat Khuzhin](https://github.com/azat)).
* Fixed possible deadlock in concurrent `ALTER ... REPLACE/MOVE PARTITION ...` queries. [#13626](https://github.com/ClickHouse/ClickHouse/pull/13626) ([tavplubix](https://github.com/tavplubix)).
* Fixed the behaviour when sometimes cache-dictionary returned default value instead of present value from source. [#13624](https://github.com/ClickHouse/ClickHouse/pull/13624) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fix secondary indices corruption in compact parts (compact parts is an experimental feature). [#13538](https://github.com/ClickHouse/ClickHouse/pull/13538) ([Anton Popov](https://github.com/CurtizJ)).
* Fix wrong code in function `netloc`. This fixes [#13335](https://github.com/ClickHouse/ClickHouse/issues/13335). [#13446](https://github.com/ClickHouse/ClickHouse/pull/13446) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix error in `parseDateTimeBestEffort` function when unix timestamp was passed as an argument. This fixes [#13362](https://github.com/ClickHouse/ClickHouse/issues/13362). [#13441](https://github.com/ClickHouse/ClickHouse/pull/13441) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix invalid return type for comparison of tuples with `NULL` elements. Fixes [#12461](https://github.com/ClickHouse/ClickHouse/issues/12461). [#13420](https://github.com/ClickHouse/ClickHouse/pull/13420) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix wrong optimization caused `aggregate function any(x) is found inside another aggregate function in query` error with `SET optimize_move_functions_out_of_any = 1` and aliases inside `any()`. [#13419](https://github.com/ClickHouse/ClickHouse/pull/13419) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix possible race in `StorageMemory`. [#13416](https://github.com/ClickHouse/ClickHouse/pull/13416) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix empty output for `Arrow` and `Parquet` formats in case if query return zero rows. It was done because empty output is not valid for this formats. [#13399](https://github.com/ClickHouse/ClickHouse/pull/13399) ([hcz](https://github.com/hczhcz)).
* Fix select queries with constant columns and prefix of primary key in `ORDER BY` clause. [#13396](https://github.com/ClickHouse/ClickHouse/pull/13396) ([Anton Popov](https://github.com/CurtizJ)).
* Fix `PrettyCompactMonoBlock` for clickhouse-local. Fix extremes/totals with `PrettyCompactMonoBlock`. Fixes [#7746](https://github.com/ClickHouse/ClickHouse/issues/7746). [#13394](https://github.com/ClickHouse/ClickHouse/pull/13394) ([Azat Khuzhin](https://github.com/azat)).
* Fixed deadlock in system.text_log. [#12452](https://github.com/ClickHouse/ClickHouse/pull/12452) ([alexey-milovidov](https://github.com/alexey-milovidov)). It is a part of [#12339](https://github.com/ClickHouse/ClickHouse/issues/12339). This fixes [#12325](https://github.com/ClickHouse/ClickHouse/issues/12325). [#13386](https://github.com/ClickHouse/ClickHouse/pull/13386) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fixed `File(TSVWithNames*)` (header was written multiple times), fixed `clickhouse-local --format CSVWithNames*` (lacks header, broken after [#12197](https://github.com/ClickHouse/ClickHouse/issues/12197)), fixed `clickhouse-local --format CSVWithNames*` with zero rows (lacks header). [#13343](https://github.com/ClickHouse/ClickHouse/pull/13343) ([Azat Khuzhin](https://github.com/azat)).
* Fix segfault when function `groupArrayMovingSum` deserializes empty state. Fixes [#13339](https://github.com/ClickHouse/ClickHouse/issues/13339). [#13341](https://github.com/ClickHouse/ClickHouse/pull/13341) ([alesapin](https://github.com/alesapin)).
* Throw error on `arrayJoin()` function in `JOIN ON` section. [#13330](https://github.com/ClickHouse/ClickHouse/pull/13330) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix crash in `LEFT ASOF JOIN` with `join_use_nulls=1`. [#13291](https://github.com/ClickHouse/ClickHouse/pull/13291) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix possible error `Totals having transform was already added to pipeline` in case of a query from delayed replica. [#13290](https://github.com/ClickHouse/ClickHouse/pull/13290) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* The server may crash if user passed specifically crafted arguments to the function `h3ToChildren`. This fixes [#13275](https://github.com/ClickHouse/ClickHouse/issues/13275). [#13277](https://github.com/ClickHouse/ClickHouse/pull/13277) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix potentially low performance and slightly incorrect result for `uniqExact`, `topK`, `sumDistinct` and similar aggregate functions called on Float types with `NaN` values. It also triggered assert in debug build. This fixes [#12491](https://github.com/ClickHouse/ClickHouse/issues/12491). [#13254](https://github.com/ClickHouse/ClickHouse/pull/13254) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix assertion in KeyCondition when primary key contains expression with monotonic function and query contains comparison with constant whose type is different. This fixes [#12465](https://github.com/ClickHouse/ClickHouse/issues/12465). [#13251](https://github.com/ClickHouse/ClickHouse/pull/13251) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Return passed number for numbers with MSB set in function roundUpToPowerOfTwoOrZero(). It prevents potential errors in case of overflow of array sizes. [#13234](https://github.com/ClickHouse/ClickHouse/pull/13234) ([Azat Khuzhin](https://github.com/azat)).
* Fix function if with nullable constexpr as cond that is not literal NULL. Fixes [#12463](https://github.com/ClickHouse/ClickHouse/issues/12463). [#13226](https://github.com/ClickHouse/ClickHouse/pull/13226) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix assert in `arrayElement` function in case of array elements are Nullable and array subscript is also Nullable. This fixes [#12172](https://github.com/ClickHouse/ClickHouse/issues/12172). [#13224](https://github.com/ClickHouse/ClickHouse/pull/13224) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix DateTime64 conversion functions with constant argument. [#13205](https://github.com/ClickHouse/ClickHouse/pull/13205) ([Azat Khuzhin](https://github.com/azat)).
* Fix parsing row policies from users.xml when names of databases or tables contain dots. This fixes https://github.com/ClickHouse/ClickHouse/issues/5779, https://github.com/ClickHouse/ClickHouse/issues/12527. [#13199](https://github.com/ClickHouse/ClickHouse/pull/13199) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix access to `redis` dictionary after connection was dropped once. It may happen with `cache` and `direct` dictionary layouts. [#13082](https://github.com/ClickHouse/ClickHouse/pull/13082) ([Anton Popov](https://github.com/CurtizJ)).
* Fix wrong index analysis with functions. It could lead to some data parts being skipped when reading from `MergeTree` tables. Fixes [#13060](https://github.com/ClickHouse/ClickHouse/issues/13060). Fixes [#12406](https://github.com/ClickHouse/ClickHouse/issues/12406). [#13081](https://github.com/ClickHouse/ClickHouse/pull/13081) ([Anton Popov](https://github.com/CurtizJ)).
* Fix error `Cannot convert column because it is constant but values of constants are different in source and result` for remote queries which use deterministic functions in scope of query, but not deterministic between queries, like `now()`, `now64()`, `randConstant()`. Fixes [#11327](https://github.com/ClickHouse/ClickHouse/issues/11327). [#13075](https://github.com/ClickHouse/ClickHouse/pull/13075) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix crash which was possible for queries with `ORDER BY` tuple and small `LIMIT`. Fixes [#12623](https://github.com/ClickHouse/ClickHouse/issues/12623). [#13009](https://github.com/ClickHouse/ClickHouse/pull/13009) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix `Block structure mismatch` error for queries with `UNION` and `JOIN`. Fixes [#12602](https://github.com/ClickHouse/ClickHouse/issues/12602). [#12989](https://github.com/ClickHouse/ClickHouse/pull/12989) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Corrected `merge_with_ttl_timeout` logic which did not work well when expiration affected more than one partition over one time interval. (Authored by @excitoon). [#12982](https://github.com/ClickHouse/ClickHouse/pull/12982) ([Alexander Kazakov](https://github.com/Akazz)).
* Fix columns duplication for range hashed dictionary created from DDL query. This fixes [#10605](https://github.com/ClickHouse/ClickHouse/issues/10605). [#12857](https://github.com/ClickHouse/ClickHouse/pull/12857) ([alesapin](https://github.com/alesapin)).
* Fix unnecessary limiting for the number of threads for selects from local replica. [#12840](https://github.com/ClickHouse/ClickHouse/pull/12840) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix rare bug when `ALTER DELETE` and `ALTER MODIFY COLUMN` queries executed simultaneously as a single mutation. Bug leads to an incorrect amount of rows in `count.txt` and as a consequence incorrect data in part. Also, fix a small bug with simultaneous `ALTER RENAME COLUMN` and `ALTER ADD COLUMN`. [#12760](https://github.com/ClickHouse/ClickHouse/pull/12760) ([alesapin](https://github.com/alesapin)).
* Wrong credentials being used when using `clickhouse` dictionary source to query remote tables. [#12756](https://github.com/ClickHouse/ClickHouse/pull/12756) ([sundyli](https://github.com/sundy-li)).
* Fix `CAST(Nullable(String), Enum())`. [#12745](https://github.com/ClickHouse/ClickHouse/pull/12745) ([Azat Khuzhin](https://github.com/azat)).
* Fix performance with large tuples, which are interpreted as functions in `IN` section. The case when user writes `WHERE x IN tuple(1, 2, ...)` instead of `WHERE x IN (1, 2, ...)` for some obscure reason. [#12700](https://github.com/ClickHouse/ClickHouse/pull/12700) ([Anton Popov](https://github.com/CurtizJ)).
* Fix memory tracking for input_format_parallel_parsing (by attaching thread to group). [#12672](https://github.com/ClickHouse/ClickHouse/pull/12672) ([Azat Khuzhin](https://github.com/azat)).
* Fix wrong optimization `optimize_move_functions_out_of_any=1` in case of `any(func(<lambda>))`. [#12664](https://github.com/ClickHouse/ClickHouse/pull/12664) ([Artem Zuikov](https://github.com/4ertus2)).
* Fixed [#10572](https://github.com/ClickHouse/ClickHouse/issues/10572) fix bloom filter index with const expression. [#12659](https://github.com/ClickHouse/ClickHouse/pull/12659) ([Winter Zhang](https://github.com/zhang2014)).
* Fix SIGSEGV in StorageKafka when broker is unavailable (and not only). [#12658](https://github.com/ClickHouse/ClickHouse/pull/12658) ([Azat Khuzhin](https://github.com/azat)).
* Add support for function `if` with `Array(UUID)` arguments. This fixes [#11066](https://github.com/ClickHouse/ClickHouse/issues/11066). [#12648](https://github.com/ClickHouse/ClickHouse/pull/12648) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* CREATE USER IF NOT EXISTS now doesn't throw exception if the user exists. This fixes https://github.com/ClickHouse/ClickHouse/issues/12507. [#12646](https://github.com/ClickHouse/ClickHouse/pull/12646) ([Vitaly Baranov](https://github.com/vitlibar)).
* Exception `There is no supertype...` can be thrown during `ALTER ... UPDATE` in unexpected cases (e.g. when subtracting from UInt64 column). This fixes [#7306](https://github.com/ClickHouse/ClickHouse/issues/7306). This fixes [#4165](https://github.com/ClickHouse/ClickHouse/issues/4165). [#12633](https://github.com/ClickHouse/ClickHouse/pull/12633) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix possible `Pipeline stuck` error for queries with external sorting. Fixes [#12617](https://github.com/ClickHouse/ClickHouse/issues/12617). [#12618](https://github.com/ClickHouse/ClickHouse/pull/12618) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix error `Output of TreeExecutor is not sorted` for `OPTIMIZE DEDUPLICATE`. Fixes [#11572](https://github.com/ClickHouse/ClickHouse/issues/11572). [#12613](https://github.com/ClickHouse/ClickHouse/pull/12613) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix the issue when alias on result of function `any` can be lost during query optimization. [#12593](https://github.com/ClickHouse/ClickHouse/pull/12593) ([Anton Popov](https://github.com/CurtizJ)).
* Remove data for Distributed tables (blocks from async INSERTs) on DROP TABLE. [#12556](https://github.com/ClickHouse/ClickHouse/pull/12556) ([Azat Khuzhin](https://github.com/azat)).
* Now ClickHouse will recalculate checksums for parts when file `checksums.txt` is absent. Broken since [#9827](https://github.com/ClickHouse/ClickHouse/issues/9827). [#12545](https://github.com/ClickHouse/ClickHouse/pull/12545) ([alesapin](https://github.com/alesapin)).
* Fix bug which lead to broken old parts after `ALTER DELETE` query when `enable_mixed_granularity_parts=1`. Fixes [#12536](https://github.com/ClickHouse/ClickHouse/issues/12536). [#12543](https://github.com/ClickHouse/ClickHouse/pull/12543) ([alesapin](https://github.com/alesapin)).
* Fixing race condition in live view tables which could cause data duplication. LIVE VIEW is an experimental feature. [#12519](https://github.com/ClickHouse/ClickHouse/pull/12519) ([vzakaznikov](https://github.com/vzakaznikov)).
* Fix backwards compatibility in binary format of `AggregateFunction(avg, ...)` values. This fixes [#12342](https://github.com/ClickHouse/ClickHouse/issues/12342). [#12486](https://github.com/ClickHouse/ClickHouse/pull/12486) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix crash in JOIN with dictionary when we are joining over expression of dictionary key: `t JOIN dict ON expr(dict.id) = t.id`. Disable dictionary join optimisation for this case. [#12458](https://github.com/ClickHouse/ClickHouse/pull/12458) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix overflow when very large LIMIT or OFFSET is specified. This fixes [#10470](https://github.com/ClickHouse/ClickHouse/issues/10470). This fixes [#11372](https://github.com/ClickHouse/ClickHouse/issues/11372). [#12427](https://github.com/ClickHouse/ClickHouse/pull/12427) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* kafka: fix SIGSEGV if there is a message with error in the middle of the batch. [#12302](https://github.com/ClickHouse/ClickHouse/pull/12302) ([Azat Khuzhin](https://github.com/azat)).
#### Improvement
* Keep smaller amount of logs in ZooKeeper. Avoid excessive growing of ZooKeeper nodes in case of offline replicas when having many servers/tables/inserts. [#13100](https://github.com/ClickHouse/ClickHouse/pull/13100) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Now exceptions forwarded to the client if an error happened during ALTER or mutation. Closes [#11329](https://github.com/ClickHouse/ClickHouse/issues/11329). [#12666](https://github.com/ClickHouse/ClickHouse/pull/12666) ([alesapin](https://github.com/alesapin)).
* Add `QueryTimeMicroseconds`, `SelectQueryTimeMicroseconds` and `InsertQueryTimeMicroseconds` to `system.events`, along with system.metrics, processes, query_log, etc. [#13028](https://github.com/ClickHouse/ClickHouse/pull/13028) ([ianton-ru](https://github.com/ianton-ru)).
* Added `SelectedRows` and `SelectedBytes` to `system.events`, along with system.metrics, processes, query_log, etc. [#12638](https://github.com/ClickHouse/ClickHouse/pull/12638) ([ianton-ru](https://github.com/ianton-ru)).
* Added `current_database` information to `system.query_log`. [#12652](https://github.com/ClickHouse/ClickHouse/pull/12652) ([Amos Bird](https://github.com/amosbird)).
* Allow `TabSeparatedRaw` as input format. [#12009](https://github.com/ClickHouse/ClickHouse/pull/12009) ([hcz](https://github.com/hczhcz)).
* Now `joinGet` supports multi-key lookup. [#12418](https://github.com/ClickHouse/ClickHouse/pull/12418) ([Amos Bird](https://github.com/amosbird)).
* Allow `*Map` aggregate functions to work on Arrays with NULLs. Fixes [#13157](https://github.com/ClickHouse/ClickHouse/issues/13157). [#13225](https://github.com/ClickHouse/ClickHouse/pull/13225) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Avoid overflow in parsing of DateTime values that will lead to negative unix timestamp in their timezone (for example, `1970-01-01 00:00:00` in Moscow). Saturate to zero instead. This fixes [#3470](https://github.com/ClickHouse/ClickHouse/issues/3470). This fixes [#4172](https://github.com/ClickHouse/ClickHouse/issues/4172). [#12443](https://github.com/ClickHouse/ClickHouse/pull/12443) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* AvroConfluent: Skip Kafka tombstone records - Support skipping broken records [#13203](https://github.com/ClickHouse/ClickHouse/pull/13203) ([Andrew Onyshchuk](https://github.com/oandrew)).
* Fix wrong error for long queries. It was possible to get syntax error other than `Max query size exceeded` for correct query. [#13928](https://github.com/ClickHouse/ClickHouse/pull/13928) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix data race in `lgamma` function. This race was caught only in `tsan`, no side effects really happened. [#13842](https://github.com/ClickHouse/ClickHouse/pull/13842) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix a 'Week'-interval formatting for ATTACH/ALTER/CREATE QUOTA-statements. [#13417](https://github.com/ClickHouse/ClickHouse/pull/13417) ([vladimir-golovchenko](https://github.com/vladimir-golovchenko)).
* Now broken parts are also reported when encountered in compact part processing. Compact parts is an experimental feature. [#13282](https://github.com/ClickHouse/ClickHouse/pull/13282) ([Amos Bird](https://github.com/amosbird)).
* Fix assert in `geohashesInBox`. This fixes [#12554](https://github.com/ClickHouse/ClickHouse/issues/12554). [#13229](https://github.com/ClickHouse/ClickHouse/pull/13229) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix assert in `parseDateTimeBestEffort`. This fixes [#12649](https://github.com/ClickHouse/ClickHouse/issues/12649). [#13227](https://github.com/ClickHouse/ClickHouse/pull/13227) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Minor optimization in Processors/PipelineExecutor: breaking out of a loop because it makes sense to do so. [#13058](https://github.com/ClickHouse/ClickHouse/pull/13058) ([Mark Papadakis](https://github.com/markpapadakis)).
* Support TRUNCATE table without TABLE keyword. [#12653](https://github.com/ClickHouse/ClickHouse/pull/12653) ([Winter Zhang](https://github.com/zhang2014)).
* Fix explain query format overwrite by default, issue https://github.com/ClickHouse/ClickHouse/issues/12432. [#12541](https://github.com/ClickHouse/ClickHouse/pull/12541) ([BohuTANG](https://github.com/BohuTANG)).
* Allow to set JOIN kind and type in more standad way: `LEFT SEMI JOIN` instead of `SEMI LEFT JOIN`. For now both are correct. [#12520](https://github.com/ClickHouse/ClickHouse/pull/12520) ([Artem Zuikov](https://github.com/4ertus2)).
* Changes default value for `multiple_joins_rewriter_version` to 2. It enables new multiple joins rewriter that knows about column names. [#12469](https://github.com/ClickHouse/ClickHouse/pull/12469) ([Artem Zuikov](https://github.com/4ertus2)).
* Add several metrics for requests to S3 storages. [#12464](https://github.com/ClickHouse/ClickHouse/pull/12464) ([ianton-ru](https://github.com/ianton-ru)).
* Use correct default secure port for clickhouse-benchmark with `--secure` argument. This fixes [#11044](https://github.com/ClickHouse/ClickHouse/issues/11044). [#12440](https://github.com/ClickHouse/ClickHouse/pull/12440) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Rollback insertion errors in `Log`, `TinyLog`, `StripeLog` engines. In previous versions insertion error lead to inconsisent table state (this works as documented and it is normal for these table engines). This fixes [#12402](https://github.com/ClickHouse/ClickHouse/issues/12402). [#12426](https://github.com/ClickHouse/ClickHouse/pull/12426) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Implement `RENAME DATABASE` and `RENAME DICTIONARY` for `Atomic` database engine - Add implicit `{uuid}` macro, which can be used in ZooKeeper path for `ReplicatedMergeTree`. It works with `CREATE ... ON CLUSTER ...` queries. Set `show_table_uuid_in_table_create_query_if_not_nil` to `true` to use it. - Make `ReplicatedMergeTree` engine arguments optional, `/clickhouse/tables/{uuid}/{shard}/` and `{replica}` are used by default. Closes [#12135](https://github.com/ClickHouse/ClickHouse/issues/12135). - Minor fixes. - These changes break backward compatibility of `Atomic` database engine. Previously created `Atomic` databases must be manually converted to new format. Atomic database is an experimental feature. [#12343](https://github.com/ClickHouse/ClickHouse/pull/12343) ([tavplubix](https://github.com/tavplubix)).
* Separated `AWSAuthV4Signer` into different logger, removed excessive `AWSClient: AWSClient` from log messages. [#12320](https://github.com/ClickHouse/ClickHouse/pull/12320) ([Vladimir Chebotarev](https://github.com/excitoon)).
* Better exception message in disk access storage. [#12625](https://github.com/ClickHouse/ClickHouse/pull/12625) ([alesapin](https://github.com/alesapin)).
* Better exception for function `in` with invalid number of arguments. [#12529](https://github.com/ClickHouse/ClickHouse/pull/12529) ([Anton Popov](https://github.com/CurtizJ)).
* Fix error message about adaptive granularity. [#12624](https://github.com/ClickHouse/ClickHouse/pull/12624) ([alesapin](https://github.com/alesapin)).
* Fix SETTINGS parse after FORMAT. [#12480](https://github.com/ClickHouse/ClickHouse/pull/12480) ([Azat Khuzhin](https://github.com/azat)).
* If MergeTree table does not contain ORDER BY or PARTITION BY, it was possible to request ALTER to CLEAR all the columns and ALTER will stuck. Fixed [#7941](https://github.com/ClickHouse/ClickHouse/issues/7941). [#12382](https://github.com/ClickHouse/ClickHouse/pull/12382) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Avoid re-loading completion from the history file after each query (to avoid history overlaps with other client sessions). [#13086](https://github.com/ClickHouse/ClickHouse/pull/13086) ([Azat Khuzhin](https://github.com/azat)).
#### Performance Improvement
* Lower memory usage for some operations up to 2 times. [#12424](https://github.com/ClickHouse/ClickHouse/pull/12424) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Optimize PK lookup for queries that match exact PK range. [#12277](https://github.com/ClickHouse/ClickHouse/pull/12277) ([Ivan Babrou](https://github.com/bobrik)).
* Slightly optimize very short queries with `LowCardinality`. [#14129](https://github.com/ClickHouse/ClickHouse/pull/14129) ([Anton Popov](https://github.com/CurtizJ)).
* Slightly improve performance of aggregation by UInt8/UInt16 keys. [#13091](https://github.com/ClickHouse/ClickHouse/pull/13091) and [#13055](https://github.com/ClickHouse/ClickHouse/pull/13055) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Push down `LIMIT` step for query plan (inside subqueries). [#13016](https://github.com/ClickHouse/ClickHouse/pull/13016) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Parallel primary key lookup and skipping index stages on parts, as described in [#11564](https://github.com/ClickHouse/ClickHouse/issues/11564). [#12589](https://github.com/ClickHouse/ClickHouse/pull/12589) ([Ivan Babrou](https://github.com/bobrik)).
* Converting String-type arguments of function "if" and "transform" into enum if `set optimize_if_transform_strings_to_enum = 1`. [#12515](https://github.com/ClickHouse/ClickHouse/pull/12515) ([Artem Zuikov](https://github.com/4ertus2)).
* Replaces monotonic functions with its argument in `ORDER BY` if `set optimize_monotonous_functions_in_order_by=1`. [#12467](https://github.com/ClickHouse/ClickHouse/pull/12467) ([Artem Zuikov](https://github.com/4ertus2)).
* Add order by optimization that rewrites `ORDER BY x, f(x)` with `ORDER by x` if `set optimize_redundant_functions_in_order_by = 1`. [#12404](https://github.com/ClickHouse/ClickHouse/pull/12404) ([Artem Zuikov](https://github.com/4ertus2)).
* Allow pushdown predicate when subquery contains `WITH` clause. This fixes [#12293](https://github.com/ClickHouse/ClickHouse/issues/12293) [#12663](https://github.com/ClickHouse/ClickHouse/pull/12663) ([Winter Zhang](https://github.com/zhang2014)).
* Improve performance of reading from compact parts. Compact parts is an experimental feature. [#12492](https://github.com/ClickHouse/ClickHouse/pull/12492) ([Anton Popov](https://github.com/CurtizJ)).
* Attempt to implement streaming optimization in `DiskS3`. DiskS3 is an experimental feature. [#12434](https://github.com/ClickHouse/ClickHouse/pull/12434) ([Vladimir Chebotarev](https://github.com/excitoon)).
#### Build/Testing/Packaging Improvement
* Use `shellcheck` for sh tests linting. [#13200](https://github.com/ClickHouse/ClickHouse/pull/13200) [#13207](https://github.com/ClickHouse/ClickHouse/pull/13207) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Add script which set labels for pull requests in GitHub hook. [#13183](https://github.com/ClickHouse/ClickHouse/pull/13183) ([alesapin](https://github.com/alesapin)).
* Remove some of recursive submodules. See [#13378](https://github.com/ClickHouse/ClickHouse/issues/13378). [#13379](https://github.com/ClickHouse/ClickHouse/pull/13379) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Ensure that all the submodules are from proper URLs. Continuation of [#13379](https://github.com/ClickHouse/ClickHouse/issues/13379). This fixes [#13378](https://github.com/ClickHouse/ClickHouse/issues/13378). [#13397](https://github.com/ClickHouse/ClickHouse/pull/13397) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Added support for user-declared settings, which can be accessed from inside queries. This is needed when ClickHouse engine is used as a component of another system. [#13013](https://github.com/ClickHouse/ClickHouse/pull/13013) ([Vitaly Baranov](https://github.com/vitlibar)).
* Added testing for RBAC functionality of INSERT privilege in TestFlows. Expanded tables on which SELECT is being tested. Added Requirements to match new table engine tests. [#13340](https://github.com/ClickHouse/ClickHouse/pull/13340) ([MyroTk](https://github.com/MyroTk)).
* Fix timeout error during server restart in the stress test. [#13321](https://github.com/ClickHouse/ClickHouse/pull/13321) ([alesapin](https://github.com/alesapin)).
* Now fast test will wait server with retries. [#13284](https://github.com/ClickHouse/ClickHouse/pull/13284) ([alesapin](https://github.com/alesapin)).
* Function `materialize()` (the function for ClickHouse testing) will work for NULL as expected - by transforming it to non-constant column. [#13212](https://github.com/ClickHouse/ClickHouse/pull/13212) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix libunwind build in AArch64. This fixes [#13204](https://github.com/ClickHouse/ClickHouse/issues/13204). [#13208](https://github.com/ClickHouse/ClickHouse/pull/13208) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Even more retries in zkutil gtest to prevent test flakiness. [#13165](https://github.com/ClickHouse/ClickHouse/pull/13165) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Small fixes to the RBAC TestFlows. [#13152](https://github.com/ClickHouse/ClickHouse/pull/13152) ([vzakaznikov](https://github.com/vzakaznikov)).
* Fixing `00960_live_view_watch_events_live.py` test. [#13108](https://github.com/ClickHouse/ClickHouse/pull/13108) ([vzakaznikov](https://github.com/vzakaznikov)).
* Improve cache purge in documentation deploy script. [#13107](https://github.com/ClickHouse/ClickHouse/pull/13107) ([alesapin](https://github.com/alesapin)).
* Rewrote some orphan tests to gtest. Removed useless includes from tests. [#13073](https://github.com/ClickHouse/ClickHouse/pull/13073) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Added tests for RBAC functionality of `SELECT` privilege in TestFlows. [#13061](https://github.com/ClickHouse/ClickHouse/pull/13061) ([Ritaank Tiwari](https://github.com/ritaank)).
* Rerun some tests in fast test check. [#12992](https://github.com/ClickHouse/ClickHouse/pull/12992) ([alesapin](https://github.com/alesapin)).
* Fix MSan error in "rdkafka" library. This closes [#12990](https://github.com/ClickHouse/ClickHouse/issues/12990). Updated `rdkafka` to version 1.5 (master). [#12991](https://github.com/ClickHouse/ClickHouse/pull/12991) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix UBSan report in base64 if tests were run on server with AVX-512. This fixes [#12318](https://github.com/ClickHouse/ClickHouse/issues/12318). Author: @qoega. [#12441](https://github.com/ClickHouse/ClickHouse/pull/12441) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix UBSan report in HDFS library. This closes [#12330](https://github.com/ClickHouse/ClickHouse/issues/12330). [#12453](https://github.com/ClickHouse/ClickHouse/pull/12453) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Check an ability that we able to restore the backup from an old version to the new version. This closes [#8979](https://github.com/ClickHouse/ClickHouse/issues/8979). [#12959](https://github.com/ClickHouse/ClickHouse/pull/12959) ([alesapin](https://github.com/alesapin)).
* Do not build helper_container image inside integrational tests. Build docker container in CI and use pre-built helper_container in integration tests. [#12953](https://github.com/ClickHouse/ClickHouse/pull/12953) ([Ilya Yatsishin](https://github.com/qoega)).
* Add a test for `ALTER TABLE CLEAR COLUMN` query for primary key columns. [#12951](https://github.com/ClickHouse/ClickHouse/pull/12951) ([alesapin](https://github.com/alesapin)).
* Increased timeouts in testflows tests. [#12949](https://github.com/ClickHouse/ClickHouse/pull/12949) ([vzakaznikov](https://github.com/vzakaznikov)).
* Fix build of test under Mac OS X. This closes [#12767](https://github.com/ClickHouse/ClickHouse/issues/12767). [#12772](https://github.com/ClickHouse/ClickHouse/pull/12772) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Connector-ODBC updated to mysql-connector-odbc-8.0.21. [#12739](https://github.com/ClickHouse/ClickHouse/pull/12739) ([Ilya Yatsishin](https://github.com/qoega)).
* Adding RBAC syntax tests in TestFlows. [#12642](https://github.com/ClickHouse/ClickHouse/pull/12642) ([vzakaznikov](https://github.com/vzakaznikov)).
* Improve performance of TestKeeper. This will speedup tests with heavy usage of Replicated tables. [#12505](https://github.com/ClickHouse/ClickHouse/pull/12505) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Now we check that server is able to start after stress tests run. This fixes [#12473](https://github.com/ClickHouse/ClickHouse/issues/12473). [#12496](https://github.com/ClickHouse/ClickHouse/pull/12496) ([alesapin](https://github.com/alesapin)).
* Update fmtlib to master (7.0.1). [#12446](https://github.com/ClickHouse/ClickHouse/pull/12446) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Add docker image for fast tests. [#12294](https://github.com/ClickHouse/ClickHouse/pull/12294) ([alesapin](https://github.com/alesapin)).
* Rework configuration paths for integration tests. [#12285](https://github.com/ClickHouse/ClickHouse/pull/12285) ([Ilya Yatsishin](https://github.com/qoega)).
* Add compiler option to control that stack frames are not too large. This will help to run the code in fibers with small stack size. [#11524](https://github.com/ClickHouse/ClickHouse/pull/11524) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Update gitignore-files. [#13447](https://github.com/ClickHouse/ClickHouse/pull/13447) ([vladimir-golovchenko](https://github.com/vladimir-golovchenko)).
## ClickHouse release 20.6
### ClickHouse release v20.6.3.28-stable

View File

@ -78,8 +78,9 @@ if (COMPILER_GCC)
-Wno-deprecated-declarations -Wno-class-memaccess)
elseif (COMPILER_CLANG)
set (SUPPRESS_WARNINGS -Wno-non-virtual-dtor -Wno-sign-compare -Wno-strict-aliasing -Wno-deprecated-declarations)
set (CAPNP_PRIVATE_CXX_FLAGS -fno-char8_t)
endif ()
target_compile_options(kj PRIVATE ${SUPPRESS_WARNINGS})
target_compile_options(capnp PRIVATE ${SUPPRESS_WARNINGS})
target_compile_options(capnpc PRIVATE ${SUPPRESS_WARNINGS})
target_compile_options(kj PRIVATE ${SUPPRESS_WARNINGS} ${CAPNP_PRIVATE_CXX_FLAGS})
target_compile_options(capnp PRIVATE ${SUPPRESS_WARNINGS} ${CAPNP_PRIVATE_CXX_FLAGS})
target_compile_options(capnpc PRIVATE ${SUPPRESS_WARNINGS} ${CAPNP_PRIVATE_CXX_FLAGS})

2
debian/rules vendored
View File

@ -18,7 +18,7 @@ ifeq ($(CCACHE_PREFIX),distcc)
THREADS_COUNT=$(shell distcc -j)
endif
ifeq ($(THREADS_COUNT),)
THREADS_COUNT=$(shell nproc || grep -c ^processor /proc/cpuinfo || sysctl -n hw.ncpu || echo 4)
THREADS_COUNT=$(shell echo $$(( $$(nproc || grep -c ^processor /proc/cpuinfo || sysctl -n hw.ncpu || echo 8) / 2 )) )
endif
DEB_BUILD_OPTIONS+=parallel=$(THREADS_COUNT)

View File

@ -6,7 +6,7 @@ RUN apt-get update \
&& apt-get install ca-certificates lsb-release wget gnupg apt-transport-https \
--yes --no-install-recommends --verbose-versions \
&& export LLVM_PUBKEY_HASH="bda960a8da687a275a2078d43c111d66b1c6a893a3275271beedf266c1ff4a0cdecb429c7a5cccf9f486ea7aa43fd27f" \
&& wget -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \
&& wget -nv -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \
&& echo "${LLVM_PUBKEY_HASH} /tmp/llvm-snapshot.gpg.key" | sha384sum -c \
&& apt-key add /tmp/llvm-snapshot.gpg.key \
&& export CODENAME="$(lsb_release --codename --short | tr 'A-Z' 'a-z')" \

View File

@ -7,7 +7,7 @@ RUN apt-get update \
&& apt-get install ca-certificates lsb-release wget gnupg apt-transport-https \
--yes --no-install-recommends --verbose-versions \
&& export LLVM_PUBKEY_HASH="bda960a8da687a275a2078d43c111d66b1c6a893a3275271beedf266c1ff4a0cdecb429c7a5cccf9f486ea7aa43fd27f" \
&& wget -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \
&& wget -nv -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \
&& echo "${LLVM_PUBKEY_HASH} /tmp/llvm-snapshot.gpg.key" | sha384sum -c \
&& apt-key add /tmp/llvm-snapshot.gpg.key \
&& export CODENAME="$(lsb_release --codename --short | tr 'A-Z' 'a-z')" \
@ -55,7 +55,6 @@ RUN apt-get update \
cmake \
gdb \
rename \
wget \
build-essential \
--yes --no-install-recommends
@ -83,14 +82,14 @@ RUN git clone https://github.com/tpoechtrager/cctools-port.git \
&& rm -rf cctools-port
# Download toolchain for Darwin
RUN wget https://github.com/phracker/MacOSX-SDKs/releases/download/10.14-beta4/MacOSX10.14.sdk.tar.xz
RUN wget -nv https://github.com/phracker/MacOSX-SDKs/releases/download/10.14-beta4/MacOSX10.14.sdk.tar.xz
# Download toolchain for ARM
# It contains all required headers and libraries. Note that it's named as "gcc" but actually we are using clang for cross compiling.
RUN wget "https://developer.arm.com/-/media/Files/downloads/gnu-a/8.3-2019.03/binrel/gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu.tar.xz?revision=2e88a73f-d233-4f96-b1f4-d8b36e9bb0b9&la=en" -O gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu.tar.xz
RUN wget -nv "https://developer.arm.com/-/media/Files/downloads/gnu-a/8.3-2019.03/binrel/gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu.tar.xz?revision=2e88a73f-d233-4f96-b1f4-d8b36e9bb0b9&la=en" -O gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu.tar.xz
# Download toolchain for FreeBSD 11.3
RUN wget https://clickhouse-datasets.s3.yandex.net/toolchains/toolchains/freebsd-11.3-toolchain.tar.xz
RUN wget -nv https://clickhouse-datasets.s3.yandex.net/toolchains/toolchains/freebsd-11.3-toolchain.tar.xz
COPY build.sh /
CMD ["/bin/bash", "/build.sh"]

View File

@ -7,7 +7,7 @@ RUN apt-get update \
&& apt-get install ca-certificates lsb-release wget gnupg apt-transport-https \
--yes --no-install-recommends --verbose-versions \
&& export LLVM_PUBKEY_HASH="bda960a8da687a275a2078d43c111d66b1c6a893a3275271beedf266c1ff4a0cdecb429c7a5cccf9f486ea7aa43fd27f" \
&& wget -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \
&& wget -nv -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \
&& echo "${LLVM_PUBKEY_HASH} /tmp/llvm-snapshot.gpg.key" | sha384sum -c \
&& apt-key add /tmp/llvm-snapshot.gpg.key \
&& export CODENAME="$(lsb_release --codename --short | tr 'A-Z' 'a-z')" \
@ -34,7 +34,7 @@ RUN curl -O https://clickhouse-builds.s3.yandex.net/utils/1/dpkg-deb \
ENV APACHE_PUBKEY_HASH="bba6987b63c63f710fd4ed476121c588bc3812e99659d27a855f8c4d312783ee66ad6adfce238765691b04d62fa3688f"
RUN export CODENAME="$(lsb_release --codename --short | tr 'A-Z' 'a-z')" \
&& wget -O /tmp/arrow-keyring.deb "https://apache.bintray.com/arrow/ubuntu/apache-arrow-archive-keyring-latest-${CODENAME}.deb" \
&& wget -nv -O /tmp/arrow-keyring.deb "https://apache.bintray.com/arrow/ubuntu/apache-arrow-archive-keyring-latest-${CODENAME}.deb" \
&& echo "${APACHE_PUBKEY_HASH} /tmp/arrow-keyring.deb" | sha384sum -c \
&& dpkg -i /tmp/arrow-keyring.deb

View File

@ -7,7 +7,7 @@ RUN apt-get update \
&& apt-get install ca-certificates lsb-release wget gnupg apt-transport-https \
--yes --no-install-recommends --verbose-versions \
&& export LLVM_PUBKEY_HASH="bda960a8da687a275a2078d43c111d66b1c6a893a3275271beedf266c1ff4a0cdecb429c7a5cccf9f486ea7aa43fd27f" \
&& wget -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \
&& wget -nv -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \
&& echo "${LLVM_PUBKEY_HASH} /tmp/llvm-snapshot.gpg.key" | sha384sum -c \
&& apt-key add /tmp/llvm-snapshot.gpg.key \
&& export CODENAME="$(lsb_release --codename --short | tr 'A-Z' 'a-z')" \

View File

@ -15,7 +15,7 @@ RUN apt-get --allow-unauthenticated update -y \
gpg-agent \
git
RUN wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | sudo apt-key add -
RUN wget -nv -O - https://apt.kitware.com/keys/kitware-archive-latest.asc | sudo apt-key add -
RUN sudo apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic main'
RUN sudo echo "deb [trusted=yes] http://apt.llvm.org/bionic/ llvm-toolchain-bionic-8 main" >> /etc/apt/sources.list

View File

@ -7,7 +7,7 @@ RUN apt-get update \
&& apt-get install ca-certificates lsb-release wget gnupg apt-transport-https \
--yes --no-install-recommends --verbose-versions \
&& export LLVM_PUBKEY_HASH="bda960a8da687a275a2078d43c111d66b1c6a893a3275271beedf266c1ff4a0cdecb429c7a5cccf9f486ea7aa43fd27f" \
&& wget -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \
&& wget -nv -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \
&& echo "${LLVM_PUBKEY_HASH} /tmp/llvm-snapshot.gpg.key" | sha384sum -c \
&& apt-key add /tmp/llvm-snapshot.gpg.key \
&& export CODENAME="$(lsb_release --codename --short | tr 'A-Z' 'a-z')" \
@ -61,7 +61,6 @@ RUN apt-get update \
software-properties-common \
tzdata \
unixodbc \
wget \
--yes --no-install-recommends
# This symlink required by gcc to find lld compiler
@ -70,7 +69,7 @@ RUN ln -s /usr/bin/lld-${LLVM_VERSION} /usr/bin/ld.lld
ARG odbc_driver_url="https://github.com/ClickHouse/clickhouse-odbc/releases/download/v1.1.4.20200302/clickhouse-odbc-1.1.4-Linux.tar.gz"
RUN mkdir -p /tmp/clickhouse-odbc-tmp \
&& wget --quiet -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \
&& wget -nv -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \
&& cp /tmp/clickhouse-odbc-tmp/lib64/*.so /usr/local/lib/ \
&& odbcinst -i -d -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbcinst.ini.sample \
&& odbcinst -i -s -l -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbc.ini.sample \

View File

@ -227,5 +227,4 @@ EOF
;&
esac
exit $task_exit_code
exit $task_exit_code

View File

@ -17,7 +17,6 @@ RUN apt-get update \
odbc-postgresql \
sqlite3 \
curl \
bind9-host \
tar
RUN rm -rf \
/var/lib/apt/lists/* \

View File

@ -46,7 +46,7 @@ RUN set -eux; \
\
# this "case" statement is generated via "update.sh"
\
if ! wget -O docker.tgz "https://download.docker.com/linux/static/${DOCKER_CHANNEL}/x86_64/docker-${DOCKER_VERSION}.tgz"; then \
if ! wget -nv -O docker.tgz "https://download.docker.com/linux/static/${DOCKER_CHANNEL}/x86_64/docker-${DOCKER_VERSION}.tgz"; then \
echo >&2 "error: failed to download 'docker-${DOCKER_VERSION}' from '${DOCKER_CHANNEL}' for '${x86_64}'"; \
exit 1; \
fi; \

View File

@ -12,8 +12,8 @@ RUN apt-get update --yes \
strace \
--yes --no-install-recommends
#RUN wget -q -O - http://files.viva64.com/etc/pubkey.txt | sudo apt-key add -
#RUN sudo wget -O /etc/apt/sources.list.d/viva64.list http://files.viva64.com/etc/viva64.list
#RUN wget -nv -O - http://files.viva64.com/etc/pubkey.txt | sudo apt-key add -
#RUN sudo wget -nv -O /etc/apt/sources.list.d/viva64.list http://files.viva64.com/etc/viva64.list
#
#RUN apt-get --allow-unauthenticated update -y \
# && env DEBIAN_FRONTEND=noninteractive \
@ -24,10 +24,10 @@ ENV PKG_VERSION="pvs-studio-latest"
RUN set -x \
&& export PUBKEY_HASHSUM="486a0694c7f92e96190bbfac01c3b5ac2cb7823981db510a28f744c99eabbbf17a7bcee53ca42dc6d84d4323c2742761" \
&& wget https://files.viva64.com/etc/pubkey.txt -O /tmp/pubkey.txt \
&& wget -nv https://files.viva64.com/etc/pubkey.txt -O /tmp/pubkey.txt \
&& echo "${PUBKEY_HASHSUM} /tmp/pubkey.txt" | sha384sum -c \
&& apt-key add /tmp/pubkey.txt \
&& wget "https://files.viva64.com/${PKG_VERSION}.deb" \
&& wget -nv "https://files.viva64.com/${PKG_VERSION}.deb" \
&& { debsig-verify ${PKG_VERSION}.deb \
|| echo "WARNING: Some file was just downloaded from the internet without any validation and we are installing it into the system"; } \
&& dpkg -i "${PKG_VERSION}.deb"

View File

@ -26,7 +26,7 @@ RUN apt-get update -y \
zookeeperd
RUN mkdir -p /tmp/clickhouse-odbc-tmp \
&& wget --quiet -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \
&& wget -nv -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \
&& cp /tmp/clickhouse-odbc-tmp/lib64/*.so /usr/local/lib/ \
&& odbcinst -i -d -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbcinst.ini.sample \
&& odbcinst -i -s -l -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbc.ini.sample \

View File

@ -71,7 +71,7 @@ RUN apt-get --allow-unauthenticated update -y \
zookeeperd
RUN mkdir -p /tmp/clickhouse-odbc-tmp \
&& wget --quiet -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \
&& wget -nv -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \
&& cp /tmp/clickhouse-odbc-tmp/lib64/*.so /usr/local/lib/ \
&& odbcinst -i -d -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbcinst.ini.sample \
&& odbcinst -i -s -l -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbc.ini.sample \

View File

@ -33,7 +33,7 @@ RUN apt-get update -y \
qemu-user-static
RUN mkdir -p /tmp/clickhouse-odbc-tmp \
&& wget --quiet -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \
&& wget -nv -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \
&& cp /tmp/clickhouse-odbc-tmp/lib64/*.so /usr/local/lib/ \
&& odbcinst -i -d -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbcinst.ini.sample \
&& odbcinst -i -s -l -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbc.ini.sample \

View File

@ -44,7 +44,7 @@ RUN set -eux; \
\
# this "case" statement is generated via "update.sh"
\
if ! wget -O docker.tgz "https://download.docker.com/linux/static/${DOCKER_CHANNEL}/x86_64/docker-${DOCKER_VERSION}.tgz"; then \
if ! wget -nv -O docker.tgz "https://download.docker.com/linux/static/${DOCKER_CHANNEL}/x86_64/docker-${DOCKER_VERSION}.tgz"; then \
echo >&2 "error: failed to download 'docker-${DOCKER_VERSION}' from '${DOCKER_CHANNEL}' for '${x86_64}'"; \
exit 1; \
fi; \

View File

@ -7,7 +7,7 @@ toc_title: RabbitMQ
This engine allows integrating ClickHouse with [RabbitMQ](https://www.rabbitmq.com).
RabbitMQ lets you:
`RabbitMQ` lets you:
- Publish or subscribe to data flows.
- Process streams as they become available.
@ -44,8 +44,8 @@ Optional parameters:
- `rabbitmq_routing_key_list` A comma-separated list of routing keys.
- `rabbitmq_row_delimiter` Delimiter character, which ends the message.
- `rabbitmq_num_consumers` The number of consumers per table. Default: `1`. Specify more consumers if the throughput of one consumer is insufficient.
- `rabbitmq_num_queues` The number of queues per consumer. Default: `1`. Specify more queues if the capacity of one queue per consumer is insufficient. Single queue can contain up to 50K messages at the same time.
- `rabbitmq_transactional_channel` Wrap insert queries in transactions. Default: `0`.
- `rabbitmq_num_queues` The number of queues per consumer. Default: `1`. Specify more queues if the capacity of one queue per consumer is insufficient. A single queue can contain up to 50K messages at the same time.
- `rabbitmq_transactional_channel` Wrap `INSERT` queries in transactions. Default: `0`.
Required configuration:
@ -72,7 +72,7 @@ Example:
## Description {#description}
`SELECT` is not particularly useful for reading messages (except for debugging), because each message can be read only once. It is more practical to create real-time threads using materialized views. To do this:
`SELECT` is not particularly useful for reading messages (except for debugging), because each message can be read only once. It is more practical to create real-time threads using [materialized views](../../../sql-reference/statements/create/view.md). To do this:
1. Use the engine to create a RabbitMQ consumer and consider it a data stream.
2. Create a table with the desired structure.
@ -86,13 +86,13 @@ There can be no more than one exchange per table. One exchange can be shared bet
Exchange type options:
- `direct` - Routing is based on exact matching of keys. Example table key list: `key1,key2,key3,key4,key5`, message key can eqaul any of them.
- `direct` - Routing is based on the exact matching of keys. Example table key list: `key1,key2,key3,key4,key5`, message key can equal any of them.
- `fanout` - Routing to all tables (where exchange name is the same) regardless of the keys.
- `topic` - Routing is based on patterns with dot-separated keys. Examples: `*.logs`, `records.*.*.2020`, `*.2018,*.2019,*.2020`.
- `headers` - Routing is based on `key=value` matches with a setting `x-match=all` or `x-match=any`. Example table key list: `x-match=all,format=logs,type=report,year=2020`.
- `consistent-hash` - Data is evenly distributed between all bound tables (where exchange name is the same). Note that this exchange type must be enabled with RabbitMQ plugin: `rabbitmq-plugins enable rabbitmq_consistent_hash_exchange`.
- `consistent-hash` - Data is evenly distributed between all bound tables (where the exchange name is the same). Note that this exchange type must be enabled with RabbitMQ plugin: `rabbitmq-plugins enable rabbitmq_consistent_hash_exchange`.
If exchange type is not specified, then default is `fanout` and routing keys for data publishing must be randomized in range `[1, num_consumers]` for every message/batch (or in range `[1, num_consumers * num_queues]` if `rabbitmq_num_queues` is set). This table configuration works quicker then any other, especially when `rabbitmq_num_consumers` and/or `rabbitmq_num_queues` parameters are set.
If exchange type is not specified, then default is `fanout` and routing keys for data publishing must be randomized in range `[1, num_consumers]` for every message/batch (or in range `[1, num_consumers * num_queues]` if `rabbitmq_num_queues` is set). This table configuration works quicker than any other, especially when `rabbitmq_num_consumers` and/or `rabbitmq_num_queues` parameters are set.
If `rabbitmq_num_consumers` and/or `rabbitmq_num_queues` parameters are specified along with `rabbitmq_exchange_type`, then:

View File

@ -36,7 +36,7 @@ Examples:
$ curl 'http://localhost:8123/?query=SELECT%201'
1
$ wget -O- -q 'http://localhost:8123/?query=SELECT 1'
$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1'
1
$ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123

View File

@ -38,7 +38,7 @@ Ejemplos:
$ curl 'http://localhost:8123/?query=SELECT%201'
1
$ wget -O- -q 'http://localhost:8123/?query=SELECT 1'
$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1'
1
$ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123

View File

@ -38,7 +38,7 @@ Ok.
$ curl 'http://localhost:8123/?query=SELECT%201'
1
$ wget -O- -q 'http://localhost:8123/?query=SELECT 1'
$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1'
1
$ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123

View File

@ -38,7 +38,7 @@ Exemple:
$ curl 'http://localhost:8123/?query=SELECT%201'
1
$ wget -O- -q 'http://localhost:8123/?query=SELECT 1'
$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1'
1
$ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123

View File

@ -38,7 +38,7 @@ GETメソッドを使用する場合, readonly 設定されています。
$ curl 'http://localhost:8123/?query=SELECT%201'
1
$ wget -O- -q 'http://localhost:8123/?query=SELECT 1'
$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1'
1
$ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123

View File

@ -0,0 +1,122 @@
---
toc_priority: 6
toc_title: RabbitMQ
---
# RabbitMQ {#rabbitmq-engine}
Движок работает с [RabbitMQ](https://www.rabbitmq.com).
`RabbitMQ` позволяет:
- Публиковать/подписываться на потоки данных.
- Обрабатывать потоки по мере их появления.
## Создание таблицы {#table_engine-rabbitmq-creating-a-table}
``` sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
...
) ENGINE = RabbitMQ SETTINGS
rabbitmq_host_port = 'host:port',
rabbitmq_exchange_name = 'exchange_name',
rabbitmq_format = 'data_format'[,]
[rabbitmq_exchange_type = 'exchange_type',]
[rabbitmq_routing_key_list = 'key1,key2,...',]
[rabbitmq_row_delimiter = 'delimiter_symbol',]
[rabbitmq_num_consumers = N,]
[rabbitmq_num_queues = N,]
[rabbitmq_transactional_channel = 0]
```
Обязательные параметры:
- `rabbitmq_host_port` адрес сервера (`хост:порт`). Например: `localhost:5672`.
- `rabbitmq_exchange_name` имя точки обмена в RabbitMQ.
- `rabbitmq_format` формат сообщения. Используется такое же обозначение, как и в функции `FORMAT` в SQL, например, `JSONEachRow`. Подробнее см. в разделе [Форматы входных и выходных данных](../../../interfaces/formats.md).
Дополнительные параметры:
- `rabbitmq_exchange_type` тип точки обмена в RabbitMQ: `direct`, `fanout`, `topic`, `headers`, `consistent-hash`. По умолчанию: `fanout`.
- `rabbitmq_routing_key_list` список ключей маршрутизации, через запятую.
- `rabbitmq_row_delimiter` символ-разделитель, который завершает сообщение.
- `rabbitmq_num_consumers` количество потребителей на таблицу. По умолчанию: `1`. Укажите больше потребителей, если пропускная способность одного потребителя недостаточна.
- `rabbitmq_num_queues` количество очередей на потребителя. По умолчанию: `1`. Укажите больше потребителей, если пропускная способность одной очереди на потребителя недостаточна. Одна очередь поддерживает до 50 тысяч сообщений одновременно.
- `rabbitmq_transactional_channel` обернутые запросы `INSERT` в транзакциях. По умолчанию: `0`.
Требуемая конфигурация:
Конфигурация сервера RabbitMQ добавляется с помощью конфигурационного файла ClickHouse.
``` xml
<rabbitmq>
<username>root</username>
<password>clickhouse</password>
</rabbitmq>
```
Example:
``` sql
CREATE TABLE queue (
key UInt64,
value UInt64
) ENGINE = RabbitMQ SETTINGS rabbitmq_host_port = 'localhost:5672',
rabbitmq_exchange_name = 'exchange1',
rabbitmq_format = 'JSONEachRow',
rabbitmq_num_consumers = 5;
```
## Описание {#description}
Запрос `SELECT` не очень полезен для чтения сообщений (за исключением отладки), поскольку каждое сообщение может быть прочитано только один раз. Практичнее создавать потоки реального времени с помощью [материализованных преставлений](../../../sql-reference/statements/create/view.md). Для этого:
1. Создайте потребителя RabbitMQ с помощью движка и рассматривайте его как поток данных.
2. Создайте таблицу с необходимой структурой.
3. Создайте материализованное представление, которое преобразует данные от движка и помещает их в ранее созданную таблицу.
Когда к движку присоединяется материализованное представление, оно начинает в фоновом режиме собирать данные. Это позволяет непрерывно получать сообщения от RabbitMQ и преобразовывать их в необходимый формат с помощью `SELECT`.
У одной таблицы RabbitMQ может быть неограниченное количество материализованных представлений.
Данные передаются с помощью параметров `rabbitmq_exchange_type` и `rabbitmq_routing_key_list`.
Может быть не более одной точки обмена на таблицу. Одна точка обмена может использоваться несколькими таблицами: это позволяет выполнять маршрутизацию по нескольким таблицам одновременно.
Параметры точек обмена:
- `direct` - маршрутизация основана на точном совпадении ключей. Пример списка ключей: `key1,key2,key3,key4,key5`. Ключ сообщения может совпадать с одним из них.
- `fanout` - маршрутизация по всем таблицам, где имя точки обмена совпадает, независимо от ключей.
- `topic` - маршрутизация основана на правилах с ключами, разделенными точками. Например: `*.logs`, `records.*.*.2020`, `*.2018,*.2019,*.2020`.
- `headers` - маршрутизация основана на совпадении `key=value` с настройкой `x-match=all` или `x-match=any`. Пример списка ключей таблицы: `x-match=all,format=logs,type=report,year=2020`.
- `consistent-hash` - данные равномерно распределяются между всеми связанными таблицами, где имя точки обмена совпадает. Обратите внимание, что этот тип обмена должен быть включен с помощью плагина RabbitMQ: `rabbitmq-plugins enable rabbitmq_consistent_hash_exchange`.
Если тип точки обмена не задан, по умолчанию используется `fanout`. В таком случае ключи маршрутизации для публикации данных должны быть рандомизированы в диапазоне `[1, num_consumers]` за каждое сообщение/пакет (или в диапазоне `[1, num_consumers * num_queues]`, если `rabbitmq_num_queues` задано). Эта конфигурация таблицы работает быстрее, чем любая другая, особенно когда заданы параметры `rabbitmq_num_consumers` и/или `rabbitmq_num_queues`.
Если параметры`rabbitmq_num_consumers` и/или `rabbitmq_num_queues` заданы вместе с параметром `rabbitmq_exchange_type`:
- плагин `rabbitmq-consistent-hash-exchange` должен быть включен.
- свойство `message_id` должно быть определено (уникальное для каждого сообщения/пакета).
Пример:
``` sql
CREATE TABLE queue (
key UInt64,
value UInt64
) ENGINE = RabbitMQ SETTINGS rabbitmq_host_port = 'localhost:5672',
rabbitmq_exchange_name = 'exchange1',
rabbitmq_exchange_type = 'headers',
rabbitmq_routing_key_list = 'format=logs,type=report,year=2020',
rabbitmq_format = 'JSONEachRow',
rabbitmq_num_consumers = 5;
CREATE TABLE daily (key UInt64, value UInt64)
ENGINE = MergeTree();
CREATE MATERIALIZED VIEW consumer TO daily
AS SELECT key, value FROM queue;
SELECT key, value FROM daily ORDER BY key;
```

View File

@ -31,7 +31,7 @@ Ok.
$ curl 'http://localhost:8123/?query=SELECT%201'
1
$ wget -O- -q 'http://localhost:8123/?query=SELECT 1'
$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1'
1
$ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123

View File

@ -401,13 +401,91 @@ INSERT INTO test VALUES (lower('Hello')), (lower('world')), (lower('INSERT')), (
Устанавливает тип поведения [JOIN](../../sql-reference/statements/select/join.md). При объединении таблиц могут появиться пустые ячейки. ClickHouse заполняет их по-разному в зависимости от настроек.
Возможные значения
Возможные значения:
- 0 — пустые ячейки заполняются значением по умолчанию соответствующего типа поля.
- 1 — `JOIN` ведёт себя как в стандартном SQL. Тип соответствующего поля преобразуется в [Nullable](../../sql-reference/data-types/nullable.md#data_type-nullable), а пустые ячейки заполняются значениями [NULL](../../sql-reference/syntax.md).
## partial_merge_join_optimizations {#partial_merge_join_optimizations}
Отключает все оптимизации для запросов [JOIN](../../sql-reference/statements/select/join.md) с частичным MergeJoin алгоритмом.
По умолчанию оптимизации включены, что может привести к неправильным результатам. Если вы видите подозрительные результаты в своих запросах, отключите оптимизацию с помощью этого параметра. В различных версиях сервера ClickHouse, оптимизация может отличаться.
Возможные значения:
- 0 — Оптимизация отключена.
- 1 — Оптимизация включена.
Значение по умолчанию: 1.
## partial_merge_join_rows_in_right_blocks {#partial_merge_join_rows_in_right_blocks}
Устанавливает предельные размеры блоков данных «правого» соединения, для запросов [JOIN](../../sql-reference/statements/select/join.md) с частичным MergeJoin алгоритмом.
Сервер ClickHouse:
1. Разделяет данные правого соединения на блоки с заданным числом строк.
2. Индексирует для каждого блока минимальное и максимальное значение.
3. Выгружает подготовленные блоки на диск, если это возможно.
Возможные значения:
- Положительное целое число. Рекомендуемый диапазон значений [1000, 100000].
Значение по умолчанию: 65536.
## join_on_disk_max_files_to_merge {#join_on_disk_max_files_to_merge}
Устанавливет количество файлов, разрешенных для параллельной сортировки, при выполнении операций MergeJoin на диске.
Чем больше значение параметра, тем больше оперативной памяти используется и тем меньше используется диск (I/O).
Возможные значения:
- Положительное целое число, больше 2.
Значение по умолчанию: 64.
## temporary_files_codec {#temporary_files_codec}
Устанавливает метод сжатия для временных файлов на диске, используемых при сортировки и объединения.
Возможные значения:
- LZ4 — применять сжатие, используя алгоритм [LZ4](https://ru.wikipedia.org/wiki/LZ4)
- NONE — не применять сжатие.
Значение по умолчанию: LZ4.
## any_join_distinct_right_table_keys {#any_join_distinct_right_table_keys}
Включает устаревшее поведение сервера ClickHouse при выполнении операций `ANY INNER|LEFT JOIN`.
!!! note "Внимание"
Используйте этот параметр только в целях обратной совместимости, если ваши варианты использования требуют устаревшего поведения `JOIN`.
Когда включено устаревшее поведение:
- Результаты операций "t1 ANY LEFT JOIN t2" и "t2 ANY RIGHT JOIN t1" не равны, поскольку ClickHouse использует логику с сопоставлением ключей таблицы "многие к одному слева направо".
- Результаты операций `ANY INNER JOIN` содержат все строки из левой таблицы, аналогично операции `SEMI LEFT JOIN`.
Когда устаревшее поведение отключено:
- Результаты операций `t1 ANY LEFT JOIN t2` и `t2 ANY RIGHT JOIN t1` равно, потому что ClickHouse использует логику сопоставления ключей один-ко-многим в операциях `ANY RIGHT JOIN`.
- Результаты операций `ANY INNER JOIN` содержат по одной строке на ключ из левой и правой таблиц.
Возможные значения:
- 0 — Устаревшее поведение отключено.
- 1 — Устаревшее поведение включено.
Значение по умолчанию: 0.
См. также:
- [JOIN strictness](../../sql-reference/statements/select/join.md#select-join-strictness)
## max\_block\_size {#setting-max_block_size}
Данные в ClickHouse обрабатываются по блокам (наборам кусочков столбцов). Внутренние циклы обработки для одного блока достаточно эффективны, но есть заметные издержки на каждый блок. Настройка `max_block_size` — это рекомендация, какой размер блока (в количестве строк) загружать из таблиц. Размер блока не должен быть слишком маленьким, чтобы затраты на каждый блок были заметны, но не слишком велики, чтобы запрос с LIMIT, который завершается после первого блока, обрабатывался быстро. Цель состоит в том, чтобы не использовалось слишком много оперативки при вынимании большого количества столбцов в несколько потоков; чтобы оставалась хоть какая-нибудь кэш-локальность.

View File

@ -24,7 +24,7 @@
Утилиту следует запускать вручную следующим образом:
``` bash
$ clickhouse-copier copier --daemon --config zookeeper.xml --task-path /task/path --base-dir /path/to/dir
$ clickhouse-copier --daemon --config zookeeper.xml --task-path /task/path --base-dir /path/to/dir
```
Параметры запуска:

View File

@ -36,7 +36,9 @@ FROM <left_table>
!!! note "Примечание"
Значение строгости по умолчанию может быть переопределено с помощью настройки [join\_default\_strictness](../../../operations/settings/settings.md#settings-join_default_strictness).
Поведение сервера ClickHouse для операций `ANY JOIN` зависит от параметра [any_join_distinct_right_table_keys](../../../operations/settings/settings.md#any_join_distinct_right_table_keys).
### Использование ASOF JOIN {#asof-join-usage}
`ASOF JOIN` применим в том случае, когда необходимо объединять записи, которые не имеют точного совпадения.

View File

@ -38,7 +38,7 @@ GET yöntemini kullanırken, readonly ayar .lanmıştır. Başka bir deyi
$ curl 'http://localhost:8123/?query=SELECT%201'
1
$ wget -O- -q 'http://localhost:8123/?query=SELECT 1'
$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1'
1
$ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123

View File

@ -23,7 +23,7 @@ Ok.
$ curl 'http://localhost:8123/?query=SELECT%201'
1
$ wget -O- -q 'http://localhost:8123/?query=SELECT 1'
$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1'
1
$ GET 'http://localhost:8123/?query=SELECT 1'

View File

@ -13,6 +13,7 @@ namespace DB
namespace ErrorCodes
{
extern const int UNKNOWN_EXCEPTION;
extern const int LOGICAL_ERROR;
}
namespace MySQLReplication
@ -103,19 +104,19 @@ namespace MySQLReplication
= header.event_size - EVENT_HEADER_LENGTH - 4 - 4 - 1 - 2 - 2 - status_len - schema_len - 1 - CHECKSUM_CRC32_SIGNATURE_LENGTH;
query.resize(len);
payload.readStrict(reinterpret_cast<char *>(query.data()), len);
if (query.rfind("BEGIN", 0) == 0 || query.rfind("COMMIT") == 0)
if (query.starts_with("BEGIN") || query.starts_with("COMMIT"))
{
typ = QUERY_EVENT_MULTI_TXN_FLAG;
}
else if (query.rfind("XA", 0) == 0)
else if (query.starts_with("XA"))
{
if (query.rfind("XA ROLLBACK", 0) == 0)
throw ReplicationError("ParseQueryEvent: Unsupported query event:" + query, ErrorCodes::UNKNOWN_EXCEPTION);
if (query.starts_with("XA ROLLBACK"))
throw ReplicationError("ParseQueryEvent: Unsupported query event:" + query, ErrorCodes::LOGICAL_ERROR);
typ = QUERY_EVENT_XA;
}
else if (query.rfind("SAVEPOINT", 0) == 0)
else if (query.starts_with("SAVEPOINT"))
{
throw ReplicationError("ParseQueryEvent: Unsupported query event:" + query, ErrorCodes::UNKNOWN_EXCEPTION);
throw ReplicationError("ParseQueryEvent: Unsupported query event:" + query, ErrorCodes::LOGICAL_ERROR);
}
}

View File

@ -13,6 +13,7 @@ namespace DB
namespace ErrorCodes
{
extern const int UNSUPPORTED_METHOD;
extern const int LOGICAL_ERROR;
}
@ -239,12 +240,15 @@ std::string ExternalQueryBuilder::composeLoadIdsQuery(const std::vector<UInt64>
}
std::string
ExternalQueryBuilder::composeLoadKeysQuery(const Columns & key_columns, const std::vector<size_t> & requested_rows, LoadKeysMethod method, size_t partition_key_prefix)
std::string ExternalQueryBuilder::composeLoadKeysQuery(
const Columns & key_columns, const std::vector<size_t> & requested_rows, LoadKeysMethod method, size_t partition_key_prefix)
{
if (!dict_struct.key)
throw Exception{"Composite key required for method", ErrorCodes::UNSUPPORTED_METHOD};
if (key_columns.size() != dict_struct.key->size())
throw Exception{"The size of key_columns does not equal to the size of dictionary key", ErrorCodes::LOGICAL_ERROR};
WriteBufferFromOwnString out;
writeString("SELECT ", out);

View File

@ -1120,6 +1120,8 @@ void SSDComplexKeyCacheStorage::update(
AbsentIdHandler && on_key_not_found,
const DictionaryLifetime lifetime)
{
assert(key_columns.size() == key_types.size());
auto append_block = [&key_types, this](
const Columns & new_keys,
const SSDComplexKeyCachePartition::Attributes & new_attributes,
@ -1447,6 +1449,12 @@ void SSDComplexKeyCacheDictionary::getItemsNumberImpl(
const Columns & key_columns, const DataTypes & key_types,
ResultArrayType<OutputType> & out, DefaultGetter && get_default) const
{
assert(dict_struct.key);
assert(key_columns.size() == key_types.size());
assert(key_columns.size() == dict_struct.key->size());
dict_struct.validateKeyTypes(key_types);
const auto now = std::chrono::system_clock::now();
TemporalComplexKeysPool not_found_pool;
@ -1527,6 +1535,8 @@ void SSDComplexKeyCacheDictionary::getItemsStringImpl(
ColumnString * out,
DefaultGetter && get_default) const
{
dict_struct.validateKeyTypes(key_types);
const auto now = std::chrono::system_clock::now();
TemporalComplexKeysPool not_found_pool;

View File

@ -427,9 +427,8 @@ private:
using SSDComplexKeyCachePartitionPtr = std::shared_ptr<SSDComplexKeyCachePartition>;
/*
Class for managing SSDCachePartition and getting data from source.
*/
/** Class for managing SSDCachePartition and getting data from source.
*/
class SSDComplexKeyCacheStorage
{
public:
@ -515,9 +514,8 @@ private:
};
/*
Dictionary interface
*/
/** Dictionary interface
*/
class SSDComplexKeyCacheDictionary final : public IDictionaryBase
{
public:

View File

@ -1,6 +1,6 @@
#!/usr/bin/env bash
[ ! -f public_suffix_list.dat ] && wget -O public_suffix_list.dat https://publicsuffix.org/list/public_suffix_list.dat
[ ! -f public_suffix_list.dat ] && wget -nv -O public_suffix_list.dat https://publicsuffix.org/list/public_suffix_list.dat
echo '%language=C++
%define lookup-function-name is_valid

View File

@ -13,6 +13,7 @@
#include <Processors/Transforms/ConvertingTransform.h>
#include <Processors/Sources/RemoteSource.h>
#include <Processors/Sources/DelayedSource.h>
#include <Processors/QueryPlan/QueryPlan.h>
namespace ProfileEvents
{
@ -68,14 +69,19 @@ SelectStreamFactory::SelectStreamFactory(
namespace
{
QueryPipeline createLocalStream(
auto createLocalPipe(
const ASTPtr & query_ast, const Block & header, const Context & context, QueryProcessingStage::Enum processed_stage)
{
checkStackSize();
InterpreterSelectQuery interpreter{query_ast, context, SelectQueryOptions(processed_stage)};
InterpreterSelectQuery interpreter(query_ast, context, SelectQueryOptions(processed_stage));
auto query_plan = std::make_unique<QueryPlan>();
auto pipeline = interpreter.execute().pipeline;
interpreter.buildQueryPlan(*query_plan);
auto pipeline = std::move(*query_plan->buildQueryPipeline());
/// Avoid going it out-of-scope for EXPLAIN
pipeline.addQueryPlan(std::move(query_plan));
pipeline.addSimpleTransform([&](const Block & source_header)
{
@ -94,7 +100,7 @@ QueryPipeline createLocalStream(
/// return std::make_shared<MaterializingBlockInputStream>(stream);
pipeline.setMaxThreads(1);
return pipeline;
return QueryPipeline::getPipe(std::move(pipeline));
}
String formattedAST(const ASTPtr & ast)
@ -130,7 +136,7 @@ void SelectStreamFactory::createForShard(
auto emplace_local_stream = [&]()
{
pipes.emplace_back(QueryPipeline::getPipe(createLocalStream(modified_query_ast, header, context, processed_stage)));
pipes.emplace_back(createLocalPipe(modified_query_ast, header, context, processed_stage));
};
String modified_query = formattedAST(modified_query_ast);
@ -270,7 +276,7 @@ void SelectStreamFactory::createForShard(
}
if (try_results.empty() || local_delay < max_remote_delay)
return QueryPipeline::getPipe(createLocalStream(modified_query_ast, header, context, stage));
return createLocalPipe(modified_query_ast, header, context, stage);
else
{
std::vector<IConnectionPool::Entry> connections;

View File

@ -269,7 +269,9 @@ BlockInputStreamPtr InterpreterExplainQuery::executeImpl()
if (settings.graph)
{
auto processors = Pipe::detachProcessors(QueryPipeline::getPipe(std::move(*pipeline)));
/// Pipe holds QueryPlan, should not go out-of-scope
auto pipe = QueryPipeline::getPipe(std::move(*pipeline));
const auto & processors = pipe.getProcessors();
if (settings.compact)
printPipelineCompact(processors, buffer, settings.query_pipeline_options.header);

View File

@ -18,6 +18,7 @@
#include <DataTypes/DataTypeNullable.h>
#include <Parsers/MySQL/ASTDeclareIndex.h>
#include <Common/quoteString.h>
#include <Common/assert_cast.h>
#include <Interpreters/Context.h>
#include <Interpreters/InterpreterCreateQuery.h>
#include <Storages/IStorage.h>
@ -124,8 +125,37 @@ static NamesAndTypesList getNames(const ASTFunction & expr, const Context & cont
return expression->getRequiredColumnsWithTypes();
}
static NamesAndTypesList modifyPrimaryKeysToNonNullable(const NamesAndTypesList & primary_keys, NamesAndTypesList & columns)
{
/// https://dev.mysql.com/doc/refman/5.7/en/create-table.html#create-table-indexes-keys
/// PRIMARY KEY:
/// A unique index where all key columns must be defined as NOT NULL.
/// If they are not explicitly declared as NOT NULL, MySQL declares them so implicitly (and silently).
/// A table can have only one PRIMARY KEY. The name of a PRIMARY KEY is always PRIMARY,
/// which thus cannot be used as the name for any other kind of index.
NamesAndTypesList non_nullable_primary_keys;
for (const auto & primary_key : primary_keys)
{
if (!primary_key.type->isNullable())
non_nullable_primary_keys.emplace_back(primary_key);
else
{
non_nullable_primary_keys.emplace_back(
NameAndTypePair(primary_key.name, assert_cast<const DataTypeNullable *>(primary_key.type.get())->getNestedType()));
for (auto & column : columns)
{
if (column.name == primary_key.name)
column.type = assert_cast<const DataTypeNullable *>(column.type.get())->getNestedType();
}
}
}
return non_nullable_primary_keys;
}
static inline std::tuple<NamesAndTypesList, NamesAndTypesList, NamesAndTypesList, NameSet> getKeys(
ASTExpressionList * columns_define, ASTExpressionList * indices_define, const Context & context, const NamesAndTypesList & columns)
ASTExpressionList * columns_define, ASTExpressionList * indices_define, const Context & context, NamesAndTypesList & columns)
{
NameSet increment_columns;
auto keys = makeASTFunction("tuple");
@ -171,8 +201,9 @@ static inline std::tuple<NamesAndTypesList, NamesAndTypesList, NamesAndTypesList
}
}
return std::make_tuple(
getNames(*primary_keys, context, columns), getNames(*unique_keys, context, columns), getNames(*keys, context, columns), increment_columns);
const auto & primary_keys_names_and_types = getNames(*primary_keys, context, columns);
const auto & non_nullable_primary_keys_names_and_types = modifyPrimaryKeysToNonNullable(primary_keys_names_and_types, columns);
return std::make_tuple(non_nullable_primary_keys_names_and_types, getNames(*unique_keys, context, columns), getNames(*keys, context, columns), increment_columns);
}
static String getUniqueColumnName(NamesAndTypesList columns_name_and_type, const String & prefix)
@ -201,14 +232,13 @@ static String getUniqueColumnName(NamesAndTypesList columns_name_and_type, const
static ASTPtr getPartitionPolicy(const NamesAndTypesList & primary_keys)
{
const auto & numbers_partition = [&](const String & column_name, bool is_nullable, size_t type_max_size)
const auto & numbers_partition = [&](const String & column_name, size_t type_max_size) -> ASTPtr
{
ASTPtr column = std::make_shared<ASTIdentifier>(column_name);
if (type_max_size <= 1000)
return std::make_shared<ASTIdentifier>(column_name);
if (is_nullable)
column = makeASTFunction("assumeNotNull", column);
return makeASTFunction("intDiv", column, std::make_shared<ASTLiteral>(UInt64(type_max_size / 1000)));
return makeASTFunction("intDiv", std::make_shared<ASTIdentifier>(column_name),
std::make_shared<ASTLiteral>(UInt64(type_max_size / 1000)));
};
ASTPtr best_partition;
@ -219,16 +249,12 @@ static ASTPtr getPartitionPolicy(const NamesAndTypesList & primary_keys)
WhichDataType which(type);
if (which.isNullable())
{
type = (static_cast<const DataTypeNullable &>(*type)).getNestedType();
which = WhichDataType(type);
}
throw Exception("LOGICAL ERROR: MySQL primary key must be not null, it is a bug.", ErrorCodes::LOGICAL_ERROR);
if (which.isDateOrDateTime())
{
/// In any case, date or datetime is always the best partitioning key
ASTPtr res = std::make_shared<ASTIdentifier>(primary_key.name);
return makeASTFunction("toYYYYMM", primary_key.type->isNullable() ? makeASTFunction("assumeNotNull", res) : res);
return makeASTFunction("toYYYYMM", std::make_shared<ASTIdentifier>(primary_key.name));
}
if (type->haveMaximumSizeOfValue() && (!best_size || type->getSizeOfValueInMemory() < best_size))
@ -236,25 +262,22 @@ static ASTPtr getPartitionPolicy(const NamesAndTypesList & primary_keys)
if (which.isInt8() || which.isUInt8())
{
best_size = type->getSizeOfValueInMemory();
best_partition = std::make_shared<ASTIdentifier>(primary_key.name);
if (primary_key.type->isNullable())
best_partition = makeASTFunction("assumeNotNull", best_partition);
best_partition = numbers_partition(primary_key.name, std::numeric_limits<UInt8>::max());
}
else if (which.isInt16() || which.isUInt16())
{
best_size = type->getSizeOfValueInMemory();
best_partition = numbers_partition(primary_key.name, primary_key.type->isNullable(), std::numeric_limits<UInt16>::max());
best_partition = numbers_partition(primary_key.name, std::numeric_limits<UInt16>::max());
}
else if (which.isInt32() || which.isUInt32())
{
best_size = type->getSizeOfValueInMemory();
best_partition = numbers_partition(primary_key.name, primary_key.type->isNullable(), std::numeric_limits<UInt32>::max());
best_partition = numbers_partition(primary_key.name, std::numeric_limits<UInt32>::max());
}
else if (which.isInt64() || which.isUInt64())
{
best_size = type->getSizeOfValueInMemory();
best_partition = numbers_partition(primary_key.name, primary_key.type->isNullable(), std::numeric_limits<UInt64>::max());
best_partition = numbers_partition(primary_key.name, std::numeric_limits<UInt64>::max());
}
}
}
@ -266,12 +289,12 @@ static ASTPtr getOrderByPolicy(
const NamesAndTypesList & primary_keys, const NamesAndTypesList & unique_keys, const NamesAndTypesList & keys, const NameSet & increment_columns)
{
NameSet order_by_columns_set;
std::deque<std::vector<String>> order_by_columns_list;
std::deque<NamesAndTypesList> order_by_columns_list;
const auto & add_order_by_expression = [&](const NamesAndTypesList & names_and_types)
{
std::vector<String> increment_keys;
std::vector<String> non_increment_keys;
NamesAndTypesList increment_keys;
NamesAndTypesList non_increment_keys;
for (const auto & [name, type] : names_and_types)
{
@ -280,13 +303,13 @@ static ASTPtr getOrderByPolicy(
if (increment_columns.count(name))
{
increment_keys.emplace_back(name);
order_by_columns_set.emplace(name);
increment_keys.emplace_back(NameAndTypePair(name, type));
}
else
{
order_by_columns_set.emplace(name);
non_increment_keys.emplace_back(name);
non_increment_keys.emplace_back(NameAndTypePair(name, type));
}
}
@ -305,8 +328,13 @@ static ASTPtr getOrderByPolicy(
for (const auto & order_by_columns : order_by_columns_list)
{
for (const auto & order_by_column : order_by_columns)
order_by_expression->arguments->children.emplace_back(std::make_shared<ASTIdentifier>(order_by_column));
for (const auto & [name, type] : order_by_columns)
{
order_by_expression->arguments->children.emplace_back(std::make_shared<ASTIdentifier>(name));
if (type->isNullable())
order_by_expression->arguments->children.back() = makeASTFunction("assumeNotNull", order_by_expression->arguments->children.back());
}
}
return order_by_expression;

View File

@ -103,21 +103,12 @@ TEST(MySQLCreateRewritten, PartitionPolicy)
{"TIMESTAMP", "DateTime", " PARTITION BY toYYYYMM(key)"}, {"BOOLEAN", "Int8", " PARTITION BY key"}
};
const auto & replace_string = [](const String & str, const String & old_str, const String & new_str)
{
String new_string = str;
size_t pos = new_string.find(old_str);
if (pos != std::string::npos)
new_string = new_string.replace(pos, old_str.size(), new_str);
return new_string;
};
for (const auto & [test_type, mapped_type, partition_policy] : test_types)
{
EXPECT_EQ(queryToString(tryRewrittenCreateQuery(
"CREATE TABLE `test_database`.`test_table_1` (`key` " + test_type + " PRIMARY KEY)", context_holder.context)),
"CREATE TABLE test_database.test_table_1 (`key` Nullable(" + mapped_type + "), `_sign` Int8() MATERIALIZED 1, "
"`_version` UInt64() MATERIALIZED 1) ENGINE = ReplacingMergeTree(_version)" + replace_string(partition_policy, "key", "assumeNotNull(key)") + " ORDER BY tuple(key)");
"CREATE TABLE test_database.test_table_1 (`key` " + mapped_type + ", `_sign` Int8() MATERIALIZED 1, "
"`_version` UInt64() MATERIALIZED 1) ENGINE = ReplacingMergeTree(_version)" + partition_policy + " ORDER BY tuple(key)");
EXPECT_EQ(queryToString(tryRewrittenCreateQuery(
"CREATE TABLE `test_database`.`test_table_1` (`key` " + test_type + " NOT NULL PRIMARY KEY)", context_holder.context)),
@ -126,6 +117,45 @@ TEST(MySQLCreateRewritten, PartitionPolicy)
}
}
TEST(MySQLCreateRewritten, OrderbyPolicy)
{
tryRegisterFunctions();
const auto & context_holder = getContext();
std::vector<std::tuple<String, String, String>> test_types
{
{"TINYINT", "Int8", " PARTITION BY key"}, {"SMALLINT", "Int16", " PARTITION BY intDiv(key, 65)"},
{"MEDIUMINT", "Int32", " PARTITION BY intDiv(key, 4294967)"}, {"INT", "Int32", " PARTITION BY intDiv(key, 4294967)"},
{"INTEGER", "Int32", " PARTITION BY intDiv(key, 4294967)"}, {"BIGINT", "Int64", " PARTITION BY intDiv(key, 18446744073709551)"},
{"FLOAT", "Float32", ""}, {"DOUBLE", "Float64", ""}, {"VARCHAR(10)", "String", ""}, {"CHAR(10)", "String", ""},
{"Date", "Date", " PARTITION BY toYYYYMM(key)"}, {"DateTime", "DateTime", " PARTITION BY toYYYYMM(key)"},
{"TIMESTAMP", "DateTime", " PARTITION BY toYYYYMM(key)"}, {"BOOLEAN", "Int8", " PARTITION BY key"}
};
for (const auto & [test_type, mapped_type, partition_policy] : test_types)
{
EXPECT_EQ(queryToString(tryRewrittenCreateQuery(
"CREATE TABLE `test_database`.`test_table_1` (`key` " + test_type + " PRIMARY KEY, `key2` " + test_type + " UNIQUE KEY)", context_holder.context)),
"CREATE TABLE test_database.test_table_1 (`key` " + mapped_type + ", `key2` Nullable(" + mapped_type + "), `_sign` Int8() MATERIALIZED 1, "
"`_version` UInt64() MATERIALIZED 1) ENGINE = ReplacingMergeTree(_version)" + partition_policy + " ORDER BY (key, assumeNotNull(key2))");
EXPECT_EQ(queryToString(tryRewrittenCreateQuery(
"CREATE TABLE `test_database`.`test_table_1` (`key` " + test_type + " NOT NULL PRIMARY KEY, `key2` " + test_type + " NOT NULL UNIQUE KEY)", context_holder.context)),
"CREATE TABLE test_database.test_table_1 (`key` " + mapped_type + ", `key2` " + mapped_type + ", `_sign` Int8() MATERIALIZED 1, "
"`_version` UInt64() MATERIALIZED 1) ENGINE = ReplacingMergeTree(_version)" + partition_policy + " ORDER BY (key, key2)");
EXPECT_EQ(queryToString(tryRewrittenCreateQuery(
"CREATE TABLE `test_database`.`test_table_1` (`key` " + test_type + " KEY UNIQUE KEY)", context_holder.context)),
"CREATE TABLE test_database.test_table_1 (`key` " + mapped_type + ", `_sign` Int8() MATERIALIZED 1, "
"`_version` UInt64() MATERIALIZED 1) ENGINE = ReplacingMergeTree(_version)" + partition_policy + " ORDER BY tuple(key)");
EXPECT_EQ(queryToString(tryRewrittenCreateQuery(
"CREATE TABLE `test_database`.`test_table_1` (`key` " + test_type + ", `key2` " + test_type + " UNIQUE KEY, PRIMARY KEY(`key`, `key2`))", context_holder.context)),
"CREATE TABLE test_database.test_table_1 (`key` " + mapped_type + ", `key2` " + mapped_type + ", `_sign` Int8() MATERIALIZED 1, "
"`_version` UInt64() MATERIALIZED 1) ENGINE = ReplacingMergeTree(_version)" + partition_policy + " ORDER BY (key, key2)");
}
}
TEST(MySQLCreateRewritten, RewrittenQueryWithPrimaryKey)
{
tryRegisterFunctions();

View File

@ -20,6 +20,7 @@ namespace ErrorCodes
extern const int TOO_DEEP_AST;
extern const int CYCLIC_ALIASES;
extern const int UNKNOWN_QUERY_PARAMETER;
extern const int BAD_ARGUMENTS;
}
@ -151,6 +152,13 @@ void QueryNormalizer::visitChildren(const ASTPtr & node, Data & data)
{
if (const auto * func_node = node->as<ASTFunction>())
{
if (func_node->query)
{
if (func_node->name != "view")
throw Exception("Query argument can only be used in the `view` TableFunction", ErrorCodes::BAD_ARGUMENTS);
/// Don't go into query argument.
return;
}
/// We skip the first argument. We also assume that the lambda function can not have parameters.
size_t first_pos = 0;
if (func_node->name == "lambda")

View File

@ -18,6 +18,7 @@
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ASTColumnsMatcher.h>
#include <Parsers/ASTColumnsTransformers.h>
namespace DB
@ -135,8 +136,8 @@ void TranslateQualifiedNamesMatcher::visit(ASTFunction & node, const ASTPtr &, D
void TranslateQualifiedNamesMatcher::visit(const ASTQualifiedAsterisk &, const ASTPtr & ast, Data & data)
{
if (ast->children.size() != 1)
throw Exception("Logical error: qualified asterisk must have exactly one child", ErrorCodes::LOGICAL_ERROR);
if (ast->children.empty())
throw Exception("Logical error: qualified asterisk must have children", ErrorCodes::LOGICAL_ERROR);
auto & ident = ast->children[0];
@ -242,6 +243,10 @@ void TranslateQualifiedNamesMatcher::visit(ASTExpressionList & node, const ASTPt
first_table = false;
}
for (const auto & transformer : asterisk->children)
{
IASTColumnsTransformer::transform(transformer, node.children);
}
}
else if (const auto * asterisk_pattern = child->as<ASTColumnsMatcher>())
{
@ -258,6 +263,11 @@ void TranslateQualifiedNamesMatcher::visit(ASTExpressionList & node, const ASTPt
first_table = false;
}
// ColumnsMatcher's transformers start to appear at child 1
for (auto it = asterisk_pattern->children.begin() + 1; it != asterisk_pattern->children.end(); ++it)
{
IASTColumnsTransformer::transform(*it, node.children);
}
}
else if (const auto * qualified_asterisk = child->as<ASTQualifiedAsterisk>())
{
@ -274,6 +284,11 @@ void TranslateQualifiedNamesMatcher::visit(ASTExpressionList & node, const ASTPt
break;
}
}
// QualifiedAsterisk's transformers start to appear at child 1
for (auto it = qualified_asterisk->children.begin() + 1; it != qualified_asterisk->children.end(); ++it)
{
IASTColumnsTransformer::transform(*it, node.children);
}
}
else
node.children.emplace_back(child);

View File

@ -13,9 +13,14 @@ ASTPtr ASTAsterisk::clone() const
void ASTAsterisk::appendColumnName(WriteBuffer & ostr) const { ostr.write('*'); }
void ASTAsterisk::formatImpl(const FormatSettings & settings, FormatState &, FormatStateStacked) const
void ASTAsterisk::formatImpl(const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const
{
settings.ostr << "*";
for (const auto & child : children)
{
settings.ostr << ' ';
child->formatImpl(settings, state, frame);
}
}
}

View File

@ -9,6 +9,9 @@ namespace DB
struct AsteriskSemantic;
struct AsteriskSemanticImpl;
/** SELECT * is expanded to all visible columns of the source table.
* Optional transformers can be attached to further manipulate these expanded columns.
*/
class ASTAsterisk : public IAST
{
public:

View File

@ -28,10 +28,15 @@ void ASTColumnsMatcher::updateTreeHashImpl(SipHash & hash_state) const
IAST::updateTreeHashImpl(hash_state);
}
void ASTColumnsMatcher::formatImpl(const FormatSettings & settings, FormatState &, FormatStateStacked) const
void ASTColumnsMatcher::formatImpl(const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const
{
settings.ostr << (settings.hilite ? hilite_keyword : "") << "COLUMNS" << (settings.hilite ? hilite_none : "") << "("
<< quoteString(original_pattern) << ")";
for (ASTs::const_iterator it = children.begin() + 1; it != children.end(); ++it)
{
settings.ostr << ' ';
(*it)->formatImpl(settings, state, frame);
}
}
void ASTColumnsMatcher::setPattern(String pattern)

View File

@ -23,6 +23,7 @@ struct AsteriskSemanticImpl;
/** SELECT COLUMNS('regexp') is expanded to multiple columns like * (asterisk).
* Optional transformers can be attached to further manipulate these expanded columns.
*/
class ASTColumnsMatcher : public IAST
{

View File

@ -0,0 +1,159 @@
#include <map>
#include "ASTColumnsTransformers.h"
#include <IO/WriteHelpers.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ASTIdentifier.h>
#include <Common/SipHash.h>
#include <Common/quoteString.h>
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
}
void IASTColumnsTransformer::transform(const ASTPtr & transformer, ASTs & nodes)
{
if (const auto * apply = transformer->as<ASTColumnsApplyTransformer>())
{
apply->transform(nodes);
}
else if (const auto * except = transformer->as<ASTColumnsExceptTransformer>())
{
except->transform(nodes);
}
else if (const auto * replace = transformer->as<ASTColumnsReplaceTransformer>())
{
replace->transform(nodes);
}
}
void ASTColumnsApplyTransformer::formatImpl(const FormatSettings & settings, FormatState &, FormatStateStacked) const
{
settings.ostr << (settings.hilite ? hilite_keyword : "") << "APPLY" << (settings.hilite ? hilite_none : "") << "(" << func_name << ")";
}
void ASTColumnsApplyTransformer::transform(ASTs & nodes) const
{
for (auto & column : nodes)
{
column = makeASTFunction(func_name, column);
}
}
void ASTColumnsExceptTransformer::formatImpl(const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const
{
settings.ostr << (settings.hilite ? hilite_keyword : "") << "EXCEPT" << (settings.hilite ? hilite_none : "") << "(";
for (ASTs::const_iterator it = children.begin(); it != children.end(); ++it)
{
if (it != children.begin())
{
settings.ostr << ", ";
}
(*it)->formatImpl(settings, state, frame);
}
settings.ostr << ")";
}
void ASTColumnsExceptTransformer::transform(ASTs & nodes) const
{
nodes.erase(
std::remove_if(
nodes.begin(),
nodes.end(),
[this](const ASTPtr & node_child)
{
if (const auto * id = node_child->as<ASTIdentifier>())
{
for (const auto & except_child : children)
{
if (except_child->as<const ASTIdentifier &>().name == id->shortName())
return true;
}
}
return false;
}),
nodes.end());
}
void ASTColumnsReplaceTransformer::Replacement::formatImpl(
const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const
{
expr->formatImpl(settings, state, frame);
settings.ostr << (settings.hilite ? hilite_keyword : "") << " AS " << (settings.hilite ? hilite_none : "") << name;
}
void ASTColumnsReplaceTransformer::formatImpl(const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const
{
settings.ostr << (settings.hilite ? hilite_keyword : "") << "REPLACE" << (settings.hilite ? hilite_none : "") << "(";
for (ASTs::const_iterator it = children.begin(); it != children.end(); ++it)
{
if (it != children.begin())
{
settings.ostr << ", ";
}
(*it)->formatImpl(settings, state, frame);
}
settings.ostr << ")";
}
void ASTColumnsReplaceTransformer::replaceChildren(ASTPtr & node, const ASTPtr & replacement, const String & name)
{
for (auto & child : node->children)
{
if (const auto * id = child->as<ASTIdentifier>())
{
if (id->shortName() == name)
child = replacement;
}
else
replaceChildren(child, replacement, name);
}
}
void ASTColumnsReplaceTransformer::transform(ASTs & nodes) const
{
std::map<String, ASTPtr> replace_map;
for (const auto & replace_child : children)
{
auto & replacement = replace_child->as<Replacement &>();
if (replace_map.find(replacement.name) != replace_map.end())
throw Exception(
"Expressions in columns transformer REPLACE should not contain the same replacement more than once",
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
replace_map.emplace(replacement.name, replacement.expr);
}
for (auto & column : nodes)
{
if (const auto * id = column->as<ASTIdentifier>())
{
auto replace_it = replace_map.find(id->shortName());
if (replace_it != replace_map.end())
{
column = replace_it->second;
column->setAlias(replace_it->first);
}
}
else if (auto * ast_with_alias = dynamic_cast<ASTWithAlias *>(column.get()))
{
auto replace_it = replace_map.find(ast_with_alias->alias);
if (replace_it != replace_map.end())
{
auto new_ast = replace_it->second->clone();
ast_with_alias->alias = ""; // remove the old alias as it's useless after replace transformation
replaceChildren(new_ast, column, replace_it->first);
column = new_ast;
column->setAlias(replace_it->first);
}
}
}
}
}

View File

@ -0,0 +1,85 @@
#pragma once
#include <Parsers/IAST.h>
namespace DB
{
class IASTColumnsTransformer : public IAST
{
public:
virtual void transform(ASTs & nodes) const = 0;
static void transform(const ASTPtr & transformer, ASTs & nodes);
};
class ASTColumnsApplyTransformer : public IASTColumnsTransformer
{
public:
String getID(char) const override { return "ColumnsApplyTransformer"; }
ASTPtr clone() const override
{
auto res = std::make_shared<ASTColumnsApplyTransformer>(*this);
return res;
}
void transform(ASTs & nodes) const override;
String func_name;
protected:
void formatImpl(const FormatSettings & settings, FormatState &, FormatStateStacked) const override;
};
class ASTColumnsExceptTransformer : public IASTColumnsTransformer
{
public:
String getID(char) const override { return "ColumnsExceptTransformer"; }
ASTPtr clone() const override
{
auto clone = std::make_shared<ASTColumnsExceptTransformer>(*this);
clone->cloneChildren();
return clone;
}
void transform(ASTs & nodes) const override;
protected:
void formatImpl(const FormatSettings & settings, FormatState &, FormatStateStacked) const override;
};
class ASTColumnsReplaceTransformer : public IASTColumnsTransformer
{
public:
class Replacement : public IAST
{
public:
String getID(char) const override { return "ColumnsReplaceTransformer::Replacement"; }
ASTPtr clone() const override
{
auto replacement = std::make_shared<Replacement>(*this);
replacement->name = name;
replacement->expr = expr->clone();
replacement->children.push_back(replacement->expr);
return replacement;
}
String name;
ASTPtr expr;
protected:
void formatImpl(const FormatSettings & settings, FormatState &, FormatStateStacked) const override;
};
String getID(char) const override { return "ColumnsReplaceTransformer"; }
ASTPtr clone() const override
{
auto clone = std::make_shared<ASTColumnsReplaceTransformer>(*this);
clone->cloneChildren();
return clone;
}
void transform(ASTs & nodes) const override;
protected:
void formatImpl(const FormatSettings & settings, FormatState &, FormatStateStacked) const override;
private:
static void replaceChildren(ASTPtr & node, const ASTPtr & replacement, const String & name);
};
}

View File

@ -48,6 +48,7 @@ ASTPtr ASTFunction::clone() const
auto res = std::make_shared<ASTFunction>(*this);
res->children.clear();
if (query) { res->query = query->clone(); res->children.push_back(res->query); }
if (arguments) { res->arguments = arguments->clone(); res->children.push_back(res->arguments); }
if (parameters) { res->parameters = parameters->clone(); res->children.push_back(res->parameters); }
@ -118,6 +119,18 @@ void ASTFunction::formatImplWithoutAlias(const FormatSettings & settings, Format
nested_need_parens.need_parens = true;
nested_dont_need_parens.need_parens = false;
if (query)
{
std::string nl_or_nothing = settings.one_line ? "" : "\n";
std::string indent_str = settings.one_line ? "" : std::string(4u * frame.indent, ' ');
settings.ostr << (settings.hilite ? hilite_function : "") << name << "(" << nl_or_nothing;
FormatStateStacked frame_nested = frame;
frame_nested.need_parens = false;
++frame_nested.indent;
query->formatImpl(settings, state, frame_nested);
settings.ostr << nl_or_nothing << indent_str << ")";
return;
}
/// Should this function to be written as operator?
bool written = false;
if (arguments && !parameters)

View File

@ -13,6 +13,7 @@ class ASTFunction : public ASTWithAlias
{
public:
String name;
ASTPtr query; // It's possible for a function to accept a query as its only argument.
ASTPtr arguments;
/// parameters - for parametric aggregate function. Example: quantile(0.9)(x) - what in first parens are 'parameters'.
ASTPtr parameters;

View File

@ -16,6 +16,11 @@ void ASTQualifiedAsterisk::formatImpl(const FormatSettings & settings, FormatSta
const auto & qualifier = children.at(0);
qualifier->formatImpl(settings, state, frame);
settings.ostr << ".*";
for (ASTs::const_iterator it = children.begin() + 1; it != children.end(); ++it)
{
settings.ostr << ' ';
(*it)->formatImpl(settings, state, frame);
}
}
}

View File

@ -11,6 +11,7 @@ struct AsteriskSemanticImpl;
/** Something like t.*
* It will have qualifier as its child ASTIdentifier.
* Optional transformers can be attached to further manipulate these expanded columns.
*/
class ASTQualifiedAsterisk : public IAST
{

View File

@ -18,9 +18,13 @@
#include <Parsers/ASTQueryParameter.h>
#include <Parsers/ASTTTLElement.h>
#include <Parsers/ASTOrderByElement.h>
#include <Parsers/ASTSelectWithUnionQuery.h>
#include <Parsers/ASTSelectQuery.h>
#include <Parsers/ASTSubquery.h>
#include <Parsers/ASTFunctionWithKeyValueArguments.h>
#include <Parsers/ASTColumnsTransformers.h>
#include <Parsers/parseIdentifierOrStringLiteral.h>
#include <Parsers/parseIntervalKind.h>
#include <Parsers/ExpressionListParsers.h>
#include <Parsers/ParserSelectWithUnionQuery.h>
@ -217,10 +221,12 @@ bool ParserFunction::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
ParserIdentifier id_parser;
ParserKeyword distinct("DISTINCT");
ParserExpressionList contents(false);
ParserSelectWithUnionQuery select;
bool has_distinct_modifier = false;
ASTPtr identifier;
ASTPtr query;
ASTPtr expr_list_args;
ASTPtr expr_list_params;
@ -231,8 +237,36 @@ bool ParserFunction::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
return false;
++pos;
if (distinct.ignore(pos, expected))
has_distinct_modifier = true;
else
{
auto old_pos = pos;
auto maybe_an_subquery = pos->type == TokenType::OpeningRoundBracket;
if (select.parse(pos, query, expected))
{
auto & select_ast = query->as<ASTSelectWithUnionQuery &>();
if (select_ast.list_of_selects->children.size() == 1 && maybe_an_subquery)
{
// It's an subquery. Bail out.
pos = old_pos;
}
else
{
if (pos->type != TokenType::ClosingRoundBracket)
return false;
++pos;
auto function_node = std::make_shared<ASTFunction>();
tryGetIdentifierNameInto(identifier, function_node->name);
function_node->query = query;
function_node->children.push_back(function_node->query);
node = function_node;
return true;
}
}
}
const char * contents_begin = pos->begin;
if (!contents.parse(pos, expr_list_args, expected))
@ -1172,17 +1206,131 @@ bool ParserColumnsMatcher::parseImpl(Pos & pos, ASTPtr & node, Expected & expect
auto res = std::make_shared<ASTColumnsMatcher>();
res->setPattern(regex_node->as<ASTLiteral &>().value.get<String>());
res->children.push_back(regex_node);
ParserColumnsTransformers transformers_p;
ASTPtr transformer;
while (transformers_p.parse(pos, transformer, expected))
{
res->children.push_back(transformer);
}
node = std::move(res);
return true;
}
bool ParserAsterisk::parseImpl(Pos & pos, ASTPtr & node, Expected &)
bool ParserColumnsTransformers::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{
ParserKeyword apply("APPLY");
ParserKeyword except("EXCEPT");
ParserKeyword replace("REPLACE");
ParserKeyword as("AS");
if (apply.ignore(pos, expected))
{
if (pos->type != TokenType::OpeningRoundBracket)
return false;
++pos;
String func_name;
if (!parseIdentifierOrStringLiteral(pos, expected, func_name))
return false;
if (pos->type != TokenType::ClosingRoundBracket)
return false;
++pos;
auto res = std::make_shared<ASTColumnsApplyTransformer>();
res->func_name = func_name;
node = std::move(res);
return true;
}
else if (except.ignore(pos, expected))
{
if (pos->type != TokenType::OpeningRoundBracket)
return false;
++pos;
ASTs identifiers;
auto parse_id = [&identifiers, &pos, &expected]
{
ASTPtr identifier;
if (!ParserIdentifier().parse(pos, identifier, expected))
return false;
identifiers.emplace_back(std::move(identifier));
return true;
};
if (!ParserList::parseUtil(pos, expected, parse_id, false))
return false;
if (pos->type != TokenType::ClosingRoundBracket)
return false;
++pos;
auto res = std::make_shared<ASTColumnsExceptTransformer>();
res->children = std::move(identifiers);
node = std::move(res);
return true;
}
else if (replace.ignore(pos, expected))
{
if (pos->type != TokenType::OpeningRoundBracket)
return false;
++pos;
ASTs replacements;
ParserExpression element_p;
ParserIdentifier ident_p;
auto parse_id = [&]
{
ASTPtr expr;
if (!element_p.parse(pos, expr, expected))
return false;
if (!as.ignore(pos, expected))
return false;
ASTPtr ident;
if (!ident_p.parse(pos, ident, expected))
return false;
auto replacement = std::make_shared<ASTColumnsReplaceTransformer::Replacement>();
replacement->name = getIdentifierName(ident);
replacement->expr = std::move(expr);
replacements.emplace_back(std::move(replacement));
return true;
};
if (!ParserList::parseUtil(pos, expected, parse_id, false))
return false;
if (pos->type != TokenType::ClosingRoundBracket)
return false;
++pos;
auto res = std::make_shared<ASTColumnsReplaceTransformer>();
res->children = std::move(replacements);
node = std::move(res);
return true;
}
return false;
}
bool ParserAsterisk::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{
if (pos->type == TokenType::Asterisk)
{
++pos;
node = std::make_shared<ASTAsterisk>();
auto asterisk = std::make_shared<ASTAsterisk>();
ParserColumnsTransformers transformers_p;
ASTPtr transformer;
while (transformers_p.parse(pos, transformer, expected))
{
asterisk->children.push_back(transformer);
}
node = asterisk;
return true;
}
return false;
@ -1204,6 +1352,12 @@ bool ParserQualifiedAsterisk::parseImpl(Pos & pos, ASTPtr & node, Expected & exp
auto res = std::make_shared<ASTQualifiedAsterisk>();
res->children.push_back(node);
ParserColumnsTransformers transformers_p;
ASTPtr transformer;
while (transformers_p.parse(pos, transformer, expected))
{
res->children.push_back(transformer);
}
node = std::move(res);
return true;
}

View File

@ -88,6 +88,15 @@ protected:
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override;
};
/** *, t.*, db.table.*, COLUMNS('<regular expression>') APPLY(...) or EXCEPT(...) or REPLACE(...)
*/
class ParserColumnsTransformers : public IParserBase
{
protected:
const char * getName() const override { return "COLUMNS transformers"; }
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override;
};
/** A function, for example, f(x, y + 1, g(z)).
* Or an aggregate function: sum(x + f(y)), corr(x, y). The syntax is the same as the usual function.
* Or a parametric aggregate function: quantile(0.9)(x + y).

View File

@ -46,10 +46,10 @@ static inline bool parseColumnDeclareOptions(IParser::Pos & pos, ASTPtr & node,
OptionDescribe("DEFAULT", "default", std::make_unique<ParserExpression>()),
OptionDescribe("ON UPDATE", "on_update", std::make_unique<ParserExpression>()),
OptionDescribe("AUTO_INCREMENT", "auto_increment", std::make_unique<ParserAlwaysTrue>()),
OptionDescribe("UNIQUE", "unique_key", std::make_unique<ParserAlwaysTrue>()),
OptionDescribe("UNIQUE KEY", "unique_key", std::make_unique<ParserAlwaysTrue>()),
OptionDescribe("KEY", "primary_key", std::make_unique<ParserAlwaysTrue>()),
OptionDescribe("PRIMARY KEY", "primary_key", std::make_unique<ParserAlwaysTrue>()),
OptionDescribe("UNIQUE", "unique_key", std::make_unique<ParserAlwaysTrue>()),
OptionDescribe("KEY", "primary_key", std::make_unique<ParserAlwaysTrue>()),
OptionDescribe("COMMENT", "comment", std::make_unique<ParserStringLiteral>()),
OptionDescribe("CHARACTER SET", "charset_name", std::make_unique<ParserCharsetName>()),
OptionDescribe("COLLATE", "collate", std::make_unique<ParserCharsetName>()),

View File

@ -10,6 +10,7 @@ SRCS(
ASTAsterisk.cpp
ASTColumnDeclaration.cpp
ASTColumnsMatcher.cpp
ASTColumnsTransformers.cpp
ASTConstraintDeclaration.cpp
ASTCreateQuery.cpp
ASTCreateQuotaQuery.cpp

View File

@ -102,6 +102,8 @@ Pipe::Holder & Pipe::Holder::operator=(Holder && rhs)
storage_holders.insert(storage_holders.end(), rhs.storage_holders.begin(), rhs.storage_holders.end());
interpreter_context.insert(interpreter_context.end(),
rhs.interpreter_context.begin(), rhs.interpreter_context.end());
for (auto & plan : rhs.query_plans)
query_plans.emplace_back(std::move(plan));
return *this;
}

View File

@ -1,6 +1,7 @@
#pragma once
#include <Processors/IProcessor.h>
#include <Processors/Sources/SourceWithProgress.h>
#include <Processors/QueryPlan/QueryPlan.h>
namespace DB
{
@ -8,6 +9,8 @@ namespace DB
class Pipe;
using Pipes = std::vector<Pipe>;
class QueryPipeline;
class IStorage;
using StoragePtr = std::shared_ptr<IStorage>;
@ -86,6 +89,8 @@ public:
/// Get processors from Pipe. Use it with cautious, it is easy to loss totals and extremes ports.
static Processors detachProcessors(Pipe pipe) { return std::move(pipe.processors); }
/// Get processors from Pipe w/o destroying pipe (used for EXPLAIN to keep QueryPlan).
const Processors & getProcessors() const { return processors; }
/// Specify quotas and limits for every ISourceWithProgress.
void setLimits(const SourceWithProgress::LocalLimits & limits);
@ -96,6 +101,8 @@ public:
/// This methods are from QueryPipeline. Needed to make conversion from pipeline to pipe possible.
void addInterpreterContext(std::shared_ptr<Context> context) { holder.interpreter_context.emplace_back(std::move(context)); }
void addStorageHolder(StoragePtr storage) { holder.storage_holders.emplace_back(std::move(storage)); }
/// For queries with nested interpreters (i.e. StorageDistributed)
void addQueryPlan(std::unique_ptr<QueryPlan> plan) { holder.query_plans.emplace_back(std::move(plan)); }
private:
/// Destruction order: processors, header, locks, temporary storages, local contexts
@ -113,6 +120,7 @@ private:
std::vector<std::shared_ptr<Context>> interpreter_context;
std::vector<StoragePtr> storage_holders;
std::vector<TableLockHolder> table_locks;
std::vector<std::unique_ptr<QueryPlan>> query_plans;
};
Holder holder;

View File

@ -21,6 +21,8 @@ class QueryPipelineProcessorsCollector;
struct AggregatingTransformParams;
using AggregatingTransformParamsPtr = std::shared_ptr<AggregatingTransformParams>;
class QueryPlan;
class QueryPipeline
{
public:
@ -93,6 +95,7 @@ public:
void addTableLock(const TableLockHolder & lock) { pipe.addTableLock(lock); }
void addInterpreterContext(std::shared_ptr<Context> context) { pipe.addInterpreterContext(std::move(context)); }
void addStorageHolder(StoragePtr storage) { pipe.addStorageHolder(std::move(storage)); }
void addQueryPlan(std::unique_ptr<QueryPlan> plan) { pipe.addQueryPlan(std::move(plan)); }
/// For compatibility with IBlockInputStream.
void setProgressCallback(const ProgressCallback & callback);

View File

@ -104,7 +104,13 @@ void StorageView::replaceWithSubquery(ASTSelectQuery & outer_query, ASTPtr view_
ASTTableExpression * table_expression = getFirstTableExpression(outer_query);
if (!table_expression->database_and_table_name)
throw Exception("Logical error: incorrect table expression", ErrorCodes::LOGICAL_ERROR);
{
// If it's a view table function, add a fake db.table name.
if (table_expression->table_function && table_expression->table_function->as<ASTFunction>()->name == "view")
table_expression->database_and_table_name = std::make_shared<ASTIdentifier>("__view");
else
throw Exception("Logical error: incorrect table expression", ErrorCodes::LOGICAL_ERROR);
}
DatabaseAndTableWithAlias db_table(table_expression->database_and_table_name);
String alias = db_table.alias.empty() ? db_table.table : db_table.alias;

View File

@ -0,0 +1,45 @@
#include <Interpreters/InterpreterSelectWithUnionQuery.h>
#include <Parsers/ASTCreateQuery.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ASTSelectWithUnionQuery.h>
#include <Storages/StorageView.h>
#include <TableFunctions/ITableFunction.h>
#include <TableFunctions/TableFunctionFactory.h>
#include <TableFunctions/TableFunctionView.h>
#include "registerTableFunctions.h"
namespace DB
{
namespace ErrorCodes
{
extern const int BAD_ARGUMENTS;
}
StoragePtr TableFunctionView::executeImpl(const ASTPtr & ast_function, const Context & context, const std::string & table_name) const
{
if (const auto * function = ast_function->as<ASTFunction>())
{
if (function->query)
{
if (auto * select = function->query->as<ASTSelectWithUnionQuery>())
{
auto sample = InterpreterSelectWithUnionQuery::getSampleBlock(function->query, context);
auto columns = ColumnsDescription(sample.getNamesAndTypesList());
ASTCreateQuery create;
create.select = select;
auto res = StorageView::create(StorageID(getDatabaseName(), table_name), create, columns);
res->startup();
return res;
}
}
}
throw Exception("Table function '" + getName() + "' requires a query argument.", ErrorCodes::BAD_ARGUMENTS);
}
void registerTableFunctionView(TableFunctionFactory & factory)
{
factory.registerFunction<TableFunctionView>();
}
}

View File

@ -0,0 +1,27 @@
#pragma once
#include <TableFunctions/ITableFunction.h>
#include <Core/Types.h>
namespace DB
{
/* view(query)
* Turning subquery into a table.
* Useful for passing subquery around.
*/
class TableFunctionView : public ITableFunction
{
public:
static constexpr auto name = "view";
std::string getName() const override { return name; }
private:
StoragePtr executeImpl(const ASTPtr & ast_function, const Context & context, const std::string & table_name) const override;
const char * getStorageTypeName() const override { return "View"; }
UInt64 evaluateArgument(const Context & context, ASTPtr & argument) const;
};
}

View File

@ -30,6 +30,8 @@ void registerTableFunctions()
registerTableFunctionODBC(factory);
registerTableFunctionJDBC(factory);
registerTableFunctionView(factory);
#if USE_MYSQL
registerTableFunctionMySQL(factory);
#endif

View File

@ -30,6 +30,8 @@ void registerTableFunctionHDFS(TableFunctionFactory & factory);
void registerTableFunctionODBC(TableFunctionFactory & factory);
void registerTableFunctionJDBC(TableFunctionFactory & factory);
void registerTableFunctionView(TableFunctionFactory & factory);
#if USE_MYSQL
void registerTableFunctionMySQL(TableFunctionFactory & factory);
#endif

View File

@ -21,6 +21,7 @@ SRCS(
TableFunctionRemote.cpp
TableFunctionURL.cpp
TableFunctionValues.cpp
TableFunctionView.cpp
TableFunctionZeros.cpp
)

View File

@ -1,4 +1,5 @@
import time
import pymysql.cursors
def check_query(clickhouse_node, query, result_set, retry_count=3, interval_seconds=3):
@ -321,3 +322,35 @@ def alter_rename_table_with_materialize_mysql_database(clickhouse_node, mysql_no
clickhouse_node.query("DROP DATABASE test_database")
mysql_node.query("DROP DATABASE test_database")
def query_event_with_empty_transaction(clickhouse_node, mysql_node, service_name):
mysql_node.query("CREATE DATABASE test_database")
mysql_node.query("RESET MASTER")
mysql_node.query("CREATE TABLE test_database.t1(a INT NOT NULL PRIMARY KEY, b VARCHAR(255) DEFAULT 'BEGIN')")
mysql_node.query("INSERT INTO test_database.t1(a) VALUES(1)")
clickhouse_node.query(
"CREATE DATABASE test_database ENGINE = MaterializeMySQL('{}:3306', 'test_database', 'root', 'clickhouse')".format(
service_name))
# Reject one empty GTID QUERY event with 'BEGIN' and 'COMMIT'
mysql_cursor = mysql_node.alloc_connection().cursor(pymysql.cursors.DictCursor)
mysql_cursor.execute("SHOW MASTER STATUS")
(uuid, seqs) = mysql_cursor.fetchall()[0]["Executed_Gtid_Set"].split(":")
(seq_begin, seq_end) = seqs.split("-")
next_gtid = uuid + ":" + str(int(seq_end) + 1)
mysql_node.query("SET gtid_next='" + next_gtid + "'")
mysql_node.query("BEGIN")
mysql_node.query("COMMIT")
mysql_node.query("SET gtid_next='AUTOMATIC'")
# Reject one 'BEGIN' QUERY event and 'COMMIT' XID event.
mysql_node.query("/* start */ begin /* end */")
mysql_node.query("INSERT INTO test_database.t1(a) VALUES(2)")
mysql_node.query("/* start */ commit /* end */")
check_query(clickhouse_node, "SELECT * FROM test_database.t1 ORDER BY a FORMAT TSV",
"1\tBEGIN\n2\tBEGIN\n")
clickhouse_node.query("DROP DATABASE test_database")
mysql_node.query("DROP DATABASE test_database")

View File

@ -120,3 +120,8 @@ def test_materialize_database_ddl_with_mysql_8_0(started_cluster, started_mysql_
materialize_with_ddl.alter_rename_column_with_materialize_mysql_database(clickhouse_node, started_mysql_8_0, "mysql8_0")
materialize_with_ddl.alter_modify_column_with_materialize_mysql_database(clickhouse_node, started_mysql_8_0, "mysql8_0")
def test_materialize_database_ddl_with_empty_transaction_5_7(started_cluster, started_mysql_5_7):
materialize_with_ddl.query_event_with_empty_transaction(clickhouse_node, started_mysql_5_7, "mysql5_7")
def test_materialize_database_ddl_with_empty_transaction_8_0(started_cluster, started_mysql_8_0):
materialize_with_ddl.query_event_with_empty_transaction(clickhouse_node, started_mysql_8_0, "mysql8_0")

View File

@ -214,7 +214,7 @@ def test_different_part_types_on_replicas(start_cluster, table, part_type):
node7 = cluster.add_instance('node7', user_configs=["configs_old/users.d/not_optimize_count.xml"], with_zookeeper=True, image='yandex/clickhouse-server', tag='19.17.8.54', stay_alive=True, with_installed_binary=True)
node8 = cluster.add_instance('node8', main_configs=[], user_configs=["configs/users.d/not_optimize_count.xml"], with_zookeeper=True)
node8 = cluster.add_instance('node8', user_configs=["configs/users.d/not_optimize_count.xml"], with_zookeeper=True)
settings7 = {'index_granularity_bytes' : 10485760}
settings8 = {'index_granularity_bytes' : 10485760, 'min_rows_for_wide_part' : 512, 'min_rows_for_compact_part' : 0}

View File

@ -0,0 +1,10 @@
1
1
SELECT `1`
FROM view(
SELECT 1
)
SELECT `1`
FROM remote(\'127.0.0.1\', view(
SELECT 1
))

View File

@ -0,0 +1,5 @@
SELECT * FROM view(SELECT 1);
SELECT * FROM remote('127.0.0.1', view(SELECT 1));
EXPLAIN SYNTAX SELECT * FROM view(SELECT 1);
EXPLAIN SYNTAX SELECT * FROM remote('127.0.0.1', view(SELECT 1));

View File

@ -0,0 +1,63 @@
220 18 347
110 9 173.5
1970-04-11 1970-01-11 1970-11-21
2 3
1 2
18 347
110 173.5
1970-04-11 1970-01-11 1970-11-21
222 18 347
111 11 173.5
1970-04-11 1970-01-11 1970-11-21
SELECT
sum(i),
sum(j),
sum(k)
FROM columns_transformers
SELECT
avg(i),
avg(j),
avg(k)
FROM columns_transformers
SELECT
toDate(any(i)),
toDate(any(j)),
toDate(any(k))
FROM columns_transformers AS a
SELECT
length(toString(j)),
length(toString(k))
FROM columns_transformers
SELECT
sum(j),
sum(k)
FROM columns_transformers
SELECT
avg(i),
avg(k)
FROM columns_transformers
SELECT
toDate(any(i)),
toDate(any(j)),
toDate(any(k))
FROM columns_transformers AS a
SELECT
sum(i + 1 AS i),
sum(j),
sum(k)
FROM columns_transformers
SELECT
avg(i + 1 AS i),
avg(j + 2 AS j),
avg(k)
FROM columns_transformers
SELECT
toDate(any(i)),
toDate(any(j)),
toDate(any(k))
FROM columns_transformers AS a
SELECT
(i + 1) + 1 AS i,
j,
k
FROM columns_transformers

View File

@ -0,0 +1,36 @@
DROP TABLE IF EXISTS columns_transformers;
CREATE TABLE columns_transformers (i Int64, j Int16, k Int64) Engine=TinyLog;
INSERT INTO columns_transformers VALUES (100, 10, 324), (120, 8, 23);
SELECT * APPLY(sum) from columns_transformers;
SELECT columns_transformers.* APPLY(avg) from columns_transformers;
SELECT a.* APPLY(toDate) APPLY(any) from columns_transformers a;
SELECT COLUMNS('[jk]') APPLY(toString) APPLY(length) from columns_transformers;
SELECT * EXCEPT(i) APPLY(sum) from columns_transformers;
SELECT columns_transformers.* EXCEPT(j) APPLY(avg) from columns_transformers;
-- EXCEPT after APPLY will not match anything
SELECT a.* APPLY(toDate) EXCEPT(i, j) APPLY(any) from columns_transformers a;
SELECT * REPLACE(i + 1 AS i) APPLY(sum) from columns_transformers;
SELECT columns_transformers.* REPLACE(j + 2 AS j, i + 1 AS i) APPLY(avg) from columns_transformers;
SELECT columns_transformers.* REPLACE(j + 1 AS j, j + 2 AS j) APPLY(avg) from columns_transformers; -- { serverError 43 }
-- REPLACE after APPLY will not match anything
SELECT a.* APPLY(toDate) REPLACE(i + 1 AS i) APPLY(any) from columns_transformers a;
EXPLAIN SYNTAX SELECT * APPLY(sum) from columns_transformers;
EXPLAIN SYNTAX SELECT columns_transformers.* APPLY(avg) from columns_transformers;
EXPLAIN SYNTAX SELECT a.* APPLY(toDate) APPLY(any) from columns_transformers a;
EXPLAIN SYNTAX SELECT COLUMNS('[jk]') APPLY(toString) APPLY(length) from columns_transformers;
EXPLAIN SYNTAX SELECT * EXCEPT(i) APPLY(sum) from columns_transformers;
EXPLAIN SYNTAX SELECT columns_transformers.* EXCEPT(j) APPLY(avg) from columns_transformers;
EXPLAIN SYNTAX SELECT a.* APPLY(toDate) EXCEPT(i, j) APPLY(any) from columns_transformers a;
EXPLAIN SYNTAX SELECT * REPLACE(i + 1 AS i) APPLY(sum) from columns_transformers;
EXPLAIN SYNTAX SELECT columns_transformers.* REPLACE(j + 2 AS j, i + 1 AS i) APPLY(avg) from columns_transformers;
EXPLAIN SYNTAX SELECT a.* APPLY(toDate) REPLACE(i + 1 AS i) APPLY(any) from columns_transformers a;
-- Multiple REPLACE in a row
EXPLAIN SYNTAX SELECT * REPLACE(i + 1 AS i) REPLACE(i + 1 AS i) from columns_transformers;
DROP TABLE columns_transformers;

View File

@ -0,0 +1,6 @@
--
-- regressions
--
-- SIGSEGV regression due to QueryPlan lifetime
EXPLAIN PIPELINE graph=1 SELECT * FROM remote('127.{1,2}', system.one) FORMAT Null;

View File

@ -11,7 +11,7 @@ ROOT_DIR=${CUR_DIR}/../../build_no_submodules
mkdir -p $ROOT_DIR
cd $ROOT_DIR
URL=`git remote get-url origin | sed 's/.git$//'`
wget -O ch.zip $URL/archive/${BRANCH}.zip
wget -nv -O ch.zip $URL/archive/${BRANCH}.zip
unzip -ou ch.zip
# TODO: make disableable lz4 zstd

View File

@ -18,7 +18,7 @@ THREADS=$(grep -c ^processor /proc/cpuinfo)
mkdir "${WORKSPACE}/gcc"
pushd "${WORKSPACE}/gcc"
wget https://ftpmirror.gnu.org/gcc/${GCC_SOURCES_VERSION}/${GCC_SOURCES_VERSION}.tar.xz
wget -nv https://ftpmirror.gnu.org/gcc/${GCC_SOURCES_VERSION}/${GCC_SOURCES_VERSION}.tar.xz
tar xf ${GCC_SOURCES_VERSION}.tar.xz
pushd ${GCC_SOURCES_VERSION}
./contrib/download_prerequisites

View File

@ -29,7 +29,7 @@ baseUrl="https://partner-images.canonical.com/core/$VERSION"
# install qemu-user-static
if [ -n "${QEMU_ARCH}" ]; then
if [ ! -f x86_64_qemu-${QEMU_ARCH}-static.tar.gz ]; then
wget -N https://github.com/multiarch/qemu-user-static/releases/download/${QEMU_VER}/x86_64_qemu-${QEMU_ARCH}-static.tar.gz
wget -nv -N https://github.com/multiarch/qemu-user-static/releases/download/${QEMU_VER}/x86_64_qemu-${QEMU_ARCH}-static.tar.gz
fi
tar -xvf x86_64_qemu-${QEMU_ARCH}-static.tar.gz -C $ROOTFS/usr/bin/
fi
@ -37,13 +37,13 @@ fi
# get the image
if \
wget -q --spider "$baseUrl/current" \
&& wget -q --spider "$baseUrl/current/$thisTar" \
wget -nv --spider "$baseUrl/current" \
&& wget -nv --spider "$baseUrl/current/$thisTar" \
; then
baseUrl+='/current'
fi
wget -qN "$baseUrl/"{{MD5,SHA{1,256}}SUMS{,.gpg},"$thisTarBase.manifest",'unpacked/build-info.txt'} || true
wget -N "$baseUrl/$thisTar"
wget -nv -N "$baseUrl/"{{MD5,SHA{1,256}}SUMS{,.gpg},"$thisTarBase.manifest",'unpacked/build-info.txt'} || true
wget -nv -N "$baseUrl/$thisTar"
# check checksum
if [ -f SHA256SUMS ]; then

View File

@ -24,7 +24,7 @@ param="$1"
if [ "${param}" = "list" ]
then
# https://stackoverflow.com/a/39454426/1555175
wget -q https://registry.hub.docker.com/v1/repositories/yandex/clickhouse-server/tags -O - | sed -e 's/[][]//g' -e 's/"//g' -e 's/ //g' | tr '}' '\n' | awk -F: '{print $3}'
wget -nv https://registry.hub.docker.com/v1/repositories/yandex/clickhouse-server/tags -O - | sed -e 's/[][]//g' -e 's/"//g' -e 's/ //g' | tr '}' '\n' | awk -F: '{print $3}'
else
docker pull yandex/clickhouse-server:${param}
tmp_dir=$(mktemp -d -t ci-XXXXXXXXXX) # older version require /nonexistent folder to exist to run clickhouse client :D

View File

@ -20,7 +20,7 @@ class Backport:
def getPullRequests(self, from_commit):
return self._gh.get_pull_requests(from_commit)
def execute(self, repo, til, number, run_cherrypick):
def execute(self, repo, until_commit, number, run_cherrypick):
repo = LocalRepo(repo, 'origin', self.default_branch_name)
branches = repo.get_release_branches()[-number:] # [(branch_name, base_commit)]
@ -31,9 +31,9 @@ class Backport:
for branch in branches:
logging.info('Found release branch: %s', branch[0])
if not til:
til = branches[0][1]
prs = self.getPullRequests(til)
if not until_commit:
until_commit = branches[0][1]
pull_requests = self.getPullRequests(until_commit)
backport_map = {}
@ -42,7 +42,7 @@ class Backport:
RE_BACKPORTED = re.compile(r'^v(\d+\.\d+)-backported$')
# pull-requests are sorted by ancestry from the least recent.
for pr in prs:
for pr in pull_requests:
while repo.comparator(branches[-1][1]) >= repo.comparator(pr['mergeCommit']['oid']):
logging.info("PR #{} is already inside {}. Dropping this branch for further PRs".format(pr['number'], branches[-1][0]))
branches.pop()
@ -55,28 +55,28 @@ class Backport:
# First pass. Find all must-backports
for label in pr['labels']['nodes']:
if label['name'].startswith('pr-') and label['color'] == 'ff0000':
if label['name'] == 'pr-bugfix':
backport_map[pr['number']] = branch_set.copy()
continue
m = RE_MUST_BACKPORT.match(label['name'])
if m:
matched = RE_MUST_BACKPORT.match(label['name'])
if matched:
if pr['number'] not in backport_map:
backport_map[pr['number']] = set()
backport_map[pr['number']].add(m.group(1))
backport_map[pr['number']].add(matched.group(1))
# Second pass. Find all no-backports
for label in pr['labels']['nodes']:
if label['name'] == 'pr-no-backport' and pr['number'] in backport_map:
del backport_map[pr['number']]
break
m1 = RE_NO_BACKPORT.match(label['name'])
m2 = RE_BACKPORTED.match(label['name'])
if m1 and pr['number'] in backport_map and m1.group(1) in backport_map[pr['number']]:
backport_map[pr['number']].remove(m1.group(1))
logging.info('\tskipping %s because of forced no-backport', m1.group(1))
elif m2 and pr['number'] in backport_map and m2.group(1) in backport_map[pr['number']]:
backport_map[pr['number']].remove(m2.group(1))
logging.info('\tskipping %s because it\'s already backported manually', m2.group(1))
matched_no_backport = RE_NO_BACKPORT.match(label['name'])
matched_backported = RE_BACKPORTED.match(label['name'])
if matched_no_backport and pr['number'] in backport_map and matched_no_backport.group(1) in backport_map[pr['number']]:
backport_map[pr['number']].remove(matched_no_backport.group(1))
logging.info('\tskipping %s because of forced no-backport', matched_no_backport.group(1))
elif matched_backported and pr['number'] in backport_map and matched_backported.group(1) in backport_map[pr['number']]:
backport_map[pr['number']].remove(matched_backported.group(1))
logging.info('\tskipping %s because it\'s already backported manually', matched_backported.group(1))
for pr, branches in backport_map.items():
logging.info('PR #%s needs to be backported to:', pr)