Merge branch 'master' into CheSema-patch-1

This commit is contained in:
Sema Checherinda 2023-03-08 11:27:42 +01:00 committed by GitHub
commit 6000881230
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
78 changed files with 2374 additions and 244 deletions

View File

@ -0,0 +1,55 @@
---
sidebar_position: 1
sidebar_label: 2023
---
# 2023 Changelog
### ClickHouse release v22.12.4.76-stable (cb5772db805) FIXME as compared to v22.12.3.5-stable (893de538f02)
#### Performance Improvement
* Backported in [#45704](https://github.com/ClickHouse/ClickHouse/issues/45704): Fixed performance of short `SELECT` queries that read from tables with large number of`Array`/`Map`/`Nested` columns. [#45630](https://github.com/ClickHouse/ClickHouse/pull/45630) ([Anton Popov](https://github.com/CurtizJ)).
* Backported in [#46378](https://github.com/ClickHouse/ClickHouse/issues/46378): Fix too big memory usage for vertical merges on non-remote disk. Respect `max_insert_delayed_streams_for_parallel_write` for the remote disk. [#46275](https://github.com/ClickHouse/ClickHouse/pull/46275) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
#### Bug Fix
* Backported in [#45672](https://github.com/ClickHouse/ClickHouse/issues/45672): Fix wiping sensitive info in logs. [#45603](https://github.com/ClickHouse/ClickHouse/pull/45603) ([Vitaly Baranov](https://github.com/vitlibar)).
#### Build/Testing/Packaging Improvement
* Backported in [#45200](https://github.com/ClickHouse/ClickHouse/issues/45200): Fix zookeeper downloading, update the version, and optimize the image size. [#44853](https://github.com/ClickHouse/ClickHouse/pull/44853) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#46116](https://github.com/ClickHouse/ClickHouse/issues/46116): Remove the dependency on the `adduser` tool from the packages, because we don't use it. This fixes [#44934](https://github.com/ClickHouse/ClickHouse/issues/44934). [#45011](https://github.com/ClickHouse/ClickHouse/pull/45011) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#46035](https://github.com/ClickHouse/ClickHouse/issues/46035): Add systemd.service file for clickhouse-keeper. Fixes [#44293](https://github.com/ClickHouse/ClickHouse/issues/44293). [#45568](https://github.com/ClickHouse/ClickHouse/pull/45568) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#46484](https://github.com/ClickHouse/ClickHouse/issues/46484): Get rid of unnecessary build for standalone clickhouse-keeper. [#46367](https://github.com/ClickHouse/ClickHouse/pull/46367) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#46509](https://github.com/ClickHouse/ClickHouse/issues/46509): Some time ago the ccache compression was changed to `zst`, but `gz` archives are downloaded by default. It fixes it by prioritizing zst archive. [#46490](https://github.com/ClickHouse/ClickHouse/pull/46490) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#47058](https://github.com/ClickHouse/ClickHouse/issues/47058): Fix error during server startup on old distros (e.g. Amazon Linux 2) and on ARM that glibc 2.28 symbols are not found. [#47008](https://github.com/ClickHouse/ClickHouse/pull/47008) ([Robert Schulze](https://github.com/rschu1ze)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#45904](https://github.com/ClickHouse/ClickHouse/issues/45904): Fixed bug with non-parsable default value for EPHEMERAL column in table metadata. [#44026](https://github.com/ClickHouse/ClickHouse/pull/44026) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Backported in [#45321](https://github.com/ClickHouse/ClickHouse/issues/45321): Fixed a bug in normalization of a `DEFAULT` expression in `CREATE TABLE` statement. The second argument of function `in` (or the right argument of operator `IN`) might be replaced with the result of its evaluation during CREATE query execution. Fixes [#44496](https://github.com/ClickHouse/ClickHouse/issues/44496). [#44547](https://github.com/ClickHouse/ClickHouse/pull/44547) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Backported in [#45000](https://github.com/ClickHouse/ClickHouse/issues/45000): Another fix for `Cannot read all data` error which could happen while reading `LowCardinality` dictionary from remote fs. Fixes [#44709](https://github.com/ClickHouse/ClickHouse/issues/44709). [#44875](https://github.com/ClickHouse/ClickHouse/pull/44875) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Backported in [#45553](https://github.com/ClickHouse/ClickHouse/issues/45553): Fix `SELECT ... FROM system.dictionaries` exception when there is a dictionary with a bad structure (e.g. incorrect type in xml config). [#45399](https://github.com/ClickHouse/ClickHouse/pull/45399) ([Aleksei Filatov](https://github.com/aalexfvk)).
* Backported in [#46226](https://github.com/ClickHouse/ClickHouse/issues/46226): A couple of seg faults have been reported around `c-ares`. All of the recent stack traces observed fail on inserting into `std::unodered_set<>`. I believe I have found the root cause of this, it seems to be unprocessed queries. Prior to this PR, CH calls `poll` to wait on the file descriptors in the `c-ares` channel. According to the [poll docs](https://man7.org/linux/man-pages/man2/poll.2.html), a negative return value means an error has ocurred. Because of this, we would abort the execution and return failure. The problem is that `poll` will also return a negative value if a system interrupt occurs. A system interrupt does not mean the processing has failed or ended, but we would abort it anyways because we were checking for negative values. Once the execution is aborted, the whole stack is destroyed, which includes the `std::unordered_set<std::string>` passed to the `void *` parameter of the c-ares callback. Once c-ares completed the request, the callback would be invoked and would access an invalid memory address causing a segfault. [#45629](https://github.com/ClickHouse/ClickHouse/pull/45629) ([Arthur Passos](https://github.com/arthurpassos)).
* Backported in [#46218](https://github.com/ClickHouse/ClickHouse/issues/46218): Fix reading of non existing nested columns with multiple level in compact parts. [#46045](https://github.com/ClickHouse/ClickHouse/pull/46045) ([Azat Khuzhin](https://github.com/azat)).
* Backported in [#46446](https://github.com/ClickHouse/ClickHouse/issues/46446): Fix possible `LOGICAL_ERROR` in asynchronous inserts with invalid data sent in format `VALUES`. [#46350](https://github.com/ClickHouse/ClickHouse/pull/46350) ([Anton Popov](https://github.com/CurtizJ)).
* Backported in [#46678](https://github.com/ClickHouse/ClickHouse/issues/46678): Fix an invalid processing of constant `LowCardinality` argument in function `arrayMap`. This bug could lead to a segfault in release, and logical error `Bad cast` in debug build. [#46569](https://github.com/ClickHouse/ClickHouse/pull/46569) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#46872](https://github.com/ClickHouse/ClickHouse/issues/46872): Fix a bug in the `Map` data type. This closes [#46855](https://github.com/ClickHouse/ClickHouse/issues/46855). [#46856](https://github.com/ClickHouse/ClickHouse/pull/46856) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#46954](https://github.com/ClickHouse/ClickHouse/issues/46954): Fix result of LIKE predicates which translate to substring searches and contain quoted non-LIKE metacharacters. [#46875](https://github.com/ClickHouse/ClickHouse/pull/46875) ([Robert Schulze](https://github.com/rschu1ze)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Automatically merge green backport PRs and green approved PRs [#41110](https://github.com/ClickHouse/ClickHouse/pull/41110) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Improve release scripts [#45074](https://github.com/ClickHouse/ClickHouse/pull/45074) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix wrong approved_at, simplify conditions [#45302](https://github.com/ClickHouse/ClickHouse/pull/45302) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Get rid of artifactory in favor of r2 + ch-repos-manager [#45421](https://github.com/ClickHouse/ClickHouse/pull/45421) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Another attempt to fix automerge, or at least to have debug footprint [#45476](https://github.com/ClickHouse/ClickHouse/pull/45476) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Trim refs/tags/ from GITHUB_TAG in release workflow [#45636](https://github.com/ClickHouse/ClickHouse/pull/45636) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add check for running workflows to merge_pr.py [#45803](https://github.com/ClickHouse/ClickHouse/pull/45803) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Get rid of progress timestamps in release publishing [#45818](https://github.com/ClickHouse/ClickHouse/pull/45818) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add necessary dependency for sanitizers [#45959](https://github.com/ClickHouse/ClickHouse/pull/45959) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add helping logging to auto-merge script [#46080](https://github.com/ClickHouse/ClickHouse/pull/46080) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix write buffer destruction order for vertical merge. [#46205](https://github.com/ClickHouse/ClickHouse/pull/46205) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Improve install_check.py [#46458](https://github.com/ClickHouse/ClickHouse/pull/46458) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix dependencies for InstallPackagesTestAarch64 [#46597](https://github.com/ClickHouse/ClickHouse/pull/46597) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Get rid of legacy DocsReleaseChecks [#46665](https://github.com/ClickHouse/ClickHouse/pull/46665) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Reduce updates of Mergeable Check [#46781](https://github.com/ClickHouse/ClickHouse/pull/46781) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -0,0 +1,40 @@
---
sidebar_position: 1
sidebar_label: 2023
---
# 2023 Changelog
### ClickHouse release v22.8.14.53-lts (4ea67c40077) FIXME as compared to v22.8.13.20-lts (e4817946d18)
#### Performance Improvement
* Backported in [#45845](https://github.com/ClickHouse/ClickHouse/issues/45845): Fixed performance of short `SELECT` queries that read from tables with large number of`Array`/`Map`/`Nested` columns. [#45630](https://github.com/ClickHouse/ClickHouse/pull/45630) ([Anton Popov](https://github.com/CurtizJ)).
* Backported in [#46374](https://github.com/ClickHouse/ClickHouse/issues/46374): Fix too big memory usage for vertical merges on non-remote disk. Respect `max_insert_delayed_streams_for_parallel_write` for the remote disk. [#46275](https://github.com/ClickHouse/ClickHouse/pull/46275) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Backported in [#46358](https://github.com/ClickHouse/ClickHouse/issues/46358): Allow using Vertical merge algorithm with parts in Compact format. This will allow ClickHouse server to use much less memory for background operations. This closes [#46084](https://github.com/ClickHouse/ClickHouse/issues/46084). [#46282](https://github.com/ClickHouse/ClickHouse/pull/46282) ([Anton Popov](https://github.com/CurtizJ)).
#### Build/Testing/Packaging Improvement
* Backported in [#46112](https://github.com/ClickHouse/ClickHouse/issues/46112): Remove the dependency on the `adduser` tool from the packages, because we don't use it. This fixes [#44934](https://github.com/ClickHouse/ClickHouse/issues/44934). [#45011](https://github.com/ClickHouse/ClickHouse/pull/45011) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#46482](https://github.com/ClickHouse/ClickHouse/issues/46482): Get rid of unnecessary build for standalone clickhouse-keeper. [#46367](https://github.com/ClickHouse/ClickHouse/pull/46367) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#46505](https://github.com/ClickHouse/ClickHouse/issues/46505): Some time ago the ccache compression was changed to `zst`, but `gz` archives are downloaded by default. It fixes it by prioritizing zst archive. [#46490](https://github.com/ClickHouse/ClickHouse/pull/46490) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#45908](https://github.com/ClickHouse/ClickHouse/issues/45908): Fixed bug with non-parsable default value for EPHEMERAL column in table metadata. [#44026](https://github.com/ClickHouse/ClickHouse/pull/44026) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Backported in [#46238](https://github.com/ClickHouse/ClickHouse/issues/46238): A couple of seg faults have been reported around `c-ares`. All of the recent stack traces observed fail on inserting into `std::unodered_set<>`. I believe I have found the root cause of this, it seems to be unprocessed queries. Prior to this PR, CH calls `poll` to wait on the file descriptors in the `c-ares` channel. According to the [poll docs](https://man7.org/linux/man-pages/man2/poll.2.html), a negative return value means an error has ocurred. Because of this, we would abort the execution and return failure. The problem is that `poll` will also return a negative value if a system interrupt occurs. A system interrupt does not mean the processing has failed or ended, but we would abort it anyways because we were checking for negative values. Once the execution is aborted, the whole stack is destroyed, which includes the `std::unordered_set<std::string>` passed to the `void *` parameter of the c-ares callback. Once c-ares completed the request, the callback would be invoked and would access an invalid memory address causing a segfault. [#45629](https://github.com/ClickHouse/ClickHouse/pull/45629) ([Arthur Passos](https://github.com/arthurpassos)).
* Backported in [#45727](https://github.com/ClickHouse/ClickHouse/issues/45727): Fix key description when encountering duplicate primary keys. This can happen in projections. See [#45590](https://github.com/ClickHouse/ClickHouse/issues/45590) for details. [#45686](https://github.com/ClickHouse/ClickHouse/pull/45686) ([Amos Bird](https://github.com/amosbird)).
* Backported in [#46394](https://github.com/ClickHouse/ClickHouse/issues/46394): Fix `SYSTEM UNFREEZE` queries failing with the exception `CANNOT_PARSE_INPUT_ASSERTION_FAILED`. [#46325](https://github.com/ClickHouse/ClickHouse/pull/46325) ([Aleksei Filatov](https://github.com/aalexfvk)).
* Backported in [#46442](https://github.com/ClickHouse/ClickHouse/issues/46442): Fix possible `LOGICAL_ERROR` in asynchronous inserts with invalid data sent in format `VALUES`. [#46350](https://github.com/ClickHouse/ClickHouse/pull/46350) ([Anton Popov](https://github.com/CurtizJ)).
* Backported in [#46674](https://github.com/ClickHouse/ClickHouse/issues/46674): Fix an invalid processing of constant `LowCardinality` argument in function `arrayMap`. This bug could lead to a segfault in release, and logical error `Bad cast` in debug build. [#46569](https://github.com/ClickHouse/ClickHouse/pull/46569) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#46879](https://github.com/ClickHouse/ClickHouse/issues/46879): Fix MSan report in the `maxIntersections` function. This closes [#43126](https://github.com/ClickHouse/ClickHouse/issues/43126). [#46847](https://github.com/ClickHouse/ClickHouse/pull/46847) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#46871](https://github.com/ClickHouse/ClickHouse/issues/46871): Fix a bug in the `Map` data type. This closes [#46855](https://github.com/ClickHouse/ClickHouse/issues/46855). [#46856](https://github.com/ClickHouse/ClickHouse/pull/46856) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Another attempt to fix automerge, or at least to have debug footprint [#45476](https://github.com/ClickHouse/ClickHouse/pull/45476) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add check for running workflows to merge_pr.py [#45803](https://github.com/ClickHouse/ClickHouse/pull/45803) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Get rid of progress timestamps in release publishing [#45818](https://github.com/ClickHouse/ClickHouse/pull/45818) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add necessary dependency for sanitizers [#45959](https://github.com/ClickHouse/ClickHouse/pull/45959) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add helping logging to auto-merge script [#46080](https://github.com/ClickHouse/ClickHouse/pull/46080) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix write buffer destruction order for vertical merge. [#46205](https://github.com/ClickHouse/ClickHouse/pull/46205) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Get rid of legacy DocsReleaseChecks [#46665](https://github.com/ClickHouse/ClickHouse/pull/46665) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -0,0 +1,47 @@
---
sidebar_position: 1
sidebar_label: 2023
---
# 2023 Changelog
### ClickHouse release v23.1.4.58-stable (9ed562163a5) FIXME as compared to v23.1.3.5-stable (548b494bcce)
#### Performance Improvement
* Backported in [#46380](https://github.com/ClickHouse/ClickHouse/issues/46380): Fix too big memory usage for vertical merges on non-remote disk. Respect `max_insert_delayed_streams_for_parallel_write` for the remote disk. [#46275](https://github.com/ClickHouse/ClickHouse/pull/46275) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
#### Improvement
* Backported in [#46985](https://github.com/ClickHouse/ClickHouse/issues/46985): - Apply `ALTER TABLE table_name ON CLUSTER cluster MOVE PARTITION|PART partition_expr TO DISK|VOLUME 'disk_name'` to all replicas. Because `ALTER TABLE t MOVE` is not replicated. [#46402](https://github.com/ClickHouse/ClickHouse/pull/46402) ([lizhuoyu5](https://github.com/lzydmxy)).
* Backported in [#46778](https://github.com/ClickHouse/ClickHouse/issues/46778): Backward compatibility for T64 codec support for IPv4. [#46747](https://github.com/ClickHouse/ClickHouse/pull/46747) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Backported in [#47020](https://github.com/ClickHouse/ClickHouse/issues/47020): Allow IPv4 in range(). [#46995](https://github.com/ClickHouse/ClickHouse/pull/46995) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
#### Build/Testing/Packaging Improvement
* Backported in [#46031](https://github.com/ClickHouse/ClickHouse/issues/46031): Add systemd.service file for clickhouse-keeper. Fixes [#44293](https://github.com/ClickHouse/ClickHouse/issues/44293). [#45568](https://github.com/ClickHouse/ClickHouse/pull/45568) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#46477](https://github.com/ClickHouse/ClickHouse/issues/46477): Get rid of unnecessary build for standalone clickhouse-keeper. [#46367](https://github.com/ClickHouse/ClickHouse/pull/46367) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#46511](https://github.com/ClickHouse/ClickHouse/issues/46511): Some time ago the ccache compression was changed to `zst`, but `gz` archives are downloaded by default. It fixes it by prioritizing zst archive. [#46490](https://github.com/ClickHouse/ClickHouse/pull/46490) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#46228](https://github.com/ClickHouse/ClickHouse/issues/46228): A couple of seg faults have been reported around `c-ares`. All of the recent stack traces observed fail on inserting into `std::unodered_set<>`. I believe I have found the root cause of this, it seems to be unprocessed queries. Prior to this PR, CH calls `poll` to wait on the file descriptors in the `c-ares` channel. According to the [poll docs](https://man7.org/linux/man-pages/man2/poll.2.html), a negative return value means an error has ocurred. Because of this, we would abort the execution and return failure. The problem is that `poll` will also return a negative value if a system interrupt occurs. A system interrupt does not mean the processing has failed or ended, but we would abort it anyways because we were checking for negative values. Once the execution is aborted, the whole stack is destroyed, which includes the `std::unordered_set<std::string>` passed to the `void *` parameter of the c-ares callback. Once c-ares completed the request, the callback would be invoked and would access an invalid memory address causing a segfault. [#45629](https://github.com/ClickHouse/ClickHouse/pull/45629) ([Arthur Passos](https://github.com/arthurpassos)).
* Backported in [#46967](https://github.com/ClickHouse/ClickHouse/issues/46967): Backward compatibility - allow implicit narrowing conversion from UInt64 to IPv4 - required for "INSERT ... VALUES ..." expression. [#45865](https://github.com/ClickHouse/ClickHouse/pull/45865) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Backported in [#46220](https://github.com/ClickHouse/ClickHouse/issues/46220): Fix reading of non existing nested columns with multiple level in compact parts. [#46045](https://github.com/ClickHouse/ClickHouse/pull/46045) ([Azat Khuzhin](https://github.com/azat)).
* Backported in [#46751](https://github.com/ClickHouse/ClickHouse/issues/46751): Follow-up fix for Replace domain IP types (IPv4, IPv6) with native https://github.com/ClickHouse/ClickHouse/pull/43221. [#46087](https://github.com/ClickHouse/ClickHouse/pull/46087) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Backported in [#46448](https://github.com/ClickHouse/ClickHouse/issues/46448): Fix possible `LOGICAL_ERROR` in asynchronous inserts with invalid data sent in format `VALUES`. [#46350](https://github.com/ClickHouse/ClickHouse/pull/46350) ([Anton Popov](https://github.com/CurtizJ)).
* Backported in [#46680](https://github.com/ClickHouse/ClickHouse/issues/46680): Fix an invalid processing of constant `LowCardinality` argument in function `arrayMap`. This bug could lead to a segfault in release, and logical error `Bad cast` in debug build. [#46569](https://github.com/ClickHouse/ClickHouse/pull/46569) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#46873](https://github.com/ClickHouse/ClickHouse/issues/46873): Fix a bug in the `Map` data type. This closes [#46855](https://github.com/ClickHouse/ClickHouse/issues/46855). [#46856](https://github.com/ClickHouse/ClickHouse/pull/46856) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#46956](https://github.com/ClickHouse/ClickHouse/issues/46956): Fix result of LIKE predicates which translate to substring searches and contain quoted non-LIKE metacharacters. [#46875](https://github.com/ClickHouse/ClickHouse/pull/46875) ([Robert Schulze](https://github.com/rschu1ze)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Another attempt to fix automerge, or at least to have debug footprint [#45476](https://github.com/ClickHouse/ClickHouse/pull/45476) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Support DELETE ON CLUSTER [#45786](https://github.com/ClickHouse/ClickHouse/pull/45786) ([Alexander Gololobov](https://github.com/davenger)).
* Add check for running workflows to merge_pr.py [#45803](https://github.com/ClickHouse/ClickHouse/pull/45803) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add necessary dependency for sanitizers [#45959](https://github.com/ClickHouse/ClickHouse/pull/45959) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add helping logging to auto-merge script [#46080](https://github.com/ClickHouse/ClickHouse/pull/46080) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix write buffer destruction order for vertical merge. [#46205](https://github.com/ClickHouse/ClickHouse/pull/46205) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Wait for background tasks in ~UploadHelper [#46334](https://github.com/ClickHouse/ClickHouse/pull/46334) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Improve install_check.py [#46458](https://github.com/ClickHouse/ClickHouse/pull/46458) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix dependencies for InstallPackagesTestAarch64 [#46597](https://github.com/ClickHouse/ClickHouse/pull/46597) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Get rid of legacy DocsReleaseChecks [#46665](https://github.com/ClickHouse/ClickHouse/pull/46665) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Reduce updates of Mergeable Check [#46781](https://github.com/ClickHouse/ClickHouse/pull/46781) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -0,0 +1,30 @@
---
sidebar_position: 1
sidebar_label: 2023
---
# 2023 Changelog
### ClickHouse release v23.2.2.20-stable (f6c269c8df2) FIXME as compared to v23.2.1.2537-stable (52bf836e03a)
#### Improvement
* Backported in [#46914](https://github.com/ClickHouse/ClickHouse/issues/46914): Allow PREWHERE for Merge with different DEFAULT expression for column. [#46831](https://github.com/ClickHouse/ClickHouse/pull/46831) ([Azat Khuzhin](https://github.com/azat)).
* Backported in [#47022](https://github.com/ClickHouse/ClickHouse/issues/47022): Allow IPv4 in range(). [#46995](https://github.com/ClickHouse/ClickHouse/pull/46995) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
#### Bug Fix
* Backported in [#46828](https://github.com/ClickHouse/ClickHouse/issues/46828): Combined PREWHERE column accumulated from multiple PREWHERE in some cases didn't contain 0's from previous steps. The fix is to apply final filter if we know that it wasn't applied from more than 1 last step. [#46785](https://github.com/ClickHouse/ClickHouse/pull/46785) ([Alexander Gololobov](https://github.com/davenger)).
#### Build/Testing/Packaging Improvement
* Backported in [#47062](https://github.com/ClickHouse/ClickHouse/issues/47062): Fix error during server startup on old distros (e.g. Amazon Linux 2) and on ARM that glibc 2.28 symbols are not found. [#47008](https://github.com/ClickHouse/ClickHouse/pull/47008) ([Robert Schulze](https://github.com/rschu1ze)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#46895](https://github.com/ClickHouse/ClickHouse/issues/46895): Fixed a bug in automatic retries of `DROP TABLE` query with `ReplicatedMergeTree` tables and `Atomic` databases. In rare cases it could lead to `Can't get data for node /zk_path/log_pointer` and `The specified key does not exist` errors if ZooKeeper session expired during DROP and a new replicated table with the same path in ZooKeeper was created in parallel. [#46384](https://github.com/ClickHouse/ClickHouse/pull/46384) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Backported in [#46865](https://github.com/ClickHouse/ClickHouse/issues/46865): Fix a bug in the `Map` data type. This closes [#46855](https://github.com/ClickHouse/ClickHouse/issues/46855). [#46856](https://github.com/ClickHouse/ClickHouse/pull/46856) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#46958](https://github.com/ClickHouse/ClickHouse/issues/46958): Fix result of LIKE predicates which translate to substring searches and contain quoted non-LIKE metacharacters. [#46875](https://github.com/ClickHouse/ClickHouse/pull/46875) ([Robert Schulze](https://github.com/rschu1ze)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* More concise logging at trace level for PREWHERE steps [#46771](https://github.com/ClickHouse/ClickHouse/pull/46771) ([Alexander Gololobov](https://github.com/davenger)).
* Reduce updates of Mergeable Check [#46781](https://github.com/ClickHouse/ClickHouse/pull/46781) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -172,7 +172,7 @@ Global thread pool is `GlobalThreadPool` singleton class. To allocate thread fro
Global pool is universal and all pools described below are implemented on top of it. This can be thought of as a hierarchy of pools. Any specialized pool takes its threads from the global pool using `ThreadPool` class. So the main purpose of any specialized pool is to apply limit on the number of simultaneous jobs and do job scheduling. If there are more jobs scheduled than threads in a pool, `ThreadPool` accumulates jobs in a queue with priorities. Each job has an integer priority. Default priority is zero. All jobs with higher priority values are started before any job with lower priority value. But there is no difference between already executing jobs, thus priority matters only when the pool in overloaded.
IO thread pool is implemented as a plain `ThreadPool` accessible via `IOThreadPool::get()` method. It is configured in the same way as global pool with `max_io_thread_pool_size`, `max_io_thread_pool_free_size` and `io_thread_pool_queue_size` settings. The main purpose of IO thread pool is to avoid exhaustion of the global pool with IO jobs, which could prevent queries from fully utilizing CPU.
IO thread pool is implemented as a plain `ThreadPool` accessible via `IOThreadPool::get()` method. It is configured in the same way as global pool with `max_io_thread_pool_size`, `max_io_thread_pool_free_size` and `io_thread_pool_queue_size` settings. The main purpose of IO thread pool is to avoid exhaustion of the global pool with IO jobs, which could prevent queries from fully utilizing CPU. Backup to S3 does significant amount of IO operations and to avoid impact on interactive queries there is a separate `BackupsIOThreadPool` configured with `max_backups_io_thread_pool_size`, `max_backups_io_thread_pool_free_size` and `backups_io_thread_pool_queue_size` settings.
For periodic task execution there is `BackgroundSchedulePool` class. You can register tasks using `BackgroundSchedulePool::TaskHolder` objects and the pool ensures that no task runs two jobs at the same time. It also allows you to postpone task execution to a specific instant in the future or temporarily deactivate task. Global `Context` provides a few instances of this class for different purposes. For general purpose tasks `Context::getSchedulePool()` is used.

View File

@ -67,7 +67,7 @@ It generally means that the SSH keys for connecting to GitHub are missing. These
You can also clone the repository via https protocol:
git clone --recursive--shallow-submodules https://github.com/ClickHouse/ClickHouse.git
git clone --recursive --shallow-submodules https://github.com/ClickHouse/ClickHouse.git
This, however, will not let you send your changes to the server. You can still use it temporarily and add the SSH keys later replacing the remote address of the repository with `git remote` command.

View File

@ -19,8 +19,8 @@ Kafka lets you:
``` sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
name1 [type1],
name2 [type2],
...
) ENGINE = Kafka()
SETTINGS
@ -113,6 +113,10 @@ Kafka(kafka_broker_list, kafka_topic_list, kafka_group_name, kafka_format
</details>
:::info
The Kafka table engine doesn't support columns with [default value](../../../sql-reference/statements/create/table.md#default_value). If you need columns with default value, you can add them at materialized view level (see below).
:::
## Description {#description}
The delivered messages are tracked automatically, so each message in a group is only counted once. If you want to get the data twice, then create a copy of the table with another group name.

View File

@ -280,12 +280,20 @@ SELECT
## toIPv4OrDefault(string)
Same as `toIPv4`, but if the IPv4 address has an invalid format, it returns 0.
Same as `toIPv4`, but if the IPv4 address has an invalid format, it returns `0.0.0.0` (0 IPv4).
## toIPv4OrNull(string)
Same as `toIPv4`, but if the IPv4 address has an invalid format, it returns null.
## toIPv6OrDefault(string)
Same as `toIPv6`, but if the IPv6 address has an invalid format, it returns `::` (0 IPv6).
## toIPv6OrNull(string)
Same as `toIPv6`, but if the IPv6 address has an invalid format, it returns null.
## toIPv6
Converts a string form of IPv6 address to [IPv6](../../sql-reference/data-types/domains/ipv6.md) type. If the IPv6 address has an invalid format, returns an empty value.

View File

@ -330,7 +330,7 @@ repeat(s, n)
**Arguments**
- `s` — The string to repeat. [String](../../sql-reference/data-types/string.md).
- `n` — The number of times to repeat the string. [UInt](../../sql-reference/data-types/int-uint.md).
- `n` — The number of times to repeat the string. [UInt or Int](../../sql-reference/data-types/int-uint.md).
**Returned value**

View File

@ -110,7 +110,7 @@ If the type is not `Nullable` and if `NULL` is specified, it will be treated as
See also [data_type_default_nullable](../../../operations/settings/settings.md#data_type_default_nullable) setting.
## Default Values
## Default Values {#default_values}
The column description can specify an expression for a default value, in one of the following ways: `DEFAULT expr`, `MATERIALIZED expr`, `ALIAS expr`.
@ -576,7 +576,7 @@ SELECT * FROM base.t1;
You can add a comment to the table when you creating it.
:::note
The comment is supported for all table engines except [Kafka](../../../engines/table-engines/integrations/kafka.md), [RabbitMQ](../../../engines/table-engines/integrations/rabbitmq.md) and [EmbeddedRocksDB](../../../engines/table-engines/integrations/embedded-rocksdb.md).
The comment clause is supported by all table engines except [Kafka](../../../engines/table-engines/integrations/kafka.md), [RabbitMQ](../../../engines/table-engines/integrations/rabbitmq.md) and [EmbeddedRocksDB](../../../engines/table-engines/integrations/embedded-rocksdb.md).
:::

View File

@ -3,6 +3,7 @@
#include <Core/NamesAndTypes.h>
#include <Analyzer/IQueryTreeNode.h>
#include <DataTypes/DataTypeNullable.h>
namespace DB
{
@ -117,6 +118,11 @@ public:
return column.type;
}
void convertToNullable() override
{
column.type = makeNullableSafe(column.type);
}
void dumpTreeImpl(WriteBuffer & buffer, FormatState & state, size_t indent) const override;
protected:

View File

@ -99,7 +99,7 @@ void FunctionNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state
buffer << ", function_type: " << function_type;
if (function)
buffer << ", result_type: " + function->getResultType()->getName();
buffer << ", result_type: " + getResultType()->getName();
const auto & parameters = getParameters();
if (!parameters.getNodes().empty())
@ -177,6 +177,7 @@ QueryTreeNodePtr FunctionNode::cloneImpl() const
*/
result_function->function = function;
result_function->kind = kind;
result_function->wrap_with_nullable = wrap_with_nullable;
return result_function;
}

View File

@ -8,6 +8,7 @@
#include <Common/typeid_cast.h>
#include <Core/ColumnsWithTypeAndName.h>
#include <Core/IResolvedFunction.h>
#include <DataTypes/DataTypeNullable.h>
#include <Functions/IFunction.h>
namespace DB
@ -187,7 +188,16 @@ public:
throw Exception(ErrorCodes::UNSUPPORTED_METHOD,
"Function node with name '{}' is not resolved",
function_name);
return function->getResultType();
auto type = function->getResultType();
if (wrap_with_nullable)
return makeNullableSafe(type);
return type;
}
void convertToNullable() override
{
chassert(kind == FunctionKind::ORDINARY);
wrap_with_nullable = true;
}
void dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const override;
@ -205,6 +215,7 @@ private:
String function_name;
FunctionKind kind = FunctionKind::UNKNOWN;
IResolvedFunctionPtr function;
bool wrap_with_nullable = false;
static constexpr size_t parameters_child_index = 0;
static constexpr size_t arguments_child_index = 1;

View File

@ -90,6 +90,11 @@ public:
throw Exception(ErrorCodes::UNSUPPORTED_METHOD, "Method getResultType is not supported for {} query node", getNodeTypeName());
}
virtual void convertToNullable()
{
throw Exception(ErrorCodes::UNSUPPORTED_METHOD, "Method convertToNullable is not supported for {} query node", getNodeTypeName());
}
struct CompareOptions
{
bool compare_aliases = true;

View File

@ -0,0 +1,237 @@
#include <Analyzer/Passes/LogicalExpressionOptimizerPass.h>
#include <Functions/FunctionFactory.h>
#include <Analyzer/InDepthQueryTreeVisitor.h>
#include <Analyzer/FunctionNode.h>
#include <Analyzer/ConstantNode.h>
#include <Analyzer/HashUtils.h>
#include <DataTypes/DataTypeString.h>
namespace DB
{
class LogicalExpressionOptimizerVisitor : public InDepthQueryTreeVisitorWithContext<LogicalExpressionOptimizerVisitor>
{
public:
using Base = InDepthQueryTreeVisitorWithContext<LogicalExpressionOptimizerVisitor>;
explicit LogicalExpressionOptimizerVisitor(ContextPtr context)
: Base(std::move(context))
{}
void visitImpl(QueryTreeNodePtr & node)
{
auto * function_node = node->as<FunctionNode>();
if (!function_node)
return;
if (function_node->getFunctionName() == "or")
{
tryReplaceOrEqualsChainWithIn(node);
return;
}
if (function_node->getFunctionName() == "and")
{
tryReplaceAndEqualsChainsWithConstant(node);
return;
}
}
private:
void tryReplaceAndEqualsChainsWithConstant(QueryTreeNodePtr & node)
{
auto & function_node = node->as<FunctionNode &>();
assert(function_node.getFunctionName() == "and");
if (function_node.getResultType()->isNullable())
return;
QueryTreeNodes and_operands;
QueryTreeNodePtrWithHashMap<const ConstantNode *> node_to_constants;
for (const auto & argument : function_node.getArguments())
{
auto * argument_function = argument->as<FunctionNode>();
if (!argument_function || argument_function->getFunctionName() != "equals")
{
and_operands.push_back(argument);
continue;
}
const auto & equals_arguments = argument_function->getArguments().getNodes();
const auto & lhs = equals_arguments[0];
const auto & rhs = equals_arguments[1];
const auto has_and_with_different_constant = [&](const QueryTreeNodePtr & expression, const ConstantNode * constant)
{
if (auto it = node_to_constants.find(expression); it != node_to_constants.end())
{
if (!it->second->isEqual(*constant))
return true;
}
else
{
node_to_constants.emplace(expression, constant);
and_operands.push_back(argument);
}
return false;
};
bool collapse_to_false = false;
if (const auto * lhs_literal = lhs->as<ConstantNode>())
collapse_to_false = has_and_with_different_constant(rhs, lhs_literal);
else if (const auto * rhs_literal = rhs->as<ConstantNode>())
collapse_to_false = has_and_with_different_constant(lhs, rhs_literal);
else
and_operands.push_back(argument);
if (collapse_to_false)
{
auto false_value = std::make_shared<ConstantValue>(0u, function_node.getResultType());
auto false_node = std::make_shared<ConstantNode>(std::move(false_value));
node = std::move(false_node);
return;
}
}
if (and_operands.size() == 1)
{
/// AND operator can have UInt8 or bool as its type.
/// bool is used if a bool constant is at least one operand.
/// Because we reduce the number of operands here by eliminating the same equality checks,
/// the only situation we can end up here is we had AND check where all the equality checks are the same so we know the type is UInt8.
/// Otherwise, we will have > 1 operands and we don't have to do anything.
assert(!function_node.getResultType()->isNullable() && and_operands[0]->getResultType()->equals(*function_node.getResultType()));
node = std::move(and_operands[0]);
return;
}
auto and_function_resolver = FunctionFactory::instance().get("and", getContext());
function_node.getArguments().getNodes() = std::move(and_operands);
function_node.resolveAsFunction(and_function_resolver);
}
void tryReplaceOrEqualsChainWithIn(QueryTreeNodePtr & node)
{
auto & function_node = node->as<FunctionNode &>();
assert(function_node.getFunctionName() == "or");
QueryTreeNodes or_operands;
QueryTreeNodePtrWithHashMap<QueryTreeNodes> node_to_equals_functions;
QueryTreeNodePtrWithHashMap<QueryTreeNodeConstRawPtrWithHashSet> node_to_constants;
for (const auto & argument : function_node.getArguments())
{
auto * argument_function = argument->as<FunctionNode>();
if (!argument_function || argument_function->getFunctionName() != "equals")
{
or_operands.push_back(argument);
continue;
}
/// collect all equality checks (x = value)
const auto & equals_arguments = argument_function->getArguments().getNodes();
const auto & lhs = equals_arguments[0];
const auto & rhs = equals_arguments[1];
const auto add_equals_function_if_not_present = [&](const auto & expression_node, const ConstantNode * constant)
{
auto & constant_set = node_to_constants[expression_node];
if (!constant_set.contains(constant))
{
constant_set.insert(constant);
node_to_equals_functions[expression_node].push_back(argument);
}
};
if (const auto * lhs_literal = lhs->as<ConstantNode>())
add_equals_function_if_not_present(rhs, lhs_literal);
else if (const auto * rhs_literal = rhs->as<ConstantNode>())
add_equals_function_if_not_present(lhs, rhs_literal);
else
or_operands.push_back(argument);
}
auto in_function_resolver = FunctionFactory::instance().get("in", getContext());
for (auto & [expression, equals_functions] : node_to_equals_functions)
{
const auto & settings = getSettings();
if (equals_functions.size() < settings.optimize_min_equality_disjunction_chain_length && !expression.node->getResultType()->lowCardinality())
{
std::move(equals_functions.begin(), equals_functions.end(), std::back_inserter(or_operands));
continue;
}
Tuple args;
args.reserve(equals_functions.size());
/// first we create tuple from RHS of equals functions
for (const auto & equals : equals_functions)
{
const auto * equals_function = equals->as<FunctionNode>();
assert(equals_function && equals_function->getFunctionName() == "equals");
const auto & equals_arguments = equals_function->getArguments().getNodes();
if (const auto * rhs_literal = equals_arguments[1]->as<ConstantNode>())
{
args.push_back(rhs_literal->getValue());
}
else
{
const auto * lhs_literal = equals_arguments[0]->as<ConstantNode>();
assert(lhs_literal);
args.push_back(lhs_literal->getValue());
}
}
auto rhs_node = std::make_shared<ConstantNode>(std::move(args));
auto in_function = std::make_shared<FunctionNode>("in");
QueryTreeNodes in_arguments;
in_arguments.reserve(2);
in_arguments.push_back(expression.node);
in_arguments.push_back(std::move(rhs_node));
in_function->getArguments().getNodes() = std::move(in_arguments);
in_function->resolveAsFunction(in_function_resolver);
or_operands.push_back(std::move(in_function));
}
if (or_operands.size() == 1)
{
/// if the result type of operand is the same as the result type of OR
/// we can replace OR with the operand
if (or_operands[0]->getResultType()->equals(*function_node.getResultType()))
{
assert(!function_node.getResultType()->isNullable());
node = std::move(or_operands[0]);
return;
}
/// otherwise add a stub 0 to make OR correct
or_operands.push_back(std::make_shared<ConstantNode>(static_cast<UInt8>(0)));
}
auto or_function_resolver = FunctionFactory::instance().get("or", getContext());
function_node.getArguments().getNodes() = std::move(or_operands);
function_node.resolveAsFunction(or_function_resolver);
}
};
void LogicalExpressionOptimizerPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
{
LogicalExpressionOptimizerVisitor visitor(std::move(context));
visitor.visit(query_tree_node);
}
}

View File

@ -0,0 +1,82 @@
#pragma once
#include <Analyzer/IQueryTreePass.h>
namespace DB
{
/**
* This pass tries to do optimizations on logical expression:
*
* 1. Replaces chains of equality functions inside an OR with a single IN operator.
* The replacement is done if:
* - one of the operands of the equality function is a constant
* - length of chain is at least 'optimize_min_equality_disjunction_chain_length' long OR the expression has type of LowCardinality
*
* E.g. (optimize_min_equality_disjunction_chain_length = 2)
* -------------------------------
* SELECT *
* FROM table
* WHERE a = 1 OR b = 'test' OR a = 2;
*
* will be transformed into
*
* SELECT *
* FROM TABLE
* WHERE b = 'test' OR a IN (1, 2);
* -------------------------------
*
* 2. Removes duplicate OR checks
* -------------------------------
* SELECT *
* FROM table
* WHERE a = 1 OR b = 'test' OR a = 1;
*
* will be transformed into
*
* SELECT *
* FROM TABLE
* WHERE a = 1 OR b = 'test';
* -------------------------------
*
* 3. Replaces AND chains with a single constant.
* The replacement is done if:
* - one of the operands of the equality function is a constant
* - constants are different for same expression
* -------------------------------
* SELECT *
* FROM table
* WHERE a = 1 AND b = 'test' AND a = 2;
*
* will be transformed into
*
* SELECT *
* FROM TABLE
* WHERE 0;
* -------------------------------
*
* 4. Removes duplicate AND checks
* -------------------------------
* SELECT *
* FROM table
* WHERE a = 1 AND b = 'test' AND a = 1;
*
* will be transformed into
*
* SELECT *
* FROM TABLE
* WHERE a = 1 AND b = 'test';
* -------------------------------
*/
class LogicalExpressionOptimizerPass final : public IQueryTreePass
{
public:
String getName() override { return "LogicalExpressionOptimizer"; }
String getDescription() override { return "Transform equality chain to a single IN function or a constant if possible"; }
void run(QueryTreeNodePtr query_tree_node, ContextPtr context) override;
};
}

View File

@ -199,7 +199,6 @@ namespace ErrorCodes
* TODO: SELECT (compound_expression).*, (compound_expression).COLUMNS are not supported on parser level.
* TODO: SELECT a.b.c.*, a.b.c.COLUMNS. Qualified matcher where identifier size is greater than 2 are not supported on parser level.
* TODO: Support function identifier resolve from parent query scope, if lambda in parent scope does not capture any columns.
* TODO: Support group_by_use_nulls.
* TODO: Scalar subqueries cache.
*/
@ -472,6 +471,12 @@ public:
alias_name_to_expressions[node_alias].push_back(node);
}
if (const auto * function = node->as<FunctionNode>())
{
if (AggregateFunctionFactory::instance().isAggregateFunctionName(function->getFunctionName()))
++aggregate_functions_counter;
}
expressions.emplace_back(node);
}
@ -490,6 +495,12 @@ public:
alias_name_to_expressions.erase(it);
}
if (const auto * function = top_expression->as<FunctionNode>())
{
if (AggregateFunctionFactory::instance().isAggregateFunctionName(function->getFunctionName()))
--aggregate_functions_counter;
}
expressions.pop_back();
}
@ -508,6 +519,11 @@ public:
return alias_name_to_expressions.contains(alias);
}
bool hasAggregateFunction() const
{
return aggregate_functions_counter > 0;
}
QueryTreeNodePtr getExpressionWithAlias(const std::string & alias) const
{
auto expression_it = alias_name_to_expressions.find(alias);
@ -554,6 +570,7 @@ public:
private:
QueryTreeNodes expressions;
size_t aggregate_functions_counter = 0;
std::unordered_map<std::string, QueryTreeNodes> alias_name_to_expressions;
};
@ -686,7 +703,11 @@ struct IdentifierResolveScope
if (auto * union_node = scope_node->as<UnionNode>())
context = union_node->getContext();
else if (auto * query_node = scope_node->as<QueryNode>())
{
context = query_node->getContext();
group_by_use_nulls = context->getSettingsRef().group_by_use_nulls &&
(query_node->isGroupByWithGroupingSets() || query_node->isGroupByWithRollup() || query_node->isGroupByWithCube());
}
}
QueryTreeNodePtr scope_node;
@ -734,9 +755,14 @@ struct IdentifierResolveScope
/// Table expression node to data
std::unordered_map<QueryTreeNodePtr, TableExpressionData> table_expression_node_to_data;
QueryTreeNodePtrWithHashSet nullable_group_by_keys;
/// Use identifier lookup to result cache
bool use_identifier_lookup_to_result_cache = true;
/// Apply nullability to aggregation keys
bool group_by_use_nulls = false;
/// JOINs count
size_t joins_count = 0;
@ -5407,10 +5433,18 @@ ProjectionNames QueryAnalyzer::resolveExpressionNode(QueryTreeNodePtr & node, Id
}
}
if (node
&& scope.nullable_group_by_keys.contains(node)
&& !scope.expressions_in_resolve_process_stack.hasAggregateFunction())
{
node = node->clone();
node->convertToNullable();
}
/** Update aliases after expression node was resolved.
* Do not update node in alias table if we resolve it for duplicate alias.
*/
if (!node_alias.empty() && use_alias_table)
if (!node_alias.empty() && use_alias_table && !scope.group_by_use_nulls)
{
auto it = scope.alias_name_to_expression_node.find(node_alias);
if (it != scope.alias_name_to_expression_node.end())
@ -6418,9 +6452,6 @@ void QueryAnalyzer::resolveQuery(const QueryTreeNodePtr & query_node, Identifier
auto & query_node_typed = query_node->as<QueryNode &>();
const auto & settings = scope.context->getSettingsRef();
if (settings.group_by_use_nulls)
throw Exception(ErrorCodes::UNSUPPORTED_METHOD, "GROUP BY use nulls is not supported");
bool is_rollup_or_cube = query_node_typed.isGroupByWithRollup() || query_node_typed.isGroupByWithCube();
if (query_node_typed.isGroupByWithGroupingSets() && query_node_typed.isGroupByWithTotals())
@ -6556,16 +6587,11 @@ void QueryAnalyzer::resolveQuery(const QueryTreeNodePtr & query_node, Identifier
resolveQueryJoinTreeNode(query_node_typed.getJoinTree(), scope, visitor);
}
scope.use_identifier_lookup_to_result_cache = true;
if (!scope.group_by_use_nulls)
scope.use_identifier_lookup_to_result_cache = true;
/// Resolve query node sections.
auto projection_columns = resolveProjectionExpressionNodeList(query_node_typed.getProjectionNode(), scope);
if (query_node_typed.getProjection().getNodes().empty())
throw Exception(ErrorCodes::EMPTY_LIST_OF_COLUMNS_QUERIED,
"Empty list of columns in projection. In scope {}",
scope.scope_node->formatASTForErrorMessage());
if (query_node_typed.hasWith())
resolveExpressionNodeList(query_node_typed.getWithNode(), scope, true /*allow_lambda_expression*/, false /*allow_table_expression*/);
@ -6586,6 +6612,15 @@ void QueryAnalyzer::resolveQuery(const QueryTreeNodePtr & query_node, Identifier
resolveExpressionNodeList(grouping_sets_keys_list_node, scope, false /*allow_lambda_expression*/, false /*allow_table_expression*/);
}
if (scope.group_by_use_nulls)
{
for (const auto & grouping_set : query_node_typed.getGroupBy().getNodes())
{
for (const auto & group_by_elem : grouping_set->as<ListNode>()->getNodes())
scope.nullable_group_by_keys.insert(group_by_elem);
}
}
}
else
{
@ -6593,6 +6628,12 @@ void QueryAnalyzer::resolveQuery(const QueryTreeNodePtr & query_node, Identifier
replaceNodesWithPositionalArguments(query_node_typed.getGroupByNode(), query_node_typed.getProjection().getNodes(), scope);
resolveExpressionNodeList(query_node_typed.getGroupByNode(), scope, false /*allow_lambda_expression*/, false /*allow_table_expression*/);
if (scope.group_by_use_nulls)
{
for (const auto & group_by_elem : query_node_typed.getGroupBy().getNodes())
scope.nullable_group_by_keys.insert(group_by_elem);
}
}
}
@ -6645,6 +6686,12 @@ void QueryAnalyzer::resolveQuery(const QueryTreeNodePtr & query_node, Identifier
convertLimitOffsetExpression(query_node_typed.getOffset(), "OFFSET", scope);
}
auto projection_columns = resolveProjectionExpressionNodeList(query_node_typed.getProjectionNode(), scope);
if (query_node_typed.getProjection().getNodes().empty())
throw Exception(ErrorCodes::EMPTY_LIST_OF_COLUMNS_QUERIED,
"Empty list of columns in projection. In scope {}",
scope.scope_node->formatASTForErrorMessage());
/** Resolve nodes with duplicate aliases.
* Table expressions cannot have duplicate aliases.
*
@ -6708,7 +6755,7 @@ void QueryAnalyzer::resolveQuery(const QueryTreeNodePtr & query_node, Identifier
"ARRAY JOIN",
"in PREWHERE");
validateAggregates(query_node);
validateAggregates(query_node, { .group_by_use_nulls = scope.group_by_use_nulls });
/** WITH section can be safely removed, because WITH section only can provide aliases to query expressions
* and CTE for other sections to use.

View File

@ -38,6 +38,7 @@
#include <Analyzer/Passes/AutoFinalOnQueryPass.h>
#include <Analyzer/Passes/ArrayExistsToHasPass.h>
#include <Analyzer/Passes/ComparisonTupleEliminationPass.h>
#include <Analyzer/Passes/LogicalExpressionOptimizerPass.h>
#include <Analyzer/Passes/CrossToInnerJoinPass.h>
#include <Analyzer/Passes/ShardNumColumnToFunctionPass.h>
@ -147,7 +148,6 @@ private:
/** ClickHouse query tree pass manager.
*
* TODO: Support logical expressions optimizer.
* TODO: Support setting convert_query_to_cnf.
* TODO: Support setting optimize_using_constraints.
* TODO: Support setting optimize_substitute_columns.
@ -262,6 +262,8 @@ void addQueryTreePasses(QueryTreePassManager & manager)
manager.addPass(std::make_unique<ConvertOrLikeChainPass>());
manager.addPass(std::make_unique<LogicalExpressionOptimizerPass>());
manager.addPass(std::make_unique<GroupingFunctionsResolvePass>());
manager.addPass(std::make_unique<AutoFinalOnQueryPass>());
manager.addPass(std::make_unique<CrossToInnerJoinPass>());

View File

@ -105,7 +105,7 @@ private:
const QueryTreeNodePtr & query_node;
};
void validateAggregates(const QueryTreeNodePtr & query_node)
void validateAggregates(const QueryTreeNodePtr & query_node, ValidationParams params)
{
const auto & query_node_typed = query_node->as<QueryNode &>();
auto join_tree_node_type = query_node_typed.getJoinTree()->getNodeType();
@ -182,7 +182,9 @@ void validateAggregates(const QueryTreeNodePtr & query_node)
if (grouping_set_key->as<ConstantNode>())
continue;
group_by_keys_nodes.push_back(grouping_set_key);
group_by_keys_nodes.push_back(grouping_set_key->clone());
if (params.group_by_use_nulls)
group_by_keys_nodes.back()->convertToNullable();
}
}
else
@ -190,7 +192,9 @@ void validateAggregates(const QueryTreeNodePtr & query_node)
if (node->as<ConstantNode>())
continue;
group_by_keys_nodes.push_back(node);
group_by_keys_nodes.push_back(node->clone());
if (params.group_by_use_nulls)
group_by_keys_nodes.back()->convertToNullable();
}
}

View File

@ -5,6 +5,11 @@
namespace DB
{
struct ValidationParams
{
bool group_by_use_nulls;
};
/** Validate aggregates in query node.
*
* 1. Check that there are no aggregate functions and GROUPING function in JOIN TREE, WHERE, PREWHERE, in another aggregate functions.
@ -15,7 +20,7 @@ namespace DB
* PROJECTION.
* 5. Throws exception if there is GROUPING SETS or ROLLUP or CUBE or WITH TOTALS without aggregation.
*/
void validateAggregates(const QueryTreeNodePtr & query_node);
void validateAggregates(const QueryTreeNodePtr & query_node, ValidationParams params);
/** Assert that there are no function nodes with specified function name in node children.
* Do not visit subqueries.

View File

@ -273,7 +273,7 @@ void DatabaseAtomic::renameTable(ContextPtr local_context, const String & table_
else
renameNoReplace(old_metadata_path, new_metadata_path);
/// After metadata was successfully moved, the following methods should not throw (if them do, it's a logical error)
/// After metadata was successfully moved, the following methods should not throw (if they do, it's a logical error)
table_data_path = detach(*this, table_name, table->storesDataOnDisk());
if (exchange)
other_table_data_path = detach(other_db, to_table_name, other_table->storesDataOnDisk());

View File

@ -6,9 +6,13 @@
#include <Disks/DiskSelector.h>
#include <Parsers/formatAST.h>
#include <Parsers/ASTExpressionList.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/isDiskFunction.h>
#include <Interpreters/Context.h>
#include <Parsers/IAST.h>
#include <Interpreters/InDepthNodeVisitor.h>
namespace DB
{
@ -18,43 +22,85 @@ namespace ErrorCodes
extern const int BAD_ARGUMENTS;
}
std::string getOrCreateDiskFromDiskAST(const ASTFunction & function, ContextPtr context)
namespace
{
/// We need a unique name for a created custom disk, but it needs to be the same
/// after table is reattached or server is restarted, so take a hash of the disk
/// configuration serialized ast as a disk name suffix.
auto disk_setting_string = serializeAST(function, true);
auto disk_name = DiskSelector::TMP_INTERNAL_DISK_PREFIX
+ toString(sipHash128(disk_setting_string.data(), disk_setting_string.size()));
auto result_disk = context->getOrCreateDisk(disk_name, [&](const DisksMap & disks_map) -> DiskPtr {
const auto * function_args_expr = assert_cast<const ASTExpressionList *>(function.arguments.get());
const auto & function_args = function_args_expr->children;
auto config = getDiskConfigurationFromAST(disk_name, function_args, context);
auto disk = DiskFactory::instance().create(disk_name, *config, disk_name, context, disks_map);
/// Mark that disk can be used without storage policy.
disk->markDiskAsCustom();
return disk;
});
if (!result_disk->isRemote())
std::string getOrCreateDiskFromDiskAST(const ASTFunction & function, ContextPtr context)
{
static constexpr auto custom_disks_base_dir_in_config = "custom_local_disks_base_directory";
auto disk_path_expected_prefix = context->getConfigRef().getString(custom_disks_base_dir_in_config, "");
/// We need a unique name for a created custom disk, but it needs to be the same
/// after table is reattached or server is restarted, so take a hash of the disk
/// configuration serialized ast as a disk name suffix.
auto disk_setting_string = serializeAST(function, true);
auto disk_name = DiskSelector::TMP_INTERNAL_DISK_PREFIX
+ toString(sipHash128(disk_setting_string.data(), disk_setting_string.size()));
if (disk_path_expected_prefix.empty())
throw Exception(
ErrorCodes::BAD_ARGUMENTS,
"Base path for custom local disks must be defined in config file by `{}`",
custom_disks_base_dir_in_config);
auto result_disk = context->getOrCreateDisk(disk_name, [&](const DisksMap & disks_map) -> DiskPtr {
const auto * function_args_expr = assert_cast<const ASTExpressionList *>(function.arguments.get());
const auto & function_args = function_args_expr->children;
auto config = getDiskConfigurationFromAST(disk_name, function_args, context);
auto disk = DiskFactory::instance().create(disk_name, *config, disk_name, context, disks_map);
/// Mark that disk can be used without storage policy.
disk->markDiskAsCustom();
return disk;
});
if (!pathStartsWith(result_disk->getPath(), disk_path_expected_prefix))
throw Exception(
ErrorCodes::BAD_ARGUMENTS,
"Path of the custom local disk must be inside `{}` directory",
disk_path_expected_prefix);
if (!result_disk->isRemote())
{
static constexpr auto custom_disks_base_dir_in_config = "custom_local_disks_base_directory";
auto disk_path_expected_prefix = context->getConfigRef().getString(custom_disks_base_dir_in_config, "");
if (disk_path_expected_prefix.empty())
throw Exception(
ErrorCodes::BAD_ARGUMENTS,
"Base path for custom local disks must be defined in config file by `{}`",
custom_disks_base_dir_in_config);
if (!pathStartsWith(result_disk->getPath(), disk_path_expected_prefix))
throw Exception(
ErrorCodes::BAD_ARGUMENTS,
"Path of the custom local disk must be inside `{}` directory",
disk_path_expected_prefix);
}
return disk_name;
}
class DiskConfigurationFlattener
{
public:
struct Data
{
ContextPtr context;
};
static bool needChildVisit(const ASTPtr &, const ASTPtr &) { return true; }
static void visit(ASTPtr & ast, Data & data)
{
if (isDiskFunction(ast))
{
auto disk_name = getOrCreateDiskFromDiskAST(*ast->as<ASTFunction>(), data.context);
ast = std::make_shared<ASTLiteral>(disk_name);
}
}
};
/// Visits children first.
using FlattenDiskConfigurationVisitor = InDepthNodeVisitor<DiskConfigurationFlattener, false>;
}
std::string getOrCreateDiskFromDiskAST(const ASTPtr & disk_function, ContextPtr context)
{
if (!isDiskFunction(disk_function))
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Expected a disk function");
auto ast = disk_function->clone();
FlattenDiskConfigurationVisitor::Data data{context};
FlattenDiskConfigurationVisitor{data}.visit(ast);
auto disk_name = assert_cast<const ASTLiteral &>(*ast).value.get<String>();
LOG_TRACE(&Poco::Logger::get("getOrCreateDiskFromDiskAST"), "Result disk name: {}", disk_name);
return disk_name;
}

View File

@ -13,6 +13,6 @@ class ASTFunction;
* add it to DiskSelector by a unique (but always the same for given configuration) disk name
* and return this name.
*/
std::string getOrCreateDiskFromDiskAST(const ASTFunction & function, ContextPtr context);
std::string getOrCreateDiskFromDiskAST(const ASTPtr & disk_function, ContextPtr context);
}

View File

@ -39,13 +39,15 @@ struct RepeatImpl
size, max_string_size);
}
template <typename T>
static void vectorStrConstRepeat(
const ColumnString::Chars & data,
const ColumnString::Offsets & offsets,
ColumnString::Chars & res_data,
ColumnString::Offsets & res_offsets,
UInt64 repeat_time)
T repeat_time)
{
repeat_time = repeat_time < 0 ? 0 : repeat_time;
checkRepeatTime(repeat_time);
UInt64 data_size = 0;
@ -77,7 +79,8 @@ struct RepeatImpl
res_offsets.assign(offsets);
for (UInt64 i = 0; i < col_num.size(); ++i)
{
size_t repeated_size = (offsets[i] - offsets[i - 1] - 1) * col_num[i] + 1;
T repeat_time = col_num[i] < 0 ? 0 : col_num[i];
size_t repeated_size = (offsets[i] - offsets[i - 1] - 1) * repeat_time + 1;
checkStringSize(repeated_size);
data_size += repeated_size;
res_offsets[i] = data_size;
@ -86,7 +89,7 @@ struct RepeatImpl
for (UInt64 i = 0; i < col_num.size(); ++i)
{
T repeat_time = col_num[i];
T repeat_time = col_num[i] < 0 ? 0 : col_num[i];
checkRepeatTime(repeat_time);
process(data.data() + offsets[i - 1], res_data.data() + res_offsets[i - 1], offsets[i] - offsets[i - 1], repeat_time);
}
@ -105,7 +108,8 @@ struct RepeatImpl
UInt64 col_size = col_num.size();
for (UInt64 i = 0; i < col_size; ++i)
{
size_t repeated_size = str_size * col_num[i] + 1;
T repeat_time = col_num[i] < 0 ? 0 : col_num[i];
size_t repeated_size = str_size * repeat_time + 1;
checkStringSize(repeated_size);
data_size += repeated_size;
res_offsets[i] = data_size;
@ -113,7 +117,7 @@ struct RepeatImpl
res_data.resize(data_size);
for (UInt64 i = 0; i < col_size; ++i)
{
T repeat_time = col_num[i];
T repeat_time = col_num[i] < 0 ? 0 : col_num[i];
checkRepeatTime(repeat_time);
process(
reinterpret_cast<UInt8 *>(const_cast<char *>(copy_str.data())),
@ -168,7 +172,8 @@ class FunctionRepeat : public IFunction
template <typename F>
static bool castType(const IDataType * type, F && f)
{
return castTypeToEither<DataTypeUInt8, DataTypeUInt16, DataTypeUInt32, DataTypeUInt64>(type, std::forward<F>(f));
return castTypeToEither<DataTypeInt8, DataTypeInt16, DataTypeInt32, DataTypeInt64,
DataTypeUInt8, DataTypeUInt16, DataTypeUInt32, DataTypeUInt64>(type, std::forward<F>(f));
}
public:
@ -186,7 +191,7 @@ public:
if (!isString(arguments[0]))
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of argument of function {}",
arguments[0]->getName(), getName());
if (!isUnsignedInteger(arguments[1]))
if (!isInteger(arguments[1]))
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of argument of function {}",
arguments[1]->getName(), getName());
return arguments[0];
@ -204,9 +209,15 @@ public:
{
if (const ColumnConst * scale_column_num = checkAndGetColumn<ColumnConst>(numcolumn.get()))
{
UInt64 repeat_time = scale_column_num->getValue<UInt64>();
auto col_res = ColumnString::create();
RepeatImpl::vectorStrConstRepeat(col->getChars(), col->getOffsets(), col_res->getChars(), col_res->getOffsets(), repeat_time);
castType(arguments[1].type.get(), [&](const auto & type)
{
using DataType = std::decay_t<decltype(type)>;
using T = typename DataType::FieldType;
T repeat_time = scale_column_num->getValue<T>();
RepeatImpl::vectorStrConstRepeat(col->getChars(), col->getOffsets(), col_res->getChars(), col_res->getOffsets(), repeat_time);
return true;
});
return col_res;
}
else if (castType(arguments[1].type.get(), [&](const auto & type)

View File

@ -1,5 +1,6 @@
#include <Interpreters/ActionsDAG.h>
#include <Analyzer/FunctionNode.h>
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypeString.h>
#include <Functions/IFunction.h>
@ -199,6 +200,23 @@ const ActionsDAG::Node & ActionsDAG::addFunction(
std::move(children),
std::move(arguments),
std::move(result_name),
function_base->getResultType(),
all_const);
}
const ActionsDAG::Node & ActionsDAG::addFunction(
const FunctionNode & function,
NodeRawConstPtrs children,
std::string result_name)
{
auto [arguments, all_const] = getFunctionArguments(children);
return addFunctionImpl(
function.getFunction(),
std::move(children),
std::move(arguments),
std::move(result_name),
function.getResultType(),
all_const);
}
@ -214,6 +232,7 @@ const ActionsDAG::Node & ActionsDAG::addFunction(
std::move(children),
std::move(arguments),
std::move(result_name),
function_base->getResultType(),
all_const);
}
@ -238,6 +257,7 @@ const ActionsDAG::Node & ActionsDAG::addFunctionImpl(
NodeRawConstPtrs children,
ColumnsWithTypeAndName arguments,
std::string result_name,
DataTypePtr result_type,
bool all_const)
{
size_t num_arguments = children.size();
@ -247,7 +267,7 @@ const ActionsDAG::Node & ActionsDAG::addFunctionImpl(
node.children = std::move(children);
node.function_base = function_base;
node.result_type = node.function_base->getResultType();
node.result_type = result_type;
node.function = node.function_base->prepare(arguments);
node.is_deterministic = node.function_base->isDeterministic();
@ -2264,7 +2284,15 @@ ActionsDAGPtr ActionsDAG::buildFilterActionsDAG(
for (const auto & child : node->children)
function_children.push_back(node_to_result_node.find(child)->second);
result_node = &result_dag->addFunction(node->function_base, std::move(function_children), {});
auto [arguments, all_const] = getFunctionArguments(function_children);
result_node = &result_dag->addFunctionImpl(
node->function_base,
std::move(function_children),
std::move(arguments),
{},
node->result_type,
all_const);
break;
}
}

View File

@ -23,6 +23,8 @@ using FunctionBasePtr = std::shared_ptr<const IFunctionBase>;
class IFunctionOverloadResolver;
using FunctionOverloadResolverPtr = std::shared_ptr<IFunctionOverloadResolver>;
class FunctionNode;
class IDataType;
using DataTypePtr = std::shared_ptr<const IDataType>;
@ -139,6 +141,10 @@ public:
const FunctionOverloadResolverPtr & function,
NodeRawConstPtrs children,
std::string result_name);
const Node & addFunction(
const FunctionNode & function,
NodeRawConstPtrs children,
std::string result_name);
const Node & addFunction(
const FunctionBasePtr & function_base,
NodeRawConstPtrs children,
@ -358,6 +364,7 @@ private:
NodeRawConstPtrs children,
ColumnsWithTypeAndName arguments,
std::string result_name,
DataTypePtr result_type,
bool all_const);
#if USE_EMBEDDED_COMPILER

View File

@ -10,7 +10,6 @@
#include <Core/ProtocolDefines.h>
#include <Disks/IVolume.h>
#include <Disks/TemporaryFileOnDisk.h>
#include <IO/WriteBufferFromTemporaryFile.h>
#include <Common/logger_useful.h>
#include <Common/thread_local_rng.h>

View File

@ -139,6 +139,7 @@ private:
mutable SharedMutex rehash_mutex;
FileBucket * current_bucket = nullptr;
mutable std::mutex current_bucket_mutex;
InMemoryJoinPtr hash_join;

View File

@ -161,6 +161,8 @@ public:
if (curr_process.processed)
continue;
LOG_DEBUG(&Poco::Logger::get("KillQuery"), "Will kill query {} (synchronously)", curr_process.query_id);
auto code = process_list.sendCancelToQuery(curr_process.query_id, curr_process.user, true);
if (code != CancellationCode::QueryIsNotInitializedYet && code != CancellationCode::CancelSent)
@ -226,6 +228,8 @@ BlockIO InterpreterKillQueryQuery::execute()
MutableColumns res_columns = header.cloneEmptyColumns();
for (const auto & query_desc : queries_to_stop)
{
if (!query.test)
LOG_DEBUG(&Poco::Logger::get("KillQuery"), "Will kill query {} (asynchronously)", query_desc.query_id);
auto code = (query.test) ? CancellationCode::Unknown : process_list.sendCancelToQuery(query_desc.query_id, query_desc.user, true);
insertResultRow(query_desc.source_num, code, processes_block, header, res_columns);
}

View File

@ -998,7 +998,7 @@ static std::tuple<ASTPtr, BlockIO> executeQueryImpl(
{
double elapsed_seconds = static_cast<double>(info.elapsed_microseconds) / 1000000.0;
double rows_per_second = static_cast<double>(elem.read_rows) / elapsed_seconds;
LOG_INFO(
LOG_DEBUG(
&Poco::Logger::get("executeQuery"),
"Read {} rows, {} in {} sec., {} rows/sec., {}/sec.",
elem.read_rows,

View File

@ -6,6 +6,7 @@
#include <Parsers/ASTFunction.h>
#include <Parsers/isDiskFunction.h>
#include <Common/assert_cast.h>
#include <Interpreters/InDepthNodeVisitor.h>
namespace DB
@ -31,42 +32,64 @@ bool FieldFromASTImpl::isSecret() const
return isDiskFunction(ast);
}
class DiskConfigurationMasker
{
public:
struct Data {};
static bool needChildVisit(const ASTPtr &, const ASTPtr &) { return true; }
static void visit(ASTPtr & ast, Data &)
{
if (isDiskFunction(ast))
{
const auto & disk_function = assert_cast<const ASTFunction &>(*ast);
const auto * disk_function_args_expr = assert_cast<const ASTExpressionList *>(disk_function.arguments.get());
const auto & disk_function_args = disk_function_args_expr->children;
auto is_secret_arg = [](const std::string & arg_name)
{
/// We allow to not hide type of the disk, e.g. disk(type = s3, ...)
/// and also nested disk, e.g. disk = 'disk_name'
return arg_name != "type" && arg_name != "disk";
};
for (const auto & arg : disk_function_args)
{
auto * setting_function = arg->as<ASTFunction>();
if (!setting_function || setting_function->name != "equals")
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Bad format: expected equals function");
auto * function_args_expr = assert_cast<ASTExpressionList *>(setting_function->arguments.get());
if (!function_args_expr)
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Bad format: expected arguments");
auto & function_args = function_args_expr->children;
if (function_args.empty())
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Bad format: expected non zero number of arguments");
auto * key_identifier = function_args[0]->as<ASTIdentifier>();
if (!key_identifier)
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Bad format: expected Identifier");
const std::string & key = key_identifier->name();
if (is_secret_arg(key))
function_args[1] = std::make_shared<ASTLiteral>("[HIDDEN]");
}
}
}
};
/// Visits children first.
using HideDiskConfigurationVisitor = InDepthNodeVisitor<DiskConfigurationMasker, false>;
String FieldFromASTImpl::toString(bool show_secrets) const
{
if (!show_secrets && isDiskFunction(ast))
{
auto hidden = ast->clone();
const auto & disk_function = assert_cast<const ASTFunction &>(*hidden);
const auto * disk_function_args_expr = assert_cast<const ASTExpressionList *>(disk_function.arguments.get());
const auto & disk_function_args = disk_function_args_expr->children;
auto is_secret_arg = [](const std::string & arg_name)
{
return arg_name != "type";
};
for (const auto & arg : disk_function_args)
{
auto * setting_function = arg->as<ASTFunction>();
if (!setting_function || setting_function->name != "equals")
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Bad format: expected equals function");
auto * function_args_expr = assert_cast<ASTExpressionList *>(setting_function->arguments.get());
if (!function_args_expr)
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Bad format: expected arguments");
auto & function_args = function_args_expr->children;
if (function_args.empty())
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Bad format: expected non zero number of arguments");
auto * key_identifier = function_args[0]->as<ASTIdentifier>();
if (!key_identifier)
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Bad format: expected Identifier");
const std::string & key = key_identifier->name();
if (is_secret_arg(key))
function_args[1] = std::make_shared<ASTLiteral>("[HIDDEN]");
}
HideDiskConfigurationVisitor::Data data{};
HideDiskConfigurationVisitor{data}.visit(hidden);
return serializeAST(*hidden);
}

View File

@ -132,7 +132,7 @@ public:
}
template <typename FunctionOrOverloadResolver>
const ActionsDAG::Node * addFunctionIfNecessary(const std::string & node_name, ActionsDAG::NodeRawConstPtrs children, FunctionOrOverloadResolver function)
const ActionsDAG::Node * addFunctionIfNecessary(const std::string & node_name, ActionsDAG::NodeRawConstPtrs children, const FunctionOrOverloadResolver & function)
{
auto it = node_name_to_node.find(node_name);
if (it != node_name_to_node.end())
@ -339,7 +339,7 @@ PlannerActionsVisitorImpl::NodeNameAndNodeMinLevel PlannerActionsVisitorImpl::vi
actions_stack.pop_back();
// TODO: Pass IFunctionBase here not FunctionCaptureOverloadResolver.
actions_stack[level].addFunctionIfNecessary(lambda_node_name, std::move(lambda_children), std::move(function_capture));
actions_stack[level].addFunctionIfNecessary(lambda_node_name, std::move(lambda_children), function_capture);
size_t actions_stack_size = actions_stack.size();
for (size_t i = level + 1; i < actions_stack_size; ++i)
@ -501,7 +501,7 @@ PlannerActionsVisitorImpl::NodeNameAndNodeMinLevel PlannerActionsVisitorImpl::vi
}
else
{
actions_stack[level].addFunctionIfNecessary(function_node_name, children, function_node.getFunction());
actions_stack[level].addFunctionIfNecessary(function_node_name, children, function_node);
}
size_t actions_stack_size = actions_stack.size();

View File

@ -1,6 +1,7 @@
#include <Planner/PlannerExpressionAnalysis.h>
#include <DataTypes/DataTypesNumber.h>
#include <DataTypes/DataTypeNullable.h>
#include <Analyzer/FunctionNode.h>
#include <Analyzer/ConstantNode.h>
@ -33,12 +34,11 @@ namespace
* It is client responsibility to update filter analysis result if filter column must be removed after chain is finalized.
*/
FilterAnalysisResult analyzeFilter(const QueryTreeNodePtr & filter_expression_node,
const ColumnsWithTypeAndName & join_tree_input_columns,
const ColumnsWithTypeAndName & current_output_columns,
const PlannerContextPtr & planner_context,
ActionsChain & actions_chain)
{
const auto * chain_available_output_columns = actions_chain.getLastStepAvailableOutputColumnsOrNull();
const auto & filter_input = chain_available_output_columns ? *chain_available_output_columns : join_tree_input_columns;
const auto & filter_input = current_output_columns;
FilterAnalysisResult result;
@ -52,8 +52,8 @@ FilterAnalysisResult analyzeFilter(const QueryTreeNodePtr & filter_expression_no
/** Construct aggregation analysis result if query tree has GROUP BY or aggregates.
* Actions before aggregation are added into actions chain, if result is not null optional.
*/
std::optional<AggregationAnalysisResult> analyzeAggregation(const QueryTreeNodePtr & query_tree,
const ColumnsWithTypeAndName & join_tree_input_columns,
std::pair<std::optional<AggregationAnalysisResult>, std::optional<ColumnsWithTypeAndName>> analyzeAggregation(const QueryTreeNodePtr & query_tree,
const ColumnsWithTypeAndName & current_output_columns,
const PlannerContextPtr & planner_context,
ActionsChain & actions_chain)
{
@ -69,8 +69,7 @@ std::optional<AggregationAnalysisResult> analyzeAggregation(const QueryTreeNodeP
Names aggregation_keys;
const auto * chain_available_output_columns = actions_chain.getLastStepAvailableOutputColumnsOrNull();
const auto & group_by_input = chain_available_output_columns ? *chain_available_output_columns : join_tree_input_columns;
const auto & group_by_input = current_output_columns;
ActionsDAGPtr before_aggregation_actions = std::make_shared<ActionsDAG>(group_by_input);
before_aggregation_actions->getOutputs().clear();
@ -83,6 +82,8 @@ std::optional<AggregationAnalysisResult> analyzeAggregation(const QueryTreeNodeP
PlannerActionsVisitor actions_visitor(planner_context);
/// Add expressions from GROUP BY
bool group_by_use_nulls = planner_context->getQueryContext()->getSettingsRef().group_by_use_nulls &&
(query_node.isGroupByWithGroupingSets() || query_node.isGroupByWithRollup() || query_node.isGroupByWithCube());
if (query_node.hasGroupBy())
{
@ -107,6 +108,8 @@ std::optional<AggregationAnalysisResult> analyzeAggregation(const QueryTreeNodeP
if (before_aggregation_actions_output_node_names.contains(expression_dag_node->result_name))
continue;
auto expression_type_after_aggregation = group_by_use_nulls ? makeNullableSafe(expression_dag_node->result_type) : expression_dag_node->result_type;
available_columns_after_aggregation.emplace_back(nullptr, expression_type_after_aggregation, expression_dag_node->result_name);
aggregation_keys.push_back(expression_dag_node->result_name);
before_aggregation_actions->getOutputs().push_back(expression_dag_node);
before_aggregation_actions_output_node_names.insert(expression_dag_node->result_name);
@ -150,6 +153,8 @@ std::optional<AggregationAnalysisResult> analyzeAggregation(const QueryTreeNodeP
if (before_aggregation_actions_output_node_names.contains(expression_dag_node->result_name))
continue;
auto expression_type_after_aggregation = group_by_use_nulls ? makeNullableSafe(expression_dag_node->result_type) : expression_dag_node->result_type;
available_columns_after_aggregation.emplace_back(nullptr, expression_type_after_aggregation, expression_dag_node->result_name);
aggregation_keys.push_back(expression_dag_node->result_name);
before_aggregation_actions->getOutputs().push_back(expression_dag_node);
before_aggregation_actions_output_node_names.insert(expression_dag_node->result_name);
@ -157,9 +162,6 @@ std::optional<AggregationAnalysisResult> analyzeAggregation(const QueryTreeNodeP
}
}
for (auto & node : before_aggregation_actions->getOutputs())
available_columns_after_aggregation.emplace_back(nullptr, node->result_type, node->result_name);
/// Add expressions from aggregate functions arguments
for (auto & aggregate_function_node : aggregate_function_nodes)
@ -201,14 +203,14 @@ std::optional<AggregationAnalysisResult> analyzeAggregation(const QueryTreeNodeP
aggregation_analysis_result.grouping_sets_parameters_list = std::move(grouping_sets_parameters_list);
aggregation_analysis_result.group_by_with_constant_keys = group_by_with_constant_keys;
return aggregation_analysis_result;
return { aggregation_analysis_result, available_columns_after_aggregation };
}
/** Construct window analysis result if query tree has window functions.
* Actions before window functions are added into actions chain, if result is not null optional.
*/
std::optional<WindowAnalysisResult> analyzeWindow(const QueryTreeNodePtr & query_tree,
const ColumnsWithTypeAndName & join_tree_input_columns,
const ColumnsWithTypeAndName & current_output_columns,
const PlannerContextPtr & planner_context,
ActionsChain & actions_chain)
{
@ -218,8 +220,7 @@ std::optional<WindowAnalysisResult> analyzeWindow(const QueryTreeNodePtr & query
auto window_descriptions = extractWindowDescriptions(window_function_nodes, *planner_context);
const auto * chain_available_output_columns = actions_chain.getLastStepAvailableOutputColumnsOrNull();
const auto & window_input = chain_available_output_columns ? *chain_available_output_columns : join_tree_input_columns;
const auto & window_input = current_output_columns;
PlannerActionsVisitor actions_visitor(planner_context);
@ -298,12 +299,11 @@ std::optional<WindowAnalysisResult> analyzeWindow(const QueryTreeNodePtr & query
* It is client responsibility to update projection analysis result with project names actions after chain is finalized.
*/
ProjectionAnalysisResult analyzeProjection(const QueryNode & query_node,
const ColumnsWithTypeAndName & join_tree_input_columns,
const ColumnsWithTypeAndName & current_output_columns,
const PlannerContextPtr & planner_context,
ActionsChain & actions_chain)
{
const auto * chain_available_output_columns = actions_chain.getLastStepAvailableOutputColumnsOrNull();
const auto & projection_input = chain_available_output_columns ? *chain_available_output_columns : join_tree_input_columns;
const auto & projection_input = current_output_columns;
auto projection_actions = buildActionsDAGFromExpressionNode(query_node.getProjectionNode(), projection_input, planner_context);
auto projection_columns = query_node.getProjectionColumns();
@ -347,12 +347,11 @@ ProjectionAnalysisResult analyzeProjection(const QueryNode & query_node,
* Actions before sort are added into actions chain.
*/
SortAnalysisResult analyzeSort(const QueryNode & query_node,
const ColumnsWithTypeAndName & join_tree_input_columns,
const ColumnsWithTypeAndName & current_output_columns,
const PlannerContextPtr & planner_context,
ActionsChain & actions_chain)
{
const auto * chain_available_output_columns = actions_chain.getLastStepAvailableOutputColumnsOrNull();
const auto & order_by_input = chain_available_output_columns ? *chain_available_output_columns : join_tree_input_columns;
const auto & order_by_input = current_output_columns;
ActionsDAGPtr before_sort_actions = std::make_shared<ActionsDAG>(order_by_input);
auto & before_sort_actions_outputs = before_sort_actions->getOutputs();
@ -437,13 +436,12 @@ SortAnalysisResult analyzeSort(const QueryNode & query_node,
* Actions before limit by are added into actions chain.
*/
LimitByAnalysisResult analyzeLimitBy(const QueryNode & query_node,
const ColumnsWithTypeAndName & join_tree_input_columns,
const ColumnsWithTypeAndName & current_output_columns,
const PlannerContextPtr & planner_context,
const NameSet & required_output_nodes_names,
ActionsChain & actions_chain)
{
const auto * chain_available_output_columns = actions_chain.getLastStepAvailableOutputColumnsOrNull();
const auto & limit_by_input = chain_available_output_columns ? *chain_available_output_columns : join_tree_input_columns;
const auto & limit_by_input = current_output_columns;
auto before_limit_by_actions = buildActionsDAGFromExpressionNode(query_node.getLimitByNode(), limit_by_input, planner_context);
NameSet limit_by_column_names_set;
@ -482,29 +480,43 @@ PlannerExpressionsAnalysisResult buildExpressionAnalysisResult(const QueryTreeNo
std::optional<FilterAnalysisResult> where_analysis_result_optional;
std::optional<size_t> where_action_step_index_optional;
const auto * input_columns = actions_chain.getLastStepAvailableOutputColumnsOrNull();
ColumnsWithTypeAndName current_output_columns = input_columns ? *input_columns : join_tree_input_columns;
if (query_node.hasWhere())
{
where_analysis_result_optional = analyzeFilter(query_node.getWhere(), join_tree_input_columns, planner_context, actions_chain);
where_analysis_result_optional = analyzeFilter(query_node.getWhere(), current_output_columns, planner_context, actions_chain);
where_action_step_index_optional = actions_chain.getLastStepIndex();
current_output_columns = actions_chain.getLastStepAvailableOutputColumns();
}
auto aggregation_analysis_result_optional = analyzeAggregation(query_tree, join_tree_input_columns, planner_context, actions_chain);
auto [aggregation_analysis_result_optional, aggregated_columns_optional] = analyzeAggregation(query_tree, current_output_columns, planner_context, actions_chain);
if (aggregated_columns_optional)
current_output_columns = std::move(*aggregated_columns_optional);
std::optional<FilterAnalysisResult> having_analysis_result_optional;
std::optional<size_t> having_action_step_index_optional;
if (query_node.hasHaving())
{
having_analysis_result_optional = analyzeFilter(query_node.getHaving(), join_tree_input_columns, planner_context, actions_chain);
having_analysis_result_optional = analyzeFilter(query_node.getHaving(), current_output_columns, planner_context, actions_chain);
having_action_step_index_optional = actions_chain.getLastStepIndex();
current_output_columns = actions_chain.getLastStepAvailableOutputColumns();
}
auto window_analysis_result_optional = analyzeWindow(query_tree, join_tree_input_columns, planner_context, actions_chain);
auto projection_analysis_result = analyzeProjection(query_node, join_tree_input_columns, planner_context, actions_chain);
auto window_analysis_result_optional = analyzeWindow(query_tree, current_output_columns, planner_context, actions_chain);
if (window_analysis_result_optional)
current_output_columns = actions_chain.getLastStepAvailableOutputColumns();
auto projection_analysis_result = analyzeProjection(query_node, current_output_columns, planner_context, actions_chain);
current_output_columns = actions_chain.getLastStepAvailableOutputColumns();
std::optional<SortAnalysisResult> sort_analysis_result_optional;
if (query_node.hasOrderBy())
sort_analysis_result_optional = analyzeSort(query_node, join_tree_input_columns, planner_context, actions_chain);
{
sort_analysis_result_optional = analyzeSort(query_node, current_output_columns, planner_context, actions_chain);
current_output_columns = actions_chain.getLastStepAvailableOutputColumns();
}
std::optional<LimitByAnalysisResult> limit_by_analysis_result_optional;
@ -526,14 +538,15 @@ PlannerExpressionsAnalysisResult buildExpressionAnalysisResult(const QueryTreeNo
}
limit_by_analysis_result_optional = analyzeLimitBy(query_node,
join_tree_input_columns,
current_output_columns,
planner_context,
required_output_nodes_names,
actions_chain);
current_output_columns = actions_chain.getLastStepAvailableOutputColumns();
}
const auto * chain_available_output_columns = actions_chain.getLastStepAvailableOutputColumnsOrNull();
auto project_names_input = chain_available_output_columns ? *chain_available_output_columns : join_tree_input_columns;
auto project_names_input = chain_available_output_columns ? *chain_available_output_columns : current_output_columns;
bool has_with_fill = sort_analysis_result_optional.has_value() && sort_analysis_result_optional->has_with_fill;
/** If there is WITH FILL we must use non constant projection columns.

View File

@ -1,5 +1,7 @@
#include <Planner/PlannerJoinTree.h>
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypeAggregateFunction.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include <Functions/FunctionFactory.h>
@ -19,6 +21,8 @@
#include <Analyzer/JoinNode.h>
#include <Analyzer/ArrayJoinNode.h>
#include <Analyzer/Utils.h>
#include <Analyzer/AggregationUtils.h>
#include <Analyzer/FunctionNode.h>
#include <Processors/Sources/NullSource.h>
#include <Processors/QueryPlan/SortingStep.h>
@ -27,6 +31,7 @@
#include <Processors/QueryPlan/ExpressionStep.h>
#include <Processors/QueryPlan/JoinStep.h>
#include <Processors/QueryPlan/ArrayJoinStep.h>
#include <Processors/Sources/SourceFromSingleChunk.h>
#include <Interpreters/Context.h>
#include <Interpreters/IJoin.h>
@ -40,6 +45,10 @@
#include <Planner/PlannerActionsVisitor.h>
#include <Planner/Utils.h>
#include <AggregateFunctions/AggregateFunctionCount.h>
#include <Columns/ColumnAggregateFunction.h>
#include <Common/scope_guard_safe.h>
namespace DB
{
@ -143,6 +152,100 @@ NameAndTypePair chooseSmallestColumnToReadFromStorage(const StoragePtr & storage
return result;
}
bool applyTrivialCountIfPossible(
QueryPlan & query_plan,
const TableNode & table_node,
const QueryTreeNodePtr & query_tree,
const ContextPtr & query_context,
const Names & columns_names)
{
const auto & settings = query_context->getSettingsRef();
if (!settings.optimize_trivial_count_query)
return false;
/// can't apply if FINAL
if (table_node.getTableExpressionModifiers().has_value() && table_node.getTableExpressionModifiers()->hasFinal())
return false;
auto & main_query_node = query_tree->as<QueryNode &>();
if (main_query_node.hasGroupBy())
return false;
const auto & storage = table_node.getStorage();
if (!storage || storage->hasLightweightDeletedMask())
return false;
if (settings.max_parallel_replicas > 1 || settings.allow_experimental_query_deduplication
|| settings.empty_result_for_aggregation_by_empty_set)
return false;
QueryTreeNodes aggregates = collectAggregateFunctionNodes(query_tree);
if (aggregates.size() != 1)
return false;
const auto & function_node = aggregates.front().get()->as<const FunctionNode &>();
chassert(function_node.getAggregateFunction() != nullptr);
const auto * count_func = typeid_cast<const AggregateFunctionCount *>(function_node.getAggregateFunction().get());
if (!count_func)
return false;
/// get number of rows
std::optional<UInt64> num_rows{};
/// Transaction check here is necessary because
/// MergeTree maintains total count for all parts in Active state and it simply returns that number for trivial select count() from table query.
/// But if we have current transaction, then we should return number of rows in current snapshot (that may include parts in Outdated state),
/// so we have to use totalRowsByPartitionPredicate() instead of totalRows even for trivial query
/// See https://github.com/ClickHouse/ClickHouse/pull/24258/files#r828182031
if (!main_query_node.hasPrewhere() && !main_query_node.hasWhere() && !query_context->getCurrentTransaction())
{
num_rows = storage->totalRows(settings);
}
// TODO:
// else // It's possible to optimize count() given only partition predicates
// {
// SelectQueryInfo temp_query_info;
// temp_query_info.query = query_ptr;
// temp_query_info.syntax_analyzer_result = syntax_analyzer_result;
// temp_query_info.prepared_sets = query_analyzer->getPreparedSets();
// num_rows = storage->totalRowsByPartitionPredicate(temp_query_info, context);
// }
if (!num_rows)
return false;
/// set aggregation state
const AggregateFunctionCount & agg_count = *count_func;
std::vector<char> state(agg_count.sizeOfData());
AggregateDataPtr place = state.data();
agg_count.create(place);
SCOPE_EXIT_MEMORY_SAFE(agg_count.destroy(place));
agg_count.set(place, num_rows.value());
auto column = ColumnAggregateFunction::create(function_node.getAggregateFunction());
column->insertFrom(place);
/// get count() argument type
DataTypes argument_types;
argument_types.reserve(columns_names.size());
{
const Block source_header = table_node.getStorageSnapshot()->getSampleBlockForColumns(columns_names);
for (const auto & column_name : columns_names)
argument_types.push_back(source_header.getByName(column_name).type);
}
Block block_with_count{
{std::move(column),
std::make_shared<DataTypeAggregateFunction>(function_node.getAggregateFunction(), argument_types, Array{}),
columns_names.front()}};
auto source = std::make_shared<SourceFromSingleChunk>(block_with_count);
auto prepared_count = std::make_unique<ReadFromPreparedSource>(Pipe(std::move(source)));
prepared_count->setStepDescription("Optimized trivial count");
query_plan.addStep(std::move(prepared_count));
return true;
}
JoinTreeQueryPlan buildQueryPlanForTableExpression(const QueryTreeNodePtr & table_expression,
const SelectQueryInfo & select_query_info,
const SelectQueryOptions & select_query_options,
@ -306,32 +409,43 @@ JoinTreeQueryPlan buildQueryPlanForTableExpression(const QueryTreeNodePtr & tabl
}
}
if (!select_query_options.only_analyze)
{
from_stage = storage->getQueryProcessingStage(query_context, select_query_options.to_stage, storage_snapshot, table_expression_query_info);
storage->read(query_plan, columns_names, storage_snapshot, table_expression_query_info, query_context, from_stage, max_block_size, max_streams);
}
/// Apply trivial_count optimization if possible
bool is_trivial_count_applied = !select_query_options.only_analyze && is_single_table_expression && table_node && select_query_info.has_aggregates
&& applyTrivialCountIfPossible(query_plan, *table_node, select_query_info.query_tree, planner_context->getQueryContext(), columns_names);
if (query_plan.isInitialized())
if (is_trivial_count_applied)
{
/** Specify the number of threads only if it wasn't specified in storage.
*
* But in case of remote query and prefer_localhost_replica=1 (default)
* The inner local query (that is done in the same process, without
* network interaction), it will setMaxThreads earlier and distributed
* query will not update it.
*/
if (!query_plan.getMaxThreads() || is_remote)
query_plan.setMaxThreads(max_threads_execute_query);
from_stage = QueryProcessingStage::WithMergeableState;
}
else
{
/// Create step which reads from empty source if storage has no data
auto source_header = storage_snapshot->getSampleBlockForColumns(columns_names);
Pipe pipe(std::make_shared<NullSource>(source_header));
auto read_from_pipe = std::make_unique<ReadFromPreparedSource>(std::move(pipe));
read_from_pipe->setStepDescription("Read from NullSource");
query_plan.addStep(std::move(read_from_pipe));
if (!select_query_options.only_analyze)
{
from_stage = storage->getQueryProcessingStage(query_context, select_query_options.to_stage, storage_snapshot, table_expression_query_info);
storage->read(query_plan, columns_names, storage_snapshot, table_expression_query_info, query_context, from_stage, max_block_size, max_streams);
}
if (query_plan.isInitialized())
{
/** Specify the number of threads only if it wasn't specified in storage.
*
* But in case of remote query and prefer_localhost_replica=1 (default)
* The inner local query (that is done in the same process, without
* network interaction), it will setMaxThreads earlier and distributed
* query will not update it.
*/
if (!query_plan.getMaxThreads() || is_remote)
query_plan.setMaxThreads(max_threads_execute_query);
}
else
{
/// Create step which reads from empty source if storage has no data.
auto source_header = storage_snapshot->getSampleBlockForColumns(columns_names);
Pipe pipe(std::make_shared<NullSource>(source_header));
auto read_from_pipe = std::make_unique<ReadFromPreparedSource>(std::move(pipe));
read_from_pipe->setStepDescription("Read from NullSource");
query_plan.addStep(std::move(read_from_pipe));
}
}
}
else if (query_node || union_node)

View File

@ -187,7 +187,7 @@ void PushingAsyncPipelineExecutor::push(Chunk chunk)
if (!is_pushed)
throw Exception(ErrorCodes::LOGICAL_ERROR,
"Pipeline for PushingPipelineExecutor was finished before all data was inserted");
"Pipeline for PushingAsyncPipelineExecutor was finished before all data was inserted");
}
void PushingAsyncPipelineExecutor::push(Block block)

View File

@ -318,13 +318,24 @@ DelayedJoinedBlocksWorkerTransform::DelayedJoinedBlocksWorkerTransform(Block out
IProcessor::Status DelayedJoinedBlocksWorkerTransform::prepare()
{
auto & output = outputs.front();
auto & input = inputs.front();
if (output.isFinished())
{
input.close();
return Status::Finished;
}
if (!output.canPush())
{
input.setNotNeeded();
return Status::PortFull;
}
if (inputs.size() != 1 && outputs.size() != 1)
throw Exception(ErrorCodes::LOGICAL_ERROR, "DelayedJoinedBlocksWorkerTransform must have exactly one input port");
auto & output = outputs.front();
auto & input = inputs.front();
if (output_chunk)
{
input.setNotNeeded();
@ -397,15 +408,25 @@ DelayedJoinedBlocksTransform::DelayedJoinedBlocksTransform(size_t num_streams, J
void DelayedJoinedBlocksTransform::work()
{
if (finished)
return;
delayed_blocks = join->getDelayedBlocks();
finished = finished || delayed_blocks == nullptr;
}
IProcessor::Status DelayedJoinedBlocksTransform::prepare()
{
for (auto & output : outputs)
{
if (output.isFinished())
{
/// If at least one output is finished, then we have read all data from buckets.
/// Some workers can still be busy with joining the last chunk of data in memory,
/// but after that they also will finish when they will try to get next chunk.
finished = true;
continue;
}
if (!output.canPush())
return Status::PortFull;
}
@ -414,6 +435,8 @@ IProcessor::Status DelayedJoinedBlocksTransform::prepare()
{
for (auto & output : outputs)
{
if (output.isFinished())
continue;
Chunk chunk;
chunk.setChunkInfo(std::make_shared<DelayedBlocksTask>());
output.push(std::move(chunk));

View File

@ -1224,14 +1224,7 @@ void TCPHandler::receiveHello()
session = makeSession();
auto & client_info = session->getClientInfo();
/// Extract the last entry from comma separated list of forwarded_for addresses.
/// Only the last proxy can be trusted (if any).
String forwarded_address = client_info.getLastForwardedFor();
if (!forwarded_address.empty() && server.config().getBool("auth_use_forwarded_address", false))
session->authenticate(user, password, Poco::Net::SocketAddress(forwarded_address, socket().peerAddress().port()));
else
session->authenticate(user, password, socket().peerAddress());
session->authenticate(user, password, getClientAddress(client_info));
}
void TCPHandler::receiveAddendum()
@ -1522,11 +1515,16 @@ void TCPHandler::receiveQuery()
/// so we should not rely on that. However, in this particular case we got client_info from other clickhouse-server, so it's ok.
if (client_info.initial_user.empty())
{
LOG_DEBUG(log, "User (no user, interserver mode)");
LOG_DEBUG(log, "User (no user, interserver mode) (client: {})", getClientAddress(client_info).toString());
}
else
{
LOG_DEBUG(log, "User (initial, interserver mode): {}", client_info.initial_user);
LOG_DEBUG(log, "User (initial, interserver mode): {} (client: {})", client_info.initial_user, getClientAddress(client_info).toString());
/// In case of inter-server mode authorization is done with the
/// initial address of the client, not the real address from which
/// the query was come, since the real address is the address of
/// the initiator server, while we are interested in client's
/// address.
session->authenticate(AlwaysAllowCredentials{client_info.initial_user}, client_info.initial_address);
}
#else
@ -2012,4 +2010,15 @@ void TCPHandler::run()
}
}
Poco::Net::SocketAddress TCPHandler::getClientAddress(const ClientInfo & client_info)
{
/// Extract the last entry from comma separated list of forwarded_for addresses.
/// Only the last proxy can be trusted (if any).
String forwarded_address = client_info.getLastForwardedFor();
if (!forwarded_address.empty() && server.config().getBool("auth_use_forwarded_address", false))
return Poco::Net::SocketAddress(forwarded_address, socket().peerAddress().port());
else
return socket().peerAddress();
}
}

View File

@ -273,6 +273,8 @@ private:
/// This function is called from different threads.
void updateProgress(const Progress & value);
Poco::Net::SocketAddress getClientAddress(const ClientInfo & client_info);
};
}

View File

@ -383,6 +383,15 @@ NamesAndTypesList ColumnsDescription::getEphemeral() const
return ret;
}
NamesAndTypesList ColumnsDescription::getWithDefaultExpression() const
{
NamesAndTypesList ret;
for (const auto & col : columns)
if (col.default_desc.expression)
ret.emplace_back(col.name, col.type);
return ret;
}
NamesAndTypesList ColumnsDescription::getAll() const
{
NamesAndTypesList ret;

View File

@ -132,6 +132,7 @@ public:
NamesAndTypesList getInsertable() const; /// ordinary + ephemeral
NamesAndTypesList getAliases() const;
NamesAndTypesList getEphemeral() const;
NamesAndTypesList getWithDefaultExpression() const; // columns with default expression, for example set by `CREATE TABLE` statement
NamesAndTypesList getAllPhysical() const; /// ordinary + materialized.
NamesAndTypesList getAll() const; /// ordinary + materialized + aliases + ephemeral
/// Returns .size0/.null/...

View File

@ -94,14 +94,15 @@ IndexDescription IndexDescription::getIndexFromAST(const ASTPtr & definition_ast
auto syntax = TreeRewriter(context).analyze(expr_list, columns.getAllPhysical());
result.expression = ExpressionAnalyzer(expr_list, syntax, context).getActions(true);
Block block_without_columns = result.expression->getSampleBlock();
result.sample_block = result.expression->getSampleBlock();
for (size_t i = 0; i < block_without_columns.columns(); ++i)
for (auto & elem : result.sample_block)
{
const auto & column = block_without_columns.getByPosition(i);
result.column_names.emplace_back(column.name);
result.data_types.emplace_back(column.type);
result.sample_block.insert(ColumnWithTypeAndName(column.type->createColumn(), column.type, column.name));
if (!elem.column)
elem.column = elem.type->createColumn();
result.column_names.push_back(elem.name);
result.data_types.push_back(elem.type);
}
const auto & definition_arguments = index_definition->type->arguments;

View File

@ -959,6 +959,11 @@ void registerStorageKafka(StorageFactory & factory)
{
throw Exception(ErrorCodes::BAD_ARGUMENTS, "kafka_poll_max_batch_size can not be lower than 1");
}
if (args.columns.getOrdinary() != args.columns.getAll() || !args.columns.getWithDefaultExpression().empty())
{
throw Exception(ErrorCodes::BAD_ARGUMENTS, "KafkaEngine doesn't support DEFAULT/MATERIALIZED/EPHEMERAL/ALIAS expressions for columns. "
"See https://clickhouse.com/docs/en/engines/table-engines/integrations/kafka/#configuration");
}
return std::make_shared<StorageKafka>(args.table_id, args.getContext(), args.columns, std::move(kafka_settings), collection_name);
};

View File

@ -241,7 +241,18 @@ StorageLiveView::StorageLiveView(
blocks_metadata_ptr = std::make_shared<BlocksMetadataPtr>();
active_ptr = std::make_shared<bool>(true);
periodic_refresh_task = getContext()->getSchedulePool().createTask("LiveViewPeriodicRefreshTask", [this]{ periodicRefreshTaskFunc(); });
periodic_refresh_task = getContext()->getSchedulePool().createTask("LiveViewPeriodicRefreshTask",
[this]
{
try
{
periodicRefreshTaskFunc();
}
catch (...)
{
tryLogCurrentException(log, "Exception in LiveView periodic refresh task in BackgroundSchedulePool");
}
});
periodic_refresh_task->deactivate();
}

View File

@ -525,7 +525,6 @@ void MergeTreeData::checkProperties(
for (const auto & index : new_metadata.secondary_indices)
{
MergeTreeIndexFactory::instance().validate(index, attach);
if (indices_names.find(index.name) != indices_names.end())

View File

@ -35,6 +35,7 @@ MergeTreeIndexPtr MergeTreeIndexFactory::get(
{
auto it = creators.find(index.type);
if (it == creators.end())
{
throw Exception(ErrorCodes::INCORRECT_QUERY,
"Unknown Index type '{}'. Available index types: {}", index.type,
std::accumulate(creators.cbegin(), creators.cend(), std::string{},
@ -46,6 +47,7 @@ MergeTreeIndexPtr MergeTreeIndexFactory::get(
return left + ", " + right.first;
})
);
}
return it->second(index);
}
@ -61,8 +63,31 @@ MergeTreeIndices MergeTreeIndexFactory::getMany(const std::vector<IndexDescripti
void MergeTreeIndexFactory::validate(const IndexDescription & index, bool attach) const
{
/// Do not allow constant and non-deterministic expressions.
/// Do not throw on attach for compatibility.
if (!attach)
{
if (index.expression->hasArrayJoin())
throw Exception(ErrorCodes::INCORRECT_QUERY, "Secondary index '{}' cannot contain array joins", index.name);
try
{
index.expression->assertDeterministic();
}
catch (Exception & e)
{
e.addMessage(fmt::format("for secondary index '{}'", index.name));
throw;
}
for (const auto & elem : index.sample_block)
if (elem.column && (isColumnConst(*elem.column) || elem.column->isDummy()))
throw Exception(ErrorCodes::INCORRECT_QUERY, "Secondary index '{}' cannot contain constants", index.name);
}
auto it = validators.find(index.type);
if (it == validators.end())
{
throw Exception(ErrorCodes::INCORRECT_QUERY,
"Unknown Index type '{}'. Available index types: {}", index.type,
std::accumulate(
@ -77,6 +102,7 @@ void MergeTreeIndexFactory::validate(const IndexDescription & index, bool attach
return left + ", " + right.first;
})
);
}
it->second(index, attach);
}

View File

@ -64,8 +64,7 @@ void MergeTreeSettings::loadFromQuery(ASTStorage & storage_def, ContextPtr conte
auto ast = dynamic_cast<const FieldFromASTImpl &>(custom.getImpl()).ast;
if (ast && isDiskFunction(ast))
{
const auto & ast_function = assert_cast<const ASTFunction &>(*ast);
auto disk_name = getOrCreateDiskFromDiskAST(ast_function, context);
auto disk_name = getOrCreateDiskFromDiskAST(ast, context);
LOG_TRACE(&Poco::Logger::get("MergeTreeSettings"), "Created custom disk {}", disk_name);
value = disk_name;
}

View File

@ -8434,7 +8434,11 @@ std::pair<bool, NameSet> StorageReplicatedMergeTree::unlockSharedDataByID(
}
else if (error_code == Coordination::Error::ZNONODE)
{
LOG_TRACE(logger, "Node with parent zookeeper lock {} for part {} doesn't exist (part was unlocked before)", zookeeper_part_uniq_node, part_name);
/// We don't know what to do, because this part can be mutation part
/// with hardlinked columns. Since we don't have this information (about blobs not to remove)
/// we refuse to remove blobs.
LOG_WARNING(logger, "Node with parent zookeeper lock {} for part {} doesn't exist (part was unlocked before), refuse to remove blobs", zookeeper_part_uniq_node, part_name);
return {false, {}};
}
else
{

View File

@ -294,6 +294,65 @@ def test_merge_tree_custom_disk_setting(start_cluster):
).strip()
)
node1.query(f"DROP TABLE {TABLE_NAME} SYNC")
node1.query(f"DROP TABLE {TABLE_NAME}_2 SYNC")
node1.query(f"DROP TABLE {TABLE_NAME}_3 SYNC")
node1.query(f"DROP TABLE {TABLE_NAME}_4 SYNC")
node2.query(f"DROP TABLE {TABLE_NAME}_4 SYNC")
def test_merge_tree_nested_custom_disk_setting(start_cluster):
node = cluster.instances["node1"]
minio = cluster.minio_client
for obj in list(minio.list_objects(cluster.minio_bucket, "data/", recursive=True)):
minio.remove_object(cluster.minio_bucket, obj.object_name)
assert (
len(list(minio.list_objects(cluster.minio_bucket, "data/", recursive=True)))
== 0
)
node.query(
f"""
DROP TABLE IF EXISTS {TABLE_NAME} SYNC;
CREATE TABLE {TABLE_NAME} (a Int32)
ENGINE = MergeTree() order by tuple()
SETTINGS disk = disk(
type=cache,
max_size='1Gi',
path='/var/lib/clickhouse/custom_disk_cache/',
disk=disk(
type=s3,
endpoint='http://minio1:9001/root/data/',
access_key_id='minio',
secret_access_key='minio123'));
"""
)
node.query(f"INSERT INTO {TABLE_NAME} SELECT number FROM numbers(100)")
node.query("SYSTEM DROP FILESYSTEM CACHE")
# Check cache is filled
assert 0 == int(node.query("SELECT count() FROM system.filesystem_cache"))
assert 100 == int(node.query(f"SELECT count() FROM {TABLE_NAME}"))
node.query(f"SELECT * FROM {TABLE_NAME}")
assert 0 < int(node.query("SELECT count() FROM system.filesystem_cache"))
# Check s3 is filled
assert (
len(list(minio.list_objects(cluster.minio_bucket, "data/", recursive=True))) > 0
)
node.restart_clickhouse()
assert 100 == int(node.query(f"SELECT count() FROM {TABLE_NAME}"))
expected = """
SETTINGS disk = disk(type = cache, max_size = \\'[HIDDEN]\\', path = \\'[HIDDEN]\\', disk = disk(type = s3, endpoint = \\'[HIDDEN]\\'
"""
assert expected.strip() in node.query(f"SHOW CREATE TABLE {TABLE_NAME}").strip()
node.query(f"DROP TABLE {TABLE_NAME} SYNC")
def test_merge_tree_setting_override(start_cluster):
node = cluster.instances["node3"]
@ -367,3 +426,4 @@ def test_merge_tree_setting_override(start_cluster):
assert (
len(list(minio.list_objects(cluster.minio_bucket, "data/", recursive=True))) > 0
)
node.query(f"DROP TABLE {TABLE_NAME} SYNC")

View File

@ -44,8 +44,6 @@ def test_file_path_escaping(started_cluster):
]
)
def test_file_path_escaping_atomic_db(started_cluster):
node.query("CREATE DATABASE IF NOT EXISTS `test 2` ENGINE = Atomic")
node.query(
"""

View File

@ -594,8 +594,6 @@ def test_cancel_while_processing_input():
stub = clickhouse_grpc_pb2_grpc.ClickHouseStub(main_channel)
result = stub.ExecuteQueryWithStreamInput(send_query_info())
assert result.cancelled == True
assert result.progress.written_rows == 6
assert query("SELECT a FROM t ORDER BY a") == "1\n2\n3\n4\n5\n6\n"
def test_cancel_while_generating_output():

View File

@ -10,7 +10,7 @@ node = cluster.add_instance(
config = """<clickhouse>
<logger>
<level>information</level>
<level>debug</level>
<log>/var/log/clickhouse-server/clickhouse-server.log</log>
</logger>
</clickhouse>"""
@ -63,4 +63,4 @@ def test_log_levels_update(start_cluster):
log = get_log(node)
assert len(log) > 0
assert not re.search("(<Trace>|<Debug>)", log)
assert not re.search("<Trace>", log)

View File

@ -285,6 +285,56 @@ def avro_confluent_message(schema_registry_client, value):
# Tests
def test_kafka_prohibited_column_types(kafka_cluster):
def assert_returned_exception(e):
assert e.value.returncode == 36
assert (
"KafkaEngine doesn't support DEFAULT/MATERIALIZED/EPHEMERAL/ALIAS expressions for columns."
in str(e.value)
)
# check column with DEFAULT expression
with pytest.raises(QueryRuntimeException) as exception:
instance.query(
"""
CREATE TABLE test.kafka (a Int, b Int DEFAULT 0)
ENGINE = Kafka('{kafka_broker}:19092', '{kafka_topic_new}', '{kafka_group_name_new}', '{kafka_format_json_each_row}', '\\n')
"""
)
assert_returned_exception(exception)
# check EPHEMERAL
with pytest.raises(QueryRuntimeException) as exception:
instance.query(
"""
CREATE TABLE test.kafka (a Int, b Int EPHEMERAL)
ENGINE = Kafka('{kafka_broker}:19092', '{kafka_topic_new}', '{kafka_group_name_new}', '{kafka_format_json_each_row}', '\\n')
"""
)
assert_returned_exception(exception)
# check ALIAS
with pytest.raises(QueryRuntimeException) as exception:
instance.query(
"""
CREATE TABLE test.kafka (a Int, b String Alias toString(a))
ENGINE = Kafka('{kafka_broker}:19092', '{kafka_topic_new}', '{kafka_group_name_new}', '{kafka_format_json_each_row}', '\\n')
"""
)
assert_returned_exception(exception)
# check MATERIALIZED
# check ALIAS
with pytest.raises(QueryRuntimeException) as exception:
instance.query(
"""
CREATE TABLE test.kafka (a Int, b String MATERIALIZED toString(a))
ENGINE = Kafka('{kafka_broker}:19092', '{kafka_topic_new}', '{kafka_group_name_new}', '{kafka_format_json_each_row}', '\\n')
"""
)
assert_returned_exception(exception)
def test_kafka_settings_old_syntax(kafka_cluster):
assert TSV(
instance.query(

View File

@ -3,3 +3,43 @@
2
2
2
QUERY id: 0
PROJECTION COLUMNS
count() UInt64
PROJECTION
LIST id: 1, nodes: 1
FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64
JOIN TREE
TABLE id: 3, table_name: default.regression_for_in_operator_view
WHERE
FUNCTION id: 4, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 5, nodes: 2
COLUMN id: 6, column_name: g, result_type: String, source_id: 3
CONSTANT id: 7, constant_value: Tuple_(\'5\', \'6\'), constant_value_type: Tuple(String, String)
SETTINGS allow_experimental_analyzer=1
2
2
QUERY id: 0
PROJECTION COLUMNS
count() UInt64
PROJECTION
LIST id: 1, nodes: 1
FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64
JOIN TREE
TABLE id: 3, table_name: default.regression_for_in_operator_view
WHERE
FUNCTION id: 4, function_name: or, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 5, nodes: 2
FUNCTION id: 6, function_name: equals, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 7, nodes: 2
COLUMN id: 8, column_name: g, result_type: String, source_id: 3
CONSTANT id: 9, constant_value: \'5\', constant_value_type: String
FUNCTION id: 10, function_name: equals, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 11, nodes: 2
COLUMN id: 8, column_name: g, result_type: String, source_id: 3
CONSTANT id: 12, constant_value: \'6\', constant_value_type: String
SETTINGS allow_experimental_analyzer=1

View File

@ -12,9 +12,13 @@ SELECT count() FROM regression_for_in_operator_view WHERE g IN ('5','6');
SET optimize_min_equality_disjunction_chain_length = 1;
SELECT count() FROM regression_for_in_operator_view WHERE g = '5' OR g = '6';
SELECT count() FROM regression_for_in_operator_view WHERE g = '5' OR g = '6' SETTINGS allow_experimental_analyzer = 1;
EXPLAIN QUERY TREE SELECT count() FROM regression_for_in_operator_view WHERE g = '5' OR g = '6' SETTINGS allow_experimental_analyzer = 1;
SET optimize_min_equality_disjunction_chain_length = 3;
SELECT count() FROM regression_for_in_operator_view WHERE g = '5' OR g = '6';
SELECT count() FROM regression_for_in_operator_view WHERE g = '5' OR g = '6' SETTINGS allow_experimental_analyzer = 1;
EXPLAIN QUERY TREE SELECT count() FROM regression_for_in_operator_view WHERE g = '5' OR g = '6' SETTINGS allow_experimental_analyzer = 1;
DROP TABLE regression_for_in_operator_view;
DROP TABLE regression_for_in_operator;

View File

@ -25,6 +25,81 @@
3 21
3 22
3 23
QUERY id: 0
PROJECTION COLUMNS
k UInt64
s UInt64
PROJECTION
LIST id: 1, nodes: 2
COLUMN id: 2, column_name: k, result_type: UInt64, source_id: 3
COLUMN id: 4, column_name: s, result_type: UInt64, source_id: 3
JOIN TREE
TABLE id: 3, table_name: default.bug
WHERE
FUNCTION id: 5, function_name: and, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 6, nodes: 2
FUNCTION id: 7, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 8, nodes: 2
COLUMN id: 9, column_name: k, result_type: UInt64, source_id: 3
CONSTANT id: 10, constant_value: Tuple_(UInt64_1, UInt64_2, UInt64_3), constant_value_type: Tuple(UInt8, UInt8, UInt8)
FUNCTION id: 11, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 12, nodes: 2
COLUMN id: 13, column_name: s, result_type: UInt64, source_id: 3
CONSTANT id: 14, constant_value: Tuple_(UInt64_21, UInt64_22, UInt64_23), constant_value_type: Tuple(UInt8, UInt8, UInt8)
SETTINGS allow_experimental_analyzer=1
1 21
1 22
1 23
2 21
2 22
2 23
3 21
3 22
3 23
1 21
1 22
1 23
2 21
2 22
2 23
3 21
3 22
3 23
QUERY id: 0
PROJECTION COLUMNS
k UInt64
s UInt64
PROJECTION
LIST id: 1, nodes: 2
COLUMN id: 2, column_name: k, result_type: UInt64, source_id: 3
COLUMN id: 4, column_name: s, result_type: UInt64, source_id: 3
JOIN TREE
QUERY id: 3, is_subquery: 1
PROJECTION COLUMNS
k UInt64
s UInt64
PROJECTION
LIST id: 5, nodes: 2
COLUMN id: 6, column_name: k, result_type: UInt64, source_id: 7
COLUMN id: 8, column_name: s, result_type: UInt64, source_id: 7
JOIN TREE
TABLE id: 7, table_name: default.bug
WHERE
FUNCTION id: 9, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 10, nodes: 2
COLUMN id: 11, column_name: k, result_type: UInt64, source_id: 7
CONSTANT id: 12, constant_value: Tuple_(UInt64_1, UInt64_2, UInt64_3), constant_value_type: Tuple(UInt8, UInt8, UInt8)
WHERE
FUNCTION id: 13, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 14, nodes: 2
COLUMN id: 15, column_name: s, result_type: UInt64, source_id: 3
CONSTANT id: 16, constant_value: Tuple_(UInt64_21, UInt64_22, UInt64_23), constant_value_type: Tuple(UInt8, UInt8, UInt8)
SETTINGS allow_experimental_analyzer=1
1 1 21 1 1 1
1 1 22 0 1 1
1 1 23 0 0 1
@ -34,42 +109,6 @@
3 1 21 1 1 1
3 1 22 0 1 1
3 1 23 0 0 1
21 1
22 1
23 1
21 1
22 1
23 1
21 1
22 1
23 1
1 21
1 22
1 23
2 21
2 22
2 23
3 21
3 22
3 23
1 21
1 22
1 23
2 21
2 22
2 23
3 21
3 22
3 23
1 21
1 22
1 23
2 21
2 22
2 23
3 21
3 22
3 23
1 1 21 1 1 1
1 1 22 0 1 1
1 1 23 0 0 1
@ -79,6 +118,41 @@
3 1 21 1 1 1
3 1 22 0 1 1
3 1 23 0 0 1
QUERY id: 0
PROJECTION COLUMNS
k UInt64
or(equals(k, 1), equals(k, 2), equals(k, 3)) UInt8
s UInt64
equals(s, 21) UInt8
or(equals(s, 21), equals(s, 22)) UInt8
or(equals(s, 21), equals(s, 22), equals(s, 23)) UInt8
PROJECTION
LIST id: 1, nodes: 6
COLUMN id: 2, column_name: k, result_type: UInt64, source_id: 3
FUNCTION id: 4, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 5, nodes: 2
COLUMN id: 2, column_name: k, result_type: UInt64, source_id: 3
CONSTANT id: 6, constant_value: Tuple_(UInt64_1, UInt64_2, UInt64_3), constant_value_type: Tuple(UInt8, UInt8, UInt8)
COLUMN id: 7, column_name: s, result_type: UInt64, source_id: 3
FUNCTION id: 8, function_name: equals, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 9, nodes: 2
COLUMN id: 7, column_name: s, result_type: UInt64, source_id: 3
CONSTANT id: 10, constant_value: UInt64_21, constant_value_type: UInt8
FUNCTION id: 11, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 12, nodes: 2
COLUMN id: 7, column_name: s, result_type: UInt64, source_id: 3
CONSTANT id: 13, constant_value: Tuple_(UInt64_21, UInt64_22), constant_value_type: Tuple(UInt8, UInt8)
FUNCTION id: 14, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 15, nodes: 2
COLUMN id: 7, column_name: s, result_type: UInt64, source_id: 3
CONSTANT id: 16, constant_value: Tuple_(UInt64_21, UInt64_22, UInt64_23), constant_value_type: Tuple(UInt8, UInt8, UInt8)
JOIN TREE
TABLE id: 3, table_name: default.bug
SETTINGS allow_experimental_analyzer=1
21 1
22 1
23 1
@ -88,3 +162,256 @@
21 1
22 1
23 1
21 1
22 1
23 1
21 1
22 1
23 1
21 1
22 1
23 1
QUERY id: 0
PROJECTION COLUMNS
s UInt64
or(equals(s, 21), equals(s, 22), equals(s, 23)) UInt8
PROJECTION
LIST id: 1, nodes: 2
COLUMN id: 2, column_name: s, result_type: UInt64, source_id: 3
FUNCTION id: 4, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 5, nodes: 2
COLUMN id: 2, column_name: s, result_type: UInt64, source_id: 3
CONSTANT id: 6, constant_value: Tuple_(UInt64_21, UInt64_22, UInt64_23), constant_value_type: Tuple(UInt8, UInt8, UInt8)
JOIN TREE
TABLE id: 3, table_name: default.bug
SETTINGS allow_experimental_analyzer=1
1 21
1 22
1 23
2 21
2 22
2 23
3 21
3 22
3 23
1 21
1 22
1 23
2 21
2 22
2 23
3 21
3 22
3 23
1 21
1 22
1 23
2 21
2 22
2 23
3 21
3 22
3 23
QUERY id: 0
PROJECTION COLUMNS
k UInt64
s UInt64
PROJECTION
LIST id: 1, nodes: 2
COLUMN id: 2, column_name: k, result_type: UInt64, source_id: 3
COLUMN id: 4, column_name: s, result_type: UInt64, source_id: 3
JOIN TREE
TABLE id: 3, table_name: default.bug
WHERE
FUNCTION id: 5, function_name: and, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 6, nodes: 2
FUNCTION id: 7, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 8, nodes: 2
COLUMN id: 9, column_name: k, result_type: UInt64, source_id: 3
CONSTANT id: 10, constant_value: Tuple_(UInt64_1, UInt64_2, UInt64_3), constant_value_type: Tuple(UInt8, UInt8, UInt8)
FUNCTION id: 11, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 12, nodes: 2
COLUMN id: 13, column_name: s, result_type: UInt64, source_id: 3
CONSTANT id: 14, constant_value: Tuple_(UInt64_21, UInt64_22, UInt64_23), constant_value_type: Tuple(UInt8, UInt8, UInt8)
SETTINGS allow_experimental_analyzer=1
1 21
1 22
1 23
2 21
2 22
2 23
3 21
3 22
3 23
1 21
1 22
1 23
2 21
2 22
2 23
3 21
3 22
3 23
QUERY id: 0
PROJECTION COLUMNS
k UInt64
s UInt64
PROJECTION
LIST id: 1, nodes: 2
COLUMN id: 2, column_name: k, result_type: UInt64, source_id: 3
COLUMN id: 4, column_name: s, result_type: UInt64, source_id: 3
JOIN TREE
QUERY id: 3, is_subquery: 1
PROJECTION COLUMNS
k UInt64
s UInt64
PROJECTION
LIST id: 5, nodes: 2
COLUMN id: 6, column_name: k, result_type: UInt64, source_id: 7
COLUMN id: 8, column_name: s, result_type: UInt64, source_id: 7
JOIN TREE
TABLE id: 7, table_name: default.bug
WHERE
FUNCTION id: 9, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 10, nodes: 2
COLUMN id: 11, column_name: k, result_type: UInt64, source_id: 7
CONSTANT id: 12, constant_value: Tuple_(UInt64_1, UInt64_2, UInt64_3), constant_value_type: Tuple(UInt8, UInt8, UInt8)
WHERE
FUNCTION id: 13, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 14, nodes: 2
COLUMN id: 15, column_name: s, result_type: UInt64, source_id: 3
CONSTANT id: 16, constant_value: Tuple_(UInt64_21, UInt64_22, UInt64_23), constant_value_type: Tuple(UInt8, UInt8, UInt8)
SETTINGS allow_experimental_analyzer=1
1 1 21 1 1 1
1 1 22 0 1 1
1 1 23 0 0 1
2 1 21 1 1 1
2 1 22 0 1 1
2 1 23 0 0 1
3 1 21 1 1 1
3 1 22 0 1 1
3 1 23 0 0 1
1 1 21 1 1 1
1 1 22 0 1 1
1 1 23 0 0 1
2 1 21 1 1 1
2 1 22 0 1 1
2 1 23 0 0 1
3 1 21 1 1 1
3 1 22 0 1 1
3 1 23 0 0 1
QUERY id: 0
PROJECTION COLUMNS
k UInt64
or(equals(k, 1), equals(k, 2), equals(k, 3)) UInt8
s UInt64
equals(s, 21) UInt8
or(equals(s, 21), equals(s, 22)) UInt8
or(equals(s, 21), equals(s, 22), equals(s, 23)) UInt8
PROJECTION
LIST id: 1, nodes: 6
COLUMN id: 2, column_name: k, result_type: UInt64, source_id: 3
FUNCTION id: 4, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 5, nodes: 2
COLUMN id: 2, column_name: k, result_type: UInt64, source_id: 3
CONSTANT id: 6, constant_value: Tuple_(UInt64_1, UInt64_2, UInt64_3), constant_value_type: Tuple(UInt8, UInt8, UInt8)
COLUMN id: 7, column_name: s, result_type: UInt64, source_id: 3
FUNCTION id: 8, function_name: equals, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 9, nodes: 2
COLUMN id: 7, column_name: s, result_type: UInt64, source_id: 3
CONSTANT id: 10, constant_value: UInt64_21, constant_value_type: UInt8
FUNCTION id: 11, function_name: or, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 12, nodes: 2
FUNCTION id: 13, function_name: equals, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 14, nodes: 2
COLUMN id: 7, column_name: s, result_type: UInt64, source_id: 3
CONSTANT id: 15, constant_value: UInt64_21, constant_value_type: UInt8
FUNCTION id: 16, function_name: equals, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 17, nodes: 2
COLUMN id: 7, column_name: s, result_type: UInt64, source_id: 3
CONSTANT id: 18, constant_value: UInt64_22, constant_value_type: UInt8
FUNCTION id: 19, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 20, nodes: 2
COLUMN id: 7, column_name: s, result_type: UInt64, source_id: 3
CONSTANT id: 21, constant_value: Tuple_(UInt64_21, UInt64_22, UInt64_23), constant_value_type: Tuple(UInt8, UInt8, UInt8)
JOIN TREE
TABLE id: 3, table_name: default.bug
SETTINGS allow_experimental_analyzer=1
21 1
22 1
23 1
21 1
22 1
23 1
21 1
22 1
23 1
21 1
22 1
23 1
21 1
22 1
23 1
21 1
22 1
23 1
QUERY id: 0
PROJECTION COLUMNS
s UInt64
or(equals(s, 21), equals(s, 22), equals(s, 23)) UInt8
PROJECTION
LIST id: 1, nodes: 2
COLUMN id: 2, column_name: s, result_type: UInt64, source_id: 3
FUNCTION id: 4, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 5, nodes: 2
COLUMN id: 2, column_name: s, result_type: UInt64, source_id: 3
CONSTANT id: 6, constant_value: Tuple_(UInt64_21, UInt64_22, UInt64_23), constant_value_type: Tuple(UInt8, UInt8, UInt8)
JOIN TREE
TABLE id: 3, table_name: default.bug
SETTINGS allow_experimental_analyzer=1
21 1
22 1
23 1
21 1
22 1
23 1
21 1
22 1
23 1
21 1
22 1
23 1
21 1
22 1
23 1
21 1
22 1
23 1
QUERY id: 0
PROJECTION COLUMNS
s UInt64
or(equals(s, 21), equals(22, s), equals(23, s)) UInt8
PROJECTION
LIST id: 1, nodes: 2
COLUMN id: 2, column_name: s, result_type: UInt64, source_id: 3
FUNCTION id: 4, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 5, nodes: 2
COLUMN id: 2, column_name: s, result_type: UInt64, source_id: 3
CONSTANT id: 6, constant_value: Tuple_(UInt64_21, UInt64_22, UInt64_23), constant_value_type: Tuple(UInt8, UInt8, UInt8)
JOIN TREE
TABLE id: 3, table_name: default.bug
SETTINGS allow_experimental_analyzer=1

View File

@ -5,17 +5,45 @@ insert into bug values(1,21),(1,22),(1,23),(2,21),(2,22),(2,23),(3,21),(3,22),(3
set optimize_min_equality_disjunction_chain_length = 2;
select * from bug;
select * from bug where (k =1 or k=2 or k =3) and (s=21 or s=22 or s=23);
select * from bug where (k =1 or k=2 or k =3) and (s=21 or s=22 or s=23) SETTINGS allow_experimental_analyzer = 1;
explain query tree select * from bug where (k =1 or k=2 or k =3) and (s=21 or s=22 or s=23) SETTINGS allow_experimental_analyzer = 1;;
select * from (select * from bug where k=1 or k=2 or k=3) where (s=21 or s=22 or s=23);
select * from (select * from bug where k=1 or k=2 or k=3) where (s=21 or s=22 or s=23) SETTINGS allow_experimental_analyzer = 1;;
explain query tree select * from (select * from bug where k=1 or k=2 or k=3) where (s=21 or s=22 or s=23) SETTINGS allow_experimental_analyzer = 1;;
select k, (k=1 or k=2 or k=3), s, (s=21), (s=21 or s=22), (s=21 or s=22 or s=23) from bug;
select k, (k=1 or k=2 or k=3), s, (s=21), (s=21 or s=22), (s=21 or s=22 or s=23) from bug SETTINGS allow_experimental_analyzer = 1;;
explain query tree select k, (k=1 or k=2 or k=3), s, (s=21), (s=21 or s=22), (s=21 or s=22 or s=23) from bug SETTINGS allow_experimental_analyzer = 1;;
select s, (s=21 or s=22 or s=23) from bug;
select s, (s=21 or s=22 or s=23) from bug SETTINGS allow_experimental_analyzer = 1;;
explain query tree select s, (s=21 or s=22 or s=23) from bug SETTINGS allow_experimental_analyzer = 1;;
set optimize_min_equality_disjunction_chain_length = 3;
select * from bug;
select * from bug where (k =1 or k=2 or k =3) and (s=21 or s=22 or s=23);
select * from bug where (k =1 or k=2 or k =3) and (s=21 or s=22 or s=23) SETTINGS allow_experimental_analyzer = 1;
explain query tree select * from bug where (k =1 or k=2 or k =3) and (s=21 or s=22 or s=23) SETTINGS allow_experimental_analyzer = 1;;
select * from (select * from bug where k=1 or k=2 or k=3) where (s=21 or s=22 or s=23);
select * from (select * from bug where k=1 or k=2 or k=3) where (s=21 or s=22 or s=23) SETTINGS allow_experimental_analyzer = 1;;
explain query tree select * from (select * from bug where k=1 or k=2 or k=3) where (s=21 or s=22 or s=23) SETTINGS allow_experimental_analyzer = 1;;
select k, (k=1 or k=2 or k=3), s, (s=21), (s=21 or s=22), (s=21 or s=22 or s=23) from bug;
select k, (k=1 or k=2 or k=3), s, (s=21), (s=21 or s=22), (s=21 or s=22 or s=23) from bug SETTINGS allow_experimental_analyzer = 1;;
explain query tree select k, (k=1 or k=2 or k=3), s, (s=21), (s=21 or s=22), (s=21 or s=22 or s=23) from bug SETTINGS allow_experimental_analyzer = 1;;
select s, (s=21 or s=22 or s=23) from bug;
select s, (s=21 or s=22 or s=23) from bug SETTINGS allow_experimental_analyzer = 1;;
explain query tree select s, (s=21 or s=22 or s=23) from bug SETTINGS allow_experimental_analyzer = 1;;
select s, (s=21 or 22=s or 23=s) from bug;
select s, (s=21 or 22=s or 23=s) from bug SETTINGS allow_experimental_analyzer = 1;;
explain query tree select s, (s=21 or 22=s or 23=s) from bug SETTINGS allow_experimental_analyzer = 1;;
DROP TABLE bug;

View File

@ -2,7 +2,6 @@
.
<Debug>
.
<Information>
.
<Error>
-

View File

@ -1,7 +1,7 @@
abcabcabcabcabcabcabcabcabcabc
abcabcabc
sdfggsdfgg
xywq
abcabcabcabcabcabcabcabcabcabcabcabc
sdfggsdfggsdfggsdfggsdfggsdfggsdfggsdfggsdfggsdfgg
@ -20,8 +20,8 @@ sdfggsdfggsdfggsdfggsdfggsdfggsdfggsdfggsdfggsdfgg
xywqxywqxywqxywqxywqxywqxywqxywqxywqxywq
plkfplkfplkfplkfplkfplkfplkfplkfplkfplkf
abcabcabc
abcabc
abc
abcabcabcabcabcabcabcabcabcabcabcabc
abcabcabcabcabcabcabcabcabcabc

View File

@ -3,20 +3,20 @@ DROP TABLE IF EXISTS defaults;
CREATE TABLE defaults
(
strings String,
u8 UInt8,
i8 Int8,
u16 UInt16,
u32 UInt32,
u64 UInt64
)ENGINE = Memory();
INSERT INTO defaults values ('abc', 3, 12, 4, 56) ('sdfgg', 2, 10, 21, 200) ('xywq', 1, 4, 9, 5) ('plkf', 0, 5, 7,77);
INSERT INTO defaults values ('abc', 3, 12, 4, 56) ('sdfgg', -2, 10, 21, 200) ('xywq', -1, 4, 9, 5) ('plkf', 0, 5, 7,77);
SELECT repeat(strings, u8) FROM defaults;
SELECT repeat(strings, i8) FROM defaults;
SELECT repeat(strings, u16) FROM defaults;
SELECT repeat(strings, u32) from defaults;
SELECT repeat(strings, u64) FROM defaults;
SELECT repeat(strings, 10) FROM defaults;
SELECT repeat('abc', u8) FROM defaults;
SELECT repeat('abc', i8) FROM defaults;
SELECT repeat('abc', u16) FROM defaults;
SELECT repeat('abc', u32) FROM defaults;
SELECT repeat('abc', u64) FROM defaults;

View File

@ -1,2 +1,2 @@
CREATE TABLE default.x\n(\n `i` Int32,\n INDEX mm rand() TYPE minmax GRANULARITY 1,\n INDEX nn rand() TYPE minmax GRANULARITY 1,\n PROJECTION p\n (\n SELECT max(i)\n ),\n PROJECTION p2\n (\n SELECT min(i)\n )\n)\nENGINE = ReplicatedMergeTree(\'/clickhouse/tables/default/x\', \'r\')\nORDER BY i\nSETTINGS index_granularity = 8192
metadata format version: 1\ndate column: \nsampling expression: \nindex granularity: 8192\nmode: 0\nsign column: \nprimary key: i\ndata format version: 1\npartition key: \nindices: mm rand() TYPE minmax GRANULARITY 1, nn rand() TYPE minmax GRANULARITY 1\nprojections: p (SELECT max(i)), p2 (SELECT min(i))\ngranularity bytes: 10485760\n
CREATE TABLE default.x\n(\n `i` Int32,\n INDEX mm log2(i) TYPE minmax GRANULARITY 1,\n INDEX nn log2(i) TYPE minmax GRANULARITY 1,\n PROJECTION p\n (\n SELECT max(i)\n ),\n PROJECTION p2\n (\n SELECT min(i)\n )\n)\nENGINE = ReplicatedMergeTree(\'/clickhouse/tables/default/x\', \'r\')\nORDER BY i\nSETTINGS index_granularity = 8192
metadata format version: 1\ndate column: \nsampling expression: \nindex granularity: 8192\nmode: 0\nsign column: \nprimary key: i\ndata format version: 1\npartition key: \nindices: mm log2(i) TYPE minmax GRANULARITY 1, nn log2(i) TYPE minmax GRANULARITY 1\nprojections: p (SELECT max(i)), p2 (SELECT min(i))\ngranularity bytes: 10485760\n

View File

@ -2,9 +2,9 @@
drop table if exists x;
create table x(i int, index mm RAND() type minmax granularity 1, projection p (select MAX(i))) engine ReplicatedMergeTree('/clickhouse/tables/{database}/x', 'r') order by i;
create table x(i int, index mm LOG2(i) type minmax granularity 1, projection p (select MAX(i))) engine ReplicatedMergeTree('/clickhouse/tables/{database}/x', 'r') order by i;
alter table x add index nn RAND() type minmax granularity 1, add projection p2 (select MIN(i));
alter table x add index nn LOG2(i) type minmax granularity 1, add projection p2 (select MIN(i));
show create x;

View File

@ -49,7 +49,16 @@ insert_client_opts=(
timeout 250s $CLICKHOUSE_CLIENT "${client_opts[@]}" "${insert_client_opts[@]}" -q "insert into function remote('127.2', currentDatabase(), in_02232) select * from numbers(1e6)"
# Kill underlying query of remote() to make KILL faster
timeout 60s $CLICKHOUSE_CLIENT "${client_opts[@]}" -q "KILL QUERY WHERE Settings['log_comment'] = '$CLICKHOUSE_LOG_COMMENT' SYNC" --format Null
# This test is reproducing very interesting bahaviour.
# The block size is 1, so the secondary query creates InterpreterSelectQuery for each row due to pushing to the MV.
# It works extremely slow, and the initial query produces new blocks and writes them to the socket much faster
# then the secondary query can read and process them. Therefore, it fills network buffers in the kernel.
# Once a buffer in the kernel is full, send(...) blocks until the secondary query will finish processing data
# that it already has in ReadBufferFromPocoSocket and call recv.
# Or until the kernel will decide to resize the buffer (seems like it has non-trivial rules for that).
# Anyway, it may look like the initial query got stuck, but actually it did not.
# Moreover, the initial query cannot be killed at that point, so KILL QUERY ... SYNC will get "stuck" as well.
timeout 30s $CLICKHOUSE_CLIENT "${client_opts[@]}" -q "KILL QUERY WHERE query like '%INSERT INTO $CLICKHOUSE_DATABASE.in_02232%' SYNC" --format Null
echo $?
$CLICKHOUSE_CLIENT "${client_opts[@]}" -nm -q "

View File

@ -1,6 +1,62 @@
SELECT a
FROM t_logical_expressions_optimizer_low_cardinality
WHERE a IN (\'x\', \'y\')
QUERY id: 0
PROJECTION COLUMNS
a LowCardinality(String)
PROJECTION
LIST id: 1, nodes: 1
COLUMN id: 2, column_name: a, result_type: LowCardinality(String), source_id: 3
JOIN TREE
TABLE id: 3, table_name: default.t_logical_expressions_optimizer_low_cardinality
WHERE
FUNCTION id: 4, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 5, nodes: 2
COLUMN id: 2, column_name: a, result_type: LowCardinality(String), source_id: 3
CONSTANT id: 6, constant_value: Tuple_(\'x\', \'y\'), constant_value_type: Tuple(String, String)
SETTINGS allow_experimental_analyzer=1
SELECT a
FROM t_logical_expressions_optimizer_low_cardinality
WHERE (a = \'x\') OR (\'y\' = a)
QUERY id: 0
PROJECTION COLUMNS
a LowCardinality(String)
PROJECTION
LIST id: 1, nodes: 1
COLUMN id: 2, column_name: a, result_type: LowCardinality(String), source_id: 3
JOIN TREE
TABLE id: 3, table_name: default.t_logical_expressions_optimizer_low_cardinality
WHERE
FUNCTION id: 4, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 5, nodes: 2
COLUMN id: 2, column_name: a, result_type: LowCardinality(String), source_id: 3
CONSTANT id: 6, constant_value: Tuple_(\'x\', \'y\'), constant_value_type: Tuple(String, String)
SETTINGS allow_experimental_analyzer=1
SELECT a
FROM t_logical_expressions_optimizer_low_cardinality
WHERE (b = 0) OR (b = 1)
QUERY id: 0
PROJECTION COLUMNS
a LowCardinality(String)
PROJECTION
LIST id: 1, nodes: 1
COLUMN id: 2, column_name: a, result_type: LowCardinality(String), source_id: 3
JOIN TREE
TABLE id: 3, table_name: default.t_logical_expressions_optimizer_low_cardinality
WHERE
FUNCTION id: 4, function_name: or, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 5, nodes: 2
FUNCTION id: 6, function_name: equals, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 7, nodes: 2
COLUMN id: 8, column_name: b, result_type: UInt32, source_id: 3
CONSTANT id: 9, constant_value: UInt64_0, constant_value_type: UInt8
FUNCTION id: 10, function_name: equals, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 11, nodes: 2
COLUMN id: 8, column_name: b, result_type: UInt32, source_id: 3
CONSTANT id: 12, constant_value: UInt64_1, constant_value_type: UInt8
SETTINGS allow_experimental_analyzer=1

View File

@ -4,7 +4,11 @@ CREATE TABLE t_logical_expressions_optimizer_low_cardinality (a LowCardinality(S
-- LowCardinality case, ignore optimize_min_equality_disjunction_chain_length limit, optimzer applied
EXPLAIN SYNTAX SELECT a FROM t_logical_expressions_optimizer_low_cardinality WHERE a = 'x' OR a = 'y';
EXPLAIN QUERY TREE SELECT a FROM t_logical_expressions_optimizer_low_cardinality WHERE a = 'x' OR a = 'y' SETTINGS allow_experimental_analyzer = 1;
EXPLAIN SYNTAX SELECT a FROM t_logical_expressions_optimizer_low_cardinality WHERE a = 'x' OR 'y' = a;
EXPLAIN QUERY TREE SELECT a FROM t_logical_expressions_optimizer_low_cardinality WHERE a = 'x' OR 'y' = a SETTINGS allow_experimental_analyzer = 1;
-- Non-LowCardinality case, optimizer not applied for short chains
EXPLAIN SYNTAX SELECT a FROM t_logical_expressions_optimizer_low_cardinality WHERE b = 0 OR b = 1;
EXPLAIN QUERY TREE SELECT a FROM t_logical_expressions_optimizer_low_cardinality WHERE b = 0 OR b = 1 SETTINGS allow_experimental_analyzer = 1;
DROP TABLE t_logical_expressions_optimizer_low_cardinality;

View File

@ -0,0 +1,256 @@
-- { echoOn }
SELECT number, number % 2, sum(number) AS val
FROM numbers(10)
GROUP BY ROLLUP(number, number % 2)
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls=1;
0 0 0
1 1 1
2 0 2
3 1 3
4 0 4
5 1 5
6 0 6
7 1 7
8 0 8
9 1 9
\N \N 45
set optimize_group_by_function_keys = 0;
SELECT number, number % 2, sum(number) AS val
FROM numbers(10)
GROUP BY ROLLUP(number, number % 2)
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls=1;
0 0 0
0 \N 0
1 1 1
1 \N 1
2 0 2
2 \N 2
3 1 3
3 \N 3
4 0 4
4 \N 4
5 1 5
5 \N 5
6 0 6
6 \N 6
7 1 7
7 \N 7
8 0 8
8 \N 8
9 1 9
9 \N 9
\N \N 45
SELECT number, number % 2, sum(number) AS val
FROM numbers(10)
GROUP BY ROLLUP(number, number % 2)
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls=0;
0 0 0
0 0 0
0 0 45
1 0 1
1 1 1
2 0 2
2 0 2
3 0 3
3 1 3
4 0 4
4 0 4
5 0 5
5 1 5
6 0 6
6 0 6
7 0 7
7 1 7
8 0 8
8 0 8
9 0 9
9 1 9
SELECT number, number % 2, sum(number) AS val
FROM numbers(10)
GROUP BY CUBE(number, number % 2)
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls=1;
0 0 0
0 \N 0
1 1 1
1 \N 1
2 0 2
2 \N 2
3 1 3
3 \N 3
4 0 4
4 \N 4
5 1 5
5 \N 5
6 0 6
6 \N 6
7 1 7
7 \N 7
8 0 8
8 \N 8
9 1 9
9 \N 9
\N 0 20
\N 1 25
\N \N 45
SELECT number, number % 2, sum(number) AS val
FROM numbers(10)
GROUP BY CUBE(number, number % 2)
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls=0;
0 0 0
0 0 0
0 0 20
0 0 45
0 1 25
1 0 1
1 1 1
2 0 2
2 0 2
3 0 3
3 1 3
4 0 4
4 0 4
5 0 5
5 1 5
6 0 6
6 0 6
7 0 7
7 1 7
8 0 8
8 0 8
9 0 9
9 1 9
SELECT
number,
number % 2,
sum(number) AS val
FROM numbers(10)
GROUP BY
GROUPING SETS (
(number),
(number % 2)
)
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls = 1;
0 \N 0
1 \N 1
2 \N 2
3 \N 3
4 \N 4
5 \N 5
6 \N 6
7 \N 7
8 \N 8
9 \N 9
\N 0 20
\N 1 25
SELECT
number,
number % 2,
sum(number) AS val
FROM numbers(10)
GROUP BY
GROUPING SETS (
(number),
(number % 2)
)
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls = 0;
0 0 0
0 0 20
0 1 25
1 0 1
2 0 2
3 0 3
4 0 4
5 0 5
6 0 6
7 0 7
8 0 8
9 0 9
SELECT number, number % 2, sum(number) AS val
FROM numbers(10)
GROUP BY ROLLUP(number, number % 2) WITH TOTALS
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls=1;
0 0 0
0 \N 0
1 1 1
1 \N 1
2 0 2
2 \N 2
3 1 3
3 \N 3
4 0 4
4 \N 4
5 1 5
5 \N 5
6 0 6
6 \N 6
7 1 7
7 \N 7
8 0 8
8 \N 8
9 1 9
9 \N 9
\N \N 45
0 0 45
SELECT number, number % 2, sum(number) AS val
FROM numbers(10)
GROUP BY CUBE(number, number % 2) WITH TOTALS
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls=1;
0 0 0
0 \N 0
1 1 1
1 \N 1
2 0 2
2 \N 2
3 1 3
3 \N 3
4 0 4
4 \N 4
5 1 5
5 \N 5
6 0 6
6 \N 6
7 1 7
7 \N 7
8 0 8
8 \N 8
9 1 9
9 \N 9
\N 0 20
\N 1 25
\N \N 45
0 0 45
SELECT
number,
number % 2,
sum(number) AS val
FROM numbers(10)
GROUP BY
GROUPING SETS (
(number),
(number % 2)
)
ORDER BY 1, tuple(val)
SETTINGS group_by_use_nulls = 1, max_bytes_before_external_sort=10;
0 \N 0
1 \N 1
2 \N 2
3 \N 3
4 \N 4
5 \N 5
6 \N 6
7 \N 7
8 \N 8
9 \N 9
\N 0 20
\N 1 25

View File

@ -0,0 +1,85 @@
SET allow_experimental_analyzer=1;
-- { echoOn }
SELECT number, number % 2, sum(number) AS val
FROM numbers(10)
GROUP BY ROLLUP(number, number % 2)
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls=1;
set optimize_group_by_function_keys = 0;
SELECT number, number % 2, sum(number) AS val
FROM numbers(10)
GROUP BY ROLLUP(number, number % 2)
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls=1;
SELECT number, number % 2, sum(number) AS val
FROM numbers(10)
GROUP BY ROLLUP(number, number % 2)
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls=0;
SELECT number, number % 2, sum(number) AS val
FROM numbers(10)
GROUP BY CUBE(number, number % 2)
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls=1;
SELECT number, number % 2, sum(number) AS val
FROM numbers(10)
GROUP BY CUBE(number, number % 2)
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls=0;
SELECT
number,
number % 2,
sum(number) AS val
FROM numbers(10)
GROUP BY
GROUPING SETS (
(number),
(number % 2)
)
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls = 1;
SELECT
number,
number % 2,
sum(number) AS val
FROM numbers(10)
GROUP BY
GROUPING SETS (
(number),
(number % 2)
)
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls = 0;
SELECT number, number % 2, sum(number) AS val
FROM numbers(10)
GROUP BY ROLLUP(number, number % 2) WITH TOTALS
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls=1;
SELECT number, number % 2, sum(number) AS val
FROM numbers(10)
GROUP BY CUBE(number, number % 2) WITH TOTALS
ORDER BY (number, number % 2, val)
SETTINGS group_by_use_nulls=1;
SELECT
number,
number % 2,
sum(number) AS val
FROM numbers(10)
GROUP BY
GROUPING SETS (
(number),
(number % 2)
)
ORDER BY 1, tuple(val)
SETTINGS group_by_use_nulls = 1, max_bytes_before_external_sort=10;

View File

@ -6,7 +6,7 @@ Arrow
2001:db8:11a3:9d7:1f34:8a2e:7a0:765d 127.0.0.1
Parquet
ipv6 Nullable(FixedString(16))
ipv4 Nullable(Int64)
ipv4 Nullable(UInt32)
2001:db8:11a3:9d7:1f34:8a2e:7a0:765d 127.0.0.1
ORC
ipv6 Nullable(String)

View File

@ -47,6 +47,7 @@ $CLICKHOUSE_CLIENT --query "SYSTEM FLUSH LOGS"
$CLICKHOUSE_CLIENT --query "
SELECT 'id_' || splitByChar('_', query_id)[1] AS id FROM system.text_log
WHERE query_id LIKE '%$query_id_suffix' AND message LIKE '%$message%'
ORDER BY id
"
$CLICKHOUSE_CLIENT --query "DROP TABLE IF EXISTS t_async_insert_fallback"

View File

@ -0,0 +1,89 @@
1 test
3 another
QUERY id: 0
PROJECTION COLUMNS
a Int32
b LowCardinality(String)
PROJECTION
LIST id: 1, nodes: 2
COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3
COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3
JOIN TREE
TABLE id: 3, table_name: default.02668_logical_optimizer
WHERE
FUNCTION id: 5, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 6, nodes: 2
COLUMN id: 7, column_name: a, result_type: Int32, source_id: 3
CONSTANT id: 8, constant_value: Tuple_(UInt64_1, UInt64_3), constant_value_type: Tuple(UInt8, UInt8)
1 test
QUERY id: 0
PROJECTION COLUMNS
a Int32
b LowCardinality(String)
PROJECTION
LIST id: 1, nodes: 2
COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3
COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3
JOIN TREE
TABLE id: 3, table_name: default.02668_logical_optimizer
WHERE
FUNCTION id: 5, function_name: equals, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 6, nodes: 2
COLUMN id: 7, column_name: a, result_type: Int32, source_id: 3
CONSTANT id: 8, constant_value: UInt64_1, constant_value_type: UInt8
QUERY id: 0
PROJECTION COLUMNS
a Int32
b LowCardinality(String)
PROJECTION
LIST id: 1, nodes: 2
COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3
COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3
JOIN TREE
TABLE id: 3, table_name: default.02668_logical_optimizer
WHERE
CONSTANT id: 5, constant_value: UInt64_0, constant_value_type: UInt8
3 another
QUERY id: 0
PROJECTION COLUMNS
a Int32
b LowCardinality(String)
PROJECTION
LIST id: 1, nodes: 2
COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3
COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3
JOIN TREE
TABLE id: 3, table_name: default.02668_logical_optimizer
WHERE
FUNCTION id: 5, function_name: and, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 6, nodes: 2
FUNCTION id: 7, function_name: equals, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 8, nodes: 2
COLUMN id: 9, column_name: a, result_type: Int32, source_id: 3
CONSTANT id: 10, constant_value: UInt64_3, constant_value_type: UInt8
FUNCTION id: 11, function_name: equals, function_type: ordinary, result_type: LowCardinality(UInt8)
ARGUMENTS
LIST id: 12, nodes: 2
COLUMN id: 13, column_name: b, result_type: LowCardinality(String), source_id: 3
CONSTANT id: 14, constant_value: \'another\', constant_value_type: String
2 test2
QUERY id: 0
PROJECTION COLUMNS
a Int32
b LowCardinality(String)
PROJECTION
LIST id: 1, nodes: 2
COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3
COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3
JOIN TREE
TABLE id: 3, table_name: default.02668_logical_optimizer
WHERE
FUNCTION id: 5, function_name: equals, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 6, nodes: 2
COLUMN id: 7, column_name: a, result_type: Int32, source_id: 3
CONSTANT id: 8, constant_value: UInt64_2, constant_value_type: UInt8

View File

@ -0,0 +1,26 @@
SET allow_experimental_analyzer = 1;
DROP TABLE IF EXISTS 02668_logical_optimizer;
CREATE TABLE 02668_logical_optimizer
(a Int32, b LowCardinality(String))
ENGINE=Memory;
INSERT INTO 02668_logical_optimizer VALUES (1, 'test'), (2, 'test2'), (3, 'another');
SET optimize_min_equality_disjunction_chain_length = 2;
SELECT * FROM 02668_logical_optimizer WHERE a = 1 OR 3 = a OR 1 = a;
EXPLAIN QUERY TREE SELECT * FROM 02668_logical_optimizer WHERE a = 1 OR 3 = a OR 1 = a;
SELECT * FROM 02668_logical_optimizer WHERE a = 1 OR 1 = a;
EXPLAIN QUERY TREE SELECT * FROM 02668_logical_optimizer WHERE a = 1 OR 1 = a;
SELECT * FROM 02668_logical_optimizer WHERE a = 1 AND 2 = a;
EXPLAIN QUERY TREE SELECT * FROM 02668_logical_optimizer WHERE a = 1 AND 2 = a;
SELECT * FROM 02668_logical_optimizer WHERE 3 = a AND b = 'another' AND a = 3;
EXPLAIN QUERY TREE SELECT * FROM 02668_logical_optimizer WHERE a = 3 AND b = 'another' AND a = 3;
SELECT * FROM 02668_logical_optimizer WHERE a = 2 AND 2 = a;
EXPLAIN QUERY TREE SELECT * FROM 02668_logical_optimizer WHERE a = 2 AND 2 = a;

View File

@ -0,0 +1,25 @@
DROP TABLE IF EXISTS t_constant_index;
CREATE TABLE t_constant_index
(
id UInt64,
INDEX t_constant_index 'foo' TYPE set(2) GRANULARITY 1
) ENGINE = MergeTree
ORDER BY id; -- { serverError INCORRECT_QUERY }
CREATE TABLE t_constant_index
(
id UInt64,
INDEX t_constant_index id + rand() TYPE set(2) GRANULARITY 1
) ENGINE = MergeTree
ORDER BY id; -- { serverError BAD_ARGUMENTS }
CREATE TABLE t_constant_index
(
id UInt64,
INDEX t_constant_index id * 2 TYPE set(2) GRANULARITY 1
) ENGINE = MergeTree
ORDER BY id;
DROP TABLE t_constant_index;

View File

@ -0,0 +1,47 @@
-- { echoOn }
set allow_experimental_analyzer=1;
set optimize_trivial_count_query=1;
create table m3(a Int64, b UInt64) Engine=MergeTree order by tuple();
select count() from m3;
0
insert into m3 values (0,0);
insert into m3 values (-1,1);
select trimBoth(explain) from (explain select count() from m3) where explain like '%ReadFromPreparedSource (Optimized trivial count)%';
ReadFromPreparedSource (Optimized trivial count)
select count() from m3;
2
select count(*) from m3;
2
select count(a) from m3;
2
select count(b) from m3;
2
select count() + 1 from m3;
3
drop table m3;
-- checking queries with FINAL
create table replacing_m3(a Int64, b UInt64) Engine=ReplacingMergeTree() order by (a, b);
SYSTEM STOP MERGES replacing_m3;
select count() from replacing_m3;
0
insert into replacing_m3 values (0,0);
insert into replacing_m3 values (0,0);
insert into replacing_m3 values (-1,1);
insert into replacing_m3 values (-2,2);
select trimBoth(explain) from (explain select count() from replacing_m3) where explain like '%ReadFromPreparedSource (Optimized trivial count)%';
ReadFromPreparedSource (Optimized trivial count)
select count() from replacing_m3;
4
select count(*) from replacing_m3;
4
select count(a) from replacing_m3;
4
select count(b) from replacing_m3;
4
select count() from replacing_m3 FINAL;
3
select count(a) from replacing_m3 FINAL;
3
select count(b) from replacing_m3 FINAL;
3
drop table replacing_m3;

View File

@ -0,0 +1,45 @@
drop table if exists m3;
drop table if exists replacing_m3;
-- { echoOn }
set allow_experimental_analyzer=1;
set optimize_trivial_count_query=1;
create table m3(a Int64, b UInt64) Engine=MergeTree order by tuple();
select count() from m3;
insert into m3 values (0,0);
insert into m3 values (-1,1);
select trimBoth(explain) from (explain select count() from m3) where explain like '%ReadFromPreparedSource (Optimized trivial count)%';
select count() from m3;
select count(*) from m3;
select count(a) from m3;
select count(b) from m3;
select count() + 1 from m3;
drop table m3;
-- checking queries with FINAL
create table replacing_m3(a Int64, b UInt64) Engine=ReplacingMergeTree() order by (a, b);
SYSTEM STOP MERGES replacing_m3;
select count() from replacing_m3;
insert into replacing_m3 values (0,0);
insert into replacing_m3 values (0,0);
insert into replacing_m3 values (-1,1);
insert into replacing_m3 values (-2,2);
select trimBoth(explain) from (explain select count() from replacing_m3) where explain like '%ReadFromPreparedSource (Optimized trivial count)%';
select count() from replacing_m3;
select count(*) from replacing_m3;
select count(a) from replacing_m3;
select count(b) from replacing_m3;
select count() from replacing_m3 FINAL;
select count(a) from replacing_m3 FINAL;
select count(b) from replacing_m3 FINAL;
drop table replacing_m3;

View File

@ -0,0 +1 @@
1000

View File

@ -0,0 +1,16 @@
DROP TABLE IF EXISTS test_grace_hash;
CREATE TABLE test_grace_hash (id UInt32, value UInt64) ENGINE = MergeTree ORDER BY id;
INSERT INTO test_grace_hash SELECT number, number % 100 = 0 FROM numbers(100000);
SET join_algorithm = 'grace_hash';
SELECT count() FROM (
SELECT f.id FROM test_grace_hash AS f
LEFT JOIN test_grace_hash AS d
ON f.id = d.id
LIMIT 1000
);
DROP TABLE test_grace_hash;

View File

@ -8,3 +8,25 @@
59183 1336
33010362 1336
800784 1336
-- { echoOn }
set allow_experimental_analyzer = 1;
SELECT
CounterID AS k,
quantileBFloat16(0.5)(ResolutionWidth)
FROM remote('127.0.0.{1,2}', test, hits)
GROUP BY k
ORDER BY
count() DESC,
CounterID ASC
LIMIT 10
SETTINGS group_by_use_nulls = 1;
1704509 1384
732797 1336
598875 1384
792887 1336
3807842 1336
25703952 1336
716829 1384
59183 1336
33010362 1336
800784 1336

View File

@ -8,3 +8,28 @@ ORDER BY
CounterID ASC
LIMIT 10
SETTINGS group_by_use_nulls = 1;
SELECT
CounterID AS k,
quantileBFloat16(0.5)(ResolutionWidth)
FROM test.hits
GROUP BY k
ORDER BY
count() DESC,
CounterID ASC
LIMIT 10
SETTINGS group_by_use_nulls = 1 FORMAT Null;
-- { echoOn }
set allow_experimental_analyzer = 1;
SELECT
CounterID AS k,
quantileBFloat16(0.5)(ResolutionWidth)
FROM remote('127.0.0.{1,2}', test, hits)
GROUP BY k
ORDER BY
count() DESC,
CounterID ASC
LIMIT 10
SETTINGS group_by_use_nulls = 1;