diff --git a/CHANGELOG.md b/CHANGELOG.md index 283000f1804..1b36142cc9f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -22,7 +22,7 @@ * The MergeTree setting `clean_deleted_rows` is deprecated, it has no effect anymore. The `CLEANUP` keyword for the `OPTIMIZE` is not allowed by default (it can be unlocked with the `allow_experimental_replacing_merge_with_cleanup` setting). [#58267](https://github.com/ClickHouse/ClickHouse/pull/58267) ([Alexander Tokmakov](https://github.com/tavplubix)). This fixes [#57930](https://github.com/ClickHouse/ClickHouse/issues/57930). This closes [#54988](https://github.com/ClickHouse/ClickHouse/issues/54988). This closes [#54570](https://github.com/ClickHouse/ClickHouse/issues/54570). This closes [#50346](https://github.com/ClickHouse/ClickHouse/issues/50346). This closes [#47579](https://github.com/ClickHouse/ClickHouse/issues/47579). The feature has to be removed because it is not good. We have to remove it as quickly as possible, because there is no other option. [#57932](https://github.com/ClickHouse/ClickHouse/pull/57932) ([Alexey Milovidov](https://github.com/alexey-milovidov)). #### New Feature -* Implement Refreshable Materialized Views, requested in [#33919](https://github.com/ClickHouse/ClickHouse/issues/57995). [#56946](https://github.com/ClickHouse/ClickHouse/pull/56946) ([Michael Kolupaev](https://github.com/al13n321), [Michael Guzov](https://github.com/koloshmet)). +* Implement Refreshable Materialized Views, requested in [#33919](https://github.com/ClickHouse/ClickHouse/issues/33919). [#56946](https://github.com/ClickHouse/ClickHouse/pull/56946) ([Michael Kolupaev](https://github.com/al13n321), [Michael Guzov](https://github.com/koloshmet)). * Introduce `PASTE JOIN`, which allows users to join tables without `ON` clause simply by row numbers. Example: `SELECT * FROM (SELECT number AS a FROM numbers(2)) AS t1 PASTE JOIN (SELECT number AS a FROM numbers(2) ORDER BY a DESC) AS t2`. [#57995](https://github.com/ClickHouse/ClickHouse/pull/57995) ([Yarik Briukhovetskyi](https://github.com/yariks5s)). * The `ORDER BY` clause now supports specifying `ALL`, meaning that ClickHouse sorts by all columns in the `SELECT` clause. Example: `SELECT col1, col2 FROM tab WHERE [...] ORDER BY ALL`. [#57875](https://github.com/ClickHouse/ClickHouse/pull/57875) ([zhongyuankai](https://github.com/zhongyuankai)). * Added a new mutation command `ALTER TABLE APPLY DELETED MASK`, which allows to enforce applying of mask written by lightweight delete and to remove rows marked as deleted from disk. [#57433](https://github.com/ClickHouse/ClickHouse/pull/57433) ([Anton Popov](https://github.com/CurtizJ)). @@ -375,6 +375,7 @@ * Do not interpret the `send_timeout` set on the client side as the `receive_timeout` on the server side and vise-versa. [#56035](https://github.com/ClickHouse/ClickHouse/pull/56035) ([Azat Khuzhin](https://github.com/azat)). * Comparison of time intervals with different units will throw an exception. This closes [#55942](https://github.com/ClickHouse/ClickHouse/issues/55942). You might have occasionally rely on the previous behavior when the underlying numeric values were compared regardless of the units. [#56090](https://github.com/ClickHouse/ClickHouse/pull/56090) ([Alexey Milovidov](https://github.com/alexey-milovidov)). * Rewrited the experimental `S3Queue` table engine completely: changed the way we keep information in zookeeper which allows to make less zookeeper requests, added caching of zookeeper state in cases when we know the state will not change, improved the polling from s3 process to make it less aggressive, changed the way ttl and max set for trached files is maintained, now it is a background process. Added `system.s3queue` and `system.s3queue_log` tables. Closes [#54998](https://github.com/ClickHouse/ClickHouse/issues/54998). [#54422](https://github.com/ClickHouse/ClickHouse/pull/54422) ([Kseniia Sumarokova](https://github.com/kssenii)). +* Arbitrary paths on HTTP endpoint are no longer interpreted as a request to the `/query` endpoint. [#55521](https://github.com/ClickHouse/ClickHouse/pull/55521) ([Konstantin Bogdanov](https://github.com/thevar1able)). #### New Feature * Add function `arrayFold(accumulator, x1, ..., xn -> expression, initial, array1, ..., arrayn)` which applies a lambda function to multiple arrays of the same cardinality and collects the result in an accumulator. [#49794](https://github.com/ClickHouse/ClickHouse/pull/49794) ([Lirikl](https://github.com/Lirikl)). diff --git a/docker/keeper/Dockerfile b/docker/keeper/Dockerfile index 145f5d13cc2..4b5e8cd3970 100644 --- a/docker/keeper/Dockerfile +++ b/docker/keeper/Dockerfile @@ -34,7 +34,7 @@ RUN arch=${TARGETARCH:-amd64} \ # lts / testing / prestable / etc ARG REPO_CHANNEL="stable" ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}" -ARG VERSION="23.12.1.1368" +ARG VERSION="23.12.2.59" ARG PACKAGES="clickhouse-keeper" ARG DIRECT_DOWNLOAD_URLS="" diff --git a/docker/server/Dockerfile.alpine b/docker/server/Dockerfile.alpine index 26d65eb3ccc..452d8539a48 100644 --- a/docker/server/Dockerfile.alpine +++ b/docker/server/Dockerfile.alpine @@ -32,7 +32,7 @@ RUN arch=${TARGETARCH:-amd64} \ # lts / testing / prestable / etc ARG REPO_CHANNEL="stable" ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}" -ARG VERSION="23.12.1.1368" +ARG VERSION="23.12.2.59" ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static" ARG DIRECT_DOWNLOAD_URLS="" diff --git a/docker/server/Dockerfile.ubuntu b/docker/server/Dockerfile.ubuntu index 5b96b208b11..0cefa3c14cb 100644 --- a/docker/server/Dockerfile.ubuntu +++ b/docker/server/Dockerfile.ubuntu @@ -30,7 +30,7 @@ RUN sed -i "s|http://archive.ubuntu.com|${apt_archive}|g" /etc/apt/sources.list ARG REPO_CHANNEL="stable" ARG REPOSITORY="deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg] https://packages.clickhouse.com/deb ${REPO_CHANNEL} main" -ARG VERSION="23.12.1.1368" +ARG VERSION="23.12.2.59" ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static" # set non-empty deb_location_url url to create a docker image diff --git a/docker/test/stateless/stress_tests.lib b/docker/test/stateless/stress_tests.lib index 8f89c1b80dd..6f0dabb5207 100644 --- a/docker/test/stateless/stress_tests.lib +++ b/docker/test/stateless/stress_tests.lib @@ -236,6 +236,10 @@ function check_logs_for_critical_errors() && echo -e "S3_ERROR No such key thrown (see clickhouse-server.log or no_such_key_errors.txt)$FAIL$(trim_server_logs no_such_key_errors.txt)" >> /test_output/test_results.tsv \ || echo -e "No lost s3 keys$OK" >> /test_output/test_results.tsv + rg -Fa "it is lost forever" /var/log/clickhouse-server/clickhouse-server*.log | grep 'SharedMergeTreePartCheckThread' > /dev/null \ + && echo -e "Lost forever for SharedMergeTree$FAIL" >> /test_output/test_results.tsv \ + || echo -e "No SharedMergeTree lost forever in clickhouse-server.log$OK" >> /test_output/test_results.tsv + # Remove file no_such_key_errors.txt if it's empty [ -s /test_output/no_such_key_errors.txt ] || rm /test_output/no_such_key_errors.txt diff --git a/docs/changelogs/v23.10.6.60-stable.md b/docs/changelogs/v23.10.6.60-stable.md new file mode 100644 index 00000000000..5e1c126e729 --- /dev/null +++ b/docs/changelogs/v23.10.6.60-stable.md @@ -0,0 +1,51 @@ +--- +sidebar_position: 1 +sidebar_label: 2024 +--- + +# 2024 Changelog + +### ClickHouse release v23.10.6.60-stable (68907bbe643) FIXME as compared to v23.10.5.20-stable (e84001e5c61) + +#### Improvement +* Backported in [#58493](https://github.com/ClickHouse/ClickHouse/issues/58493): Fix transfer query to MySQL compatible query. Fixes [#57253](https://github.com/ClickHouse/ClickHouse/issues/57253). Fixes [#52654](https://github.com/ClickHouse/ClickHouse/issues/52654). Fixes [#56729](https://github.com/ClickHouse/ClickHouse/issues/56729). [#56456](https://github.com/ClickHouse/ClickHouse/pull/56456) ([flynn](https://github.com/ucasfl)). +* Backported in [#57659](https://github.com/ClickHouse/ClickHouse/issues/57659): Handle sigabrt case when getting PostgreSQl table structure with empty array. [#57618](https://github.com/ClickHouse/ClickHouse/pull/57618) ([Mike Kot (Михаил Кот)](https://github.com/myrrc)). + +#### Build/Testing/Packaging Improvement +* Backported in [#57586](https://github.com/ClickHouse/ClickHouse/issues/57586): Fix issue caught in https://github.com/docker-library/official-images/pull/15846. [#57571](https://github.com/ClickHouse/ClickHouse/pull/57571) ([Mikhail f. Shiryaev](https://github.com/Felixoid)). + +#### Bug Fix (user-visible misbehavior in an official stable release) + +* Flatten only true Nested type if flatten_nested=1, not all Array(Tuple) [#56132](https://github.com/ClickHouse/ClickHouse/pull/56132) ([Kruglov Pavel](https://github.com/Avogar)). +* Fix ALTER COLUMN with ALIAS [#56493](https://github.com/ClickHouse/ClickHouse/pull/56493) ([Nikolay Degterinsky](https://github.com/evillique)). +* Prevent incompatible ALTER of projection columns [#56948](https://github.com/ClickHouse/ClickHouse/pull/56948) ([Amos Bird](https://github.com/amosbird)). +* Fix segfault after ALTER UPDATE with Nullable MATERIALIZED column [#57147](https://github.com/ClickHouse/ClickHouse/pull/57147) ([Nikolay Degterinsky](https://github.com/evillique)). +* Fix incorrect JOIN plan optimization with partially materialized normal projection [#57196](https://github.com/ClickHouse/ClickHouse/pull/57196) ([Amos Bird](https://github.com/amosbird)). +* Fix `ReadonlyReplica` metric for all cases [#57267](https://github.com/ClickHouse/ClickHouse/pull/57267) ([Antonio Andelic](https://github.com/antonio2368)). +* Background merges correctly use temporary data storage in the cache [#57275](https://github.com/ClickHouse/ClickHouse/pull/57275) ([vdimir](https://github.com/vdimir)). +* MergeTree mutations reuse source part index granularity [#57352](https://github.com/ClickHouse/ClickHouse/pull/57352) ([Maksim Kita](https://github.com/kitaisreal)). +* Fix function jsonMergePatch for partially const columns [#57379](https://github.com/ClickHouse/ClickHouse/pull/57379) ([Nikolay Degterinsky](https://github.com/evillique)). +* Fix working with read buffers in StreamingFormatExecutor [#57438](https://github.com/ClickHouse/ClickHouse/pull/57438) ([Kruglov Pavel](https://github.com/Avogar)). +* bugfix: correctly parse SYSTEM STOP LISTEN TCP SECURE [#57483](https://github.com/ClickHouse/ClickHouse/pull/57483) ([joelynch](https://github.com/joelynch)). +* Ignore ON CLUSTER clause in grant/revoke queries for management of replicated access entities. [#57538](https://github.com/ClickHouse/ClickHouse/pull/57538) ([MikhailBurdukov](https://github.com/MikhailBurdukov)). +* Disable system.kafka_consumers by default (due to possible live memory leak) [#57822](https://github.com/ClickHouse/ClickHouse/pull/57822) ([Azat Khuzhin](https://github.com/azat)). +* Fix invalid memory access in BLAKE3 (Rust) [#57876](https://github.com/ClickHouse/ClickHouse/pull/57876) ([Raúl Marín](https://github.com/Algunenano)). +* Normalize function names in CREATE INDEX [#57906](https://github.com/ClickHouse/ClickHouse/pull/57906) ([Alexander Tokmakov](https://github.com/tavplubix)). +* Fix invalid preprocessing on Keeper [#58069](https://github.com/ClickHouse/ClickHouse/pull/58069) ([Antonio Andelic](https://github.com/antonio2368)). +* Fix Integer overflow in Poco::UTF32Encoding [#58073](https://github.com/ClickHouse/ClickHouse/pull/58073) ([Andrey Fedotov](https://github.com/anfedotoff)). +* Remove parallel parsing for JSONCompactEachRow [#58181](https://github.com/ClickHouse/ClickHouse/pull/58181) ([Alexey Milovidov](https://github.com/alexey-milovidov)). +* Fix parallel parsing for JSONCompactEachRow [#58250](https://github.com/ClickHouse/ClickHouse/pull/58250) ([Kruglov Pavel](https://github.com/Avogar)). +* Fix lost blobs after dropping a replica with broken detached parts [#58333](https://github.com/ClickHouse/ClickHouse/pull/58333) ([Alexander Tokmakov](https://github.com/tavplubix)). +* MergeTreePrefetchedReadPool disable for LIMIT only queries [#58505](https://github.com/ClickHouse/ClickHouse/pull/58505) ([Maksim Kita](https://github.com/kitaisreal)). + +#### NO CL CATEGORY + +* Backported in [#57916](https://github.com/ClickHouse/ClickHouse/issues/57916):. [#57909](https://github.com/ClickHouse/ClickHouse/pull/57909) ([Alexey Milovidov](https://github.com/alexey-milovidov)). + +#### NOT FOR CHANGELOG / INSIGNIFICANT + +* Pin alpine version of integration tests helper container [#57669](https://github.com/ClickHouse/ClickHouse/pull/57669) ([Mikhail f. Shiryaev](https://github.com/Felixoid)). +* Remove heavy rust stable toolchain [#57905](https://github.com/ClickHouse/ClickHouse/pull/57905) ([Mikhail f. Shiryaev](https://github.com/Felixoid)). +* Fix docker image for integration tests (fixes CI) [#57952](https://github.com/ClickHouse/ClickHouse/pull/57952) ([Azat Khuzhin](https://github.com/azat)). +* Fix test_user_valid_until [#58409](https://github.com/ClickHouse/ClickHouse/pull/58409) ([Nikolay Degterinsky](https://github.com/evillique)). + diff --git a/docs/changelogs/v23.11.4.24-stable.md b/docs/changelogs/v23.11.4.24-stable.md new file mode 100644 index 00000000000..40096285b06 --- /dev/null +++ b/docs/changelogs/v23.11.4.24-stable.md @@ -0,0 +1,26 @@ +--- +sidebar_position: 1 +sidebar_label: 2024 +--- + +# 2024 Changelog + +### ClickHouse release v23.11.4.24-stable (e79d840d7fe) FIXME as compared to v23.11.3.23-stable (a14ab450b0e) + +#### Bug Fix (user-visible misbehavior in an official stable release) + +* Flatten only true Nested type if flatten_nested=1, not all Array(Tuple) [#56132](https://github.com/ClickHouse/ClickHouse/pull/56132) ([Kruglov Pavel](https://github.com/Avogar)). +* Fix working with read buffers in StreamingFormatExecutor [#57438](https://github.com/ClickHouse/ClickHouse/pull/57438) ([Kruglov Pavel](https://github.com/Avogar)). +* Disable system.kafka_consumers by default (due to possible live memory leak) [#57822](https://github.com/ClickHouse/ClickHouse/pull/57822) ([Azat Khuzhin](https://github.com/azat)). +* Fix invalid preprocessing on Keeper [#58069](https://github.com/ClickHouse/ClickHouse/pull/58069) ([Antonio Andelic](https://github.com/antonio2368)). +* Fix Integer overflow in Poco::UTF32Encoding [#58073](https://github.com/ClickHouse/ClickHouse/pull/58073) ([Andrey Fedotov](https://github.com/anfedotoff)). +* Remove parallel parsing for JSONCompactEachRow [#58181](https://github.com/ClickHouse/ClickHouse/pull/58181) ([Alexey Milovidov](https://github.com/alexey-milovidov)). +* Fix parallel parsing for JSONCompactEachRow [#58250](https://github.com/ClickHouse/ClickHouse/pull/58250) ([Kruglov Pavel](https://github.com/Avogar)). +* Fix lost blobs after dropping a replica with broken detached parts [#58333](https://github.com/ClickHouse/ClickHouse/pull/58333) ([Alexander Tokmakov](https://github.com/tavplubix)). +* MergeTreePrefetchedReadPool disable for LIMIT only queries [#58505](https://github.com/ClickHouse/ClickHouse/pull/58505) ([Maksim Kita](https://github.com/kitaisreal)). + +#### NOT FOR CHANGELOG / INSIGNIFICANT + +* Handle another case for preprocessing in Keeper [#58308](https://github.com/ClickHouse/ClickHouse/pull/58308) ([Antonio Andelic](https://github.com/antonio2368)). +* Fix test_user_valid_until [#58409](https://github.com/ClickHouse/ClickHouse/pull/58409) ([Nikolay Degterinsky](https://github.com/evillique)). + diff --git a/docs/changelogs/v23.12.2.59-stable.md b/docs/changelogs/v23.12.2.59-stable.md new file mode 100644 index 00000000000..6533f4e6b86 --- /dev/null +++ b/docs/changelogs/v23.12.2.59-stable.md @@ -0,0 +1,32 @@ +--- +sidebar_position: 1 +sidebar_label: 2024 +--- + +# 2024 Changelog + +### ClickHouse release v23.12.2.59-stable (17ab210e761) FIXME as compared to v23.12.1.1368-stable (a2faa65b080) + +#### Backward Incompatible Change +* Backported in [#58389](https://github.com/ClickHouse/ClickHouse/issues/58389): The MergeTree setting `clean_deleted_rows` is deprecated, it has no effect anymore. The `CLEANUP` keyword for `OPTIMIZE` is not allowed by default (unless `allow_experimental_replacing_merge_with_cleanup` is enabled). [#58316](https://github.com/ClickHouse/ClickHouse/pull/58316) ([Alexander Tokmakov](https://github.com/tavplubix)). + +#### Bug Fix (user-visible misbehavior in an official stable release) + +* Flatten only true Nested type if flatten_nested=1, not all Array(Tuple) [#56132](https://github.com/ClickHouse/ClickHouse/pull/56132) ([Kruglov Pavel](https://github.com/Avogar)). +* Fix working with read buffers in StreamingFormatExecutor [#57438](https://github.com/ClickHouse/ClickHouse/pull/57438) ([Kruglov Pavel](https://github.com/Avogar)). +* Fix lost blobs after dropping a replica with broken detached parts [#58333](https://github.com/ClickHouse/ClickHouse/pull/58333) ([Alexander Tokmakov](https://github.com/tavplubix)). +* Fix segfault when graphite table does not have agg function [#58453](https://github.com/ClickHouse/ClickHouse/pull/58453) ([Duc Canh Le](https://github.com/canhld94)). +* MergeTreePrefetchedReadPool disable for LIMIT only queries [#58505](https://github.com/ClickHouse/ClickHouse/pull/58505) ([Maksim Kita](https://github.com/kitaisreal)). + +#### NO CL ENTRY + +* NO CL ENTRY: 'Revert "Refreshable materialized views (takeover)"'. [#58296](https://github.com/ClickHouse/ClickHouse/pull/58296) ([Alexander Tokmakov](https://github.com/tavplubix)). + +#### NOT FOR CHANGELOG / INSIGNIFICANT + +* Fix an error in the release script - it didn't allow to make 23.12. [#58288](https://github.com/ClickHouse/ClickHouse/pull/58288) ([Alexey Milovidov](https://github.com/alexey-milovidov)). +* Update version_date.tsv and changelogs after v23.12.1.1368-stable [#58290](https://github.com/ClickHouse/ClickHouse/pull/58290) ([robot-clickhouse](https://github.com/robot-clickhouse)). +* Fix test_storage_s3_queue/test.py::test_drop_table [#58293](https://github.com/ClickHouse/ClickHouse/pull/58293) ([Kseniia Sumarokova](https://github.com/kssenii)). +* Handle another case for preprocessing in Keeper [#58308](https://github.com/ClickHouse/ClickHouse/pull/58308) ([Antonio Andelic](https://github.com/antonio2368)). +* Fix test_user_valid_until [#58409](https://github.com/ClickHouse/ClickHouse/pull/58409) ([Nikolay Degterinsky](https://github.com/evillique)). + diff --git a/docs/changelogs/v23.3.19.32-lts.md b/docs/changelogs/v23.3.19.32-lts.md new file mode 100644 index 00000000000..4604c986fe6 --- /dev/null +++ b/docs/changelogs/v23.3.19.32-lts.md @@ -0,0 +1,36 @@ +--- +sidebar_position: 1 +sidebar_label: 2024 +--- + +# 2024 Changelog + +### ClickHouse release v23.3.19.32-lts (c4d4ca8ec02) FIXME as compared to v23.3.18.15-lts (7228475d77a) + +#### Backward Incompatible Change +* Backported in [#57840](https://github.com/ClickHouse/ClickHouse/issues/57840): Remove function `arrayFold` because it has a bug. This closes [#57816](https://github.com/ClickHouse/ClickHouse/issues/57816). This closes [#57458](https://github.com/ClickHouse/ClickHouse/issues/57458). [#57836](https://github.com/ClickHouse/ClickHouse/pull/57836) ([Alexey Milovidov](https://github.com/alexey-milovidov)). + +#### Improvement +* Backported in [#58489](https://github.com/ClickHouse/ClickHouse/issues/58489): Fix transfer query to MySQL compatible query. Fixes [#57253](https://github.com/ClickHouse/ClickHouse/issues/57253). Fixes [#52654](https://github.com/ClickHouse/ClickHouse/issues/52654). Fixes [#56729](https://github.com/ClickHouse/ClickHouse/issues/56729). [#56456](https://github.com/ClickHouse/ClickHouse/pull/56456) ([flynn](https://github.com/ucasfl)). +* Backported in [#57653](https://github.com/ClickHouse/ClickHouse/issues/57653): Handle sigabrt case when getting PostgreSQl table structure with empty array. [#57618](https://github.com/ClickHouse/ClickHouse/pull/57618) ([Mike Kot (Михаил Кот)](https://github.com/myrrc)). + +#### Build/Testing/Packaging Improvement +* Backported in [#57580](https://github.com/ClickHouse/ClickHouse/issues/57580): Fix issue caught in https://github.com/docker-library/official-images/pull/15846. [#57571](https://github.com/ClickHouse/ClickHouse/pull/57571) ([Mikhail f. Shiryaev](https://github.com/Felixoid)). + +#### Bug Fix (user-visible misbehavior in an official stable release) + +* Prevent incompatible ALTER of projection columns [#56948](https://github.com/ClickHouse/ClickHouse/pull/56948) ([Amos Bird](https://github.com/amosbird)). +* Fix segfault after ALTER UPDATE with Nullable MATERIALIZED column [#57147](https://github.com/ClickHouse/ClickHouse/pull/57147) ([Nikolay Degterinsky](https://github.com/evillique)). +* Fix incorrect JOIN plan optimization with partially materialized normal projection [#57196](https://github.com/ClickHouse/ClickHouse/pull/57196) ([Amos Bird](https://github.com/amosbird)). +* MergeTree mutations reuse source part index granularity [#57352](https://github.com/ClickHouse/ClickHouse/pull/57352) ([Maksim Kita](https://github.com/kitaisreal)). +* Fix invalid memory access in BLAKE3 (Rust) [#57876](https://github.com/ClickHouse/ClickHouse/pull/57876) ([Raúl Marín](https://github.com/Algunenano)). +* Normalize function names in CREATE INDEX [#57906](https://github.com/ClickHouse/ClickHouse/pull/57906) ([Alexander Tokmakov](https://github.com/tavplubix)). +* Fix invalid preprocessing on Keeper [#58069](https://github.com/ClickHouse/ClickHouse/pull/58069) ([Antonio Andelic](https://github.com/antonio2368)). +* Fix Integer overflow in Poco::UTF32Encoding [#58073](https://github.com/ClickHouse/ClickHouse/pull/58073) ([Andrey Fedotov](https://github.com/anfedotoff)). +* Remove parallel parsing for JSONCompactEachRow [#58181](https://github.com/ClickHouse/ClickHouse/pull/58181) ([Alexey Milovidov](https://github.com/alexey-milovidov)). + +#### NOT FOR CHANGELOG / INSIGNIFICANT + +* Pin alpine version of integration tests helper container [#57669](https://github.com/ClickHouse/ClickHouse/pull/57669) ([Mikhail f. Shiryaev](https://github.com/Felixoid)). +* Fix docker image for integration tests (fixes CI) [#57952](https://github.com/ClickHouse/ClickHouse/pull/57952) ([Azat Khuzhin](https://github.com/azat)). + diff --git a/docs/changelogs/v23.8.9.54-lts.md b/docs/changelogs/v23.8.9.54-lts.md new file mode 100644 index 00000000000..00607c60c39 --- /dev/null +++ b/docs/changelogs/v23.8.9.54-lts.md @@ -0,0 +1,47 @@ +--- +sidebar_position: 1 +sidebar_label: 2024 +--- + +# 2024 Changelog + +### ClickHouse release v23.8.9.54-lts (192a1d231fa) FIXME as compared to v23.8.8.20-lts (5e012a03bf2) + +#### Improvement +* Backported in [#57668](https://github.com/ClickHouse/ClickHouse/issues/57668): Output valid JSON/XML on excetpion during HTTP query execution. Add setting `http_write_exception_in_output_format` to enable/disable this behaviour (enabled by default). [#52853](https://github.com/ClickHouse/ClickHouse/pull/52853) ([Kruglov Pavel](https://github.com/Avogar)). +* Backported in [#58491](https://github.com/ClickHouse/ClickHouse/issues/58491): Fix transfer query to MySQL compatible query. Fixes [#57253](https://github.com/ClickHouse/ClickHouse/issues/57253). Fixes [#52654](https://github.com/ClickHouse/ClickHouse/issues/52654). Fixes [#56729](https://github.com/ClickHouse/ClickHouse/issues/56729). [#56456](https://github.com/ClickHouse/ClickHouse/pull/56456) ([flynn](https://github.com/ucasfl)). +* Backported in [#57238](https://github.com/ClickHouse/ClickHouse/issues/57238): Fetching a part waits when that part is fully committed on remote replica. It is better not send part in PreActive state. In case of zero copy this is mandatory restriction. [#56808](https://github.com/ClickHouse/ClickHouse/pull/56808) ([Sema Checherinda](https://github.com/CheSema)). +* Backported in [#57655](https://github.com/ClickHouse/ClickHouse/issues/57655): Handle sigabrt case when getting PostgreSQl table structure with empty array. [#57618](https://github.com/ClickHouse/ClickHouse/pull/57618) ([Mike Kot (Михаил Кот)](https://github.com/myrrc)). + +#### Build/Testing/Packaging Improvement +* Backported in [#57582](https://github.com/ClickHouse/ClickHouse/issues/57582): Fix issue caught in https://github.com/docker-library/official-images/pull/15846. [#57571](https://github.com/ClickHouse/ClickHouse/pull/57571) ([Mikhail f. Shiryaev](https://github.com/Felixoid)). + +#### Bug Fix (user-visible misbehavior in an official stable release) + +* Flatten only true Nested type if flatten_nested=1, not all Array(Tuple) [#56132](https://github.com/ClickHouse/ClickHouse/pull/56132) ([Kruglov Pavel](https://github.com/Avogar)). +* Fix ALTER COLUMN with ALIAS [#56493](https://github.com/ClickHouse/ClickHouse/pull/56493) ([Nikolay Degterinsky](https://github.com/evillique)). +* Prevent incompatible ALTER of projection columns [#56948](https://github.com/ClickHouse/ClickHouse/pull/56948) ([Amos Bird](https://github.com/amosbird)). +* Fix segfault after ALTER UPDATE with Nullable MATERIALIZED column [#57147](https://github.com/ClickHouse/ClickHouse/pull/57147) ([Nikolay Degterinsky](https://github.com/evillique)). +* Fix incorrect JOIN plan optimization with partially materialized normal projection [#57196](https://github.com/ClickHouse/ClickHouse/pull/57196) ([Amos Bird](https://github.com/amosbird)). +* Fix `ReadonlyReplica` metric for all cases [#57267](https://github.com/ClickHouse/ClickHouse/pull/57267) ([Antonio Andelic](https://github.com/antonio2368)). +* Fix working with read buffers in StreamingFormatExecutor [#57438](https://github.com/ClickHouse/ClickHouse/pull/57438) ([Kruglov Pavel](https://github.com/Avogar)). +* bugfix: correctly parse SYSTEM STOP LISTEN TCP SECURE [#57483](https://github.com/ClickHouse/ClickHouse/pull/57483) ([joelynch](https://github.com/joelynch)). +* Ignore ON CLUSTER clause in grant/revoke queries for management of replicated access entities. [#57538](https://github.com/ClickHouse/ClickHouse/pull/57538) ([MikhailBurdukov](https://github.com/MikhailBurdukov)). +* Disable system.kafka_consumers by default (due to possible live memory leak) [#57822](https://github.com/ClickHouse/ClickHouse/pull/57822) ([Azat Khuzhin](https://github.com/azat)). +* Fix invalid memory access in BLAKE3 (Rust) [#57876](https://github.com/ClickHouse/ClickHouse/pull/57876) ([Raúl Marín](https://github.com/Algunenano)). +* Normalize function names in CREATE INDEX [#57906](https://github.com/ClickHouse/ClickHouse/pull/57906) ([Alexander Tokmakov](https://github.com/tavplubix)). +* Fix invalid preprocessing on Keeper [#58069](https://github.com/ClickHouse/ClickHouse/pull/58069) ([Antonio Andelic](https://github.com/antonio2368)). +* Fix Integer overflow in Poco::UTF32Encoding [#58073](https://github.com/ClickHouse/ClickHouse/pull/58073) ([Andrey Fedotov](https://github.com/anfedotoff)). +* Remove parallel parsing for JSONCompactEachRow [#58181](https://github.com/ClickHouse/ClickHouse/pull/58181) ([Alexey Milovidov](https://github.com/alexey-milovidov)). +* Fix parallel parsing for JSONCompactEachRow [#58250](https://github.com/ClickHouse/ClickHouse/pull/58250) ([Kruglov Pavel](https://github.com/Avogar)). + +#### NO CL ENTRY + +* NO CL ENTRY: 'Update PeekableWriteBuffer.cpp'. [#57701](https://github.com/ClickHouse/ClickHouse/pull/57701) ([Kruglov Pavel](https://github.com/Avogar)). + +#### NOT FOR CHANGELOG / INSIGNIFICANT + +* Pin alpine version of integration tests helper container [#57669](https://github.com/ClickHouse/ClickHouse/pull/57669) ([Mikhail f. Shiryaev](https://github.com/Felixoid)). +* Remove heavy rust stable toolchain [#57905](https://github.com/ClickHouse/ClickHouse/pull/57905) ([Mikhail f. Shiryaev](https://github.com/Felixoid)). +* Fix docker image for integration tests (fixes CI) [#57952](https://github.com/ClickHouse/ClickHouse/pull/57952) ([Azat Khuzhin](https://github.com/azat)). + diff --git a/docs/en/interfaces/formats.md b/docs/en/interfaces/formats.md index 836b1f2f637..ed67af48af7 100644 --- a/docs/en/interfaces/formats.md +++ b/docs/en/interfaces/formats.md @@ -1262,6 +1262,7 @@ SELECT * FROM json_each_row_nested - [input_format_import_nested_json](/docs/en/operations/settings/settings-formats.md/#input_format_import_nested_json) - map nested JSON data to nested tables (it works for JSONEachRow format). Default value - `false`. - [input_format_json_read_bools_as_numbers](/docs/en/operations/settings/settings-formats.md/#input_format_json_read_bools_as_numbers) - allow to parse bools as numbers in JSON input formats. Default value - `true`. +- [input_format_json_read_bools_as_strings](/docs/en/operations/settings/settings-formats.md/#input_format_json_read_bools_as_strings) - allow to parse bools as strings in JSON input formats. Default value - `true`. - [input_format_json_read_numbers_as_strings](/docs/en/operations/settings/settings-formats.md/#input_format_json_read_numbers_as_strings) - allow to parse numbers as strings in JSON input formats. Default value - `true`. - [input_format_json_read_arrays_as_strings](/docs/en/operations/settings/settings-formats.md/#input_format_json_read_arrays_as_strings) - allow to parse JSON arrays as strings in JSON input formats. Default value - `true`. - [input_format_json_read_objects_as_strings](/docs/en/operations/settings/settings-formats.md/#input_format_json_read_objects_as_strings) - allow to parse JSON objects as strings in JSON input formats. Default value - `true`. diff --git a/docs/en/interfaces/schema-inference.md b/docs/en/interfaces/schema-inference.md index ef858796936..4db1d53987a 100644 --- a/docs/en/interfaces/schema-inference.md +++ b/docs/en/interfaces/schema-inference.md @@ -614,6 +614,26 @@ DESC format(JSONEachRow, $$ └───────┴─────────────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘ ``` +##### input_format_json_read_bools_as_strings + +Enabling this setting allows reading Bool values as strings. + +This setting is enabled by default. + +**Example:** + +```sql +SET input_format_json_read_bools_as_strings = 1; +DESC format(JSONEachRow, $$ + {"value" : true} + {"value" : "Hello, World"} + $$) +``` +```response +┌─name──┬─type─────────────┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐ +│ value │ Nullable(String) │ │ │ │ │ │ +└───────┴──────────────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘ +``` ##### input_format_json_read_arrays_as_strings Enabling this setting allows reading JSON array values as strings. diff --git a/docs/en/operations/settings/settings-formats.md b/docs/en/operations/settings/settings-formats.md index 3d76bd9df73..43a73844b79 100644 --- a/docs/en/operations/settings/settings-formats.md +++ b/docs/en/operations/settings/settings-formats.md @@ -377,6 +377,12 @@ Allow parsing bools as numbers in JSON input formats. Enabled by default. +## input_format_json_read_bools_as_strings {#input_format_json_read_bools_as_strings} + +Allow parsing bools as strings in JSON input formats. + +Enabled by default. + ## input_format_json_read_numbers_as_strings {#input_format_json_read_numbers_as_strings} Allow parsing numbers as strings in JSON input formats. diff --git a/docs/en/operations/settings/settings.md b/docs/en/operations/settings/settings.md index 6e087467bb9..d4ee8106320 100644 --- a/docs/en/operations/settings/settings.md +++ b/docs/en/operations/settings/settings.md @@ -3847,6 +3847,8 @@ Possible values: - `none` — Is similar to throw, but distributed DDL query returns no result set. - `null_status_on_timeout` — Returns `NULL` as execution status in some rows of result set instead of throwing `TIMEOUT_EXCEEDED` if query is not finished on the corresponding hosts. - `never_throw` — Do not throw `TIMEOUT_EXCEEDED` and do not rethrow exceptions if query has failed on some hosts. +- `null_status_on_timeout_only_active` — similar to `null_status_on_timeout`, but doesn't wait for inactive replicas of the `Replicated` database +- `throw_only_active` — similar to `throw`, but doesn't wait for inactive replicas of the `Replicated` database Default value: `throw`. diff --git a/docs/en/operations/utilities/clickhouse-format.md b/docs/en/operations/utilities/clickhouse-format.md index 101310cc65e..3e4295598aa 100644 --- a/docs/en/operations/utilities/clickhouse-format.md +++ b/docs/en/operations/utilities/clickhouse-format.md @@ -27,7 +27,7 @@ $ clickhouse-format --query "select number from numbers(10) where number%2 order Result: -```sql +```bash SELECT number FROM numbers(10) WHERE number % 2 @@ -49,22 +49,20 @@ SELECT sum(number) FROM numbers(5) 3. Multiqueries: ```bash -$ clickhouse-format -n <<< "SELECT * FROM (SELECT 1 AS x UNION ALL SELECT 1 UNION DISTINCT SELECT 3);" +$ clickhouse-format -n <<< "SELECT min(number) FROM numbers(5); SELECT max(number) FROM numbers(5);" ``` Result: -```sql -SELECT * -FROM -( - SELECT 1 AS x - UNION ALL - SELECT 1 - UNION DISTINCT - SELECT 3 -) +``` +SELECT min(number) +FROM numbers(5) ; + +SELECT max(number) +FROM numbers(5) +; + ``` 4. Obfuscating: @@ -75,7 +73,7 @@ $ clickhouse-format --seed Hello --obfuscate <<< "SELECT cost_first_screen BETWE Result: -```sql +``` SELECT treasury_mammoth_hazelnut BETWEEN nutmeg AND span, CASE WHEN chive >= 116 THEN switching ELSE ANYTHING END; ``` @@ -87,7 +85,7 @@ $ clickhouse-format --seed World --obfuscate <<< "SELECT cost_first_screen BETWE Result: -```sql +``` SELECT horse_tape_summer BETWEEN folklore AND moccasins, CASE WHEN intestine >= 116 THEN nonconformist ELSE FORESTRY END; ``` @@ -99,7 +97,7 @@ $ clickhouse-format --backslash <<< "SELECT * FROM (SELECT 1 AS x UNION ALL SELE Result: -```sql +``` SELECT * \ FROM \ ( \ diff --git a/docs/en/sql-reference/functions/date-time-functions.md b/docs/en/sql-reference/functions/date-time-functions.md index 0261589b968..5622097537e 100644 --- a/docs/en/sql-reference/functions/date-time-functions.md +++ b/docs/en/sql-reference/functions/date-time-functions.md @@ -1483,7 +1483,9 @@ For mode values with a meaning of “with 4 or more days this year,” weeks are - Otherwise, it is the last week of the previous year, and the next week is week 1. -For mode values with a meaning of “contains January 1”, the week contains January 1 is week 1. It does not matter how many days in the new year the week contained, even if it contained only one day. +For mode values with a meaning of “contains January 1”, the week contains January 1 is week 1. +It does not matter how many days in the new year the week contained, even if it contained only one day. +I.e. if the last week of December contains January 1 of the next year, it will be week 1 of the next year. **Syntax** diff --git a/docs/en/sql-reference/functions/hash-functions.md b/docs/en/sql-reference/functions/hash-functions.md index a23849c13aa..2c6a468af0e 100644 --- a/docs/en/sql-reference/functions/hash-functions.md +++ b/docs/en/sql-reference/functions/hash-functions.md @@ -1779,7 +1779,9 @@ Result: ## sqid -Transforms numbers into YouTube-like short URL hash called [Sqid](https://sqids.org/). +Transforms numbers into a [Sqid](https://sqids.org/) which is a YouTube-like ID string. +The output alphabet is `abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789`. +Do not use this function for hashing - the generated IDs can be decoded back into numbers. **Syntax** diff --git a/docs/en/sql-reference/functions/rounding-functions.md b/docs/en/sql-reference/functions/rounding-functions.md index 84839c2489c..3ede66cf316 100644 --- a/docs/en/sql-reference/functions/rounding-functions.md +++ b/docs/en/sql-reference/functions/rounding-functions.md @@ -53,7 +53,7 @@ The rounded number of the same type as the input number. **Example of use with Float** ``` sql -SELECT number / 2 AS x, round(x) FROM system.numbers LIMIT 3 +SELECT number / 2 AS x, round(x) FROM system.numbers LIMIT 3; ``` ``` text @@ -67,7 +67,22 @@ SELECT number / 2 AS x, round(x) FROM system.numbers LIMIT 3 **Example of use with Decimal** ``` sql -SELECT cast(number / 2 AS Decimal(10,4)) AS x, round(x) FROM system.numbers LIMIT 3 +SELECT cast(number / 2 AS Decimal(10,4)) AS x, round(x) FROM system.numbers LIMIT 3; +``` + +``` text +┌───x─┬─round(CAST(divide(number, 2), 'Decimal(10, 4)'))─┐ +│ 0 │ 0 │ +│ 0.5 │ 1 │ +│ 1 │ 1 │ +└─────┴──────────────────────────────────────────────────┘ +``` + +If you want to keep the trailing zeros, you need to enable `output_format_decimal_trailing_zeros` + +``` sql +SELECT cast(number / 2 AS Decimal(10,4)) AS x, round(x) FROM system.numbers LIMIT 3 settings output_format_decimal_trailing_zeros=1; + ``` ``` text diff --git a/docs/ru/sql-reference/functions/date-time-functions.md b/docs/ru/sql-reference/functions/date-time-functions.md index fa5728a097d..cbbb456aa80 100644 --- a/docs/ru/sql-reference/functions/date-time-functions.md +++ b/docs/ru/sql-reference/functions/date-time-functions.md @@ -578,7 +578,9 @@ SELECT - В противном случае это последняя неделя предыдущего года, а следующая неделя - неделя 1. -Для режимов со значением «содержит 1 января», неделя 1 – это неделя содержащая 1 января. Не имеет значения, сколько дней в новом году содержала неделя, даже если она содержала только один день. +Для режимов со значением «содержит 1 января», неделя 1 – это неделя, содержащая 1 января. +Не имеет значения, сколько дней нового года содержит эта неделя, даже если она содержит только один день. +Так, если последняя неделя декабря содержит 1 января следующего года, то она считается неделей 1 следующего года. **Пример** diff --git a/programs/library-bridge/LibraryBridgeHandlers.cpp b/programs/library-bridge/LibraryBridgeHandlers.cpp index 7c77e633a44..9642dd7ee63 100644 --- a/programs/library-bridge/LibraryBridgeHandlers.cpp +++ b/programs/library-bridge/LibraryBridgeHandlers.cpp @@ -2,7 +2,6 @@ #include "CatBoostLibraryHandler.h" #include "CatBoostLibraryHandlerFactory.h" -#include "Common/ProfileEvents.h" #include "ExternalDictionaryLibraryHandler.h" #include "ExternalDictionaryLibraryHandlerFactory.h" @@ -45,7 +44,7 @@ namespace response.setStatusAndReason(HTTPResponse::HTTP_INTERNAL_SERVER_ERROR); if (!response.sent()) - *response.send() << message << '\n'; + *response.send() << message << std::endl; LOG_WARNING(&Poco::Logger::get("LibraryBridge"), fmt::runtime(message)); } @@ -97,7 +96,7 @@ ExternalDictionaryLibraryBridgeRequestHandler::ExternalDictionaryLibraryBridgeRe } -void ExternalDictionaryLibraryBridgeRequestHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & /*write_event*/) +void ExternalDictionaryLibraryBridgeRequestHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { LOG_TRACE(log, "Request URI: {}", request.getURI()); HTMLForm params(getContext()->getSettingsRef(), request); @@ -385,7 +384,7 @@ ExternalDictionaryLibraryBridgeExistsHandler::ExternalDictionaryLibraryBridgeExi } -void ExternalDictionaryLibraryBridgeExistsHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & /*write_event*/) +void ExternalDictionaryLibraryBridgeExistsHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { try { @@ -424,7 +423,7 @@ CatBoostLibraryBridgeRequestHandler::CatBoostLibraryBridgeRequestHandler( } -void CatBoostLibraryBridgeRequestHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & /*write_event*/) +void CatBoostLibraryBridgeRequestHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { LOG_TRACE(log, "Request URI: {}", request.getURI()); HTMLForm params(getContext()->getSettingsRef(), request); @@ -622,7 +621,7 @@ CatBoostLibraryBridgeExistsHandler::CatBoostLibraryBridgeExistsHandler(size_t ke } -void CatBoostLibraryBridgeExistsHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & /*write_event*/) +void CatBoostLibraryBridgeExistsHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { try { diff --git a/programs/library-bridge/LibraryBridgeHandlers.h b/programs/library-bridge/LibraryBridgeHandlers.h index 4f08d7a6084..16815e84723 100644 --- a/programs/library-bridge/LibraryBridgeHandlers.h +++ b/programs/library-bridge/LibraryBridgeHandlers.h @@ -20,7 +20,7 @@ class ExternalDictionaryLibraryBridgeRequestHandler : public HTTPRequestHandler, public: ExternalDictionaryLibraryBridgeRequestHandler(size_t keep_alive_timeout_, ContextPtr context_); - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; private: static constexpr inline auto FORMAT = "RowBinary"; @@ -36,7 +36,7 @@ class ExternalDictionaryLibraryBridgeExistsHandler : public HTTPRequestHandler, public: ExternalDictionaryLibraryBridgeExistsHandler(size_t keep_alive_timeout_, ContextPtr context_); - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; private: const size_t keep_alive_timeout; @@ -65,7 +65,7 @@ class CatBoostLibraryBridgeRequestHandler : public HTTPRequestHandler, WithConte public: CatBoostLibraryBridgeRequestHandler(size_t keep_alive_timeout_, ContextPtr context_); - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; private: const size_t keep_alive_timeout; @@ -79,7 +79,7 @@ class CatBoostLibraryBridgeExistsHandler : public HTTPRequestHandler, WithContex public: CatBoostLibraryBridgeExistsHandler(size_t keep_alive_timeout_, ContextPtr context_); - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; private: const size_t keep_alive_timeout; diff --git a/programs/odbc-bridge/ColumnInfoHandler.cpp b/programs/odbc-bridge/ColumnInfoHandler.cpp index 774883657b7..434abf0bf14 100644 --- a/programs/odbc-bridge/ColumnInfoHandler.cpp +++ b/programs/odbc-bridge/ColumnInfoHandler.cpp @@ -69,7 +69,7 @@ namespace } -void ODBCColumnsInfoHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & /*write_event*/) +void ODBCColumnsInfoHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { HTMLForm params(getContext()->getSettingsRef(), request, request.getStream()); LOG_TRACE(log, "Request URI: {}", request.getURI()); @@ -78,7 +78,7 @@ void ODBCColumnsInfoHandler::handleRequest(HTTPServerRequest & request, HTTPServ { response.setStatusAndReason(Poco::Net::HTTPResponse::HTTP_INTERNAL_SERVER_ERROR); if (!response.sent()) - *response.send() << message << '\n'; + *response.send() << message << std::endl; LOG_WARNING(log, fmt::runtime(message)); }; diff --git a/programs/odbc-bridge/ColumnInfoHandler.h b/programs/odbc-bridge/ColumnInfoHandler.h index e3087701182..3ba8b182ba6 100644 --- a/programs/odbc-bridge/ColumnInfoHandler.h +++ b/programs/odbc-bridge/ColumnInfoHandler.h @@ -23,7 +23,7 @@ public: { } - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; private: Poco::Logger * log; diff --git a/programs/odbc-bridge/IdentifierQuoteHandler.cpp b/programs/odbc-bridge/IdentifierQuoteHandler.cpp index a23efb112de..f622995bf15 100644 --- a/programs/odbc-bridge/IdentifierQuoteHandler.cpp +++ b/programs/odbc-bridge/IdentifierQuoteHandler.cpp @@ -21,7 +21,7 @@ namespace DB { -void IdentifierQuoteHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & /*write_event*/) +void IdentifierQuoteHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { HTMLForm params(getContext()->getSettingsRef(), request, request.getStream()); LOG_TRACE(log, "Request URI: {}", request.getURI()); @@ -30,7 +30,7 @@ void IdentifierQuoteHandler::handleRequest(HTTPServerRequest & request, HTTPServ { response.setStatusAndReason(Poco::Net::HTTPResponse::HTTP_INTERNAL_SERVER_ERROR); if (!response.sent()) - response.send()->writeln(message); + *response.send() << message << std::endl; LOG_WARNING(log, fmt::runtime(message)); }; diff --git a/programs/odbc-bridge/IdentifierQuoteHandler.h b/programs/odbc-bridge/IdentifierQuoteHandler.h index ff5c02ca07b..d57bbc0ca8a 100644 --- a/programs/odbc-bridge/IdentifierQuoteHandler.h +++ b/programs/odbc-bridge/IdentifierQuoteHandler.h @@ -21,7 +21,7 @@ public: { } - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; private: Poco::Logger * log; diff --git a/programs/odbc-bridge/MainHandler.cpp b/programs/odbc-bridge/MainHandler.cpp index e350afa2b10..9130b3e0f47 100644 --- a/programs/odbc-bridge/MainHandler.cpp +++ b/programs/odbc-bridge/MainHandler.cpp @@ -46,12 +46,12 @@ void ODBCHandler::processError(HTTPServerResponse & response, const std::string { response.setStatusAndReason(HTTPResponse::HTTP_INTERNAL_SERVER_ERROR); if (!response.sent()) - *response.send() << message << '\n'; + *response.send() << message << std::endl; LOG_WARNING(log, fmt::runtime(message)); } -void ODBCHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & /*write_event*/) +void ODBCHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { HTMLForm params(getContext()->getSettingsRef(), request); LOG_TRACE(log, "Request URI: {}", request.getURI()); diff --git a/programs/odbc-bridge/MainHandler.h b/programs/odbc-bridge/MainHandler.h index 7977245ff82..bc0fca8b9a5 100644 --- a/programs/odbc-bridge/MainHandler.h +++ b/programs/odbc-bridge/MainHandler.h @@ -30,7 +30,7 @@ public: { } - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; private: Poco::Logger * log; diff --git a/programs/odbc-bridge/PingHandler.cpp b/programs/odbc-bridge/PingHandler.cpp index 80d0e2bf4a9..e3ab5e5cd00 100644 --- a/programs/odbc-bridge/PingHandler.cpp +++ b/programs/odbc-bridge/PingHandler.cpp @@ -6,7 +6,7 @@ namespace DB { -void PingHandler::handleRequest(HTTPServerRequest & /* request */, HTTPServerResponse & response, const ProfileEvents::Event & /*write_event*/) +void PingHandler::handleRequest(HTTPServerRequest & /* request */, HTTPServerResponse & response) { try { diff --git a/programs/odbc-bridge/PingHandler.h b/programs/odbc-bridge/PingHandler.h index c5447107e0c..c969ec55af7 100644 --- a/programs/odbc-bridge/PingHandler.h +++ b/programs/odbc-bridge/PingHandler.h @@ -10,7 +10,7 @@ class PingHandler : public HTTPRequestHandler { public: explicit PingHandler(size_t keep_alive_timeout_) : keep_alive_timeout(keep_alive_timeout_) {} - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; private: size_t keep_alive_timeout; diff --git a/programs/odbc-bridge/SchemaAllowedHandler.cpp b/programs/odbc-bridge/SchemaAllowedHandler.cpp index c7025ca4311..020359f51fd 100644 --- a/programs/odbc-bridge/SchemaAllowedHandler.cpp +++ b/programs/odbc-bridge/SchemaAllowedHandler.cpp @@ -29,7 +29,7 @@ namespace } -void SchemaAllowedHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & /*write_event*/) +void SchemaAllowedHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { HTMLForm params(getContext()->getSettingsRef(), request, request.getStream()); LOG_TRACE(log, "Request URI: {}", request.getURI()); @@ -38,7 +38,7 @@ void SchemaAllowedHandler::handleRequest(HTTPServerRequest & request, HTTPServer { response.setStatusAndReason(Poco::Net::HTTPResponse::HTTP_INTERNAL_SERVER_ERROR); if (!response.sent()) - *response.send() << message << '\n'; + *response.send() << message << std::endl; LOG_WARNING(log, fmt::runtime(message)); }; diff --git a/programs/odbc-bridge/SchemaAllowedHandler.h b/programs/odbc-bridge/SchemaAllowedHandler.h index aa0b04b1d31..cb71a6fb5a2 100644 --- a/programs/odbc-bridge/SchemaAllowedHandler.h +++ b/programs/odbc-bridge/SchemaAllowedHandler.h @@ -24,7 +24,7 @@ public: { } - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; private: Poco::Logger * log; diff --git a/programs/server/Server.cpp b/programs/server/Server.cpp index 7ad7460f6f8..1fa3d1cfa73 100644 --- a/programs/server/Server.cpp +++ b/programs/server/Server.cpp @@ -152,18 +152,6 @@ namespace ProfileEvents { extern const Event MainConfigLoads; extern const Event ServerStartupMilliseconds; - extern const Event InterfaceNativeSendBytes; - extern const Event InterfaceNativeReceiveBytes; - extern const Event InterfaceHTTPSendBytes; - extern const Event InterfaceHTTPReceiveBytes; - extern const Event InterfacePrometheusSendBytes; - extern const Event InterfacePrometheusReceiveBytes; - extern const Event InterfaceInterserverSendBytes; - extern const Event InterfaceInterserverReceiveBytes; - extern const Event InterfaceMySQLSendBytes; - extern const Event InterfaceMySQLReceiveBytes; - extern const Event InterfacePostgreSQLSendBytes; - extern const Event InterfacePostgreSQLReceiveBytes; } namespace fs = std::filesystem; @@ -1272,11 +1260,11 @@ try { Settings::checkNoSettingNamesAtTopLevel(*config, config_path); - ServerSettings server_settings_; - server_settings_.loadSettingsFromConfig(*config); + ServerSettings new_server_settings; + new_server_settings.loadSettingsFromConfig(*config); - size_t max_server_memory_usage = server_settings_.max_server_memory_usage; - double max_server_memory_usage_to_ram_ratio = server_settings_.max_server_memory_usage_to_ram_ratio; + size_t max_server_memory_usage = new_server_settings.max_server_memory_usage; + double max_server_memory_usage_to_ram_ratio = new_server_settings.max_server_memory_usage_to_ram_ratio; size_t current_physical_server_memory = getMemoryAmount(); /// With cgroups, the amount of memory available to the server can be changed dynamically. size_t default_max_server_memory_usage = static_cast(current_physical_server_memory * max_server_memory_usage_to_ram_ratio); @@ -1306,9 +1294,9 @@ try total_memory_tracker.setDescription("(total)"); total_memory_tracker.setMetric(CurrentMetrics::MemoryTracking); - size_t merges_mutations_memory_usage_soft_limit = server_settings_.merges_mutations_memory_usage_soft_limit; + size_t merges_mutations_memory_usage_soft_limit = new_server_settings.merges_mutations_memory_usage_soft_limit; - size_t default_merges_mutations_server_memory_usage = static_cast(current_physical_server_memory * server_settings_.merges_mutations_memory_usage_to_ram_ratio); + size_t default_merges_mutations_server_memory_usage = static_cast(current_physical_server_memory * new_server_settings.merges_mutations_memory_usage_to_ram_ratio); if (merges_mutations_memory_usage_soft_limit == 0) { merges_mutations_memory_usage_soft_limit = default_merges_mutations_server_memory_usage; @@ -1316,7 +1304,7 @@ try " ({} available * {:.2f} merges_mutations_memory_usage_to_ram_ratio)", formatReadableSizeWithBinarySuffix(merges_mutations_memory_usage_soft_limit), formatReadableSizeWithBinarySuffix(current_physical_server_memory), - server_settings_.merges_mutations_memory_usage_to_ram_ratio); + new_server_settings.merges_mutations_memory_usage_to_ram_ratio); } else if (merges_mutations_memory_usage_soft_limit > default_merges_mutations_server_memory_usage) { @@ -1325,7 +1313,7 @@ try " ({} available * {:.2f} merges_mutations_memory_usage_to_ram_ratio)", formatReadableSizeWithBinarySuffix(merges_mutations_memory_usage_soft_limit), formatReadableSizeWithBinarySuffix(current_physical_server_memory), - server_settings_.merges_mutations_memory_usage_to_ram_ratio); + new_server_settings.merges_mutations_memory_usage_to_ram_ratio); } LOG_INFO(log, "Merges and mutations memory limit is set to {}", @@ -1334,7 +1322,7 @@ try background_memory_tracker.setDescription("(background)"); background_memory_tracker.setMetric(CurrentMetrics::MergesMutationsMemoryTracking); - total_memory_tracker.setAllowUseJemallocMemory(server_settings_.allow_use_jemalloc_memory); + total_memory_tracker.setAllowUseJemallocMemory(new_server_settings.allow_use_jemalloc_memory); auto * global_overcommit_tracker = global_context->getGlobalOvercommitTracker(); total_memory_tracker.setOvercommitTracker(global_overcommit_tracker); @@ -1358,26 +1346,26 @@ try global_context->setRemoteHostFilter(*config); global_context->setHTTPHeaderFilter(*config); - global_context->setMaxTableSizeToDrop(server_settings_.max_table_size_to_drop); - global_context->setMaxPartitionSizeToDrop(server_settings_.max_partition_size_to_drop); - global_context->setMaxTableNumToWarn(server_settings_.max_table_num_to_warn); - global_context->setMaxDatabaseNumToWarn(server_settings_.max_database_num_to_warn); - global_context->setMaxPartNumToWarn(server_settings_.max_part_num_to_warn); + global_context->setMaxTableSizeToDrop(new_server_settings.max_table_size_to_drop); + global_context->setMaxPartitionSizeToDrop(new_server_settings.max_partition_size_to_drop); + global_context->setMaxTableNumToWarn(new_server_settings.max_table_num_to_warn); + global_context->setMaxDatabaseNumToWarn(new_server_settings.max_database_num_to_warn); + global_context->setMaxPartNumToWarn(new_server_settings.max_part_num_to_warn); ConcurrencyControl::SlotCount concurrent_threads_soft_limit = ConcurrencyControl::Unlimited; - if (server_settings_.concurrent_threads_soft_limit_num > 0 && server_settings_.concurrent_threads_soft_limit_num < concurrent_threads_soft_limit) - concurrent_threads_soft_limit = server_settings_.concurrent_threads_soft_limit_num; - if (server_settings_.concurrent_threads_soft_limit_ratio_to_cores > 0) + if (new_server_settings.concurrent_threads_soft_limit_num > 0 && new_server_settings.concurrent_threads_soft_limit_num < concurrent_threads_soft_limit) + concurrent_threads_soft_limit = new_server_settings.concurrent_threads_soft_limit_num; + if (new_server_settings.concurrent_threads_soft_limit_ratio_to_cores > 0) { - auto value = server_settings_.concurrent_threads_soft_limit_ratio_to_cores * std::thread::hardware_concurrency(); + auto value = new_server_settings.concurrent_threads_soft_limit_ratio_to_cores * std::thread::hardware_concurrency(); if (value > 0 && value < concurrent_threads_soft_limit) concurrent_threads_soft_limit = value; } ConcurrencyControl::instance().setMaxConcurrency(concurrent_threads_soft_limit); - global_context->getProcessList().setMaxSize(server_settings_.max_concurrent_queries); - global_context->getProcessList().setMaxInsertQueriesAmount(server_settings_.max_concurrent_insert_queries); - global_context->getProcessList().setMaxSelectQueriesAmount(server_settings_.max_concurrent_select_queries); + global_context->getProcessList().setMaxSize(new_server_settings.max_concurrent_queries); + global_context->getProcessList().setMaxInsertQueriesAmount(new_server_settings.max_concurrent_insert_queries); + global_context->getProcessList().setMaxSelectQueriesAmount(new_server_settings.max_concurrent_select_queries); if (config->has("keeper_server")) global_context->updateKeeperConfiguration(*config); @@ -1388,68 +1376,68 @@ try /// This is done for backward compatibility. if (global_context->areBackgroundExecutorsInitialized()) { - auto new_pool_size = server_settings_.background_pool_size; - auto new_ratio = server_settings_.background_merges_mutations_concurrency_ratio; + auto new_pool_size = new_server_settings.background_pool_size; + auto new_ratio = new_server_settings.background_merges_mutations_concurrency_ratio; global_context->getMergeMutateExecutor()->increaseThreadsAndMaxTasksCount(new_pool_size, static_cast(new_pool_size * new_ratio)); - global_context->getMergeMutateExecutor()->updateSchedulingPolicy(server_settings_.background_merges_mutations_scheduling_policy.toString()); + global_context->getMergeMutateExecutor()->updateSchedulingPolicy(new_server_settings.background_merges_mutations_scheduling_policy.toString()); } if (global_context->areBackgroundExecutorsInitialized()) { - auto new_pool_size = server_settings_.background_move_pool_size; + auto new_pool_size = new_server_settings.background_move_pool_size; global_context->getMovesExecutor()->increaseThreadsAndMaxTasksCount(new_pool_size, new_pool_size); } if (global_context->areBackgroundExecutorsInitialized()) { - auto new_pool_size = server_settings_.background_fetches_pool_size; + auto new_pool_size = new_server_settings.background_fetches_pool_size; global_context->getFetchesExecutor()->increaseThreadsAndMaxTasksCount(new_pool_size, new_pool_size); } if (global_context->areBackgroundExecutorsInitialized()) { - auto new_pool_size = server_settings_.background_common_pool_size; + auto new_pool_size = new_server_settings.background_common_pool_size; global_context->getCommonExecutor()->increaseThreadsAndMaxTasksCount(new_pool_size, new_pool_size); } - global_context->getBufferFlushSchedulePool().increaseThreadsCount(server_settings_.background_buffer_flush_schedule_pool_size); - global_context->getSchedulePool().increaseThreadsCount(server_settings_.background_schedule_pool_size); - global_context->getMessageBrokerSchedulePool().increaseThreadsCount(server_settings_.background_message_broker_schedule_pool_size); - global_context->getDistributedSchedulePool().increaseThreadsCount(server_settings_.background_distributed_schedule_pool_size); + global_context->getBufferFlushSchedulePool().increaseThreadsCount(new_server_settings.background_buffer_flush_schedule_pool_size); + global_context->getSchedulePool().increaseThreadsCount(new_server_settings.background_schedule_pool_size); + global_context->getMessageBrokerSchedulePool().increaseThreadsCount(new_server_settings.background_message_broker_schedule_pool_size); + global_context->getDistributedSchedulePool().increaseThreadsCount(new_server_settings.background_distributed_schedule_pool_size); - global_context->getAsyncLoader().setMaxThreads(TablesLoaderForegroundPoolId, server_settings_.tables_loader_foreground_pool_size); - global_context->getAsyncLoader().setMaxThreads(TablesLoaderBackgroundLoadPoolId, server_settings_.tables_loader_background_pool_size); - global_context->getAsyncLoader().setMaxThreads(TablesLoaderBackgroundStartupPoolId, server_settings_.tables_loader_background_pool_size); + global_context->getAsyncLoader().setMaxThreads(TablesLoaderForegroundPoolId, new_server_settings.tables_loader_foreground_pool_size); + global_context->getAsyncLoader().setMaxThreads(TablesLoaderBackgroundLoadPoolId, new_server_settings.tables_loader_background_pool_size); + global_context->getAsyncLoader().setMaxThreads(TablesLoaderBackgroundStartupPoolId, new_server_settings.tables_loader_background_pool_size); getIOThreadPool().reloadConfiguration( - server_settings.max_io_thread_pool_size, - server_settings.max_io_thread_pool_free_size, - server_settings.io_thread_pool_queue_size); + new_server_settings.max_io_thread_pool_size, + new_server_settings.max_io_thread_pool_free_size, + new_server_settings.io_thread_pool_queue_size); getBackupsIOThreadPool().reloadConfiguration( - server_settings.max_backups_io_thread_pool_size, - server_settings.max_backups_io_thread_pool_free_size, - server_settings.backups_io_thread_pool_queue_size); + new_server_settings.max_backups_io_thread_pool_size, + new_server_settings.max_backups_io_thread_pool_free_size, + new_server_settings.backups_io_thread_pool_queue_size); getActivePartsLoadingThreadPool().reloadConfiguration( - server_settings.max_active_parts_loading_thread_pool_size, + new_server_settings.max_active_parts_loading_thread_pool_size, 0, // We don't need any threads once all the parts will be loaded - server_settings.max_active_parts_loading_thread_pool_size); + new_server_settings.max_active_parts_loading_thread_pool_size); getOutdatedPartsLoadingThreadPool().reloadConfiguration( - server_settings.max_outdated_parts_loading_thread_pool_size, + new_server_settings.max_outdated_parts_loading_thread_pool_size, 0, // We don't need any threads once all the parts will be loaded - server_settings.max_outdated_parts_loading_thread_pool_size); + new_server_settings.max_outdated_parts_loading_thread_pool_size); /// It could grow if we need to synchronously wait until all the data parts will be loaded. getOutdatedPartsLoadingThreadPool().setMaxTurboThreads( - server_settings.max_active_parts_loading_thread_pool_size + new_server_settings.max_active_parts_loading_thread_pool_size ); getPartsCleaningThreadPool().reloadConfiguration( - server_settings.max_parts_cleaning_thread_pool_size, + new_server_settings.max_parts_cleaning_thread_pool_size, 0, // We don't need any threads one all the parts will be deleted - server_settings.max_parts_cleaning_thread_pool_size); + new_server_settings.max_parts_cleaning_thread_pool_size); if (config->has("resources")) { @@ -2059,7 +2047,7 @@ std::unique_ptr Server::buildProtocolStackFromConfig( auto create_factory = [&](const std::string & type, const std::string & conf_name) -> TCPServerConnectionFactory::Ptr { if (type == "tcp") - return TCPServerConnectionFactory::Ptr(new TCPHandlerFactory(*this, false, false, ProfileEvents::InterfaceNativeReceiveBytes, ProfileEvents::InterfaceNativeSendBytes)); + return TCPServerConnectionFactory::Ptr(new TCPHandlerFactory(*this, false, false)); if (type == "tls") #if USE_SSL @@ -2071,20 +2059,20 @@ std::unique_ptr Server::buildProtocolStackFromConfig( if (type == "proxy1") return TCPServerConnectionFactory::Ptr(new ProxyV1HandlerFactory(*this, conf_name)); if (type == "mysql") - return TCPServerConnectionFactory::Ptr(new MySQLHandlerFactory(*this, ProfileEvents::InterfaceMySQLReceiveBytes, ProfileEvents::InterfaceMySQLSendBytes)); + return TCPServerConnectionFactory::Ptr(new MySQLHandlerFactory(*this)); if (type == "postgres") - return TCPServerConnectionFactory::Ptr(new PostgreSQLHandlerFactory(*this, ProfileEvents::InterfacePostgreSQLReceiveBytes, ProfileEvents::InterfacePostgreSQLSendBytes)); + return TCPServerConnectionFactory::Ptr(new PostgreSQLHandlerFactory(*this)); if (type == "http") return TCPServerConnectionFactory::Ptr( - new HTTPServerConnectionFactory(httpContext(), http_params, createHandlerFactory(*this, config, async_metrics, "HTTPHandler-factory"), ProfileEvents::InterfaceHTTPReceiveBytes, ProfileEvents::InterfaceHTTPSendBytes) + new HTTPServerConnectionFactory(httpContext(), http_params, createHandlerFactory(*this, config, async_metrics, "HTTPHandler-factory")) ); if (type == "prometheus") return TCPServerConnectionFactory::Ptr( - new HTTPServerConnectionFactory(httpContext(), http_params, createHandlerFactory(*this, config, async_metrics, "PrometheusHandler-factory"), ProfileEvents::InterfacePrometheusReceiveBytes, ProfileEvents::InterfacePrometheusSendBytes) + new HTTPServerConnectionFactory(httpContext(), http_params, createHandlerFactory(*this, config, async_metrics, "PrometheusHandler-factory")) ); if (type == "interserver") return TCPServerConnectionFactory::Ptr( - new HTTPServerConnectionFactory(httpContext(), http_params, createHandlerFactory(*this, config, async_metrics, "InterserverIOHTTPHandler-factory"), ProfileEvents::InterfaceInterserverReceiveBytes, ProfileEvents::InterfaceInterserverSendBytes) + new HTTPServerConnectionFactory(httpContext(), http_params, createHandlerFactory(*this, config, async_metrics, "InterserverIOHTTPHandler-factory")) ); throw Exception(ErrorCodes::INVALID_CONFIG_PARAMETER, "Protocol configuration error, unknown protocol name '{}'", type); @@ -2217,7 +2205,7 @@ void Server::createServers( port_name, "http://" + address.toString(), std::make_unique( - httpContext(), createHandlerFactory(*this, config, async_metrics, "HTTPHandler-factory"), server_pool, socket, http_params, ProfileEvents::InterfaceHTTPReceiveBytes, ProfileEvents::InterfaceHTTPSendBytes)); + httpContext(), createHandlerFactory(*this, config, async_metrics, "HTTPHandler-factory"), server_pool, socket, http_params)); }); } @@ -2237,7 +2225,7 @@ void Server::createServers( port_name, "https://" + address.toString(), std::make_unique( - httpContext(), createHandlerFactory(*this, config, async_metrics, "HTTPSHandler-factory"), server_pool, socket, http_params, ProfileEvents::InterfaceHTTPReceiveBytes, ProfileEvents::InterfaceHTTPSendBytes)); + httpContext(), createHandlerFactory(*this, config, async_metrics, "HTTPSHandler-factory"), server_pool, socket, http_params)); #else UNUSED(port); throw Exception(ErrorCodes::SUPPORT_IS_DISABLED, "HTTPS protocol is disabled because Poco library was built without NetSSL support."); @@ -2260,7 +2248,7 @@ void Server::createServers( port_name, "native protocol (tcp): " + address.toString(), std::make_unique( - new TCPHandlerFactory(*this, /* secure */ false, /* proxy protocol */ false, ProfileEvents::InterfaceNativeReceiveBytes, ProfileEvents::InterfaceNativeSendBytes), + new TCPHandlerFactory(*this, /* secure */ false, /* proxy protocol */ false), server_pool, socket, new Poco::Net::TCPServerParams)); @@ -2282,7 +2270,7 @@ void Server::createServers( port_name, "native protocol (tcp) with PROXY: " + address.toString(), std::make_unique( - new TCPHandlerFactory(*this, /* secure */ false, /* proxy protocol */ true, ProfileEvents::InterfaceNativeReceiveBytes, ProfileEvents::InterfaceNativeSendBytes), + new TCPHandlerFactory(*this, /* secure */ false, /* proxy protocol */ true), server_pool, socket, new Poco::Net::TCPServerParams)); @@ -2305,7 +2293,7 @@ void Server::createServers( port_name, "secure native protocol (tcp_secure): " + address.toString(), std::make_unique( - new TCPHandlerFactory(*this, /* secure */ true, /* proxy protocol */ false, ProfileEvents::InterfaceNativeReceiveBytes, ProfileEvents::InterfaceNativeSendBytes), + new TCPHandlerFactory(*this, /* secure */ true, /* proxy protocol */ false), server_pool, socket, new Poco::Net::TCPServerParams)); @@ -2329,7 +2317,7 @@ void Server::createServers( listen_host, port_name, "MySQL compatibility protocol: " + address.toString(), - std::make_unique(new MySQLHandlerFactory(*this, ProfileEvents::InterfaceMySQLReceiveBytes, ProfileEvents::InterfaceMySQLSendBytes), server_pool, socket, new Poco::Net::TCPServerParams)); + std::make_unique(new MySQLHandlerFactory(*this), server_pool, socket, new Poco::Net::TCPServerParams)); }); } @@ -2346,7 +2334,7 @@ void Server::createServers( listen_host, port_name, "PostgreSQL compatibility protocol: " + address.toString(), - std::make_unique(new PostgreSQLHandlerFactory(*this, ProfileEvents::InterfacePostgreSQLReceiveBytes, ProfileEvents::InterfacePostgreSQLSendBytes), server_pool, socket, new Poco::Net::TCPServerParams)); + std::make_unique(new PostgreSQLHandlerFactory(*this), server_pool, socket, new Poco::Net::TCPServerParams)); }); } @@ -2380,7 +2368,7 @@ void Server::createServers( port_name, "Prometheus: http://" + address.toString(), std::make_unique( - httpContext(), createHandlerFactory(*this, config, async_metrics, "PrometheusHandler-factory"), server_pool, socket, http_params, ProfileEvents::InterfacePrometheusReceiveBytes, ProfileEvents::InterfacePrometheusSendBytes)); + httpContext(), createHandlerFactory(*this, config, async_metrics, "PrometheusHandler-factory"), server_pool, socket, http_params)); }); } } @@ -2426,9 +2414,7 @@ void Server::createInterserverServers( createHandlerFactory(*this, config, async_metrics, "InterserverIOHTTPHandler-factory"), server_pool, socket, - http_params, - ProfileEvents::InterfaceInterserverReceiveBytes, - ProfileEvents::InterfaceInterserverSendBytes)); + http_params)); }); } @@ -2451,9 +2437,7 @@ void Server::createInterserverServers( createHandlerFactory(*this, config, async_metrics, "InterserverIOHTTPSHandler-factory"), server_pool, socket, - http_params, - ProfileEvents::InterfaceInterserverReceiveBytes, - ProfileEvents::InterfaceInterserverSendBytes)); + http_params)); #else UNUSED(port); throw Exception(ErrorCodes::SUPPORT_IS_DISABLED, "SSL support for TCP protocol is disabled because Poco library was built without NetSSL support."); diff --git a/programs/server/config.xml b/programs/server/config.xml index 52a1c528040..1be20c5cad8 100644 --- a/programs/server/config.xml +++ b/programs/server/config.xml @@ -1379,6 +1379,9 @@ + + + diff --git a/src/Access/SettingsProfilesCache.cpp b/src/Access/SettingsProfilesCache.cpp index f03e68ba455..275b3aeb6b5 100644 --- a/src/Access/SettingsProfilesCache.cpp +++ b/src/Access/SettingsProfilesCache.cpp @@ -140,8 +140,7 @@ void SettingsProfilesCache::mergeSettingsAndConstraintsFor(EnabledSettings & ena auto info = std::make_shared(access_control); - info->profiles = merged_settings.toProfileIDs(); - substituteProfiles(merged_settings, info->profiles_with_implicit, info->names_of_profiles); + substituteProfiles(merged_settings, info->profiles, info->profiles_with_implicit, info->names_of_profiles); info->settings = merged_settings.toSettingsChanges(); info->constraints = merged_settings.toSettingsConstraints(access_control); @@ -152,9 +151,12 @@ void SettingsProfilesCache::mergeSettingsAndConstraintsFor(EnabledSettings & ena void SettingsProfilesCache::substituteProfiles( SettingsProfileElements & elements, + std::vector & profiles, std::vector & substituted_profiles, std::unordered_map & names_of_substituted_profiles) const { + profiles = elements.toProfileIDs(); + /// We should substitute profiles in reversive order because the same profile can occur /// in `elements` multiple times (with some other settings in between) and in this case /// the last occurrence should override all the previous ones. @@ -184,6 +186,11 @@ void SettingsProfilesCache::substituteProfiles( names_of_substituted_profiles.emplace(profile_id, profile->getName()); } std::reverse(substituted_profiles.begin(), substituted_profiles.end()); + + std::erase_if(profiles, [&substituted_profiles_set](const UUID & profile_id) + { + return !substituted_profiles_set.contains(profile_id); + }); } std::shared_ptr SettingsProfilesCache::getEnabledSettings( @@ -225,13 +232,13 @@ std::shared_ptr SettingsProfilesCache::getSettingsPr if (auto pos = this->profile_infos_cache.get(profile_id)) return *pos; - SettingsProfileElements elements = all_profiles[profile_id]->elements; + SettingsProfileElements elements; + auto & element = elements.emplace_back(); + element.parent_profile = profile_id; auto info = std::make_shared(access_control); - info->profiles.push_back(profile_id); - info->profiles_with_implicit.push_back(profile_id); - substituteProfiles(elements, info->profiles_with_implicit, info->names_of_profiles); + substituteProfiles(elements, info->profiles, info->profiles_with_implicit, info->names_of_profiles); info->settings = elements.toSettingsChanges(); info->constraints.merge(elements.toSettingsConstraints(access_control)); diff --git a/src/Access/SettingsProfilesCache.h b/src/Access/SettingsProfilesCache.h index 28914596ccc..afc3c3e13a5 100644 --- a/src/Access/SettingsProfilesCache.h +++ b/src/Access/SettingsProfilesCache.h @@ -37,7 +37,11 @@ private: void profileRemoved(const UUID & profile_id); void mergeSettingsAndConstraints(); void mergeSettingsAndConstraintsFor(EnabledSettings & enabled) const; - void substituteProfiles(SettingsProfileElements & elements, std::vector & substituted_profiles, std::unordered_map & names_of_substituted_profiles) const; + + void substituteProfiles(SettingsProfileElements & elements, + std::vector & profiles, + std::vector & substituted_profiles, + std::unordered_map & names_of_substituted_profiles) const; const AccessControl & access_control; std::unordered_map all_profiles; diff --git a/src/AggregateFunctions/AggregateFunctionLargestTriangleThreeBuckets.cpp b/src/AggregateFunctions/AggregateFunctionLargestTriangleThreeBuckets.cpp index 850a7c688ad..d5abdbc12fb 100644 --- a/src/AggregateFunctions/AggregateFunctionLargestTriangleThreeBuckets.cpp +++ b/src/AggregateFunctions/AggregateFunctionLargestTriangleThreeBuckets.cpp @@ -14,8 +14,9 @@ #include #include #include -#include #include +#include +#include #include #include @@ -48,7 +49,7 @@ struct LargestTriangleThreeBucketsData : public StatisticalSamplex and this->y in ascending order of this->x using index std::vector index(this->x.size()); - std::iota(index.begin(), index.end(), 0); + iota(index.data(), index.size(), size_t(0)); ::sort(index.begin(), index.end(), [&](size_t i1, size_t i2) { return this->x[i1] < this->x[i2]; }); SampleX temp_x{}; diff --git a/src/AggregateFunctions/AggregateFunctionMax.cpp b/src/AggregateFunctions/AggregateFunctionMax.cpp index e74224a24c3..e9cd651b8db 100644 --- a/src/AggregateFunctions/AggregateFunctionMax.cpp +++ b/src/AggregateFunctions/AggregateFunctionMax.cpp @@ -1,7 +1,8 @@ #include #include #include -#include +#include +#include namespace DB { @@ -19,7 +20,7 @@ public: explicit AggregateFunctionsSingleValueMax(const DataTypePtr & type) : Parent(type) { } /// Specializations for native numeric types - ALWAYS_INLINE inline void addBatchSinglePlace( + void addBatchSinglePlace( size_t row_begin, size_t row_end, AggregateDataPtr __restrict place, @@ -27,7 +28,7 @@ public: Arena * arena, ssize_t if_argument_pos) const override; - ALWAYS_INLINE inline void addBatchSinglePlaceNotNull( + void addBatchSinglePlaceNotNull( size_t row_begin, size_t row_end, AggregateDataPtr __restrict place, @@ -53,10 +54,10 @@ void AggregateFunctionsSingleValueMax= 0) \ { \ const auto & flags = assert_cast(*columns[if_argument_pos]).getData(); \ - opt = findNumericMaxIf(column.getData().data(), flags.data(), row_begin, row_end); \ + opt = findExtremeMaxIf(column.getData().data(), flags.data(), row_begin, row_end); \ } \ else \ - opt = findNumericMax(column.getData().data(), row_begin, row_end); \ + opt = findExtremeMax(column.getData().data(), row_begin, row_end); \ if (opt.has_value()) \ this->data(place).changeIfGreater(opt.value()); \ } @@ -74,7 +75,57 @@ void AggregateFunctionsSingleValueMax::addBatchSinglePlace( Arena * arena, ssize_t if_argument_pos) const { - return Parent::addBatchSinglePlace(row_begin, row_end, place, columns, arena, if_argument_pos); + if constexpr (!is_any_of) + { + /// Leave other numeric types (large integers, decimals, etc) to keep doing the comparison as it's + /// faster than doing a permutation + return Parent::addBatchSinglePlace(row_begin, row_end, place, columns, arena, if_argument_pos); + } + + constexpr int nan_direction_hint = 1; + auto const & column = *columns[0]; + if (if_argument_pos >= 0) + { + size_t index = row_begin; + const auto & if_flags = assert_cast(*columns[if_argument_pos]).getData(); + while (if_flags[index] == 0 && index < row_end) + index++; + if (index >= row_end) + return; + + for (size_t i = index + 1; i < row_end; i++) + { + if ((if_flags[i] != 0) && (column.compareAt(i, index, column, nan_direction_hint) > 0)) + index = i; + } + this->data(place).changeIfGreater(column, index, arena); + } + else + { + if (row_begin >= row_end) + return; + + /// TODO: Introduce row_begin and row_end to getPermutation + if (row_begin != 0 || row_end != column.size()) + { + size_t index = row_begin; + for (size_t i = index + 1; i < row_end; i++) + { + if (column.compareAt(i, index, column, nan_direction_hint) > 0) + index = i; + } + this->data(place).changeIfGreater(column, index, arena); + } + else + { + constexpr IColumn::PermutationSortDirection direction = IColumn::PermutationSortDirection::Descending; + constexpr IColumn::PermutationSortStability stability = IColumn::PermutationSortStability::Unstable; + IColumn::Permutation permutation; + constexpr UInt64 limit = 1; + column.getPermutation(direction, stability, limit, nan_direction_hint, permutation); + this->data(place).changeIfGreater(column, permutation[0], arena); + } + } } // NOLINTBEGIN(bugprone-macro-parentheses) @@ -97,10 +148,10 @@ void AggregateFunctionsSingleValueMax(row_end); \ for (size_t i = row_begin; i < row_end; ++i) \ final_flags[i] = (!null_map[i]) & !!if_flags[i]; \ - opt = findNumericMaxIf(column.getData().data(), final_flags.get(), row_begin, row_end); \ + opt = findExtremeMaxIf(column.getData().data(), final_flags.get(), row_begin, row_end); \ } \ else \ - opt = findNumericMaxNotNull(column.getData().data(), null_map, row_begin, row_end); \ + opt = findExtremeMaxNotNull(column.getData().data(), null_map, row_begin, row_end); \ if (opt.has_value()) \ this->data(place).changeIfGreater(opt.value()); \ } @@ -119,7 +170,46 @@ void AggregateFunctionsSingleValueMax::addBatchSinglePlaceNotNull( Arena * arena, ssize_t if_argument_pos) const { - return Parent::addBatchSinglePlaceNotNull(row_begin, row_end, place, columns, null_map, arena, if_argument_pos); + if constexpr (!is_any_of) + { + /// Leave other numeric types (large integers, decimals, etc) to keep doing the comparison as it's + /// faster than doing a permutation + return Parent::addBatchSinglePlaceNotNull(row_begin, row_end, place, columns, null_map, arena, if_argument_pos); + } + + constexpr int nan_direction_hint = 1; + auto const & column = *columns[0]; + if (if_argument_pos >= 0) + { + size_t index = row_begin; + const auto & if_flags = assert_cast(*columns[if_argument_pos]).getData(); + while ((if_flags[index] == 0 || null_map[index] != 0) && (index < row_end)) + index++; + if (index >= row_end) + return; + + for (size_t i = index + 1; i < row_end; i++) + { + if ((if_flags[i] != 0) && (null_map[i] == 0) && (column.compareAt(i, index, column, nan_direction_hint) > 0)) + index = i; + } + this->data(place).changeIfGreater(column, index, arena); + } + else + { + size_t index = row_begin; + while ((null_map[index] != 0) && (index < row_end)) + index++; + if (index >= row_end) + return; + + for (size_t i = index + 1; i < row_end; i++) + { + if ((null_map[i] == 0) && (column.compareAt(i, index, column, nan_direction_hint) > 0)) + index = i; + } + this->data(place).changeIfGreater(column, index, arena); + } } AggregateFunctionPtr createAggregateFunctionMax( diff --git a/src/AggregateFunctions/AggregateFunctionMin.cpp b/src/AggregateFunctions/AggregateFunctionMin.cpp index 48758aa74b0..d767bd5c563 100644 --- a/src/AggregateFunctions/AggregateFunctionMin.cpp +++ b/src/AggregateFunctions/AggregateFunctionMin.cpp @@ -1,7 +1,8 @@ #include #include #include -#include +#include +#include namespace DB @@ -20,7 +21,7 @@ public: explicit AggregateFunctionsSingleValueMin(const DataTypePtr & type) : Parent(type) { } /// Specializations for native numeric types - ALWAYS_INLINE inline void addBatchSinglePlace( + void addBatchSinglePlace( size_t row_begin, size_t row_end, AggregateDataPtr __restrict place, @@ -28,7 +29,7 @@ public: Arena * arena, ssize_t if_argument_pos) const override; - ALWAYS_INLINE inline void addBatchSinglePlaceNotNull( + void addBatchSinglePlaceNotNull( size_t row_begin, size_t row_end, AggregateDataPtr __restrict place, @@ -54,10 +55,10 @@ public: if (if_argument_pos >= 0) \ { \ const auto & flags = assert_cast(*columns[if_argument_pos]).getData(); \ - opt = findNumericMinIf(column.getData().data(), flags.data(), row_begin, row_end); \ + opt = findExtremeMinIf(column.getData().data(), flags.data(), row_begin, row_end); \ } \ else \ - opt = findNumericMin(column.getData().data(), row_begin, row_end); \ + opt = findExtremeMin(column.getData().data(), row_begin, row_end); \ if (opt.has_value()) \ this->data(place).changeIfLess(opt.value()); \ } @@ -75,7 +76,57 @@ void AggregateFunctionsSingleValueMin::addBatchSinglePlace( Arena * arena, ssize_t if_argument_pos) const { - return Parent::addBatchSinglePlace(row_begin, row_end, place, columns, arena, if_argument_pos); + if constexpr (!is_any_of) + { + /// Leave other numeric types (large integers, decimals, etc) to keep doing the comparison as it's + /// faster than doing a permutation + return Parent::addBatchSinglePlace(row_begin, row_end, place, columns, arena, if_argument_pos); + } + + constexpr int nan_direction_hint = 1; + auto const & column = *columns[0]; + if (if_argument_pos >= 0) + { + size_t index = row_begin; + const auto & if_flags = assert_cast(*columns[if_argument_pos]).getData(); + while (if_flags[index] == 0 && index < row_end) + index++; + if (index >= row_end) + return; + + for (size_t i = index + 1; i < row_end; i++) + { + if ((if_flags[i] != 0) && (column.compareAt(i, index, column, nan_direction_hint) < 0)) + index = i; + } + this->data(place).changeIfLess(column, index, arena); + } + else + { + if (row_begin >= row_end) + return; + + /// TODO: Introduce row_begin and row_end to getPermutation + if (row_begin != 0 || row_end != column.size()) + { + size_t index = row_begin; + for (size_t i = index + 1; i < row_end; i++) + { + if (column.compareAt(i, index, column, nan_direction_hint) < 0) + index = i; + } + this->data(place).changeIfLess(column, index, arena); + } + else + { + constexpr IColumn::PermutationSortDirection direction = IColumn::PermutationSortDirection::Ascending; + constexpr IColumn::PermutationSortStability stability = IColumn::PermutationSortStability::Unstable; + IColumn::Permutation permutation; + constexpr UInt64 limit = 1; + column.getPermutation(direction, stability, limit, nan_direction_hint, permutation); + this->data(place).changeIfLess(column, permutation[0], arena); + } + } } // NOLINTBEGIN(bugprone-macro-parentheses) @@ -98,10 +149,10 @@ void AggregateFunctionsSingleValueMin::addBatchSinglePlace( auto final_flags = std::make_unique(row_end); \ for (size_t i = row_begin; i < row_end; ++i) \ final_flags[i] = (!null_map[i]) & !!if_flags[i]; \ - opt = findNumericMinIf(column.getData().data(), final_flags.get(), row_begin, row_end); \ + opt = findExtremeMinIf(column.getData().data(), final_flags.get(), row_begin, row_end); \ } \ else \ - opt = findNumericMinNotNull(column.getData().data(), null_map, row_begin, row_end); \ + opt = findExtremeMinNotNull(column.getData().data(), null_map, row_begin, row_end); \ if (opt.has_value()) \ this->data(place).changeIfLess(opt.value()); \ } @@ -120,7 +171,46 @@ void AggregateFunctionsSingleValueMin::addBatchSinglePlaceNotNull( Arena * arena, ssize_t if_argument_pos) const { - return Parent::addBatchSinglePlaceNotNull(row_begin, row_end, place, columns, null_map, arena, if_argument_pos); + if constexpr (!is_any_of) + { + /// Leave other numeric types (large integers, decimals, etc) to keep doing the comparison as it's + /// faster than doing a permutation + return Parent::addBatchSinglePlaceNotNull(row_begin, row_end, place, columns, null_map, arena, if_argument_pos); + } + + constexpr int nan_direction_hint = 1; + auto const & column = *columns[0]; + if (if_argument_pos >= 0) + { + size_t index = row_begin; + const auto & if_flags = assert_cast(*columns[if_argument_pos]).getData(); + while ((if_flags[index] == 0 || null_map[index] != 0) && (index < row_end)) + index++; + if (index >= row_end) + return; + + for (size_t i = index + 1; i < row_end; i++) + { + if ((if_flags[i] != 0) && (null_map[index] == 0) && (column.compareAt(i, index, column, nan_direction_hint) < 0)) + index = i; + } + this->data(place).changeIfLess(column, index, arena); + } + else + { + size_t index = row_begin; + while ((null_map[index] != 0) && (index < row_end)) + index++; + if (index >= row_end) + return; + + for (size_t i = index + 1; i < row_end; i++) + { + if ((null_map[i] == 0) && (column.compareAt(i, index, column, nan_direction_hint) < 0)) + index = i; + } + this->data(place).changeIfLess(column, index, arena); + } } AggregateFunctionPtr createAggregateFunctionMin( diff --git a/src/AggregateFunctions/AggregateFunctionMinMaxAny.h b/src/AggregateFunctions/AggregateFunctionMinMaxAny.h index b69a0b100a3..dec70861543 100644 --- a/src/AggregateFunctions/AggregateFunctionMinMaxAny.h +++ b/src/AggregateFunctions/AggregateFunctionMinMaxAny.h @@ -965,6 +965,7 @@ template struct AggregateFunctionMinData : Data { using Self = AggregateFunctionMinData; + using Impl = Data; bool changeIfBetter(const IColumn & column, size_t row_num, Arena * arena) { return this->changeIfLess(column, row_num, arena); } bool changeIfBetter(const Self & to, Arena * arena) { return this->changeIfLess(to, arena); } @@ -993,6 +994,7 @@ template struct AggregateFunctionMaxData : Data { using Self = AggregateFunctionMaxData; + using Impl = Data; bool changeIfBetter(const IColumn & column, size_t row_num, Arena * arena) { return this->changeIfGreater(column, row_num, arena); } bool changeIfBetter(const Self & to, Arena * arena) { return this->changeIfGreater(to, arena); } diff --git a/src/AggregateFunctions/QuantilesCommon.h b/src/AggregateFunctions/QuantilesCommon.h index 3dda0119485..afbca84b827 100644 --- a/src/AggregateFunctions/QuantilesCommon.h +++ b/src/AggregateFunctions/QuantilesCommon.h @@ -6,6 +6,7 @@ #include #include +#include namespace DB @@ -63,10 +64,9 @@ struct QuantileLevels if (isNaN(levels[i]) || levels[i] < 0 || levels[i] > 1) throw Exception(ErrorCodes::PARAMETER_OUT_OF_BOUND, "Quantile level is out of range [0..1]"); - - permutation[i] = i; } + iota(permutation.data(), size, Permutation::value_type(0)); ::sort(permutation.begin(), permutation.end(), [this] (size_t a, size_t b) { return levels[a] < levels[b]; }); } }; diff --git a/src/AggregateFunctions/StatCommon.h b/src/AggregateFunctions/StatCommon.h index 23054e25189..8b1395ea95c 100644 --- a/src/AggregateFunctions/StatCommon.h +++ b/src/AggregateFunctions/StatCommon.h @@ -7,6 +7,7 @@ #include #include +#include #include #include @@ -30,7 +31,7 @@ std::pair computeRanksAndTieCorrection(const Values & value const size_t size = values.size(); /// Save initial positions, than sort indices according to the values. std::vector indexes(size); - std::iota(indexes.begin(), indexes.end(), 0); + iota(indexes.data(), indexes.size(), size_t(0)); std::sort(indexes.begin(), indexes.end(), [&] (size_t lhs, size_t rhs) { return values[lhs] < values[rhs]; }); diff --git a/src/AggregateFunctions/findNumeric.cpp b/src/AggregateFunctions/findNumeric.cpp deleted file mode 100644 index bbad8c1fe3d..00000000000 --- a/src/AggregateFunctions/findNumeric.cpp +++ /dev/null @@ -1,15 +0,0 @@ -#include - -namespace DB -{ -#define INSTANTIATION(T) \ - template std::optional findNumericMin(const T * __restrict ptr, size_t start, size_t end); \ - template std::optional findNumericMinNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \ - template std::optional findNumericMinIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \ - template std::optional findNumericMax(const T * __restrict ptr, size_t start, size_t end); \ - template std::optional findNumericMaxNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \ - template std::optional findNumericMaxIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); - -FOR_BASIC_NUMERIC_TYPES(INSTANTIATION) -#undef INSTANTIATION -} diff --git a/src/Analyzer/IQueryTreeNode.h b/src/Analyzer/IQueryTreeNode.h index 922eaabe75c..b07aa2d31b0 100644 --- a/src/Analyzer/IQueryTreeNode.h +++ b/src/Analyzer/IQueryTreeNode.h @@ -143,9 +143,17 @@ public: return alias; } + const String & getOriginalAlias() const + { + return original_alias.empty() ? alias : original_alias; + } + /// Set node alias void setAlias(String alias_value) { + if (original_alias.empty()) + original_alias = std::move(alias); + alias = std::move(alias_value); } @@ -276,6 +284,9 @@ protected: private: String alias; + /// An alias from query. Alias can be replaced by query passes, + /// but we need to keep the original one to support additional_table_filters. + String original_alias; ASTPtr original_ast; }; diff --git a/src/Analyzer/Passes/FuseFunctionsPass.cpp b/src/Analyzer/Passes/FuseFunctionsPass.cpp index e77b3ddcb20..443e13b7d9d 100644 --- a/src/Analyzer/Passes/FuseFunctionsPass.cpp +++ b/src/Analyzer/Passes/FuseFunctionsPass.cpp @@ -1,5 +1,6 @@ #include +#include #include #include #include @@ -184,7 +185,7 @@ FunctionNodePtr createFusedQuantilesNode(std::vector & nodes { /// Sort nodes and parameters in ascending order of quantile level std::vector permutation(nodes.size()); - std::iota(permutation.begin(), permutation.end(), 0); + iota(permutation.data(), permutation.size(), size_t(0)); std::sort(permutation.begin(), permutation.end(), [&](size_t i, size_t j) { return parameters[i].get() < parameters[j].get(); }); std::vector new_nodes; diff --git a/src/Analyzer/Passes/QueryAnalysisPass.cpp b/src/Analyzer/Passes/QueryAnalysisPass.cpp index 3290d918a8b..4ad9581b5b6 100644 --- a/src/Analyzer/Passes/QueryAnalysisPass.cpp +++ b/src/Analyzer/Passes/QueryAnalysisPass.cpp @@ -52,6 +52,7 @@ #include +#include #include #include #include @@ -1198,7 +1199,7 @@ private: static void mergeWindowWithParentWindow(const QueryTreeNodePtr & window_node, const QueryTreeNodePtr & parent_window_node, IdentifierResolveScope & scope); - static void replaceNodesWithPositionalArguments(QueryTreeNodePtr & node_list, const QueryTreeNodes & projection_nodes, IdentifierResolveScope & scope); + void replaceNodesWithPositionalArguments(QueryTreeNodePtr & node_list, const QueryTreeNodes & projection_nodes, IdentifierResolveScope & scope); static void convertLimitOffsetExpression(QueryTreeNodePtr & expression_node, const String & expression_description, IdentifierResolveScope & scope); @@ -2168,7 +2169,12 @@ void QueryAnalyzer::replaceNodesWithPositionalArguments(QueryTreeNodePtr & node_ scope.scope_node->formatASTForErrorMessage()); --positional_argument_number; - *node_to_replace = projection_nodes[positional_argument_number]; + *node_to_replace = projection_nodes[positional_argument_number]->clone(); + if (auto it = resolved_expressions.find(projection_nodes[positional_argument_number]); + it != resolved_expressions.end()) + { + resolved_expressions[*node_to_replace] = it->second; + } } } @@ -7366,6 +7372,7 @@ void QueryAnalysisPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context { QueryAnalyzer analyzer; analyzer.resolve(query_tree_node, table_expression, context); + createUniqueTableAliases(query_tree_node, table_expression, context); } } diff --git a/src/Analyzer/Utils.cpp b/src/Analyzer/Utils.cpp index f75022220e7..53fcf534f64 100644 --- a/src/Analyzer/Utils.cpp +++ b/src/Analyzer/Utils.cpp @@ -326,7 +326,7 @@ void addTableExpressionOrJoinIntoTablesInSelectQuery(ASTPtr & tables_in_select_q } } -QueryTreeNodes extractTableExpressions(const QueryTreeNodePtr & join_tree_node) +QueryTreeNodes extractTableExpressions(const QueryTreeNodePtr & join_tree_node, bool add_array_join) { QueryTreeNodes result; @@ -357,6 +357,8 @@ QueryTreeNodes extractTableExpressions(const QueryTreeNodePtr & join_tree_node) { auto & array_join_node = node_to_process->as(); nodes_to_process.push_front(array_join_node.getTableExpression()); + if (add_array_join) + result.push_back(std::move(node_to_process)); break; } case QueryTreeNodeType::JOIN: diff --git a/src/Analyzer/Utils.h b/src/Analyzer/Utils.h index e3316f5ad6b..d3eb6ba3cc2 100644 --- a/src/Analyzer/Utils.h +++ b/src/Analyzer/Utils.h @@ -51,7 +51,7 @@ std::optional tryExtractConstantFromConditionNode(const QueryTreeNodePtr & void addTableExpressionOrJoinIntoTablesInSelectQuery(ASTPtr & tables_in_select_query_ast, const QueryTreeNodePtr & table_expression, const IQueryTreeNode::ConvertToASTOptions & convert_to_ast_options); /// Extract table, table function, query, union from join tree -QueryTreeNodes extractTableExpressions(const QueryTreeNodePtr & join_tree_node); +QueryTreeNodes extractTableExpressions(const QueryTreeNodePtr & join_tree_node, bool add_array_join = false); /// Extract left table expression from join tree QueryTreeNodePtr extractLeftTableExpression(const QueryTreeNodePtr & join_tree_node); diff --git a/src/Analyzer/createUniqueTableAliases.cpp b/src/Analyzer/createUniqueTableAliases.cpp new file mode 100644 index 00000000000..8f850fe8dec --- /dev/null +++ b/src/Analyzer/createUniqueTableAliases.cpp @@ -0,0 +1,141 @@ +#include +#include +#include +#include +#include +#include +#include +#include + +namespace DB +{ + +namespace +{ + +class CreateUniqueTableAliasesVisitor : public InDepthQueryTreeVisitorWithContext +{ +public: + using Base = InDepthQueryTreeVisitorWithContext; + + explicit CreateUniqueTableAliasesVisitor(const ContextPtr & context) + : Base(context) + { + // Insert a fake node on top of the stack. + scope_nodes_stack.push_back(std::make_shared(Names{}, nullptr)); + } + + void enterImpl(QueryTreeNodePtr & node) + { + auto node_type = node->getNodeType(); + + switch (node_type) + { + case QueryTreeNodeType::QUERY: + [[fallthrough]]; + case QueryTreeNodeType::UNION: + { + /// Queries like `(SELECT 1) as t` have invalid syntax. To avoid creating such queries (e.g. in StorageDistributed) + /// we need to remove aliases for top level queries. + /// N.B. Subquery depth starts count from 1, so the following condition checks if it's a top level. + if (getSubqueryDepth() == 1) + { + node->removeAlias(); + break; + } + [[fallthrough]]; + } + case QueryTreeNodeType::TABLE: + [[fallthrough]]; + case QueryTreeNodeType::TABLE_FUNCTION: + [[fallthrough]]; + case QueryTreeNodeType::ARRAY_JOIN: + { + auto & alias = table_expression_to_alias[node]; + if (alias.empty()) + { + scope_to_nodes_with_aliases[scope_nodes_stack.back()].push_back(node); + alias = fmt::format("__table{}", ++next_id); + node->setAlias(alias); + } + break; + } + default: + break; + } + + switch (node_type) + { + case QueryTreeNodeType::QUERY: + [[fallthrough]]; + case QueryTreeNodeType::UNION: + [[fallthrough]]; + case QueryTreeNodeType::LAMBDA: + scope_nodes_stack.push_back(node); + break; + default: + break; + } + } + + void leaveImpl(QueryTreeNodePtr & node) + { + if (scope_nodes_stack.back() == node) + { + if (auto it = scope_to_nodes_with_aliases.find(scope_nodes_stack.back()); + it != scope_to_nodes_with_aliases.end()) + { + for (const auto & node_with_alias : it->second) + { + table_expression_to_alias.erase(node_with_alias); + } + scope_to_nodes_with_aliases.erase(it); + } + scope_nodes_stack.pop_back(); + } + + /// Here we revisit subquery for IN function. Reasons: + /// * For remote query execution, query tree may be traversed a few times. + /// In such a case, it is possible to get AST like + /// `IN ((SELECT ... FROM table AS __table4) AS __table1)` which result in + /// `Multiple expressions for the alias` exception + /// * Tables in subqueries could have different aliases => different three hashes, + /// which is important to be able to find a set in PreparedSets + /// See 01253_subquery_in_aggregate_function_JustStranger. + /// + /// So, we revisit this subquery to make aliases stable. + /// This should be safe cause columns from IN subquery can't be used in main query anyway. + if (node->getNodeType() == QueryTreeNodeType::FUNCTION) + { + auto * function_node = node->as(); + if (isNameOfInFunction(function_node->getFunctionName())) + { + auto arg = function_node->getArguments().getNodes().back(); + /// Avoid aliasing IN `table` + if (arg->getNodeType() != QueryTreeNodeType::TABLE) + CreateUniqueTableAliasesVisitor(getContext()).visit(function_node->getArguments().getNodes().back()); + } + } + } + +private: + size_t next_id = 0; + + // Stack of nodes which create scopes: QUERY, UNION and LAMBDA. + QueryTreeNodes scope_nodes_stack; + + std::unordered_map scope_to_nodes_with_aliases; + + // We need to use raw pointer as a key, not a QueryTreeNodePtrWithHash. + std::unordered_map table_expression_to_alias; +}; + +} + + +void createUniqueTableAliases(QueryTreeNodePtr & node, const QueryTreeNodePtr & /*table_expression*/, const ContextPtr & context) +{ + CreateUniqueTableAliasesVisitor(context).visit(node); +} + +} diff --git a/src/Analyzer/createUniqueTableAliases.h b/src/Analyzer/createUniqueTableAliases.h new file mode 100644 index 00000000000..d57a198498c --- /dev/null +++ b/src/Analyzer/createUniqueTableAliases.h @@ -0,0 +1,18 @@ +#pragma once + +#include +#include + +class IQueryTreeNode; +using QueryTreeNodePtr = std::shared_ptr; + +namespace DB +{ + +/* + * For each table expression in the Query Tree generate and add a unique alias. + * If table expression had an alias in initial query tree, override it. + */ +void createUniqueTableAliases(QueryTreeNodePtr & node, const QueryTreeNodePtr & table_expression, const ContextPtr & context); + +} diff --git a/src/Backups/RestorerFromBackup.cpp b/src/Backups/RestorerFromBackup.cpp index 4e580e493a7..a33773f19ab 100644 --- a/src/Backups/RestorerFromBackup.cpp +++ b/src/Backups/RestorerFromBackup.cpp @@ -573,11 +573,12 @@ void RestorerFromBackup::createDatabase(const String & database_name) const create_database_query->if_not_exists = (restore_settings.create_table == RestoreTableCreationMode::kCreateIfNotExists); LOG_TRACE(log, "Creating database {}: {}", backQuoteIfNeed(database_name), serializeAST(*create_database_query)); - + auto query_context = Context::createCopy(context); + query_context->setSetting("allow_deprecated_database_ordinary", 1); try { /// Execute CREATE DATABASE query. - InterpreterCreateQuery interpreter{create_database_query, context}; + InterpreterCreateQuery interpreter{create_database_query, query_context}; interpreter.setInternal(true); interpreter.execute(); } diff --git a/src/Columns/ColumnAggregateFunction.cpp b/src/Columns/ColumnAggregateFunction.cpp index 0ec5db6c69d..2018015b46d 100644 --- a/src/Columns/ColumnAggregateFunction.cpp +++ b/src/Columns/ColumnAggregateFunction.cpp @@ -1,18 +1,19 @@ #include #include #include -#include -#include +#include #include #include -#include -#include -#include +#include #include -#include #include -#include +#include #include +#include +#include +#include +#include +#include namespace DB @@ -626,8 +627,7 @@ void ColumnAggregateFunction::getPermutation(PermutationSortDirection /*directio { size_t s = data.size(); res.resize(s); - for (size_t i = 0; i < s; ++i) - res[i] = i; + iota(res.data(), s, IColumn::Permutation::value_type(0)); } void ColumnAggregateFunction::updatePermutation(PermutationSortDirection, PermutationSortStability, diff --git a/src/Columns/ColumnConst.cpp b/src/Columns/ColumnConst.cpp index 10e960ea244..9aa0f5cfa49 100644 --- a/src/Columns/ColumnConst.cpp +++ b/src/Columns/ColumnConst.cpp @@ -2,9 +2,10 @@ #include #include -#include -#include #include +#include +#include +#include #include @@ -128,8 +129,7 @@ void ColumnConst::getPermutation(PermutationSortDirection /*direction*/, Permuta size_t /*limit*/, int /*nan_direction_hint*/, Permutation & res) const { res.resize(s); - for (size_t i = 0; i < s; ++i) - res[i] = i; + iota(res.data(), s, IColumn::Permutation::value_type(0)); } void ColumnConst::updatePermutation(PermutationSortDirection /*direction*/, PermutationSortStability /*stability*/, diff --git a/src/Columns/ColumnDecimal.cpp b/src/Columns/ColumnDecimal.cpp index baccfc69147..20fc5d8e1fe 100644 --- a/src/Columns/ColumnDecimal.cpp +++ b/src/Columns/ColumnDecimal.cpp @@ -1,10 +1,11 @@ -#include #include -#include -#include -#include +#include #include #include +#include +#include +#include +#include #include @@ -163,8 +164,7 @@ void ColumnDecimal::getPermutation(IColumn::PermutationSortDirection directio if (limit >= data_size) limit = 0; - for (size_t i = 0; i < data_size; ++i) - res[i] = i; + iota(res.data(), data_size, IColumn::Permutation::value_type(0)); if constexpr (is_arithmetic_v && !is_big_int_v) { @@ -183,8 +183,7 @@ void ColumnDecimal::getPermutation(IColumn::PermutationSortDirection directio /// Thresholds on size. Lower threshold is arbitrary. Upper threshold is chosen by the type for histogram counters. if (data_size >= 256 && data_size <= std::numeric_limits::max() && use_radix_sort) { - for (size_t i = 0; i < data_size; ++i) - res[i] = i; + iota(res.data(), data_size, IColumn::Permutation::value_type(0)); bool try_sort = false; diff --git a/src/Columns/ColumnObject.cpp b/src/Columns/ColumnObject.cpp index 2052ec3c968..f7176568a1b 100644 --- a/src/Columns/ColumnObject.cpp +++ b/src/Columns/ColumnObject.cpp @@ -2,6 +2,7 @@ #include #include #include +#include #include #include #include @@ -838,7 +839,7 @@ MutableColumnPtr ColumnObject::cloneResized(size_t new_size) const void ColumnObject::getPermutation(PermutationSortDirection, PermutationSortStability, size_t, int, Permutation & res) const { res.resize(num_rows); - std::iota(res.begin(), res.end(), 0); + iota(res.data(), res.size(), size_t(0)); } void ColumnObject::compareColumn(const IColumn & rhs, size_t rhs_row_num, diff --git a/src/Columns/ColumnSparse.cpp b/src/Columns/ColumnSparse.cpp index 057c0cd7112..02e6e9e56b4 100644 --- a/src/Columns/ColumnSparse.cpp +++ b/src/Columns/ColumnSparse.cpp @@ -1,11 +1,12 @@ -#include -#include #include +#include #include -#include -#include -#include +#include #include +#include +#include +#include +#include #include #include @@ -499,8 +500,7 @@ void ColumnSparse::getPermutationImpl(IColumn::PermutationSortDirection directio res.resize(_size); if (offsets->empty()) { - for (size_t i = 0; i < _size; ++i) - res[i] = i; + iota(res.data(), _size, IColumn::Permutation::value_type(0)); return; } diff --git a/src/Columns/ColumnTuple.cpp b/src/Columns/ColumnTuple.cpp index d8992125be4..356bb0493d2 100644 --- a/src/Columns/ColumnTuple.cpp +++ b/src/Columns/ColumnTuple.cpp @@ -1,16 +1,17 @@ #include -#include -#include #include +#include #include -#include +#include #include #include +#include +#include #include #include +#include #include -#include namespace DB @@ -378,8 +379,7 @@ void ColumnTuple::getPermutationImpl(IColumn::PermutationSortDirection direction { size_t rows = size(); res.resize(rows); - for (size_t i = 0; i < rows; ++i) - res[i] = i; + iota(res.data(), rows, IColumn::Permutation::value_type(0)); if (limit >= rows) limit = 0; diff --git a/src/Columns/ColumnVector.cpp b/src/Columns/ColumnVector.cpp index 37e62c76596..b1cf449dfde 100644 --- a/src/Columns/ColumnVector.cpp +++ b/src/Columns/ColumnVector.cpp @@ -1,24 +1,25 @@ #include "ColumnVector.h" -#include #include +#include #include #include -#include #include +#include +#include +#include +#include +#include #include #include #include #include #include #include -#include #include +#include #include -#include -#include -#include -#include +#include #include #include @@ -244,8 +245,7 @@ void ColumnVector::getPermutation(IColumn::PermutationSortDirection direction if (limit >= data_size) limit = 0; - for (size_t i = 0; i < data_size; ++i) - res[i] = i; + iota(res.data(), data_size, IColumn::Permutation::value_type(0)); if constexpr (is_arithmetic_v && !is_big_int_v) { diff --git a/src/Columns/IColumnDummy.cpp b/src/Columns/IColumnDummy.cpp index 01091a87049..7c237536f94 100644 --- a/src/Columns/IColumnDummy.cpp +++ b/src/Columns/IColumnDummy.cpp @@ -1,7 +1,8 @@ -#include -#include -#include #include +#include +#include +#include +#include namespace DB @@ -87,8 +88,7 @@ void IColumnDummy::getPermutation(IColumn::PermutationSortDirection /*direction* size_t /*limit*/, int /*nan_direction_hint*/, Permutation & res) const { res.resize(s); - for (size_t i = 0; i < s; ++i) - res[i] = i; + iota(res.data(), s, IColumn::Permutation::value_type(0)); } ColumnPtr IColumnDummy::replicate(const Offsets & offsets) const diff --git a/src/Columns/IColumnImpl.h b/src/Columns/IColumnImpl.h index 0eab9452813..8e0bf0014f2 100644 --- a/src/Columns/IColumnImpl.h +++ b/src/Columns/IColumnImpl.h @@ -6,10 +6,11 @@ * implementation. */ -#include -#include -#include #include +#include +#include +#include +#include namespace DB @@ -299,8 +300,7 @@ void IColumn::getPermutationImpl( if (limit >= data_size) limit = 0; - for (size_t i = 0; i < data_size; ++i) - res[i] = i; + iota(res.data(), data_size, Permutation::value_type(0)); if (limit) { diff --git a/src/Columns/tests/gtest_column_sparse.cpp b/src/Columns/tests/gtest_column_sparse.cpp index c3450ff91b4..02b15a2f5c4 100644 --- a/src/Columns/tests/gtest_column_sparse.cpp +++ b/src/Columns/tests/gtest_column_sparse.cpp @@ -1,6 +1,7 @@ #include #include +#include #include #include #include @@ -191,7 +192,7 @@ TEST(ColumnSparse, Permute) auto [sparse_src, full_src] = createColumns(n, k); IColumn::Permutation perm(n); - std::iota(perm.begin(), perm.end(), 0); + iota(perm.data(), perm.size(), size_t(0)); std::shuffle(perm.begin(), perm.end(), rng); auto sparse_dst = sparse_src->permute(perm, limit); diff --git a/src/Columns/tests/gtest_column_stable_permutation.cpp b/src/Columns/tests/gtest_column_stable_permutation.cpp index df898cffa04..0dabd4d1fc2 100644 --- a/src/Columns/tests/gtest_column_stable_permutation.cpp +++ b/src/Columns/tests/gtest_column_stable_permutation.cpp @@ -9,7 +9,6 @@ #include #include #include - #include #include #include @@ -17,6 +16,7 @@ #include #include #include +#include using namespace DB; @@ -32,8 +32,7 @@ void stableGetColumnPermutation( size_t size = column.size(); out_permutation.resize(size); - for (size_t i = 0; i < size; ++i) - out_permutation[i] = i; + iota(out_permutation.data(), size, IColumn::Permutation::value_type(0)); std::stable_sort( out_permutation.begin(), @@ -146,10 +145,7 @@ void assertColumnPermutations(ColumnCreateFunc column_create_func, ValueTransfor std::vector> ranges(ranges_size); std::vector ranges_permutations(ranges_size); - for (size_t i = 0; i < ranges_size; ++i) - { - ranges_permutations[i] = i; - } + iota(ranges_permutations.data(), ranges_size, IColumn::Permutation::value_type(0)); IColumn::Permutation actual_permutation; IColumn::Permutation expected_permutation; diff --git a/src/Common/CounterInFile.h b/src/Common/CounterInFile.h index 993ed97966a..854bf7cc675 100644 --- a/src/Common/CounterInFile.h +++ b/src/Common/CounterInFile.h @@ -88,7 +88,7 @@ public: { /// A more understandable error message. if (e.code() == DB::ErrorCodes::CANNOT_READ_ALL_DATA || e.code() == DB::ErrorCodes::ATTEMPT_TO_READ_AFTER_EOF) - throw DB::ParsingException(e.code(), "File {} is empty. You must fill it manually with appropriate value.", path); + throw DB::Exception(e.code(), "File {} is empty. You must fill it manually with appropriate value.", path); else throw; } diff --git a/src/Common/ErrorCodes.cpp b/src/Common/ErrorCodes.cpp index 9222a27afdf..577a83e40b9 100644 --- a/src/Common/ErrorCodes.cpp +++ b/src/Common/ErrorCodes.cpp @@ -589,6 +589,7 @@ M(707, GCP_ERROR) \ M(708, ILLEGAL_STATISTIC) \ M(709, CANNOT_GET_REPLICATED_DATABASE_SNAPSHOT) \ + M(710, FAULT_INJECTED) \ \ M(999, KEEPER_EXCEPTION) \ M(1000, POCO_EXCEPTION) \ diff --git a/src/Common/Exception.cpp b/src/Common/Exception.cpp index d5f1984a5ff..e1f010cc740 100644 --- a/src/Common/Exception.cpp +++ b/src/Common/Exception.cpp @@ -616,48 +616,4 @@ ExecutionStatus ExecutionStatus::fromText(const std::string & data) return status; } -ParsingException::ParsingException() = default; -ParsingException::ParsingException(const std::string & msg, int code) - : Exception(msg, code) -{ -} - -/// We use additional field formatted_message_ to make this method const. -std::string ParsingException::displayText() const -{ - try - { - formatted_message = message(); - bool need_newline = false; - if (!file_name.empty()) - { - formatted_message += fmt::format(": (in file/uri {})", file_name); - need_newline = true; - } - - if (line_number != -1) - { - formatted_message += fmt::format(": (at row {})", line_number); - need_newline = true; - } - - if (need_newline) - formatted_message += "\n"; - } - catch (...) {} // NOLINT(bugprone-empty-catch) - - if (!formatted_message.empty()) - { - std::string result = name(); - result.append(": "); - result.append(formatted_message); - return result; - } - else - { - return Exception::displayText(); - } -} - - } diff --git a/src/Common/Exception.h b/src/Common/Exception.h index aabc848b230..6f30fde3876 100644 --- a/src/Common/Exception.h +++ b/src/Common/Exception.h @@ -235,43 +235,6 @@ private: const char * className() const noexcept override { return "DB::ErrnoException"; } }; - -/// Special class of exceptions, used mostly in ParallelParsingInputFormat for -/// more convenient calculation of problem line number. -class ParsingException : public Exception -{ - ParsingException(const std::string & msg, int code); -public: - ParsingException(); - - // Format message with fmt::format, like the logging functions. - template - ParsingException(int code, FormatStringHelper fmt, Args &&... args) : Exception(fmt::format(fmt.fmt_str, std::forward(args)...), code) - { - message_format_string = fmt.message_format_string; - } - - std::string displayText() const override; - - ssize_t getLineNumber() const { return line_number; } - void setLineNumber(int line_number_) { line_number = line_number_;} - - String getFileName() const { return file_name; } - void setFileName(const String & file_name_) { file_name = file_name_; } - - Exception * clone() const override { return new ParsingException(*this); } - void rethrow() const override { throw *this; } // NOLINT - -private: - ssize_t line_number{-1}; - String file_name; - mutable std::string formatted_message; - - const char * name() const noexcept override { return "DB::ParsingException"; } - const char * className() const noexcept override { return "DB::ParsingException"; } -}; - - using Exceptions = std::vector; /** Try to write an exception to the log (and forget about it). diff --git a/src/Common/FailPoint.cpp b/src/Common/FailPoint.cpp index 9665788dac2..f29aee0cdcc 100644 --- a/src/Common/FailPoint.cpp +++ b/src/Common/FailPoint.cpp @@ -34,6 +34,8 @@ static struct InitFiu #define APPLY_FOR_FAILPOINTS(ONCE, REGULAR, PAUSEABLE_ONCE, PAUSEABLE) \ ONCE(replicated_merge_tree_commit_zk_fail_after_op) \ + ONCE(replicated_queue_fail_next_entry) \ + REGULAR(replicated_queue_unfail_entries) \ ONCE(replicated_merge_tree_insert_quorum_fail_0) \ REGULAR(replicated_merge_tree_commit_zk_fail_when_recovering_from_hw_fault) \ REGULAR(use_delayed_remote_source) \ diff --git a/src/Common/ProfileEvents.cpp b/src/Common/ProfileEvents.cpp index d6e5a77b64a..119e0d99143 100644 --- a/src/Common/ProfileEvents.cpp +++ b/src/Common/ProfileEvents.cpp @@ -288,6 +288,18 @@ The server successfully detected this situation and will download merged part fr M(OSReadChars, "Number of bytes read from filesystem, including page cache.") \ M(OSWriteChars, "Number of bytes written to filesystem, including page cache.") \ \ + M(ParallelReplicasHandleRequestMicroseconds, "Time spent processing requests for marks from replicas") \ + M(ParallelReplicasHandleAnnouncementMicroseconds, "Time spent processing replicas announcements") \ + \ + M(ParallelReplicasReadAssignedMarks, "Sum across all replicas of how many of scheduled marks were assigned by consistent hash") \ + M(ParallelReplicasReadUnassignedMarks, "Sum across all replicas of how many unassigned marks were scheduled") \ + M(ParallelReplicasReadAssignedForStealingMarks, "Sum across all replicas of how many of scheduled marks were assigned for stealing by consistent hash") \ + \ + M(ParallelReplicasStealingByHashMicroseconds, "Time spent collecting segments meant for stealing by hash") \ + M(ParallelReplicasProcessingPartsMicroseconds, "Time spent processing data parts") \ + M(ParallelReplicasStealingLeftoversMicroseconds, "Time spent collecting orphaned segments") \ + M(ParallelReplicasCollectingOwnedSegmentsMicroseconds, "Time spent collecting segments meant by hash") \ + \ M(PerfCpuCycles, "Total cycles. Be wary of what happens during CPU frequency scaling.") \ M(PerfInstructions, "Retired instructions. Be careful, these can be affected by various issues, most notably hardware interrupt counts.") \ M(PerfCacheReferences, "Cache accesses. Usually, this indicates Last Level Cache accesses, but this may vary depending on your CPU. This may include prefetches and coherency messages; again this depends on the design of your CPU.") \ @@ -587,19 +599,6 @@ The server successfully detected this situation and will download merged part fr M(LogError, "Number of log messages with level Error") \ M(LogFatal, "Number of log messages with level Fatal") \ \ - M(InterfaceHTTPSendBytes, "Number of bytes sent through HTTP interfaces") \ - M(InterfaceHTTPReceiveBytes, "Number of bytes received through HTTP interfaces") \ - M(InterfaceNativeSendBytes, "Number of bytes sent through native interfaces") \ - M(InterfaceNativeReceiveBytes, "Number of bytes received through native interfaces") \ - M(InterfacePrometheusSendBytes, "Number of bytes sent through Prometheus interfaces") \ - M(InterfacePrometheusReceiveBytes, "Number of bytes received through Prometheus interfaces") \ - M(InterfaceInterserverSendBytes, "Number of bytes sent through interserver interfaces") \ - M(InterfaceInterserverReceiveBytes, "Number of bytes received through interserver interfaces") \ - M(InterfaceMySQLSendBytes, "Number of bytes sent through MySQL interfaces") \ - M(InterfaceMySQLReceiveBytes, "Number of bytes received through MySQL interfaces") \ - M(InterfacePostgreSQLSendBytes, "Number of bytes sent through PostgreSQL interfaces") \ - M(InterfacePostgreSQLReceiveBytes, "Number of bytes received through PostgreSQL interfaces") \ - \ M(ParallelReplicasUsedCount, "Number of replicas used to execute a query with task-based parallel replicas") \ #ifdef APPLY_FOR_EXTERNAL_EVENTS diff --git a/src/Common/TargetSpecific.h b/src/Common/TargetSpecific.h index 4ee29d3fc55..68f6d39c3ff 100644 --- a/src/Common/TargetSpecific.h +++ b/src/Common/TargetSpecific.h @@ -365,7 +365,7 @@ DECLARE_AVX512VBMI2_SPECIFIC_CODE( FUNCTION_HEADER \ \ name \ - FUNCTION_BODY \ + FUNCTION_BODY \ /// NOLINTNEXTLINE #define MULTITARGET_FUNCTION_AVX512BW_AVX512F_AVX2_SSE42(FUNCTION_HEADER, name, FUNCTION_BODY) \ diff --git a/src/AggregateFunctions/findNumeric.h b/src/Common/findExtreme.cpp similarity index 57% rename from src/AggregateFunctions/findNumeric.h rename to src/Common/findExtreme.cpp index df7c325569a..032ac75b79b 100644 --- a/src/AggregateFunctions/findNumeric.h +++ b/src/Common/findExtreme.cpp @@ -1,18 +1,9 @@ -#pragma once - #include -#include -#include -#include #include - -#include -#include +#include namespace DB { -template -concept is_any_native_number = (is_any_of); template struct MinComparator @@ -28,8 +19,8 @@ struct MaxComparator MULTITARGET_FUNCTION_AVX2_SSE42( MULTITARGET_FUNCTION_HEADER(template static std::optional NO_INLINE), - findNumericExtremeImpl, - MULTITARGET_FUNCTION_BODY((const T * __restrict ptr, const UInt8 * __restrict condition_map [[maybe_unused]], size_t row_begin, size_t row_end) + findExtremeImpl, + MULTITARGET_FUNCTION_BODY((const T * __restrict ptr, const UInt8 * __restrict condition_map [[maybe_unused]], size_t row_begin, size_t row_end) /// NOLINT { size_t count = row_end - row_begin; ptr += row_begin; @@ -86,69 +77,67 @@ MULTITARGET_FUNCTION_AVX2_SSE42( } )) - /// Given a vector of T finds the extreme (MIN or MAX) value template static std::optional -findNumericExtreme(const T * __restrict ptr, const UInt8 * __restrict condition_map [[maybe_unused]], size_t start, size_t end) +findExtreme(const T * __restrict ptr, const UInt8 * __restrict condition_map [[maybe_unused]], size_t start, size_t end) { #if USE_MULTITARGET_CODE /// We see no benefit from using AVX512BW or AVX512F (over AVX2), so we only declare SSE and AVX2 if (isArchSupported(TargetArch::AVX2)) - return findNumericExtremeImplAVX2(ptr, condition_map, start, end); + return findExtremeImplAVX2(ptr, condition_map, start, end); if (isArchSupported(TargetArch::SSE42)) - return findNumericExtremeImplSSE42(ptr, condition_map, start, end); + return findExtremeImplSSE42(ptr, condition_map, start, end); #endif - return findNumericExtremeImpl(ptr, condition_map, start, end); + return findExtremeImpl(ptr, condition_map, start, end); } template -std::optional findNumericMin(const T * __restrict ptr, size_t start, size_t end) +std::optional findExtremeMin(const T * __restrict ptr, size_t start, size_t end) { - return findNumericExtreme, true, false>(ptr, nullptr, start, end); + return findExtreme, true, false>(ptr, nullptr, start, end); } template -std::optional findNumericMinNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end) +std::optional findExtremeMinNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end) { - return findNumericExtreme, false, true>(ptr, condition_map, start, end); + return findExtreme, false, true>(ptr, condition_map, start, end); } template -std::optional findNumericMinIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end) +std::optional findExtremeMinIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end) { - return findNumericExtreme, false, false>(ptr, condition_map, start, end); + return findExtreme, false, false>(ptr, condition_map, start, end); } template -std::optional findNumericMax(const T * __restrict ptr, size_t start, size_t end) +std::optional findExtremeMax(const T * __restrict ptr, size_t start, size_t end) { - return findNumericExtreme, true, false>(ptr, nullptr, start, end); + return findExtreme, true, false>(ptr, nullptr, start, end); } template -std::optional findNumericMaxNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end) +std::optional findExtremeMaxNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end) { - return findNumericExtreme, false, true>(ptr, condition_map, start, end); + return findExtreme, false, true>(ptr, condition_map, start, end); } template -std::optional findNumericMaxIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end) +std::optional findExtremeMaxIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end) { - return findNumericExtreme, false, false>(ptr, condition_map, start, end); + return findExtreme, false, false>(ptr, condition_map, start, end); } -#define EXTERN_INSTANTIATION(T) \ - extern template std::optional findNumericMin(const T * __restrict ptr, size_t start, size_t end); \ - extern template std::optional findNumericMinNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \ - extern template std::optional findNumericMinIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \ - extern template std::optional findNumericMax(const T * __restrict ptr, size_t start, size_t end); \ - extern template std::optional findNumericMaxNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \ - extern template std::optional findNumericMaxIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); - - FOR_BASIC_NUMERIC_TYPES(EXTERN_INSTANTIATION) -#undef EXTERN_INSTANTIATION +#define INSTANTIATION(T) \ + template std::optional findExtremeMin(const T * __restrict ptr, size_t start, size_t end); \ + template std::optional findExtremeMinNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \ + template std::optional findExtremeMinIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \ + template std::optional findExtremeMax(const T * __restrict ptr, size_t start, size_t end); \ + template std::optional findExtremeMaxNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \ + template std::optional findExtremeMaxIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); +FOR_BASIC_NUMERIC_TYPES(INSTANTIATION) +#undef INSTANTIATION } diff --git a/src/Common/findExtreme.h b/src/Common/findExtreme.h new file mode 100644 index 00000000000..b38c24697c0 --- /dev/null +++ b/src/Common/findExtreme.h @@ -0,0 +1,45 @@ +#pragma once + +#include +#include +#include +#include + +#include +#include + +namespace DB +{ +template +concept is_any_native_number = (is_any_of); + +template +std::optional findExtremeMin(const T * __restrict ptr, size_t start, size_t end); + +template +std::optional findExtremeMinNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); + +template +std::optional findExtremeMinIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); + +template +std::optional findExtremeMax(const T * __restrict ptr, size_t start, size_t end); + +template +std::optional findExtremeMaxNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); + +template +std::optional findExtremeMaxIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); + +#define EXTERN_INSTANTIATION(T) \ + extern template std::optional findExtremeMin(const T * __restrict ptr, size_t start, size_t end); \ + extern template std::optional findExtremeMinNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \ + extern template std::optional findExtremeMinIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \ + extern template std::optional findExtremeMax(const T * __restrict ptr, size_t start, size_t end); \ + extern template std::optional findExtremeMaxNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \ + extern template std::optional findExtremeMaxIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); + + FOR_BASIC_NUMERIC_TYPES(EXTERN_INSTANTIATION) +#undef EXTERN_INSTANTIATION + +} diff --git a/src/Common/iota.cpp b/src/Common/iota.cpp new file mode 100644 index 00000000000..98f18eb195b --- /dev/null +++ b/src/Common/iota.cpp @@ -0,0 +1,36 @@ +#include +#include +#include + +namespace DB +{ + +MULTITARGET_FUNCTION_AVX2_SSE42( + MULTITARGET_FUNCTION_HEADER(template void NO_INLINE), + iotaImpl, MULTITARGET_FUNCTION_BODY((T * begin, size_t count, T first_value) /// NOLINT + { + for (size_t i = 0; i < count; i++) + *(begin + i) = static_cast(first_value + i); + }) +) + +template +void iota(T * begin, size_t count, T first_value) +{ +#if USE_MULTITARGET_CODE + if (isArchSupported(TargetArch::AVX2)) + return iotaImplAVX2(begin, count, first_value); + + if (isArchSupported(TargetArch::SSE42)) + return iotaImplSSE42(begin, count, first_value); +#endif + return iotaImpl(begin, count, first_value); +} + +template void iota(UInt8 * begin, size_t count, UInt8 first_value); +template void iota(UInt32 * begin, size_t count, UInt32 first_value); +template void iota(UInt64 * begin, size_t count, UInt64 first_value); +#if defined(OS_DARWIN) +template void iota(size_t * begin, size_t count, size_t first_value); +#endif +} diff --git a/src/Common/iota.h b/src/Common/iota.h new file mode 100644 index 00000000000..7910274d15d --- /dev/null +++ b/src/Common/iota.h @@ -0,0 +1,34 @@ +#pragma once + +#include +#include + +/// This is a replacement for std::iota to use dynamic dispatch +/// Note that is only defined for containers with contiguous memory only + +namespace DB +{ + +/// Make sure to add any new type to the extern declaration at the end of the file and instantiate it in iota.cpp + +template +concept iota_supported_types = (is_any_of< + T, + UInt8, + UInt32, + UInt64 +#if defined(OS_DARWIN) + , + size_t +#endif + >); + +template void iota(T * begin, size_t count, T first_value); + +extern template void iota(UInt8 * begin, size_t count, UInt8 first_value); +extern template void iota(UInt32 * begin, size_t count, UInt32 first_value); +extern template void iota(UInt64 * begin, size_t count, UInt64 first_value); +#if defined(OS_DARWIN) +extern template void iota(size_t * begin, size_t count, size_t first_value); +#endif +} diff --git a/src/Common/levenshteinDistance.cpp b/src/Common/levenshteinDistance.cpp index 9eb6c0f9050..3ab80af94bb 100644 --- a/src/Common/levenshteinDistance.cpp +++ b/src/Common/levenshteinDistance.cpp @@ -1,5 +1,6 @@ -#include #include +#include +#include namespace DB { @@ -11,8 +12,7 @@ size_t levenshteinDistance(const String & lhs, const String & rhs) PODArrayWithStackMemory row(n + 1); - for (size_t i = 1; i <= n; ++i) - row[i] = i; + iota(row.data() + 1, n, size_t(1)); for (size_t j = 1; j <= m; ++j) { diff --git a/src/Common/tests/gtest_hash_table.cpp b/src/Common/tests/gtest_hash_table.cpp index 72941126cfd..ae432de7766 100644 --- a/src/Common/tests/gtest_hash_table.cpp +++ b/src/Common/tests/gtest_hash_table.cpp @@ -6,6 +6,7 @@ #include #include #include +#include #include #include @@ -20,7 +21,7 @@ namespace std::vector getVectorWithNumbersUpToN(size_t n) { std::vector res(n); - std::iota(res.begin(), res.end(), 0); + iota(res.data(), res.size(), UInt64(0)); return res; } diff --git a/src/Core/ServerSettings.h b/src/Core/ServerSettings.h index 85e3d33f80b..310b3585eab 100644 --- a/src/Core/ServerSettings.h +++ b/src/Core/ServerSettings.h @@ -26,6 +26,8 @@ namespace DB M(UInt64, max_active_parts_loading_thread_pool_size, 64, "The number of threads to load active set of data parts (Active ones) at startup.", 0) \ M(UInt64, max_outdated_parts_loading_thread_pool_size, 32, "The number of threads to load inactive set of data parts (Outdated ones) at startup.", 0) \ M(UInt64, max_parts_cleaning_thread_pool_size, 128, "The number of threads for concurrent removal of inactive data parts.", 0) \ + M(UInt64, max_mutations_bandwidth_for_server, 0, "The maximum read speed of all mutations on server in bytes per second. Zero means unlimited.", 0) \ + M(UInt64, max_merges_bandwidth_for_server, 0, "The maximum read speed of all merges on server in bytes per second. Zero means unlimited.", 0) \ M(UInt64, max_replicated_fetches_network_bandwidth_for_server, 0, "The maximum speed of data exchange over the network in bytes per second for replicated fetches. Zero means unlimited.", 0) \ M(UInt64, max_replicated_sends_network_bandwidth_for_server, 0, "The maximum speed of data exchange over the network in bytes per second for replicated sends. Zero means unlimited.", 0) \ M(UInt64, max_remote_read_network_bandwidth_for_server, 0, "The maximum speed of data exchange over the network in bytes per second for read. Zero means unlimited.", 0) \ diff --git a/src/Core/Settings.h b/src/Core/Settings.h index 988c4f357e0..58b7cbab4c9 100644 --- a/src/Core/Settings.h +++ b/src/Core/Settings.h @@ -185,6 +185,7 @@ class IColumn; M(Float, parallel_replicas_single_task_marks_count_multiplier, 2, "A multiplier which will be added during calculation for minimal number of marks to retrieve from coordinator. This will be applied only for remote replicas.", 0) \ M(Bool, parallel_replicas_for_non_replicated_merge_tree, false, "If true, ClickHouse will use parallel replicas algorithm also for non-replicated MergeTree tables", 0) \ M(UInt64, parallel_replicas_min_number_of_rows_per_replica, 0, "Limit the number of replicas used in a query to (estimated rows to read / min_number_of_rows_per_replica). The max is still limited by 'max_parallel_replicas'", 0) \ + M(UInt64, parallel_replicas_mark_segment_size, 128, "Parts virtually divided into segments to be distributed between replicas for parallel reading. This setting controls the size of these segments. Not recommended to change until you're absolutely sure in what you're doing", 0) \ \ M(Bool, skip_unavailable_shards, false, "If true, ClickHouse silently skips unavailable shards. Shard is marked as unavailable when: 1) The shard cannot be reached due to a connection failure. 2) Shard is unresolvable through DNS. 3) Table does not exist on the shard.", 0) \ \ @@ -584,6 +585,7 @@ class IColumn; M(Bool, enable_early_constant_folding, true, "Enable query optimization where we analyze function and subqueries results and rewrite query if there're constants there", 0) \ M(Bool, deduplicate_blocks_in_dependent_materialized_views, false, "Should deduplicate blocks for materialized views if the block is not a duplicate for the table. Use true to always deduplicate in dependent tables.", 0) \ M(Bool, materialized_views_ignore_errors, false, "Allows to ignore errors for MATERIALIZED VIEW, and deliver original block to the table regardless of MVs", 0) \ + M(Bool, ignore_materialized_views_with_dropped_target_table, false, "Ignore MVs with dropped taraget table during pushing to views", 0) \ M(Bool, allow_experimental_refreshable_materialized_view, false, "Allow refreshable materialized views (CREATE MATERIALIZED VIEW REFRESH ...).", 0) \ M(Bool, stop_refreshable_materialized_views_on_startup, false, "On server startup, prevent scheduling of refreshable materialized views, as if with SYSTEM STOP VIEWS. You can manually start them with SYSTEM START VIEWS or SYSTEM START VIEW afterwards. Also applies to newly created views. Has no effect on non-refreshable materialized views.", 0) \ M(Bool, use_compact_format_in_distributed_parts_names, true, "Changes format of directories names for distributed table insert parts.", 0) \ @@ -707,7 +709,6 @@ class IColumn; M(Bool, query_plan_execute_functions_after_sorting, true, "Allow to re-order functions after sorting", 0) \ M(Bool, query_plan_reuse_storage_ordering_for_window_functions, true, "Allow to use the storage sorting for window functions", 0) \ M(Bool, query_plan_lift_up_union, true, "Allow to move UNIONs up so that more parts of the query plan can be optimized", 0) \ - M(Bool, query_plan_optimize_primary_key, true, "Analyze primary key using query plan (instead of AST)", 0) \ M(Bool, query_plan_read_in_order, true, "Use query plan for read-in-order optimization", 0) \ M(Bool, query_plan_aggregation_in_order, true, "Use query plan for aggregation-in-order optimization", 0) \ M(Bool, query_plan_remove_redundant_sorting, true, "Remove redundant sorting in query plan. For example, sorting steps related to ORDER BY clauses in subqueries", 0) \ @@ -843,7 +844,7 @@ class IColumn; M(Timezone, session_timezone, "", "This setting can be removed in the future due to potential caveats. It is experimental and is not suitable for production usage. The default timezone for current session or query. The server default timezone if empty.", 0) \ M(Bool, allow_create_index_without_type, false, "Allow CREATE INDEX query without TYPE. Query will be ignored. Made for SQL compatibility tests.", 0) \ M(Bool, create_index_ignore_unique, false, "Ignore UNIQUE keyword in CREATE UNIQUE INDEX. Made for SQL compatibility tests.", 0) \ - M(Bool, print_pretty_type_names, false, "Print pretty type names in DESCRIBE query and toTypeName() function", 0) \ + M(Bool, print_pretty_type_names, true, "Print pretty type names in DESCRIBE query and toTypeName() function", 0) \ M(Bool, create_table_empty_primary_key_by_default, false, "Allow to create *MergeTree tables with empty primary key when ORDER BY and PRIMARY KEY not specified", 0) \ M(Bool, allow_named_collection_override_by_default, true, "Allow named collections' fields override by default.", 0)\ M(Bool, allow_experimental_shared_merge_tree, false, "Only available in ClickHouse Cloud", 0) \ @@ -916,6 +917,7 @@ class IColumn; MAKE_OBSOLETE(M, Bool, optimize_move_functions_out_of_any, false) \ MAKE_OBSOLETE(M, Bool, allow_experimental_undrop_table_query, true) \ MAKE_OBSOLETE(M, Bool, allow_experimental_s3queue, true) \ + MAKE_OBSOLETE(M, Bool, query_plan_optimize_primary_key, true) \ /** The section above is for obsolete settings. Do not add anything there. */ @@ -981,6 +983,7 @@ class IColumn; M(SchemaInferenceMode, schema_inference_mode, "default", "Mode of schema inference. 'default' - assume that all files have the same schema and schema can be inferred from any file, 'union' - files can have different schemas and the resulting schema should be the a union of schemas of all files", 0) \ M(Bool, schema_inference_make_columns_nullable, true, "If set to true, all inferred types will be Nullable in schema inference for formats without information about nullability.", 0) \ M(Bool, input_format_json_read_bools_as_numbers, true, "Allow to parse bools as numbers in JSON input formats", 0) \ + M(Bool, input_format_json_read_bools_as_strings, true, "Allow to parse bools as strings in JSON input formats", 0) \ M(Bool, input_format_json_try_infer_numbers_from_strings, false, "Try to infer numbers from string fields while schema inference", 0) \ M(Bool, input_format_json_validate_types_from_metadata, true, "For JSON/JSONCompact/JSONColumnsWithMetadata input formats this controls whether format parser should check if data types from input metadata match data types of the corresponding columns from the table", 0) \ M(Bool, input_format_json_read_numbers_as_strings, true, "Allow to parse numbers as strings in JSON input formats", 0) \ diff --git a/src/Core/SettingsChangesHistory.h b/src/Core/SettingsChangesHistory.h index aad57ffebb7..fdee1fd5b13 100644 --- a/src/Core/SettingsChangesHistory.h +++ b/src/Core/SettingsChangesHistory.h @@ -81,6 +81,8 @@ namespace SettingsChangesHistory /// It's used to implement `compatibility` setting (see https://github.com/ClickHouse/ClickHouse/issues/35972) static std::map settings_changes_history = { + {"24.1", {{"print_pretty_type_names", false, true, "Better user experience."}, + {"input_format_json_read_bools_as_strings", false, true, "Allow to read bools as strings in JSON formats by default"}}}, {"23.12", {{"allow_suspicious_ttl_expressions", true, false, "It is a new setting, and in previous versions the behavior was equivalent to allowing."}, {"input_format_parquet_allow_missing_columns", false, true, "Allow missing columns in Parquet files by default"}, {"input_format_orc_allow_missing_columns", false, true, "Allow missing columns in ORC files by default"}, diff --git a/src/Core/SettingsEnums.cpp b/src/Core/SettingsEnums.cpp index ee113a6776f..0c84c1cc7d2 100644 --- a/src/Core/SettingsEnums.cpp +++ b/src/Core/SettingsEnums.cpp @@ -115,6 +115,8 @@ IMPLEMENT_SETTING_ENUM(DistributedDDLOutputMode, ErrorCodes::BAD_ARGUMENTS, {{"none", DistributedDDLOutputMode::NONE}, {"throw", DistributedDDLOutputMode::THROW}, {"null_status_on_timeout", DistributedDDLOutputMode::NULL_STATUS_ON_TIMEOUT}, + {"throw_only_active", DistributedDDLOutputMode::THROW_ONLY_ACTIVE}, + {"null_status_on_timeout_only_active", DistributedDDLOutputMode::NULL_STATUS_ON_TIMEOUT_ONLY_ACTIVE}, {"never_throw", DistributedDDLOutputMode::NEVER_THROW}}) IMPLEMENT_SETTING_ENUM(StreamingHandleErrorMode, ErrorCodes::BAD_ARGUMENTS, diff --git a/src/Core/SettingsEnums.h b/src/Core/SettingsEnums.h index 7977a0b3ab6..246cdf6f684 100644 --- a/src/Core/SettingsEnums.h +++ b/src/Core/SettingsEnums.h @@ -173,6 +173,8 @@ enum class DistributedDDLOutputMode THROW, NULL_STATUS_ON_TIMEOUT, NEVER_THROW, + THROW_ONLY_ACTIVE, + NULL_STATUS_ON_TIMEOUT_ONLY_ACTIVE, }; DECLARE_SETTING_ENUM(DistributedDDLOutputMode) diff --git a/src/DataTypes/DataTypeMap.cpp b/src/DataTypes/DataTypeMap.cpp index acd26ca338b..1f246af74d3 100644 --- a/src/DataTypes/DataTypeMap.cpp +++ b/src/DataTypes/DataTypeMap.cpp @@ -85,10 +85,7 @@ std::string DataTypeMap::doGetName() const std::string DataTypeMap::doGetPrettyName(size_t indent) const { WriteBufferFromOwnString s; - s << "Map(\n" - << fourSpaceIndent(indent + 1) << key_type->getPrettyName(indent + 1) << ",\n" - << fourSpaceIndent(indent + 1) << value_type->getPrettyName(indent + 1) << '\n' - << fourSpaceIndent(indent) << ')'; + s << "Map(" << key_type->getPrettyName(indent) << ", " << value_type->getPrettyName(indent) << ')'; return s.str(); } diff --git a/src/DataTypes/DataTypeTuple.cpp b/src/DataTypes/DataTypeTuple.cpp index fd2e5e6a784..db8a14c537a 100644 --- a/src/DataTypes/DataTypeTuple.cpp +++ b/src/DataTypes/DataTypeTuple.cpp @@ -98,21 +98,38 @@ std::string DataTypeTuple::doGetPrettyName(size_t indent) const { size_t size = elems.size(); WriteBufferFromOwnString s; - s << "Tuple(\n"; - for (size_t i = 0; i != size; ++i) + /// If the Tuple is named, we will output it in multiple lines with indentation. + if (have_explicit_names) { - if (i != 0) - s << ",\n"; + s << "Tuple(\n"; - s << fourSpaceIndent(indent + 1); - if (have_explicit_names) - s << backQuoteIfNeed(names[i]) << ' '; + for (size_t i = 0; i != size; ++i) + { + if (i != 0) + s << ",\n"; - s << elems[i]->getPrettyName(indent + 1); + s << fourSpaceIndent(indent + 1) + << backQuoteIfNeed(names[i]) << ' ' + << elems[i]->getPrettyName(indent + 1); + } + + s << ')'; + } + else + { + s << "Tuple("; + + for (size_t i = 0; i != size; ++i) + { + if (i != 0) + s << ", "; + s << elems[i]->getPrettyName(indent); + } + + s << ')'; } - s << '\n' << fourSpaceIndent(indent) << ')'; return s.str(); } diff --git a/src/DataTypes/Serializations/SerializationArray.cpp b/src/DataTypes/Serializations/SerializationArray.cpp index 1a21a45d7b8..0d99b741a23 100644 --- a/src/DataTypes/Serializations/SerializationArray.cpp +++ b/src/DataTypes/Serializations/SerializationArray.cpp @@ -390,7 +390,7 @@ void SerializationArray::deserializeBinaryBulkWithMultipleStreams( /// Check consistency between offsets and elements subcolumns. /// But if elements column is empty - it's ok for columns of Nested types that was added by ALTER. if (!nested_column->empty() && nested_column->size() != last_offset) - throw ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Cannot read all array values: read just {} of {}", + throw Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "Cannot read all array values: read just {} of {}", toString(nested_column->size()), toString(last_offset)); column = std::move(mutable_column); @@ -445,7 +445,7 @@ static void deserializeTextImpl(IColumn & column, ReadBuffer & istr, Reader && r if (*istr.position() == ',') ++istr.position(); else - throw ParsingException(ErrorCodes::CANNOT_READ_ARRAY_FROM_TEXT, + throw Exception(ErrorCodes::CANNOT_READ_ARRAY_FROM_TEXT, "Cannot read array from text, expected comma or end of array, found '{}'", *istr.position()); } diff --git a/src/DataTypes/Serializations/SerializationNullable.cpp b/src/DataTypes/Serializations/SerializationNullable.cpp index 15203bdc9fa..d9efc6fff10 100644 --- a/src/DataTypes/Serializations/SerializationNullable.cpp +++ b/src/DataTypes/Serializations/SerializationNullable.cpp @@ -359,7 +359,7 @@ ReturnType SerializationNullable::deserializeTextEscapedAndRawImpl(IColumn & col nested_column.popBack(1); if (null_representation.find('\t') != std::string::npos || null_representation.find('\n') != std::string::npos) - throw DB::ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "TSV custom null representation " + throw DB::Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "TSV custom null representation " "containing '\\t' or '\\n' may not work correctly for large input."); WriteBufferFromOwnString parsed_value; @@ -367,7 +367,7 @@ ReturnType SerializationNullable::deserializeTextEscapedAndRawImpl(IColumn & col nested_serialization->serializeTextEscaped(nested_column, nested_column.size() - 1, parsed_value, settings); else nested_serialization->serializeTextRaw(nested_column, nested_column.size() - 1, parsed_value, settings); - throw DB::ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Error while parsing \"{}{}\" as Nullable" + throw DB::Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "Error while parsing \"{}{}\" as Nullable" " at position {}: got \"{}\", which was deserialized as \"{}\". " "It seems that input data is ill-formatted.", std::string(pos, buf.buffer().end()), @@ -452,7 +452,7 @@ ReturnType SerializationNullable::deserializeTextQuotedImpl(IColumn & column, Re /// It can happen only if there is an unquoted string instead of a number. /// We also should delete incorrectly deserialized value from nested column. nested_column.popBack(1); - throw DB::ParsingException( + throw DB::Exception( ErrorCodes::CANNOT_READ_ALL_DATA, "Error while parsing Nullable: got an unquoted string {} instead of a number", String(buf.position(), std::min(10ul, buf.available()))); @@ -589,12 +589,12 @@ ReturnType SerializationNullable::deserializeTextCSVImpl(IColumn & column, ReadB if (null_representation.find(settings.csv.delimiter) != std::string::npos || null_representation.find('\r') != std::string::npos || null_representation.find('\n') != std::string::npos) - throw DB::ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "CSV custom null representation containing " + throw DB::Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "CSV custom null representation containing " "format_csv_delimiter, '\\r' or '\\n' may not work correctly for large input."); WriteBufferFromOwnString parsed_value; nested_serialization->serializeTextCSV(nested_column, nested_column.size() - 1, parsed_value, settings); - throw DB::ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Error while parsing \"{}{}\" as Nullable" + throw DB::Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "Error while parsing \"{}{}\" as Nullable" " at position {}: got \"{}\", which was deserialized as \"{}\". " "It seems that input data is ill-formatted.", std::string(pos, buf.buffer().end()), diff --git a/src/DataTypes/Serializations/SerializationString.cpp b/src/DataTypes/Serializations/SerializationString.cpp index 788ff429088..b2b083fd466 100644 --- a/src/DataTypes/Serializations/SerializationString.cpp +++ b/src/DataTypes/Serializations/SerializationString.cpp @@ -335,6 +335,22 @@ void SerializationString::deserializeTextJSON(IColumn & column, ReadBuffer & ist { read(column, [&](ColumnString::Chars & data) { readJSONArrayInto(data, istr); }); } + else if (settings.json.read_bools_as_strings && !istr.eof() && (*istr.position() == 't' || *istr.position() == 'f')) + { + String str_value; + if (*istr.position() == 't') + { + assertString("true", istr); + str_value = "true"; + } + else if (*istr.position() == 'f') + { + assertString("false", istr); + str_value = "false"; + } + + read(column, [&](ColumnString::Chars & data) { data.insert(str_value.begin(), str_value.end()); }); + } else if (settings.json.read_numbers_as_strings && !istr.eof() && *istr.position() != '"') { String field; diff --git a/src/Databases/DatabaseFactory.cpp b/src/Databases/DatabaseFactory.cpp index 2c2e4030821..fc8073eac3b 100644 --- a/src/Databases/DatabaseFactory.cpp +++ b/src/Databases/DatabaseFactory.cpp @@ -92,9 +92,16 @@ void validate(const ASTCreateQuery & create_query) DatabasePtr DatabaseFactory::get(const ASTCreateQuery & create, const String & metadata_path, ContextPtr context) { + const auto engine_name = create.storage->engine->name; /// check if the database engine is a valid one before proceeding - if (!database_engines.contains(create.storage->engine->name)) - throw Exception(ErrorCodes::UNKNOWN_DATABASE_ENGINE, "Unknown database engine: {}", create.storage->engine->name); + if (!database_engines.contains(engine_name)) + { + auto hints = getHints(engine_name); + if (!hints.empty()) + throw Exception(ErrorCodes::UNKNOWN_DATABASE_ENGINE, "Unknown database engine {}. Maybe you meant: {}", engine_name, toString(hints)); + else + throw Exception(ErrorCodes::UNKNOWN_DATABASE_ENGINE, "Unknown database engine: {}", create.storage->engine->name); + } /// if the engine is found (i.e. registered with the factory instance), then validate if the /// supplied engine arguments, settings and table overrides are valid for the engine. diff --git a/src/Databases/DatabaseFactory.h b/src/Databases/DatabaseFactory.h index c86eaddb29d..6b92963f46e 100644 --- a/src/Databases/DatabaseFactory.h +++ b/src/Databases/DatabaseFactory.h @@ -1,5 +1,6 @@ #pragma once +#include #include #include #include @@ -24,7 +25,7 @@ static inline ValueType safeGetLiteralValue(const ASTPtr &ast, const String &eng return ast->as()->value.safeGet(); } -class DatabaseFactory : private boost::noncopyable +class DatabaseFactory : private boost::noncopyable, public IHints<> { public: @@ -52,6 +53,14 @@ public: const DatabaseEngines & getDatabaseEngines() const { return database_engines; } + std::vector getAllRegisteredNames() const override + { + std::vector result; + auto getter = [](const auto & pair) { return pair.first; }; + std::transform(database_engines.begin(), database_engines.end(), std::back_inserter(result), getter); + return result; + } + private: DatabaseEngines database_engines; diff --git a/src/Dictionaries/HashedDictionaryParallelLoader.h b/src/Dictionaries/HashedDictionaryParallelLoader.h index 907a987555e..ec892af7e36 100644 --- a/src/Dictionaries/HashedDictionaryParallelLoader.h +++ b/src/Dictionaries/HashedDictionaryParallelLoader.h @@ -2,6 +2,7 @@ #include #include +#include #include #include #include @@ -53,7 +54,7 @@ public: LOG_TRACE(dictionary.log, "Will load the dictionary using {} threads (with {} backlog)", shards, backlog); shards_slots.resize(shards); - std::iota(shards_slots.begin(), shards_slots.end(), 0); + iota(shards_slots.data(), shards_slots.size(), UInt64(0)); for (size_t shard = 0; shard < shards; ++shard) { diff --git a/src/Dictionaries/PolygonDictionary.cpp b/src/Dictionaries/PolygonDictionary.cpp index df3ae439b00..6f800bd921d 100644 --- a/src/Dictionaries/PolygonDictionary.cpp +++ b/src/Dictionaries/PolygonDictionary.cpp @@ -5,6 +5,7 @@ #include +#include #include #include #include @@ -507,7 +508,7 @@ const IColumn * unrollSimplePolygons(const ColumnPtr & column, Offset & offset) if (!ptr_polygons) throw Exception(ErrorCodes::TYPE_MISMATCH, "Expected a column containing arrays of points"); offset.ring_offsets.assign(ptr_polygons->getOffsets()); - std::iota(offset.polygon_offsets.begin(), offset.polygon_offsets.end(), 1); + iota(offset.polygon_offsets.data(), offset.polygon_offsets.size(), IColumn::Offsets::value_type(1)); offset.multi_polygon_offsets.assign(offset.polygon_offsets); return ptr_polygons->getDataPtr().get(); diff --git a/src/Dictionaries/PolygonDictionaryUtils.h b/src/Dictionaries/PolygonDictionaryUtils.h index 0238ef0b2b9..63d97e9dabd 100644 --- a/src/Dictionaries/PolygonDictionaryUtils.h +++ b/src/Dictionaries/PolygonDictionaryUtils.h @@ -1,6 +1,7 @@ #pragma once #include +#include #include #include @@ -184,7 +185,7 @@ public: { setBoundingBox(); std::vector order(polygons.size()); - std::iota(order.begin(), order.end(), 0); + iota(order.data(), order.size(), size_t(0)); root = makeCell(min_x, min_y, max_x, max_y, order); } diff --git a/src/Formats/EscapingRuleUtils.cpp b/src/Formats/EscapingRuleUtils.cpp index 9cc7cb3b89e..a7e9fb8e99f 100644 --- a/src/Formats/EscapingRuleUtils.cpp +++ b/src/Formats/EscapingRuleUtils.cpp @@ -450,10 +450,11 @@ String getAdditionalFormatInfoByEscapingRule(const FormatSettings & settings, Fo break; case FormatSettings::EscapingRule::JSON: result += fmt::format( - ", try_infer_numbers_from_strings={}, read_bools_as_numbers={}, read_objects_as_strings={}, read_numbers_as_strings={}, " + ", try_infer_numbers_from_strings={}, read_bools_as_numbers={}, read_bools_as_strings={}, read_objects_as_strings={}, read_numbers_as_strings={}, " "read_arrays_as_strings={}, try_infer_objects_as_tuples={}, infer_incomplete_types_as_strings={}, try_infer_objects={}", settings.json.try_infer_numbers_from_strings, settings.json.read_bools_as_numbers, + settings.json.read_bools_as_strings, settings.json.read_objects_as_strings, settings.json.read_numbers_as_strings, settings.json.read_arrays_as_strings, diff --git a/src/Formats/FormatFactory.cpp b/src/Formats/FormatFactory.cpp index 15743365d7d..0344ed54ae3 100644 --- a/src/Formats/FormatFactory.cpp +++ b/src/Formats/FormatFactory.cpp @@ -111,6 +111,7 @@ FormatSettings getFormatSettings(ContextPtr context, const Settings & settings) format_settings.json.quote_denormals = settings.output_format_json_quote_denormals; format_settings.json.quote_decimals = settings.output_format_json_quote_decimals; format_settings.json.read_bools_as_numbers = settings.input_format_json_read_bools_as_numbers; + format_settings.json.read_bools_as_strings = settings.input_format_json_read_bools_as_strings; format_settings.json.read_numbers_as_strings = settings.input_format_json_read_numbers_as_strings; format_settings.json.read_objects_as_strings = settings.input_format_json_read_objects_as_strings; format_settings.json.read_arrays_as_strings = settings.input_format_json_read_arrays_as_strings; diff --git a/src/Formats/FormatSettings.h b/src/Formats/FormatSettings.h index 8d5c044a311..5982d30f6a7 100644 --- a/src/Formats/FormatSettings.h +++ b/src/Formats/FormatSettings.h @@ -204,6 +204,7 @@ struct FormatSettings bool ignore_unknown_keys_in_named_tuple = false; bool serialize_as_strings = false; bool read_bools_as_numbers = true; + bool read_bools_as_strings = true; bool read_numbers_as_strings = true; bool read_objects_as_strings = true; bool read_arrays_as_strings = true; diff --git a/src/Formats/JSONUtils.cpp b/src/Formats/JSONUtils.cpp index b8b9a9ecb0d..779f38032d8 100644 --- a/src/Formats/JSONUtils.cpp +++ b/src/Formats/JSONUtils.cpp @@ -43,7 +43,7 @@ namespace JSONUtils { const auto current_object_size = memory.size() + static_cast(pos - in.position()); if (min_bytes != 0 && current_object_size > 10 * min_bytes) - throw ParsingException(ErrorCodes::INCORRECT_DATA, + throw Exception(ErrorCodes::INCORRECT_DATA, "Size of JSON object at position {} is extremely large. Expected not greater than {} bytes, but current is {} bytes per row. " "Increase the value setting 'min_chunk_bytes_for_parallel_parsing' or check your data manually, " "most likely JSON is malformed", in.count(), min_bytes, current_object_size); diff --git a/src/Formats/NativeReader.cpp b/src/Formats/NativeReader.cpp index 4c25460eb63..8286b24d0a6 100644 --- a/src/Formats/NativeReader.cpp +++ b/src/Formats/NativeReader.cpp @@ -120,7 +120,7 @@ Block NativeReader::read() if (istr.eof()) { if (use_index) - throw ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Input doesn't contain all data for index."); + throw Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "Input doesn't contain all data for index."); return res; } diff --git a/src/Formats/SchemaInferenceUtils.cpp b/src/Formats/SchemaInferenceUtils.cpp index e2ba188d015..f065d2f0f4d 100644 --- a/src/Formats/SchemaInferenceUtils.cpp +++ b/src/Formats/SchemaInferenceUtils.cpp @@ -377,6 +377,22 @@ namespace type_indexes.erase(TypeIndex::UInt8); } + /// If we have Bool and String types convert all numbers to String. + /// It's applied only when setting input_format_json_read_bools_as_strings is enabled. + void transformJSONBoolsAndStringsToString(DataTypes & data_types, TypeIndexesSet & type_indexes) + { + if (!type_indexes.contains(TypeIndex::String) || !type_indexes.contains(TypeIndex::UInt8)) + return; + + for (auto & type : data_types) + { + if (isBool(type)) + type = std::make_shared(); + } + + type_indexes.erase(TypeIndex::UInt8); + } + /// If we have type Nothing/Nullable(Nothing) and some other non Nothing types, /// convert all Nothing/Nullable(Nothing) types to the first non Nothing. /// For example, when we have [Nothing, Array(Int64)] it will convert it to [Array(Int64), Array(Int64)] @@ -628,6 +644,10 @@ namespace if (settings.json.read_bools_as_numbers) transformBoolsAndNumbersToNumbers(data_types, type_indexes); + /// Convert Bool to String if needed. + if (settings.json.read_bools_as_strings) + transformJSONBoolsAndStringsToString(data_types, type_indexes); + if (settings.json.try_infer_objects_as_tuples) mergeJSONPaths(data_types, type_indexes, settings, json_info); }; diff --git a/src/Functions/FunctionsStringDistance.cpp b/src/Functions/FunctionsStringDistance.cpp index 3098d02630a..a5e819179d6 100644 --- a/src/Functions/FunctionsStringDistance.cpp +++ b/src/Functions/FunctionsStringDistance.cpp @@ -6,6 +6,7 @@ #include #include #include +#include #ifdef __SSE4_2__ # include @@ -246,8 +247,7 @@ struct ByteEditDistanceImpl ResultType insertion = 0; ResultType deletion = 0; - for (size_t i = 0; i <= haystack_size; ++i) - distances0[i] = i; + iota(distances0.data(), haystack_size + 1, ResultType(0)); for (size_t pos_needle = 0; pos_needle < needle_size; ++pos_needle) { diff --git a/src/Functions/array/arrayRandomSample.cpp b/src/Functions/array/arrayRandomSample.cpp index 1e28e089a2a..40344efb077 100644 --- a/src/Functions/array/arrayRandomSample.cpp +++ b/src/Functions/array/arrayRandomSample.cpp @@ -1,5 +1,6 @@ #include #include +#include #include #include #include @@ -80,7 +81,7 @@ public: const size_t cur_samples = std::min(num_elements, samples); indices.resize(num_elements); - std::iota(indices.begin(), indices.end(), prev_array_offset); + iota(indices.data(), indices.size(), prev_array_offset); std::shuffle(indices.begin(), indices.end(), rng); for (UInt64 i = 0; i < cur_samples; i++) diff --git a/src/Functions/array/arrayShuffle.cpp b/src/Functions/array/arrayShuffle.cpp index faa5ae47b29..10cb51d27d2 100644 --- a/src/Functions/array/arrayShuffle.cpp +++ b/src/Functions/array/arrayShuffle.cpp @@ -7,6 +7,7 @@ #include #include #include +#include #include #include #include @@ -150,7 +151,7 @@ ColumnPtr FunctionArrayShuffleImpl::executeGeneric(const ColumnArray & a size_t size = offsets.size(); size_t nested_size = array.getData().size(); IColumn::Permutation permutation(nested_size); - std::iota(std::begin(permutation), std::end(permutation), 0); + iota(permutation.data(), permutation.size(), IColumn::Permutation::value_type(0)); ColumnArray::Offset current_offset = 0; for (size_t i = 0; i < size; ++i) diff --git a/src/Functions/array/arraySort.cpp b/src/Functions/array/arraySort.cpp index a853289e8cc..184b1f82280 100644 --- a/src/Functions/array/arraySort.cpp +++ b/src/Functions/array/arraySort.cpp @@ -1,5 +1,6 @@ -#include #include +#include +#include namespace DB { @@ -55,9 +56,7 @@ ColumnPtr ArraySortImpl::execute( size_t size = offsets.size(); size_t nested_size = array.getData().size(); IColumn::Permutation permutation(nested_size); - - for (size_t i = 0; i < nested_size; ++i) - permutation[i] = i; + iota(permutation.data(), nested_size, IColumn::Permutation::value_type(0)); ColumnArray::Offset current_offset = 0; for (size_t i = 0; i < size; ++i) diff --git a/src/Functions/rowNumberInBlock.cpp b/src/Functions/rowNumberInBlock.cpp index e5fe2aeb178..25c9e9c56f3 100644 --- a/src/Functions/rowNumberInBlock.cpp +++ b/src/Functions/rowNumberInBlock.cpp @@ -56,8 +56,7 @@ public: auto column = ColumnUInt64::create(); auto & data = column->getData(); data.resize(input_rows_count); - for (size_t i = 0; i < input_rows_count; ++i) - data[i] = i; + iota(data.data(), input_rows_count, UInt64(0)); return column; } diff --git a/src/Functions/FunctionSqid.cpp b/src/Functions/sqid.cpp similarity index 94% rename from src/Functions/FunctionSqid.cpp rename to src/Functions/sqid.cpp index 546263914c2..363a3f8ac13 100644 --- a/src/Functions/FunctionSqid.cpp +++ b/src/Functions/sqid.cpp @@ -1,6 +1,6 @@ #include "config.h" -#ifdef ENABLE_SQIDS +#if USE_SQIDS #include #include @@ -57,9 +57,10 @@ public: ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t input_rows_count) const override { - size_t num_args = arguments.size(); auto col_res = ColumnString::create(); + col_res->reserve(input_rows_count); + const size_t num_args = arguments.size(); std::vector numbers(num_args); for (size_t i = 0; i < input_rows_count; ++i) { @@ -83,7 +84,7 @@ REGISTER_FUNCTION(Sqid) { factory.registerFunction(FunctionDocumentation{ .description=R"( -Transforms numbers into YouTube-like short URL hash called [Sqid](https://sqids.org/).)", +Transforms numbers into a [Sqid](https://sqids.org/) which is a Youtube-like ID string.)", .syntax="sqid(number1, ...)", .arguments={{"number1, ...", "Arbitrarily many UInt8, UInt16, UInt32 or UInt64 arguments"}}, .returned_value="A hash id [String](/docs/en/sql-reference/data-types/string.md).", diff --git a/src/Functions/translate.cpp b/src/Functions/translate.cpp index 836cb4de2f3..ad5be7d9dfd 100644 --- a/src/Functions/translate.cpp +++ b/src/Functions/translate.cpp @@ -3,6 +3,7 @@ #include #include #include +#include #include #include #include @@ -31,7 +32,7 @@ struct TranslateImpl if (map_from.size() != map_to.size()) throw Exception(ErrorCodes::BAD_ARGUMENTS, "Second and third arguments must be the same length"); - std::iota(map.begin(), map.end(), 0); + iota(map.data(), map.size(), UInt8(0)); for (size_t i = 0; i < map_from.size(); ++i) { @@ -129,7 +130,7 @@ struct TranslateUTF8Impl if (map_from_size != map_to_size) throw Exception(ErrorCodes::BAD_ARGUMENTS, "Second and third arguments must be the same length"); - std::iota(map_ascii.begin(), map_ascii.end(), 0); + iota(map_ascii.data(), map_ascii.size(), UInt32(0)); const UInt8 * map_from_ptr = reinterpret_cast(map_from.data()); const UInt8 * map_from_end = map_from_ptr + map_from.size(); diff --git a/src/IO/BrotliWriteBuffer.cpp b/src/IO/BrotliWriteBuffer.cpp index a497b78a6c2..a19c6770dad 100644 --- a/src/IO/BrotliWriteBuffer.cpp +++ b/src/IO/BrotliWriteBuffer.cpp @@ -13,14 +13,33 @@ namespace ErrorCodes } -BrotliWriteBuffer::BrotliStateWrapper::BrotliStateWrapper() -: state(BrotliEncoderCreateInstance(nullptr, nullptr, nullptr)) +class BrotliWriteBuffer::BrotliStateWrapper { -} +public: + BrotliStateWrapper() + : state(BrotliEncoderCreateInstance(nullptr, nullptr, nullptr)) + { + } -BrotliWriteBuffer::BrotliStateWrapper::~BrotliStateWrapper() + ~BrotliStateWrapper() + { + BrotliEncoderDestroyInstance(state); + } + + BrotliEncoderState * state; +}; + +BrotliWriteBuffer::BrotliWriteBuffer(std::unique_ptr out_, int compression_level, size_t buf_size, char * existing_memory, size_t alignment) + : WriteBufferWithOwnMemoryDecorator(std::move(out_), buf_size, existing_memory, alignment) + , brotli(std::make_unique()) + , in_available(0) + , in_data(nullptr) + , out_capacity(0) + , out_data(nullptr) { - BrotliEncoderDestroyInstance(state); + BrotliEncoderSetParameter(brotli->state, BROTLI_PARAM_QUALITY, static_cast(compression_level)); + // Set LZ77 window size. According to brotli sources default value is 24 (c/tools/brotli.c:81) + BrotliEncoderSetParameter(brotli->state, BROTLI_PARAM_LGWIN, 24); } BrotliWriteBuffer::~BrotliWriteBuffer() = default; @@ -39,20 +58,18 @@ void BrotliWriteBuffer::nextImpl() { do { - const auto * in_data_ptr = in_data; out->nextIfAtEnd(); out_data = reinterpret_cast(out->position()); out_capacity = out->buffer().end() - out->position(); int result = BrotliEncoderCompressStream( brotli->state, - BROTLI_OPERATION_PROCESS, + in_available ? BROTLI_OPERATION_PROCESS : BROTLI_OPERATION_FINISH, &in_available, &in_data, &out_capacity, &out_data, nullptr); - total_in += in_data - in_data_ptr; out->position() = out->buffer().end() - out_capacity; @@ -75,10 +92,6 @@ void BrotliWriteBuffer::finalizeBefore() { next(); - /// Don't write out if no data was ever compressed - if (!compress_empty && total_in == 0) - return; - while (true) { out->nextIfAtEnd(); diff --git a/src/IO/BrotliWriteBuffer.h b/src/IO/BrotliWriteBuffer.h index d4cda7b270c..8cbc78bd9e7 100644 --- a/src/IO/BrotliWriteBuffer.h +++ b/src/IO/BrotliWriteBuffer.h @@ -4,38 +4,18 @@ #include #include -#include "config.h" - -#if USE_BROTLI -# include - namespace DB { - class BrotliWriteBuffer : public WriteBufferWithOwnMemoryDecorator { public: - template BrotliWriteBuffer( - WriteBufferT && out_, + std::unique_ptr out_, int compression_level, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE, char * existing_memory = nullptr, - size_t alignment = 0, - bool compress_empty_ = true) - : WriteBufferWithOwnMemoryDecorator(std::move(out_), buf_size, existing_memory, alignment) - , brotli(std::make_unique()) - , in_available(0) - , in_data(nullptr) - , out_capacity(0) - , out_data(nullptr) - , compress_empty(compress_empty_) - { - BrotliEncoderSetParameter(brotli->state, BROTLI_PARAM_QUALITY, static_cast(compression_level)); - // Set LZ77 window size. According to brotli sources default value is 24 (c/tools/brotli.c:81) - BrotliEncoderSetParameter(brotli->state, BROTLI_PARAM_LGWIN, 24); - } + size_t alignment = 0); ~BrotliWriteBuffer() override; @@ -44,15 +24,7 @@ private: void finalizeBefore() override; - class BrotliStateWrapper - { - public: - BrotliStateWrapper(); - ~BrotliStateWrapper(); - - BrotliEncoderState * state; - }; - + class BrotliStateWrapper; std::unique_ptr brotli; @@ -61,12 +33,6 @@ private: size_t out_capacity; uint8_t * out_data; - -protected: - UInt64 total_in = 0; - bool compress_empty = true; }; } - -#endif diff --git a/src/IO/BufferBase.h b/src/IO/BufferBase.h index 4c0a467b155..7a59687fa56 100644 --- a/src/IO/BufferBase.h +++ b/src/IO/BufferBase.h @@ -2,7 +2,6 @@ #include #include -#include namespace DB diff --git a/src/IO/Bzip2WriteBuffer.cpp b/src/IO/Bzip2WriteBuffer.cpp index 3421b4c3985..b84cbdd1e41 100644 --- a/src/IO/Bzip2WriteBuffer.cpp +++ b/src/IO/Bzip2WriteBuffer.cpp @@ -15,22 +15,34 @@ namespace ErrorCodes } -Bzip2WriteBuffer::Bzip2StateWrapper::Bzip2StateWrapper(int compression_level) +class Bzip2WriteBuffer::Bzip2StateWrapper { - memset(&stream, 0, sizeof(stream)); +public: + explicit Bzip2StateWrapper(int compression_level) + { + memset(&stream, 0, sizeof(stream)); - int ret = BZ2_bzCompressInit(&stream, compression_level, 0, 0); + int ret = BZ2_bzCompressInit(&stream, compression_level, 0, 0); - if (ret != BZ_OK) - throw Exception( - ErrorCodes::BZIP2_STREAM_ENCODER_FAILED, - "bzip2 stream encoder init failed: error code: {}", - ret); -} + if (ret != BZ_OK) + throw Exception( + ErrorCodes::BZIP2_STREAM_ENCODER_FAILED, + "bzip2 stream encoder init failed: error code: {}", + ret); + } -Bzip2WriteBuffer::Bzip2StateWrapper::~Bzip2StateWrapper() + ~Bzip2StateWrapper() + { + BZ2_bzCompressEnd(&stream); + } + + bz_stream stream; +}; + +Bzip2WriteBuffer::Bzip2WriteBuffer(std::unique_ptr out_, int compression_level, size_t buf_size, char * existing_memory, size_t alignment) + : WriteBufferWithOwnMemoryDecorator(std::move(out_), buf_size, existing_memory, alignment) + , bz(std::make_unique(compression_level)) { - BZ2_bzCompressEnd(&stream); } Bzip2WriteBuffer::~Bzip2WriteBuffer() = default; @@ -65,8 +77,6 @@ void Bzip2WriteBuffer::nextImpl() } while (bz->stream.avail_in > 0); - - total_in += offset(); } catch (...) { @@ -80,10 +90,6 @@ void Bzip2WriteBuffer::finalizeBefore() { next(); - /// Don't write out if no data was ever compressed - if (!compress_empty && total_in == 0) - return; - out->nextIfAtEnd(); bz->stream.next_out = out->position(); bz->stream.avail_out = static_cast(out->buffer().end() - out->position()); diff --git a/src/IO/Bzip2WriteBuffer.h b/src/IO/Bzip2WriteBuffer.h index 63c67461c6a..d0371903487 100644 --- a/src/IO/Bzip2WriteBuffer.h +++ b/src/IO/Bzip2WriteBuffer.h @@ -4,29 +4,18 @@ #include #include -#include "config.h" - -#if USE_BZIP2 -# include - namespace DB { class Bzip2WriteBuffer : public WriteBufferWithOwnMemoryDecorator { public: - template Bzip2WriteBuffer( - WriteBufferT && out_, + std::unique_ptr out_, int compression_level, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE, char * existing_memory = nullptr, - size_t alignment = 0, - bool compress_empty_ = true) - : WriteBufferWithOwnMemoryDecorator(std::move(out_), buf_size, existing_memory, alignment), bz(std::make_unique(compression_level)) - , compress_empty(compress_empty_) - { - } + size_t alignment = 0); ~Bzip2WriteBuffer() override; @@ -35,20 +24,8 @@ private: void finalizeBefore() override; - class Bzip2StateWrapper - { - public: - explicit Bzip2StateWrapper(int compression_level); - ~Bzip2StateWrapper(); - - bz_stream stream; - }; - + class Bzip2StateWrapper; std::unique_ptr bz; - bool compress_empty = true; - UInt64 total_in = 0; }; } - -#endif diff --git a/src/IO/CompressionMethod.cpp b/src/IO/CompressionMethod.cpp index 90453e16961..13e1adbb702 100644 --- a/src/IO/CompressionMethod.cpp +++ b/src/IO/CompressionMethod.cpp @@ -169,66 +169,37 @@ std::unique_ptr wrapReadBufferWithCompressionMethod( return createCompressedWrapper(std::move(nested), method, buf_size, existing_memory, alignment, zstd_window_log_max); } - -template -std::unique_ptr createWriteCompressedWrapper( - WriteBufferT && nested, CompressionMethod method, int level, size_t buf_size, char * existing_memory, size_t alignment, bool compress_empty) +std::unique_ptr wrapWriteBufferWithCompressionMethod( + std::unique_ptr nested, CompressionMethod method, int level, size_t buf_size, char * existing_memory, size_t alignment) { if (method == DB::CompressionMethod::Gzip || method == CompressionMethod::Zlib) - return std::make_unique(std::forward(nested), method, level, buf_size, existing_memory, alignment, compress_empty); + return std::make_unique(std::move(nested), method, level, buf_size, existing_memory, alignment); #if USE_BROTLI if (method == DB::CompressionMethod::Brotli) - return std::make_unique(std::forward(nested), level, buf_size, existing_memory, alignment, compress_empty); + return std::make_unique(std::move(nested), level, buf_size, existing_memory, alignment); #endif if (method == CompressionMethod::Xz) - return std::make_unique(std::forward(nested), level, buf_size, existing_memory, alignment, compress_empty); + return std::make_unique(std::move(nested), level, buf_size, existing_memory, alignment); if (method == CompressionMethod::Zstd) - return std::make_unique(std::forward(nested), level, buf_size, existing_memory, alignment, compress_empty); + return std::make_unique(std::move(nested), level, buf_size, existing_memory, alignment); if (method == CompressionMethod::Lz4) - return std::make_unique(std::forward(nested), level, buf_size, existing_memory, alignment, compress_empty); + return std::make_unique(std::move(nested), level, buf_size, existing_memory, alignment); #if USE_BZIP2 if (method == CompressionMethod::Bzip2) - return std::make_unique(std::forward(nested), level, buf_size, existing_memory, alignment, compress_empty); + return std::make_unique(std::move(nested), level, buf_size, existing_memory, alignment); #endif #if USE_SNAPPY if (method == CompressionMethod::Snappy) throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Unsupported compression method"); #endif + if (method == CompressionMethod::None) + return nested; throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Unsupported compression method"); } - -std::unique_ptr wrapWriteBufferWithCompressionMethod( - std::unique_ptr nested, - CompressionMethod method, - int level, - size_t buf_size, - char * existing_memory, - size_t alignment, - bool compress_empty) -{ - if (method == CompressionMethod::None) - return nested; - return createWriteCompressedWrapper(nested, method, level, buf_size, existing_memory, alignment, compress_empty); -} - - -std::unique_ptr wrapWriteBufferWithCompressionMethod( - WriteBuffer * nested, - CompressionMethod method, - int level, - size_t buf_size, - char * existing_memory, - size_t alignment, - bool compress_empty) -{ - assert(method != CompressionMethod::None); - return createWriteCompressedWrapper(nested, method, level, buf_size, existing_memory, alignment, compress_empty); -} - } diff --git a/src/IO/CompressionMethod.h b/src/IO/CompressionMethod.h index d218e4c5882..c142531cd05 100644 --- a/src/IO/CompressionMethod.h +++ b/src/IO/CompressionMethod.h @@ -61,22 +61,13 @@ std::unique_ptr wrapReadBufferWithCompressionMethod( char * existing_memory = nullptr, size_t alignment = 0); + std::unique_ptr wrapWriteBufferWithCompressionMethod( std::unique_ptr nested, CompressionMethod method, int level, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE, char * existing_memory = nullptr, - size_t alignment = 0, - bool compress_empty = true); - -std::unique_ptr wrapWriteBufferWithCompressionMethod( - WriteBuffer * nested, - CompressionMethod method, - int level, - size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE, - char * existing_memory = nullptr, - size_t alignment = 0, - bool compress_empty = true); + size_t alignment = 0); } diff --git a/src/IO/LZMADeflatingWriteBuffer.cpp b/src/IO/LZMADeflatingWriteBuffer.cpp index db8f8c95fe6..a77b2bb7b39 100644 --- a/src/IO/LZMADeflatingWriteBuffer.cpp +++ b/src/IO/LZMADeflatingWriteBuffer.cpp @@ -7,7 +7,9 @@ namespace ErrorCodes extern const int LZMA_STREAM_ENCODER_FAILED; } -void LZMADeflatingWriteBuffer::initialize(int compression_level) +LZMADeflatingWriteBuffer::LZMADeflatingWriteBuffer( + std::unique_ptr out_, int compression_level, size_t buf_size, char * existing_memory, size_t alignment) + : WriteBufferWithOwnMemoryDecorator(std::move(out_), buf_size, existing_memory, alignment) { lstr = LZMA_STREAM_INIT; @@ -92,10 +94,6 @@ void LZMADeflatingWriteBuffer::finalizeBefore() { next(); - /// Don't write out if no data was ever compressed - if (!compress_empty && lstr.total_out == 0) - return; - do { out->nextIfAtEnd(); diff --git a/src/IO/LZMADeflatingWriteBuffer.h b/src/IO/LZMADeflatingWriteBuffer.h index 797b85cd400..2e135455e00 100644 --- a/src/IO/LZMADeflatingWriteBuffer.h +++ b/src/IO/LZMADeflatingWriteBuffer.h @@ -14,32 +14,22 @@ namespace DB class LZMADeflatingWriteBuffer : public WriteBufferWithOwnMemoryDecorator { public: - template LZMADeflatingWriteBuffer( - WriteBufferT && out_, + std::unique_ptr out_, int compression_level, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE, char * existing_memory = nullptr, - size_t alignment = 0, - bool compress_empty_ = true) - : WriteBufferWithOwnMemoryDecorator(std::move(out_), buf_size, existing_memory, alignment), compress_empty(compress_empty_) - { - initialize(compression_level); - } + size_t alignment = 0); ~LZMADeflatingWriteBuffer() override; private: - void initialize(int compression_level); - void nextImpl() override; void finalizeBefore() override; void finalizeAfter() override; lzma_stream lstr; - - bool compress_empty = true; }; } diff --git a/src/IO/Lz4DeflatingWriteBuffer.cpp b/src/IO/Lz4DeflatingWriteBuffer.cpp index a8cac823b50..8241bfd4f3c 100644 --- a/src/IO/Lz4DeflatingWriteBuffer.cpp +++ b/src/IO/Lz4DeflatingWriteBuffer.cpp @@ -63,8 +63,11 @@ namespace ErrorCodes extern const int LZ4_ENCODER_FAILED; } +Lz4DeflatingWriteBuffer::Lz4DeflatingWriteBuffer( + std::unique_ptr out_, int compression_level, size_t buf_size, char * existing_memory, size_t alignment) + : WriteBufferWithOwnMemoryDecorator(std::move(out_), buf_size, existing_memory, alignment) + , tmp_memory(buf_size) -void Lz4DeflatingWriteBuffer::initialize(int compression_level) { kPrefs = { {LZ4F_max256KB, @@ -102,7 +105,7 @@ void Lz4DeflatingWriteBuffer::nextImpl() if (first_time) { - auto sink = SinkToOut(out, tmp_memory, LZ4F_HEADER_SIZE_MAX); + auto sink = SinkToOut(out.get(), tmp_memory, LZ4F_HEADER_SIZE_MAX); chassert(sink.getCapacity() >= LZ4F_HEADER_SIZE_MAX); /// write frame header and check for errors @@ -128,7 +131,7 @@ void Lz4DeflatingWriteBuffer::nextImpl() /// Ensure that there is enough space for compressed block of minimal size size_t min_compressed_block_size = LZ4F_compressBound(1, &kPrefs); - auto sink = SinkToOut(out, tmp_memory, min_compressed_block_size); + auto sink = SinkToOut(out.get(), tmp_memory, min_compressed_block_size); chassert(sink.getCapacity() >= min_compressed_block_size); /// LZ4F_compressUpdate compresses whole input buffer at once so we need to shink it manually @@ -160,12 +163,8 @@ void Lz4DeflatingWriteBuffer::finalizeBefore() { next(); - /// Don't write out if no data was ever compressed - if (!compress_empty && first_time) - return; - auto suffix_size = LZ4F_compressBound(0, &kPrefs); - auto sink = SinkToOut(out, tmp_memory, suffix_size); + auto sink = SinkToOut(out.get(), tmp_memory, suffix_size); chassert(sink.getCapacity() >= suffix_size); /// compression end diff --git a/src/IO/Lz4DeflatingWriteBuffer.h b/src/IO/Lz4DeflatingWriteBuffer.h index b37d61fa732..7bb8a5e6c0e 100644 --- a/src/IO/Lz4DeflatingWriteBuffer.h +++ b/src/IO/Lz4DeflatingWriteBuffer.h @@ -14,26 +14,16 @@ namespace DB class Lz4DeflatingWriteBuffer : public WriteBufferWithOwnMemoryDecorator { public: - template Lz4DeflatingWriteBuffer( - WriteBufferT && out_, + std::unique_ptr out_, int compression_level, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE, char * existing_memory = nullptr, - size_t alignment = 0, - bool compress_empty_ = true) - : WriteBufferWithOwnMemoryDecorator(std::move(out_), buf_size, existing_memory, alignment) - , tmp_memory(buf_size) - , compress_empty(compress_empty_) - { - initialize(compression_level); - } + size_t alignment = 0); ~Lz4DeflatingWriteBuffer() override; private: - void initialize(int compression_level); - void nextImpl() override; void finalizeBefore() override; @@ -45,6 +35,5 @@ private: Memory<> tmp_memory; bool first_time = true; - bool compress_empty = true; }; } diff --git a/src/IO/ReadBufferFromPocoSocket.cpp b/src/IO/ReadBufferFromPocoSocket.cpp index d399721d060..ff72dc5386c 100644 --- a/src/IO/ReadBufferFromPocoSocket.cpp +++ b/src/IO/ReadBufferFromPocoSocket.cpp @@ -99,9 +99,6 @@ bool ReadBufferFromPocoSocket::nextImpl() if (bytes_read < 0) throw NetException(ErrorCodes::CANNOT_READ_FROM_SOCKET, "Cannot read from socket ({})", peer_address.toString()); - if (read_event != ProfileEvents::end()) - ProfileEvents::increment(read_event, bytes_read); - if (bytes_read) working_buffer.resize(bytes_read); else @@ -114,17 +111,10 @@ ReadBufferFromPocoSocket::ReadBufferFromPocoSocket(Poco::Net::Socket & socket_, : BufferWithOwnMemory(buf_size) , socket(socket_) , peer_address(socket.peerAddress()) - , read_event(ProfileEvents::end()) , socket_description("socket (" + peer_address.toString() + ")") { } -ReadBufferFromPocoSocket::ReadBufferFromPocoSocket(Poco::Net::Socket & socket_, const ProfileEvents::Event & read_event_, size_t buf_size) - : ReadBufferFromPocoSocket(socket_, buf_size) -{ - read_event = read_event_; -} - bool ReadBufferFromPocoSocket::poll(size_t timeout_microseconds) const { if (available()) diff --git a/src/IO/ReadBufferFromPocoSocket.h b/src/IO/ReadBufferFromPocoSocket.h index 76156612764..dab4ac86295 100644 --- a/src/IO/ReadBufferFromPocoSocket.h +++ b/src/IO/ReadBufferFromPocoSocket.h @@ -20,13 +20,10 @@ protected: */ Poco::Net::SocketAddress peer_address; - ProfileEvents::Event read_event; - bool nextImpl() override; public: explicit ReadBufferFromPocoSocket(Poco::Net::Socket & socket_, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE); - explicit ReadBufferFromPocoSocket(Poco::Net::Socket & socket_, const ProfileEvents::Event & read_event_, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE); bool poll(size_t timeout_microseconds) const; diff --git a/src/IO/ReadHelpers.cpp b/src/IO/ReadHelpers.cpp index ff5743a63af..05d35a57b12 100644 --- a/src/IO/ReadHelpers.cpp +++ b/src/IO/ReadHelpers.cpp @@ -89,7 +89,7 @@ void NO_INLINE throwAtAssertionFailed(const char * s, ReadBuffer & buf) else out << " before: " << quote << String(buf.position(), std::min(SHOW_CHARS_ON_SYNTAX_ERROR, buf.buffer().end() - buf.position())); - throw ParsingException(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "Cannot parse input: expected {}", out.str()); + throw Exception(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "Cannot parse input: expected {}", out.str()); } @@ -562,7 +562,7 @@ static ReturnType readAnyQuotedStringInto(Vector & s, ReadBuffer & buf) if (buf.eof() || *buf.position() != quote) { if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_QUOTED_STRING, + throw Exception(ErrorCodes::CANNOT_PARSE_QUOTED_STRING, "Cannot parse quoted string: expected opening quote '{}', got '{}'", std::string{quote}, buf.eof() ? "EOF" : std::string{*buf.position()}); else @@ -608,7 +608,7 @@ static ReturnType readAnyQuotedStringInto(Vector & s, ReadBuffer & buf) } if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_QUOTED_STRING, "Cannot parse quoted string: expected closing quote"); + throw Exception(ErrorCodes::CANNOT_PARSE_QUOTED_STRING, "Cannot parse quoted string: expected closing quote"); else return ReturnType(false); } @@ -958,7 +958,7 @@ ReturnType readJSONStringInto(Vector & s, ReadBuffer & buf) auto error = [](FormatStringHelper<> message [[maybe_unused]], int code [[maybe_unused]]) { if constexpr (throw_exception) - throw ParsingException(code, std::move(message)); + throw Exception(code, std::move(message)); return ReturnType(false); }; @@ -1009,7 +1009,7 @@ ReturnType readJSONObjectOrArrayPossiblyInvalid(Vector & s, ReadBuffer & buf) auto error = [](FormatStringHelper<> message [[maybe_unused]], int code [[maybe_unused]]) { if constexpr (throw_exception) - throw ParsingException(code, std::move(message)); + throw Exception(code, std::move(message)); return ReturnType(false); }; @@ -1185,7 +1185,7 @@ ReturnType readDateTimeTextFallback(time_t & datetime, ReadBuffer & buf, const D else { if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse DateTime"); + throw Exception(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse DateTime"); else return false; } @@ -1212,7 +1212,7 @@ ReturnType readDateTimeTextFallback(time_t & datetime, ReadBuffer & buf, const D s_pos[size] = 0; if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse DateTime {}", s); + throw Exception(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse DateTime {}", s); else return false; } @@ -1235,7 +1235,7 @@ ReturnType readDateTimeTextFallback(time_t & datetime, ReadBuffer & buf, const D s_pos[size] = 0; if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse time component of DateTime {}", s); + throw Exception(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse time component of DateTime {}", s); else return false; } @@ -1266,7 +1266,7 @@ ReturnType readDateTimeTextFallback(time_t & datetime, ReadBuffer & buf, const D if (too_short && negative_multiplier != -1) { if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse DateTime"); + throw Exception(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse DateTime"); else return false; } @@ -1382,8 +1382,12 @@ void skipJSONField(ReadBuffer & buf, StringRef name_of_field) } else { - throw Exception(ErrorCodes::INCORRECT_DATA, "Unexpected symbol '{}' for key '{}'", - std::string(*buf.position(), 1), name_of_field.toString()); + throw Exception( + ErrorCodes::INCORRECT_DATA, + "Cannot read JSON field here: '{}'. Unexpected symbol '{}'{}", + String(buf.position(), std::min(buf.available(), size_t(10))), + std::string(1, *buf.position()), + name_of_field.empty() ? "" : " for key " + name_of_field.toString()); } } @@ -1753,7 +1757,7 @@ void readQuotedField(String & s, ReadBuffer & buf) void readJSONField(String & s, ReadBuffer & buf) { s.clear(); - auto parse_func = [](ReadBuffer & in) { skipJSONField(in, "json_field"); }; + auto parse_func = [](ReadBuffer & in) { skipJSONField(in, ""); }; readParsedValueInto(s, buf, parse_func); } diff --git a/src/IO/ReadHelpers.h b/src/IO/ReadHelpers.h index bba0b694d23..85584d63ee8 100644 --- a/src/IO/ReadHelpers.h +++ b/src/IO/ReadHelpers.h @@ -296,7 +296,7 @@ inline void readBoolTextWord(bool & x, ReadBuffer & buf, bool support_upper_case [[fallthrough]]; } default: - throw ParsingException(ErrorCodes::CANNOT_PARSE_BOOL, "Unexpected Bool value"); + throw Exception(ErrorCodes::CANNOT_PARSE_BOOL, "Unexpected Bool value"); } } @@ -340,7 +340,7 @@ ReturnType readIntTextImpl(T & x, ReadBuffer & buf) if (has_sign) { if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_NUMBER, + throw Exception(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot parse number with multiple sign (+/-) characters"); else return ReturnType(false); @@ -357,7 +357,7 @@ ReturnType readIntTextImpl(T & x, ReadBuffer & buf) if (has_sign) { if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_NUMBER, + throw Exception(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot parse number with multiple sign (+/-) characters"); else return ReturnType(false); @@ -368,7 +368,7 @@ ReturnType readIntTextImpl(T & x, ReadBuffer & buf) else { if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_NUMBER, "Unsigned type must not contain '-' symbol"); + throw Exception(ErrorCodes::CANNOT_PARSE_NUMBER, "Unsigned type must not contain '-' symbol"); else return ReturnType(false); } @@ -430,7 +430,7 @@ end: if (has_sign && !has_number) { if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_NUMBER, + throw Exception(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot parse number with a sign character but without any numeric character"); else return ReturnType(false); @@ -837,7 +837,7 @@ inline ReturnType readUUIDTextImpl(UUID & uuid, ReadBuffer & buf) if constexpr (throw_exception) { - throw ParsingException(ErrorCodes::CANNOT_PARSE_UUID, "Cannot parse uuid {}", s); + throw Exception(ErrorCodes::CANNOT_PARSE_UUID, "Cannot parse uuid {}", s); } else { @@ -855,7 +855,7 @@ inline ReturnType readUUIDTextImpl(UUID & uuid, ReadBuffer & buf) if constexpr (throw_exception) { - throw ParsingException(ErrorCodes::CANNOT_PARSE_UUID, "Cannot parse uuid {}", s); + throw Exception(ErrorCodes::CANNOT_PARSE_UUID, "Cannot parse uuid {}", s); } else { @@ -881,7 +881,7 @@ inline ReturnType readIPv4TextImpl(IPv4 & ip, ReadBuffer & buf) return ReturnType(true); if constexpr (std::is_same_v) - throw ParsingException(ErrorCodes::CANNOT_PARSE_IPV4, "Cannot parse IPv4 {}", std::string_view(buf.position(), buf.available())); + throw Exception(ErrorCodes::CANNOT_PARSE_IPV4, "Cannot parse IPv4 {}", std::string_view(buf.position(), buf.available())); else return ReturnType(false); } @@ -903,7 +903,7 @@ inline ReturnType readIPv6TextImpl(IPv6 & ip, ReadBuffer & buf) return ReturnType(true); if constexpr (std::is_same_v) - throw ParsingException(ErrorCodes::CANNOT_PARSE_IPV6, "Cannot parse IPv6 {}", std::string_view(buf.position(), buf.available())); + throw Exception(ErrorCodes::CANNOT_PARSE_IPV6, "Cannot parse IPv6 {}", std::string_view(buf.position(), buf.available())); else return ReturnType(false); } @@ -944,7 +944,7 @@ inline ReturnType readDateTimeTextImpl(time_t & datetime, ReadBuffer & buf, cons if (!buf.eof() && !isNumericASCII(*buf.position())) { if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse datetime"); + throw Exception(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse datetime"); else return false; } @@ -1017,7 +1017,7 @@ inline ReturnType readDateTimeTextImpl(DateTime64 & datetime64, UInt32 scale, Re { readDateTimeTextImpl(whole, buf, date_lut); } - catch (const DB::ParsingException &) + catch (const DB::Exception &) { if (buf.eof() || *buf.position() != '.') throw; @@ -1125,7 +1125,7 @@ inline void readDateTimeText(LocalDateTime & datetime, ReadBuffer & buf) if (10 != size) { s[size] = 0; - throw ParsingException(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse DateTime {}", s); + throw Exception(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse DateTime {}", s); } datetime.year((s[0] - '0') * 1000 + (s[1] - '0') * 100 + (s[2] - '0') * 10 + (s[3] - '0')); @@ -1141,7 +1141,7 @@ inline void readDateTimeText(LocalDateTime & datetime, ReadBuffer & buf) if (8 != size) { s[size] = 0; - throw ParsingException(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse time component of DateTime {}", s); + throw Exception(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse time component of DateTime {}", s); } datetime.hour((s[0] - '0') * 10 + (s[1] - '0')); @@ -1174,7 +1174,7 @@ inline ReturnType readTimeTextImpl(time_t & time, ReadBuffer & buf) s[size] = 0; if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse DateTime {}", s); + throw Exception(ErrorCodes::CANNOT_PARSE_DATETIME, "Cannot parse DateTime {}", s); else return false; } @@ -1482,7 +1482,7 @@ void readQuoted(std::vector & x, ReadBuffer & buf) if (*buf.position() == ',') ++buf.position(); else - throw ParsingException(ErrorCodes::CANNOT_READ_ARRAY_FROM_TEXT, "Cannot read array from text"); + throw Exception(ErrorCodes::CANNOT_READ_ARRAY_FROM_TEXT, "Cannot read array from text"); } first = false; @@ -1505,7 +1505,7 @@ void readDoubleQuoted(std::vector & x, ReadBuffer & buf) if (*buf.position() == ',') ++buf.position(); else - throw ParsingException(ErrorCodes::CANNOT_READ_ARRAY_FROM_TEXT, "Cannot read array from text"); + throw Exception(ErrorCodes::CANNOT_READ_ARRAY_FROM_TEXT, "Cannot read array from text"); } first = false; diff --git a/src/IO/WriteBufferDecorator.h b/src/IO/WriteBufferDecorator.h index ee47834b7af..7c984eeea8d 100644 --- a/src/IO/WriteBufferDecorator.h +++ b/src/IO/WriteBufferDecorator.h @@ -12,21 +12,13 @@ class WriteBuffer; /// WriteBuffer that decorates data and delegates it to underlying buffer. /// It's used for writing compressed and encrypted data -/// This class can own or not own underlying buffer - constructor will differentiate -/// std::unique_ptr for owning and WriteBuffer* for not owning. template class WriteBufferDecorator : public Base { public: template explicit WriteBufferDecorator(std::unique_ptr out_, BaseArgs && ... args) - : Base(std::forward(args)...), owning_holder(std::move(out_)), out(owning_holder.get()) - { - } - - template - explicit WriteBufferDecorator(WriteBuffer * out_, BaseArgs && ... args) - : Base(std::forward(args)...), out(out_) + : Base(std::forward(args)...), out(std::move(out_)) { } @@ -46,7 +38,7 @@ public: } } - WriteBuffer * getNestedBuffer() { return out; } + WriteBuffer * getNestedBuffer() { return out.get(); } protected: /// Do some finalization before finalization of underlying buffer. @@ -55,8 +47,7 @@ protected: /// Do some finalization after finalization of underlying buffer. virtual void finalizeAfter() {} - std::unique_ptr owning_holder; - WriteBuffer * out; + std::unique_ptr out; }; using WriteBufferWithOwnMemoryDecorator = WriteBufferDecorator>; diff --git a/src/IO/WriteBufferFromEncryptedFile.h b/src/IO/WriteBufferFromEncryptedFile.h index f8f864d00a6..25dd54ca9d5 100644 --- a/src/IO/WriteBufferFromEncryptedFile.h +++ b/src/IO/WriteBufferFromEncryptedFile.h @@ -28,7 +28,7 @@ public: void sync() override; - std::string getFileName() const override { return assert_cast(out)->getFileName(); } + std::string getFileName() const override { return assert_cast(out.get())->getFileName(); } private: void nextImpl() override; diff --git a/src/IO/WriteBufferFromPocoSocket.cpp b/src/IO/WriteBufferFromPocoSocket.cpp index 10d9fd131cd..171e7f1ce69 100644 --- a/src/IO/WriteBufferFromPocoSocket.cpp +++ b/src/IO/WriteBufferFromPocoSocket.cpp @@ -34,97 +34,6 @@ namespace ErrorCodes extern const int LOGICAL_ERROR; } -ssize_t WriteBufferFromPocoSocket::socketSendBytesImpl(const char * ptr, size_t size) -{ - ssize_t res = 0; - - /// If async_callback is specified, set socket to non-blocking mode - /// and try to write data to it, if socket is not ready for writing, - /// run async_callback and try again later. - /// It is expected that file descriptor may be polled externally. - /// Note that send timeout is not checked here. External code should check it while polling. - if (async_callback) - { - socket.setBlocking(false); - /// Set socket to blocking mode at the end. - SCOPE_EXIT(socket.setBlocking(true)); - bool secure = socket.secure(); - res = socket.impl()->sendBytes(ptr, static_cast(size)); - - /// Check EAGAIN and ERR_SSL_WANT_WRITE/ERR_SSL_WANT_READ for secure socket (writing to secure socket can read too). - while (res < 0 && (errno == EAGAIN || (secure && (checkSSLWantRead(res) || checkSSLWantWrite(res))))) - { - /// In case of ERR_SSL_WANT_READ we should wait for socket to be ready for reading, otherwise - for writing. - if (secure && checkSSLWantRead(res)) - async_callback(socket.impl()->sockfd(), socket.getReceiveTimeout(), AsyncEventTimeoutType::RECEIVE, socket_description, AsyncTaskExecutor::Event::READ | AsyncTaskExecutor::Event::ERROR); - else - async_callback(socket.impl()->sockfd(), socket.getSendTimeout(), AsyncEventTimeoutType::SEND, socket_description, AsyncTaskExecutor::Event::WRITE | AsyncTaskExecutor::Event::ERROR); - - /// Try to write again. - res = socket.impl()->sendBytes(ptr, static_cast(size)); - } - } - else - { - res = socket.impl()->sendBytes(ptr, static_cast(size)); - } - - return res; -} - -void WriteBufferFromPocoSocket::socketSendBytes(const char * ptr, size_t size) -{ - if (!size) - return; - - Stopwatch watch; - size_t bytes_written = 0; - - SCOPE_EXIT({ - ProfileEvents::increment(ProfileEvents::NetworkSendElapsedMicroseconds, watch.elapsedMicroseconds()); - ProfileEvents::increment(ProfileEvents::NetworkSendBytes, bytes_written); - if (write_event != ProfileEvents::end()) - ProfileEvents::increment(write_event, bytes_written); - }); - - while (bytes_written < size) - { - ssize_t res = 0; - - /// Add more details to exceptions. - try - { - CurrentMetrics::Increment metric_increment(CurrentMetrics::NetworkSend); - if (size > INT_MAX) - throw Exception(ErrorCodes::LOGICAL_ERROR, "Buffer overflow"); - - res = socketSendBytesImpl(ptr + bytes_written, size - bytes_written); - } - catch (const Poco::Net::NetException & e) - { - throw NetException(ErrorCodes::NETWORK_ERROR, "{}, while writing to socket ({} -> {})", e.displayText(), - our_address.toString(), peer_address.toString()); - } - catch (const Poco::TimeoutException &) - { - throw NetException(ErrorCodes::SOCKET_TIMEOUT, "Timeout exceeded while writing to socket ({}, {} ms)", - peer_address.toString(), - socket.impl()->getSendTimeout().totalMilliseconds()); - } - catch (const Poco::IOException & e) - { - throw NetException(ErrorCodes::NETWORK_ERROR, "{}, while writing to socket ({} -> {})", e.displayText(), - our_address.toString(), peer_address.toString()); - } - - if (res < 0) - throw NetException(ErrorCodes::CANNOT_WRITE_TO_SOCKET, "Cannot write to socket ({} -> {})", - our_address.toString(), peer_address.toString()); - - bytes_written += res; - } -} - void WriteBufferFromPocoSocket::nextImpl() { if (!offset()) @@ -151,7 +60,36 @@ void WriteBufferFromPocoSocket::nextImpl() if (size > INT_MAX) throw Exception(ErrorCodes::LOGICAL_ERROR, "Buffer overflow"); - res = socketSendBytesImpl(pos, size); + /// If async_callback is specified, set socket to non-blocking mode + /// and try to write data to it, if socket is not ready for writing, + /// run async_callback and try again later. + /// It is expected that file descriptor may be polled externally. + /// Note that send timeout is not checked here. External code should check it while polling. + if (async_callback) + { + socket.setBlocking(false); + /// Set socket to blocking mode at the end. + SCOPE_EXIT(socket.setBlocking(true)); + bool secure = socket.secure(); + res = socket.impl()->sendBytes(pos, static_cast(size)); + + /// Check EAGAIN and ERR_SSL_WANT_WRITE/ERR_SSL_WANT_READ for secure socket (writing to secure socket can read too). + while (res < 0 && (errno == EAGAIN || (secure && (checkSSLWantRead(res) || checkSSLWantWrite(res))))) + { + /// In case of ERR_SSL_WANT_READ we should wait for socket to be ready for reading, otherwise - for writing. + if (secure && checkSSLWantRead(res)) + async_callback(socket.impl()->sockfd(), socket.getReceiveTimeout(), AsyncEventTimeoutType::RECEIVE, socket_description, AsyncTaskExecutor::Event::READ | AsyncTaskExecutor::Event::ERROR); + else + async_callback(socket.impl()->sockfd(), socket.getSendTimeout(), AsyncEventTimeoutType::SEND, socket_description, AsyncTaskExecutor::Event::WRITE | AsyncTaskExecutor::Event::ERROR); + + /// Try to write again. + res = socket.impl()->sendBytes(pos, static_cast(size)); + } + } + else + { + res = socket.impl()->sendBytes(pos, static_cast(size)); + } } catch (const Poco::Net::NetException & e) { @@ -187,12 +125,6 @@ WriteBufferFromPocoSocket::WriteBufferFromPocoSocket(Poco::Net::Socket & socket_ { } -WriteBufferFromPocoSocket::WriteBufferFromPocoSocket(Poco::Net::Socket & socket_, const ProfileEvents::Event & write_event_, size_t buf_size) - : WriteBufferFromPocoSocket(socket_, buf_size) -{ - write_event = write_event_; -} - WriteBufferFromPocoSocket::~WriteBufferFromPocoSocket() { try diff --git a/src/IO/WriteBufferFromPocoSocket.h b/src/IO/WriteBufferFromPocoSocket.h index 9c5509aebd1..ecb61020357 100644 --- a/src/IO/WriteBufferFromPocoSocket.h +++ b/src/IO/WriteBufferFromPocoSocket.h @@ -17,33 +17,14 @@ class WriteBufferFromPocoSocket : public BufferWithOwnMemory { public: explicit WriteBufferFromPocoSocket(Poco::Net::Socket & socket_, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE); - explicit WriteBufferFromPocoSocket(Poco::Net::Socket & socket_, const ProfileEvents::Event & write_event_, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE); ~WriteBufferFromPocoSocket() override; void setAsyncCallback(AsyncCallback async_callback_) { async_callback = std::move(async_callback_); } - using WriteBuffer::write; - void write(const std::string & str) { WriteBuffer::write(str.c_str(), str.size()); } - void write(std::string_view str) { WriteBuffer::write(str.data(), str.size()); } - void write(const char * str) { WriteBuffer::write(str, strlen(str)); } - void writeln(const std::string & str) { write(str); WriteBuffer::write("\n", 1); } - void writeln(std::string_view str) { write(str); WriteBuffer::write("\n", 1); } - void writeln(const char * str) { write(str); WriteBuffer::write("\n", 1); } - protected: void nextImpl() override; - void socketSendBytes(const char * ptr, size_t size); - void socketSendStr(const std::string & str) - { - return socketSendBytes(str.data(), str.size()); - } - void socketSendStr(const char * ptr) - { - return socketSendBytes(ptr, strlen(ptr)); - } - Poco::Net::Socket & socket; /** For error messages. It is necessary to receive this address in advance, because, @@ -53,13 +34,9 @@ protected: Poco::Net::SocketAddress peer_address; Poco::Net::SocketAddress our_address; - ProfileEvents::Event write_event; - private: AsyncCallback async_callback; std::string socket_description; - - ssize_t socketSendBytesImpl(const char * ptr, size_t size); }; } diff --git a/src/IO/WriteHelpers.h b/src/IO/WriteHelpers.h index b4f8b476b11..094352638e6 100644 --- a/src/IO/WriteHelpers.h +++ b/src/IO/WriteHelpers.h @@ -63,7 +63,9 @@ namespace ErrorCodes inline void writeChar(char x, WriteBuffer & buf) { - buf.write(x); + buf.nextIfAtEnd(); + *buf.position() = x; + ++buf.position(); } /// Write the same character n times. diff --git a/src/IO/ZlibDeflatingWriteBuffer.cpp b/src/IO/ZlibDeflatingWriteBuffer.cpp index ab6763fe6a6..6e4ab742413 100644 --- a/src/IO/ZlibDeflatingWriteBuffer.cpp +++ b/src/IO/ZlibDeflatingWriteBuffer.cpp @@ -10,6 +10,36 @@ namespace ErrorCodes extern const int ZLIB_DEFLATE_FAILED; } + +ZlibDeflatingWriteBuffer::ZlibDeflatingWriteBuffer( + std::unique_ptr out_, + CompressionMethod compression_method, + int compression_level, + size_t buf_size, + char * existing_memory, + size_t alignment) + : WriteBufferWithOwnMemoryDecorator(std::move(out_), buf_size, existing_memory, alignment) +{ + zstr.zalloc = nullptr; + zstr.zfree = nullptr; + zstr.opaque = nullptr; + zstr.next_in = nullptr; + zstr.avail_in = 0; + zstr.next_out = nullptr; + zstr.avail_out = 0; + + int window_bits = 15; + if (compression_method == CompressionMethod::Gzip) + { + window_bits += 16; + } + + int rc = deflateInit2(&zstr, compression_level, Z_DEFLATED, window_bits, 8, Z_DEFAULT_STRATEGY); + + if (rc != Z_OK) + throw Exception(ErrorCodes::ZLIB_DEFLATE_FAILED, "deflateInit2 failed: {}; zlib version: {}", zError(rc), ZLIB_VERSION); +} + void ZlibDeflatingWriteBuffer::nextImpl() { if (!offset()) @@ -52,10 +82,6 @@ void ZlibDeflatingWriteBuffer::finalizeBefore() { next(); - /// Don't write out if no data was ever compressed - if (!compress_empty && zstr.total_out == 0) - return; - /// https://github.com/zlib-ng/zlib-ng/issues/494 do { diff --git a/src/IO/ZlibDeflatingWriteBuffer.h b/src/IO/ZlibDeflatingWriteBuffer.h index f01c41c7d13..58e709b54e6 100644 --- a/src/IO/ZlibDeflatingWriteBuffer.h +++ b/src/IO/ZlibDeflatingWriteBuffer.h @@ -12,45 +12,17 @@ namespace DB { -namespace ErrorCodes -{ - extern const int ZLIB_DEFLATE_FAILED; -} - /// Performs compression using zlib library and writes compressed data to out_ WriteBuffer. class ZlibDeflatingWriteBuffer : public WriteBufferWithOwnMemoryDecorator { public: - template ZlibDeflatingWriteBuffer( - WriteBufferT && out_, + std::unique_ptr out_, CompressionMethod compression_method, int compression_level, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE, char * existing_memory = nullptr, - size_t alignment = 0, - bool compress_empty_ = true) - : WriteBufferWithOwnMemoryDecorator(std::move(out_), buf_size, existing_memory, alignment), compress_empty(compress_empty_) - { - zstr.zalloc = nullptr; - zstr.zfree = nullptr; - zstr.opaque = nullptr; - zstr.next_in = nullptr; - zstr.avail_in = 0; - zstr.next_out = nullptr; - zstr.avail_out = 0; - - int window_bits = 15; - if (compression_method == CompressionMethod::Gzip) - { - window_bits += 16; - } - - int rc = deflateInit2(&zstr, compression_level, Z_DEFLATED, window_bits, 8, Z_DEFAULT_STRATEGY); - - if (rc != Z_OK) - throw Exception(ErrorCodes::ZLIB_DEFLATE_FAILED, "deflateInit2 failed: {}; zlib version: {}", zError(rc), ZLIB_VERSION); - } + size_t alignment = 0); ~ZlibDeflatingWriteBuffer() override; @@ -64,7 +36,6 @@ private: virtual void finalizeAfter() override; z_stream zstr; - bool compress_empty = true; }; } diff --git a/src/IO/ZstdDeflatingWriteBuffer.cpp b/src/IO/ZstdDeflatingWriteBuffer.cpp index bad6e733cf1..949d65926b3 100644 --- a/src/IO/ZstdDeflatingWriteBuffer.cpp +++ b/src/IO/ZstdDeflatingWriteBuffer.cpp @@ -8,7 +8,9 @@ namespace ErrorCodes extern const int ZSTD_ENCODER_FAILED; } -void ZstdDeflatingWriteBuffer::initialize(int compression_level) +ZstdDeflatingWriteBuffer::ZstdDeflatingWriteBuffer( + std::unique_ptr out_, int compression_level, size_t buf_size, char * existing_memory, size_t alignment) + : WriteBufferWithOwnMemoryDecorator(std::move(out_), buf_size, existing_memory, alignment) { cctx = ZSTD_createCCtx(); if (cctx == nullptr) @@ -42,7 +44,6 @@ void ZstdDeflatingWriteBuffer::flush(ZSTD_EndDirective mode) try { - size_t out_offset = out->offset(); bool ended = false; do { @@ -66,8 +67,6 @@ void ZstdDeflatingWriteBuffer::flush(ZSTD_EndDirective mode) ended = everything_was_compressed && everything_was_flushed; } while (!ended); - - total_out += out->offset() - out_offset; } catch (...) { @@ -85,9 +84,6 @@ void ZstdDeflatingWriteBuffer::nextImpl() void ZstdDeflatingWriteBuffer::finalizeBefore() { - /// Don't write out if no data was ever compressed - if (!compress_empty && total_out == 0) - return; flush(ZSTD_e_end); } diff --git a/src/IO/ZstdDeflatingWriteBuffer.h b/src/IO/ZstdDeflatingWriteBuffer.h index d25db515d28..a66d6085a74 100644 --- a/src/IO/ZstdDeflatingWriteBuffer.h +++ b/src/IO/ZstdDeflatingWriteBuffer.h @@ -14,18 +14,12 @@ namespace DB class ZstdDeflatingWriteBuffer : public WriteBufferWithOwnMemoryDecorator { public: - template ZstdDeflatingWriteBuffer( - WriteBufferT && out_, + std::unique_ptr out_, int compression_level, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE, char * existing_memory = nullptr, - size_t alignment = 0, - bool compress_empty_ = true) - : WriteBufferWithOwnMemoryDecorator(std::move(out_), buf_size, existing_memory, alignment), compress_empty(compress_empty_) - { - initialize(compression_level); - } + size_t alignment = 0); ~ZstdDeflatingWriteBuffer() override; @@ -35,8 +29,6 @@ public: } private: - void initialize(int compression_level); - void nextImpl() override; /// Flush all pending data and write zstd footer to the underlying buffer. @@ -50,9 +42,6 @@ private: ZSTD_CCtx * cctx; ZSTD_inBuffer input; ZSTD_outBuffer output; - - size_t total_out = 0; - bool compress_empty = true; }; } diff --git a/src/IO/parseDateTimeBestEffort.cpp b/src/IO/parseDateTimeBestEffort.cpp index 83fde8e8830..9734ba1c84f 100644 --- a/src/IO/parseDateTimeBestEffort.cpp +++ b/src/IO/parseDateTimeBestEffort.cpp @@ -95,7 +95,7 @@ ReturnType parseDateTimeBestEffortImpl( FmtArgs && ...fmt_args [[maybe_unused]]) { if constexpr (std::is_same_v) - throw ParsingException(error_code, std::move(fmt_string), std::forward(fmt_args)...); + throw Exception(error_code, std::move(fmt_string), std::forward(fmt_args)...); else return false; }; diff --git a/src/IO/readDecimalText.h b/src/IO/readDecimalText.h index 9fd9c439b87..3417310a990 100644 --- a/src/IO/readDecimalText.h +++ b/src/IO/readDecimalText.h @@ -121,7 +121,7 @@ inline bool readDigits(ReadBuffer & buf, T & x, uint32_t & digits, int32_t & exp if (!tryReadIntText(addition_exp, buf)) { if constexpr (_throw_on_error) - throw ParsingException(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot parse exponent while reading decimal"); + throw Exception(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot parse exponent while reading decimal"); else return false; } @@ -134,7 +134,7 @@ inline bool readDigits(ReadBuffer & buf, T & x, uint32_t & digits, int32_t & exp if (digits_only) { if constexpr (_throw_on_error) - throw ParsingException(ErrorCodes::CANNOT_PARSE_NUMBER, "Unexpected symbol while reading decimal"); + throw Exception(ErrorCodes::CANNOT_PARSE_NUMBER, "Unexpected symbol while reading decimal"); return false; } stop = true; diff --git a/src/IO/readFloatText.h b/src/IO/readFloatText.h index b0682576183..23e904f305a 100644 --- a/src/IO/readFloatText.h +++ b/src/IO/readFloatText.h @@ -160,7 +160,7 @@ ReturnType readFloatTextPreciseImpl(T & x, ReadBuffer & buf) if (unlikely(res.ec != std::errc())) { if constexpr (throw_exception) - throw ParsingException( + throw Exception( ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot read floating point value here: {}", String(initial_position, buf.buffer().end() - initial_position)); @@ -253,7 +253,7 @@ ReturnType readFloatTextPreciseImpl(T & x, ReadBuffer & buf) if (unlikely(res.ec != std::errc() || res.ptr - tmp_buf != num_copied_chars)) { if constexpr (throw_exception) - throw ParsingException( + throw Exception( ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot read floating point value here: {}", String(tmp_buf, num_copied_chars)); else return ReturnType(false); @@ -342,7 +342,7 @@ ReturnType readFloatTextFastImpl(T & x, ReadBuffer & in) if (in.eof()) { if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot read floating point value"); + throw Exception(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot read floating point value"); else return false; } @@ -400,7 +400,7 @@ ReturnType readFloatTextFastImpl(T & x, ReadBuffer & in) if (in.eof()) { if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot read floating point value: nothing after exponent"); + throw Exception(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot read floating point value: nothing after exponent"); else return false; } @@ -438,7 +438,7 @@ ReturnType readFloatTextFastImpl(T & x, ReadBuffer & in) if (in.eof()) { if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot read floating point value: no digits read"); + throw Exception(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot read floating point value: no digits read"); else return false; } @@ -449,14 +449,14 @@ ReturnType readFloatTextFastImpl(T & x, ReadBuffer & in) if (in.eof()) { if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot read floating point value: nothing after plus sign"); + throw Exception(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot read floating point value: nothing after plus sign"); else return false; } else if (negative) { if constexpr (throw_exception) - throw ParsingException(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot read floating point value: plus after minus sign"); + throw Exception(ErrorCodes::CANNOT_PARSE_NUMBER, "Cannot read floating point value: plus after minus sign"); else return false; } diff --git a/src/Interpreters/ActionsDAG.h b/src/Interpreters/ActionsDAG.h index 94b6b1ac41d..f18ae5d5c75 100644 --- a/src/Interpreters/ActionsDAG.h +++ b/src/Interpreters/ActionsDAG.h @@ -115,6 +115,7 @@ public: explicit ActionsDAG(const ColumnsWithTypeAndName & inputs_); const Nodes & getNodes() const { return nodes; } + static Nodes detachNodes(ActionsDAG && dag) { return std::move(dag.nodes); } const NodeRawConstPtrs & getOutputs() const { return outputs; } /** Output nodes can contain any column returned from DAG. * You may manually change it if needed. diff --git a/src/Interpreters/ActionsVisitor.cpp b/src/Interpreters/ActionsVisitor.cpp index 827914eaefe..1789cc6c4b1 100644 --- a/src/Interpreters/ActionsVisitor.cpp +++ b/src/Interpreters/ActionsVisitor.cpp @@ -1419,7 +1419,7 @@ FutureSetPtr ActionsMatcher::makeSet(const ASTFunction & node, Data & data, bool return set; } - FutureSetPtr external_table_set; + FutureSetFromSubqueryPtr external_table_set; /// A special case is if the name of the table is specified on the right side of the IN statement, /// and the table has the type Set (a previously prepared set). diff --git a/src/Interpreters/ClusterProxy/executeQuery.cpp b/src/Interpreters/ClusterProxy/executeQuery.cpp index 18f7280dd19..c448206ed78 100644 --- a/src/Interpreters/ClusterProxy/executeQuery.cpp +++ b/src/Interpreters/ClusterProxy/executeQuery.cpp @@ -412,7 +412,8 @@ void executeQueryWithParallelReplicas( new_cluster = not_optimized_cluster->getClusterWithReplicasAsShards(settings, settings.max_parallel_replicas); } - auto coordinator = std::make_shared(new_cluster->getShardCount()); + auto coordinator + = std::make_shared(new_cluster->getShardCount(), settings.parallel_replicas_mark_segment_size); auto external_tables = new_context->getExternalTables(); auto read_from_remote = std::make_unique( query_ast, diff --git a/src/Interpreters/Context.cpp b/src/Interpreters/Context.cpp index e9962d08160..38944b21c49 100644 --- a/src/Interpreters/Context.cpp +++ b/src/Interpreters/Context.cpp @@ -330,6 +330,9 @@ struct ContextSharedPart : boost::noncopyable mutable ThrottlerPtr backups_server_throttler; /// A server-wide throttler for BACKUPs + mutable ThrottlerPtr mutations_throttler; /// A server-wide throttler for mutations + mutable ThrottlerPtr merges_throttler; /// A server-wide throttler for merges + MultiVersion macros; /// Substitutions extracted from config. std::unique_ptr ddl_worker TSA_GUARDED_BY(mutex); /// Process ddl commands from zk. LoadTaskPtr ddl_worker_startup_task; /// To postpone `ddl_worker->startup()` after all tables startup @@ -738,6 +741,12 @@ struct ContextSharedPart : boost::noncopyable if (auto bandwidth = server_settings.max_backup_bandwidth_for_server) backups_server_throttler = std::make_shared(bandwidth); + + if (auto bandwidth = server_settings.max_mutations_bandwidth_for_server) + mutations_throttler = std::make_shared(bandwidth); + + if (auto bandwidth = server_settings.max_merges_bandwidth_for_server) + merges_throttler = std::make_shared(bandwidth); } }; @@ -3001,6 +3010,16 @@ ThrottlerPtr Context::getBackupsThrottler() const return throttler; } +ThrottlerPtr Context::getMutationsThrottler() const +{ + return shared->mutations_throttler; +} + +ThrottlerPtr Context::getMergesThrottler() const +{ + return shared->merges_throttler; +} + bool Context::hasDistributedDDL() const { return getConfigRef().has("distributed_ddl"); diff --git a/src/Interpreters/Context.h b/src/Interpreters/Context.h index b09eeb8ca2d..640aeb0539c 100644 --- a/src/Interpreters/Context.h +++ b/src/Interpreters/Context.h @@ -1328,6 +1328,9 @@ public: ThrottlerPtr getBackupsThrottler() const; + ThrottlerPtr getMutationsThrottler() const; + ThrottlerPtr getMergesThrottler() const; + /// Kitchen sink using ContextData::KitchenSink; using ContextData::kitchen_sink; diff --git a/src/Interpreters/DDLTask.cpp b/src/Interpreters/DDLTask.cpp index 6e9155ab2a2..d418be51cc5 100644 --- a/src/Interpreters/DDLTask.cpp +++ b/src/Interpreters/DDLTask.cpp @@ -215,20 +215,47 @@ ContextMutablePtr DDLTaskBase::makeQueryContext(ContextPtr from_context, const Z } -bool DDLTask::findCurrentHostID(ContextPtr global_context, Poco::Logger * log, const ZooKeeperPtr & zookeeper) +bool DDLTask::findCurrentHostID(ContextPtr global_context, Poco::Logger * log, const ZooKeeperPtr & zookeeper, const std::optional & config_host_name) { bool host_in_hostlist = false; std::exception_ptr first_exception = nullptr; + const auto maybe_secure_port = global_context->getTCPPortSecure(); + const auto port = global_context->getTCPPort(); + + if (config_host_name) + { + bool is_local_port = (maybe_secure_port && HostID(*config_host_name, *maybe_secure_port).isLocalAddress(*maybe_secure_port)) || + HostID(*config_host_name, port).isLocalAddress(port); + + if (!is_local_port) + throw Exception( + ErrorCodes::DNS_ERROR, + "{} is not a local address. Check parameter 'host_name' in the configuration", + *config_host_name); + } + for (const HostID & host : entry.hosts) { - auto maybe_secure_port = global_context->getTCPPortSecure(); + if (config_host_name) + { + if (config_host_name != host.host_name) + continue; + + if (maybe_secure_port != host.port && port != host.port) + continue; + + host_in_hostlist = true; + host_id = host; + host_id_str = host.toString(); + break; + } try { /// The port is considered local if it matches TCP or TCP secure port that the server is listening. bool is_local_port - = (maybe_secure_port && host.isLocalAddress(*maybe_secure_port)) || host.isLocalAddress(global_context->getTCPPort()); + = (maybe_secure_port && host.isLocalAddress(*maybe_secure_port)) || host.isLocalAddress(port); if (!is_local_port) continue; diff --git a/src/Interpreters/DDLTask.h b/src/Interpreters/DDLTask.h index 1ceb74c7048..bc45b46bf0f 100644 --- a/src/Interpreters/DDLTask.h +++ b/src/Interpreters/DDLTask.h @@ -44,6 +44,9 @@ struct HostID explicit HostID(const Cluster::Address & address) : host_name(address.host_name), port(address.port) {} + HostID(const String & host_name_, UInt16 port_) + : host_name(host_name_), port(port_) {} + static HostID fromString(const String & host_port_str); String toString() const @@ -143,7 +146,7 @@ struct DDLTask : public DDLTaskBase { DDLTask(const String & name, const String & path) : DDLTaskBase(name, path) {} - bool findCurrentHostID(ContextPtr global_context, Poco::Logger * log, const ZooKeeperPtr & zookeeper); + bool findCurrentHostID(ContextPtr global_context, Poco::Logger * log, const ZooKeeperPtr & zookeeper, const std::optional & config_host_name); void setClusterInfo(ContextPtr context, Poco::Logger * log); diff --git a/src/Interpreters/DDLWorker.cpp b/src/Interpreters/DDLWorker.cpp index f08fd72ff7f..c0611dfaf7d 100644 --- a/src/Interpreters/DDLWorker.cpp +++ b/src/Interpreters/DDLWorker.cpp @@ -107,6 +107,9 @@ DDLWorker::DDLWorker( cleanup_delay_period = config->getUInt64(prefix + ".cleanup_delay_period", static_cast(cleanup_delay_period)); max_tasks_in_queue = std::max(1, config->getUInt64(prefix + ".max_tasks_in_queue", max_tasks_in_queue)); + if (config->has(prefix + ".host_name")) + config_host_name = config->getString(prefix + ".host_name"); + if (config->has(prefix + ".profile")) context->setSetting("profile", config->getString(prefix + ".profile")); } @@ -214,7 +217,7 @@ DDLTaskPtr DDLWorker::initAndCheckTask(const String & entry_name, String & out_r /// Stage 2: resolve host_id and check if we should execute query or not /// Multiple clusters can use single DDL queue path in ZooKeeper, /// So we should skip task if we cannot find current host in cluster hosts list. - if (!task->findCurrentHostID(context, log, zookeeper)) + if (!task->findCurrentHostID(context, log, zookeeper, config_host_name)) { out_reason = "There is no a local address in host list"; return add_to_skip_set(); diff --git a/src/Interpreters/DDLWorker.h b/src/Interpreters/DDLWorker.h index d34a4135199..adc9a491d81 100644 --- a/src/Interpreters/DDLWorker.h +++ b/src/Interpreters/DDLWorker.h @@ -153,6 +153,8 @@ protected: ContextMutablePtr context; Poco::Logger * log; + std::optional config_host_name; /// host_name from config + std::string host_fqdn; /// current host domain name std::string host_fqdn_id; /// host_name:port std::string queue_dir; /// dir with queue of queries diff --git a/src/Interpreters/DatabaseCatalog.h b/src/Interpreters/DatabaseCatalog.h index 6d8fd84557c..19882b0b828 100644 --- a/src/Interpreters/DatabaseCatalog.h +++ b/src/Interpreters/DatabaseCatalog.h @@ -82,8 +82,8 @@ private: using DDLGuardPtr = std::unique_ptr; -class FutureSet; -using FutureSetPtr = std::shared_ptr; +class FutureSetFromSubquery; +using FutureSetFromSubqueryPtr = std::shared_ptr; /// Creates temporary table in `_temporary_and_external_tables` with randomly generated unique StorageID. /// Such table can be accessed from everywhere by its ID. @@ -116,7 +116,7 @@ struct TemporaryTableHolder : boost::noncopyable, WithContext IDatabase * temporary_tables = nullptr; UUID id = UUIDHelpers::Nil; - FutureSetPtr future_set; + FutureSetFromSubqueryPtr future_set; }; ///TODO maybe remove shared_ptr from here? diff --git a/src/Interpreters/InterpreterSelectQuery.cpp b/src/Interpreters/InterpreterSelectQuery.cpp index cdf1b4228bc..8e8482ccbd7 100644 --- a/src/Interpreters/InterpreterSelectQuery.cpp +++ b/src/Interpreters/InterpreterSelectQuery.cpp @@ -2378,12 +2378,25 @@ std::optional InterpreterSelectQuery::getTrivialCount(UInt64 max_paralle else { // It's possible to optimize count() given only partition predicates - SelectQueryInfo temp_query_info; - temp_query_info.query = query_ptr; - temp_query_info.syntax_analyzer_result = syntax_analyzer_result; - temp_query_info.prepared_sets = query_analyzer->getPreparedSets(); + ActionsDAG::NodeRawConstPtrs filter_nodes; + if (analysis_result.hasPrewhere()) + { + auto & prewhere_info = analysis_result.prewhere_info; + filter_nodes.push_back(&prewhere_info->prewhere_actions->findInOutputs(prewhere_info->prewhere_column_name)); - return storage->totalRowsByPartitionPredicate(temp_query_info, context); + if (prewhere_info->row_level_filter) + filter_nodes.push_back(&prewhere_info->row_level_filter->findInOutputs(prewhere_info->row_level_column_name)); + } + if (analysis_result.hasWhere()) + { + filter_nodes.push_back(&analysis_result.before_where->findInOutputs(analysis_result.where_column_name)); + } + + auto filter_actions_dag = ActionsDAG::buildFilterActionsDAG(filter_nodes, {}, context); + if (!filter_actions_dag) + return {}; + + return storage->totalRowsByPartitionPredicate(filter_actions_dag, context); } } @@ -2501,7 +2514,12 @@ void InterpreterSelectQuery::executeFetchColumns(QueryProcessingStage::Enum proc max_block_size = std::max(1, max_block_limited); max_threads_execute_query = max_streams = 1; } - if (max_block_limited < local_limits.local_limits.size_limits.max_rows) + if (local_limits.local_limits.size_limits.max_rows != 0) + { + if (max_block_limited < local_limits.local_limits.size_limits.max_rows) + query_info.limit = max_block_limited; + } + else { query_info.limit = max_block_limited; } diff --git a/src/Interpreters/MutationsInterpreter.cpp b/src/Interpreters/MutationsInterpreter.cpp index bf50766c165..a6ea03f8a03 100644 --- a/src/Interpreters/MutationsInterpreter.cpp +++ b/src/Interpreters/MutationsInterpreter.cpp @@ -1280,6 +1280,7 @@ void MutationsInterpreter::Source::read( VirtualColumns virtual_columns(std::move(required_columns), part); createReadFromPartStep( + MergeTreeSequentialSourceType::Mutation, plan, *data, storage_snapshot, part, std::move(virtual_columns.columns_to_read), apply_deleted_mask_, filter, context_, diff --git a/src/Interpreters/PreparedSets.cpp b/src/Interpreters/PreparedSets.cpp index 18a25482b7f..cc3db726f01 100644 --- a/src/Interpreters/PreparedSets.cpp +++ b/src/Interpreters/PreparedSets.cpp @@ -97,7 +97,7 @@ FutureSetFromSubquery::FutureSetFromSubquery( String key, std::unique_ptr source_, StoragePtr external_table_, - FutureSetPtr external_table_set_, + std::shared_ptr external_table_set_, const Settings & settings, bool in_subquery_) : external_table(std::move(external_table_)) @@ -168,6 +168,24 @@ std::unique_ptr FutureSetFromSubquery::build(const ContextPtr & conte return plan; } +void FutureSetFromSubquery::buildSetInplace(const ContextPtr & context) +{ + if (external_table_set) + external_table_set->buildSetInplace(context); + + auto plan = build(context); + + if (!plan) + return; + + auto builder = plan->buildQueryPipeline(QueryPlanOptimizationSettings::fromContext(context), BuildQueryPipelineSettings::fromContext(context)); + auto pipeline = QueryPipelineBuilder::getPipeline(std::move(*builder)); + pipeline.complete(std::make_shared(Block())); + + CompletedPipelineExecutor executor(pipeline); + executor.execute(); +} + SetPtr FutureSetFromSubquery::buildOrderedSetInplace(const ContextPtr & context) { if (!context->getSettingsRef().use_index_for_in_with_subqueries) @@ -233,7 +251,7 @@ String PreparedSets::toString(const PreparedSets::Hash & key, const DataTypes & return buf.str(); } -FutureSetPtr PreparedSets::addFromTuple(const Hash & key, Block block, const Settings & settings) +FutureSetFromTuplePtr PreparedSets::addFromTuple(const Hash & key, Block block, const Settings & settings) { auto from_tuple = std::make_shared(std::move(block), settings); const auto & set_types = from_tuple->getTypes(); @@ -247,7 +265,7 @@ FutureSetPtr PreparedSets::addFromTuple(const Hash & key, Block block, const Set return from_tuple; } -FutureSetPtr PreparedSets::addFromStorage(const Hash & key, SetPtr set_) +FutureSetFromStoragePtr PreparedSets::addFromStorage(const Hash & key, SetPtr set_) { auto from_storage = std::make_shared(std::move(set_)); auto [it, inserted] = sets_from_storage.emplace(key, from_storage); @@ -258,11 +276,11 @@ FutureSetPtr PreparedSets::addFromStorage(const Hash & key, SetPtr set_) return from_storage; } -FutureSetPtr PreparedSets::addFromSubquery( +FutureSetFromSubqueryPtr PreparedSets::addFromSubquery( const Hash & key, std::unique_ptr source, StoragePtr external_table, - FutureSetPtr external_table_set, + FutureSetFromSubqueryPtr external_table_set, const Settings & settings, bool in_subquery) { @@ -282,7 +300,7 @@ FutureSetPtr PreparedSets::addFromSubquery( return from_subquery; } -FutureSetPtr PreparedSets::addFromSubquery( +FutureSetFromSubqueryPtr PreparedSets::addFromSubquery( const Hash & key, QueryTreeNodePtr query_tree, const Settings & settings) @@ -300,7 +318,7 @@ FutureSetPtr PreparedSets::addFromSubquery( return from_subquery; } -FutureSetPtr PreparedSets::findTuple(const Hash & key, const DataTypes & types) const +FutureSetFromTuplePtr PreparedSets::findTuple(const Hash & key, const DataTypes & types) const { auto it = sets_from_tuple.find(key); if (it == sets_from_tuple.end()) diff --git a/src/Interpreters/PreparedSets.h b/src/Interpreters/PreparedSets.h index 9f8bac9f71c..7178cff73b9 100644 --- a/src/Interpreters/PreparedSets.h +++ b/src/Interpreters/PreparedSets.h @@ -69,6 +69,8 @@ private: SetPtr set; }; +using FutureSetFromStoragePtr = std::shared_ptr; + /// Set from tuple is filled as well as set from storage. /// Additionally, it can be converted to set useful for PK. class FutureSetFromTuple final : public FutureSet @@ -86,6 +88,8 @@ private: SetKeyColumns set_key_columns; }; +using FutureSetFromTuplePtr = std::shared_ptr; + /// Set from subquery can be built inplace for PK or in CreatingSet step. /// If use_index_for_in_with_subqueries_max_values is reached, set for PK won't be created, /// but ordinary set would be created instead. @@ -96,7 +100,7 @@ public: String key, std::unique_ptr source_, StoragePtr external_table_, - FutureSetPtr external_table_set_, + std::shared_ptr external_table_set_, const Settings & settings, bool in_subquery_); @@ -110,6 +114,7 @@ public: SetPtr buildOrderedSetInplace(const ContextPtr & context) override; std::unique_ptr build(const ContextPtr & context); + void buildSetInplace(const ContextPtr & context); QueryTreeNodePtr detachQueryTree() { return std::move(query_tree); } void setQueryPlan(std::unique_ptr source_); @@ -119,7 +124,7 @@ public: private: SetAndKeyPtr set_and_key; StoragePtr external_table; - FutureSetPtr external_table_set; + std::shared_ptr external_table_set; std::unique_ptr source; QueryTreeNodePtr query_tree; @@ -130,6 +135,8 @@ private: // with new analyzer it's not a case }; +using FutureSetFromSubqueryPtr = std::shared_ptr; + /// Container for all the sets used in query. class PreparedSets { @@ -141,32 +148,32 @@ public: UInt64 operator()(const Hash & key) const { return key.low64 ^ key.high64; } }; - using SetsFromTuple = std::unordered_map>, Hashing>; - using SetsFromStorage = std::unordered_map, Hashing>; - using SetsFromSubqueries = std::unordered_map, Hashing>; + using SetsFromTuple = std::unordered_map, Hashing>; + using SetsFromStorage = std::unordered_map; + using SetsFromSubqueries = std::unordered_map; - FutureSetPtr addFromStorage(const Hash & key, SetPtr set_); - FutureSetPtr addFromTuple(const Hash & key, Block block, const Settings & settings); + FutureSetFromStoragePtr addFromStorage(const Hash & key, SetPtr set_); + FutureSetFromTuplePtr addFromTuple(const Hash & key, Block block, const Settings & settings); - FutureSetPtr addFromSubquery( + FutureSetFromSubqueryPtr addFromSubquery( const Hash & key, std::unique_ptr source, StoragePtr external_table, - FutureSetPtr external_table_set, + FutureSetFromSubqueryPtr external_table_set, const Settings & settings, bool in_subquery = false); - FutureSetPtr addFromSubquery( + FutureSetFromSubqueryPtr addFromSubquery( const Hash & key, QueryTreeNodePtr query_tree, const Settings & settings); - FutureSetPtr findTuple(const Hash & key, const DataTypes & types) const; - std::shared_ptr findStorage(const Hash & key) const; - std::shared_ptr findSubquery(const Hash & key) const; + FutureSetFromTuplePtr findTuple(const Hash & key, const DataTypes & types) const; + FutureSetFromStoragePtr findStorage(const Hash & key) const; + FutureSetFromSubqueryPtr findSubquery(const Hash & key) const; void markAsINSubquery(const Hash & key); - using Subqueries = std::vector>; + using Subqueries = std::vector; Subqueries getSubqueries() const; bool hasSubqueries() const { return !sets_from_subqueries.empty(); } diff --git a/src/Interpreters/RequiredSourceColumnsData.h b/src/Interpreters/RequiredSourceColumnsData.h index dd4e2dc3d68..501f6961efa 100644 --- a/src/Interpreters/RequiredSourceColumnsData.h +++ b/src/Interpreters/RequiredSourceColumnsData.h @@ -36,7 +36,6 @@ struct RequiredSourceColumnsData bool has_table_join = false; bool has_array_join = false; - bool visit_index_hint = false; bool addColumnAliasIfAny(const IAST & ast); void addColumnIdentifier(const ASTIdentifier & node); diff --git a/src/Interpreters/RequiredSourceColumnsVisitor.cpp b/src/Interpreters/RequiredSourceColumnsVisitor.cpp index c07d783788a..3971c8b58f4 100644 --- a/src/Interpreters/RequiredSourceColumnsVisitor.cpp +++ b/src/Interpreters/RequiredSourceColumnsVisitor.cpp @@ -72,11 +72,6 @@ void RequiredSourceColumnsMatcher::visit(const ASTPtr & ast, Data & data) } if (auto * t = ast->as()) { - /// "indexHint" is a special function for index analysis. - /// Everything that is inside it is not calculated. See KeyCondition - if (!data.visit_index_hint && t->name == "indexHint") - return; - data.addColumnAliasIfAny(*ast); visit(*t, ast, data); return; diff --git a/src/Interpreters/Session.cpp b/src/Interpreters/Session.cpp index d2f9fe8b325..162772061b5 100644 --- a/src/Interpreters/Session.cpp +++ b/src/Interpreters/Session.cpp @@ -112,7 +112,8 @@ public: throw Exception(ErrorCodes::SESSION_NOT_FOUND, "Session {} not found", session_id); /// Create a new session from current context. - it = sessions.insert(std::make_pair(key, std::make_shared(key, global_context, timeout, *this))).first; + auto context = Context::createCopy(global_context); + it = sessions.insert(std::make_pair(key, std::make_shared(key, context, timeout, *this))).first; const auto & session = it->second; if (!thread.joinable()) @@ -127,7 +128,7 @@ public: /// Use existing session. const auto & session = it->second; - LOG_TRACE(log, "Reuse session from storage with session_id: {}, user_id: {}", key.second, key.first); + LOG_TEST(log, "Reuse session from storage with session_id: {}, user_id: {}", key.second, key.first); if (!session.unique()) throw Exception(ErrorCodes::SESSION_IS_LOCKED, "Session {} is locked by a concurrent client", session_id); @@ -702,10 +703,6 @@ void Session::releaseSessionID() { if (!named_session) return; - - prepared_client_info = getClientInfo(); - session_context.reset(); - named_session->release(); named_session = nullptr; } diff --git a/src/Interpreters/Session.h b/src/Interpreters/Session.h index 75e1414b8cb..2249d8fbb2f 100644 --- a/src/Interpreters/Session.h +++ b/src/Interpreters/Session.h @@ -8,7 +8,6 @@ #include #include -#include #include namespace Poco::Net { class SocketAddress; } diff --git a/src/Interpreters/TreeRewriter.cpp b/src/Interpreters/TreeRewriter.cpp index 9cbf24091e3..6ed3ff2f1e6 100644 --- a/src/Interpreters/TreeRewriter.cpp +++ b/src/Interpreters/TreeRewriter.cpp @@ -995,13 +995,12 @@ void TreeRewriterResult::collectSourceColumns(bool add_special) /// Calculate which columns are required to execute the expression. /// Then, delete all other columns from the list of available columns. /// After execution, columns will only contain the list of columns needed to read from the table. -bool TreeRewriterResult::collectUsedColumns(const ASTPtr & query, bool is_select, bool visit_index_hint, bool no_throw) +bool TreeRewriterResult::collectUsedColumns(const ASTPtr & query, bool is_select, bool no_throw) { /// We calculate required_source_columns with source_columns modifications and swap them on exit required_source_columns = source_columns; RequiredSourceColumnsVisitor::Data columns_context; - columns_context.visit_index_hint = visit_index_hint; RequiredSourceColumnsVisitor(columns_context).visit(query); NameSet source_column_names; @@ -1385,7 +1384,7 @@ TreeRewriterResultPtr TreeRewriter::analyzeSelect( result.window_function_asts = getWindowFunctions(query, *select_query); result.expressions_with_window_function = getExpressionsWithWindowFunctions(query); - result.collectUsedColumns(query, true, settings.query_plan_optimize_primary_key); + result.collectUsedColumns(query, true); if (!result.missed_subcolumns.empty()) { @@ -1422,7 +1421,7 @@ TreeRewriterResultPtr TreeRewriter::analyzeSelect( result.aggregates = getAggregates(query, *select_query); result.window_function_asts = getWindowFunctions(query, *select_query); result.expressions_with_window_function = getExpressionsWithWindowFunctions(query); - result.collectUsedColumns(query, true, settings.query_plan_optimize_primary_key); + result.collectUsedColumns(query, true); } } @@ -1499,7 +1498,7 @@ TreeRewriterResultPtr TreeRewriter::analyze( else assertNoAggregates(query, "in wrong place"); - bool is_ok = result.collectUsedColumns(query, false, settings.query_plan_optimize_primary_key, no_throw); + bool is_ok = result.collectUsedColumns(query, false, no_throw); if (!is_ok) return {}; diff --git a/src/Interpreters/TreeRewriter.h b/src/Interpreters/TreeRewriter.h index 1858488afa3..205b4760423 100644 --- a/src/Interpreters/TreeRewriter.h +++ b/src/Interpreters/TreeRewriter.h @@ -88,7 +88,7 @@ struct TreeRewriterResult bool add_special = true); void collectSourceColumns(bool add_special); - bool collectUsedColumns(const ASTPtr & query, bool is_select, bool visit_index_hint, bool no_throw = false); + bool collectUsedColumns(const ASTPtr & query, bool is_select, bool no_throw = false); Names requiredSourceColumns() const { return required_source_columns.getNames(); } const Names & requiredSourceColumnsForAccessCheck() const { return required_source_columns_before_expanding_alias_columns; } NameSet getArrayJoinSourceNameSet() const; diff --git a/src/Interpreters/executeDDLQueryOnCluster.cpp b/src/Interpreters/executeDDLQueryOnCluster.cpp index 9486350a0f6..6b6054fdae3 100644 --- a/src/Interpreters/executeDDLQueryOnCluster.cpp +++ b/src/Interpreters/executeDDLQueryOnCluster.cpp @@ -200,8 +200,6 @@ public: Status prepare() override; private: - static Strings getChildrenAllowNoNode(const std::shared_ptr & zookeeper, const String & node_path); - static Block getSampleBlock(ContextPtr context_, bool hosts_to_wait); Strings getNewAndUpdate(const Strings & current_list_of_finished_hosts); @@ -228,7 +226,8 @@ private: NameSet waiting_hosts; /// hosts from task host list NameSet finished_hosts; /// finished hosts from host list NameSet ignoring_hosts; /// appeared hosts that are not in hosts list - Strings current_active_hosts; /// Hosts that were in active state at the last check + Strings current_active_hosts; /// Hosts that are currently executing the task + NameSet offline_hosts; /// Hosts that are not currently running size_t num_hosts_finished = 0; /// Save the first detected error and throw it at the end of execution @@ -237,7 +236,10 @@ private: Int64 timeout_seconds = 120; bool is_replicated_database = false; bool throw_on_timeout = true; + bool only_running_hosts = false; + bool timeout_exceeded = false; + bool stop_waiting_offline_hosts = false; }; @@ -310,12 +312,15 @@ DDLQueryStatusSource::DDLQueryStatusSource( , log(&Poco::Logger::get("DDLQueryStatusSource")) { auto output_mode = context->getSettingsRef().distributed_ddl_output_mode; - throw_on_timeout = output_mode == DistributedDDLOutputMode::THROW || output_mode == DistributedDDLOutputMode::NONE; + throw_on_timeout = output_mode == DistributedDDLOutputMode::THROW || output_mode == DistributedDDLOutputMode::THROW_ONLY_ACTIVE + || output_mode == DistributedDDLOutputMode::NONE; if (hosts_to_wait) { waiting_hosts = NameSet(hosts_to_wait->begin(), hosts_to_wait->end()); is_replicated_database = true; + only_running_hosts = output_mode == DistributedDDLOutputMode::THROW_ONLY_ACTIVE || + output_mode == DistributedDDLOutputMode::NULL_STATUS_ON_TIMEOUT_ONLY_ACTIVE; } else { @@ -377,6 +382,38 @@ Chunk DDLQueryStatusSource::generateChunkWithUnfinishedHosts() const return Chunk(std::move(columns), unfinished_hosts.size()); } +static NameSet getOfflineHosts(const String & node_path, const NameSet & hosts_to_wait, const ZooKeeperPtr & zookeeper, Poco::Logger * log) +{ + fs::path replicas_path; + if (node_path.ends_with('/')) + replicas_path = fs::path(node_path).parent_path().parent_path().parent_path() / "replicas"; + else + replicas_path = fs::path(node_path).parent_path().parent_path() / "replicas"; + + Strings paths; + Strings hosts_array; + for (const auto & host : hosts_to_wait) + { + hosts_array.push_back(host); + paths.push_back(replicas_path / host / "active"); + } + + NameSet offline; + auto res = zookeeper->tryGet(paths); + for (size_t i = 0; i < res.size(); ++i) + if (res[i].error == Coordination::Error::ZNONODE) + offline.insert(hosts_array[i]); + + if (offline.size() == hosts_to_wait.size()) + { + /// Avoid reporting that all hosts are offline + LOG_WARNING(log, "Did not find active hosts, will wait for all {} hosts. This should not happen often", offline.size()); + return {}; + } + + return offline; +} + Chunk DDLQueryStatusSource::generate() { bool all_hosts_finished = num_hosts_finished >= waiting_hosts.size(); @@ -398,7 +435,7 @@ Chunk DDLQueryStatusSource::generate() if (isCancelled()) return {}; - if (timeout_seconds >= 0 && watch.elapsedSeconds() > timeout_seconds) + if (stop_waiting_offline_hosts || (timeout_seconds >= 0 && watch.elapsedSeconds() > timeout_seconds)) { timeout_exceeded = true; @@ -406,7 +443,7 @@ Chunk DDLQueryStatusSource::generate() size_t num_active_hosts = current_active_hosts.size(); constexpr auto msg_format = "Watching task {} is executing longer than distributed_ddl_task_timeout (={}) seconds. " - "There are {} unfinished hosts ({} of them are currently active), " + "There are {} unfinished hosts ({} of them are currently executing the task), " "they are going to execute the query in background"; if (throw_on_timeout) { @@ -425,10 +462,7 @@ Chunk DDLQueryStatusSource::generate() return generateChunkWithUnfinishedHosts(); } - if (num_hosts_finished != 0 || try_number != 0) - { - sleepForMilliseconds(std::min(1000, 50 * (try_number + 1))); - } + sleepForMilliseconds(std::min(1000, 50 * try_number)); bool node_exists = false; Strings tmp_hosts; @@ -440,9 +474,21 @@ Chunk DDLQueryStatusSource::generate() retries_ctl.retryLoop([&]() { auto zookeeper = context->getZooKeeper(); - node_exists = zookeeper->exists(node_path); - tmp_hosts = getChildrenAllowNoNode(zookeeper, fs::path(node_path) / node_to_wait); - tmp_active_hosts = getChildrenAllowNoNode(zookeeper, fs::path(node_path) / "active"); + Strings paths = {String(fs::path(node_path) / node_to_wait), String(fs::path(node_path) / "active")}; + auto res = zookeeper->tryGetChildren(paths); + for (size_t i = 0; i < res.size(); ++i) + if (res[i].error != Coordination::Error::ZOK && res[i].error != Coordination::Error::ZNONODE) + throw Coordination::Exception::fromPath(res[i].error, paths[i]); + + if (res[0].error == Coordination::Error::ZNONODE) + node_exists = zookeeper->exists(node_path); + else + node_exists = true; + tmp_hosts = res[0].names; + tmp_active_hosts = res[1].names; + + if (only_running_hosts) + offline_hosts = getOfflineHosts(node_path, waiting_hosts, zookeeper, log); }); } @@ -460,6 +506,17 @@ Chunk DDLQueryStatusSource::generate() Strings new_hosts = getNewAndUpdate(tmp_hosts); ++try_number; + + if (only_running_hosts) + { + size_t num_finished_or_offline = 0; + for (const auto & host : waiting_hosts) + num_finished_or_offline += finished_hosts.contains(host) || offline_hosts.contains(host); + + if (num_finished_or_offline == waiting_hosts.size()) + stop_waiting_offline_hosts = true; + } + if (new_hosts.empty()) continue; @@ -470,7 +527,13 @@ Chunk DDLQueryStatusSource::generate() { ExecutionStatus status(-1, "Cannot obtain error message"); - if (node_to_wait == "finished") + /// Replicated database retries in case of error, it should not write error status. +#ifdef ABORT_ON_LOGICAL_ERROR + bool need_check_status = true; +#else + bool need_check_status = !is_replicated_database; +#endif + if (need_check_status) { String status_data; bool finished_exists = false; @@ -496,7 +559,6 @@ Chunk DDLQueryStatusSource::generate() if (status.code != 0 && !first_exception && context->getSettingsRef().distributed_ddl_output_mode != DistributedDDLOutputMode::NEVER_THROW) { - /// Replicated database retries in case of error, it should not write error status. if (is_replicated_database) throw Exception(ErrorCodes::LOGICAL_ERROR, "There was an error on {}: {} (probably it's a bug)", host_id, status.message); @@ -555,15 +617,6 @@ IProcessor::Status DDLQueryStatusSource::prepare() return ISource::prepare(); } -Strings DDLQueryStatusSource::getChildrenAllowNoNode(const std::shared_ptr & zookeeper, const String & node_path) -{ - Strings res; - Coordination::Error code = zookeeper->tryGetChildren(node_path, res); - if (code != Coordination::Error::ZOK && code != Coordination::Error::ZNONODE) - throw Coordination::Exception::fromPath(code, node_path); - return res; -} - Strings DDLQueryStatusSource::getNewAndUpdate(const Strings & current_list_of_finished_hosts) { Strings diff; diff --git a/src/Interpreters/sortBlock.cpp b/src/Interpreters/sortBlock.cpp index 89c4220ccdf..d75786f33b9 100644 --- a/src/Interpreters/sortBlock.cpp +++ b/src/Interpreters/sortBlock.cpp @@ -4,6 +4,7 @@ #include #include #include +#include #ifdef __SSE2__ #include @@ -155,8 +156,7 @@ void getBlockSortPermutationImpl(const Block & block, const SortDescription & de { size_t size = block.rows(); permutation.resize(size); - for (size_t i = 0; i < size; ++i) - permutation[i] = i; + iota(permutation.data(), size, IColumn::Permutation::value_type(0)); if (limit >= size) limit = 0; diff --git a/src/Interpreters/tests/gtest_filecache.cpp b/src/Interpreters/tests/gtest_filecache.cpp index 1005e6090b8..3e061db4f56 100644 --- a/src/Interpreters/tests/gtest_filecache.cpp +++ b/src/Interpreters/tests/gtest_filecache.cpp @@ -11,6 +11,7 @@ #include #include +#include #include #include #include @@ -788,7 +789,7 @@ TEST_F(FileCacheTest, writeBuffer) /// get random permutation of indexes std::vector indexes(data.size()); - std::iota(indexes.begin(), indexes.end(), 0); + iota(indexes.data(), indexes.size(), size_t(0)); std::shuffle(indexes.begin(), indexes.end(), rng); for (auto i : indexes) diff --git a/src/Planner/CollectTableExpressionData.cpp b/src/Planner/CollectTableExpressionData.cpp index 5ba318dab6a..78a7c7074c3 100644 --- a/src/Planner/CollectTableExpressionData.cpp +++ b/src/Planner/CollectTableExpressionData.cpp @@ -8,6 +8,8 @@ #include #include #include +#include +#include #include #include @@ -33,6 +35,28 @@ public: void visitImpl(QueryTreeNodePtr & node) { + /// Special case for USING clause which contains references to ALIAS columns. + /// We can not modify such ColumnNode. + if (auto * join_node = node->as()) + { + if (!join_node->isUsingJoinExpression()) + return; + + auto & using_list = join_node->getJoinExpression()->as(); + for (auto & using_element : using_list) + { + auto & column_node = using_element->as(); + /// This list contains column nodes from left and right tables. + auto & columns_from_subtrees = column_node.getExpressionOrThrow()->as().getNodes(); + + /// Visit left table column node. + visitUsingColumn(columns_from_subtrees[0]); + /// Visit right table column node. + visitUsingColumn(columns_from_subtrees[1]); + } + return; + } + auto * column_node = node->as(); if (!column_node) return; @@ -55,7 +79,13 @@ public: if (column_node->hasExpression() && column_source_node_type != QueryTreeNodeType::ARRAY_JOIN) { /// Replace ALIAS column with expression - table_expression_data.addAliasColumnName(column_node->getColumnName()); + bool column_already_exists = table_expression_data.hasColumn(column_node->getColumnName()); + if (!column_already_exists) + { + auto column_identifier = planner_context.getGlobalPlannerContext()->createColumnIdentifier(node); + table_expression_data.addAliasColumnName(column_node->getColumnName(), column_identifier); + } + node = column_node->getExpression(); visitImpl(node); return; @@ -78,13 +108,38 @@ public: table_expression_data.addColumn(column_node->getColumn(), column_identifier); } - static bool needChildVisit(const QueryTreeNodePtr &, const QueryTreeNodePtr & child_node) + static bool needChildVisit(const QueryTreeNodePtr & parent, const QueryTreeNodePtr & child_node) { + if (auto * join_node = parent->as()) + { + if (join_node->getJoinExpression() == child_node && join_node->isUsingJoinExpression()) + return false; + } auto child_node_type = child_node->getNodeType(); return !(child_node_type == QueryTreeNodeType::QUERY || child_node_type == QueryTreeNodeType::UNION); } private: + + void visitUsingColumn(QueryTreeNodePtr & node) + { + auto & column_node = node->as(); + if (column_node.hasExpression()) + { + auto & table_expression_data = planner_context.getOrCreateTableExpressionData(column_node.getColumnSource()); + bool column_already_exists = table_expression_data.hasColumn(column_node.getColumnName()); + if (column_already_exists) + return; + + auto column_identifier = planner_context.getGlobalPlannerContext()->createColumnIdentifier(node); + table_expression_data.addAliasColumnName(column_node.getColumnName(), column_identifier); + + visitImpl(column_node.getExpressionOrThrow()); + } + else + visitImpl(node); + } + PlannerContext & planner_context; }; diff --git a/src/Planner/Planner.cpp b/src/Planner/Planner.cpp index d2ffd47c500..a0c0fce4934 100644 --- a/src/Planner/Planner.cpp +++ b/src/Planner/Planner.cpp @@ -1057,7 +1057,7 @@ void addBuildSubqueriesForSetsStepIfNeeded( Planner subquery_planner( query_tree, subquery_options, - planner_context->getGlobalPlannerContext()); + std::make_shared()); //planner_context->getGlobalPlannerContext()); subquery_planner.buildQueryPlanIfNeeded(); subquery->setQueryPlan(std::make_unique(std::move(subquery_planner).extractQueryPlan())); diff --git a/src/Planner/PlannerContext.cpp b/src/Planner/PlannerContext.cpp index 0fde034b87a..422c8c1d01f 100644 --- a/src/Planner/PlannerContext.cpp +++ b/src/Planner/PlannerContext.cpp @@ -20,12 +20,15 @@ const ColumnIdentifier & GlobalPlannerContext::createColumnIdentifier(const Quer return createColumnIdentifier(column_node_typed.getColumn(), column_source_node); } -const ColumnIdentifier & GlobalPlannerContext::createColumnIdentifier(const NameAndTypePair & column, const QueryTreeNodePtr & /*column_source_node*/) +const ColumnIdentifier & GlobalPlannerContext::createColumnIdentifier(const NameAndTypePair & column, const QueryTreeNodePtr & column_source_node) { std::string column_identifier; - column_identifier += column.name; - column_identifier += '_' + std::to_string(column_identifiers.size()); + const auto & source_alias = column_source_node->getAlias(); + if (!source_alias.empty()) + column_identifier = source_alias + "." + column.name; + else + column_identifier = column.name; auto [it, inserted] = column_identifiers.emplace(column_identifier); assert(inserted); diff --git a/src/Planner/PlannerJoinTree.cpp b/src/Planner/PlannerJoinTree.cpp index e2cdf146a69..f6569d998f1 100644 --- a/src/Planner/PlannerJoinTree.cpp +++ b/src/Planner/PlannerJoinTree.cpp @@ -645,7 +645,12 @@ JoinTreeQueryPlan buildQueryPlanForTableExpression(QueryTreeNodePtr table_expres max_threads_execute_query = 1; } - if (max_block_size_limited < select_query_info.local_storage_limits.local_limits.size_limits.max_rows) + if (select_query_info.local_storage_limits.local_limits.size_limits.max_rows != 0) + { + if (max_block_size_limited < select_query_info.local_storage_limits.local_limits.size_limits.max_rows) + table_expression_query_info.limit = max_block_size_limited; + } + else { table_expression_query_info.limit = max_block_size_limited; } @@ -812,7 +817,7 @@ JoinTreeQueryPlan buildQueryPlanForTableExpression(QueryTreeNodePtr table_expres } } - const auto & table_expression_alias = table_expression->getAlias(); + const auto & table_expression_alias = table_expression->getOriginalAlias(); auto additional_filters_info = buildAdditionalFiltersIfNeeded(storage, table_expression_alias, table_expression_query_info, planner_context); add_filter(additional_filters_info, "additional filter"); @@ -978,6 +983,57 @@ void joinCastPlanColumnsToNullable(QueryPlan & plan_to_add_cast, PlannerContextP plan_to_add_cast.addStep(std::move(cast_join_columns_step)); } +/// Actions to calculate table columns that have a functional representation (ALIASes and subcolumns) +/// and used in USING clause of JOIN expression. +struct UsingAliasKeyActions +{ + UsingAliasKeyActions( + const ColumnsWithTypeAndName & left_plan_output_columns, + const ColumnsWithTypeAndName & right_plan_output_columns + ) + : left_alias_columns_keys(std::make_shared(left_plan_output_columns)) + , right_alias_columns_keys(std::make_shared(right_plan_output_columns)) + {} + + void addLeftColumn(QueryTreeNodePtr & node, const ColumnsWithTypeAndName & plan_output_columns, const PlannerContextPtr & planner_context) + { + addColumnImpl(left_alias_columns_keys, node, plan_output_columns, planner_context); + } + + void addRightColumn(QueryTreeNodePtr & node, const ColumnsWithTypeAndName & plan_output_columns, const PlannerContextPtr & planner_context) + { + addColumnImpl(right_alias_columns_keys, node, plan_output_columns, planner_context); + } + + ActionsDAGPtr getLeftActions() + { + left_alias_columns_keys->projectInput(); + return std::move(left_alias_columns_keys); + } + + ActionsDAGPtr getRightActions() + { + right_alias_columns_keys->projectInput(); + return std::move(right_alias_columns_keys); + } + +private: + void addColumnImpl(ActionsDAGPtr & alias_columns_keys, QueryTreeNodePtr & node, const ColumnsWithTypeAndName & plan_output_columns, const PlannerContextPtr & planner_context) + { + auto & column_node = node->as(); + if (column_node.hasExpression()) + { + auto dag = buildActionsDAGFromExpressionNode(column_node.getExpressionOrThrow(), plan_output_columns, planner_context); + const auto & left_inner_column_identifier = planner_context->getColumnNodeIdentifierOrThrow(node); + dag->addOrReplaceInOutputs(dag->addAlias(*dag->getOutputs().front(), left_inner_column_identifier)); + alias_columns_keys->mergeInplace(std::move(*dag)); + } + } + + ActionsDAGPtr left_alias_columns_keys; + ActionsDAGPtr right_alias_columns_keys; +}; + JoinTreeQueryPlan buildQueryPlanForJoinNode(const QueryTreeNodePtr & join_table_expression, JoinTreeQueryPlan left_join_tree_query_plan, JoinTreeQueryPlan right_join_tree_query_plan, @@ -1002,6 +1058,18 @@ JoinTreeQueryPlan buildQueryPlanForJoinNode(const QueryTreeNodePtr & join_table_ auto right_plan = std::move(right_join_tree_query_plan.query_plan); auto right_plan_output_columns = right_plan.getCurrentDataStream().header.getColumnsWithTypeAndName(); + // { + // WriteBufferFromOwnString buf; + // left_plan.explainPlan(buf, {.header = true, .actions = true}); + // std::cerr << "left plan \n "<< buf.str() << std::endl; + // } + + // { + // WriteBufferFromOwnString buf; + // right_plan.explainPlan(buf, {.header = true, .actions = true}); + // std::cerr << "right plan \n "<< buf.str() << std::endl; + // } + JoinClausesAndActions join_clauses_and_actions; JoinKind join_kind = join_node.getKind(); JoinStrictness join_strictness = join_node.getStrictness(); @@ -1034,6 +1102,8 @@ JoinTreeQueryPlan buildQueryPlanForJoinNode(const QueryTreeNodePtr & join_table_ if (join_node.isUsingJoinExpression()) { + UsingAliasKeyActions using_alias_key_actions{left_plan_output_columns, right_plan_output_columns}; + auto & join_node_using_columns_list = join_node.getJoinExpression()->as(); for (auto & join_node_using_node : join_node_using_columns_list.getNodes()) { @@ -1043,9 +1113,13 @@ JoinTreeQueryPlan buildQueryPlanForJoinNode(const QueryTreeNodePtr & join_table_ auto & left_inner_column_node = inner_columns_list.getNodes().at(0); auto & left_inner_column = left_inner_column_node->as(); + using_alias_key_actions.addLeftColumn(left_inner_column_node, left_plan_output_columns, planner_context); + auto & right_inner_column_node = inner_columns_list.getNodes().at(1); auto & right_inner_column = right_inner_column_node->as(); + using_alias_key_actions.addRightColumn(right_inner_column_node, right_plan_output_columns, planner_context); + const auto & join_node_using_column_node_type = join_node_using_column_node.getColumnType(); if (!left_inner_column.getColumnType()->equals(*join_node_using_column_node_type)) { @@ -1059,6 +1133,14 @@ JoinTreeQueryPlan buildQueryPlanForJoinNode(const QueryTreeNodePtr & join_table_ right_plan_column_name_to_cast_type.emplace(right_inner_column_identifier, join_node_using_column_node_type); } } + + auto left_alias_columns_keys_step = std::make_unique(left_plan.getCurrentDataStream(), using_alias_key_actions.getLeftActions()); + left_alias_columns_keys_step->setStepDescription("Actions for left table alias column keys"); + left_plan.addStep(std::move(left_alias_columns_keys_step)); + + auto right_alias_columns_keys_step = std::make_unique(right_plan.getCurrentDataStream(), using_alias_key_actions.getRightActions()); + right_alias_columns_keys_step->setStepDescription("Actions for right table alias column keys"); + right_plan.addStep(std::move(right_alias_columns_keys_step)); } auto join_cast_plan_output_nodes = [&](QueryPlan & plan_to_add_cast, std::unordered_map & plan_column_name_to_cast_type) diff --git a/src/Planner/PlannerJoins.cpp b/src/Planner/PlannerJoins.cpp index 5e9de4dedcf..9b249d21a24 100644 --- a/src/Planner/PlannerJoins.cpp +++ b/src/Planner/PlannerJoins.cpp @@ -20,6 +20,7 @@ #include #include +#include #include #include #include @@ -113,41 +114,96 @@ String JoinClause::dump() const namespace { -std::optional extractJoinTableSideFromExpression(const ActionsDAG::Node * expression_root_node, - const std::unordered_set & join_expression_dag_input_nodes, - const NameSet & left_table_expression_columns_names, - const NameSet & right_table_expression_columns_names, +using TableExpressionSet = std::unordered_set; + +TableExpressionSet extractTableExpressionsSet(const QueryTreeNodePtr & node) +{ + TableExpressionSet res; + for (const auto & expr : extractTableExpressions(node, true)) + res.insert(expr.get()); + + return res; +} + +std::optional extractJoinTableSideFromExpression(//const ActionsDAG::Node * expression_root_node, + const IQueryTreeNode * expression_root_node, + //const std::unordered_set & join_expression_dag_input_nodes, + const TableExpressionSet & left_table_expressions, + const TableExpressionSet & right_table_expressions, const JoinNode & join_node) { std::optional table_side; - std::vector nodes_to_process; + std::vector nodes_to_process; nodes_to_process.push_back(expression_root_node); + // std::cerr << "==== extractJoinTableSideFromExpression\n"; + // std::cerr << "inp nodes" << std::endl; + // for (const auto * node : join_expression_dag_input_nodes) + // std::cerr << reinterpret_cast(node) << ' ' << node->result_name << std::endl; + + + // std::cerr << "l names" << std::endl; + // for (const auto & l : left_table_expression_columns_names) + // std::cerr << l << std::endl; + + // std::cerr << "r names" << std::endl; + // for (const auto & r : right_table_expression_columns_names) + // std::cerr << r << std::endl; + + // const auto * left_table_expr = join_node.getLeftTableExpression().get(); + // const auto * right_table_expr = join_node.getRightTableExpression().get(); + while (!nodes_to_process.empty()) { const auto * node_to_process = nodes_to_process.back(); nodes_to_process.pop_back(); - for (const auto & child : node_to_process->children) - nodes_to_process.push_back(child); + //std::cerr << "... " << reinterpret_cast(node_to_process) << ' ' << node_to_process->result_name << std::endl; - if (!join_expression_dag_input_nodes.contains(node_to_process)) + if (const auto * function_node = node_to_process->as()) + { + for (const auto & child : function_node->getArguments()) + nodes_to_process.push_back(child.get()); + + continue; + } + + const auto * column_node = node_to_process->as(); + if (!column_node) continue; - const auto & input_name = node_to_process->result_name; + // if (!join_expression_dag_input_nodes.contains(node_to_process)) + // continue; - bool left_table_expression_contains_input = left_table_expression_columns_names.contains(input_name); - bool right_table_expression_contains_input = right_table_expression_columns_names.contains(input_name); + const auto & input_name = column_node->getColumnName(); - if (!left_table_expression_contains_input && !right_table_expression_contains_input) + // bool left_table_expression_contains_input = left_table_expression_columns_names.contains(input_name); + // bool right_table_expression_contains_input = right_table_expression_columns_names.contains(input_name); + + // if (!left_table_expression_contains_input && !right_table_expression_contains_input) + // throw Exception(ErrorCodes::INVALID_JOIN_ON_EXPRESSION, + // "JOIN {} actions has column {} that do not exist in left {} or right {} table expression columns", + // join_node.formatASTForErrorMessage(), + // input_name, + // boost::join(left_table_expression_columns_names, ", "), + // boost::join(right_table_expression_columns_names, ", ")); + + const auto * column_source = column_node->getColumnSource().get(); + if (!column_source) + throw Exception(ErrorCodes::LOGICAL_ERROR, "No source for column {} in JOIN {}", input_name, join_node.formatASTForErrorMessage()); + + bool is_column_from_left_expr = left_table_expressions.contains(column_source); + bool is_column_from_right_expr = right_table_expressions.contains(column_source); + + if (!is_column_from_left_expr && !is_column_from_right_expr) throw Exception(ErrorCodes::INVALID_JOIN_ON_EXPRESSION, "JOIN {} actions has column {} that do not exist in left {} or right {} table expression columns", join_node.formatASTForErrorMessage(), - input_name, - boost::join(left_table_expression_columns_names, ", "), - boost::join(right_table_expression_columns_names, ", ")); + column_source->formatASTForErrorMessage(), + join_node.getLeftTableExpression()->formatASTForErrorMessage(), + join_node.getRightTableExpression()->formatASTForErrorMessage()); - auto input_table_side = left_table_expression_contains_input ? JoinTableSide::Left : JoinTableSide::Right; + auto input_table_side = is_column_from_left_expr ? JoinTableSide::Left : JoinTableSide::Right; if (table_side && (*table_side) != input_table_side) throw Exception(ErrorCodes::INVALID_JOIN_ON_EXPRESSION, "JOIN {} join expression contains column from left and right table", @@ -159,29 +215,58 @@ std::optional extractJoinTableSideFromExpression(const ActionsDAG return table_side; } -void buildJoinClause(ActionsDAGPtr join_expression_dag, - const std::unordered_set & join_expression_dag_input_nodes, - const ActionsDAG::Node * join_expressions_actions_node, - const NameSet & left_table_expression_columns_names, - const NameSet & right_table_expression_columns_names, +const ActionsDAG::Node * appendExpression( + ActionsDAGPtr & dag, + const QueryTreeNodePtr & expression, + const PlannerContextPtr & planner_context, + const JoinNode & join_node) +{ + PlannerActionsVisitor join_expression_visitor(planner_context); + auto join_expression_dag_node_raw_pointers = join_expression_visitor.visit(dag, expression); + if (join_expression_dag_node_raw_pointers.size() != 1) + throw Exception(ErrorCodes::LOGICAL_ERROR, + "JOIN {} ON clause contains multiple expressions", + join_node.formatASTForErrorMessage()); + + return join_expression_dag_node_raw_pointers[0]; +} + +void buildJoinClause( + ActionsDAGPtr & left_dag, + ActionsDAGPtr & right_dag, + const PlannerContextPtr & planner_context, + //ActionsDAGPtr join_expression_dag, + //const std::unordered_set & join_expression_dag_input_nodes, + //const ActionsDAG::Node * join_expressions_actions_node, + const QueryTreeNodePtr & join_expression, + const TableExpressionSet & left_table_expressions, + const TableExpressionSet & right_table_expressions, const JoinNode & join_node, JoinClause & join_clause) { std::string function_name; - if (join_expressions_actions_node->function) - function_name = join_expressions_actions_node->function->getName(); + //std::cerr << join_expression_dag->dumpDAG() << std::endl; + auto * function_node = join_expression->as(); + if (function_node) + function_name = function_node->getFunction()->getName(); + + // if (join_expressions_actions_node->function) + // function_name = join_expressions_actions_node->function->getName(); /// For 'and' function go into children if (function_name == "and") { - for (const auto & child : join_expressions_actions_node->children) + for (const auto & child : function_node->getArguments()) { - buildJoinClause(join_expression_dag, - join_expression_dag_input_nodes, + buildJoinClause(//join_expression_dag, + //join_expression_dag_input_nodes, + left_dag, + right_dag, + planner_context, child, - left_table_expression_columns_names, - right_table_expression_columns_names, + left_table_expressions, + right_table_expressions, join_node, join_clause); } @@ -194,45 +279,49 @@ void buildJoinClause(ActionsDAGPtr join_expression_dag, if (function_name == "equals" || function_name == "isNotDistinctFrom" || is_asof_join_inequality) { - const auto * left_child = join_expressions_actions_node->children.at(0); - const auto * right_child = join_expressions_actions_node->children.at(1); + const auto left_child = function_node->getArguments().getNodes().at(0);//join_expressions_actions_node->children.at(0); + const auto right_child = function_node->getArguments().getNodes().at(1); //join_expressions_actions_node->children.at(1); - auto left_expression_side_optional = extractJoinTableSideFromExpression(left_child, - join_expression_dag_input_nodes, - left_table_expression_columns_names, - right_table_expression_columns_names, + auto left_expression_side_optional = extractJoinTableSideFromExpression(left_child.get(), + //join_expression_dag_input_nodes, + left_table_expressions, + right_table_expressions, join_node); - auto right_expression_side_optional = extractJoinTableSideFromExpression(right_child, - join_expression_dag_input_nodes, - left_table_expression_columns_names, - right_table_expression_columns_names, + auto right_expression_side_optional = extractJoinTableSideFromExpression(right_child.get(), + //join_expression_dag_input_nodes, + left_table_expressions, + right_table_expressions, join_node); if (!left_expression_side_optional && !right_expression_side_optional) { throw Exception(ErrorCodes::INVALID_JOIN_ON_EXPRESSION, - "JOIN {} ON expression {} with constants is not supported", - join_node.formatASTForErrorMessage(), - join_expressions_actions_node->result_name); + "JOIN {} ON expression with constants is not supported", + join_node.formatASTForErrorMessage()); } else if (left_expression_side_optional && !right_expression_side_optional) { - join_clause.addCondition(*left_expression_side_optional, join_expressions_actions_node); + auto & dag = *left_expression_side_optional == JoinTableSide::Left ? left_dag : right_dag; + const auto * node = appendExpression(dag, join_expression, planner_context, join_node); + join_clause.addCondition(*left_expression_side_optional, node); } else if (!left_expression_side_optional && right_expression_side_optional) { - join_clause.addCondition(*right_expression_side_optional, join_expressions_actions_node); + auto & dag = *right_expression_side_optional == JoinTableSide::Left ? left_dag : right_dag; + const auto * node = appendExpression(dag, join_expression, planner_context, join_node); + join_clause.addCondition(*right_expression_side_optional, node); } else { + // std::cerr << "===============\n"; auto left_expression_side = *left_expression_side_optional; auto right_expression_side = *right_expression_side_optional; if (left_expression_side != right_expression_side) { - const ActionsDAG::Node * left_key = left_child; - const ActionsDAG::Node * right_key = right_child; + auto left_key = left_child; + auto right_key = right_child; if (left_expression_side == JoinTableSide::Right) { @@ -241,6 +330,9 @@ void buildJoinClause(ActionsDAGPtr join_expression_dag, asof_inequality = reverseASOFJoinInequality(asof_inequality); } + const auto * left_node = appendExpression(left_dag, left_key, planner_context, join_node); + const auto * right_node = appendExpression(right_dag, right_key, planner_context, join_node); + if (is_asof_join_inequality) { if (join_clause.hasASOF()) @@ -250,55 +342,66 @@ void buildJoinClause(ActionsDAGPtr join_expression_dag, join_node.formatASTForErrorMessage()); } - join_clause.addASOFKey(left_key, right_key, asof_inequality); + join_clause.addASOFKey(left_node, right_node, asof_inequality); } else { bool null_safe_comparison = function_name == "isNotDistinctFrom"; - join_clause.addKey(left_key, right_key, null_safe_comparison); + join_clause.addKey(left_node, right_node, null_safe_comparison); } } else { - join_clause.addCondition(left_expression_side, join_expressions_actions_node); + auto & dag = left_expression_side == JoinTableSide::Left ? left_dag : right_dag; + const auto * node = appendExpression(dag, join_expression, planner_context, join_node); + join_clause.addCondition(left_expression_side, node); } } return; } - auto expression_side_optional = extractJoinTableSideFromExpression(join_expressions_actions_node, - join_expression_dag_input_nodes, - left_table_expression_columns_names, - right_table_expression_columns_names, + auto expression_side_optional = extractJoinTableSideFromExpression(//join_expressions_actions_node, + //join_expression_dag_input_nodes, + join_expression.get(), + left_table_expressions, + right_table_expressions, join_node); if (!expression_side_optional) expression_side_optional = JoinTableSide::Right; auto expression_side = *expression_side_optional; - join_clause.addCondition(expression_side, join_expressions_actions_node); + auto & dag = expression_side == JoinTableSide::Left ? left_dag : right_dag; + const auto * node = appendExpression(dag, join_expression, planner_context, join_node); + join_clause.addCondition(expression_side, node); } -JoinClausesAndActions buildJoinClausesAndActions(const ColumnsWithTypeAndName & join_expression_input_columns, +JoinClausesAndActions buildJoinClausesAndActions(//const ColumnsWithTypeAndName & join_expression_input_columns, const ColumnsWithTypeAndName & left_table_expression_columns, const ColumnsWithTypeAndName & right_table_expression_columns, const JoinNode & join_node, const PlannerContextPtr & planner_context) { - ActionsDAGPtr join_expression_actions = std::make_shared(join_expression_input_columns); + //ActionsDAGPtr join_expression_actions = std::make_shared(join_expression_input_columns); + + ActionsDAGPtr left_join_actions = std::make_shared(left_table_expression_columns); + ActionsDAGPtr right_join_actions = std::make_shared(right_table_expression_columns); + + // LOG_TRACE(&Poco::Logger::get("Planner"), "buildJoinClausesAndActions cols {} ", left_join_actions->dumpDAG()); + // LOG_TRACE(&Poco::Logger::get("Planner"), "buildJoinClausesAndActions cols {} ", right_join_actions->dumpDAG()); /** In ActionsDAG if input node has constant representation additional constant column is added. * That way we cannot simply check that node has INPUT type during resolution of expression join table side. * Put all nodes after actions dag initialization in set. * To check if actions dag node is input column, we check if set contains it. */ - const auto & join_expression_actions_nodes = join_expression_actions->getNodes(); + // const auto & join_expression_actions_nodes = join_expression_actions->getNodes(); - std::unordered_set join_expression_dag_input_nodes; - join_expression_dag_input_nodes.reserve(join_expression_actions_nodes.size()); - for (const auto & node : join_expression_actions_nodes) - join_expression_dag_input_nodes.insert(&node); + // std::unordered_set join_expression_dag_input_nodes; + // join_expression_dag_input_nodes.reserve(join_expression_actions_nodes.size()); + // for (const auto & node : join_expression_actions_nodes) + // join_expression_dag_input_nodes.insert(&node); /** It is possible to have constant value in JOIN ON section, that we need to ignore during DAG construction. * If we do not ignore it, this function will be replaced by underlying constant. @@ -308,6 +411,9 @@ JoinClausesAndActions buildJoinClausesAndActions(const ColumnsWithTypeAndName & * ON (t1.id = t2.id) AND 1 != 1 AND (t1.value >= t1.value); */ auto join_expression = join_node.getJoinExpression(); + // LOG_TRACE(&Poco::Logger::get("Planner"), "buildJoinClausesAndActions expr {} ", join_expression->formatConvertedASTForErrorMessage()); + // LOG_TRACE(&Poco::Logger::get("Planner"), "buildJoinClausesAndActions expr {} ", join_expression->dumpTree()); + auto * constant_join_expression = join_expression->as(); if (constant_join_expression && constant_join_expression->hasSourceExpression()) @@ -319,18 +425,18 @@ JoinClausesAndActions buildJoinClausesAndActions(const ColumnsWithTypeAndName & "JOIN {} join expression expected function", join_node.formatASTForErrorMessage()); - PlannerActionsVisitor join_expression_visitor(planner_context); - auto join_expression_dag_node_raw_pointers = join_expression_visitor.visit(join_expression_actions, join_expression); - if (join_expression_dag_node_raw_pointers.size() != 1) - throw Exception(ErrorCodes::LOGICAL_ERROR, - "JOIN {} ON clause contains multiple expressions", - join_node.formatASTForErrorMessage()); + // PlannerActionsVisitor join_expression_visitor(planner_context); + // auto join_expression_dag_node_raw_pointers = join_expression_visitor.visit(join_expression_actions, join_expression); + // if (join_expression_dag_node_raw_pointers.size() != 1) + // throw Exception(ErrorCodes::LOGICAL_ERROR, + // "JOIN {} ON clause contains multiple expressions", + // join_node.formatASTForErrorMessage()); - const auto * join_expressions_actions_root_node = join_expression_dag_node_raw_pointers[0]; - if (!join_expressions_actions_root_node->function) - throw Exception(ErrorCodes::INVALID_JOIN_ON_EXPRESSION, - "JOIN {} join expression expected function", - join_node.formatASTForErrorMessage()); + // const auto * join_expressions_actions_root_node = join_expression_dag_node_raw_pointers[0]; + // if (!join_expressions_actions_root_node->function) + // throw Exception(ErrorCodes::INVALID_JOIN_ON_EXPRESSION, + // "JOIN {} join expression expected function", + // join_node.formatASTForErrorMessage()); size_t left_table_expression_columns_size = left_table_expression_columns.size(); @@ -360,21 +466,27 @@ JoinClausesAndActions buildJoinClausesAndActions(const ColumnsWithTypeAndName & join_right_actions_names_set.insert(right_table_expression_column.name); } - JoinClausesAndActions result; - result.join_expression_actions = join_expression_actions; + auto join_left_table_expressions = extractTableExpressionsSet(join_node.getLeftTableExpression()); + auto join_right_table_expressions = extractTableExpressionsSet(join_node.getRightTableExpression()); - const auto & function_name = join_expressions_actions_root_node->function->getName(); + JoinClausesAndActions result; + //result.join_expression_actions = join_expression_actions; + + const auto & function_name = function_node->getFunction()->getName(); if (function_name == "or") { - for (const auto & child : join_expressions_actions_root_node->children) + for (const auto & child : function_node->getArguments()) { result.join_clauses.emplace_back(); - buildJoinClause(join_expression_actions, - join_expression_dag_input_nodes, + buildJoinClause(//join_expression_actions, + //join_expression_dag_input_nodes, + left_join_actions, + right_join_actions, + planner_context, child, - join_left_actions_names_set, - join_right_actions_names_set, + join_left_table_expressions, + join_right_table_expressions, join_node, result.join_clauses.back()); } @@ -383,11 +495,15 @@ JoinClausesAndActions buildJoinClausesAndActions(const ColumnsWithTypeAndName & { result.join_clauses.emplace_back(); - buildJoinClause(join_expression_actions, - join_expression_dag_input_nodes, - join_expressions_actions_root_node, - join_left_actions_names_set, - join_right_actions_names_set, + buildJoinClause( + left_join_actions, + right_join_actions, + planner_context, + //join_expression_actions, + //join_expression_dag_input_nodes, + join_expression, //join_expressions_actions_root_node, + join_left_table_expressions, + join_right_table_expressions, join_node, result.join_clauses.back()); } @@ -412,12 +528,12 @@ JoinClausesAndActions buildJoinClausesAndActions(const ColumnsWithTypeAndName & const ActionsDAG::Node * dag_filter_condition_node = nullptr; if (left_filter_condition_nodes.size() > 1) - dag_filter_condition_node = &join_expression_actions->addFunction(and_function, left_filter_condition_nodes, {}); + dag_filter_condition_node = &left_join_actions->addFunction(and_function, left_filter_condition_nodes, {}); else dag_filter_condition_node = left_filter_condition_nodes[0]; join_clause.getLeftFilterConditionNodes() = {dag_filter_condition_node}; - join_expression_actions->addOrReplaceInOutputs(*dag_filter_condition_node); + left_join_actions->addOrReplaceInOutputs(*dag_filter_condition_node); add_necessary_name_if_needed(JoinTableSide::Left, dag_filter_condition_node->result_name); } @@ -428,12 +544,12 @@ JoinClausesAndActions buildJoinClausesAndActions(const ColumnsWithTypeAndName & const ActionsDAG::Node * dag_filter_condition_node = nullptr; if (right_filter_condition_nodes.size() > 1) - dag_filter_condition_node = &join_expression_actions->addFunction(and_function, right_filter_condition_nodes, {}); + dag_filter_condition_node = &right_join_actions->addFunction(and_function, right_filter_condition_nodes, {}); else dag_filter_condition_node = right_filter_condition_nodes[0]; join_clause.getRightFilterConditionNodes() = {dag_filter_condition_node}; - join_expression_actions->addOrReplaceInOutputs(*dag_filter_condition_node); + right_join_actions->addOrReplaceInOutputs(*dag_filter_condition_node); add_necessary_name_if_needed(JoinTableSide::Right, dag_filter_condition_node->result_name); } @@ -470,10 +586,10 @@ JoinClausesAndActions buildJoinClausesAndActions(const ColumnsWithTypeAndName & } if (!left_key_node->result_type->equals(*common_type)) - left_key_node = &join_expression_actions->addCast(*left_key_node, common_type, {}); + left_key_node = &left_join_actions->addCast(*left_key_node, common_type, {}); if (!right_key_node->result_type->equals(*common_type)) - right_key_node = &join_expression_actions->addCast(*right_key_node, common_type, {}); + right_key_node = &right_join_actions->addCast(*right_key_node, common_type, {}); } if (join_clause.isNullsafeCompareKey(i) && left_key_node->result_type->isNullable() && right_key_node->result_type->isNullable()) @@ -490,22 +606,29 @@ JoinClausesAndActions buildJoinClausesAndActions(const ColumnsWithTypeAndName & * SELECT * FROM t1 JOIN t2 ON tuple(t1.a) == tuple(t2.b) */ auto wrap_nullsafe_function = FunctionFactory::instance().get("tuple", planner_context->getQueryContext()); - left_key_node = &join_expression_actions->addFunction(wrap_nullsafe_function, {left_key_node}, {}); - right_key_node = &join_expression_actions->addFunction(wrap_nullsafe_function, {right_key_node}, {}); + left_key_node = &left_join_actions->addFunction(wrap_nullsafe_function, {left_key_node}, {}); + right_key_node = &right_join_actions->addFunction(wrap_nullsafe_function, {right_key_node}, {}); } - join_expression_actions->addOrReplaceInOutputs(*left_key_node); - join_expression_actions->addOrReplaceInOutputs(*right_key_node); + left_join_actions->addOrReplaceInOutputs(*left_key_node); + right_join_actions->addOrReplaceInOutputs(*right_key_node); add_necessary_name_if_needed(JoinTableSide::Left, left_key_node->result_name); add_necessary_name_if_needed(JoinTableSide::Right, right_key_node->result_name); } } - result.left_join_expressions_actions = join_expression_actions->clone(); + result.left_join_expressions_actions = left_join_actions->clone(); + result.left_join_tmp_expression_actions = std::move(left_join_actions); result.left_join_expressions_actions->removeUnusedActions(join_left_actions_names); - result.right_join_expressions_actions = join_expression_actions->clone(); + // for (const auto & name : join_right_actions_names) + // std::cerr << ".. " << name << std::endl; + + // std::cerr << right_join_actions->dumpDAG() << std::endl; + + result.right_join_expressions_actions = right_join_actions->clone(); + result.right_join_tmp_expression_actions = std::move(right_join_actions); result.right_join_expressions_actions->removeUnusedActions(join_right_actions_names); return result; @@ -525,10 +648,10 @@ JoinClausesAndActions buildJoinClausesAndActions( "JOIN {} join does not have ON section", join_node_typed.formatASTForErrorMessage()); - auto join_expression_input_columns = left_table_expression_columns; - join_expression_input_columns.insert(join_expression_input_columns.end(), right_table_expression_columns.begin(), right_table_expression_columns.end()); + // auto join_expression_input_columns = left_table_expression_columns; + // join_expression_input_columns.insert(join_expression_input_columns.end(), right_table_expression_columns.begin(), right_table_expression_columns.end()); - return buildJoinClausesAndActions(join_expression_input_columns, left_table_expression_columns, right_table_expression_columns, join_node_typed, planner_context); + return buildJoinClausesAndActions(/*join_expression_input_columns,*/ left_table_expression_columns, right_table_expression_columns, join_node_typed, planner_context); } std::optional tryExtractConstantFromJoinNode(const QueryTreeNodePtr & join_node) diff --git a/src/Planner/PlannerJoins.h b/src/Planner/PlannerJoins.h index 94f32e7ad51..7bc65cfb544 100644 --- a/src/Planner/PlannerJoins.h +++ b/src/Planner/PlannerJoins.h @@ -165,7 +165,8 @@ struct JoinClausesAndActions /// Join clauses. Actions dag nodes point into join_expression_actions. JoinClauses join_clauses; /// Whole JOIN ON section expressions - ActionsDAGPtr join_expression_actions; + ActionsDAGPtr left_join_tmp_expression_actions; + ActionsDAGPtr right_join_tmp_expression_actions; /// Left join expressions actions ActionsDAGPtr left_join_expressions_actions; /// Right join expressions actions diff --git a/src/Planner/TableExpressionData.h b/src/Planner/TableExpressionData.h index 9f963dc182a..f6ef4017c98 100644 --- a/src/Planner/TableExpressionData.h +++ b/src/Planner/TableExpressionData.h @@ -80,9 +80,11 @@ public: } /// Add alias column name - void addAliasColumnName(const std::string & column_name) + void addAliasColumnName(const std::string & column_name, const ColumnIdentifier & column_identifier) { alias_columns_names.insert(column_name); + + column_name_to_column_identifier.emplace(column_name, column_identifier); } /// Get alias columns names diff --git a/src/Planner/Utils.cpp b/src/Planner/Utils.cpp index 9a6ef6f5d83..ba29cab5956 100644 --- a/src/Planner/Utils.cpp +++ b/src/Planner/Utils.cpp @@ -357,6 +357,7 @@ QueryTreeNodePtr mergeConditionNodes(const QueryTreeNodes & condition_nodes, con QueryTreeNodePtr replaceTableExpressionsWithDummyTables(const QueryTreeNodePtr & query_node, const ContextPtr & context, + //PlannerContext & planner_context, ResultReplacementMap * result_replacement_map) { auto & query_node_typed = query_node->as(); @@ -406,6 +407,13 @@ QueryTreeNodePtr replaceTableExpressionsWithDummyTables(const QueryTreeNodePtr & if (result_replacement_map) result_replacement_map->emplace(table_expression, dummy_table_node); + dummy_table_node->setAlias(table_expression->getAlias()); + + // auto & src_table_expression_data = planner_context.getOrCreateTableExpressionData(table_expression); + // auto & dst_table_expression_data = planner_context.getOrCreateTableExpressionData(dummy_table_node); + + // dst_table_expression_data = src_table_expression_data; + replacement_map.emplace(table_expression.get(), std::move(dummy_table_node)); } diff --git a/src/Processors/Formats/IInputFormat.cpp b/src/Processors/Formats/IInputFormat.cpp index e487a0054e7..3009e91c45a 100644 --- a/src/Processors/Formats/IInputFormat.cpp +++ b/src/Processors/Formats/IInputFormat.cpp @@ -1,6 +1,7 @@ #include #include - +#include +#include namespace DB { @@ -11,6 +12,21 @@ IInputFormat::IInputFormat(Block header, ReadBuffer * in_) column_mapping = std::make_shared(); } +Chunk IInputFormat::generate() +{ + try + { + return read(); + } + catch (Exception & e) + { + auto file_name = getFileNameFromReadBuffer(getReadBuffer()); + if (!file_name.empty()) + e.addMessage(fmt::format("(in file/uri {})", file_name)); + throw; + } +} + void IInputFormat::resetParser() { chassert(in); diff --git a/src/Processors/Formats/IInputFormat.h b/src/Processors/Formats/IInputFormat.h index 6722f5ebebf..713c1089d28 100644 --- a/src/Processors/Formats/IInputFormat.h +++ b/src/Processors/Formats/IInputFormat.h @@ -27,6 +27,11 @@ public: /// ReadBuffer can be nullptr for random-access formats. IInputFormat(Block header, ReadBuffer * in_); + Chunk generate() override; + + /// All data reading from the read buffer must be performed by this method. + virtual Chunk read() = 0; + /** In some usecase (hello Kafka) we need to read a lot of tiny streams in exactly the same format. * The recreating of parser for each small stream takes too long, so we introduce a method * resetParser() which allow to reset the state of parser to continue reading of @@ -49,8 +54,9 @@ public: /// Must be called from ParallelParsingInputFormat before readPrefix void setColumnMapping(ColumnMappingPtr column_mapping_) { column_mapping = column_mapping_; } - size_t getCurrentUnitNumber() const { return current_unit_number; } - void setCurrentUnitNumber(size_t current_unit_number_) { current_unit_number = current_unit_number_; } + /// Set the number of rows that was already read in + /// parallel parsing before creating this parser. + virtual void setRowsReadBefore(size_t /*rows*/) {} void addBuffer(std::unique_ptr buffer) { owned_buffers.emplace_back(std::move(buffer)); } @@ -72,9 +78,6 @@ protected: bool need_only_count = false; private: - /// Number of currently parsed chunk (if parallel parsing is enabled) - size_t current_unit_number = 0; - std::vector> owned_buffers; }; diff --git a/src/Processors/Formats/IRowInputFormat.cpp b/src/Processors/Formats/IRowInputFormat.cpp index 8c563b6f13b..5f27fa78c55 100644 --- a/src/Processors/Formats/IRowInputFormat.cpp +++ b/src/Processors/Formats/IRowInputFormat.cpp @@ -83,7 +83,7 @@ void IRowInputFormat::logError() errors_logger->logError(InputFormatErrorsLogger::ErrorEntry{now_time, total_rows, diagnostic, raw_data}); } -Chunk IRowInputFormat::generate() +Chunk IRowInputFormat::read() { if (total_rows == 0) { @@ -93,10 +93,6 @@ Chunk IRowInputFormat::generate() } catch (Exception & e) { - auto file_name = getFileNameFromReadBuffer(getReadBuffer()); - if (!file_name.empty()) - e.addMessage(fmt::format("(in file/uri {})", file_name)); - e.addMessage("(while reading header)"); throw; } @@ -132,8 +128,6 @@ Chunk IRowInputFormat::generate() { try { - ++total_rows; - info.read_columns.clear(); continue_reading = readRow(columns, info); @@ -148,6 +142,8 @@ Chunk IRowInputFormat::generate() } } + ++total_rows; + /// Some formats may read row AND say the read is finished. /// For such a case, get the number or rows from first column. if (!columns.empty()) @@ -162,6 +158,8 @@ Chunk IRowInputFormat::generate() } catch (Exception & e) { + ++total_rows; + /// Logic for possible skipping of errors. if (!isParseError(e.code())) @@ -204,27 +202,6 @@ Chunk IRowInputFormat::generate() } } } - catch (ParsingException & e) - { - String verbose_diagnostic; - try - { - verbose_diagnostic = getDiagnosticInfo(); - } - catch (const Exception & exception) - { - verbose_diagnostic = "Cannot get verbose diagnostic: " + exception.message(); - } - catch (...) // NOLINT(bugprone-empty-catch) - { - /// Error while trying to obtain verbose diagnostic. Ok to ignore. - } - - e.setFileName(getFileNameFromReadBuffer(getReadBuffer())); - e.setLineNumber(static_cast(total_rows)); - e.addMessage(verbose_diagnostic); - throw; - } catch (Exception & e) { if (!isParseError(e.code())) @@ -244,10 +221,6 @@ Chunk IRowInputFormat::generate() /// Error while trying to obtain verbose diagnostic. Ok to ignore. } - auto file_name = getFileNameFromReadBuffer(getReadBuffer()); - if (!file_name.empty()) - e.addMessage(fmt::format("(in file/uri {})", file_name)); - e.addMessage(fmt::format("(at row {})\n", total_rows)); e.addMessage(verbose_diagnostic); throw; diff --git a/src/Processors/Formats/IRowInputFormat.h b/src/Processors/Formats/IRowInputFormat.h index 1b48647a224..f8796df8604 100644 --- a/src/Processors/Formats/IRowInputFormat.h +++ b/src/Processors/Formats/IRowInputFormat.h @@ -42,7 +42,7 @@ public: IRowInputFormat(Block header, ReadBuffer & in_, Params params_); - Chunk generate() override; + Chunk read() override; void resetParser() override; @@ -79,10 +79,12 @@ protected: const BlockMissingValues & getMissingValues() const override { return block_missing_values; } - size_t getTotalRows() const { return total_rows; } + size_t getRowNum() const { return total_rows; } size_t getApproxBytesReadForChunk() const override { return approx_bytes_read_for_chunk; } + void setRowsReadBefore(size_t rows) override { total_rows = rows; } + Serializations serializations; private: diff --git a/src/Processors/Formats/Impl/ArrowBlockInputFormat.cpp b/src/Processors/Formats/Impl/ArrowBlockInputFormat.cpp index bac6c540381..206e244c75f 100644 --- a/src/Processors/Formats/Impl/ArrowBlockInputFormat.cpp +++ b/src/Processors/Formats/Impl/ArrowBlockInputFormat.cpp @@ -28,7 +28,7 @@ ArrowBlockInputFormat::ArrowBlockInputFormat(ReadBuffer & in_, const Block & hea { } -Chunk ArrowBlockInputFormat::generate() +Chunk ArrowBlockInputFormat::read() { Chunk res; block_missing_values.clear(); @@ -64,7 +64,7 @@ Chunk ArrowBlockInputFormat::generate() { auto rows = file_reader->RecordBatchCountRows(record_batch_current++); if (!rows.ok()) - throw ParsingException( + throw Exception( ErrorCodes::CANNOT_READ_ALL_DATA, "Error while reading batch of Arrow data: {}", rows.status().ToString()); return getChunkForCount(*rows); } @@ -73,12 +73,12 @@ Chunk ArrowBlockInputFormat::generate() } if (!batch_result.ok()) - throw ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, + throw Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "Error while reading batch of Arrow data: {}", batch_result.status().ToString()); auto table_result = arrow::Table::FromRecordBatches({*batch_result}); if (!table_result.ok()) - throw ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, + throw Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "Error while reading batch of Arrow data: {}", table_result.status().ToString()); ++record_batch_current; @@ -213,7 +213,7 @@ std::optional ArrowSchemaReader::readNumberOrRows() auto rows = file_reader->CountRows(); if (!rows.ok()) - throw ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Error while reading batch of Arrow data: {}", rows.status().ToString()); + throw Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "Error while reading batch of Arrow data: {}", rows.status().ToString()); return *rows; } diff --git a/src/Processors/Formats/Impl/ArrowBlockInputFormat.h b/src/Processors/Formats/Impl/ArrowBlockInputFormat.h index 06a7b470312..cdbc5e57e4e 100644 --- a/src/Processors/Formats/Impl/ArrowBlockInputFormat.h +++ b/src/Processors/Formats/Impl/ArrowBlockInputFormat.h @@ -30,7 +30,7 @@ public: size_t getApproxBytesReadForChunk() const override { return approx_bytes_read_for_chunk; } private: - Chunk generate() override; + Chunk read() override; void onCancel() override { diff --git a/src/Processors/Formats/Impl/AvroRowInputFormat.cpp b/src/Processors/Formats/Impl/AvroRowInputFormat.cpp index 9841b5e70c6..46d1c426ef4 100644 --- a/src/Processors/Formats/Impl/AvroRowInputFormat.cpp +++ b/src/Processors/Formats/Impl/AvroRowInputFormat.cpp @@ -186,7 +186,7 @@ static AvroDeserializer::DeserializeFn createDecimalDeserializeFn(const avro::No tmp = decoder.decodeBytes(); if (tmp.size() > field_type_size || tmp.empty()) - throw ParsingException( + throw Exception( ErrorCodes::CANNOT_PARSE_UUID, "Cannot parse type {}, expected non-empty binary data with size equal to or less than {}, got {}", target_type->getName(), @@ -274,7 +274,7 @@ AvroDeserializer::DeserializeFn AvroDeserializer::createDeserializeFn(const avro { decoder.decodeString(tmp); if (tmp.length() != 36) - throw ParsingException(ErrorCodes::CANNOT_PARSE_UUID, "Cannot parse uuid {}", tmp); + throw Exception(ErrorCodes::CANNOT_PARSE_UUID, "Cannot parse uuid {}", tmp); const UUID uuid = parseUUID({reinterpret_cast(tmp.data()), tmp.length()}); assert_cast(column).insertValue(uuid); @@ -530,7 +530,7 @@ AvroDeserializer::DeserializeFn AvroDeserializer::createDeserializeFn(const avro { decoder.decodeFixed(fixed_size, tmp); if (tmp.size() != 36) - throw ParsingException(ErrorCodes::CANNOT_PARSE_UUID, "Cannot parse UUID from type Fixed, because it's size ({}) is not equal to the size of UUID (36)", fixed_size); + throw Exception(ErrorCodes::CANNOT_PARSE_UUID, "Cannot parse UUID from type Fixed, because it's size ({}) is not equal to the size of UUID (36)", fixed_size); const UUID uuid = parseUUID({reinterpret_cast(tmp.data()), tmp.size()}); assert_cast(column).insertValue(uuid); diff --git a/src/Processors/Formats/Impl/BSONEachRowRowInputFormat.cpp b/src/Processors/Formats/Impl/BSONEachRowRowInputFormat.cpp index b38aaa426fd..340bcc8aae5 100644 --- a/src/Processors/Formats/Impl/BSONEachRowRowInputFormat.cpp +++ b/src/Processors/Formats/Impl/BSONEachRowRowInputFormat.cpp @@ -1031,17 +1031,17 @@ fileSegmentationEngineBSONEachRow(ReadBuffer & in, DB::Memory<> & memory, size_t readBinaryLittleEndian(document_size, in); if (document_size < sizeof(document_size)) - throw ParsingException(ErrorCodes::INCORRECT_DATA, "Size of BSON document is invalid"); + throw Exception(ErrorCodes::INCORRECT_DATA, "Size of BSON document is invalid"); if (min_bytes != 0 && document_size > 10 * min_bytes) - throw ParsingException( + throw Exception( ErrorCodes::INCORRECT_DATA, "Size of BSON document is extremely large. Expected not greater than {} bytes, but current is {} bytes per row. Increase " "the value setting 'min_chunk_bytes_for_parallel_parsing' or check your data manually, most likely BSON is malformed", min_bytes, document_size); if (document_size < sizeof(document_size)) - throw ParsingException(ErrorCodes::INCORRECT_DATA, "Size of BSON document is invalid"); + throw Exception(ErrorCodes::INCORRECT_DATA, "Size of BSON document is invalid"); size_t old_size = memory.size(); memory.resize(old_size + document_size); diff --git a/src/Processors/Formats/Impl/BSONEachRowRowInputFormat.h b/src/Processors/Formats/Impl/BSONEachRowRowInputFormat.h index 5e8bee50963..a1f197557b4 100644 --- a/src/Processors/Formats/Impl/BSONEachRowRowInputFormat.h +++ b/src/Processors/Formats/Impl/BSONEachRowRowInputFormat.h @@ -57,9 +57,6 @@ public: void resetParser() override; private: - void readPrefix() override { } - void readSuffix() override { } - bool readRow(MutableColumns & columns, RowReadExtension & ext) override; bool allowSyncAfterError() const override { return true; } void syncAfterError() override; diff --git a/src/Processors/Formats/Impl/DWARFBlockInputFormat.cpp b/src/Processors/Formats/Impl/DWARFBlockInputFormat.cpp index 4c3bb219415..43ef2521032 100644 --- a/src/Processors/Formats/Impl/DWARFBlockInputFormat.cpp +++ b/src/Processors/Formats/Impl/DWARFBlockInputFormat.cpp @@ -888,7 +888,7 @@ void DWARFBlockInputFormat::parseRanges( } } -Chunk DWARFBlockInputFormat::generate() +Chunk DWARFBlockInputFormat::read() { initializeIfNeeded(); diff --git a/src/Processors/Formats/Impl/DWARFBlockInputFormat.h b/src/Processors/Formats/Impl/DWARFBlockInputFormat.h index e1409dd3373..0345a264d47 100644 --- a/src/Processors/Formats/Impl/DWARFBlockInputFormat.h +++ b/src/Processors/Formats/Impl/DWARFBlockInputFormat.h @@ -30,7 +30,7 @@ public: size_t getApproxBytesReadForChunk() const override { return approx_bytes_read_for_chunk; } protected: - Chunk generate() override; + Chunk read() override; void onCancel() override { diff --git a/src/Processors/Formats/Impl/JSONColumnsBlockInputFormatBase.cpp b/src/Processors/Formats/Impl/JSONColumnsBlockInputFormatBase.cpp index 1c148f5b3d3..53cb5a77898 100644 --- a/src/Processors/Formats/Impl/JSONColumnsBlockInputFormatBase.cpp +++ b/src/Processors/Formats/Impl/JSONColumnsBlockInputFormatBase.cpp @@ -109,7 +109,7 @@ void JSONColumnsBlockInputFormatBase::setReadBuffer(ReadBuffer & in_) IInputFormat::setReadBuffer(in_); } -Chunk JSONColumnsBlockInputFormatBase::generate() +Chunk JSONColumnsBlockInputFormatBase::read() { MutableColumns columns = getPort().getHeader().cloneEmptyColumns(); block_missing_values.clear(); diff --git a/src/Processors/Formats/Impl/JSONColumnsBlockInputFormatBase.h b/src/Processors/Formats/Impl/JSONColumnsBlockInputFormatBase.h index 53d65bb3539..fe80d77cd87 100644 --- a/src/Processors/Formats/Impl/JSONColumnsBlockInputFormatBase.h +++ b/src/Processors/Formats/Impl/JSONColumnsBlockInputFormatBase.h @@ -56,7 +56,7 @@ public: size_t getApproxBytesReadForChunk() const override { return approx_bytes_read_for_chunk; } protected: - Chunk generate() override; + Chunk read() override; size_t readColumn(IColumn & column, const DataTypePtr & type, const SerializationPtr & serialization, const String & column_name); diff --git a/src/Processors/Formats/Impl/JSONEachRowRowInputFormat.cpp b/src/Processors/Formats/Impl/JSONEachRowRowInputFormat.cpp index 95563fd2f62..0ef19a9c14f 100644 --- a/src/Processors/Formats/Impl/JSONEachRowRowInputFormat.cpp +++ b/src/Processors/Formats/Impl/JSONEachRowRowInputFormat.cpp @@ -142,7 +142,7 @@ inline bool JSONEachRowRowInputFormat::advanceToNextKey(size_t key_index) skipWhitespaceIfAny(*in); if (in->eof()) - throw ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Unexpected end of stream while parsing JSONEachRow format"); + throw Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "Unexpected end of stream while parsing JSONEachRow format"); else if (*in->position() == '}') { ++in->position(); @@ -205,7 +205,7 @@ bool JSONEachRowRowInputFormat::readRow(MutableColumns & columns, RowReadExtensi return false; skipWhitespaceIfAny(*in); - bool is_first_row = getCurrentUnitNumber() == 0 && getTotalRows() == 1; + bool is_first_row = getRowNum() == 0; if (checkEndOfData(is_first_row)) return false; @@ -308,7 +308,7 @@ size_t JSONEachRowRowInputFormat::countRows(size_t max_block_size) return 0; size_t num_rows = 0; - bool is_first_row = getCurrentUnitNumber() == 0 && getTotalRows() == 0; + bool is_first_row = getRowNum() == 0; skipWhitespaceIfAny(*in); while (num_rows < max_block_size && !checkEndOfData(is_first_row)) { diff --git a/src/Processors/Formats/Impl/NativeFormat.cpp b/src/Processors/Formats/Impl/NativeFormat.cpp index 65ea87479a3..73ffc02bbc1 100644 --- a/src/Processors/Formats/Impl/NativeFormat.cpp +++ b/src/Processors/Formats/Impl/NativeFormat.cpp @@ -35,7 +35,7 @@ public: reader->resetParser(); } - Chunk generate() override + Chunk read() override { block_missing_values.clear(); size_t block_start = getDataOffsetMaybeCompressed(*in); diff --git a/src/Processors/Formats/Impl/NativeORCBlockInputFormat.cpp b/src/Processors/Formats/Impl/NativeORCBlockInputFormat.cpp index 4629127186a..2fa5c1d2850 100644 --- a/src/Processors/Formats/Impl/NativeORCBlockInputFormat.cpp +++ b/src/Processors/Formats/Impl/NativeORCBlockInputFormat.cpp @@ -905,7 +905,7 @@ bool NativeORCBlockInputFormat::prepareStripeReader() return true; } -Chunk NativeORCBlockInputFormat::generate() +Chunk NativeORCBlockInputFormat::read() { block_missing_values.clear(); diff --git a/src/Processors/Formats/Impl/NativeORCBlockInputFormat.h b/src/Processors/Formats/Impl/NativeORCBlockInputFormat.h index 6ea7a063e0d..a3ef9ed4b8f 100644 --- a/src/Processors/Formats/Impl/NativeORCBlockInputFormat.h +++ b/src/Processors/Formats/Impl/NativeORCBlockInputFormat.h @@ -62,7 +62,7 @@ public: size_t getApproxBytesReadForChunk() const override { return approx_bytes_read_for_chunk; } protected: - Chunk generate() override; + Chunk read() override; void onCancel() override { is_stopped = 1; } diff --git a/src/Processors/Formats/Impl/ORCBlockInputFormat.cpp b/src/Processors/Formats/Impl/ORCBlockInputFormat.cpp index 5cde51a4927..a41eacf26b7 100644 --- a/src/Processors/Formats/Impl/ORCBlockInputFormat.cpp +++ b/src/Processors/Formats/Impl/ORCBlockInputFormat.cpp @@ -27,7 +27,7 @@ ORCBlockInputFormat::ORCBlockInputFormat(ReadBuffer & in_, Block header_, const { } -Chunk ORCBlockInputFormat::generate() +Chunk ORCBlockInputFormat::read() { block_missing_values.clear(); @@ -48,7 +48,7 @@ Chunk ORCBlockInputFormat::generate() auto batch_result = file_reader->ReadStripe(stripe_current, include_indices); if (!batch_result.ok()) - throw ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Failed to create batch reader: {}", batch_result.status().ToString()); + throw Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "Failed to create batch reader: {}", batch_result.status().ToString()); auto batch = batch_result.ValueOrDie(); if (!batch) @@ -56,7 +56,7 @@ Chunk ORCBlockInputFormat::generate() auto table_result = arrow::Table::FromRecordBatches({batch}); if (!table_result.ok()) - throw ParsingException( + throw Exception( ErrorCodes::CANNOT_READ_ALL_DATA, "Error while reading batch of ORC data: {}", table_result.status().ToString()); /// We should extract the number of rows directly from the stripe, because in case when diff --git a/src/Processors/Formats/Impl/ORCBlockInputFormat.h b/src/Processors/Formats/Impl/ORCBlockInputFormat.h index 4d878f85255..34630345849 100644 --- a/src/Processors/Formats/Impl/ORCBlockInputFormat.h +++ b/src/Processors/Formats/Impl/ORCBlockInputFormat.h @@ -32,7 +32,7 @@ public: size_t getApproxBytesReadForChunk() const override { return approx_bytes_read_for_chunk; } protected: - Chunk generate() override; + Chunk read() override; void onCancel() override { diff --git a/src/Processors/Formats/Impl/OneFormat.cpp b/src/Processors/Formats/Impl/OneFormat.cpp index 4a9c8caebf3..f190cce6425 100644 --- a/src/Processors/Formats/Impl/OneFormat.cpp +++ b/src/Processors/Formats/Impl/OneFormat.cpp @@ -23,7 +23,7 @@ OneInputFormat::OneInputFormat(const Block & header, ReadBuffer & in_) : IInputF header.getByPosition(0).type->getName()); } -Chunk OneInputFormat::generate() +Chunk OneInputFormat::read() { if (done) return {}; diff --git a/src/Processors/Formats/Impl/OneFormat.h b/src/Processors/Formats/Impl/OneFormat.h index f73b2dab66a..060b9b21def 100644 --- a/src/Processors/Formats/Impl/OneFormat.h +++ b/src/Processors/Formats/Impl/OneFormat.h @@ -14,7 +14,7 @@ public: String getName() const override { return "One"; } protected: - Chunk generate() override; + Chunk read() override; private: bool done = false; diff --git a/src/Processors/Formats/Impl/ParallelParsingInputFormat.cpp b/src/Processors/Formats/Impl/ParallelParsingInputFormat.cpp index 24f1bcde6aa..8b6969bbfcc 100644 --- a/src/Processors/Formats/Impl/ParallelParsingInputFormat.cpp +++ b/src/Processors/Formats/Impl/ParallelParsingInputFormat.cpp @@ -61,7 +61,7 @@ void ParallelParsingInputFormat::segmentatorThreadFunction(ThreadGroupPtr thread } catch (...) { - onBackgroundException(successfully_read_rows_count); + onBackgroundException(); } } @@ -90,7 +90,7 @@ void ParallelParsingInputFormat::parserThreadFunction(ThreadGroupPtr thread_grou ReadBuffer read_buffer(unit.segment.data(), unit.segment.size(), 0); InputFormatPtr input_format = internal_parser_creator(read_buffer); - input_format->setCurrentUnitNumber(current_ticket_number); + input_format->setRowsReadBefore(unit.offset); input_format->setErrorsLogger(errors_logger); InternalParser parser(input_format); @@ -132,28 +132,16 @@ void ParallelParsingInputFormat::parserThreadFunction(ThreadGroupPtr thread_grou } catch (...) { - onBackgroundException(unit.offset); + onBackgroundException(); } } -void ParallelParsingInputFormat::onBackgroundException(size_t offset) +void ParallelParsingInputFormat::onBackgroundException() { std::lock_guard lock(mutex); if (!background_exception) - { background_exception = std::current_exception(); - if (ParsingException * e = exception_cast(background_exception)) - { - /// NOTE: it is not that safe to use line number hack here (may exceed INT_MAX) - if (e->getLineNumber() != -1) - e->setLineNumber(static_cast(e->getLineNumber() + offset)); - - auto file_name = getFileNameFromReadBuffer(getReadBuffer()); - if (!file_name.empty()) - e->setFileName(file_name); - } - } if (is_server) tryLogCurrentException(__PRETTY_FUNCTION__); @@ -164,7 +152,7 @@ void ParallelParsingInputFormat::onBackgroundException(size_t offset) segmentator_condvar.notify_all(); } -Chunk ParallelParsingInputFormat::generate() +Chunk ParallelParsingInputFormat::read() { /// Delayed launching of segmentator thread if (unlikely(!parsing_started.exchange(true))) diff --git a/src/Processors/Formats/Impl/ParallelParsingInputFormat.h b/src/Processors/Formats/Impl/ParallelParsingInputFormat.h index 8432e053eba..ff97afa8348 100644 --- a/src/Processors/Formats/Impl/ParallelParsingInputFormat.h +++ b/src/Processors/Formats/Impl/ParallelParsingInputFormat.h @@ -135,7 +135,7 @@ public: private: - Chunk generate() override final; + Chunk read() override final; void onCancel() override final { @@ -333,7 +333,7 @@ private: /// threads. This function is used by segmentator and parsed threads. /// readImpl() is called from the main thread, so the exception handling /// is different. - void onBackgroundException(size_t offset); + void onBackgroundException(); }; } diff --git a/src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp b/src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp index d37c2dc1160..62e576d4953 100644 --- a/src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp +++ b/src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp @@ -570,7 +570,7 @@ void ParquetBlockInputFormat::decodeOneChunk(size_t row_group_batch_idx, std::un // We may be able to schedule more work now, but can't call scheduleMoreWorkIfNeeded() right // here because we're running on the same thread pool, so it'll deadlock if thread limit is - // reached. Wake up generate() instead. + // reached. Wake up read() instead. condvar.notify_all(); }; @@ -579,7 +579,7 @@ void ParquetBlockInputFormat::decodeOneChunk(size_t row_group_batch_idx, std::un auto batch = row_group_batch.record_batch_reader->Next(); if (!batch.ok()) - throw ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Error while reading Parquet data: {}", batch.status().ToString()); + throw Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "Error while reading Parquet data: {}", batch.status().ToString()); if (!*batch) { @@ -637,7 +637,7 @@ void ParquetBlockInputFormat::scheduleMoreWorkIfNeeded(std::optional row } } -Chunk ParquetBlockInputFormat::generate() +Chunk ParquetBlockInputFormat::read() { initializeIfNeeded(); diff --git a/src/Processors/Formats/Impl/ParquetBlockInputFormat.h b/src/Processors/Formats/Impl/ParquetBlockInputFormat.h index 7fdf03a0606..b5b884b5efa 100644 --- a/src/Processors/Formats/Impl/ParquetBlockInputFormat.h +++ b/src/Processors/Formats/Impl/ParquetBlockInputFormat.h @@ -65,7 +65,7 @@ public: size_t getApproxBytesReadForChunk() const override { return previous_approx_bytes_read_for_chunk; } private: - Chunk generate() override; + Chunk read() override; void onCancel() override { @@ -142,7 +142,7 @@ private: // reading its data (using RAM). Row group becomes inactive when we finish reading and // delivering all its blocks and free the RAM. Size of the window is max_decoding_threads. // - // Decoded blocks are placed in `pending_chunks` queue, then picked up by generate(). + // Decoded blocks are placed in `pending_chunks` queue, then picked up by read(). // If row group decoding runs too far ahead of delivery (by `max_pending_chunks_per_row_group` // chunks), we pause the stream for the row group, to avoid using too much memory when decoded // chunks are much bigger than the compressed data. @@ -150,7 +150,7 @@ private: // Also: // * If preserve_order = true, we deliver chunks strictly in order of increasing row group. // Decoding may still proceed in later row groups. - // * If max_decoding_threads <= 1, we run all tasks inline in generate(), without thread pool. + // * If max_decoding_threads <= 1, we run all tasks inline in read(), without thread pool. // Potential improvements: // * Plan all read ranges ahead of time, for the whole file, and do prefetching for them @@ -189,7 +189,7 @@ private: Status status = Status::NotStarted; - // Window of chunks that were decoded but not returned from generate(): + // Window of chunks that were decoded but not returned from read(): // // (delivered) next_chunk_idx // v v v @@ -215,7 +215,7 @@ private: std::unique_ptr arrow_column_to_ch_column; }; - // Chunk ready to be delivered by generate(). + // Chunk ready to be delivered by read(). struct PendingChunk { Chunk chunk; @@ -265,7 +265,7 @@ private: // Done NotStarted std::mutex mutex; - // Wakes up the generate() call, if any. + // Wakes up the read() call, if any. std::condition_variable condvar; std::vector row_group_batches; diff --git a/src/Processors/Formats/Impl/ParquetMetadataInputFormat.cpp b/src/Processors/Formats/Impl/ParquetMetadataInputFormat.cpp index 1f81f5ac201..7fd6e93dd80 100644 --- a/src/Processors/Formats/Impl/ParquetMetadataInputFormat.cpp +++ b/src/Processors/Formats/Impl/ParquetMetadataInputFormat.cpp @@ -140,7 +140,7 @@ ParquetMetadataInputFormat::ParquetMetadataInputFormat(ReadBuffer & in_, Block h checkHeader(getPort().getHeader()); } -Chunk ParquetMetadataInputFormat::generate() +Chunk ParquetMetadataInputFormat::read() { Chunk res; if (done) diff --git a/src/Processors/Formats/Impl/ParquetMetadataInputFormat.h b/src/Processors/Formats/Impl/ParquetMetadataInputFormat.h index 2d027e5000f..1aa2d99ca76 100644 --- a/src/Processors/Formats/Impl/ParquetMetadataInputFormat.h +++ b/src/Processors/Formats/Impl/ParquetMetadataInputFormat.h @@ -63,7 +63,7 @@ public: void resetParser() override; private: - Chunk generate() override; + Chunk read() override; void onCancel() override { diff --git a/src/Processors/Formats/Impl/ProtobufListInputFormat.cpp b/src/Processors/Formats/Impl/ProtobufListInputFormat.cpp index 220a24b3c8c..2382b3cf27a 100644 --- a/src/Processors/Formats/Impl/ProtobufListInputFormat.cpp +++ b/src/Processors/Formats/Impl/ProtobufListInputFormat.cpp @@ -61,7 +61,7 @@ bool ProtobufListInputFormat::readRow(MutableColumns & columns, RowReadExtension size_t ProtobufListInputFormat::countRows(size_t max_block_size) { - if (getTotalRows() == 0) + if (getRowNum() == 0) reader->startMessage(true); if (reader->eof()) diff --git a/src/Processors/Formats/Impl/TSKVRowInputFormat.cpp b/src/Processors/Formats/Impl/TSKVRowInputFormat.cpp index f4f92583473..432e944a246 100644 --- a/src/Processors/Formats/Impl/TSKVRowInputFormat.cpp +++ b/src/Processors/Formats/Impl/TSKVRowInputFormat.cpp @@ -92,7 +92,7 @@ static bool readName(ReadBuffer & buf, StringRef & ref, String & tmp) } } - throw ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Unexpected end of stream while reading key name from TSKV format"); + throw Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "Unexpected end of stream while reading key name from TSKV format"); } @@ -161,7 +161,7 @@ bool TSKVRowInputFormat::readRow(MutableColumns & columns, RowReadExtension & ex if (in->eof()) { - throw ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Unexpected end of stream after field in TSKV format: {}", name_ref.toString()); + throw Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "Unexpected end of stream after field in TSKV format: {}", name_ref.toString()); } else if (*in->position() == '\t') { diff --git a/src/Processors/Formats/Impl/TemplateRowInputFormat.cpp b/src/Processors/Formats/Impl/TemplateRowInputFormat.cpp index ede0426a0a2..a6e4600d83b 100644 --- a/src/Processors/Formats/Impl/TemplateRowInputFormat.cpp +++ b/src/Processors/Formats/Impl/TemplateRowInputFormat.cpp @@ -21,7 +21,7 @@ namespace ErrorCodes [[noreturn]] static void throwUnexpectedEof(size_t row_num) { - throw ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Unexpected EOF while parsing row {}. " + throw Exception(ErrorCodes::CANNOT_READ_ALL_DATA, "Unexpected EOF while parsing row {}. " "Maybe last row has wrong format or input doesn't contain specified suffix before EOF.", std::to_string(row_num)); } @@ -121,7 +121,7 @@ bool TemplateRowInputFormat::readRow(MutableColumns & columns, RowReadExtension updateDiagnosticInfo(); - if (likely(row_num != 1)) + if (likely(getRowNum() != 0)) format_reader->skipRowBetweenDelimiter(); extra.read_columns.assign(columns.size(), false); @@ -160,7 +160,7 @@ bool TemplateRowInputFormat::deserializeField(const DataTypePtr & type, catch (Exception & e) { if (e.code() == ErrorCodes::ATTEMPT_TO_READ_AFTER_EOF) - throwUnexpectedEof(row_num); + throwUnexpectedEof(getRowNum()); throw; } } @@ -198,7 +198,7 @@ bool TemplateRowInputFormat::parseRowAndPrintDiagnosticInfo(MutableColumns & col out << "\nUsing format string (from format_schema_rows): " << row_format.dump() << "\n"; out << "\nTrying to parse next row, because suffix does not match:\n"; - if (likely(row_num != 1) && !parseDelimiterWithDiagnosticInfo(out, *buf, row_between_delimiter, "delimiter between rows", ignore_spaces)) + if (likely(getRowNum() != 0) && !parseDelimiterWithDiagnosticInfo(out, *buf, row_between_delimiter, "delimiter between rows", ignore_spaces)) return false; for (size_t i = 0; i < row_format.columnsCount(); ++i) diff --git a/src/Processors/Formats/Impl/ValuesBlockInputFormat.cpp b/src/Processors/Formats/Impl/ValuesBlockInputFormat.cpp index 1a203302238..aa193ffd36a 100644 --- a/src/Processors/Formats/Impl/ValuesBlockInputFormat.cpp +++ b/src/Processors/Formats/Impl/ValuesBlockInputFormat.cpp @@ -98,7 +98,7 @@ bool ValuesBlockInputFormat::skipToNextRow(ReadBuffer * buf, size_t min_chunk_by return true; } -Chunk ValuesBlockInputFormat::generate() +Chunk ValuesBlockInputFormat::read() { if (total_rows == 0) readPrefix(); diff --git a/src/Processors/Formats/Impl/ValuesBlockInputFormat.h b/src/Processors/Formats/Impl/ValuesBlockInputFormat.h index 9ea7407f12d..bf2765bfd1e 100644 --- a/src/Processors/Formats/Impl/ValuesBlockInputFormat.h +++ b/src/Processors/Formats/Impl/ValuesBlockInputFormat.h @@ -58,7 +58,7 @@ private: using ConstantExpressionTemplates = std::vector>; - Chunk generate() override; + Chunk read() override; void readRow(MutableColumns & columns, size_t row_num); void readUntilTheEndOfRowAndReTokenize(size_t current_column_idx); diff --git a/src/Processors/Formats/RowInputFormatWithDiagnosticInfo.cpp b/src/Processors/Formats/RowInputFormatWithDiagnosticInfo.cpp index 6358a99d6b4..a56c24a740a 100644 --- a/src/Processors/Formats/RowInputFormatWithDiagnosticInfo.cpp +++ b/src/Processors/Formats/RowInputFormatWithDiagnosticInfo.cpp @@ -26,8 +26,6 @@ RowInputFormatWithDiagnosticInfo::RowInputFormatWithDiagnosticInfo(const Block & void RowInputFormatWithDiagnosticInfo::updateDiagnosticInfo() { - ++row_num; - bytes_read_at_start_of_buffer_on_prev_row = bytes_read_at_start_of_buffer_on_current_row; bytes_read_at_start_of_buffer_on_current_row = in->count() - in->offset(); @@ -73,7 +71,7 @@ std::pair RowInputFormatWithDiagnosticInfo::getDiagnosticAndRawD { in->position() = in->buffer().begin() + offset_of_prev_row; - out_diag << "\nRow " << (row_num - 1) << ":\n"; + out_diag << "\nRow " << getRowNum() - 1 << ":\n"; if (!parseRowAndPrintDiagnosticInfo(columns, out_diag)) return std::make_pair(out_diag.str(), out_data.str()); } @@ -96,7 +94,7 @@ std::pair RowInputFormatWithDiagnosticInfo::getDiagnosticAndRawD ++data; } - out_diag << "\nRow " << row_num << ":\n"; + out_diag << "\nRow " << getRowNum() << ":\n"; parseRowAndPrintDiagnosticInfo(columns, out_diag); out_diag << "\n"; @@ -193,7 +191,6 @@ bool RowInputFormatWithDiagnosticInfo::deserializeFieldAndPrintDiagnosticInfo(co void RowInputFormatWithDiagnosticInfo::resetParser() { IRowInputFormat::resetParser(); - row_num = 0; bytes_read_at_start_of_buffer_on_current_row = 0; bytes_read_at_start_of_buffer_on_prev_row = 0; offset_of_current_row = std::numeric_limits::max(); diff --git a/src/Processors/Formats/RowInputFormatWithDiagnosticInfo.h b/src/Processors/Formats/RowInputFormatWithDiagnosticInfo.h index 49793fcd208..f067ebd7583 100644 --- a/src/Processors/Formats/RowInputFormatWithDiagnosticInfo.h +++ b/src/Processors/Formats/RowInputFormatWithDiagnosticInfo.h @@ -29,9 +29,6 @@ protected: virtual void tryDeserializeField(const DataTypePtr & type, IColumn & column, size_t file_column) = 0; virtual bool isGarbageAfterField(size_t after_input_pos_idx, ReadBuffer::Position pos) = 0; - /// For convenient diagnostics in case of an error. - size_t row_num = 0; - private: /// How many bytes were read, not counting those still in the buffer. size_t bytes_read_at_start_of_buffer_on_current_row = 0; diff --git a/src/Processors/Formats/RowInputFormatWithNamesAndTypes.cpp b/src/Processors/Formats/RowInputFormatWithNamesAndTypes.cpp index f7345848559..478ce41f924 100644 --- a/src/Processors/Formats/RowInputFormatWithNamesAndTypes.cpp +++ b/src/Processors/Formats/RowInputFormatWithNamesAndTypes.cpp @@ -66,11 +66,6 @@ RowInputFormatWithNamesAndTypes::RowInputFormatWithNamesAndTypes( void RowInputFormatWithNamesAndTypes::readPrefix() { - /// This is a bit of abstraction leakage, but we need it in parallel parsing: - /// we check if this InputFormat is working with the "real" beginning of the data. - if (getCurrentUnitNumber() != 0) - return; - /// Search and remove BOM only in textual formats (CSV, TSV etc), not in binary ones (RowBinary*). /// Also, we assume that column name or type cannot contain BOM, so, if format has header, /// then BOM at beginning of stream cannot be confused with name or type of field, and it is safe to skip it. @@ -206,7 +201,7 @@ bool RowInputFormatWithNamesAndTypes::readRow(MutableColumns & columns, RowReadE updateDiagnosticInfo(); - if (likely(row_num != 1 || getCurrentUnitNumber() != 0 || (getCurrentUnitNumber() == 0 && (with_names || with_types || is_header_detected)))) + if (likely(getRowNum() != 0 || with_names || with_types || is_header_detected)) format_reader->skipRowBetweenDelimiter(); format_reader->skipRowStartDelimiter(); @@ -270,7 +265,7 @@ size_t RowInputFormatWithNamesAndTypes::countRows(size_t max_block_size) return 0; size_t num_rows = 0; - bool is_first_row = getTotalRows() == 0 && !with_names && !with_types && !is_header_detected; + bool is_first_row = getRowNum() == 0 && !with_names && !with_types && !is_header_detected; while (!format_reader->checkForSuffix() && num_rows < max_block_size) { if (likely(!is_first_row)) @@ -323,7 +318,7 @@ bool RowInputFormatWithNamesAndTypes::parseRowAndPrintDiagnosticInfo(MutableColu if (!format_reader->tryParseSuffixWithDiagnosticInfo(out)) return false; - if (likely(row_num != 1) && !format_reader->parseRowBetweenDelimiterWithDiagnosticInfo(out)) + if (likely(getRowNum() != 0) && !format_reader->parseRowBetweenDelimiterWithDiagnosticInfo(out)) return false; if (!format_reader->parseRowStartWithDiagnosticInfo(out)) diff --git a/src/Processors/Merges/Algorithms/Graphite.h b/src/Processors/Merges/Algorithms/Graphite.h index 692e36d2eae..04bb4548c14 100644 --- a/src/Processors/Merges/Algorithms/Graphite.h +++ b/src/Processors/Merges/Algorithms/Graphite.h @@ -127,7 +127,12 @@ struct Pattern { hash.update(rule_type); hash.update(regexp_str); - hash.update(function->getName()); + if (function) + { + hash.update(function->getName()); + for (const auto & p : function->getParameters()) + hash.update(toString(p)); + } for (const auto & r : retentions) { hash.update(r.age); diff --git a/src/Processors/QueryPlan/Optimizations/optimizeUseAggregateProjection.cpp b/src/Processors/QueryPlan/Optimizations/optimizeUseAggregateProjection.cpp index c5e42e76653..d1f0c1ebe5e 100644 --- a/src/Processors/QueryPlan/Optimizations/optimizeUseAggregateProjection.cpp +++ b/src/Processors/QueryPlan/Optimizations/optimizeUseAggregateProjection.cpp @@ -436,7 +436,6 @@ AggregateProjectionCandidates getAggregateProjectionCandidates( AggregateProjectionCandidates candidates; const auto & parts = reading.getParts(); - const auto & query_info = reading.getQueryInfo(); const auto metadata = reading.getStorageMetadata(); ContextPtr context = reading.getContext(); @@ -481,8 +480,7 @@ AggregateProjectionCandidates getAggregateProjectionCandidates( auto block = reading.getMergeTreeData().getMinMaxCountProjectionBlock( metadata, candidate.dag->getRequiredColumnsNames(), - dag.filter_node != nullptr, - query_info, + (dag.filter_node ? dag.dag : nullptr), parts, max_added_blocks.get(), context); diff --git a/src/Processors/QueryPlan/ReadFromMergeTree.cpp b/src/Processors/QueryPlan/ReadFromMergeTree.cpp index aa1c463e4e6..68786bdec6c 100644 --- a/src/Processors/QueryPlan/ReadFromMergeTree.cpp +++ b/src/Processors/QueryPlan/ReadFromMergeTree.cpp @@ -23,6 +23,8 @@ #include #include #include +#include +#include #include #include #include @@ -418,7 +420,13 @@ Pipe ReadFromMergeTree::readFromPool( && settings.allow_prefetched_read_pool_for_local_filesystem && MergeTreePrefetchedReadPool::checkReadMethodAllowed(reader_settings.read_settings.local_fs_method); - if (allow_prefetched_remote || allow_prefetched_local) + /** Do not use prefetched read pool if query is trivial limit query. + * Because time spend during filling per thread tasks can be greater than whole query + * execution for big tables with small limit. + */ + bool use_prefetched_read_pool = query_info.limit == 0 && (allow_prefetched_remote || allow_prefetched_local); + + if (use_prefetched_read_pool) { pool = std::make_shared( std::move(parts_with_range), @@ -1331,26 +1339,12 @@ static void buildIndexes( const Names & primary_key_column_names = primary_key.column_names; const auto & settings = context->getSettingsRef(); - if (settings.query_plan_optimize_primary_key) - { - NameSet array_join_name_set; - if (query_info.syntax_analyzer_result) - array_join_name_set = query_info.syntax_analyzer_result->getArrayJoinSourceNameSet(); - indexes.emplace(ReadFromMergeTree::Indexes{{ - filter_actions_dag, - context, - primary_key_column_names, - primary_key.expression}, {}, {}, {}, {}, false, {}}); - } - else - { - indexes.emplace(ReadFromMergeTree::Indexes{{ - query_info, - context, - primary_key_column_names, - primary_key.expression}, {}, {}, {}, {}, false, {}}); - } + indexes.emplace(ReadFromMergeTree::Indexes{{ + filter_actions_dag, + context, + primary_key_column_names, + primary_key.expression}, {}, {}, {}, {}, false, {}}); if (metadata_snapshot->hasPartitionKey()) { @@ -1363,11 +1357,7 @@ static void buildIndexes( } /// TODO Support row_policy_filter and additional_filters - if (settings.allow_experimental_analyzer) - indexes->part_values = MergeTreeDataSelectExecutor::filterPartsByVirtualColumns(data, parts, filter_actions_dag, context); - else - indexes->part_values = MergeTreeDataSelectExecutor::filterPartsByVirtualColumns(data, parts, query_info.query, context); - + indexes->part_values = MergeTreeDataSelectExecutor::filterPartsByVirtualColumns(data, parts, filter_actions_dag, context); MergeTreeDataSelectExecutor::buildKeyConditionFromPartOffset(indexes->part_offset_condition, filter_actions_dag, context); indexes->use_skip_indexes = settings.use_skip_indexes; @@ -1379,14 +1369,18 @@ static void buildIndexes( if (!indexes->use_skip_indexes) return; - const SelectQueryInfo * info = &query_info; std::optional info_copy; - if (settings.allow_experimental_analyzer) + auto get_query_info = [&]() -> const SelectQueryInfo & { - info_copy.emplace(query_info); - info_copy->filter_actions_dag = filter_actions_dag; - info = &*info_copy; - } + if (settings.allow_experimental_analyzer) + { + info_copy.emplace(query_info); + info_copy->filter_actions_dag = filter_actions_dag; + return *info_copy; + } + + return query_info; + }; std::unordered_set ignored_index_names; @@ -1427,14 +1421,30 @@ static void buildIndexes( if (inserted) { skip_indexes.merged_indices.emplace_back(); - skip_indexes.merged_indices.back().condition = index_helper->createIndexMergedCondition(*info, metadata_snapshot); + skip_indexes.merged_indices.back().condition = index_helper->createIndexMergedCondition(get_query_info(), metadata_snapshot); } skip_indexes.merged_indices[it->second].addIndex(index_helper); } else { - auto condition = index_helper->createIndexCondition(*info, context); + MergeTreeIndexConditionPtr condition; + if (index_helper->isVectorSearch()) + { +#ifdef ENABLE_ANNOY + if (const auto * annoy = typeid_cast(index_helper.get())) + condition = annoy->createIndexCondition(get_query_info(), context); +#endif +#ifdef ENABLE_USEARCH + if (const auto * usearch = typeid_cast(index_helper.get())) + condition = usearch->createIndexCondition(get_query_info(), context); +#endif + if (!condition) + throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown vector search index {}", index_helper->index.name); + } + else + condition = index_helper->createIndexCondition(filter_actions_dag, context); + if (!condition->alwaysUnknownOrTrue()) skip_indexes.useful_indices.emplace_back(index_helper, condition); } @@ -1467,34 +1477,15 @@ MergeTreeDataSelectAnalysisResultPtr ReadFromMergeTree::selectRangesToRead( Poco::Logger * log, std::optional & indexes) { - const auto & settings = context->getSettingsRef(); - if (settings.allow_experimental_analyzer || settings.query_plan_optimize_primary_key) - { - auto updated_query_info_with_filter_dag = query_info; - updated_query_info_with_filter_dag.filter_actions_dag = buildFilterDAG(context, prewhere_info, added_filter_nodes, query_info); - - return selectRangesToReadImpl( - std::move(parts), - std::move(alter_conversions), - metadata_snapshot_base, - metadata_snapshot, - updated_query_info_with_filter_dag, - context, - num_streams, - max_block_numbers_to_read, - data, - real_column_names, - sample_factor_column_queried, - log, - indexes); - } + auto updated_query_info_with_filter_dag = query_info; + updated_query_info_with_filter_dag.filter_actions_dag = buildFilterDAG(context, prewhere_info, added_filter_nodes, query_info); return selectRangesToReadImpl( std::move(parts), std::move(alter_conversions), metadata_snapshot_base, metadata_snapshot, - query_info, + updated_query_info_with_filter_dag, context, num_streams, max_block_numbers_to_read, diff --git a/src/Processors/QueryPlan/ReadFromPreparedSource.cpp b/src/Processors/QueryPlan/ReadFromPreparedSource.cpp index 798073f94d3..e7b170f0f91 100644 --- a/src/Processors/QueryPlan/ReadFromPreparedSource.cpp +++ b/src/Processors/QueryPlan/ReadFromPreparedSource.cpp @@ -30,19 +30,9 @@ void ReadFromStorageStep::applyFilters() if (!context) return; - std::shared_ptr key_condition; - if (!context->getSettingsRef().allow_experimental_analyzer) - { - for (const auto & processor : pipe.getProcessors()) - if (auto * source = dynamic_cast(processor.get())) - source->setKeyCondition(query_info, context); - } - else - { - for (const auto & processor : pipe.getProcessors()) - if (auto * source = dynamic_cast(processor.get())) - source->setKeyCondition(filter_nodes.nodes, context); - } + for (const auto & processor : pipe.getProcessors()) + if (auto * source = dynamic_cast(processor.get())) + source->setKeyCondition(filter_nodes.nodes, context); } } diff --git a/src/Processors/QueryPlan/ReadFromSystemNumbersStep.cpp b/src/Processors/QueryPlan/ReadFromSystemNumbersStep.cpp index ec43c647b77..aec959233ea 100644 --- a/src/Processors/QueryPlan/ReadFromSystemNumbersStep.cpp +++ b/src/Processors/QueryPlan/ReadFromSystemNumbersStep.cpp @@ -9,6 +9,7 @@ #include #include #include +#include #include namespace DB @@ -40,11 +41,10 @@ protected: auto column = ColumnUInt64::create(block_size); ColumnUInt64::Container & vec = column->getData(); - size_t curr = next; /// The local variable for some reason works faster (>20%) than member of class. + UInt64 curr = next; /// The local variable for some reason works faster (>20%) than member of class. UInt64 * pos = vec.data(); /// This also accelerates the code. UInt64 * end = &vec[block_size]; - while (pos < end) - *pos++ = curr++; + iota(pos, static_cast(end - pos), curr); next += step; @@ -211,17 +211,18 @@ protected: { auto start_value_64 = static_cast(start_value); auto end_value_64 = static_cast(end_value); - while (start_value_64 < end_value_64) - *(pos++) = start_value_64++; + auto size = end_value_64 - start_value_64; + iota(pos, static_cast(size), start_value_64); + pos += size; } }; if (can_provide > need) { UInt64 start_value = first_value(range) + cursor.offset_in_range; - UInt64 end_value = start_value + need; /// end_value will never overflow - while (start_value < end_value) - *(pos++) = start_value++; + /// end_value will never overflow + iota(pos, static_cast(need), start_value); + pos += need; provided += need; cursor.offset_in_range += need; diff --git a/src/Processors/SourceWithKeyCondition.h b/src/Processors/SourceWithKeyCondition.h index 9e641cc8c51..82d46eb74a4 100644 --- a/src/Processors/SourceWithKeyCondition.h +++ b/src/Processors/SourceWithKeyCondition.h @@ -16,33 +16,18 @@ protected: /// Represents pushed down filters in source std::shared_ptr key_condition; - void setKeyConditionImpl(const SelectQueryInfo & query_info, ContextPtr context, const Block & keys) - { - if (!context->getSettingsRef().allow_experimental_analyzer) - { - key_condition = std::make_shared( - query_info, - context, - keys.getNames(), - std::make_shared(std::make_shared(keys.getColumnsWithTypeAndName()))); - } - } - void setKeyConditionImpl(const ActionsDAG::NodeRawConstPtrs & nodes, ContextPtr context, const Block & keys) { - if (context->getSettingsRef().allow_experimental_analyzer) - { - std::unordered_map node_name_to_input_column; - for (const auto & column : keys.getColumnsWithTypeAndName()) - node_name_to_input_column.insert({column.name, column}); + std::unordered_map node_name_to_input_column; + for (const auto & column : keys.getColumnsWithTypeAndName()) + node_name_to_input_column.insert({column.name, column}); - auto filter_actions_dag = ActionsDAG::buildFilterActionsDAG(nodes, node_name_to_input_column, context); - key_condition = std::make_shared( - filter_actions_dag, - context, - keys.getNames(), - std::make_shared(std::make_shared(keys.getColumnsWithTypeAndName()))); - } + auto filter_actions_dag = ActionsDAG::buildFilterActionsDAG(nodes, node_name_to_input_column, context); + key_condition = std::make_shared( + filter_actions_dag, + context, + keys.getNames(), + std::make_shared(std::make_shared(keys.getColumnsWithTypeAndName()))); } public: @@ -52,10 +37,7 @@ public: /// Set key_condition directly. It is used for filter push down in source. virtual void setKeyCondition(const std::shared_ptr & key_condition_) { key_condition = key_condition_; } - /// Set key_condition created by query_info and context. It is used for filter push down when allow_experimental_analyzer is false. - virtual void setKeyCondition(const SelectQueryInfo & /*query_info*/, ContextPtr /*context*/) { } - - /// Set key_condition created by nodes and context. It is used for filter push down when allow_experimental_analyzer is true. + /// Set key_condition created by nodes and context. virtual void setKeyCondition(const ActionsDAG::NodeRawConstPtrs & /*nodes*/, ContextPtr /*context*/) { } }; } diff --git a/src/Processors/Transforms/PartialSortingTransform.cpp b/src/Processors/Transforms/PartialSortingTransform.cpp index 3fc9a4e71db..e79673f6645 100644 --- a/src/Processors/Transforms/PartialSortingTransform.cpp +++ b/src/Processors/Transforms/PartialSortingTransform.cpp @@ -1,7 +1,8 @@ -#include -#include #include +#include +#include #include +#include namespace DB { @@ -36,9 +37,7 @@ size_t getFilterMask(const ColumnRawPtrs & raw_block_columns, const Columns & th else { rows_to_compare.resize(num_rows); - - for (size_t i = 0; i < num_rows; ++i) - rows_to_compare[i] = i; + iota(rows_to_compare.data(), num_rows, UInt64(0)); size_t size = description.size(); for (size_t i = 0; i < size; ++i) diff --git a/src/Processors/Transforms/buildPushingToViewsChain.cpp b/src/Processors/Transforms/buildPushingToViewsChain.cpp index f85dc28f4c7..ab9b3a80f12 100644 --- a/src/Processors/Transforms/buildPushingToViewsChain.cpp +++ b/src/Processors/Transforms/buildPushingToViewsChain.cpp @@ -39,6 +39,7 @@ namespace DB namespace ErrorCodes { extern const int LOGICAL_ERROR; + extern const int UNKNOWN_TABLE; } ThreadStatusesHolder::~ThreadStatusesHolder() @@ -316,7 +317,21 @@ Chain buildPushingToViewsChain( type = QueryViewsLogElement::ViewType::MATERIALIZED; result_chain.addTableLock(lock); - StoragePtr inner_table = materialized_view->getTargetTable(); + StoragePtr inner_table = materialized_view->tryGetTargetTable(); + /// If target table was dropped, ignore this materialized view. + if (!inner_table) + { + if (context->getSettingsRef().ignore_materialized_views_with_dropped_target_table) + continue; + + throw Exception( + ErrorCodes::UNKNOWN_TABLE, + "Target table '{}' of view '{}' doesn't exists. To ignore this view use setting " + "ignore_materialized_views_with_dropped_target_table", + materialized_view->getTargetTableId().getFullTableName(), + view_id.getFullTableName()); + } + auto inner_table_id = inner_table->getStorageID(); auto inner_metadata_snapshot = inner_table->getInMemoryMetadataPtr(); diff --git a/src/QueryPipeline/QueryPipelineBuilder.cpp b/src/QueryPipeline/QueryPipelineBuilder.cpp index a0fabe3273c..46c6a77f60f 100644 --- a/src/QueryPipeline/QueryPipelineBuilder.cpp +++ b/src/QueryPipeline/QueryPipelineBuilder.cpp @@ -1,14 +1,12 @@ #include -#include -#include -#include "Core/UUID.h" #include +#include +#include #include #include #include #include -#include #include #include #include @@ -25,11 +23,14 @@ #include #include #include -#include #include #include +#include #include #include +#include +#include +#include namespace DB { @@ -619,8 +620,7 @@ void QueryPipelineBuilder::addPipelineBefore(QueryPipelineBuilder pipeline) bool has_extremes = pipe.getExtremesPort(); size_t num_extra_ports = (has_totals ? 1 : 0) + (has_extremes ? 1 : 0); IProcessor::PortNumbers delayed_streams(pipe.numOutputPorts() + num_extra_ports); - for (size_t i = 0; i < delayed_streams.size(); ++i) - delayed_streams[i] = i; + iota(delayed_streams.data(), delayed_streams.size(), IProcessor::PortNumbers::value_type(0)); auto * collected_processors = pipe.collected_processors; diff --git a/src/Server/HTTP/HTTPRequestHandler.h b/src/Server/HTTP/HTTPRequestHandler.h index 7902e86e3ed..19340866bb7 100644 --- a/src/Server/HTTP/HTTPRequestHandler.h +++ b/src/Server/HTTP/HTTPRequestHandler.h @@ -13,8 +13,7 @@ class HTTPRequestHandler : private boost::noncopyable public: virtual ~HTTPRequestHandler() = default; - virtual void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) = 0; - virtual void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { handleRequest(request, response, ProfileEvents::end()); } + virtual void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) = 0; }; } diff --git a/src/Server/HTTP/HTTPServer.cpp b/src/Server/HTTP/HTTPServer.cpp index 90bdebf6451..46734933263 100644 --- a/src/Server/HTTP/HTTPServer.cpp +++ b/src/Server/HTTP/HTTPServer.cpp @@ -10,10 +10,8 @@ HTTPServer::HTTPServer( HTTPRequestHandlerFactoryPtr factory_, Poco::ThreadPool & thread_pool, Poco::Net::ServerSocket & socket_, - Poco::Net::HTTPServerParams::Ptr params, - const ProfileEvents::Event & read_event, - const ProfileEvents::Event & write_event) - : TCPServer(new HTTPServerConnectionFactory(context, params, factory_, read_event, write_event), thread_pool, socket_, params), factory(factory_) + Poco::Net::HTTPServerParams::Ptr params) + : TCPServer(new HTTPServerConnectionFactory(context, params, factory_), thread_pool, socket_, params), factory(factory_) { } diff --git a/src/Server/HTTP/HTTPServer.h b/src/Server/HTTP/HTTPServer.h index 9911cde1b93..adfb21e7c62 100644 --- a/src/Server/HTTP/HTTPServer.h +++ b/src/Server/HTTP/HTTPServer.h @@ -20,9 +20,7 @@ public: HTTPRequestHandlerFactoryPtr factory, Poco::ThreadPool & thread_pool, Poco::Net::ServerSocket & socket, - Poco::Net::HTTPServerParams::Ptr params, - const ProfileEvents::Event & read_event_ = ProfileEvents::end(), - const ProfileEvents::Event & write_event_ = ProfileEvents::end()); + Poco::Net::HTTPServerParams::Ptr params); ~HTTPServer() override; diff --git a/src/Server/HTTP/HTTPServerConnection.cpp b/src/Server/HTTP/HTTPServerConnection.cpp index 047db014560..042f5e2e5df 100644 --- a/src/Server/HTTP/HTTPServerConnection.cpp +++ b/src/Server/HTTP/HTTPServerConnection.cpp @@ -11,10 +11,8 @@ HTTPServerConnection::HTTPServerConnection( TCPServer & tcp_server_, const Poco::Net::StreamSocket & socket, Poco::Net::HTTPServerParams::Ptr params_, - HTTPRequestHandlerFactoryPtr factory_, - const ProfileEvents::Event & read_event_, - const ProfileEvents::Event & write_event_) - : TCPServerConnection(socket), context(std::move(context_)), tcp_server(tcp_server_), params(params_), factory(factory_), read_event(read_event_), write_event(write_event_), stopped(false) + HTTPRequestHandlerFactoryPtr factory_) + : TCPServerConnection(socket), context(std::move(context_)), tcp_server(tcp_server_), params(params_), factory(factory_), stopped(false) { poco_check_ptr(factory); } @@ -32,7 +30,7 @@ void HTTPServerConnection::run() if (!stopped && tcp_server.isOpen() && session.connected()) { HTTPServerResponse response(session); - HTTPServerRequest request(context, response, session, read_event); + HTTPServerRequest request(context, response, session); Poco::Timestamp now; @@ -67,7 +65,7 @@ void HTTPServerConnection::run() if (request.getExpectContinue() && response.getStatus() == Poco::Net::HTTPResponse::HTTP_OK) response.sendContinue(); - handler->handleRequest(request, response, write_event); + handler->handleRequest(request, response); session.setKeepAlive(params->getKeepAlive() && response.getKeepAlive() && session.canKeepAlive()); } else diff --git a/src/Server/HTTP/HTTPServerConnection.h b/src/Server/HTTP/HTTPServerConnection.h index c6b1dc1ba25..7087f8d5a21 100644 --- a/src/Server/HTTP/HTTPServerConnection.h +++ b/src/Server/HTTP/HTTPServerConnection.h @@ -19,9 +19,7 @@ public: TCPServer & tcp_server, const Poco::Net::StreamSocket & socket, Poco::Net::HTTPServerParams::Ptr params, - HTTPRequestHandlerFactoryPtr factory, - const ProfileEvents::Event & read_event_ = ProfileEvents::end(), - const ProfileEvents::Event & write_event_ = ProfileEvents::end()); + HTTPRequestHandlerFactoryPtr factory); HTTPServerConnection( HTTPContextPtr context_, @@ -29,10 +27,8 @@ public: const Poco::Net::StreamSocket & socket_, Poco::Net::HTTPServerParams::Ptr params_, HTTPRequestHandlerFactoryPtr factory_, - const String & forwarded_for_, - const ProfileEvents::Event & read_event_ = ProfileEvents::end(), - const ProfileEvents::Event & write_event_ = ProfileEvents::end()) - : HTTPServerConnection(context_, tcp_server_, socket_, params_, factory_, read_event_, write_event_) + const String & forwarded_for_) + : HTTPServerConnection(context_, tcp_server_, socket_, params_, factory_) { forwarded_for = forwarded_for_; } @@ -48,8 +44,6 @@ private: Poco::Net::HTTPServerParams::Ptr params; HTTPRequestHandlerFactoryPtr factory; String forwarded_for; - ProfileEvents::Event read_event; - ProfileEvents::Event write_event; bool stopped; std::mutex mutex; // guards the |factory| with assumption that creating handlers is not thread-safe. }; diff --git a/src/Server/HTTP/HTTPServerConnectionFactory.cpp b/src/Server/HTTP/HTTPServerConnectionFactory.cpp index 16e5160fe3f..2c9ac0cda2a 100644 --- a/src/Server/HTTP/HTTPServerConnectionFactory.cpp +++ b/src/Server/HTTP/HTTPServerConnectionFactory.cpp @@ -5,20 +5,20 @@ namespace DB { HTTPServerConnectionFactory::HTTPServerConnectionFactory( - HTTPContextPtr context_, Poco::Net::HTTPServerParams::Ptr params_, HTTPRequestHandlerFactoryPtr factory_, const ProfileEvents::Event & read_event_, const ProfileEvents::Event & write_event_) - : context(std::move(context_)), params(params_), factory(factory_), read_event(read_event_), write_event(write_event_) + HTTPContextPtr context_, Poco::Net::HTTPServerParams::Ptr params_, HTTPRequestHandlerFactoryPtr factory_) + : context(std::move(context_)), params(params_), factory(factory_) { poco_check_ptr(factory); } Poco::Net::TCPServerConnection * HTTPServerConnectionFactory::createConnection(const Poco::Net::StreamSocket & socket, TCPServer & tcp_server) { - return new HTTPServerConnection(context, tcp_server, socket, params, factory, read_event, write_event); + return new HTTPServerConnection(context, tcp_server, socket, params, factory); } Poco::Net::TCPServerConnection * HTTPServerConnectionFactory::createConnection(const Poco::Net::StreamSocket & socket, TCPServer & tcp_server, TCPProtocolStackData & stack_data) { - return new HTTPServerConnection(context, tcp_server, socket, params, factory, stack_data.forwarded_for, read_event, write_event); + return new HTTPServerConnection(context, tcp_server, socket, params, factory, stack_data.forwarded_for); } } diff --git a/src/Server/HTTP/HTTPServerConnectionFactory.h b/src/Server/HTTP/HTTPServerConnectionFactory.h index 4b785e31744..e18249da4de 100644 --- a/src/Server/HTTP/HTTPServerConnectionFactory.h +++ b/src/Server/HTTP/HTTPServerConnectionFactory.h @@ -12,7 +12,7 @@ namespace DB class HTTPServerConnectionFactory : public TCPServerConnectionFactory { public: - HTTPServerConnectionFactory(HTTPContextPtr context, Poco::Net::HTTPServerParams::Ptr params, HTTPRequestHandlerFactoryPtr factory, const ProfileEvents::Event & read_event_ = ProfileEvents::end(), const ProfileEvents::Event & write_event_ = ProfileEvents::end()); + HTTPServerConnectionFactory(HTTPContextPtr context, Poco::Net::HTTPServerParams::Ptr params, HTTPRequestHandlerFactoryPtr factory); Poco::Net::TCPServerConnection * createConnection(const Poco::Net::StreamSocket & socket, TCPServer & tcp_server) override; Poco::Net::TCPServerConnection * createConnection(const Poco::Net::StreamSocket & socket, TCPServer & tcp_server, TCPProtocolStackData & stack_data) override; @@ -21,8 +21,6 @@ private: HTTPContextPtr context; Poco::Net::HTTPServerParams::Ptr params; HTTPRequestHandlerFactoryPtr factory; - ProfileEvents::Event read_event; - ProfileEvents::Event write_event; }; } diff --git a/src/Server/HTTP/HTTPServerRequest.cpp b/src/Server/HTTP/HTTPServerRequest.cpp index 4a6e85ba0fb..de5dde3c4aa 100644 --- a/src/Server/HTTP/HTTPServerRequest.cpp +++ b/src/Server/HTTP/HTTPServerRequest.cpp @@ -22,7 +22,7 @@ namespace DB { -HTTPServerRequest::HTTPServerRequest(HTTPContextPtr context, HTTPServerResponse & response, Poco::Net::HTTPServerSession & session, const ProfileEvents::Event & read_event) +HTTPServerRequest::HTTPServerRequest(HTTPContextPtr context, HTTPServerResponse & response, Poco::Net::HTTPServerSession & session) : max_uri_size(context->getMaxUriSize()) , max_fields_number(context->getMaxFields()) , max_field_name_size(context->getMaxFieldNameSize()) @@ -41,7 +41,7 @@ HTTPServerRequest::HTTPServerRequest(HTTPContextPtr context, HTTPServerResponse session.socket().setReceiveTimeout(receive_timeout); session.socket().setSendTimeout(send_timeout); - auto in = std::make_unique(session.socket(), read_event); + auto in = std::make_unique(session.socket()); socket = session.socket().impl(); readRequest(*in); /// Try parse according to RFC7230 diff --git a/src/Server/HTTP/HTTPServerRequest.h b/src/Server/HTTP/HTTPServerRequest.h index aaec89ab757..1f38334c745 100644 --- a/src/Server/HTTP/HTTPServerRequest.h +++ b/src/Server/HTTP/HTTPServerRequest.h @@ -4,7 +4,6 @@ #include #include #include -#include #include "config.h" #include @@ -20,7 +19,7 @@ class ReadBufferFromPocoSocket; class HTTPServerRequest : public HTTPRequest { public: - HTTPServerRequest(HTTPContextPtr context, HTTPServerResponse & response, Poco::Net::HTTPServerSession & session, const ProfileEvents::Event & read_event = ProfileEvents::end()); + HTTPServerRequest(HTTPContextPtr context, HTTPServerResponse & response, Poco::Net::HTTPServerSession & session); /// FIXME: it's a little bit inconvenient interface. The rationale is that all other ReadBuffer's wrap each other /// via unique_ptr - but we can't inherit HTTPServerRequest from ReadBuffer and pass it around, diff --git a/src/Server/HTTP/HTTPServerResponse.cpp b/src/Server/HTTP/HTTPServerResponse.cpp index 3c2d54a67df..25e7604a515 100644 --- a/src/Server/HTTP/HTTPServerResponse.cpp +++ b/src/Server/HTTP/HTTPServerResponse.cpp @@ -9,15 +9,12 @@ #include #include #include -#include namespace DB { -HTTPServerResponse::HTTPServerResponse(Poco::Net::HTTPServerSession & session_, const ProfileEvents::Event & write_event_) - : session(session_) - , write_event(write_event_) +HTTPServerResponse::HTTPServerResponse(Poco::Net::HTTPServerSession & session_) : session(session_) { } @@ -27,45 +24,42 @@ void HTTPServerResponse::sendContinue() hs << getVersion() << " 100 Continue\r\n\r\n"; } -std::shared_ptr HTTPServerResponse::send() +std::shared_ptr HTTPServerResponse::send() { poco_assert(!stream); if ((request && request->getMethod() == HTTPRequest::HTTP_HEAD) || getStatus() < 200 || getStatus() == HTTPResponse::HTTP_NO_CONTENT || getStatus() == HTTPResponse::HTTP_NOT_MODIFIED) { - // Send header - Poco::Net::HTTPHeaderOutputStream hs(session); - write(hs); - stream = std::make_shared(session.socket(), write_event); + Poco::CountingOutputStream cs; + write(cs); + stream = std::make_shared(session, cs.chars()); + write(*stream); } else if (getChunkedTransferEncoding()) { - // Send header Poco::Net::HTTPHeaderOutputStream hs(session); write(hs); - stream = std::make_shared(session.socket(), write_event); + stream = std::make_shared(session); } else if (hasContentLength()) { - // Send header - Poco::Net::HTTPHeaderOutputStream hs(session); - write(hs); - stream = std::make_shared(session.socket(), getContentLength(), write_event); + Poco::CountingOutputStream cs; + write(cs); + stream = std::make_shared(session, getContentLength64() + cs.chars()); + write(*stream); } else { + stream = std::make_shared(session); setKeepAlive(false); - // Send header - Poco::Net::HTTPHeaderOutputStream hs(session); - write(hs); - stream = std::make_shared(session.socket(), write_event); + write(*stream); } return stream; } -std::pair, std::shared_ptr> HTTPServerResponse::beginSend() +std::pair, std::shared_ptr> HTTPServerResponse::beginSend() { poco_assert(!stream); poco_assert(!header_stream); @@ -77,39 +71,40 @@ std::pair, std::shared_ptr(session); + beginWrite(*header_stream); + stream = std::make_shared(session); + } + else if (hasContentLength()) { throw Poco::Exception("HTTPServerResponse::beginSend is invalid for response with Content-Length header"); } - - // Write header to buffer - std::stringstream header; //STYLE_CHECK_ALLOW_STD_STRING_STREAM - beginWrite(header); - // Send header - auto str = header.str(); - header_stream = std::make_shared(session.socket(), write_event, str.size()); - header_stream->write(str); - - if (getChunkedTransferEncoding()) - stream = std::make_shared(session.socket(), write_event); else - stream = std::make_shared(session.socket(), write_event); + { + stream = std::make_shared(session); + header_stream = stream; + setKeepAlive(false); + beginWrite(*stream); + } return std::make_pair(header_stream, stream); } void HTTPServerResponse::sendBuffer(const void * buffer, std::size_t length) { + poco_assert(!stream); + setContentLength(static_cast(length)); setChunkedTransferEncoding(false); - // Send header - Poco::Net::HTTPHeaderOutputStream hs(session); - write(hs); - hs.flush(); + stream = std::make_shared(session); + write(*stream); if (request && request->getMethod() != HTTPRequest::HTTP_HEAD) - WriteBufferFromPocoSocket(session.socket(), write_event).write(static_cast(buffer), length); + { + stream->write(static_cast(buffer), static_cast(length)); + } } void HTTPServerResponse::requireAuthentication(const std::string & realm) diff --git a/src/Server/HTTP/HTTPServerResponse.h b/src/Server/HTTP/HTTPServerResponse.h index 6efe48667eb..236a56e2323 100644 --- a/src/Server/HTTP/HTTPServerResponse.h +++ b/src/Server/HTTP/HTTPServerResponse.h @@ -1,12 +1,9 @@ #pragma once -#include #include #include #include -#include -#include #include @@ -14,182 +11,12 @@ namespace DB { - -class HTTPWriteBufferChunked : public WriteBufferFromPocoSocket -{ - using WriteBufferFromPocoSocket::WriteBufferFromPocoSocket; -protected: - void nextImpl() override - { - if (offset() == 0) - return; - - std::string chunk_header; - Poco::NumberFormatter::appendHex(chunk_header, offset()); - chunk_header.append("\r\n", 2); - socketSendBytes(chunk_header.data(), static_cast(chunk_header.size())); - WriteBufferFromPocoSocket::nextImpl(); - socketSendBytes("\r\n", 2); - } - - void finalizeImpl() override - { - WriteBufferFromPocoSocket::finalizeImpl(); - socketSendBytes("0\r\n\r\n", 5); - } -}; - -class HTTPWriteBufferFixedLength : public WriteBufferFromPocoSocket -{ -public: - explicit HTTPWriteBufferFixedLength(Poco::Net::Socket & socket_, size_t fixed_length_, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE) - : WriteBufferFromPocoSocket(socket_, buf_size) - { - fixed_length = fixed_length_; - } - explicit HTTPWriteBufferFixedLength(Poco::Net::Socket & socket_, size_t fixed_length_, const ProfileEvents::Event & write_event_, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE) - : WriteBufferFromPocoSocket(socket_, write_event_, buf_size) - { - fixed_length = fixed_length_; - } -protected: - void nextImpl() override - { - if (count_length >= fixed_length || offset() == 0) - return; - - if (count_length + offset() > fixed_length) - pos -= offset() - (fixed_length - count_length); - - count_length += offset(); - - WriteBufferFromPocoSocket::nextImpl(); - } -private: - size_t fixed_length; - size_t count_length = 0; -}; - -/// Universal HTTP buffer, can be switched for different Transfer-Encoding/Content-Length on the fly -/// so it can be used to output HTTP header and then switched to appropriate mode for body -class HTTPWriteBuffer : public WriteBufferFromPocoSocket -{ -public: - explicit HTTPWriteBuffer(Poco::Net::Socket & socket_, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE) - : WriteBufferFromPocoSocket(socket_, buf_size) - { - } - explicit HTTPWriteBuffer(Poco::Net::Socket & socket_, const ProfileEvents::Event & write_event_, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE) - : WriteBufferFromPocoSocket(socket_, write_event_, buf_size) - { - } - - void setChunked(size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE) - { - chunked = true; - resizeIfNeeded(buf_size); - } - - bool isChunked() - { - return chunked; - } - - void setFixedLength(size_t length) - { - chunked = false; - fixed_length = length; - count_length = 0; - resizeIfNeeded(length); - } - - size_t isFixedLength() - { - return chunked ? 0 : fixed_length; - } - - void setPlain(size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE) - { - chunked = false; - fixed_length = 0; - count_length = 0; - resizeIfNeeded(buf_size); - } - - bool isPlain() - { - return !(isChunked() || isFixedLength()); - } - -protected: - void finalizeImpl() override - { - WriteBufferFromPocoSocket::finalizeImpl(); - if (chunked) - socketSendBytes("0\r\n\r\n", 5); - } - - void nextImpl() override - { - if (chunked) - return nextImplChunked(); - - if (fixed_length) - return nextImplFixedLength(); - - WriteBufferFromPocoSocket::nextImpl(); - } - - void nextImplFixedLength() - { - if (count_length >= fixed_length || offset() == 0) - return; - - if (count_length + offset() > fixed_length) - pos -= offset() - (fixed_length - count_length); - - count_length += offset(); - - WriteBufferFromPocoSocket::nextImpl(); - } - - void nextImplChunked() - { - if (offset() == 0) - return; - - std::string chunk_header; - Poco::NumberFormatter::appendHex(chunk_header, offset()); - chunk_header.append("\r\n", 2); - socketSendBytes(chunk_header.data(), static_cast(chunk_header.size())); - WriteBufferFromPocoSocket::nextImpl(); - socketSendBytes("\r\n", 2); - } - - void resizeIfNeeded(size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE) - { - if (!buf_size) - return; - - auto data_size = offset(); - assert(data_size <= buf_size); - - memory.resize(buf_size); - set(memory.data(), memory.size(), data_size); - } -private: - bool chunked = false; - size_t fixed_length = 0; - size_t count_length = 0; -}; - - class HTTPServerRequest; class HTTPServerResponse : public HTTPResponse { public: - explicit HTTPServerResponse(Poco::Net::HTTPServerSession & session, const ProfileEvents::Event & write_event_ = ProfileEvents::end()); + explicit HTTPServerResponse(Poco::Net::HTTPServerSession & session); void sendContinue(); /// Sends a 100 Continue response to the client. @@ -199,7 +26,7 @@ public: /// /// Must not be called after beginSend(), sendFile(), sendBuffer() /// or redirect() has been called. - std::shared_ptr send(); + std::shared_ptr send(); /// TODO: use some WriteBuffer implementation here. /// Sends the response headers to the client /// but do not finish headers with \r\n, @@ -207,7 +34,7 @@ public: /// /// Must not be called after send(), sendFile(), sendBuffer() /// or redirect() has been called. - std::pair, std::shared_ptr> beginSend(); + std::pair, std::shared_ptr> beginSend(); /// TODO: use some WriteBuffer implementation here. /// Sends the response header to the client, followed /// by the contents of the given buffer. @@ -231,16 +58,13 @@ public: /// Returns true if the response (header) has been sent. bool sent() const { return !!stream; } - Poco::Net::StreamSocket & getSocket() { return session.socket(); } - void attachRequest(HTTPServerRequest * request_) { request = request_; } private: Poco::Net::HTTPServerSession & session; HTTPServerRequest * request = nullptr; - ProfileEvents::Event write_event; - std::shared_ptr stream; - std::shared_ptr header_stream; + std::shared_ptr stream; + std::shared_ptr header_stream; }; } diff --git a/src/Server/HTTP/WriteBufferFromHTTPServerResponse.cpp b/src/Server/HTTP/WriteBufferFromHTTPServerResponse.cpp index 07b87c3ae96..1a12c09a8c7 100644 --- a/src/Server/HTTP/WriteBufferFromHTTPServerResponse.cpp +++ b/src/Server/HTTP/WriteBufferFromHTTPServerResponse.cpp @@ -1,15 +1,17 @@ #include + #include #include #include #include -#include -#include -#include namespace DB { +namespace ErrorCodes +{ +} + void WriteBufferFromHTTPServerResponse::startSendHeaders() { @@ -17,33 +19,27 @@ void WriteBufferFromHTTPServerResponse::startSendHeaders() { headers_started_sending = true; - if (response.getChunkedTransferEncoding()) - setChunked(); - if (add_cors_header) response.set("Access-Control-Allow-Origin", "*"); setResponseDefaultHeaders(response, keep_alive_timeout); - std::stringstream header; //STYLE_CHECK_ALLOW_STD_STRING_STREAM - response.beginWrite(header); - auto header_str = header.str(); - socketSendBytes(header_str.data(), header_str.size()); + if (!is_http_method_head) + std::tie(response_header_ostr, response_body_ostr) = response.beginSend(); } } void WriteBufferFromHTTPServerResponse::writeHeaderProgressImpl(const char * header_name) { - if (is_http_method_head || headers_finished_sending || !headers_started_sending) + if (headers_finished_sending) return; WriteBufferFromOwnString progress_string_writer; accumulated_progress.writeJSON(progress_string_writer); - socketSendBytes(header_name, strlen(header_name)); - socketSendBytes(progress_string_writer.str().data(), progress_string_writer.str().size()); - socketSendBytes("\r\n", 2); + if (response_header_ostr) + *response_header_ostr << header_name << progress_string_writer.str() << "\r\n" << std::flush; } void WriteBufferFromHTTPServerResponse::writeHeaderSummary() @@ -61,30 +57,30 @@ void WriteBufferFromHTTPServerResponse::writeExceptionCode() { if (headers_finished_sending || !exception_code) return; - if (headers_started_sending) - { - socketSendBytes("X-ClickHouse-Exception-Code: ", sizeof("X-ClickHouse-Exception-Code: ") - 1); - auto str_code = std::to_string(exception_code); - socketSendBytes(str_code.data(), str_code.size()); - socketSendBytes("\r\n", 2); - } + if (response_header_ostr) + *response_header_ostr << "X-ClickHouse-Exception-Code: " << exception_code << "\r\n" << std::flush; } void WriteBufferFromHTTPServerResponse::finishSendHeaders() { - if (headers_finished_sending) - return; + if (!headers_finished_sending) + { + writeHeaderSummary(); + writeExceptionCode(); + headers_finished_sending = true; - if (!headers_started_sending) - startSendHeaders(); - - writeHeaderSummary(); - writeExceptionCode(); - - headers_finished_sending = true; - - /// Send end of headers delimiter. - socketSendBytes("\r\n", 2); + if (!is_http_method_head) + { + /// Send end of headers delimiter. + if (response_header_ostr) + *response_header_ostr << "\r\n" << std::flush; + } + else + { + if (!response_body_ostr) + response_body_ostr = response.send(); + } + } } @@ -93,19 +89,47 @@ void WriteBufferFromHTTPServerResponse::nextImpl() if (!initialized) { std::lock_guard lock(mutex); + /// Initialize as early as possible since if the code throws, /// next() should not be called anymore. initialized = true; - if (compression_method != CompressionMethod::None) - response.set("Content-Encoding", toContentEncodingName(compression_method)); - startSendHeaders(); + + if (!out && !is_http_method_head) + { + if (compress) + { + auto content_encoding_name = toContentEncodingName(compression_method); + + *response_header_ostr << "Content-Encoding: " << content_encoding_name << "\r\n"; + } + + /// We reuse our buffer in "out" to avoid extra allocations and copies. + + if (compress) + out = wrapWriteBufferWithCompressionMethod( + std::make_unique(*response_body_ostr), + compress ? compression_method : CompressionMethod::None, + compression_level, + working_buffer.size(), + working_buffer.begin()); + else + out = std::make_unique( + *response_body_ostr, + working_buffer.size(), + working_buffer.begin()); + } + finishSendHeaders(); } - if (!is_http_method_head) - HTTPWriteBuffer::nextImpl(); + if (out) + { + out->buffer() = buffer(); + out->position() = position(); + out->next(); + } } @@ -113,11 +137,14 @@ WriteBufferFromHTTPServerResponse::WriteBufferFromHTTPServerResponse( HTTPServerResponse & response_, bool is_http_method_head_, UInt64 keep_alive_timeout_, - const ProfileEvents::Event & write_event_) - : HTTPWriteBuffer(response_.getSocket(), write_event_) + bool compress_, + CompressionMethod compression_method_) + : BufferWithOwnMemory(DBMS_DEFAULT_BUFFER_SIZE) , response(response_) , is_http_method_head(is_http_method_head_) , keep_alive_timeout(keep_alive_timeout_) + , compress(compress_) + , compression_method(compression_method_) { } @@ -142,15 +169,6 @@ void WriteBufferFromHTTPServerResponse::onProgress(const Progress & progress) } } -void WriteBufferFromHTTPServerResponse::setExceptionCode(int exception_code_) -{ - std::lock_guard lock(mutex); - if (headers_started_sending) - exception_code = exception_code_; - else - response.set("X-ClickHouse-Exception-Code", toString(exception_code_)); -} - WriteBufferFromHTTPServerResponse::~WriteBufferFromHTTPServerResponse() { finalize(); @@ -158,20 +176,30 @@ WriteBufferFromHTTPServerResponse::~WriteBufferFromHTTPServerResponse() void WriteBufferFromHTTPServerResponse::finalizeImpl() { - if (!headers_finished_sending) + try { - std::lock_guard lock(mutex); - /// If no body data just send header - startSendHeaders(); - - if (!initialized && offset() && compression_method != CompressionMethod::None) - socketSendStr("Content-Encoding: " + toContentEncodingName(compression_method) + "\r\n"); - - finishSendHeaders(); + next(); + if (out) + out->finalize(); + out.reset(); + /// Catch write-after-finalize bugs. + set(nullptr, 0); + } + catch (...) + { + /// Avoid calling WriteBufferFromOStream::next() from dtor + /// (via WriteBufferFromHTTPServerResponse::next()) + out.reset(); + throw; } - if (!is_http_method_head) - HTTPWriteBuffer::finalizeImpl(); + if (!offset()) + { + /// If no remaining data, just send headers. + std::lock_guard lock(mutex); + startSendHeaders(); + finishSendHeaders(); + } } diff --git a/src/Server/HTTP/WriteBufferFromHTTPServerResponse.h b/src/Server/HTTP/WriteBufferFromHTTPServerResponse.h index a3952b7c553..38345f27952 100644 --- a/src/Server/HTTP/WriteBufferFromHTTPServerResponse.h +++ b/src/Server/HTTP/WriteBufferFromHTTPServerResponse.h @@ -5,8 +5,8 @@ #include #include #include +#include #include -#include #include #include @@ -17,26 +17,48 @@ namespace DB { -/// Postpone sending HTTP header until first data is flushed. This is needed in HTTP servers -/// to change some HTTP headers (e.g. response code) before any data is sent to the client. +/// The difference from WriteBufferFromOStream is that this buffer gets the underlying std::ostream +/// (using response.send()) only after data is flushed for the first time. This is needed in HTTP +/// servers to change some HTTP headers (e.g. response code) before any data is sent to the client +/// (headers can't be changed after response.send() is called). +/// +/// In short, it allows delaying the call to response.send(). +/// +/// Additionally, supports HTTP response compression (in this case corresponding Content-Encoding +/// header will be set). /// /// Also this class write and flush special X-ClickHouse-Progress HTTP headers /// if no data was sent at the time of progress notification. /// This allows to implement progress bar in HTTP clients. -class WriteBufferFromHTTPServerResponse final : public HTTPWriteBuffer +class WriteBufferFromHTTPServerResponse final : public BufferWithOwnMemory { public: WriteBufferFromHTTPServerResponse( HTTPServerResponse & response_, bool is_http_method_head_, UInt64 keep_alive_timeout_, - const ProfileEvents::Event & write_event_ = ProfileEvents::end()); + bool compress_ = false, /// If true - set Content-Encoding header and compress the result. + CompressionMethod compression_method_ = CompressionMethod::None); ~WriteBufferFromHTTPServerResponse() override; /// Writes progress in repeating HTTP headers. void onProgress(const Progress & progress); + /// Turn compression on or off. + /// The setting has any effect only if HTTP headers haven't been sent yet. + void setCompression(bool enable_compression) + { + compress = enable_compression; + } + + /// Set compression level if the compression is turned on. + /// The setting has any effect only if HTTP headers haven't been sent yet. + void setCompressionLevel(int level) + { + compression_level = level; + } + /// Turn CORS on or off. /// The setting has any effect only if HTTP headers haven't been sent yet. void addHeaderCORS(bool enable_cors) @@ -53,13 +75,7 @@ public: send_progress_interval_ms = send_progress_interval_ms_; } - /// Content-Encoding header will be set on first data package - void setCompressionMethodHeader(const CompressionMethod & compression_method_) - { - compression_method = compression_method_; - } - - void setExceptionCode(int exception_code_); + void setExceptionCode(int exception_code_) { exception_code = exception_code_; } private: /// Send at least HTTP headers if no data has been sent yet. @@ -92,7 +108,14 @@ private: bool is_http_method_head; bool add_cors_header = false; size_t keep_alive_timeout = 0; + bool compress = false; + CompressionMethod compression_method; + int compression_level = 1; + std::shared_ptr response_body_ostr; + std::shared_ptr response_header_ostr; + + std::unique_ptr out; bool initialized = false; bool headers_started_sending = false; @@ -103,8 +126,6 @@ private: size_t send_progress_interval_ms = 100; Stopwatch progress_watch; - CompressionMethod compression_method = CompressionMethod::None; - int exception_code = 0; std::mutex mutex; /// progress callback could be called from different threads. diff --git a/src/Server/HTTPHandler.cpp b/src/Server/HTTPHandler.cpp index 675fa922c90..81a02c38696 100644 --- a/src/Server/HTTPHandler.cpp +++ b/src/Server/HTTPHandler.cpp @@ -47,9 +47,7 @@ #include #include -#include #include -#include #include #if USE_SSL @@ -295,7 +293,7 @@ void HTTPHandler::pushDelayedResults(Output & used_output) std::vector write_buffers; ConcatReadBuffer::Buffers read_buffers; - auto * cascade_buffer = typeid_cast(used_output.out_maybe_delayed_and_compressed); + auto * cascade_buffer = typeid_cast(used_output.out_maybe_delayed_and_compressed.get()); if (!cascade_buffer) throw Exception(ErrorCodes::LOGICAL_ERROR, "Expected CascadeWriteBuffer"); @@ -547,8 +545,7 @@ void HTTPHandler::processQuery( HTMLForm & params, HTTPServerResponse & response, Output & used_output, - std::optional & query_scope, - const ProfileEvents::Event & write_event) + std::optional & query_scope) { using namespace Poco::Net; @@ -559,9 +556,6 @@ void HTTPHandler::processQuery( /// The user could specify session identifier and session timeout. /// It allows to modify settings, create temporary tables and reuse them in subsequent requests. - - SCOPE_EXIT({ session->releaseSessionID(); }); - String session_id; std::chrono::steady_clock::duration session_timeout; bool session_is_set = params.has("session_id"); @@ -614,35 +608,15 @@ void HTTPHandler::processQuery( size_t buffer_size_http = DBMS_DEFAULT_BUFFER_SIZE; size_t buffer_size_memory = (buffer_size_total > buffer_size_http) ? buffer_size_total : 0; - bool enable_http_compression = params.getParsed("enable_http_compression", context->getSettingsRef().enable_http_compression); - Int64 http_zlib_compression_level = params.getParsed("http_zlib_compression_level", context->getSettingsRef().http_zlib_compression_level); - - used_output.out_holder = - std::make_shared( - response, - request.getMethod() == HTTPRequest::HTTP_HEAD, - context->getServerSettings().keep_alive_timeout.totalSeconds(), - write_event); - used_output.out = used_output.out_holder; - used_output.out_maybe_compressed = used_output.out_holder; - - if (client_supports_http_compression && enable_http_compression) - { - used_output.out_holder->setCompressionMethodHeader(http_response_compression_method); - used_output.wrap_compressed_holder = - wrapWriteBufferWithCompressionMethod( - used_output.out_holder.get(), - http_response_compression_method, - static_cast(http_zlib_compression_level), - DBMS_DEFAULT_BUFFER_SIZE, nullptr, 0, false); - used_output.out = used_output.wrap_compressed_holder; - } + used_output.out = std::make_shared( + response, + request.getMethod() == HTTPRequest::HTTP_HEAD, + context->getServerSettings().keep_alive_timeout.totalSeconds(), + client_supports_http_compression, + http_response_compression_method); if (internal_compression) - { - used_output.out_compressed_holder = std::make_shared(*used_output.out); - used_output.out_maybe_compressed = used_output.out_compressed_holder; - } + used_output.out_maybe_compressed = std::make_shared(*used_output.out); else used_output.out_maybe_compressed = used_output.out; @@ -682,12 +656,12 @@ void HTTPHandler::processQuery( cascade_buffer2.emplace_back(push_memory_buffer_and_continue); } - used_output.out_delayed_and_compressed_holder = std::make_unique(std::move(cascade_buffer1), std::move(cascade_buffer2)); - used_output.out_maybe_delayed_and_compressed = used_output.out_delayed_and_compressed_holder.get(); + used_output.out_maybe_delayed_and_compressed = std::make_shared( + std::move(cascade_buffer1), std::move(cascade_buffer2)); } else { - used_output.out_maybe_delayed_and_compressed = used_output.out_maybe_compressed.get(); + used_output.out_maybe_delayed_and_compressed = used_output.out_maybe_compressed; } /// Request body can be compressed using algorithm specified in the Content-Encoding header. @@ -816,8 +790,14 @@ void HTTPHandler::processQuery( const auto & query = getQuery(request, params, context); std::unique_ptr in_param = std::make_unique(query); - used_output.out_holder->setSendProgress(settings.send_progress_in_http_headers); - used_output.out_holder->setSendProgressInterval(settings.http_headers_progress_interval_ms); + /// HTTP response compression is turned on only if the client signalled that they support it + /// (using Accept-Encoding header) and 'enable_http_compression' setting is turned on. + used_output.out->setCompression(client_supports_http_compression && settings.enable_http_compression); + if (client_supports_http_compression) + used_output.out->setCompressionLevel(static_cast(settings.http_zlib_compression_level)); + + used_output.out->setSendProgress(settings.send_progress_in_http_headers); + used_output.out->setSendProgressInterval(settings.http_headers_progress_interval_ms); /// If 'http_native_compression_disable_checksumming_on_decompress' setting is turned on, /// checksums of client data compressed with internal algorithm are not checked. @@ -828,7 +808,7 @@ void HTTPHandler::processQuery( /// Note that whether the header is added is determined by the settings, and we can only get the user settings after authentication. /// Once the authentication fails, the header can't be added. if (settings.add_http_cors_header && !request.get("Origin", "").empty() && !config.has("http_options_response")) - used_output.out_holder->addHeaderCORS(true); + used_output.out->addHeaderCORS(true); auto append_callback = [my_context = context] (ProgressCallback callback) { @@ -847,7 +827,7 @@ void HTTPHandler::processQuery( /// Note that we add it unconditionally so the progress is available for `X-ClickHouse-Summary` append_callback([&used_output](const Progress & progress) { - used_output.out_holder->onProgress(progress); + used_output.out->onProgress(progress); }); if (settings.readonly > 0 && settings.cancel_http_readonly_queries_on_client_close) @@ -900,8 +880,6 @@ void HTTPHandler::processQuery( {}, handle_exception_in_output_format); - session->releaseSessionID(); - if (used_output.hasDelayed()) { /// TODO: set Content-Length if possible @@ -916,8 +894,10 @@ void HTTPHandler::trySendExceptionToClient( const std::string & s, int exception_code, HTTPServerRequest & request, HTTPServerResponse & response, Output & used_output) try { - if (used_output.out_holder) - used_output.out_holder->setExceptionCode(exception_code); + /// In case data has already been sent, like progress headers, try using the output buffer to + /// set the exception code since it will be able to append it if it hasn't finished writing headers + if (response.sent() && used_output.out) + used_output.out->setExceptionCode(exception_code); else response.set("X-ClickHouse-Exception-Code", toString(exception_code)); @@ -942,10 +922,10 @@ try response.setStatusAndReason(exceptionCodeToHTTPStatus(exception_code)); } - if (!used_output.out_holder && !used_output.exception_is_written) + if (!response.sent() && !used_output.out_maybe_compressed && !used_output.exception_is_written) { /// If nothing was sent yet and we don't even know if we must compress the response. - WriteBufferFromHTTPServerResponse(response, request.getMethod() == HTTPRequest::HTTP_HEAD, DEFAULT_HTTP_KEEP_ALIVE_TIMEOUT).writeln(s); + *response.send() << s << std::endl; } else if (used_output.out_maybe_compressed) { @@ -955,8 +935,7 @@ try /// do not call finalize here for CascadeWriteBuffer used_output.out_maybe_delayed_and_compressed, /// exception is written into used_output.out_maybe_compressed later /// HTTPHandler::trySendExceptionToClient is called with exception context, it is Ok to destroy buffers - used_output.out_delayed_and_compressed_holder.reset(); - used_output.out_maybe_delayed_and_compressed = nullptr; + used_output.out_maybe_delayed_and_compressed.reset(); } if (!used_output.exception_is_written) @@ -966,12 +945,12 @@ try /// Also HTTP code 200 could have already been sent. /// If buffer has data, and that data wasn't sent yet, then no need to send that data - bool data_sent = used_output.out_holder->count() != used_output.out_holder->offset(); + bool data_sent = used_output.out->count() != used_output.out->offset(); if (!data_sent) { used_output.out_maybe_compressed->position() = used_output.out_maybe_compressed->buffer().begin(); - used_output.out_holder->position() = used_output.out_holder->buffer().begin(); + used_output.out->position() = used_output.out->buffer().begin(); } writeString(s, *used_output.out_maybe_compressed); @@ -1002,7 +981,7 @@ catch (...) } -void HTTPHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) +void HTTPHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { setThreadName("HTTPHandler"); ThreadStatus thread_status; @@ -1091,7 +1070,7 @@ void HTTPHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse "is no Content-Length header for POST request"); } - processQuery(request, params, response, used_output, query_scope, write_event); + processQuery(request, params, response, used_output, query_scope); if (request_credentials) LOG_DEBUG(log, "Authentication in progress..."); else diff --git a/src/Server/HTTPHandler.h b/src/Server/HTTPHandler.h index 815b0f84231..16af5db21c6 100644 --- a/src/Server/HTTPHandler.h +++ b/src/Server/HTTPHandler.h @@ -7,8 +7,6 @@ #include #include #include -#include -#include namespace CurrentMetrics { @@ -34,7 +32,7 @@ public: HTTPHandler(IServer & server_, const std::string & name, const std::optional & content_type_override_); virtual ~HTTPHandler() override; - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; /// This method is called right before the query execution. virtual void customizeContext(HTTPServerRequest & /* request */, ContextMutablePtr /* context */, ReadBuffer & /* body */) {} @@ -55,22 +53,11 @@ private: * WriteBufferFromHTTPServerResponse out */ - /// Holds original response buffer - std::shared_ptr out_holder; - /// If HTTP compression is enabled holds compression wrapper over original response buffer - std::shared_ptr wrap_compressed_holder; - /// Points either to out_holder or to wrap_compressed_holder - std::shared_ptr out; - - /// If internal compression is enabled holds compression wrapper over out buffer - std::shared_ptr out_compressed_holder; - /// Points to 'out' or to CompressedWriteBuffer(*out) + std::shared_ptr out; + /// Points to 'out' or to CompressedWriteBuffer(*out), depending on settings. std::shared_ptr out_maybe_compressed; - - /// If output should be delayed holds cascade buffer - std::unique_ptr out_delayed_and_compressed_holder; - /// Points to out_maybe_compressed or to CascadeWriteBuffer. - WriteBuffer * out_maybe_delayed_and_compressed = nullptr; + /// Points to 'out' or to CompressedWriteBuffer(*out) or to CascadeWriteBuffer. + std::shared_ptr out_maybe_delayed_and_compressed; bool finalized = false; @@ -78,7 +65,7 @@ private: inline bool hasDelayed() const { - return out_maybe_delayed_and_compressed != out_maybe_compressed.get(); + return out_maybe_delayed_and_compressed != out_maybe_compressed; } inline void finalize() @@ -87,9 +74,11 @@ private: return; finalized = true; + if (out_maybe_delayed_and_compressed) + out_maybe_delayed_and_compressed->finalize(); if (out_maybe_compressed) out_maybe_compressed->finalize(); - else if (out) + if (out) out->finalize(); } @@ -138,8 +127,7 @@ private: HTMLForm & params, HTTPServerResponse & response, Output & used_output, - std::optional & query_scope, - const ProfileEvents::Event & write_event); + std::optional & query_scope); void trySendExceptionToClient( const std::string & s, diff --git a/src/Server/InterserverIOHTTPHandler.cpp b/src/Server/InterserverIOHTTPHandler.cpp index c41d68bab02..53773a83b40 100644 --- a/src/Server/InterserverIOHTTPHandler.cpp +++ b/src/Server/InterserverIOHTTPHandler.cpp @@ -77,7 +77,7 @@ void InterserverIOHTTPHandler::processQuery(HTTPServerRequest & request, HTTPSer } -void InterserverIOHTTPHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) +void InterserverIOHTTPHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { setThreadName("IntersrvHandler"); ThreadStatus thread_status; @@ -89,7 +89,7 @@ void InterserverIOHTTPHandler::handleRequest(HTTPServerRequest & request, HTTPSe Output used_output; const auto keep_alive_timeout = server.context()->getServerSettings().keep_alive_timeout.totalSeconds(); used_output.out = std::make_shared( - response, request.getMethod() == Poco::Net::HTTPRequest::HTTP_HEAD, keep_alive_timeout, write_event); + response, request.getMethod() == Poco::Net::HTTPRequest::HTTP_HEAD, keep_alive_timeout); auto write_response = [&](const std::string & message) { diff --git a/src/Server/InterserverIOHTTPHandler.h b/src/Server/InterserverIOHTTPHandler.h index 66042ad3d1d..da5b286b9e5 100644 --- a/src/Server/InterserverIOHTTPHandler.h +++ b/src/Server/InterserverIOHTTPHandler.h @@ -30,7 +30,7 @@ public: { } - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; private: struct Output diff --git a/src/Server/KeeperReadinessHandler.cpp b/src/Server/KeeperReadinessHandler.cpp index de6edd199d7..ed972055aee 100644 --- a/src/Server/KeeperReadinessHandler.cpp +++ b/src/Server/KeeperReadinessHandler.cpp @@ -19,7 +19,7 @@ namespace DB { -void KeeperReadinessHandler::handleRequest(HTTPServerRequest & /*request*/, HTTPServerResponse & response, const ProfileEvents::Event & /*write_event*/) +void KeeperReadinessHandler::handleRequest(HTTPServerRequest & /*request*/, HTTPServerResponse & response) { try { @@ -58,7 +58,7 @@ void KeeperReadinessHandler::handleRequest(HTTPServerRequest & /*request*/, HTTP if (!response.sent()) { /// We have not sent anything yet and we don't even know if we need to compress response. - *response.send() << getCurrentExceptionMessage(false) << '\n'; + *response.send() << getCurrentExceptionMessage(false) << std::endl; } } catch (...) diff --git a/src/Server/KeeperReadinessHandler.h b/src/Server/KeeperReadinessHandler.h index a16aa9f8021..00b51b886f9 100644 --- a/src/Server/KeeperReadinessHandler.h +++ b/src/Server/KeeperReadinessHandler.h @@ -22,7 +22,7 @@ public: { } - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; }; HTTPRequestHandlerFactoryPtr diff --git a/src/Server/MySQLHandler.cpp b/src/Server/MySQLHandler.cpp index cd063361811..f0d7c576e7a 100644 --- a/src/Server/MySQLHandler.cpp +++ b/src/Server/MySQLHandler.cpp @@ -70,17 +70,13 @@ MySQLHandler::MySQLHandler( IServer & server_, TCPServer & tcp_server_, const Poco::Net::StreamSocket & socket_, - bool ssl_enabled, uint32_t connection_id_, - const ProfileEvents::Event & read_event_, - const ProfileEvents::Event & write_event_) + bool ssl_enabled, uint32_t connection_id_) : Poco::Net::TCPServerConnection(socket_) , server(server_) , tcp_server(tcp_server_) , log(&Poco::Logger::get("MySQLHandler")) , connection_id(connection_id_) , auth_plugin(new MySQLProtocol::Authentication::Native41()) - , read_event(read_event_) - , write_event(write_event_) { server_capabilities = CLIENT_PROTOCOL_41 | CLIENT_SECURE_CONNECTION | CLIENT_PLUGIN_AUTH | CLIENT_PLUGIN_AUTH_LENENC_CLIENT_DATA | CLIENT_CONNECT_WITH_DB | CLIENT_DEPRECATE_EOF; if (ssl_enabled) @@ -102,8 +98,8 @@ void MySQLHandler::run() session->setClientConnectionId(connection_id); - in = std::make_shared(socket(), read_event); - out = std::make_shared(socket(), write_event); + in = std::make_shared(socket()); + out = std::make_shared(socket()); packet_endpoint = std::make_shared(*in, *out, sequence_id); try @@ -493,10 +489,8 @@ MySQLHandlerSSL::MySQLHandlerSSL( bool ssl_enabled, uint32_t connection_id_, RSA & public_key_, - RSA & private_key_, - const ProfileEvents::Event & read_event_, - const ProfileEvents::Event & write_event_) - : MySQLHandler(server_, tcp_server_, socket_, ssl_enabled, connection_id_, read_event_, write_event_) + RSA & private_key_) + : MySQLHandler(server_, tcp_server_, socket_, ssl_enabled, connection_id_) , public_key(public_key_) , private_key(private_key_) {} diff --git a/src/Server/MySQLHandler.h b/src/Server/MySQLHandler.h index 36d63ebca84..194b18bdc39 100644 --- a/src/Server/MySQLHandler.h +++ b/src/Server/MySQLHandler.h @@ -42,9 +42,7 @@ public: TCPServer & tcp_server_, const Poco::Net::StreamSocket & socket_, bool ssl_enabled, - uint32_t connection_id_, - const ProfileEvents::Event & read_event_ = ProfileEvents::end(), - const ProfileEvents::Event & write_event_ = ProfileEvents::end()); + uint32_t connection_id_); void run() final; @@ -104,9 +102,6 @@ protected: std::shared_ptr in; std::shared_ptr out; bool secure_connection = false; - - ProfileEvents::Event read_event; - ProfileEvents::Event write_event; }; #if USE_SSL @@ -120,9 +115,7 @@ public: bool ssl_enabled, uint32_t connection_id_, RSA & public_key_, - RSA & private_key_, - const ProfileEvents::Event & read_event_ = ProfileEvents::end(), - const ProfileEvents::Event & write_event_ = ProfileEvents::end()); + RSA & private_key_); private: void authPluginSSL() override; diff --git a/src/Server/MySQLHandlerFactory.cpp b/src/Server/MySQLHandlerFactory.cpp index 79234c647aa..f74f57926f9 100644 --- a/src/Server/MySQLHandlerFactory.cpp +++ b/src/Server/MySQLHandlerFactory.cpp @@ -21,11 +21,9 @@ namespace ErrorCodes extern const int OPENSSL_ERROR; } -MySQLHandlerFactory::MySQLHandlerFactory(IServer & server_, const ProfileEvents::Event & read_event_, const ProfileEvents::Event & write_event_) +MySQLHandlerFactory::MySQLHandlerFactory(IServer & server_) : server(server_) , log(&Poco::Logger::get("MySQLHandlerFactory")) - , read_event(read_event_) - , write_event(write_event_) { #if USE_SSL try diff --git a/src/Server/MySQLHandlerFactory.h b/src/Server/MySQLHandlerFactory.h index 307ee3b2f0d..fa4ce93f765 100644 --- a/src/Server/MySQLHandlerFactory.h +++ b/src/Server/MySQLHandlerFactory.h @@ -4,7 +4,6 @@ #include #include #include -#include #include "config.h" @@ -38,11 +37,8 @@ private: #endif std::atomic last_connection_id = 0; - - ProfileEvents::Event read_event; - ProfileEvents::Event write_event; public: - explicit MySQLHandlerFactory(IServer & server_, const ProfileEvents::Event & read_event_ = ProfileEvents::end(), const ProfileEvents::Event & write_event_ = ProfileEvents::end()); + explicit MySQLHandlerFactory(IServer & server_); void readRSAKeys(); diff --git a/src/Server/NotFoundHandler.cpp b/src/Server/NotFoundHandler.cpp index 38f56921c89..5b1db508551 100644 --- a/src/Server/NotFoundHandler.cpp +++ b/src/Server/NotFoundHandler.cpp @@ -5,7 +5,7 @@ namespace DB { -void NotFoundHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & /*write_event*/) +void NotFoundHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { try { diff --git a/src/Server/NotFoundHandler.h b/src/Server/NotFoundHandler.h index a484d237771..1cbfcd57f8f 100644 --- a/src/Server/NotFoundHandler.h +++ b/src/Server/NotFoundHandler.h @@ -10,7 +10,7 @@ class NotFoundHandler : public HTTPRequestHandler { public: NotFoundHandler(std::vector hints_) : hints(std::move(hints_)) {} - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; private: std::vector hints; }; diff --git a/src/Server/PostgreSQLHandler.cpp b/src/Server/PostgreSQLHandler.cpp index c62dc8109ea..eeb3784c1df 100644 --- a/src/Server/PostgreSQLHandler.cpp +++ b/src/Server/PostgreSQLHandler.cpp @@ -32,16 +32,12 @@ PostgreSQLHandler::PostgreSQLHandler( TCPServer & tcp_server_, bool ssl_enabled_, Int32 connection_id_, - std::vector> & auth_methods_, - const ProfileEvents::Event & read_event_, - const ProfileEvents::Event & write_event_) + std::vector> & auth_methods_) : Poco::Net::TCPServerConnection(socket_) , server(server_) , tcp_server(tcp_server_) , ssl_enabled(ssl_enabled_) , connection_id(connection_id_) - , read_event(read_event_) - , write_event(write_event_) , authentication_manager(auth_methods_) { changeIO(socket()); @@ -49,8 +45,8 @@ PostgreSQLHandler::PostgreSQLHandler( void PostgreSQLHandler::changeIO(Poco::Net::StreamSocket & socket) { - in = std::make_shared(socket, read_event); - out = std::make_shared(socket, write_event); + in = std::make_shared(socket); + out = std::make_shared(socket); message_transport = std::make_shared(in.get(), out.get()); } diff --git a/src/Server/PostgreSQLHandler.h b/src/Server/PostgreSQLHandler.h index 57b91a0ad04..f20af3df02c 100644 --- a/src/Server/PostgreSQLHandler.h +++ b/src/Server/PostgreSQLHandler.h @@ -33,9 +33,7 @@ public: TCPServer & tcp_server_, bool ssl_enabled_, Int32 connection_id_, - std::vector> & auth_methods_, - const ProfileEvents::Event & read_event_ = ProfileEvents::end(), - const ProfileEvents::Event & write_event_ = ProfileEvents::end()); + std::vector> & auth_methods_); void run() final; @@ -53,9 +51,6 @@ private: std::shared_ptr out; std::shared_ptr message_transport; - ProfileEvents::Event read_event; - ProfileEvents::Event write_event; - #if USE_SSL std::shared_ptr ss; #endif diff --git a/src/Server/PostgreSQLHandlerFactory.cpp b/src/Server/PostgreSQLHandlerFactory.cpp index 096bbbdcda9..6f2124861e7 100644 --- a/src/Server/PostgreSQLHandlerFactory.cpp +++ b/src/Server/PostgreSQLHandlerFactory.cpp @@ -5,11 +5,9 @@ namespace DB { -PostgreSQLHandlerFactory::PostgreSQLHandlerFactory(IServer & server_, const ProfileEvents::Event & read_event_, const ProfileEvents::Event & write_event_) +PostgreSQLHandlerFactory::PostgreSQLHandlerFactory(IServer & server_) : server(server_) , log(&Poco::Logger::get("PostgreSQLHandlerFactory")) - , read_event(read_event_) - , write_event(write_event_) { auth_methods = { @@ -22,7 +20,7 @@ Poco::Net::TCPServerConnection * PostgreSQLHandlerFactory::createConnection(cons { Int32 connection_id = last_connection_id++; LOG_TRACE(log, "PostgreSQL connection. Id: {}. Address: {}", connection_id, socket.peerAddress().toString()); - return new PostgreSQLHandler(socket, server, tcp_server, ssl_enabled, connection_id, auth_methods, read_event, write_event); + return new PostgreSQLHandler(socket, server, tcp_server, ssl_enabled, connection_id, auth_methods); } } diff --git a/src/Server/PostgreSQLHandlerFactory.h b/src/Server/PostgreSQLHandlerFactory.h index e5f762fca6d..35046325386 100644 --- a/src/Server/PostgreSQLHandlerFactory.h +++ b/src/Server/PostgreSQLHandlerFactory.h @@ -15,8 +15,6 @@ class PostgreSQLHandlerFactory : public TCPServerConnectionFactory private: IServer & server; Poco::Logger * log; - ProfileEvents::Event read_event; - ProfileEvents::Event write_event; #if USE_SSL bool ssl_enabled = true; @@ -28,7 +26,7 @@ private: std::vector> auth_methods; public: - explicit PostgreSQLHandlerFactory(IServer & server_, const ProfileEvents::Event & read_event_ = ProfileEvents::end(), const ProfileEvents::Event & write_event_ = ProfileEvents::end()); + explicit PostgreSQLHandlerFactory(IServer & server_); Poco::Net::TCPServerConnection * createConnection(const Poco::Net::StreamSocket & socket, TCPServer & server) override; }; diff --git a/src/Server/PrometheusRequestHandler.cpp b/src/Server/PrometheusRequestHandler.cpp index 12caad5eea1..188f61e51d5 100644 --- a/src/Server/PrometheusRequestHandler.cpp +++ b/src/Server/PrometheusRequestHandler.cpp @@ -13,7 +13,7 @@ namespace DB { -void PrometheusRequestHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) +void PrometheusRequestHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { try { @@ -27,7 +27,7 @@ void PrometheusRequestHandler::handleRequest(HTTPServerRequest & request, HTTPSe response.setContentType("text/plain; version=0.0.4; charset=UTF-8"); - WriteBufferFromHTTPServerResponse wb(response, request.getMethod() == Poco::Net::HTTPRequest::HTTP_HEAD, keep_alive_timeout, write_event); + WriteBufferFromHTTPServerResponse wb(response, request.getMethod() == Poco::Net::HTTPRequest::HTTP_HEAD, keep_alive_timeout); try { metrics_writer.write(wb); diff --git a/src/Server/PrometheusRequestHandler.h b/src/Server/PrometheusRequestHandler.h index 9ec54cc2e4e..7d424f437e0 100644 --- a/src/Server/PrometheusRequestHandler.h +++ b/src/Server/PrometheusRequestHandler.h @@ -22,7 +22,7 @@ public: { } - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; }; } diff --git a/src/Server/ProxyV1Handler.cpp b/src/Server/ProxyV1Handler.cpp index 56621940a23..d5e6ab23360 100644 --- a/src/Server/ProxyV1Handler.cpp +++ b/src/Server/ProxyV1Handler.cpp @@ -29,38 +29,38 @@ void ProxyV1Handler::run() // read "PROXY" if (!readWord(5, word, eol) || word != "PROXY" || eol) - throw ParsingException(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); + throw Exception(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); // read "TCP4" or "TCP6" or "UNKNOWN" if (!readWord(7, word, eol)) - throw ParsingException(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); + throw Exception(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); if (word != "TCP4" && word != "TCP6" && word != "UNKNOWN") - throw ParsingException(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); + throw Exception(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); if (word == "UNKNOWN" && eol) return; if (eol) - throw ParsingException(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); + throw Exception(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); // read address if (!readWord(39, word, eol) || eol) - throw ParsingException(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); + throw Exception(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); stack_data.forwarded_for = std::move(word); // read address if (!readWord(39, word, eol) || eol) - throw ParsingException(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); + throw Exception(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); // read port if (!readWord(5, word, eol) || eol) - throw ParsingException(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); + throw Exception(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); // read port and "\r\n" if (!readWord(5, word, eol) || !eol) - throw ParsingException(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); + throw Exception(ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED, "PROXY protocol violation"); if (!stack_data.forwarded_for.empty()) LOG_TRACE(log, "Forwarded client address from PROXY header: {}", stack_data.forwarded_for); diff --git a/src/Server/ReplicasStatusHandler.cpp b/src/Server/ReplicasStatusHandler.cpp index 07f3b67b6a7..c30c3ebaa77 100644 --- a/src/Server/ReplicasStatusHandler.cpp +++ b/src/Server/ReplicasStatusHandler.cpp @@ -22,7 +22,7 @@ ReplicasStatusHandler::ReplicasStatusHandler(IServer & server) : WithContext(ser { } -void ReplicasStatusHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & /*write_event*/) +void ReplicasStatusHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { try { @@ -113,7 +113,7 @@ void ReplicasStatusHandler::handleRequest(HTTPServerRequest & request, HTTPServe if (!response.sent()) { /// We have not sent anything yet and we don't even know if we need to compress response. - *response.send() << getCurrentExceptionMessage(false) << '\n'; + *response.send() << getCurrentExceptionMessage(false) << std::endl; } } catch (...) diff --git a/src/Server/ReplicasStatusHandler.h b/src/Server/ReplicasStatusHandler.h index 08fd757b0d6..1a5388aa2ab 100644 --- a/src/Server/ReplicasStatusHandler.h +++ b/src/Server/ReplicasStatusHandler.h @@ -14,7 +14,7 @@ class ReplicasStatusHandler : public HTTPRequestHandler, WithContext public: explicit ReplicasStatusHandler(IServer & server_); - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; }; diff --git a/src/Server/StaticRequestHandler.cpp b/src/Server/StaticRequestHandler.cpp index 67bf3875de4..34cb5d2d169 100644 --- a/src/Server/StaticRequestHandler.cpp +++ b/src/Server/StaticRequestHandler.cpp @@ -33,11 +33,9 @@ namespace ErrorCodes extern const int INVALID_CONFIG_PARAMETER; } -static inline std::unique_ptr +static inline WriteBufferPtr responseWriteBuffer(HTTPServerRequest & request, HTTPServerResponse & response, UInt64 keep_alive_timeout) { - auto buf = std::unique_ptr(new WriteBufferFromHTTPServerResponse(response, request.getMethod() == HTTPRequest::HTTP_HEAD, keep_alive_timeout)); - /// The client can pass a HTTP header indicating supported compression method (gzip or deflate). String http_response_compression_methods = request.get("Accept-Encoding", ""); CompressionMethod http_response_compression_method = CompressionMethod::None; @@ -45,11 +43,14 @@ responseWriteBuffer(HTTPServerRequest & request, HTTPServerResponse & response, if (!http_response_compression_methods.empty()) http_response_compression_method = chooseHTTPCompressionMethod(http_response_compression_methods); - if (http_response_compression_method == CompressionMethod::None) - return buf; + bool client_supports_http_compression = http_response_compression_method != CompressionMethod::None; - response.set("Content-Encoding", toContentEncodingName(http_response_compression_method)); - return wrapWriteBufferWithCompressionMethod(std::move(buf), http_response_compression_method, 1); + return std::make_shared( + response, + request.getMethod() == Poco::Net::HTTPRequest::HTTP_HEAD, + keep_alive_timeout, + client_supports_http_compression, + http_response_compression_method); } static inline void trySendExceptionToClient( @@ -68,7 +69,7 @@ static inline void trySendExceptionToClient( response.setStatusAndReason(Poco::Net::HTTPResponse::HTTP_INTERNAL_SERVER_ERROR); if (!response.sent()) - *response.send() << s << '\n'; + *response.send() << s << std::endl; else { if (out.count() != out.offset()) @@ -87,10 +88,10 @@ static inline void trySendExceptionToClient( } } -void StaticRequestHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & /*write_event*/) +void StaticRequestHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { auto keep_alive_timeout = server.context()->getServerSettings().keep_alive_timeout.totalSeconds(); - auto out = responseWriteBuffer(request, response, keep_alive_timeout); + const auto & out = responseWriteBuffer(request, response, keep_alive_timeout); try { diff --git a/src/Server/StaticRequestHandler.h b/src/Server/StaticRequestHandler.h index 38d774bb0aa..df9374d4409 100644 --- a/src/Server/StaticRequestHandler.h +++ b/src/Server/StaticRequestHandler.h @@ -29,7 +29,7 @@ public: void writeResponse(WriteBuffer & out); - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; }; } diff --git a/src/Server/TCPHandler.cpp b/src/Server/TCPHandler.cpp index b56df48a121..a563e0e0004 100644 --- a/src/Server/TCPHandler.cpp +++ b/src/Server/TCPHandler.cpp @@ -184,27 +184,23 @@ void validateClientInfo(const ClientInfo & session_client_info, const ClientInfo namespace DB { -TCPHandler::TCPHandler(IServer & server_, TCPServer & tcp_server_, const Poco::Net::StreamSocket & socket_, bool parse_proxy_protocol_, std::string server_display_name_, const ProfileEvents::Event & read_event_, const ProfileEvents::Event & write_event_) +TCPHandler::TCPHandler(IServer & server_, TCPServer & tcp_server_, const Poco::Net::StreamSocket & socket_, bool parse_proxy_protocol_, std::string server_display_name_) : Poco::Net::TCPServerConnection(socket_) , server(server_) , tcp_server(tcp_server_) , parse_proxy_protocol(parse_proxy_protocol_) , log(&Poco::Logger::get("TCPHandler")) - , read_event(read_event_) - , write_event(write_event_) , server_display_name(std::move(server_display_name_)) { } -TCPHandler::TCPHandler(IServer & server_, TCPServer & tcp_server_, const Poco::Net::StreamSocket & socket_, TCPProtocolStackData & stack_data, std::string server_display_name_, const ProfileEvents::Event & read_event_, const ProfileEvents::Event & write_event_) +TCPHandler::TCPHandler(IServer & server_, TCPServer & tcp_server_, const Poco::Net::StreamSocket & socket_, TCPProtocolStackData & stack_data, std::string server_display_name_) : Poco::Net::TCPServerConnection(socket_) , server(server_) , tcp_server(tcp_server_) , log(&Poco::Logger::get("TCPHandler")) , forwarded_for(stack_data.forwarded_for) , certificate(stack_data.certificate) - , read_event(read_event_) - , write_event(write_event_) , default_database(stack_data.default_database) , server_display_name(std::move(server_display_name_)) { @@ -237,8 +233,8 @@ void TCPHandler::runImpl() socket().setSendTimeout(send_timeout); socket().setNoDelay(true); - in = std::make_shared(socket(), read_event); - out = std::make_shared(socket(), write_event); + in = std::make_shared(socket()); + out = std::make_shared(socket()); /// Support for PROXY protocol if (parse_proxy_protocol && !receiveProxyHeader()) diff --git a/src/Server/TCPHandler.h b/src/Server/TCPHandler.h index 4eb84ee5eee..45c10b1c27d 100644 --- a/src/Server/TCPHandler.h +++ b/src/Server/TCPHandler.h @@ -147,8 +147,8 @@ public: * because it allows to check the IP ranges of the trusted proxy. * Proxy-forwarded (original client) IP address is used for quota accounting if quota is keyed by forwarded IP. */ - TCPHandler(IServer & server_, TCPServer & tcp_server_, const Poco::Net::StreamSocket & socket_, bool parse_proxy_protocol_, std::string server_display_name_, const ProfileEvents::Event & read_event_ = ProfileEvents::end(), const ProfileEvents::Event & write_event_ = ProfileEvents::end()); - TCPHandler(IServer & server_, TCPServer & tcp_server_, const Poco::Net::StreamSocket & socket_, TCPProtocolStackData & stack_data, std::string server_display_name_, const ProfileEvents::Event & read_event_ = ProfileEvents::end(), const ProfileEvents::Event & write_event_ = ProfileEvents::end()); + TCPHandler(IServer & server_, TCPServer & tcp_server_, const Poco::Net::StreamSocket & socket_, bool parse_proxy_protocol_, std::string server_display_name_); + TCPHandler(IServer & server_, TCPServer & tcp_server_, const Poco::Net::StreamSocket & socket_, TCPProtocolStackData & stack_data, std::string server_display_name_); ~TCPHandler() override; void run() override; @@ -191,9 +191,6 @@ private: std::shared_ptr in; std::shared_ptr out; - ProfileEvents::Event read_event; - ProfileEvents::Event write_event; - /// Time after the last check to stop the request and send the progress. Stopwatch after_check_cancelled; Stopwatch after_send_progress; diff --git a/src/Server/TCPHandlerFactory.h b/src/Server/TCPHandlerFactory.h index 3eb032f4250..fde04c6e0ab 100644 --- a/src/Server/TCPHandlerFactory.h +++ b/src/Server/TCPHandlerFactory.h @@ -21,9 +21,6 @@ private: Poco::Logger * log; std::string server_display_name; - ProfileEvents::Event read_event; - ProfileEvents::Event write_event; - class DummyTCPHandler : public Poco::Net::TCPServerConnection { public: @@ -36,11 +33,9 @@ public: * and set the information about forwarded address accordingly. * See https://github.com/wolfeidau/proxyv2/blob/master/docs/proxy-protocol.txt */ - TCPHandlerFactory(IServer & server_, bool secure_, bool parse_proxy_protocol_, const ProfileEvents::Event & read_event_ = ProfileEvents::end(), const ProfileEvents::Event & write_event_ = ProfileEvents::end()) + TCPHandlerFactory(IServer & server_, bool secure_, bool parse_proxy_protocol_) : server(server_), parse_proxy_protocol(parse_proxy_protocol_) , log(&Poco::Logger::get(std::string("TCP") + (secure_ ? "S" : "") + "HandlerFactory")) - , read_event(read_event_) - , write_event(write_event_) { server_display_name = server.config().getString("display_name", getFQDNOrHostName()); } @@ -50,7 +45,8 @@ public: try { LOG_TRACE(log, "TCP Request. Address: {}", socket.peerAddress().toString()); - return new TCPHandler(server, tcp_server, socket, parse_proxy_protocol, server_display_name, read_event, write_event); + + return new TCPHandler(server, tcp_server, socket, parse_proxy_protocol, server_display_name); } catch (const Poco::Net::NetException &) { @@ -64,7 +60,8 @@ public: try { LOG_TRACE(log, "TCP Request. Address: {}", socket.peerAddress().toString()); - return new TCPHandler(server, tcp_server, socket, stack_data, server_display_name, read_event, write_event); + + return new TCPHandler(server, tcp_server, socket, stack_data, server_display_name); } catch (const Poco::Net::NetException &) { diff --git a/src/Server/WebUIRequestHandler.cpp b/src/Server/WebUIRequestHandler.cpp index ac7a3bfccf3..b26ec06b4c1 100644 --- a/src/Server/WebUIRequestHandler.cpp +++ b/src/Server/WebUIRequestHandler.cpp @@ -1,6 +1,5 @@ #include "WebUIRequestHandler.h" #include "IServer.h" -#include #include #include @@ -29,7 +28,7 @@ WebUIRequestHandler::WebUIRequestHandler(IServer & server_) } -void WebUIRequestHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & /*write_event*/) +void WebUIRequestHandler::handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) { auto keep_alive_timeout = server.context()->getServerSettings().keep_alive_timeout.totalSeconds(); @@ -43,7 +42,7 @@ void WebUIRequestHandler::handleRequest(HTTPServerRequest & request, HTTPServerR if (request.getURI().starts_with("/play")) { response.setStatusAndReason(Poco::Net::HTTPResponse::HTTP_OK); - WriteBufferFromHTTPServerResponse(response, request.getMethod() == HTTPRequest::HTTP_HEAD, keep_alive_timeout).write(reinterpret_cast(gresource_play_htmlData), gresource_play_htmlSize); + *response.send() << std::string_view(reinterpret_cast(gresource_play_htmlData), gresource_play_htmlSize); } else if (request.getURI().starts_with("/dashboard")) { @@ -59,17 +58,17 @@ void WebUIRequestHandler::handleRequest(HTTPServerRequest & request, HTTPServerR static re2::RE2 uplot_url = R"(https://[^\s"'`]+u[Pp]lot[^\s"'`]*\.js)"; RE2::Replace(&html, uplot_url, "/js/uplot.js"); - WriteBufferFromHTTPServerResponse(response, request.getMethod() == HTTPRequest::HTTP_HEAD, keep_alive_timeout).write(html); + *response.send() << html; } else if (request.getURI().starts_with("/binary")) { response.setStatusAndReason(Poco::Net::HTTPResponse::HTTP_OK); - WriteBufferFromHTTPServerResponse(response, request.getMethod() == HTTPRequest::HTTP_HEAD, keep_alive_timeout).write(reinterpret_cast(gresource_binary_htmlData), gresource_binary_htmlSize); + *response.send() << std::string_view(reinterpret_cast(gresource_binary_htmlData), gresource_binary_htmlSize); } else if (request.getURI() == "/js/uplot.js") { response.setStatusAndReason(Poco::Net::HTTPResponse::HTTP_OK); - WriteBufferFromHTTPServerResponse(response, request.getMethod() == HTTPRequest::HTTP_HEAD, keep_alive_timeout).write(reinterpret_cast(gresource_uplot_jsData), gresource_uplot_jsSize); + *response.send() << std::string_view(reinterpret_cast(gresource_uplot_jsData), gresource_uplot_jsSize); } else { diff --git a/src/Server/WebUIRequestHandler.h b/src/Server/WebUIRequestHandler.h index c52946e2089..09fe62d41c3 100644 --- a/src/Server/WebUIRequestHandler.h +++ b/src/Server/WebUIRequestHandler.h @@ -16,7 +16,7 @@ private: public: WebUIRequestHandler(IServer & server_); - void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response, const ProfileEvents::Event & write_event) override; + void handleRequest(HTTPServerRequest & request, HTTPServerResponse & response) override; }; } diff --git a/src/Storages/HDFS/StorageHDFS.cpp b/src/Storages/HDFS/StorageHDFS.cpp index cf4843a513d..4f263d6cea8 100644 --- a/src/Storages/HDFS/StorageHDFS.cpp +++ b/src/Storages/HDFS/StorageHDFS.cpp @@ -16,6 +16,9 @@ #include #include #include +#include +#include +#include #include #include @@ -400,22 +403,22 @@ ColumnsDescription StorageHDFS::getTableStructureFromData( class HDFSSource::DisclosedGlobIterator::Impl { public: - Impl(const String & uri, const ASTPtr & query, const NamesAndTypesList & virtual_columns, const ContextPtr & context) + Impl(const String & uri, const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns, const ContextPtr & context) { const auto [path_from_uri, uri_without_path] = getPathFromUriAndUriWithoutPath(uri); uris = getPathsList(path_from_uri, uri_without_path, context); - ASTPtr filter_ast; + ActionsDAGPtr filter_dag; if (!uris.empty()) - filter_ast = VirtualColumnUtils::createPathAndFileFilterAst(query, virtual_columns, uris[0].path, context); + filter_dag = VirtualColumnUtils::createPathAndFileFilterDAG(predicate, virtual_columns); - if (filter_ast) + if (filter_dag) { std::vector paths; paths.reserve(uris.size()); for (const auto & path_with_info : uris) paths.push_back(path_with_info.path); - VirtualColumnUtils::filterByPathOrFile(uris, paths, query, virtual_columns, context, filter_ast); + VirtualColumnUtils::filterByPathOrFile(uris, paths, filter_dag, virtual_columns, context); } auto file_progress_callback = context->getFileProgressCallback(); @@ -448,21 +451,21 @@ private: class HDFSSource::URISIterator::Impl : WithContext { public: - explicit Impl(const std::vector & uris_, const ASTPtr & query, const NamesAndTypesList & virtual_columns, const ContextPtr & context_) + explicit Impl(const std::vector & uris_, const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns, const ContextPtr & context_) : WithContext(context_), uris(uris_), file_progress_callback(context_->getFileProgressCallback()) { - ASTPtr filter_ast; + ActionsDAGPtr filter_dag; if (!uris.empty()) - filter_ast = VirtualColumnUtils::createPathAndFileFilterAst(query, virtual_columns, getPathFromUriAndUriWithoutPath(uris[0]).first, getContext()); + filter_dag = VirtualColumnUtils::createPathAndFileFilterDAG(predicate, virtual_columns); - if (filter_ast) + if (filter_dag) { std::vector paths; paths.reserve(uris.size()); for (const auto & uri : uris) paths.push_back(getPathFromUriAndUriWithoutPath(uri).first); - VirtualColumnUtils::filterByPathOrFile(uris, paths, query, virtual_columns, getContext(), filter_ast); + VirtualColumnUtils::filterByPathOrFile(uris, paths, filter_dag, virtual_columns, getContext()); } if (!uris.empty()) @@ -509,16 +512,16 @@ private: std::function file_progress_callback; }; -HDFSSource::DisclosedGlobIterator::DisclosedGlobIterator(const String & uri, const ASTPtr & query, const NamesAndTypesList & virtual_columns, const ContextPtr & context) - : pimpl(std::make_shared(uri, query, virtual_columns, context)) {} +HDFSSource::DisclosedGlobIterator::DisclosedGlobIterator(const String & uri, const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns, const ContextPtr & context) + : pimpl(std::make_shared(uri, predicate, virtual_columns, context)) {} StorageHDFS::PathWithInfo HDFSSource::DisclosedGlobIterator::next() { return pimpl->next(); } -HDFSSource::URISIterator::URISIterator(const std::vector & uris_, const ASTPtr & query, const NamesAndTypesList & virtual_columns, const ContextPtr & context) - : pimpl(std::make_shared(uris_, query, virtual_columns, context)) +HDFSSource::URISIterator::URISIterator(const std::vector & uris_, const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns, const ContextPtr & context) + : pimpl(std::make_shared(uris_, predicate, virtual_columns, context)) { } @@ -533,8 +536,7 @@ HDFSSource::HDFSSource( ContextPtr context_, UInt64 max_block_size_, std::shared_ptr file_iterator_, - bool need_only_count_, - const SelectQueryInfo & query_info_) + bool need_only_count_) : ISource(info.source_header, false) , WithContext(context_) , storage(std::move(storage_)) @@ -545,7 +547,6 @@ HDFSSource::HDFSSource( , file_iterator(file_iterator_) , columns_description(info.columns_description) , need_only_count(need_only_count_) - , query_info(query_info_) { initialize(); } @@ -835,7 +836,57 @@ bool StorageHDFS::supportsSubsetOfColumns(const ContextPtr & context_) const return FormatFactory::instance().checkIfFormatSupportsSubsetOfColumns(format_name, context_); } -Pipe StorageHDFS::read( +class ReadFromHDFS : public SourceStepWithFilter +{ +public: + std::string getName() const override { return "ReadFromHDFS"; } + void initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) override; + void applyFilters() override; + + ReadFromHDFS( + Block sample_block, + ReadFromFormatInfo info_, + bool need_only_count_, + std::shared_ptr storage_, + ContextPtr context_, + size_t max_block_size_, + size_t num_streams_) + : SourceStepWithFilter(DataStream{.header = std::move(sample_block)}) + , info(std::move(info_)) + , need_only_count(need_only_count_) + , storage(std::move(storage_)) + , context(std::move(context_)) + , max_block_size(max_block_size_) + , num_streams(num_streams_) + { + } + +private: + ReadFromFormatInfo info; + const bool need_only_count; + std::shared_ptr storage; + + ContextPtr context; + size_t max_block_size; + size_t num_streams; + + std::shared_ptr iterator_wrapper; + + void createIterator(const ActionsDAG::Node * predicate); +}; + +void ReadFromHDFS::applyFilters() +{ + auto filter_actions_dag = ActionsDAG::buildFilterActionsDAG(filter_nodes.nodes, {}, context); + const ActionsDAG::Node * predicate = nullptr; + if (filter_actions_dag) + predicate = filter_actions_dag->getOutputs().at(0); + + createIterator(predicate); +} + +void StorageHDFS::read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, SelectQueryInfo & query_info, @@ -844,18 +895,40 @@ Pipe StorageHDFS::read( size_t max_block_size, size_t num_streams) { - std::shared_ptr iterator_wrapper{nullptr}; - if (distributed_processing) + auto read_from_format_info = prepareReadingFromFormat(column_names, storage_snapshot, supportsSubsetOfColumns(context_), virtual_columns); + bool need_only_count = (query_info.optimize_trivial_count || read_from_format_info.requested_columns.empty()) + && context_->getSettingsRef().optimize_count_from_files; + + auto this_ptr = std::static_pointer_cast(shared_from_this()); + + auto reading = std::make_unique( + read_from_format_info.source_header, + std::move(read_from_format_info), + need_only_count, + std::move(this_ptr), + context_, + max_block_size, + num_streams); + + query_plan.addStep(std::move(reading)); +} + +void ReadFromHDFS::createIterator(const ActionsDAG::Node * predicate) +{ + if (iterator_wrapper) + return; + + if (storage->distributed_processing) { iterator_wrapper = std::make_shared( - [callback = context_->getReadTaskCallback()]() -> StorageHDFS::PathWithInfo { + [callback = context->getReadTaskCallback()]() -> StorageHDFS::PathWithInfo { return StorageHDFS::PathWithInfo{callback(), std::nullopt}; }); } - else if (is_path_with_globs) + else if (storage->is_path_with_globs) { /// Iterate through disclosed globs and make a source for each file - auto glob_iterator = std::make_shared(uris[0], query_info.query, virtual_columns, context_); + auto glob_iterator = std::make_shared(storage->uris[0], predicate, storage->virtual_columns, context); iterator_wrapper = std::make_shared([glob_iterator]() { return glob_iterator->next(); @@ -863,31 +936,38 @@ Pipe StorageHDFS::read( } else { - auto uris_iterator = std::make_shared(uris, query_info.query, virtual_columns, context_); + auto uris_iterator = std::make_shared(storage->uris, predicate, storage->virtual_columns, context); iterator_wrapper = std::make_shared([uris_iterator]() { return uris_iterator->next(); }); } +} - auto read_from_format_info = prepareReadingFromFormat(column_names, storage_snapshot, supportsSubsetOfColumns(context_), getVirtuals()); - bool need_only_count = (query_info.optimize_trivial_count || read_from_format_info.requested_columns.empty()) - && context_->getSettingsRef().optimize_count_from_files; +void ReadFromHDFS::initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) +{ + createIterator(nullptr); Pipes pipes; - auto this_ptr = std::static_pointer_cast(shared_from_this()); for (size_t i = 0; i < num_streams; ++i) { pipes.emplace_back(std::make_shared( - read_from_format_info, - this_ptr, - context_, + info, + storage, + context, max_block_size, iterator_wrapper, - need_only_count, - query_info)); + need_only_count)); } - return Pipe::unitePipes(std::move(pipes)); + + auto pipe = Pipe::unitePipes(std::move(pipes)); + if (pipe.empty()) + pipe = Pipe(std::make_shared(info.source_header)); + + for (const auto & processor : pipe.getProcessors()) + processors.emplace_back(processor); + + pipeline.init(std::move(pipe)); } SinkToStoragePtr StorageHDFS::write(const ASTPtr & query, const StorageMetadataPtr & metadata_snapshot, ContextPtr context_, bool /*async_insert*/) diff --git a/src/Storages/HDFS/StorageHDFS.h b/src/Storages/HDFS/StorageHDFS.h index 18eeb787d77..f1f0019d3e0 100644 --- a/src/Storages/HDFS/StorageHDFS.h +++ b/src/Storages/HDFS/StorageHDFS.h @@ -51,7 +51,8 @@ public: String getName() const override { return "HDFS"; } - Pipe read( + void read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, SelectQueryInfo & query_info, @@ -93,6 +94,7 @@ public: protected: friend class HDFSSource; + friend class ReadFromHDFS; private: std::vector uris; @@ -114,7 +116,7 @@ public: class DisclosedGlobIterator { public: - DisclosedGlobIterator(const String & uri_, const ASTPtr & query, const NamesAndTypesList & virtual_columns, const ContextPtr & context); + DisclosedGlobIterator(const String & uri_, const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns, const ContextPtr & context); StorageHDFS::PathWithInfo next(); private: class Impl; @@ -125,7 +127,7 @@ public: class URISIterator { public: - URISIterator(const std::vector & uris_, const ASTPtr & query, const NamesAndTypesList & virtual_columns, const ContextPtr & context); + URISIterator(const std::vector & uris_, const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns, const ContextPtr & context); StorageHDFS::PathWithInfo next(); private: class Impl; @@ -142,8 +144,7 @@ public: ContextPtr context_, UInt64 max_block_size_, std::shared_ptr file_iterator_, - bool need_only_count_, - const SelectQueryInfo & query_info_); + bool need_only_count_); String getName() const override; @@ -162,7 +163,6 @@ private: ColumnsDescription columns_description; bool need_only_count; size_t total_rows_in_file = 0; - SelectQueryInfo query_info; std::unique_ptr read_buf; std::shared_ptr input_format; diff --git a/src/Storages/HDFS/StorageHDFSCluster.cpp b/src/Storages/HDFS/StorageHDFSCluster.cpp index bff22936e95..2e8129b9845 100644 --- a/src/Storages/HDFS/StorageHDFSCluster.cpp +++ b/src/Storages/HDFS/StorageHDFSCluster.cpp @@ -79,9 +79,9 @@ void StorageHDFSCluster::addColumnsStructureToQuery(ASTPtr & query, const String } -RemoteQueryExecutor::Extension StorageHDFSCluster::getTaskIteratorExtension(ASTPtr query, const ContextPtr & context) const +RemoteQueryExecutor::Extension StorageHDFSCluster::getTaskIteratorExtension(const ActionsDAG::Node * predicate, const ContextPtr & context) const { - auto iterator = std::make_shared(uri, query, virtual_columns, context); + auto iterator = std::make_shared(uri, predicate, virtual_columns, context); auto callback = std::make_shared>([iter = std::move(iterator)]() mutable -> String { return iter->next().path; }); return RemoteQueryExecutor::Extension{.task_iterator = std::move(callback)}; } diff --git a/src/Storages/HDFS/StorageHDFSCluster.h b/src/Storages/HDFS/StorageHDFSCluster.h index 8ad4a83c5b9..7c4c41a573a 100644 --- a/src/Storages/HDFS/StorageHDFSCluster.h +++ b/src/Storages/HDFS/StorageHDFSCluster.h @@ -35,7 +35,7 @@ public: NamesAndTypesList getVirtuals() const override; - RemoteQueryExecutor::Extension getTaskIteratorExtension(ASTPtr query, const ContextPtr & context) const override; + RemoteQueryExecutor::Extension getTaskIteratorExtension(const ActionsDAG::Node * predicate, const ContextPtr & context) const override; bool supportsSubcolumns() const override { return true; } diff --git a/src/Storages/Hive/StorageHive.cpp b/src/Storages/Hive/StorageHive.cpp index f03136e4edf..0c4e4f956a0 100644 --- a/src/Storages/Hive/StorageHive.cpp +++ b/src/Storages/Hive/StorageHive.cpp @@ -29,10 +29,14 @@ #include #include #include +#include #include #include #include #include +#include +#include +#include #include #include #include @@ -123,7 +127,6 @@ public: String compression_method_, Block sample_block_, ContextPtr context_, - const SelectQueryInfo & query_info_, UInt64 max_block_size_, const StorageHive & storage_, const Names & text_input_field_names_ = {}) @@ -140,7 +143,6 @@ public: , text_input_field_names(text_input_field_names_) , format_settings(getFormatSettings(getContext())) , read_settings(getContext()->getReadSettings()) - , query_info(query_info_) { to_read_block = sample_block; @@ -395,7 +397,6 @@ private: const Names & text_input_field_names; FormatSettings format_settings; ReadSettings read_settings; - SelectQueryInfo query_info; HiveFilePtr current_file; String current_path; @@ -574,7 +575,7 @@ static HiveFilePtr createHiveFile( HiveFiles StorageHive::collectHiveFilesFromPartition( const Apache::Hadoop::Hive::Partition & partition, - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_actions_dag, const HiveTableMetadataPtr & hive_table_metadata, const HDFSFSPtr & fs, const ContextPtr & context_, @@ -638,7 +639,7 @@ HiveFiles StorageHive::collectHiveFilesFromPartition( for (size_t i = 0; i < partition_names.size(); ++i) ranges.emplace_back(fields[i]); - const KeyCondition partition_key_condition(query_info, getContext(), partition_names, partition_minmax_idx_expr); + const KeyCondition partition_key_condition(filter_actions_dag, getContext(), partition_names, partition_minmax_idx_expr); if (!partition_key_condition.checkInHyperrectangle(ranges, partition_types).can_be_true) return {}; } @@ -648,7 +649,7 @@ HiveFiles StorageHive::collectHiveFilesFromPartition( hive_files.reserve(file_infos.size()); for (const auto & file_info : file_infos) { - auto hive_file = getHiveFileIfNeeded(file_info, fields, query_info, hive_table_metadata, context_, prune_level); + auto hive_file = getHiveFileIfNeeded(file_info, fields, filter_actions_dag, hive_table_metadata, context_, prune_level); if (hive_file) { LOG_TRACE( @@ -672,7 +673,7 @@ StorageHive::listDirectory(const String & path, const HiveTableMetadataPtr & hiv HiveFilePtr StorageHive::getHiveFileIfNeeded( const FileInfo & file_info, const FieldVector & fields, - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_actions_dag, const HiveTableMetadataPtr & hive_table_metadata, const ContextPtr & context_, PruneLevel prune_level) const @@ -706,7 +707,7 @@ HiveFilePtr StorageHive::getHiveFileIfNeeded( if (prune_level >= PruneLevel::File) { - const KeyCondition hivefile_key_condition(query_info, getContext(), hivefile_name_types.getNames(), hivefile_minmax_idx_expr); + const KeyCondition hivefile_key_condition(filter_actions_dag, getContext(), hivefile_name_types.getNames(), hivefile_minmax_idx_expr); if (hive_file->useFileMinMaxIndex()) { /// Load file level minmax index and apply @@ -758,10 +759,77 @@ bool StorageHive::supportsSubsetOfColumns() const return format_name == "Parquet" || format_name == "ORC"; } -Pipe StorageHive::read( +class ReadFromHive : public SourceStepWithFilter +{ +public: + std::string getName() const override { return "ReadFromHive"; } + void initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) override; + void applyFilters() override; + + ReadFromHive( + Block header, + std::shared_ptr storage_, + std::shared_ptr sources_info_, + HDFSBuilderWrapper builder_, + HDFSFSPtr fs_, + HiveMetastoreClient::HiveTableMetadataPtr hive_table_metadata_, + Block sample_block_, + Poco::Logger * log_, + ContextPtr context_, + size_t max_block_size_, + size_t num_streams_) + : SourceStepWithFilter(DataStream{.header = std::move(header)}) + , storage(std::move(storage_)) + , sources_info(std::move(sources_info_)) + , builder(std::move(builder_)) + , fs(std::move(fs_)) + , hive_table_metadata(std::move(hive_table_metadata_)) + , sample_block(std::move(sample_block_)) + , log(log_) + , context(std::move(context_)) + , max_block_size(max_block_size_) + , num_streams(num_streams_) + { + } + +private: + std::shared_ptr storage; + std::shared_ptr sources_info; + HDFSBuilderWrapper builder; + HDFSFSPtr fs; + HiveMetastoreClient::HiveTableMetadataPtr hive_table_metadata; + Block sample_block; + Poco::Logger * log; + + ContextPtr context; + size_t max_block_size; + size_t num_streams; + + std::optional hive_files; + + void createFiles(const ActionsDAGPtr & filter_actions_dag); +}; + +void ReadFromHive::applyFilters() +{ + auto filter_actions_dag = ActionsDAG::buildFilterActionsDAG(filter_nodes.nodes, {}, context); + createFiles(filter_actions_dag); +} + +void ReadFromHive::createFiles(const ActionsDAGPtr & filter_actions_dag) +{ + if (hive_files) + return; + + hive_files = storage->collectHiveFiles(num_streams, filter_actions_dag, hive_table_metadata, fs, context); + LOG_INFO(log, "Collect {} hive files to read", hive_files->size()); +} + +void StorageHive::read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, - SelectQueryInfo & query_info, + SelectQueryInfo &, ContextPtr context_, QueryProcessingStage::Enum /* processed_stage */, size_t max_block_size, @@ -774,15 +842,7 @@ Pipe StorageHive::read( auto hive_metastore_client = HiveMetastoreClientFactory::instance().getOrCreate(hive_metastore_url); auto hive_table_metadata = hive_metastore_client->getTableMetadata(hive_database, hive_table); - /// Collect Hive files to read - HiveFiles hive_files = collectHiveFiles(num_streams, query_info, hive_table_metadata, fs, context_); - LOG_INFO(log, "Collect {} hive files to read", hive_files.size()); - - if (hive_files.empty()) - return {}; - auto sources_info = std::make_shared(); - sources_info->hive_files = std::move(hive_files); sources_info->database_name = hive_database; sources_info->table_name = hive_table; sources_info->hive_metastore_client = hive_metastore_client; @@ -822,6 +882,36 @@ Pipe StorageHive::read( sources_info->need_file_column = true; } + auto this_ptr = std::static_pointer_cast(shared_from_this()); + + auto reading = std::make_unique( + StorageHiveSource::getHeader(sample_block, sources_info), + std::move(this_ptr), + std::move(sources_info), + std::move(builder), + std::move(fs), + std::move(hive_table_metadata), + std::move(sample_block), + log, + context_, + max_block_size, + num_streams); + + query_plan.addStep(std::move(reading)); +} + +void ReadFromHive::initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) +{ + createFiles(nullptr); + + if (hive_files->empty()) + { + pipeline.init(Pipe(std::make_shared(getOutputStream().header))); + return; + } + + sources_info->hive_files = std::move(*hive_files); + if (num_streams > sources_info->hive_files.size()) num_streams = sources_info->hive_files.size(); @@ -830,22 +920,29 @@ Pipe StorageHive::read( { pipes.emplace_back(std::make_shared( sources_info, - hdfs_namenode_url, - format_name, - compression_method, + storage->hdfs_namenode_url, + storage->format_name, + storage->compression_method, sample_block, - context_, - query_info, + context, max_block_size, - *this, - text_input_field_names)); + *storage, + storage->text_input_field_names)); } - return Pipe::unitePipes(std::move(pipes)); + + auto pipe = Pipe::unitePipes(std::move(pipes)); + if (pipe.empty()) + pipe = Pipe(std::make_shared(getOutputStream().header)); + + for (const auto & processor : pipe.getProcessors()) + processors.emplace_back(processor); + + pipeline.init(std::move(pipe)); } HiveFiles StorageHive::collectHiveFiles( size_t max_threads, - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_actions_dag, const HiveTableMetadataPtr & hive_table_metadata, const HDFSFSPtr & fs, const ContextPtr & context_, @@ -871,7 +968,7 @@ HiveFiles StorageHive::collectHiveFiles( [&]() { auto hive_files_in_partition - = collectHiveFilesFromPartition(partition, query_info, hive_table_metadata, fs, context_, prune_level); + = collectHiveFilesFromPartition(partition, filter_actions_dag, hive_table_metadata, fs, context_, prune_level); if (!hive_files_in_partition.empty()) { std::lock_guard lock(hive_files_mutex); @@ -897,7 +994,7 @@ HiveFiles StorageHive::collectHiveFiles( pool.scheduleOrThrowOnError( [&]() { - auto hive_file = getHiveFileIfNeeded(file_info, {}, query_info, hive_table_metadata, context_, prune_level); + auto hive_file = getHiveFileIfNeeded(file_info, {}, filter_actions_dag, hive_table_metadata, context_, prune_level); if (hive_file) { std::lock_guard lock(hive_files_mutex); @@ -925,13 +1022,12 @@ NamesAndTypesList StorageHive::getVirtuals() const std::optional StorageHive::totalRows(const Settings & settings) const { /// query_info is not used when prune_level == PruneLevel::None - SelectQueryInfo query_info; - return totalRowsImpl(settings, query_info, getContext(), PruneLevel::None); + return totalRowsImpl(settings, nullptr, getContext(), PruneLevel::None); } -std::optional StorageHive::totalRowsByPartitionPredicate(const SelectQueryInfo & query_info, ContextPtr context_) const +std::optional StorageHive::totalRowsByPartitionPredicate(const ActionsDAGPtr & filter_actions_dag, ContextPtr context_) const { - return totalRowsImpl(context_->getSettingsRef(), query_info, context_, PruneLevel::Partition); + return totalRowsImpl(context_->getSettingsRef(), filter_actions_dag, context_, PruneLevel::Partition); } void StorageHive::checkAlterIsPossible(const AlterCommands & commands, ContextPtr /*local_context*/) const @@ -946,7 +1042,7 @@ void StorageHive::checkAlterIsPossible(const AlterCommands & commands, ContextPt } std::optional -StorageHive::totalRowsImpl(const Settings & settings, const SelectQueryInfo & query_info, ContextPtr context_, PruneLevel prune_level) const +StorageHive::totalRowsImpl(const Settings & settings, const ActionsDAGPtr & filter_actions_dag, ContextPtr context_, PruneLevel prune_level) const { /// Row-based format like Text doesn't support totalRowsByPartitionPredicate if (!supportsSubsetOfColumns()) @@ -958,7 +1054,7 @@ StorageHive::totalRowsImpl(const Settings & settings, const SelectQueryInfo & qu HDFSFSPtr fs = createHDFSFS(builder.get()); HiveFiles hive_files = collectHiveFiles( settings.max_threads, - query_info, + filter_actions_dag, hive_table_metadata, fs, context_, diff --git a/src/Storages/Hive/StorageHive.h b/src/Storages/Hive/StorageHive.h index 8b378bf9e54..b0ec96604cc 100644 --- a/src/Storages/Hive/StorageHive.h +++ b/src/Storages/Hive/StorageHive.h @@ -42,10 +42,11 @@ public: bool supportsSubcolumns() const override { return true; } - Pipe read( + void read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, - SelectQueryInfo & query_info, + SelectQueryInfo &, ContextPtr context, QueryProcessingStage::Enum processed_stage, size_t max_block_size, @@ -58,9 +59,12 @@ public: bool supportsSubsetOfColumns() const; std::optional totalRows(const Settings & settings) const override; - std::optional totalRowsByPartitionPredicate(const SelectQueryInfo & query_info, ContextPtr context_) const override; + std::optional totalRowsByPartitionPredicate(const ActionsDAGPtr & filter_actions_dag, ContextPtr context_) const override; void checkAlterIsPossible(const AlterCommands & commands, ContextPtr local_context) const override; +protected: + friend class ReadFromHive; + private: using FileFormat = IHiveFile::FileFormat; using FileInfo = HiveMetastoreClient::FileInfo; @@ -88,7 +92,7 @@ private: HiveFiles collectHiveFiles( size_t max_threads, - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_actions_dag, const HiveTableMetadataPtr & hive_table_metadata, const HDFSFSPtr & fs, const ContextPtr & context_, @@ -96,7 +100,7 @@ private: HiveFiles collectHiveFilesFromPartition( const Apache::Hadoop::Hive::Partition & partition, - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_actions_dag, const HiveTableMetadataPtr & hive_table_metadata, const HDFSFSPtr & fs, const ContextPtr & context_, @@ -105,7 +109,7 @@ private: HiveFilePtr getHiveFileIfNeeded( const FileInfo & file_info, const FieldVector & fields, - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_actions_dag, const HiveTableMetadataPtr & hive_table_metadata, const ContextPtr & context_, PruneLevel prune_level = PruneLevel::Max) const; @@ -113,7 +117,7 @@ private: void lazyInitialize(); std::optional - totalRowsImpl(const Settings & settings, const SelectQueryInfo & query_info, ContextPtr context_, PruneLevel prune_level) const; + totalRowsImpl(const Settings & settings, const ActionsDAGPtr & filter_actions_dag, ContextPtr context_, PruneLevel prune_level) const; String hive_metastore_url; diff --git a/src/Storages/IStorage.h b/src/Storages/IStorage.h index 1102c77ca58..4fa6bfdd617 100644 --- a/src/Storages/IStorage.h +++ b/src/Storages/IStorage.h @@ -669,7 +669,7 @@ public: virtual std::optional totalRows(const Settings &) const { return {}; } /// Same as above but also take partition predicate into account. - virtual std::optional totalRowsByPartitionPredicate(const SelectQueryInfo &, ContextPtr) const { return {}; } + virtual std::optional totalRowsByPartitionPredicate(const ActionsDAGPtr &, ContextPtr) const { return {}; } /// If it is possible to quickly determine exact number of bytes for the table on storage: /// - memory (approximated, resident) diff --git a/src/Storages/IStorageCluster.cpp b/src/Storages/IStorageCluster.cpp index 1447dad1374..6f42d8f855c 100644 --- a/src/Storages/IStorageCluster.cpp +++ b/src/Storages/IStorageCluster.cpp @@ -1,7 +1,7 @@ -#include "Storages/IStorageCluster.h" +#include -#include "Common/Exception.h" -#include "Core/QueryProcessingStage.h" +#include +#include #include #include #include @@ -11,11 +11,14 @@ #include #include #include +#include +#include +#include +#include #include #include -#include #include -#include +#include #include #include #include @@ -38,9 +41,66 @@ IStorageCluster::IStorageCluster( { } +class ReadFromCluster : public SourceStepWithFilter +{ +public: + std::string getName() const override { return "ReadFromCluster"; } + void initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) override; + void applyFilters() override; + + ReadFromCluster( + Block sample_block, + std::shared_ptr storage_, + ASTPtr query_to_send_, + QueryProcessingStage::Enum processed_stage_, + ClusterPtr cluster_, + Poco::Logger * log_, + ContextPtr context_) + : SourceStepWithFilter(DataStream{.header = std::move(sample_block)}) + , storage(std::move(storage_)) + , query_to_send(std::move(query_to_send_)) + , processed_stage(processed_stage_) + , cluster(std::move(cluster_)) + , log(log_) + , context(std::move(context_)) + { + } + +private: + std::shared_ptr storage; + ASTPtr query_to_send; + QueryProcessingStage::Enum processed_stage; + ClusterPtr cluster; + Poco::Logger * log; + ContextPtr context; + + std::optional extension; + + void createExtension(const ActionsDAG::Node * predicate); + ContextPtr updateSettings(const Settings & settings); +}; + +void ReadFromCluster::applyFilters() +{ + auto filter_actions_dag = ActionsDAG::buildFilterActionsDAG(filter_nodes.nodes, {}, context); + const ActionsDAG::Node * predicate = nullptr; + if (filter_actions_dag) + predicate = filter_actions_dag->getOutputs().at(0); + + createExtension(predicate); +} + +void ReadFromCluster::createExtension(const ActionsDAG::Node * predicate) +{ + if (extension) + return; + + extension = storage->getTaskIteratorExtension(predicate, context); +} /// The code executes on initiator -Pipe IStorageCluster::read( +void IStorageCluster::read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, SelectQueryInfo & query_info, @@ -49,10 +109,10 @@ Pipe IStorageCluster::read( size_t /*max_block_size*/, size_t /*num_streams*/) { - updateBeforeRead(context); + storage_snapshot->check(column_names); + updateBeforeRead(context); auto cluster = getCluster(context); - auto extension = getTaskIteratorExtension(query_info.query, context); /// Calculate the header. This is significant, because some columns could be thrown away in some cases like query with count(*) @@ -70,12 +130,6 @@ Pipe IStorageCluster::read( query_to_send = interpreter.getQueryInfo().query->clone(); } - const Scalars & scalars = context->hasQueryContext() ? context->getQueryContext()->getScalars() : Scalars{}; - - Pipes pipes; - - const bool add_agg_info = processed_stage == QueryProcessingStage::WithMergeableState; - if (!structure_argument_was_provided) addColumnsStructureToQuery(query_to_send, storage_snapshot->metadata->getColumns().getAll().toNamesAndTypesDescription(), context); @@ -89,7 +143,29 @@ Pipe IStorageCluster::read( /* only_replace_in_join_= */true); visitor.visit(query_to_send); - auto new_context = updateSettings(context, context->getSettingsRef()); + auto this_ptr = std::static_pointer_cast(shared_from_this()); + + auto reading = std::make_unique( + sample_block, + std::move(this_ptr), + std::move(query_to_send), + processed_stage, + cluster, + log, + context); + + query_plan.addStep(std::move(reading)); +} + +void ReadFromCluster::initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) +{ + createExtension(nullptr); + + const Scalars & scalars = context->hasQueryContext() ? context->getQueryContext()->getScalars() : Scalars{}; + const bool add_agg_info = processed_stage == QueryProcessingStage::WithMergeableState; + + Pipes pipes; + auto new_context = updateSettings(context->getSettingsRef()); const auto & current_settings = new_context->getSettingsRef(); auto timeouts = ConnectionTimeouts::getTCPTimeoutsWithFailover(current_settings); for (const auto & shard_info : cluster->getShardsInfo()) @@ -100,7 +176,7 @@ Pipe IStorageCluster::read( auto remote_query_executor = std::make_shared( std::vector{try_result}, queryToString(query_to_send), - sample_block, + getOutputStream().header, new_context, /*throttler=*/nullptr, scalars, @@ -113,8 +189,14 @@ Pipe IStorageCluster::read( } } - storage_snapshot->check(column_names); - return Pipe::unitePipes(std::move(pipes)); + auto pipe = Pipe::unitePipes(std::move(pipes)); + if (pipe.empty()) + pipe = Pipe(std::make_shared(getOutputStream().header)); + + for (const auto & processor : pipe.getProcessors()) + processors.emplace_back(processor); + + pipeline.init(std::move(pipe)); } QueryProcessingStage::Enum IStorageCluster::getQueryProcessingStage( @@ -129,7 +211,7 @@ QueryProcessingStage::Enum IStorageCluster::getQueryProcessingStage( return QueryProcessingStage::Enum::FetchColumns; } -ContextPtr IStorageCluster::updateSettings(ContextPtr context, const Settings & settings) +ContextPtr ReadFromCluster::updateSettings(const Settings & settings) { Settings new_settings = settings; diff --git a/src/Storages/IStorageCluster.h b/src/Storages/IStorageCluster.h index b15ed37202a..b233f20103d 100644 --- a/src/Storages/IStorageCluster.h +++ b/src/Storages/IStorageCluster.h @@ -22,7 +22,8 @@ public: Poco::Logger * log_, bool structure_argument_was_provided_); - Pipe read( + void read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, SelectQueryInfo & query_info, @@ -33,7 +34,7 @@ public: ClusterPtr getCluster(ContextPtr context) const; /// Query is needed for pruning by virtual columns (_file, _path) - virtual RemoteQueryExecutor::Extension getTaskIteratorExtension(ASTPtr query, const ContextPtr & context) const = 0; + virtual RemoteQueryExecutor::Extension getTaskIteratorExtension(const ActionsDAG::Node * predicate, const ContextPtr & context) const = 0; QueryProcessingStage::Enum getQueryProcessingStage(ContextPtr, QueryProcessingStage::Enum, const StorageSnapshotPtr &, SelectQueryInfo &) const override; @@ -45,8 +46,6 @@ protected: virtual void addColumnsStructureToQuery(ASTPtr & query, const String & structure, const ContextPtr & context) = 0; private: - ContextPtr updateSettings(ContextPtr context, const Settings & settings); - Poco::Logger * log; String cluster_name; bool structure_argument_was_provided; diff --git a/src/Storages/MergeTree/KeyCondition.cpp b/src/Storages/MergeTree/KeyCondition.cpp index 1cc672fb98f..d5922ae1bc2 100644 --- a/src/Storages/MergeTree/KeyCondition.cpp +++ b/src/Storages/MergeTree/KeyCondition.cpp @@ -762,92 +762,6 @@ void KeyCondition::getAllSpaceFillingCurves() } } -KeyCondition::KeyCondition( - const ASTPtr & query, - const ASTs & additional_filter_asts, - Block block_with_constants, - PreparedSetsPtr prepared_sets, - ContextPtr context, - const Names & key_column_names, - const ExpressionActionsPtr & key_expr_, - NameSet array_joined_column_names_, - bool single_point_, - bool strict_) - : key_expr(key_expr_) - , key_subexpr_names(getAllSubexpressionNames(*key_expr)) - , array_joined_column_names(std::move(array_joined_column_names_)) - , single_point(single_point_) - , strict(strict_) -{ - size_t key_index = 0; - for (const auto & name : key_column_names) - { - if (!key_columns.contains(name)) - { - key_columns[name] = key_columns.size(); - key_indices.push_back(key_index); - } - ++key_index; - } - - if (context->getSettingsRef().analyze_index_with_space_filling_curves) - getAllSpaceFillingCurves(); - - ASTPtr filter_node; - if (query) - filter_node = buildFilterNode(query, additional_filter_asts); - - if (!filter_node) - { - has_filter = false; - rpn.emplace_back(RPNElement::FUNCTION_UNKNOWN); - return; - } - - has_filter = true; - - /** When non-strictly monotonic functions are employed in functional index (e.g. ORDER BY toStartOfHour(dateTime)), - * the use of NOT operator in predicate will result in the indexing algorithm leave out some data. - * This is caused by rewriting in KeyCondition::tryParseAtomFromAST of relational operators to less strict - * when parsing the AST into internal RPN representation. - * To overcome the problem, before parsing the AST we transform it to its semantically equivalent form where all NOT's - * are pushed down and applied (when possible) to leaf nodes. - */ - auto inverted_filter_node = DB::cloneASTWithInversionPushDown(filter_node); - - RPNBuilder builder( - inverted_filter_node, - std::move(context), - std::move(block_with_constants), - std::move(prepared_sets), - [&](const RPNBuilderTreeNode & node, RPNElement & out) { return extractAtomFromTree(node, out); }); - - rpn = std::move(builder).extractRPN(); - - findHyperrectanglesForArgumentsOfSpaceFillingCurves(); -} - -KeyCondition::KeyCondition( - const SelectQueryInfo & query_info, - ContextPtr context, - const Names & key_column_names, - const ExpressionActionsPtr & key_expr_, - bool single_point_, - bool strict_) - : KeyCondition( - query_info.query, - query_info.filter_asts, - KeyCondition::getBlockWithConstants(query_info.query, query_info.syntax_analyzer_result, context), - query_info.prepared_sets, - context, - key_column_names, - key_expr_, - query_info.syntax_analyzer_result ? query_info.syntax_analyzer_result->getArrayJoinSourceNameSet() : NameSet{}, - single_point_, - strict_) -{ -} - KeyCondition::KeyCondition( ActionsDAGPtr filter_dag, ContextPtr context, @@ -883,6 +797,13 @@ KeyCondition::KeyCondition( has_filter = true; + /** When non-strictly monotonic functions are employed in functional index (e.g. ORDER BY toStartOfHour(dateTime)), + * the use of NOT operator in predicate will result in the indexing algorithm leave out some data. + * This is caused by rewriting in KeyCondition::tryParseAtomFromAST of relational operators to less strict + * when parsing the AST into internal RPN representation. + * To overcome the problem, before parsing the AST we transform it to its semantically equivalent form where all NOT's + * are pushed down and applied (when possible) to leaf nodes. + */ auto inverted_dag = cloneASTWithInversionPushDown({filter_dag->getOutputs().at(0)}, context); assert(inverted_dag->getOutputs().size() == 1); diff --git a/src/Storages/MergeTree/KeyCondition.h b/src/Storages/MergeTree/KeyCondition.h index 980c248835d..6e248dd664a 100644 --- a/src/Storages/MergeTree/KeyCondition.h +++ b/src/Storages/MergeTree/KeyCondition.h @@ -39,30 +39,6 @@ struct ActionDAGNodes; class KeyCondition { public: - /// Construct key condition from AST SELECT query WHERE, PREWHERE and additional filters - KeyCondition( - const ASTPtr & query, - const ASTs & additional_filter_asts, - Block block_with_constants, - PreparedSetsPtr prepared_sets_, - ContextPtr context, - const Names & key_column_names, - const ExpressionActionsPtr & key_expr, - NameSet array_joined_column_names, - bool single_point_ = false, - bool strict_ = false); - - /** Construct key condition from AST SELECT query WHERE, PREWHERE and additional filters. - * Select query, additional filters, prepared sets are initialized using query info. - */ - KeyCondition( - const SelectQueryInfo & query_info, - ContextPtr context, - const Names & key_column_names, - const ExpressionActionsPtr & key_expr_, - bool single_point_ = false, - bool strict_ = false); - /// Construct key condition from ActionsDAG nodes KeyCondition( ActionsDAGPtr filter_dag, diff --git a/src/Storages/MergeTree/MergeFromLogEntryTask.cpp b/src/Storages/MergeTree/MergeFromLogEntryTask.cpp index 3d8bc62b5cc..23037b1ee7a 100644 --- a/src/Storages/MergeTree/MergeFromLogEntryTask.cpp +++ b/src/Storages/MergeTree/MergeFromLogEntryTask.cpp @@ -43,6 +43,8 @@ ReplicatedMergeMutateTaskBase::PrepareResult MergeFromLogEntryTask::prepare() LOG_TRACE(log, "Executing log entry to merge parts {} to {}", fmt::join(entry.source_parts, ", "), entry.new_part_name); + StorageMetadataPtr metadata_snapshot = storage.getInMemoryMetadataPtr(); + int32_t metadata_version = metadata_snapshot->getMetadataVersion(); const auto storage_settings_ptr = storage.getSettings(); if (storage_settings_ptr->always_fetch_merged_part) @@ -129,6 +131,18 @@ ReplicatedMergeMutateTaskBase::PrepareResult MergeFromLogEntryTask::prepare() }; } + int32_t part_metadata_version = source_part_or_covering->getMetadataVersion(); + if (part_metadata_version > metadata_version) + { + LOG_DEBUG(log, "Source part metadata version {} is newer then the table metadata version {}. ALTER_METADATA is still in progress.", + part_metadata_version, metadata_version); + return PrepareResult{ + .prepared_successfully = false, + .need_to_check_missing_part_in_fetch = false, + .part_log_writer = {} + }; + } + parts.push_back(source_part_or_covering); } @@ -176,8 +190,6 @@ ReplicatedMergeMutateTaskBase::PrepareResult MergeFromLogEntryTask::prepare() /// It will live until the whole task is being destroyed table_lock_holder = storage.lockForShare(RWLockImpl::NO_QUERY, storage_settings_ptr->lock_acquire_timeout_for_background_operations); - StorageMetadataPtr metadata_snapshot = storage.getInMemoryMetadataPtr(); - auto future_merged_part = std::make_shared(parts, entry.new_part_format); if (future_merged_part->name != entry.new_part_name) { diff --git a/src/Storages/MergeTree/MergeTask.cpp b/src/Storages/MergeTree/MergeTask.cpp index 786960beb37..4b5b7ca8018 100644 --- a/src/Storages/MergeTree/MergeTask.cpp +++ b/src/Storages/MergeTree/MergeTask.cpp @@ -570,6 +570,7 @@ void MergeTask::VerticalMergeStage::prepareVerticalMergeForOneColumn() const for (size_t part_num = 0; part_num < global_ctx->future_part->parts.size(); ++part_num) { Pipe pipe = createMergeTreeSequentialSource( + MergeTreeSequentialSourceType::Merge, *global_ctx->data, global_ctx->storage_snapshot, global_ctx->future_part->parts[part_num], @@ -925,6 +926,7 @@ void MergeTask::ExecuteAndFinalizeHorizontalPart::createMergedStream() for (const auto & part : global_ctx->future_part->parts) { Pipe pipe = createMergeTreeSequentialSource( + MergeTreeSequentialSourceType::Merge, *global_ctx->data, global_ctx->storage_snapshot, part, diff --git a/src/Storages/MergeTree/MergeTreeData.cpp b/src/Storages/MergeTree/MergeTreeData.cpp index 1c80778f1ca..e6b0c581f27 100644 --- a/src/Storages/MergeTree/MergeTreeData.cpp +++ b/src/Storages/MergeTree/MergeTreeData.cpp @@ -1075,26 +1075,30 @@ Block MergeTreeData::getBlockWithVirtualPartColumns(const MergeTreeData::DataPar std::optional MergeTreeData::totalRowsByPartitionPredicateImpl( - const SelectQueryInfo & query_info, ContextPtr local_context, const DataPartsVector & parts) const + const ActionsDAGPtr & filter_actions_dag, ContextPtr local_context, const DataPartsVector & parts) const { if (parts.empty()) return 0u; auto metadata_snapshot = getInMemoryMetadataPtr(); - ASTPtr expression_ast; Block virtual_columns_block = getBlockWithVirtualPartColumns(parts, true /* one_part */); - // Generate valid expressions for filtering - bool valid = VirtualColumnUtils::prepareFilterBlockWithQuery(query_info.query, local_context, virtual_columns_block, expression_ast); + auto filter_dag = VirtualColumnUtils::splitFilterDagForAllowedInputs(filter_actions_dag->getOutputs().at(0), nullptr); - PartitionPruner partition_pruner(metadata_snapshot, query_info, local_context, true /* strict */); + // Generate valid expressions for filtering + bool valid = true; + for (const auto * input : filter_dag->getInputs()) + if (!virtual_columns_block.has(input->result_name)) + valid = false; + + PartitionPruner partition_pruner(metadata_snapshot, filter_dag, local_context, true /* strict */); if (partition_pruner.isUseless() && !valid) return {}; std::unordered_set part_values; - if (valid && expression_ast) + if (valid) { virtual_columns_block = getBlockWithVirtualPartColumns(parts, false /* one_part */); - VirtualColumnUtils::filterBlockWithQuery(query_info.query, virtual_columns_block, local_context, expression_ast); + VirtualColumnUtils::filterBlockWithDAG(filter_dag, virtual_columns_block, local_context); part_values = VirtualColumnUtils::extractSingleValueFromBlock(virtual_columns_block, "_part"); if (part_values.empty()) return 0; @@ -3985,8 +3989,15 @@ MergeTreeData::PartsToRemoveFromZooKeeper MergeTreeData::removePartsInRangeFromW /// FIXME refactor removePartsFromWorkingSet(...), do not remove parts twice removePartsFromWorkingSet(txn, parts_to_remove, clear_without_timeout, lock); + /// We can only create a covering part for a blocks range that starts with 0 (otherwise we may get "intersecting parts" + /// if we remove a range from the middle when dropping a part). + /// Maybe we could do it by incrementing mutation version to get a name for the empty covering part, + /// but it's okay to simply avoid creating it for DROP PART (for a part in the middle). + /// NOTE: Block numbers in ReplicatedMergeTree start from 0. For MergeTree, is_new_syntax is always false. + assert(!create_empty_part || supportsReplication()); + bool range_in_the_middle = drop_range.min_block; bool is_new_syntax = format_version >= MERGE_TREE_DATA_MIN_FORMAT_VERSION_WITH_CUSTOM_PARTITIONING; - if (create_empty_part && !parts_to_remove.empty() && is_new_syntax) + if (create_empty_part && !parts_to_remove.empty() && is_new_syntax && !range_in_the_middle) { /// We are going to remove a lot of parts from zookeeper just after returning from this function. /// And we will remove parts from disk later (because some queries may use them). @@ -3995,12 +4006,9 @@ MergeTreeData::PartsToRemoveFromZooKeeper MergeTreeData::removePartsInRangeFromW /// We don't need to commit it to zk, and don't even need to activate it. MergeTreePartInfo empty_info = drop_range; - empty_info.level = empty_info.mutation = 0; - if (!empty_info.min_block) - empty_info.min_block = MergeTreePartInfo::MAX_BLOCK_NUMBER; + empty_info.min_block = empty_info.level = empty_info.mutation = 0; for (const auto & part : parts_to_remove) { - empty_info.min_block = std::min(empty_info.min_block, part->info.min_block); empty_info.level = std::max(empty_info.level, part->info.level); empty_info.mutation = std::max(empty_info.mutation, part->info.mutation); } @@ -6617,8 +6625,7 @@ using PartitionIdToMaxBlock = std::unordered_map; Block MergeTreeData::getMinMaxCountProjectionBlock( const StorageMetadataPtr & metadata_snapshot, const Names & required_columns, - bool has_filter, - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_dag, const DataPartsVector & parts, const PartitionIdToMaxBlock * max_block_numbers_to_read, ContextPtr query_context) const @@ -6668,7 +6675,7 @@ Block MergeTreeData::getMinMaxCountProjectionBlock( Block virtual_columns_block; auto virtual_block = getSampleBlockWithVirtualColumns(); bool has_virtual_column = std::any_of(required_columns.begin(), required_columns.end(), [&](const auto & name) { return virtual_block.has(name); }); - if (has_virtual_column || has_filter) + if (has_virtual_column || filter_dag) { virtual_columns_block = getBlockWithVirtualPartColumns(parts, false /* one_part */, true /* ignore_empty */); if (virtual_columns_block.rows() == 0) @@ -6680,7 +6687,7 @@ Block MergeTreeData::getMinMaxCountProjectionBlock( std::optional partition_pruner; std::optional minmax_idx_condition; DataTypes minmax_columns_types; - if (has_filter) + if (filter_dag) { if (metadata_snapshot->hasPartitionKey()) { @@ -6689,16 +6696,15 @@ Block MergeTreeData::getMinMaxCountProjectionBlock( minmax_columns_types = getMinMaxColumnsTypes(partition_key); minmax_idx_condition.emplace( - query_info, query_context, minmax_columns_names, + filter_dag, query_context, minmax_columns_names, getMinMaxExpr(partition_key, ExpressionActionsSettings::fromContext(query_context))); - partition_pruner.emplace(metadata_snapshot, query_info, query_context, false /* strict */); + partition_pruner.emplace(metadata_snapshot, filter_dag, query_context, false /* strict */); } + const auto * predicate = filter_dag->getOutputs().at(0); + // Generate valid expressions for filtering - ASTPtr expression_ast; - VirtualColumnUtils::prepareFilterBlockWithQuery(query_info.query, query_context, virtual_columns_block, expression_ast); - if (expression_ast) - VirtualColumnUtils::filterBlockWithQuery(query_info.query, virtual_columns_block, query_context, expression_ast); + VirtualColumnUtils::filterBlockWithPredicate(predicate, virtual_columns_block, query_context); rows = virtual_columns_block.rows(); part_name_column = virtual_columns_block.getByName("_part").column; diff --git a/src/Storages/MergeTree/MergeTreeData.h b/src/Storages/MergeTree/MergeTreeData.h index dfa13eca11d..f0dbaf0e307 100644 --- a/src/Storages/MergeTree/MergeTreeData.h +++ b/src/Storages/MergeTree/MergeTreeData.h @@ -404,8 +404,7 @@ public: Block getMinMaxCountProjectionBlock( const StorageMetadataPtr & metadata_snapshot, const Names & required_columns, - bool has_filter, - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_dag, const DataPartsVector & parts, const PartitionIdToMaxBlock * max_block_numbers_to_read, ContextPtr query_context) const; @@ -1222,7 +1221,7 @@ protected: boost::iterator_range range, const ColumnsDescription & storage_columns); std::optional totalRowsByPartitionPredicateImpl( - const SelectQueryInfo & query_info, ContextPtr context, const DataPartsVector & parts) const; + const ActionsDAGPtr & filter_actions_dag, ContextPtr context, const DataPartsVector & parts) const; static decltype(auto) getStateModifier(DataPartState state) { diff --git a/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp b/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp index 7b30622a4fc..d5b9b4423a9 100644 --- a/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp +++ b/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp @@ -784,7 +784,7 @@ void MergeTreeDataSelectExecutor::buildKeyConditionFromPartOffset( = {ColumnWithTypeAndName(part_offset_type->createColumn(), part_offset_type, "_part_offset"), ColumnWithTypeAndName(part_type->createColumn(), part_type, "_part")}; - auto dag = VirtualColumnUtils::splitFilterDagForAllowedInputs(filter_dag->getOutputs().at(0), sample); + auto dag = VirtualColumnUtils::splitFilterDagForAllowedInputs(filter_dag->getOutputs().at(0), &sample); if (!dag) return; @@ -810,7 +810,7 @@ std::optional> MergeTreeDataSelectExecutor::filterPar if (!filter_dag) return {}; auto sample = data.getSampleBlockWithVirtualColumns(); - auto dag = VirtualColumnUtils::splitFilterDagForAllowedInputs(filter_dag->getOutputs().at(0), sample); + auto dag = VirtualColumnUtils::splitFilterDagForAllowedInputs(filter_dag->getOutputs().at(0), &sample); if (!dag) return {}; @@ -819,34 +819,6 @@ std::optional> MergeTreeDataSelectExecutor::filterPar return VirtualColumnUtils::extractSingleValueFromBlock(virtual_columns_block, "_part"); } - -std::optional> MergeTreeDataSelectExecutor::filterPartsByVirtualColumns( - const MergeTreeData & data, - const MergeTreeData::DataPartsVector & parts, - const ASTPtr & query, - ContextPtr context) -{ - std::unordered_set part_values; - ASTPtr expression_ast; - auto virtual_columns_block = data.getBlockWithVirtualPartColumns(parts, true /* one_part */); - - if (virtual_columns_block.rows() == 0) - return {}; - - // Generate valid expressions for filtering - VirtualColumnUtils::prepareFilterBlockWithQuery(query, context, virtual_columns_block, expression_ast); - - // If there is still something left, fill the virtual block and do the filtering. - if (expression_ast) - { - virtual_columns_block = data.getBlockWithVirtualPartColumns(parts, false /* one_part */); - VirtualColumnUtils::filterBlockWithQuery(query, virtual_columns_block, context, expression_ast); - return VirtualColumnUtils::extractSingleValueFromBlock(virtual_columns_block, "_part"); - } - - return {}; -} - void MergeTreeDataSelectExecutor::filterPartsByPartition( const std::optional & partition_pruner, const std::optional & minmax_idx_condition, diff --git a/src/Storages/MergeTree/MergeTreeDataSelectExecutor.h b/src/Storages/MergeTree/MergeTreeDataSelectExecutor.h index 11c8e172a4f..4c6e1086cbc 100644 --- a/src/Storages/MergeTree/MergeTreeDataSelectExecutor.h +++ b/src/Storages/MergeTree/MergeTreeDataSelectExecutor.h @@ -169,12 +169,6 @@ public: /// If possible, filter using expression on virtual columns. /// Example: SELECT count() FROM table WHERE _part = 'part_name' /// If expression found, return a set with allowed part names (std::nullopt otherwise). - static std::optional> filterPartsByVirtualColumns( - const MergeTreeData & data, - const MergeTreeData::DataPartsVector & parts, - const ASTPtr & query, - ContextPtr context); - static std::optional> filterPartsByVirtualColumns( const MergeTreeData & data, const MergeTreeData::DataPartsVector & parts, diff --git a/src/Storages/MergeTree/MergeTreeIndexAnnoy.cpp b/src/Storages/MergeTree/MergeTreeIndexAnnoy.cpp index 4411d46e124..e36459b019f 100644 --- a/src/Storages/MergeTree/MergeTreeIndexAnnoy.cpp +++ b/src/Storages/MergeTree/MergeTreeIndexAnnoy.cpp @@ -23,6 +23,7 @@ namespace ErrorCodes extern const int INCORRECT_NUMBER_OF_COLUMNS; extern const int INCORRECT_QUERY; extern const int LOGICAL_ERROR; + extern const int NOT_IMPLEMENTED; } template @@ -331,6 +332,11 @@ MergeTreeIndexConditionPtr MergeTreeIndexAnnoy::createIndexCondition(const Selec return std::make_shared(index, query, distance_function, context); }; +MergeTreeIndexConditionPtr MergeTreeIndexAnnoy::createIndexCondition(const ActionsDAGPtr &, ContextPtr) const +{ + throw Exception(ErrorCodes::NOT_IMPLEMENTED, "MergeTreeIndexAnnoy cannot be created with ActionsDAG"); +} + MergeTreeIndexPtr annoyIndexCreator(const IndexDescription & index) { static constexpr auto DEFAULT_DISTANCE_FUNCTION = DISTANCE_FUNCTION_L2; diff --git a/src/Storages/MergeTree/MergeTreeIndexAnnoy.h b/src/Storages/MergeTree/MergeTreeIndexAnnoy.h index dead12fe66f..d511ab84859 100644 --- a/src/Storages/MergeTree/MergeTreeIndexAnnoy.h +++ b/src/Storages/MergeTree/MergeTreeIndexAnnoy.h @@ -88,7 +88,7 @@ private: }; -class MergeTreeIndexAnnoy : public IMergeTreeIndex +class MergeTreeIndexAnnoy final : public IMergeTreeIndex { public: @@ -98,7 +98,9 @@ public: MergeTreeIndexGranulePtr createIndexGranule() const override; MergeTreeIndexAggregatorPtr createIndexAggregator(const MergeTreeWriterSettings & settings) const override; - MergeTreeIndexConditionPtr createIndexCondition(const SelectQueryInfo & query, ContextPtr context) const override; + MergeTreeIndexConditionPtr createIndexCondition(const SelectQueryInfo & query, ContextPtr context) const; + MergeTreeIndexConditionPtr createIndexCondition(const ActionsDAGPtr &, ContextPtr) const override; + bool isVectorSearch() const override { return true; } private: const UInt64 trees; diff --git a/src/Storages/MergeTree/MergeTreeIndexBloomFilter.cpp b/src/Storages/MergeTree/MergeTreeIndexBloomFilter.cpp index fa05f9e61e1..dbd33609a00 100644 --- a/src/Storages/MergeTree/MergeTreeIndexBloomFilter.cpp +++ b/src/Storages/MergeTree/MergeTreeIndexBloomFilter.cpp @@ -43,9 +43,9 @@ MergeTreeIndexAggregatorPtr MergeTreeIndexBloomFilter::createIndexAggregator(con return std::make_shared(bits_per_row, hash_functions, index.column_names); } -MergeTreeIndexConditionPtr MergeTreeIndexBloomFilter::createIndexCondition(const SelectQueryInfo & query_info, ContextPtr context) const +MergeTreeIndexConditionPtr MergeTreeIndexBloomFilter::createIndexCondition(const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const { - return std::make_shared(query_info, context, index.sample_block, hash_functions); + return std::make_shared(filter_actions_dag, context, index.sample_block, hash_functions); } static void assertIndexColumnsType(const Block & header) diff --git a/src/Storages/MergeTree/MergeTreeIndexBloomFilter.h b/src/Storages/MergeTree/MergeTreeIndexBloomFilter.h index 4d688ae3cfc..d6f4d6f2cf5 100644 --- a/src/Storages/MergeTree/MergeTreeIndexBloomFilter.h +++ b/src/Storages/MergeTree/MergeTreeIndexBloomFilter.h @@ -20,7 +20,7 @@ public: MergeTreeIndexAggregatorPtr createIndexAggregator(const MergeTreeWriterSettings & settings) const override; - MergeTreeIndexConditionPtr createIndexCondition(const SelectQueryInfo & query_info, ContextPtr context) const override; + MergeTreeIndexConditionPtr createIndexCondition(const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const override; private: size_t bits_per_row; diff --git a/src/Storages/MergeTree/MergeTreeIndexConditionBloomFilter.cpp b/src/Storages/MergeTree/MergeTreeIndexConditionBloomFilter.cpp index 398a85e92ac..da49814b83a 100644 --- a/src/Storages/MergeTree/MergeTreeIndexConditionBloomFilter.cpp +++ b/src/Storages/MergeTree/MergeTreeIndexConditionBloomFilter.cpp @@ -97,39 +97,18 @@ bool maybeTrueOnBloomFilter(const IColumn * hash_column, const BloomFilterPtr & } MergeTreeIndexConditionBloomFilter::MergeTreeIndexConditionBloomFilter( - const SelectQueryInfo & info_, ContextPtr context_, const Block & header_, size_t hash_functions_) - : WithContext(context_), header(header_), query_info(info_), hash_functions(hash_functions_) + const ActionsDAGPtr & filter_actions_dag, ContextPtr context_, const Block & header_, size_t hash_functions_) + : WithContext(context_), header(header_), hash_functions(hash_functions_) { - if (context_->getSettingsRef().allow_experimental_analyzer) - { - if (!query_info.filter_actions_dag) - { - rpn.push_back(RPNElement::FUNCTION_UNKNOWN); - return; - } - - RPNBuilder builder( - query_info.filter_actions_dag->getOutputs().at(0), - context_, - [&](const RPNBuilderTreeNode & node, RPNElement & out) { return extractAtomFromTree(node, out); }); - rpn = std::move(builder).extractRPN(); - return; - } - - ASTPtr filter_node = buildFilterNode(query_info.query); - - if (!filter_node) + if (!filter_actions_dag) { rpn.push_back(RPNElement::FUNCTION_UNKNOWN); return; } - auto block_with_constants = KeyCondition::getBlockWithConstants(query_info.query, query_info.syntax_analyzer_result, context_); RPNBuilder builder( - filter_node, + filter_actions_dag->getOutputs().at(0), context_, - std::move(block_with_constants), - query_info.prepared_sets, [&](const RPNBuilderTreeNode & node, RPNElement & out) { return extractAtomFromTree(node, out); }); rpn = std::move(builder).extractRPN(); } diff --git a/src/Storages/MergeTree/MergeTreeIndexConditionBloomFilter.h b/src/Storages/MergeTree/MergeTreeIndexConditionBloomFilter.h index 952948fd582..db85c804d8d 100644 --- a/src/Storages/MergeTree/MergeTreeIndexConditionBloomFilter.h +++ b/src/Storages/MergeTree/MergeTreeIndexConditionBloomFilter.h @@ -44,7 +44,7 @@ public: std::vector> predicate; }; - MergeTreeIndexConditionBloomFilter(const SelectQueryInfo & info_, ContextPtr context_, const Block & header_, size_t hash_functions_); + MergeTreeIndexConditionBloomFilter(const ActionsDAGPtr & filter_actions_dag, ContextPtr context_, const Block & header_, size_t hash_functions_); bool alwaysUnknownOrTrue() const override; @@ -58,7 +58,6 @@ public: private: const Block & header; - const SelectQueryInfo & query_info; const size_t hash_functions; std::vector rpn; diff --git a/src/Storages/MergeTree/MergeTreeIndexFullText.cpp b/src/Storages/MergeTree/MergeTreeIndexFullText.cpp index 6c1fff53109..4cd616513ac 100644 --- a/src/Storages/MergeTree/MergeTreeIndexFullText.cpp +++ b/src/Storages/MergeTree/MergeTreeIndexFullText.cpp @@ -1,22 +1,23 @@ #include #include -#include +#include +#include #include -#include +#include #include +#include #include #include #include #include -#include -#include -#include #include #include -#include #include -#include +#include +#include +#include +#include #include @@ -137,7 +138,7 @@ void MergeTreeIndexAggregatorFullText::update(const Block & block, size_t * pos, } MergeTreeConditionFullText::MergeTreeConditionFullText( - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_actions_dag, ContextPtr context, const Block & index_sample_block, const BloomFilterParameters & params_, @@ -146,38 +147,16 @@ MergeTreeConditionFullText::MergeTreeConditionFullText( , index_data_types(index_sample_block.getNamesAndTypesList().getTypes()) , params(params_) , token_extractor(token_extactor_) - , prepared_sets(query_info.prepared_sets) { - if (context->getSettingsRef().allow_experimental_analyzer) - { - if (!query_info.filter_actions_dag) - { - rpn.push_back(RPNElement::FUNCTION_UNKNOWN); - return; - } - - RPNBuilder builder( - query_info.filter_actions_dag->getOutputs().at(0), - context, - [&](const RPNBuilderTreeNode & node, RPNElement & out) { return extractAtomFromTree(node, out); }); - rpn = std::move(builder).extractRPN(); - return; - } - - ASTPtr filter_node = buildFilterNode(query_info.query); - - if (!filter_node) + if (!filter_actions_dag) { rpn.push_back(RPNElement::FUNCTION_UNKNOWN); return; } - auto block_with_constants = KeyCondition::getBlockWithConstants(query_info.query, query_info.syntax_analyzer_result, context); RPNBuilder builder( - filter_node, + filter_actions_dag->getOutputs().at(0), context, - std::move(block_with_constants), - query_info.prepared_sets, [&](const RPNBuilderTreeNode & node, RPNElement & out) { return extractAtomFromTree(node, out); }); rpn = std::move(builder).extractRPN(); } @@ -201,6 +180,7 @@ bool MergeTreeConditionFullText::alwaysUnknownOrTrue() const || element.function == RPNElement::FUNCTION_IN || element.function == RPNElement::FUNCTION_NOT_IN || element.function == RPNElement::FUNCTION_MULTI_SEARCH + || element.function == RPNElement::FUNCTION_MATCH || element.function == RPNElement::FUNCTION_HAS_ANY || element.function == RPNElement::ALWAYS_FALSE) { @@ -285,8 +265,27 @@ bool MergeTreeConditionFullText::mayBeTrueOnGranule(MergeTreeIndexGranulePtr idx for (size_t row = 0; row < bloom_filters.size(); ++row) result[row] = result[row] && granule->bloom_filters[element.key_column].contains(bloom_filters[row]); - rpn_stack.emplace_back( - std::find(std::cbegin(result), std::cend(result), true) != std::end(result), true); + rpn_stack.emplace_back(std::find(std::cbegin(result), std::cend(result), true) != std::end(result), true); + } + else if (element.function == RPNElement::FUNCTION_MATCH) + { + if (!element.set_bloom_filters.empty()) + { + /// Alternative substrings + std::vector result(element.set_bloom_filters.back().size(), true); + + const auto & bloom_filters = element.set_bloom_filters[0]; + + for (size_t row = 0; row < bloom_filters.size(); ++row) + result[row] = result[row] && granule->bloom_filters[element.key_column].contains(bloom_filters[row]); + + rpn_stack.emplace_back(std::find(std::cbegin(result), std::cend(result), true) != std::end(result), true); + } + else if (element.bloom_filter) + { + /// Required substrings + rpn_stack.emplace_back(granule->bloom_filters[element.key_column].contains(*element.bloom_filter), true); + } } else if (element.function == RPNElement::FUNCTION_NOT) { @@ -392,6 +391,7 @@ bool MergeTreeConditionFullText::extractAtomFromTree(const RPNBuilderTreeNode & function_name == "notEquals" || function_name == "has" || function_name == "mapContains" || + function_name == "match" || function_name == "like" || function_name == "notLike" || function_name.starts_with("hasToken") || @@ -513,6 +513,7 @@ bool MergeTreeConditionFullText::traverseTreeEquals( token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter); return true; } + else if (function_name == "has") { out.key_column = *key_index; @@ -600,6 +601,39 @@ bool MergeTreeConditionFullText::traverseTreeEquals( out.set_bloom_filters = std::move(bloom_filters); return true; } + else if (function_name == "match") + { + out.key_column = *key_index; + out.function = RPNElement::FUNCTION_MATCH; + out.bloom_filter = std::make_unique(params); + + auto & value = const_value.get(); + String required_substring; + bool dummy_is_trivial, dummy_required_substring_is_prefix; + std::vector alternatives; + OptimizedRegularExpression::analyze(value, required_substring, dummy_is_trivial, dummy_required_substring_is_prefix, alternatives); + + if (required_substring.empty() && alternatives.empty()) + return false; + + /// out.set_bloom_filters means alternatives exist + /// out.bloom_filter means required_substring exists + if (!alternatives.empty()) + { + std::vector> bloom_filters; + bloom_filters.emplace_back(); + for (const auto & alternative : alternatives) + { + bloom_filters.back().emplace_back(params); + token_extractor->stringToBloomFilter(alternative.data(), alternative.size(), bloom_filters.back().back()); + } + out.set_bloom_filters = std::move(bloom_filters); + } + else + token_extractor->stringToBloomFilter(required_substring.data(), required_substring.size(), *out.bloom_filter); + + return true; + } return false; } @@ -691,9 +725,9 @@ MergeTreeIndexAggregatorPtr MergeTreeIndexFullText::createIndexAggregator(const } MergeTreeIndexConditionPtr MergeTreeIndexFullText::createIndexCondition( - const SelectQueryInfo & query, ContextPtr context) const + const ActionsDAGPtr & filter_dag, ContextPtr context) const { - return std::make_shared(query, context, index.sample_block, params, token_extractor.get()); + return std::make_shared(filter_dag, context, index.sample_block, params, token_extractor.get()); } MergeTreeIndexPtr bloomFilterIndexCreator( diff --git a/src/Storages/MergeTree/MergeTreeIndexFullText.h b/src/Storages/MergeTree/MergeTreeIndexFullText.h index 22f9215d563..e66f498ce1d 100644 --- a/src/Storages/MergeTree/MergeTreeIndexFullText.h +++ b/src/Storages/MergeTree/MergeTreeIndexFullText.h @@ -62,7 +62,7 @@ class MergeTreeConditionFullText final : public IMergeTreeIndexCondition { public: MergeTreeConditionFullText( - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_actions_dag, ContextPtr context, const Block & index_sample_block, const BloomFilterParameters & params_, @@ -90,6 +90,7 @@ private: FUNCTION_NOT_EQUALS, FUNCTION_HAS, FUNCTION_IN, + FUNCTION_MATCH, FUNCTION_NOT_IN, FUNCTION_MULTI_SEARCH, FUNCTION_HAS_ANY, @@ -143,9 +144,6 @@ private: BloomFilterParameters params; TokenExtractorPtr token_extractor; RPN rpn; - - /// Sets from syntax analyzer. - PreparedSetsPtr prepared_sets; }; class MergeTreeIndexFullText final : public IMergeTreeIndex @@ -165,7 +163,7 @@ public: MergeTreeIndexAggregatorPtr createIndexAggregator(const MergeTreeWriterSettings & settings) const override; MergeTreeIndexConditionPtr createIndexCondition( - const SelectQueryInfo & query, ContextPtr context) const override; + const ActionsDAGPtr & filter_dag, ContextPtr context) const override; BloomFilterParameters params; /// Function for selecting next token. diff --git a/src/Storages/MergeTree/MergeTreeIndexHypothesis.cpp b/src/Storages/MergeTree/MergeTreeIndexHypothesis.cpp index 818bae40067..0995e2724ec 100644 --- a/src/Storages/MergeTree/MergeTreeIndexHypothesis.cpp +++ b/src/Storages/MergeTree/MergeTreeIndexHypothesis.cpp @@ -79,7 +79,7 @@ MergeTreeIndexAggregatorPtr MergeTreeIndexHypothesis::createIndexAggregator(cons } MergeTreeIndexConditionPtr MergeTreeIndexHypothesis::createIndexCondition( - const SelectQueryInfo &, ContextPtr) const + const ActionsDAGPtr &, ContextPtr) const { throw Exception(ErrorCodes::LOGICAL_ERROR, "Not supported"); } diff --git a/src/Storages/MergeTree/MergeTreeIndexHypothesis.h b/src/Storages/MergeTree/MergeTreeIndexHypothesis.h index 1cd0e3daf27..2296e1b717d 100644 --- a/src/Storages/MergeTree/MergeTreeIndexHypothesis.h +++ b/src/Storages/MergeTree/MergeTreeIndexHypothesis.h @@ -70,7 +70,7 @@ public: MergeTreeIndexAggregatorPtr createIndexAggregator(const MergeTreeWriterSettings & settings) const override; MergeTreeIndexConditionPtr createIndexCondition( - const SelectQueryInfo & query, ContextPtr context) const override; + const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const override; MergeTreeIndexMergedConditionPtr createIndexMergedCondition( const SelectQueryInfo & query_info, StorageMetadataPtr storage_metadata) const override; diff --git a/src/Storages/MergeTree/MergeTreeIndexInverted.cpp b/src/Storages/MergeTree/MergeTreeIndexInverted.cpp index 5e2a034cb97..4c28fe8f00b 100644 --- a/src/Storages/MergeTree/MergeTreeIndexInverted.cpp +++ b/src/Storages/MergeTree/MergeTreeIndexInverted.cpp @@ -184,7 +184,7 @@ void MergeTreeIndexAggregatorInverted::update(const Block & block, size_t * pos, } MergeTreeConditionInverted::MergeTreeConditionInverted( - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_actions_dag, ContextPtr context_, const Block & index_sample_block, const GinFilterParameters & params_, @@ -192,41 +192,20 @@ MergeTreeConditionInverted::MergeTreeConditionInverted( : WithContext(context_), header(index_sample_block) , params(params_) , token_extractor(token_extactor_) - , prepared_sets(query_info.prepared_sets) { - if (context_->getSettingsRef().allow_experimental_analyzer) - { - if (!query_info.filter_actions_dag) - { - rpn.push_back(RPNElement::FUNCTION_UNKNOWN); - return; - } - - rpn = std::move( - RPNBuilder( - query_info.filter_actions_dag->getOutputs().at(0), context_, - [&](const RPNBuilderTreeNode & node, RPNElement & out) - { - return this->traverseAtomAST(node, out); - }).extractRPN()); - return; - } - - ASTPtr filter_node = buildFilterNode(query_info.query); - if (!filter_node) + if (!filter_actions_dag) { rpn.push_back(RPNElement::FUNCTION_UNKNOWN); return; } - auto block_with_constants = KeyCondition::getBlockWithConstants(query_info.query, query_info.syntax_analyzer_result, context_); - RPNBuilder builder( - filter_node, - context_, - std::move(block_with_constants), - query_info.prepared_sets, - [&](const RPNBuilderTreeNode & node, RPNElement & out) { return traverseAtomAST(node, out); }); - rpn = std::move(builder).extractRPN(); + rpn = std::move( + RPNBuilder( + filter_actions_dag->getOutputs().at(0), context_, + [&](const RPNBuilderTreeNode & node, RPNElement & out) + { + return this->traverseAtomAST(node, out); + }).extractRPN()); } /// Keep in-sync with MergeTreeConditionFullText::alwaysUnknownOrTrue @@ -721,9 +700,9 @@ MergeTreeIndexAggregatorPtr MergeTreeIndexInverted::createIndexAggregatorForPart } MergeTreeIndexConditionPtr MergeTreeIndexInverted::createIndexCondition( - const SelectQueryInfo & query, ContextPtr context) const + const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const { - return std::make_shared(query, context, index.sample_block, params, token_extractor.get()); + return std::make_shared(filter_actions_dag, context, index.sample_block, params, token_extractor.get()); }; MergeTreeIndexPtr invertedIndexCreator( diff --git a/src/Storages/MergeTree/MergeTreeIndexInverted.h b/src/Storages/MergeTree/MergeTreeIndexInverted.h index 413cf206f0e..807651d0c26 100644 --- a/src/Storages/MergeTree/MergeTreeIndexInverted.h +++ b/src/Storages/MergeTree/MergeTreeIndexInverted.h @@ -64,7 +64,7 @@ class MergeTreeConditionInverted final : public IMergeTreeIndexCondition, WithCo { public: MergeTreeConditionInverted( - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_actions_dag, ContextPtr context, const Block & index_sample_block, const GinFilterParameters & params_, @@ -169,7 +169,7 @@ public: MergeTreeIndexGranulePtr createIndexGranule() const override; MergeTreeIndexAggregatorPtr createIndexAggregator(const MergeTreeWriterSettings & settings) const override; MergeTreeIndexAggregatorPtr createIndexAggregatorForPart(const GinIndexStorePtr & store, const MergeTreeWriterSettings & /*settings*/) const override; - MergeTreeIndexConditionPtr createIndexCondition(const SelectQueryInfo & query, ContextPtr context) const override; + MergeTreeIndexConditionPtr createIndexCondition(const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const override; GinFilterParameters params; /// Function for selecting next token. diff --git a/src/Storages/MergeTree/MergeTreeIndexMinMax.cpp b/src/Storages/MergeTree/MergeTreeIndexMinMax.cpp index 535fef45872..b1f8e09be9f 100644 --- a/src/Storages/MergeTree/MergeTreeIndexMinMax.cpp +++ b/src/Storages/MergeTree/MergeTreeIndexMinMax.cpp @@ -156,20 +156,17 @@ void MergeTreeIndexAggregatorMinMax::update(const Block & block, size_t * pos, s namespace { -KeyCondition buildCondition(const IndexDescription & index, const SelectQueryInfo & query_info, ContextPtr context) +KeyCondition buildCondition(const IndexDescription & index, const ActionsDAGPtr & filter_actions_dag, ContextPtr context) { - if (context->getSettingsRef().allow_experimental_analyzer) - return KeyCondition{query_info.filter_actions_dag, context, index.column_names, index.expression}; - - return KeyCondition{query_info, context, index.column_names, index.expression}; + return KeyCondition{filter_actions_dag, context, index.column_names, index.expression}; } } MergeTreeIndexConditionMinMax::MergeTreeIndexConditionMinMax( - const IndexDescription & index, const SelectQueryInfo & query_info, ContextPtr context) + const IndexDescription & index, const ActionsDAGPtr & filter_actions_dag, ContextPtr context) : index_data_types(index.data_types) - , condition(buildCondition(index, query_info, context)) + , condition(buildCondition(index, filter_actions_dag, context)) { } @@ -200,9 +197,9 @@ MergeTreeIndexAggregatorPtr MergeTreeIndexMinMax::createIndexAggregator(const Me } MergeTreeIndexConditionPtr MergeTreeIndexMinMax::createIndexCondition( - const SelectQueryInfo & query, ContextPtr context) const + const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const { - return std::make_shared(index, query, context); + return std::make_shared(index, filter_actions_dag, context); } MergeTreeIndexFormat MergeTreeIndexMinMax::getDeserializedFormat(const IDataPartStorage & data_part_storage, const std::string & relative_path_prefix) const diff --git a/src/Storages/MergeTree/MergeTreeIndexMinMax.h b/src/Storages/MergeTree/MergeTreeIndexMinMax.h index a1a216fdf72..1e2abe6983f 100644 --- a/src/Storages/MergeTree/MergeTreeIndexMinMax.h +++ b/src/Storages/MergeTree/MergeTreeIndexMinMax.h @@ -52,7 +52,7 @@ class MergeTreeIndexConditionMinMax final : public IMergeTreeIndexCondition public: MergeTreeIndexConditionMinMax( const IndexDescription & index, - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_actions_dag, ContextPtr context); bool alwaysUnknownOrTrue() const override; @@ -79,7 +79,7 @@ public: MergeTreeIndexAggregatorPtr createIndexAggregator(const MergeTreeWriterSettings & settings) const override; MergeTreeIndexConditionPtr createIndexCondition( - const SelectQueryInfo & query, ContextPtr context) const override; + const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const override; const char* getSerializedFileExtension() const override { return ".idx2"; } MergeTreeIndexFormat getDeserializedFormat(const IDataPartStorage & data_part_storage, const std::string & path_prefix) const override; /// NOLINT diff --git a/src/Storages/MergeTree/MergeTreeIndexSet.cpp b/src/Storages/MergeTree/MergeTreeIndexSet.cpp index 612c5d868cb..831856f8085 100644 --- a/src/Storages/MergeTree/MergeTreeIndexSet.cpp +++ b/src/Storages/MergeTree/MergeTreeIndexSet.cpp @@ -247,7 +247,7 @@ MergeTreeIndexConditionSet::MergeTreeIndexConditionSet( const String & index_name_, const Block & index_sample_block, size_t max_rows_, - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_dag, ContextPtr context) : index_name(index_name_) , max_rows(max_rows_) @@ -256,42 +256,20 @@ MergeTreeIndexConditionSet::MergeTreeIndexConditionSet( if (!key_columns.contains(name)) key_columns.insert(name); - if (context->getSettingsRef().allow_experimental_analyzer) - { - if (!query_info.filter_actions_dag) - return; + if (!filter_dag) + return; - if (checkDAGUseless(*query_info.filter_actions_dag->getOutputs().at(0), context)) - return; + if (checkDAGUseless(*filter_dag->getOutputs().at(0), context)) + return; - const auto * filter_node = query_info.filter_actions_dag->getOutputs().at(0); - auto filter_actions_dag = ActionsDAG::buildFilterActionsDAG({filter_node}, {}, context); - const auto * filter_actions_dag_node = filter_actions_dag->getOutputs().at(0); + auto filter_actions_dag = filter_dag->clone(); + const auto * filter_actions_dag_node = filter_actions_dag->getOutputs().at(0); - std::unordered_map node_to_result_node; - filter_actions_dag->getOutputs()[0] = &traverseDAG(*filter_actions_dag_node, filter_actions_dag, context, node_to_result_node); + std::unordered_map node_to_result_node; + filter_actions_dag->getOutputs()[0] = &traverseDAG(*filter_actions_dag_node, filter_actions_dag, context, node_to_result_node); - filter_actions_dag->removeUnusedActions(); - actions = std::make_shared(filter_actions_dag); - } - else - { - ASTPtr ast_filter_node = buildFilterNode(query_info.query); - if (!ast_filter_node) - return; - - if (checkASTUseless(ast_filter_node)) - return; - - auto expression_ast = ast_filter_node->clone(); - - /// Replace logical functions with bit functions. - /// Working with UInt8: last bit = can be true, previous = can be false (Like src/Storages/MergeTree/BoolMask.h). - traverseAST(expression_ast); - - auto syntax_analyzer_result = TreeRewriter(context).analyze(expression_ast, index_sample_block.getNamesAndTypesList()); - actions = ExpressionAnalyzer(expression_ast, syntax_analyzer_result, context).getActions(true); - } + filter_actions_dag->removeUnusedActions(); + actions = std::make_shared(filter_actions_dag); } bool MergeTreeIndexConditionSet::alwaysUnknownOrTrue() const @@ -704,9 +682,9 @@ MergeTreeIndexAggregatorPtr MergeTreeIndexSet::createIndexAggregator(const Merge } MergeTreeIndexConditionPtr MergeTreeIndexSet::createIndexCondition( - const SelectQueryInfo & query, ContextPtr context) const + const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const { - return std::make_shared(index.name, index.sample_block, max_rows, query, context); + return std::make_shared(index.name, index.sample_block, max_rows, filter_actions_dag, context); } MergeTreeIndexPtr setIndexCreator(const IndexDescription & index) diff --git a/src/Storages/MergeTree/MergeTreeIndexSet.h b/src/Storages/MergeTree/MergeTreeIndexSet.h index a53476ca751..ea9f7ddef3d 100644 --- a/src/Storages/MergeTree/MergeTreeIndexSet.h +++ b/src/Storages/MergeTree/MergeTreeIndexSet.h @@ -87,7 +87,7 @@ public: const String & index_name_, const Block & index_sample_block, size_t max_rows_, - const SelectQueryInfo & query_info, + const ActionsDAGPtr & filter_dag, ContextPtr context); bool alwaysUnknownOrTrue() const override; @@ -149,7 +149,7 @@ public: MergeTreeIndexAggregatorPtr createIndexAggregator(const MergeTreeWriterSettings & settings) const override; MergeTreeIndexConditionPtr createIndexCondition( - const SelectQueryInfo & query, ContextPtr context) const override; + const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const override; size_t max_rows = 0; }; diff --git a/src/Storages/MergeTree/MergeTreeIndexUSearch.cpp b/src/Storages/MergeTree/MergeTreeIndexUSearch.cpp index dc8ed368011..c9df7210569 100644 --- a/src/Storages/MergeTree/MergeTreeIndexUSearch.cpp +++ b/src/Storages/MergeTree/MergeTreeIndexUSearch.cpp @@ -36,6 +36,7 @@ namespace ErrorCodes extern const int INCORRECT_NUMBER_OF_COLUMNS; extern const int INCORRECT_QUERY; extern const int LOGICAL_ERROR; + extern const int NOT_IMPLEMENTED; } namespace @@ -366,6 +367,11 @@ MergeTreeIndexConditionPtr MergeTreeIndexUSearch::createIndexCondition(const Sel return std::make_shared(index, query, distance_function, context); }; +MergeTreeIndexConditionPtr MergeTreeIndexUSearch::createIndexCondition(const ActionsDAGPtr &, ContextPtr) const +{ + throw Exception(ErrorCodes::NOT_IMPLEMENTED, "MergeTreeIndexAnnoy cannot be created with ActionsDAG"); +} + MergeTreeIndexPtr usearchIndexCreator(const IndexDescription & index) { static constexpr auto default_distance_function = DISTANCE_FUNCTION_L2; diff --git a/src/Storages/MergeTree/MergeTreeIndexUSearch.h b/src/Storages/MergeTree/MergeTreeIndexUSearch.h index a7675620a2e..5107cfee371 100644 --- a/src/Storages/MergeTree/MergeTreeIndexUSearch.h +++ b/src/Storages/MergeTree/MergeTreeIndexUSearch.h @@ -100,7 +100,9 @@ public: MergeTreeIndexGranulePtr createIndexGranule() const override; MergeTreeIndexAggregatorPtr createIndexAggregator(const MergeTreeWriterSettings & settings) const override; - MergeTreeIndexConditionPtr createIndexCondition(const SelectQueryInfo & query, ContextPtr context) const override; + MergeTreeIndexConditionPtr createIndexCondition(const SelectQueryInfo & query, ContextPtr context) const; + MergeTreeIndexConditionPtr createIndexCondition(const ActionsDAGPtr &, ContextPtr) const override; + bool isVectorSearch() const override { return true; } private: const String distance_function; diff --git a/src/Storages/MergeTree/MergeTreeIndices.h b/src/Storages/MergeTree/MergeTreeIndices.h index da1e914b90e..4749470bedd 100644 --- a/src/Storages/MergeTree/MergeTreeIndices.h +++ b/src/Storages/MergeTree/MergeTreeIndices.h @@ -170,7 +170,9 @@ struct IMergeTreeIndex } virtual MergeTreeIndexConditionPtr createIndexCondition( - const SelectQueryInfo & query_info, ContextPtr context) const = 0; + const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const = 0; + + virtual bool isVectorSearch() const { return false; } virtual MergeTreeIndexMergedConditionPtr createIndexMergedCondition( const SelectQueryInfo & /*query_info*/, StorageMetadataPtr /*storage_metadata*/) const diff --git a/src/Storages/MergeTree/MergeTreeReadPoolParallelReplicas.cpp b/src/Storages/MergeTree/MergeTreeReadPoolParallelReplicas.cpp index e61ddf0d122..69e64d5ea98 100644 --- a/src/Storages/MergeTree/MergeTreeReadPoolParallelReplicas.cpp +++ b/src/Storages/MergeTree/MergeTreeReadPoolParallelReplicas.cpp @@ -1,5 +1,6 @@ #include + namespace DB { @@ -30,12 +31,10 @@ MergeTreeReadPoolParallelReplicas::MergeTreeReadPoolParallelReplicas( settings_, context_) , extension(std::move(extension_)) + , coordination_mode(CoordinationMode::Default) { - extension.all_callback(InitialAllRangesAnnouncement( - CoordinationMode::Default, - parts_ranges.getDescriptions(), - extension.number_of_current_replica - )); + extension.all_callback( + InitialAllRangesAnnouncement(coordination_mode, parts_ranges.getDescriptions(), extension.number_of_current_replica)); } MergeTreeReadTaskPtr MergeTreeReadPoolParallelReplicas::getTask(size_t /*task_idx*/, MergeTreeReadTask * previous_task) @@ -48,7 +47,7 @@ MergeTreeReadTaskPtr MergeTreeReadPoolParallelReplicas::getTask(size_t /*task_id if (buffered_ranges.empty()) { auto result = extension.callback(ParallelReadRequest( - CoordinationMode::Default, + coordination_mode, extension.number_of_current_replica, pool_settings.min_marks_for_concurrent_read * pool_settings.threads, /// For Default coordination mode we don't need to pass part names. diff --git a/src/Storages/MergeTree/MergeTreeReadPoolParallelReplicas.h b/src/Storages/MergeTree/MergeTreeReadPoolParallelReplicas.h index 08020565ec4..7579a892b67 100644 --- a/src/Storages/MergeTree/MergeTreeReadPoolParallelReplicas.h +++ b/src/Storages/MergeTree/MergeTreeReadPoolParallelReplicas.h @@ -31,6 +31,7 @@ private: mutable std::mutex mutex; const ParallelReadingExtension extension; + const CoordinationMode coordination_mode; RangesInDataPartsDescription buffered_ranges; bool no_more_tasks_available{false}; Poco::Logger * log = &Poco::Logger::get("MergeTreeReadPoolParallelReplicas"); diff --git a/src/Storages/MergeTree/MergeTreeReaderCompact.cpp b/src/Storages/MergeTree/MergeTreeReaderCompact.cpp index 9713cc8b890..02048009296 100644 --- a/src/Storages/MergeTree/MergeTreeReaderCompact.cpp +++ b/src/Storages/MergeTree/MergeTreeReaderCompact.cpp @@ -216,6 +216,10 @@ size_t MergeTreeReaderCompact::readRows( { size_t rows_to_read = data_part_info_for_read->getIndexGranularity().getMarkRows(from_mark); + /// If we need to read multiple subcolumns from a single column in storage, + /// we will read it this column only once and then reuse to extract all subcolumns. + std::unordered_map columns_cache_for_subcolumns; + for (size_t pos = 0; pos < num_columns; ++pos) { if (!res_columns[pos]) @@ -226,7 +230,7 @@ size_t MergeTreeReaderCompact::readRows( auto & column = res_columns[pos]; size_t column_size_before_reading = column->size(); - readData(columns_to_read[pos], column, from_mark, current_task_last_mark, *column_positions[pos], rows_to_read, columns_for_offsets[pos]); + readData(columns_to_read[pos], column, from_mark, current_task_last_mark, *column_positions[pos], rows_to_read, columns_for_offsets[pos], columns_cache_for_subcolumns); size_t read_rows_in_column = column->size() - column_size_before_reading; if (read_rows_in_column != rows_to_read) @@ -265,7 +269,7 @@ size_t MergeTreeReaderCompact::readRows( void MergeTreeReaderCompact::readData( const NameAndTypePair & name_and_type, ColumnPtr & column, size_t from_mark, size_t current_task_last_mark, size_t column_position, size_t rows_to_read, - ColumnNameLevel name_level_for_offsets) + ColumnNameLevel name_level_for_offsets, std::unordered_map & columns_cache_for_subcolumns) { const auto & [name, type] = name_and_type; std::optional column_for_offsets; @@ -327,34 +331,54 @@ void MergeTreeReaderCompact::readData( ISerialization::DeserializeBinaryBulkSettings deserialize_settings; deserialize_settings.avg_value_size_hint = avg_value_size_hints[name]; + bool columns_cache_was_used = false; if (name_and_type.isSubcolumn()) { NameAndTypePair name_type_in_storage{name_and_type.getNameInStorage(), name_and_type.getTypeInStorage()}; + ColumnPtr temp_column; - /// In case of reading onlys offset use the correct serialization for reading of the prefix - auto serialization = getSerializationInPart(name_type_in_storage); - ColumnPtr temp_column = name_type_in_storage.type->createColumn(*serialization); - - if (column_for_offsets) + auto it = columns_cache_for_subcolumns.find(name_type_in_storage.name); + if (!column_for_offsets && it != columns_cache_for_subcolumns.end()) { - auto serialization_for_prefix = getSerializationInPart(*column_for_offsets); + temp_column = it->second; + auto subcolumn = name_type_in_storage.type->getSubcolumn(name_and_type.getSubcolumnName(), temp_column); + if (column->empty()) + column = IColumn::mutate(subcolumn); + else + column->assumeMutable()->insertRangeFrom(*subcolumn, 0, subcolumn->size()); - deserialize_settings.getter = buffer_getter_for_prefix; - serialization_for_prefix->deserializeBinaryBulkStatePrefix(deserialize_settings, state_for_prefix); + columns_cache_was_used = true; } - - deserialize_settings.getter = buffer_getter; - serialization->deserializeBinaryBulkStatePrefix(deserialize_settings, state); - serialization->deserializeBinaryBulkWithMultipleStreams(temp_column, rows_to_read, deserialize_settings, state, nullptr); - - auto subcolumn = name_type_in_storage.type->getSubcolumn(name_and_type.getSubcolumnName(), temp_column); - - /// TODO: Avoid extra copying. - if (column->empty()) - column = subcolumn; else - column->assumeMutable()->insertRangeFrom(*subcolumn, 0, subcolumn->size()); + { + /// In case of reading only offset use the correct serialization for reading of the prefix + auto serialization = getSerializationInPart(name_type_in_storage); + temp_column = name_type_in_storage.type->createColumn(*serialization); + + if (column_for_offsets) + { + auto serialization_for_prefix = getSerializationInPart(*column_for_offsets); + + deserialize_settings.getter = buffer_getter_for_prefix; + serialization_for_prefix->deserializeBinaryBulkStatePrefix(deserialize_settings, state_for_prefix); + } + + deserialize_settings.getter = buffer_getter; + serialization->deserializeBinaryBulkStatePrefix(deserialize_settings, state); + serialization->deserializeBinaryBulkWithMultipleStreams(temp_column, rows_to_read, deserialize_settings, state, nullptr); + + if (!column_for_offsets) + columns_cache_for_subcolumns[name_type_in_storage.name] = temp_column; + + auto subcolumn = name_type_in_storage.type->getSubcolumn(name_and_type.getSubcolumnName(), temp_column); + + /// TODO: Avoid extra copying. + if (column->empty()) + column = subcolumn; + else + column->assumeMutable()->insertRangeFrom(*subcolumn, 0, subcolumn->size()); + } } else { @@ -374,8 +398,8 @@ void MergeTreeReaderCompact::readData( serialization->deserializeBinaryBulkWithMultipleStreams(column, rows_to_read, deserialize_settings, state, nullptr); } - /// The buffer is left in inconsistent state after reading single offsets - if (name_level_for_offsets.has_value()) + /// The buffer is left in inconsistent state after reading single offsets or using columns cache during subcolumns reading. + if (name_level_for_offsets.has_value() || columns_cache_was_used) last_read_granule.reset(); else last_read_granule.emplace(from_mark, column_position); diff --git a/src/Storages/MergeTree/MergeTreeReaderCompact.h b/src/Storages/MergeTree/MergeTreeReaderCompact.h index cf706526363..dace4ec468e 100644 --- a/src/Storages/MergeTree/MergeTreeReaderCompact.h +++ b/src/Storages/MergeTree/MergeTreeReaderCompact.h @@ -76,7 +76,7 @@ private: void readData(const NameAndTypePair & name_and_type, ColumnPtr & column, size_t from_mark, size_t current_task_last_mark, size_t column_position, - size_t rows_to_read, ColumnNameLevel name_level_for_offsets); + size_t rows_to_read, ColumnNameLevel name_level_for_offsets, std::unordered_map & columns_cache_for_subcolumns); /// Returns maximal value of granule size in compressed file from @mark_ranges. /// This value is used as size of read buffer. diff --git a/src/Storages/MergeTree/MergeTreeSequentialSource.cpp b/src/Storages/MergeTree/MergeTreeSequentialSource.cpp index 076dec00bcc..82e9f8fd2db 100644 --- a/src/Storages/MergeTree/MergeTreeSequentialSource.cpp +++ b/src/Storages/MergeTree/MergeTreeSequentialSource.cpp @@ -22,7 +22,9 @@ namespace ErrorCodes } -/// Lightweight (in terms of logic) stream for reading single part from MergeTree +/// Lightweight (in terms of logic) stream for reading single part from +/// MergeTree, used for merges and mutations. +/// /// NOTE: /// It doesn't filter out rows that are deleted with lightweight deletes. /// Use createMergeTreeSequentialSource filter out those rows. @@ -30,6 +32,7 @@ class MergeTreeSequentialSource : public ISource { public: MergeTreeSequentialSource( + MergeTreeSequentialSourceType type, const MergeTreeData & storage_, const StorageSnapshotPtr & storage_snapshot_, MergeTreeData::DataPartPtr data_part_, @@ -85,6 +88,7 @@ private: MergeTreeSequentialSource::MergeTreeSequentialSource( + MergeTreeSequentialSourceType type, const MergeTreeData & storage_, const StorageSnapshotPtr & storage_snapshot_, MergeTreeData::DataPartPtr data_part_, @@ -144,10 +148,25 @@ MergeTreeSequentialSource::MergeTreeSequentialSource( columns_for_reader = data_part->getColumns().addTypes(columns_to_read); } - ReadSettings read_settings; + const auto & context = storage.getContext(); + ReadSettings read_settings = context->getReadSettings(); + read_settings.read_from_filesystem_cache_if_exists_otherwise_bypass_cache = true; + /// It does not make sense to use pthread_threadpool for background merges/mutations + /// And also to preserve backward compatibility + read_settings.local_fs_method = LocalFSReadMethod::pread; if (read_with_direct_io) read_settings.direct_io_threshold = 1; - read_settings.read_from_filesystem_cache_if_exists_otherwise_bypass_cache = true; + /// Configure throttling + switch (type) + { + case Mutation: + read_settings.local_throttler = context->getMutationsThrottler(); + break; + case Merge: + read_settings.local_throttler = context->getMergesThrottler(); + break; + } + read_settings.remote_throttler = read_settings.local_throttler; MergeTreeReaderSettings reader_settings = { @@ -242,6 +261,7 @@ MergeTreeSequentialSource::~MergeTreeSequentialSource() = default; Pipe createMergeTreeSequentialSource( + MergeTreeSequentialSourceType type, const MergeTreeData & storage, const StorageSnapshotPtr & storage_snapshot, MergeTreeData::DataPartPtr data_part, @@ -262,7 +282,7 @@ Pipe createMergeTreeSequentialSource( if (need_to_filter_deleted_rows && !has_filter_column) columns_to_read.emplace_back(filter_column.name); - auto column_part_source = std::make_shared( + auto column_part_source = std::make_shared(type, storage, storage_snapshot, data_part, columns_to_read, std::move(mark_ranges), /*apply_deleted_mask=*/ false, read_with_direct_io, take_column_types_from_storage, quiet); @@ -290,6 +310,7 @@ class ReadFromPart final : public ISourceStep { public: ReadFromPart( + MergeTreeSequentialSourceType type_, const MergeTreeData & storage_, const StorageSnapshotPtr & storage_snapshot_, MergeTreeData::DataPartPtr data_part_, @@ -299,6 +320,7 @@ public: ContextPtr context_, Poco::Logger * log_) : ISourceStep(DataStream{.header = storage_snapshot_->getSampleBlockForColumns(columns_to_read_)}) + , type(type_) , storage(storage_) , storage_snapshot(storage_snapshot_) , data_part(std::move(data_part_)) @@ -335,7 +357,7 @@ public: } } - auto source = createMergeTreeSequentialSource( + auto source = createMergeTreeSequentialSource(type, storage, storage_snapshot, data_part, @@ -351,6 +373,7 @@ public: } private: + MergeTreeSequentialSourceType type; const MergeTreeData & storage; StorageSnapshotPtr storage_snapshot; MergeTreeData::DataPartPtr data_part; @@ -362,6 +385,7 @@ private: }; void createReadFromPartStep( + MergeTreeSequentialSourceType type, QueryPlan & plan, const MergeTreeData & storage, const StorageSnapshotPtr & storage_snapshot, @@ -372,7 +396,7 @@ void createReadFromPartStep( ContextPtr context, Poco::Logger * log) { - auto reading = std::make_unique( + auto reading = std::make_unique(type, storage, storage_snapshot, std::move(data_part), std::move(columns_to_read), apply_deleted_mask, filter, std::move(context), log); diff --git a/src/Storages/MergeTree/MergeTreeSequentialSource.h b/src/Storages/MergeTree/MergeTreeSequentialSource.h index 396d3f76886..41def48aab6 100644 --- a/src/Storages/MergeTree/MergeTreeSequentialSource.h +++ b/src/Storages/MergeTree/MergeTreeSequentialSource.h @@ -8,9 +8,16 @@ namespace DB { +enum MergeTreeSequentialSourceType +{ + Mutation, + Merge, +}; + /// Create stream for reading single part from MergeTree. /// If the part has lightweight delete mask then the deleted rows are filtered out. Pipe createMergeTreeSequentialSource( + MergeTreeSequentialSourceType type, const MergeTreeData & storage, const StorageSnapshotPtr & storage_snapshot, MergeTreeData::DataPartPtr data_part, @@ -25,6 +32,7 @@ Pipe createMergeTreeSequentialSource( class QueryPlan; void createReadFromPartStep( + MergeTreeSequentialSourceType type, QueryPlan & plan, const MergeTreeData & storage, const StorageSnapshotPtr & storage_snapshot, diff --git a/src/Storages/MergeTree/ParallelReplicasReadingCoordinator.cpp b/src/Storages/MergeTree/ParallelReplicasReadingCoordinator.cpp index 333a0590d6b..bbe8c30a5c0 100644 --- a/src/Storages/MergeTree/ParallelReplicasReadingCoordinator.cpp +++ b/src/Storages/MergeTree/ParallelReplicasReadingCoordinator.cpp @@ -1,27 +1,77 @@ #include #include +#include +#include +#include +#include #include #include -#include -#include #include - +#include +#include +#include +#include #include -#include "Common/Exception.h" -#include -#include -#include -#include -#include "IO/WriteBufferFromString.h" #include -#include "Storages/MergeTree/RangesInDataPart.h" -#include "Storages/MergeTree/RequestResponse.h" -#include +#include #include +#include +#include +#include +#include +#include +#include +#include #include #include +#include +#include +#include +#include +#include +#include + +using namespace DB; + +namespace +{ +size_t roundDownToMultiple(size_t num, size_t multiple) +{ + return (num / multiple) * multiple; +} + +size_t +takeFromRange(const MarkRange & range, size_t min_number_of_marks, size_t & current_marks_amount, RangesInDataPartDescription & result) +{ + const auto marks_needed = min_number_of_marks - current_marks_amount; + chassert(marks_needed); + auto range_we_take = MarkRange{range.begin, range.begin + std::min(marks_needed, range.getNumberOfMarks())}; + if (!result.ranges.empty() && result.ranges.back().end == range_we_take.begin) + /// Can extend the previous range + result.ranges.back().end = range_we_take.end; + else + result.ranges.emplace_back(range_we_take); + current_marks_amount += range_we_take.getNumberOfMarks(); + return range_we_take.getNumberOfMarks(); +} +} + +namespace ProfileEvents +{ +extern const Event ParallelReplicasHandleRequestMicroseconds; +extern const Event ParallelReplicasHandleAnnouncementMicroseconds; + +extern const Event ParallelReplicasStealingByHashMicroseconds; +extern const Event ParallelReplicasProcessingPartsMicroseconds; +extern const Event ParallelReplicasStealingLeftoversMicroseconds; +extern const Event ParallelReplicasCollectingOwnedSegmentsMicroseconds; + +extern const Event ParallelReplicasReadAssignedMarks; +extern const Event ParallelReplicasReadUnassignedMarks; +extern const Event ParallelReplicasReadAssignedForStealingMarks; +} namespace ProfileEvents { @@ -58,7 +108,8 @@ namespace DB namespace ErrorCodes { - extern const int LOGICAL_ERROR; +extern const int BAD_ARGUMENTS; +extern const int LOGICAL_ERROR; } class ParallelReplicasReadingCoordinator::ImplInterface @@ -68,6 +119,15 @@ public: { size_t number_of_requests{0}; size_t sum_marks{0}; + + /// Marks assigned to the given replica by consistent hash + size_t assigned_to_me = 0; + /// Marks stolen from other replicas + size_t stolen_unassigned = 0; + + /// Stolen marks that were assigned for stealing to the given replica by hash. Makes sense only for DefaultCoordinator + size_t stolen_by_hash = 0; + bool is_unavailable{false}; }; using Stats = std::vector; @@ -76,7 +136,15 @@ public: String result = "Statistics: "; std::vector stats_by_replica; for (size_t i = 0; i < stats.size(); ++i) - stats_by_replica.push_back(fmt::format("replica {}{} - {{requests: {} marks: {}}}", i, stats[i].is_unavailable ? " is unavailable" : "", stats[i].number_of_requests, stats[i].sum_marks)); + stats_by_replica.push_back(fmt::format( + "replica {}{} - {{requests: {} marks: {} assigned_to_me: {} stolen_by_hash: {} stolen_unassigned: {}}}", + i, + stats[i].is_unavailable ? " is unavailable" : "", + stats[i].number_of_requests, + stats[i].sum_marks, + stats[i].assigned_to_me, + stats[i].stolen_by_hash, + stats[i].stolen_unassigned)); result += fmt::format("{}", fmt::join(stats_by_replica, "; ")); return result; } @@ -92,6 +160,7 @@ public: {} virtual ~ImplInterface() = default; + virtual ParallelReadResponse handleRequest(ParallelReadRequest request) = 0; virtual void handleInitialAllRangesAnnouncement(InitialAllRangesAnnouncement announcement) = 0; virtual void markReplicaAsUnavailable(size_t replica_number) = 0; @@ -103,165 +172,227 @@ using Parts = std::set; using PartRefs = std::deque; +/// This coordinator relies heavily on the fact that we work with a single shard, +/// i.e. the difference in parts contained in each replica's snapshot is rather negligible (it is only recently inserted or merged parts). +/// So the guarantees we provide here are basically the same as with single-node reading: we will read from parts as their were seen by some node at the moment when query started. +/// +/// Knowing that almost each part could be read by each node, we suppose ranges of each part to be available to all the replicas and thus distribute them evenly between them +/// (of course we still check if replica has access to the given part before scheduling a reading from it). +/// +/// Of course we want to distribute marks evenly. Looks like it is better to split parts into reasonably small segments of equal size +/// (something between 16 and 128 granules i.e. ~100K and ~1M rows should work). +/// This approach seems to work ok for all three main cases: full scan, reading random sub-ranges and reading only {pre,suf}-fix of parts. +/// Also we could expect that more granular division will make distribution more even up to a certain point. class DefaultCoordinator : public ParallelReplicasReadingCoordinator::ImplInterface { public: - using ParallelReadRequestPtr = std::unique_ptr; - using PartToMarkRanges = std::map; - - explicit DefaultCoordinator(size_t replicas_count_) + explicit DefaultCoordinator(size_t replicas_count_, size_t mark_segment_size_) : ParallelReplicasReadingCoordinator::ImplInterface(replicas_count_) - , reading_state(replicas_count_) + , mark_segment_size(mark_segment_size_) + , replica_status(replicas_count_) + , distribution_by_hash_queue(replicas_count_) { + if (mark_segment_size == 0) + throw Exception(ErrorCodes::BAD_ARGUMENTS, "Zero value provided for `mark_segment_size`"); } ~DefaultCoordinator() override; - struct PartitionReading - { - PartSegments part_ranges; - PartToMarkRanges mark_ranges_in_part; - }; + ParallelReadResponse handleRequest(ParallelReadRequest request) override; - using PartitionToBlockRanges = std::map; - PartitionToBlockRanges partitions; + void handleInitialAllRangesAnnouncement(InitialAllRangesAnnouncement announcement) override; + + void markReplicaAsUnavailable(size_t replica_number) override; + +private: + /// This many granules will represent a single segment of marks that will be assigned to a replica + const size_t mark_segment_size{0}; size_t sent_initial_requests{0}; + bool state_initialized{false}; + size_t finished_replicas{0}; - Parts all_parts_to_read; - /// Contains only parts which we haven't started to read from - PartRefs delayed_parts; - /// Per-replica preferred parts split by consistent hash - /// Once all task will be done by some replica, it can steal tasks - std::vector reading_state; + struct ReplicaStatus + { + bool is_finished{false}; + bool is_announcement_received{false}; + }; + std::vector replica_status; Poco::Logger * log = &Poco::Logger::get("DefaultCoordinator"); - std::atomic state_initialized{false}; + /// Workflow of a segment: + /// 0. `all_parts_to_read` contains all the parts and thus all the segments initially present there (virtually) + /// 1. when we traverse `all_parts_to_read` in selectPartsAndRanges() we either: + /// * take this segment into output + /// * put this segment into `distribution_by_hash_queue` for its owner if it's available and can read from it + /// * otherwise put this segment into `distribution_by_hash_queue` for its stealer_by_hash if it's available and can read from it + /// * otherwise put this segment into `ranges_for_stealing_queue` + /// 2. when we traverse `distribution_by_hash_queue` in `selectPartsAndRanges` we either: + /// * take this segment into output + /// * otherwise put this segment into `distribution_by_hash_queue` for its stealer_by_hash if it's available and can read from it + /// * otherwise put this segment into `ranges_for_stealing_queue` + /// 3. when we figuring out that some replica is unavailable we move all segments from its `distribution_by_hash_queue` to their stealers by hash or to `ranges_for_stealing_queue` + /// 4. when we get the announcement from a replica we move all segments it cannot read to their stealers by hash or to `ranges_for_stealing_queue` + /// + /// So, segments always move in one direction down this path (possibly skipping some stops): + /// `all_parts_to_read` -> `distribution_by_hash_queue[owner]` -> `distribution_by_hash_queue[stealer_by_hash]` -> `ranges_for_stealing_queue` - ParallelReadResponse handleRequest(ParallelReadRequest request) override; - void handleInitialAllRangesAnnouncement(InitialAllRangesAnnouncement announcement) override; - void markReplicaAsUnavailable(size_t replica_number) override; + /// We take the set of parts announced by this replica as the working set for the whole query. + /// For this replica we know for sure that + /// 1. it sees all the parts from this set + /// 2. it was available in the beginning of execution (since we got announcement), so if it will become unavailable at some point - query will be failed with exception. + /// this means that we can delegate reading of all leftover segments (i.e. segments that were not read by their owner or stealer by hash) to this node + size_t source_replica_for_parts_snapshot{0}; - void updateReadingState(InitialAllRangesAnnouncement announcement); - void finalizeReadingState(); + /// Parts view from the first announcement we received + std::vector all_parts_to_read; - size_t computeConsistentHash(const MergeTreePartInfo & info) const + std::unordered_map> part_visibility; /// part_name -> set of replicas announced that part + + /// We order parts from biggest (= oldest) to newest and steal from newest. Because we assume + /// that they're gonna be merged soon anyway and for them we should already expect worse cache hit. + struct BiggerPartsFirst { - auto hash = SipHash(); - hash.update(info.getPartNameV1()); - return ConsistentHashing(hash.get64(), replicas_count); - } + bool operator()(const auto & lhs, const auto & rhs) const { return lhs.info.getBlocksCount() > rhs.info.getBlocksCount(); } + }; - void selectPartsAndRanges(const PartRefs & container, size_t replica_num, size_t min_number_of_marks, size_t & current_mark_size, ParallelReadResponse & response) const; + /// We don't precalculate the whole assignment for each node at the start. + /// When replica asks coordinator for a new portion of data to read, it traverses `all_parts_to_read` to find ranges relevant to this replica (by consistent hash). + /// Many hashes are being calculated during this process and just to not loose this time we save the information about all these ranges + /// observed along the way to what node they belong to. + /// Ranges in this queue might belong to a part that the given replica cannot read from - the corresponding check happens later. + /// TODO: consider making it bounded in size + std::vector> distribution_by_hash_queue; + + /// For some ranges their owner and stealer (by consistent hash) cannot read from the given part at all. So this range have to be stolen anyway. + /// TODO: consider making it bounded in size + RangesInDataPartsDescription ranges_for_stealing_queue; + + /// We take only first replica's set of parts as the whole working set for this query. + /// For other replicas we'll just discard parts that they know, but that weren't present in the first request we received. + /// The second and all subsequent announcements needed only to understand if we can schedule reading from the given part to the given replica. + void initializeReadingState(InitialAllRangesAnnouncement announcement); + + void setProgressCallback(); + + enum class ScanMode + { + /// Main working set for the replica + TakeWhatsMineByHash, + /// We need to steal to optimize tail latency, let's do it by hash nevertheless + TakeWhatsMineForStealing, + /// All bets are off, we need to steal "for correctness" - to not leave any segments unread + TakeEverythingAvailable + }; + + void selectPartsAndRanges( + size_t replica_num, + ScanMode scan_mode, + size_t min_number_of_marks, + size_t & current_marks_amount, + RangesInDataPartsDescription & description); + + size_t computeConsistentHash(const std::string & part_name, size_t segment_begin, ScanMode scan_mode) const; + + void tryToTakeFromDistributionQueue( + size_t replica_num, size_t min_number_of_marks, size_t & current_marks_amount, RangesInDataPartsDescription & description); + + void tryToStealFromQueues( + size_t replica_num, + ScanMode scan_mode, + size_t min_number_of_marks, + size_t & current_marks_amount, + RangesInDataPartsDescription & description); + + void tryToStealFromQueue( + auto & queue, + ssize_t owner, /// In case `queue` is `distribution_by_hash_queue[replica]` + size_t replica_num, + ScanMode scan_mode, + size_t min_number_of_marks, + size_t & current_marks_amount, + RangesInDataPartsDescription & description); + + void processPartsFurther( + size_t replica_num, + ScanMode scan_mode, + size_t min_number_of_marks, + size_t & current_marks_amount, + RangesInDataPartsDescription & description); + + bool possiblyCanReadPart(size_t replica, const MergeTreePartInfo & info) const; + void enqueueSegment(const MergeTreePartInfo & info, const MarkRange & segment, size_t owner); + void enqueueToStealerOrStealingQueue(const MergeTreePartInfo & info, const MarkRange & segment); }; + DefaultCoordinator::~DefaultCoordinator() { - LOG_DEBUG(log, "Coordination done: {}", toString(stats)); + try + { + LOG_DEBUG(log, "Coordination done: {}", toString(stats)); + } + catch (...) + { + tryLogCurrentException(log); + } } -void DefaultCoordinator::updateReadingState(InitialAllRangesAnnouncement announcement) +void DefaultCoordinator::initializeReadingState(InitialAllRangesAnnouncement announcement) { - PartRefs parts_diff; - - /// To get rid of duplicates - for (auto && part_ranges: announcement.description) + for (const auto & part : announcement.description) { - Part part{.description = std::move(part_ranges), .replicas = {announcement.replica_num}}; - const MergeTreePartInfo & announced_part = part.description.info; - - auto it = std::lower_bound(cbegin(all_parts_to_read), cend(all_parts_to_read), part); - if (it != all_parts_to_read.cend()) - { - const MergeTreePartInfo & found_part = it->description.info; - if (found_part == announced_part) - { - /// We have the same part - add the info about presence on current replica - it->replicas.insert(announcement.replica_num); - continue; - } - else - { - /// check if it is covering or covered part - /// need to compare with 2 nearest parts in set, - lesser and greater than the part from the announcement - bool is_disjoint = found_part.isDisjoint(announced_part); - if (it != all_parts_to_read.cbegin() && is_disjoint) - { - const MergeTreePartInfo & lesser_part = (--it)->description.info; - is_disjoint &= lesser_part.isDisjoint(announced_part); - } - if (!is_disjoint) - continue; - } - } - else if (!all_parts_to_read.empty()) - { - /// the announced part is greatest - check if it's disjoint with lesser part - const MergeTreePartInfo & lesser_part = all_parts_to_read.crbegin()->description.info; - if (!lesser_part.isDisjoint(announced_part)) - continue; - } - - auto [insert_it, _] = all_parts_to_read.emplace(std::move(part)); - parts_diff.push_back(insert_it); + /// We don't really care here if this part will be included into the working set or not + part_visibility[part.info.getPartNameV1()].insert(announcement.replica_num); } - /// Split all parts by consistent hash - while (!parts_diff.empty()) + /// If state is already initialized - just register availabitily info and leave + if (state_initialized) + return; + + for (auto && part : announcement.description) { - auto current_part_it = parts_diff.front(); - parts_diff.pop_front(); - auto consistent_hash = computeConsistentHash(current_part_it->description.info); + auto intersecting_it = std::find_if( + all_parts_to_read.begin(), + all_parts_to_read.end(), + [&part](const Part & other) { return !other.description.info.isDisjoint(part.info); }); - /// Check whether the new part can easy go to replica queue - if (current_part_it->replicas.contains(consistent_hash)) - { - reading_state[consistent_hash].emplace_back(current_part_it); - continue; - } + if (intersecting_it != all_parts_to_read.end()) + throw Exception(ErrorCodes::LOGICAL_ERROR, "Intersecting parts found in announcement"); - /// Add to delayed parts - delayed_parts.emplace_back(current_part_it); + all_parts_to_read.push_back(Part{.description = std::move(part), .replicas = {announcement.replica_num}}); } + + std::ranges::sort( + all_parts_to_read, [](const Part & lhs, const Part & rhs) { return BiggerPartsFirst()(lhs.description, rhs.description); }); + state_initialized = true; + source_replica_for_parts_snapshot = announcement.replica_num; + + LOG_DEBUG(log, "Reading state is fully initialized: {}", fmt::join(all_parts_to_read, "; ")); } void DefaultCoordinator::markReplicaAsUnavailable(size_t replica_number) { - if (stats[replica_number].is_unavailable == false) + LOG_DEBUG(log, "Replica number {} is unavailable", replica_number); + + ++unavailable_replicas_count; + stats[replica_number].is_unavailable = true; + + if (sent_initial_requests == replicas_count - unavailable_replicas_count) + setProgressCallback(); + + for (const auto & segment : distribution_by_hash_queue[replica_number]) { - LOG_DEBUG(log, "Replica number {} is unavailable", replica_number); - - stats[replica_number].is_unavailable = true; - ++unavailable_replicas_count; - - if (sent_initial_requests == replicas_count - unavailable_replicas_count) - finalizeReadingState(); + chassert(segment.ranges.size() == 1); + enqueueToStealerOrStealingQueue(segment.info, segment.ranges.front()); } + distribution_by_hash_queue[replica_number].clear(); } -void DefaultCoordinator::finalizeReadingState() +void DefaultCoordinator::setProgressCallback() { - /// Clear all the delayed queue - while (!delayed_parts.empty()) - { - auto current_part_it = delayed_parts.front(); - auto consistent_hash = computeConsistentHash(current_part_it->description.info); - - if (current_part_it->replicas.contains(consistent_hash)) - { - reading_state[consistent_hash].emplace_back(current_part_it); - delayed_parts.pop_front(); - continue; - } - - /// In this situation just assign to a random replica which has this part - auto replica = *(std::next(current_part_it->replicas.begin(), thread_local_rng() % current_part_it->replicas.size())); - reading_state[replica].emplace_back(current_part_it); - delayed_parts.pop_front(); - } - - // update progress with total rows + // Update progress with total rows if (progress_callback) { size_t total_rows_to_read = 0; @@ -274,116 +405,378 @@ void DefaultCoordinator::finalizeReadingState() LOG_DEBUG(log, "Total rows to read: {}", total_rows_to_read); } - - LOG_DEBUG(log, "Reading state is fully initialized: {}", fmt::join(all_parts_to_read, "; ")); } - void DefaultCoordinator::handleInitialAllRangesAnnouncement(InitialAllRangesAnnouncement announcement) { const auto replica_num = announcement.replica_num; - updateReadingState(std::move(announcement)); + LOG_DEBUG(log, "Initial request from replica {}: {}", announcement.replica_num, announcement.describe()); + + initializeReadingState(std::move(announcement)); if (replica_num >= stats.size()) - throw Exception(ErrorCodes::LOGICAL_ERROR, "Replica number ({}) is bigger than total replicas count ({})", replica_num, stats.size()); + throw Exception( + ErrorCodes::LOGICAL_ERROR, "Replica number ({}) is bigger than total replicas count ({})", replica_num, stats.size()); ++stats[replica_num].number_of_requests; + replica_status[replica_num].is_announcement_received = true; ++sent_initial_requests; LOG_DEBUG(log, "Sent initial requests: {} Replicas count: {}", sent_initial_requests, replicas_count); + if (sent_initial_requests == replicas_count) - finalizeReadingState(); -} + setProgressCallback(); -void DefaultCoordinator::selectPartsAndRanges(const PartRefs & container, size_t replica_num, size_t min_number_of_marks, size_t & current_mark_size, ParallelReadResponse & response) const -{ - for (const auto & part : container) + /// Sift the queue to move out all invisible segments + for (const auto & segment : distribution_by_hash_queue[replica_num]) { - if (current_mark_size >= min_number_of_marks) + if (!part_visibility[segment.info.getPartNameV1()].contains(replica_num)) { - LOG_TEST(log, "Current mark size {} is bigger than min_number_marks {}", current_mark_size, min_number_of_marks); - break; - } - - if (part->description.ranges.empty()) - { - LOG_TEST(log, "Part {} is already empty in reading state", part->description.info.getPartNameV1()); - continue; - } - - if (std::find(part->replicas.begin(), part->replicas.end(), replica_num) == part->replicas.end()) - { - LOG_TEST(log, "Not found part {} on replica {}", part->description.info.getPartNameV1(), replica_num); - continue; - } - - response.description.push_back({ - .info = part->description.info, - .ranges = {}, - }); - - while (!part->description.ranges.empty() && current_mark_size < min_number_of_marks) - { - auto & range = part->description.ranges.front(); - const size_t needed = min_number_of_marks - current_mark_size; - - if (range.getNumberOfMarks() > needed) - { - auto range_we_take = MarkRange{range.begin, range.begin + needed}; - response.description.back().ranges.emplace_back(range_we_take); - current_mark_size += range_we_take.getNumberOfMarks(); - - range.begin += needed; - break; - } - - response.description.back().ranges.emplace_back(range); - current_mark_size += range.getNumberOfMarks(); - part->description.ranges.pop_front(); + chassert(segment.ranges.size() == 1); + enqueueToStealerOrStealingQueue(segment.info, segment.ranges.front()); } } } +void DefaultCoordinator::tryToTakeFromDistributionQueue( + size_t replica_num, size_t min_number_of_marks, size_t & current_marks_amount, RangesInDataPartsDescription & description) +{ + ProfileEventTimeIncrement watch(ProfileEvents::ParallelReplicasCollectingOwnedSegmentsMicroseconds); + + auto & distribution_queue = distribution_by_hash_queue[replica_num]; + auto replica_can_read_part = [&](auto replica, const auto & part) { return part_visibility[part.getPartNameV1()].contains(replica); }; + + RangesInDataPartDescription result; + + while (!distribution_queue.empty() && current_marks_amount < min_number_of_marks) + { + if (result.ranges.empty() || distribution_queue.begin()->info != result.info) + { + if (!result.ranges.empty()) + /// We're switching to a different part, so have to save currently accumulated ranges + description.push_back(result); + result = {.info = distribution_queue.begin()->info}; + } + + /// NOTE: this works because ranges are not considered by the comparator + auto & part_ranges = const_cast(*distribution_queue.begin()); + chassert(part_ranges.ranges.size() == 1); + auto & range = part_ranges.ranges.front(); + + if (replica_can_read_part(replica_num, part_ranges.info)) + { + if (auto taken = takeFromRange(range, min_number_of_marks, current_marks_amount, result); taken == range.getNumberOfMarks()) + distribution_queue.erase(distribution_queue.begin()); + else + { + range.begin += taken; + break; + } + } + else + { + /// It might be that `replica_num` is the stealer by hash itself - no problem, + /// we'll just have a redundant hash computation inside this function + enqueueToStealerOrStealingQueue(part_ranges.info, range); + distribution_queue.erase(distribution_queue.begin()); + } + } + + if (!result.ranges.empty()) + description.push_back(result); +} + +void DefaultCoordinator::tryToStealFromQueues( + size_t replica_num, + ScanMode scan_mode, + size_t min_number_of_marks, + size_t & current_marks_amount, + RangesInDataPartsDescription & description) +{ + auto steal_from_other_replicas = [&]() + { + /// Try to steal from other replicas starting from replicas with longest queues + std::vector order(replicas_count); + std::iota(order.begin(), order.end(), 0); + std::ranges::sort( + order, [&](auto lhs, auto rhs) { return distribution_by_hash_queue[lhs].size() > distribution_by_hash_queue[rhs].size(); }); + + for (auto replica : order) + tryToStealFromQueue( + distribution_by_hash_queue[replica], + replica, + replica_num, + scan_mode, + min_number_of_marks, + current_marks_amount, + description); + }; + + if (scan_mode == ScanMode::TakeWhatsMineForStealing) + { + ProfileEventTimeIncrement watch(ProfileEvents::ParallelReplicasStealingByHashMicroseconds); + steal_from_other_replicas(); + } + else + { + ProfileEventTimeIncrement watch(ProfileEvents::ParallelReplicasStealingLeftoversMicroseconds); + /// Check orphaned ranges + tryToStealFromQueue( + ranges_for_stealing_queue, /*owner=*/-1, replica_num, scan_mode, min_number_of_marks, current_marks_amount, description); + /// Last hope. In case we haven't yet figured out that some node is unavailable its segments are still in the distribution queue. + steal_from_other_replicas(); + } +} + +void DefaultCoordinator::tryToStealFromQueue( + auto & queue, + ssize_t owner, + size_t replica_num, + ScanMode scan_mode, + size_t min_number_of_marks, + size_t & current_marks_amount, + RangesInDataPartsDescription & description) +{ + auto replica_can_read_part = [&](auto replica, const auto & part) { return part_visibility[part.getPartNameV1()].contains(replica); }; + + RangesInDataPartDescription result; + + auto it = queue.rbegin(); + while (it != queue.rend() && current_marks_amount < min_number_of_marks) + { + auto & part_ranges = const_cast(*it); + chassert(part_ranges.ranges.size() == 1); + auto & range = part_ranges.ranges.front(); + + if (result.ranges.empty() || part_ranges.info != result.info) + { + if (!result.ranges.empty()) + /// We're switching to a different part, so have to save currently accumulated ranges + description.push_back(result); + result = {.info = part_ranges.info}; + } + + if (replica_can_read_part(replica_num, part_ranges.info)) + { + bool can_take = false; + if (scan_mode == ScanMode::TakeWhatsMineForStealing) + { + chassert(owner >= 0); + const size_t segment_begin = roundDownToMultiple(range.begin, mark_segment_size); + can_take = computeConsistentHash(part_ranges.info.getPartNameV1(), segment_begin, scan_mode) == replica_num; + } + else + { + /// Don't steal segments with alive owner that sees them + can_take = owner == -1 || stats[owner].is_unavailable || !replica_status[owner].is_announcement_received; + } + if (can_take) + { + if (auto taken = takeFromRange(range, min_number_of_marks, current_marks_amount, result); taken == range.getNumberOfMarks()) + { + it = decltype(it)(queue.erase(std::next(it).base())); + continue; + } + else + range.begin += taken; + } + } + + ++it; + } + + if (!result.ranges.empty()) + description.push_back(result); +} + +void DefaultCoordinator::processPartsFurther( + size_t replica_num, + ScanMode scan_mode, + size_t min_number_of_marks, + size_t & current_marks_amount, + RangesInDataPartsDescription & description) +{ + ProfileEventTimeIncrement watch(ProfileEvents::ParallelReplicasProcessingPartsMicroseconds); + + for (const auto & part : all_parts_to_read) + { + if (current_marks_amount >= min_number_of_marks) + { + LOG_TEST(log, "Current mark size {} is bigger than min_number_marks {}", current_marks_amount, min_number_of_marks); + return; + } + + RangesInDataPartDescription result{.info = part.description.info}; + + while (!part.description.ranges.empty() && current_marks_amount < min_number_of_marks) + { + auto & range = part.description.ranges.front(); + + /// Parts are divided into segments of `mark_segment_size` granules staring from 0-th granule + for (size_t segment_begin = roundDownToMultiple(range.begin, mark_segment_size); + segment_begin < range.end && current_marks_amount < min_number_of_marks; + segment_begin += mark_segment_size) + { + const auto cur_segment + = MarkRange{std::max(range.begin, segment_begin), std::min(range.end, segment_begin + mark_segment_size)}; + + const auto owner = computeConsistentHash(part.description.info.getPartNameV1(), segment_begin, scan_mode); + if (owner == replica_num) + { + const auto taken = takeFromRange(cur_segment, min_number_of_marks, current_marks_amount, result); + if (taken == range.getNumberOfMarks()) + part.description.ranges.pop_front(); + else + { + range.begin += taken; + break; + } + } + else + { + chassert(scan_mode == ScanMode::TakeWhatsMineByHash); + enqueueSegment(part.description.info, cur_segment, owner); + range.begin += cur_segment.getNumberOfMarks(); + if (range.getNumberOfMarks() == 0) + part.description.ranges.pop_front(); + } + } + } + + if (!result.ranges.empty()) + description.push_back(std::move(result)); + } +} + +void DefaultCoordinator::selectPartsAndRanges( + size_t replica_num, + ScanMode scan_mode, + size_t min_number_of_marks, + size_t & current_marks_amount, + RangesInDataPartsDescription & description) +{ + if (scan_mode == ScanMode::TakeWhatsMineByHash) + { + tryToTakeFromDistributionQueue(replica_num, min_number_of_marks, current_marks_amount, description); + processPartsFurther(replica_num, scan_mode, min_number_of_marks, current_marks_amount, description); + /// We might back-fill `distribution_by_hash_queue` for this replica in `enqueueToStealerOrStealingQueue` + tryToTakeFromDistributionQueue(replica_num, min_number_of_marks, current_marks_amount, description); + } + else + tryToStealFromQueues(replica_num, scan_mode, min_number_of_marks, current_marks_amount, description); +} + +bool DefaultCoordinator::possiblyCanReadPart(size_t replica, const MergeTreePartInfo & info) const +{ + /// At this point we might not be sure if `owner` can read from the given part. + /// Then we will check it while processing `owner`'s data requests - they are guaranteed to came after the announcement. + return !stats[replica].is_unavailable && !replica_status[replica].is_finished + && (!replica_status[replica].is_announcement_received || part_visibility.at(info.getPartNameV1()).contains(replica)); +} + +void DefaultCoordinator::enqueueSegment(const MergeTreePartInfo & info, const MarkRange & segment, size_t owner) +{ + if (possiblyCanReadPart(owner, info)) + { + /// TODO: optimize me (maybe we can store something lighter than RangesInDataPartDescription) + distribution_by_hash_queue[owner].insert(RangesInDataPartDescription{.info = info, .ranges = {segment}}); + LOG_TEST(log, "Segment {} is added to its owner's ({}) queue", segment, owner); + } + else + enqueueToStealerOrStealingQueue(info, segment); +} + +void DefaultCoordinator::enqueueToStealerOrStealingQueue(const MergeTreePartInfo & info, const MarkRange & segment) +{ + auto && range = RangesInDataPartDescription{.info = info, .ranges = {segment}}; + const auto stealer_by_hash = computeConsistentHash( + info.getPartNameV1(), roundDownToMultiple(segment.begin, mark_segment_size), ScanMode::TakeWhatsMineForStealing); + if (possiblyCanReadPart(stealer_by_hash, info)) + { + distribution_by_hash_queue[stealer_by_hash].insert(std::move(range)); + LOG_TEST(log, "Segment {} is added to its stealer's ({}) queue", segment, stealer_by_hash); + } + else + { + ranges_for_stealing_queue.push_back(std::move(range)); + LOG_TEST(log, "Segment {} is added to stealing queue", segment); + } +} + +size_t DefaultCoordinator::computeConsistentHash(const std::string & part_name, size_t segment_begin, ScanMode scan_mode) const +{ + chassert(segment_begin % mark_segment_size == 0); + auto hash = SipHash(); + hash.update(part_name); + hash.update(segment_begin); + hash.update(scan_mode); + return ConsistentHashing(hash.get64(), replicas_count); +} + ParallelReadResponse DefaultCoordinator::handleRequest(ParallelReadRequest request) { LOG_TRACE(log, "Handling request from replica {}, minimal marks size is {}", request.replica_num, request.min_number_of_marks); - size_t current_mark_size = 0; ParallelReadResponse response; - /// 1. Try to select from preferred set of parts for current replica - selectPartsAndRanges(reading_state[request.replica_num], request.replica_num, request.min_number_of_marks, current_mark_size, response); + size_t current_mark_size = 0; - /// 2. Try to use parts from delayed queue - while (!delayed_parts.empty() && current_mark_size < request.min_number_of_marks) - { - auto part = delayed_parts.front(); - delayed_parts.pop_front(); - reading_state[request.replica_num].emplace_back(part); - selectPartsAndRanges(reading_state[request.replica_num], request.replica_num, request.min_number_of_marks, current_mark_size, response); - } + /// 1. Try to select ranges meant for this replica by consistent hash + selectPartsAndRanges( + request.replica_num, ScanMode::TakeWhatsMineByHash, request.min_number_of_marks, current_mark_size, response.description); + const size_t assigned_to_me = current_mark_size; - /// 3. Try to steal tasks; - if (current_mark_size < request.min_number_of_marks) - { - for (size_t i = 0; i < replicas_count; ++i) - { - if (i != request.replica_num) - selectPartsAndRanges(reading_state[i], request.replica_num, request.min_number_of_marks, current_mark_size, response); + /// 2. Try to steal but with caching again (with different key) + selectPartsAndRanges( + request.replica_num, ScanMode::TakeWhatsMineForStealing, request.min_number_of_marks, current_mark_size, response.description); + const size_t stolen_by_hash = current_mark_size - assigned_to_me; - if (current_mark_size >= request.min_number_of_marks) - break; - } - } + /// 3. Try to steal with no preference. We're trying to postpone it as much as possible. + if (current_mark_size == 0 && request.replica_num == source_replica_for_parts_snapshot) + selectPartsAndRanges( + request.replica_num, ScanMode::TakeEverythingAvailable, request.min_number_of_marks, current_mark_size, response.description); + const size_t stolen_unassigned = current_mark_size - stolen_by_hash - assigned_to_me; stats[request.replica_num].number_of_requests += 1; stats[request.replica_num].sum_marks += current_mark_size; + stats[request.replica_num].assigned_to_me += assigned_to_me; + stats[request.replica_num].stolen_by_hash += stolen_by_hash; + stats[request.replica_num].stolen_unassigned += stolen_unassigned; + + ProfileEvents::increment(ProfileEvents::ParallelReplicasReadAssignedMarks, assigned_to_me); + ProfileEvents::increment(ProfileEvents::ParallelReplicasReadUnassignedMarks, stolen_unassigned); + ProfileEvents::increment(ProfileEvents::ParallelReplicasReadAssignedForStealingMarks, stolen_by_hash); + if (response.description.empty()) + { response.finish = true; - LOG_TRACE(log, "Going to respond to replica {} with {}", request.replica_num, response.describe()); + replica_status[request.replica_num].is_finished = true; + + if (++finished_replicas == replicas_count - unavailable_replicas_count) + { + /// Nobody will come to process any more data + + if (!ranges_for_stealing_queue.empty()) + throw Exception(ErrorCodes::LOGICAL_ERROR, "Some orphaned segments were left unread"); + + for (size_t replica = 0; replica < replicas_count; ++replica) + if (!distribution_by_hash_queue[replica].empty()) + throw Exception(ErrorCodes::LOGICAL_ERROR, "Non-empty distribution_by_hash_queue for replica {}", replica); + } + } + + LOG_DEBUG( + log, + "Going to respond to replica {} with {}; mine_marks={}, stolen_by_hash={}, stolen_rest={}", + request.replica_num, + response.describe(), + assigned_to_me, + stolen_by_hash, + stolen_unassigned); + return response; } @@ -456,6 +849,8 @@ void InOrderCoordinator::handleInitialAllRangesAnnouncement(InitialAllRang std::sort(ranges.begin(), ranges.end()); } + ++stats[announcement.replica_num].number_of_requests; + if (new_rows_to_read > 0) { Progress progress; @@ -557,6 +952,8 @@ ParallelReadResponse InOrderCoordinator::handleRequest(ParallelReadRequest void ParallelReplicasReadingCoordinator::handleInitialAllRangesAnnouncement(InitialAllRangesAnnouncement announcement) { + ProfileEventTimeIncrement watch(ProfileEvents::ParallelReplicasHandleAnnouncementMicroseconds); + std::lock_guard lock(mutex); if (!pimpl) @@ -570,6 +967,8 @@ void ParallelReplicasReadingCoordinator::handleInitialAllRangesAnnouncement(Init ParallelReadResponse ParallelReplicasReadingCoordinator::handleRequest(ParallelReadRequest request) { + ProfileEventTimeIncrement watch(ProfileEvents::ParallelReplicasHandleRequestMicroseconds); + std::lock_guard lock(mutex); if (!pimpl) @@ -604,7 +1003,7 @@ void ParallelReplicasReadingCoordinator::initialize() switch (mode) { case CoordinationMode::Default: - pimpl = std::make_unique(replicas_count); + pimpl = std::make_unique(replicas_count, mark_segment_size); break; case CoordinationMode::WithOrder: pimpl = std::make_unique>(replicas_count); @@ -621,7 +1020,10 @@ void ParallelReplicasReadingCoordinator::initialize() pimpl->markReplicaAsUnavailable(replica); } -ParallelReplicasReadingCoordinator::ParallelReplicasReadingCoordinator(size_t replicas_count_) : replicas_count(replicas_count_) {} +ParallelReplicasReadingCoordinator::ParallelReplicasReadingCoordinator(size_t replicas_count_, size_t mark_segment_size_) + : replicas_count(replicas_count_), mark_segment_size(mark_segment_size_) +{ +} ParallelReplicasReadingCoordinator::~ParallelReplicasReadingCoordinator() = default; diff --git a/src/Storages/MergeTree/ParallelReplicasReadingCoordinator.h b/src/Storages/MergeTree/ParallelReplicasReadingCoordinator.h index acc265c124f..9cba7d8e8c2 100644 --- a/src/Storages/MergeTree/ParallelReplicasReadingCoordinator.h +++ b/src/Storages/MergeTree/ParallelReplicasReadingCoordinator.h @@ -15,7 +15,7 @@ class ParallelReplicasReadingCoordinator public: class ImplInterface; - explicit ParallelReplicasReadingCoordinator(size_t replicas_count_); + explicit ParallelReplicasReadingCoordinator(size_t replicas_count_, size_t mark_segment_size_ = 0); ~ParallelReplicasReadingCoordinator(); void handleInitialAllRangesAnnouncement(InitialAllRangesAnnouncement); @@ -35,8 +35,8 @@ private: std::mutex mutex; size_t replicas_count{0}; + size_t mark_segment_size{0}; CoordinationMode mode{CoordinationMode::Default}; - std::atomic initialized{false}; std::unique_ptr pimpl; ProgressCallback progress_callback; // store the callback only to bypass it to coordinator implementation std::set replicas_used; diff --git a/src/Storages/MergeTree/PartitionPruner.cpp b/src/Storages/MergeTree/PartitionPruner.cpp index c559ba4371a..668576f9021 100644 --- a/src/Storages/MergeTree/PartitionPruner.cpp +++ b/src/Storages/MergeTree/PartitionPruner.cpp @@ -9,10 +9,7 @@ namespace KeyCondition buildKeyCondition(const KeyDescription & partition_key, const SelectQueryInfo & query_info, ContextPtr context, bool strict) { - if (context->getSettingsRef().allow_experimental_analyzer) - return {query_info.filter_actions_dag, context, partition_key.column_names, partition_key.expression, true /* single_point */, strict}; - - return {query_info, context, partition_key.column_names, partition_key.expression, true /* single_point */, strict}; + return {query_info.filter_actions_dag, context, partition_key.column_names, partition_key.expression, true /* single_point */, strict}; } } diff --git a/src/Storages/MergeTree/RPNBuilder.h b/src/Storages/MergeTree/RPNBuilder.h index f14f241cac8..b0755ccd3ca 100644 --- a/src/Storages/MergeTree/RPNBuilder.h +++ b/src/Storages/MergeTree/RPNBuilder.h @@ -202,17 +202,6 @@ public: traverseTree(RPNBuilderTreeNode(filter_actions_dag_node, tree_context)); } - RPNBuilder(const ASTPtr & filter_node, - ContextPtr query_context_, - Block block_with_constants_, - PreparedSetsPtr prepared_sets_, - const ExtractAtomFromTreeFunction & extract_atom_from_tree_function_) - : tree_context(std::move(query_context_), std::move(block_with_constants_), std::move(prepared_sets_)) - , extract_atom_from_tree_function(extract_atom_from_tree_function_) - { - traverseTree(RPNBuilderTreeNode(filter_node.get(), tree_context)); - } - RPNElements && extractRPN() && { return std::move(rpn_elements); } private: diff --git a/src/Storages/MergeTree/ReplicatedMergeTreeLogEntry.h b/src/Storages/MergeTree/ReplicatedMergeTreeLogEntry.h index 0ce59b18818..054c576cfc5 100644 --- a/src/Storages/MergeTree/ReplicatedMergeTreeLogEntry.h +++ b/src/Storages/MergeTree/ReplicatedMergeTreeLogEntry.h @@ -172,6 +172,9 @@ struct ReplicatedMergeTreeLogEntryData /// The quorum value (for GET_PART) is a non-zero value when the quorum write is enabled. size_t quorum = 0; + /// Used only in tests for permanent fault injection for particular queue entry. + bool fault_injected = false; + /// If this MUTATE_PART entry caused by alter(modify/drop) query. bool isAlterMutation() const { diff --git a/src/Storages/S3Queue/StorageS3Queue.cpp b/src/Storages/S3Queue/StorageS3Queue.cpp index 33e63d45c8d..bc33e8cf2a9 100644 --- a/src/Storages/S3Queue/StorageS3Queue.cpp +++ b/src/Storages/S3Queue/StorageS3Queue.cpp @@ -6,11 +6,14 @@ #include #include #include +#include +#include #include #include #include -#include -#include +#include +#include +#include #include #include #include @@ -20,6 +23,7 @@ #include #include #include +#include #include @@ -204,10 +208,65 @@ bool StorageS3Queue::supportsSubsetOfColumns(const ContextPtr & context_) const return FormatFactory::instance().checkIfFormatSupportsSubsetOfColumns(configuration.format, context_, format_settings); } -Pipe StorageS3Queue::read( +class ReadFromS3Queue : public SourceStepWithFilter +{ +public: + std::string getName() const override { return "ReadFromS3Queue"; } + void initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) override; + void applyFilters() override; + + ReadFromS3Queue( + Block sample_block, + ReadFromFormatInfo info_, + std::shared_ptr storage_, + ContextPtr context_, + size_t max_block_size_, + size_t num_streams_) + : SourceStepWithFilter(DataStream{.header = std::move(sample_block)}) + , info(std::move(info_)) + , storage(std::move(storage_)) + , context(std::move(context_)) + , max_block_size(max_block_size_) + , num_streams(num_streams_) + { + } + +private: + ReadFromFormatInfo info; + std::shared_ptr storage; + ContextPtr context; + size_t max_block_size; + size_t num_streams; + + std::shared_ptr iterator; + + void createIterator(const ActionsDAG::Node * predicate); +}; + +void ReadFromS3Queue::createIterator(const ActionsDAG::Node * predicate) +{ + if (iterator) + return; + + iterator = storage->createFileIterator(context, predicate); +} + + +void ReadFromS3Queue::applyFilters() +{ + auto filter_actions_dag = ActionsDAG::buildFilterActionsDAG(filter_nodes.nodes, {}, context); + const ActionsDAG::Node * predicate = nullptr; + if (filter_actions_dag) + predicate = filter_actions_dag->getOutputs().at(0); + + createIterator(predicate); +} + +void StorageS3Queue::read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, - SelectQueryInfo & query_info, + SelectQueryInfo & /*query_info*/, ContextPtr local_context, QueryProcessingStage::Enum /*processed_stage*/, size_t max_block_size, @@ -225,27 +284,49 @@ Pipe StorageS3Queue::read( "Cannot read from {} with attached materialized views", getName()); } - Pipes pipes; - const size_t adjusted_num_streams = std::min(num_streams, s3queue_settings->s3queue_processing_threads_num); + auto this_ptr = std::static_pointer_cast(shared_from_this()); + auto read_from_format_info = prepareReadingFromFormat(column_names, storage_snapshot, supportsSubsetOfColumns(local_context), getVirtuals()); - auto file_iterator = createFileIterator(local_context, query_info.query); + auto reading = std::make_unique( + read_from_format_info.source_header, + read_from_format_info, + std::move(this_ptr), + local_context, + max_block_size, + num_streams); + + query_plan.addStep(std::move(reading)); +} + +void ReadFromS3Queue::initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) +{ + Pipes pipes; + const size_t adjusted_num_streams = std::min(num_streams, storage->s3queue_settings->s3queue_processing_threads_num); + + createIterator(nullptr); for (size_t i = 0; i < adjusted_num_streams; ++i) - pipes.emplace_back(createSource(file_iterator, column_names, storage_snapshot, max_block_size, local_context)); - return Pipe::unitePipes(std::move(pipes)); + pipes.emplace_back(storage->createSource(info, iterator, max_block_size, context)); + + auto pipe = Pipe::unitePipes(std::move(pipes)); + if (pipe.empty()) + pipe = Pipe(std::make_shared(info.source_header)); + + for (const auto & processor : pipe.getProcessors()) + processors.emplace_back(processor); + + pipeline.init(std::move(pipe)); } std::shared_ptr StorageS3Queue::createSource( + const ReadFromFormatInfo & info, std::shared_ptr file_iterator, - const Names & column_names, - const StorageSnapshotPtr & storage_snapshot, size_t max_block_size, ContextPtr local_context) { auto configuration_snapshot = updateConfigurationAndGetCopy(local_context); - auto read_from_format_info = prepareReadingFromFormat(column_names, storage_snapshot, supportsSubsetOfColumns(local_context), getVirtuals()); auto internal_source = std::make_unique( - read_from_format_info, configuration.format, getName(), local_context, format_settings, + info, configuration.format, getName(), local_context, format_settings, max_block_size, configuration_snapshot.request_settings, configuration_snapshot.compression_method, @@ -253,7 +334,7 @@ std::shared_ptr StorageS3Queue::createSource( configuration_snapshot.url.bucket, configuration_snapshot.url.version_id, configuration_snapshot.url.uri.getHost() + std::to_string(configuration_snapshot.url.uri.getPort()), - file_iterator, local_context->getSettingsRef().max_download_threads, false, /* query_info */ std::nullopt); + file_iterator, local_context->getSettingsRef().max_download_threads, false); auto file_deleter = [this, bucket = configuration_snapshot.url.bucket, client = configuration_snapshot.client, blob_storage_log = BlobStorageLogWriter::create()](const std::string & path) mutable { @@ -277,8 +358,8 @@ std::shared_ptr StorageS3Queue::createSource( }; auto s3_queue_log = s3queue_settings->s3queue_enable_logging_to_s3queue_log ? local_context->getS3QueueLog() : nullptr; return std::make_shared( - getName(), read_from_format_info.source_header, std::move(internal_source), - files_metadata, after_processing, file_deleter, read_from_format_info.requested_virtual_columns, + getName(), info.source_header, std::move(internal_source), + files_metadata, after_processing, file_deleter, info.requested_virtual_columns, local_context, shutdown_called, table_is_being_dropped, s3_queue_log, getStorageID(), log); } @@ -375,13 +456,14 @@ bool StorageS3Queue::streamToViews() auto block_io = interpreter.execute(); auto file_iterator = createFileIterator(s3queue_context, nullptr); + auto read_from_format_info = prepareReadingFromFormat(block_io.pipeline.getHeader().getNames(), storage_snapshot, supportsSubsetOfColumns(s3queue_context), getVirtuals()); + Pipes pipes; pipes.reserve(s3queue_settings->s3queue_processing_threads_num); for (size_t i = 0; i < s3queue_settings->s3queue_processing_threads_num; ++i) { auto source = createSource( - file_iterator, block_io.pipeline.getHeader().getNames(), - storage_snapshot, DBMS_DEFAULT_BUFFER_SIZE, s3queue_context); + read_from_format_info, file_iterator, DBMS_DEFAULT_BUFFER_SIZE, s3queue_context); pipes.emplace_back(std::move(source)); } @@ -479,10 +561,10 @@ void StorageS3Queue::checkTableStructure(const String & zookeeper_prefix, const } } -std::shared_ptr StorageS3Queue::createFileIterator(ContextPtr local_context, ASTPtr query) +std::shared_ptr StorageS3Queue::createFileIterator(ContextPtr local_context, const ActionsDAG::Node * predicate) { auto glob_iterator = std::make_unique( - *configuration.client, configuration.url, query, virtual_columns, local_context, + *configuration.client, configuration.url, predicate, virtual_columns, local_context, /* read_keys */nullptr, configuration.request_settings); return std::make_shared(files_metadata, std::move(glob_iterator), shutdown_called); } diff --git a/src/Storages/S3Queue/StorageS3Queue.h b/src/Storages/S3Queue/StorageS3Queue.h index f26b1175150..3d3594dc2ab 100644 --- a/src/Storages/S3Queue/StorageS3Queue.h +++ b/src/Storages/S3Queue/StorageS3Queue.h @@ -39,10 +39,11 @@ public: String getName() const override { return "S3Queue"; } - Pipe read( + void read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, - SelectQueryInfo & query_info, + SelectQueryInfo & /*query_info*/, ContextPtr context, QueryProcessingStage::Enum processed_stage, size_t max_block_size, @@ -57,6 +58,7 @@ public: zkutil::ZooKeeperPtr getZooKeeper() const; private: + friend class ReadFromS3Queue; using FileIterator = StorageS3QueueSource::FileIterator; const std::unique_ptr s3queue_settings; @@ -85,11 +87,10 @@ private: bool supportsSubsetOfColumns(const ContextPtr & context_) const; bool supportsSubcolumns() const override { return true; } - std::shared_ptr createFileIterator(ContextPtr local_context, ASTPtr query); + std::shared_ptr createFileIterator(ContextPtr local_context, const ActionsDAG::Node * predicate); std::shared_ptr createSource( + const ReadFromFormatInfo & info, std::shared_ptr file_iterator, - const Names & column_names, - const StorageSnapshotPtr & storage_snapshot, size_t max_block_size, ContextPtr local_context); diff --git a/src/Storages/StorageAzureBlob.cpp b/src/Storages/StorageAzureBlob.cpp index 6322aa5bb76..80ca3f4b07a 100644 --- a/src/Storages/StorageAzureBlob.cpp +++ b/src/Storages/StorageAzureBlob.cpp @@ -1,6 +1,5 @@ #include - #if USE_AZURE_BLOB_STORAGE #include #include @@ -22,6 +21,9 @@ #include #include #include +#include +#include +#include #include #include @@ -658,7 +660,58 @@ private: } -Pipe StorageAzureBlob::read( +class ReadFromAzureBlob : public SourceStepWithFilter +{ +public: + std::string getName() const override { return "ReadFromAzureBlob"; } + void initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) override; + void applyFilters() override; + + ReadFromAzureBlob( + Block sample_block, + std::shared_ptr storage_, + ReadFromFormatInfo info_, + const bool need_only_count_, + ContextPtr context_, + size_t max_block_size_, + size_t num_streams_) + : SourceStepWithFilter(DataStream{.header = std::move(sample_block)}) + , storage(std::move(storage_)) + , info(std::move(info_)) + , need_only_count(need_only_count_) + , context(std::move(context_)) + , max_block_size(max_block_size_) + , num_streams(num_streams_) + { + } + +private: + std::shared_ptr storage; + ReadFromFormatInfo info; + const bool need_only_count; + + ContextPtr context; + + size_t max_block_size; + const size_t num_streams; + + std::shared_ptr iterator_wrapper; + + void createIterator(const ActionsDAG::Node * predicate); +}; + +void ReadFromAzureBlob::applyFilters() +{ + auto filter_actions_dag = ActionsDAG::buildFilterActionsDAG(filter_nodes.nodes, {}, context); + const ActionsDAG::Node * predicate = nullptr; + if (filter_actions_dag) + predicate = filter_actions_dag->getOutputs().at(0); + + createIterator(predicate); +} + +void StorageAzureBlob::read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, SelectQueryInfo & query_info, @@ -670,51 +723,83 @@ Pipe StorageAzureBlob::read( if (partition_by && configuration.withWildcard()) throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Reading from a partitioned Azure storage is not implemented yet"); - Pipes pipes; - - std::shared_ptr iterator_wrapper; - if (distributed_processing) - { - iterator_wrapper = std::make_shared(local_context, - local_context->getReadTaskCallback()); - } - else if (configuration.withGlobs()) - { - /// Iterate through disclosed globs and make a source for each file - iterator_wrapper = std::make_shared( - object_storage.get(), configuration.container, configuration.blob_path, - query_info.query, virtual_columns, local_context, nullptr, local_context->getFileProgressCallback()); - } - else - { - iterator_wrapper = std::make_shared( - object_storage.get(), configuration.container, configuration.blobs_paths, - query_info.query, virtual_columns, local_context, nullptr, local_context->getFileProgressCallback()); - } + auto this_ptr = std::static_pointer_cast(shared_from_this()); auto read_from_format_info = prepareReadingFromFormat(column_names, storage_snapshot, supportsSubsetOfColumns(local_context), getVirtuals()); bool need_only_count = (query_info.optimize_trivial_count || read_from_format_info.requested_columns.empty()) && local_context->getSettingsRef().optimize_count_from_files; + auto reading = std::make_unique( + read_from_format_info.source_header, + std::move(this_ptr), + std::move(read_from_format_info), + need_only_count, + local_context, + max_block_size, + num_streams); + + query_plan.addStep(std::move(reading)); +} + +void ReadFromAzureBlob::createIterator(const ActionsDAG::Node * predicate) +{ + if (iterator_wrapper) + return; + + const auto & configuration = storage->configuration; + + if (storage->distributed_processing) + { + iterator_wrapper = std::make_shared(context, + context->getReadTaskCallback()); + } + else if (configuration.withGlobs()) + { + /// Iterate through disclosed globs and make a source for each file + iterator_wrapper = std::make_shared( + storage->object_storage.get(), configuration.container, configuration.blob_path, + predicate, storage->virtual_columns, context, nullptr, context->getFileProgressCallback()); + } + else + { + iterator_wrapper = std::make_shared( + storage->object_storage.get(), configuration.container, configuration.blobs_paths, + predicate, storage->virtual_columns, context, nullptr, context->getFileProgressCallback()); + } +} + +void ReadFromAzureBlob::initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) +{ + createIterator(nullptr); + + const auto & configuration = storage->configuration; + Pipes pipes; + for (size_t i = 0; i < num_streams; ++i) { pipes.emplace_back(std::make_shared( - read_from_format_info, + info, configuration.format, getName(), - local_context, - format_settings, + context, + storage->format_settings, max_block_size, configuration.compression_method, - object_storage.get(), + storage->object_storage.get(), configuration.container, configuration.connection_url, iterator_wrapper, - need_only_count, - query_info)); + need_only_count)); } - return Pipe::unitePipes(std::move(pipes)); + auto pipe = Pipe::unitePipes(std::move(pipes)); + if (pipe.empty()) + pipe = Pipe(std::make_shared(info.source_header)); + + for (const auto & processor : pipe.getProcessors()) + processors.emplace_back(processor); + + pipeline.init(std::move(pipe)); } SinkToStoragePtr StorageAzureBlob::write(const ASTPtr & query, const StorageMetadataPtr & metadata_snapshot, ContextPtr local_context, bool /*async_insert*/) @@ -821,7 +906,7 @@ StorageAzureBlobSource::GlobIterator::GlobIterator( AzureObjectStorage * object_storage_, const std::string & container_, String blob_path_with_globs_, - ASTPtr query_, + const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns_, ContextPtr context_, RelativePathsWithMetadata * outer_blobs_, @@ -830,7 +915,6 @@ StorageAzureBlobSource::GlobIterator::GlobIterator( , object_storage(object_storage_) , container(container_) , blob_path_with_globs(blob_path_with_globs_) - , query(query_) , virtual_columns(virtual_columns_) , outer_blobs(outer_blobs_) , file_progress_callback(file_progress_callback_) @@ -862,6 +946,8 @@ StorageAzureBlobSource::GlobIterator::GlobIterator( ErrorCodes::CANNOT_COMPILE_REGEXP, "Cannot compile regex from glob ({}): {}", blob_path_with_globs, matcher->error()); recursive = blob_path_with_globs == "/**" ? true : false; + + filter_dag = VirtualColumnUtils::createPathAndFileFilterDAG(predicate, virtual_columns); } RelativePathWithMetadata StorageAzureBlobSource::GlobIterator::next() @@ -901,20 +987,15 @@ RelativePathWithMetadata StorageAzureBlobSource::GlobIterator::next() } index = 0; - if (!is_initialized) - { - filter_ast = VirtualColumnUtils::createPathAndFileFilterAst(query, virtual_columns, fs::path(container) / new_batch.front().relative_path, getContext()); - is_initialized = true; - } - if (filter_ast) + if (filter_dag) { std::vector paths; paths.reserve(new_batch.size()); for (auto & path_with_metadata : new_batch) paths.push_back(fs::path(container) / path_with_metadata.relative_path); - VirtualColumnUtils::filterByPathOrFile(new_batch, paths, query, virtual_columns, getContext(), filter_ast); + VirtualColumnUtils::filterByPathOrFile(new_batch, paths, filter_dag, virtual_columns, getContext()); } if (outer_blobs) @@ -940,7 +1021,7 @@ StorageAzureBlobSource::KeysIterator::KeysIterator( AzureObjectStorage * object_storage_, const std::string & container_, const Strings & keys_, - ASTPtr query_, + const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns_, ContextPtr context_, RelativePathsWithMetadata * outer_blobs, @@ -948,23 +1029,22 @@ StorageAzureBlobSource::KeysIterator::KeysIterator( : IIterator(context_) , object_storage(object_storage_) , container(container_) - , query(query_) , virtual_columns(virtual_columns_) { Strings all_keys = keys_; ASTPtr filter_ast; if (!all_keys.empty()) - filter_ast = VirtualColumnUtils::createPathAndFileFilterAst(query, virtual_columns, fs::path(container) / all_keys[0], getContext()); + filter_dag = VirtualColumnUtils::createPathAndFileFilterDAG(predicate, virtual_columns); - if (filter_ast) + if (filter_dag) { Strings paths; paths.reserve(all_keys.size()); for (const auto & key : all_keys) paths.push_back(fs::path(container) / key); - VirtualColumnUtils::filterByPathOrFile(all_keys, paths, query, virtual_columns, getContext(), filter_ast); + VirtualColumnUtils::filterByPathOrFile(all_keys, paths, filter_dag, virtual_columns, getContext()); } for (auto && key : all_keys) @@ -1070,8 +1150,7 @@ StorageAzureBlobSource::StorageAzureBlobSource( const String & container_, const String & connection_url_, std::shared_ptr file_iterator_, - bool need_only_count_, - const SelectQueryInfo & query_info_) + bool need_only_count_) :ISource(info.source_header, false) , WithContext(context_) , requested_columns(info.requested_columns) @@ -1088,7 +1167,6 @@ StorageAzureBlobSource::StorageAzureBlobSource( , connection_url(connection_url_) , file_iterator(file_iterator_) , need_only_count(need_only_count_) - , query_info(query_info_) , create_reader_pool(CurrentMetrics::ObjectStorageAzureThreads, CurrentMetrics::ObjectStorageAzureThreadsActive, CurrentMetrics::ObjectStorageAzureThreadsScheduled, 1) , create_reader_scheduler(threadPoolCallbackRunner(create_reader_pool, "AzureReader")) { diff --git a/src/Storages/StorageAzureBlob.h b/src/Storages/StorageAzureBlob.h index 77ddfe07d50..16e5b9edfb6 100644 --- a/src/Storages/StorageAzureBlob.h +++ b/src/Storages/StorageAzureBlob.h @@ -80,7 +80,8 @@ public: return name; } - Pipe read( + void read( + QueryPlan & query_plan, const Names &, const StorageSnapshotPtr &, SelectQueryInfo &, @@ -118,6 +119,8 @@ public: bool distributed_processing = false); private: + friend class ReadFromAzureBlob; + std::string name; Configuration configuration; std::unique_ptr object_storage; @@ -148,7 +151,7 @@ public: AzureObjectStorage * object_storage_, const std::string & container_, String blob_path_with_globs_, - ASTPtr query_, + const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns_, ContextPtr context_, RelativePathsWithMetadata * outer_blobs_, @@ -161,8 +164,7 @@ public: AzureObjectStorage * object_storage; std::string container; String blob_path_with_globs; - ASTPtr query; - ASTPtr filter_ast; + ActionsDAGPtr filter_dag; NamesAndTypesList virtual_columns; size_t index = 0; @@ -176,7 +178,6 @@ public: void createFilterAST(const String & any_key); bool is_finished = false; - bool is_initialized = false; std::mutex next_mutex; std::function file_progress_callback; @@ -204,7 +205,7 @@ public: AzureObjectStorage * object_storage_, const std::string & container_, const Strings & keys_, - ASTPtr query_, + const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns_, ContextPtr context_, RelativePathsWithMetadata * outer_blobs, @@ -218,7 +219,7 @@ public: std::string container; RelativePathsWithMetadata keys; - ASTPtr query; + ActionsDAGPtr filter_dag; NamesAndTypesList virtual_columns; std::atomic index = 0; @@ -236,8 +237,7 @@ public: const String & container_, const String & connection_url_, std::shared_ptr file_iterator_, - bool need_only_count_, - const SelectQueryInfo & query_info_); + bool need_only_count_); ~StorageAzureBlobSource() override; Chunk generate() override; @@ -263,7 +263,6 @@ private: std::shared_ptr file_iterator; bool need_only_count; size_t total_rows_in_file = 0; - SelectQueryInfo query_info; struct ReaderHolder { diff --git a/src/Storages/StorageAzureBlobCluster.cpp b/src/Storages/StorageAzureBlobCluster.cpp index b8f95458379..a6372577fb0 100644 --- a/src/Storages/StorageAzureBlobCluster.cpp +++ b/src/Storages/StorageAzureBlobCluster.cpp @@ -69,11 +69,11 @@ void StorageAzureBlobCluster::addColumnsStructureToQuery(ASTPtr & query, const S TableFunctionAzureBlobStorageCluster::addColumnsStructureToArguments(expression_list->children, structure, context); } -RemoteQueryExecutor::Extension StorageAzureBlobCluster::getTaskIteratorExtension(ASTPtr query, const ContextPtr & context) const +RemoteQueryExecutor::Extension StorageAzureBlobCluster::getTaskIteratorExtension(const ActionsDAG::Node * predicate, const ContextPtr & context) const { auto iterator = std::make_shared( object_storage.get(), configuration.container, configuration.blob_path, - query, virtual_columns, context, nullptr); + predicate, virtual_columns, context, nullptr); auto callback = std::make_shared>([iterator]() mutable -> String{ return iterator->next().relative_path; }); return RemoteQueryExecutor::Extension{ .task_iterator = std::move(callback) }; } diff --git a/src/Storages/StorageAzureBlobCluster.h b/src/Storages/StorageAzureBlobCluster.h index 2900243708c..2831b94f825 100644 --- a/src/Storages/StorageAzureBlobCluster.h +++ b/src/Storages/StorageAzureBlobCluster.h @@ -34,7 +34,7 @@ public: NamesAndTypesList getVirtuals() const override; - RemoteQueryExecutor::Extension getTaskIteratorExtension(ASTPtr query, const ContextPtr & context) const override; + RemoteQueryExecutor::Extension getTaskIteratorExtension(const ActionsDAG::Node * predicate, const ContextPtr & context) const override; bool supportsSubcolumns() const override { return true; } diff --git a/src/Storages/StorageDistributed.cpp b/src/Storages/StorageDistributed.cpp index a928a4daf63..7ef2ff08827 100644 --- a/src/Storages/StorageDistributed.cpp +++ b/src/Storages/StorageDistributed.cpp @@ -91,6 +91,7 @@ #include #include #include +#include #include #include #include @@ -1068,15 +1069,67 @@ std::optional StorageDistributed::distributedWriteBetweenDistribu return pipeline; } +static ActionsDAGPtr getFilterFromQuery(const ASTPtr & ast, ContextPtr context) +{ + QueryPlan plan; + SelectQueryOptions options; + options.only_analyze = true; + if (context->getSettingsRef().allow_experimental_analyzer) + { + InterpreterSelectQueryAnalyzer interpreter(ast, context, options); + plan = std::move(interpreter).extractQueryPlan(); + } + else + { + InterpreterSelectWithUnionQuery interpreter(ast, context, options); + interpreter.buildQueryPlan(plan); + } + + plan.optimize(QueryPlanOptimizationSettings::fromContext(context)); + + std::stack nodes; + nodes.push(plan.getRootNode()); + + SourceStepWithFilter * source = nullptr; + + while (!nodes.empty()) + { + const auto * node = nodes.top(); + nodes.pop(); + + if (auto * with_filter = dynamic_cast(node->step.get())) + { + if (source) + { + WriteBufferFromOwnString buf; + plan.explainPlan(buf, {}); + throw Exception(ErrorCodes::LOGICAL_ERROR, + "Found multiple source steps for query\n{}\nPlan\n{}", + queryToString(ast), buf.str()); + } + + source = with_filter; + } + } + + if (!source) + return nullptr; + + return ActionsDAG::buildFilterActionsDAG(source->getFilterNodes().nodes, {}, context); +} + std::optional StorageDistributed::distributedWriteFromClusterStorage(const IStorageCluster & src_storage_cluster, const ASTInsertQuery & query, ContextPtr local_context) const { const auto & settings = local_context->getSettingsRef(); - auto & select = query.select->as(); + + auto filter = getFilterFromQuery(query.select, local_context); + const ActionsDAG::Node * predicate = nullptr; + if (filter) + predicate = filter->getOutputs().at(0); + /// Select query is needed for pruining on virtual columns - auto extension = src_storage_cluster.getTaskIteratorExtension( - select.list_of_selects->children.at(0)->as()->clone(), - local_context); + auto extension = src_storage_cluster.getTaskIteratorExtension(predicate, local_context); auto dst_cluster = getCluster(); diff --git a/src/Storages/StorageFile.cpp b/src/Storages/StorageFile.cpp index e726e664765..5105a652a11 100644 --- a/src/Storages/StorageFile.cpp +++ b/src/Storages/StorageFile.cpp @@ -9,6 +9,7 @@ #include #include +#include #include #include @@ -37,6 +38,8 @@ #include #include #include +#include +#include #include #include @@ -921,22 +924,21 @@ static std::chrono::seconds getLockTimeout(ContextPtr context) using StorageFilePtr = std::shared_ptr; - StorageFileSource::FilesIterator::FilesIterator( const Strings & files_, std::optional archive_info_, - ASTPtr query, + const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns, ContextPtr context_, bool distributed_processing_) : files(files_), archive_info(std::move(archive_info_)), distributed_processing(distributed_processing_), context(context_) { - ASTPtr filter_ast; - if (!distributed_processing && !archive_info && !files.empty() && !files[0].empty()) - filter_ast = VirtualColumnUtils::createPathAndFileFilterAst(query, virtual_columns, files[0], context_); + ActionsDAGPtr filter_dag; + if (!distributed_processing && !archive_info && !files.empty()) + filter_dag = VirtualColumnUtils::createPathAndFileFilterDAG(predicate, virtual_columns); - if (filter_ast) - VirtualColumnUtils::filterByPathOrFile(files, files, query, virtual_columns, context_, filter_ast); + if (filter_dag) + VirtualColumnUtils::filterByPathOrFile(files, files, filter_dag, virtual_columns, context_); } String StorageFileSource::FilesIterator::next() @@ -966,16 +968,13 @@ const String & StorageFileSource::FilesIterator::getFileNameInArchive() StorageFileSource::StorageFileSource( const ReadFromFormatInfo & info, std::shared_ptr storage_, - const StorageSnapshotPtr & storage_snapshot_, ContextPtr context_, - const SelectQueryInfo & query_info_, UInt64 max_block_size_, FilesIteratorPtr files_iterator_, std::unique_ptr read_buf_, bool need_only_count_) : SourceWithKeyCondition(info.source_header, false) , storage(std::move(storage_)) - , storage_snapshot(storage_snapshot_) , files_iterator(std::move(files_iterator_)) , read_buf(std::move(read_buf_)) , columns_description(info.columns_description) @@ -983,7 +982,6 @@ StorageFileSource::StorageFileSource( , requested_virtual_columns(info.requested_virtual_columns) , block_for_format(info.format_header) , context(context_) - , query_info(query_info_) , max_block_size(max_block_size_) , need_only_count(need_only_count_) { @@ -1050,11 +1048,6 @@ StorageFileSource::~StorageFileSource() beforeDestroy(); } -void StorageFileSource::setKeyCondition(const SelectQueryInfo & query_info_, ContextPtr context_) -{ - setKeyConditionImpl(query_info_, context_, block_for_format); -} - void StorageFileSource::setKeyCondition(const ActionsDAG::NodeRawConstPtrs & nodes, ContextPtr context_) { setKeyConditionImpl(nodes, context_, block_for_format); @@ -1314,14 +1307,64 @@ std::optional StorageFileSource::tryGetNumRowsFromCache(const String & p return schema_cache.tryGetNumRows(key, get_last_mod_time); } -Pipe StorageFile::read( +class ReadFromFile : public SourceStepWithFilter +{ +public: + std::string getName() const override { return "ReadFromFile"; } + void initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) override; + void applyFilters() override; + + ReadFromFile( + Block sample_block, + std::shared_ptr storage_, + ReadFromFormatInfo info_, + const bool need_only_count_, + ContextPtr context_, + size_t max_block_size_, + size_t num_streams_) + : SourceStepWithFilter(DataStream{.header = std::move(sample_block)}) + , storage(std::move(storage_)) + , info(std::move(info_)) + , need_only_count(need_only_count_) + , context(std::move(context_)) + , max_block_size(max_block_size_) + , max_num_streams(num_streams_) + { + } + +private: + std::shared_ptr storage; + ReadFromFormatInfo info; + const bool need_only_count; + + ContextPtr context; + size_t max_block_size; + const size_t max_num_streams; + + std::shared_ptr files_iterator; + + void createIterator(const ActionsDAG::Node * predicate); +}; + +void ReadFromFile::applyFilters() +{ + auto filter_actions_dag = ActionsDAG::buildFilterActionsDAG(filter_nodes.nodes, {}, context); + const ActionsDAG::Node * predicate = nullptr; + if (filter_actions_dag) + predicate = filter_actions_dag->getOutputs().at(0); + + createIterator(predicate); +} + +void StorageFile::read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, SelectQueryInfo & query_info, ContextPtr context, QueryProcessingStage::Enum /*processed_stage*/, size_t max_block_size, - const size_t max_num_streams) + size_t num_streams) { if (use_table_fd) { @@ -1338,24 +1381,58 @@ Pipe StorageFile::read( if (p->size() == 1 && !fs::exists(p->at(0))) { - if (context->getSettingsRef().engine_file_empty_if_not_exists) - return Pipe(std::make_shared(storage_snapshot->getSampleBlockForColumns(column_names))); - else + if (!context->getSettingsRef().engine_file_empty_if_not_exists) throw Exception(ErrorCodes::FILE_DOESNT_EXIST, "File {} doesn't exist", p->at(0)); + + auto header = storage_snapshot->getSampleBlockForColumns(column_names); + InterpreterSelectQuery::addEmptySourceToQueryPlan(query_plan, header, query_info, context); + return; } } - auto files_iterator = std::make_shared(paths, archive_info, query_info.query, virtual_columns, context, distributed_processing); - auto this_ptr = std::static_pointer_cast(shared_from_this()); + auto read_from_format_info = prepareReadingFromFormat(column_names, storage_snapshot, supportsSubsetOfColumns(context), getVirtuals()); + bool need_only_count = (query_info.optimize_trivial_count || read_from_format_info.requested_columns.empty()) + && context->getSettingsRef().optimize_count_from_files; + + auto reading = std::make_unique( + read_from_format_info.source_header, + std::move(this_ptr), + std::move(read_from_format_info), + need_only_count, + context, + max_block_size, + num_streams); + + query_plan.addStep(std::move(reading)); +} + +void ReadFromFile::createIterator(const ActionsDAG::Node * predicate) +{ + if (files_iterator) + return; + + files_iterator = std::make_shared( + storage->paths, + storage->archive_info, + predicate, + storage->virtual_columns, + context, + storage->distributed_processing); +} + +void ReadFromFile::initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) +{ + createIterator(nullptr); + size_t num_streams = max_num_streams; size_t files_to_read = 0; - if (archive_info) - files_to_read = archive_info->paths_to_archives.size(); + if (storage->archive_info) + files_to_read = storage->archive_info->paths_to_archives.size(); else - files_to_read = paths.size(); + files_to_read = storage->paths.size(); if (max_num_streams > files_to_read) num_streams = files_to_read; @@ -1366,12 +1443,8 @@ Pipe StorageFile::read( /// Set total number of bytes to process. For progress bar. auto progress_callback = context->getFileProgressCallback(); - if (progress_callback && !archive_info) - progress_callback(FileProgress(0, total_bytes_to_read)); - - auto read_from_format_info = prepareReadingFromFormat(column_names, storage_snapshot, supportsSubsetOfColumns(context), getVirtuals()); - bool need_only_count = (query_info.optimize_trivial_count || read_from_format_info.requested_columns.empty()) - && context->getSettingsRef().optimize_count_from_files; + if (progress_callback && !storage->archive_info) + progress_callback(FileProgress(0, storage->total_bytes_to_read)); for (size_t i = 0; i < num_streams; ++i) { @@ -1380,22 +1453,35 @@ Pipe StorageFile::read( /// If yes, then we should use it in StorageFileSource. Atomic bool flag is needed /// to prevent data race in case of parallel reads. std::unique_ptr read_buffer; - if (has_peekable_read_buffer_from_fd.exchange(false)) - read_buffer = std::move(peekable_read_buffer_from_fd); + if (storage->has_peekable_read_buffer_from_fd.exchange(false)) + read_buffer = std::move(storage->peekable_read_buffer_from_fd); - pipes.emplace_back(std::make_shared( - read_from_format_info, - this_ptr, - storage_snapshot, + auto source = std::make_shared( + info, + storage, context, - query_info, max_block_size, files_iterator, std::move(read_buffer), - need_only_count)); + need_only_count); + + source->setKeyCondition(filter_nodes.nodes, context); + pipes.emplace_back(std::move(source)); } - return Pipe::unitePipes(std::move(pipes)); + auto pipe = Pipe::unitePipes(std::move(pipes)); + size_t output_ports = pipe.numOutputPorts(); + const bool parallelize_output = context->getSettingsRef().parallelize_output_from_storages; + if (parallelize_output && storage->parallelizeOutputAfterReading(context) && output_ports > 0 && output_ports < max_num_streams) + pipe.resize(max_num_streams); + + if (pipe.empty()) + pipe = Pipe(std::make_shared(info.source_header)); + + for (const auto & processor : pipe.getProcessors()) + processors.emplace_back(processor); + + pipeline.init(std::move(pipe)); } diff --git a/src/Storages/StorageFile.h b/src/Storages/StorageFile.h index 1fd3f2e0edf..b74868597a6 100644 --- a/src/Storages/StorageFile.h +++ b/src/Storages/StorageFile.h @@ -53,7 +53,8 @@ public: std::string getName() const override { return "File"; } - Pipe read( + void read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, SelectQueryInfo & query_info, @@ -137,6 +138,7 @@ public: protected: friend class StorageFileSource; friend class StorageFileSink; + friend class ReadFromFile; private: void setStorageMetadata(CommonArguments args); @@ -194,7 +196,7 @@ public: explicit FilesIterator( const Strings & files_, std::optional archive_info_, - ASTPtr query, + const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns, ContextPtr context_, bool distributed_processing_ = false); @@ -234,9 +236,7 @@ private: StorageFileSource( const ReadFromFormatInfo & info, std::shared_ptr storage_, - const StorageSnapshotPtr & storage_snapshot_, ContextPtr context_, - const SelectQueryInfo & query_info_, UInt64 max_block_size_, FilesIteratorPtr files_iterator_, std::unique_ptr read_buf_, @@ -256,8 +256,6 @@ private: return storage->getName(); } - void setKeyCondition(const SelectQueryInfo & query_info_, ContextPtr context_) override; - void setKeyCondition(const ActionsDAG::NodeRawConstPtrs & nodes, ContextPtr context_) override; bool tryGetCountFromCache(const struct stat & file_stat); @@ -269,7 +267,6 @@ private: std::optional tryGetNumRowsFromCache(const String & path, time_t last_mod_time) const; std::shared_ptr storage; - StorageSnapshotPtr storage_snapshot; FilesIteratorPtr files_iterator; String current_path; std::optional current_file_size; @@ -290,7 +287,6 @@ private: Block block_for_format; ContextPtr context; /// TODO Untangle potential issues with context lifetime. - SelectQueryInfo query_info; UInt64 max_block_size; bool finished_generate = false; diff --git a/src/Storages/StorageFileCluster.cpp b/src/Storages/StorageFileCluster.cpp index 782c36c9819..c12124f1e07 100644 --- a/src/Storages/StorageFileCluster.cpp +++ b/src/Storages/StorageFileCluster.cpp @@ -71,9 +71,9 @@ void StorageFileCluster::addColumnsStructureToQuery(ASTPtr & query, const String TableFunctionFileCluster::addColumnsStructureToArguments(expression_list->children, structure, context); } -RemoteQueryExecutor::Extension StorageFileCluster::getTaskIteratorExtension(ASTPtr query, const ContextPtr & context) const +RemoteQueryExecutor::Extension StorageFileCluster::getTaskIteratorExtension(const ActionsDAG::Node * predicate, const ContextPtr & context) const { - auto iterator = std::make_shared(paths, std::nullopt, query, virtual_columns, context); + auto iterator = std::make_shared(paths, std::nullopt, predicate, virtual_columns, context); auto callback = std::make_shared([iter = std::move(iterator)]() mutable -> String { return iter->next(); }); return RemoteQueryExecutor::Extension{.task_iterator = std::move(callback)}; } diff --git a/src/Storages/StorageFileCluster.h b/src/Storages/StorageFileCluster.h index e907fbad0de..a6e57c3bb4f 100644 --- a/src/Storages/StorageFileCluster.h +++ b/src/Storages/StorageFileCluster.h @@ -31,7 +31,7 @@ public: NamesAndTypesList getVirtuals() const override { return virtual_columns; } - RemoteQueryExecutor::Extension getTaskIteratorExtension(ASTPtr query, const ContextPtr & context) const override; + RemoteQueryExecutor::Extension getTaskIteratorExtension(const ActionsDAG::Node * predicate, const ContextPtr & context) const override; bool supportsSubcolumns() const override { return true; } diff --git a/src/Storages/StorageMaterializedView.h b/src/Storages/StorageMaterializedView.h index abca5833d26..4678060d81f 100644 --- a/src/Storages/StorageMaterializedView.h +++ b/src/Storages/StorageMaterializedView.h @@ -73,6 +73,7 @@ public: StoragePtr getTargetTable() const; StoragePtr tryGetTargetTable() const; + StorageID getTargetTableId() const; /// Get the virtual column of the target table; NamesAndTypesList getVirtuals() const override; @@ -119,7 +120,6 @@ private: std::tuple> prepareRefresh() const; StorageID exchangeTargetTable(StorageID fresh_table, ContextPtr refresh_context); - StorageID getTargetTableId() const; void setTargetTableId(StorageID id); void updateTargetTableId(std::optional database_name, std::optional table_name); }; diff --git a/src/Storages/StorageMerge.cpp b/src/Storages/StorageMerge.cpp index 868dbc4b231..5d4f50baa53 100644 --- a/src/Storages/StorageMerge.cpp +++ b/src/Storages/StorageMerge.cpp @@ -80,7 +80,6 @@ namespace ErrorCodes { extern const int BAD_ARGUMENTS; extern const int NOT_IMPLEMENTED; - extern const int ILLEGAL_PREWHERE; extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH; extern const int SAMPLING_NOT_SUPPORTED; extern const int ALTER_OF_COLUMN_IS_FORBIDDEN; @@ -88,6 +87,20 @@ namespace ErrorCodes extern const int LOGICAL_ERROR; } +StorageMerge::DatabaseNameOrRegexp::DatabaseNameOrRegexp( + const String & source_database_name_or_regexp_, + bool database_is_regexp_, + std::optional source_database_regexp_, + std::optional source_table_regexp_, + std::optional source_databases_and_tables_) + : source_database_name_or_regexp(source_database_name_or_regexp_) + , database_is_regexp(database_is_regexp_) + , source_database_regexp(std::move(source_database_regexp_)) + , source_table_regexp(std::move(source_table_regexp_)) + , source_databases_and_tables(std::move(source_databases_and_tables_)) +{ +} + StorageMerge::StorageMerge( const StorageID & table_id_, const ColumnsDescription & columns_, @@ -98,10 +111,11 @@ StorageMerge::StorageMerge( ContextPtr context_) : IStorage(table_id_) , WithContext(context_->getGlobalContext()) - , source_database_regexp(source_database_name_or_regexp_) - , source_databases_and_tables(source_databases_and_tables_) - , source_database_name_or_regexp(source_database_name_or_regexp_) - , database_is_regexp(database_is_regexp_) + , database_name_or_regexp( + source_database_name_or_regexp_, + database_is_regexp_, + source_database_name_or_regexp_, {}, + source_databases_and_tables_) { StorageInMemoryMetadata storage_metadata; storage_metadata.setColumns(columns_.empty() ? getColumnsDescriptionFromSourceTables() : columns_); @@ -119,10 +133,11 @@ StorageMerge::StorageMerge( ContextPtr context_) : IStorage(table_id_) , WithContext(context_->getGlobalContext()) - , source_database_regexp(source_database_name_or_regexp_) - , source_table_regexp(source_table_regexp_) - , source_database_name_or_regexp(source_database_name_or_regexp_) - , database_is_regexp(database_is_regexp_) + , database_name_or_regexp( + source_database_name_or_regexp_, + database_is_regexp_, + source_database_name_or_regexp_, + source_table_regexp_, {}) { StorageInMemoryMetadata storage_metadata; storage_metadata.setColumns(columns_.empty() ? getColumnsDescriptionFromSourceTables() : columns_); @@ -130,6 +145,11 @@ StorageMerge::StorageMerge( setInMemoryMetadata(storage_metadata); } +StorageMerge::DatabaseTablesIterators StorageMerge::getDatabaseIterators(ContextPtr context_) const +{ + return database_name_or_regexp.getDatabaseIterators(context_); +} + ColumnsDescription StorageMerge::getColumnsDescriptionFromSourceTables() const { auto table = getFirstTable([](auto && t) { return t; }); @@ -141,7 +161,7 @@ ColumnsDescription StorageMerge::getColumnsDescriptionFromSourceTables() const template StoragePtr StorageMerge::getFirstTable(F && predicate) const { - auto database_table_iterators = getDatabaseIterators(getContext()); + auto database_table_iterators = database_name_or_regexp.getDatabaseIterators(getContext()); for (auto & iterator : database_table_iterators) { @@ -236,7 +256,6 @@ std::optional StorageMerge::supportedPrewhereColumns() const return supported_columns; } - QueryProcessingStage::Enum StorageMerge::getQueryProcessingStage( ContextPtr local_context, QueryProcessingStage::Enum to_stage, @@ -255,7 +274,7 @@ QueryProcessingStage::Enum StorageMerge::getQueryProcessingStage( auto stage_in_source_tables = QueryProcessingStage::FetchColumns; - DatabaseTablesIterators database_table_iterators = getDatabaseIterators(local_context); + DatabaseTablesIterators database_table_iterators = database_name_or_regexp.getDatabaseIterators(local_context); size_t selected_table_size = 0; @@ -297,45 +316,6 @@ void StorageMerge::read( */ auto modified_context = Context::createCopy(local_context); modified_context->setSetting("optimize_move_to_prewhere", false); - - bool has_database_virtual_column = false; - bool has_table_virtual_column = false; - Names real_column_names; - real_column_names.reserve(column_names.size()); - - for (const auto & column_name : column_names) - { - if (column_name == "_database" && isVirtualColumn(column_name, storage_snapshot->metadata)) - has_database_virtual_column = true; - else if (column_name == "_table" && isVirtualColumn(column_name, storage_snapshot->metadata)) - has_table_virtual_column = true; - else - real_column_names.push_back(column_name); - } - - StorageListWithLocks selected_tables - = getSelectedTables(modified_context, query_info.query, has_database_virtual_column, has_table_virtual_column); - - InputOrderInfoPtr input_sorting_info; - if (query_info.order_optimizer) - { - for (auto it = selected_tables.begin(); it != selected_tables.end(); ++it) - { - auto storage_ptr = std::get<1>(*it); - auto storage_metadata_snapshot = storage_ptr->getInMemoryMetadataPtr(); - auto current_info = query_info.order_optimizer->getInputOrder(storage_metadata_snapshot, modified_context); - if (it == selected_tables.begin()) - input_sorting_info = current_info; - else if (!current_info || (input_sorting_info && *current_info != *input_sorting_info)) - input_sorting_info.reset(); - - if (!input_sorting_info) - break; - } - - query_info.input_order_info = input_sorting_info; - } - query_plan.addInterpreterContext(modified_context); /// What will be result structure depending on query processed stage in source tables? @@ -343,10 +323,7 @@ void StorageMerge::read( auto step = std::make_unique( common_header, - std::move(selected_tables), - real_column_names, - has_database_virtual_column, - has_table_virtual_column, + column_names, max_block_size, num_streams, shared_from_this(), @@ -358,43 +335,9 @@ void StorageMerge::read( query_plan.addStep(std::move(step)); } -/// An object of this helper class is created -/// when processing a Merge table data source (subordinary table) -/// that has row policies -/// to guarantee that these row policies are applied -class ReadFromMerge::RowPolicyData -{ -public: - RowPolicyData(RowPolicyFilterPtr, std::shared_ptr, ContextPtr); - - /// Add to data stream columns that are needed only for row policies - /// SELECT x from T if T has row policy y=42 - /// required y in data pipeline - void extendNames(Names &) const; - - /// Use storage facilities to filter data - /// optimization - /// does not guarantee accuracy, but reduces number of rows - void addStorageFilter(SourceStepWithFilter *) const; - - /// Create explicit filter transform to exclude - /// rows that are not conform to row level policy - void addFilterTransform(QueryPipelineBuilder &) const; - -private: - std::string filter_column_name; // complex filter, may contain logic operations - ActionsDAGPtr actions_dag; - ExpressionActionsPtr filter_actions; - StorageMetadataPtr storage_metadata_snapshot; -}; - - ReadFromMerge::ReadFromMerge( Block common_header_, - StorageListWithLocks selected_tables_, - Names column_names_, - bool has_database_virtual_column_, - bool has_table_virtual_column_, + Names all_column_names_, size_t max_block_size, size_t num_streams, StoragePtr storage, @@ -406,21 +349,19 @@ ReadFromMerge::ReadFromMerge( , required_max_block_size(max_block_size) , requested_num_streams(num_streams) , common_header(std::move(common_header_)) - , selected_tables(std::move(selected_tables_)) - , column_names(std::move(column_names_)) - , has_database_virtual_column(has_database_virtual_column_) - , has_table_virtual_column(has_table_virtual_column_) + , all_column_names(std::move(all_column_names_)) , storage_merge(std::move(storage)) , merge_storage_snapshot(std::move(storage_snapshot)) , query_info(query_info_) , context(std::move(context_)) , common_processed_stage(processed_stage) { - createChildPlans(); } void ReadFromMerge::initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) { + filterTablesAndCreateChildrenPlans(); + if (selected_tables.empty()) { pipeline.init(Pipe(std::make_shared(output_stream->header))); @@ -430,13 +371,10 @@ void ReadFromMerge::initializePipeline(QueryPipelineBuilder & pipeline, const Bu QueryPlanResourceHolder resources; std::vector> pipelines; - chassert(selected_tables.size() == child_plans.size()); - chassert(selected_tables.size() == table_aliases.size()); - chassert(selected_tables.size() == table_row_policy_data_opts.size()); auto table_it = selected_tables.begin(); for (size_t i = 0; i < selected_tables.size(); ++i, ++table_it) { - auto & plan = child_plans.at(i); + auto & child_plan = child_plans->at(i); const auto & table = *table_it; const auto storage = std::get<1>(table); @@ -446,13 +384,13 @@ void ReadFromMerge::initializePipeline(QueryPipelineBuilder & pipeline, const Bu auto modified_query_info = getModifiedQueryInfo(query_info, context, table, nested_storage_snaphsot); auto source_pipeline = createSources( - plan, + child_plan.plan, nested_storage_snaphsot, modified_query_info, common_processed_stage, common_header, - table_aliases.at(i), - table_row_policy_data_opts.at(i), + child_plan.table_aliases, + child_plan.row_policy_data_opt, table, context); @@ -490,10 +428,37 @@ void ReadFromMerge::initializePipeline(QueryPipelineBuilder & pipeline, const Bu pipeline.addResources(std::move(resources)); } -void ReadFromMerge::createChildPlans() +void ReadFromMerge::filterTablesAndCreateChildrenPlans() +{ + if (child_plans) + return; + + has_database_virtual_column = false; + has_table_virtual_column = false; + column_names.clear(); + column_names.reserve(column_names.size()); + + for (const auto & column_name : all_column_names) + { + if (column_name == "_database" && storage_merge->isVirtualColumn(column_name, merge_storage_snapshot->metadata)) + has_database_virtual_column = true; + else if (column_name == "_table" && storage_merge->isVirtualColumn(column_name, merge_storage_snapshot->metadata)) + has_table_virtual_column = true; + else + column_names.push_back(column_name); + } + + selected_tables = getSelectedTables(context, has_database_virtual_column, has_table_virtual_column); + + child_plans = createChildrenPlans(query_info); +} + +std::vector ReadFromMerge::createChildrenPlans(SelectQueryInfo & query_info_) const { if (selected_tables.empty()) - return; + return {}; + + std::vector res; size_t tables_count = selected_tables.size(); Float64 num_streams_multiplier @@ -503,7 +468,7 @@ void ReadFromMerge::createChildPlans() if (order_info) { - query_info.input_order_info = order_info; + query_info_.input_order_info = order_info; } else if (query_info.order_optimizer) { @@ -522,7 +487,7 @@ void ReadFromMerge::createChildPlans() break; } - query_info.input_order_info = input_sorting_info; + query_info_.input_order_info = input_sorting_info; } for (const auto & table : selected_tables) @@ -542,8 +507,10 @@ void ReadFromMerge::createChildPlans() if (sampling_requested && !storage->supportsSampling()) throw Exception(ErrorCodes::SAMPLING_NOT_SUPPORTED, "Illegal SAMPLE: table {} doesn't support sampling", storage->getStorageID().getNameForLogs()); - auto & aliases = table_aliases.emplace_back(); - auto & row_policy_data_opt = table_row_policy_data_opts.emplace_back(); + res.emplace_back(); + + auto & aliases = res.back().table_aliases; + auto & row_policy_data_opt = res.back().row_policy_data_opt; auto storage_metadata_snapshot = storage->getInMemoryMetadataPtr(); auto nested_storage_snaphsot = storage->getStorageSnapshot(storage_metadata_snapshot, context); @@ -616,7 +583,7 @@ void ReadFromMerge::createChildPlans() } } - child_plans.emplace_back(createPlanForTable( + res.back().plan = createPlanForTable( nested_storage_snaphsot, modified_query_info, common_processed_stage, @@ -625,8 +592,10 @@ void ReadFromMerge::createChildPlans() column_names_as_aliases.empty() ? std::move(real_column_names) : std::move(column_names_as_aliases), row_policy_data_opt, context, - current_streams)); + current_streams); } + + return res; } SelectQueryInfo ReadFromMerge::getModifiedQueryInfo(const SelectQueryInfo & query_info, @@ -804,7 +773,7 @@ QueryPlan ReadFromMerge::createPlanForTable( Names && real_column_names, const RowPolicyDataOpt & row_policy_data_opt, ContextMutablePtr modified_context, - size_t streams_num) + size_t streams_num) const { const auto & [database_name, storage, _, table_name] = storage_with_lock; @@ -967,21 +936,14 @@ void ReadFromMerge::RowPolicyData::addFilterTransform(QueryPipelineBuilder & bui }); } -StorageMerge::StorageListWithLocks StorageMerge::getSelectedTables( +StorageMerge::StorageListWithLocks ReadFromMerge::getSelectedTables( ContextPtr query_context, - const ASTPtr & query /* = nullptr */, - bool filter_by_database_virtual_column /* = false */, - bool filter_by_table_virtual_column /* = false */) const + bool filter_by_database_virtual_column, + bool filter_by_table_virtual_column) const { - /// FIXME: filtering does not work with allow_experimental_analyzer due to - /// different column names there (it has "table_name._table" not just - /// "_table") - - assert(!filter_by_database_virtual_column || !filter_by_table_virtual_column || query); - const Settings & settings = query_context->getSettingsRef(); - StorageListWithLocks selected_tables; - DatabaseTablesIterators database_table_iterators = getDatabaseIterators(getContext()); + StorageListWithLocks res; + DatabaseTablesIterators database_table_iterators = assert_cast(*storage_merge).getDatabaseIterators(query_context); MutableColumnPtr database_name_virtual_column; MutableColumnPtr table_name_virtual_column; @@ -1005,13 +967,10 @@ StorageMerge::StorageListWithLocks StorageMerge::getSelectedTables( if (!storage) continue; - if (query && query->as()->prewhere() && !storage->supportsPrewhere()) - throw Exception(ErrorCodes::ILLEGAL_PREWHERE, "Storage {} doesn't support PREWHERE.", storage->getName()); - - if (storage.get() != this) + if (storage.get() != storage_merge.get()) { auto table_lock = storage->lockForShare(query_context->getCurrentQueryId(), settings.lock_acquire_timeout); - selected_tables.emplace_back(iterator->databaseName(), storage, std::move(table_lock), iterator->name()); + res.emplace_back(iterator->databaseName(), storage, std::move(table_lock), iterator->name()); if (filter_by_table_virtual_column) table_name_virtual_column->insert(iterator->name()); } @@ -1020,33 +979,42 @@ StorageMerge::StorageListWithLocks StorageMerge::getSelectedTables( } } + if (!filter_by_database_virtual_column && !filter_by_table_virtual_column) + return res; + + auto filter_actions_dag = ActionsDAG::buildFilterActionsDAG(filter_nodes.nodes, {}, context); + if (!filter_actions_dag) + return res; + + const auto * predicate = filter_actions_dag->getOutputs().at(0); + if (filter_by_database_virtual_column) { /// Filter names of selected tables if there is a condition on "_database" virtual column in WHERE clause Block virtual_columns_block = Block{ColumnWithTypeAndName(std::move(database_name_virtual_column), std::make_shared(), "_database")}; - VirtualColumnUtils::filterBlockWithQuery(query, virtual_columns_block, query_context); + VirtualColumnUtils::filterBlockWithPredicate(predicate, virtual_columns_block, query_context); auto values = VirtualColumnUtils::extractSingleValueFromBlock(virtual_columns_block, "_database"); /// Remove unused databases from the list - selected_tables.remove_if([&](const auto & elem) { return values.find(std::get<0>(elem)) == values.end(); }); + res.remove_if([&](const auto & elem) { return values.find(std::get<0>(elem)) == values.end(); }); } if (filter_by_table_virtual_column) { /// Filter names of selected tables if there is a condition on "_table" virtual column in WHERE clause Block virtual_columns_block = Block{ColumnWithTypeAndName(std::move(table_name_virtual_column), std::make_shared(), "_table")}; - VirtualColumnUtils::filterBlockWithQuery(query, virtual_columns_block, query_context); + VirtualColumnUtils::filterBlockWithPredicate(predicate, virtual_columns_block, query_context); auto values = VirtualColumnUtils::extractSingleValueFromBlock(virtual_columns_block, "_table"); /// Remove unused tables from the list - selected_tables.remove_if([&](const auto & elem) { return values.find(std::get<3>(elem)) == values.end(); }); + res.remove_if([&](const auto & elem) { return values.find(std::get<3>(elem)) == values.end(); }); } - return selected_tables; + return res; } -DatabaseTablesIteratorPtr StorageMerge::getDatabaseIterator(const String & database_name, ContextPtr local_context) const +DatabaseTablesIteratorPtr StorageMerge::DatabaseNameOrRegexp::getDatabaseIterator(const String & database_name, ContextPtr local_context) const { auto database = DatabaseCatalog::instance().getDatabase(database_name); @@ -1066,7 +1034,7 @@ DatabaseTablesIteratorPtr StorageMerge::getDatabaseIterator(const String & datab return database->getTablesIterator(local_context, table_name_match); } -StorageMerge::DatabaseTablesIterators StorageMerge::getDatabaseIterators(ContextPtr local_context) const +StorageMerge::DatabaseTablesIterators StorageMerge::DatabaseNameOrRegexp::getDatabaseIterators(ContextPtr local_context) const { try { @@ -1191,8 +1159,16 @@ void ReadFromMerge::convertAndFilterSourceStream( }); } +const ReadFromMerge::StorageListWithLocks & ReadFromMerge::getSelectedTables() +{ + filterTablesAndCreateChildrenPlans(); + return selected_tables; +} + bool ReadFromMerge::requestReadingInOrder(InputOrderInfoPtr order_info_) { + filterTablesAndCreateChildrenPlans(); + /// Disable read-in-order optimization for reverse order with final. /// Otherwise, it can lead to incorrect final behavior because the implementation may rely on the reading in direct order). if (order_info_->direction != 1 && InterpreterSelectQuery::isQueryWithFinal(query_info)) @@ -1205,9 +1181,9 @@ bool ReadFromMerge::requestReadingInOrder(InputOrderInfoPtr order_info_) }; bool ok = true; - for (const auto & plan : child_plans) - if (plan.isInitialized()) - ok &= recursivelyApplyToReadingSteps(plan.getRootNode(), request_read_in_order); + for (const auto & child_plan : *child_plans) + if (child_plan.plan.isInitialized()) + ok &= recursivelyApplyToReadingSteps(child_plan.plan.getRootNode(), request_read_in_order); if (!ok) return false; @@ -1234,9 +1210,11 @@ void ReadFromMerge::applyFilters(const QueryPlan & plan) const void ReadFromMerge::applyFilters() { - for (const auto & plan : child_plans) - if (plan.isInitialized()) - applyFilters(plan); + filterTablesAndCreateChildrenPlans(); + + for (const auto & child_plan : *child_plans) + if (child_plan.plan.isInitialized()) + applyFilters(child_plan.plan); } IStorage::ColumnSizeByName StorageMerge::getColumnSizes() const diff --git a/src/Storages/StorageMerge.h b/src/Storages/StorageMerge.h index 97e453facdf..703e5db9c50 100644 --- a/src/Storages/StorageMerge.h +++ b/src/Storages/StorageMerge.h @@ -12,6 +12,9 @@ namespace DB struct QueryPlanResourceHolder; +struct RowPolicyFilter; +using RowPolicyFilterPtr = std::shared_ptr; + /** A table that represents the union of an arbitrary number of other tables. * All tables must have the same structure. */ @@ -78,24 +81,36 @@ public: std::optional totalRows(const Settings & settings) const override; std::optional totalBytes(const Settings & settings) const override; + using DatabaseTablesIterators = std::vector; + DatabaseTablesIterators getDatabaseIterators(ContextPtr context) const; + private: - std::optional source_database_regexp; - std::optional source_table_regexp; - std::optional source_databases_and_tables; - - String source_database_name_or_regexp; - bool database_is_regexp = false; - /// (Database, Table, Lock, TableName) using StorageWithLockAndName = std::tuple; using StorageListWithLocks = std::list; - using DatabaseTablesIterators = std::vector; - StorageMerge::StorageListWithLocks getSelectedTables( - ContextPtr query_context, - const ASTPtr & query = nullptr, - bool filter_by_database_virtual_column = false, - bool filter_by_table_virtual_column = false) const; + struct DatabaseNameOrRegexp + { + String source_database_name_or_regexp; + bool database_is_regexp = false; + + std::optional source_database_regexp; + std::optional source_table_regexp; + std::optional source_databases_and_tables; + + DatabaseNameOrRegexp( + const String & source_database_name_or_regexp_, + bool database_is_regexp_, + std::optional source_database_regexp_, + std::optional source_table_regexp_, + std::optional source_databases_and_tables_); + + DatabaseTablesIteratorPtr getDatabaseIterator(const String & database_name, ContextPtr context) const; + + DatabaseTablesIterators getDatabaseIterators(ContextPtr context) const; + }; + + DatabaseNameOrRegexp database_name_or_regexp; template StoragePtr getFirstTable(F && predicate) const; @@ -103,10 +118,6 @@ private: template void forEachTable(F && func) const; - DatabaseTablesIteratorPtr getDatabaseIterator(const String & database_name, ContextPtr context) const; - - DatabaseTablesIterators getDatabaseIterators(ContextPtr context) const; - NamesAndTypesList getVirtuals() const override; ColumnSizeByName getColumnSizes() const override; @@ -132,10 +143,7 @@ public: ReadFromMerge( Block common_header_, - StorageListWithLocks selected_tables_, - Names column_names_, - bool has_database_virtual_column_, - bool has_table_virtual_column_, + Names all_column_names_, size_t max_block_size, size_t num_streams, StoragePtr storage, @@ -146,7 +154,7 @@ public: void initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) override; - const StorageListWithLocks & getSelectedTables() const { return selected_tables; } + const StorageListWithLocks & getSelectedTables(); /// Returns `false` if requested reading cannot be performed. bool requestReadingInOrder(InputOrderInfoPtr order_info_); @@ -159,16 +167,13 @@ private: const Block common_header; StorageListWithLocks selected_tables; + Names all_column_names; Names column_names; bool has_database_virtual_column; bool has_table_virtual_column; StoragePtr storage_merge; StorageSnapshotPtr merge_storage_snapshot; - /// Store read plan for each child table. - /// It's needed to guarantee lifetime for child steps to be the same as for this step (mainly for EXPLAIN PIPELINE). - std::vector child_plans; - SelectQueryInfo query_info; ContextMutablePtr context; QueryProcessingStage::Enum common_processed_stage; @@ -184,14 +189,52 @@ private: using Aliases = std::vector; - class RowPolicyData; + /// An object of this helper class is created + /// when processing a Merge table data source (subordinary table) + /// that has row policies + /// to guarantee that these row policies are applied + class RowPolicyData + { + public: + RowPolicyData(RowPolicyFilterPtr, std::shared_ptr, ContextPtr); + + /// Add to data stream columns that are needed only for row policies + /// SELECT x from T if T has row policy y=42 + /// required y in data pipeline + void extendNames(Names &) const; + + /// Use storage facilities to filter data + /// optimization + /// does not guarantee accuracy, but reduces number of rows + void addStorageFilter(SourceStepWithFilter *) const; + + /// Create explicit filter transform to exclude + /// rows that are not conform to row level policy + void addFilterTransform(QueryPipelineBuilder &) const; + + private: + std::string filter_column_name; // complex filter, may contain logic operations + ActionsDAGPtr actions_dag; + ExpressionActionsPtr filter_actions; + StorageMetadataPtr storage_metadata_snapshot; + }; + using RowPolicyDataOpt = std::optional; - std::vector table_aliases; + struct ChildPlan + { + QueryPlan plan; + Aliases table_aliases; + RowPolicyDataOpt row_policy_data_opt; + }; - std::vector table_row_policy_data_opts; + /// Store read plan for each child table. + /// It's needed to guarantee lifetime for child steps to be the same as for this step (mainly for EXPLAIN PIPELINE). + std::optional> child_plans; - void createChildPlans(); + std::vector createChildrenPlans(SelectQueryInfo & query_info_) const; + + void filterTablesAndCreateChildrenPlans(); void applyFilters(const QueryPlan & plan) const; @@ -204,7 +247,7 @@ private: Names && real_column_names, const RowPolicyDataOpt & row_policy_data_opt, ContextMutablePtr modified_context, - size_t streams_num); + size_t streams_num) const; QueryPipelineBuilderPtr createSources( QueryPlan & plan, @@ -231,6 +274,11 @@ private: ContextPtr context, QueryPipelineBuilder & builder, QueryProcessingStage::Enum processed_stage); + + StorageMerge::StorageListWithLocks getSelectedTables( + ContextPtr query_context, + bool filter_by_database_virtual_column, + bool filter_by_table_virtual_column) const; }; } diff --git a/src/Storages/StorageMergeTree.cpp b/src/Storages/StorageMergeTree.cpp index e7ca50f4a5c..b8804ad3c6d 100644 --- a/src/Storages/StorageMergeTree.cpp +++ b/src/Storages/StorageMergeTree.cpp @@ -262,10 +262,10 @@ std::optional StorageMergeTree::totalRows(const Settings &) const return getTotalActiveSizeInRows(); } -std::optional StorageMergeTree::totalRowsByPartitionPredicate(const SelectQueryInfo & query_info, ContextPtr local_context) const +std::optional StorageMergeTree::totalRowsByPartitionPredicate(const ActionsDAGPtr & filter_actions_dag, ContextPtr local_context) const { auto parts = getVisibleDataPartsVector(local_context); - return totalRowsByPartitionPredicateImpl(query_info, local_context, parts); + return totalRowsByPartitionPredicateImpl(filter_actions_dag, local_context, parts); } std::optional StorageMergeTree::totalBytes(const Settings &) const diff --git a/src/Storages/StorageMergeTree.h b/src/Storages/StorageMergeTree.h index b2829ecb17f..51bf6aa42e7 100644 --- a/src/Storages/StorageMergeTree.h +++ b/src/Storages/StorageMergeTree.h @@ -66,7 +66,7 @@ public: size_t num_streams) override; std::optional totalRows(const Settings &) const override; - std::optional totalRowsByPartitionPredicate(const SelectQueryInfo &, ContextPtr) const override; + std::optional totalRowsByPartitionPredicate(const ActionsDAGPtr & filter_actions_dag, ContextPtr) const override; std::optional totalBytes(const Settings &) const override; std::optional totalBytesUncompressed(const Settings &) const override; diff --git a/src/Storages/StorageReplicatedMergeTree.cpp b/src/Storages/StorageReplicatedMergeTree.cpp index eefcab01236..a8404052c59 100644 --- a/src/Storages/StorageReplicatedMergeTree.cpp +++ b/src/Storages/StorageReplicatedMergeTree.cpp @@ -18,6 +18,7 @@ #include #include #include +#include #include @@ -147,6 +148,12 @@ namespace CurrentMetrics namespace DB { +namespace FailPoints +{ + extern const char replicated_queue_fail_next_entry[]; + extern const char replicated_queue_unfail_entries[]; +} + namespace ErrorCodes { extern const int CANNOT_READ_ALL_DATA; @@ -191,6 +198,7 @@ namespace ErrorCodes extern const int TABLE_IS_DROPPED; extern const int CANNOT_BACKUP_TABLE; extern const int SUPPORT_IS_DISABLED; + extern const int FAULT_INJECTED; } namespace ActionLocks @@ -1737,14 +1745,12 @@ bool StorageReplicatedMergeTree::checkPartChecksumsAndAddCommitOps( if (replica_part_header.getColumnsHash() != local_part_header.getColumnsHash()) { - /// Currently there are two (known) cases when it may happen: + /// Currently there are only one (known) cases when it may happen: /// - KILL MUTATION query had removed mutation before all replicas have executed assigned MUTATE_PART entries. /// Some replicas may skip this mutation and update part version without actually applying any changes. /// It leads to mismatching checksum if changes were applied on other replicas. - /// - ALTER_METADATA and MERGE_PARTS were reordered on some replicas. - /// It may lead to different number of columns in merged parts on these replicas. throw Exception(ErrorCodes::CHECKSUM_DOESNT_MATCH, "Part {} from {} has different columns hash " - "(it may rarely happen on race condition with KILL MUTATION or ALTER COLUMN).", part_name, replica); + "(it may rarely happen on race condition with KILL MUTATION).", part_name, replica); } replica_part_header.getChecksums().checkEqual(local_part_header.getChecksums(), true); @@ -1931,6 +1937,17 @@ MergeTreeData::MutableDataPartPtr StorageReplicatedMergeTree::attachPartHelperFo bool StorageReplicatedMergeTree::executeLogEntry(LogEntry & entry) { + fiu_do_on(FailPoints::replicated_queue_fail_next_entry, + { + entry.fault_injected = true; + }); + fiu_do_on(FailPoints::replicated_queue_unfail_entries, + { + entry.fault_injected = false; + }); + if (entry.fault_injected) + throw Exception(ErrorCodes::FAULT_INJECTED, "Injecting fault for log entry {}", entry.getDescriptionForLogs(format_version)); + if (entry.type == LogEntry::DROP_RANGE || entry.type == LogEntry::DROP_PART) { executeDropRange(entry); @@ -5453,11 +5470,11 @@ std::optional StorageReplicatedMergeTree::totalRows(const Settings & set return res; } -std::optional StorageReplicatedMergeTree::totalRowsByPartitionPredicate(const SelectQueryInfo & query_info, ContextPtr local_context) const +std::optional StorageReplicatedMergeTree::totalRowsByPartitionPredicate(const ActionsDAGPtr & filter_actions_dag, ContextPtr local_context) const { DataPartsVector parts; foreachActiveParts([&](auto & part) { parts.push_back(part); }, local_context->getSettingsRef().select_sequential_consistency); - return totalRowsByPartitionPredicateImpl(query_info, local_context, parts); + return totalRowsByPartitionPredicateImpl(filter_actions_dag, local_context, parts); } std::optional StorageReplicatedMergeTree::totalBytes(const Settings & settings) const diff --git a/src/Storages/StorageReplicatedMergeTree.h b/src/Storages/StorageReplicatedMergeTree.h index 556d23d6903..2bd1fcbc693 100644 --- a/src/Storages/StorageReplicatedMergeTree.h +++ b/src/Storages/StorageReplicatedMergeTree.h @@ -163,7 +163,7 @@ public: size_t num_streams) override; std::optional totalRows(const Settings & settings) const override; - std::optional totalRowsByPartitionPredicate(const SelectQueryInfo & query_info, ContextPtr context) const override; + std::optional totalRowsByPartitionPredicate(const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const override; std::optional totalBytes(const Settings & settings) const override; std::optional totalBytesUncompressed(const Settings & settings) const override; diff --git a/src/Storages/StorageS3.cpp b/src/Storages/StorageS3.cpp index 60ae7f219f4..d7cc86ed321 100644 --- a/src/Storages/StorageS3.cpp +++ b/src/Storages/StorageS3.cpp @@ -1,6 +1,4 @@ #include "config.h" -#include -#include "Parsers/ASTCreateQuery.h" #if USE_AWS_S3 @@ -16,6 +14,7 @@ #include #include +#include #include #include @@ -42,6 +41,7 @@ #include #include #include +#include #include @@ -57,6 +57,7 @@ #include #include #include +#include #include #include @@ -146,7 +147,8 @@ public: const Names & column_names_, StorageSnapshotPtr storage_snapshot_, StorageS3 & storage_, - SelectQueryInfo query_info_, + ReadFromFormatInfo read_from_format_info_, + bool need_only_count_, ContextPtr context_, size_t max_block_size_, size_t num_streams_) @@ -154,106 +156,36 @@ public: , column_names(column_names_) , storage_snapshot(std::move(storage_snapshot_)) , storage(storage_) - , query_info(std::move(query_info_)) + , read_from_format_info(std::move(read_from_format_info_)) + , need_only_count(need_only_count_) , local_context(std::move(context_)) , max_block_size(max_block_size_) , num_streams(num_streams_) { + query_configuration = storage.updateConfigurationAndGetCopy(local_context); + virtual_columns = storage.getVirtuals(); } private: Names column_names; StorageSnapshotPtr storage_snapshot; StorageS3 & storage; - SelectQueryInfo query_info; + ReadFromFormatInfo read_from_format_info; + bool need_only_count; + StorageS3::Configuration query_configuration; + NamesAndTypesList virtual_columns; + ContextPtr local_context; size_t max_block_size; size_t num_streams; + + std::shared_ptr iterator_wrapper; + + void createIterator(const ActionsDAG::Node * predicate); }; -static Block getBlockWithVirtuals(const NamesAndTypesList & virtual_columns, const String & bucket, const std::unordered_set & keys) -{ - Block virtual_columns_block; - fs::path bucket_path(bucket); - - for (const auto & [column_name, column_type] : virtual_columns) - { - if (column_name == "_path") - { - auto column = column_type->createColumn(); - for (const auto & key : keys) - column->insert((bucket_path / key).string()); - virtual_columns_block.insert({std::move(column), column_type, column_name}); - } - else if (column_name == "_file") - { - auto column = column_type->createColumn(); - for (const auto & key : keys) - { - auto pos = key.find_last_of('/'); - if (pos != std::string::npos) - column->insert(key.substr(pos + 1)); - else - column->insert(key); - } - virtual_columns_block.insert({std::move(column), column_type, column_name}); - } - else if (column_name == "_key") - { - auto column = column_type->createColumn(); - for (const auto & key : keys) - column->insert(key); - virtual_columns_block.insert({std::move(column), column_type, column_name}); - } - else - { - auto column = column_type->createColumn(); - column->insertManyDefaults(keys.size()); - virtual_columns_block.insert({std::move(column), column_type, column_name}); - } - } - - /// Column _key is mandatory and may not be in virtual_columns list - if (!virtual_columns_block.has("_key")) - { - auto column_type = std::make_shared(); - auto column = column_type->createColumn(); for (const auto & key : keys) - column->insert(key); - virtual_columns_block.insert({std::move(column), column_type, "_key"}); - } - - return virtual_columns_block; -} - -static std::vector filterKeysForPartitionPruning( - const std::vector & keys, - const String & bucket, - const NamesAndTypesList & virtual_columns, - const std::vector & filter_dags, - ContextPtr context) -{ - std::unordered_set result_keys(keys.begin(), keys.end()); - for (const auto & filter_dag : filter_dags) - { - if (result_keys.empty()) - break; - - auto block = getBlockWithVirtuals(virtual_columns, bucket, result_keys); - - auto filter_actions = VirtualColumnUtils::splitFilterDagForAllowedInputs(filter_dag->getOutputs().at(0), block); - if (!filter_actions) - continue; - VirtualColumnUtils::filterBlockWithDAG(filter_actions, block, context); - - result_keys = VirtualColumnUtils::extractSingleValueFromBlock(block, "_key"); - } - - LOG_DEBUG(&Poco::Logger::get("StorageS3"), "Applied partition pruning {} from {} keys left", result_keys.size(), keys.size()); - return std::vector(result_keys.begin(), result_keys.end()); -} - class IOutputFormat; using OutputFormatPtr = std::shared_ptr; @@ -263,7 +195,7 @@ public: Impl( const S3::Client & client_, const S3::URI & globbed_uri_, - ASTPtr & query_, + const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns_, ContextPtr context_, KeysWithInfo * read_keys_, @@ -272,7 +204,6 @@ public: : WithContext(context_) , client(client_.clone()) , globbed_uri(globbed_uri_) - , query(query_) , virtual_columns(virtual_columns_) , read_keys(read_keys_) , request_settings(request_settings_) @@ -306,6 +237,8 @@ public: "Cannot compile regex from glob ({}): {}", globbed_uri.key, matcher->error()); recursive = globbed_uri.key == "/**" ? true : false; + + filter_dag = VirtualColumnUtils::createPathAndFileFilterDAG(predicate, virtual_columns); fillInternalBufferAssumeLocked(); } @@ -424,20 +357,14 @@ private: return; } - if (!is_initialized) - { - filter_ast = VirtualColumnUtils::createPathAndFileFilterAst(query, virtual_columns, fs::path(globbed_uri.bucket) / temp_buffer.front()->key, getContext()); - is_initialized = true; - } - - if (filter_ast) + if (filter_dag) { std::vector paths; paths.reserve(temp_buffer.size()); for (const auto & key_with_info : temp_buffer) paths.push_back(fs::path(globbed_uri.bucket) / key_with_info->key); - VirtualColumnUtils::filterByPathOrFile(temp_buffer, paths, query, virtual_columns, getContext(), filter_ast); + VirtualColumnUtils::filterByPathOrFile(temp_buffer, paths, filter_dag, virtual_columns, getContext()); } buffer = std::move(temp_buffer); @@ -479,8 +406,7 @@ private: S3::URI globbed_uri; ASTPtr query; NamesAndTypesList virtual_columns; - bool is_initialized{false}; - ASTPtr filter_ast; + ActionsDAGPtr filter_dag; std::unique_ptr matcher; bool recursive{false}; bool is_finished{false}; @@ -498,13 +424,13 @@ private: StorageS3Source::DisclosedGlobIterator::DisclosedGlobIterator( const S3::Client & client_, const S3::URI & globbed_uri_, - ASTPtr query, + const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns_, ContextPtr context, KeysWithInfo * read_keys_, const S3Settings::RequestSettings & request_settings_, std::function file_progress_callback_) - : pimpl(std::make_shared(client_, globbed_uri_, query, virtual_columns_, context, read_keys_, request_settings_, file_progress_callback_)) + : pimpl(std::make_shared(client_, globbed_uri_, predicate, virtual_columns_, context, read_keys_, request_settings_, file_progress_callback_)) { } @@ -646,8 +572,7 @@ StorageS3Source::StorageS3Source( const String & url_host_and_port_, std::shared_ptr file_iterator_, const size_t max_parsing_threads_, - bool need_only_count_, - std::optional query_info_) + bool need_only_count_) : SourceWithKeyCondition(info.source_header, false) , WithContext(context_) , name(std::move(name_)) @@ -663,7 +588,6 @@ StorageS3Source::StorageS3Source( , client(client_) , sample_block(info.format_header) , format_settings(format_settings_) - , query_info(std::move(query_info_)) , requested_virtual_columns(info.requested_virtual_columns) , file_iterator(file_iterator_) , max_parsing_threads(max_parsing_threads_) @@ -1151,8 +1075,7 @@ static std::shared_ptr createFileIterator( const StorageS3::Configuration & configuration, bool distributed_processing, ContextPtr local_context, - ASTPtr query, - const std::vector & filter_dags, + const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns, StorageS3::KeysWithInfo * read_keys = nullptr, std::function file_progress_callback = {}) @@ -1165,12 +1088,22 @@ static std::shared_ptr createFileIterator( { /// Iterate through disclosed globs and make a source for each file return std::make_shared( - *configuration.client, configuration.url, query, virtual_columns, + *configuration.client, configuration.url, predicate, virtual_columns, local_context, read_keys, configuration.request_settings, file_progress_callback); } else { - Strings keys = filterKeysForPartitionPruning(configuration.keys, configuration.url.bucket, virtual_columns, filter_dags, local_context); + Strings keys = configuration.keys; + auto filter_dag = VirtualColumnUtils::createPathAndFileFilterDAG(predicate, virtual_columns); + if (filter_dag) + { + std::vector paths; + paths.reserve(keys.size()); + for (const auto & key : keys) + paths.push_back(fs::path(configuration.url.bucket) / key); + VirtualColumnUtils::filterByPathOrFile(keys, paths, filter_dag, virtual_columns, local_context); + } + return std::make_shared( *configuration.client, configuration.url.version_id, keys, configuration.url.bucket, configuration.request_settings, read_keys, file_progress_callback); @@ -1204,12 +1137,16 @@ void StorageS3::read( { auto read_from_format_info = prepareReadingFromFormat(column_names, storage_snapshot, supportsSubsetOfColumns(local_context), virtual_columns); + bool need_only_count = (query_info.optimize_trivial_count || read_from_format_info.requested_columns.empty()) + && local_context->getSettingsRef().optimize_count_from_files; + auto reading = std::make_unique( read_from_format_info.source_header, column_names, storage_snapshot, *this, - query_info, + std::move(read_from_format_info), + need_only_count, local_context, max_block_size, num_streams); @@ -1217,19 +1154,32 @@ void StorageS3::read( query_plan.addStep(std::move(reading)); } +void ReadFromStorageS3Step::applyFilters() +{ + auto filter_actions_dag = ActionsDAG::buildFilterActionsDAG(filter_nodes.nodes, {}, local_context); + const ActionsDAG::Node * predicate = nullptr; + if (filter_actions_dag) + predicate = filter_actions_dag->getOutputs().at(0); + + createIterator(predicate); +} + +void ReadFromStorageS3Step::createIterator(const ActionsDAG::Node * predicate) +{ + if (iterator_wrapper) + return; + + iterator_wrapper = createFileIterator( + query_configuration, storage.distributed_processing, local_context, predicate, + virtual_columns, nullptr, local_context->getFileProgressCallback()); +} + void ReadFromStorageS3Step::initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) { - auto query_configuration = storage.updateConfigurationAndGetCopy(local_context); - if (storage.partition_by && query_configuration.withWildcard()) throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Reading from a partitioned S3 storage is not implemented yet"); - auto virtual_columns = storage.getVirtuals(); - auto read_from_format_info = prepareReadingFromFormat(column_names, storage_snapshot, storage.supportsSubsetOfColumns(local_context), virtual_columns); - - std::shared_ptr iterator_wrapper = createFileIterator( - query_configuration, storage.distributed_processing, local_context, query_info.query, filter_dags, - virtual_columns, nullptr, local_context->getFileProgressCallback()); + createIterator(nullptr); size_t estimated_keys_count = iterator_wrapper->estimatedKeysCount(); if (estimated_keys_count > 1) @@ -1238,9 +1188,6 @@ void ReadFromStorageS3Step::initializePipeline(QueryPipelineBuilder & pipeline, /// Disclosed glob iterator can underestimate the amount of keys in some cases. We will keep one stream for this particular case. num_streams = 1; - bool need_only_count = (query_info.optimize_trivial_count || read_from_format_info.requested_columns.empty()) - && local_context->getSettingsRef().optimize_count_from_files; - const size_t max_threads = local_context->getSettingsRef().max_threads; const size_t max_parsing_threads = num_streams >= max_threads ? 1 : (max_threads / std::max(num_streams, 1ul)); LOG_DEBUG(&Poco::Logger::get("StorageS3"), "Reading in {} streams, {} threads per stream", num_streams, max_parsing_threads); @@ -1249,7 +1196,7 @@ void ReadFromStorageS3Step::initializePipeline(QueryPipelineBuilder & pipeline, pipes.reserve(num_streams); for (size_t i = 0; i < num_streams; ++i) { - pipes.emplace_back(std::make_shared( + auto source = std::make_shared( read_from_format_info, query_configuration.format, storage.getName(), @@ -1264,17 +1211,20 @@ void ReadFromStorageS3Step::initializePipeline(QueryPipelineBuilder & pipeline, query_configuration.url.uri.getHost() + std::to_string(query_configuration.url.uri.getPort()), iterator_wrapper, max_parsing_threads, - need_only_count, - query_info)); + need_only_count); + + source->setKeyCondition(filter_nodes.nodes, local_context); + pipes.emplace_back(std::move(source)); } - pipeline.init(Pipe::unitePipes(std::move(pipes))); -} + auto pipe = Pipe::unitePipes(std::move(pipes)); + if (pipe.empty()) + pipe = Pipe(std::make_shared(read_from_format_info.source_header)); + for (const auto & processor : pipe.getProcessors()) + processors.emplace_back(processor); -void ReadFromStorageS3Step::applyFilters() -{ - /// We will use filter_dags in filterKeysForPartitionPruning called from initializePipeline, nothing to do here + pipeline.init(std::move(pipe)); } SinkToStoragePtr StorageS3::write(const ASTPtr & query, const StorageMetadataPtr & metadata_snapshot, ContextPtr local_context, bool /*async_insert*/) @@ -1858,7 +1808,7 @@ ColumnsDescription StorageS3::getTableStructureFromDataImpl( { KeysWithInfo read_keys; - auto file_iterator = createFileIterator(configuration, false, ctx, {}, {}, {}, &read_keys); + auto file_iterator = createFileIterator(configuration, false, ctx, {}, {}, &read_keys); ReadBufferIterator read_buffer_iterator(file_iterator, read_keys, configuration, format_settings, ctx); return readSchemaFromFormat(configuration.format, format_settings, read_buffer_iterator, configuration.withGlobs(), ctx); diff --git a/src/Storages/StorageS3.h b/src/Storages/StorageS3.h index 07d965d8bb3..b90a0d394cb 100644 --- a/src/Storages/StorageS3.h +++ b/src/Storages/StorageS3.h @@ -78,7 +78,7 @@ public: DisclosedGlobIterator( const S3::Client & client_, const S3::URI & globbed_uri_, - ASTPtr query, + const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns, ContextPtr context, KeysWithInfo * read_keys_ = nullptr, @@ -145,18 +145,12 @@ public: const String & url_host_and_port, std::shared_ptr file_iterator_, size_t max_parsing_threads, - bool need_only_count_, - std::optional query_info); + bool need_only_count_); ~StorageS3Source() override; String getName() const override; - void setKeyCondition(const SelectQueryInfo & query_info_, ContextPtr context_) override - { - setKeyConditionImpl(query_info_, context_, sample_block); - } - void setKeyCondition(const ActionsDAG::NodeRawConstPtrs & nodes, ContextPtr context_) override { setKeyConditionImpl(nodes, context_, sample_block); @@ -180,7 +174,6 @@ private: std::shared_ptr client; Block sample_block; std::optional format_settings; - std::optional query_info; struct ReaderHolder { diff --git a/src/Storages/StorageS3Cluster.cpp b/src/Storages/StorageS3Cluster.cpp index 702b1f14ae7..e1738056e9d 100644 --- a/src/Storages/StorageS3Cluster.cpp +++ b/src/Storages/StorageS3Cluster.cpp @@ -78,10 +78,10 @@ void StorageS3Cluster::updateConfigurationIfChanged(ContextPtr local_context) s3_configuration.update(local_context); } -RemoteQueryExecutor::Extension StorageS3Cluster::getTaskIteratorExtension(ASTPtr query, const ContextPtr & context) const +RemoteQueryExecutor::Extension StorageS3Cluster::getTaskIteratorExtension(const ActionsDAG::Node * predicate, const ContextPtr & context) const { auto iterator = std::make_shared( - *s3_configuration.client, s3_configuration.url, query, virtual_columns, context, nullptr, s3_configuration.request_settings, context->getFileProgressCallback()); + *s3_configuration.client, s3_configuration.url, predicate, virtual_columns, context, nullptr, s3_configuration.request_settings, context->getFileProgressCallback()); auto callback = std::make_shared>([iterator]() mutable -> String { diff --git a/src/Storages/StorageS3Cluster.h b/src/Storages/StorageS3Cluster.h index 81fb48d2398..c526f14834a 100644 --- a/src/Storages/StorageS3Cluster.h +++ b/src/Storages/StorageS3Cluster.h @@ -34,7 +34,7 @@ public: NamesAndTypesList getVirtuals() const override; - RemoteQueryExecutor::Extension getTaskIteratorExtension(ASTPtr query, const ContextPtr & context) const override; + RemoteQueryExecutor::Extension getTaskIteratorExtension(const ActionsDAG::Node * predicate, const ContextPtr & context) const override; bool supportsSubcolumns() const override { return true; } diff --git a/src/Storages/StorageURL.cpp b/src/Storages/StorageURL.cpp index 83d0dc5496f..e08aca7555e 100644 --- a/src/Storages/StorageURL.cpp +++ b/src/Storages/StorageURL.cpp @@ -26,6 +26,8 @@ #include #include #include +#include +#include #include #include @@ -182,22 +184,22 @@ namespace class StorageURLSource::DisclosedGlobIterator::Impl { public: - Impl(const String & uri_, size_t max_addresses, const ASTPtr & query, const NamesAndTypesList & virtual_columns, const ContextPtr & context) + Impl(const String & uri_, size_t max_addresses, const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns, const ContextPtr & context) { uris = parseRemoteDescription(uri_, 0, uri_.size(), ',', max_addresses); - ASTPtr filter_ast; + ActionsDAGPtr filter_dag; if (!uris.empty()) - filter_ast = VirtualColumnUtils::createPathAndFileFilterAst(query, virtual_columns, Poco::URI(uris[0]).getPath(), context); + filter_dag = VirtualColumnUtils::createPathAndFileFilterDAG(predicate, virtual_columns); - if (filter_ast) + if (filter_dag) { std::vector paths; paths.reserve(uris.size()); for (const auto & uri : uris) paths.push_back(Poco::URI(uri).getPath()); - VirtualColumnUtils::filterByPathOrFile(uris, paths, query, virtual_columns, context, filter_ast); + VirtualColumnUtils::filterByPathOrFile(uris, paths, filter_dag, virtual_columns, context); } } @@ -220,8 +222,8 @@ private: std::atomic_size_t index = 0; }; -StorageURLSource::DisclosedGlobIterator::DisclosedGlobIterator(const String & uri, size_t max_addresses, const ASTPtr & query, const NamesAndTypesList & virtual_columns, const ContextPtr & context) - : pimpl(std::make_shared(uri, max_addresses, query, virtual_columns, context)) {} +StorageURLSource::DisclosedGlobIterator::DisclosedGlobIterator(const String & uri, size_t max_addresses, const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns, const ContextPtr & context) + : pimpl(std::make_shared(uri, max_addresses, predicate, virtual_columns, context)) {} String StorageURLSource::DisclosedGlobIterator::next() { @@ -260,7 +262,6 @@ StorageURLSource::StorageURLSource( const ConnectionTimeouts & timeouts, CompressionMethod compression_method, size_t max_parsing_threads, - const SelectQueryInfo &, const HTTPHeaderEntries & headers_, const URIParams & params, bool glob_url, @@ -874,7 +875,70 @@ bool IStorageURLBase::parallelizeOutputAfterReading(ContextPtr context) const return FormatFactory::instance().checkParallelizeOutputAfterReading(format_name, context); } -Pipe IStorageURLBase::read( +class ReadFromURL : public SourceStepWithFilter +{ +public: + std::string getName() const override { return "ReadFromURL"; } + void initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) override; + void applyFilters() override; + + ReadFromURL( + Block sample_block, + std::shared_ptr storage_, + std::vector * uri_options_, + ReadFromFormatInfo info_, + const bool need_only_count_, + std::vector> read_uri_params_, + std::function read_post_data_callback_, + ContextPtr context_, + size_t max_block_size_, + size_t num_streams_) + : SourceStepWithFilter(DataStream{.header = std::move(sample_block)}) + , storage(std::move(storage_)) + , uri_options(uri_options_) + , info(std::move(info_)) + , need_only_count(need_only_count_) + , read_uri_params(std::move(read_uri_params_)) + , read_post_data_callback(std::move(read_post_data_callback_)) + , context(std::move(context_)) + , max_block_size(max_block_size_) + , num_streams(num_streams_) + { + } + +private: + std::shared_ptr storage; + std::vector * uri_options; + + ReadFromFormatInfo info; + const bool need_only_count; + std::vector> read_uri_params; + std::function read_post_data_callback; + + ContextPtr context; + + size_t max_block_size; + size_t num_streams; + + std::shared_ptr iterator_wrapper; + bool is_url_with_globs = false; + bool is_empty_glob = false; + + void createIterator(const ActionsDAG::Node * predicate); +}; + +void ReadFromURL::applyFilters() +{ + auto filter_actions_dag = ActionsDAG::buildFilterActionsDAG(filter_nodes.nodes, {}, context); + const ActionsDAG::Node * predicate = nullptr; + if (filter_actions_dag) + predicate = filter_actions_dag->getOutputs().at(0); + + createIterator(predicate); +} + +void IStorageURLBase::read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, SelectQueryInfo & query_info, @@ -884,16 +948,61 @@ Pipe IStorageURLBase::read( size_t num_streams) { auto params = getReadURIParams(column_names, storage_snapshot, query_info, local_context, processed_stage, max_block_size); - - std::shared_ptr iterator_wrapper{nullptr}; - bool is_url_with_globs = urlWithGlobs(uri); - size_t max_addresses = local_context->getSettingsRef().glob_expansion_max_elements; auto read_from_format_info = prepareReadingFromFormat(column_names, storage_snapshot, supportsSubsetOfColumns(local_context), getVirtuals()); - if (distributed_processing) + bool need_only_count = (query_info.optimize_trivial_count || read_from_format_info.requested_columns.empty()) + && local_context->getSettingsRef().optimize_count_from_files; + + auto read_post_data_callback = getReadPOSTDataCallback( + read_from_format_info.columns_description.getNamesOfPhysical(), + read_from_format_info.columns_description, + query_info, + local_context, + processed_stage, + max_block_size); + + auto this_ptr = std::static_pointer_cast(shared_from_this()); + + auto reading = std::make_unique( + read_from_format_info.source_header, + std::move(this_ptr), + nullptr, + std::move(read_from_format_info), + need_only_count, + std::move(params), + std::move(read_post_data_callback), + local_context, + max_block_size, + num_streams); + + query_plan.addStep(std::move(reading)); +} + +void ReadFromURL::createIterator(const ActionsDAG::Node * predicate) +{ + if (iterator_wrapper || is_empty_glob) + return; + + if (uri_options) + { + iterator_wrapper = std::make_shared([&, done = false]() mutable + { + if (done) + return StorageURLSource::FailoverOptions{}; + done = true; + return *uri_options; + }); + + return; + } + + size_t max_addresses = context->getSettingsRef().glob_expansion_max_elements; + is_url_with_globs = urlWithGlobs(storage->uri); + + if (storage->distributed_processing) { iterator_wrapper = std::make_shared( - [callback = local_context->getReadTaskCallback(), max_addresses]() + [callback = context->getReadTaskCallback(), max_addresses]() { String next_uri = callback(); if (next_uri.empty()) @@ -904,11 +1013,14 @@ Pipe IStorageURLBase::read( else if (is_url_with_globs) { /// Iterate through disclosed globs and make a source for each file - auto glob_iterator = std::make_shared(uri, max_addresses, query_info.query, virtual_columns, local_context); + auto glob_iterator = std::make_shared(storage->uri, max_addresses, predicate, storage->virtual_columns, context); /// check if we filtered out all the paths if (glob_iterator->size() == 0) - return Pipe(std::make_shared(read_from_format_info.source_header)); + { + is_empty_glob = true; + return; + } iterator_wrapper = std::make_shared([glob_iterator, max_addresses]() { @@ -923,7 +1035,7 @@ Pipe IStorageURLBase::read( } else { - iterator_wrapper = std::make_shared([&, max_addresses, done = false]() mutable + iterator_wrapper = std::make_shared([max_addresses, done = false, &uri = storage->uri]() mutable { if (done) return StorageURLSource::FailoverOptions{}; @@ -932,49 +1044,69 @@ Pipe IStorageURLBase::read( }); num_streams = 1; } +} - bool need_only_count = (query_info.optimize_trivial_count || read_from_format_info.requested_columns.empty()) - && local_context->getSettingsRef().optimize_count_from_files; +void ReadFromURL::initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) +{ + createIterator(nullptr); + + if (is_empty_glob) + { + pipeline.init(Pipe(std::make_shared(info.source_header))); + return; + } Pipes pipes; pipes.reserve(num_streams); - const size_t max_threads = local_context->getSettingsRef().max_threads; + const size_t max_threads = context->getSettingsRef().max_threads; const size_t max_parsing_threads = num_streams >= max_threads ? 1 : (max_threads / num_streams); for (size_t i = 0; i < num_streams; ++i) { - pipes.emplace_back(std::make_shared( - read_from_format_info, + auto source = std::make_shared( + info, iterator_wrapper, - getReadMethod(), - getReadPOSTDataCallback( - read_from_format_info.columns_description.getNamesOfPhysical(), - read_from_format_info.columns_description, - query_info, - local_context, - processed_stage, - max_block_size), - format_name, - format_settings, - getName(), - local_context, + storage->getReadMethod(), + read_post_data_callback, + storage->format_name, + storage->format_settings, + storage->getName(), + context, max_block_size, - getHTTPTimeouts(local_context), - compression_method, + getHTTPTimeouts(context), + storage->compression_method, max_parsing_threads, - query_info, - headers, - params, + storage->headers, + read_uri_params, is_url_with_globs, - need_only_count)); + need_only_count); + + source->setKeyCondition(filter_nodes.nodes, context); + pipes.emplace_back(std::move(source)); } - return Pipe::unitePipes(std::move(pipes)); + if (uri_options) + std::shuffle(uri_options->begin(), uri_options->end(), thread_local_rng); + + auto pipe = Pipe::unitePipes(std::move(pipes)); + size_t output_ports = pipe.numOutputPorts(); + const bool parallelize_output = context->getSettingsRef().parallelize_output_from_storages; + if (parallelize_output && storage->parallelizeOutputAfterReading(context) && output_ports > 0 && output_ports < num_streams) + pipe.resize(num_streams); + + if (pipe.empty()) + pipe = Pipe(std::make_shared(info.source_header)); + + for (const auto & processor : pipe.getProcessors()) + processors.emplace_back(processor); + + pipeline.init(std::move(pipe)); } -Pipe StorageURLWithFailover::read( +void StorageURLWithFailover::read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, SelectQueryInfo & query_info, @@ -984,38 +1116,34 @@ Pipe StorageURLWithFailover::read( size_t num_streams) { auto params = getReadURIParams(column_names, storage_snapshot, query_info, local_context, processed_stage, max_block_size); - - auto iterator_wrapper = std::make_shared([&, done = false]() mutable - { - if (done) - return StorageURLSource::FailoverOptions{}; - done = true; - return uri_options; - }); - auto read_from_format_info = prepareReadingFromFormat(column_names, storage_snapshot, supportsSubsetOfColumns(local_context), getVirtuals()); - const size_t max_threads = local_context->getSettingsRef().max_threads; - const size_t max_parsing_threads = num_streams >= max_threads ? 1 : (max_threads / num_streams); + bool need_only_count = (query_info.optimize_trivial_count || read_from_format_info.requested_columns.empty()) + && local_context->getSettingsRef().optimize_count_from_files; - auto pipe = Pipe(std::make_shared( - read_from_format_info, - iterator_wrapper, - getReadMethod(), - getReadPOSTDataCallback(read_from_format_info.columns_description.getNamesOfPhysical(), read_from_format_info.columns_description, query_info, local_context, processed_stage, max_block_size), - format_name, - format_settings, - getName(), + auto read_post_data_callback = getReadPOSTDataCallback( + read_from_format_info.columns_description.getNamesOfPhysical(), + read_from_format_info.columns_description, + query_info, + local_context, + processed_stage, + max_block_size); + + auto this_ptr = std::static_pointer_cast(shared_from_this()); + + auto reading = std::make_unique( + read_from_format_info.source_header, + std::move(this_ptr), + &uri_options, + std::move(read_from_format_info), + need_only_count, + std::move(params), + std::move(read_post_data_callback), local_context, max_block_size, - getHTTPTimeouts(local_context), - compression_method, - max_parsing_threads, - query_info, - headers, - params)); - std::shuffle(uri_options.begin(), uri_options.end(), thread_local_rng); - return pipe; + num_streams); + + query_plan.addStep(std::move(reading)); } diff --git a/src/Storages/StorageURL.h b/src/Storages/StorageURL.h index 8d027025882..07d4d0cad38 100644 --- a/src/Storages/StorageURL.h +++ b/src/Storages/StorageURL.h @@ -34,7 +34,8 @@ class PullingPipelineExecutor; class IStorageURLBase : public IStorage { public: - Pipe read( + void read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, SelectQueryInfo & query_info, @@ -67,6 +68,8 @@ public: const ContextPtr & context); protected: + friend class ReadFromURL; + IStorageURLBase( const String & uri_, ContextPtr context_, @@ -136,7 +139,7 @@ public: class DisclosedGlobIterator { public: - DisclosedGlobIterator(const String & uri_, size_t max_addresses, const ASTPtr & query, const NamesAndTypesList & virtual_columns, const ContextPtr & context); + DisclosedGlobIterator(const String & uri_, size_t max_addresses, const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns, const ContextPtr & context); String next(); size_t size(); @@ -162,7 +165,6 @@ public: const ConnectionTimeouts & timeouts, CompressionMethod compression_method, size_t max_parsing_threads, - const SelectQueryInfo & query_info, const HTTPHeaderEntries & headers_ = {}, const URIParams & params = {}, bool glob_url = false, @@ -170,11 +172,6 @@ public: String getName() const override { return name; } - void setKeyCondition(const SelectQueryInfo & query_info_, ContextPtr context_) override - { - setKeyConditionImpl(query_info_, context_, block_for_format); - } - void setKeyCondition(const ActionsDAG::NodeRawConstPtrs & nodes, ContextPtr context_) override { setKeyConditionImpl(nodes, context_, block_for_format); @@ -317,7 +314,8 @@ public: ContextPtr context_, const String & compression_method_); - Pipe read( + void read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, SelectQueryInfo & query_info, diff --git a/src/Storages/StorageURLCluster.cpp b/src/Storages/StorageURLCluster.cpp index c052e781877..a0b5fcd6f28 100644 --- a/src/Storages/StorageURLCluster.cpp +++ b/src/Storages/StorageURLCluster.cpp @@ -81,9 +81,9 @@ void StorageURLCluster::addColumnsStructureToQuery(ASTPtr & query, const String TableFunctionURLCluster::addColumnsStructureToArguments(expression_list->children, structure, context); } -RemoteQueryExecutor::Extension StorageURLCluster::getTaskIteratorExtension(ASTPtr query, const ContextPtr & context) const +RemoteQueryExecutor::Extension StorageURLCluster::getTaskIteratorExtension(const ActionsDAG::Node * predicate, const ContextPtr & context) const { - auto iterator = std::make_shared(uri, context->getSettingsRef().glob_expansion_max_elements, query, virtual_columns, context); + auto iterator = std::make_shared(uri, context->getSettingsRef().glob_expansion_max_elements, predicate, virtual_columns, context); auto callback = std::make_shared([iter = std::move(iterator)]() mutable -> String { return iter->next(); }); return RemoteQueryExecutor::Extension{.task_iterator = std::move(callback)}; } diff --git a/src/Storages/StorageURLCluster.h b/src/Storages/StorageURLCluster.h index ddf7e6f0790..07978040029 100644 --- a/src/Storages/StorageURLCluster.h +++ b/src/Storages/StorageURLCluster.h @@ -34,7 +34,7 @@ public: NamesAndTypesList getVirtuals() const override { return virtual_columns; } - RemoteQueryExecutor::Extension getTaskIteratorExtension(ASTPtr query, const ContextPtr & context) const override; + RemoteQueryExecutor::Extension getTaskIteratorExtension(const ActionsDAG::Node * predicate, const ContextPtr & context) const override; bool supportsSubcolumns() const override { return true; } diff --git a/src/Storages/StorageXDBC.cpp b/src/Storages/StorageXDBC.cpp index a569c50835c..a274b1ba4db 100644 --- a/src/Storages/StorageXDBC.cpp +++ b/src/Storages/StorageXDBC.cpp @@ -102,7 +102,8 @@ std::function StorageXDBC::getReadPOSTDataCallback( return write_body_callback; } -Pipe StorageXDBC::read( +void StorageXDBC::read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, SelectQueryInfo & query_info, @@ -114,7 +115,7 @@ Pipe StorageXDBC::read( storage_snapshot->check(column_names); bridge_helper->startBridgeSync(); - return IStorageURLBase::read(column_names, storage_snapshot, query_info, local_context, processed_stage, max_block_size, num_streams); + IStorageURLBase::read(query_plan, column_names, storage_snapshot, query_info, local_context, processed_stage, max_block_size, num_streams); } SinkToStoragePtr StorageXDBC::write(const ASTPtr & /* query */, const StorageMetadataPtr & metadata_snapshot, ContextPtr local_context, bool /*async_insert*/) diff --git a/src/Storages/StorageXDBC.h b/src/Storages/StorageXDBC.h index 1c1651cb333..fe678785dc2 100644 --- a/src/Storages/StorageXDBC.h +++ b/src/Storages/StorageXDBC.h @@ -19,7 +19,8 @@ namespace DB class StorageXDBC : public IStorageURLBase { public: - Pipe read( + void read( + QueryPlan & query_plan, const Names & column_names, const StorageSnapshotPtr & storage_snapshot, SelectQueryInfo & query_info, diff --git a/src/Storages/System/StorageSystemTables.cpp b/src/Storages/System/StorageSystemTables.cpp index 53b28543bf1..d2c01ec3dea 100644 --- a/src/Storages/System/StorageSystemTables.cpp +++ b/src/Storages/System/StorageSystemTables.cpp @@ -104,7 +104,7 @@ ColumnPtr getFilteredTables(const ActionsDAG::Node * predicate, const ColumnPtr MutableColumnPtr database_column = ColumnString::create(); MutableColumnPtr engine_column; - auto dag = VirtualColumnUtils::splitFilterDagForAllowedInputs(predicate, sample); + auto dag = VirtualColumnUtils::splitFilterDagForAllowedInputs(predicate, &sample); if (dag) { bool filter_by_engine = false; diff --git a/src/Storages/VirtualColumnUtils.cpp b/src/Storages/VirtualColumnUtils.cpp index aed06fb0540..e845e03d122 100644 --- a/src/Storages/VirtualColumnUtils.cpp +++ b/src/Storages/VirtualColumnUtils.cpp @@ -36,7 +36,10 @@ #include #include #include +#include "Functions/FunctionsLogical.h" #include "Functions/IFunction.h" +#include "Functions/IFunctionAdaptors.h" +#include "Functions/indexHint.h" #include #include #include @@ -250,19 +253,7 @@ static void makeSets(const ExpressionActionsPtr & actions, const ContextPtr & co if (!future_set->get()) { if (auto * set_from_subquery = typeid_cast(future_set.get())) - { - auto plan = set_from_subquery->build(context); - - if (!plan) - continue; - - auto builder = plan->buildQueryPipeline(QueryPlanOptimizationSettings::fromContext(context), BuildQueryPipelineSettings::fromContext(context)); - auto pipeline = QueryPipelineBuilder::getPipeline(std::move(*builder)); - pipeline.complete(std::make_shared(Block())); - - CompletedPipelineExecutor executor(pipeline); - executor.execute(); - } + set_from_subquery->buildSetInplace(context); } } } @@ -390,9 +381,9 @@ static void addPathAndFileToVirtualColumns(Block & block, const String & path, s block.getByName("_idx").column->assumeMutableRef().insert(idx); } -ASTPtr createPathAndFileFilterAst(const ASTPtr & query, const NamesAndTypesList & virtual_columns, const String & path_example, const ContextPtr & context) +ActionsDAGPtr createPathAndFileFilterDAG(const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns) { - if (!query || virtual_columns.empty()) + if (!predicate || virtual_columns.empty()) return {}; Block block; @@ -401,16 +392,12 @@ ASTPtr createPathAndFileFilterAst(const ASTPtr & query, const NamesAndTypesList if (column.name == "_file" || column.name == "_path") block.insert({column.type->createColumn(), column.type, column.name}); } - /// Create a block with one row to construct filter - /// Append "idx" column as the filter result + block.insert({ColumnUInt64::create(), std::make_shared(), "_idx"}); - addPathAndFileToVirtualColumns(block, path_example, 0); - ASTPtr filter_ast; - prepareFilterBlockWithQuery(query, context, block, filter_ast); - return filter_ast; + return splitFilterDagForAllowedInputs(predicate, &block); } -ColumnPtr getFilterByPathAndFileIndexes(const std::vector & paths, const ASTPtr & query, const NamesAndTypesList & virtual_columns, const ContextPtr & context, ASTPtr filter_ast) +ColumnPtr getFilterByPathAndFileIndexes(const std::vector & paths, const ActionsDAGPtr & dag, const NamesAndTypesList & virtual_columns, const ContextPtr & context) { Block block; for (const auto & column : virtual_columns) @@ -423,7 +410,7 @@ ColumnPtr getFilterByPathAndFileIndexes(const std::vector & paths, const for (size_t i = 0; i != paths.size(); ++i) addPathAndFileToVirtualColumns(block, paths[i], i); - filterBlockWithQuery(query, block, context, filter_ast); + filterBlockWithDAG(dag, block, context); return block.getByName("_idx").column; } @@ -481,7 +468,7 @@ static bool canEvaluateSubtree(const ActionsDAG::Node * node, const Block & allo static const ActionsDAG::Node * splitFilterNodeForAllowedInputs( const ActionsDAG::Node * node, - const Block & allowed_inputs, + const Block * allowed_inputs, ActionsDAG::Nodes & additional_nodes) { if (node->type == ActionsDAG::ActionType::FUNCTION) @@ -502,9 +489,12 @@ static const ActionsDAG::Node * splitFilterNodeForAllowedInputs( const ActionsDAG::Node * res = node_copy.children.front(); /// Expression like (not_allowed AND 256) can't be resuced to (and(256)) because AND requires /// at least two arguments; also it can't be reduced to (256) because result type is different. - /// TODO: add CAST here if (!res->result_type->equals(*node->result_type)) - return nullptr; + { + ActionsDAG tmp_dag; + res = &tmp_dag.addCast(*res, node->result_type, {}); + additional_nodes.splice(additional_nodes.end(), ActionsDAG::detachNodes(std::move(tmp_dag))); + } return res; } @@ -520,15 +510,46 @@ static const ActionsDAG::Node * splitFilterNodeForAllowedInputs( return &node_copy; } + else if (node->function_base->getName() == "indexHint") + { + if (const auto * adaptor = typeid_cast(node->function_base.get())) + { + if (const auto * index_hint = typeid_cast(adaptor->getFunction().get())) + { + auto index_hint_dag = index_hint->getActions()->clone(); + ActionsDAG::NodeRawConstPtrs atoms; + for (const auto & output : index_hint_dag->getOutputs()) + if (const auto * child_copy = splitFilterNodeForAllowedInputs(output, allowed_inputs, additional_nodes)) + atoms.push_back(child_copy); + + if (!atoms.empty()) + { + const auto * res = atoms.at(0); + + if (atoms.size() > 1) + { + FunctionOverloadResolverPtr func_builder_and = std::make_unique(std::make_shared()); + res = &index_hint_dag->addFunction(func_builder_and, atoms, {}); + } + + if (!res->result_type->equals(*node->result_type)) + res = &index_hint_dag->addCast(*res, node->result_type, {}); + + additional_nodes.splice(additional_nodes.end(), ActionsDAG::detachNodes(std::move(*index_hint_dag))); + return res; + } + } + } + } } - if (!canEvaluateSubtree(node, allowed_inputs)) + if (allowed_inputs && !canEvaluateSubtree(node, *allowed_inputs)) return nullptr; return node; } -ActionsDAGPtr splitFilterDagForAllowedInputs(const ActionsDAG::Node * predicate, const Block & allowed_inputs) +ActionsDAGPtr splitFilterDagForAllowedInputs(const ActionsDAG::Node * predicate, const Block * allowed_inputs) { if (!predicate) return nullptr; @@ -543,7 +564,7 @@ ActionsDAGPtr splitFilterDagForAllowedInputs(const ActionsDAG::Node * predicate, void filterBlockWithPredicate(const ActionsDAG::Node * predicate, Block & block, ContextPtr context) { - auto dag = splitFilterDagForAllowedInputs(predicate, block); + auto dag = splitFilterDagForAllowedInputs(predicate, &block); if (dag) filterBlockWithDAG(dag, block, context); } diff --git a/src/Storages/VirtualColumnUtils.h b/src/Storages/VirtualColumnUtils.h index e22b9742888..7a9b2605339 100644 --- a/src/Storages/VirtualColumnUtils.h +++ b/src/Storages/VirtualColumnUtils.h @@ -42,7 +42,7 @@ void filterBlockWithPredicate(const ActionsDAG::Node * predicate, Block & block, void filterBlockWithDAG(ActionsDAGPtr dag, Block & block, ContextPtr context); /// Extract a part of predicate that can be evaluated using only columns from input_names. -ActionsDAGPtr splitFilterDagForAllowedInputs(const ActionsDAG::Node * predicate, const Block & allowed_inputs); +ActionsDAGPtr splitFilterDagForAllowedInputs(const ActionsDAG::Node * predicate, const Block * allowed_inputs); /// Extract from the input stream a set of `name` column values template @@ -58,14 +58,14 @@ auto extractSingleValueFromBlock(const Block & block, const String & name) NamesAndTypesList getPathFileAndSizeVirtualsForStorage(NamesAndTypesList storage_columns); -ASTPtr createPathAndFileFilterAst(const ASTPtr & query, const NamesAndTypesList & virtual_columns, const String & path_example, const ContextPtr & context); +ActionsDAGPtr createPathAndFileFilterDAG(const ActionsDAG::Node * predicate, const NamesAndTypesList & virtual_columns); -ColumnPtr getFilterByPathAndFileIndexes(const std::vector & paths, const ASTPtr & query, const NamesAndTypesList & virtual_columns, const ContextPtr & context, ASTPtr filter_ast); +ColumnPtr getFilterByPathAndFileIndexes(const std::vector & paths, const ActionsDAGPtr & dag, const NamesAndTypesList & virtual_columns, const ContextPtr & context); template -void filterByPathOrFile(std::vector & sources, const std::vector & paths, const ASTPtr & query, const NamesAndTypesList & virtual_columns, const ContextPtr & context, ASTPtr filter_ast) +void filterByPathOrFile(std::vector & sources, const std::vector & paths, const ActionsDAGPtr & dag, const NamesAndTypesList & virtual_columns, const ContextPtr & context) { - auto indexes_column = getFilterByPathAndFileIndexes(paths, query, virtual_columns, context, filter_ast); + auto indexes_column = getFilterByPathAndFileIndexes(paths, dag, virtual_columns, context); const auto & indexes = typeid_cast(*indexes_column).getData(); if (indexes.size() == sources.size()) return; diff --git a/src/Storages/buildQueryTreeForShard.cpp b/src/Storages/buildQueryTreeForShard.cpp index 74f2709f458..00cc5e3ee58 100644 --- a/src/Storages/buildQueryTreeForShard.cpp +++ b/src/Storages/buildQueryTreeForShard.cpp @@ -1,6 +1,7 @@ #include +#include #include #include #include @@ -372,6 +373,10 @@ QueryTreeNodePtr buildQueryTreeForShard(SelectQueryInfo & query_info, QueryTreeN removeGroupingFunctionSpecializations(query_tree_to_modify); + // std::cerr << "====================== build 1 \n" << query_tree_to_modify->dumpTree() << std::endl; + createUniqueTableAliases(query_tree_to_modify, nullptr, planner_context->getQueryContext()); + // std::cerr << "====================== build 2 \n" << query_tree_to_modify->dumpTree() << std::endl; + return query_tree_to_modify; } diff --git a/tests/ci/ci_config.py b/tests/ci/ci_config.py index 031ab0be8a0..895a12313da 100644 --- a/tests/ci/ci_config.py +++ b/tests/ci/ci_config.py @@ -72,10 +72,20 @@ class BuildConfig: include_paths=[ "./src", "./contrib/*-cmake", + "./contrib/consistent-hashing", + "./contrib/murmurhash", + "./contrib/libfarmhash", + "./contrib/pdqsort", + "./contrib/cityhash102", + "./contrib/sparse-checkout", + "./contrib/libmetrohash", + "./contrib/update-submodules.sh", + "./contrib/CMakeLists.txt", "./cmake", "./base", "./programs", "./packages", + "./docker/packager/packager", ], exclude_files=[".md"], docker=["clickhouse/binary-builder"], diff --git a/tests/config/config.d/graphite_alternative.xml b/tests/config/config.d/graphite_alternative.xml index 1a00de52af5..6c0bd13ce43 100644 --- a/tests/config/config.d/graphite_alternative.xml +++ b/tests/config/config.d/graphite_alternative.xml @@ -26,4 +26,28 @@ + + Version + + sum + + 0 + 600 + + + 17280 + 6000 + + + + + 0 + 600 + + + 17280 + 6000 + + + diff --git a/tests/integration/test_parallel_replicas_distributed_read_from_all/__init__.py b/tests/integration/test_ddl_config_hostname/__init__.py similarity index 100% rename from tests/integration/test_parallel_replicas_distributed_read_from_all/__init__.py rename to tests/integration/test_ddl_config_hostname/__init__.py diff --git a/tests/integration/test_ddl_config_hostname/configs/remote_servers.xml b/tests/integration/test_ddl_config_hostname/configs/remote_servers.xml new file mode 100644 index 00000000000..8c6a507951d --- /dev/null +++ b/tests/integration/test_ddl_config_hostname/configs/remote_servers.xml @@ -0,0 +1,19 @@ + + + + + true + + node1 + 9000 + + + + + + 1 + + + node1 + + diff --git a/tests/integration/test_ddl_config_hostname/test.py b/tests/integration/test_ddl_config_hostname/test.py new file mode 100644 index 00000000000..724e766c9dc --- /dev/null +++ b/tests/integration/test_ddl_config_hostname/test.py @@ -0,0 +1,76 @@ +import pytest + +from helpers.cluster import ClickHouseCluster + +cluster = ClickHouseCluster(__file__) + +node1 = cluster.add_instance( + "node1", + main_configs=["configs/remote_servers.xml"], + with_zookeeper=True, + stay_alive=True, +) + + +@pytest.fixture(scope="module") +def started_cluster(): + try: + cluster.start() + yield cluster + finally: + cluster.shutdown() + + +def test_ddl_queue_delete_add_replica(started_cluster): + # Some query started on the cluster, then we deleted some unfinished node + # and added a new node to the cluster. Considering that there are less + # finished nodes than expected and we can't resolve deleted node's hostname + # the queue will be stuck on a new node. + # inside allows us to simply discard deleted + # node's hostname by simple comparison without trying to resolve it. + + node1.query( + "create table hostname_change on cluster test_cluster (n int) engine=Log" + ) + + # There's no easy way to change hostname of a container, so let's update values in zk + query_znode = node1.query( + "select max(name) from system.zookeeper where path='/clickhouse/task_queue/ddl'" + )[:-1] + + value = ( + node1.query( + f"select value from system.zookeeper where path='/clickhouse/task_queue/ddl' and name='{query_znode}' format TSVRaw" + )[:-1] + .replace( + "hosts: ['node1:9000']", "hosts: ['finished_node:9000','deleted_node:9000']" + ) + .replace("initiator: node1:9000", "initiator: finished_node:9000") + .replace("\\'", "#") + .replace("'", "\\'") + .replace("\n", "\\n") + .replace("#", "\\'") + ) + + finished_znode = node1.query( + f"select name from system.zookeeper where path='/clickhouse/task_queue/ddl/{query_znode}/finished' and name like '%node1%'" + )[:-1] + + node1.query( + f"insert into system.zookeeper (name, path, value) values ('{query_znode}', '/clickhouse/task_queue/ddl', '{value}')" + ) + started_cluster.get_kazoo_client("zoo1").delete( + f"/clickhouse/task_queue/ddl/{query_znode}/finished/{finished_znode}" + ) + + finished_znode = finished_znode.replace("node1", "finished_node") + + node1.query( + f"insert into system.zookeeper (name, path, value) values ('{finished_znode}', '/clickhouse/task_queue/ddl/{query_znode}/finished', '0\\n')" + ) + + node1.restart_clickhouse(kill=True) + + node1.query( + "create table hostname_change2 on cluster test_cluster (n int) engine=Log" + ) diff --git a/tests/integration/test_kafka_bad_messages/test.py b/tests/integration/test_kafka_bad_messages/test.py index 1633f230f83..954b6042305 100644 --- a/tests/integration/test_kafka_bad_messages/test.py +++ b/tests/integration/test_kafka_bad_messages/test.py @@ -294,7 +294,7 @@ def test_bad_messages_parsing_exception(kafka_cluster, max_retries=20): ]: print(format_name) - kafka_create_topic(admin_client, f"{format_name}_err") + kafka_create_topic(admin_client, f"{format_name}_parsing_err") instance.query( f""" @@ -305,7 +305,7 @@ def test_bad_messages_parsing_exception(kafka_cluster, max_retries=20): CREATE TABLE kafka_{format_name} (key UInt64, value UInt64) ENGINE = Kafka SETTINGS kafka_broker_list = 'kafka1:19092', - kafka_topic_list = '{format_name}_err', + kafka_topic_list = '{format_name}_parsing_err', kafka_group_name = '{format_name}', kafka_format = '{format_name}', kafka_num_consumers = 1; @@ -316,16 +316,18 @@ def test_bad_messages_parsing_exception(kafka_cluster, max_retries=20): ) kafka_produce( - kafka_cluster, f"{format_name}_err", ["qwertyuiop", "asdfghjkl", "zxcvbnm"] + kafka_cluster, + f"{format_name}_parsing_err", + ["qwertyuiop", "asdfghjkl", "zxcvbnm"], ) - expected_result = """avro::Exception: Invalid data file. Magic does not match: : while parsing Kafka message (topic: Avro_err, partition: 0, offset: 0)\\'|1|1|1|default|kafka_Avro -Cannot parse input: expected \\'{\\' before: \\'qwertyuiop\\': while parsing Kafka message (topic: JSONEachRow_err, partition: 0, offset: 0|1|1|1|default|kafka_JSONEachRow + expected_result = """avro::Exception: Invalid data file. Magic does not match: : while parsing Kafka message (topic: Avro_parsing_err, partition: 0, offset: 0)\\'|1|1|1|default|kafka_Avro +Cannot parse input: expected \\'{\\' before: \\'qwertyuiop\\': (at row 1)\\n: while parsing Kafka message (topic: JSONEachRow_parsing_err, partition:|1|1|1|default|kafka_JSONEachRow """ # filter out stacktrace in exceptions.text[1] because it is hardly stable enough result_system_kafka_consumers = instance.query_with_retry( """ - SELECT substr(exceptions.text[1], 1, 131), length(exceptions.text) > 1 AND length(exceptions.text) < 15, length(exceptions.time) > 1 AND length(exceptions.time) < 15, abs(dateDiff('second', exceptions.time[1], now())) < 40, database, table FROM system.kafka_consumers WHERE table in('kafka_Avro', 'kafka_JSONEachRow') ORDER BY table, assignments.partition_id[1] + SELECT substr(exceptions.text[1], 1, 139), length(exceptions.text) > 1 AND length(exceptions.text) < 15, length(exceptions.time) > 1 AND length(exceptions.time) < 15, abs(dateDiff('second', exceptions.time[1], now())) < 40, database, table FROM system.kafka_consumers WHERE table in('kafka_Avro', 'kafka_JSONEachRow') ORDER BY table, assignments.partition_id[1] """, retry_count=max_retries, sleep_time=1, @@ -338,7 +340,7 @@ Cannot parse input: expected \\'{\\' before: \\'qwertyuiop\\': while parsing Kaf "Avro", "JSONEachRow", ]: - kafka_delete_topic(admin_client, f"{format_name}_err") + kafka_delete_topic(admin_client, f"{format_name}_parsing_err") def test_bad_messages_to_mv(kafka_cluster, max_retries=20): diff --git a/tests/integration/test_parallel_replicas_working_set/__init__.py b/tests/integration/test_parallel_replicas_all_marks_read/__init__.py similarity index 100% rename from tests/integration/test_parallel_replicas_working_set/__init__.py rename to tests/integration/test_parallel_replicas_all_marks_read/__init__.py diff --git a/tests/integration/test_parallel_replicas_all_marks_read/configs/remote_servers.xml b/tests/integration/test_parallel_replicas_all_marks_read/configs/remote_servers.xml new file mode 100644 index 00000000000..1ad562334f5 --- /dev/null +++ b/tests/integration/test_parallel_replicas_all_marks_read/configs/remote_servers.xml @@ -0,0 +1,32 @@ + + + + + + node0 + 9000 + + + node1 + 9000 + + + node2 + 9000 + + + node3 + 9000 + + + node4 + 9000 + + + node5 + 9000 + + + + + diff --git a/tests/integration/test_parallel_replicas_all_marks_read/test.py b/tests/integration/test_parallel_replicas_all_marks_read/test.py new file mode 100644 index 00000000000..7776ccb0c09 --- /dev/null +++ b/tests/integration/test_parallel_replicas_all_marks_read/test.py @@ -0,0 +1,156 @@ +import json +import pytest + +from helpers.cluster import ClickHouseCluster +from random import randint + +cluster = ClickHouseCluster(__file__) +cluster_name = "parallel_replicas_with_unavailable_nodes" + +nodes = [ + cluster.add_instance( + f"node{num}", main_configs=["configs/remote_servers.xml"], with_zookeeper=True + ) + for num in range(3) +] + + +@pytest.fixture(scope="module", autouse=True) +def start_cluster(): + try: + cluster.start() + yield cluster + finally: + cluster.shutdown() + + +def _create_tables(table_name, table_size, index_granularity): + for num in range(len(nodes)): + nodes[num].query(f"DROP TABLE IF EXISTS {table_name}") + + nodes[num].query( + f""" + CREATE TABLE IF NOT EXISTS {table_name} (key Int64, value String) + Engine=ReplicatedMergeTree('/test_parallel_replicas/shard1/{table_name}', '{num}') + ORDER BY (key) + SETTINGS index_granularity = {index_granularity} + """ + ) + + nodes[0].query( + f""" + INSERT INTO {table_name} + SELECT number, toString(number) FROM numbers_mt({table_size}) + """ + ) + + +def _create_query(query_tmpl, table_name): + rand_set = [randint(0, 500) for i in range(42)] + return query_tmpl.format(table_name=table_name, rand_set=rand_set) + + +def _get_result_without_parallel_replicas(query): + return nodes[0].query( + query, + settings={ + "allow_experimental_parallel_reading_from_replicas": 0, + }, + ) + + +def _get_result_with_parallel_replicas( + query, query_id, cluster_name, parallel_replicas_mark_segment_size +): + return nodes[0].query( + query, + settings={ + "allow_experimental_parallel_reading_from_replicas": 2, + "max_parallel_replicas": 6, + "cluster_for_parallel_replicas": f"{cluster_name}", + "parallel_replicas_mark_segment_size": parallel_replicas_mark_segment_size, + "query_id": query_id, + }, + ) + + +def _get_expected_amount_of_marks_to_read(query): + return json.loads( + nodes[0].query( + f""" + EXPLAIN ESTIMATE + {query} + FORMAT JSONEachRow + """ + ) + )["marks"] + + +def _get_number_of_marks_read_by_replicas(query_id): + nodes[0].query("SYSTEM FLUSH LOGS") + return ( + nodes[0] + .query( + f""" + SELECT sum( + ProfileEvents['ParallelReplicasReadAssignedMarks'] + + ProfileEvents['ParallelReplicasReadUnassignedMarks'] + + ProfileEvents['ParallelReplicasReadAssignedForStealingMarks'] + ) + FROM system.query_log + WHERE query_id = '{query_id}' + """ + ) + .strip() + ) + + +@pytest.mark.parametrize( + "query_tmpl", + [ + "SELECT sum(cityHash64(*)) FROM {table_name}", + "SELECT sum(cityHash64(*)) FROM {table_name} WHERE intDiv(key, 100) IN {rand_set}", + ], +) +@pytest.mark.parametrize( + "table_size", + [1000, 10000, 100000], +) +@pytest.mark.parametrize( + "index_granularity", + [10, 100], +) +@pytest.mark.parametrize( + "parallel_replicas_mark_segment_size", + [1, 10], +) +def test_number_of_marks_read( + start_cluster, + query_tmpl, + table_size, + index_granularity, + parallel_replicas_mark_segment_size, +): + if nodes[0].is_built_with_sanitizer(): + pytest.skip("Disabled for sanitizers (too slow)") + + table_name = f"tbl_{len(query_tmpl)}_{cluster_name}_{table_size}_{index_granularity}_{parallel_replicas_mark_segment_size}" + _create_tables(table_name, table_size, index_granularity) + + if "where" in query_tmpl.lower(): + # We need all the replicas to see the same state of parts to make sure that index analysis will pick the same amount of marks for reading + # regardless of which replica's state will be chosen as the working set. This should became redundant once we start to always use initiator's snapshot. + nodes[0].query(f"OPTIMIZE TABLE {table_name} FINAL", settings={"alter_sync": 2}) + for node in nodes: + node.query(f"SYSTEM SYNC REPLICA {table_name} STRICT") + + query = _create_query(query_tmpl, table_name) + query_id = f"{table_name}_{randint(0, 1e9)}" + + assert _get_result_with_parallel_replicas( + query, query_id, cluster_name, parallel_replicas_mark_segment_size + ) == _get_result_without_parallel_replicas(query) + + assert _get_number_of_marks_read_by_replicas( + query_id + ) == _get_expected_amount_of_marks_to_read(query) diff --git a/tests/integration/test_parallel_replicas_distributed_read_from_all/configs/remote_servers.xml b/tests/integration/test_parallel_replicas_distributed_read_from_all/configs/remote_servers.xml deleted file mode 100644 index 02a315479f8..00000000000 --- a/tests/integration/test_parallel_replicas_distributed_read_from_all/configs/remote_servers.xml +++ /dev/null @@ -1,22 +0,0 @@ - - - - - true - - n1 - 9000 - - - n2 - 9000 - - - n3 - 9000 - - - - - - diff --git a/tests/integration/test_parallel_replicas_distributed_read_from_all/test.py b/tests/integration/test_parallel_replicas_distributed_read_from_all/test.py deleted file mode 100644 index 8af7bb12595..00000000000 --- a/tests/integration/test_parallel_replicas_distributed_read_from_all/test.py +++ /dev/null @@ -1,156 +0,0 @@ -import pytest -from helpers.cluster import ClickHouseCluster - -cluster = ClickHouseCluster(__file__) - -nodes = [ - cluster.add_instance( - f"n{i}", main_configs=["configs/remote_servers.xml"], with_zookeeper=True - ) - for i in (1, 2, 3) -] - - -@pytest.fixture(scope="module", autouse=True) -def start_cluster(): - try: - cluster.start() - yield cluster - finally: - cluster.shutdown() - - -def create_tables(cluster, table_name): - """create replicated tables in special way - - each table is populated by equal number of rows - - fetches are disabled, so each replica will have different set of rows - which enforce parallel replicas read from each replica - """ - - # create replicated tables - for node in nodes: - node.query(f"DROP TABLE IF EXISTS {table_name} SYNC") - - nodes[0].query( - f"""CREATE TABLE IF NOT EXISTS {table_name} (key Int64, value String) Engine=ReplicatedMergeTree('/test_parallel_replicas/shard1/{table_name}', 'r1') - ORDER BY (key)""" - ) - nodes[1].query( - f"""CREATE TABLE IF NOT EXISTS {table_name} (key Int64, value String) Engine=ReplicatedMergeTree('/test_parallel_replicas/shard1/{table_name}', 'r2') - ORDER BY (key)""" - ) - nodes[2].query( - f"""CREATE TABLE IF NOT EXISTS {table_name} (key Int64, value String) Engine=ReplicatedMergeTree('/test_parallel_replicas/shard1/{table_name}', 'r3') - ORDER BY (key)""" - ) - # stop merges - nodes[0].query(f"system stop merges {table_name}") - nodes[1].query(f"system stop merges {table_name}") - nodes[2].query(f"system stop merges {table_name}") - # stop fetches - nodes[0].query(f"system stop fetches {table_name}") - nodes[1].query(f"system stop fetches {table_name}") - nodes[2].query(f"system stop fetches {table_name}") - - # create distributed table - nodes[0].query(f"DROP TABLE IF EXISTS {table_name}_d SYNC") - nodes[0].query( - f""" - CREATE TABLE {table_name}_d AS {table_name} - Engine=Distributed( - {cluster}, - currentDatabase(), - {table_name}, - rand() - ) - """ - ) - - # populate data, equal number of rows for each replica - nodes[0].query( - f"INSERT INTO {table_name} SELECT number, number FROM numbers(10)", - settings={"distributed_foreground_insert": 1}, - ) - nodes[0].query( - f"INSERT INTO {table_name} SELECT number, number FROM numbers(10, 10)", - settings={"distributed_foreground_insert": 1}, - ) - nodes[1].query( - f"INSERT INTO {table_name} SELECT number, number FROM numbers(20, 10)", - settings={"distributed_foreground_insert": 1}, - ) - nodes[1].query( - f"INSERT INTO {table_name} SELECT number, number FROM numbers(30, 10)", - settings={"distributed_foreground_insert": 1}, - ) - nodes[2].query( - f"INSERT INTO {table_name} SELECT number, number FROM numbers(40, 10)", - settings={"distributed_foreground_insert": 1}, - ) - nodes[2].query( - f"INSERT INTO {table_name} SELECT number, number FROM numbers(50, 10)", - settings={"distributed_foreground_insert": 1}, - ) - - return "60\t0\t59\t1770\n" - - -@pytest.mark.parametrize( - "prefer_localhost_replica", - [ - pytest.param(0), - pytest.param(1), - ], -) -def test_read_equally_from_each_replica(start_cluster, prefer_localhost_replica): - """create and populate table in special way (see create_table()), - so parallel replicas will read equal number of rows from each replica - """ - - cluster = "test_single_shard_multiple_replicas" - table_name = "test_table" - expected_result = create_tables(cluster, table_name) - - # parallel replicas - assert ( - nodes[0].query( - f"SELECT count(), min(key), max(key), sum(key) FROM {table_name}_d", - settings={ - "allow_experimental_parallel_reading_from_replicas": 2, - "prefer_localhost_replica": prefer_localhost_replica, - "max_parallel_replicas": 3, - }, - ) - == expected_result - ) - - # check logs for coordinator statistic - for n in nodes: - n.query("SYSTEM FLUSH LOGS") - - # each replica has 2 distinct parts (non-intersecting with another replicas), - # each part less then index granularity, therefore 2 marks for each replica to handle - coordinator_statistic = "replica 0 - {requests: 3 marks: 2}; replica 1 - {requests: 3 marks: 2}; replica 2 - {requests: 3 marks: 2}" - assert ( - nodes[0].contains_in_log(coordinator_statistic) - or nodes[1].contains_in_log(coordinator_statistic) - or nodes[2].contains_in_log(coordinator_statistic) - ) - - # w/o parallel replicas - # start fetches back, otherwise the result will be not as expected - nodes[0].query(f"system start fetches {table_name}") - nodes[1].query(f"system start fetches {table_name}") - nodes[2].query(f"system start fetches {table_name}") - # ensure that replica in sync before querying it to get stable result - nodes[0].query(f"system start merges {table_name}") - nodes[0].query(f"system sync replica {table_name}") - assert ( - nodes[0].query( - f"SELECT count(), min(key), max(key), sum(key) FROM {table_name}_d", - settings={ - "allow_experimental_parallel_reading_from_replicas": 0, - }, - ) - == expected_result - ) diff --git a/tests/integration/test_parallel_replicas_working_set/configs/remote_servers.xml b/tests/integration/test_parallel_replicas_working_set/configs/remote_servers.xml deleted file mode 100644 index 02a315479f8..00000000000 --- a/tests/integration/test_parallel_replicas_working_set/configs/remote_servers.xml +++ /dev/null @@ -1,22 +0,0 @@ - - - - - true - - n1 - 9000 - - - n2 - 9000 - - - n3 - 9000 - - - - - - diff --git a/tests/integration/test_parallel_replicas_working_set/test.py b/tests/integration/test_parallel_replicas_working_set/test.py deleted file mode 100644 index 0ede9d9b1a5..00000000000 --- a/tests/integration/test_parallel_replicas_working_set/test.py +++ /dev/null @@ -1,140 +0,0 @@ -import pytest -from helpers.cluster import ClickHouseCluster - -cluster = ClickHouseCluster(__file__) - -nodes = [ - cluster.add_instance( - f"n{i}", main_configs=["configs/remote_servers.xml"], with_zookeeper=True - ) - for i in (1, 2, 3) -] - - -@pytest.fixture(scope="module", autouse=True) -def start_cluster(): - try: - cluster.start() - yield cluster - finally: - cluster.shutdown() - - -def create_tables(cluster, table_name, node_with_covering_part): - # create replicated tables - for node in nodes: - node.query(f"DROP TABLE IF EXISTS {table_name} SYNC") - - nodes[0].query( - f"""CREATE TABLE IF NOT EXISTS {table_name} (key Int64, value String) Engine=ReplicatedMergeTree('/test_parallel_replicas/shard1/{table_name}', 'r1') - ORDER BY (key)""" - ) - nodes[1].query( - f"""CREATE TABLE IF NOT EXISTS {table_name} (key Int64, value String) Engine=ReplicatedMergeTree('/test_parallel_replicas/shard1/{table_name}', 'r2') - ORDER BY (key)""" - ) - nodes[2].query( - f"""CREATE TABLE IF NOT EXISTS {table_name} (key Int64, value String) Engine=ReplicatedMergeTree('/test_parallel_replicas/shard1/{table_name}', 'r3') - ORDER BY (key)""" - ) - # stop merges to keep original parts - # stop fetches to keep only parts created on the nodes - for i in (0, 1, 2): - if i != node_with_covering_part: - nodes[i].query(f"system stop fetches {table_name}") - nodes[i].query(f"system stop merges {table_name}") - - # populate data, equal number of rows for each replica - nodes[0].query( - f"INSERT INTO {table_name} SELECT number, number FROM numbers(10)", - ) - nodes[0].query( - f"INSERT INTO {table_name} SELECT number, number FROM numbers(10, 10)" - ) - nodes[1].query( - f"INSERT INTO {table_name} SELECT number, number FROM numbers(20, 10)" - ) - nodes[1].query( - f"INSERT INTO {table_name} SELECT number, number FROM numbers(30, 10)" - ) - nodes[2].query( - f"INSERT INTO {table_name} SELECT number, number FROM numbers(40, 10)" - ) - nodes[2].query( - f"INSERT INTO {table_name} SELECT number, number FROM numbers(50, 10)" - ) - nodes[node_with_covering_part].query(f"system sync replica {table_name}") - nodes[node_with_covering_part].query(f"optimize table {table_name}") - - # check we have expected set of parts - expected_active_parts = "" - if node_with_covering_part == 0: - expected_active_parts = ( - "all_0_5_1\nall_2_2_0\nall_3_3_0\nall_4_4_0\nall_5_5_0\n" - ) - - if node_with_covering_part == 1: - expected_active_parts = ( - "all_0_0_0\nall_0_5_1\nall_1_1_0\nall_4_4_0\nall_5_5_0\n" - ) - - if node_with_covering_part == 2: - expected_active_parts = ( - "all_0_0_0\nall_0_5_1\nall_1_1_0\nall_2_2_0\nall_3_3_0\n" - ) - - assert ( - nodes[0].query( - f"select distinct name from clusterAllReplicas({cluster}, system.parts) where table='{table_name}' and active order by name" - ) - == expected_active_parts - ) - - -@pytest.mark.parametrize("node_with_covering_part", [0, 1, 2]) -def test_covering_part_in_announcement(start_cluster, node_with_covering_part): - """create and populate table in special way (see create_table()), - node_with_covering_part contains all parts merged into one, - other nodes contain only parts which are result of insert via the node - """ - - cluster = "test_single_shard_multiple_replicas" - table_name = "test_table" - create_tables(cluster, table_name, node_with_covering_part) - - # query result can be one of the following outcomes - # (1) query result if parallel replicas working set contains all_0_5_1 - expected_full_result = "60\t0\t59\t1770\n" - expected_results = {expected_full_result} - - # (2) query result if parallel replicas working set DOESN'T contain all_0_5_1 - if node_with_covering_part == 0: - expected_results.add("40\t20\t59\t1580\n") - if node_with_covering_part == 1: - expected_results.add("40\t0\t59\t1180\n") - if node_with_covering_part == 2: - expected_results.add("40\t0\t39\t780\n") - - # parallel replicas - result = nodes[0].query( - f"SELECT count(), min(key), max(key), sum(key) FROM {table_name}", - settings={ - "allow_experimental_parallel_reading_from_replicas": 2, - "prefer_localhost_replica": 0, - "max_parallel_replicas": 3, - "use_hedged_requests": 0, - "cluster_for_parallel_replicas": cluster, - }, - ) - assert result in expected_results - - # w/o parallel replicas - assert ( - nodes[node_with_covering_part].query( - f"SELECT count(), min(key), max(key), sum(key) FROM {table_name}", - settings={ - "allow_experimental_parallel_reading_from_replicas": 0, - }, - ) - == expected_full_result - ) diff --git a/tests/integration/test_replicated_database/test.py b/tests/integration/test_replicated_database/test.py index 3ced82ebb57..1fc3fe37044 100644 --- a/tests/integration/test_replicated_database/test.py +++ b/tests/integration/test_replicated_database/test.py @@ -507,7 +507,7 @@ def test_alters_from_different_replicas(started_cluster): settings = {"distributed_ddl_task_timeout": 5} assert ( - "There are 1 unfinished hosts (0 of them are currently active)" + "There are 1 unfinished hosts (0 of them are currently executing the task" in competing_node.query_and_get_error( "ALTER TABLE alters_from_different_replicas.concurrent_test ADD COLUMN Added0 UInt32;", settings=settings, diff --git a/tests/integration/test_replicated_database_cluster_groups/test.py b/tests/integration/test_replicated_database_cluster_groups/test.py index b14581c1fe6..647626d8014 100644 --- a/tests/integration/test_replicated_database_cluster_groups/test.py +++ b/tests/integration/test_replicated_database_cluster_groups/test.py @@ -96,7 +96,7 @@ def test_cluster_groups(started_cluster): main_node_2.stop_clickhouse() settings = {"distributed_ddl_task_timeout": 5} assert ( - "There are 1 unfinished hosts (0 of them are currently active)" + "There are 1 unfinished hosts (0 of them are currently executing the task)" in main_node_1.query_and_get_error( "CREATE TABLE cluster_groups.table_2 (d Date, k UInt64) ENGINE=ReplicatedMergeTree ORDER BY k PARTITION BY toYYYYMM(d);", settings=settings, diff --git a/tests/integration/test_storage_iceberg/test.py b/tests/integration/test_storage_iceberg/test.py index d5f8d04e258..9a75dc50d61 100644 --- a/tests/integration/test_storage_iceberg/test.py +++ b/tests/integration/test_storage_iceberg/test.py @@ -463,7 +463,9 @@ def test_schema_inference(started_cluster, format_version): create_iceberg_table(instance, TABLE_NAME, format) - res = instance.query(f"DESC {TABLE_NAME} FORMAT TSVRaw") + res = instance.query( + f"DESC {TABLE_NAME} FORMAT TSVRaw", settings={"print_pretty_type_names": 0} + ) expected = TSV( [ ["intC", "Nullable(Int32)"], diff --git a/tests/integration/test_throttling/configs/server_overrides.xml b/tests/integration/test_throttling/configs/dynamic_overrides.xml similarity index 100% rename from tests/integration/test_throttling/configs/server_overrides.xml rename to tests/integration/test_throttling/configs/dynamic_overrides.xml diff --git a/tests/integration/test_throttling/configs/server_backups.xml b/tests/integration/test_throttling/configs/static_overrides.xml similarity index 83% rename from tests/integration/test_throttling/configs/server_backups.xml rename to tests/integration/test_throttling/configs/static_overrides.xml index a8c43f8beaf..9f3bad2f882 100644 --- a/tests/integration/test_throttling/configs/server_backups.xml +++ b/tests/integration/test_throttling/configs/static_overrides.xml @@ -31,4 +31,7 @@ default /backups/ + + 1000000 + 1000000 diff --git a/tests/integration/test_throttling/test.py b/tests/integration/test_throttling/test.py index 04d02cc859d..c53c2bb1ddf 100644 --- a/tests/integration/test_throttling/test.py +++ b/tests/integration/test_throttling/test.py @@ -34,8 +34,8 @@ node = cluster.add_instance( "node", stay_alive=True, main_configs=[ - "configs/server_backups.xml", - "configs/server_overrides.xml", + "configs/static_overrides.xml", + "configs/dynamic_overrides.xml", "configs/ssl.xml", ], user_configs=[ @@ -64,7 +64,7 @@ def revert_config(): [ "bash", "-c", - f"echo '' > /etc/clickhouse-server/config.d/server_overrides.xml", + f"echo '' > /etc/clickhouse-server/config.d/dynamic_overrides.xml", ] ) node.exec_in_container( @@ -96,7 +96,7 @@ def node_update_config(mode, setting, value=None): if mode is None: return if mode == "server": - config_path = "/etc/clickhouse-server/config.d/server_overrides.xml" + config_path = "/etc/clickhouse-server/config.d/dynamic_overrides.xml" config_content = f""" <{setting}>{value} """ @@ -430,3 +430,32 @@ def test_write_throttling(policy, mode, setting, value, should_took): ) _, took = elapsed(node.query, f"insert into data select * from numbers(1e6)") assert_took(took, should_took) + + +def test_max_mutations_bandwidth_for_server(): + node.query( + """ + drop table if exists data; + create table data (key UInt64 CODEC(NONE)) engine=MergeTree() order by tuple() settings min_bytes_for_wide_part=1e9; + """ + ) + node.query("insert into data select * from numbers(1e6)") + _, took = elapsed( + node.query, + "alter table data update key = -key where 1 settings mutations_sync = 1", + ) + # reading 1e6*8 bytes with 1M/s bandwith should take (8-1)/1=7 seconds + assert_took(took, 7) + + +def test_max_merges_bandwidth_for_server(): + node.query( + """ + drop table if exists data; + create table data (key UInt64 CODEC(NONE)) engine=MergeTree() order by tuple() settings min_bytes_for_wide_part=1e9; + """ + ) + node.query("insert into data select * from numbers(1e6)") + _, took = elapsed(node.query, "optimize table data final") + # reading 1e6*8 bytes with 1M/s bandwith should take (8-1)/1=7 seconds + assert_took(took, 7) diff --git a/tests/integration/test_wrong_db_or_table_name/test.py b/tests/integration/test_wrong_db_or_table_name/test.py index 641501eac84..a5096d80ca9 100644 --- a/tests/integration/test_wrong_db_or_table_name/test.py +++ b/tests/integration/test_wrong_db_or_table_name/test.py @@ -57,6 +57,31 @@ def test_drop_wrong_database_name(start): node.query("DROP DATABASE test;") +def test_database_engine_name(start): + # test with a valid database engine + node.query( + """ + CREATE DATABASE test_atomic ENGINE = Atomic; + CREATE TABLE test_atomic.table_test_atomic (i Int64) ENGINE = MergeTree() ORDER BY i; + INSERT INTO test_atomic.table_test_atomic SELECT 1; + """ + ) + assert 1 == int(node.query("SELECT * FROM test_atomic.table_test_atomic".strip())) + # test with a invalid database engine + with pytest.raises( + QueryRuntimeException, + match="DB::Exception: Unknown database engine Atomic123. Maybe you meant: \\['Atomic'\\].", + ): + node.query("CREATE DATABASE test_atomic123 ENGINE = Atomic123;") + + node.query( + """ + DROP TABLE test_atomic.table_test_atomic; + DROP DATABASE test_atomic; + """ + ) + + def test_wrong_table_name(start): node.query( """ diff --git a/tests/performance/agg_functions_min_max_any.xml b/tests/performance/agg_functions_min_max_any.xml index 2926a5ed3c8..f8469244643 100644 --- a/tests/performance/agg_functions_min_max_any.xml +++ b/tests/performance/agg_functions_min_max_any.xml @@ -87,4 +87,9 @@ select any(FromTag) from hits_100m_single where FromTag != '' group by intHash32(UserID) % {group_scale} FORMAT Null select anyHeavy(FromTag) from hits_100m_single where FromTag != '' group by intHash32(UserID) % {group_scale} FORMAT Null + +select min((WatchID, CounterID)) from hits_100m_single FORMAT Null +select max((WatchID, CounterID)) from hits_100m_single FORMAT Null +select any((WatchID, CounterID)) from hits_100m_single FORMAT Null +select anyHeavy((WatchID, CounterID)) from hits_100m_single FORMAT Null diff --git a/tests/performance/bounding_ratio.xml b/tests/performance/bounding_ratio.xml index e3a15f90013..ed0b25848df 100644 --- a/tests/performance/bounding_ratio.xml +++ b/tests/performance/bounding_ratio.xml @@ -1,4 +1,4 @@ - SELECT boundingRatio(number, number) FROM numbers(100000000) - SELECT (argMax(number, number) - argMin(number, number)) / (max(number) - min(number)) FROM numbers(100000000) + SELECT boundingRatio(number, number) FROM numbers(30000000) + SELECT (argMax(number, number) - argMin(number, number)) / (max(number) - min(number)) FROM numbers(30000000) diff --git a/tests/performance/decimal_parse.xml b/tests/performance/decimal_parse.xml index 19e940b13df..966363d6fec 100644 --- a/tests/performance/decimal_parse.xml +++ b/tests/performance/decimal_parse.xml @@ -1,3 +1,3 @@ - SELECT count() FROM zeros(10000000) WHERE NOT ignore(toDecimal32OrZero(toString(rand() % 10000), 5)) + SELECT count() FROM zeros(3000000) WHERE NOT ignore(toDecimal32OrZero(toString(rand() % 10000), 5)) diff --git a/tests/performance/group_by_fixed_keys.xml b/tests/performance/group_by_fixed_keys.xml index a64208eb3de..d74b65ad47a 100644 --- a/tests/performance/group_by_fixed_keys.xml +++ b/tests/performance/group_by_fixed_keys.xml @@ -11,7 +11,7 @@ create table group_by_fk(a UInt32, b UInt32, c LowCardinality(UInt32), d Nullable(UInt32), e UInt64, f UInt64, g UInt64, h LowCardinality(UInt64), i Nullable(UInt64)) engine=MergeTree order by tuple() - insert into group_by_fk select number, number, number % 10000, number % 2 == 0 ? number : Null, number, number, number, number % 10000, number % 2 == 0 ? number : Null from numbers_mt(3e7) + insert into group_by_fk select number, number, number % 10000, number % 2 == 0 ? number : Null, number, number, number, number % 10000, number % 2 == 0 ? number : Null from numbers_mt(1e7) settings max_insert_threads=8 select a, b from group_by_fk group by a, b format Null diff --git a/tests/performance/group_by_sundy_li.xml b/tests/performance/group_by_sundy_li.xml index 694fafcbbcd..46f659d9cc0 100644 --- a/tests/performance/group_by_sundy_li.xml +++ b/tests/performance/group_by_sundy_li.xml @@ -16,10 +16,10 @@ ORDER BY (d, n) - insert into a select '2000-01-01', ['aa','bb','cc','dd'][number % 4 + 1], number from numbers_mt(100000000) - insert into a select '2000-01-02', ['aa','bb','cc','dd'][number % 4 + 1], number from numbers_mt(100000000) - insert into a select '2000-01-03', ['aa','bb','cc','dd'][number % 4 + 1], number from numbers_mt(100000000) - insert into a select '2000-01-04', ['aa','bb','cc','dd'][number % 4 + 1], number from numbers_mt(100000000) + insert into a select '2000-01-01', ['aa','bb','cc','dd'][number % 4 + 1], number from numbers_mt(10000000) + insert into a select '2000-01-02', ['aa','bb','cc','dd'][number % 4 + 1], number from numbers_mt(10000000) + insert into a select '2000-01-03', ['aa','bb','cc','dd'][number % 4 + 1], number from numbers_mt(10000000) + insert into a select '2000-01-04', ['aa','bb','cc','dd'][number % 4 + 1], number from numbers_mt(10000000) OPTIMIZE TABLE a FINAL diff --git a/tests/performance/hashed_dictionary.xml b/tests/performance/hashed_dictionary.xml index e9038e694c6..b9de02a70e0 100644 --- a/tests/performance/hashed_dictionary.xml +++ b/tests/performance/hashed_dictionary.xml @@ -82,7 +82,6 @@ elements_count 5000000 - 7500000 @@ -90,16 +89,14 @@ WITH rand64() % toUInt64({elements_count}) as key SELECT dictGet('default.simple_key_hashed_dictionary', {column_name}, key) - FROM system.numbers - LIMIT {elements_count} + FROM numbers_mt({elements_count}) FORMAT Null; WITH rand64() % toUInt64({elements_count}) as key SELECT dictHas('default.simple_key_hashed_dictionary', key) - FROM system.numbers - LIMIT {elements_count} + FROM numbers_mt({elements_count}) FORMAT Null; @@ -111,16 +108,14 @@ WITH (rand64() % toUInt64({elements_count}), toString(rand64() % toUInt64({elements_count}))) as key SELECT dictGet('default.complex_key_hashed_dictionary', {column_name}, key) - FROM system.numbers - LIMIT {elements_count} + FROM numbers_mt({elements_count}) FORMAT Null; WITH (rand64() % toUInt64({elements_count}), toString(rand64() % toUInt64({elements_count}))) as key SELECT dictHas('default.complex_key_hashed_dictionary', key) - FROM system.numbers - LIMIT {elements_count} + FROM numbers_mt({elements_count}) FORMAT Null; diff --git a/tests/performance/join_used_flags.xml b/tests/performance/join_used_flags.xml index cd2073ee106..1bb994f7be2 100644 --- a/tests/performance/join_used_flags.xml +++ b/tests/performance/join_used_flags.xml @@ -1,6 +1,6 @@ CREATE TABLE test_join_used_flags (i64 Int64, i32 Int32) ENGINE = Memory - INSERT INTO test_join_used_flags SELECT number AS i64, rand32() AS i32 FROM numbers(20000000) + INSERT INTO test_join_used_flags SELECT number AS i64, rand32() AS i32 FROM numbers_mt(1500000) SELECT l.i64, r.i64, l.i32, r.i32 FROM test_join_used_flags l RIGHT JOIN test_join_used_flags r USING i64 format Null DROP TABLE IF EXISTS test_join_used_flags diff --git a/tests/performance/min_max_index.xml b/tests/performance/min_max_index.xml index b7b5d4fb991..518696144e2 100644 --- a/tests/performance/min_max_index.xml +++ b/tests/performance/min_max_index.xml @@ -1,7 +1,7 @@ CREATE TABLE index_test (z UInt32, INDEX i_x (mortonDecode(2, z).1) TYPE minmax, INDEX i_y (mortonDecode(2, z).2) TYPE minmax) ENGINE = MergeTree ORDER BY z - INSERT INTO index_test SELECT number FROM numbers(0x100000000) WHERE rand() % 3 = 1 + INSERT INTO index_test SELECT number * 10 FROM numbers_mt(toUInt64(0x100000000 / 10)) SETTINGS max_insert_threads=8 = 20000 AND mortonDecode(2, z).1 <= 20100 AND mortonDecode(2, z).2 >= 10000 AND mortonDecode(2, z).2 <= 10100 diff --git a/tests/performance/polymorphic_parts_l.xml b/tests/performance/polymorphic_parts_l.xml index d2ae9417bf7..66c5b73caa8 100644 --- a/tests/performance/polymorphic_parts_l.xml +++ b/tests/performance/polymorphic_parts_l.xml @@ -25,8 +25,8 @@ - INSERT INTO hits_wide(UserID) SELECT rand() FROM numbers(100000) - INSERT INTO hits_compact(UserID) SELECT rand() FROM numbers(100000) + INSERT INTO hits_wide(UserID) SELECT rand() FROM numbers(100000) + INSERT INTO hits_compact(UserID) SELECT rand() FROM numbers(100000) INSERT INTO hits_buffer(UserID) SELECT rand() FROM numbers(100000) DROP TABLE IF EXISTS hits_wide diff --git a/tests/performance/polymorphic_parts_m.xml b/tests/performance/polymorphic_parts_m.xml index 54a81def55e..0a44038ffbd 100644 --- a/tests/performance/polymorphic_parts_m.xml +++ b/tests/performance/polymorphic_parts_m.xml @@ -25,8 +25,8 @@ - INSERT INTO hits_wide(UserID) SELECT rand() FROM numbers(10000) - INSERT INTO hits_compact(UserID) SELECT rand() FROM numbers(100000) + INSERT INTO hits_wide(UserID) SELECT rand() FROM numbers(10000) + INSERT INTO hits_compact(UserID) SELECT rand() FROM numbers(10000) INSERT INTO hits_buffer(UserID) SELECT rand() FROM numbers(10000) DROP TABLE IF EXISTS hits_wide diff --git a/tests/performance/reinterpret_as.xml b/tests/performance/reinterpret_as.xml index dbf6df160ed..d05ef3bb038 100644 --- a/tests/performance/reinterpret_as.xml +++ b/tests/performance/reinterpret_as.xml @@ -19,7 +19,7 @@ toInt256(number) as d, toString(number) as f, toFixedString(f, 20) as g - FROM numbers_mt(200000000) + FROM numbers_mt(100000000) SETTINGS max_threads = 8 FORMAT Null @@ -38,7 +38,7 @@ toInt256(number) as d, toString(number) as f, toFixedString(f, 20) as g - FROM numbers_mt(200000000) + FROM numbers_mt(100000000) SETTINGS max_threads = 8 FORMAT Null @@ -57,7 +57,7 @@ toInt256(number) as d, toString(number) as f, toFixedString(f, 20) as g - FROM numbers_mt(200000000) + FROM numbers_mt(100000000) SETTINGS max_threads = 8 FORMAT Null @@ -76,7 +76,7 @@ toInt256(number) as d, toString(number) as f, toFixedString(f, 20) as g - FROM numbers_mt(200000000) + FROM numbers_mt(100000000) SETTINGS max_threads = 8 FORMAT Null @@ -95,7 +95,7 @@ toInt256(number) as d, toString(number) as f, toFixedString(f, 20) as g - FROM numbers_mt(10000000) + FROM numbers_mt(5000000) SETTINGS max_threads = 8 FORMAT Null @@ -115,7 +115,7 @@ toInt256(number) as d, toString(number) as f, toFixedString(f, 20) as g - FROM numbers_mt(200000000) + FROM numbers_mt(100000000) SETTINGS max_threads = 8 FORMAT Null @@ -134,7 +134,7 @@ toInt256(number) as d, toString(number) as f, toFixedString(f, 20) as g - FROM numbers_mt(200000000) + FROM numbers_mt(100000000) SETTINGS max_threads = 8 FORMAT Null @@ -153,7 +153,7 @@ toInt256(number) as d, toString(number) as f, toFixedString(f, 20) as g - FROM numbers_mt(200000000) + FROM numbers_mt(100000000) SETTINGS max_threads = 8 FORMAT Null @@ -172,7 +172,7 @@ toInt256(number) as d, toString(number) as f, toFixedString(f, 20) as g - FROM numbers_mt(200000000) + FROM numbers_mt(100000000) SETTINGS max_threads = 8 FORMAT Null @@ -191,7 +191,7 @@ toInt256(number) as d, toString(number) as f, toFixedString(f, 20) as g - FROM numbers_mt(100000000) + FROM numbers_mt(50000000) SETTINGS max_threads = 8 FORMAT Null @@ -210,7 +210,7 @@ toInt256(number) as d, toString(number) as f, toFixedString(f, 20) as g - FROM numbers_mt(10000000) + FROM numbers_mt(5000000) SETTINGS max_threads = 8 FORMAT Null @@ -230,7 +230,7 @@ toInt256(number) as d, toString(number) as f, toFixedString(f, 20) as g - FROM numbers_mt(20000000) + FROM numbers_mt(10000000) SETTINGS max_threads = 8 FORMAT Null @@ -249,7 +249,7 @@ toInt256(number) as d, toString(number) as f, toFixedString(f, 20) as g - FROM numbers_mt(100000000) + FROM numbers_mt(50000000) SETTINGS max_threads = 8 FORMAT Null diff --git a/tests/performance/sum_map.xml b/tests/performance/sum_map.xml index f55af077023..ffb9b9507ae 100644 --- a/tests/performance/sum_map.xml +++ b/tests/performance/sum_map.xml @@ -7,7 +7,7 @@ scale - 1000000 + 100000 diff --git a/tests/queries/0_stateless/00547_named_tuples.reference b/tests/queries/0_stateless/00547_named_tuples.reference index 70cd0054bdd..041ead4ca79 100644 --- a/tests/queries/0_stateless/00547_named_tuples.reference +++ b/tests/queries/0_stateless/00547_named_tuples.reference @@ -1 +1 @@ -(1,'Hello') Tuple(x UInt64, s String) 1 Hello 1 Hello +(1,'Hello') Tuple(\n x UInt64,\n s String) 1 Hello 1 Hello diff --git a/tests/queries/0_stateless/00578_merge_table_and_table_virtual_column.sql b/tests/queries/0_stateless/00578_merge_table_and_table_virtual_column.sql index c2bc334ea38..f292eb30648 100644 --- a/tests/queries/0_stateless/00578_merge_table_and_table_virtual_column.sql +++ b/tests/queries/0_stateless/00578_merge_table_and_table_virtual_column.sql @@ -13,6 +13,8 @@ CREATE TABLE numbers5 ENGINE = MergeTree ORDER BY number AS SELECT number FROM n SELECT count() FROM merge(currentDatabase(), '^numbers\\d+$'); SELECT DISTINCT count() FROM merge(currentDatabase(), '^numbers\\d+$') GROUP BY number; +SET optimize_aggregation_in_order = 0; -- FIXME : in order may happen before filter push down + SET max_rows_to_read = 1000; SET max_threads = 'auto'; diff --git a/tests/queries/0_stateless/00621_regression_for_in_operator.reference b/tests/queries/0_stateless/00621_regression_for_in_operator.reference index ab8bcf307eb..b68f550a742 100644 --- a/tests/queries/0_stateless/00621_regression_for_in_operator.reference +++ b/tests/queries/0_stateless/00621_regression_for_in_operator.reference @@ -10,7 +10,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - TABLE id: 3, table_name: default.regression_for_in_operator_view + TABLE id: 3, alias: __table1, table_name: default.regression_for_in_operator_view WHERE FUNCTION id: 4, function_name: in, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -27,7 +27,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - TABLE id: 3, table_name: default.regression_for_in_operator_view + TABLE id: 3, alias: __table1, table_name: default.regression_for_in_operator_view WHERE FUNCTION id: 4, function_name: or, function_type: ordinary, result_type: UInt8 ARGUMENTS diff --git a/tests/queries/0_stateless/00736_disjunction_optimisation.reference b/tests/queries/0_stateless/00736_disjunction_optimisation.reference index 84477a64057..f28dcacef0e 100644 --- a/tests/queries/0_stateless/00736_disjunction_optimisation.reference +++ b/tests/queries/0_stateless/00736_disjunction_optimisation.reference @@ -34,7 +34,7 @@ QUERY id: 0 COLUMN id: 2, column_name: k, result_type: UInt64, source_id: 3 COLUMN id: 4, column_name: s, result_type: UInt64, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.bug + TABLE id: 3, alias: __table1, table_name: default.bug WHERE FUNCTION id: 5, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -77,7 +77,7 @@ QUERY id: 0 COLUMN id: 2, column_name: k, result_type: UInt64, source_id: 3 COLUMN id: 4, column_name: s, result_type: UInt64, source_id: 3 JOIN TREE - QUERY id: 3, is_subquery: 1 + QUERY id: 3, alias: __table1, is_subquery: 1 PROJECTION COLUMNS k UInt64 s UInt64 @@ -86,7 +86,7 @@ QUERY id: 0 COLUMN id: 6, column_name: k, result_type: UInt64, source_id: 7 COLUMN id: 8, column_name: s, result_type: UInt64, source_id: 7 JOIN TREE - TABLE id: 7, table_name: default.bug + TABLE id: 7, alias: __table2, table_name: default.bug WHERE FUNCTION id: 9, function_name: in, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -151,7 +151,7 @@ QUERY id: 0 COLUMN id: 7, column_name: s, result_type: UInt64, source_id: 3 CONSTANT id: 16, constant_value: Tuple_(UInt64_21, UInt64_22, UInt64_23), constant_value_type: Tuple(UInt8, UInt8, UInt8) JOIN TREE - TABLE id: 3, table_name: default.bug + TABLE id: 3, alias: __table1, table_name: default.bug SETTINGS allow_experimental_analyzer=1 21 1 22 1 @@ -184,7 +184,7 @@ QUERY id: 0 COLUMN id: 2, column_name: s, result_type: UInt64, source_id: 3 CONSTANT id: 6, constant_value: Tuple_(UInt64_21, UInt64_22, UInt64_23), constant_value_type: Tuple(UInt8, UInt8, UInt8) JOIN TREE - TABLE id: 3, table_name: default.bug + TABLE id: 3, alias: __table1, table_name: default.bug SETTINGS allow_experimental_analyzer=1 1 21 1 22 @@ -222,7 +222,7 @@ QUERY id: 0 COLUMN id: 2, column_name: k, result_type: UInt64, source_id: 3 COLUMN id: 4, column_name: s, result_type: UInt64, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.bug + TABLE id: 3, alias: __table1, table_name: default.bug WHERE FUNCTION id: 5, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -265,7 +265,7 @@ QUERY id: 0 COLUMN id: 2, column_name: k, result_type: UInt64, source_id: 3 COLUMN id: 4, column_name: s, result_type: UInt64, source_id: 3 JOIN TREE - QUERY id: 3, is_subquery: 1 + QUERY id: 3, alias: __table1, is_subquery: 1 PROJECTION COLUMNS k UInt64 s UInt64 @@ -274,7 +274,7 @@ QUERY id: 0 COLUMN id: 6, column_name: k, result_type: UInt64, source_id: 7 COLUMN id: 8, column_name: s, result_type: UInt64, source_id: 7 JOIN TREE - TABLE id: 7, table_name: default.bug + TABLE id: 7, alias: __table2, table_name: default.bug WHERE FUNCTION id: 9, function_name: in, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -347,7 +347,7 @@ QUERY id: 0 COLUMN id: 7, column_name: s, result_type: UInt64, source_id: 3 CONSTANT id: 21, constant_value: Tuple_(UInt64_21, UInt64_22, UInt64_23), constant_value_type: Tuple(UInt8, UInt8, UInt8) JOIN TREE - TABLE id: 3, table_name: default.bug + TABLE id: 3, alias: __table1, table_name: default.bug SETTINGS allow_experimental_analyzer=1 21 1 22 1 @@ -380,7 +380,7 @@ QUERY id: 0 COLUMN id: 2, column_name: s, result_type: UInt64, source_id: 3 CONSTANT id: 6, constant_value: Tuple_(UInt64_21, UInt64_22, UInt64_23), constant_value_type: Tuple(UInt8, UInt8, UInt8) JOIN TREE - TABLE id: 3, table_name: default.bug + TABLE id: 3, alias: __table1, table_name: default.bug SETTINGS allow_experimental_analyzer=1 21 1 22 1 @@ -413,5 +413,5 @@ QUERY id: 0 COLUMN id: 2, column_name: s, result_type: UInt64, source_id: 3 CONSTANT id: 6, constant_value: Tuple_(UInt64_21, UInt64_22, UInt64_23), constant_value_type: Tuple(UInt8, UInt8, UInt8) JOIN TREE - TABLE id: 3, table_name: default.bug + TABLE id: 3, alias: __table1, table_name: default.bug SETTINGS allow_experimental_analyzer=1 diff --git a/tests/queries/0_stateless/00918_json_functions.reference b/tests/queries/0_stateless/00918_json_functions.reference index be8e603f8dc..5264d51fa73 100644 --- a/tests/queries/0_stateless/00918_json_functions.reference +++ b/tests/queries/0_stateless/00918_json_functions.reference @@ -69,8 +69,8 @@ hello 123456.1234 Decimal(20, 4) 123456.1234 Decimal(20, 4) 123456789012345.12 Decimal(30, 4) -(1234567890.1234567890123456789,'test') Tuple(a Decimal(35, 20), b LowCardinality(String)) -(1234567890.12345678901234567890123456789,'test') Tuple(a Decimal(45, 30), b LowCardinality(String)) +(1234567890.1234567890123456789,'test') Tuple(\n a Decimal(35, 20),\n b LowCardinality(String)) +(1234567890.12345678901234567890123456789,'test') Tuple(\n a Decimal(45, 30),\n b LowCardinality(String)) 123456789012345.1136 123456789012345.1136 1234567890.12345677879616925706 (1234567890.12345677879616925706,'test') 1234567890.123456695758468374595199311875 (1234567890.123456695758468374595199311875,'test') diff --git a/tests/queries/0_stateless/01175_distributed_ddl_output_mode_long.reference b/tests/queries/0_stateless/01175_distributed_ddl_output_mode_long.reference index 39979a98bde..b9a66a1e1a9 100644 --- a/tests/queries/0_stateless/01175_distributed_ddl_output_mode_long.reference +++ b/tests/queries/0_stateless/01175_distributed_ddl_output_mode_long.reference @@ -3,7 +3,7 @@ Received exception from server: Code: 57. Error: Received from localhost:9000. Error: There was an error on [localhost:9000]: Code: 57. Error: Table default.none already exists. (TABLE_ALREADY_EXISTS) (query: create table none on cluster test_shard_localhost (n int) engine=Memory;) Received exception from server: -Code: 159. Error: Received from localhost:9000. Error: Watching task is executing longer than distributed_ddl_task_timeout (=1) seconds. There are 1 unfinished hosts (0 of them are currently active), they are going to execute the query in background. (TIMEOUT_EXCEEDED) +Code: 159. Error: Received from localhost:9000. Error: Watching task is executing longer than distributed_ddl_task_timeout (=1) seconds. There are 1 unfinished hosts (0 of them are currently executing the task), they are going to execute the query in background. (TIMEOUT_EXCEEDED) (query: drop table if exists none on cluster test_unavailable_shard;) throw localhost 9000 0 0 0 @@ -12,7 +12,7 @@ Code: 57. Error: Received from localhost:9000. Error: There was an error on [loc (query: create table throw on cluster test_shard_localhost (n int) engine=Memory format Null;) localhost 9000 0 1 0 Received exception from server: -Code: 159. Error: Received from localhost:9000. Error: Watching task is executing longer than distributed_ddl_task_timeout (=1) seconds. There are 1 unfinished hosts (0 of them are currently active), they are going to execute the query in background. (TIMEOUT_EXCEEDED) +Code: 159. Error: Received from localhost:9000. Error: Watching task is executing longer than distributed_ddl_task_timeout (=1) seconds. There are 1 unfinished hosts (0 of them are currently executing the task), they are going to execute the query in background. (TIMEOUT_EXCEEDED) (query: drop table if exists throw on cluster test_unavailable_shard;) null_status_on_timeout localhost 9000 0 0 0 diff --git a/tests/queries/0_stateless/01175_distributed_ddl_output_mode_long.sh b/tests/queries/0_stateless/01175_distributed_ddl_output_mode_long.sh index d2695e602c5..12e142adda9 100755 --- a/tests/queries/0_stateless/01175_distributed_ddl_output_mode_long.sh +++ b/tests/queries/0_stateless/01175_distributed_ddl_output_mode_long.sh @@ -33,7 +33,7 @@ function run_until_out_contains() done } -RAND_COMMENT="01175_DDL_$RANDOM" +RAND_COMMENT="01175_DDL_$CLICKHOUSE_DATABASE" LOG_COMMENT="${CLICKHOUSE_LOG_COMMENT}_$RAND_COMMENT" CLICKHOUSE_CLIENT_WITH_SETTINGS=${CLICKHOUSE_CLIENT/--log_comment ${CLICKHOUSE_LOG_COMMENT}/--log_comment ${LOG_COMMENT}} diff --git a/tests/queries/0_stateless/01300_group_by_other_keys_having.reference b/tests/queries/0_stateless/01300_group_by_other_keys_having.reference index a9be79800c1..f861da3da2b 100644 --- a/tests/queries/0_stateless/01300_group_by_other_keys_having.reference +++ b/tests/queries/0_stateless/01300_group_by_other_keys_having.reference @@ -49,7 +49,7 @@ QUERY id: 0 LIST id: 9, nodes: 1 COLUMN id: 10, column_name: number, result_type: UInt64, source_id: 11 JOIN TREE - TABLE_FUNCTION id: 11, table_function_name: numbers + TABLE_FUNCTION id: 11, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_10000000, constant_value_type: UInt32 @@ -124,7 +124,7 @@ QUERY id: 0 LIST id: 9, nodes: 1 COLUMN id: 10, column_name: number, result_type: UInt64, source_id: 11 JOIN TREE - TABLE_FUNCTION id: 11, table_function_name: numbers + TABLE_FUNCTION id: 11, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_10000000, constant_value_type: UInt32 @@ -194,7 +194,7 @@ QUERY id: 0 COLUMN id: 6, column_name: number, result_type: UInt64, source_id: 7 CONSTANT id: 11, constant_value: UInt64_5, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 7, table_function_name: numbers + TABLE_FUNCTION id: 7, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_10000000, constant_value_type: UInt32 @@ -276,7 +276,7 @@ QUERY id: 0 LIST id: 9, nodes: 1 COLUMN id: 10, column_name: number, result_type: UInt64, source_id: 11 JOIN TREE - TABLE_FUNCTION id: 11, table_function_name: numbers + TABLE_FUNCTION id: 11, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_10000000, constant_value_type: UInt32 diff --git a/tests/queries/0_stateless/01323_redundant_functions_in_order_by.reference b/tests/queries/0_stateless/01323_redundant_functions_in_order_by.reference index bf184d142ec..d47f12ff4d1 100644 --- a/tests/queries/0_stateless/01323_redundant_functions_in_order_by.reference +++ b/tests/queries/0_stateless/01323_redundant_functions_in_order_by.reference @@ -49,14 +49,14 @@ QUERY id: 0 LIST id: 3, nodes: 1 COLUMN id: 4, column_name: x, result_type: UInt64, source_id: 5 JOIN TREE - QUERY id: 5, is_subquery: 1 + QUERY id: 5, alias: __table1, is_subquery: 1 PROJECTION COLUMNS x UInt64 PROJECTION LIST id: 6, nodes: 1 COLUMN id: 7, column_name: number, result_type: UInt64, source_id: 8 JOIN TREE - TABLE_FUNCTION id: 8, table_function_name: numbers + TABLE_FUNCTION id: 8, alias: __table2, table_function_name: numbers ARGUMENTS LIST id: 9, nodes: 1 CONSTANT id: 10, constant_value: UInt64_3, constant_value_type: UInt8 @@ -83,14 +83,14 @@ QUERY id: 0 LIST id: 3, nodes: 1 COLUMN id: 4, column_name: x, result_type: UInt64, source_id: 5 JOIN TREE - QUERY id: 5, is_subquery: 1 + QUERY id: 5, alias: __table1, is_subquery: 1 PROJECTION COLUMNS x UInt64 PROJECTION LIST id: 6, nodes: 1 COLUMN id: 7, column_name: number, result_type: UInt64, source_id: 8 JOIN TREE - TABLE_FUNCTION id: 8, table_function_name: numbers + TABLE_FUNCTION id: 8, alias: __table2, table_function_name: numbers ARGUMENTS LIST id: 9, nodes: 1 CONSTANT id: 10, constant_value: UInt64_3, constant_value_type: UInt8 @@ -119,14 +119,14 @@ QUERY id: 0 LIST id: 3, nodes: 1 COLUMN id: 4, column_name: x, result_type: UInt64, source_id: 5 JOIN TREE - QUERY id: 5, is_subquery: 1 + QUERY id: 5, alias: __table1, is_subquery: 1 PROJECTION COLUMNS x UInt64 PROJECTION LIST id: 6, nodes: 1 COLUMN id: 7, column_name: number, result_type: UInt64, source_id: 8 JOIN TREE - TABLE_FUNCTION id: 8, table_function_name: numbers + TABLE_FUNCTION id: 8, alias: __table2, table_function_name: numbers ARGUMENTS LIST id: 9, nodes: 1 CONSTANT id: 10, constant_value: UInt64_3, constant_value_type: UInt8 @@ -171,7 +171,7 @@ QUERY id: 0 JOIN TREE JOIN id: 8, strictness: ALL, kind: FULL LEFT TABLE EXPRESSION - QUERY id: 3, alias: s, is_subquery: 1 + QUERY id: 3, alias: __table1, is_subquery: 1 PROJECTION COLUMNS key UInt64 PROJECTION @@ -182,12 +182,12 @@ QUERY id: 0 COLUMN id: 12, column_name: number, result_type: UInt64, source_id: 13 CONSTANT id: 14, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 13, table_function_name: numbers + TABLE_FUNCTION id: 13, alias: __table2, table_function_name: numbers ARGUMENTS LIST id: 15, nodes: 1 CONSTANT id: 16, constant_value: UInt64_4, constant_value_type: UInt8 RIGHT TABLE EXPRESSION - TABLE id: 5, alias: t, table_name: default.test + TABLE id: 5, alias: __table3, table_name: default.test JOIN EXPRESSION LIST id: 17, nodes: 1 COLUMN id: 18, column_name: key, result_type: UInt64, source_id: 8 @@ -220,7 +220,7 @@ QUERY id: 0 COLUMN id: 2, column_name: key, result_type: UInt64, source_id: 3 COLUMN id: 4, column_name: a, result_type: UInt8, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.test + TABLE id: 3, alias: __table1, table_name: default.test ORDER BY LIST id: 5, nodes: 2 SORT id: 6, sort_direction: ASCENDING, with_fill: 0 @@ -246,7 +246,7 @@ QUERY id: 0 COLUMN id: 2, column_name: key, result_type: UInt64, source_id: 3 COLUMN id: 4, column_name: a, result_type: UInt8, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.test + TABLE id: 3, alias: __table1, table_name: default.test ORDER BY LIST id: 5, nodes: 2 SORT id: 6, sort_direction: ASCENDING, with_fill: 0 @@ -270,7 +270,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: key, result_type: UInt64, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.test + TABLE id: 3, alias: __table1, table_name: default.test GROUP BY LIST id: 4, nodes: 1 COLUMN id: 2, column_name: key, result_type: UInt64, source_id: 3 @@ -297,9 +297,9 @@ QUERY id: 0 JOIN TREE JOIN id: 6, strictness: ALL, kind: INNER LEFT TABLE EXPRESSION - TABLE id: 3, table_name: default.t1 + TABLE id: 3, alias: __table1, table_name: default.t1 RIGHT TABLE EXPRESSION - TABLE id: 5, table_name: default.t2 + TABLE id: 5, alias: __table2, table_name: default.t2 JOIN EXPRESSION FUNCTION id: 7, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS diff --git a/tests/queries/0_stateless/01455_opentelemetry_distributed.reference b/tests/queries/0_stateless/01455_opentelemetry_distributed.reference index a6d43856aec..2920b387aa2 100644 --- a/tests/queries/0_stateless/01455_opentelemetry_distributed.reference +++ b/tests/queries/0_stateless/01455_opentelemetry_distributed.reference @@ -1,9 +1,9 @@ ===http=== {"query":"select 1 from remote('127.0.0.2', system, one) settings allow_experimental_analyzer = 1 format Null\n","status":"QueryFinish","tracestate":"some custom state","sorted_by_start_time":1} {"query":"DESC TABLE system.one","status":"QueryFinish","tracestate":"some custom state","sorted_by_start_time":1} -{"query":"SELECT 1 AS `1` FROM `system`.`one`","status":"QueryFinish","tracestate":"some custom state","sorted_by_start_time":1} +{"query":"SELECT 1 AS `1` FROM `system`.`one` AS `__table1`","status":"QueryFinish","tracestate":"some custom state","sorted_by_start_time":1} {"query":"DESC TABLE system.one","query_status":"QueryFinish","tracestate":"some custom state","sorted_by_finish_time":1} -{"query":"SELECT 1 AS `1` FROM `system`.`one`","query_status":"QueryFinish","tracestate":"some custom state","sorted_by_finish_time":1} +{"query":"SELECT 1 AS `1` FROM `system`.`one` AS `__table1`","query_status":"QueryFinish","tracestate":"some custom state","sorted_by_finish_time":1} {"query":"select 1 from remote('127.0.0.2', system, one) settings allow_experimental_analyzer = 1 format Null\n","query_status":"QueryFinish","tracestate":"some custom state","sorted_by_finish_time":1} {"total spans":"3","unique spans":"3","unique non-zero parent spans":"3"} {"initial query spans with proper parent":"2"} diff --git a/tests/queries/0_stateless/01458_named_tuple_millin.reference b/tests/queries/0_stateless/01458_named_tuple_millin.reference index d6d6d7ae8d4..954dfe36563 100644 --- a/tests/queries/0_stateless/01458_named_tuple_millin.reference +++ b/tests/queries/0_stateless/01458_named_tuple_millin.reference @@ -3,10 +3,10 @@ CREATE TABLE default.tuple `j` Tuple(a Int8, b String) ) ENGINE = Memory -j Tuple(a Int8, b String) +j Tuple(\n a Int8,\n b String) CREATE TABLE default.tuple ( `j` Tuple(a Int8, b String) ) ENGINE = Memory -j Tuple(a Int8, b String) +j Tuple(\n a Int8,\n b String) diff --git a/tests/queries/0_stateless/01532_tuple_with_name_type.reference b/tests/queries/0_stateless/01532_tuple_with_name_type.reference index 8a3e57d9016..66b85f05fa6 100644 --- a/tests/queries/0_stateless/01532_tuple_with_name_type.reference +++ b/tests/queries/0_stateless/01532_tuple_with_name_type.reference @@ -1,4 +1,4 @@ -a Tuple(key String, value String) -a Tuple(Tuple(key String, value String)) -a Array(Tuple(key String, value String)) -a Tuple(UInt8, Tuple(key String, value String)) +a Tuple(\n key String,\n value String) +a Tuple(Tuple(\n key String,\n value String)) +a Array(Tuple(\n key String,\n value String)) +a Tuple(UInt8, Tuple(\n key String,\n value String)) diff --git a/tests/queries/0_stateless/01561_clickhouse_client_stage.reference b/tests/queries/0_stateless/01561_clickhouse_client_stage.reference index 8a34751b071..2631199cbab 100644 --- a/tests/queries/0_stateless/01561_clickhouse_client_stage.reference +++ b/tests/queries/0_stateless/01561_clickhouse_client_stage.reference @@ -2,7 +2,7 @@ execute: --allow_experimental_analyzer=1 "foo" 1 execute: --allow_experimental_analyzer=1 --stage fetch_columns -"dummy_0" +"__table1.dummy" 0 execute: --allow_experimental_analyzer=1 --stage with_mergeable_state "1_UInt8" diff --git a/tests/queries/0_stateless/01591_window_functions.reference b/tests/queries/0_stateless/01591_window_functions.reference index 5d12a09a846..156f36f7dba 100644 --- a/tests/queries/0_stateless/01591_window_functions.reference +++ b/tests/queries/0_stateless/01591_window_functions.reference @@ -917,9 +917,9 @@ from ; Expression ((Project names + Projection)) Window (Window step for window \'\') - Window (Window step for window \'PARTITION BY p_0\') - Window (Window step for window \'PARTITION BY p_0 ORDER BY o_1 ASC\') - Sorting (Sorting for window \'PARTITION BY p_0 ORDER BY o_1 ASC\') + Window (Window step for window \'PARTITION BY __table1.p\') + Window (Window step for window \'PARTITION BY __table1.p ORDER BY __table1.o ASC\') + Sorting (Sorting for window \'PARTITION BY __table1.p ORDER BY __table1.o ASC\') Expression ((Before WINDOW + (Change column names to column identifiers + (Project names + (Projection + Change column names to column identifiers))))) ReadFromSystemNumbers explain select @@ -930,11 +930,11 @@ from from numbers(16)) t ; Expression ((Project names + Projection)) - Window (Window step for window \'ORDER BY o_0 ASC, number_1 ASC\') - Sorting (Sorting for window \'ORDER BY o_0 ASC, number_1 ASC\') - Window (Window step for window \'ORDER BY number_1 ASC\') + Window (Window step for window \'ORDER BY __table1.o ASC, __table1.number ASC\') + Sorting (Sorting for window \'ORDER BY __table1.o ASC, __table1.number ASC\') + Window (Window step for window \'ORDER BY __table1.number ASC\') Expression ((Before WINDOW + (Change column names to column identifiers + (Project names + (Projection + Change column names to column identifiers)))) [lifted up part]) - Sorting (Sorting for window \'ORDER BY number_1 ASC\') + Sorting (Sorting for window \'ORDER BY __table1.number ASC\') Expression ((Before WINDOW + (Change column names to column identifiers + (Project names + (Projection + Change column names to column identifiers))))) ReadFromSystemNumbers -- A test case for the sort comparator found by fuzzer. diff --git a/tests/queries/0_stateless/01622_constraints_simple_optimization.reference b/tests/queries/0_stateless/01622_constraints_simple_optimization.reference index ef6425b485b..d267df2237f 100644 --- a/tests/queries/0_stateless/01622_constraints_simple_optimization.reference +++ b/tests/queries/0_stateless/01622_constraints_simple_optimization.reference @@ -45,7 +45,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - TABLE id: 3, table_name: default.constraint_test_constants + TABLE id: 3, alias: __table1, table_name: default.constraint_test_constants WHERE FUNCTION id: 4, function_name: greater, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -63,7 +63,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - TABLE id: 3, table_name: default.constraint_test_constants + TABLE id: 3, alias: __table1, table_name: default.constraint_test_constants WHERE FUNCTION id: 4, function_name: greater, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -80,5 +80,5 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - TABLE id: 3, table_name: default.constraint_test_constants + TABLE id: 3, alias: __table1, table_name: default.constraint_test_constants SETTINGS allow_experimental_analyzer=1 diff --git a/tests/queries/0_stateless/01622_constraints_where_optimization.reference b/tests/queries/0_stateless/01622_constraints_where_optimization.reference index b5520d75b0e..3f6e8211f1a 100644 --- a/tests/queries/0_stateless/01622_constraints_where_optimization.reference +++ b/tests/queries/0_stateless/01622_constraints_where_optimization.reference @@ -8,7 +8,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - TABLE id: 3, table_name: default.t_constraints_where + TABLE id: 3, alias: __table1, table_name: default.t_constraints_where WHERE CONSTANT id: 4, constant_value: UInt64_0, constant_value_type: UInt8 SETTINGS allow_experimental_analyzer=1 @@ -22,7 +22,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - TABLE id: 3, table_name: default.t_constraints_where + TABLE id: 3, alias: __table1, table_name: default.t_constraints_where WHERE CONSTANT id: 4, constant_value: UInt64_0, constant_value_type: UInt8 SETTINGS allow_experimental_analyzer=1 @@ -36,7 +36,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - TABLE id: 3, table_name: default.t_constraints_where + TABLE id: 3, alias: __table1, table_name: default.t_constraints_where WHERE CONSTANT id: 4, constant_value: UInt64_0, constant_value_type: UInt8 SETTINGS allow_experimental_analyzer=1 @@ -50,7 +50,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - TABLE id: 3, table_name: default.t_constraints_where + TABLE id: 3, alias: __table1, table_name: default.t_constraints_where WHERE FUNCTION id: 4, function_name: less, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -68,7 +68,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - TABLE id: 3, table_name: default.t_constraints_where + TABLE id: 3, alias: __table1, table_name: default.t_constraints_where PREWHERE FUNCTION id: 4, function_name: less, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -85,5 +85,5 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - TABLE id: 3, table_name: default.t_constraints_where + TABLE id: 3, alias: __table1, table_name: default.t_constraints_where SETTINGS allow_experimental_analyzer=1 diff --git a/tests/queries/0_stateless/01623_constraints_column_swap.reference b/tests/queries/0_stateless/01623_constraints_column_swap.reference index 3639ad47228..555a4c93f70 100644 --- a/tests/queries/0_stateless/01623_constraints_column_swap.reference +++ b/tests/queries/0_stateless/01623_constraints_column_swap.reference @@ -20,7 +20,7 @@ QUERY id: 0 COLUMN id: 9, column_name: b, result_type: UInt64, source_id: 5 CONSTANT id: 10, constant_value: UInt64_3, constant_value_type: UInt8 JOIN TREE - TABLE id: 5, table_name: default.column_swap_test_test + TABLE id: 5, alias: __table1, table_name: default.column_swap_test_test WHERE FUNCTION id: 11, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -50,7 +50,7 @@ QUERY id: 0 COLUMN id: 9, column_name: b, result_type: UInt64, source_id: 5 CONSTANT id: 10, constant_value: UInt64_3, constant_value_type: UInt8 JOIN TREE - TABLE id: 5, table_name: default.column_swap_test_test + TABLE id: 5, alias: __table1, table_name: default.column_swap_test_test PREWHERE FUNCTION id: 11, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -80,7 +80,7 @@ QUERY id: 0 COLUMN id: 9, column_name: b, result_type: UInt64, source_id: 5 CONSTANT id: 10, constant_value: UInt64_3, constant_value_type: UInt8 JOIN TREE - TABLE id: 5, table_name: default.column_swap_test_test + TABLE id: 5, alias: __table1, table_name: default.column_swap_test_test WHERE FUNCTION id: 11, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -110,7 +110,7 @@ QUERY id: 0 COLUMN id: 9, column_name: b, result_type: UInt64, source_id: 5 CONSTANT id: 10, constant_value: UInt64_3, constant_value_type: UInt8 JOIN TREE - TABLE id: 5, table_name: default.column_swap_test_test + TABLE id: 5, alias: __table1, table_name: default.column_swap_test_test WHERE FUNCTION id: 11, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -140,7 +140,7 @@ QUERY id: 0 COLUMN id: 9, column_name: b, result_type: UInt64, source_id: 5 CONSTANT id: 10, constant_value: UInt64_3, constant_value_type: UInt8 JOIN TREE - TABLE id: 5, table_name: default.column_swap_test_test + TABLE id: 5, alias: __table1, table_name: default.column_swap_test_test WHERE FUNCTION id: 11, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -162,7 +162,7 @@ QUERY id: 0 COLUMN id: 4, column_name: b, result_type: UInt64, source_id: 5 CONSTANT id: 6, constant_value: UInt64_10, constant_value_type: UInt8 JOIN TREE - TABLE id: 5, table_name: default.column_swap_test_test + TABLE id: 5, alias: __table1, table_name: default.column_swap_test_test WHERE FUNCTION id: 7, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -191,7 +191,7 @@ QUERY id: 0 CONSTANT id: 8, constant_value: UInt64_10, constant_value_type: UInt8 COLUMN id: 9, column_name: a, result_type: String, source_id: 7 JOIN TREE - TABLE id: 7, table_name: default.column_swap_test_test + TABLE id: 7, alias: __table1, table_name: default.column_swap_test_test WHERE FUNCTION id: 10, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -223,7 +223,7 @@ QUERY id: 0 CONSTANT id: 8, constant_value: UInt64_10, constant_value_type: UInt8 COLUMN id: 9, column_name: a, result_type: String, source_id: 7 JOIN TREE - TABLE id: 7, table_name: default.column_swap_test_test + TABLE id: 7, alias: __table1, table_name: default.column_swap_test_test WHERE FUNCTION id: 10, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -248,7 +248,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: String, source_id: 3 COLUMN id: 4, column_name: a, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.column_swap_test_test + TABLE id: 3, alias: __table1, table_name: default.column_swap_test_test WHERE FUNCTION id: 5, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -270,7 +270,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: String, source_id: 3 COLUMN id: 4, column_name: a, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.column_swap_test_test + TABLE id: 3, alias: __table1, table_name: default.column_swap_test_test WHERE FUNCTION id: 5, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -292,7 +292,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: String, source_id: 3 COLUMN id: 4, column_name: a, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.column_swap_test_test + TABLE id: 3, alias: __table1, table_name: default.column_swap_test_test WHERE FUNCTION id: 5, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -310,7 +310,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: a, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.column_swap_test_test + TABLE id: 3, alias: __table1, table_name: default.column_swap_test_test WHERE FUNCTION id: 4, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -327,5 +327,5 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: a, result_type: UInt32, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.t_bad_constraint + TABLE id: 3, alias: __table1, table_name: default.t_bad_constraint SETTINGS allow_experimental_analyzer=1 diff --git a/tests/queries/0_stateless/01646_rewrite_sum_if.reference b/tests/queries/0_stateless/01646_rewrite_sum_if.reference index 871c75737c6..af582908f03 100644 --- a/tests/queries/0_stateless/01646_rewrite_sum_if.reference +++ b/tests/queries/0_stateless/01646_rewrite_sum_if.reference @@ -56,7 +56,7 @@ QUERY id: 0 CONSTANT id: 13, constant_value: UInt64_2, constant_value_type: UInt8 CONSTANT id: 14, constant_value: UInt64_0, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 12, table_function_name: numbers + TABLE_FUNCTION id: 12, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 15, nodes: 1 CONSTANT id: 16, constant_value: UInt64_100, constant_value_type: UInt8 @@ -82,7 +82,7 @@ QUERY id: 0 CONSTANT id: 13, constant_value: UInt64_2, constant_value_type: UInt8 CONSTANT id: 14, constant_value: UInt64_0, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 12, table_function_name: numbers + TABLE_FUNCTION id: 12, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 15, nodes: 1 CONSTANT id: 16, constant_value: UInt64_100, constant_value_type: UInt8 @@ -111,7 +111,7 @@ QUERY id: 0 CONSTANT id: 15, constant_value: UInt64_2, constant_value_type: UInt8 CONSTANT id: 16, constant_value: UInt64_0, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 14, table_function_name: numbers + TABLE_FUNCTION id: 14, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 17, nodes: 1 CONSTANT id: 18, constant_value: UInt64_100, constant_value_type: UInt8 diff --git a/tests/queries/0_stateless/01655_plan_optimizations.reference b/tests/queries/0_stateless/01655_plan_optimizations.reference index 54ca55d2068..436d06c5076 100644 --- a/tests/queries/0_stateless/01655_plan_optimizations.reference +++ b/tests/queries/0_stateless/01655_plan_optimizations.reference @@ -28,7 +28,7 @@ Aggregating Filter Filter > (analyzer) filter should be pushed down after aggregating, column after aggregation is const -COLUMN Const(UInt8) -> notEquals(y_1, 0_UInt8) +COLUMN Const(UInt8) -> notEquals(__table1.y, 0_UInt8) Aggregating Filter Filter @@ -49,9 +49,9 @@ Aggregating Filter column: notEquals(y, 0) > (analyzer) one condition of filter should be pushed down after aggregating, other condition is aliased Filter column -ALIAS notEquals(s_0, 4_UInt8) :: 0 -> and(notEquals(y_1, 0_UInt8), notEquals(s_0, 4_UInt8)) +ALIAS notEquals(__table1.s, 4_UInt8) :: 0 -> and(notEquals(__table1.y, 0_UInt8), notEquals(__table1.s, 4_UInt8)) Aggregating -Filter column: notEquals(y_1, 0_UInt8) +Filter column: notEquals(__table1.y, 0_UInt8) 0 1 1 2 2 3 @@ -68,9 +68,9 @@ Aggregating Filter column: notEquals(y, 0) > (analyzer) one condition of filter should be pushed down after aggregating, other condition is casted Filter column -FUNCTION and(minus(s_0, 4_UInt8) :: 0, 1 :: 3) -> and(notEquals(y_1, 0_UInt8), minus(s_0, 4_UInt8)) UInt8 : 2 +FUNCTION and(minus(__table1.s, 4_UInt8) :: 0, 1 :: 3) -> and(notEquals(__table1.y, 0_UInt8), minus(__table1.s, 4_UInt8)) UInt8 : 2 Aggregating -Filter column: notEquals(y_1, 0_UInt8) +Filter column: notEquals(__table1.y, 0_UInt8) 0 1 1 2 2 3 @@ -87,9 +87,9 @@ Aggregating Filter column: notEquals(y, 0) > (analyzer) one condition of filter should be pushed down after aggregating, other two conditions are ANDed Filter column -FUNCTION and(minus(s_0, 8_UInt8) :: 0, minus(s_0, 4_UInt8) :: 2) -> and(notEquals(y_1, 0_UInt8), minus(s_0, 8_UInt8), minus(s_0, 4_UInt8)) +FUNCTION and(minus(__table1.s, 8_UInt8) :: 0, minus(__table1.s, 4_UInt8) :: 2) -> and(notEquals(__table1.y, 0_UInt8), minus(__table1.s, 8_UInt8), minus(__table1.s, 4_UInt8)) Aggregating -Filter column: notEquals(y_1, 0_UInt8) +Filter column: notEquals(__table1.y, 0_UInt8) 0 1 1 2 2 3 @@ -105,9 +105,9 @@ Aggregating Filter column: and(notEquals(y, 0), minus(y, 4)) > (analyzer) two conditions of filter should be pushed down after aggregating and ANDed, one condition is aliased Filter column -ALIAS notEquals(s_0, 8_UInt8) :: 0 -> and(notEquals(y_1, 0_UInt8), notEquals(s_0, 8_UInt8), minus(y_1, 4_UInt8)) +ALIAS notEquals(__table1.s, 8_UInt8) :: 0 -> and(notEquals(__table1.y, 0_UInt8), notEquals(__table1.s, 8_UInt8), minus(__table1.y, 4_UInt8)) Aggregating -Filter column: and(notEquals(y_1, 0_UInt8), minus(y_1, 4_UInt8)) +Filter column: and(notEquals(__table1.y, 0_UInt8), minus(__table1.y, 4_UInt8)) 0 1 1 2 2 3 @@ -121,9 +121,9 @@ Filter column: and(notEquals(y, 2), notEquals(x, 0)) ARRAY JOIN x Filter column: notEquals(y, 2) > (analyzer) filter is split, one part is filtered before ARRAY JOIN -Filter column: and(notEquals(y_1, 2_UInt8), notEquals(x_0, 0_UInt8)) -ARRAY JOIN x_0 -Filter column: notEquals(y_1, 2_UInt8) +Filter column: and(notEquals(__table2.y, 2_UInt8), notEquals(__table1.x, 0_UInt8)) +ARRAY JOIN __table1.x +Filter column: notEquals(__table2.y, 2_UInt8) 1 3 > filter is pushed down before Distinct Distinct @@ -132,7 +132,7 @@ Filter column: notEquals(y, 2) > (analyzer) filter is pushed down before Distinct Distinct Distinct -Filter column: notEquals(y_1, 2_UInt8) +Filter column: notEquals(__table1.y, 2_UInt8) 0 0 0 1 1 0 @@ -144,7 +144,7 @@ Filter column: and(notEquals(x, 0), notEquals(y, 0)) > (analyzer) filter is pushed down before sorting steps Sorting Sorting -Filter column: and(notEquals(x_0, 0_UInt8), notEquals(y_1, 0_UInt8)) +Filter column: and(notEquals(__table1.x, 0_UInt8), notEquals(__table1.y, 0_UInt8)) 1 2 1 1 > filter is pushed down before TOTALS HAVING and aggregating @@ -154,7 +154,7 @@ Filter column: notEquals(y, 2) > (analyzer) filter is pushed down before TOTALS HAVING and aggregating TotalsHaving Aggregating -Filter column: notEquals(y_0, 2_UInt8) +Filter column: notEquals(__table1.y, 2_UInt8) 0 12 1 15 3 10 @@ -174,7 +174,7 @@ Join > (analyzer) one condition of filter is pushed down before LEFT JOIN Join Join -Filter column: notEquals(number_0, 1_UInt8) +Filter column: notEquals(__table1.number, 1_UInt8) 0 0 3 3 > one condition of filter is pushed down before INNER JOIN @@ -185,7 +185,7 @@ Join > (analyzer) one condition of filter is pushed down before INNER JOIN Join Join -Filter column: notEquals(number_0, 1_UInt8) +Filter column: notEquals(__table1.number, 1_UInt8) 3 3 > filter is pushed down before UNION Union diff --git a/tests/queries/0_stateless/01655_plan_optimizations.sh b/tests/queries/0_stateless/01655_plan_optimizations.sh index a765a6ea4fa..5a517264243 100755 --- a/tests/queries/0_stateless/01655_plan_optimizations.sh +++ b/tests/queries/0_stateless/01655_plan_optimizations.sh @@ -36,7 +36,7 @@ $CLICKHOUSE_CLIENT --allow_experimental_analyzer=1 -q " explain actions = 1 select s, y, y != 0 from (select sum(x) as s, y from ( select number as x, number + 1 as y from numbers(10)) group by y ) where y != 0 - settings enable_optimize_predicate_expression=0" | grep -o "Aggregating\|Filter\|COLUMN Const(UInt8) -> notEquals(y_1, 0_UInt8)" + settings enable_optimize_predicate_expression=0" | grep -o "Aggregating\|Filter\|COLUMN Const(UInt8) -> notEquals(__table1.y, 0_UInt8)" $CLICKHOUSE_CLIENT -q " select s, y, y != 0 from (select sum(x) as s, y from ( select number as x, number + 1 as y from numbers(10)) group by y @@ -56,7 +56,7 @@ $CLICKHOUSE_CLIENT --allow_experimental_analyzer=1 -q " select sum(x) as s, y from (select number as x, number + 1 as y from numbers(10)) group by y ) where y != 0 and s != 4 settings enable_optimize_predicate_expression=0" | - grep -o "Aggregating\|Filter column\|Filter column: notEquals(y_1, 0_UInt8)\|ALIAS notEquals(s_0, 4_UInt8) :: 0 -> and(notEquals(y_1, 0_UInt8), notEquals(s_0, 4_UInt8))" + grep -o "Aggregating\|Filter column\|Filter column: notEquals(__table1.y, 0_UInt8)\|ALIAS notEquals(__table1.s, 4_UInt8) :: 0 -> and(notEquals(__table1.y, 0_UInt8), notEquals(__table1.s, 4_UInt8))" $CLICKHOUSE_CLIENT -q " select s, y from ( select sum(x) as s, y from (select number as x, number + 1 as y from numbers(10)) group by y @@ -76,7 +76,7 @@ $CLICKHOUSE_CLIENT --allow_experimental_analyzer=1 -q " select sum(x) as s, y from (select number as x, number + 1 as y from numbers(10)) group by y ) where y != 0 and s - 4 settings enable_optimize_predicate_expression=0" | - grep -o "Aggregating\|Filter column\|Filter column: notEquals(y_1, 0_UInt8)\|FUNCTION and(minus(s_0, 4_UInt8) :: 0, 1 :: 3) -> and(notEquals(y_1, 0_UInt8), minus(s_0, 4_UInt8)) UInt8 : 2" + grep -o "Aggregating\|Filter column\|Filter column: notEquals(__table1.y, 0_UInt8)\|FUNCTION and(minus(__table1.s, 4_UInt8) :: 0, 1 :: 3) -> and(notEquals(__table1.y, 0_UInt8), minus(__table1.s, 4_UInt8)) UInt8 : 2" $CLICKHOUSE_CLIENT -q " select s, y from ( select sum(x) as s, y from (select number as x, number + 1 as y from numbers(10)) group by y @@ -96,7 +96,7 @@ $CLICKHOUSE_CLIENT --allow_experimental_analyzer=1 --convert_query_to_cnf=0 -q " select sum(x) as s, y from (select number as x, number + 1 as y from numbers(10)) group by y ) where y != 0 and s - 8 and s - 4 settings enable_optimize_predicate_expression=0" | - grep -o "Aggregating\|Filter column\|Filter column: notEquals(y_1, 0_UInt8)\|FUNCTION and(minus(s_0, 8_UInt8) :: 0, minus(s_0, 4_UInt8) :: 2) -> and(notEquals(y_1, 0_UInt8), minus(s_0, 8_UInt8), minus(s_0, 4_UInt8))" + grep -o "Aggregating\|Filter column\|Filter column: notEquals(__table1.y, 0_UInt8)\|FUNCTION and(minus(__table1.s, 8_UInt8) :: 0, minus(__table1.s, 4_UInt8) :: 2) -> and(notEquals(__table1.y, 0_UInt8), minus(__table1.s, 8_UInt8), minus(__table1.s, 4_UInt8))" $CLICKHOUSE_CLIENT -q " select s, y from ( select sum(x) as s, y from (select number as x, number + 1 as y from numbers(10)) group by y @@ -116,7 +116,7 @@ $CLICKHOUSE_CLIENT --allow_experimental_analyzer=1 --convert_query_to_cnf=0 -q " select sum(x) as s, y from (select number as x, number + 1 as y from numbers(10)) group by y ) where y != 0 and s != 8 and y - 4 settings enable_optimize_predicate_expression=0" | - grep -o "Aggregating\|Filter column\|Filter column: and(notEquals(y_1, 0_UInt8), minus(y_1, 4_UInt8))\|ALIAS notEquals(s_0, 8_UInt8) :: 0 -> and(notEquals(y_1, 0_UInt8), notEquals(s_0, 8_UInt8), minus(y_1, 4_UInt8))" + grep -o "Aggregating\|Filter column\|Filter column: and(notEquals(__table1.y, 0_UInt8), minus(__table1.y, 4_UInt8))\|ALIAS notEquals(__table1.s, 8_UInt8) :: 0 -> and(notEquals(__table1.y, 0_UInt8), notEquals(__table1.s, 8_UInt8), minus(__table1.y, 4_UInt8))" $CLICKHOUSE_CLIENT -q " select s, y from ( select sum(x) as s, y from (select number as x, number + 1 as y from numbers(10)) group by y @@ -134,7 +134,7 @@ $CLICKHOUSE_CLIENT --allow_experimental_analyzer=1 -q " explain actions = 1 select x, y from ( select range(number) as x, number + 1 as y from numbers(3) ) array join x where y != 2 and x != 0" | - grep -o "Filter column: and(notEquals(y_1, 2_UInt8), notEquals(x_0, 0_UInt8))\|ARRAY JOIN x_0\|Filter column: notEquals(y_1, 2_UInt8)" + grep -o "Filter column: and(notEquals(__table2.y, 2_UInt8), notEquals(__table1.x, 0_UInt8))\|ARRAY JOIN __table1.x\|Filter column: notEquals(__table2.y, 2_UInt8)" $CLICKHOUSE_CLIENT -q " select x, y from ( select range(number) as x, number + 1 as y from numbers(3) @@ -166,7 +166,7 @@ $CLICKHOUSE_CLIENT --allow_experimental_analyzer=1 -q " select distinct x, y from (select number % 2 as x, number % 3 as y from numbers(10)) ) where y != 2 settings enable_optimize_predicate_expression=0" | - grep -o "Distinct\|Filter column: notEquals(y_1, 2_UInt8)" + grep -o "Distinct\|Filter column: notEquals(__table1.y, 2_UInt8)" $CLICKHOUSE_CLIENT -q " select x, y from ( select distinct x, y from (select number % 2 as x, number % 3 as y from numbers(10)) @@ -186,7 +186,7 @@ $CLICKHOUSE_CLIENT --allow_experimental_analyzer=1 --convert_query_to_cnf=0 -q " select number % 2 as x, number % 3 as y from numbers(6) order by y desc ) where x != 0 and y != 0 settings enable_optimize_predicate_expression = 0" | - grep -o "Sorting\|Filter column: and(notEquals(x_0, 0_UInt8), notEquals(y_1, 0_UInt8))" + grep -o "Sorting\|Filter column: and(notEquals(__table1.x, 0_UInt8), notEquals(__table1.y, 0_UInt8))" $CLICKHOUSE_CLIENT -q " select x, y from ( select number % 2 as x, number % 3 as y from numbers(6) order by y desc @@ -206,7 +206,7 @@ $CLICKHOUSE_CLIENT --allow_experimental_analyzer=1 -q " select y, sum(x) from (select number as x, number % 4 as y from numbers(10)) group by y with totals ) where y != 2 settings enable_optimize_predicate_expression=0" | - grep -o "TotalsHaving\|Aggregating\|Filter column: notEquals(y_0, 2_UInt8)" + grep -o "TotalsHaving\|Aggregating\|Filter column: notEquals(__table1.y, 2_UInt8)" $CLICKHOUSE_CLIENT -q " select * from ( select y, sum(x) from (select number as x, number % 4 as y from numbers(10)) group by y with totals @@ -236,7 +236,7 @@ $CLICKHOUSE_CLIENT --allow_experimental_analyzer=1 -q " select number as a, r.b from numbers(4) as l any left join ( select number + 2 as b from numbers(3) ) as r on a = r.b where a != 1 and b != 2 settings enable_optimize_predicate_expression = 0" | - grep -o "Join\|Filter column: notEquals(number_0, 1_UInt8)" + grep -o "Join\|Filter column: notEquals(__table1.number, 1_UInt8)" $CLICKHOUSE_CLIENT -q " select number as a, r.b from numbers(4) as l any left join ( select number + 2 as b from numbers(3) @@ -255,7 +255,7 @@ $CLICKHOUSE_CLIENT --allow_experimental_analyzer=1 -q " select number as a, r.b from numbers(4) as l any inner join ( select number + 2 as b from numbers(3) ) as r on a = r.b where a != 1 and b != 2 settings enable_optimize_predicate_expression = 0" | - grep -o "Join\|Filter column: notEquals(number_0, 1_UInt8)" + grep -o "Join\|Filter column: notEquals(__table1.number, 1_UInt8)" $CLICKHOUSE_CLIENT -q " select number as a, r.b from numbers(4) as l any inner join ( select number + 2 as b from numbers(3) diff --git a/tests/queries/0_stateless/01655_plan_optimizations_optimize_read_in_window_order.reference b/tests/queries/0_stateless/01655_plan_optimizations_optimize_read_in_window_order.reference index 8a33df9fad2..7c2753124b3 100644 --- a/tests/queries/0_stateless/01655_plan_optimizations_optimize_read_in_window_order.reference +++ b/tests/queries/0_stateless/01655_plan_optimizations_optimize_read_in_window_order.reference @@ -7,19 +7,19 @@ Partial sorting plan Prefix sort description: n ASC Result sort description: n ASC, x ASC optimize_read_in_window_order=1, allow_experimental_analyzer=1 - Prefix sort description: n_0 ASC - Result sort description: n_0 ASC, x_1 ASC + Prefix sort description: __table1.n ASC + Result sort description: __table1.n ASC, __table1.x ASC No sorting plan optimize_read_in_window_order=0 Sort description: n ASC, x ASC optimize_read_in_window_order=0, allow_experimental_analyzer=1 - Sort description: n_0 ASC, x_1 ASC + Sort description: __table1.n ASC, __table1.x ASC optimize_read_in_window_order=1 Prefix sort description: n ASC, x ASC Result sort description: n ASC, x ASC optimize_read_in_window_order=1, allow_experimental_analyzer=1 - Prefix sort description: n_0 ASC, x_1 ASC - Result sort description: n_0 ASC, x_1 ASC + Prefix sort description: __table1.n ASC, __table1.x ASC + Result sort description: __table1.n ASC, __table1.x ASC Complex ORDER BY optimize_read_in_window_order=0 3 3 1 diff --git a/tests/queries/0_stateless/01823_explain_json.reference b/tests/queries/0_stateless/01823_explain_json.reference index befbf82f4fb..23fb34c2192 100644 --- a/tests/queries/0_stateless/01823_explain_json.reference +++ b/tests/queries/0_stateless/01823_explain_json.reference @@ -37,59 +37,59 @@ "Node Type": "Aggregating", "Header": [ { - "Name": "number_0", + "Name": "__table1.number", "Type": "UInt64" }, { - "Name": "quantile(0.2_Float64)(number_0)", + "Name": "quantile(0.2_Float64)(__table1.number)", "Type": "Float64" }, { - "Name": "sumIf(number_0, greater(number_0, 0_UInt8))", + "Name": "sumIf(__table1.number, greater(__table1.number, 0_UInt8))", "Type": "UInt64" } ], - "Keys": ["number_0"], + "Keys": ["__table1.number"], "Aggregates": [ { - "Name": "quantile(0.2_Float64)(number_0)", + "Name": "quantile(0.2_Float64)(__table1.number)", "Function": { "Name": "quantile", "Parameters": ["0.2"], "Argument Types": ["UInt64"], "Result Type": "Float64" }, - "Arguments": ["number_0"] + "Arguments": ["__table1.number"] }, { - "Name": "sumIf(number_0, greater(number_0, 0_UInt8))", + "Name": "sumIf(__table1.number, greater(__table1.number, 0_UInt8))", "Function": { "Name": "sumIf", "Argument Types": ["UInt64", "UInt8"], "Result Type": "UInt64" }, - "Arguments": ["number_0", "greater(number_0, 0_UInt8)"] + "Arguments": ["__table1.number", "greater(__table1.number, 0_UInt8)"] } ], -------- "Node Type": "ArrayJoin", "Left": false, - "Columns": ["x_0", "y_1"], + "Columns": ["__table1.x", "__table1.y"], -------- "Node Type": "Distinct", - "Columns": ["intDiv(number_0, 2_UInt8)", "intDiv(number_0, 3_UInt8)"], + "Columns": ["intDiv(__table1.number, 2_UInt8)", "intDiv(__table1.number, 3_UInt8)"], -- "Node Type": "Distinct", - "Columns": ["intDiv(number_0, 2_UInt8)", "intDiv(number_0, 3_UInt8)"], + "Columns": ["intDiv(__table1.number, 2_UInt8)", "intDiv(__table1.number, 3_UInt8)"], -------- "Sort Description": [ { - "Column": "number_0", + "Column": "__table1.number", "Ascending": false, "With Fill": false }, { - "Column": "plus(number_0, 1_UInt8)", + "Column": "plus(__table1.number, 1_UInt8)", "Ascending": true, "With Fill": false } diff --git a/tests/queries/0_stateless/01825_type_json_10.reference b/tests/queries/0_stateless/01825_type_json_10.reference index 53fe604fa51..4161fb59c93 100644 --- a/tests/queries/0_stateless/01825_type_json_10.reference +++ b/tests/queries/0_stateless/01825_type_json_10.reference @@ -1,4 +1,4 @@ -Tuple(a Tuple(b Int8, c Nested(d Int8, e Array(Int16), f Int8))) +Tuple(\n a Tuple(\n b Int8,\n c Nested(d Int8, e Array(Int16), f Int8))) {"o":{"a":{"b":1,"c":[{"d":10,"e":[31],"f":0},{"d":20,"e":[63,127],"f":0}]}}} {"o":{"a":{"b":2,"c":[]}}} {"o":{"a":{"b":3,"c":[{"d":0,"e":[32],"f":20},{"d":0,"e":[64,128],"f":30}]}}} diff --git a/tests/queries/0_stateless/01825_type_json_11.reference b/tests/queries/0_stateless/01825_type_json_11.reference index 27569620cd7..0575743e019 100644 --- a/tests/queries/0_stateless/01825_type_json_11.reference +++ b/tests/queries/0_stateless/01825_type_json_11.reference @@ -1,4 +1,4 @@ -Tuple(id Int8, key_1 Nested(key_2 Int32, key_3 Nested(key_4 Nested(key_5 Int8), key_7 Int16))) +Tuple(\n id Int8,\n key_1 Nested(key_2 Int32, key_3 Nested(key_4 Nested(key_5 Int8), key_7 Int16))) {"obj":{"id":1,"key_1":[{"key_2":100,"key_3":[{"key_4":[{"key_5":-2}],"key_7":257}]},{"key_2":65536,"key_3":[]}]}} {"obj":{"id":2,"key_1":[{"key_2":101,"key_3":[{"key_4":[{"key_5":-2}],"key_7":0}]},{"key_2":102,"key_3":[{"key_4":[],"key_7":257}]},{"key_2":65536,"key_3":[]}]}} {"obj.key_1.key_3":[[{"key_4":[{"key_5":-2}],"key_7":257}],[]]} diff --git a/tests/queries/0_stateless/01825_type_json_12.reference b/tests/queries/0_stateless/01825_type_json_12.reference index 7f4f5bf190e..ff60ba33f94 100644 --- a/tests/queries/0_stateless/01825_type_json_12.reference +++ b/tests/queries/0_stateless/01825_type_json_12.reference @@ -1,3 +1,3 @@ -Tuple(id Int8, key_0 Nested(key_1 Nested(key_3 Nested(key_4 String, key_5 Float64, key_6 String, key_7 Float64)))) +Tuple(\n id Int8,\n key_0 Nested(key_1 Nested(key_3 Nested(key_4 String, key_5 Float64, key_6 String, key_7 Float64)))) {"obj":{"id":1,"key_0":[{"key_1":[{"key_3":[{"key_4":"1048576","key_5":0.0001048576,"key_6":"25.5","key_7":1025},{"key_4":"","key_5":0,"key_6":"","key_7":2}]}]},{"key_1":[]},{"key_1":[{"key_3":[{"key_4":"","key_5":-1,"key_6":"aqbjfiruu","key_7":-922337203685477600},{"key_4":"","key_5":0,"key_6":"","key_7":65537}]},{"key_3":[{"key_4":"ghdqyeiom","key_5":1048575,"key_6":"","key_7":21474836.48}]}]}]}} [[['1048576','']],[],[['',''],['ghdqyeiom']]] [[[0.0001048576,0]],[],[[-1,0],[1048575]]] [[['25.5','']],[],[['aqbjfiruu',''],['']]] [[[1025,2]],[],[[-922337203685477600,65537],[21474836.48]]] diff --git a/tests/queries/0_stateless/01825_type_json_13.reference b/tests/queries/0_stateless/01825_type_json_13.reference index e420021f406..fa105f1a4c6 100644 --- a/tests/queries/0_stateless/01825_type_json_13.reference +++ b/tests/queries/0_stateless/01825_type_json_13.reference @@ -1,3 +1,3 @@ -Tuple(id Int8, key_1 Nested(key_2 Nested(key_3 Nested(key_4 Nested(key_5 Float64, key_6 Int64, key_7 Int32), key_8 Int32)))) +Tuple(\n id Int8,\n key_1 Nested(key_2 Nested(key_3 Nested(key_4 Nested(key_5 Float64, key_6 Int64, key_7 Int32), key_8 Int32)))) {"obj":{"id":1,"key_1":[{"key_2":[{"key_3":[{"key_4":[],"key_8":65537},{"key_4":[{"key_5":-0.02,"key_6":"0","key_7":0},{"key_5":0,"key_6":"0","key_7":1023},{"key_5":0,"key_6":"9223372036854775807","key_7":1}],"key_8":0},{"key_4":[{"key_5":0,"key_6":"0","key_7":65537}],"key_8":0}]}]}]}} [[[65537,0,0]]] [[[[],[-0.02,0,0],[0]]]] [[[[],[0,0,9223372036854775807],[0]]]] [[[[],[0,1023,1],[65537]]]] diff --git a/tests/queries/0_stateless/01825_type_json_15.reference b/tests/queries/0_stateless/01825_type_json_15.reference index ab4b1b82877..4f13731d35a 100644 --- a/tests/queries/0_stateless/01825_type_json_15.reference +++ b/tests/queries/0_stateless/01825_type_json_15.reference @@ -1,3 +1,3 @@ -Tuple(id Int8, key_0 Nested(key_0 Float64, key_1 Tuple(key_2 Array(Int8), key_8 String), key_10 Float64)) +Tuple(\n id Int8,\n key_0 Nested(key_0 Float64, key_1 Tuple(key_2 Array(Int8), key_8 String), key_10 Float64)) {"obj":{"id":1,"key_0":[{"key_0":-1,"key_1":{"key_2":[1,2,3],"key_8":"sffjx"},"key_10":65535},{"key_0":922337203.685,"key_1":{"key_2":[],"key_8":""},"key_10":10.23}]}} [[1,2,3],[]] ['sffjx',''] [65535,10.23] [-1,922337203.685] diff --git a/tests/queries/0_stateless/01825_type_json_16.reference b/tests/queries/0_stateless/01825_type_json_16.reference index f40f0d747d5..a8cc682f8e1 100644 --- a/tests/queries/0_stateless/01825_type_json_16.reference +++ b/tests/queries/0_stateless/01825_type_json_16.reference @@ -1,3 +1,3 @@ -Tuple(id Int8, key_0 Nested(key_1 Nested(key_2 Tuple(key_3 Nested(key_4 Int32, key_6 Int8, key_7 Int16), key_5 Nested(key_6 Int8, key_7 String))))) +Tuple(\n id Int8,\n key_0 Nested(key_1 Nested(key_2 Tuple(key_3 Nested(key_4 Int32, key_6 Int8, key_7 Int16), key_5 Nested(key_6 Int8, key_7 String))))) {"obj":{"id":1,"key_0":[{"key_1":[{"key_2":{"key_3":[{"key_4":255,"key_6":0,"key_7":0},{"key_4":65535,"key_6":0,"key_7":0},{"key_4":0,"key_6":3,"key_7":255}],"key_5":[{"key_6":1,"key_7":"nnpqx"},{"key_6":3,"key_7":"255"}]}}]}]}} [[[255,65535,0]]] [[[0,0,3]]] [[[0,0,255]]] [[[1,3]]] [[['nnpqx','255']]] diff --git a/tests/queries/0_stateless/01825_type_json_17.reference b/tests/queries/0_stateless/01825_type_json_17.reference index 0f97bfed5bc..c830cf41cf1 100644 --- a/tests/queries/0_stateless/01825_type_json_17.reference +++ b/tests/queries/0_stateless/01825_type_json_17.reference @@ -1,4 +1,4 @@ -Tuple(arr Nested(k1 Nested(k2 String, k3 String, k4 Int8), k5 Tuple(k6 String)), id Int8) +Tuple(\n arr Nested(k1 Nested(k2 String, k3 String, k4 Int8), k5 Tuple(k6 String)),\n id Int8) {"obj":{"arr":[{"k1":[{"k2":"aaa","k3":"bbb","k4":0},{"k2":"ccc","k3":"","k4":0}],"k5":{"k6":""}}],"id":1}} {"obj":{"arr":[{"k1":[{"k2":"","k3":"ddd","k4":10},{"k2":"","k3":"","k4":20}],"k5":{"k6":"foo"}}],"id":2}} [['bbb','']] [['aaa','ccc']] @@ -6,7 +6,7 @@ Tuple(arr Nested(k1 Nested(k2 String, k3 String, k4 Int8), k5 Tuple(k6 String)), 1 [[0,0]] [[10,20]] -Tuple(arr Nested(k1 Nested(k2 String, k3 Nested(k4 Int8))), id Int8) +Tuple(\n arr Nested(k1 Nested(k2 String, k3 Nested(k4 Int8))),\n id Int8) {"obj":{"arr":[{"k1":[{"k2":"aaa","k3":[]}]}],"id":1}} {"obj":{"arr":[{"k1":[{"k2":"bbb","k3":[{"k4":10}]},{"k2":"ccc","k3":[{"k4":20}]}]}],"id":2}} [['aaa']] [[[]]] @@ -14,7 +14,7 @@ Tuple(arr Nested(k1 Nested(k2 String, k3 Nested(k4 Int8))), id Int8) 1 [[[]]] [[[10],[20]]] -Tuple(arr Nested(k1 Nested(k2 String, k4 Nested(k5 Int8)), k3 String), id Int8) +Tuple(\n arr Nested(k1 Nested(k2 String, k4 Nested(k5 Int8)), k3 String),\n id Int8) {"obj":{"arr":[{"k1":[],"k3":"qqq"},{"k1":[],"k3":"www"}],"id":1}} {"obj":{"arr":[{"k1":[{"k2":"aaa","k4":[]}],"k3":"eee"}],"id":2}} {"obj":{"arr":[{"k1":[{"k2":"bbb","k4":[{"k5":10}]},{"k2":"ccc","k4":[{"k5":20}]}],"k3":"rrr"}],"id":3}} diff --git a/tests/queries/0_stateless/01825_type_json_18.reference b/tests/queries/0_stateless/01825_type_json_18.reference index d93f9bda63c..d61baf5eb6f 100644 --- a/tests/queries/0_stateless/01825_type_json_18.reference +++ b/tests/queries/0_stateless/01825_type_json_18.reference @@ -1,2 +1,2 @@ -1 (1) Tuple(k1 Int8) -1 ([1,2]) Tuple(k1 Array(Int8)) +1 (1) Tuple(\n k1 Int8) +1 ([1,2]) Tuple(\n k1 Array(Int8)) diff --git a/tests/queries/0_stateless/01825_type_json_2.reference b/tests/queries/0_stateless/01825_type_json_2.reference index 8524035a3a4..790d825a894 100644 --- a/tests/queries/0_stateless/01825_type_json_2.reference +++ b/tests/queries/0_stateless/01825_type_json_2.reference @@ -1,24 +1,24 @@ -1 (1,2,0) Tuple(k1 Int8, k2 Int8, k3 Int8) -2 (0,3,4) Tuple(k1 Int8, k2 Int8, k3 Int8) +1 (1,2,0) Tuple(\n k1 Int8,\n k2 Int8,\n k3 Int8) +2 (0,3,4) Tuple(\n k1 Int8,\n k2 Int8,\n k3 Int8) 1 1 2 0 2 0 3 4 -1 (1,2,'0') Tuple(k1 Int8, k2 Int8, k3 String) -2 (0,3,'4') Tuple(k1 Int8, k2 Int8, k3 String) -3 (0,0,'10') Tuple(k1 Int8, k2 Int8, k3 String) -4 (0,5,'str') Tuple(k1 Int8, k2 Int8, k3 String) +1 (1,2,'0') Tuple(\n k1 Int8,\n k2 Int8,\n k3 String) +2 (0,3,'4') Tuple(\n k1 Int8,\n k2 Int8,\n k3 String) +3 (0,0,'10') Tuple(\n k1 Int8,\n k2 Int8,\n k3 String) +4 (0,5,'str') Tuple(\n k1 Int8,\n k2 Int8,\n k3 String) 1 1 2 0 2 0 3 4 3 0 0 10 4 0 5 str ============ -1 ([1,2,3.3]) Tuple(k1 Array(Float64)) +1 ([1,2,3.3]) Tuple(\n k1 Array(Float64)) 1 [1,2,3.3] -1 (['1','2','3.3']) Tuple(k1 Array(String)) -2 (['a','4','b']) Tuple(k1 Array(String)) +1 (['1','2','3.3']) Tuple(\n k1 Array(String)) +2 (['a','4','b']) Tuple(\n k1 Array(String)) 1 ['1','2','3.3'] 2 ['a','4','b'] ============ -1 ([(11,0,0),(0,22,0)]) Tuple(k1 Nested(k2 Int8, k3 Int8, k4 Int8)) -2 ([(0,33,0),(0,0,44),(0,55,66)]) Tuple(k1 Nested(k2 Int8, k3 Int8, k4 Int8)) +1 ([(11,0,0),(0,22,0)]) Tuple(\n k1 Nested(k2 Int8, k3 Int8, k4 Int8)) +2 ([(0,33,0),(0,0,44),(0,55,66)]) Tuple(\n k1 Nested(k2 Int8, k3 Int8, k4 Int8)) 1 [11,0] [0,22] [0,0] 2 [0,0,0] [33,0,55] [0,44,66] diff --git a/tests/queries/0_stateless/01825_type_json_3.reference.j2 b/tests/queries/0_stateless/01825_type_json_3.reference.j2 index 23f38b74fd1..8646cf48872 100644 --- a/tests/queries/0_stateless/01825_type_json_3.reference.j2 +++ b/tests/queries/0_stateless/01825_type_json_3.reference.j2 @@ -1,17 +1,17 @@ {% for engine in ["ReplicatedMergeTree('/clickhouse/tables/{database}/test_01825_3/t_json_3', 'r1') ORDER BY tuple()", "Memory"] -%} -1 ('',0) Tuple(k1 String, k2 Int8) -2 ('v1',2) Tuple(k1 String, k2 Int8) +1 ('',0) Tuple(\n k1 String,\n k2 Int8) +2 ('v1',2) Tuple(\n k1 String,\n k2 Int8) 1 0 2 v1 2 ======== -1 ([]) Tuple(k1 Nested(k2 String, k3 String)) -2 ([('v1','v3'),('v4','')]) Tuple(k1 Nested(k2 String, k3 String)) +1 ([]) Tuple(\n k1 Nested(k2 String, k3 String)) +2 ([('v1','v3'),('v4','')]) Tuple(\n k1 Nested(k2 String, k3 String)) 1 [] [] 2 ['v1','v4'] ['v3',''] -1 ([]) Tuple(k1 Nested(k2 String, k3 String)) -2 ([('v1','v3'),('v4','')]) Tuple(k1 Nested(k2 String, k3 String)) -3 ([]) Tuple(k1 Nested(k2 String, k3 String)) -4 ([]) Tuple(k1 Nested(k2 String, k3 String)) +1 ([]) Tuple(\n k1 Nested(k2 String, k3 String)) +2 ([('v1','v3'),('v4','')]) Tuple(\n k1 Nested(k2 String, k3 String)) +3 ([]) Tuple(\n k1 Nested(k2 String, k3 String)) +4 ([]) Tuple(\n k1 Nested(k2 String, k3 String)) 1 [] [] 2 ['v1','v4'] ['v3',''] 3 [] [] @@ -26,9 +26,9 @@ data Tuple(k1 Nested(k2 String, k3 String)) 3 [] [] 4 [] [] ======== -1 ((1,'foo'),[]) Tuple(k1 Tuple(k2 Int8, k3 String), k4 Array(Int8)) -2 ((0,''),[1,2,3]) Tuple(k1 Tuple(k2 Int8, k3 String), k4 Array(Int8)) -3 ((10,''),[]) Tuple(k1 Tuple(k2 Int8, k3 String), k4 Array(Int8)) +1 ((1,'foo'),[]) Tuple(\n k1 Tuple(\n k2 Int8,\n k3 String),\n k4 Array(Int8)) +2 ((0,''),[1,2,3]) Tuple(\n k1 Tuple(\n k2 Int8,\n k3 String),\n k4 Array(Int8)) +3 ((10,''),[]) Tuple(\n k1 Tuple(\n k2 Int8,\n k3 String),\n k4 Array(Int8)) 1 1 foo [] 2 0 [1,2,3] 3 10 [] diff --git a/tests/queries/0_stateless/01825_type_json_4.reference b/tests/queries/0_stateless/01825_type_json_4.reference index 1b23bf2213e..58b1d067a2b 100644 --- a/tests/queries/0_stateless/01825_type_json_4.reference +++ b/tests/queries/0_stateless/01825_type_json_4.reference @@ -1,5 +1,5 @@ Code: 645 Code: 15 Code: 53 -1 ('v1') Tuple(k1 String) +1 ('v1') Tuple(\n k1 String) 1 v1 diff --git a/tests/queries/0_stateless/01825_type_json_5.reference b/tests/queries/0_stateless/01825_type_json_5.reference index 4ac0aa26ffd..3c21f2840a2 100644 --- a/tests/queries/0_stateless/01825_type_json_5.reference +++ b/tests/queries/0_stateless/01825_type_json_5.reference @@ -2,4 +2,4 @@ {"s":{"a.b":1,"a.c":2}} 1 [22,33] 2 qqq [44] -Tuple(k1 Int8, k2 Tuple(k3 String, k4 Array(Int8))) +Tuple(\n k1 Int8,\n k2 Tuple(\n k3 String,\n k4 Array(Int8))) diff --git a/tests/queries/0_stateless/01825_type_json_6.reference b/tests/queries/0_stateless/01825_type_json_6.reference index 7fcd2a40826..15e1ab3ac80 100644 --- a/tests/queries/0_stateless/01825_type_json_6.reference +++ b/tests/queries/0_stateless/01825_type_json_6.reference @@ -1,3 +1,3 @@ -Tuple(key String, out Nested(outputs Nested(index Int32, n Int8), type Int8, value Int8)) +Tuple(\n key String,\n out Nested(outputs Nested(index Int32, n Int8), type Int8, value Int8)) v1 [0,0] [1,2] [[],[1960131]] [[],[0]] v2 [1,1] [4,3] [[1881212],[]] [[1],[]] diff --git a/tests/queries/0_stateless/01825_type_json_7.reference b/tests/queries/0_stateless/01825_type_json_7.reference index 263f1688a91..cf6b32d73e8 100644 --- a/tests/queries/0_stateless/01825_type_json_7.reference +++ b/tests/queries/0_stateless/01825_type_json_7.reference @@ -1,4 +1,4 @@ -Tuple(categories Array(String), key String) +Tuple(\n categories Array(String),\n key String) v1 [] v2 ['foo','bar'] v3 [] diff --git a/tests/queries/0_stateless/01825_type_json_8.reference b/tests/queries/0_stateless/01825_type_json_8.reference index b64e6d0c9b9..27770317862 100644 --- a/tests/queries/0_stateless/01825_type_json_8.reference +++ b/tests/queries/0_stateless/01825_type_json_8.reference @@ -1,2 +1,2 @@ -([[(1,2),(3,4)],[(5,6)]]) Tuple(k1 Array(Nested(k2 Int8, k3 Int8))) -([([1,3,4,5],[6,7]),([8],[9,10,11])]) Tuple(k1 Nested(k2 Array(Int8), k3 Array(Int8))) +([[(1,2),(3,4)],[(5,6)]]) Tuple(\n k1 Array(Nested(k2 Int8, k3 Int8))) +([([1,3,4,5],[6,7]),([8],[9,10,11])]) Tuple(\n k1 Nested(k2 Array(Int8), k3 Array(Int8))) diff --git a/tests/queries/0_stateless/01825_type_json_9.reference b/tests/queries/0_stateless/01825_type_json_9.reference index a426b09a100..f58a64eda5a 100644 --- a/tests/queries/0_stateless/01825_type_json_9.reference +++ b/tests/queries/0_stateless/01825_type_json_9.reference @@ -1 +1 @@ -Tuple(foo Int8, k1 Int8, k2 Int8) +Tuple(\n foo Int8,\n k1 Int8,\n k2 Int8) diff --git a/tests/queries/0_stateless/01825_type_json_bools.reference b/tests/queries/0_stateless/01825_type_json_bools.reference index bed8c2ad2c3..6b4d2382dc2 100644 --- a/tests/queries/0_stateless/01825_type_json_bools.reference +++ b/tests/queries/0_stateless/01825_type_json_bools.reference @@ -1 +1 @@ -(1,0) Tuple(k1 UInt8, k2 UInt8) +(1,0) Tuple(\n k1 UInt8,\n k2 UInt8) diff --git a/tests/queries/0_stateless/01825_type_json_btc.reference b/tests/queries/0_stateless/01825_type_json_btc.reference index cee3b31a798..e85c0ef45bd 100644 --- a/tests/queries/0_stateless/01825_type_json_btc.reference +++ b/tests/queries/0_stateless/01825_type_json_btc.reference @@ -1,5 +1,5 @@ 100 -data Tuple(double_spend UInt8, fee Int32, hash String, inputs Nested(index Int8, prev_out Tuple(addr String, n Int16, script String, spending_outpoints Nested(n Int8, tx_index Int64), spent UInt8, tx_index Int64, type Int8, value Int64), script String, sequence Int64, witness String), lock_time Int32, out Nested(addr String, n Int8, script String, spending_outpoints Nested(n Int8, tx_index Int64), spent UInt8, tx_index Int64, type Int8, value Int64), rbf UInt8, relayed_by String, size Int16, time Int32, tx_index Int64, ver Int8, vin_sz Int8, vout_sz Int8, weight Int16) +data Tuple(\n double_spend UInt8,\n fee Int32,\n hash String,\n inputs Nested(index Int8, prev_out Tuple(addr String, n Int16, script String, spending_outpoints Nested(n Int8, tx_index Int64), spent UInt8, tx_index Int64, type Int8, value Int64), script String, sequence Int64, witness String),\n lock_time Int32,\n out Nested(addr String, n Int8, script String, spending_outpoints Nested(n Int8, tx_index Int64), spent UInt8, tx_index Int64, type Int8, value Int64),\n rbf UInt8,\n relayed_by String,\n size Int16,\n time Int32,\n tx_index Int64,\n ver Int8,\n vin_sz Int8,\n vout_sz Int8,\n weight Int16) 8174.56 2680 2.32 1 [[],[(0,359661801933760)]] diff --git a/tests/queries/0_stateless/01825_type_json_describe.reference b/tests/queries/0_stateless/01825_type_json_describe.reference index 629b60cb629..98b2bf8be83 100644 --- a/tests/queries/0_stateless/01825_type_json_describe.reference +++ b/tests/queries/0_stateless/01825_type_json_describe.reference @@ -1,3 +1,3 @@ data Object(\'json\') -data Tuple(k1 Int8) -data Tuple(k1 String, k2 Array(Int8)) +data Tuple(\n k1 Int8) +data Tuple(\n k1 String,\n k2 Array(Int8)) diff --git a/tests/queries/0_stateless/01825_type_json_distributed.reference b/tests/queries/0_stateless/01825_type_json_distributed.reference index 9ae85ac888c..9735fec2fe5 100644 --- a/tests/queries/0_stateless/01825_type_json_distributed.reference +++ b/tests/queries/0_stateless/01825_type_json_distributed.reference @@ -1,4 +1,4 @@ -(2,('qqq',[44,55])) Tuple(k1 Int8, k2 Tuple(k3 String, k4 Array(Int8))) -(2,('qqq',[44,55])) Tuple(k1 Int8, k2 Tuple(k3 String, k4 Array(Int8))) +(2,('qqq',[44,55])) Tuple(\n k1 Int8,\n k2 Tuple(\n k3 String,\n k4 Array(Int8))) +(2,('qqq',[44,55])) Tuple(\n k1 Int8,\n k2 Tuple(\n k3 String,\n k4 Array(Int8))) 2 qqq [44,55] 2 qqq [44,55] diff --git a/tests/queries/0_stateless/01825_type_json_field.reference b/tests/queries/0_stateless/01825_type_json_field.reference index b5637b1fbb7..8afd0110b63 100644 --- a/tests/queries/0_stateless/01825_type_json_field.reference +++ b/tests/queries/0_stateless/01825_type_json_field.reference @@ -1,12 +1,12 @@ 1 10 a -Tuple(a UInt8, s String) +Tuple(\n a UInt8,\n s String) 1 10 a 0 2 sss b 300 3 20 c 0 -Tuple(a String, b UInt16, s String) +Tuple(\n a String,\n b UInt16,\n s String) 1 10 a 0 2 sss b 300 3 20 c 0 4 30 400 5 0 qqq 0 foo -Tuple(a String, b UInt16, s String, t String) +Tuple(\n a String,\n b UInt16,\n s String,\n t String) diff --git a/tests/queries/0_stateless/01825_type_json_from_map.reference b/tests/queries/0_stateless/01825_type_json_from_map.reference index dbcf67faef3..90680ee383b 100644 --- a/tests/queries/0_stateless/01825_type_json_from_map.reference +++ b/tests/queries/0_stateless/01825_type_json_from_map.reference @@ -1,4 +1,4 @@ 800000 2000000 1400000 900000 800000 2000000 1400000 900000 -Tuple(col0 UInt64, col1 UInt64, col2 UInt64, col3 UInt64, col4 UInt64, col5 UInt64, col6 UInt64, col7 UInt64, col8 UInt64) +Tuple(\n col0 UInt64,\n col1 UInt64,\n col2 UInt64,\n col3 UInt64,\n col4 UInt64,\n col5 UInt64,\n col6 UInt64,\n col7 UInt64,\n col8 UInt64) 1600000 4000000 2800000 1800000 diff --git a/tests/queries/0_stateless/01825_type_json_in_array.reference b/tests/queries/0_stateless/01825_type_json_in_array.reference index c36a22e6951..82207f53a21 100644 --- a/tests/queries/0_stateless/01825_type_json_in_array.reference +++ b/tests/queries/0_stateless/01825_type_json_in_array.reference @@ -5,7 +5,7 @@ {"arr":{"k1":1,"k2":{"k3":2,"k4":3,"k5":""}}} {"arr":{"k1":2,"k2":{"k3":0,"k4":0,"k5":"foo"}}} {"arr":{"k1":3,"k2":{"k3":4,"k4":5,"k5":""}}} -Array(Tuple(k1 Int8, k2 Tuple(k3 Int8, k4 Int8, k5 String))) +Array(Tuple(\n k1 Int8,\n k2 Tuple(\n k3 Int8,\n k4 Int8,\n k5 String))) {"id":1,"arr":[{"k1":[{"k2":"aaa","k3":"bbb","k4":0},{"k2":"ccc","k3":"","k4":0}],"k5":{"k6":""}}]} {"id":2,"arr":[{"k1":[{"k2":"","k3":"ddd","k4":10},{"k2":"","k3":"","k4":20}],"k5":{"k6":"foo"}}]} 1 [['aaa','ccc']] [['bbb','']] [[0,0]] [''] @@ -14,7 +14,7 @@ Array(Tuple(k1 Int8, k2 Tuple(k3 Int8, k4 Int8, k5 String))) {"k1":{"k2":"","k3":"ddd","k4":10}} {"k1":{"k2":"aaa","k3":"bbb","k4":0}} {"k1":{"k2":"ccc","k3":"","k4":0}} -Tuple(k2 String, k3 String, k4 Int8) +Tuple(\n k2 String,\n k3 String,\n k4 Int8) {"arr":[{"x":1}]} {"arr":{"x":{"y":1},"t":{"y":2}}} {"arr":[1,{"y":1}]} diff --git a/tests/queries/0_stateless/01825_type_json_in_other_types.reference b/tests/queries/0_stateless/01825_type_json_in_other_types.reference index b94885a65ab..fa8af729cc7 100644 --- a/tests/queries/0_stateless/01825_type_json_in_other_types.reference +++ b/tests/queries/0_stateless/01825_type_json_in_other_types.reference @@ -1,4 +1,4 @@ -Tuple(String, Map(String, Array(Tuple(k1 Nested(k2 Int8, k3 Int8, k5 String), k4 String))), Tuple(k1 String, k2 Tuple(k3 String, k4 String))) +Tuple(String, Map(String, Array(Tuple(\n k1 Nested(k2 Int8, k3 Int8, k5 String),\n k4 String))), Tuple(\n k1 String,\n k2 Tuple(\n k3 String,\n k4 String))) ============= {"id":1,"data":["foo",{"aa":[{"k1":[{"k2":1,"k3":2,"k5":""},{"k2":0,"k3":3,"k5":""}],"k4":""},{"k1":[{"k2":4,"k3":0,"k5":""},{"k2":0,"k3":5,"k5":""},{"k2":6,"k3":0,"k5":""}],"k4":"qqq"}],"bb":[{"k1":[],"k4":"www"},{"k1":[{"k2":7,"k3":8,"k5":""},{"k2":9,"k3":10,"k5":""},{"k2":11,"k3":12,"k5":""}],"k4":""}]},{"k1":"aa","k2":{"k3":"bb","k4":"c"}}]} {"id":2,"data":["bar",{"aa":[{"k1":[{"k2":13,"k3":14,"k5":""},{"k2":15,"k3":16,"k5":""}],"k4":"www"}]},{"k1":"","k2":{"k3":"","k4":""}}]} diff --git a/tests/queries/0_stateless/01825_type_json_insert_select.reference b/tests/queries/0_stateless/01825_type_json_insert_select.reference index 6778da508f2..cb46a9c607e 100644 --- a/tests/queries/0_stateless/01825_type_json_insert_select.reference +++ b/tests/queries/0_stateless/01825_type_json_insert_select.reference @@ -1,10 +1,10 @@ -Tuple(k1 Int8, k2 String) +Tuple(\n k1 Int8,\n k2 String) 1 (1,'foo') -Tuple(k1 Int8, k2 String, k3 String) +Tuple(\n k1 Int8,\n k2 String,\n k3 String) 1 (1,'foo','') 2 (2,'bar','') 3 (3,'','aaa') -Tuple(arr Nested(k11 Int8, k22 String, k33 Int8), k1 Int8, k2 String, k3 String) +Tuple(\n arr Nested(k11 Int8, k22 String, k33 Int8),\n k1 Int8,\n k2 String,\n k3 String) 1 ([],1,'foo','') 2 ([],2,'bar','') 3 ([],3,'','aaa') @@ -12,7 +12,7 @@ Tuple(arr Nested(k11 Int8, k22 String, k33 Int8), k1 Int8, k2 String, k3 String) 5 ([(0,'str1',0)],0,'','') {"data":{"k1":1,"k10":[{"a":"1","b":"2","c":{"k11":""}},{"a":"2","b":"3","c":{"k11":""}}]}} {"data":{"k1":2,"k10":[{"a":"1","b":"2","c":{"k11":"haha"}}]}} -Tuple(k1 Int8, k10 Nested(a String, b String, c Tuple(k11 String))) +Tuple(\n k1 Int8,\n k10 Nested(a String, b String, c Tuple(k11 String))) {"data":{"k1":1,"k10":[{"a":"1","b":"2","c":{"k11":""}},{"a":"2","b":"3","c":{"k11":""}}]}} {"data":{"k1":2,"k10":[{"a":"1","b":"2","c":{"k11":"haha"}}]}} -Tuple(k1 Int8, k10 Nested(a String, b String, c Tuple(k11 String))) +Tuple(\n k1 Int8,\n k10 Nested(a String, b String, c Tuple(k11 String))) diff --git a/tests/queries/0_stateless/01825_type_json_missed_values.reference b/tests/queries/0_stateless/01825_type_json_missed_values.reference index b480493995b..2a4b3a6f671 100644 --- a/tests/queries/0_stateless/01825_type_json_missed_values.reference +++ b/tests/queries/0_stateless/01825_type_json_missed_values.reference @@ -1,2 +1,2 @@ -Tuple(foo Int8, k1 Int8, k2 Int8) +Tuple(\n foo Int8,\n k1 Int8,\n k2 Int8) 1 diff --git a/tests/queries/0_stateless/01825_type_json_multiple_files.reference b/tests/queries/0_stateless/01825_type_json_multiple_files.reference index b887abc8590..6dcdb00e139 100644 --- a/tests/queries/0_stateless/01825_type_json_multiple_files.reference +++ b/tests/queries/0_stateless/01825_type_json_multiple_files.reference @@ -4,11 +4,11 @@ {"data":{"k0":0,"k1":0,"k2":0,"k3":100,"k4":0,"k5":0}} {"data":{"k0":0,"k1":0,"k2":0,"k3":0,"k4":100,"k5":0}} {"data":{"k0":0,"k1":0,"k2":0,"k3":0,"k4":0,"k5":100}} -Tuple(k0 Int8, k1 Int8, k2 Int8, k3 Int8, k4 Int8, k5 Int8) +Tuple(\n k0 Int8,\n k1 Int8,\n k2 Int8,\n k3 Int8,\n k4 Int8,\n k5 Int8) {"data":{"k0":100,"k1":0,"k2":0}} {"data":{"k0":0,"k1":100,"k2":0}} {"data":{"k0":0,"k1":0,"k2":100}} -Tuple(k0 Int8, k1 Int8, k2 Int8) +Tuple(\n k0 Int8,\n k1 Int8,\n k2 Int8) {"data":{"k1":100,"k3":0}} {"data":{"k1":0,"k3":100}} -Tuple(k1 Int8, k3 Int8) +Tuple(\n k1 Int8,\n k3 Int8) diff --git a/tests/queries/0_stateless/01825_type_json_nbagames.reference b/tests/queries/0_stateless/01825_type_json_nbagames.reference index 5aa63dceb86..70df8f967f3 100644 --- a/tests/queries/0_stateless/01825_type_json_nbagames.reference +++ b/tests/queries/0_stateless/01825_type_json_nbagames.reference @@ -1,5 +1,5 @@ 1000 -data Tuple(_id Tuple(`$oid` String), date Tuple(`$date` String), teams Nested(abbreviation String, city String, home UInt8, name String, players Nested(ast Int8, blk Int8, drb Int8, fg Int8, fg3 Int8, fg3_pct String, fg3a Int8, fg_pct String, fga Int8, ft Int8, ft_pct String, fta Int8, mp String, orb Int8, pf Int8, player String, pts Int8, stl Int8, tov Int8, trb Int8), results Tuple(ast Int8, blk Int8, drb Int8, fg Int8, fg3 Int8, fg3_pct String, fg3a Int8, fg_pct String, fga Int8, ft Int8, ft_pct String, fta Int8, mp Int16, orb Int8, pf Int8, pts Int16, stl Int8, tov Int8, trb Int8), score Int16, won Int8)) +data Tuple(\n _id Tuple(\n `$oid` String),\n date Tuple(\n `$date` String),\n teams Nested(abbreviation String, city String, home UInt8, name String, players Nested(ast Int8, blk Int8, drb Int8, fg Int8, fg3 Int8, fg3_pct String, fg3a Int8, fg_pct String, fga Int8, ft Int8, ft_pct String, fta Int8, mp String, orb Int8, pf Int8, player String, pts Int8, stl Int8, tov Int8, trb Int8), results Tuple(ast Int8, blk Int8, drb Int8, fg Int8, fg3 Int8, fg3_pct String, fg3a Int8, fg_pct String, fga Int8, ft Int8, ft_pct String, fta Int8, mp Int16, orb Int8, pf Int8, pts Int16, stl Int8, tov Int8, trb Int8), score Int16, won Int8)) Boston Celtics 70 Los Angeles Lakers 64 Milwaukee Bucks 61 diff --git a/tests/queries/0_stateless/01825_type_json_nullable.reference b/tests/queries/0_stateless/01825_type_json_nullable.reference index 587fb1b1bc9..597ede47615 100644 --- a/tests/queries/0_stateless/01825_type_json_nullable.reference +++ b/tests/queries/0_stateless/01825_type_json_nullable.reference @@ -1,17 +1,17 @@ -1 (1,2,NULL) Tuple(k1 Nullable(Int8), k2 Nullable(Int8), k3 Nullable(Int8)) -2 (NULL,3,4) Tuple(k1 Nullable(Int8), k2 Nullable(Int8), k3 Nullable(Int8)) +1 (1,2,NULL) Tuple(\n k1 Nullable(Int8),\n k2 Nullable(Int8),\n k3 Nullable(Int8)) +2 (NULL,3,4) Tuple(\n k1 Nullable(Int8),\n k2 Nullable(Int8),\n k3 Nullable(Int8)) 1 1 2 \N 2 \N 3 4 -1 (1,2,NULL) Tuple(k1 Nullable(Int8), k2 Nullable(Int8), k3 Nullable(String)) -2 (NULL,3,'4') Tuple(k1 Nullable(Int8), k2 Nullable(Int8), k3 Nullable(String)) -3 (NULL,NULL,'10') Tuple(k1 Nullable(Int8), k2 Nullable(Int8), k3 Nullable(String)) -4 (NULL,5,'str') Tuple(k1 Nullable(Int8), k2 Nullable(Int8), k3 Nullable(String)) +1 (1,2,NULL) Tuple(\n k1 Nullable(Int8),\n k2 Nullable(Int8),\n k3 Nullable(String)) +2 (NULL,3,'4') Tuple(\n k1 Nullable(Int8),\n k2 Nullable(Int8),\n k3 Nullable(String)) +3 (NULL,NULL,'10') Tuple(\n k1 Nullable(Int8),\n k2 Nullable(Int8),\n k3 Nullable(String)) +4 (NULL,5,'str') Tuple(\n k1 Nullable(Int8),\n k2 Nullable(Int8),\n k3 Nullable(String)) 1 1 2 \N 2 \N 3 4 3 \N \N 10 4 \N 5 str ============ -1 ([(11,NULL,NULL),(NULL,22,NULL)]) Tuple(k1 Nested(k2 Nullable(Int8), k3 Nullable(Int8), k4 Nullable(Int8))) -2 ([(NULL,33,NULL),(NULL,NULL,44),(NULL,55,66)]) Tuple(k1 Nested(k2 Nullable(Int8), k3 Nullable(Int8), k4 Nullable(Int8))) +1 ([(11,NULL,NULL),(NULL,22,NULL)]) Tuple(\n k1 Nested(k2 Nullable(Int8), k3 Nullable(Int8), k4 Nullable(Int8))) +2 ([(NULL,33,NULL),(NULL,NULL,44),(NULL,55,66)]) Tuple(\n k1 Nested(k2 Nullable(Int8), k3 Nullable(Int8), k4 Nullable(Int8))) 1 [11,NULL] [NULL,22] [NULL,NULL] 2 [NULL,NULL,NULL] [33,NULL,55] [NULL,44,66] diff --git a/tests/queries/0_stateless/01825_type_json_parallel_insert.reference b/tests/queries/0_stateless/01825_type_json_parallel_insert.reference index 158d61d46f7..e93e0aeb956 100644 --- a/tests/queries/0_stateless/01825_type_json_parallel_insert.reference +++ b/tests/queries/0_stateless/01825_type_json_parallel_insert.reference @@ -1 +1 @@ -Tuple(k1 Int8, k2 String) 500000 +Tuple(\n k1 Int8,\n k2 String) 500000 diff --git a/tests/queries/0_stateless/01825_type_json_schema_inference.reference b/tests/queries/0_stateless/01825_type_json_schema_inference.reference index a1dd269f9b4..72e3b58b8a8 100644 --- a/tests/queries/0_stateless/01825_type_json_schema_inference.reference +++ b/tests/queries/0_stateless/01825_type_json_schema_inference.reference @@ -1,5 +1,5 @@ {"id":"1","obj":{"k1":1,"k2":{"k3":"2","k4":[{"k5":3,"k6":null},{"k5":4,"k6":null}]},"some":null},"s":"foo"} {"id":"2","obj":{"k1":null,"k2":{"k3":"str","k4":[{"k5":null,"k6":55}]},"some":42},"s":"bar"} -Tuple(k1 Nullable(Int8), k2 Tuple(k3 Nullable(String), k4 Nested(k5 Nullable(Int8), k6 Nullable(Int8))), some Nullable(Int8)) +Tuple(\n k1 Nullable(Int8),\n k2 Tuple(\n k3 Nullable(String),\n k4 Nested(k5 Nullable(Int8), k6 Nullable(Int8))),\n some Nullable(Int8)) {"id":"1","obj":"aaa","s":"foo"} {"id":"2","obj":"bbb","s":"bar"} diff --git a/tests/queries/0_stateless/01881_join_on_conditions_hash.sql.j2 b/tests/queries/0_stateless/01881_join_on_conditions_hash.sql.j2 index fafefd72cb8..bd20d34b684 100644 --- a/tests/queries/0_stateless/01881_join_on_conditions_hash.sql.j2 +++ b/tests/queries/0_stateless/01881_join_on_conditions_hash.sql.j2 @@ -30,7 +30,7 @@ SELECT t1.key, t1.key2 FROM t1 INNER ALL JOIN t2 ON t1.id == t2.id AND t2.key == SELECT '--'; SELECT t1.key FROM t1 INNER ANY JOIN t2 ON t1.id == t2.id AND t2.key == t2.key2 AND t1.key == t1.key2; -SELECT t1.key FROM t1 INNER ANY JOIN t2 ON t1.id == t2.id AND t2.key == t2.key2 AND t1.key == t1.key2 AND 0; -- { serverError INVALID_JOIN_ON_EXPRESSION } +SELECT t1.key FROM t1 INNER ANY JOIN t2 ON t1.id == t2.id AND t2.key == t2.key2 AND t1.key == t1.key2 AND 0; -- { serverError INVALID_JOIN_ON_EXPRESSION,NOT_FOUND_COLUMN_IN_BLOCK } SELECT '--'; SELECT '333' = t1.key FROM t1 INNER ANY JOIN t2 ON t1.id == t2.id AND t2.key == t2.key2 AND t1.key == t1.key2 AND t2.id > 2; diff --git a/tests/queries/0_stateless/02026_describe_include_subcolumns.reference b/tests/queries/0_stateless/02026_describe_include_subcolumns.reference index ba792ea9f74..ac114a03837 100644 --- a/tests/queries/0_stateless/02026_describe_include_subcolumns.reference +++ b/tests/queries/0_stateless/02026_describe_include_subcolumns.reference @@ -1,23 +1,33 @@ -┌─name─┬─type────────────────────────────────────────────────┬─default_type─┬─default_expression─┬─comment─────────────────┬─codec_expression─┬─ttl_expression───────┐ -│ d │ Date │ │ │ │ │ │ -│ n │ Nullable(String) │ │ │ It is a nullable column │ │ │ -│ arr1 │ Array(UInt32) │ │ │ │ ZSTD(1) │ │ -│ arr2 │ Array(Array(String)) │ │ │ │ │ d + toIntervalDay(1) │ -│ t │ Tuple(s String, a Array(Tuple(a UInt32, b UInt32))) │ │ │ │ ZSTD(1) │ │ -└──────┴─────────────────────────────────────────────────────┴──────────────┴────────────────────┴─────────────────────────┴──────────────────┴──────────────────────┘ -┌─name───────┬─type────────────────────────────────────────────────┬─default_type─┬─default_expression─┬─comment─────────────────┬─codec_expression─┬─ttl_expression───────┬─is_subcolumn─┐ -│ d │ Date │ │ │ │ │ │ 0 │ -│ n │ Nullable(String) │ │ │ It is a nullable column │ │ │ 0 │ -│ arr1 │ Array(UInt32) │ │ │ │ ZSTD(1) │ │ 0 │ -│ arr2 │ Array(Array(String)) │ │ │ │ │ d + toIntervalDay(1) │ 0 │ -│ t │ Tuple(s String, a Array(Tuple(a UInt32, b UInt32))) │ │ │ │ ZSTD(1) │ │ 0 │ -│ n.null │ UInt8 │ │ │ It is a nullable column │ │ │ 1 │ -│ arr1.size0 │ UInt64 │ │ │ │ │ │ 1 │ -│ arr2.size0 │ UInt64 │ │ │ │ │ d + toIntervalDay(1) │ 1 │ -│ arr2.size1 │ Array(UInt64) │ │ │ │ │ d + toIntervalDay(1) │ 1 │ -│ t.s │ String │ │ │ │ ZSTD(1) │ │ 1 │ -│ t.a │ Array(Tuple(a UInt32, b UInt32)) │ │ │ │ │ │ 1 │ -│ t.a.size0 │ UInt64 │ │ │ │ │ │ 1 │ -│ t.a.a │ Array(UInt32) │ │ │ │ ZSTD(1) │ │ 1 │ -│ t.a.b │ Array(UInt32) │ │ │ │ ZSTD(1) │ │ 1 │ -└────────────┴─────────────────────────────────────────────────────┴──────────────┴────────────────────┴─────────────────────────┴──────────────────┴──────────────────────┴──────────────┘ +┌─name─┬─type──────────────────────────────────────────────────────────────────────┬─default_type─┬─default_expression─┬─comment─────────────────┬─codec_expression─┬─ttl_expression───────┐ +│ d │ Date │ │ │ │ │ │ +│ n │ Nullable(String) │ │ │ It is a nullable column │ │ │ +│ arr1 │ Array(UInt32) │ │ │ │ ZSTD(1) │ │ +│ arr2 │ Array(Array(String)) │ │ │ │ │ d + toIntervalDay(1) │ +│ t │ Tuple( + s String, + a Array(Tuple( + a UInt32, + b UInt32))) │ │ │ │ ZSTD(1) │ │ +└──────┴───────────────────────────────────────────────────────────────────────────┴──────────────┴────────────────────┴─────────────────────────┴──────────────────┴──────────────────────┘ +┌─name───────┬─type──────────────────────────────────────────────────────────────────────┬─default_type─┬─default_expression─┬─comment─────────────────┬─codec_expression─┬─ttl_expression───────┬─is_subcolumn─┐ +│ d │ Date │ │ │ │ │ │ 0 │ +│ n │ Nullable(String) │ │ │ It is a nullable column │ │ │ 0 │ +│ arr1 │ Array(UInt32) │ │ │ │ ZSTD(1) │ │ 0 │ +│ arr2 │ Array(Array(String)) │ │ │ │ │ d + toIntervalDay(1) │ 0 │ +│ t │ Tuple( + s String, + a Array(Tuple( + a UInt32, + b UInt32))) │ │ │ │ ZSTD(1) │ │ 0 │ +│ n.null │ UInt8 │ │ │ It is a nullable column │ │ │ 1 │ +│ arr1.size0 │ UInt64 │ │ │ │ │ │ 1 │ +│ arr2.size0 │ UInt64 │ │ │ │ │ d + toIntervalDay(1) │ 1 │ +│ arr2.size1 │ Array(UInt64) │ │ │ │ │ d + toIntervalDay(1) │ 1 │ +│ t.s │ String │ │ │ │ ZSTD(1) │ │ 1 │ +│ t.a │ Array(Tuple( + a UInt32, + b UInt32)) │ │ │ │ │ │ 1 │ +│ t.a.size0 │ UInt64 │ │ │ │ │ │ 1 │ +│ t.a.a │ Array(UInt32) │ │ │ │ ZSTD(1) │ │ 1 │ +│ t.a.b │ Array(UInt32) │ │ │ │ ZSTD(1) │ │ 1 │ +└────────────┴───────────────────────────────────────────────────────────────────────────┴──────────────┴────────────────────┴─────────────────────────┴──────────────────┴──────────────────────┴──────────────┘ diff --git a/tests/queries/0_stateless/02048_clickhouse_local_stage.reference b/tests/queries/0_stateless/02048_clickhouse_local_stage.reference index 8a34751b071..2631199cbab 100644 --- a/tests/queries/0_stateless/02048_clickhouse_local_stage.reference +++ b/tests/queries/0_stateless/02048_clickhouse_local_stage.reference @@ -2,7 +2,7 @@ execute: --allow_experimental_analyzer=1 "foo" 1 execute: --allow_experimental_analyzer=1 --stage fetch_columns -"dummy_0" +"__table1.dummy" 0 execute: --allow_experimental_analyzer=1 --stage with_mergeable_state "1_UInt8" diff --git a/tests/queries/0_stateless/02149_external_schema_inference.reference b/tests/queries/0_stateless/02149_external_schema_inference.reference index ebc30e874da..194c8ca62cb 100644 --- a/tests/queries/0_stateless/02149_external_schema_inference.reference +++ b/tests/queries/0_stateless/02149_external_schema_inference.reference @@ -31,8 +31,8 @@ lotteryWin Float64 someRatio Float32 temperature Float32 randomBigNumber Int64 -measureUnits Array(Tuple(unit String, coef Float32)) -nestiness_a_b_c Tuple(d UInt32, e Array(UInt32)) +measureUnits Array(Tuple(\n unit String,\n coef Float32)) +nestiness_a_b_c Tuple(\n d UInt32,\n e Array(UInt32)) location Array(Int32) pi Float32 @@ -78,8 +78,8 @@ lotteryWin String someRatio String temperature String randomBigNumber String -measureUnits Tuple(unit Array(String), coef Array(String)) -nestiness_a_b_c Tuple(d String, e Array(String)) +measureUnits Tuple(\n unit Array(String),\n coef Array(String)) +nestiness_a_b_c Tuple(\n d String,\n e Array(String)) uuid String name String @@ -101,14 +101,14 @@ lotteryWin Float64 someRatio Float32 temperature Float32 randomBigNumber Int64 -measureunits Tuple(coef Array(Float32), unit Array(String)) -nestiness_a_b_c Tuple(d UInt32, e Array(UInt32)) +measureunits Tuple(\n coef Array(Float32),\n unit Array(String)) +nestiness_a_b_c Tuple(\n d UInt32,\n e Array(UInt32)) newFieldStr String newFieldInt Int32 newBool UInt8 identifier String -modules Array(Tuple(module_id UInt32, supply UInt32, temp UInt32, nodes Array(Tuple(node_id UInt32, opening_time UInt32, closing_time UInt32, current UInt32, coords_y Float32)))) +modules Array(Tuple(\n module_id UInt32,\n supply UInt32,\n temp UInt32,\n nodes Array(Tuple(\n node_id UInt32,\n opening_time UInt32,\n closing_time UInt32,\n current UInt32,\n coords_y Float32)))) Capnproto @@ -123,15 +123,15 @@ lc2 Nullable(String) lc3 Array(Nullable(String)) value UInt64 -nested Tuple(a Tuple(b UInt64, c Array(Array(UInt64))), d Array(Tuple(e Array(Array(Tuple(f UInt64, g UInt64))), h Array(Tuple(k Array(UInt64)))))) +nested Tuple(\n a Tuple(\n b UInt64,\n c Array(Array(UInt64))),\n d Array(Tuple(\n e Array(Array(Tuple(\n f UInt64,\n g UInt64))),\n h Array(Tuple(\n k Array(UInt64)))))) -nested Tuple(value Array(UInt64), array Array(Array(UInt64)), tuple Array(Tuple(one UInt64, two UInt64))) +nested Tuple(\n value Array(UInt64),\n array Array(Array(UInt64)),\n tuple Array(Tuple(\n one UInt64,\n two UInt64))) -a Tuple(b UInt64, c Tuple(d UInt64, e Tuple(f UInt64))) +a Tuple(\n b UInt64,\n c Tuple(\n d UInt64,\n e Tuple(\n f UInt64))) nullable Nullable(UInt64) array Array(Nullable(UInt64)) -tuple Tuple(nullable Nullable(UInt64)) +tuple Tuple(\n nullable Nullable(UInt64)) int8 Int8 uint8 UInt8 @@ -151,8 +151,8 @@ datetime UInt32 datetime64 Int64 value UInt64 -tuple1 Tuple(one UInt64, two Tuple(three UInt64, four UInt64)) -tuple2 Tuple(nested1 Tuple(nested2 Tuple(x UInt64))) +tuple1 Tuple(\n one UInt64,\n two Tuple(\n three UInt64,\n four UInt64)) +tuple2 Tuple(\n nested1 Tuple(\n nested2 Tuple(\n x UInt64))) RawBLOB diff --git a/tests/queries/0_stateless/02149_schema_inference.reference b/tests/queries/0_stateless/02149_schema_inference.reference index 6d70c4682f5..ca634ac1701 100644 --- a/tests/queries/0_stateless/02149_schema_inference.reference +++ b/tests/queries/0_stateless/02149_schema_inference.reference @@ -37,30 +37,30 @@ d Array(Nullable(Int64)) JSONCompactEachRow c1 Nullable(Float64) c2 Array(Tuple(Nullable(Int64), Nullable(String))) -c3 Tuple(key Nullable(Int64), key2 Nullable(Int64)) +c3 Tuple(\n key Nullable(Int64),\n key2 Nullable(Int64)) c4 Nullable(Bool) 42.42 [(1,'String'),(2,'abcd')] (42,24) true c1 Nullable(Int64) c2 Array(Tuple(Nullable(Int64), Nullable(String))) -c3 Tuple(key1 Nullable(Int64), key2 Nullable(Int64)) +c3 Tuple(\n key1 Nullable(Int64),\n key2 Nullable(Int64)) c4 Nullable(Bool) \N [(1,'String'),(2,NULL)] (NULL,24) \N 32 [(2,'String 2'),(3,'hello')] (4242,2424) true JSONCompactEachRowWithNames a Nullable(Float64) b Array(Tuple(Nullable(Int64), Nullable(String))) -c Tuple(key Nullable(Int64), key2 Nullable(Int64)) +c Tuple(\n key Nullable(Int64),\n key2 Nullable(Int64)) d Nullable(Bool) 42.42 [(1,'String'),(2,'abcd')] (42,24) true JSONEachRow a Nullable(Float64) b Array(Tuple(Nullable(Int64), Nullable(String))) -c Tuple(key Nullable(Int64), key2 Nullable(Int64)) +c Tuple(\n key Nullable(Int64),\n key2 Nullable(Int64)) d Nullable(Bool) 42.42 [(1,'String'),(2,'abcd')] (42,24) true a Nullable(Int64) b Array(Tuple(Nullable(Int64), Nullable(String))) -c Tuple(key1 Nullable(Int64), key2 Nullable(Int64)) +c Tuple(\n key1 Nullable(Int64),\n key2 Nullable(Int64)) d Nullable(Bool) \N [(1,'String'),(2,NULL)] (NULL,24) \N 32 [(2,'String 2'),(3,'hello')] (4242,2424) true diff --git a/tests/queries/0_stateless/02149_schema_inference_formats_with_schema_1.reference b/tests/queries/0_stateless/02149_schema_inference_formats_with_schema_1.reference index 4e020427ad0..ee83ed63dc1 100644 --- a/tests/queries/0_stateless/02149_schema_inference_formats_with_schema_1.reference +++ b/tests/queries/0_stateless/02149_schema_inference_formats_with_schema_1.reference @@ -24,12 +24,12 @@ fixed_string Nullable(FixedString(3)) Str: 0 100 Str: 1 200 array Array(Nullable(UInt64)) -tuple Tuple(`1` Nullable(UInt64), `2` Nullable(String)) +tuple Tuple(\n `1` Nullable(UInt64),\n `2` Nullable(String)) map Map(String, Nullable(UInt64)) [0,1] (0,'0') {'0':0} [1,2] (1,'1') {'1':1} -nested1 Array(Tuple(`1` Array(Nullable(UInt64)), `2` Map(String, Nullable(UInt64)))) -nested2 Tuple(`1` Tuple(`1` Array(Array(Nullable(UInt64))), `2` Map(UInt64, Array(Tuple(`1` Nullable(UInt64), `2` Nullable(String))))), `2` Nullable(UInt8)) +nested1 Array(Tuple(\n `1` Array(Nullable(UInt64)),\n `2` Map(String, Nullable(UInt64)))) +nested2 Tuple(\n `1` Tuple(\n `1` Array(Array(Nullable(UInt64))),\n `2` Map(UInt64, Array(Tuple(\n `1` Nullable(UInt64),\n `2` Nullable(String))))),\n `2` Nullable(UInt8)) [([0,1],{'42':0}),([],{}),([42],{'42':42})] (([[0],[1],[]],{0:[(0,'42'),(1,'42')]}),42) [([1,2],{'42':1}),([],{}),([42],{'42':42})] (([[1],[2],[]],{1:[(1,'42'),(2,'42')]}),42) ArrowStream @@ -58,12 +58,12 @@ fixed_string Nullable(FixedString(3)) Str: 0 100 Str: 1 200 array Array(Nullable(UInt64)) -tuple Tuple(`1` Nullable(UInt64), `2` Nullable(String)) +tuple Tuple(\n `1` Nullable(UInt64),\n `2` Nullable(String)) map Map(String, Nullable(UInt64)) [0,1] (0,'0') {'0':0} [1,2] (1,'1') {'1':1} -nested1 Array(Tuple(`1` Array(Nullable(UInt64)), `2` Map(String, Nullable(UInt64)))) -nested2 Tuple(`1` Tuple(`1` Array(Array(Nullable(UInt64))), `2` Map(UInt64, Array(Tuple(`1` Nullable(UInt64), `2` Nullable(String))))), `2` Nullable(UInt8)) +nested1 Array(Tuple(\n `1` Array(Nullable(UInt64)),\n `2` Map(String, Nullable(UInt64)))) +nested2 Tuple(\n `1` Tuple(\n `1` Array(Array(Nullable(UInt64))),\n `2` Map(UInt64, Array(Tuple(\n `1` Nullable(UInt64),\n `2` Nullable(String))))),\n `2` Nullable(UInt8)) [([0,1],{'42':0}),([],{}),([42],{'42':42})] (([[0],[1],[]],{0:[(0,'42'),(1,'42')]}),42) [([1,2],{'42':1}),([],{}),([42],{'42':42})] (([[1],[2],[]],{1:[(1,'42'),(2,'42')]}),42) Parquet @@ -92,12 +92,12 @@ fixed_string Nullable(FixedString(3)) Str: 0 100 Str: 1 200 array Array(Nullable(UInt64)) -tuple Tuple(`1` Nullable(UInt64), `2` Nullable(String)) +tuple Tuple(\n `1` Nullable(UInt64),\n `2` Nullable(String)) map Map(String, Nullable(UInt64)) [0,1] (0,'0') {'0':0} [1,2] (1,'1') {'1':1} -nested1 Array(Tuple(`1` Array(Nullable(UInt64)), `2` Map(String, Nullable(UInt64)))) -nested2 Tuple(`1` Tuple(`1` Array(Array(Nullable(UInt64))), `2` Map(UInt64, Array(Tuple(`1` Nullable(UInt64), `2` Nullable(String))))), `2` Nullable(UInt8)) +nested1 Array(Tuple(\n `1` Array(Nullable(UInt64)),\n `2` Map(String, Nullable(UInt64)))) +nested2 Tuple(\n `1` Tuple(\n `1` Array(Array(Nullable(UInt64))),\n `2` Map(UInt64, Array(Tuple(\n `1` Nullable(UInt64),\n `2` Nullable(String))))),\n `2` Nullable(UInt8)) [([0,1],{'42':0}),([],{}),([42],{'42':42})] (([[0],[1],[]],{0:[(0,'42'),(1,'42')]}),42) [([1,2],{'42':1}),([],{}),([42],{'42':42})] (([[1],[2],[]],{1:[(1,'42'),(2,'42')]}),42) ORC @@ -126,12 +126,12 @@ fixed_string Nullable(String) Str: 0 100 Str: 1 200 array Array(Nullable(Int64)) -tuple Tuple(`1` Nullable(Int64), `2` Nullable(String)) +tuple Tuple(\n `1` Nullable(Int64),\n `2` Nullable(String)) map Map(String, Nullable(Int64)) [0,1] (0,'0') {'0':0} [1,2] (1,'1') {'1':1} -nested1 Array(Tuple(`1` Array(Nullable(Int64)), `2` Map(String, Nullable(Int64)))) -nested2 Tuple(`1` Tuple(`1` Array(Array(Nullable(Int64))), `2` Map(Int64, Array(Tuple(`1` Nullable(Int64), `2` Nullable(String))))), `2` Nullable(Int8)) +nested1 Array(Tuple(\n `1` Array(Nullable(Int64)),\n `2` Map(String, Nullable(Int64)))) +nested2 Tuple(\n `1` Tuple(\n `1` Array(Array(Nullable(Int64))),\n `2` Map(Int64, Array(Tuple(\n `1` Nullable(Int64),\n `2` Nullable(String))))),\n `2` Nullable(Int8)) [([0,1],{'42':0}),([],{}),([42],{'42':42})] (([[0],[1],[]],{0:[(0,'42'),(1,'42')]}),42) [([1,2],{'42':1}),([],{}),([42],{'42':42})] (([[1],[2],[]],{1:[(1,'42'),(2,'42')]}),42) Native diff --git a/tests/queries/0_stateless/02179_map_cast_to_array.reference b/tests/queries/0_stateless/02179_map_cast_to_array.reference index 81bb9fba537..e87d1c69c1b 100644 --- a/tests/queries/0_stateless/02179_map_cast_to_array.reference +++ b/tests/queries/0_stateless/02179_map_cast_to_array.reference @@ -6,4 +6,4 @@ {1:{1:'1234'}} [(1,{1:1234})] [(1,{1:1234})] {1:{1:'1234'}} [(1,[(1,'1234')])] [(1,[(1,'1234')])] {1:{1:'1234'}} [(1,[(1,1234)])] [(1,[(1,1234)])] -[(1,'val1'),(2,'val2')] Array(Tuple(k UInt32, v String)) +[(1,'val1'),(2,'val2')] Array(Tuple(\n k UInt32,\n v String)) diff --git a/tests/queries/0_stateless/02226_analyzer_or_like_combine.reference b/tests/queries/0_stateless/02226_analyzer_or_like_combine.reference index d741391067c..0ff24b39709 100644 --- a/tests/queries/0_stateless/02226_analyzer_or_like_combine.reference +++ b/tests/queries/0_stateless/02226_analyzer_or_like_combine.reference @@ -11,7 +11,7 @@ QUERY id: 0 LIST id: 3, nodes: 1 CONSTANT id: 4, constant_value: \'Привет, World\', constant_value_type: String JOIN TREE - TABLE id: 5, table_name: system.one + TABLE id: 5, alias: __table1, table_name: system.one WHERE FUNCTION id: 6, function_name: or, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -54,7 +54,7 @@ QUERY id: 0 LIST id: 3, nodes: 1 CONSTANT id: 4, constant_value: \'Привет, World\', constant_value_type: String JOIN TREE - TABLE id: 5, table_name: system.one + TABLE id: 5, alias: __table1, table_name: system.one WHERE FUNCTION id: 6, function_name: or, function_type: ordinary, result_type: UInt8 ARGUMENTS diff --git a/tests/queries/0_stateless/02227_union_match_by_name.reference b/tests/queries/0_stateless/02227_union_match_by_name.reference index 42b9b01a529..c28035fab49 100644 --- a/tests/queries/0_stateless/02227_union_match_by_name.reference +++ b/tests/queries/0_stateless/02227_union_match_by_name.reference @@ -4,15 +4,15 @@ EXPLAIN header = 1, optimize = 0 SELECT avgWeighted(x, y) FROM (SELECT NULL, 255 Expression (Project names) Header: avgWeighted(x, y) Nullable(Float64) Expression (Projection) - Header: avgWeighted(x_0, y_1) Nullable(Float64) + Header: avgWeighted(__table1.x, __table1.y) Nullable(Float64) Aggregating - Header: avgWeighted(x_0, y_1) Nullable(Float64) + Header: avgWeighted(__table1.x, __table1.y) Nullable(Float64) Expression (Before GROUP BY) - Header: x_0 Nullable(UInt8) - y_1 UInt8 + Header: __table1.x Nullable(UInt8) + __table1.y UInt8 Expression (Change column names to column identifiers) - Header: x_0 Nullable(UInt8) - y_1 UInt8 + Header: __table1.x Nullable(UInt8) + __table1.y UInt8 Union Header: x Nullable(UInt8) y UInt8 @@ -26,7 +26,7 @@ Header: avgWeighted(x, y) Nullable(Float64) Header: 255_UInt8 UInt8 1_UInt8 UInt8 Expression (Change column names to column identifiers) - Header: dummy_0 UInt8 + Header: __table3.dummy UInt8 ReadFromStorage (SystemOne) Header: dummy UInt8 Expression (Conversion before UNION) @@ -39,7 +39,7 @@ Header: avgWeighted(x, y) Nullable(Float64) Header: NULL_Nullable(Nothing) Nullable(Nothing) 1_UInt8 UInt8 Expression (Change column names to column identifiers) - Header: dummy_0 UInt8 + Header: __table5.dummy UInt8 ReadFromStorage (SystemOne) Header: dummy UInt8 SELECT avgWeighted(x, y) FROM (SELECT NULL, 255 AS x, 1 AS y UNION ALL SELECT y, NULL AS x, 1 AS y); diff --git a/tests/queries/0_stateless/02242_arrow_orc_parquet_nullable_schema_inference.reference b/tests/queries/0_stateless/02242_arrow_orc_parquet_nullable_schema_inference.reference index 2ecce985eb4..cd39bf8879b 100644 --- a/tests/queries/0_stateless/02242_arrow_orc_parquet_nullable_schema_inference.reference +++ b/tests/queries/0_stateless/02242_arrow_orc_parquet_nullable_schema_inference.reference @@ -2,7 +2,7 @@ Arrow x Nullable(UInt64) arr1 Array(Nullable(UInt64)) arr2 Array(Array(Nullable(String))) -arr3 Array(Tuple(`1` Nullable(String), `2` Nullable(UInt64))) +arr3 Array(Tuple(\n `1` Nullable(String),\n `2` Nullable(UInt64))) 0 [0,1] [[NULL,'String'],[NULL],[]] [(NULL,NULL),('String',NULL),(NULL,0)] \N [NULL,2] [[NULL,'String'],[NULL],[]] [(NULL,NULL),('String',NULL),(NULL,1)] 2 [2,3] [[NULL,'String'],[NULL],[]] [(NULL,NULL),('String',NULL),(NULL,2)] @@ -12,7 +12,7 @@ ArrowStream x Nullable(UInt64) arr1 Array(Nullable(UInt64)) arr2 Array(Array(Nullable(String))) -arr3 Array(Tuple(`1` Nullable(String), `2` Nullable(UInt64))) +arr3 Array(Tuple(\n `1` Nullable(String),\n `2` Nullable(UInt64))) 0 [0,1] [[NULL,'String'],[NULL],[]] [(NULL,NULL),('String',NULL),(NULL,0)] \N [NULL,2] [[NULL,'String'],[NULL],[]] [(NULL,NULL),('String',NULL),(NULL,1)] 2 [2,3] [[NULL,'String'],[NULL],[]] [(NULL,NULL),('String',NULL),(NULL,2)] @@ -22,7 +22,7 @@ Parquet x Nullable(UInt64) arr1 Array(Nullable(UInt64)) arr2 Array(Array(Nullable(String))) -arr3 Array(Tuple(`1` Nullable(String), `2` Nullable(UInt64))) +arr3 Array(Tuple(\n `1` Nullable(String),\n `2` Nullable(UInt64))) 0 [0,1] [[NULL,'String'],[NULL],[]] [(NULL,NULL),('String',NULL),(NULL,0)] \N [NULL,2] [[NULL,'String'],[NULL],[]] [(NULL,NULL),('String',NULL),(NULL,1)] 2 [2,3] [[NULL,'String'],[NULL],[]] [(NULL,NULL),('String',NULL),(NULL,2)] @@ -32,7 +32,7 @@ ORC x Nullable(Int64) arr1 Array(Nullable(Int64)) arr2 Array(Array(Nullable(String))) -arr3 Array(Tuple(`1` Nullable(String), `2` Nullable(Int64))) +arr3 Array(Tuple(\n `1` Nullable(String),\n `2` Nullable(Int64))) 0 [0,1] [[NULL,'String'],[NULL],[]] [(NULL,NULL),('String',NULL),(NULL,0)] \N [NULL,2] [[NULL,'String'],[NULL],[]] [(NULL,NULL),('String',NULL),(NULL,1)] 2 [2,3] [[NULL,'String'],[NULL],[]] [(NULL,NULL),('String',NULL),(NULL,2)] diff --git a/tests/queries/0_stateless/02246_flatten_tuple.reference b/tests/queries/0_stateless/02246_flatten_tuple.reference index 0320150025d..ad0ca1fa03a 100644 --- a/tests/queries/0_stateless/02246_flatten_tuple.reference +++ b/tests/queries/0_stateless/02246_flatten_tuple.reference @@ -1,4 +1,4 @@ -([1,2],['a','b'],3,'c',4) Tuple(`t1.a` Array(UInt32), `t1.s` Array(String), b UInt32, `t2.k` String, `t2.v` UInt32) -Tuple(id Int8, obj Tuple(k1 Int8, k2 Tuple(k3 String, k4 Nested(k5 Int8, k6 Int8)), some Int8), s String) Tuple(id Int8, `obj.k1` Int8, `obj.k2.k3` String, `obj.k2.k4.k5` Array(Int8), `obj.k2.k4.k6` Array(Int8), `obj.some` Int8, s String) +([1,2],['a','b'],3,'c',4) Tuple(\n `t1.a` Array(UInt32),\n `t1.s` Array(String),\n b UInt32,\n `t2.k` String,\n `t2.v` UInt32) +Tuple(\n id Int8,\n obj Tuple(\n k1 Int8,\n k2 Tuple(\n k3 String,\n k4 Nested(k5 Int8, k6 Int8)),\n some Int8),\n s String) Tuple(\n id Int8,\n `obj.k1` Int8,\n `obj.k2.k3` String,\n `obj.k2.k4.k5` Array(Int8),\n `obj.k2.k4.k6` Array(Int8),\n `obj.some` Int8,\n s String) 1 1 2 [3,4] [0,0] 0 foo 2 0 str [0] [55] 42 bar diff --git a/tests/queries/0_stateless/02286_tuple_numeric_identifier.reference b/tests/queries/0_stateless/02286_tuple_numeric_identifier.reference index 5f330409b2a..21348493d1d 100644 --- a/tests/queries/0_stateless/02286_tuple_numeric_identifier.reference +++ b/tests/queries/0_stateless/02286_tuple_numeric_identifier.reference @@ -4,7 +4,7 @@ CREATE TABLE default.t_tuple_numeric\n(\n `t` Tuple(`1` Tuple(`2` Int32, `3` 2 3 4 2 3 4 2 3 4 -Tuple(`1` Tuple(`2` Int8, `3` Int8), `4` Int8) +Tuple(\n `1` Tuple(\n `2` Int8,\n `3` Int8),\n `4` Int8) {"t":{"1":{"2":2,"3":3},"4":4}} 2 3 4 (('value')) diff --git a/tests/queries/0_stateless/02287_type_object_convert.reference b/tests/queries/0_stateless/02287_type_object_convert.reference index 2df54dcbcbc..501536f1f3e 100644 --- a/tests/queries/0_stateless/02287_type_object_convert.reference +++ b/tests/queries/0_stateless/02287_type_object_convert.reference @@ -1,15 +1,15 @@ -1 (1) Tuple(x Nullable(Int8)) -1 (1,NULL) Tuple(x Nullable(Int8), y Nullable(Int8)) -2 (NULL,2) Tuple(x Nullable(Int8), y Nullable(Int8)) -1 (1,NULL) Tuple(x Nullable(Int8), y Nullable(Int8)) -2 (NULL,2) Tuple(x Nullable(Int8), y Nullable(Int8)) -3 (1,2) Tuple(x Nullable(Int8), y Nullable(Int8)) +1 (1) Tuple(\n x Nullable(Int8)) +1 (1,NULL) Tuple(\n x Nullable(Int8),\n y Nullable(Int8)) +2 (NULL,2) Tuple(\n x Nullable(Int8),\n y Nullable(Int8)) +1 (1,NULL) Tuple(\n x Nullable(Int8),\n y Nullable(Int8)) +2 (NULL,2) Tuple(\n x Nullable(Int8),\n y Nullable(Int8)) +3 (1,2) Tuple(\n x Nullable(Int8),\n y Nullable(Int8)) 1 1 \N 2 \N 2 3 1 2 -1 (1) Tuple(x Int8) -1 (1,0) Tuple(x Int8, y Int8) -2 (0,2) Tuple(x Int8, y Int8) +1 (1) Tuple(\n x Int8) +1 (1,0) Tuple(\n x Int8,\n y Int8) +2 (0,2) Tuple(\n x Int8,\n y Int8) {"x":1} {"x":1} {"x":[[1],[1,2]]} diff --git a/tests/queries/0_stateless/02293_http_header_full_summary_without_progress.sh b/tests/queries/0_stateless/02293_http_header_full_summary_without_progress.sh index a08928a773c..8f08bd6f84b 100755 --- a/tests/queries/0_stateless/02293_http_header_full_summary_without_progress.sh +++ b/tests/queries/0_stateless/02293_http_header_full_summary_without_progress.sh @@ -7,7 +7,7 @@ CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) CURL_OUTPUT=$(echo 'SELECT 1 + sleepEachRow(0.00002) FROM numbers(100000)' | \ - ${CLICKHOUSE_CURL_COMMAND} -vsS "${CLICKHOUSE_URL}&wait_end_of_query=1&send_progress_in_http_headers=0&max_execution_time=1" --data-binary @- 2>&1) + ${CLICKHOUSE_CURL_COMMAND} -v "${CLICKHOUSE_URL}&wait_end_of_query=1&send_progress_in_http_headers=0&max_execution_time=1" --data-binary @- 2>&1) READ_ROWS=$(echo "${CURL_OUTPUT}" | \ grep 'X-ClickHouse-Summary' | \ diff --git a/tests/queries/0_stateless/02303_query_kind.reference b/tests/queries/0_stateless/02303_query_kind.reference index 8d119fb22b2..53a0df682b2 100644 --- a/tests/queries/0_stateless/02303_query_kind.reference +++ b/tests/queries/0_stateless/02303_query_kind.reference @@ -2,35 +2,35 @@ clickhouse-client --allow_experimental_analyzer=1 --query_kind secondary_query - Expression ((Project names + Projection)) Header: dummy String Aggregating - Header: toString(dummy_0) String + Header: toString(__table1.dummy) String Expression ((Before GROUP BY + Change column names to column identifiers)) - Header: toString(dummy_0) String + Header: toString(__table1.dummy) String ReadFromStorage (SystemOne) Header: dummy UInt8 clickhouse-local --allow_experimental_analyzer=1 --query_kind secondary_query -q explain plan header=1 select toString(dummy) as dummy from system.one group by dummy Expression ((Project names + Projection)) Header: dummy String Aggregating - Header: toString(dummy_0) String + Header: toString(__table1.dummy) String Expression ((Before GROUP BY + Change column names to column identifiers)) - Header: toString(dummy_0) String + Header: toString(__table1.dummy) String ReadFromStorage (SystemOne) Header: dummy UInt8 clickhouse-client --allow_experimental_analyzer=1 --query_kind initial_query -q explain plan header=1 select toString(dummy) as dummy from system.one group by dummy Expression ((Project names + Projection)) Header: dummy String Aggregating - Header: toString(dummy_0) String + Header: toString(__table1.dummy) String Expression ((Before GROUP BY + Change column names to column identifiers)) - Header: toString(dummy_0) String + Header: toString(__table1.dummy) String ReadFromStorage (SystemOne) Header: dummy UInt8 clickhouse-local --allow_experimental_analyzer=1 --query_kind initial_query -q explain plan header=1 select toString(dummy) as dummy from system.one group by dummy Expression ((Project names + Projection)) Header: dummy String Aggregating - Header: toString(dummy_0) String + Header: toString(__table1.dummy) String Expression ((Before GROUP BY + Change column names to column identifiers)) - Header: toString(dummy_0) String + Header: toString(__table1.dummy) String ReadFromStorage (SystemOne) Header: dummy UInt8 diff --git a/tests/queries/0_stateless/02313_avro_records_and_maps.reference b/tests/queries/0_stateless/02313_avro_records_and_maps.reference index 24fc635cdce..329462a4dda 100644 --- a/tests/queries/0_stateless/02313_avro_records_and_maps.reference +++ b/tests/queries/0_stateless/02313_avro_records_and_maps.reference @@ -1,8 +1,8 @@ -t Tuple(a Int32, b String) +t Tuple(\n a Int32,\n b String) (0,'String') (1,'String') (2,'String') -t Tuple(a Int32, b Tuple(c Int32, d Int32), e Array(Int32)) +t Tuple(\n a Int32,\n b Tuple(\n c Int32,\n d Int32),\n e Array(Int32)) (0,(1,2),[]) (1,(2,3),[0]) (2,(3,4),[0,1]) @@ -11,7 +11,7 @@ a.c Array(Int32) [0,1] [2,3] [1,2] [3,4] [2,3] [4,5] -a.b Array(Array(Tuple(c Int32, d Int32))) +a.b Array(Array(Tuple(\n c Int32,\n d Int32))) [[(0,1),(2,3)]] [[(1,2),(3,4)]] [[(2,3),(4,5)]] @@ -19,7 +19,7 @@ m Map(String, Int64) {'key_0':0} {'key_1':1} {'key_2':2} -m Map(String, Tuple(`1` Int64, `2` Array(Int64))) +m Map(String, Tuple(\n `1` Int64,\n `2` Array(Int64))) {'key_0':(0,[])} {'key_1':(1,[0])} {'key_2':(2,[0,1])} diff --git a/tests/queries/0_stateless/02314_avro_null_as_default.reference b/tests/queries/0_stateless/02314_avro_null_as_default.reference index ba38a15f924..e5d1b1c3752 100644 --- a/tests/queries/0_stateless/02314_avro_null_as_default.reference +++ b/tests/queries/0_stateless/02314_avro_null_as_default.reference @@ -1,5 +1,5 @@ a Nullable(Int64) -b Array(Tuple(c Nullable(Int64), d Nullable(String))) +b Array(Tuple(\n c Nullable(Int64),\n d Nullable(String))) 1 [(100,'Q'),(200,'W')] 0 0 diff --git a/tests/queries/0_stateless/02317_distinct_in_order_optimization_explain.reference b/tests/queries/0_stateless/02317_distinct_in_order_optimization_explain.reference index da07e94cead..69571551c2b 100644 --- a/tests/queries/0_stateless/02317_distinct_in_order_optimization_explain.reference +++ b/tests/queries/0_stateless/02317_distinct_in_order_optimization_explain.reference @@ -83,36 +83,36 @@ Sorting (Stream): a ASC, b ASC Sorting (Stream): a ASC, b ASC === enable new analyzer === -- enabled, check that sorting properties are propagated from ReadFromMergeTree till preliminary distinct -Sorting (Stream): a_1 ASC, b_0 ASC -Sorting (Stream): a_1 ASC, b_0 ASC -Sorting (Stream): a_1 ASC, b_0 ASC -Sorting (Stream): a_1 ASC, b ASC +Sorting (Stream): __table1.a ASC, __table1.b ASC +Sorting (Stream): __table1.a ASC, __table1.b ASC +Sorting (Stream): __table1.a ASC, __table1.b ASC +Sorting (Stream): __table1.a ASC, b ASC -- disabled, check that sorting description for ReadFromMergeTree match ORDER BY columns -Sorting (Stream): a_1 ASC -Sorting (Stream): a_1 ASC -Sorting (Stream): a_1 ASC +Sorting (Stream): __table1.a ASC +Sorting (Stream): __table1.a ASC +Sorting (Stream): __table1.a ASC Sorting (Stream): a ASC -- enabled, check that ReadFromMergeTree sorting description is overwritten by DISTINCT optimization i.e. it contains columns from DISTINCT clause -Sorting (Stream): a_1 ASC, b_0 ASC -Sorting (Stream): a_1 ASC, b_0 ASC -Sorting (Stream): a_1 ASC, b_0 ASC +Sorting (Stream): __table1.a ASC, __table1.b ASC +Sorting (Stream): __table1.a ASC, __table1.b ASC +Sorting (Stream): __table1.a ASC, __table1.b ASC Sorting (Stream): a ASC, b ASC -- enabled, check that ReadFromMergeTree sorting description is overwritten by DISTINCT optimization, but direction used from ORDER BY clause -Sorting (Stream): a_1 DESC, b_0 DESC -Sorting (Stream): a_1 DESC, b_0 DESC -Sorting (Stream): a_1 DESC, b_0 DESC +Sorting (Stream): __table1.a DESC, __table1.b DESC +Sorting (Stream): __table1.a DESC, __table1.b DESC +Sorting (Stream): __table1.a DESC, __table1.b DESC Sorting (Stream): a DESC, b DESC -- enabled, check that ReadFromMergeTree sorting description is NOT overwritten by DISTINCT optimization (1), - it contains columns from ORDER BY clause -Sorting (Stream): a_0 ASC, b_1 ASC -Sorting (Stream): a_0 ASC, b_1 ASC -Sorting (Stream): a_0 ASC, b_1 ASC +Sorting (Stream): __table1.a ASC, __table1.b ASC +Sorting (Stream): __table1.a ASC, __table1.b ASC +Sorting (Stream): __table1.a ASC, __table1.b ASC Sorting (Stream): a ASC, b ASC -- enabled, check that ReadFromMergeTree sorting description is NOT overwritten by DISTINCT optimization (2), - direction used from ORDER BY clause -Sorting (Stream): a_1 DESC, b_0 DESC -Sorting (Stream): a_1 DESC, b_0 DESC -Sorting (Stream): a_1 DESC, b_0 DESC +Sorting (Stream): __table1.a DESC, __table1.b DESC +Sorting (Stream): __table1.a DESC, __table1.b DESC +Sorting (Stream): __table1.a DESC, __table1.b DESC Sorting (Stream): a DESC, b DESC -- enabled, check that disabling other 'read in order' optimizations do not disable distinct in order optimization -Sorting (Stream): a_0 ASC, b_1 ASC -Sorting (Stream): a_0 ASC, b_1 ASC +Sorting (Stream): __table1.a ASC, __table1.b ASC +Sorting (Stream): __table1.a ASC, __table1.b ASC Sorting (Stream): a ASC, b ASC diff --git a/tests/queries/0_stateless/02325_dates_schema_inference.reference b/tests/queries/0_stateless/02325_dates_schema_inference.reference index a37360dae62..c8eebd3262e 100644 --- a/tests/queries/0_stateless/02325_dates_schema_inference.reference +++ b/tests/queries/0_stateless/02325_dates_schema_inference.reference @@ -5,14 +5,14 @@ x Nullable(DateTime64(9)) x Array(Nullable(Date)) x Array(Nullable(DateTime64(9))) x Array(Nullable(DateTime64(9))) -x Tuple(date1 Nullable(DateTime64(9)), date2 Nullable(Date)) +x Tuple(\n date1 Nullable(DateTime64(9)),\n date2 Nullable(Date)) x Array(Nullable(DateTime64(9))) x Array(Nullable(DateTime64(9))) x Nullable(DateTime64(9)) x Array(Nullable(String)) x Nullable(String) x Array(Nullable(String)) -x Tuple(key1 Array(Array(Nullable(DateTime64(9)))), key2 Array(Array(Nullable(String)))) +x Tuple(\n key1 Array(Array(Nullable(DateTime64(9)))),\n key2 Array(Array(Nullable(String)))) CSV c1 Nullable(Date) c1 Nullable(DateTime64(9)) diff --git a/tests/queries/0_stateless/02326_settings_changes_system_table.reference b/tests/queries/0_stateless/02326_settings_changes_system_table.reference index c4a3c71edfd..1c8c4fa1880 100644 --- a/tests/queries/0_stateless/02326_settings_changes_system_table.reference +++ b/tests/queries/0_stateless/02326_settings_changes_system_table.reference @@ -1,3 +1,3 @@ version String -changes Array(Tuple(name String, previous_value String, new_value String, reason String)) +changes Array(Tuple(\n name String,\n previous_value String,\n new_value String,\n reason String)) 22.5 [('memory_overcommit_ratio_denominator','0','1073741824','Enable memory overcommit feature by default'),('memory_overcommit_ratio_denominator_for_user','0','1073741824','Enable memory overcommit feature by default')] diff --git a/tests/queries/0_stateless/02327_try_infer_integers_schema_inference.reference b/tests/queries/0_stateless/02327_try_infer_integers_schema_inference.reference index a0e0f8f6b5e..d190476a7da 100644 --- a/tests/queries/0_stateless/02327_try_infer_integers_schema_inference.reference +++ b/tests/queries/0_stateless/02327_try_infer_integers_schema_inference.reference @@ -1,12 +1,12 @@ JSONEachRow x Nullable(Int64) x Array(Nullable(Int64)) -x Tuple(a Array(Nullable(Int64))) -x Tuple(a Array(Nullable(Int64)), b Array(Nullable(Int64))) +x Tuple(\n a Array(Nullable(Int64))) +x Tuple(\n a Array(Nullable(Int64)),\n b Array(Nullable(Int64))) x Nullable(Float64) x Nullable(Float64) x Array(Nullable(Float64)) -x Tuple(a Array(Nullable(Int64)), b Array(Nullable(Float64))) +x Tuple(\n a Array(Nullable(Int64)),\n b Array(Nullable(Float64))) CSV c1 Nullable(Int64) c1 Array(Nullable(Int64)) diff --git a/tests/queries/0_stateless/02342_analyzer_compound_types.reference b/tests/queries/0_stateless/02342_analyzer_compound_types.reference index 51e0bbe6e92..c384b548473 100644 --- a/tests/queries/0_stateless/02342_analyzer_compound_types.reference +++ b/tests/queries/0_stateless/02342_analyzer_compound_types.reference @@ -8,33 +8,33 @@ Constant tuple Tuple -- id UInt64 -value Tuple(value_0_level_0 Tuple(value_0_level_1 String, value_1_level_1 String), value_1_level_0 String) +value Tuple(\n value_0_level_0 Tuple(\n value_0_level_1 String,\n value_1_level_1 String),\n value_1_level_0 String) 0 (('value_0_level_1','value_1_level_1'),'value_1_level_0') -- id UInt64 -value Tuple(value_0_level_0 Tuple(value_0_level_1 String, value_1_level_1 String), value_1_level_0 String) +value Tuple(\n value_0_level_0 Tuple(\n value_0_level_1 String,\n value_1_level_1 String),\n value_1_level_0 String) 0 (('value_0_level_1','value_1_level_1'),'value_1_level_0') -- -value.value_0_level_0 Tuple(value_0_level_1 String, value_1_level_1 String) +value.value_0_level_0 Tuple(\n value_0_level_1 String,\n value_1_level_1 String) value.value_1_level_0 String ('value_0_level_1','value_1_level_1') value_1_level_0 -- -alias_value Tuple(value_0_level_0 Tuple(value_0_level_1 String, value_1_level_1 String), value_1_level_0 String) -alias_value.value_0_level_0 Tuple(value_0_level_1 String, value_1_level_1 String) +alias_value Tuple(\n value_0_level_0 Tuple(\n value_0_level_1 String,\n value_1_level_1 String),\n value_1_level_0 String) +alias_value.value_0_level_0 Tuple(\n value_0_level_1 String,\n value_1_level_1 String) alias_value.value_1_level_0 String (('value_0_level_1','value_1_level_1'),'value_1_level_0') ('value_0_level_1','value_1_level_1') value_1_level_0 -- -alias_value Tuple(value_0_level_0 Tuple(value_0_level_1 String, value_1_level_1 String), value_1_level_0 String) -alias_value.value_0_level_0 Tuple(value_0_level_1 String, value_1_level_1 String) +alias_value Tuple(\n value_0_level_0 Tuple(\n value_0_level_1 String,\n value_1_level_1 String),\n value_1_level_0 String) +alias_value.value_0_level_0 Tuple(\n value_0_level_1 String,\n value_1_level_1 String) alias_value.value_1_level_0 String (('value_0_level_1','value_1_level_1'),'value_1_level_0') ('value_0_level_1','value_1_level_1') value_1_level_0 -- -alias_value Tuple(value_0_level_0 Tuple(value_0_level_1 String, value_1_level_1 String), value_1_level_0 String) +alias_value Tuple(\n value_0_level_0 Tuple(\n value_0_level_1 String,\n value_1_level_1 String),\n value_1_level_0 String) toString(alias_value.value_0_level_0) String toString(alias_value.value_1_level_0) String (('value_0_level_1','value_1_level_1'),'value_1_level_0') (\'value_0_level_1\',\'value_1_level_1\') value_1_level_0 -- -value.value_0_level_0 Tuple(value_0_level_1 String, value_1_level_1 String) +value.value_0_level_0 Tuple(\n value_0_level_1 String,\n value_1_level_1 String) value.value_1_level_0 String ('value_0_level_1','value_1_level_1') value_1_level_0 -- @@ -46,17 +46,17 @@ value.value_0_level_0.value_0_level_1 String value.value_0_level_0.value_1_level_1 String value_0_level_1 value_1_level_1 -- -alias_value Tuple(value_0_level_1 String, value_1_level_1 String) +alias_value Tuple(\n value_0_level_1 String,\n value_1_level_1 String) alias_value.value_0_level_1 String alias_value.value_1_level_1 String ('value_0_level_1','value_1_level_1') value_0_level_1 value_1_level_1 -- -alias_value Tuple(value_0_level_1 String, value_1_level_1 String) +alias_value Tuple(\n value_0_level_1 String,\n value_1_level_1 String) alias_value.value_0_level_1 String alias_value.value_1_level_1 String ('value_0_level_1','value_1_level_1') value_0_level_1 value_1_level_1 -- -alias_value Tuple(value_0_level_1 String, value_1_level_1 String) +alias_value Tuple(\n value_0_level_1 String,\n value_1_level_1 String) toString(alias_value.value_0_level_1) String toString(alias_value.value_1_level_1) String ('value_0_level_1','value_1_level_1') value_0_level_1 value_1_level_1 diff --git a/tests/queries/0_stateless/02366_explain_query_tree.reference b/tests/queries/0_stateless/02366_explain_query_tree.reference index 769d7661e68..acbedbd0622 100644 --- a/tests/queries/0_stateless/02366_explain_query_tree.reference +++ b/tests/queries/0_stateless/02366_explain_query_tree.reference @@ -22,7 +22,7 @@ QUERY id: 0 COLUMN id: 2, column_name: id, result_type: UInt64, source_id: 3 COLUMN id: 4, column_name: value, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.test_table + TABLE id: 3, alias: __table1, table_name: default.test_table -- QUERY id: 0 PROJECTION @@ -64,7 +64,7 @@ QUERY id: 0 CONSTANT id: 9, constant_value: UInt64_1, constant_value_type: UInt8 CONSTANT id: 10, constant_value: Array_[UInt64_1, UInt64_2, UInt64_3], constant_value_type: Array(UInt8) JOIN TREE - TABLE id: 11, table_name: default.test_table + TABLE id: 11, alias: __table1, table_name: default.test_table -- QUERY id: 0 WITH @@ -99,4 +99,4 @@ QUERY id: 0 COLUMN id: 4, column_name: id, result_type: UInt64, source_id: 5 CONSTANT id: 6, constant_value: UInt64_1, constant_value_type: UInt8 JOIN TREE - TABLE id: 5, table_name: default.test_table + TABLE id: 5, alias: __table1, table_name: default.test_table diff --git a/tests/queries/0_stateless/02373_progress_contain_result.sh b/tests/queries/0_stateless/02373_progress_contain_result.sh index fd343df1013..c87a5ec7615 100755 --- a/tests/queries/0_stateless/02373_progress_contain_result.sh +++ b/tests/queries/0_stateless/02373_progress_contain_result.sh @@ -5,5 +5,5 @@ CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) . "$CURDIR"/../shell_config.sh echo 'SELECT 1 FROM numbers(100)' | - ${CLICKHOUSE_CURL_COMMAND} -vsS "${CLICKHOUSE_URL}&wait_end_of_query=1&send_progress_in_http_headers=0" --data-binary @- 2>&1 | + ${CLICKHOUSE_CURL_COMMAND} -v "${CLICKHOUSE_URL}&wait_end_of_query=1&send_progress_in_http_headers=0" --data-binary @- 2>&1 | grep 'X-ClickHouse-Summary' | sed 's/,\"elapsed_ns[^}]*//' diff --git a/tests/queries/0_stateless/02377_optimize_sorting_by_input_stream_properties_explain.reference b/tests/queries/0_stateless/02377_optimize_sorting_by_input_stream_properties_explain.reference index 5c9e39805b7..2c50d1028fe 100644 --- a/tests/queries/0_stateless/02377_optimize_sorting_by_input_stream_properties_explain.reference +++ b/tests/queries/0_stateless/02377_optimize_sorting_by_input_stream_properties_explain.reference @@ -8,7 +8,7 @@ Sorting (None) -- QUERY (analyzer): set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a FROM optimize_sorting ORDER BY a Sorting (Global): a ASC Sorting (Sorting for ORDER BY) -Sorting (Global): a_0 ASC +Sorting (Global): __table1.a ASC Sorting (None) Sorting (None) -- disable optimization -> sorting order is NOT propagated from subquery -> full sort @@ -36,8 +36,8 @@ Sorting (Stream): a ASC -- QUERY (analyzer): set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a FROM optimize_sorting ORDER BY a Sorting (Global): a ASC Sorting (Sorting for ORDER BY) -Sorting (Global): a_0 ASC -Sorting (Stream): a_0 ASC +Sorting (Global): __table1.a ASC +Sorting (Stream): __table1.a ASC Sorting (Stream): a ASC -- QUERY: set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a FROM optimize_sorting ORDER BY a+1 Sorting (None) @@ -48,8 +48,8 @@ Sorting (Chunk): a ASC -- QUERY (analyzer): set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a FROM optimize_sorting ORDER BY a+1 Sorting (None) Sorting (Sorting for ORDER BY) -Sorting (Global): plus(a_0, 1_UInt8) ASC -Sorting (Chunk): a_0 ASC +Sorting (Global): plus(__table1.a, 1_UInt8) ASC +Sorting (Chunk): __table1.a ASC Sorting (Chunk): a ASC -- ExpressionStep breaks sort mode -- QUERY: set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a+1 FROM optimize_sorting ORDER BY a+1 @@ -61,7 +61,7 @@ Sorting (Chunk): a ASC -- QUERY (analyzer): set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a+1 FROM optimize_sorting ORDER BY a+1 Sorting (Global): plus(a, 1) ASC Sorting (Sorting for ORDER BY) -Sorting (Global): plus(a_0, 1_UInt8) ASC +Sorting (Global): plus(__table1.a, 1_UInt8) ASC Sorting (None) Sorting (Chunk): a ASC -- FilterStep preserves sort mode @@ -71,7 +71,7 @@ Sorting (Chunk): a ASC Sorting (Chunk): a ASC -- QUERY (analyzer): set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a FROM optimize_sorting WHERE a > 0 Sorting (Chunk): a ASC -Sorting (Chunk): a_0 ASC +Sorting (Chunk): __table1.a ASC Sorting (Chunk): a ASC -- QUERY: set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a FROM optimize_sorting WHERE a+1 > 0 Sorting (Chunk): a ASC @@ -79,7 +79,7 @@ Sorting (Chunk): a ASC Sorting (Chunk): a ASC -- QUERY (analyzer): set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a FROM optimize_sorting WHERE a+1 > 0 Sorting (Chunk): a ASC -Sorting (Chunk): a_0 ASC +Sorting (Chunk): __table1.a ASC Sorting (Chunk): a ASC -- QUERY: set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a, a+1 FROM optimize_sorting WHERE a+1 > 0 Sorting (Chunk): a ASC @@ -87,7 +87,7 @@ Sorting (Chunk): a ASC Sorting (Chunk): a ASC -- QUERY (analyzer): set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a, a+1 FROM optimize_sorting WHERE a+1 > 0 Sorting (Chunk): a ASC -Sorting (Chunk): a_0 ASC +Sorting (Chunk): __table1.a ASC Sorting (Chunk): a ASC -- FilterStep breaks sort mode -- QUERY: set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a > 0 FROM optimize_sorting WHERE a > 0 @@ -119,11 +119,11 @@ Sorting (Stream): a ASC -- QUERY (analyzer): set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a FROM (SELECT sipHash64(a) AS a FROM (SELECT a FROM optimize_sorting ORDER BY a)) ORDER BY a Sorting (Global): a ASC Sorting (Sorting for ORDER BY) -Sorting (Global): a_0 ASC +Sorting (Global): __table1.a ASC Sorting (None) Sorting (Sorting for ORDER BY) -Sorting (Global): a_2 ASC -Sorting (Stream): a_2 ASC +Sorting (Global): __table3.a ASC +Sorting (Stream): __table3.a ASC Sorting (Stream): a ASC -- aliases DONT break sorting order -- QUERY: set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a, b FROM (SELECT x AS a, y AS b FROM (SELECT a AS x, b AS y FROM optimize_sorting) ORDER BY x, y) @@ -135,8 +135,8 @@ Sorting (Stream): a ASC, b ASC -- QUERY (analyzer): set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a, b FROM (SELECT x AS a, y AS b FROM (SELECT a AS x, b AS y FROM optimize_sorting) ORDER BY x, y) Sorting (Global): a ASC, b ASC Sorting (Sorting for ORDER BY) -Sorting (Global): x_2 ASC, y_3 ASC -Sorting (Stream): x_2 ASC, y_3 ASC +Sorting (Global): __table2.x ASC, __table2.y ASC +Sorting (Stream): __table2.x ASC, __table2.y ASC Sorting (Stream): a ASC, b ASC -- actions chain breaks sorting order: input(column a)->sipHash64(column a)->alias(sipHash64(column a), a)->plus(alias a, 1) -- QUERY: set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a, z FROM (SELECT sipHash64(a) AS a, a + 1 AS z FROM (SELECT a FROM optimize_sorting ORDER BY a + 1)) ORDER BY a + 1 @@ -151,11 +151,11 @@ Sorting (Chunk): a ASC -- QUERY (analyzer): set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN actions=1, header=1, sorting=1 SELECT a, z FROM (SELECT sipHash64(a) AS a, a + 1 AS z FROM (SELECT a FROM optimize_sorting ORDER BY a + 1)) ORDER BY a + 1 Sorting (None) Sorting (Sorting for ORDER BY) -Sorting (Global): plus(a_0, 1_UInt8) ASC -Sorting (Global): plus(a_3, 1_UInt8) ASC +Sorting (Global): plus(__table1.a, 1_UInt8) ASC +Sorting (Global): plus(__table3.a, 1_UInt8) ASC Sorting (Sorting for ORDER BY) -Sorting (Global): plus(a_3, 1_UInt8) ASC -Sorting (Chunk): a_3 ASC +Sorting (Global): plus(__table3.a, 1_UInt8) ASC +Sorting (Chunk): __table3.a ASC Sorting (Chunk): a ASC -- check that correct sorting info is provided in case of only prefix of sorting key is in ORDER BY clause but all sorting key columns returned by query -- QUERY: set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN sorting=1 SELECT a, b FROM optimize_sorting ORDER BY a @@ -167,6 +167,6 @@ Sorting (Stream): a ASC -- QUERY (analyzer): set optimize_read_in_order=1;set max_threads=3;set query_plan_remove_redundant_sorting=0;EXPLAIN PLAN sorting=1 SELECT a, b FROM optimize_sorting ORDER BY a Sorting (Global): a ASC Sorting (Sorting for ORDER BY) -Sorting (Global): a_0 ASC -Sorting (Stream): a_0 ASC +Sorting (Global): __table1.a ASC +Sorting (Stream): __table1.a ASC Sorting (Stream): a ASC diff --git a/tests/queries/0_stateless/02378_analyzer_projection_names.reference b/tests/queries/0_stateless/02378_analyzer_projection_names.reference index f8b18e6df15..fd5bc7d4ae8 100644 --- a/tests/queries/0_stateless/02378_analyzer_projection_names.reference +++ b/tests/queries/0_stateless/02378_analyzer_projection_names.reference @@ -13,7 +13,7 @@ concat(\'Value_1\', \'Value_2\') String SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1, 'Value'), 'Tuple (id UInt64, value String)')); -CAST((1, \'Value\'), \'Tuple (id UInt64, value String)\') Tuple(id UInt64, value String) +CAST((1, \'Value\'), \'Tuple (id UInt64, value String)\') Tuple(\n id UInt64,\n value String) SELECT 'Columns'; Columns DESCRIBE (SELECT test_table.id, test_table.id, id FROM test_table); @@ -77,45 +77,45 @@ e String SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1, 'Value'), 'Tuple (id UInt64, value String)') AS a, a.id, a.value); -a Tuple(id UInt64, value String) +a Tuple(\n id UInt64,\n value String) a.id UInt64 a.value String SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1, 'Value'), 'Tuple (id UInt64, value String)') AS a, a.*); -a Tuple(id UInt64, value String) +a Tuple(\n id UInt64,\n value String) a.id UInt64 a.value String SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1, 'Value'), 'Tuple (id UInt64, value String)') AS a, a.* EXCEPT id); -a Tuple(id UInt64, value String) +a Tuple(\n id UInt64,\n value String) a.value String SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1, 'Value'), 'Tuple (id UInt64, value String)') AS a, a.* EXCEPT value); -a Tuple(id UInt64, value String) +a Tuple(\n id UInt64,\n value String) a.id UInt64 SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1, 'Value'), 'Tuple (id UInt64, value String)') AS a, a.* EXCEPT value APPLY toString); -a Tuple(id UInt64, value String) +a Tuple(\n id UInt64,\n value String) toString(a.id) String SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1, 'Value'), 'Tuple (id UInt64, value String)') AS a, a.* EXCEPT value APPLY x -> toString(x)); -a Tuple(id UInt64, value String) +a Tuple(\n id UInt64,\n value String) toString(a.id) String SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1, 'Value'), 'Tuple (id UInt64, value String)') AS a, untuple(a)); -a Tuple(id UInt64, value String) +a Tuple(\n id UInt64,\n value String) tupleElement(a, \'id\') UInt64 tupleElement(a, \'value\') String SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1, 'Value'), 'Tuple (id UInt64, value String)') AS a, untuple(a) AS b); -a Tuple(id UInt64, value String) +a Tuple(\n id UInt64,\n value String) b.id UInt64 b.value String SELECT 'Columns with aliases'; @@ -199,63 +199,63 @@ arrayMap(lambda(tuple(x), toString(id)), [1, 2, 3]) Array(String) SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1), 'Tuple (id UInt64)') AS compound_value, arrayMap(x -> compound_value.*, [1,2,3])); -compound_value Tuple(id UInt64) +compound_value Tuple(\n id UInt64) arrayMap(lambda(tuple(x), compound_value.id), [1, 2, 3]) Array(UInt64) SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1), 'Tuple (id UInt64)') AS compound_value, arrayMap(x -> compound_value.* APPLY x -> x, [1,2,3])); -compound_value Tuple(id UInt64) +compound_value Tuple(\n id UInt64) arrayMap(lambda(tuple(x), compound_value.id), [1, 2, 3]) Array(UInt64) SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1), 'Tuple (id UInt64)') AS compound_value, arrayMap(x -> compound_value.* APPLY toString, [1,2,3])); -compound_value Tuple(id UInt64) +compound_value Tuple(\n id UInt64) arrayMap(lambda(tuple(x), toString(compound_value.id)), [1, 2, 3]) Array(String) SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1), 'Tuple (id UInt64)') AS compound_value, arrayMap(x -> compound_value.* APPLY x -> toString(x), [1,2,3])); -compound_value Tuple(id UInt64) +compound_value Tuple(\n id UInt64) arrayMap(lambda(tuple(x), toString(compound_value.id)), [1, 2, 3]) Array(String) SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1, 'Value'), 'Tuple (id UInt64, value String)') AS compound_value, arrayMap(x -> compound_value.* EXCEPT value, [1,2,3])); -compound_value Tuple(id UInt64, value String) +compound_value Tuple(\n id UInt64,\n value String) arrayMap(lambda(tuple(x), compound_value.id), [1, 2, 3]) Array(UInt64) SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1, 'Value'), 'Tuple (id UInt64, value String)') AS compound_value, arrayMap(x -> compound_value.* EXCEPT value APPLY x -> x, [1,2,3])); -compound_value Tuple(id UInt64, value String) +compound_value Tuple(\n id UInt64,\n value String) arrayMap(lambda(tuple(x), compound_value.id), [1, 2, 3]) Array(UInt64) SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1, 'Value'), 'Tuple (id UInt64, value String)') AS compound_value, arrayMap(x -> compound_value.* EXCEPT value APPLY toString, [1,2,3])); -compound_value Tuple(id UInt64, value String) +compound_value Tuple(\n id UInt64,\n value String) arrayMap(lambda(tuple(x), toString(compound_value.id)), [1, 2, 3]) Array(String) SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1, 'Value'), 'Tuple (id UInt64, value String)') AS compound_value, arrayMap(x -> compound_value.* EXCEPT value APPLY x -> toString(x), [1,2,3])); -compound_value Tuple(id UInt64, value String) +compound_value Tuple(\n id UInt64,\n value String) arrayMap(lambda(tuple(x), toString(compound_value.id)), [1, 2, 3]) Array(String) SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1), 'Tuple (id UInt64)') AS a, arrayMap(x -> untuple(a), [1,2,3]) FROM test_table); -a Tuple(id UInt64) +a Tuple(\n id UInt64) arrayMap(lambda(tuple(x), tupleElement(a, \'id\')), [1, 2, 3]) Array(UInt64) SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1), 'Tuple (id UInt64)') AS a, arrayMap(x -> untuple(a) AS untupled_value, [1,2,3]) FROM test_table); -a Tuple(id UInt64) +a Tuple(\n id UInt64) arrayMap(untupled_value, [1, 2, 3]) Array(UInt64) SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1), 'Tuple (id UInt64)') AS a, untuple(a) AS untupled_value, arrayMap(x -> untupled_value, [1,2,3]) FROM test_table); -a Tuple(id UInt64) +a Tuple(\n id UInt64) untupled_value.id UInt64 arrayMap(lambda(tuple(x), untupled_value.id), [1, 2, 3]) Array(UInt64) SELECT '--'; -- DESCRIBE (SELECT cast(tuple(1), 'Tuple (id UInt64)') AS a, untuple(a) AS untupled_value, arrayMap(x -> untupled_value AS untupled_value_in_lambda, [1,2,3]) FROM test_table); -a Tuple(id UInt64) +a Tuple(\n id UInt64) untupled_value.id UInt64 arrayMap(untupled_value_in_lambda, [1, 2, 3]) Array(UInt64) SELECT 'Standalone lambda'; @@ -285,13 +285,13 @@ arrayMap(lambda(tuple(x), _subquery_3), [1, 2, 3]) Array(Nullable(UInt8)) SELECT '--'; -- DESCRIBE (SELECT (SELECT 1 AS a, 2 AS b) AS c, c.a, c.b); -c Tuple(a UInt8, b UInt8) +c Tuple(\n a UInt8,\n b UInt8) c.a UInt8 c.b UInt8 SELECT '--'; -- DESCRIBE (SELECT (SELECT 1 AS a, 2 AS b) AS c, c.*); -c Tuple(a UInt8, b UInt8) +c Tuple(\n a UInt8,\n b UInt8) c.a UInt8 c.b UInt8 SELECT '--'; @@ -311,13 +311,13 @@ arrayMap(lambda(tuple(x), _subquery_3), [1, 2, 3]) Array(Nullable(UInt8)) SELECT '--'; -- DESCRIBE (SELECT (SELECT 1 AS a, 2 AS b UNION DISTINCT SELECT 1, 2) AS c, c.a, c.b); -c Tuple(a UInt8, b UInt8) +c Tuple(\n a UInt8,\n b UInt8) c.a UInt8 c.b UInt8 SELECT '--'; -- DESCRIBE (SELECT (SELECT 1 AS a, 2 AS b UNION DISTINCT SELECT 1, 2) AS c, c.*); -c Tuple(a UInt8, b UInt8) +c Tuple(\n a UInt8,\n b UInt8) c.a UInt8 c.b UInt8 SELECT '--'; diff --git a/tests/queries/0_stateless/02381_join_dup_columns_in_plan.reference b/tests/queries/0_stateless/02381_join_dup_columns_in_plan.reference index dd5c9d4616e..5dd39c39852 100644 --- a/tests/queries/0_stateless/02381_join_dup_columns_in_plan.reference +++ b/tests/queries/0_stateless/02381_join_dup_columns_in_plan.reference @@ -2,51 +2,51 @@ Expression Header: key String value String Join - Header: key_0 String - value_1 String + Header: __table1.key String + __table3.value String Expression - Header: key_0 String + Header: __table1.key String ReadFromStorage Header: dummy UInt8 Union - Header: key_2 String - value_1 String + Header: __table3.key String + __table3.value String Expression - Header: key_2 String - value_1 String + Header: __table3.key String + __table3.value String ReadFromStorage Header: dummy UInt8 Expression - Header: key_2 String - value_1 String + Header: __table3.key String + __table3.value String ReadFromStorage Header: dummy UInt8 Expression Header: key String value String Join - Header: key_0 String - key_2 String - value_1 String + Header: __table1.key String + __table3.key String + __table3.value String Sorting - Header: key_0 String + Header: __table1.key String Expression - Header: key_0 String + Header: __table1.key String ReadFromStorage Header: dummy UInt8 Sorting - Header: key_2 String - value_1 String + Header: __table3.key String + __table3.value String Union - Header: key_2 String - value_1 String + Header: __table3.key String + __table3.value String Expression - Header: key_2 String - value_1 String + Header: __table3.key String + __table3.value String ReadFromStorage Header: dummy UInt8 Expression - Header: key_2 String - value_1 String + Header: __table3.key String + __table3.value String ReadFromStorage Header: dummy UInt8 diff --git a/tests/queries/0_stateless/02416_json_tuple_to_array_schema_inference.reference b/tests/queries/0_stateless/02416_json_tuple_to_array_schema_inference.reference index 57cafb6c8e0..3f4eeac37b3 100644 --- a/tests/queries/0_stateless/02416_json_tuple_to_array_schema_inference.reference +++ b/tests/queries/0_stateless/02416_json_tuple_to_array_schema_inference.reference @@ -1,3 +1,3 @@ x Array(Array(Nullable(Int64))) x Tuple(Array(Array(Nullable(Int64))), Nullable(Int64)) -x Tuple(key Array(Nullable(Int64))) +x Tuple(\n key Array(Nullable(Int64))) diff --git a/tests/queries/0_stateless/02421_explain_subquery.sql b/tests/queries/0_stateless/02421_explain_subquery.sql index 4b970f81219..2970003cb1c 100644 --- a/tests/queries/0_stateless/02421_explain_subquery.sql +++ b/tests/queries/0_stateless/02421_explain_subquery.sql @@ -34,7 +34,7 @@ DROP TABLE t1; SET allow_experimental_analyzer = 1; -SELECT count() > 3 FROM (EXPLAIN PIPELINE header = 1 SELECT * FROM system.numbers ORDER BY number DESC) WHERE explain LIKE '%Header: number__ UInt64%'; +SELECT count() > 3 FROM (EXPLAIN PIPELINE header = 1 SELECT * FROM system.numbers ORDER BY number DESC) WHERE explain LIKE '%Header: \_\_table1.number UInt64%'; SELECT count() > 0 FROM (EXPLAIN PLAN SELECT * FROM system.numbers ORDER BY number DESC) WHERE explain ILIKE '%Sort%'; SELECT count() > 0 FROM (EXPLAIN SELECT * FROM system.numbers ORDER BY number DESC) WHERE explain ILIKE '%Sort%'; SELECT count() > 0 FROM (EXPLAIN CURRENT TRANSACTION); diff --git a/tests/queries/0_stateless/02421_type_json_empty_parts.reference b/tests/queries/0_stateless/02421_type_json_empty_parts.reference index f360b4b92cd..3c1d2aafec1 100644 --- a/tests/queries/0_stateless/02421_type_json_empty_parts.reference +++ b/tests/queries/0_stateless/02421_type_json_empty_parts.reference @@ -3,24 +3,24 @@ Collapsing 0 id UInt64 s Int8 -data Tuple(_dummy UInt8) +data Tuple(\n _dummy UInt8) DELETE all 2 1 id UInt64 -data Tuple(k1 String, k2 String) +data Tuple(\n k1 String,\n k2 String) 0 0 id UInt64 -data Tuple(_dummy UInt8) +data Tuple(\n _dummy UInt8) TTL 1 1 id UInt64 d Date -data Tuple(k1 String, k2 String) +data Tuple(\n k1 String,\n k2 String) 0 0 id UInt64 d Date -data Tuple(_dummy UInt8) +data Tuple(\n _dummy UInt8) diff --git a/tests/queries/0_stateless/02447_drop_database_replica.reference b/tests/queries/0_stateless/02447_drop_database_replica.reference index f2b41569540..1af3ee244f1 100644 --- a/tests/queries/0_stateless/02447_drop_database_replica.reference +++ b/tests/queries/0_stateless/02447_drop_database_replica.reference @@ -13,9 +13,15 @@ t rdb_default 1 1 s1 r1 1 2 2 +s1 r1 OK 2 0 +s1 r2 QUEUED 2 0 +s2 r1 QUEUED 2 0 +2 rdb_default 1 1 s1 r1 1 rdb_default 1 2 s1 r2 0 2 2 t +t2 +t3 rdb_default_4 1 1 s1 r1 1 diff --git a/tests/queries/0_stateless/02447_drop_database_replica.sh b/tests/queries/0_stateless/02447_drop_database_replica.sh index d5b3ceef46a..fb89db5045b 100755 --- a/tests/queries/0_stateless/02447_drop_database_replica.sh +++ b/tests/queries/0_stateless/02447_drop_database_replica.sh @@ -32,6 +32,10 @@ $CLICKHOUSE_CLIENT -q "system sync database replica $db" $CLICKHOUSE_CLIENT -q "select cluster, shard_num, replica_num, database_shard_name, database_replica_name, is_active from system.clusters where cluster='$db' and shard_num=1 and replica_num=1" $CLICKHOUSE_CLIENT -q "system drop database replica 's1|r1' from database $db2" 2>&1| grep -Fac "is active, cannot drop it" +# Also check that it doesn't exceed distributed_ddl_task_timeout waiting for inactive replicas +timeout 60s $CLICKHOUSE_CLIENT --distributed_ddl_task_timeout=1000 --distributed_ddl_output_mode=throw_only_active -q "create table $db.t2 (n int) engine=Log" 2>&1| grep -Fac "TIMEOUT_EXCEEDED" +timeout 60s $CLICKHOUSE_CLIENT --distributed_ddl_task_timeout=1000 --distributed_ddl_output_mode=null_status_on_timeout_only_active -q "create table $db.t3 (n int) engine=Log" | sort + $CLICKHOUSE_CLIENT -q "detach database $db3" $CLICKHOUSE_CLIENT -q "system drop database replica 'r1' from shard 's2' from database $db" $CLICKHOUSE_CLIENT -q "attach database $db3" 2>/dev/null diff --git a/tests/queries/0_stateless/02451_order_by_monotonic.reference b/tests/queries/0_stateless/02451_order_by_monotonic.reference index 05f20a9bad8..4b2f9f7e227 100644 --- a/tests/queries/0_stateless/02451_order_by_monotonic.reference +++ b/tests/queries/0_stateless/02451_order_by_monotonic.reference @@ -4,19 +4,19 @@ 2022-09-09 12:00:00 0x 2022-09-09 12:00:00 1 2022-09-09 12:00:00 1x - Prefix sort description: toStartOfMinute(t_0) ASC - Result sort description: toStartOfMinute(t_0) ASC, c1_1 ASC - Prefix sort description: toStartOfMinute(t_0) ASC - Result sort description: toStartOfMinute(t_0) ASC - Prefix sort description: negate(a_0) ASC - Result sort description: negate(a_0) ASC - Prefix sort description: negate(a_0) ASC, negate(b_1) ASC - Result sort description: negate(a_0) ASC, negate(b_1) ASC - Prefix sort description: a_0 DESC, negate(b_1) ASC - Result sort description: a_0 DESC, negate(b_1) ASC - Prefix sort description: negate(a_0) ASC, b_1 DESC - Result sort description: negate(a_0) ASC, b_1 DESC - Prefix sort description: negate(a_0) ASC - Result sort description: negate(a_0) ASC, b_1 ASC - Prefix sort description: a_0 ASC - Result sort description: a_0 ASC, negate(b_1) ASC + Prefix sort description: toStartOfMinute(__table1.t) ASC + Result sort description: toStartOfMinute(__table1.t) ASC, __table1.c1 ASC + Prefix sort description: toStartOfMinute(__table1.t) ASC + Result sort description: toStartOfMinute(__table1.t) ASC + Prefix sort description: negate(__table1.a) ASC + Result sort description: negate(__table1.a) ASC + Prefix sort description: negate(__table1.a) ASC, negate(__table1.b) ASC + Result sort description: negate(__table1.a) ASC, negate(__table1.b) ASC + Prefix sort description: __table1.a DESC, negate(__table1.b) ASC + Result sort description: __table1.a DESC, negate(__table1.b) ASC + Prefix sort description: negate(__table1.a) ASC, __table1.b DESC + Result sort description: negate(__table1.a) ASC, __table1.b DESC + Prefix sort description: negate(__table1.a) ASC + Result sort description: negate(__table1.a) ASC, __table1.b ASC + Prefix sort description: __table1.a ASC + Result sort description: __table1.a ASC, negate(__table1.b) ASC diff --git a/tests/queries/0_stateless/02475_bson_each_row_format.reference b/tests/queries/0_stateless/02475_bson_each_row_format.reference index f90867d92b1..5659e5201b1 100644 --- a/tests/queries/0_stateless/02475_bson_each_row_format.reference +++ b/tests/queries/0_stateless/02475_bson_each_row_format.reference @@ -166,7 +166,7 @@ Tuple ('Hello',4) OK OK -tuple Tuple(x Nullable(Int64), s Nullable(String)) +tuple Tuple(\n x Nullable(Int64),\n s Nullable(String)) (0,'Hello') (1,'Hello') (2,'Hello') @@ -214,7 +214,7 @@ Nested types [[0,1,2],[0,1,2,3]] ((3,'Hello'),'Hello') {'a':{'a.a':3,'a.b':4},'b':{'b.a':3,'b.b':4}} [[0,1,2,3],[0,1,2,3,4]] ((4,'Hello'),'Hello') {'a':{'a.a':4,'a.b':5},'b':{'b.a':4,'b.b':5}} nested1 Array(Array(Nullable(Int64))) -nested2 Tuple(Tuple(x Nullable(Int64), s Nullable(String)), Nullable(String)) +nested2 Tuple(Tuple(\n x Nullable(Int64),\n s Nullable(String)), Nullable(String)) nested3 Map(String, Map(String, Nullable(Int64))) [[],[0]] ((0,'Hello'),'Hello') {'a':{'a.a':0,'a.b':1},'b':{'b.a':0,'b.b':1}} [[0],[0,1]] ((1,'Hello'),'Hello') {'a':{'a.a':1,'a.b':2},'b':{'b.a':1,'b.b':2}} diff --git a/tests/queries/0_stateless/02476_fuse_sum_count.reference b/tests/queries/0_stateless/02476_fuse_sum_count.reference index 43a39e8b7e5..1eb156743b0 100644 --- a/tests/queries/0_stateless/02476_fuse_sum_count.reference +++ b/tests/queries/0_stateless/02476_fuse_sum_count.reference @@ -21,7 +21,7 @@ QUERY id: 0 LIST id: 7, nodes: 1 COLUMN id: 4, column_name: a, result_type: Nullable(Int8), source_id: 5 JOIN TREE - TABLE id: 5, table_name: default.fuse_tbl + TABLE id: 5, alias: __table1, table_name: default.fuse_tbl QUERY id: 0 PROJECTION COLUMNS sum(b) Int64 @@ -59,7 +59,7 @@ QUERY id: 0 COLUMN id: 6, column_name: b, result_type: Int8, source_id: 7 CONSTANT id: 18, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - TABLE id: 7, table_name: default.fuse_tbl + TABLE id: 7, alias: __table1, table_name: default.fuse_tbl QUERY id: 0 PROJECTION COLUMNS sum(plus(a, 1)) Nullable(Int64) @@ -138,7 +138,7 @@ QUERY id: 0 LIST id: 39, nodes: 1 COLUMN id: 6, column_name: a, result_type: Nullable(Int8), source_id: 7 JOIN TREE - TABLE id: 7, table_name: default.fuse_tbl + TABLE id: 7, alias: __table1, table_name: default.fuse_tbl QUERY id: 0 PROJECTION COLUMNS multiply(avg(b), 3) Float64 @@ -215,14 +215,14 @@ QUERY id: 0 COLUMN id: 10, column_name: b, result_type: Int8, source_id: 11 CONSTANT id: 37, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - QUERY id: 11, is_subquery: 1 + QUERY id: 11, alias: __table1, is_subquery: 1 PROJECTION COLUMNS b Int8 PROJECTION LIST id: 38, nodes: 1 COLUMN id: 39, column_name: b, result_type: Int8, source_id: 40 JOIN TREE - TABLE id: 40, table_name: default.fuse_tbl + TABLE id: 40, alias: __table2, table_name: default.fuse_tbl QUERY id: 0 PROJECTION COLUMNS sum(b) Int64 @@ -246,14 +246,14 @@ QUERY id: 0 COLUMN id: 6, column_name: b, result_type: Int64, source_id: 7 CONSTANT id: 11, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - QUERY id: 7, is_subquery: 1 + QUERY id: 7, alias: __table1, is_subquery: 1 PROJECTION COLUMNS b Int64 PROJECTION LIST id: 12, nodes: 1 COLUMN id: 13, column_name: x, result_type: Int64, source_id: 14 JOIN TREE - QUERY id: 14, is_subquery: 1 + QUERY id: 14, alias: __table2, is_subquery: 1 PROJECTION COLUMNS x Int64 count(b) UInt64 @@ -276,7 +276,7 @@ QUERY id: 0 COLUMN id: 20, column_name: b, result_type: Int8, source_id: 21 CONSTANT id: 25, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - TABLE id: 21, table_name: default.fuse_tbl + TABLE id: 21, alias: __table3, table_name: default.fuse_tbl 0 0 nan 0 0 nan 45 10 4.5 Decimal(38, 0) UInt64 Float64 diff --git a/tests/queries/0_stateless/02477_fuse_quantiles.reference b/tests/queries/0_stateless/02477_fuse_quantiles.reference index 7c7d581f7fb..7603381416c 100644 --- a/tests/queries/0_stateless/02477_fuse_quantiles.reference +++ b/tests/queries/0_stateless/02477_fuse_quantiles.reference @@ -34,7 +34,7 @@ QUERY id: 0 COLUMN id: 9, column_name: b, result_type: Float64, source_id: 10 CONSTANT id: 14, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - QUERY id: 10, is_subquery: 1 + QUERY id: 10, alias: __table1, is_subquery: 1 PROJECTION COLUMNS b Float64 PROJECTION @@ -45,7 +45,7 @@ QUERY id: 0 COLUMN id: 18, column_name: x, result_type: Float64, source_id: 19 CONSTANT id: 20, constant_value: UInt64_1, constant_value_type: UInt8 JOIN TREE - QUERY id: 19, is_subquery: 1 + QUERY id: 19, alias: __table2, is_subquery: 1 PROJECTION COLUMNS x Float64 quantile(0.9)(b) Float64 @@ -76,7 +76,7 @@ QUERY id: 0 COLUMN id: 29, column_name: b, result_type: Int32, source_id: 30 CONSTANT id: 34, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - TABLE id: 30, table_name: default.fuse_tbl + TABLE id: 30, alias: __table3, table_name: default.fuse_tbl GROUP BY LIST id: 35, nodes: 1 COLUMN id: 18, column_name: x, result_type: Float64, source_id: 19 diff --git a/tests/queries/0_stateless/02477_logical_expressions_optimizer_low_cardinality.reference b/tests/queries/0_stateless/02477_logical_expressions_optimizer_low_cardinality.reference index ff5f7e5a687..649b037fafa 100644 --- a/tests/queries/0_stateless/02477_logical_expressions_optimizer_low_cardinality.reference +++ b/tests/queries/0_stateless/02477_logical_expressions_optimizer_low_cardinality.reference @@ -8,7 +8,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: a, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.t_logical_expressions_optimizer_low_cardinality + TABLE id: 3, alias: __table1, table_name: default.t_logical_expressions_optimizer_low_cardinality WHERE FUNCTION id: 4, function_name: in, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -26,7 +26,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: a, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.t_logical_expressions_optimizer_low_cardinality + TABLE id: 3, alias: __table1, table_name: default.t_logical_expressions_optimizer_low_cardinality WHERE FUNCTION id: 4, function_name: in, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -44,7 +44,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: a, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.t_logical_expressions_optimizer_low_cardinality + TABLE id: 3, alias: __table1, table_name: default.t_logical_expressions_optimizer_low_cardinality WHERE FUNCTION id: 4, function_name: notIn, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -62,7 +62,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: a, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.t_logical_expressions_optimizer_low_cardinality + TABLE id: 3, alias: __table1, table_name: default.t_logical_expressions_optimizer_low_cardinality WHERE FUNCTION id: 4, function_name: notIn, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -80,7 +80,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: a, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.t_logical_expressions_optimizer_low_cardinality + TABLE id: 3, alias: __table1, table_name: default.t_logical_expressions_optimizer_low_cardinality WHERE FUNCTION id: 4, function_name: or, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -106,7 +106,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: a, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.t_logical_expressions_optimizer_low_cardinality + TABLE id: 3, alias: __table1, table_name: default.t_logical_expressions_optimizer_low_cardinality WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS diff --git a/tests/queries/0_stateless/02479_analyzer_join_with_constants.sql b/tests/queries/0_stateless/02479_analyzer_join_with_constants.sql index 99f20290ff0..50248665bc9 100644 --- a/tests/queries/0_stateless/02479_analyzer_join_with_constants.sql +++ b/tests/queries/0_stateless/02479_analyzer_join_with_constants.sql @@ -24,4 +24,4 @@ SELECT * FROM (SELECT 1 AS id, 1 AS value) AS t1 ASOF LEFT JOIN (SELECT 1 AS id, SELECT '--'; -SELECT b.dt FROM (SELECT NULL > NULL AS pk, 1 AS dt FROM numbers(5)) AS a ASOF LEFT JOIN (SELECT NULL AS pk, 1 AS dt) AS b ON (a.pk = b.pk) AND 1 != 1 AND (a.dt >= b.dt); -- { serverError 403 } +SELECT b.dt FROM (SELECT NULL > NULL AS pk, 1 AS dt FROM numbers(5)) AS a ASOF LEFT JOIN (SELECT NULL AS pk, 1 AS dt) AS b ON (a.pk = b.pk) AND 1 != 1 AND (a.dt >= b.dt); -- { serverError 403, NOT_FOUND_COLUMN_IN_BLOCK } diff --git a/tests/queries/0_stateless/02479_mysql_connect_to_self.reference b/tests/queries/0_stateless/02479_mysql_connect_to_self.reference index f4dd01bc184..6838dacc3b3 100644 --- a/tests/queries/0_stateless/02479_mysql_connect_to_self.reference +++ b/tests/queries/0_stateless/02479_mysql_connect_to_self.reference @@ -50,7 +50,7 @@ QUERY id: 0 COLUMN id: 5, column_name: b, result_type: String, source_id: 3 COLUMN id: 6, column_name: c, result_type: String, source_id: 3 JOIN TREE - TABLE_FUNCTION id: 3, table_function_name: mysql + TABLE_FUNCTION id: 3, alias: __table1, table_function_name: mysql ARGUMENTS LIST id: 7, nodes: 5 CONSTANT id: 8, constant_value: \'127.0.0.1:9004\', constant_value_type: String @@ -63,10 +63,10 @@ QUERY id: 0 SETTINGS connection_wait_timeout=123 connect_timeout=40123002 read_write_timeout=40123001 connection_pool_size=3 SELECT - key AS key, - a AS a, - b AS b, - c AS c -FROM mysql(\'127.0.0.1:9004\', \'default\', foo, \'default\', \'\', SETTINGS connection_wait_timeout = 123, connect_timeout = 40123002, read_write_timeout = 40123001, connection_pool_size = 3) + __table1.key AS key, + __table1.a AS a, + __table1.b AS b, + __table1.c AS c +FROM mysql(\'127.0.0.1:9004\', \'default\', foo, \'default\', \'\', SETTINGS connection_wait_timeout = 123, connect_timeout = 40123002, read_write_timeout = 40123001, connection_pool_size = 3) AS __table1 --- 5 diff --git a/tests/queries/0_stateless/02479_race_condition_between_insert_and_droppin_mv.sh b/tests/queries/0_stateless/02479_race_condition_between_insert_and_droppin_mv.sh index 9ce4b459fce..6899b31d1d9 100755 --- a/tests/queries/0_stateless/02479_race_condition_between_insert_and_droppin_mv.sh +++ b/tests/queries/0_stateless/02479_race_condition_between_insert_and_droppin_mv.sh @@ -14,7 +14,7 @@ function insert { offset=500 while true; do - ${CLICKHOUSE_CLIENT} -q "INSERT INTO test_race_condition_landing SELECT number, toString(number), toString(number) from system.numbers limit $i, $offset" + ${CLICKHOUSE_CLIENT} -q "INSERT INTO test_race_condition_landing SELECT number, toString(number), toString(number) from system.numbers limit $i, $offset settings ignore_materialized_views_with_dropped_target_table=1" i=$(( $i + $RANDOM % 100 + 400 )) done } diff --git a/tests/queries/0_stateless/02481_aggregation_in_order_plan.reference b/tests/queries/0_stateless/02481_aggregation_in_order_plan.reference index b11f3e3a1d3..969ec320790 100644 --- a/tests/queries/0_stateless/02481_aggregation_in_order_plan.reference +++ b/tests/queries/0_stateless/02481_aggregation_in_order_plan.reference @@ -6,5 +6,5 @@ Order: a ASC, c ASC ReadFromMergeTree (default.tab) Aggregating - Order: a_0 ASC, c_2 ASC + Order: __table1.a ASC, __table1.c ASC ReadFromMergeTree (default.tab) diff --git a/tests/queries/0_stateless/02481_analyzer_optimize_aggregation_arithmetics.reference b/tests/queries/0_stateless/02481_analyzer_optimize_aggregation_arithmetics.reference index 22dda253066..a26773baae2 100644 --- a/tests/queries/0_stateless/02481_analyzer_optimize_aggregation_arithmetics.reference +++ b/tests/queries/0_stateless/02481_analyzer_optimize_aggregation_arithmetics.reference @@ -20,7 +20,7 @@ QUERY id: 0 LIST id: 9, nodes: 1 COLUMN id: 10, column_name: number, result_type: UInt64, source_id: 11 JOIN TREE - TABLE_FUNCTION id: 11, table_function_name: numbers + TABLE_FUNCTION id: 11, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_10, constant_value_type: UInt8 @@ -44,7 +44,7 @@ QUERY id: 0 LIST id: 10, nodes: 1 CONSTANT id: 11, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 7, table_function_name: numbers + TABLE_FUNCTION id: 7, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_10, constant_value_type: UInt8 diff --git a/tests/queries/0_stateless/02481_analyzer_optimize_grouping_sets_keys.reference b/tests/queries/0_stateless/02481_analyzer_optimize_grouping_sets_keys.reference index 03722034708..9f9c1da5e88 100644 --- a/tests/queries/0_stateless/02481_analyzer_optimize_grouping_sets_keys.reference +++ b/tests/queries/0_stateless/02481_analyzer_optimize_grouping_sets_keys.reference @@ -17,7 +17,7 @@ QUERY id: 0, group_by_type: grouping_sets LIST id: 9, nodes: 1 COLUMN id: 10, column_name: number, result_type: UInt64, source_id: 11 JOIN TREE - TABLE_FUNCTION id: 11, table_function_name: numbers + TABLE_FUNCTION id: 11, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_10000000, constant_value_type: UInt32 @@ -103,7 +103,7 @@ QUERY id: 0, group_by_type: grouping_sets LIST id: 9, nodes: 1 COLUMN id: 10, column_name: number, result_type: UInt64, source_id: 11 JOIN TREE - TABLE_FUNCTION id: 11, table_function_name: numbers + TABLE_FUNCTION id: 11, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_10000000, constant_value_type: UInt32 @@ -180,7 +180,7 @@ QUERY id: 0, group_by_type: grouping_sets LIST id: 9, nodes: 1 COLUMN id: 10, column_name: number, result_type: UInt64, source_id: 11 JOIN TREE - TABLE_FUNCTION id: 11, table_function_name: numbers + TABLE_FUNCTION id: 11, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_10000000, constant_value_type: UInt32 @@ -253,7 +253,7 @@ QUERY id: 0, group_by_type: grouping_sets LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - TABLE_FUNCTION id: 3, table_function_name: numbers + TABLE_FUNCTION id: 3, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 4, nodes: 1 CONSTANT id: 5, constant_value: UInt64_1000, constant_value_type: UInt16 diff --git a/tests/queries/0_stateless/02486_truncate_and_unexpected_parts.reference b/tests/queries/0_stateless/02486_truncate_and_unexpected_parts.reference index 2ece1147d78..824d4bbec98 100644 --- a/tests/queries/0_stateless/02486_truncate_and_unexpected_parts.reference +++ b/tests/queries/0_stateless/02486_truncate_and_unexpected_parts.reference @@ -13,3 +13,5 @@ 5 rmt2 7 rmt2 9 rmt2 +1 +3 diff --git a/tests/queries/0_stateless/02486_truncate_and_unexpected_parts.sql b/tests/queries/0_stateless/02486_truncate_and_unexpected_parts.sql index 52e8be236c8..5c90313b6b8 100644 --- a/tests/queries/0_stateless/02486_truncate_and_unexpected_parts.sql +++ b/tests/queries/0_stateless/02486_truncate_and_unexpected_parts.sql @@ -50,3 +50,20 @@ system sync replica rmt1; system sync replica rmt2; select *, _table from merge(currentDatabase(), '') order by _table, (*,); + + +create table rmt3 (n int) engine=ReplicatedMergeTree('/test/02468/{database}3', '1') order by tuple() settings replicated_max_ratio_of_wrong_parts=0, max_suspicious_broken_parts=0, max_suspicious_broken_parts_bytes=0; +set insert_keeper_fault_injection_probability=0; +insert into rmt3 values (1); +insert into rmt3 values (2); +insert into rmt3 values (3); + +system stop cleanup rmt3; +system sync replica rmt3 pull; +alter table rmt3 drop part 'all_1_1_0'; +optimize table rmt3 final; + +detach table rmt3 sync; +attach table rmt3; + +select * from rmt3 order by n; diff --git a/tests/queries/0_stateless/02493_analyzer_sum_if_to_count_if.reference b/tests/queries/0_stateless/02493_analyzer_sum_if_to_count_if.reference index eccf51501ed..23e91dc2703 100644 --- a/tests/queries/0_stateless/02493_analyzer_sum_if_to_count_if.reference +++ b/tests/queries/0_stateless/02493_analyzer_sum_if_to_count_if.reference @@ -16,7 +16,7 @@ QUERY id: 0 CONSTANT id: 10, constant_value: UInt64_2, constant_value_type: UInt8 CONSTANT id: 11, constant_value: UInt64_0, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 9, table_function_name: numbers + TABLE_FUNCTION id: 9, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_10, constant_value_type: UInt8 @@ -41,7 +41,7 @@ QUERY id: 0 CONSTANT id: 10, constant_value: UInt64_2, constant_value_type: UInt8 CONSTANT id: 11, constant_value: UInt64_0, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 9, table_function_name: numbers + TABLE_FUNCTION id: 9, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_10, constant_value_type: UInt8 @@ -69,7 +69,7 @@ QUERY id: 0 CONSTANT id: 12, constant_value: UInt64_2, constant_value_type: UInt8 CONSTANT id: 13, constant_value: UInt64_0, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 11, table_function_name: numbers + TABLE_FUNCTION id: 11, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 14, nodes: 1 CONSTANT id: 15, constant_value: UInt64_10, constant_value_type: UInt8 diff --git a/tests/queries/0_stateless/02493_analyzer_uniq_injective_functions_elimination.reference b/tests/queries/0_stateless/02493_analyzer_uniq_injective_functions_elimination.reference index 5b808310f0e..01d7fa2a2cb 100644 --- a/tests/queries/0_stateless/02493_analyzer_uniq_injective_functions_elimination.reference +++ b/tests/queries/0_stateless/02493_analyzer_uniq_injective_functions_elimination.reference @@ -13,7 +13,7 @@ QUERY id: 0 LIST id: 6, nodes: 1 CONSTANT id: 7, constant_value: \'\', constant_value_type: String JOIN TREE - TABLE_FUNCTION id: 8, table_function_name: numbers + TABLE_FUNCTION id: 8, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 9, nodes: 1 CONSTANT id: 10, constant_value: UInt64_1, constant_value_type: UInt8 diff --git a/tests/queries/0_stateless/02497_if_transform_strings_to_enum.reference b/tests/queries/0_stateless/02497_if_transform_strings_to_enum.reference index 88f23334d31..d77fd1028f2 100644 --- a/tests/queries/0_stateless/02497_if_transform_strings_to_enum.reference +++ b/tests/queries/0_stateless/02497_if_transform_strings_to_enum.reference @@ -35,7 +35,7 @@ QUERY id: 0 CONSTANT id: 15, constant_value: \'other\', constant_value_type: String CONSTANT id: 16, constant_value: \'Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2, \\\'other\\\' = 3, \\\'yahoo\\\' = 4)\', constant_value_type: String JOIN TREE - TABLE id: 7, table_name: system.numbers + TABLE id: 7, alias: __table1, table_name: system.numbers LIMIT CONSTANT id: 17, constant_value: UInt64_10, constant_value_type: UInt64 google @@ -78,7 +78,7 @@ QUERY id: 0 CONSTANT id: 17, constant_value: \'google\', constant_value_type: String CONSTANT id: 18, constant_value: \'Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2)\', constant_value_type: String JOIN TREE - TABLE id: 9, table_name: system.numbers + TABLE id: 9, alias: __table1, table_name: system.numbers LIMIT CONSTANT id: 19, constant_value: UInt64_10, constant_value_type: UInt64 other1 @@ -122,7 +122,7 @@ QUERY id: 0 CONSTANT id: 18, constant_value: \'Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2, \\\'other\\\' = 3, \\\'yahoo\\\' = 4)\', constant_value_type: String CONSTANT id: 19, constant_value: \'1\', constant_value_type: String JOIN TREE - TABLE id: 9, table_name: system.numbers + TABLE id: 9, alias: __table1, table_name: system.numbers LIMIT CONSTANT id: 20, constant_value: UInt64_10, constant_value_type: UInt64 google1 @@ -169,7 +169,7 @@ QUERY id: 0 CONSTANT id: 20, constant_value: \'Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2)\', constant_value_type: String CONSTANT id: 21, constant_value: \'1\', constant_value_type: String JOIN TREE - TABLE id: 11, table_name: system.numbers + TABLE id: 11, alias: __table1, table_name: system.numbers LIMIT CONSTANT id: 22, constant_value: UInt64_10, constant_value_type: UInt64 google @@ -196,7 +196,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value, result_type: String, source_id: 3 JOIN TREE - QUERY id: 3, alias: t1, is_subquery: 1 + QUERY id: 3, alias: __table1, is_subquery: 1 PROJECTION COLUMNS value String PROJECTION @@ -223,7 +223,7 @@ QUERY id: 0 CONSTANT id: 20, constant_value: \'google\', constant_value_type: String CONSTANT id: 21, constant_value: \'Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2)\', constant_value_type: String JOIN TREE - TABLE id: 12, table_name: system.numbers + TABLE id: 12, alias: __table2, table_name: system.numbers LIMIT CONSTANT id: 22, constant_value: UInt64_10, constant_value_type: UInt64 other @@ -250,7 +250,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value, result_type: String, source_id: 3 JOIN TREE - QUERY id: 3, alias: t1, is_subquery: 1 + QUERY id: 3, alias: __table1, is_subquery: 1 PROJECTION COLUMNS value String PROJECTION @@ -274,7 +274,7 @@ QUERY id: 0 CONSTANT id: 18, constant_value: \'other\', constant_value_type: String CONSTANT id: 19, constant_value: \'Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2, \\\'other\\\' = 3, \\\'yahoo\\\' = 4)\', constant_value_type: String JOIN TREE - TABLE id: 10, table_name: system.numbers + TABLE id: 10, alias: __table2, table_name: system.numbers LIMIT CONSTANT id: 20, constant_value: UInt64_10, constant_value_type: UInt64 google google @@ -341,7 +341,7 @@ QUERY id: 0 CONSTANT id: 17, constant_value: \'google\', constant_value_type: String CONSTANT id: 18, constant_value: \'Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2)\', constant_value_type: String JOIN TREE - TABLE id: 9, table_name: system.numbers + TABLE id: 9, alias: __table1, table_name: system.numbers LIMIT CONSTANT id: 19, constant_value: UInt64_10, constant_value_type: UInt64 other other @@ -402,7 +402,7 @@ QUERY id: 0 CONSTANT id: 15, constant_value: \'other\', constant_value_type: String CONSTANT id: 16, constant_value: \'Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2, \\\'other\\\' = 3, \\\'yahoo\\\' = 4)\', constant_value_type: String JOIN TREE - TABLE id: 7, table_name: system.numbers + TABLE id: 7, alias: __table1, table_name: system.numbers LIMIT CONSTANT id: 17, constant_value: UInt64_10, constant_value_type: UInt64 other @@ -446,14 +446,14 @@ QUERY id: 0 CONSTANT id: 15, constant_value: \'other\', constant_value_type: String CONSTANT id: 16, constant_value: \'Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2, \\\'other\\\' = 3, \\\'yahoo\\\' = 4)\', constant_value_type: String JOIN TREE - QUERY id: 7, is_subquery: 1 + QUERY id: 7, alias: __table1, is_subquery: 1 PROJECTION COLUMNS number Nullable(Nothing) PROJECTION LIST id: 17, nodes: 1 CONSTANT id: 18, constant_value: NULL, constant_value_type: Nullable(Nothing) JOIN TREE - TABLE id: 19, table_name: system.numbers + TABLE id: 19, alias: __table2, table_name: system.numbers LIMIT CONSTANT id: 20, constant_value: UInt64_10, constant_value_type: UInt64 other @@ -482,7 +482,7 @@ QUERY id: 0 CONSTANT id: 7, constant_value: Array_[\'google\', \'censor.net\', \'yahoo\'], constant_value_type: Array(String) CONSTANT id: 8, constant_value: \'other\', constant_value_type: String JOIN TREE - TABLE id: 5, table_name: system.numbers + TABLE id: 5, alias: __table1, table_name: system.numbers LIMIT CONSTANT id: 9, constant_value: UInt64_10, constant_value_type: UInt64 google @@ -514,6 +514,6 @@ QUERY id: 0 CONSTANT id: 9, constant_value: \'censor.net\', constant_value_type: String CONSTANT id: 10, constant_value: \'google\', constant_value_type: String JOIN TREE - TABLE id: 7, table_name: system.numbers + TABLE id: 7, alias: __table1, table_name: system.numbers LIMIT CONSTANT id: 11, constant_value: UInt64_10, constant_value_type: UInt64 diff --git a/tests/queries/0_stateless/02498_analyzer_settings_push_down.reference b/tests/queries/0_stateless/02498_analyzer_settings_push_down.reference index 583da07380e..f24edd96996 100644 --- a/tests/queries/0_stateless/02498_analyzer_settings_push_down.reference +++ b/tests/queries/0_stateless/02498_analyzer_settings_push_down.reference @@ -12,7 +12,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value, result_type: UInt64, source_id: 3 JOIN TREE - QUERY id: 3, is_subquery: 1 + QUERY id: 3, alias: __table1, is_subquery: 1 PROJECTION COLUMNS value UInt64 PROJECTION @@ -23,7 +23,7 @@ QUERY id: 0 COLUMN id: 7, column_name: value, result_type: Tuple(a UInt64), source_id: 8 CONSTANT id: 9, constant_value: \'a\', constant_value_type: String JOIN TREE - TABLE id: 8, table_name: default.test_table + TABLE id: 8, alias: __table2, table_name: default.test_table SELECT '--'; -- EXPLAIN QUERY TREE SELECT value FROM ( @@ -36,14 +36,14 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value, result_type: UInt64, source_id: 3 JOIN TREE - QUERY id: 3, is_subquery: 1 + QUERY id: 3, alias: __table1, is_subquery: 1 PROJECTION COLUMNS value UInt64 PROJECTION LIST id: 4, nodes: 1 COLUMN id: 5, column_name: value.a, result_type: UInt64, source_id: 6 JOIN TREE - TABLE id: 6, table_name: default.test_table + TABLE id: 6, alias: __table2, table_name: default.test_table SETTINGS optimize_functions_to_subcolumns=1 SELECT '--'; -- @@ -57,7 +57,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value, result_type: UInt64, source_id: 3 JOIN TREE - QUERY id: 3, is_subquery: 1 + QUERY id: 3, alias: __table1, is_subquery: 1 PROJECTION COLUMNS value UInt64 PROJECTION @@ -68,7 +68,7 @@ QUERY id: 0 COLUMN id: 7, column_name: value, result_type: Tuple(a UInt64), source_id: 8 CONSTANT id: 9, constant_value: \'a\', constant_value_type: String JOIN TREE - TABLE id: 8, table_name: default.test_table + TABLE id: 8, alias: __table2, table_name: default.test_table SETTINGS optimize_functions_to_subcolumns=0 SETTINGS optimize_functions_to_subcolumns=1 SELECT '--'; @@ -83,7 +83,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value, result_type: UInt64, source_id: 3 JOIN TREE - QUERY id: 3, is_subquery: 1 + QUERY id: 3, alias: __table1, is_subquery: 1 PROJECTION COLUMNS value UInt64 PROJECTION @@ -94,7 +94,7 @@ QUERY id: 0 COLUMN id: 7, column_name: value, result_type: Tuple(a UInt64), source_id: 8 CONSTANT id: 9, constant_value: \'a\', constant_value_type: String JOIN TREE - TABLE id: 8, table_name: default.test_table + TABLE id: 8, alias: __table2, table_name: default.test_table SETTINGS optimize_functions_to_subcolumns=0 SELECT '--'; -- @@ -108,13 +108,13 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value, result_type: UInt64, source_id: 3 JOIN TREE - QUERY id: 3, is_subquery: 1 + QUERY id: 3, alias: __table1, is_subquery: 1 PROJECTION COLUMNS value UInt64 PROJECTION LIST id: 4, nodes: 1 COLUMN id: 5, column_name: value.a, result_type: UInt64, source_id: 6 JOIN TREE - TABLE id: 6, table_name: default.test_table + TABLE id: 6, alias: __table2, table_name: default.test_table SETTINGS optimize_functions_to_subcolumns=1 SETTINGS optimize_functions_to_subcolumns=0 diff --git a/tests/queries/0_stateless/02514_analyzer_drop_join_on.reference b/tests/queries/0_stateless/02514_analyzer_drop_join_on.reference index 51e009dcd91..a5a71560d00 100644 --- a/tests/queries/0_stateless/02514_analyzer_drop_join_on.reference +++ b/tests/queries/0_stateless/02514_analyzer_drop_join_on.reference @@ -6,43 +6,43 @@ SELECT count() FROM a JOIN b ON b.b1 = a.a1 JOIN c ON c.c1 = b.b1 JOIN d ON d.d1 Expression ((Project names + Projection)) Header: count() UInt64 Aggregating - Header: a2_4 String + Header: __table1.a2 String count() UInt64 Expression ((Before GROUP BY + DROP unused columns after JOIN)) - Header: a2_4 String + Header: __table1.a2 String Join (JOIN FillRightFirst) - Header: a2_4 String - c1_2 UInt64 + Header: __table1.a2 String + __table3.c1 UInt64 Expression ((JOIN actions + DROP unused columns after JOIN)) - Header: a2_4 String - c1_2 UInt64 + Header: __table1.a2 String + __table3.c1 UInt64 Join (JOIN FillRightFirst) - Header: a2_4 String - b1_0 UInt64 - c1_2 UInt64 + Header: __table1.a2 String + __table2.b1 UInt64 + __table3.c1 UInt64 Expression ((JOIN actions + DROP unused columns after JOIN)) - Header: a2_4 String - b1_0 UInt64 + Header: __table1.a2 String + __table2.b1 UInt64 Join (JOIN FillRightFirst) - Header: a1_1 UInt64 - a2_4 String - b1_0 UInt64 + Header: __table1.a1 UInt64 + __table1.a2 String + __table2.b1 UInt64 Expression ((JOIN actions + Change column names to column identifiers)) - Header: a1_1 UInt64 - a2_4 String + Header: __table1.a1 UInt64 + __table1.a2 String ReadFromMemoryStorage Header: a1 UInt64 a2 String Expression ((JOIN actions + Change column names to column identifiers)) - Header: b1_0 UInt64 + Header: __table2.b1 UInt64 ReadFromMemoryStorage Header: b1 UInt64 Expression ((JOIN actions + Change column names to column identifiers)) - Header: c1_2 UInt64 + Header: __table3.c1 UInt64 ReadFromMemoryStorage Header: c1 UInt64 Expression ((JOIN actions + Change column names to column identifiers)) - Header: d1_3 UInt64 + Header: __table4.d1 UInt64 ReadFromMemoryStorage Header: d1 UInt64 EXPLAIN PLAN header = 1 @@ -52,38 +52,38 @@ Expression ((Project names + (Projection + DROP unused columns after JOIN))) Header: a2 String d2 String Join (JOIN FillRightFirst) - Header: a2_0 String - k_2 UInt64 - d2_1 String - Expression (DROP unused columns after JOIN) - Header: a2_0 String - k_2 UInt64 + Header: __table1.a2 String + __table1.k UInt64 + __table4.d2 String + Expression ((Actions for left table alias column keys + DROP unused columns after JOIN)) + Header: __table1.a2 String + __table1.k UInt64 Join (JOIN FillRightFirst) - Header: a2_0 String - k_2 UInt64 - Expression (DROP unused columns after JOIN) - Header: a2_0 String - k_2 UInt64 + Header: __table1.a2 String + __table1.k UInt64 + Expression ((Actions for left table alias column keys + DROP unused columns after JOIN)) + Header: __table1.a2 String + __table1.k UInt64 Join (JOIN FillRightFirst) - Header: a2_0 String - k_2 UInt64 - Expression (Change column names to column identifiers) - Header: a2_0 String - k_2 UInt64 + Header: __table1.a2 String + __table1.k UInt64 + Expression ((Actions for left table alias column keys + Change column names to column identifiers)) + Header: __table1.a2 String + __table1.k UInt64 ReadFromMemoryStorage Header: a2 String k UInt64 - Expression (Change column names to column identifiers) - Header: k_3 UInt64 + Expression ((Actions for right table alias column keys + Change column names to column identifiers)) + Header: __table2.k UInt64 ReadFromMemoryStorage Header: k UInt64 - Expression (Change column names to column identifiers) - Header: k_4 UInt64 + Expression ((Actions for right table alias column keys + Change column names to column identifiers)) + Header: __table3.k UInt64 ReadFromMemoryStorage Header: k UInt64 - Expression (Change column names to column identifiers) - Header: d2_1 String - k_5 UInt64 + Expression ((Actions for right table alias column keys + Change column names to column identifiers)) + Header: __table4.d2 String + __table4.k UInt64 ReadFromMemoryStorage Header: d2 String k UInt64 @@ -97,55 +97,55 @@ WHERE c.c2 != '' ORDER BY a.a2 Expression (Project names) Header: bx String Sorting (Sorting for ORDER BY) - Header: a2_6 String - bx_0 String + Header: __table1.a2 String + __table2.bx String Expression ((Before ORDER BY + (Projection + ))) - Header: a2_6 String - bx_0 String + Header: __table1.a2 String + __table2.bx String Join (JOIN FillRightFirst) - Header: a2_6 String - bx_0 String - c2_5 String - c1_3 UInt64 + Header: __table1.a2 String + __table2.bx String + __table4.c2 String + __table4.c1 UInt64 Expression - Header: a2_6 String - bx_0 String - c2_5 String - c1_3 UInt64 + Header: __table1.a2 String + __table2.bx String + __table4.c2 String + __table4.c1 UInt64 Join (JOIN FillRightFirst) - Header: a2_6 String - bx_0 String - b1_1 UInt64 - c2_5 String - c1_3 UInt64 + Header: __table1.a2 String + __table2.bx String + __table2.b1 UInt64 + __table4.c2 String + __table4.c1 UInt64 Expression ((JOIN actions + DROP unused columns after JOIN)) - Header: a2_6 String - bx_0 String - b1_1 UInt64 + Header: __table1.a2 String + __table2.bx String + __table2.b1 UInt64 Join (JOIN FillRightFirst) - Header: a1_2 UInt64 - a2_6 String - bx_0 String - b1_1 UInt64 + Header: __table1.a1 UInt64 + __table1.a2 String + __table2.bx String + __table2.b1 UInt64 Expression ((JOIN actions + Change column names to column identifiers)) - Header: a1_2 UInt64 - a2_6 String + Header: __table1.a1 UInt64 + __table1.a2 String ReadFromMemoryStorage Header: a1 UInt64 a2 String Expression ((JOIN actions + (Change column names to column identifiers + (Project names + (Projection + Change column names to column identifiers))))) - Header: b1_1 UInt64 - bx_0 String + Header: __table2.b1 UInt64 + __table2.bx String ReadFromMemoryStorage Header: b1 UInt64 b2 String Filter (( + (JOIN actions + Change column names to column identifiers))) - Header: c1_3 UInt64 - c2_5 String + Header: __table4.c1 UInt64 + __table4.c2 String ReadFromMemoryStorage Header: c1 UInt64 c2 String Expression ((JOIN actions + (Change column names to column identifiers + (Project names + (Projection + Change column names to column identifiers))))) - Header: d1_4 UInt64 + Header: __table5.d1 UInt64 ReadFromSystemNumbers Header: number UInt64 diff --git a/tests/queries/0_stateless/02518_rewrite_aggregate_function_with_if.reference b/tests/queries/0_stateless/02518_rewrite_aggregate_function_with_if.reference index 37680adf8e0..15543789c1d 100644 --- a/tests/queries/0_stateless/02518_rewrite_aggregate_function_with_if.reference +++ b/tests/queries/0_stateless/02518_rewrite_aggregate_function_with_if.reference @@ -17,7 +17,7 @@ QUERY id: 0 COLUMN id: 8, column_name: number, result_type: UInt64, source_id: 9 CONSTANT id: 11, constant_value: UInt64_0, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 9, table_function_name: numbers + TABLE_FUNCTION id: 9, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_100, constant_value_type: UInt8 @@ -40,7 +40,7 @@ QUERY id: 0 CONSTANT id: 11, constant_value: UInt64_0, constant_value_type: UInt8 COLUMN id: 8, column_name: number, result_type: UInt64, source_id: 9 JOIN TREE - TABLE_FUNCTION id: 9, table_function_name: numbers + TABLE_FUNCTION id: 9, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_100, constant_value_type: UInt8 @@ -63,7 +63,7 @@ QUERY id: 0 COLUMN id: 8, column_name: number, result_type: UInt64, source_id: 9 CONSTANT id: 11, constant_value: NULL, constant_value_type: Nullable(Nothing) JOIN TREE - TABLE_FUNCTION id: 9, table_function_name: numbers + TABLE_FUNCTION id: 9, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_100, constant_value_type: UInt8 @@ -86,7 +86,7 @@ QUERY id: 0 CONSTANT id: 11, constant_value: NULL, constant_value_type: Nullable(Nothing) COLUMN id: 8, column_name: number, result_type: UInt64, source_id: 9 JOIN TREE - TABLE_FUNCTION id: 9, table_function_name: numbers + TABLE_FUNCTION id: 9, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_100, constant_value_type: UInt8 @@ -109,7 +109,7 @@ QUERY id: 0 COLUMN id: 8, column_name: number, result_type: UInt64, source_id: 9 CONSTANT id: 11, constant_value: NULL, constant_value_type: Nullable(Nothing) JOIN TREE - TABLE_FUNCTION id: 9, table_function_name: numbers + TABLE_FUNCTION id: 9, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_100, constant_value_type: UInt8 @@ -132,7 +132,7 @@ QUERY id: 0 CONSTANT id: 11, constant_value: NULL, constant_value_type: Nullable(Nothing) COLUMN id: 8, column_name: number, result_type: UInt64, source_id: 9 JOIN TREE - TABLE_FUNCTION id: 9, table_function_name: numbers + TABLE_FUNCTION id: 9, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_100, constant_value_type: UInt8 @@ -160,7 +160,7 @@ QUERY id: 0 COLUMN id: 12, column_name: number, result_type: UInt64, source_id: 13 CONSTANT id: 15, constant_value: NULL, constant_value_type: Nullable(Nothing) JOIN TREE - TABLE_FUNCTION id: 13, table_function_name: numbers + TABLE_FUNCTION id: 13, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 16, nodes: 1 CONSTANT id: 17, constant_value: UInt64_100, constant_value_type: UInt8 @@ -188,7 +188,7 @@ QUERY id: 0 CONSTANT id: 15, constant_value: NULL, constant_value_type: Nullable(Nothing) COLUMN id: 12, column_name: number, result_type: UInt64, source_id: 13 JOIN TREE - TABLE_FUNCTION id: 13, table_function_name: numbers + TABLE_FUNCTION id: 13, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 16, nodes: 1 CONSTANT id: 17, constant_value: UInt64_100, constant_value_type: UInt8 @@ -207,7 +207,7 @@ QUERY id: 0 COLUMN id: 4, column_name: number, result_type: UInt64, source_id: 5 CONSTANT id: 8, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 5, table_function_name: numbers + TABLE_FUNCTION id: 5, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 9, nodes: 1 CONSTANT id: 10, constant_value: UInt64_100, constant_value_type: UInt8 @@ -229,7 +229,7 @@ QUERY id: 0 COLUMN id: 4, column_name: number, result_type: UInt64, source_id: 5 CONSTANT id: 10, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 5, table_function_name: numbers + TABLE_FUNCTION id: 5, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 11, nodes: 1 CONSTANT id: 12, constant_value: UInt64_100, constant_value_type: UInt8 @@ -248,7 +248,7 @@ QUERY id: 0 COLUMN id: 4, column_name: number, result_type: UInt64, source_id: 5 CONSTANT id: 8, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 5, table_function_name: numbers + TABLE_FUNCTION id: 5, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 9, nodes: 1 CONSTANT id: 10, constant_value: UInt64_100, constant_value_type: UInt8 @@ -270,7 +270,7 @@ QUERY id: 0 COLUMN id: 4, column_name: number, result_type: UInt64, source_id: 5 CONSTANT id: 10, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 5, table_function_name: numbers + TABLE_FUNCTION id: 5, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 11, nodes: 1 CONSTANT id: 12, constant_value: UInt64_100, constant_value_type: UInt8 @@ -289,7 +289,7 @@ QUERY id: 0 COLUMN id: 4, column_name: number, result_type: UInt64, source_id: 5 CONSTANT id: 8, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 5, table_function_name: numbers + TABLE_FUNCTION id: 5, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 9, nodes: 1 CONSTANT id: 10, constant_value: UInt64_100, constant_value_type: UInt8 @@ -311,7 +311,7 @@ QUERY id: 0 COLUMN id: 4, column_name: number, result_type: UInt64, source_id: 5 CONSTANT id: 10, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 5, table_function_name: numbers + TABLE_FUNCTION id: 5, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 11, nodes: 1 CONSTANT id: 12, constant_value: UInt64_100, constant_value_type: UInt8 @@ -335,7 +335,7 @@ QUERY id: 0 COLUMN id: 8, column_name: number, result_type: UInt64, source_id: 9 CONSTANT id: 12, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 9, table_function_name: numbers + TABLE_FUNCTION id: 9, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 13, nodes: 1 CONSTANT id: 14, constant_value: UInt64_100, constant_value_type: UInt8 @@ -362,7 +362,7 @@ QUERY id: 0 COLUMN id: 8, column_name: number, result_type: UInt64, source_id: 9 CONSTANT id: 14, constant_value: UInt64_2, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 9, table_function_name: numbers + TABLE_FUNCTION id: 9, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 15, nodes: 1 CONSTANT id: 16, constant_value: UInt64_100, constant_value_type: UInt8 diff --git a/tests/queries/0_stateless/02521_avro_union_null_nested.reference b/tests/queries/0_stateless/02521_avro_union_null_nested.reference index e4818b4bcac..a3cb5ba4858 100644 --- a/tests/queries/0_stateless/02521_avro_union_null_nested.reference +++ b/tests/queries/0_stateless/02521_avro_union_null_nested.reference @@ -5,7 +5,7 @@ added_snapshot_id Nullable(Int64) added_data_files_count Nullable(Int32) existing_data_files_count Nullable(Int32) deleted_data_files_count Nullable(Int32) -partitions Array(Tuple(contains_null Bool, contains_nan Nullable(Bool), lower_bound Nullable(String), upper_bound Nullable(String))) +partitions Array(Tuple(\n contains_null Bool,\n contains_nan Nullable(Bool),\n lower_bound Nullable(String),\n upper_bound Nullable(String))) added_rows_count Nullable(Int64) existing_rows_count Nullable(Int64) deleted_rows_count Nullable(Int64) diff --git a/tests/queries/0_stateless/02522_avro_complicate_schema.reference b/tests/queries/0_stateless/02522_avro_complicate_schema.reference index 55c0369020f..a885163d609 100644 --- a/tests/queries/0_stateless/02522_avro_complicate_schema.reference +++ b/tests/queries/0_stateless/02522_avro_complicate_schema.reference @@ -1,5 +1,5 @@ status Int32 snapshot_id Nullable(Int64) -data_file Tuple(file_path String, file_format String, partition Tuple(vendor_id Nullable(Int64)), record_count Int64, file_size_in_bytes Int64, block_size_in_bytes Int64, column_sizes Array(Tuple(key Int32, value Int64)), value_counts Array(Tuple(key Int32, value Int64)), null_value_counts Array(Tuple(key Int32, value Int64)), nan_value_counts Array(Tuple(key Int32, value Int64)), lower_bounds Array(Tuple(key Int32, value String)), upper_bounds Array(Tuple(key Int32, value String)), key_metadata Nullable(String), split_offsets Array(Int64), sort_order_id Nullable(Int32)) +data_file Tuple(\n file_path String,\n file_format String,\n partition Tuple(\n vendor_id Nullable(Int64)),\n record_count Int64,\n file_size_in_bytes Int64,\n block_size_in_bytes Int64,\n column_sizes Array(Tuple(\n key Int32,\n value Int64)),\n value_counts Array(Tuple(\n key Int32,\n value Int64)),\n null_value_counts Array(Tuple(\n key Int32,\n value Int64)),\n nan_value_counts Array(Tuple(\n key Int32,\n value Int64)),\n lower_bounds Array(Tuple(\n key Int32,\n value String)),\n upper_bounds Array(Tuple(\n key Int32,\n value String)),\n key_metadata Nullable(String),\n split_offsets Array(Int64),\n sort_order_id Nullable(Int32)) 1 6850377589038341628 ('file:/warehouse/nyc.db/taxis/data/vendor_id=1/00000-0-c070e655-dc44-43d2-a01a-484f107210cb-00001.parquet','PARQUET',(1),2,1565,67108864,[(1,87),(2,51),(3,51),(4,57),(5,51)],[(1,2),(2,2),(3,2),(4,2),(5,2)],[(1,0),(2,0),(3,0),(4,0),(5,0)],[(3,0),(4,0)],[(1,'\0\0\0\0\0\0\0'),(2,'C\0\0\0\0\0'),(3,'ff?'),(4,'p=\nף.@'),(5,'N')],[(1,'\0\0\0\0\0\0\0'),(2,'C\0\0\0\0\0'),(3,'ffA'),(4,'q=\nףE@'),(5,'Y')],NULL,[4],0) 1 6850377589038341628 ('file:/warehouse/nyc.db/taxis/data/vendor_id=2/00000-0-c070e655-dc44-43d2-a01a-484f107210cb-00002.parquet','PARQUET',(2),2,1620,67108864,[(1,87),(2,51),(3,51),(4,57),(5,89)],[(1,2),(2,2),(3,2),(4,2),(5,2)],[(1,0),(2,0),(3,0),(4,0),(5,0)],[(3,0),(4,0)],[(1,'\0\0\0\0\0\0\0'),(2,'C\0\0\0\0\0'),(3,'fff?'),(4,'Q"@'),(5,'N')],[(1,'\0\0\0\0\0\0\0'),(2,'C\0\0\0\0\0'),(3,'\0\0 @'),(4,'fffff&6@'),(5,'N')],NULL,[4],0) diff --git a/tests/queries/0_stateless/02534_analyzer_grouping_function.reference b/tests/queries/0_stateless/02534_analyzer_grouping_function.reference index fcbf625ef22..1b496644547 100644 --- a/tests/queries/0_stateless/02534_analyzer_grouping_function.reference +++ b/tests/queries/0_stateless/02534_analyzer_grouping_function.reference @@ -16,7 +16,7 @@ QUERY id: 0 LIST id: 7, nodes: 1 COLUMN id: 8, column_name: value, result_type: String, source_id: 5 JOIN TREE - TABLE id: 5, table_name: default.test_table + TABLE id: 5, alias: __table1, table_name: default.test_table GROUP BY LIST id: 9, nodes: 2 COLUMN id: 4, column_name: id, result_type: UInt64, source_id: 5 @@ -42,7 +42,7 @@ QUERY id: 0, group_by_type: rollup COLUMN id: 9, column_name: __grouping_set, result_type: UInt64 COLUMN id: 10, column_name: value, result_type: String, source_id: 6 JOIN TREE - TABLE id: 6, table_name: default.test_table + TABLE id: 6, alias: __table1, table_name: default.test_table GROUP BY LIST id: 11, nodes: 2 COLUMN id: 5, column_name: id, result_type: UInt64, source_id: 6 @@ -70,7 +70,7 @@ QUERY id: 0, group_by_type: cube COLUMN id: 9, column_name: __grouping_set, result_type: UInt64 COLUMN id: 10, column_name: value, result_type: String, source_id: 6 JOIN TREE - TABLE id: 6, table_name: default.test_table + TABLE id: 6, alias: __table1, table_name: default.test_table GROUP BY LIST id: 11, nodes: 2 COLUMN id: 5, column_name: id, result_type: UInt64, source_id: 6 @@ -99,7 +99,7 @@ QUERY id: 0, group_by_type: grouping_sets COLUMN id: 9, column_name: __grouping_set, result_type: UInt64 COLUMN id: 10, column_name: value, result_type: String, source_id: 6 JOIN TREE - TABLE id: 6, table_name: default.test_table + TABLE id: 6, alias: __table1, table_name: default.test_table GROUP BY LIST id: 11, nodes: 2 LIST id: 12, nodes: 1 @@ -128,7 +128,7 @@ QUERY id: 0, group_by_type: grouping_sets COLUMN id: 9, column_name: __grouping_set, result_type: UInt64 COLUMN id: 10, column_name: value, result_type: String, source_id: 6 JOIN TREE - TABLE id: 6, table_name: default.test_table + TABLE id: 6, alias: __table1, table_name: default.test_table GROUP BY LIST id: 11, nodes: 2 LIST id: 12, nodes: 1 diff --git a/tests/queries/0_stateless/02564_analyzer_cross_to_inner.reference b/tests/queries/0_stateless/02564_analyzer_cross_to_inner.reference index e4d7ff55b86..5b9bc206695 100644 --- a/tests/queries/0_stateless/02564_analyzer_cross_to_inner.reference +++ b/tests/queries/0_stateless/02564_analyzer_cross_to_inner.reference @@ -29,9 +29,9 @@ QUERY id: 0 LEFT TABLE EXPRESSION JOIN id: 11, strictness: ALL, kind: INNER LEFT TABLE EXPRESSION - TABLE id: 3, table_name: default.t1 + TABLE id: 3, alias: __table1, table_name: default.t1 RIGHT TABLE EXPRESSION - TABLE id: 6, table_name: default.t2 + TABLE id: 6, alias: __table2, table_name: default.t2 JOIN EXPRESSION FUNCTION id: 12, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -48,14 +48,14 @@ QUERY id: 0 COLUMN id: 21, column_name: a, result_type: UInt64, source_id: 6 CONSTANT id: 22, constant_value: UInt64_0, constant_value_type: UInt8 RIGHT TABLE EXPRESSION - QUERY id: 9, alias: t3, is_subquery: 1 + QUERY id: 9, alias: __table3, is_subquery: 1 PROJECTION COLUMNS x UInt64 PROJECTION LIST id: 23, nodes: 1 COLUMN id: 24, column_name: a, result_type: UInt64, source_id: 25 JOIN TREE - TABLE id: 25, table_name: default.t3 + TABLE id: 25, alias: __table4, table_name: default.t3 WHERE FUNCTION id: 26, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -97,18 +97,18 @@ QUERY id: 0 LEFT TABLE EXPRESSION JOIN id: 11, kind: COMMA LEFT TABLE EXPRESSION - TABLE id: 3, table_name: default.t1 + TABLE id: 3, alias: __table1, table_name: default.t1 RIGHT TABLE EXPRESSION - TABLE id: 6, table_name: default.t2 + TABLE id: 6, alias: __table2, table_name: default.t2 RIGHT TABLE EXPRESSION - QUERY id: 9, alias: t3, is_subquery: 1 + QUERY id: 9, alias: __table3, is_subquery: 1 PROJECTION COLUMNS x UInt64 PROJECTION LIST id: 12, nodes: 1 COLUMN id: 13, column_name: a, result_type: UInt64, source_id: 14 JOIN TREE - TABLE id: 14, table_name: default.t3 + TABLE id: 14, alias: __table4, table_name: default.t3 WHERE FUNCTION id: 15, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -166,9 +166,9 @@ QUERY id: 0 LEFT TABLE EXPRESSION JOIN id: 11, strictness: ALL, kind: INNER LEFT TABLE EXPRESSION - TABLE id: 3, table_name: default.t1 + TABLE id: 3, alias: __table1, table_name: default.t1 RIGHT TABLE EXPRESSION - TABLE id: 6, table_name: default.t2 + TABLE id: 6, alias: __table2, table_name: default.t2 JOIN EXPRESSION FUNCTION id: 12, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -185,14 +185,14 @@ QUERY id: 0 COLUMN id: 21, column_name: a, result_type: UInt64, source_id: 6 CONSTANT id: 22, constant_value: UInt64_0, constant_value_type: UInt8 RIGHT TABLE EXPRESSION - QUERY id: 9, alias: t3, is_subquery: 1 + QUERY id: 9, alias: __table3, is_subquery: 1 PROJECTION COLUMNS x UInt64 PROJECTION LIST id: 23, nodes: 1 COLUMN id: 24, column_name: a, result_type: UInt64, source_id: 25 JOIN TREE - TABLE id: 25, table_name: default.t3 + TABLE id: 25, alias: __table4, table_name: default.t3 WHERE FUNCTION id: 26, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS diff --git a/tests/queries/0_stateless/02576_predicate_push_down_sorting_fix.reference b/tests/queries/0_stateless/02576_predicate_push_down_sorting_fix.reference index b8c68f90135..dd107065380 100644 --- a/tests/queries/0_stateless/02576_predicate_push_down_sorting_fix.reference +++ b/tests/queries/0_stateless/02576_predicate_push_down_sorting_fix.reference @@ -1,21 +1,21 @@ Expression ((Project names + (Projection + ))) Header: number UInt64 -Actions: INPUT : 0 -> number_1 UInt64 : 0 - ALIAS number_1 :: 0 -> number UInt64 : 1 - ALIAS number :: 1 -> number_0 UInt64 : 0 - ALIAS number_0 :: 0 -> number UInt64 : 1 +Actions: INPUT : 0 -> __table2.number UInt64 : 0 + ALIAS __table2.number :: 0 -> number UInt64 : 1 + ALIAS number :: 1 -> __table1.number UInt64 : 0 + ALIAS __table1.number :: 0 -> number UInt64 : 1 Positions: 1 Sorting (Sorting for ORDER BY) Header: ignore(2_UInt8) UInt8 - number_1 UInt64 + __table2.number UInt64 Sort description: ignore(2_UInt8) ASC Filter (( + (Before ORDER BY + (Projection + Change column names to column identifiers)))) Header: ignore(2_UInt8) UInt8 - number_1 UInt64 + __table2.number UInt64 Filter column: ignore(2_UInt8) Actions: INPUT : 0 -> number UInt64 : 0 COLUMN Const(UInt8) -> 2_UInt8 UInt8 : 1 - ALIAS number :: 0 -> number_1 UInt64 : 2 + ALIAS number :: 0 -> __table2.number UInt64 : 2 FUNCTION ignore(2_UInt8 :: 1) -> ignore(2_UInt8) UInt8 : 0 Positions: 0 2 ReadFromSystemNumbers diff --git a/tests/queries/0_stateless/02576_rewrite_array_exists_to_has.reference b/tests/queries/0_stateless/02576_rewrite_array_exists_to_has.reference index b6964976c20..f4e09c4b4de 100644 --- a/tests/queries/0_stateless/02576_rewrite_array_exists_to_has.reference +++ b/tests/queries/0_stateless/02576_rewrite_array_exists_to_has.reference @@ -26,7 +26,7 @@ QUERY id: 0 LIST id: 14, nodes: 1 CONSTANT id: 15, constant_value: UInt64_10, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 16, table_function_name: numbers + TABLE_FUNCTION id: 16, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 17, nodes: 1 CONSTANT id: 18, constant_value: UInt64_10, constant_value_type: UInt8 @@ -58,7 +58,7 @@ QUERY id: 0 LIST id: 14, nodes: 1 CONSTANT id: 15, constant_value: UInt64_10, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 16, table_function_name: numbers + TABLE_FUNCTION id: 16, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 17, nodes: 1 CONSTANT id: 18, constant_value: UInt64_10, constant_value_type: UInt8 @@ -81,7 +81,7 @@ QUERY id: 0 CONSTANT id: 9, constant_value: UInt64_10, constant_value_type: UInt8 CONSTANT id: 10, constant_value: UInt64_5, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 11, table_function_name: numbers + TABLE_FUNCTION id: 11, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_10, constant_value_type: UInt8 @@ -104,7 +104,7 @@ QUERY id: 0 CONSTANT id: 9, constant_value: UInt64_10, constant_value_type: UInt8 CONSTANT id: 10, constant_value: UInt64_5, constant_value_type: UInt8 JOIN TREE - TABLE_FUNCTION id: 11, table_function_name: numbers + TABLE_FUNCTION id: 11, alias: __table1, table_function_name: numbers ARGUMENTS LIST id: 12, nodes: 1 CONSTANT id: 13, constant_value: UInt64_10, constant_value_type: UInt8 diff --git a/tests/queries/0_stateless/02668_logical_optimizer_removing_redundant_checks.reference b/tests/queries/0_stateless/02668_logical_optimizer_removing_redundant_checks.reference index 089d1849eb4..cf60d63b1cf 100644 --- a/tests/queries/0_stateless/02668_logical_optimizer_removing_redundant_checks.reference +++ b/tests/queries/0_stateless/02668_logical_optimizer_removing_redundant_checks.reference @@ -9,7 +9,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3 COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02668_logical_optimizer + TABLE id: 3, alias: __table1, table_name: default.02668_logical_optimizer WHERE FUNCTION id: 5, function_name: in, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -26,7 +26,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3 COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02668_logical_optimizer + TABLE id: 3, alias: __table1, table_name: default.02668_logical_optimizer WHERE FUNCTION id: 5, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -42,7 +42,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3 COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02668_logical_optimizer + TABLE id: 3, alias: __table1, table_name: default.02668_logical_optimizer WHERE CONSTANT id: 5, constant_value: UInt64_0, constant_value_type: UInt8 3 another @@ -55,7 +55,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3 COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02668_logical_optimizer + TABLE id: 3, alias: __table1, table_name: default.02668_logical_optimizer WHERE FUNCTION id: 5, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -80,7 +80,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3 COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02668_logical_optimizer + TABLE id: 3, alias: __table1, table_name: default.02668_logical_optimizer WHERE FUNCTION id: 5, function_name: equals, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -97,7 +97,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3 COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02668_logical_optimizer + TABLE id: 3, alias: __table1, table_name: default.02668_logical_optimizer WHERE FUNCTION id: 5, function_name: notIn, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -115,7 +115,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3 COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02668_logical_optimizer + TABLE id: 3, alias: __table1, table_name: default.02668_logical_optimizer WHERE FUNCTION id: 5, function_name: notEquals, function_type: ordinary, result_type: UInt8 ARGUMENTS diff --git a/tests/queries/0_stateless/02675_predicate_push_down_filled_join_fix.reference b/tests/queries/0_stateless/02675_predicate_push_down_filled_join_fix.reference index 2630c5b95b6..e6c4d5768af 100644 --- a/tests/queries/0_stateless/02675_predicate_push_down_filled_join_fix.reference +++ b/tests/queries/0_stateless/02675_predicate_push_down_filled_join_fix.reference @@ -2,31 +2,31 @@ Expression ((Project names + (Projection + ))) Header: t1.id UInt64 t1.value String t2.value String -Actions: INPUT : 0 -> id_0 UInt64 : 0 - INPUT : 1 -> value_1 String : 1 - INPUT : 2 -> value_2 String : 2 - ALIAS id_0 :: 0 -> t1.id UInt64 : 3 - ALIAS value_1 :: 1 -> t1.value String : 0 - ALIAS value_2 :: 2 -> t2.value String : 1 +Actions: INPUT : 0 -> __table1.id UInt64 : 0 + INPUT : 1 -> __table1.value String : 1 + INPUT : 2 -> __table2.value String : 2 + ALIAS __table1.id :: 0 -> t1.id UInt64 : 3 + ALIAS __table1.value :: 1 -> t1.value String : 0 + ALIAS __table2.value :: 2 -> t2.value String : 1 Positions: 3 0 1 FilledJoin (Filled JOIN) - Header: id_0 UInt64 - value_1 String - value_2 String + Header: __table1.id UInt64 + __table1.value String + __table2.value String Type: INNER Strictness: ALL Algorithm: HashJoin - Clauses: [(id_0) = (id)] + Clauses: [(__table1.id) = (id)] Filter (( + (JOIN actions + Change column names to column identifiers))) - Header: id_0 UInt64 - value_1 String - Filter column: equals(id_0, 0_UInt8) (removed) + Header: __table1.id UInt64 + __table1.value String + Filter column: equals(__table1.id, 0_UInt8) (removed) Actions: INPUT : 0 -> id UInt64 : 0 INPUT : 1 -> value String : 1 COLUMN Const(UInt8) -> 0_UInt8 UInt8 : 2 - ALIAS id :: 0 -> id_0 UInt64 : 3 - ALIAS value :: 1 -> value_1 String : 0 - FUNCTION equals(id_0 : 3, 0_UInt8 :: 2) -> equals(id_0, 0_UInt8) UInt8 : 1 + ALIAS id :: 0 -> __table1.id UInt64 : 3 + ALIAS value :: 1 -> __table1.value String : 0 + FUNCTION equals(__table1.id : 3, 0_UInt8 :: 2) -> equals(__table1.id, 0_UInt8) UInt8 : 1 Positions: 1 3 0 ReadFromMergeTree (default.test_table) Header: id UInt64 diff --git a/tests/queries/0_stateless/02679_explain_merge_tree_prewhere_row_policy.reference b/tests/queries/0_stateless/02679_explain_merge_tree_prewhere_row_policy.reference index cc16a1fce02..4a4e338438b 100644 --- a/tests/queries/0_stateless/02679_explain_merge_tree_prewhere_row_policy.reference +++ b/tests/queries/0_stateless/02679_explain_merge_tree_prewhere_row_policy.reference @@ -29,10 +29,10 @@ Header: id UInt64 value String Actions: INPUT : 0 -> id UInt64 : 0 INPUT : 1 -> value String : 1 - ALIAS id :: 0 -> id_0 UInt64 : 2 - ALIAS value :: 1 -> value_1 String : 0 - ALIAS id_0 :: 2 -> id UInt64 : 1 - ALIAS value_1 :: 0 -> value String : 2 + ALIAS id :: 0 -> __table1.id UInt64 : 2 + ALIAS value :: 1 -> __table1.value String : 0 + ALIAS __table1.id :: 2 -> id UInt64 : 1 + ALIAS __table1.value :: 0 -> value String : 2 Positions: 1 2 ReadFromMergeTree (default.test_table) Header: id UInt64 diff --git a/tests/queries/0_stateless/02702_logical_optimizer_with_nulls.reference b/tests/queries/0_stateless/02702_logical_optimizer_with_nulls.reference index e7f46a974e6..c25b446dcdc 100644 --- a/tests/queries/0_stateless/02702_logical_optimizer_with_nulls.reference +++ b/tests/queries/0_stateless/02702_logical_optimizer_with_nulls.reference @@ -9,7 +9,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3 COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02702_logical_optimizer + TABLE id: 3, alias: __table1, table_name: default.02702_logical_optimizer WHERE FUNCTION id: 5, function_name: or, function_type: ordinary, result_type: Nullable(UInt8) ARGUMENTS @@ -41,7 +41,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3 COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02702_logical_optimizer + TABLE id: 3, alias: __table1, table_name: default.02702_logical_optimizer WHERE FUNCTION id: 5, function_name: or, function_type: ordinary, result_type: Nullable(UInt8) ARGUMENTS @@ -68,7 +68,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Nullable(Int32), source_id: 3 COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02702_logical_optimizer_with_null_column + TABLE id: 3, alias: __table1, table_name: default.02702_logical_optimizer_with_null_column WHERE FUNCTION id: 5, function_name: in, function_type: ordinary, result_type: Nullable(UInt8) ARGUMENTS diff --git a/tests/queries/0_stateless/02725_parquet_preserve_order.reference b/tests/queries/0_stateless/02725_parquet_preserve_order.reference index e9c8f99bb33..3f410c13ec4 100644 --- a/tests/queries/0_stateless/02725_parquet_preserve_order.reference +++ b/tests/queries/0_stateless/02725_parquet_preserve_order.reference @@ -3,10 +3,10 @@ 2 (Expression) ExpressionTransform - (ReadFromStorage) + (ReadFromFile) File 0 → 1 (Expression) ExpressionTransform × 2 - (ReadFromStorage) + (ReadFromFile) Resize 1 → 2 File 0 → 1 diff --git a/tests/queries/0_stateless/02771_parallel_replicas_analyzer.reference b/tests/queries/0_stateless/02771_parallel_replicas_analyzer.reference index 35573110550..3b8a394a522 100644 --- a/tests/queries/0_stateless/02771_parallel_replicas_analyzer.reference +++ b/tests/queries/0_stateless/02771_parallel_replicas_analyzer.reference @@ -9,4 +9,4 @@ 7885388429666205427 8124171311239967992 1 1 -- Simple query with analyzer and pure parallel replicas\nSELECT number\nFROM join_inner_table__fuzz_146_replicated\n SETTINGS\n allow_experimental_analyzer = 1,\n max_parallel_replicas = 2,\n cluster_for_parallel_replicas = \'test_cluster_one_shard_three_replicas_localhost\',\n allow_experimental_parallel_reading_from_replicas = 1; -0 2 SELECT `join_inner_table__fuzz_146_replicated`.`number` AS `number` FROM `default`.`join_inner_table__fuzz_146_replicated` SETTINGS allow_experimental_analyzer = 1, max_parallel_replicas = 2, cluster_for_parallel_replicas = \'test_cluster_one_shard_three_replicas_localhost\', allow_experimental_parallel_reading_from_replicas = 1 +0 2 SELECT `__table1`.`number` AS `number` FROM `default`.`join_inner_table__fuzz_146_replicated` AS `__table1` SETTINGS allow_experimental_analyzer = 1, max_parallel_replicas = 2, cluster_for_parallel_replicas = \'test_cluster_one_shard_three_replicas_localhost\', allow_experimental_parallel_reading_from_replicas = 1 diff --git a/tests/queries/0_stateless/02785_date_predicate_optimizations_ast_query_tree_rewrite.reference b/tests/queries/0_stateless/02785_date_predicate_optimizations_ast_query_tree_rewrite.reference index 0fd2f694aeb..63658890119 100644 --- a/tests/queries/0_stateless/02785_date_predicate_optimizations_ast_query_tree_rewrite.reference +++ b/tests/queries/0_stateless/02785_date_predicate_optimizations_ast_query_tree_rewrite.reference @@ -8,7 +8,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -50,7 +50,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -92,7 +92,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -126,7 +126,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -160,7 +160,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -194,7 +194,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -228,7 +228,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -270,7 +270,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -335,7 +335,7 @@ QUERY id: 0 LIST id: 5, nodes: 1 COLUMN id: 6, column_name: date1, result_type: Date, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 7, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -377,7 +377,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -412,7 +412,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t PREWHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -452,7 +452,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -492,7 +492,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -529,7 +529,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -566,7 +566,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -608,7 +608,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -650,7 +650,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -692,7 +692,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -726,7 +726,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -760,7 +760,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -794,7 +794,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -828,7 +828,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date_t + TABLE id: 3, alias: __table1, table_name: default.date_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -878,7 +878,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.datetime_t + TABLE id: 3, alias: __table1, table_name: default.datetime_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -920,7 +920,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.datetime_t + TABLE id: 3, alias: __table1, table_name: default.datetime_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -962,7 +962,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date32_t + TABLE id: 3, alias: __table1, table_name: default.date32_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -1004,7 +1004,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.date32_t + TABLE id: 3, alias: __table1, table_name: default.date32_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -1046,7 +1046,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.datetime64_t + TABLE id: 3, alias: __table1, table_name: default.datetime64_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -1088,7 +1088,7 @@ QUERY id: 0 LIST id: 1, nodes: 1 COLUMN id: 2, column_name: value1, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.datetime64_t + TABLE id: 3, alias: __table1, table_name: default.datetime64_t WHERE FUNCTION id: 4, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS diff --git a/tests/queries/0_stateless/02835_join_step_explain.reference b/tests/queries/0_stateless/02835_join_step_explain.reference index 0cc2e802682..06f4a9cfc99 100644 --- a/tests/queries/0_stateless/02835_join_step_explain.reference +++ b/tests/queries/0_stateless/02835_join_step_explain.reference @@ -3,31 +3,31 @@ Header: id UInt64 value_1 String rhs.id UInt64 rhs.value_1 String -Actions: INPUT : 0 -> id_0 UInt64 : 0 - INPUT : 1 -> value_1_1 String : 1 - INPUT : 2 -> value_1_3 String : 2 - INPUT : 3 -> id_2 UInt64 : 3 - ALIAS id_0 :: 0 -> id UInt64 : 4 - ALIAS value_1_1 :: 1 -> value_1 String : 0 - ALIAS value_1_3 :: 2 -> rhs.value_1 String : 1 - ALIAS id_2 :: 3 -> rhs.id UInt64 : 2 +Actions: INPUT : 0 -> __table1.id UInt64 : 0 + INPUT : 1 -> __table1.value_1 String : 1 + INPUT : 2 -> __table2.value_1 String : 2 + INPUT : 3 -> __table2.id UInt64 : 3 + ALIAS __table1.id :: 0 -> id UInt64 : 4 + ALIAS __table1.value_1 :: 1 -> value_1 String : 0 + ALIAS __table2.value_1 :: 2 -> rhs.value_1 String : 1 + ALIAS __table2.id :: 3 -> rhs.id UInt64 : 2 Positions: 4 0 2 1 Join (JOIN FillRightFirst) - Header: id_0 UInt64 - value_1_1 String - value_1_3 String - id_2 UInt64 + Header: __table1.id UInt64 + __table1.value_1 String + __table2.value_1 String + __table2.id UInt64 Type: INNER Strictness: ALL Algorithm: HashJoin - Clauses: [(id_0) = (id_2)] + Clauses: [(__table1.id) = (__table2.id)] Expression ((JOIN actions + Change column names to column identifiers)) - Header: id_0 UInt64 - value_1_1 String + Header: __table1.id UInt64 + __table1.value_1 String Actions: INPUT : 0 -> id UInt64 : 0 INPUT : 1 -> value_1 String : 1 - ALIAS id :: 0 -> id_0 UInt64 : 2 - ALIAS value_1 :: 1 -> value_1_1 String : 0 + ALIAS id :: 0 -> __table1.id UInt64 : 2 + ALIAS value_1 :: 1 -> __table1.value_1 String : 0 Positions: 2 0 ReadFromMergeTree (default.test_table_1) Header: id UInt64 @@ -36,12 +36,12 @@ Positions: 4 0 2 1 Parts: 1 Granules: 1 Expression ((JOIN actions + Change column names to column identifiers)) - Header: id_2 UInt64 - value_1_3 String + Header: __table2.id UInt64 + __table2.value_1 String Actions: INPUT : 0 -> id UInt64 : 0 INPUT : 1 -> value_1 String : 1 - ALIAS id :: 0 -> id_2 UInt64 : 2 - ALIAS value_1 :: 1 -> value_1_3 String : 0 + ALIAS id :: 0 -> __table2.id UInt64 : 2 + ALIAS value_1 :: 1 -> __table2.value_1 String : 0 Positions: 2 0 ReadFromMergeTree (default.test_table_2) Header: id UInt64 @@ -55,39 +55,39 @@ Header: id UInt64 value_1 String rhs.id UInt64 rhs.value_1 String -Actions: INPUT : 0 -> id_0 UInt64 : 0 - INPUT : 1 -> value_1_1 String : 1 - INPUT :: 2 -> value_2_4 UInt64 : 2 - INPUT : 3 -> value_1_3 String : 3 - INPUT :: 4 -> value_2_5 UInt64 : 4 - INPUT : 5 -> id_2 UInt64 : 5 - ALIAS id_0 :: 0 -> id UInt64 : 6 - ALIAS value_1_1 :: 1 -> value_1 String : 0 - ALIAS value_1_3 :: 3 -> rhs.value_1 String : 1 - ALIAS id_2 :: 5 -> rhs.id UInt64 : 3 +Actions: INPUT : 0 -> __table1.id UInt64 : 0 + INPUT : 1 -> __table1.value_1 String : 1 + INPUT :: 2 -> __table1.value_2 UInt64 : 2 + INPUT : 3 -> __table2.value_1 String : 3 + INPUT :: 4 -> __table2.value_2 UInt64 : 4 + INPUT : 5 -> __table2.id UInt64 : 5 + ALIAS __table1.id :: 0 -> id UInt64 : 6 + ALIAS __table1.value_1 :: 1 -> value_1 String : 0 + ALIAS __table2.value_1 :: 3 -> rhs.value_1 String : 1 + ALIAS __table2.id :: 5 -> rhs.id UInt64 : 3 Positions: 6 0 3 1 Join (JOIN FillRightFirst) - Header: id_0 UInt64 - value_1_1 String - value_2_4 UInt64 - value_1_3 String - value_2_5 UInt64 - id_2 UInt64 + Header: __table1.id UInt64 + __table1.value_1 String + __table1.value_2 UInt64 + __table2.value_1 String + __table2.value_2 UInt64 + __table2.id UInt64 Type: INNER Strictness: ASOF Algorithm: HashJoin ASOF inequality: LESS - Clauses: [(id_0, value_2_4) = (id_2, value_2_5)] + Clauses: [(__table1.id, __table1.value_2) = (__table2.id, __table2.value_2)] Expression ((JOIN actions + Change column names to column identifiers)) - Header: id_0 UInt64 - value_1_1 String - value_2_4 UInt64 + Header: __table1.id UInt64 + __table1.value_1 String + __table1.value_2 UInt64 Actions: INPUT : 0 -> id UInt64 : 0 INPUT : 1 -> value_1 String : 1 INPUT : 2 -> value_2 UInt64 : 2 - ALIAS id :: 0 -> id_0 UInt64 : 3 - ALIAS value_1 :: 1 -> value_1_1 String : 0 - ALIAS value_2 :: 2 -> value_2_4 UInt64 : 1 + ALIAS id :: 0 -> __table1.id UInt64 : 3 + ALIAS value_1 :: 1 -> __table1.value_1 String : 0 + ALIAS value_2 :: 2 -> __table1.value_2 UInt64 : 1 Positions: 3 0 1 ReadFromMergeTree (default.test_table_1) Header: id UInt64 @@ -97,15 +97,15 @@ Positions: 6 0 3 1 Parts: 1 Granules: 1 Expression ((JOIN actions + Change column names to column identifiers)) - Header: id_2 UInt64 - value_1_3 String - value_2_5 UInt64 + Header: __table2.id UInt64 + __table2.value_1 String + __table2.value_2 UInt64 Actions: INPUT : 0 -> id UInt64 : 0 INPUT : 1 -> value_1 String : 1 INPUT : 2 -> value_2 UInt64 : 2 - ALIAS id :: 0 -> id_2 UInt64 : 3 - ALIAS value_1 :: 1 -> value_1_3 String : 0 - ALIAS value_2 :: 2 -> value_2_5 UInt64 : 1 + ALIAS id :: 0 -> __table2.id UInt64 : 3 + ALIAS value_1 :: 1 -> __table2.value_1 String : 0 + ALIAS value_2 :: 2 -> __table2.value_2 UInt64 : 1 Positions: 3 0 1 ReadFromMergeTree (default.test_table_2) Header: id UInt64 diff --git a/tests/queries/0_stateless/02868_distinct_to_count_optimization.reference b/tests/queries/0_stateless/02868_distinct_to_count_optimization.reference index a2c441fa460..c2075f72f33 100644 --- a/tests/queries/0_stateless/02868_distinct_to_count_optimization.reference +++ b/tests/queries/0_stateless/02868_distinct_to_count_optimization.reference @@ -15,14 +15,14 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - QUERY id: 3, is_subquery: 1, is_distinct: 1 + QUERY id: 3, alias: __table1, is_subquery: 1, is_distinct: 1 PROJECTION COLUMNS a UInt8 PROJECTION LIST id: 4, nodes: 1 COLUMN id: 5, column_name: a, result_type: UInt8, source_id: 6 JOIN TREE - TABLE id: 6, table_name: default.test_rewrite_uniq_to_count + TABLE id: 6, alias: __table2, table_name: default.test_rewrite_uniq_to_count SETTINGS allow_experimental_analyzer=1 2. test distinct with subquery alias 3 @@ -41,14 +41,14 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - QUERY id: 3, alias: t, is_subquery: 1, is_distinct: 1 + QUERY id: 3, alias: __table1, is_subquery: 1, is_distinct: 1 PROJECTION COLUMNS a UInt8 PROJECTION LIST id: 4, nodes: 1 COLUMN id: 5, column_name: a, result_type: UInt8, source_id: 6 JOIN TREE - TABLE id: 6, table_name: default.test_rewrite_uniq_to_count + TABLE id: 6, alias: __table2, table_name: default.test_rewrite_uniq_to_count SETTINGS allow_experimental_analyzer=1 3. test distinct with compound column name 3 @@ -67,14 +67,14 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - QUERY id: 3, alias: t, is_subquery: 1, is_distinct: 1 + QUERY id: 3, alias: __table1, is_subquery: 1, is_distinct: 1 PROJECTION COLUMNS a UInt8 PROJECTION LIST id: 4, nodes: 1 COLUMN id: 5, column_name: a, result_type: UInt8, source_id: 6 JOIN TREE - TABLE id: 6, table_name: default.test_rewrite_uniq_to_count + TABLE id: 6, alias: __table2, table_name: default.test_rewrite_uniq_to_count SETTINGS allow_experimental_analyzer=1 4. test distinct with select expression alias 3 @@ -93,14 +93,14 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - QUERY id: 3, alias: t, is_subquery: 1, is_distinct: 1 + QUERY id: 3, alias: __table1, is_subquery: 1, is_distinct: 1 PROJECTION COLUMNS alias_of_a UInt8 PROJECTION LIST id: 4, nodes: 1 COLUMN id: 5, column_name: a, result_type: UInt8, source_id: 6 JOIN TREE - TABLE id: 6, table_name: default.test_rewrite_uniq_to_count + TABLE id: 6, alias: __table2, table_name: default.test_rewrite_uniq_to_count SETTINGS allow_experimental_analyzer=1 5. test simple group by 3 @@ -122,14 +122,14 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - QUERY id: 3, is_subquery: 1 + QUERY id: 3, alias: __table1, is_subquery: 1 PROJECTION COLUMNS a UInt8 PROJECTION LIST id: 4, nodes: 1 COLUMN id: 5, column_name: a, result_type: UInt8, source_id: 6 JOIN TREE - TABLE id: 6, table_name: default.test_rewrite_uniq_to_count + TABLE id: 6, alias: __table2, table_name: default.test_rewrite_uniq_to_count GROUP BY LIST id: 7, nodes: 1 COLUMN id: 5, column_name: a, result_type: UInt8, source_id: 6 @@ -154,14 +154,14 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - QUERY id: 3, alias: t, is_subquery: 1 + QUERY id: 3, alias: __table1, is_subquery: 1 PROJECTION COLUMNS a UInt8 PROJECTION LIST id: 4, nodes: 1 COLUMN id: 5, column_name: a, result_type: UInt8, source_id: 6 JOIN TREE - TABLE id: 6, table_name: default.test_rewrite_uniq_to_count + TABLE id: 6, alias: __table2, table_name: default.test_rewrite_uniq_to_count GROUP BY LIST id: 7, nodes: 1 COLUMN id: 5, column_name: a, result_type: UInt8, source_id: 6 @@ -186,14 +186,14 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - QUERY id: 3, alias: t, is_subquery: 1 + QUERY id: 3, alias: __table1, is_subquery: 1 PROJECTION COLUMNS alias_of_a UInt8 PROJECTION LIST id: 4, nodes: 1 COLUMN id: 5, column_name: a, result_type: UInt8, source_id: 6 JOIN TREE - TABLE id: 6, table_name: default.test_rewrite_uniq_to_count + TABLE id: 6, alias: __table2, table_name: default.test_rewrite_uniq_to_count GROUP BY LIST id: 7, nodes: 1 COLUMN id: 5, column_name: a, result_type: UInt8, source_id: 6 @@ -218,14 +218,14 @@ QUERY id: 0 LIST id: 1, nodes: 1 FUNCTION id: 2, function_name: count, function_type: aggregate, result_type: UInt64 JOIN TREE - QUERY id: 3, alias: t, is_subquery: 1 + QUERY id: 3, alias: __table1, is_subquery: 1 PROJECTION COLUMNS alias_of_a UInt8 PROJECTION LIST id: 4, nodes: 1 COLUMN id: 5, column_name: a, result_type: UInt8, source_id: 6 JOIN TREE - TABLE id: 6, table_name: default.test_rewrite_uniq_to_count + TABLE id: 6, alias: __table2, table_name: default.test_rewrite_uniq_to_count GROUP BY LIST id: 7, nodes: 1 COLUMN id: 5, column_name: a, result_type: UInt8, source_id: 6 diff --git a/tests/queries/0_stateless/02874_infer_objects_as_named_tuples.reference b/tests/queries/0_stateless/02874_infer_objects_as_named_tuples.reference index 01ef288d81a..06c152a0a3c 100644 --- a/tests/queries/0_stateless/02874_infer_objects_as_named_tuples.reference +++ b/tests/queries/0_stateless/02874_infer_objects_as_named_tuples.reference @@ -1,34 +1,34 @@ -obj Tuple(a Nullable(Int64), b Nullable(String), c Array(Nullable(Int64))) +obj Tuple(\n a Nullable(Int64),\n b Nullable(String),\n c Array(Nullable(Int64))) (42,'Hello',[1,2,3]) -obj Tuple(a Nullable(Int64), b Nullable(String), c Array(Nullable(Int64)), d Nullable(Date)) +obj Tuple(\n a Nullable(Int64),\n b Nullable(String),\n c Array(Nullable(Int64)),\n d Nullable(Date)) (42,'Hello',[1,2,3],NULL) (43,'World',[],'2020-01-01') -obj Tuple(a Nullable(Int64), b Nullable(String), c Array(Nullable(Int64)), d Nullable(Date)) +obj Tuple(\n a Nullable(Int64),\n b Nullable(String),\n c Array(Nullable(Int64)),\n d Nullable(Date)) (42,'Hello',[1,2,3],NULL) (43,'World',[],'2020-01-01') (NULL,NULL,[],NULL) -obj Tuple(a Nullable(Int64), b Nullable(String), c Array(Nullable(Int64)), d Nullable(String)) +obj Tuple(\n a Nullable(Int64),\n b Nullable(String),\n c Array(Nullable(Int64)),\n d Nullable(String)) (42,'Hello',[1,2,3],NULL) (43,'World',[],'2020-01-01') (NULL,NULL,[],NULL) (NULL,'2020-01-01',[],'Hello') -obj Array(Tuple(a Nullable(Int64), b Nullable(String), c Array(Nullable(Int64)), d Nullable(Date))) +obj Array(Tuple(\n a Nullable(Int64),\n b Nullable(String),\n c Array(Nullable(Int64)),\n d Nullable(Date))) [(42,'Hello',[1,2,3],NULL),(43,'World',[],'2020-01-01')] [(NULL,NULL,[],NULL)] -obj Tuple(nested_obj Tuple(a Nullable(Int64), b Nullable(String), c Array(Nullable(Int64)), d Nullable(Date))) +obj Tuple(\n nested_obj Tuple(\n a Nullable(Int64),\n b Nullable(String),\n c Array(Nullable(Int64)),\n d Nullable(Date))) ((42,'Hello',[1,2,3],NULL)) ((43,'World',[],'2020-01-01')) ((NULL,NULL,[],NULL)) -obj Tuple(a Tuple(b Nullable(Int64)), `a.b` Nullable(Int64), `a.b.c` Nullable(String)) +obj Tuple(\n a Tuple(\n b Nullable(Int64)),\n `a.b` Nullable(Int64),\n `a.b.c` Nullable(String)) ((1),NULL,NULL) ((NULL),2,'Hello') -obj Tuple(a Tuple(b Tuple(c Nullable(Int64)))) +obj Tuple(\n a Tuple(\n b Tuple(\n c Nullable(Int64)))) (((NULL))) (((10))) -obj Tuple(a Nullable(String)) +obj Tuple(\n a Nullable(String)) ('{}') obj Nullable(String) {} -obj Tuple(a Array(Tuple(b Array(Nullable(Int64)), c Tuple(d Nullable(Int64)), e Nullable(String)))) +obj Tuple(\n a Array(Tuple(\n b Array(Nullable(Int64)),\n c Tuple(\n d Nullable(Int64)),\n e Nullable(String)))) ([([],(NULL),NULL),([],(NULL),NULL),([],(10),NULL)]) ([([1,2,3],(NULL),'Hello')]) diff --git a/tests/queries/0_stateless/02876_json_incomplete_types_as_strings_inference.reference b/tests/queries/0_stateless/02876_json_incomplete_types_as_strings_inference.reference index db94ffc9466..b904568391b 100644 --- a/tests/queries/0_stateless/02876_json_incomplete_types_as_strings_inference.reference +++ b/tests/queries/0_stateless/02876_json_incomplete_types_as_strings_inference.reference @@ -2,6 +2,6 @@ a Nullable(String) b Nullable(String) c Array(Nullable(String)) \N {} [] -a Tuple(b Nullable(String), c Array(Array(Nullable(String)))) -d Tuple(e Array(Nullable(String)), f Nullable(String)) +a Tuple(\n b Nullable(String),\n c Array(Array(Nullable(String)))) +d Tuple(\n e Array(Nullable(String)),\n f Nullable(String)) (NULL,[[],[]]) (['{}','{}'],NULL) diff --git a/tests/queries/0_stateless/02889_file_log_save_errors.reference b/tests/queries/0_stateless/02889_file_log_save_errors.reference index c4a7c1f0bda..849da6ad6fa 100644 --- a/tests/queries/0_stateless/02889_file_log_save_errors.reference +++ b/tests/queries/0_stateless/02889_file_log_save_errors.reference @@ -1,20 +1,20 @@ -Cannot parse input: expected \'{\' before: \'Error 0\' Error 0 a.jsonl -Cannot parse input: expected \'{\' before: \'Error 1\' Error 1 a.jsonl -Cannot parse input: expected \'{\' before: \'Error 2\' Error 2 a.jsonl -Cannot parse input: expected \'{\' before: \'Error 3\' Error 3 a.jsonl -Cannot parse input: expected \'{\' before: \'Error 4\' Error 4 a.jsonl -Cannot parse input: expected \'{\' before: \'Error 5\' Error 5 a.jsonl -Cannot parse input: expected \'{\' before: \'Error 6\' Error 6 a.jsonl -Cannot parse input: expected \'{\' before: \'Error 7\' Error 7 a.jsonl -Cannot parse input: expected \'{\' before: \'Error 8\' Error 8 a.jsonl -Cannot parse input: expected \'{\' before: \'Error 9\' Error 9 a.jsonl -Cannot parse input: expected \'{\' before: \'Error 10\' Error 10 b.jsonl -Cannot parse input: expected \'{\' before: \'Error 11\' Error 11 b.jsonl -Cannot parse input: expected \'{\' before: \'Error 12\' Error 12 b.jsonl -Cannot parse input: expected \'{\' before: \'Error 13\' Error 13 b.jsonl -Cannot parse input: expected \'{\' before: \'Error 14\' Error 14 b.jsonl -Cannot parse input: expected \'{\' before: \'Error 15\' Error 15 b.jsonl -Cannot parse input: expected \'{\' before: \'Error 16\' Error 16 b.jsonl -Cannot parse input: expected \'{\' before: \'Error 17\' Error 17 b.jsonl -Cannot parse input: expected \'{\' before: \'Error 18\' Error 18 b.jsonl -Cannot parse input: expected \'{\' before: \'Error 19\' Error 19 b.jsonl +Cannot parse input: expected \'{\' before: \'Error 0\': (at row 1)\n Error 0 a.jsonl +Cannot parse input: expected \'{\' before: \'Error 1\': (at row 1)\n Error 1 a.jsonl +Cannot parse input: expected \'{\' before: \'Error 2\': (at row 1)\n Error 2 a.jsonl +Cannot parse input: expected \'{\' before: \'Error 3\': (at row 1)\n Error 3 a.jsonl +Cannot parse input: expected \'{\' before: \'Error 4\': (at row 1)\n Error 4 a.jsonl +Cannot parse input: expected \'{\' before: \'Error 5\': (at row 1)\n Error 5 a.jsonl +Cannot parse input: expected \'{\' before: \'Error 6\': (at row 1)\n Error 6 a.jsonl +Cannot parse input: expected \'{\' before: \'Error 7\': (at row 1)\n Error 7 a.jsonl +Cannot parse input: expected \'{\' before: \'Error 8\': (at row 1)\n Error 8 a.jsonl +Cannot parse input: expected \'{\' before: \'Error 9\': (at row 1)\n Error 9 a.jsonl +Cannot parse input: expected \'{\' before: \'Error 10\': (at row 1)\n Error 10 b.jsonl +Cannot parse input: expected \'{\' before: \'Error 11\': (at row 1)\n Error 11 b.jsonl +Cannot parse input: expected \'{\' before: \'Error 12\': (at row 1)\n Error 12 b.jsonl +Cannot parse input: expected \'{\' before: \'Error 13\': (at row 1)\n Error 13 b.jsonl +Cannot parse input: expected \'{\' before: \'Error 14\': (at row 1)\n Error 14 b.jsonl +Cannot parse input: expected \'{\' before: \'Error 15\': (at row 1)\n Error 15 b.jsonl +Cannot parse input: expected \'{\' before: \'Error 16\': (at row 1)\n Error 16 b.jsonl +Cannot parse input: expected \'{\' before: \'Error 17\': (at row 1)\n Error 17 b.jsonl +Cannot parse input: expected \'{\' before: \'Error 18\': (at row 1)\n Error 18 b.jsonl +Cannot parse input: expected \'{\' before: \'Error 19\': (at row 1)\n Error 19 b.jsonl diff --git a/tests/queries/0_stateless/02889_print_pretty_type_names.reference b/tests/queries/0_stateless/02889_print_pretty_type_names.reference index ea25df165bb..9af8e0142f8 100644 --- a/tests/queries/0_stateless/02889_print_pretty_type_names.reference +++ b/tests/queries/0_stateless/02889_print_pretty_type_names.reference @@ -5,18 +5,11 @@ a Tuple( e Array(UInt32), f Array(Tuple( g String, - h Map( - String, - Array(Tuple( - i String, - j UInt64 - )) - ) - )), - k Date - ), - l Nullable(String) -) + h Map(String, Array(Tuple( + i String, + j UInt64))))), + k Date), + l Nullable(String)) Tuple( b String, c Tuple( @@ -24,15 +17,8 @@ Tuple( e Array(UInt32), f Array(Tuple( g String, - h Map( - String, - Array(Tuple( - i String, - j UInt64 - )) - ) - )), - k Date - ), - l Nullable(String) -) + h Map(String, Array(Tuple( + i String, + j UInt64))))), + k Date), + l Nullable(String)) diff --git a/tests/queries/0_stateless/02890_describe_table_options.reference b/tests/queries/0_stateless/02890_describe_table_options.reference index 2974fd92f3c..5d99df36bb4 100644 --- a/tests/queries/0_stateless/02890_describe_table_options.reference +++ b/tests/queries/0_stateless/02890_describe_table_options.reference @@ -2,205 +2,237 @@ SET describe_compact_output = 0, describe_include_virtual_columns = 0, describe_include_subcolumns = 0; DESCRIBE TABLE t_describe_options FORMAT PrettyCompactNoEscapes; -┌─name─┬─type──────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┐ -│ id │ UInt64 │ │ │ index column │ │ │ -│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ -│ t │ Tuple(a String, b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ -└──────┴───────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┘ +┌─name─┬─type─────────────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┐ +│ id │ UInt64 │ │ │ index column │ │ │ +│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ +│ t │ Tuple( + a String, + b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ +└──────┴──────────────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┘ DESCRIBE remote(default, currentDatabase(), t_describe_options) FORMAT PrettyCompactNoEscapes; -┌─name─┬─type──────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┐ -│ id │ UInt64 │ │ │ index column │ │ │ -│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ -│ t │ Tuple(a String, b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ -└──────┴───────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┘ +┌─name─┬─type─────────────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┐ +│ id │ UInt64 │ │ │ index column │ │ │ +│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ +│ t │ Tuple( + a String, + b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ +└──────┴──────────────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┘ SET describe_compact_output = 0, describe_include_virtual_columns = 0, describe_include_subcolumns = 1; DESCRIBE TABLE t_describe_options FORMAT PrettyCompactNoEscapes; -┌─name──────┬─type──────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┬─is_subcolumn─┐ -│ id │ UInt64 │ │ │ index column │ │ │ 0 │ -│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ 0 │ -│ t │ Tuple(a String, b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ 0 │ -│ arr.size0 │ UInt64 │ │ │ │ │ │ 1 │ -│ t.a │ String │ │ │ │ ZSTD(1) │ │ 1 │ -│ t.b │ UInt64 │ │ │ │ ZSTD(1) │ │ 1 │ -└───────────┴───────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┴──────────────┘ +┌─name──────┬─type─────────────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┬─is_subcolumn─┐ +│ id │ UInt64 │ │ │ index column │ │ │ 0 │ +│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ 0 │ +│ t │ Tuple( + a String, + b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ 0 │ +│ arr.size0 │ UInt64 │ │ │ │ │ │ 1 │ +│ t.a │ String │ │ │ │ ZSTD(1) │ │ 1 │ +│ t.b │ UInt64 │ │ │ │ ZSTD(1) │ │ 1 │ +└───────────┴──────────────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┴──────────────┘ DESCRIBE remote(default, currentDatabase(), t_describe_options) FORMAT PrettyCompactNoEscapes; -┌─name──────┬─type──────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┬─is_subcolumn─┐ -│ id │ UInt64 │ │ │ index column │ │ │ 0 │ -│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ 0 │ -│ t │ Tuple(a String, b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ 0 │ -│ arr.size0 │ UInt64 │ │ │ │ │ │ 1 │ -│ t.a │ String │ │ │ │ ZSTD(1) │ │ 1 │ -│ t.b │ UInt64 │ │ │ │ ZSTD(1) │ │ 1 │ -└───────────┴───────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┴──────────────┘ +┌─name──────┬─type─────────────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┬─is_subcolumn─┐ +│ id │ UInt64 │ │ │ index column │ │ │ 0 │ +│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ 0 │ +│ t │ Tuple( + a String, + b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ 0 │ +│ arr.size0 │ UInt64 │ │ │ │ │ │ 1 │ +│ t.a │ String │ │ │ │ ZSTD(1) │ │ 1 │ +│ t.b │ UInt64 │ │ │ │ ZSTD(1) │ │ 1 │ +└───────────┴──────────────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┴──────────────┘ SET describe_compact_output = 0, describe_include_virtual_columns = 1, describe_include_subcolumns = 0; DESCRIBE TABLE t_describe_options FORMAT PrettyCompactNoEscapes; -┌─name─────────────┬─type──────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┬─is_virtual─┐ -│ id │ UInt64 │ │ │ index column │ │ │ 0 │ -│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ 0 │ -│ t │ Tuple(a String, b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ 0 │ -│ _part │ LowCardinality(String) │ │ │ │ │ │ 1 │ -│ _part_index │ UInt64 │ │ │ │ │ │ 1 │ -│ _part_uuid │ UUID │ │ │ │ │ │ 1 │ -│ _partition_id │ LowCardinality(String) │ │ │ │ │ │ 1 │ -│ _partition_value │ UInt8 │ │ │ │ │ │ 1 │ -│ _sample_factor │ Float64 │ │ │ │ │ │ 1 │ -│ _part_offset │ UInt64 │ │ │ │ │ │ 1 │ -│ _row_exists │ UInt8 │ │ │ │ │ │ 1 │ -│ _block_number │ UInt64 │ │ │ │ │ │ 1 │ -└──────────────────┴───────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┴────────────┘ +┌─name─────────────┬─type─────────────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┬─is_virtual─┐ +│ id │ UInt64 │ │ │ index column │ │ │ 0 │ +│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ 0 │ +│ t │ Tuple( + a String, + b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ 0 │ +│ _part │ LowCardinality(String) │ │ │ │ │ │ 1 │ +│ _part_index │ UInt64 │ │ │ │ │ │ 1 │ +│ _part_uuid │ UUID │ │ │ │ │ │ 1 │ +│ _partition_id │ LowCardinality(String) │ │ │ │ │ │ 1 │ +│ _partition_value │ UInt8 │ │ │ │ │ │ 1 │ +│ _sample_factor │ Float64 │ │ │ │ │ │ 1 │ +│ _part_offset │ UInt64 │ │ │ │ │ │ 1 │ +│ _row_exists │ UInt8 │ │ │ │ │ │ 1 │ +│ _block_number │ UInt64 │ │ │ │ │ │ 1 │ +└──────────────────┴──────────────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┴────────────┘ DESCRIBE remote(default, currentDatabase(), t_describe_options) FORMAT PrettyCompactNoEscapes; -┌─name───────────┬─type──────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┬─is_virtual─┐ -│ id │ UInt64 │ │ │ index column │ │ │ 0 │ -│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ 0 │ -│ t │ Tuple(a String, b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ 0 │ -│ _table │ LowCardinality(String) │ │ │ │ │ │ 1 │ -│ _part │ LowCardinality(String) │ │ │ │ │ │ 1 │ -│ _part_index │ UInt64 │ │ │ │ │ │ 1 │ -│ _part_uuid │ UUID │ │ │ │ │ │ 1 │ -│ _partition_id │ LowCardinality(String) │ │ │ │ │ │ 1 │ -│ _sample_factor │ Float64 │ │ │ │ │ │ 1 │ -│ _part_offset │ UInt64 │ │ │ │ │ │ 1 │ -│ _row_exists │ UInt8 │ │ │ │ │ │ 1 │ -│ _block_number │ UInt64 │ │ │ │ │ │ 1 │ -│ _shard_num │ UInt32 │ │ │ │ │ │ 1 │ -└────────────────┴───────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┴────────────┘ +┌─name───────────┬─type─────────────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┬─is_virtual─┐ +│ id │ UInt64 │ │ │ index column │ │ │ 0 │ +│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ 0 │ +│ t │ Tuple( + a String, + b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ 0 │ +│ _table │ LowCardinality(String) │ │ │ │ │ │ 1 │ +│ _part │ LowCardinality(String) │ │ │ │ │ │ 1 │ +│ _part_index │ UInt64 │ │ │ │ │ │ 1 │ +│ _part_uuid │ UUID │ │ │ │ │ │ 1 │ +│ _partition_id │ LowCardinality(String) │ │ │ │ │ │ 1 │ +│ _sample_factor │ Float64 │ │ │ │ │ │ 1 │ +│ _part_offset │ UInt64 │ │ │ │ │ │ 1 │ +│ _row_exists │ UInt8 │ │ │ │ │ │ 1 │ +│ _block_number │ UInt64 │ │ │ │ │ │ 1 │ +│ _shard_num │ UInt32 │ │ │ │ │ │ 1 │ +└────────────────┴──────────────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┴────────────┘ SET describe_compact_output = 0, describe_include_virtual_columns = 1, describe_include_subcolumns = 1; DESCRIBE TABLE t_describe_options FORMAT PrettyCompactNoEscapes; -┌─name─────────────┬─type──────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┬─is_subcolumn─┬─is_virtual─┐ -│ id │ UInt64 │ │ │ index column │ │ │ 0 │ 0 │ -│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ 0 │ 0 │ -│ t │ Tuple(a String, b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ 0 │ 0 │ -│ _part │ LowCardinality(String) │ │ │ │ │ │ 0 │ 1 │ -│ _part_index │ UInt64 │ │ │ │ │ │ 0 │ 1 │ -│ _part_uuid │ UUID │ │ │ │ │ │ 0 │ 1 │ -│ _partition_id │ LowCardinality(String) │ │ │ │ │ │ 0 │ 1 │ -│ _partition_value │ UInt8 │ │ │ │ │ │ 0 │ 1 │ -│ _sample_factor │ Float64 │ │ │ │ │ │ 0 │ 1 │ -│ _part_offset │ UInt64 │ │ │ │ │ │ 0 │ 1 │ -│ _row_exists │ UInt8 │ │ │ │ │ │ 0 │ 1 │ -│ _block_number │ UInt64 │ │ │ │ │ │ 0 │ 1 │ -│ arr.size0 │ UInt64 │ │ │ │ │ │ 1 │ 0 │ -│ t.a │ String │ │ │ │ ZSTD(1) │ │ 1 │ 0 │ -│ t.b │ UInt64 │ │ │ │ ZSTD(1) │ │ 1 │ 0 │ -└──────────────────┴───────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┴──────────────┴────────────┘ +┌─name─────────────┬─type─────────────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┬─is_subcolumn─┬─is_virtual─┐ +│ id │ UInt64 │ │ │ index column │ │ │ 0 │ 0 │ +│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ 0 │ 0 │ +│ t │ Tuple( + a String, + b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ 0 │ 0 │ +│ _part │ LowCardinality(String) │ │ │ │ │ │ 0 │ 1 │ +│ _part_index │ UInt64 │ │ │ │ │ │ 0 │ 1 │ +│ _part_uuid │ UUID │ │ │ │ │ │ 0 │ 1 │ +│ _partition_id │ LowCardinality(String) │ │ │ │ │ │ 0 │ 1 │ +│ _partition_value │ UInt8 │ │ │ │ │ │ 0 │ 1 │ +│ _sample_factor │ Float64 │ │ │ │ │ │ 0 │ 1 │ +│ _part_offset │ UInt64 │ │ │ │ │ │ 0 │ 1 │ +│ _row_exists │ UInt8 │ │ │ │ │ │ 0 │ 1 │ +│ _block_number │ UInt64 │ │ │ │ │ │ 0 │ 1 │ +│ arr.size0 │ UInt64 │ │ │ │ │ │ 1 │ 0 │ +│ t.a │ String │ │ │ │ ZSTD(1) │ │ 1 │ 0 │ +│ t.b │ UInt64 │ │ │ │ ZSTD(1) │ │ 1 │ 0 │ +└──────────────────┴──────────────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┴──────────────┴────────────┘ DESCRIBE remote(default, currentDatabase(), t_describe_options) FORMAT PrettyCompactNoEscapes; -┌─name───────────┬─type──────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┬─is_subcolumn─┬─is_virtual─┐ -│ id │ UInt64 │ │ │ index column │ │ │ 0 │ 0 │ -│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ 0 │ 0 │ -│ t │ Tuple(a String, b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ 0 │ 0 │ -│ _table │ LowCardinality(String) │ │ │ │ │ │ 0 │ 1 │ -│ _part │ LowCardinality(String) │ │ │ │ │ │ 0 │ 1 │ -│ _part_index │ UInt64 │ │ │ │ │ │ 0 │ 1 │ -│ _part_uuid │ UUID │ │ │ │ │ │ 0 │ 1 │ -│ _partition_id │ LowCardinality(String) │ │ │ │ │ │ 0 │ 1 │ -│ _sample_factor │ Float64 │ │ │ │ │ │ 0 │ 1 │ -│ _part_offset │ UInt64 │ │ │ │ │ │ 0 │ 1 │ -│ _row_exists │ UInt8 │ │ │ │ │ │ 0 │ 1 │ -│ _block_number │ UInt64 │ │ │ │ │ │ 0 │ 1 │ -│ _shard_num │ UInt32 │ │ │ │ │ │ 0 │ 1 │ -│ arr.size0 │ UInt64 │ │ │ │ │ │ 1 │ 0 │ -│ t.a │ String │ │ │ │ ZSTD(1) │ │ 1 │ 0 │ -│ t.b │ UInt64 │ │ │ │ ZSTD(1) │ │ 1 │ 0 │ -└────────────────┴───────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┴──────────────┴────────────┘ +┌─name───────────┬─type─────────────────────────────┬─default_type─┬─default_expression─┬─comment──────┬─codec_expression─┬─ttl_expression─┬─is_subcolumn─┬─is_virtual─┐ +│ id │ UInt64 │ │ │ index column │ │ │ 0 │ 0 │ +│ arr │ Array(UInt64) │ DEFAULT │ [10, 20] │ │ ZSTD(1) │ │ 0 │ 0 │ +│ t │ Tuple( + a String, + b UInt64) │ DEFAULT │ ('foo', 0) │ │ ZSTD(1) │ │ 0 │ 0 │ +│ _table │ LowCardinality(String) │ │ │ │ │ │ 0 │ 1 │ +│ _part │ LowCardinality(String) │ │ │ │ │ │ 0 │ 1 │ +│ _part_index │ UInt64 │ │ │ │ │ │ 0 │ 1 │ +│ _part_uuid │ UUID │ │ │ │ │ │ 0 │ 1 │ +│ _partition_id │ LowCardinality(String) │ │ │ │ │ │ 0 │ 1 │ +│ _sample_factor │ Float64 │ │ │ │ │ │ 0 │ 1 │ +│ _part_offset │ UInt64 │ │ │ │ │ │ 0 │ 1 │ +│ _row_exists │ UInt8 │ │ │ │ │ │ 0 │ 1 │ +│ _block_number │ UInt64 │ │ │ │ │ │ 0 │ 1 │ +│ _shard_num │ UInt32 │ │ │ │ │ │ 0 │ 1 │ +│ arr.size0 │ UInt64 │ │ │ │ │ │ 1 │ 0 │ +│ t.a │ String │ │ │ │ ZSTD(1) │ │ 1 │ 0 │ +│ t.b │ UInt64 │ │ │ │ ZSTD(1) │ │ 1 │ 0 │ +└────────────────┴──────────────────────────────────┴──────────────┴────────────────────┴──────────────┴──────────────────┴────────────────┴──────────────┴────────────┘ SET describe_compact_output = 1, describe_include_virtual_columns = 0, describe_include_subcolumns = 0; DESCRIBE TABLE t_describe_options FORMAT PrettyCompactNoEscapes; -┌─name─┬─type──────────────────────┐ -│ id │ UInt64 │ -│ arr │ Array(UInt64) │ -│ t │ Tuple(a String, b UInt64) │ -└──────┴───────────────────────────┘ +┌─name─┬─type─────────────────────────────┐ +│ id │ UInt64 │ +│ arr │ Array(UInt64) │ +│ t │ Tuple( + a String, + b UInt64) │ +└──────┴──────────────────────────────────┘ DESCRIBE remote(default, currentDatabase(), t_describe_options) FORMAT PrettyCompactNoEscapes; -┌─name─┬─type──────────────────────┐ -│ id │ UInt64 │ -│ arr │ Array(UInt64) │ -│ t │ Tuple(a String, b UInt64) │ -└──────┴───────────────────────────┘ +┌─name─┬─type─────────────────────────────┐ +│ id │ UInt64 │ +│ arr │ Array(UInt64) │ +│ t │ Tuple( + a String, + b UInt64) │ +└──────┴──────────────────────────────────┘ SET describe_compact_output = 1, describe_include_virtual_columns = 0, describe_include_subcolumns = 1; DESCRIBE TABLE t_describe_options FORMAT PrettyCompactNoEscapes; -┌─name──────┬─type──────────────────────┬─is_subcolumn─┐ -│ id │ UInt64 │ 0 │ -│ arr │ Array(UInt64) │ 0 │ -│ t │ Tuple(a String, b UInt64) │ 0 │ -│ arr.size0 │ UInt64 │ 1 │ -│ t.a │ String │ 1 │ -│ t.b │ UInt64 │ 1 │ -└───────────┴───────────────────────────┴──────────────┘ +┌─name──────┬─type─────────────────────────────┬─is_subcolumn─┐ +│ id │ UInt64 │ 0 │ +│ arr │ Array(UInt64) │ 0 │ +│ t │ Tuple( + a String, + b UInt64) │ 0 │ +│ arr.size0 │ UInt64 │ 1 │ +│ t.a │ String │ 1 │ +│ t.b │ UInt64 │ 1 │ +└───────────┴──────────────────────────────────┴──────────────┘ DESCRIBE remote(default, currentDatabase(), t_describe_options) FORMAT PrettyCompactNoEscapes; -┌─name──────┬─type──────────────────────┬─is_subcolumn─┐ -│ id │ UInt64 │ 0 │ -│ arr │ Array(UInt64) │ 0 │ -│ t │ Tuple(a String, b UInt64) │ 0 │ -│ arr.size0 │ UInt64 │ 1 │ -│ t.a │ String │ 1 │ -│ t.b │ UInt64 │ 1 │ -└───────────┴───────────────────────────┴──────────────┘ +┌─name──────┬─type─────────────────────────────┬─is_subcolumn─┐ +│ id │ UInt64 │ 0 │ +│ arr │ Array(UInt64) │ 0 │ +│ t │ Tuple( + a String, + b UInt64) │ 0 │ +│ arr.size0 │ UInt64 │ 1 │ +│ t.a │ String │ 1 │ +│ t.b │ UInt64 │ 1 │ +└───────────┴──────────────────────────────────┴──────────────┘ SET describe_compact_output = 1, describe_include_virtual_columns = 1, describe_include_subcolumns = 0; DESCRIBE TABLE t_describe_options FORMAT PrettyCompactNoEscapes; -┌─name─────────────┬─type──────────────────────┬─is_virtual─┐ -│ id │ UInt64 │ 0 │ -│ arr │ Array(UInt64) │ 0 │ -│ t │ Tuple(a String, b UInt64) │ 0 │ -│ _part │ LowCardinality(String) │ 1 │ -│ _part_index │ UInt64 │ 1 │ -│ _part_uuid │ UUID │ 1 │ -│ _partition_id │ LowCardinality(String) │ 1 │ -│ _partition_value │ UInt8 │ 1 │ -│ _sample_factor │ Float64 │ 1 │ -│ _part_offset │ UInt64 │ 1 │ -│ _row_exists │ UInt8 │ 1 │ -│ _block_number │ UInt64 │ 1 │ -└──────────────────┴───────────────────────────┴────────────┘ +┌─name─────────────┬─type─────────────────────────────┬─is_virtual─┐ +│ id │ UInt64 │ 0 │ +│ arr │ Array(UInt64) │ 0 │ +│ t │ Tuple( + a String, + b UInt64) │ 0 │ +│ _part │ LowCardinality(String) │ 1 │ +│ _part_index │ UInt64 │ 1 │ +│ _part_uuid │ UUID │ 1 │ +│ _partition_id │ LowCardinality(String) │ 1 │ +│ _partition_value │ UInt8 │ 1 │ +│ _sample_factor │ Float64 │ 1 │ +│ _part_offset │ UInt64 │ 1 │ +│ _row_exists │ UInt8 │ 1 │ +│ _block_number │ UInt64 │ 1 │ +└──────────────────┴──────────────────────────────────┴────────────┘ DESCRIBE remote(default, currentDatabase(), t_describe_options) FORMAT PrettyCompactNoEscapes; -┌─name───────────┬─type──────────────────────┬─is_virtual─┐ -│ id │ UInt64 │ 0 │ -│ arr │ Array(UInt64) │ 0 │ -│ t │ Tuple(a String, b UInt64) │ 0 │ -│ _table │ LowCardinality(String) │ 1 │ -│ _part │ LowCardinality(String) │ 1 │ -│ _part_index │ UInt64 │ 1 │ -│ _part_uuid │ UUID │ 1 │ -│ _partition_id │ LowCardinality(String) │ 1 │ -│ _sample_factor │ Float64 │ 1 │ -│ _part_offset │ UInt64 │ 1 │ -│ _row_exists │ UInt8 │ 1 │ -│ _block_number │ UInt64 │ 1 │ -│ _shard_num │ UInt32 │ 1 │ -└────────────────┴───────────────────────────┴────────────┘ +┌─name───────────┬─type─────────────────────────────┬─is_virtual─┐ +│ id │ UInt64 │ 0 │ +│ arr │ Array(UInt64) │ 0 │ +│ t │ Tuple( + a String, + b UInt64) │ 0 │ +│ _table │ LowCardinality(String) │ 1 │ +│ _part │ LowCardinality(String) │ 1 │ +│ _part_index │ UInt64 │ 1 │ +│ _part_uuid │ UUID │ 1 │ +│ _partition_id │ LowCardinality(String) │ 1 │ +│ _sample_factor │ Float64 │ 1 │ +│ _part_offset │ UInt64 │ 1 │ +│ _row_exists │ UInt8 │ 1 │ +│ _block_number │ UInt64 │ 1 │ +│ _shard_num │ UInt32 │ 1 │ +└────────────────┴──────────────────────────────────┴────────────┘ SET describe_compact_output = 1, describe_include_virtual_columns = 1, describe_include_subcolumns = 1; DESCRIBE TABLE t_describe_options FORMAT PrettyCompactNoEscapes; -┌─name─────────────┬─type──────────────────────┬─is_subcolumn─┬─is_virtual─┐ -│ id │ UInt64 │ 0 │ 0 │ -│ arr │ Array(UInt64) │ 0 │ 0 │ -│ t │ Tuple(a String, b UInt64) │ 0 │ 0 │ -│ _part │ LowCardinality(String) │ 0 │ 1 │ -│ _part_index │ UInt64 │ 0 │ 1 │ -│ _part_uuid │ UUID │ 0 │ 1 │ -│ _partition_id │ LowCardinality(String) │ 0 │ 1 │ -│ _partition_value │ UInt8 │ 0 │ 1 │ -│ _sample_factor │ Float64 │ 0 │ 1 │ -│ _part_offset │ UInt64 │ 0 │ 1 │ -│ _row_exists │ UInt8 │ 0 │ 1 │ -│ _block_number │ UInt64 │ 0 │ 1 │ -│ arr.size0 │ UInt64 │ 1 │ 0 │ -│ t.a │ String │ 1 │ 0 │ -│ t.b │ UInt64 │ 1 │ 0 │ -└──────────────────┴───────────────────────────┴──────────────┴────────────┘ +┌─name─────────────┬─type─────────────────────────────┬─is_subcolumn─┬─is_virtual─┐ +│ id │ UInt64 │ 0 │ 0 │ +│ arr │ Array(UInt64) │ 0 │ 0 │ +│ t │ Tuple( + a String, + b UInt64) │ 0 │ 0 │ +│ _part │ LowCardinality(String) │ 0 │ 1 │ +│ _part_index │ UInt64 │ 0 │ 1 │ +│ _part_uuid │ UUID │ 0 │ 1 │ +│ _partition_id │ LowCardinality(String) │ 0 │ 1 │ +│ _partition_value │ UInt8 │ 0 │ 1 │ +│ _sample_factor │ Float64 │ 0 │ 1 │ +│ _part_offset │ UInt64 │ 0 │ 1 │ +│ _row_exists │ UInt8 │ 0 │ 1 │ +│ _block_number │ UInt64 │ 0 │ 1 │ +│ arr.size0 │ UInt64 │ 1 │ 0 │ +│ t.a │ String │ 1 │ 0 │ +│ t.b │ UInt64 │ 1 │ 0 │ +└──────────────────┴──────────────────────────────────┴──────────────┴────────────┘ DESCRIBE remote(default, currentDatabase(), t_describe_options) FORMAT PrettyCompactNoEscapes; -┌─name───────────┬─type──────────────────────┬─is_subcolumn─┬─is_virtual─┐ -│ id │ UInt64 │ 0 │ 0 │ -│ arr │ Array(UInt64) │ 0 │ 0 │ -│ t │ Tuple(a String, b UInt64) │ 0 │ 0 │ -│ _table │ LowCardinality(String) │ 0 │ 1 │ -│ _part │ LowCardinality(String) │ 0 │ 1 │ -│ _part_index │ UInt64 │ 0 │ 1 │ -│ _part_uuid │ UUID │ 0 │ 1 │ -│ _partition_id │ LowCardinality(String) │ 0 │ 1 │ -│ _sample_factor │ Float64 │ 0 │ 1 │ -│ _part_offset │ UInt64 │ 0 │ 1 │ -│ _row_exists │ UInt8 │ 0 │ 1 │ -│ _block_number │ UInt64 │ 0 │ 1 │ -│ _shard_num │ UInt32 │ 0 │ 1 │ -│ arr.size0 │ UInt64 │ 1 │ 0 │ -│ t.a │ String │ 1 │ 0 │ -│ t.b │ UInt64 │ 1 │ 0 │ -└────────────────┴───────────────────────────┴──────────────┴────────────┘ +┌─name───────────┬─type─────────────────────────────┬─is_subcolumn─┬─is_virtual─┐ +│ id │ UInt64 │ 0 │ 0 │ +│ arr │ Array(UInt64) │ 0 │ 0 │ +│ t │ Tuple( + a String, + b UInt64) │ 0 │ 0 │ +│ _table │ LowCardinality(String) │ 0 │ 1 │ +│ _part │ LowCardinality(String) │ 0 │ 1 │ +│ _part_index │ UInt64 │ 0 │ 1 │ +│ _part_uuid │ UUID │ 0 │ 1 │ +│ _partition_id │ LowCardinality(String) │ 0 │ 1 │ +│ _sample_factor │ Float64 │ 0 │ 1 │ +│ _part_offset │ UInt64 │ 0 │ 1 │ +│ _row_exists │ UInt8 │ 0 │ 1 │ +│ _block_number │ UInt64 │ 0 │ 1 │ +│ _shard_num │ UInt32 │ 0 │ 1 │ +│ arr.size0 │ UInt64 │ 1 │ 0 │ +│ t.a │ String │ 1 │ 0 │ +│ t.b │ UInt64 │ 1 │ 0 │ +└────────────────┴──────────────────────────────────┴──────────────┴────────────┘ diff --git a/tests/queries/0_stateless/02900_union_schema_inference_mode.reference b/tests/queries/0_stateless/02900_union_schema_inference_mode.reference index 864cd780ddb..31172c41262 100644 --- a/tests/queries/0_stateless/02900_union_schema_inference_mode.reference +++ b/tests/queries/0_stateless/02900_union_schema_inference_mode.reference @@ -1,5 +1,5 @@ a Nullable(Int64) -obj Tuple(f1 Nullable(Int64), f2 Nullable(String), f3 Nullable(Int64)) +obj Tuple(\n f1 Nullable(Int64),\n f2 Nullable(String),\n f3 Nullable(Int64)) b Nullable(Int64) c Nullable(String) {"a":"1","obj":{"f1":"1","f2":"2020-01-01","f3":null},"b":null,"c":null} @@ -10,11 +10,11 @@ UNION data2.jsonl b Nullable(Int64), obj Tuple(f2 Nullable(String), f3 Nullable( UNION data3.jsonl c Nullable(String) c Nullable(String) a Nullable(Int64) -obj Tuple(f1 Nullable(Int64), f2 Nullable(String), f3 Nullable(Int64)) +obj Tuple(\n f1 Nullable(Int64),\n f2 Nullable(String),\n f3 Nullable(Int64)) b Nullable(Int64) c Nullable(String) a Nullable(Int64) -obj Tuple(f1 Nullable(Int64), f2 Nullable(String), f3 Nullable(Int64)) +obj Tuple(\n f1 Nullable(Int64),\n f2 Nullable(String),\n f3 Nullable(Int64)) b Nullable(Int64) c Nullable(String) {"a":"1","obj":{"f1":"1","f2":"2020-01-01","f3":null},"b":null,"c":null} @@ -25,7 +25,7 @@ UNION archive.tar::data2.jsonl b Nullable(Int64), obj Tuple(f2 Nullable(String), UNION archive.tar::data3.jsonl c Nullable(String) c Nullable(String) a Nullable(Int64) -obj Tuple(f1 Nullable(Int64), f2 Nullable(String), f3 Nullable(Int64)) +obj Tuple(\n f1 Nullable(Int64),\n f2 Nullable(String),\n f3 Nullable(Int64)) b Nullable(Int64) c Nullable(String) 1 diff --git a/tests/queries/0_stateless/02906_flatten_only_true_nested.reference b/tests/queries/0_stateless/02906_flatten_only_true_nested.reference index e7a96da8db9..b259b1e4563 100644 --- a/tests/queries/0_stateless/02906_flatten_only_true_nested.reference +++ b/tests/queries/0_stateless/02906_flatten_only_true_nested.reference @@ -1,3 +1,3 @@ data.x Array(UInt32) data.y Array(UInt32) -data Array(Tuple(x UInt64, y UInt64)) +data Array(Tuple(\n x UInt64,\n y UInt64)) diff --git a/tests/queries/0_stateless/02906_orc_tuple_field_prune.reference b/tests/queries/0_stateless/02906_orc_tuple_field_prune.reference index dfdd38f5e8e..46738c95cd5 100644 --- a/tests/queries/0_stateless/02906_orc_tuple_field_prune.reference +++ b/tests/queries/0_stateless/02906_orc_tuple_field_prune.reference @@ -1,9 +1,9 @@ int64_column Nullable(Int64) string_column Nullable(String) float64_column Nullable(Float64) -tuple_column Tuple(a Nullable(String), b Nullable(Float64), c Nullable(Int64)) -array_tuple_column Array(Tuple(a Nullable(String), b Nullable(Float64), c Nullable(Int64))) -map_tuple_column Map(String, Tuple(a Nullable(String), b Nullable(Float64), c Nullable(Int64))) +tuple_column Tuple(\n a Nullable(String),\n b Nullable(Float64),\n c Nullable(Int64)) +array_tuple_column Array(Tuple(\n a Nullable(String),\n b Nullable(Float64),\n c Nullable(Int64))) +map_tuple_column Map(String, Tuple(\n a Nullable(String),\n b Nullable(Float64),\n c Nullable(Int64))) -- { echoOn } -- Test primitive types select int64_column, string_column, float64_column from file('02906.orc') where int64_column % 15 = 0; diff --git a/tests/queries/0_stateless/02910_replicated_merge_parameters_must_consistent.sql b/tests/queries/0_stateless/02910_replicated_merge_parameters_must_consistent.sql index 3c1bec4fb3f..0f452105e6d 100644 --- a/tests/queries/0_stateless/02910_replicated_merge_parameters_must_consistent.sql +++ b/tests/queries/0_stateless/02910_replicated_merge_parameters_must_consistent.sql @@ -8,13 +8,22 @@ CREATE TABLE t ENGINE = ReplicatedReplacingMergeTree('/tables/{database}/t/', 'r1', legacy_ver) ORDER BY id; -CREATE TABLE t_r +CREATE TABLE t_r_ok +( + `id` UInt64, + `val` String, + `legacy_ver` UInt64, +) +ENGINE = ReplicatedReplacingMergeTree('/tables/{database}/t/', 'r2', legacy_ver) +ORDER BY id; + +CREATE TABLE t_r_error ( `id` UInt64, `val` String, `legacy_ver` UInt64 ) -ENGINE = ReplicatedReplacingMergeTree('/tables/{database}/t/', 'r2') +ENGINE = ReplicatedReplacingMergeTree('/tables/{database}/t/', 'r3') ORDER BY id; -- { serverError METADATA_MISMATCH } CREATE TABLE t2 @@ -27,14 +36,24 @@ CREATE TABLE t2 ENGINE = ReplicatedReplacingMergeTree('/tables/{database}/t2/', 'r1', legacy_ver) ORDER BY id; -CREATE TABLE t2_r +CREATE TABLE t2_r_ok ( `id` UInt64, `val` String, `legacy_ver` UInt64, `deleted` UInt8 ) -ENGINE = ReplicatedReplacingMergeTree('/tables/{database}/t2/', 'r2', legacy_ver, deleted) +ENGINE = ReplicatedReplacingMergeTree('/tables/{database}/t2/', 'r2', legacy_ver) +ORDER BY id; + +CREATE TABLE t2_r_error +( + `id` UInt64, + `val` String, + `legacy_ver` UInt64, + `deleted` UInt8 +) +ENGINE = ReplicatedReplacingMergeTree('/tables/{database}/t2/', 'r3', legacy_ver, deleted) ORDER BY id; -- { serverError METADATA_MISMATCH } CREATE TABLE t3 @@ -46,13 +65,23 @@ CREATE TABLE t3 ENGINE = ReplicatedSummingMergeTree('/tables/{database}/t3/', 'r1', metrics1) ORDER BY key; -CREATE TABLE t3_r +CREATE TABLE t3_r_ok ( `key` UInt64, `metrics1` UInt64, `metrics2` UInt64 ) -ENGINE = ReplicatedSummingMergeTree('/tables/{database}/t3/', 'r2', metrics2) +ENGINE = ReplicatedSummingMergeTree('/tables/{database}/t3/', 'r2', metrics1) +ORDER BY key; + + +CREATE TABLE t3_r_error +( + `key` UInt64, + `metrics1` UInt64, + `metrics2` UInt64 +) +ENGINE = ReplicatedSummingMergeTree('/tables/{database}/t3/', 'r3', metrics2) ORDER BY key; -- { serverError METADATA_MISMATCH } CREATE TABLE t4 @@ -67,7 +96,7 @@ CREATE TABLE t4 ENGINE = ReplicatedGraphiteMergeTree('/tables/{database}/t4/', 'r1', 'graphite_rollup') ORDER BY key; -CREATE TABLE t4_r +CREATE TABLE t4_r_ok ( `key` UInt32, `Path` String, @@ -76,5 +105,30 @@ CREATE TABLE t4_r `Version` UInt32, `col` UInt64 ) -ENGINE = ReplicatedGraphiteMergeTree('/tables/{database}/t4/', 'r2', 'graphite_rollup_alternative') +ENGINE = ReplicatedGraphiteMergeTree('/tables/{database}/t4/', 'r2', 'graphite_rollup') +ORDER BY key; + +CREATE TABLE t4_r_error +( + `key` UInt32, + `Path` String, + `Time` DateTime('UTC'), + `Value` Float64, + `Version` UInt32, + `col` UInt64 +) +ENGINE = ReplicatedGraphiteMergeTree('/tables/{database}/t4/', 'r3', 'graphite_rollup_alternative') ORDER BY key; -- { serverError METADATA_MISMATCH } + +-- https://github.com/ClickHouse/ClickHouse/issues/58451 +CREATE TABLE t4_r_error_2 +( + `key` UInt32, + `Path` String, + `Time` DateTime('UTC'), + `Value` Float64, + `Version` UInt32, + `col` UInt64 +) +ENGINE = ReplicatedGraphiteMergeTree('/tables/{database}/t4/', 'r4', 'graphite_rollup_alternative_no_function') +ORDER BY key; -- { serverError METADATA_MISMATCH } \ No newline at end of file diff --git a/tests/queries/0_stateless/02911_analyzer_order_by_read_in_order_query_plan.reference b/tests/queries/0_stateless/02911_analyzer_order_by_read_in_order_query_plan.reference index 5dd0d0d1820..d8f2decba37 100644 --- a/tests/queries/0_stateless/02911_analyzer_order_by_read_in_order_query_plan.reference +++ b/tests/queries/0_stateless/02911_analyzer_order_by_read_in_order_query_plan.reference @@ -13,8 +13,8 @@ select * from tab order by (a + b) * c; 4 4 4 4 4 4 4 4 select * from (explain plan actions = 1 select * from tab order by (a + b) * c) where explain like '%sort description%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC select * from tab order by (a + b) * c desc; 4 4 4 4 4 4 4 4 @@ -27,8 +27,8 @@ select * from tab order by (a + b) * c desc; 0 0 0 0 0 0 0 0 select * from (explain plan actions = 1 select * from tab order by (a + b) * c desc) where explain like '%sort description%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) DESC - Result sort description: multiply(plus(a_0, b_1), c_2) DESC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) DESC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) DESC -- Exact match, full key select * from tab order by (a + b) * c, sin(a / b); 0 0 0 0 @@ -42,8 +42,8 @@ select * from tab order by (a + b) * c, sin(a / b); 4 4 4 4 4 4 4 4 select * from (explain plan actions = 1 select * from tab order by (a + b) * c, sin(a / b)) where explain like '%sort description%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) ASC select * from tab order by (a + b) * c desc, sin(a / b) desc; 4 4 4 4 4 4 4 4 @@ -56,8 +56,8 @@ select * from tab order by (a + b) * c desc, sin(a / b) desc; 0 0 0 0 0 0 0 0 select * from (explain plan actions = 1 select * from tab order by (a + b) * c desc, sin(a / b) desc) where explain like '%sort description%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) DESC, sin(divide(a_0, b_1)) DESC - Result sort description: multiply(plus(a_0, b_1), c_2) DESC, sin(divide(a_0, b_1)) DESC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) DESC, sin(divide(__table1.a, __table1.b)) DESC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) DESC, sin(divide(__table1.a, __table1.b)) DESC -- Exact match, mixed direction select * from tab order by (a + b) * c desc, sin(a / b); 4 4 4 4 @@ -71,8 +71,8 @@ select * from tab order by (a + b) * c desc, sin(a / b); 0 0 0 0 0 0 0 0 select * from (explain plan actions = 1 select * from tab order by (a + b) * c desc, sin(a / b)) where explain like '%sort description%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) DESC - Result sort description: multiply(plus(a_0, b_1), c_2) DESC, sin(divide(a_0, b_1)) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) DESC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) DESC, sin(divide(__table1.a, __table1.b)) ASC select * from tab order by (a + b) * c, sin(a / b) desc; 0 0 0 0 0 0 0 0 @@ -85,8 +85,8 @@ select * from tab order by (a + b) * c, sin(a / b) desc; 4 4 4 4 4 4 4 4 select * from (explain plan actions = 1 select * from tab order by (a + b) * c, sin(a / b) desc) where explain like '%sort description%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) DESC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) DESC -- Wrong order, full sort select * from tab order by sin(a / b), (a + b) * c; 1 1 1 1 @@ -100,32 +100,32 @@ select * from tab order by sin(a / b), (a + b) * c; 0 0 0 0 0 0 0 0 select * from (explain plan actions = 1 select * from tab order by sin(a / b), (a + b) * c) where explain ilike '%sort description%'; - Sort description: sin(divide(a_0, b_1)) ASC, multiply(plus(a_0, b_1), c_2) ASC + Sort description: sin(divide(__table1.a, __table1.b)) ASC, multiply(plus(__table1.a, __table1.b), __table1.c) ASC -- Fixed point select * from tab where (a + b) * c = 8 order by sin(a / b); 2 2 2 2 2 2 2 2 select * from (explain plan actions = 1 select * from tab where (a + b) * c = 8 order by sin(a / b)) where explain ilike '%sort description%'; - Prefix sort description: sin(divide(a_0, b_1)) ASC - Result sort description: sin(divide(a_0, b_1)) ASC + Prefix sort description: sin(divide(__table1.a, __table1.b)) ASC + Result sort description: sin(divide(__table1.a, __table1.b)) ASC select * from tab where d + 1 = 2 order by (d + 1) * 4, (a + b) * c; 1 1 1 1 1 1 1 1 select * from (explain plan actions = 1 select * from tab where d + 1 = 2 order by (d + 1) * 4, (a + b) * c) where explain ilike '%sort description%'; - Prefix sort description: multiply(plus(d_3, 1_UInt8), 4_UInt8) ASC, multiply(plus(a_0, b_1), c_2) ASC - Result sort description: multiply(plus(d_3, 1_UInt8), 4_UInt8) ASC, multiply(plus(a_0, b_1), c_2) ASC + Prefix sort description: multiply(plus(__table1.d, 1_UInt8), 4_UInt8) ASC, multiply(plus(__table1.a, __table1.b), __table1.c) ASC + Result sort description: multiply(plus(__table1.d, 1_UInt8), 4_UInt8) ASC, multiply(plus(__table1.a, __table1.b), __table1.c) ASC select * from tab where d + 1 = 3 and (a + b) = 4 and c = 2 order by (d + 1) * 4, sin(a / b); 2 2 2 2 2 2 2 2 select * from (explain plan actions = 1 select * from tab where d + 1 = 3 and (a + b) = 4 and c = 2 order by (d + 1) * 4, sin(a / b)) where explain ilike '%sort description%'; - Prefix sort description: multiply(plus(d_3, 1_UInt8), 4_UInt8) ASC, sin(divide(a_0, b_1)) ASC - Result sort description: multiply(plus(d_3, 1_UInt8), 4_UInt8) ASC, sin(divide(a_0, b_1)) ASC + Prefix sort description: multiply(plus(__table1.d, 1_UInt8), 4_UInt8) ASC, sin(divide(__table1.a, __table1.b)) ASC + Result sort description: multiply(plus(__table1.d, 1_UInt8), 4_UInt8) ASC, sin(divide(__table1.a, __table1.b)) ASC -- Wrong order with fixed point select * from tab where (a + b) * c = 8 order by sin(b / a); 2 2 2 2 2 2 2 2 select * from (explain plan actions = 1 select * from tab where (a + b) * c = 8 order by sin(b / a)) where explain ilike '%sort description%'; - Sort description: sin(divide(b_1, a_0)) ASC + Sort description: sin(divide(__table1.b, __table1.a)) ASC -- Monotonicity select * from tab order by intDiv((a + b) * c, 2); 0 0 0 0 @@ -139,8 +139,8 @@ select * from tab order by intDiv((a + b) * c, 2); 4 4 4 4 4 4 4 4 select * from (explain plan actions = 1 select * from tab order by intDiv((a + b) * c, 2)) where explain like '%sort description%'; - Prefix sort description: intDiv(multiply(plus(a_0, b_1), c_2), 2_UInt8) ASC - Result sort description: intDiv(multiply(plus(a_0, b_1), c_2), 2_UInt8) ASC + Prefix sort description: intDiv(multiply(plus(__table1.a, __table1.b), __table1.c), 2_UInt8) ASC + Result sort description: intDiv(multiply(plus(__table1.a, __table1.b), __table1.c), 2_UInt8) ASC select * from tab order by intDiv((a + b) * c, 2), sin(a / b); 0 0 0 0 0 0 0 0 @@ -153,36 +153,36 @@ select * from tab order by intDiv((a + b) * c, 2), sin(a / b); 4 4 4 4 4 4 4 4 select * from (explain plan actions = 1 select * from tab order by intDiv((a + b) * c, 2), sin(a / b)) where explain like '%sort description%'; - Prefix sort description: intDiv(multiply(plus(a_0, b_1), c_2), 2_UInt8) ASC - Result sort description: intDiv(multiply(plus(a_0, b_1), c_2), 2_UInt8) ASC, sin(divide(a_0, b_1)) ASC + Prefix sort description: intDiv(multiply(plus(__table1.a, __table1.b), __table1.c), 2_UInt8) ASC + Result sort description: intDiv(multiply(plus(__table1.a, __table1.b), __table1.c), 2_UInt8) ASC, sin(divide(__table1.a, __table1.b)) ASC -- select * from tab order by (a + b) * c, intDiv(sin(a / b), 2); select * from (explain plan actions = 1 select * from tab order by (a + b) * c, intDiv(sin(a / b), 2)) where explain like '%sort description%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) ASC, intDiv(sin(divide(a_0, b_1)), 2_UInt8) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) ASC, intDiv(sin(divide(a_0, b_1)), 2_UInt8) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, intDiv(sin(divide(__table1.a, __table1.b)), 2_UInt8) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, intDiv(sin(divide(__table1.a, __table1.b)), 2_UInt8) ASC -- select * from tab order by (a + b) * c desc , intDiv(sin(a / b), 2); select * from (explain plan actions = 1 select * from tab order by (a + b) * c desc , intDiv(sin(a / b), 2)) where explain like '%sort description%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) DESC - Result sort description: multiply(plus(a_0, b_1), c_2) DESC, intDiv(sin(divide(a_0, b_1)), 2_UInt8) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) DESC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) DESC, intDiv(sin(divide(__table1.a, __table1.b)), 2_UInt8) ASC -- select * from tab order by (a + b) * c, intDiv(sin(a / b), 2) desc; select * from (explain plan actions = 1 select * from tab order by (a + b) * c, intDiv(sin(a / b), 2) desc) where explain like '%sort description%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) ASC, intDiv(sin(divide(a_0, b_1)), 2_UInt8) DESC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, intDiv(sin(divide(__table1.a, __table1.b)), 2_UInt8) DESC -- select * from tab order by (a + b) * c desc, intDiv(sin(a / b), 2) desc; select * from (explain plan actions = 1 select * from tab order by (a + b) * c desc, intDiv(sin(a / b), 2) desc) where explain like '%sort description%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) DESC, intDiv(sin(divide(a_0, b_1)), 2_UInt8) DESC - Result sort description: multiply(plus(a_0, b_1), c_2) DESC, intDiv(sin(divide(a_0, b_1)), 2_UInt8) DESC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) DESC, intDiv(sin(divide(__table1.a, __table1.b)), 2_UInt8) DESC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) DESC, intDiv(sin(divide(__table1.a, __table1.b)), 2_UInt8) DESC -- select * from tab order by (a + b) * c desc, intDiv(sin(a / b), -2); select * from (explain plan actions = 1 select * from tab order by (a + b) * c desc, intDiv(sin(a / b), -2)) where explain like '%sort description%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) DESC, intDiv(sin(divide(a_0, b_1)), -2_Int8) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) DESC, intDiv(sin(divide(a_0, b_1)), -2_Int8) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) DESC, intDiv(sin(divide(__table1.a, __table1.b)), -2_Int8) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) DESC, intDiv(sin(divide(__table1.a, __table1.b)), -2_Int8) ASC -- select * from tab order by (a + b) * c desc, intDiv(intDiv(sin(a / b), -2), -3); select * from (explain plan actions = 1 select * from tab order by (a + b) * c desc, intDiv(intDiv(sin(a / b), -2), -3)) where explain like '%sort description%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) DESC - Result sort description: multiply(plus(a_0, b_1), c_2) DESC, intDiv(intDiv(sin(divide(a_0, b_1)), -2_Int8), -3_Int8) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) DESC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) DESC, intDiv(intDiv(sin(divide(__table1.a, __table1.b)), -2_Int8), -3_Int8) ASC -- select * from tab order by (a + b) * c, intDiv(intDiv(sin(a / b), -2), -3); select * from (explain plan actions = 1 select * from tab order by (a + b) * c, intDiv(intDiv(sin(a / b), -2), -3)) where explain like '%sort description%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) ASC, intDiv(intDiv(sin(divide(a_0, b_1)), -2_Int8), -3_Int8) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) ASC, intDiv(intDiv(sin(divide(a_0, b_1)), -2_Int8), -3_Int8) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, intDiv(intDiv(sin(divide(__table1.a, __table1.b)), -2_Int8), -3_Int8) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, intDiv(intDiv(sin(divide(__table1.a, __table1.b)), -2_Int8), -3_Int8) ASC -- Aliases select * from (select *, a + b as x from tab) order by x * c; 0 0 0 0 0 @@ -196,8 +196,8 @@ select * from (select *, a + b as x from tab) order by x * c; 4 4 4 4 8 4 4 4 4 8 select * from (explain plan actions = 1 select * from (select *, a + b as x from tab) order by x * c) where explain like '%sort description%'; - Prefix sort description: multiply(x_4, c_2) ASC - Result sort description: multiply(x_4, c_2) ASC + Prefix sort description: multiply(__table1.x, __table1.c) ASC + Result sort description: multiply(__table1.x, __table1.c) ASC select * from (select *, a + b as x, a / b as y from tab) order by x * c, sin(y); 0 0 0 0 0 nan 0 0 0 0 0 nan @@ -210,8 +210,8 @@ select * from (select *, a + b as x, a / b as y from tab) order by x * c, sin(y) 4 4 4 4 8 1 4 4 4 4 8 1 select * from (explain plan actions = 1 select * from (select *, a + b as x, a / b as y from tab) order by x * c, sin(y)) where explain like '%sort description%'; - Prefix sort description: multiply(x_4, c_2) ASC, sin(y_5) ASC - Result sort description: multiply(x_4, c_2) ASC, sin(y_5) ASC + Prefix sort description: multiply(__table1.x, __table1.c) ASC, sin(__table1.y) ASC + Result sort description: multiply(__table1.x, __table1.c) ASC, sin(__table1.y) ASC select * from (select *, a / b as y from (select *, a + b as x from tab)) order by x * c, sin(y); 0 0 0 0 0 nan 0 0 0 0 0 nan @@ -224,8 +224,8 @@ select * from (select *, a / b as y from (select *, a + b as x from tab)) order 4 4 4 4 8 1 4 4 4 4 8 1 select * from (explain plan actions = 1 select * from (select *, a / b as y from (select *, a + b as x from tab)) order by x * c, sin(y)) where explain like '%sort description%'; - Prefix sort description: multiply(x_4, c_2) ASC, sin(y_5) ASC - Result sort description: multiply(x_4, c_2) ASC, sin(y_5) ASC + Prefix sort description: multiply(__table1.x, __table1.c) ASC, sin(__table1.y) ASC + Result sort description: multiply(__table1.x, __table1.c) ASC, sin(__table1.y) ASC -- { echoOn } select * from tab2 order by toTimeZone(toTimezone(x, 'UTC'), 'CET'), intDiv(intDiv(y, -2), -3); @@ -238,8 +238,8 @@ select * from tab2 order by toTimeZone(toTimezone(x, 'UTC'), 'CET'), intDiv(intD 2020-02-05 00:00:00 3 3 2020-02-05 00:00:00 3 3 select * from (explain plan actions = 1 select * from tab2 order by toTimeZone(toTimezone(x, 'UTC'), 'CET'), intDiv(intDiv(y, -2), -3)) where explain like '%sort description%'; - Prefix sort description: toTimezone(toTimezone(x_0, \'UTC\'_String), \'CET\'_String) ASC, intDiv(intDiv(y_1, -2_Int8), -3_Int8) ASC - Result sort description: toTimezone(toTimezone(x_0, \'UTC\'_String), \'CET\'_String) ASC, intDiv(intDiv(y_1, -2_Int8), -3_Int8) ASC + Prefix sort description: toTimezone(toTimezone(__table1.x, \'UTC\'_String), \'CET\'_String) ASC, intDiv(intDiv(__table1.y, -2_Int8), -3_Int8) ASC + Result sort description: toTimezone(toTimezone(__table1.x, \'UTC\'_String), \'CET\'_String) ASC, intDiv(intDiv(__table1.y, -2_Int8), -3_Int8) ASC select * from tab2 order by toStartOfDay(x), intDiv(intDiv(y, -2), -3); 2020-02-02 00:00:00 0 0 2020-02-02 00:00:00 0 0 @@ -250,12 +250,12 @@ select * from tab2 order by toStartOfDay(x), intDiv(intDiv(y, -2), -3); 2020-02-05 00:00:00 3 3 2020-02-05 00:00:00 3 3 select * from (explain plan actions = 1 select * from tab2 order by toStartOfDay(x), intDiv(intDiv(y, -2), -3)) where explain like '%sort description%'; - Prefix sort description: toStartOfDay(x_0) ASC - Result sort description: toStartOfDay(x_0) ASC, intDiv(intDiv(y_1, -2_Int8), -3_Int8) ASC + Prefix sort description: toStartOfDay(__table1.x) ASC + Result sort description: toStartOfDay(__table1.x) ASC, intDiv(intDiv(__table1.y, -2_Int8), -3_Int8) ASC -- select * from tab2 where toTimezone(x, 'CET') = '2020-02-03 01:00:00' order by intDiv(intDiv(y, -2), -3); select * from (explain plan actions = 1 select * from tab2 where toTimezone(x, 'CET') = '2020-02-03 01:00:00' order by intDiv(intDiv(y, -2), -3)) where explain like '%sort description%'; - Prefix sort description: intDiv(intDiv(y_1, -2_Int8), -3_Int8) ASC - Result sort description: intDiv(intDiv(y_1, -2_Int8), -3_Int8) ASC + Prefix sort description: intDiv(intDiv(__table1.y, -2_Int8), -3_Int8) ASC + Result sort description: intDiv(intDiv(__table1.y, -2_Int8), -3_Int8) ASC -- { echoOn } -- Union (not fully supported) @@ -281,8 +281,8 @@ select * from (select * from tab union all select * from tab3) order by (a + b) 4 4 4 4 4 4 4 4 select * from (explain plan actions = 1 select * from (select * from tab union all select * from tab3) order by (a + b) * c, sin(a / b)) where explain like '%sort description%' or explain like '%ReadType%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) ASC ReadType: InOrder ReadType: InOrder select * from (select * from tab where (a + b) * c = 8 union all select * from tab3 where (a + b) * c = 18) order by sin(a / b); @@ -291,8 +291,8 @@ select * from (select * from tab where (a + b) * c = 8 union all select * from t 3 3 3 3 3 3 3 3 select * from (explain plan actions = 1 select * from (select * from tab where (a + b) * c = 8 union all select * from tab3 where (a + b) * c = 18) order by sin(a / b)) where explain like '%sort description%' or explain like '%ReadType%'; - Prefix sort description: sin(divide(a_0, b_1)) ASC - Result sort description: sin(divide(a_0, b_1)) ASC + Prefix sort description: sin(divide(__table1.a, __table1.b)) ASC + Result sort description: sin(divide(__table1.a, __table1.b)) ASC ReadType: InOrder ReadType: InOrder select * from (select * from tab where (a + b) * c = 8 union all select * from tab4) order by sin(a / b); @@ -309,8 +309,8 @@ select * from (select * from tab where (a + b) * c = 8 union all select * from t 0 0 0 0 0 0 0 0 select * from (explain plan actions = 1 select * from (select * from tab where (a + b) * c = 8 union all select * from tab4) order by sin(a / b)) where explain like '%sort description%' or explain like '%ReadType%'; - Prefix sort description: sin(divide(a_0, b_1)) ASC - Result sort description: sin(divide(a_0, b_1)) ASC + Prefix sort description: sin(divide(__table1.a, __table1.b)) ASC + Result sort description: sin(divide(__table1.a, __table1.b)) ASC ReadType: InOrder ReadType: InOrder select * from (select * from tab union all select * from tab5) order by (a + b) * c; @@ -335,8 +335,8 @@ select * from (select * from tab union all select * from tab5) order by (a + b) 4 4 4 4 4 4 4 4 select * from (explain plan actions = 1 select * from (select * from tab union all select * from tab5) order by (a + b) * c) where explain like '%sort description%' or explain like '%ReadType%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC ReadType: InOrder ReadType: InOrder select * from (select * from tab union all select * from tab5) order by (a + b) * c, sin(a / b); @@ -361,11 +361,11 @@ select * from (select * from tab union all select * from tab5) order by (a + b) 4 4 4 4 4 4 4 4 select * from (explain plan actions = 1 select * from (select * from tab union all select * from tab5) order by (a + b) * c, sin(a / b)) where explain like '%sort description%' or explain like '%ReadType%'; - Prefix sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) ASC ReadType: InOrder - Prefix sort description: multiply(plus(a_0, b_1), c_2) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) ASC ReadType: InOrder -- Union with limit select * from (select * from tab union all select * from tab5) order by (a + b) * c, sin(a / b) limit 3; @@ -375,12 +375,12 @@ select * from (select * from tab union all select * from tab5) order by (a + b) select * from (explain plan actions = 1 select * from (select * from tab union all select * from tab5) order by (a + b) * c, sin(a / b) limit 3) where explain ilike '%sort description%' or explain like '%ReadType%' or explain like '%Limit%'; Limit (preliminary LIMIT (without OFFSET)) Limit 3 - Prefix sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) ASC Limit 3 ReadType: InOrder - Prefix sort description: multiply(plus(a_0, b_1), c_2) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) ASC ReadType: InOrder -- In this example, we read-in-order from tab up to ((a + b) * c, sin(a / b)) and from tab5 up to ((a + b) * c). -- In case of tab5, there would be two finish sorting transforms: ((a + b) * c) -> ((a + b) * c, sin(a / b)) -> ((a + b) * c, sin(a / b), d). @@ -393,14 +393,14 @@ select * from (select * from tab union all select * from tab5 union all select * select * from (explain plan actions = 1 select * from (select * from tab union all select * from tab5 union all select * from tab4) order by (a + b) * c, sin(a / b), d limit 3) where explain ilike '%sort description%' or explain like '%ReadType%' or explain like '%Limit%'; Limit (preliminary LIMIT (without OFFSET)) Limit 3 - Prefix sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) ASC, d_3 ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) ASC, __table1.d ASC Limit 3 ReadType: InOrder - Prefix sort description: multiply(plus(a_0, b_1), c_2) ASC - Result sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) ASC + Prefix sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC + Result sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) ASC ReadType: InOrder - Sort description: multiply(plus(a_0, b_1), c_2) ASC, sin(divide(a_0, b_1)) ASC, d_3 ASC + Sort description: multiply(plus(__table1.a, __table1.b), __table1.c) ASC, sin(divide(__table1.a, __table1.b)) ASC, __table1.d ASC Limit 3 ReadType: Default drop table if exists tab; diff --git a/tests/queries/0_stateless/02932_kill_query_sleep.sh b/tests/queries/0_stateless/02932_kill_query_sleep.sh index 08c375b875d..84e84204aa1 100755 --- a/tests/queries/0_stateless/02932_kill_query_sleep.sh +++ b/tests/queries/0_stateless/02932_kill_query_sleep.sh @@ -8,18 +8,31 @@ CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) function wait_query_started() { local query_id="$1" - $CLICKHOUSE_CLIENT --query "SYSTEM FLUSH LOGS" - while [[ $($CLICKHOUSE_CLIENT --query="SELECT count() FROM system.query_log WHERE query_id='$query_id' AND current_database = currentDatabase()") == 0 ]]; do - sleep 0.1; - $CLICKHOUSE_CLIENT --query "SYSTEM FLUSH LOGS;" + timeout=60 + start=$EPOCHSECONDS + while [[ $($CLICKHOUSE_CLIENT --query="SELECT count() FROM system.processes WHERE query_id='$query_id'") == 0 ]]; do + if ((EPOCHSECONDS-start > timeout )); then + echo "Timeout while waiting for query $query_id to start" + exit 1 + fi + sleep 0.1 done } + function kill_query() { local query_id="$1" $CLICKHOUSE_CLIENT --query "KILL QUERY WHERE query_id='$query_id'" >/dev/null - while [[ $($CLICKHOUSE_CLIENT --query="SELECT count() FROM system.processes WHERE query_id='$query_id'") != 0 ]]; do sleep 0.1; done + timeout=60 + start=$EPOCHSECONDS + while [[ $($CLICKHOUSE_CLIENT --query="SELECT count() FROM system.processes WHERE query_id='$query_id'") != 0 ]]; do + if ((EPOCHSECONDS-start > timeout )); then + echo "Timeout while waiting for query $query_id to cancel" + exit 1 + fi + sleep 0.1 + done } diff --git a/tests/queries/0_stateless/02932_materialized_view_with_dropped_target_table_no_exception.reference b/tests/queries/0_stateless/02932_materialized_view_with_dropped_target_table_no_exception.reference new file mode 100644 index 00000000000..8fb8a08e3f9 --- /dev/null +++ b/tests/queries/0_stateless/02932_materialized_view_with_dropped_target_table_no_exception.reference @@ -0,0 +1,4 @@ +42 +42 +42 +42 diff --git a/tests/queries/0_stateless/02932_materialized_view_with_dropped_target_table_no_exception.sql b/tests/queries/0_stateless/02932_materialized_view_with_dropped_target_table_no_exception.sql new file mode 100644 index 00000000000..af6dbf24473 --- /dev/null +++ b/tests/queries/0_stateless/02932_materialized_view_with_dropped_target_table_no_exception.sql @@ -0,0 +1,21 @@ +set ignore_materialized_views_with_dropped_target_table = 1; +drop table if exists from_table; +drop table if exists to_table; +drop table if exists mv; + +create table from_table (x UInt32) engine=MergeTree order by x; +create table to_table (x UInt32) engine=MergeTree order by x; +create materialized view mv to to_table as select * from from_table; + +insert into from_table select 42; +select * from from_table; +select * from to_table; + +drop table to_table; + +insert into from_table select 42; +select * from from_table; + +drop table from_table; +drop view mv; + diff --git a/tests/queries/0_stateless/02933_change_cache_setting_without_restart.reference b/tests/queries/0_stateless/02933_change_cache_setting_without_restart.reference index d4dd4da0c5d..17a25d82824 100644 --- a/tests/queries/0_stateless/02933_change_cache_setting_without_restart.reference +++ b/tests/queries/0_stateless/02933_change_cache_setting_without_restart.reference @@ -1,7 +1,7 @@ -134217728 10000000 33554432 4194304 1 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02933/ 0 0 0 1 -134217728 10000000 33554432 4194304 1 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02933/ 10 1000 0 1 -134217728 10000000 33554432 4194304 1 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02933/ 5 1000 0 1 -134217728 10000000 33554432 4194304 1 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02933/ 15 1000 0 1 -134217728 10000000 33554432 4194304 1 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02933/ 2 1000 0 1 -134217728 10000000 33554432 4194304 1 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02933/ 0 1000 0 1 -134217728 10000000 33554432 4194304 1 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02933/ 0 0 0 1 +134217728 10000000 33554432 4194304 1 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02933/ 0 0 0 16 +134217728 10000000 33554432 4194304 1 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02933/ 10 1000 0 16 +134217728 10000000 33554432 4194304 1 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02933/ 5 1000 0 16 +134217728 10000000 33554432 4194304 1 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02933/ 15 1000 0 16 +134217728 10000000 33554432 4194304 1 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02933/ 2 1000 0 16 +134217728 10000000 33554432 4194304 1 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02933/ 0 1000 0 16 +134217728 10000000 33554432 4194304 1 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02933/ 0 0 0 16 diff --git a/tests/queries/0_stateless/02940_json_array_of_unnamed_tuples_inference.reference b/tests/queries/0_stateless/02940_json_array_of_unnamed_tuples_inference.reference index aac3e471264..aac8c4f777e 100644 --- a/tests/queries/0_stateless/02940_json_array_of_unnamed_tuples_inference.reference +++ b/tests/queries/0_stateless/02940_json_array_of_unnamed_tuples_inference.reference @@ -1 +1 @@ -data Array(Tuple(Nullable(Int64), Tuple(a Nullable(Int64), b Nullable(Int64)), Nullable(Int64), Nullable(String))) +data Array(Tuple(Nullable(Int64), Tuple(\n a Nullable(Int64),\n b Nullable(Int64)), Nullable(Int64), Nullable(String))) diff --git a/tests/queries/0_stateless/02943_rmt_alter_metadata_merge_checksum_mismatch.reference b/tests/queries/0_stateless/02943_rmt_alter_metadata_merge_checksum_mismatch.reference new file mode 100644 index 00000000000..e69de29bb2d diff --git a/tests/queries/0_stateless/02943_rmt_alter_metadata_merge_checksum_mismatch.sh b/tests/queries/0_stateless/02943_rmt_alter_metadata_merge_checksum_mismatch.sh new file mode 100755 index 00000000000..431f59d7918 --- /dev/null +++ b/tests/queries/0_stateless/02943_rmt_alter_metadata_merge_checksum_mismatch.sh @@ -0,0 +1,98 @@ +#!/usr/bin/env bash +# Tags: no-parallel +# Tag no-parallel: failpoint is in use + +CUR_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) +# shellcheck source=../shell_config.sh +. "$CUR_DIR"/../shell_config.sh + +set -e + +function wait_part() +{ + local table=$1 && shift + local part=$1 && shift + + for ((i = 0; i < 100; ++i)); do + if [[ $($CLICKHOUSE_CLIENT -q "select count() from system.parts where database = '$CLICKHOUSE_DATABASE' and table = '$table' and active and name = '$part'") -eq 1 ]]; then + return + fi + sleep 0.1 + done + + echo "Part $table::$part does not appeared" >&2 +} + +function restore_failpoints() +{ + # restore entry error with failpoints (to avoid endless errors in logs) + $CLICKHOUSE_CLIENT -nm -q " + system enable failpoint replicated_queue_unfail_entries; + system sync replica $failed_replica; + system disable failpoint replicated_queue_unfail_entries; + " +} +trap restore_failpoints EXIT + +$CLICKHOUSE_CLIENT -nm --insert_keeper_fault_injection_probability=0 -q " + drop table if exists data_r1; + drop table if exists data_r2; + + create table data_r1 (key Int, value Int, index value_idx value type minmax) engine=ReplicatedMergeTree('/clickhouse/tables/{database}/data', '{table}') order by key; + create table data_r2 (key Int, value Int, index value_idx value type minmax) engine=ReplicatedMergeTree('/clickhouse/tables/{database}/data', '{table}') order by key; + + insert into data_r1 (key) values (1); -- part all_0_0_0 +" + +# will fail ALTER_METADATA on one of replicas +$CLICKHOUSE_CLIENT -nm -q " + system enable failpoint replicated_queue_fail_next_entry; + alter table data_r1 drop index value_idx settings alter_sync=0; -- part all_0_0_0_1 + + system sync replica data_r1 pull; + system sync replica data_r2 pull; +" + +# replica on which ALTER_METADATA had been succeed +success_replica= +for ((i = 0; i < 100; ++i)); do + for table in data_r1 data_r2; do + mutations="$($CLICKHOUSE_CLIENT -q "select count() from system.mutations where database = '$CLICKHOUSE_DATABASE' and table = '$table' and is_done = 0")" + if [[ $mutations -eq 0 ]]; then + success_replica=$table + fi + done + if [[ -n $success_replica ]]; then + break + fi + sleep 0.1 +done +case "$success_replica" in + data_r1) failed_replica=data_r2;; + data_r2) failed_replica=data_r1;; + *) echo "ALTER_METADATA does not succeed on any replica" >&2 && exit 1;; +esac +mutations_on_failed_replica="$($CLICKHOUSE_CLIENT -q "select count() from system.mutations where database = '$CLICKHOUSE_DATABASE' and table = '$failed_replica' and is_done = 0")" +if [[ $mutations_on_failed_replica != 1 ]]; then + echo "Wrong number of mutations on failed replica $failed_replica, mutations $mutations_on_failed_replica" >&2 +fi + +# This will create MERGE_PARTS, on failed replica it will be fetched from source replica (since it does not have all parts to execute merge) +$CLICKHOUSE_CLIENT -q "optimize table $success_replica final settings optimize_throw_if_noop=1, alter_sync=1" # part all_0_0_1_1 + +$CLICKHOUSE_CLIENT -nm --insert_keeper_fault_injection_probability=0 -q " + insert into $success_replica (key) values (2); -- part all_2_2_0 + optimize table $success_replica final settings optimize_throw_if_noop=1, alter_sync=1; -- part all_0_2_2_1 + system sync replica $failed_replica pull; +" + +# Wait for part to be merged on failed replica, that will trigger CHECKSUM_DOESNT_MATCH +wait_part "$failed_replica" all_0_2_2_1 + +# Already after part fetched there will CHECKSUM_DOESNT_MATCH in case of ALTER_METADATA re-order, but let's restore fail points and sync failed replica first. +restore_failpoints +trap '' EXIT + +$CLICKHOUSE_CLIENT -q "system flush logs" +# check for error "Different number of files: 5 compressed (expected 3) and 2 uncompressed ones (expected 2). (CHECKSUM_DOESNT_MATCH)" +$CLICKHOUSE_CLIENT -q "select part_name, merge_reason, event_type, errorCodeToName(error) from system.part_log where database = '$CLICKHOUSE_DATABASE' and error != 0 order by event_time_microseconds" diff --git a/tests/queries/0_stateless/02943_tokenbf_and_ngrambf_indexes_support_match_function.reference b/tests/queries/0_stateless/02943_tokenbf_and_ngrambf_indexes_support_match_function.reference new file mode 100644 index 00000000000..1cf1644fe0a --- /dev/null +++ b/tests/queries/0_stateless/02943_tokenbf_and_ngrambf_indexes_support_match_function.reference @@ -0,0 +1,38 @@ +1 Hello ClickHouse +2 Hello World +1 Hello ClickHouse +2 Hello World + Granules: 6/6 + Granules: 2/6 + Granules: 6/6 + Granules: 2/6 + Granules: 6/6 + Granules: 2/6 + Granules: 6/6 + Granules: 2/6 +--- +1 Hello ClickHouse +2 Hello World +6 World Champion +1 Hello ClickHouse +2 Hello World +6 World Champion + Granules: 6/6 + Granules: 3/6 + Granules: 6/6 + Granules: 3/6 + Granules: 6/6 + Granules: 3/6 + Granules: 6/6 + Granules: 3/6 +--- +5 OLAP Database +5 OLAP Database + Granules: 6/6 + Granules: 1/6 + Granules: 6/6 + Granules: 1/6 + Granules: 6/6 + Granules: 1/6 + Granules: 6/6 + Granules: 1/6 diff --git a/tests/queries/0_stateless/02943_tokenbf_and_ngrambf_indexes_support_match_function.sql b/tests/queries/0_stateless/02943_tokenbf_and_ngrambf_indexes_support_match_function.sql new file mode 100644 index 00000000000..49d39c601ef --- /dev/null +++ b/tests/queries/0_stateless/02943_tokenbf_and_ngrambf_indexes_support_match_function.sql @@ -0,0 +1,185 @@ +DROP TABLE IF EXISTS tokenbf_tab; +DROP TABLE IF EXISTS ngrambf_tab; + +CREATE TABLE tokenbf_tab +( + id UInt32, + str String, + INDEX idx str TYPE tokenbf_v1(256, 2, 0) +) +ENGINE = MergeTree +ORDER BY id +SETTINGS index_granularity = 1; + +CREATE TABLE ngrambf_tab +( + id UInt32, + str String, + INDEX idx str TYPE ngrambf_v1(3, 256, 2, 0) +) +ENGINE = MergeTree +ORDER BY id +SETTINGS index_granularity = 1; + +INSERT INTO tokenbf_tab VALUES (1, 'Hello ClickHouse'), (2, 'Hello World'), (3, 'Good Weather'), (4, 'Say Hello'), (5, 'OLAP Database'), (6, 'World Champion'); +INSERT INTO ngrambf_tab VALUES (1, 'Hello ClickHouse'), (2, 'Hello World'), (3, 'Good Weather'), (4, 'Say Hello'), (5, 'OLAP Database'), (6, 'World Champion'); + +SELECT * FROM tokenbf_tab WHERE match(str, 'Hello (ClickHouse|World)') ORDER BY id; +SELECT * FROM ngrambf_tab WHERE match(str, 'Hello (ClickHouse|World)') ORDER BY id; + +-- Read 2/6 granules +-- Required string: 'Hello ' +-- Alternatives: 'Hello ClickHouse', 'Hello World' + +SELECT * +FROM +( + EXPLAIN PLAN indexes=1 + SELECT * FROM tokenbf_tab WHERE match(str, 'Hello (ClickHouse|World)') ORDER BY id +) +WHERE + explain LIKE '%Granules: %' +SETTINGS + allow_experimental_analyzer = 0; + +SELECT * +FROM +( + EXPLAIN PLAN indexes=1 + SELECT * FROM tokenbf_tab WHERE match(str, 'Hello (ClickHouse|World)') ORDER BY id +) +WHERE + explain LIKE '%Granules: %' +SETTINGS + allow_experimental_analyzer = 1; + +SELECT * +FROM +( + EXPLAIN PLAN indexes=1 + SELECT * FROM ngrambf_tab WHERE match(str, 'Hello (ClickHouse|World)') ORDER BY id +) +WHERE + explain LIKE '%Granules: %' +SETTINGS + allow_experimental_analyzer = 0; + +SELECT * +FROM +( + EXPLAIN PLAN indexes=1 + SELECT * FROM ngrambf_tab WHERE match(str, 'Hello (ClickHouse|World)') ORDER BY id +) +WHERE + explain LIKE '%Granules: %' +SETTINGS + allow_experimental_analyzer = 1; + + +SELECT '---'; + +SELECT * FROM tokenbf_tab WHERE match(str, '.*(ClickHouse|World)') ORDER BY id; +SELECT * FROM ngrambf_tab WHERE match(str, '.*(ClickHouse|World)') ORDER BY id; + +-- Read 3/6 granules +-- Required string: - +-- Alternatives: 'ClickHouse', 'World' + +SELECT * +FROM +( + EXPLAIN PLAN indexes = 1 + SELECT * FROM tokenbf_tab WHERE match(str, '.*(ClickHouse|World)') ORDER BY id +) +WHERE + explain LIKE '%Granules: %' +SETTINGS + allow_experimental_analyzer = 0; + +SELECT * +FROM +( + EXPLAIN PLAN indexes = 1 + SELECT * FROM tokenbf_tab WHERE match(str, '.*(ClickHouse|World)') ORDER BY id +) +WHERE + explain LIKE '%Granules: %' +SETTINGS + allow_experimental_analyzer = 1; + +SELECT * +FROM +( + EXPLAIN PLAN indexes = 1 + SELECT * FROM ngrambf_tab WHERE match(str, '.*(ClickHouse|World)') ORDER BY id +) +WHERE + explain LIKE '%Granules: %' +SETTINGS + allow_experimental_analyzer = 0; + +SELECT * +FROM +( + EXPLAIN PLAN indexes = 1 + SELECT * FROM ngrambf_tab WHERE match(str, '.*(ClickHouse|World)') ORDER BY id +) +WHERE + explain LIKE '%Granules: %' +SETTINGS + allow_experimental_analyzer = 1; + +SELECT '---'; + +SELECT * FROM tokenbf_tab WHERE match(str, 'OLAP.*') ORDER BY id; +SELECT * FROM ngrambf_tab WHERE match(str, 'OLAP.*') ORDER BY id; + +-- Read 1/6 granules +-- Required string: 'OLAP' +-- Alternatives: - + +SELECT * +FROM +( + EXPLAIN PLAN indexes = 1 + SELECT * FROM tokenbf_tab WHERE match(str, 'OLAP (.*?)*') ORDER BY id +) +WHERE + explain LIKE '%Granules: %' +SETTINGS + allow_experimental_analyzer = 0; +SELECT * +FROM +( + EXPLAIN PLAN indexes = 1 + SELECT * FROM tokenbf_tab WHERE match(str, 'OLAP (.*?)*') ORDER BY id +) +WHERE + explain LIKE '%Granules: %' +SETTINGS + allow_experimental_analyzer = 1; + +SELECT * +FROM +( + EXPLAIN PLAN indexes = 1 + SELECT * FROM ngrambf_tab WHERE match(str, 'OLAP (.*?)*') ORDER BY id +) +WHERE + explain LIKE '%Granules: %' +SETTINGS + allow_experimental_analyzer = 0; + +SELECT * +FROM +( + EXPLAIN PLAN indexes = 1 + SELECT * FROM ngrambf_tab WHERE match(str, 'OLAP (.*?)*') ORDER BY id +) +WHERE + explain LIKE '%Granules: %' +SETTINGS + allow_experimental_analyzer = 1; + +DROP TABLE tokenbf_tab; +DROP TABLE ngrambf_tab; diff --git a/tests/queries/0_stateless/02944_dynamically_change_filesystem_cache_size.reference b/tests/queries/0_stateless/02944_dynamically_change_filesystem_cache_size.reference index 8620171cb99..4a6bc8498e1 100644 --- a/tests/queries/0_stateless/02944_dynamically_change_filesystem_cache_size.reference +++ b/tests/queries/0_stateless/02944_dynamically_change_filesystem_cache_size.reference @@ -1,20 +1,20 @@ -100 10 10 10 0 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02944/ 5 5000 0 1 +100 10 10 10 0 0 0 0 /var/lib/clickhouse/filesystem_caches/s3_cache_02944/ 5 5000 0 16 0 10 98 set max_size from 100 to 10 -10 10 10 10 0 0 8 1 /var/lib/clickhouse/filesystem_caches/s3_cache_02944/ 5 5000 0 1 +10 10 10 10 0 0 8 1 /var/lib/clickhouse/filesystem_caches/s3_cache_02944/ 5 5000 0 16 1 8 set max_size from 10 to 100 -100 10 10 10 0 0 8 1 /var/lib/clickhouse/filesystem_caches/s3_cache_02944/ 5 5000 0 1 +100 10 10 10 0 0 8 1 /var/lib/clickhouse/filesystem_caches/s3_cache_02944/ 5 5000 0 16 10 98 set max_elements from 10 to 2 -100 2 10 10 0 0 18 2 /var/lib/clickhouse/filesystem_caches/s3_cache_02944/ 5 5000 0 1 +100 2 10 10 0 0 18 2 /var/lib/clickhouse/filesystem_caches/s3_cache_02944/ 5 5000 0 16 2 18 set max_elements from 2 to 10 -100 10 10 10 0 0 18 2 /var/lib/clickhouse/filesystem_caches/s3_cache_02944/ 5 5000 0 1 +100 10 10 10 0 0 18 2 /var/lib/clickhouse/filesystem_caches/s3_cache_02944/ 5 5000 0 16 10 98 diff --git a/tests/queries/0_stateless/02952_conjunction_optimization.reference b/tests/queries/0_stateless/02952_conjunction_optimization.reference index 64663cea662..eeadfaae21d 100644 --- a/tests/queries/0_stateless/02952_conjunction_optimization.reference +++ b/tests/queries/0_stateless/02952_conjunction_optimization.reference @@ -9,7 +9,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3 COLUMN id: 4, column_name: b, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02952_disjunction_optimization + TABLE id: 3, alias: __table1, table_name: default.02952_disjunction_optimization WHERE FUNCTION id: 5, function_name: notIn, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -27,7 +27,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3 COLUMN id: 4, column_name: b, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02952_disjunction_optimization + TABLE id: 3, alias: __table1, table_name: default.02952_disjunction_optimization WHERE FUNCTION id: 5, function_name: and, function_type: ordinary, result_type: Bool ARGUMENTS @@ -48,7 +48,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3 COLUMN id: 4, column_name: b, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02952_disjunction_optimization + TABLE id: 3, alias: __table1, table_name: default.02952_disjunction_optimization WHERE FUNCTION id: 5, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -73,7 +73,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3 COLUMN id: 4, column_name: b, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02952_disjunction_optimization + TABLE id: 3, alias: __table1, table_name: default.02952_disjunction_optimization WHERE FUNCTION id: 5, function_name: and, function_type: ordinary, result_type: UInt8 ARGUMENTS @@ -100,7 +100,7 @@ QUERY id: 0 COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3 COLUMN id: 4, column_name: b, result_type: String, source_id: 3 JOIN TREE - TABLE id: 3, table_name: default.02952_disjunction_optimization + TABLE id: 3, alias: __table1, table_name: default.02952_disjunction_optimization WHERE FUNCTION id: 5, function_name: or, function_type: ordinary, result_type: UInt8 ARGUMENTS diff --git a/tests/queries/0_stateless/02955_analyzer_using_functional_args.reference b/tests/queries/0_stateless/02955_analyzer_using_functional_args.reference new file mode 100644 index 00000000000..6ed281c757a --- /dev/null +++ b/tests/queries/0_stateless/02955_analyzer_using_functional_args.reference @@ -0,0 +1,2 @@ +1 +1 diff --git a/tests/queries/0_stateless/02955_analyzer_using_functional_args.sql b/tests/queries/0_stateless/02955_analyzer_using_functional_args.sql new file mode 100644 index 00000000000..7983b43d7e5 --- /dev/null +++ b/tests/queries/0_stateless/02955_analyzer_using_functional_args.sql @@ -0,0 +1,12 @@ +CREATE TABLE t1 (x Int16, y ALIAS x + x * 2) ENGINE=MergeTree() ORDER BY x; +CREATE TABLE t2 (y Int16, z Int16) ENGINE=MergeTree() ORDER BY y; + +INSERT INTO t1 VALUES (1231), (123); +INSERT INTO t2 VALUES (6666, 48); +INSERT INTO t2 VALUES (369, 50); + +SELECT count() FROM t1 INNER JOIN t2 USING (y); +SELECT count() FROM t2 INNER JOIN t1 USING (y); + +DROP TABLE IF EXISTS t1; +DROP TABLE IF EXISTS t2; diff --git a/tests/queries/0_stateless/02961_read_bool_as_string_json.reference b/tests/queries/0_stateless/02961_read_bool_as_string_json.reference new file mode 100644 index 00000000000..56f15989a45 --- /dev/null +++ b/tests/queries/0_stateless/02961_read_bool_as_string_json.reference @@ -0,0 +1,12 @@ +true +false +str +true +false +str +['true','false'] +['false','true'] +['str1','str2'] +['true','false'] +['false','true'] +['str1','str2'] diff --git a/tests/queries/0_stateless/02961_read_bool_as_string_json.sql b/tests/queries/0_stateless/02961_read_bool_as_string_json.sql new file mode 100644 index 00000000000..b9f4a7926f9 --- /dev/null +++ b/tests/queries/0_stateless/02961_read_bool_as_string_json.sql @@ -0,0 +1,9 @@ +set input_format_json_read_bools_as_strings=1; +select * from format(JSONEachRow, 'x String', '{"x" : true}, {"x" : false}, {"x" : "str"}'); +select * from format(JSONEachRow, '{"x" : true}, {"x" : false}, {"x" : "str"}'); +select * from format(JSONEachRow, 'x String', '{"x" : tru}'); -- {serverError CANNOT_PARSE_INPUT_ASSERTION_FAILED} +select * from format(JSONEachRow, 'x String', '{"x" : fals}'); -- {serverError CANNOT_PARSE_INPUT_ASSERTION_FAILED} +select * from format(JSONEachRow, 'x String', '{"x" : atru}'); -- {serverError INCORRECT_DATA} +select * from format(JSONEachRow, 'x Array(String)', '{"x" : [true, false]}, {"x" : [false, true]}, {"x" : ["str1", "str2"]}'); +select * from format(JSONEachRow, '{"x" : [true, false]}, {"x" : [false, true]}, {"x" : ["str1", "str2"]}'); + diff --git a/utils/list-versions/version_date.tsv b/utils/list-versions/version_date.tsv index 53ad807c44b..b2983033e44 100644 --- a/utils/list-versions/version_date.tsv +++ b/utils/list-versions/version_date.tsv @@ -1,7 +1,10 @@ +v23.12.2.59-stable 2024-01-05 v23.12.1.1368-stable 2023-12-28 +v23.11.4.24-stable 2024-01-05 v23.11.3.23-stable 2023-12-21 v23.11.2.11-stable 2023-12-13 v23.11.1.2711-stable 2023-12-06 +v23.10.6.60-stable 2024-01-05 v23.10.5.20-stable 2023-11-25 v23.10.4.25-stable 2023-11-17 v23.10.3.5-stable 2023-11-10 @@ -13,6 +16,7 @@ v23.9.4.11-stable 2023-11-08 v23.9.3.12-stable 2023-10-31 v23.9.2.56-stable 2023-10-19 v23.9.1.1854-stable 2023-09-29 +v23.8.9.54-lts 2024-01-05 v23.8.8.20-lts 2023-11-25 v23.8.7.24-lts 2023-11-17 v23.8.6.16-lts 2023-11-08 @@ -41,6 +45,7 @@ v23.4.4.16-stable 2023-06-17 v23.4.3.48-stable 2023-06-12 v23.4.2.11-stable 2023-05-02 v23.4.1.1943-stable 2023-04-27 +v23.3.19.32-lts 2024-01-05 v23.3.18.15-lts 2023-11-25 v23.3.17.13-lts 2023-11-17 v23.3.16.7-lts 2023-11-08