Merge branch 'master' into file_default_value

This commit is contained in:
Kruglov Pavel 2022-07-21 12:27:20 +02:00 committed by GitHub
commit b98cb546f6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
138 changed files with 2356 additions and 465 deletions

View File

@ -1,4 +1,5 @@
### Table of Contents
**[ClickHouse release v22.7, 2022-07-21](#226)**<br>
**[ClickHouse release v22.6, 2022-06-16](#226)**<br>
**[ClickHouse release v22.5, 2022-05-19](#225)**<br>
**[ClickHouse release v22.4, 2022-04-20](#224)**<br>
@ -7,6 +8,171 @@
**[ClickHouse release v22.1, 2022-01-18](#221)**<br>
**[Changelog for 2021](https://clickhouse.com/docs/en/whats-new/changelog/2021/)**<br>
### <a id="227"></a> ClickHouse release 22.7, 2022-07-21
#### Upgrade Notes
* Enable setting `enable_positional_arguments` by default. It allows queries like `SELECT ... ORDER BY 1, 2` where 1, 2 are the references to the select clause. If you need to return the old behavior, disable this setting. [#38204](https://github.com/ClickHouse/ClickHouse/pull/38204) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* `Ordinary` database engine and old storage definition syntax for `*MergeTree` tables are deprecated. By default it's not possible to create new databases with `Ordinary` engine. If `system` database has `Ordinary` engine it will be automatically converted to `Atomic` on server startup. There are settings to keep old behavior (`allow_deprecated_database_ordinary` and `allow_deprecated_syntax_for_merge_tree`), but these settings may be removed in future releases. [#38335](https://github.com/ClickHouse/ClickHouse/pull/38335) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Force rewriting comma join to inner by default (set default value `cross_to_inner_join_rewrite = 2`). To have old behavior set `cross_to_inner_join_rewrite = 1`. [#39326](https://github.com/ClickHouse/ClickHouse/pull/39326) ([Vladimir C](https://github.com/vdimir)). If you will face any incompatibilities, you can turn this setting back.
#### New Feature
* Support expressions with window functions. Closes [#19857](https://github.com/ClickHouse/ClickHouse/issues/19857). [#37848](https://github.com/ClickHouse/ClickHouse/pull/37848) ([Dmitry Novik](https://github.com/novikd)).
* Add new `direct` join algorithm for `EmbeddedRocksDB` tables, see [#33582](https://github.com/ClickHouse/ClickHouse/issues/33582). [#35363](https://github.com/ClickHouse/ClickHouse/pull/35363) ([Vladimir C](https://github.com/vdimir)).
* Added full sorting merge join algorithm. [#35796](https://github.com/ClickHouse/ClickHouse/pull/35796) ([Vladimir C](https://github.com/vdimir)).
* Implement NATS table engine, which allows to pub/sub to NATS. Closes [#32388](https://github.com/ClickHouse/ClickHouse/issues/32388). [#37171](https://github.com/ClickHouse/ClickHouse/pull/37171) ([tchepavel](https://github.com/tchepavel)). ([Kseniia Sumarokova](https://github.com/kssenii))
* Implement table function `mongodb`. Allow writes into `MongoDB` storage / table function. [#37213](https://github.com/ClickHouse/ClickHouse/pull/37213) ([aaapetrenko](https://github.com/aaapetrenko)). ([Kseniia Sumarokova](https://github.com/kssenii))
* Add SQLInsert output format. Closes [#38441](https://github.com/ClickHouse/ClickHouse/issues/38441). [#38477](https://github.com/ClickHouse/ClickHouse/pull/38477) ([Kruglov Pavel](https://github.com/Avogar)).
* Add `compatibility` setting and `system.settings_changes` system table that contains information about changes in settings through ClickHouse versions. Closes [#35972](https://github.com/ClickHouse/ClickHouse/issues/35972). [#38957](https://github.com/ClickHouse/ClickHouse/pull/38957) ([Kruglov Pavel](https://github.com/Avogar)).
* Add functions `translate(string, from_string, to_string)` and `translateUTF8(string, from_string, to_string)`. It translates some characters to another. [#38935](https://github.com/ClickHouse/ClickHouse/pull/38935) ([Nikolay Degterinsky](https://github.com/evillique)).
* Support `parseTimeDelta` function. It can be used like ` ;-+,:` can be used as separators, eg. `1yr-2mo`, `2m:6s`: `SELECT parseTimeDelta('1yr-2mo-4w + 12 days, 3 hours : 1 minute ; 33 seconds')`. [#39071](https://github.com/ClickHouse/ClickHouse/pull/39071) ([jiahui-97](https://github.com/jiahui-97)).
* Added `CREATE TABLE ... EMPTY AS SELECT` query. It automatically deduces table structure from the SELECT query, but does not fill the table after creation. Resolves [#38049](https://github.com/ClickHouse/ClickHouse/issues/38049). [#38272](https://github.com/ClickHouse/ClickHouse/pull/38272) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Added options to limit IO operations with remote storage: `max_remote_read_network_bandwidth_for_server` and `max_remote_write_network_bandwidth_for_server`. [#39095](https://github.com/ClickHouse/ClickHouse/pull/39095) ([Sergei Trifonov](https://github.com/serxa)).
* Add `group_by_use_nulls` setting to make aggregation key columns nullable in the case of ROLLUP, CUBE and GROUPING SETS. Closes [#37359](https://github.com/ClickHouse/ClickHouse/issues/37359). [#38642](https://github.com/ClickHouse/ClickHouse/pull/38642) ([Dmitry Novik](https://github.com/novikd)).
* Add the ability to specify compression level during data export. [#38907](https://github.com/ClickHouse/ClickHouse/pull/38907) ([Nikolay Degterinsky](https://github.com/evillique)).
* Add an option to require explicit grants to SELECT from the `system` database. Details: [#38970](https://github.com/ClickHouse/ClickHouse/pull/38970) ([Vitaly Baranov](https://github.com/vitlibar)).
* Functions `multiMatchAny`, `multiMatchAnyIndex`, `multiMatchAllIndices` and their fuzzy variants now accept non-const pattern array argument. [#38485](https://github.com/ClickHouse/ClickHouse/pull/38485) ([Robert Schulze](https://github.com/rschu1ze)). SQL function `multiSearchAllPositions` now accepts non-const needle arguments. [#39167](https://github.com/ClickHouse/ClickHouse/pull/39167) ([Robert Schulze](https://github.com/rschu1ze)).
* Add a setting `zstd_window_log_max` to configure max memory usage on zstd decoding when importing external files. Closes [#35693](https://github.com/ClickHouse/ClickHouse/issues/35693). [#37015](https://github.com/ClickHouse/ClickHouse/pull/37015) ([wuxiaobai24](https://github.com/wuxiaobai24)).
* Add `send_logs_source_regexp` setting. Send server text logs with specified regexp to match log source name. Empty means all sources. [#39161](https://github.com/ClickHouse/ClickHouse/pull/39161) ([Amos Bird](https://github.com/amosbird)).
* Support `ALTER` for `Hive` tables. [#38214](https://github.com/ClickHouse/ClickHouse/pull/38214) ([lgbo](https://github.com/lgbo-ustc)).
* Support `isNullable` function. This function checks whether it's argument is nullable and return 1 or 0. Closes [#38611](https://github.com/ClickHouse/ClickHouse/issues/38611). [#38841](https://github.com/ClickHouse/ClickHouse/pull/38841) ([lokax](https://github.com/lokax)).
* Added Base58 encoding/decoding. [#38159](https://github.com/ClickHouse/ClickHouse/pull/38159) ([Andrey Zvonov](https://github.com/zvonand)).
* Add chart visualization to Play UI. [#38197](https://github.com/ClickHouse/ClickHouse/pull/38197) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Added L2 Squared distance and norm functions for both arrays and tuples. [#38545](https://github.com/ClickHouse/ClickHouse/pull/38545) ([Julian Gilyadov](https://github.com/israelg99)).
* Add ability to pass HTTP headers to the `url` table function / storage via SQL. Closes [#37897](https://github.com/ClickHouse/ClickHouse/issues/37897). [#38176](https://github.com/ClickHouse/ClickHouse/pull/38176) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Add `clickhouse-diagnostics` binary to the packages. [#38647](https://github.com/ClickHouse/ClickHouse/pull/38647) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### Experimental Feature
* Adds new setting `implicit_transaction` to run standalone queries inside a transaction. It handles both creation and closing (via COMMIT if the query succeeded or ROLLBACK if it didn't) of the transaction automatically. [#38344](https://github.com/ClickHouse/ClickHouse/pull/38344) ([Raúl Marín](https://github.com/Algunenano)).
#### Performance Improvement
* Distinct optimization for sorted columns. Use specialized distinct transformation in case input stream is sorted by column(s) in distinct. Optimization can be applied to pre-distinct, final distinct, or both. Initial implementation by @dimarub2000. [#37803](https://github.com/ClickHouse/ClickHouse/pull/37803) ([Igor Nikonov](https://github.com/devcrafter)).
* Improve performance of `ORDER BY`, `MergeTree` merges, window functions using batch version of `BinaryHeap`. [#38022](https://github.com/ClickHouse/ClickHouse/pull/38022) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix significant join performance regression which was introduced in https://github.com/ClickHouse/ClickHouse/pull/35616 . It's interesting that common join queries such as ssb queries have been 10 times slower for almost 3 months while no one complains. [#38052](https://github.com/ClickHouse/ClickHouse/pull/38052) ([Amos Bird](https://github.com/amosbird)).
* Migrate from the Intel hyperscan library to vectorscan, this speeds up many string matching on non-x86 platforms. [#38171](https://github.com/ClickHouse/ClickHouse/pull/38171) ([Robert Schulze](https://github.com/rschu1ze)).
* Increased parallelism of query plan steps executed after aggregation. [#38295](https://github.com/ClickHouse/ClickHouse/pull/38295) ([Nikita Taranov](https://github.com/nickitat)).
* Improve performance of insertion to columns of type `JSON`. [#38320](https://github.com/ClickHouse/ClickHouse/pull/38320) ([Anton Popov](https://github.com/CurtizJ)).
* Optimized insertion and lookups in the HashTable. [#38413](https://github.com/ClickHouse/ClickHouse/pull/38413) ([Nikita Taranov](https://github.com/nickitat)).
* Fix performance degradation from [#32493](https://github.com/ClickHouse/ClickHouse/issues/32493). [#38417](https://github.com/ClickHouse/ClickHouse/pull/38417) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Improve performance of joining with numeric columns using SIMD instructions. [#37235](https://github.com/ClickHouse/ClickHouse/pull/37235) ([zzachimed](https://github.com/zzachimed)). [#38565](https://github.com/ClickHouse/ClickHouse/pull/38565) ([Maksim Kita](https://github.com/kitaisreal)).
* Norm and Distance functions for arrays speed up 1.2-2 times. [#38740](https://github.com/ClickHouse/ClickHouse/pull/38740) ([Alexander Gololobov](https://github.com/davenger)).
* Add AVX-512 VBMI optimized `copyOverlap32Shuffle` for LZ4 decompression. In other words, LZ4 decompression performance is improved. [#37891](https://github.com/ClickHouse/ClickHouse/pull/37891) ([Guo Wangyang](https://github.com/guowangy)).
* `ORDER BY (a, b)` will use all the same benefits as `ORDER BY a, b`. [#38873](https://github.com/ClickHouse/ClickHouse/pull/38873) ([Igor Nikonov](https://github.com/devcrafter)).
* Align branches within a 32B boundary to make benchmark more stable. [#38988](https://github.com/ClickHouse/ClickHouse/pull/38988) ([Guo Wangyang](https://github.com/guowangy)). It improves performance 1..2% on average for Intel.
* Executable UDF, executable dictionaries, and Executable tables will avoid wasting one second during wait for subprocess termination. [#38929](https://github.com/ClickHouse/ClickHouse/pull/38929) ([Constantine Peresypkin](https://github.com/pkit)).
* TODO remove? Pushdown filter to the right side of sorting join. [#39123](https://github.com/ClickHouse/ClickHouse/pull/39123) ([Vladimir C](https://github.com/vdimir)).
* Optimize accesses to `system.stack_trace` table if not all columns are selected. [#39177](https://github.com/ClickHouse/ClickHouse/pull/39177) ([Azat Khuzhin](https://github.com/azat)).
* Improve isNullable/isConstant/isNull/isNotNull performance for LowCardinality argument. [#39192](https://github.com/ClickHouse/ClickHouse/pull/39192) ([Kruglov Pavel](https://github.com/Avogar)).
* Optimized processing of ORDER BY in window functions. [#34632](https://github.com/ClickHouse/ClickHouse/pull/34632) ([Vladimir Chebotarev](https://github.com/excitoon)).
* The table `system.asynchronous_metric_log` is further optimized for storage space. This closes [#38134](https://github.com/ClickHouse/ClickHouse/issues/38134). See the [YouTube video](https://www.youtube.com/watch?v=0fSp9SF8N8A). [#38428](https://github.com/ClickHouse/ClickHouse/pull/38428) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### Improvement
* Support SQL standard CREATE INDEX and DROP INDEX syntax. [#35166](https://github.com/ClickHouse/ClickHouse/pull/35166) ([Jianmei Zhang](https://github.com/zhangjmruc)).
* Send profile events for INSERT queries (previously only SELECT was supported). [#37391](https://github.com/ClickHouse/ClickHouse/pull/37391) ([Azat Khuzhin](https://github.com/azat)).
* Implement in order aggregation (`optimize_aggregation_in_order`) for fully materialized projections. [#37469](https://github.com/ClickHouse/ClickHouse/pull/37469) ([Azat Khuzhin](https://github.com/azat)).
* Remove subprocess run for kerberos initialization. Added new integration test. Closes [#27651](https://github.com/ClickHouse/ClickHouse/issues/27651). [#38105](https://github.com/ClickHouse/ClickHouse/pull/38105) ([Roman Vasin](https://github.com/rvasin)).
* * Add setting `multiple_joins_try_to_keep_original_names` to not rewrite identifier name on multiple JOINs rewrite, close [#34697](https://github.com/ClickHouse/ClickHouse/issues/34697). [#38149](https://github.com/ClickHouse/ClickHouse/pull/38149) ([Vladimir C](https://github.com/vdimir)).
* Improved trace-visualizer UX. [#38169](https://github.com/ClickHouse/ClickHouse/pull/38169) ([Sergei Trifonov](https://github.com/serxa)).
* Enable stack trace collection and query profiler for AArch64. [#38181](https://github.com/ClickHouse/ClickHouse/pull/38181) ([Maksim Kita](https://github.com/kitaisreal)).
* Do not skip symlinks in `user_defined` directory during SQL user defined functions loading. Closes [#38042](https://github.com/ClickHouse/ClickHouse/issues/38042). [#38184](https://github.com/ClickHouse/ClickHouse/pull/38184) ([Maksim Kita](https://github.com/kitaisreal)).
* Added background cleanup of subdirectories in `store/`. In some cases clickhouse-server might left garbage subdirectories in `store/` (for example, on unsuccessful table creation) and those dirs were never been removed. Fixes [#33710](https://github.com/ClickHouse/ClickHouse/issues/33710). [#38265](https://github.com/ClickHouse/ClickHouse/pull/38265) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add `DESCRIBE CACHE` query to show cache settings from config. Add `SHOW CACHES` query to show available filesystem caches list. [#38279](https://github.com/ClickHouse/ClickHouse/pull/38279) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Add access check for `system drop filesystem cache`. Support ON CLUSTER. [#38319](https://github.com/ClickHouse/ClickHouse/pull/38319) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix PostgreSQL database engine incompatibility on upgrade from 21.3 to 22.3. Closes [#36659](https://github.com/ClickHouse/ClickHouse/issues/36659). [#38369](https://github.com/ClickHouse/ClickHouse/pull/38369) ([Kseniia Sumarokova](https://github.com/kssenii)).
* `filesystemAvailable` and similar functions now work in `clickhouse-local`. This closes [#38423](https://github.com/ClickHouse/ClickHouse/issues/38423). [#38424](https://github.com/ClickHouse/ClickHouse/pull/38424) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Add `revision` function. [#38555](https://github.com/ClickHouse/ClickHouse/pull/38555) ([Azat Khuzhin](https://github.com/azat)).
* Fix GCS via proxy tunnel usage. [#38726](https://github.com/ClickHouse/ClickHouse/pull/38726) ([Azat Khuzhin](https://github.com/azat)).
* Support `\i file` in clickhouse client / local (similar to psql \i). [#38813](https://github.com/ClickHouse/ClickHouse/pull/38813) ([Kseniia Sumarokova](https://github.com/kssenii)).
* New option `optimize = 1` in `EXPLAIN AST`. If enabled, it shows AST after it's rewritten, otherwise AST of original query. Disabled by default. [#38910](https://github.com/ClickHouse/ClickHouse/pull/38910) ([Igor Nikonov](https://github.com/devcrafter)).
* Allow trailing comma in columns list. closes [#38425](https://github.com/ClickHouse/ClickHouse/issues/38425). [#38440](https://github.com/ClickHouse/ClickHouse/pull/38440) ([chen](https://github.com/xiedeyantu)).
* Bugfixes and performance improvements for `parallel_hash` JOIN method. [#37648](https://github.com/ClickHouse/ClickHouse/pull/37648) ([Vladimir C](https://github.com/vdimir)).
* Support hadoop secure RPC transfer (hadoop.rpc.protection=privacy and hadoop.rpc.protection=integrity). [#37852](https://github.com/ClickHouse/ClickHouse/pull/37852) ([Peng Liu](https://github.com/michael1589)).
* Add struct type support in `StorageHive`. [#38118](https://github.com/ClickHouse/ClickHouse/pull/38118) ([lgbo](https://github.com/lgbo-ustc)).
* S3 single objects are now removed with `RemoveObjectRequest`. Implement compatibility with GCP which did not allow to use `removeFileIfExists` effectively breaking approximately half of `remove` functionality. Automatic detection for `DeleteObjects` S3 API, that is not supported by GCS. This will allow to use GCS without explicit `support_batch_delete=0` in configuration. [#37882](https://github.com/ClickHouse/ClickHouse/pull/37882) ([Vladimir Chebotarev](https://github.com/excitoon)).
* Expose basic ClickHouse Keeper related monitoring data (via ProfileEvents and CurrentMetrics). [#38072](https://github.com/ClickHouse/ClickHouse/pull/38072) ([lingpeng0314](https://github.com/lingpeng0314)).
* Support `auto_close` option for PostgreSQL engine connection. Closes [#31486](https://github.com/ClickHouse/ClickHouse/issues/31486). [#38363](https://github.com/ClickHouse/ClickHouse/pull/38363) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Allow `NULL` modifier in columns declaration for table functions. [#38816](https://github.com/ClickHouse/ClickHouse/pull/38816) ([Kruglov Pavel](https://github.com/Avogar)).
* Deactivate `mutations_finalizing_task` before shutdown to avoid benign `TABLE_IS_READ_ONLY` errors during shutdown. [#38851](https://github.com/ClickHouse/ClickHouse/pull/38851) ([Raúl Marín](https://github.com/Algunenano)).
* Eliminate unnecessary waiting of SELECT queries after ALTER queries in presence of INSERT queries if you use deprecated Ordinary databases. [#38864](https://github.com/ClickHouse/ClickHouse/pull/38864) ([Azat Khuzhin](https://github.com/azat)).
* New option `rewrite` in `EXPLAIN AST`. If enabled, it shows AST after it's rewritten, otherwise AST of original query. Disabled by default. [#38910](https://github.com/ClickHouse/ClickHouse/pull/38910) ([Igor Nikonov](https://github.com/devcrafter)).
* Stop reporting Zookeeper "Node exists" exceptions in system.errors when they are expected. [#38961](https://github.com/ClickHouse/ClickHouse/pull/38961) ([Raúl Marín](https://github.com/Algunenano)).
* `clickhouse-keeper`: add support for real-time digest calculation and verification. It is disabled by default. [#37555](https://github.com/ClickHouse/ClickHouse/pull/37555) ([Antonio Andelic](https://github.com/antonio2368)).
* Allow to specify globs `* or {expr1, expr2, expr3}` inside a key for `clickhouse-extract-from-config` tool. [#38966](https://github.com/ClickHouse/ClickHouse/pull/38966) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* clearOldLogs: Don't report KEEPER_EXCEPTION on concurrent deletes. [#39016](https://github.com/ClickHouse/ClickHouse/pull/39016) ([Raúl Marín](https://github.com/Algunenano)).
* clickhouse-keeper improvement: persist meta-information about keeper servers to disk. [#39069](https://github.com/ClickHouse/ClickHouse/pull/39069) ([Antonio Andelic](https://github.com/antonio2368)). This will make it easier to operate if you shutdown or restart all keeper nodes at the same time.
* Continue without exception when running out of disk space when using filesystem cache. [#39106](https://github.com/ClickHouse/ClickHouse/pull/39106) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Handling SIGTERM signals from k8s. [#39130](https://github.com/ClickHouse/ClickHouse/pull/39130) ([Timur Solodovnikov](https://github.com/tsolodov)).
* Add `merge_algorithm` column (Undecided, Horizontal, Vertical) to system.part_log. [#39181](https://github.com/ClickHouse/ClickHouse/pull/39181) ([Azat Khuzhin](https://github.com/azat)).
* Don't increment a counter in `system.errors` when the disk is not rotational. [#39216](https://github.com/ClickHouse/ClickHouse/pull/39216) ([Raúl Marín](https://github.com/Algunenano)).
* The metric `result_bytes` for `INSERT` queries in `system.query_log` shows number of bytes inserted. Previously value was incorrect and stored the same value as `result_rows`. [#39225](https://github.com/ClickHouse/ClickHouse/pull/39225) ([Ilya Yatsishin](https://github.com/qoega)).
* The CPU usage metric in clickhouse-client will be displayed in a better way. Fixes [#38756](https://github.com/ClickHouse/ClickHouse/issues/38756). [#39280](https://github.com/ClickHouse/ClickHouse/pull/39280) ([Sergei Trifonov](https://github.com/serxa)).
* Rethrow exception on filesystem cache initialization on server startup, better error message. [#39386](https://github.com/ClickHouse/ClickHouse/pull/39386) ([Kseniia Sumarokova](https://github.com/kssenii)).
* OpenTelemetry now collects traces without Processors spans by default (there are too many). To enable Processors spans collection `opentelemetry_trace_processors` setting. [#39170](https://github.com/ClickHouse/ClickHouse/pull/39170) ([Ilya Yatsishin](https://github.com/qoega)).
* Functions `multiMatch[Fuzzy](AllIndices/Any/AnyIndex)` - don't throw a logical error if the needle argument is empty. [#39012](https://github.com/ClickHouse/ClickHouse/pull/39012) ([Robert Schulze](https://github.com/rschu1ze)).
* Allow to declare `RabbitMQ` queue without default arguments `x-max-length` and `x-overflow`. [#39259](https://github.com/ClickHouse/ClickHouse/pull/39259) ([rnbondarenko](https://github.com/rnbondarenko)).
#### Build/Testing/Packaging Improvement
* Apply Clang Thread Safety Analysis (TSA) annotations to ClickHouse. [#38068](https://github.com/ClickHouse/ClickHouse/pull/38068) ([Robert Schulze](https://github.com/rschu1ze)).
* Adapt universal installation script for FreeBSD. [#39302](https://github.com/ClickHouse/ClickHouse/pull/39302) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Preparation for building on `s390x` platform. [#39193](https://github.com/ClickHouse/ClickHouse/pull/39193) ([Harry Lee](https://github.com/HarryLeeIBM)).
* Fix a bug in `jemalloc` library [#38757](https://github.com/ClickHouse/ClickHouse/pull/38757) ([Azat Khuzhin](https://github.com/azat)).
* Hardware benchmark now has support for automatic results uploading. [#38427](https://github.com/ClickHouse/ClickHouse/pull/38427) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* System table "system.licenses" is now correctly populated on Mac (Darwin). [#38294](https://github.com/ClickHouse/ClickHouse/pull/38294) ([Robert Schulze](https://github.com/rschu1ze)).
* Change `all|noarch` packages to architecture-dependent - Fix some documentation for it - Push aarch64|arm64 packages to artifactory and release assets - Fixes [#36443](https://github.com/ClickHouse/ClickHouse/issues/36443). [#38580](https://github.com/ClickHouse/ClickHouse/pull/38580) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Fix rounding for `Decimal128/Decimal256` with more than 19-digits long scale. [#38027](https://github.com/ClickHouse/ClickHouse/pull/38027) ([Igor Nikonov](https://github.com/devcrafter)).
* Fixed crash caused by data race in storage `Hive` (integration table engine). [#38887](https://github.com/ClickHouse/ClickHouse/pull/38887) ([lgbo](https://github.com/lgbo-ustc)).
* Fix crash when executing GRANT ALL ON *.* with ON CLUSTER. It was broken in https://github.com/ClickHouse/ClickHouse/pull/35767. This closes [#38618](https://github.com/ClickHouse/ClickHouse/issues/38618). [#38674](https://github.com/ClickHouse/ClickHouse/pull/38674) ([Vitaly Baranov](https://github.com/vitlibar)).
* Correct glob expansion in case of `{0..10}` forms. Fixes [#38498](https://github.com/ClickHouse/ClickHouse/issues/38498) Current Implementation is similar to what shell does mentiond by @rschu1ze [here](https://github.com/ClickHouse/ClickHouse/pull/38502#issuecomment-1169057723). [#38502](https://github.com/ClickHouse/ClickHouse/pull/38502) ([Heena Bansal](https://github.com/HeenaBansal2009)).
* Fix crash for `mapUpdate`, `mapFilter` functions when using with constant map argument. Closes [#38547](https://github.com/ClickHouse/ClickHouse/issues/38547). [#38553](https://github.com/ClickHouse/ClickHouse/pull/38553) ([hexiaoting](https://github.com/hexiaoting)).
* Fix `toHour` monotonicity information for query optimization which can lead to incorrect query result (incorrect index analysis). This fixes [#38333](https://github.com/ClickHouse/ClickHouse/issues/38333). [#38675](https://github.com/ClickHouse/ClickHouse/pull/38675) ([Amos Bird](https://github.com/amosbird)).
* Fix checking whether s3 storage support parallel writes. It resulted in s3 parallel writes not working. [#38792](https://github.com/ClickHouse/ClickHouse/pull/38792) ([chen](https://github.com/xiedeyantu)).
* Fix s3 seekable reads with parallel read buffer. (Affected memory usage during query). Closes [#38258](https://github.com/ClickHouse/ClickHouse/issues/38258). [#38802](https://github.com/ClickHouse/ClickHouse/pull/38802) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Update `simdjson`. This fixes [#38621](https://github.com/ClickHouse/ClickHouse/issues/38621) - a buffer overflow on machines with the latest Intel CPUs with AVX-512 VBMI. [#38838](https://github.com/ClickHouse/ClickHouse/pull/38838) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix possible logical error for Vertical merges. [#38859](https://github.com/ClickHouse/ClickHouse/pull/38859) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix settings profile with seconds unit. [#38896](https://github.com/ClickHouse/ClickHouse/pull/38896) ([Raúl Marín](https://github.com/Algunenano)).
* Fix incorrect partition pruning when there is a nullable partition key. Note: most likely you don't use nullable partition keys - this is an obscure feature you should not use. Nullable keys are a nonsense and this feature is only needed for some crazy use-cases. This fixes [#38941](https://github.com/ClickHouse/ClickHouse/issues/38941). [#38946](https://github.com/ClickHouse/ClickHouse/pull/38946) ([Amos Bird](https://github.com/amosbird)).
* Improve `fsync_part_directory` for fetches. [#38993](https://github.com/ClickHouse/ClickHouse/pull/38993) ([Azat Khuzhin](https://github.com/azat)).
* Fix possible dealock inside `OvercommitTracker`. Fixes [#37794](https://github.com/ClickHouse/ClickHouse/issues/37794). [#39030](https://github.com/ClickHouse/ClickHouse/pull/39030) ([Dmitry Novik](https://github.com/novikd)).
* Fix bug in filesystem cache that could happen in some corner case which coincided with cache capacity hitting the limit. Closes [#39066](https://github.com/ClickHouse/ClickHouse/issues/39066). [#39070](https://github.com/ClickHouse/ClickHouse/pull/39070) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix some corner cases of interpretation of the arguments of window expressions. Fixes [#38538](https://github.com/ClickHouse/ClickHouse/issues/38538) Allow using of higher-order functions in window expressions. [#39112](https://github.com/ClickHouse/ClickHouse/pull/39112) ([Dmitry Novik](https://github.com/novikd)).
* Keep `LowCardinality` type in `tuple` function. Previously `LowCardinality` type was dropped and elements of created tuple had underlying type of `LowCardinality`. [#39113](https://github.com/ClickHouse/ClickHouse/pull/39113) ([Anton Popov](https://github.com/CurtizJ)).
* Fix error `Block structure mismatch` which could happen for INSERT into table with attached MATERIALIZED VIEW and enabled setting `extremes = 1`. Closes [#29759](https://github.com/ClickHouse/ClickHouse/issues/29759) and [#38729](https://github.com/ClickHouse/ClickHouse/issues/38729). [#39125](https://github.com/ClickHouse/ClickHouse/pull/39125) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix unexpected query result when both `optimize_trivial_count_query` and `empty_result_for_aggregation_by_empty_set` are set to true. This fixes [#39140](https://github.com/ClickHouse/ClickHouse/issues/39140). [#39155](https://github.com/ClickHouse/ClickHouse/pull/39155) ([Amos Bird](https://github.com/amosbird)).
* Fixed error `Not found column Type in block` in selects with `PREWHERE` and read-in-order optimizations. [#39157](https://github.com/ClickHouse/ClickHouse/pull/39157) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Fix extremely rare race condition in during hardlinks for remote filesystem. The only way to reproduce it is concurrent run of backups. [#39190](https://github.com/ClickHouse/ClickHouse/pull/39190) ([alesapin](https://github.com/alesapin)).
* (zero-copy replication is an experimental feature that should not be used in production) Fix fetch of in-memory part with `allow_remote_fs_zero_copy_replication`. [#39214](https://github.com/ClickHouse/ClickHouse/pull/39214) ([Azat Khuzhin](https://github.com/azat)).
* (MaterializedPostgreSQL - experimental feature). Fix segmentation fault in MaterializedPostgreSQL database engine, which could happen if some exception occurred at replication initialisation. Closes [#36939](https://github.com/ClickHouse/ClickHouse/issues/36939). [#39272](https://github.com/ClickHouse/ClickHouse/pull/39272) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix incorrect fetch of table metadata from PostgreSQL database engine. Closes [#33502](https://github.com/ClickHouse/ClickHouse/issues/33502). [#39283](https://github.com/ClickHouse/ClickHouse/pull/39283) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix projection exception when aggregation keys are wrapped inside other functions. This fixes [#37151](https://github.com/ClickHouse/ClickHouse/issues/37151). [#37155](https://github.com/ClickHouse/ClickHouse/pull/37155) ([Amos Bird](https://github.com/amosbird)).
* Fix possible logical error `... with argument with type Nothing and default implementation for Nothing is expected to return result with type Nothing, got ...` in some functions. Closes: [#37610](https://github.com/ClickHouse/ClickHouse/issues/37610) Closes: [#37741](https://github.com/ClickHouse/ClickHouse/issues/37741). [#37759](https://github.com/ClickHouse/ClickHouse/pull/37759) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix incorrect columns order in subqueries of UNION (in case of duplicated columns in subselects may produce incorrect result). [#37887](https://github.com/ClickHouse/ClickHouse/pull/37887) ([Azat Khuzhin](https://github.com/azat)).
* Fix incorrect work of MODIFY ALTER Column with column names that contain dots. Closes [#37907](https://github.com/ClickHouse/ClickHouse/issues/37907). [#37971](https://github.com/ClickHouse/ClickHouse/pull/37971) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix reading of sparse columns from `MergeTree` tables that store their data in S3. [#37978](https://github.com/ClickHouse/ClickHouse/pull/37978) ([Anton Popov](https://github.com/CurtizJ)).
* Fix possible crash in `Distributed` async insert in case of removing a replica from config. [#38029](https://github.com/ClickHouse/ClickHouse/pull/38029) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix "Missing columns" for GLOBAL JOIN with CTE without alias. [#38056](https://github.com/ClickHouse/ClickHouse/pull/38056) ([Azat Khuzhin](https://github.com/azat)).
* Rewrite tuple functions as literals in backwards-compatibility mode. [#38096](https://github.com/ClickHouse/ClickHouse/pull/38096) ([Anton Kozlov](https://github.com/tonickkozlov)).
* Fix redundant memory reservation for output block during `ORDER BY`. [#38127](https://github.com/ClickHouse/ClickHouse/pull/38127) ([iyupeng](https://github.com/iyupeng)).
* Fix possible logical error `Bad cast from type DB::IColumn* to DB::ColumnNullable*` in array mapped functions. Closes [#38006](https://github.com/ClickHouse/ClickHouse/issues/38006). [#38132](https://github.com/ClickHouse/ClickHouse/pull/38132) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix temporary name clash in partial merge join, close [#37928](https://github.com/ClickHouse/ClickHouse/issues/37928). [#38135](https://github.com/ClickHouse/ClickHouse/pull/38135) ([Vladimir C](https://github.com/vdimir)).
* Some minr issue with queries like `CREATE TABLE nested_name_tuples (`a` Tuple(x String, y Tuple(i Int32, j String))) ENGINE = Memory;` [#38136](https://github.com/ClickHouse/ClickHouse/pull/38136) ([lgbo](https://github.com/lgbo-ustc)).
* Fix bug with nested short-circuit functions that led to execution of arguments even if condition is false. Closes [#38040](https://github.com/ClickHouse/ClickHouse/issues/38040). [#38173](https://github.com/ClickHouse/ClickHouse/pull/38173) ([Kruglov Pavel](https://github.com/Avogar)).
* (Window View is a experimental feature) Fix LOGICAL_ERROR for WINDOW VIEW with incorrect structure. [#38205](https://github.com/ClickHouse/ClickHouse/pull/38205) ([Azat Khuzhin](https://github.com/azat)).
* Update librdkafka submodule to fix crash when an OAUTHBEARER refresh callback is set. [#38225](https://github.com/ClickHouse/ClickHouse/pull/38225) ([Rafael Acevedo](https://github.com/racevedoo)).
* Fix INSERT into Distributed hung due to ProfileEvents. [#38307](https://github.com/ClickHouse/ClickHouse/pull/38307) ([Azat Khuzhin](https://github.com/azat)).
* Fix retries in PostgreSQL engine. [#38310](https://github.com/ClickHouse/ClickHouse/pull/38310) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix optimization in PartialSortingTransform (SIGSEGV and possible incorrect result). [#38324](https://github.com/ClickHouse/ClickHouse/pull/38324) ([Azat Khuzhin](https://github.com/azat)).
* Fix RabbitMQ with formats based on PeekableReadBuffer. Closes [#38061](https://github.com/ClickHouse/ClickHouse/issues/38061). [#38356](https://github.com/ClickHouse/ClickHouse/pull/38356) ([Kseniia Sumarokova](https://github.com/kssenii)).
* MaterializedPostgreSQL - experimentail feature. Fix possible `Invalid number of rows in Chunk` in MaterializedPostgreSQL. Closes [#37323](https://github.com/ClickHouse/ClickHouse/issues/37323). [#38360](https://github.com/ClickHouse/ClickHouse/pull/38360) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix RabbitMQ configuration with connection string setting. Closes [#36531](https://github.com/ClickHouse/ClickHouse/issues/36531). [#38365](https://github.com/ClickHouse/ClickHouse/pull/38365) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix PostgreSQL engine not using PostgreSQL schema when retrieving array dimension size. Closes [#36755](https://github.com/ClickHouse/ClickHouse/issues/36755). Closes [#36772](https://github.com/ClickHouse/ClickHouse/issues/36772). [#38366](https://github.com/ClickHouse/ClickHouse/pull/38366) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix possibly incorrect result of distributed queries with `DISTINCT` and `LIMIT`. Fixes [#38282](https://github.com/ClickHouse/ClickHouse/issues/38282). [#38371](https://github.com/ClickHouse/ClickHouse/pull/38371) ([Anton Popov](https://github.com/CurtizJ)).
* Fix wrong results of countSubstrings() & position() on patterns with 0-bytes. [#38589](https://github.com/ClickHouse/ClickHouse/pull/38589) ([Robert Schulze](https://github.com/rschu1ze)).
* Now it's possible to start a clickhouse-server and attach/detach tables even for tables with the incorrect values of IPv4/IPv6 representation. Proper fix for issue [#35156](https://github.com/ClickHouse/ClickHouse/issues/35156). [#38590](https://github.com/ClickHouse/ClickHouse/pull/38590) ([alesapin](https://github.com/alesapin)).
* `rankCorr` function will work correctly if some arguments are NaNs. This closes [#38396](https://github.com/ClickHouse/ClickHouse/issues/38396). [#38722](https://github.com/ClickHouse/ClickHouse/pull/38722) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix `parallel_view_processing=1` with `optimize_trivial_insert_select=1`. Fix `max_insert_threads` while pushing to views. [#38731](https://github.com/ClickHouse/ClickHouse/pull/38731) ([Azat Khuzhin](https://github.com/azat)).
* Fix use-after-free for aggregate functions with `Map` combinator that leads to incorrect result. [#38748](https://github.com/ClickHouse/ClickHouse/pull/38748) ([Azat Khuzhin](https://github.com/azat)).
### <a id="226"></a> ClickHouse release 22.6, 2022-06-16
#### Backward Incompatible Change

View File

@ -1,4 +1,4 @@
cmake_minimum_required(VERSION 3.14)
cmake_minimum_required(VERSION 3.15)
project(ClickHouse LANGUAGES C CXX ASM)

View File

@ -1,176 +1,68 @@
#include "atomic.h"
#include <sys/auxv.h>
#include <fcntl.h> // open
#include <sys/stat.h> // O_RDONLY
#include <unistd.h> // read, close
#include <stdlib.h> // ssize_t
#include <stdio.h> // perror, fprintf
#include <link.h> // ElfW
#include "atomic.h"
#include <unistd.h> // __environ
#include <errno.h>
#define ARRAY_SIZE(a) sizeof((a))/sizeof((a[0]))
// We don't have libc struct available here.
// Compute aux vector manually (from /proc/self/auxv).
//
// Right now there is only 51 AT_* constants,
// so 64 should be enough until this implementation will be replaced with musl.
static unsigned long __auxv_procfs[64];
// We don't have libc struct available here. Compute aux vector manually.
static unsigned long * __auxv = NULL;
static unsigned long __auxv_secure = 0;
// Common
static unsigned long * __auxv_environ = NULL;
static void * volatile getauxval_func;
static unsigned long __auxv_init_environ(unsigned long type);
//
// auxv from procfs interface
//
ssize_t __retry_read(int fd, void * buf, size_t count)
{
for (;;)
{
ssize_t ret = read(fd, buf, count);
if (ret == -1)
{
if (errno == EINTR)
{
continue;
}
perror("Cannot read /proc/self/auxv");
abort();
}
return ret;
}
}
unsigned long __getauxval_procfs(unsigned long type)
{
if (type == AT_SECURE)
{
return __auxv_secure;
}
if (type >= ARRAY_SIZE(__auxv_procfs))
{
errno = ENOENT;
return 0;
}
return __auxv_procfs[type];
}
static unsigned long __auxv_init_procfs(unsigned long type)
{
// For debugging:
// - od -t dL /proc/self/auxv
// - LD_SHOW_AUX= ls
int fd = open("/proc/self/auxv", O_RDONLY);
// It is possible in case of:
// - no procfs mounted
// - on android you are not able to read it unless running from shell or debugging
// - some other issues
if (fd == -1)
{
// Fallback to environ.
a_cas_p(&getauxval_func, (void *)__auxv_init_procfs, (void *)__auxv_init_environ);
return __auxv_init_environ(type);
}
ElfW(auxv_t) aux;
/// NOTE: sizeof(aux) is very small (less then PAGE_SIZE), so partial read should not be possible.
_Static_assert(sizeof(aux) < 4096, "Unexpected sizeof(aux)");
while (__retry_read(fd, &aux, sizeof(aux)) == sizeof(aux))
{
if (aux.a_type >= ARRAY_SIZE(__auxv_procfs))
{
fprintf(stderr, "AT_* is out of range: %li (maximum allowed is %zu)\n", aux.a_type, ARRAY_SIZE(__auxv_procfs));
abort();
}
if (__auxv_procfs[aux.a_type])
{
fprintf(stderr, "AUXV already has value (%zu)\n", __auxv_procfs[aux.a_type]);
abort();
}
__auxv_procfs[aux.a_type] = aux.a_un.a_val;
}
close(fd);
__auxv_secure = __getauxval_procfs(AT_SECURE);
// Now we've initialized __auxv_procfs, next time getauxval() will only call __get_auxval().
a_cas_p(&getauxval_func, (void *)__auxv_init_procfs, (void *)__getauxval_procfs);
return __getauxval_procfs(type);
}
//
// auxv from environ interface
//
// NOTE: environ available only after static initializers,
// so you cannot rely on this if you need getauxval() before.
//
// Good example of such user is sanitizers, for example
// LSan will not work with __auxv_init_environ(),
// since it needs getauxval() before.
//
static size_t __find_auxv(unsigned long type)
{
size_t i;
for (i = 0; __auxv_environ[i]; i += 2)
for (i = 0; __auxv[i]; i += 2)
{
if (__auxv_environ[i] == type)
{
if (__auxv[i] == type)
return i + 1;
}
}
return (size_t) -1;
}
unsigned long __getauxval_environ(unsigned long type)
unsigned long __getauxval(unsigned long type)
{
if (type == AT_SECURE)
return __auxv_secure;
if (__auxv_environ)
if (__auxv)
{
size_t index = __find_auxv(type);
if (index != ((size_t) -1))
return __auxv_environ[index];
return __auxv[index];
}
errno = ENOENT;
return 0;
}
static unsigned long __auxv_init_environ(unsigned long type)
static void * volatile getauxval_func;
static unsigned long __auxv_init(unsigned long type)
{
if (!__environ)
{
// __environ is not initialized yet so we can't initialize __auxv_environ right now.
// __environ is not initialized yet so we can't initialize __auxv right now.
// That's normally occurred only when getauxval() is called from some sanitizer's internal code.
errno = ENOENT;
return 0;
}
// Initialize __auxv_environ and __auxv_secure.
// Initialize __auxv and __auxv_secure.
size_t i;
for (i = 0; __environ[i]; i++);
__auxv_environ = (unsigned long *) (__environ + i + 1);
__auxv = (unsigned long *) (__environ + i + 1);
size_t secure_idx = __find_auxv(AT_SECURE);
if (secure_idx != ((size_t) -1))
__auxv_secure = __auxv_environ[secure_idx];
__auxv_secure = __auxv[secure_idx];
// Now we need to switch to __getauxval_environ for all later calls, since
// everything is initialized.
a_cas_p(&getauxval_func, (void *)__auxv_init_environ, (void *)__getauxval_environ);
// Now we've initialized __auxv, next time getauxval() will only call __get_auxval().
a_cas_p(&getauxval_func, (void *)__auxv_init, (void *)__getauxval);
return __getauxval_environ(type);
return __getauxval(type);
}
// Callchain:
// - __auxv_init_procfs -> __getauxval_environ
// - __auxv_init_procfs -> __auxv_init_environ -> __getauxval_environ
static void * volatile getauxval_func = (void *)__auxv_init_procfs;
// First time getauxval() will call __auxv_init().
static void * volatile getauxval_func = (void *)__auxv_init;
unsigned long getauxval(unsigned long type)
{

View File

@ -1,3 +1,19 @@
if (_CLICKHOUSE_TOOLCHAIN_FILE_LOADED)
# During first run of cmake the toolchain file will be loaded twice,
# - /usr/share/cmake-3.23/Modules/CMakeDetermineSystem.cmake
# - /bld/CMakeFiles/3.23.2/CMakeSystem.cmake
#
# But once you already have non-empty cmake cache it will be loaded only
# once:
# - /bld/CMakeFiles/3.23.2/CMakeSystem.cmake
#
# This has no harm except for double load of toolchain will add
# --gcc-toolchain multiple times that will not allow ccache to reuse the
# cache.
return()
endif()
set (_CLICKHOUSE_TOOLCHAIN_FILE_LOADED ON)
set (CMAKE_TRY_COMPILE_TARGET_TYPE STATIC_LIBRARY)
set (CMAKE_SYSTEM_NAME "Linux")

View File

@ -93,6 +93,18 @@ set (CMAKE_CXX_STANDARD 17)
set (LLVM_SOURCE_DIR "${ClickHouse_SOURCE_DIR}/contrib/llvm/llvm")
set (LLVM_BINARY_DIR "${ClickHouse_BINARY_DIR}/contrib/llvm/llvm")
add_subdirectory ("${LLVM_SOURCE_DIR}" "${LLVM_BINARY_DIR}")
set_directory_properties (PROPERTIES
# due to llvm crosscompile cmake does not know how to clean it, and on clean
# will lead to the following error:
#
# ninja: error: remove(contrib/llvm/llvm/NATIVE): Directory not empty
#
ADDITIONAL_CLEAN_FILES "${LLVM_BINARY_DIR}"
# llvm's cmake configuring this file only when cmake runs,
# and after clean cmake will not know that it should re-run,
# add explicitly depends from llvm-config.h
CMAKE_CONFIGURE_DEPENDS "${LLVM_BINARY_DIR}/include/llvm/Config/llvm-config.h"
)
add_library (_llvm INTERFACE)
target_link_libraries (_llvm INTERFACE ${REQUIRED_LLVM_LIBRARIES})

View File

@ -46,6 +46,9 @@ def get_options(i, backward_compatibility_check):
if i == 13:
client_options.append("memory_tracker_fault_probability=0.001")
if i % 2 == 1 and not backward_compatibility_check:
client_options.append("group_by_use_nulls=1")
if client_options:
options.append(" --client-option " + " ".join(client_options))

View File

@ -75,7 +75,7 @@ This will create the `programs/clickhouse` executable, which can be used with `c
The build requires the following components:
- Git (is used only to checkout the sources, its not needed for the build)
- CMake 3.14 or newer
- CMake 3.15 or newer
- Ninja
- C++ compiler: clang-14 or newer
- Linker: lld

View File

@ -3329,6 +3329,15 @@ Read more about [memory overcommit](memory-overcommit.md).
Default value: `1GiB`.
## compatibility {#compatibility}
This setting changes other settings according to provided ClickHouse version.
If a behaviour in ClickHouse was changed by using a different default value for some setting, this compatibility setting allows you to use default values from previous versions for all the settings that were not set by the user.
This setting takes ClickHouse version number as a string, like `21.3`, `21.8`. Empty value means that this setting is disabled.
Disabled by default.
# Format settings {#format-settings}
## input_format_skip_unknown_fields {#input_format_skip_unknown_fields}

View File

@ -102,9 +102,34 @@ void Client::processError(const String & query) const
}
void Client::showWarnings()
{
try
{
std::vector<String> messages = loadWarningMessages();
if (!messages.empty())
{
std::cout << "Warnings:" << std::endl;
for (const auto & message : messages)
std::cout << " * " << message << std::endl;
std::cout << std::endl;
}
}
catch (...)
{
/// Ignore exception
}
}
/// Make query to get all server warnings
std::vector<String> Client::loadWarningMessages()
{
/// Older server versions cannot execute the query loading warnings.
constexpr UInt64 min_server_revision_to_load_warnings = DBMS_MIN_PROTOCOL_VERSION_WITH_VIEW_IF_PERMITTED;
if (server_revision < min_server_revision_to_load_warnings)
return {};
std::vector<String> messages;
connection->sendQuery(connection_parameters.timeouts,
"SELECT * FROM viewIfPermitted(SELECT message FROM system.warnings ELSE null('message String'))",
@ -226,25 +251,9 @@ try
connect();
/// Load Warnings at the beginning of connection
/// Show warnings at the beginning of connection.
if (is_interactive && !config().has("no-warnings"))
{
try
{
std::vector<String> messages = loadWarningMessages();
if (!messages.empty())
{
std::cout << "Warnings:" << std::endl;
for (const auto & message : messages)
std::cout << " * " << message << std::endl;
std::cout << std::endl;
}
}
catch (...)
{
/// Ignore exception
}
}
showWarnings();
if (is_interactive && !delayed_interactive)
{
@ -370,7 +379,7 @@ void Client::connect()
}
server_version = toString(server_version_major) + "." + toString(server_version_minor) + "." + toString(server_version_patch);
load_suggestions = is_interactive && (server_revision >= Suggest::MIN_SERVER_REVISION && !config().getBool("disable_suggestion", false));
load_suggestions = is_interactive && (server_revision >= Suggest::MIN_SERVER_REVISION) && !config().getBool("disable_suggestion", false);
if (server_display_name = connection->getServerDisplayName(connection_parameters.timeouts); server_display_name.empty())
server_display_name = config().getString("host", "localhost");

View File

@ -45,6 +45,7 @@ protected:
private:
void printChangedSettings() const;
void showWarnings();
std::vector<String> loadWarningMessages();
};
}

View File

@ -1,6 +1,7 @@
#pragma once
#include "ICommand.h"
#include <Interpreters/Context.h>
namespace DB
{

View File

@ -1,6 +1,7 @@
#pragma once
#include "ICommand.h"
#include <Interpreters/Context.h>
namespace DB
{

View File

@ -1,6 +1,7 @@
#pragma once
#include "ICommand.h"
#include <Interpreters/Context.h>
namespace DB
{

View File

@ -1,6 +1,7 @@
#pragma once
#include "ICommand.h"
#include <Interpreters/Context.h>
namespace DB
{

View File

@ -1,6 +1,7 @@
#pragma once
#include "ICommand.h"
#include <Interpreters/Context.h>
namespace DB
{

View File

@ -1,6 +1,7 @@
#pragma once
#include "ICommand.h"
#include <Interpreters/Context.h>
namespace DB
{

View File

@ -1,6 +1,7 @@
#pragma once
#include "ICommand.h"
#include <Interpreters/Context.h>
namespace DB
{

View File

@ -1,6 +1,7 @@
#pragma once
#include "ICommand.h"
#include <Interpreters/Context.h>
namespace DB
{

View File

@ -15,6 +15,7 @@
#include <Poco/JSON/Stringifier.h>
#include <boost/algorithm/string/case_conv.hpp>
#include <boost/range/adaptor/map.hpp>
#include <base/range.h>
#include <filesystem>
#include <fstream>

View File

@ -8,6 +8,8 @@
namespace DB
{
struct Array;
Array getAggregateFunctionParametersArray(
const ASTPtr & expression_list,
const std::string & error_context,

View File

@ -7,6 +7,7 @@
#include <IO/Archives/hasRegisteredArchiveFileExtension.h>
#include <Poco/Util/AbstractConfiguration.h>
#include <filesystem>
#include <Interpreters/Context.h>
namespace DB

View File

@ -376,7 +376,7 @@ if (TARGET ch_contrib::rdkafka)
endif()
if (TARGET ch_contrib::nats_io)
dbms_target_link_libraries(PRIVATE ch_contrib::nats_io)
dbms_target_link_libraries(PRIVATE ch_contrib::nats_io ch_contrib::uv)
endif()
if (TARGET ch_contrib::sasl2)

View File

@ -28,8 +28,8 @@ public:
template <typename ConnectionType>
void load(ContextPtr context, const ConnectionParameters & connection_parameters, Int32 suggestion_limit);
/// Older server versions cannot execute the query above.
static constexpr int MIN_SERVER_REVISION = 54406;
/// Older server versions cannot execute the query loading suggestions.
static constexpr int MIN_SERVER_REVISION = DBMS_MIN_PROTOCOL_VERSION_WITH_VIEW_IF_PERMITTED;
private:
void fetch(IServerConnection & connection, const ConnectionTimeouts & timeouts, const std::string & query);

View File

@ -793,4 +793,18 @@ ColumnPtr makeNullable(const ColumnPtr & column)
return ColumnNullable::create(column, ColumnUInt8::create(column->size(), 0));
}
ColumnPtr makeNullableSafe(const ColumnPtr & column)
{
if (isColumnNullable(*column))
return column;
if (isColumnConst(*column))
return ColumnConst::create(makeNullableSafe(assert_cast<const ColumnConst &>(*column).getDataColumnPtr()), column->size());
if (column->canBeInsideNullable())
return makeNullable(column);
return column;
}
}

View File

@ -223,5 +223,6 @@ private:
};
ColumnPtr makeNullable(const ColumnPtr & column);
ColumnPtr makeNullableSafe(const ColumnPtr & column);
}

View File

@ -45,7 +45,7 @@ void LRUFileCache::initialize()
catch (...)
{
tryLogCurrentException(__PRETTY_FUNCTION__);
return;
throw;
}
}
else
@ -841,7 +841,11 @@ void LRUFileCache::loadCacheInfoIntoMemory(std::lock_guard<std::mutex> & cache_l
/// cache_base_path / key_prefix / key / offset
if (!files.empty())
throw Exception(ErrorCodes::LOGICAL_ERROR, "Cache already initialized");
throw Exception(
ErrorCodes::REMOTE_FS_OBJECT_CACHE_ERROR,
"Cache initialization is partially made. "
"This can be a result of a failed first attempt to initialize cache. "
"Please, check log for error messages");
fs::directory_iterator key_prefix_it{cache_base_path};
for (; key_prefix_it != fs::directory_iterator(); ++key_prefix_it)

View File

@ -342,6 +342,23 @@ OptimizedRegularExpressionImpl<thread_safe>::OptimizedRegularExpressionImpl(cons
}
}
template <bool thread_safe>
OptimizedRegularExpressionImpl<thread_safe>::OptimizedRegularExpressionImpl(OptimizedRegularExpressionImpl && rhs) noexcept
: is_trivial(rhs.is_trivial)
, required_substring_is_prefix(rhs.required_substring_is_prefix)
, is_case_insensitive(rhs.is_case_insensitive)
, required_substring(std::move(rhs.required_substring))
, re2(std::move(rhs.re2))
, number_of_subpatterns(rhs.number_of_subpatterns)
{
if (!required_substring.empty())
{
if (is_case_insensitive)
case_insensitive_substring_searcher.emplace(required_substring.data(), required_substring.size());
else
case_sensitive_substring_searcher.emplace(required_substring.data(), required_substring.size());
}
}
template <bool thread_safe>
bool OptimizedRegularExpressionImpl<thread_safe>::match(const char * subject, size_t subject_size) const

View File

@ -56,6 +56,9 @@ public:
using StringPieceType = std::conditional_t<thread_safe, re2::StringPiece, re2_st::StringPiece>;
OptimizedRegularExpressionImpl(const std::string & regexp_, int options = 0); /// NOLINT
/// StringSearcher store pointers to required_substring, it must be updated on move.
OptimizedRegularExpressionImpl(OptimizedRegularExpressionImpl && rhs) noexcept;
OptimizedRegularExpressionImpl(const OptimizedRegularExpressionImpl & rhs) = delete;
bool match(const std::string & subject) const
{

View File

@ -3,6 +3,7 @@
#include <memory>
#include <IO/ReadBufferFromFile.h>
#include <IO/WriteBufferFromFile.h>
#include <unordered_map>
namespace DB

View File

@ -1,33 +0,0 @@
#include <base/defines.h> // ADDRESS_SANITIZER
#ifdef ADDRESS_SANITIZER
#include <cstdlib>
#include <thread>
#include <gtest/gtest.h>
#include <sanitizer/lsan_interface.h>
/// Test that ensures that LSan works.
///
/// Regression test for the case when it may not work,
/// because of broken getauxval() [1].
///
/// [1]: https://github.com/ClickHouse/ClickHouse/pull/33957
TEST(Common, LSan)
{
int sanitizers_exit_code = 1;
ASSERT_EXIT({
std::thread leak_in_thread([]()
{
void * leak = malloc(4096);
ASSERT_NE(leak, nullptr);
});
leak_in_thread.join();
__lsan_do_leak_check();
}, ::testing::ExitedWithCode(sanitizers_exit_code), ".*LeakSanitizer: detected memory leaks.*");
}
#endif

View File

@ -1,5 +1,4 @@
#include <Coordination/CoordinationSettings.h>
#include <Core/Settings.h>
#include <Common/logger_useful.h>
#include <filesystem>
#include <Coordination/Defines.h>

View File

@ -21,6 +21,7 @@
#include <Poco/Util/AbstractConfiguration.h>
#include <Poco/Util/Application.h>
#include <Common/ZooKeeper/ZooKeeperIO.h>
#include <Common/Stopwatch.h>
namespace DB
{

View File

@ -43,9 +43,16 @@ class BaseSettings : public TTraits::Data
{
using CustomSettingMap = std::unordered_map<std::string_view, std::pair<std::shared_ptr<const String>, SettingFieldCustom>>;
public:
BaseSettings() = default;
BaseSettings(const BaseSettings &) = default;
BaseSettings(BaseSettings &&) noexcept = default;
BaseSettings & operator=(const BaseSettings &) = default;
BaseSettings & operator=(BaseSettings &&) noexcept = default;
virtual ~BaseSettings() = default;
using Traits = TTraits;
void set(std::string_view name, const Field & value);
virtual void set(std::string_view name, const Field & value);
Field get(std::string_view name) const;
void setString(std::string_view name, const String & value);
@ -62,6 +69,8 @@ public:
/// Resets all the settings to their default values.
void resetToDefault();
/// Resets specified setting to its default value.
void resetToDefault(std::string_view name);
bool has(std::string_view name) const { return hasBuiltin(name) || hasCustom(name); }
static bool hasBuiltin(std::string_view name);
@ -315,6 +324,14 @@ void BaseSettings<TTraits>::resetToDefault()
custom_settings_map.clear();
}
template <typename TTraits>
void BaseSettings<TTraits>::resetToDefault(std::string_view name)
{
const auto & accessor = Traits::Accessor::instance();
if (size_t index = accessor.find(name); index != static_cast<size_t>(-1))
accessor.resetValueToDefault(*this, index);
}
template <typename TTraits>
bool BaseSettings<TTraits>::hasBuiltin(std::string_view name)
{

View File

@ -52,8 +52,10 @@
/// NOTE: DBMS_TCP_PROTOCOL_VERSION has nothing common with VERSION_REVISION,
/// later is just a number for server version (one number instead of commit SHA)
/// for simplicity (sometimes it may be more convenient in some use cases).
#define DBMS_TCP_PROTOCOL_VERSION 54456
#define DBMS_TCP_PROTOCOL_VERSION 54457
#define DBMS_MIN_PROTOCOL_VERSION_WITH_INITIAL_QUERY_START_TIME 54449
#define DBMS_MIN_PROTOCOL_VERSION_WITH_PROFILE_EVENTS_IN_INSERT 54456
#define DBMS_MIN_PROTOCOL_VERSION_WITH_VIEW_IF_PERMITTED 54457

View File

@ -1,5 +1,6 @@
#include "Settings.h"
#include <Core/SettingsChangesHistory.h>
#include <Poco/Util/AbstractConfiguration.h>
#include <Columns/ColumnArray.h>
#include <Columns/ColumnMap.h>
@ -145,6 +146,53 @@ std::vector<String> Settings::getAllRegisteredNames() const
return all_settings;
}
void Settings::set(std::string_view name, const Field & value)
{
BaseSettings::set(name, value);
if (name == "compatibility")
applyCompatibilitySetting();
/// If we change setting that was changed by compatibility setting before
/// we should remove it from settings_changed_by_compatibility_setting,
/// otherwise the next time we will change compatibility setting
/// this setting will be changed too (and we don't want it).
else if (settings_changed_by_compatibility_setting.contains(name))
settings_changed_by_compatibility_setting.erase(name);
}
void Settings::applyCompatibilitySetting()
{
/// First, revert all changes applied by previous compatibility setting
for (const auto & setting_name : settings_changed_by_compatibility_setting)
resetToDefault(setting_name);
settings_changed_by_compatibility_setting.clear();
String compatibility = getString("compatibility");
/// If setting value is empty, we don't need to change settings
if (compatibility.empty())
return;
ClickHouseVersion version(compatibility);
/// Iterate through ClickHouse version in descending order and apply reversed
/// changes for each version that is higher that version from compatibility setting
for (auto it = settings_changes_history.rbegin(); it != settings_changes_history.rend(); ++it)
{
if (version >= it->first)
break;
/// Apply reversed changes from this version.
for (const auto & change : it->second)
{
/// If this setting was changed manually, we don't change it
if (isChanged(change.name) && !settings_changed_by_compatibility_setting.contains(change.name))
continue;
BaseSettings::set(change.name, change.previous_value);
settings_changed_by_compatibility_setting.insert(change.name);
}
}
}
IMPLEMENT_SETTINGS_TRAITS(FormatFactorySettingsTraits, FORMAT_FACTORY_SETTINGS)
}

View File

@ -35,6 +35,10 @@ static constexpr UInt64 operator""_GiB(unsigned long long value)
*
* `flags` can be either 0 or IMPORTANT.
* A setting is "IMPORTANT" if it affects the results of queries and can't be ignored by older versions.
*
* When adding new settings that control some backward incompatible changes or when changing some settings values,
* consider adding them to settings changes history in SettingsChangesHistory.h for special `compatibility` setting
* to work correctly.
*/
#define COMMON_SETTINGS(M) \
@ -132,6 +136,8 @@ static constexpr UInt64 operator""_GiB(unsigned long long value)
M(UInt64, aggregation_memory_efficient_merge_threads, 0, "Number of threads to use for merge intermediate aggregation results in memory efficient mode. When bigger, then more memory is consumed. 0 means - same as 'max_threads'.", 0) \
M(Bool, enable_positional_arguments, true, "Enable positional arguments in ORDER BY, GROUP BY and LIMIT BY", 0) \
\
M(Bool, group_by_use_nulls, false, "Treat columns mentioned in ROLLUP, CUBE or GROUPING SETS as Nullable", 0) \
\
M(UInt64, max_parallel_replicas, 1, "The maximum number of replicas of each shard used when the query is executed. For consistency (to get different parts of the same partition), this option only works for the specified sampling key. The lag of the replicas is not controlled.", 0) \
M(UInt64, parallel_replicas_count, 0, "", 0) \
M(UInt64, parallel_replica_offset, 0, "", 0) \
@ -599,6 +605,11 @@ static constexpr UInt64 operator""_GiB(unsigned long long value)
M(Bool, allow_deprecated_database_ordinary, false, "Allow to create databases with deprecated Ordinary engine", 0) \
M(Bool, allow_deprecated_syntax_for_merge_tree, false, "Allow to create *MergeTree tables with deprecated engine definition syntax", 0) \
\
M(String, compatibility, "", "Changes other settings according to provided ClickHouse version. If we know that we changed some behaviour in ClickHouse by changing some settings in some version, this compatibility setting will control these settings", 0) \
\
M(Map, additional_table_filters, "", "Additional filter expression which would be applied after reading from specified table. Syntax: {'table1': 'expression', 'database.table2': 'expression'}", 0) \
M(String, additional_result_filter, "", "Additional filter expression which would be applied to query result", 0) \
\
/** Experimental functions */ \
M(Bool, allow_experimental_funnel_functions, false, "Enable experimental functions for funnel analysis.", 0) \
M(Bool, allow_experimental_nlp_functions, false, "Enable experimental functions for natural language processing.", 0) \
@ -758,7 +769,7 @@ static constexpr UInt64 operator""_GiB(unsigned long long value)
M(Bool, output_format_pretty_row_numbers, false, "Add row numbers before each row for pretty output format", 0) \
M(Bool, insert_distributed_one_random_shard, false, "If setting is enabled, inserting into distributed table will choose a random shard to write when there is no sharding key", 0) \
\
M(UInt64, cross_to_inner_join_rewrite, 2, "Use inner join instead of comma/cross join if possible. Possible values: 0 - no rewrite, 1 - apply if possible for comma/cross, 2 - force rewrite all comma joins, cross - if possible", 0) \
M(UInt64, cross_to_inner_join_rewrite, 1, "Use inner join instead of comma/cross join if there're joining expressions in the WHERE section. Values: 0 - no rewrite, 1 - apply if possible for comma/cross, 2 - force rewrite all comma joins, cross - if possible", 0) \
\
M(Bool, output_format_arrow_low_cardinality_as_dictionary, false, "Enable output LowCardinality type as Dictionary Arrow type", 0) \
M(Bool, output_format_arrow_string_as_string, false, "Use Arrow String type instead of Binary for String columns", 0) \
@ -825,6 +836,13 @@ struct Settings : public BaseSettings<SettingsTraits>, public IHints<2, Settings
void addProgramOption(boost::program_options::options_description & options, const SettingFieldRef & field);
void addProgramOptionAsMultitoken(boost::program_options::options_description & options, const SettingFieldRef & field);
void set(std::string_view name, const Field & value) override;
private:
void applyCompatibilitySetting();
std::unordered_set<std::string_view> settings_changed_by_compatibility_setting;
};
/*

View File

@ -0,0 +1,113 @@
#pragma once
#include <Core/Field.h>
#include <Core/Settings.h>
#include <IO/ReadHelpers.h>
#include <boost/algorithm/string.hpp>
#include <map>
namespace DB
{
namespace ErrorCodes
{
extern const int BAD_ARGUMENTS;
}
class ClickHouseVersion
{
public:
ClickHouseVersion(const String & version)
{
Strings split;
boost::split(split, version, [](char c){ return c == '.'; });
components.reserve(split.size());
if (split.empty())
throw Exception{ErrorCodes::BAD_ARGUMENTS, "Cannot parse ClickHouse version here: {}", version};
for (const auto & split_element : split)
{
size_t component;
if (!tryParse(component, split_element))
throw Exception{ErrorCodes::BAD_ARGUMENTS, "Cannot parse ClickHouse version here: {}", version};
components.push_back(component);
}
}
ClickHouseVersion(const char * version) : ClickHouseVersion(String(version)) {}
String toString() const
{
String version = std::to_string(components[0]);
for (size_t i = 1; i < components.size(); ++i)
version += "." + std::to_string(components[i]);
return version;
}
bool operator<(const ClickHouseVersion & other) const
{
return components < other.components;
}
bool operator>=(const ClickHouseVersion & other) const
{
return components >= other.components;
}
private:
std::vector<size_t> components;
};
namespace SettingsChangesHistory
{
struct SettingChange
{
String name;
Field previous_value;
Field new_value;
String reason;
};
using SettingsChanges = std::vector<SettingChange>;
}
/// History of settings changes that controls some backward incompatible changes
/// across all ClickHouse versions. It maps ClickHouse version to settings changes that were done
/// in this version. Settings changes is a vector of structs {setting_name, previous_value, new_value}
/// It's used to implement `compatibility` setting (see https://github.com/ClickHouse/ClickHouse/issues/35972)
static std::map<ClickHouseVersion, SettingsChangesHistory::SettingsChanges> settings_changes_history =
{
{"22.7", {{"cross_to_inner_join_rewrite", 1, 2, "Force rewrite comma join to inner"},
{"enable_positional_arguments", false, true, "Enable positional arguments feature by default"}}},
{"22.6", {{"output_format_json_named_tuples_as_objects", false, true, "Allow to serialize named tuples as JSON objects in JSON formats by default"}}},
{"22.5", {{"memory_overcommit_ratio_denominator", 0, 1073741824, "Enable memory overcommit feature by default"},
{"memory_overcommit_ratio_denominator_for_user", 0, 1073741824, "Enable memory overcommit feature by default"}}},
{"22.4", {{"allow_settings_after_format_in_insert", true, false, "Do not allow SETTINGS after FORMAT for INSERT queries because ClickHouse interpret SETTINGS as some values, which is misleading"}}},
{"22.3", {{"cast_ipv4_ipv6_default_on_conversion_error", true, false, "Make functions cast(value, 'IPv4') and cast(value, 'IPv6') behave same as toIPv4 and toIPv6 functions"}}},
{"21.12", {{"stream_like_engine_allow_direct_select", true, false, "Do not allow direct select for Kafka/RabbitMQ/FileLog by default"}}},
{"21.9", {{"output_format_decimal_trailing_zeros", true, false, "Do not output trailing zeros in text representation of Decimal types by default for better looking output"},
{"use_hedged_requests", false, true, "Enable Hedged Requests feature bu default"}}},
{"21.7", {{"legacy_column_name_of_tuple_literal", true, false, "Add this setting only for compatibility reasons. It makes sense to set to 'true', while doing rolling update of cluster from version lower than 21.7 to higher"}}},
{"21.5", {{"async_socket_for_remote", false, true, "Fix all problems and turn on asynchronous reads from socket for remote queries by default again"}}},
{"21.3", {{"async_socket_for_remote", true, false, "Turn off asynchronous reads from socket for remote queries because of some problems"},
{"optimize_normalize_count_variants", false, true, "Rewrite aggregate functions that semantically equals to count() as count() by default"},
{"normalize_function_names", false, true, "Normalize function names to their canonical names, this was needed for projection query routing"}}},
{"21.2", {{"enable_global_with_statement", false, true, "Propagate WITH statements to UNION queries and all subqueries by default"}}},
{"21.1", {{"insert_quorum_parallel", false, true, "Use parallel quorum inserts by default. It is significantly more convenient to use than sequential quorum inserts"},
{"input_format_null_as_default", false, true, "Allow to insert NULL as default for input formats by default"},
{"optimize_on_insert", false, true, "Enable data optimization on INSERT by default for better user experience"},
{"use_compact_format_in_distributed_parts_names", false, true, "Use compact format for async INSERT into Distributed tables by default"}}},
{"20.10", {{"format_regexp_escaping_rule", "Escaped", "Raw", "Use Raw as default escaping rule for Regexp format to male the behaviour more like to what users expect"}}},
{"20.7", {{"show_table_uuid_in_table_create_query_if_not_nil", true, false, "Stop showing UID of the table in its CREATE query for Engine=Atomic"}}},
{"20.5", {{"input_format_with_names_use_header", false, true, "Enable using header with names for formats with WithNames/WithNamesAndTypes suffixes"},
{"allow_suspicious_codecs", true, false, "Don't allow to specify meaningless compression codecs"}}},
{"20.4", {{"validate_polygons", false, true, "Throw exception if polygon is invalid in function pointInPolygon by default instead of returning possibly wrong results"}}},
{"19.18", {{"enable_scalar_subquery_optimization", false, true, "Prevent scalar subqueries from (de)serializing large scalar values and possibly avoid running the same subquery more than once"}}},
{"19.14", {{"any_join_distinct_right_table_keys", true, false, "Disable ANY RIGHT and ANY FULL JOINs by default to avoid inconsistency"}}},
{"19.12", {{"input_format_defaults_for_omitted_fields", false, true, "Enable calculation of complex default expressions for omitted fields for some input formats, because it should be the expected behaviour"}}},
{"19.5", {{"max_partitions_per_insert_block", 0, 100, "Add a limit for the number of partitions in one block"}}},
{"18.12.17", {{"enable_optimize_predicate_expression", 0, 1, "Optimize predicates to subqueries by default"}}},
};
}

View File

@ -4,6 +4,8 @@
#include <Common/getNumberOfPhysicalCPUCores.h>
#include <Common/FieldVisitorConvertToNumber.h>
#include <Common/logger_useful.h>
#include <DataTypes/DataTypeMap.h>
#include <DataTypes/DataTypeString.h>
#include <IO/ReadHelpers.h>
#include <IO/ReadBufferFromString.h>
#include <IO/WriteHelpers.h>
@ -51,6 +53,37 @@ namespace
else
return applyVisitor(FieldVisitorConvertToNumber<T>(), f);
}
#ifndef KEEPER_STANDALONE_BUILD
Map stringToMap(const String & str)
{
/// Allow empty string as an empty map
if (str.empty())
return {};
auto type_string = std::make_shared<DataTypeString>();
DataTypeMap type_map(type_string, type_string);
auto serialization = type_map.getSerialization(ISerialization::Kind::DEFAULT);
auto column = type_map.createColumn();
ReadBufferFromString buf(str);
serialization->deserializeTextEscaped(*column, buf, {});
return (*column)[0].safeGet<Map>();
}
Map fieldToMap(const Field & f)
{
if (f.getType() == Field::Types::String)
{
/// Allow to parse Map from string field. For the convenience.
const auto & str = f.get<const String &>();
return stringToMap(str);
}
return f.safeGet<const Map &>();
}
#endif
}
template <typename T>
@ -291,6 +324,48 @@ void SettingFieldString::readBinary(ReadBuffer & in)
*this = std::move(str);
}
#ifndef KEEPER_STANDALONE_BUILD
SettingFieldMap::SettingFieldMap(const Field & f) : value(fieldToMap(f)) {}
String SettingFieldMap::toString() const
{
auto type_string = std::make_shared<DataTypeString>();
DataTypeMap type_map(type_string, type_string);
auto serialization = type_map.getSerialization(ISerialization::Kind::DEFAULT);
auto column = type_map.createColumn();
column->insert(value);
WriteBufferFromOwnString out;
serialization->serializeTextEscaped(*column, 0, out, {});
return out.str();
}
SettingFieldMap & SettingFieldMap::operator =(const Field & f)
{
*this = fieldToMap(f);
return *this;
}
void SettingFieldMap::parseFromString(const String & str)
{
*this = stringToMap(str);
}
void SettingFieldMap::writeBinary(WriteBuffer & out) const
{
DB::writeBinary(value, out);
}
void SettingFieldMap::readBinary(ReadBuffer & in)
{
Map map;
DB::readBinary(map, in);
*this = map;
}
#endif
namespace
{

View File

@ -168,6 +168,32 @@ struct SettingFieldString
void readBinary(ReadBuffer & in);
};
#ifndef KEEPER_STANDALONE_BUILD
struct SettingFieldMap
{
public:
Map value;
bool changed = false;
explicit SettingFieldMap(const Map & map = {}) : value(map) {}
explicit SettingFieldMap(Map && map) : value(std::move(map)) {}
explicit SettingFieldMap(const Field & f);
SettingFieldMap & operator =(const Map & map) { value = map; changed = true; return *this; }
SettingFieldMap & operator =(const Field & f);
operator const Map &() const { return value; } /// NOLINT
explicit operator Field() const { return value; }
String toString() const;
void parseFromString(const String & str);
void writeBinary(WriteBuffer & out) const;
void readBinary(ReadBuffer & in);
};
#endif
struct SettingFieldChar
{

View File

@ -85,6 +85,13 @@ DataTypePtr makeNullable(const DataTypePtr & type)
return std::make_shared<DataTypeNullable>(type);
}
DataTypePtr makeNullableSafe(const DataTypePtr & type)
{
if (type->canBeInsideNullable())
return makeNullable(type);
return type;
}
DataTypePtr removeNullable(const DataTypePtr & type)
{
if (type->isNullable())

View File

@ -51,6 +51,7 @@ private:
DataTypePtr makeNullable(const DataTypePtr & type);
DataTypePtr makeNullableSafe(const DataTypePtr & type);
DataTypePtr removeNullable(const DataTypePtr & type);
}

View File

@ -5,6 +5,7 @@
#include <Common/typeid_cast.h>
#include <DataTypes/IDataType.h>
#include <DataTypes/DataTypeDecimalBase.h>
#include <DataTypes/DataTypeDateTime64.h>
namespace DB
@ -13,6 +14,7 @@ namespace DB
namespace ErrorCodes
{
extern const int DECIMAL_OVERFLOW;
extern const int LOGICAL_ERROR;
}
/// Implements Decimal(P, S), where P is precision, S is scale.
@ -58,7 +60,7 @@ inline const DataTypeDecimal<T> * checkDecimal(const IDataType & data_type)
return typeid_cast<const DataTypeDecimal<T> *>(&data_type);
}
inline UInt32 getDecimalScale(const IDataType & data_type, UInt32 default_value = std::numeric_limits<UInt32>::max())
inline UInt32 getDecimalScale(const IDataType & data_type)
{
if (const auto * decimal_type = checkDecimal<Decimal32>(data_type))
return decimal_type->getScale();
@ -68,7 +70,10 @@ inline UInt32 getDecimalScale(const IDataType & data_type, UInt32 default_value
return decimal_type->getScale();
if (const auto * decimal_type = checkDecimal<Decimal256>(data_type))
return decimal_type->getScale();
return default_value;
if (const auto * date_time_type = typeid_cast<const DataTypeDateTime64 *>(&data_type))
return date_time_type->getScale();
throw Exception(ErrorCodes::LOGICAL_ERROR, "Cannot get decimal scale from type {}", data_type.getName());
}
inline UInt32 getDecimalPrecision(const IDataType & data_type)
@ -81,7 +86,10 @@ inline UInt32 getDecimalPrecision(const IDataType & data_type)
return decimal_type->getPrecision();
if (const auto * decimal_type = checkDecimal<Decimal256>(data_type))
return decimal_type->getPrecision();
return 0;
if (const auto * date_time_type = typeid_cast<const DataTypeDateTime64 *>(&data_type))
return date_time_type->getPrecision();
throw Exception(ErrorCodes::LOGICAL_ERROR, "Cannot get decimal precision from type {}", data_type.getName());
}
template <typename T>

View File

@ -532,6 +532,12 @@ inline bool isBool(const DataTypePtr & data_type)
return data_type->getName() == "Bool";
}
inline bool isAggregateFunction(const DataTypePtr & data_type)
{
WhichDataType which(data_type);
return which.isAggregateFunction();
}
template <typename DataType> constexpr bool IsDataTypeDecimal = false;
template <typename DataType> constexpr bool IsDataTypeNumber = false;
template <typename DataType> constexpr bool IsDataTypeDateOrDateTime = false;

View File

@ -554,7 +554,11 @@ DataTypePtr getLeastSupertype(const DataTypes & types)
UInt32 max_scale = 0;
for (const auto & type : types)
{
UInt32 scale = getDecimalScale(*type, 0);
auto type_id = type->getTypeId();
if (type_id != TypeIndex::Decimal32 && type_id != TypeIndex::Decimal64 && type_id != TypeIndex::Decimal128)
continue;
UInt32 scale = getDecimalScale(*type);
if (scale > max_scale)
max_scale = scale;
}

View File

@ -171,15 +171,6 @@ ColumnUInt8::Ptr DirectDictionary<dictionary_key_type>::hasKeys(
auto requested_keys = requested_keys_extractor.extractAllKeys();
size_t requested_keys_size = requested_keys.size();
HashMap<KeyType, size_t> requested_key_to_index;
requested_key_to_index.reserve(requested_keys_size);
for (size_t i = 0; i < requested_keys.size(); ++i)
{
auto requested_key = requested_keys[i];
requested_key_to_index[requested_key] = i;
}
auto result = ColumnUInt8::create(requested_keys_size, false);
auto & result_data = result->getData();
@ -205,15 +196,17 @@ ColumnUInt8::Ptr DirectDictionary<dictionary_key_type>::hasKeys(
{
auto block_key = block_keys_extractor.extractCurrentKey();
const auto * it = requested_key_to_index.find(block_key);
assert(it);
size_t index;
for (index = 0; index < requested_keys_size; ++index)
{
if (!result_data[index] && requested_keys[index] == block_key)
{
keys_found++;
result_data[index] = true;
size_t result_data_found_index = it->getMapped();
/// block_keys_size cannot be used, due to duplicates.
keys_found += !result_data[result_data_found_index];
result_data[result_data_found_index] = true;
block_keys_extractor.rollbackCurrentKey();
block_keys_extractor.rollbackCurrentKey();
}
}
}
block_key_columns.clear();

View File

@ -8,6 +8,8 @@
#include <IO/ReadBufferFromString.h>
#include <IO/WriteBufferFromEncryptedFile.h>
#include <boost/algorithm/hex.hpp>
#include <Common/quoteString.h>
#include <Common/typeid_cast.h>
namespace DB

View File

@ -1,7 +1,6 @@
#pragma once
#include <Interpreters/Context_fwd.h>
#include <Interpreters/Context.h>
#include <Core/Defines.h>
#include <base/types.h>
#include <Common/CurrentMetrics.h>
@ -41,6 +40,10 @@ namespace ErrorCodes
extern const int NOT_IMPLEMENTED;
}
class IDisk;
using DiskPtr = std::shared_ptr<IDisk>;
using DisksMap = std::map<String, DiskPtr>;
class IReservation;
using ReservationPtr = std::unique_ptr<IReservation>;
using Reservations = std::vector<ReservationPtr>;
@ -363,7 +366,6 @@ private:
std::unique_ptr<Executor> executor;
};
using DiskPtr = std::shared_ptr<IDisk>;
using Disks = std::vector<DiskPtr>;
/**

View File

@ -6,6 +6,7 @@
#include <Common/assert_cast.h>
#include <Common/hex.h>
#include <Common/getRandomASCIIString.h>
#include <Interpreters/Context.h>
namespace ProfileEvents

View File

@ -10,6 +10,7 @@
#include <Disks/IO/WriteIndirectBufferFromRemoteFS.h>
#include <Disks/ObjectStorages/IObjectStorage.h>
#include <Common/getRandomASCIIString.h>
#include <Common/MultiVersion.h>
namespace DB

View File

@ -12,6 +12,7 @@
#include <Disks/ObjectStorages/AzureBlobStorage/AzureBlobStorageAuth.h>
#include <Disks/ObjectStorages/AzureBlobStorage/AzureObjectStorage.h>
#include <Disks/ObjectStorages/MetadataStorageFromDisk.h>
#include <Interpreters/Context.h>
namespace DB
{

View File

@ -18,6 +18,7 @@
#include <Disks/ObjectStorages/DiskObjectStorageTransaction.h>
#include <Disks/FakeDiskTransaction.h>
#include <Poco/Util/AbstractConfiguration.h>
#include <Interpreters/Context.h>
namespace DB
{

View File

@ -3,6 +3,7 @@
#include <Common/FileCacheFactory.h>
#include <Common/IFileCache.h>
#include <Common/FileCacheSettings.h>
#include <Interpreters/Context.h>
namespace DB
{

View File

@ -1,5 +1,6 @@
#pragma once
#include <Disks/IDisk.h>
#include <Disks/ObjectStorages/IMetadataStorage.h>
#include <Disks/ObjectStorages/MetadataFromDiskTransactionState.h>
#include <Disks/ObjectStorages/MetadataStorageFromDiskTransactionOperations.h>

View File

@ -2,6 +2,7 @@
#include <Disks/IO/ThreadPoolRemoteFSReader.h>
#include <IO/WriteBufferFromFileBase.h>
#include <IO/copyData.h>
#include <Interpreters/Context.h>
namespace DB
{

View File

@ -7,6 +7,7 @@
#include <optional>
#include <Poco/Timestamp.h>
#include <Poco/Util/AbstractConfiguration.h>
#include <Core/Defines.h>
#include <Common/Exception.h>
#include <IO/ReadSettings.h>

View File

@ -28,6 +28,7 @@
#include <Common/FileCacheFactory.h>
#include <Common/getRandomASCIIString.h>
#include <Common/logger_useful.h>
#include <Common/MultiVersion.h>
namespace DB
{

View File

@ -11,6 +11,7 @@
#include <aws/s3/model/HeadObjectResult.h>
#include <aws/s3/model/ListObjectsV2Result.h>
#include <Storages/StorageS3Settings.h>
#include <Common/MultiVersion.h>
namespace DB

View File

@ -10,6 +10,7 @@
#include <IO/Progress.h>
#include <Common/filesystemHelpers.h>
#include <sys/stat.h>
#include <Interpreters/Context.h>
#ifdef HAS_RESERVED_IDENTIFIER

View File

@ -1,7 +1,7 @@
#pragma once
#include <IO/ReadBufferFromFileBase.h>
#include <Interpreters/Context.h>
#include <Interpreters/Context_fwd.h>
#include <unistd.h>

View File

@ -23,20 +23,6 @@ ActionLocksManager::ActionLocksManager(ContextPtr context_) : WithContext(contex
{
}
template <typename F>
inline void forEachTable(F && f, ContextPtr context)
{
for (auto & elem : DatabaseCatalog::instance().getDatabases())
for (auto iterator = elem.second->getTablesIterator(context); iterator->isValid(); iterator->next())
if (auto table = iterator->table())
f(table);
}
void ActionLocksManager::add(StorageActionBlockType action_type, ContextPtr context_)
{
forEachTable([&](const StoragePtr & table) { add(table, action_type); }, context_);
}
void ActionLocksManager::add(const StorageID & table_id, StorageActionBlockType action_type)
{
if (auto table = DatabaseCatalog::instance().tryGetTable(table_id, getContext()))
@ -54,14 +40,6 @@ void ActionLocksManager::add(const StoragePtr & table, StorageActionBlockType ac
}
}
void ActionLocksManager::remove(StorageActionBlockType action_type)
{
std::lock_guard lock(mutex);
for (auto & storage_elem : storage_locks)
storage_elem.second.erase(action_type);
}
void ActionLocksManager::remove(const StorageID & table_id, StorageActionBlockType action_type)
{
if (auto table = DatabaseCatalog::instance().tryGetTable(table_id, getContext()))

View File

@ -20,14 +20,10 @@ class ActionLocksManager : WithContext
public:
explicit ActionLocksManager(ContextPtr context);
/// Adds new locks for each table
void add(StorageActionBlockType action_type, ContextPtr context);
/// Add new lock for a table if it has not been already added
void add(const StorageID & table_id, StorageActionBlockType action_type);
void add(const StoragePtr & table, StorageActionBlockType action_type);
/// Remove locks for all tables
void remove(StorageActionBlockType action_type);
/// Removes a lock for a table if it exists
void remove(const StorageID & table_id, StorageActionBlockType action_type);
void remove(const StoragePtr & table, StorageActionBlockType action_type);

View File

@ -989,9 +989,15 @@ void AsynchronousMetrics::update(std::chrono::system_clock::time_point update_ti
if (s.rfind("processor", 0) == 0)
{
/// s390x example: processor 0: version = FF, identification = 039C88, machine = 3906
/// non s390x example: processor : 0
if (auto colon = s.find_first_of(':'))
{
#ifdef __s390x__
core_id = std::stoi(s.substr(10)); /// 10: length of "processor" plus 1
#else
core_id = std::stoi(s.substr(colon + 2));
#endif
}
}
else if (s.rfind("cpu MHz", 0) == 0)

View File

@ -45,6 +45,9 @@
#include <Common/typeid_cast.h>
#include <Common/StringUtils/StringUtils.h>
#include <Columns/ColumnNullable.h>
#include <Core/ColumnsWithTypeAndName.h>
#include <DataTypes/IDataType.h>
#include <Core/SettingsEnums.h>
#include <Core/ColumnNumbers.h>
#include <Core/Names.h>
@ -345,6 +348,7 @@ void ExpressionAnalyzer::analyzeAggregation(ActionsDAGPtr & temp_actions)
group_by_kind = GroupByKind::GROUPING_SETS;
else
group_by_kind = GroupByKind::ORDINARY;
bool use_nulls = group_by_kind != GroupByKind::ORDINARY && getContext()->getSettingsRef().group_by_use_nulls;
/// For GROUPING SETS with multiple groups we always add virtual __grouping_set column
/// With set number, which is used as an additional key at the stage of merging aggregating data.
@ -399,7 +403,7 @@ void ExpressionAnalyzer::analyzeAggregation(ActionsDAGPtr & temp_actions)
}
}
NameAndTypePair key{column_name, node->result_type};
NameAndTypePair key{column_name, use_nulls ? makeNullableSafe(node->result_type) : node->result_type };
grouping_set_list.push_back(key);
@ -453,7 +457,7 @@ void ExpressionAnalyzer::analyzeAggregation(ActionsDAGPtr & temp_actions)
}
}
NameAndTypePair key{column_name, node->result_type};
NameAndTypePair key = NameAndTypePair{ column_name, use_nulls ? makeNullableSafe(node->result_type) : node->result_type };
/// Aggregation keys are uniqued.
if (!unique_keys.contains(key.name))
@ -1489,6 +1493,28 @@ void SelectQueryExpressionAnalyzer::appendExpressionsAfterWindowFunctions(Expres
}
}
void SelectQueryExpressionAnalyzer::appendGroupByModifiers(ActionsDAGPtr & before_aggregation, ExpressionActionsChain & chain, bool /* only_types */)
{
const auto * select_query = getAggregatingQuery();
if (!select_query->groupBy() || !(select_query->group_by_with_rollup || select_query->group_by_with_cube))
return;
auto source_columns = before_aggregation->getResultColumns();
ColumnsWithTypeAndName result_columns;
for (const auto & source_column : source_columns)
{
if (source_column.type->canBeInsideNullable())
result_columns.emplace_back(makeNullableSafe(source_column.type), source_column.name);
else
result_columns.push_back(source_column);
}
ExpressionActionsChain::Step & step = chain.lastStep(before_aggregation->getNamesAndTypesList());
step.actions() = ActionsDAG::makeConvertingActions(source_columns, result_columns, ActionsDAG::MatchColumnsMode::Position);
}
void SelectQueryExpressionAnalyzer::appendSelectSkipWindowExpressions(ExpressionActionsChain::Step & step, ASTPtr const & node)
{
if (auto * function = node->as<ASTFunction>())
@ -1816,6 +1842,7 @@ ExpressionAnalysisResult::ExpressionAnalysisResult(
bool second_stage_,
bool only_types,
const FilterDAGInfoPtr & filter_info_,
const FilterDAGInfoPtr & additional_filter,
const Block & source_header)
: first_stage(first_stage_)
, second_stage(second_stage_)
@ -1882,6 +1909,13 @@ ExpressionAnalysisResult::ExpressionAnalysisResult(
columns_for_final.begin(), columns_for_final.end());
}
if (storage && additional_filter)
{
Names columns_for_additional_filter = additional_filter->actions->getRequiredColumnsNames();
additional_required_columns_after_prewhere.insert(additional_required_columns_after_prewhere.end(),
columns_for_additional_filter.begin(), columns_for_additional_filter.end());
}
if (storage && filter_info_)
{
filter_info = filter_info_;
@ -1956,6 +1990,9 @@ ExpressionAnalysisResult::ExpressionAnalysisResult(
query_analyzer.appendAggregateFunctionsArguments(chain, only_types || !first_stage);
before_aggregation = chain.getLastActions();
if (settings.group_by_use_nulls)
query_analyzer.appendGroupByModifiers(before_aggregation, chain, only_types);
finalize_chain(chain);
if (query_analyzer.appendHaving(chain, only_types || !second_stage))

View File

@ -281,6 +281,7 @@ struct ExpressionAnalysisResult
bool second_stage,
bool only_types,
const FilterDAGInfoPtr & filter_info,
const FilterDAGInfoPtr & additional_filter, /// for setting additional_filters
const Block & source_header);
/// Filter for row-level security.
@ -412,6 +413,8 @@ private:
void appendExpressionsAfterWindowFunctions(ExpressionActionsChain & chain, bool only_types);
void appendSelectSkipWindowExpressions(ExpressionActionsChain::Step & step, ASTPtr const & node);
void appendGroupByModifiers(ActionsDAGPtr & before_aggregation, ExpressionActionsChain & chain, bool only_types);
/// After aggregation:
bool appendHaving(ExpressionActionsChain & chain, bool only_types);
/// appendSelect

View File

@ -4,6 +4,13 @@
#include <Processors/QueryPlan/BuildQueryPipelineSettings.h>
#include <Processors/QueryPlan/Optimizations/QueryPlanOptimizationSettings.h>
#include <QueryPipeline/QueryPipelineBuilder.h>
#include <Parsers/ExpressionListParsers.h>
#include <Parsers/parseQuery.h>
#include <Interpreters/TreeRewriter.h>
#include <Interpreters/ActionsDAG.h>
#include <Interpreters/ExpressionAnalyzer.h>
#include <Processors/QueryPlan/IQueryPlanStep.h>
#include <Processors/QueryPlan/FilterStep.h>
namespace DB
{
@ -81,6 +88,53 @@ void IInterpreterUnionOrSelectQuery::setQuota(QueryPipeline & pipeline) const
pipeline.setQuota(quota);
}
static ASTPtr parseAdditionalPostFilter(const Context & context)
{
const auto & settings = context.getSettingsRef();
const String & filter = settings.additional_result_filter;
if (filter.empty())
return nullptr;
ParserExpression parser;
return parseQuery(
parser, filter.data(), filter.data() + filter.size(),
"additional filter", settings.max_query_size, settings.max_parser_depth);
}
static ActionsDAGPtr makeAdditionalPostFilter(ASTPtr & ast, ContextPtr context, const Block & header)
{
auto syntax_result = TreeRewriter(context).analyze(ast, header.getNamesAndTypesList());
String result_column_name = ast->getColumnName();
auto dag = ExpressionAnalyzer(ast, syntax_result, context).getActionsDAG(false, false);
const ActionsDAG::Node * result_node = &dag->findInIndex(result_column_name);
auto & index = dag->getIndex();
index.clear();
index.reserve(dag->getInputs().size() + 1);
for (const auto * node : dag->getInputs())
index.push_back(node);
index.push_back(result_node);
return dag;
}
void IInterpreterUnionOrSelectQuery::addAdditionalPostFilter(QueryPlan & plan) const
{
if (options.subquery_depth != 0)
return;
auto ast = parseAdditionalPostFilter(*context);
if (!ast)
return;
auto dag = makeAdditionalPostFilter(ast, context, plan.getCurrentDataStream().header);
std::string filter_name = dag->getIndex().back()->result_name;
auto filter_step = std::make_unique<FilterStep>(
plan.getCurrentDataStream(), std::move(dag), std::move(filter_name), true);
filter_step->setStepDescription("Additional result filter");
plan.addStep(std::move(filter_step));
}
void IInterpreterUnionOrSelectQuery::addStorageLimits(const StorageLimitsList & limits)
{
for (const auto & val : limits)

View File

@ -72,6 +72,8 @@ protected:
/// Set quotas to query pipeline.
void setQuota(QueryPipeline & pipeline) const;
/// Add filter from additional_post_filter setting.
void addAdditionalPostFilter(QueryPlan & plan) const;
static StorageLimits getStorageLimits(const Context & context, const SelectQueryOptions & options);
};

View File

@ -146,14 +146,14 @@ namespace
struct QueryASTSettings
{
bool graph = false;
bool rewrite = false;
bool optimize = false;
constexpr static char name[] = "AST";
std::unordered_map<std::string, std::reference_wrapper<bool>> boolean_settings =
{
{"graph", graph},
{"rewrite", rewrite}
{"optimize", optimize}
};
};
@ -280,7 +280,7 @@ QueryPipeline InterpreterExplainQuery::executeImpl()
case ASTExplainQuery::ParsedAST:
{
auto settings = checkAndGetSettings<QueryASTSettings>(ast.getSettings());
if (settings.rewrite)
if (settings.optimize)
{
ExplainAnalyzedSyntaxVisitor::Data data(getContext());
ExplainAnalyzedSyntaxVisitor(data).visit(query);

View File

@ -138,6 +138,7 @@ void InterpreterSelectIntersectExceptQuery::buildQueryPlan(QueryPlan & query_pla
auto step = std::make_unique<IntersectOrExceptStep>(std::move(data_streams), final_operator, max_threads);
query_plan.unitePlans(std::move(step), std::move(plans));
addAdditionalPostFilter(query_plan);
query_plan.addInterpreterContext(context);
}

View File

@ -109,8 +109,17 @@ namespace ErrorCodes
}
/// Assumes `storage` is set and the table filter (row-level security) is not empty.
String InterpreterSelectQuery::generateFilterActions(ActionsDAGPtr & actions, const Names & prerequisite_columns) const
FilterDAGInfoPtr generateFilterActions(
const StorageID & table_id,
const ASTPtr & row_policy_filter,
const ContextPtr & context,
const StoragePtr & storage,
const StorageSnapshotPtr & storage_snapshot,
const StorageMetadataPtr & metadata_snapshot,
Names & prerequisite_columns)
{
auto filter_info = std::make_shared<FilterDAGInfo>();
const auto & db_name = table_id.getDatabaseName();
const auto & table_name = table_id.getTableName();
@ -146,16 +155,24 @@ String InterpreterSelectQuery::generateFilterActions(ActionsDAGPtr & actions, co
/// Using separate expression analyzer to prevent any possible alias injection
auto syntax_result = TreeRewriter(context).analyzeSelect(query_ast, TreeRewriterResult({}, storage, storage_snapshot));
SelectQueryExpressionAnalyzer analyzer(query_ast, syntax_result, context, metadata_snapshot);
actions = analyzer.simpleSelectActions();
filter_info->actions = analyzer.simpleSelectActions();
auto column_name = expr_list->children.at(0)->getColumnName();
actions->removeUnusedActions(NameSet{column_name});
actions->projectInput(false);
filter_info->column_name = expr_list->children.at(0)->getColumnName();
filter_info->actions->removeUnusedActions(NameSet{filter_info->column_name});
filter_info->actions->projectInput(false);
for (const auto * node : actions->getInputs())
actions->getIndex().push_back(node);
for (const auto * node : filter_info->actions->getInputs())
filter_info->actions->getIndex().push_back(node);
return column_name;
auto required_columns_from_filter = filter_info->actions->getRequiredColumns();
for (const auto & column : required_columns_from_filter)
{
if (prerequisite_columns.end() == std::find(prerequisite_columns.begin(), prerequisite_columns.end(), column.name))
prerequisite_columns.push_back(column.name);
}
return filter_info;
}
InterpreterSelectQuery::InterpreterSelectQuery(
@ -269,6 +286,32 @@ static void checkAccessRightsForSelect(
context->checkAccess(AccessType::SELECT, table_id, syntax_analyzer_result.requiredSourceColumnsForAccessCheck());
}
static ASTPtr parseAdditionalFilterConditionForTable(
const Map & setting,
const DatabaseAndTableWithAlias & target,
const Context & context)
{
for (size_t i = 0; i < setting.size(); ++i)
{
const auto & tuple = setting[i].safeGet<const Tuple &>();
auto & table = tuple.at(0).safeGet<String>();
auto & filter = tuple.at(1).safeGet<String>();
if ((table == target.table && context.getCurrentDatabase() == target.database) ||
(table == target.database + '.' + target.table))
{
/// Try to parse expression
ParserExpression parser;
const auto & settings = context.getSettingsRef();
return parseQuery(
parser, filter.data(), filter.data() + filter.size(),
"additional filter", settings.max_query_size, settings.max_parser_depth);
}
}
return nullptr;
}
/// Returns true if we should ignore quotas and limits for a specified table in the system database.
static bool shouldIgnoreQuotaAndLimits(const StorageID & table_id)
{
@ -448,6 +491,10 @@ InterpreterSelectQuery::InterpreterSelectQuery(
if (storage)
view = dynamic_cast<StorageView *>(storage.get());
if (!settings.additional_table_filters.value.empty() && storage && !joined_tables.tablesWithColumns().empty())
query_info.additional_filter_ast = parseAdditionalFilterConditionForTable(
settings.additional_table_filters, joined_tables.tablesWithColumns().front().table, *context);
auto analyze = [&] (bool try_move_to_prewhere)
{
/// Allow push down and other optimizations for VIEW: replace with subquery and rewrite it.
@ -566,16 +613,16 @@ InterpreterSelectQuery::InterpreterSelectQuery(
/// Fix source_header for filter actions.
if (row_policy_filter)
{
filter_info = std::make_shared<FilterDAGInfo>();
filter_info->column_name = generateFilterActions(filter_info->actions, required_columns);
filter_info = generateFilterActions(
table_id, row_policy_filter, context, storage, storage_snapshot, metadata_snapshot, required_columns);
}
auto required_columns_from_filter = filter_info->actions->getRequiredColumns();
if (query_info.additional_filter_ast)
{
additional_filter_info = generateFilterActions(
table_id, query_info.additional_filter_ast, context, storage, storage_snapshot, metadata_snapshot, required_columns);
for (const auto & column : required_columns_from_filter)
{
if (required_columns.end() == std::find(required_columns.begin(), required_columns.end(), column.name))
required_columns.push_back(column.name);
}
additional_filter_info->do_remove_column = true;
}
source_header = storage_snapshot->getSampleBlockForColumns(required_columns);
@ -735,7 +782,7 @@ Block InterpreterSelectQuery::getSampleBlockImpl()
&& options.to_stage > QueryProcessingStage::WithMergeableState;
analysis_result = ExpressionAnalysisResult(
*query_analyzer, metadata_snapshot, first_stage, second_stage, options.only_analyze, filter_info, source_header);
*query_analyzer, metadata_snapshot, first_stage, second_stage, options.only_analyze, filter_info, additional_filter_info, source_header);
if (options.to_stage == QueryProcessingStage::Enum::FetchColumns)
{
@ -786,8 +833,16 @@ Block InterpreterSelectQuery::getSampleBlockImpl()
if (analysis_result.use_grouping_set_key)
res.insert({ nullptr, std::make_shared<DataTypeUInt64>(), "__grouping_set" });
for (const auto & key : query_analyzer->aggregationKeys())
res.insert({nullptr, header.getByName(key.name).type, key.name});
if (context->getSettingsRef().group_by_use_nulls && analysis_result.use_grouping_set_key)
{
for (const auto & key : query_analyzer->aggregationKeys())
res.insert({nullptr, makeNullableSafe(header.getByName(key.name).type), key.name});
}
else
{
for (const auto & key : query_analyzer->aggregationKeys())
res.insert({nullptr, header.getByName(key.name).type, key.name});
}
for (const auto & aggregate : query_analyzer->aggregates())
{
@ -1295,6 +1350,18 @@ void InterpreterSelectQuery::executeImpl(QueryPlan & query_plan, std::optional<P
query_plan.addStep(std::move(row_level_security_step));
}
if (additional_filter_info)
{
auto additional_filter_step = std::make_unique<FilterStep>(
query_plan.getCurrentDataStream(),
additional_filter_info->actions,
additional_filter_info->column_name,
additional_filter_info->do_remove_column);
additional_filter_step->setStepDescription("Additional filter");
query_plan.addStep(std::move(additional_filter_step));
}
if (expressions.before_array_join)
{
QueryPlanStepPtr before_array_join_step
@ -1937,6 +2004,7 @@ void InterpreterSelectQuery::executeFetchColumns(QueryProcessingStage::Enum proc
&& storage
&& storage->getName() != "MaterializedMySQL"
&& !row_policy_filter
&& !query_info.additional_filter_ast
&& processing_stage == QueryProcessingStage::FetchColumns
&& query_analyzer->hasAggregation()
&& (query_analyzer->aggregates().size() == 1)
@ -2036,6 +2104,7 @@ void InterpreterSelectQuery::executeFetchColumns(QueryProcessingStage::Enum proc
&& !query.limit_with_ties
&& !query.prewhere()
&& !query.where()
&& !query_info.additional_filter_ast
&& !query.groupBy()
&& !query.having()
&& !query.orderBy()
@ -2326,6 +2395,7 @@ void InterpreterSelectQuery::executeAggregation(QueryPlan & query_plan, const Ac
merge_threads,
temporary_data_merge_threads,
storage_has_evenly_distributed_read,
settings.group_by_use_nulls,
std::move(group_by_info),
std::move(group_by_sort_description),
should_produce_results_in_order_of_bucket_number);
@ -2402,9 +2472,9 @@ void InterpreterSelectQuery::executeRollupOrCube(QueryPlan & query_plan, Modific
QueryPlanStepPtr step;
if (modificator == Modificator::ROLLUP)
step = std::make_unique<RollupStep>(query_plan.getCurrentDataStream(), std::move(params), final);
step = std::make_unique<RollupStep>(query_plan.getCurrentDataStream(), std::move(params), final, settings.group_by_use_nulls);
else if (modificator == Modificator::CUBE)
step = std::make_unique<CubeStep>(query_plan.getCurrentDataStream(), std::move(params), final);
step = std::make_unique<CubeStep>(query_plan.getCurrentDataStream(), std::move(params), final, settings.group_by_use_nulls);
query_plan.addStep(std::move(step));
}

View File

@ -189,8 +189,6 @@ private:
void
executeMergeSorted(QueryPlan & query_plan, const SortDescription & sort_description, UInt64 limit, const std::string & description);
String generateFilterActions(ActionsDAGPtr & actions, const Names & prerequisite_columns = {}) const;
enum class Modificator
{
ROLLUP = 0,
@ -217,6 +215,9 @@ private:
ASTPtr row_policy_filter;
FilterDAGInfoPtr filter_info;
/// For additional_filter setting.
FilterDAGInfoPtr additional_filter_info;
QueryProcessingStage::Enum from_stage = QueryProcessingStage::FetchColumns;
/// List of columns to read to execute the query.

View File

@ -357,6 +357,7 @@ void InterpreterSelectWithUnionQuery::buildQueryPlan(QueryPlan & query_plan)
}
}
addAdditionalPostFilter(query_plan);
query_plan.addInterpreterContext(context);
}

View File

@ -5,7 +5,6 @@
#include <Parsers/IdentifierQuotingStyle.h>
#include <Common/Exception.h>
#include <Common/TypePromotion.h>
#include <Core/Settings.h>
#include <IO/WriteBufferFromString.h>
#include <algorithm>
@ -26,7 +25,7 @@ namespace ErrorCodes
using IdentifierNameSet = std::set<String>;
class WriteBuffer;
using Strings = std::vector<String>;
/** Element of the syntax tree (hereinafter - directed acyclic graph with elements of semantics)
*/

View File

@ -3,6 +3,7 @@
#include <Parsers/IAST.h>
#include <Parsers/IParserBase.h>
#include <Parsers/CommonParsers.h>
#include <unordered_map>
namespace DB
{

View File

@ -12,12 +12,63 @@
namespace DB
{
class ParserLiteralOrMap : public IParserBase
{
public:
protected:
const char * getName() const override { return "literal or map"; }
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override
{
{
ParserLiteral literal;
if (literal.parse(pos, node, expected))
return true;
}
ParserToken l_br(TokenType::OpeningCurlyBrace);
ParserToken r_br(TokenType::ClosingCurlyBrace);
ParserToken comma(TokenType::Comma);
ParserToken colon(TokenType::Colon);
ParserStringLiteral literal;
if (!l_br.ignore(pos, expected))
return false;
Map map;
while (!r_br.ignore(pos, expected))
{
if (!map.empty() && !comma.ignore(pos, expected))
return false;
ASTPtr key;
ASTPtr val;
if (!literal.parse(pos, key, expected))
return false;
if (!colon.ignore(pos, expected))
return false;
if (!literal.parse(pos, val, expected))
return false;
Tuple tuple;
tuple.push_back(std::move(key->as<ASTLiteral>()->value));
tuple.push_back(std::move(val->as<ASTLiteral>()->value));
map.push_back(std::move(tuple));
}
node = std::make_shared<ASTLiteral>(std::move(map));
return true;
}
};
/// Parse `name = value`.
bool ParserSetQuery::parseNameValuePair(SettingChange & change, IParser::Pos & pos, Expected & expected)
{
ParserCompoundIdentifier name_p;
ParserLiteral value_p;
ParserLiteralOrMap value_p;
ParserToken s_eq(TokenType::Equals);
ASTPtr name;

View File

@ -73,6 +73,7 @@ void IOutputFormat::work()
setRowsBeforeLimit(rows_before_limit_counter->get());
finalize();
finalized = true;
return;
}
@ -119,12 +120,9 @@ void IOutputFormat::write(const Block & block)
void IOutputFormat::finalize()
{
if (finalized)
return;
writePrefixIfNot();
writeSuffixIfNot();
finalizeImpl();
finalized = true;
}
}

View File

@ -13,6 +13,7 @@
#include <arrow/buffer.h>
#include <arrow/io/memory.h>
#include <arrow/result.h>
#include <Core/Settings.h>
#include <sys/stat.h>

View File

@ -30,6 +30,7 @@ namespace DB
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
extern const int INCORRECT_DATA;
}
CapnProtoRowInputFormat::CapnProtoRowInputFormat(ReadBuffer & in_, Block header, Params params_, const FormatSchemaInfo & info, const FormatSettings & format_settings_)
@ -264,20 +265,20 @@ bool CapnProtoRowInputFormat::readRow(MutableColumns & columns, RowReadExtension
if (in->eof())
return false;
auto array = readMessage();
#if CAPNP_VERSION >= 7000 && CAPNP_VERSION < 8000
capnp::UnalignedFlatArrayMessageReader msg(array);
#else
capnp::FlatArrayMessageReader msg(array);
#endif
auto root_reader = msg.getRoot<capnp::DynamicStruct>(root);
for (size_t i = 0; i != columns.size(); ++i)
try
{
auto value = getReaderByColumnName(root_reader, column_names[i]);
insertValue(*columns[i], column_types[i], value, format_settings.capn_proto.enum_comparing_mode);
auto array = readMessage();
capnp::FlatArrayMessageReader msg(array);
auto root_reader = msg.getRoot<capnp::DynamicStruct>(root);
for (size_t i = 0; i != columns.size(); ++i)
{
auto value = getReaderByColumnName(root_reader, column_names[i]);
insertValue(*columns[i], column_types[i], value, format_settings.capn_proto.enum_comparing_mode);
}
}
catch (const kj::Exception & e)
{
throw Exception(ErrorCodes::INCORRECT_DATA, "Cannot read row: {}", e.getDescription().cStr());
}
return true;

View File

@ -11,6 +11,7 @@
#include <Processors/Merges/AggregatingSortedTransform.h>
#include <Processors/Merges/FinishAggregatingInOrderTransform.h>
#include <Interpreters/Aggregator.h>
#include <Functions/FunctionFactory.h>
#include <Processors/QueryPlan/IQueryPlanStep.h>
#include <Columns/ColumnFixedString.h>
#include <DataTypes/DataTypesNumber.h>
@ -46,22 +47,32 @@ Block appendGroupingSetColumn(Block header)
return res;
}
static Block appendGroupingColumn(Block block, const GroupingSetsParamsList & params)
static inline void convertToNullable(Block & header, const Names & keys)
{
for (const auto & key : keys)
{
auto & column = header.getByName(key);
column.type = makeNullableSafe(column.type);
column.column = makeNullableSafe(column.column);
}
}
Block generateOutputHeader(const Block & input_header, const Names & keys, bool use_nulls)
{
auto header = appendGroupingSetColumn(input_header);
if (use_nulls)
convertToNullable(header, keys);
return header;
}
static Block appendGroupingColumn(Block block, const Names & keys, const GroupingSetsParamsList & params, bool use_nulls)
{
if (params.empty())
return block;
Block res;
size_t rows = block.rows();
auto column = ColumnUInt64::create(rows);
res.insert({ColumnPtr(std::move(column)), std::make_shared<DataTypeUInt64>(), "__grouping_set"});
for (auto & col : block)
res.insert(std::move(col));
return res;
return generateOutputHeader(block, keys, use_nulls);
}
AggregatingStep::AggregatingStep(
@ -74,11 +85,12 @@ AggregatingStep::AggregatingStep(
size_t merge_threads_,
size_t temporary_data_merge_threads_,
bool storage_has_evenly_distributed_read_,
bool group_by_use_nulls_,
InputOrderInfoPtr group_by_info_,
SortDescription group_by_sort_description_,
bool should_produce_results_in_order_of_bucket_number_)
: ITransformingStep(
input_stream_, appendGroupingColumn(params_.getHeader(input_stream_.header, final_), grouping_sets_params_), getTraits(should_produce_results_in_order_of_bucket_number_), false)
input_stream_, appendGroupingColumn(params_.getHeader(input_stream_.header, final_), params_.keys, grouping_sets_params_, group_by_use_nulls_), getTraits(should_produce_results_in_order_of_bucket_number_), false)
, params(std::move(params_))
, grouping_sets_params(std::move(grouping_sets_params_))
, final(final_)
@ -87,6 +99,7 @@ AggregatingStep::AggregatingStep(
, merge_threads(merge_threads_)
, temporary_data_merge_threads(temporary_data_merge_threads_)
, storage_has_evenly_distributed_read(storage_has_evenly_distributed_read_)
, group_by_use_nulls(group_by_use_nulls_)
, group_by_info(std::move(group_by_info_))
, group_by_sort_description(std::move(group_by_sort_description_))
, should_produce_results_in_order_of_bucket_number(should_produce_results_in_order_of_bucket_number_)
@ -217,6 +230,8 @@ void AggregatingStep::transformPipeline(QueryPipelineBuilder & pipeline, const B
assert(ports.size() == grouping_sets_size);
auto output_header = transform_params->getHeader();
if (group_by_use_nulls)
convertToNullable(output_header, params.keys);
for (size_t set_counter = 0; set_counter < grouping_sets_size; ++set_counter)
{
@ -236,6 +251,7 @@ void AggregatingStep::transformPipeline(QueryPipelineBuilder & pipeline, const B
const auto & missing_columns = grouping_sets_params[set_counter].missing_keys;
auto to_nullable_function = FunctionFactory::instance().get("toNullable", nullptr);
for (size_t i = 0; i < output_header.columns(); ++i)
{
auto & col = output_header.getByPosition(i);
@ -251,7 +267,13 @@ void AggregatingStep::transformPipeline(QueryPipelineBuilder & pipeline, const B
index.push_back(node);
}
else
index.push_back(dag->getIndex()[header.getPositionByName(col.name)]);
{
const auto * column_node = dag->getIndex()[header.getPositionByName(col.name)];
if (group_by_use_nulls && column_node->result_type->canBeInsideNullable())
index.push_back(&dag->addFunction(to_nullable_function, { column_node }, col.name));
else
index.push_back(column_node);
}
}
dag->getIndex().swap(index);
@ -396,7 +418,7 @@ void AggregatingStep::updateOutputStream()
{
output_stream = createOutputStream(
input_streams.front(),
appendGroupingColumn(params.getHeader(input_streams.front().header, final), grouping_sets_params),
appendGroupingColumn(params.getHeader(input_streams.front().header, final), params.keys, grouping_sets_params, group_by_use_nulls),
getDataStreamTraits());
}

View File

@ -20,6 +20,7 @@ struct GroupingSetsParams
using GroupingSetsParamsList = std::vector<GroupingSetsParams>;
Block appendGroupingSetColumn(Block header);
Block generateOutputHeader(const Block & input_header, const Names & keys, bool use_nulls);
/// Aggregation. See AggregatingTransform.
class AggregatingStep : public ITransformingStep
@ -35,6 +36,7 @@ public:
size_t merge_threads_,
size_t temporary_data_merge_threads_,
bool storage_has_evenly_distributed_read_,
bool group_by_use_nulls_,
InputOrderInfoPtr group_by_info_,
SortDescription group_by_sort_description_,
bool should_produce_results_in_order_of_bucket_number_);
@ -62,6 +64,7 @@ private:
size_t temporary_data_merge_threads;
bool storage_has_evenly_distributed_read;
bool group_by_use_nulls;
InputOrderInfoPtr group_by_info;
SortDescription group_by_sort_description;

View File

@ -4,6 +4,7 @@
#include <Processors/QueryPlan/AggregatingStep.h>
#include <QueryPipeline/QueryPipelineBuilder.h>
#include <DataTypes/DataTypesNumber.h>
#include <Functions/FunctionFactory.h>
namespace DB
{
@ -24,27 +25,41 @@ static ITransformingStep::Traits getTraits()
};
}
CubeStep::CubeStep(const DataStream & input_stream_, Aggregator::Params params_, bool final_)
: ITransformingStep(input_stream_, appendGroupingSetColumn(params_.getHeader(input_stream_.header, final_)), getTraits())
CubeStep::CubeStep(const DataStream & input_stream_, Aggregator::Params params_, bool final_, bool use_nulls_)
: ITransformingStep(input_stream_, generateOutputHeader(params_.getHeader(input_stream_.header, final_), params_.keys, use_nulls_), getTraits())
, keys_size(params_.keys_size)
, params(std::move(params_))
, final(final_)
, use_nulls(use_nulls_)
{
/// Aggregation keys are distinct
for (const auto & key : params.keys)
output_stream->distinct_columns.insert(key);
}
ProcessorPtr addGroupingSetForTotals(const Block & header, const BuildQueryPipelineSettings & settings, UInt64 grouping_set_number)
ProcessorPtr addGroupingSetForTotals(const Block & header, const Names & keys, bool use_nulls, const BuildQueryPipelineSettings & settings, UInt64 grouping_set_number)
{
auto dag = std::make_shared<ActionsDAG>(header.getColumnsWithTypeAndName());
auto & index = dag->getIndex();
if (use_nulls)
{
auto to_nullable = FunctionFactory::instance().get("toNullable", nullptr);
for (const auto & key : keys)
{
const auto * node = dag->getIndex()[header.getPositionByName(key)];
if (node->result_type->canBeInsideNullable())
{
dag->addOrReplaceInIndex(dag->addFunction(to_nullable, { node }, node->result_name));
}
}
}
auto grouping_col = ColumnUInt64::create(1, grouping_set_number);
const auto * grouping_node = &dag->addColumn(
{ColumnPtr(std::move(grouping_col)), std::make_shared<DataTypeUInt64>(), "__grouping_set"});
grouping_node = &dag->materializeNode(*grouping_node);
auto & index = dag->getIndex();
index.insert(index.begin(), grouping_node);
auto expression = std::make_shared<ExpressionActions>(dag, settings.getActionsSettings());
@ -58,10 +73,10 @@ void CubeStep::transformPipeline(QueryPipelineBuilder & pipeline, const BuildQue
pipeline.addSimpleTransform([&](const Block & header, QueryPipelineBuilder::StreamType stream_type) -> ProcessorPtr
{
if (stream_type == QueryPipelineBuilder::StreamType::Totals)
return addGroupingSetForTotals(header, settings, (UInt64(1) << keys_size) - 1);
return addGroupingSetForTotals(header, params.keys, use_nulls, settings, (UInt64(1) << keys_size) - 1);
auto transform_params = std::make_shared<AggregatingTransformParams>(header, std::move(params), final);
return std::make_shared<CubeTransform>(header, std::move(transform_params));
return std::make_shared<CubeTransform>(header, std::move(transform_params), use_nulls);
});
}
@ -73,7 +88,7 @@ const Aggregator::Params & CubeStep::getParams() const
void CubeStep::updateOutputStream()
{
output_stream = createOutputStream(
input_streams.front(), appendGroupingSetColumn(params.getHeader(input_streams.front().header, final)), getDataStreamTraits());
input_streams.front(), generateOutputHeader(params.getHeader(input_streams.front().header, final), params.keys, use_nulls), getDataStreamTraits());
/// Aggregation keys are distinct
for (const auto & key : params.keys)

View File

@ -13,7 +13,7 @@ using AggregatingTransformParamsPtr = std::shared_ptr<AggregatingTransformParams
class CubeStep : public ITransformingStep
{
public:
CubeStep(const DataStream & input_stream_, Aggregator::Params params_, bool final_);
CubeStep(const DataStream & input_stream_, Aggregator::Params params_, bool final_, bool use_nulls_);
String getName() const override { return "Cube"; }
@ -26,6 +26,7 @@ private:
size_t keys_size;
Aggregator::Params params;
bool final;
bool use_nulls;
};
}

View File

@ -7,6 +7,7 @@
#include <IO/Operators.h>
#include <Interpreters/ExpressionAnalyzer.h>
#include <Interpreters/TreeRewriter.h>
#include <Interpreters/Context.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTSelectQuery.h>

View File

@ -1,6 +1,7 @@
#pragma once
#include <Processors/QueryPlan/ISourceStep.h>
#include <Storages/MergeTree/RangesInDataPart.h>
#include <Storages/MergeTree/RequestResponse.h>
namespace DB
{
@ -9,6 +10,8 @@ using PartitionIdToMaxBlock = std::unordered_map<String, Int64>;
class Pipe;
using MergeTreeReadTaskCallback = std::function<std::optional<PartitionReadResponse>(PartitionReadRequest)>;
struct MergeTreeDataSelectSamplingData
{
bool use_sampling = false;

View File

@ -22,18 +22,19 @@ static ITransformingStep::Traits getTraits()
};
}
RollupStep::RollupStep(const DataStream & input_stream_, Aggregator::Params params_, bool final_)
: ITransformingStep(input_stream_, appendGroupingSetColumn(params_.getHeader(input_stream_.header, final_)), getTraits())
RollupStep::RollupStep(const DataStream & input_stream_, Aggregator::Params params_, bool final_, bool use_nulls_)
: ITransformingStep(input_stream_, generateOutputHeader(params_.getHeader(input_stream_.header, final_), params_.keys, use_nulls_), getTraits())
, params(std::move(params_))
, keys_size(params.keys_size)
, final(final_)
, use_nulls(use_nulls_)
{
/// Aggregation keys are distinct
for (const auto & key : params.keys)
output_stream->distinct_columns.insert(key);
}
ProcessorPtr addGroupingSetForTotals(const Block & header, const BuildQueryPipelineSettings & settings, UInt64 grouping_set_number);
ProcessorPtr addGroupingSetForTotals(const Block & header, const Names & keys, bool use_nulls, const BuildQueryPipelineSettings & settings, UInt64 grouping_set_number);
void RollupStep::transformPipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings & settings)
{
@ -42,10 +43,10 @@ void RollupStep::transformPipeline(QueryPipelineBuilder & pipeline, const BuildQ
pipeline.addSimpleTransform([&](const Block & header, QueryPipelineBuilder::StreamType stream_type) -> ProcessorPtr
{
if (stream_type == QueryPipelineBuilder::StreamType::Totals)
return addGroupingSetForTotals(header, settings, keys_size);
return addGroupingSetForTotals(header, params.keys, use_nulls, settings, keys_size);
auto transform_params = std::make_shared<AggregatingTransformParams>(header, std::move(params), true);
return std::make_shared<RollupTransform>(header, std::move(transform_params));
return std::make_shared<RollupTransform>(header, std::move(transform_params), use_nulls);
});
}

View File

@ -13,7 +13,7 @@ using AggregatingTransformParamsPtr = std::shared_ptr<AggregatingTransformParams
class RollupStep : public ITransformingStep
{
public:
RollupStep(const DataStream & input_stream_, Aggregator::Params params_, bool final_);
RollupStep(const DataStream & input_stream_, Aggregator::Params params_, bool final_, bool use_nulls_);
String getName() const override { return "Rollup"; }
@ -25,6 +25,7 @@ private:
Aggregator::Params params;
size_t keys_size;
bool final;
bool use_nulls;
};
}

View File

@ -1,4 +1,5 @@
#include <Processors/TTL/TTLAggregationAlgorithm.h>
#include <Interpreters/Context.h>
namespace DB
{

View File

@ -1,6 +1,7 @@
#include <Processors/Transforms/CubeTransform.h>
#include <Processors/Transforms/TotalsHavingTransform.h>
#include <Processors/QueryPlan/AggregatingStep.h>
#include "Processors/Transforms/RollupTransform.h"
namespace DB
{
@ -9,61 +10,32 @@ namespace ErrorCodes
extern const int LOGICAL_ERROR;
}
CubeTransform::CubeTransform(Block header, AggregatingTransformParamsPtr params_)
: IAccumulatingTransform(std::move(header), appendGroupingSetColumn(params_->getHeader()))
, params(std::move(params_))
CubeTransform::CubeTransform(Block header, AggregatingTransformParamsPtr params_, bool use_nulls_)
: GroupByModifierTransform(std::move(header), params_, use_nulls_)
, aggregates_mask(getAggregatesMask(params->getHeader(), params->params.aggregates))
{
keys.reserve(params->params.keys_size);
for (const auto & key : params->params.keys)
keys.emplace_back(input.getHeader().getPositionByName(key));
if (keys.size() >= 8 * sizeof(mask))
throw Exception("Too many keys are used for CubeTransform.", ErrorCodes::LOGICAL_ERROR);
}
Chunk CubeTransform::merge(Chunks && chunks, bool final)
{
BlocksList rollup_blocks;
for (auto & chunk : chunks)
rollup_blocks.emplace_back(getInputPort().getHeader().cloneWithColumns(chunk.detachColumns()));
auto rollup_block = params->aggregator.mergeBlocks(rollup_blocks, final);
auto num_rows = rollup_block.rows();
return Chunk(rollup_block.getColumns(), num_rows);
}
void CubeTransform::consume(Chunk chunk)
{
consumed_chunks.emplace_back(std::move(chunk));
}
MutableColumnPtr getColumnWithDefaults(Block const & header, size_t key, size_t n);
Chunk CubeTransform::generate()
{
if (!consumed_chunks.empty())
{
if (consumed_chunks.size() > 1)
cube_chunk = merge(std::move(consumed_chunks), false);
else
cube_chunk = std::move(consumed_chunks.front());
mergeConsumed();
consumed_chunks.clear();
auto num_rows = cube_chunk.getNumRows();
auto num_rows = current_chunk.getNumRows();
mask = (static_cast<UInt64>(1) << keys.size()) - 1;
current_columns = cube_chunk.getColumns();
current_columns = current_chunk.getColumns();
current_zero_columns.clear();
current_zero_columns.reserve(keys.size());
auto const & input_header = getInputPort().getHeader();
for (auto key : keys)
current_zero_columns.emplace_back(getColumnWithDefaults(input_header, key, num_rows));
current_zero_columns.emplace_back(getColumnWithDefaults(key, num_rows));
}
auto gen_chunk = std::move(cube_chunk);
auto gen_chunk = std::move(current_chunk);
if (mask)
{
@ -78,7 +50,7 @@ Chunk CubeTransform::generate()
Chunks chunks;
chunks.emplace_back(std::move(columns), current_columns.front()->size());
cube_chunk = merge(std::move(chunks), false);
current_chunk = merge(std::move(chunks), !use_nulls, false);
}
finalizeChunk(gen_chunk, aggregates_mask);

View File

@ -1,6 +1,7 @@
#pragma once
#include <Processors/IInflatingTransform.h>
#include <Processors/Transforms/AggregatingTransform.h>
#include <Processors/Transforms/RollupTransform.h>
#include <Processors/Transforms/finalizeChunk.h>
@ -9,30 +10,23 @@ namespace DB
/// Takes blocks after grouping, with non-finalized aggregate functions.
/// Calculates all subsets of columns and aggregates over them.
class CubeTransform : public IAccumulatingTransform
class CubeTransform : public GroupByModifierTransform
{
public:
CubeTransform(Block header, AggregatingTransformParamsPtr params);
CubeTransform(Block header, AggregatingTransformParamsPtr params, bool use_nulls_);
String getName() const override { return "CubeTransform"; }
protected:
void consume(Chunk chunk) override;
Chunk generate() override;
private:
AggregatingTransformParamsPtr params;
ColumnNumbers keys;
const ColumnsMask aggregates_mask;
Chunks consumed_chunks;
Chunk cube_chunk;
Columns current_columns;
Columns current_zero_columns;
UInt64 mask = 0;
UInt64 grouping_set = 0;
Chunk merge(Chunks && chunks, bool final);
};
}

View File

@ -1,36 +1,80 @@
#include <Processors/Transforms/RollupTransform.h>
#include <Processors/Transforms/TotalsHavingTransform.h>
#include <Processors/QueryPlan/AggregatingStep.h>
#include <Columns/ColumnNullable.h>
namespace DB
{
RollupTransform::RollupTransform(Block header, AggregatingTransformParamsPtr params_)
: IAccumulatingTransform(std::move(header), appendGroupingSetColumn(params_->getHeader()))
GroupByModifierTransform::GroupByModifierTransform(Block header, AggregatingTransformParamsPtr params_, bool use_nulls_)
: IAccumulatingTransform(std::move(header), generateOutputHeader(params_->getHeader(), params_->params.keys, use_nulls_))
, params(std::move(params_))
, aggregates_mask(getAggregatesMask(params->getHeader(), params->params.aggregates))
, use_nulls(use_nulls_)
{
keys.reserve(params->params.keys_size);
for (const auto & key : params->params.keys)
keys.emplace_back(input.getHeader().getPositionByName(key));
intermediate_header = getOutputPort().getHeader();
intermediate_header.erase(0);
if (use_nulls)
{
auto output_aggregator_params = params->params;
output_aggregator = std::make_unique<Aggregator>(intermediate_header, output_aggregator_params);
}
}
void RollupTransform::consume(Chunk chunk)
void GroupByModifierTransform::consume(Chunk chunk)
{
consumed_chunks.emplace_back(std::move(chunk));
}
Chunk RollupTransform::merge(Chunks && chunks, bool final)
void GroupByModifierTransform::mergeConsumed()
{
BlocksList rollup_blocks;
for (auto & chunk : chunks)
rollup_blocks.emplace_back(getInputPort().getHeader().cloneWithColumns(chunk.detachColumns()));
if (consumed_chunks.size() > 1)
current_chunk = merge(std::move(consumed_chunks), true, false);
else
current_chunk = std::move(consumed_chunks.front());
auto rollup_block = params->aggregator.mergeBlocks(rollup_blocks, final);
auto num_rows = rollup_block.rows();
return Chunk(rollup_block.getColumns(), num_rows);
size_t rows = current_chunk.getNumRows();
auto columns = current_chunk.getColumns();
if (use_nulls)
{
for (auto key : keys)
columns[key] = makeNullableSafe(columns[key]);
}
current_chunk = Chunk{ columns, rows };
consumed_chunks.clear();
}
Chunk GroupByModifierTransform::merge(Chunks && chunks, bool is_input, bool final)
{
auto header = is_input ? getInputPort().getHeader() : intermediate_header;
BlocksList blocks;
for (auto & chunk : chunks)
blocks.emplace_back(header.cloneWithColumns(chunk.detachColumns()));
auto current_block = is_input ? params->aggregator.mergeBlocks(blocks, final) : output_aggregator->mergeBlocks(blocks, final);
auto num_rows = current_block.rows();
return Chunk(current_block.getColumns(), num_rows);
}
MutableColumnPtr GroupByModifierTransform::getColumnWithDefaults(size_t key, size_t n) const
{
auto const & col = intermediate_header.getByPosition(key);
auto result_column = col.column->cloneEmpty();
col.type->insertManyDefaultsInto(*result_column, n);
return result_column;
}
RollupTransform::RollupTransform(Block header, AggregatingTransformParamsPtr params_, bool use_nulls_)
: GroupByModifierTransform(std::move(header), params_, use_nulls_)
, aggregates_mask(getAggregatesMask(params->getHeader(), params->params.aggregates))
{}
MutableColumnPtr getColumnWithDefaults(Block const & header, size_t key, size_t n)
{
auto const & col = header.getByPosition(key);
@ -43,16 +87,11 @@ Chunk RollupTransform::generate()
{
if (!consumed_chunks.empty())
{
if (consumed_chunks.size() > 1)
rollup_chunk = merge(std::move(consumed_chunks), false);
else
rollup_chunk = std::move(consumed_chunks.front());
consumed_chunks.clear();
mergeConsumed();
last_removed_key = keys.size();
}
auto gen_chunk = std::move(rollup_chunk);
auto gen_chunk = std::move(current_chunk);
if (last_removed_key)
{
@ -61,11 +100,11 @@ Chunk RollupTransform::generate()
auto num_rows = gen_chunk.getNumRows();
auto columns = gen_chunk.getColumns();
columns[key] = getColumnWithDefaults(getInputPort().getHeader(), key, num_rows);
columns[key] = getColumnWithDefaults(key, num_rows);
Chunks chunks;
chunks.emplace_back(std::move(columns), num_rows);
rollup_chunk = merge(std::move(chunks), false);
current_chunk = merge(std::move(chunks), !use_nulls, false);
}
finalizeChunk(gen_chunk, aggregates_mask);

View File

@ -1,4 +1,6 @@
#pragma once
#include <memory>
#include <Core/ColumnNumbers.h>
#include <Processors/IAccumulatingTransform.h>
#include <Processors/Transforms/AggregatingTransform.h>
#include <Processors/Transforms/finalizeChunk.h>
@ -6,29 +8,49 @@
namespace DB
{
/// Takes blocks after grouping, with non-finalized aggregate functions.
/// Calculates subtotals and grand totals values for a set of columns.
class RollupTransform : public IAccumulatingTransform
struct GroupByModifierTransform : public IAccumulatingTransform
{
public:
RollupTransform(Block header, AggregatingTransformParamsPtr params);
String getName() const override { return "RollupTransform"; }
GroupByModifierTransform(Block header, AggregatingTransformParamsPtr params_, bool use_nulls_);
protected:
void consume(Chunk chunk) override;
void mergeConsumed();
Chunk merge(Chunks && chunks, bool is_input, bool final);
MutableColumnPtr getColumnWithDefaults(size_t key, size_t n) const;
AggregatingTransformParamsPtr params;
bool use_nulls;
ColumnNumbers keys;
std::unique_ptr<Aggregator> output_aggregator;
Block intermediate_header;
Chunks consumed_chunks;
Chunk current_chunk;
};
/// Takes blocks after grouping, with non-finalized aggregate functions.
/// Calculates subtotals and grand totals values for a set of columns.
class RollupTransform : public GroupByModifierTransform
{
public:
RollupTransform(Block header, AggregatingTransformParamsPtr params, bool use_nulls_);
String getName() const override { return "RollupTransform"; }
protected:
Chunk generate() override;
private:
AggregatingTransformParamsPtr params;
ColumnNumbers keys;
const ColumnsMask aggregates_mask;
Chunks consumed_chunks;
Chunk rollup_chunk;
size_t last_removed_key = 0;
size_t set_counter = 0;
Chunk merge(Chunks && chunks, bool final);
};
}

View File

@ -7,6 +7,7 @@
#include <Common/CurrentThread.h>
#include <Interpreters/InternalTextLogsQueue.h>
#include <IO/ConnectionTimeouts.h>
#include <Core/Settings.h>
namespace DB

View File

@ -9,7 +9,6 @@
#include <Common/MultiVersion.h>
#include "IServer.h"
#include <Common/Stopwatch.h>
#include <Interpreters/Context.h>
#include <Common/ZooKeeper/ZooKeeperCommon.h>
#include <Common/ZooKeeper/ZooKeeperConstants.h>
#include <Common/ConcurrentBoundedQueue.h>

View File

@ -430,13 +430,6 @@ public:
writer->write(getHeader().cloneWithColumns(chunk.detachColumns()));
}
void onCancel() override
{
if (!writer)
return;
onFinish();
}
void onException() override
{
if (!writer)

View File

@ -2,6 +2,7 @@
#include <Storages/MergeTree/MergeTreeData.h>
#include <Common/CurrentMetrics.h>
#include <Common/randomSeed.h>
#include <Interpreters/Context.h>
#include <pcg_random.hpp>
#include <random>

View File

@ -4,6 +4,7 @@
#include <Compression/CachedCompressedReadBuffer.h>
#include <Columns/ColumnArray.h>
#include <Interpreters/inplaceBlockConversions.h>
#include <Interpreters/Context.h>
#include <Storages/MergeTree/IMergeTreeReader.h>
#include <Common/typeid_cast.h>

View File

@ -4,6 +4,7 @@
#include <Storages/MergeTree/MergeTask.h>
#include <Storages/MutationCommands.h>
#include <Storages/MergeTree/MergeMutateSelectedEntry.h>
#include <Interpreters/MergeTreeTransactionHolder.h>
namespace DB
{

View File

@ -124,7 +124,7 @@ void MergeTreeBackgroundExecutor<Queue>::routine(TaskRuntimeDataPtr item)
/// All operations with queues are considered no to do any allocations
auto erase_from_active = [this, item]() TSA_REQUIRES(mutex)
auto erase_from_active = [this, &item]() TSA_REQUIRES(mutex)
{
active.erase(std::remove(active.begin(), active.end(), item), active.end());
};
@ -157,11 +157,10 @@ void MergeTreeBackgroundExecutor<Queue>::routine(TaskRuntimeDataPtr item)
if (need_execute_again)
{
std::lock_guard guard(mutex);
erase_from_active();
if (item->is_currently_deleting)
{
erase_from_active();
/// This is significant to order the destructors.
{
NOEXCEPT_SCOPE({
@ -179,7 +178,6 @@ void MergeTreeBackgroundExecutor<Queue>::routine(TaskRuntimeDataPtr item)
/// Otherwise the destruction of the task won't be ordered with the destruction of the
/// storage.
pending.push(std::move(item));
erase_from_active();
has_tasks.notify_one();
item = nullptr;
return;

View File

@ -56,6 +56,9 @@ struct ZeroCopyLock;
class IBackupEntry;
using BackupEntries = std::vector<std::pair<String, std::shared_ptr<const IBackupEntry>>>;
class MergeTreeTransaction;
using MergeTreeTransactionPtr = std::shared_ptr<MergeTreeTransaction>;
/// Auxiliary struct holding information about the future merged or mutated part.
struct EmergingPartInfo
{

View File

@ -383,6 +383,7 @@ QueryPlanPtr MergeTreeDataSelectExecutor::read(
merge_threads,
temporary_data_merge_threads,
/* storage_has_evenly_distributed_read_= */ false,
/* group_by_use_nulls */ false,
std::move(group_by_info),
std::move(group_by_sort_description),
should_produce_results_in_order_of_bucket_number);

View File

@ -2,6 +2,7 @@
#include <Storages/MergeTree/MergeTreeData.h>
#include <Storages/MergeTree/IMergeTreeDataPart.h>
#include <IO/HashingWriteBuffer.h>
#include <Interpreters/Context.h>
#include <Common/FieldVisitors.h>
#include <DataTypes/DataTypeDate.h>
#include <DataTypes/DataTypeTuple.h>

Some files were not shown because too many files have changed in this diff Show More