Merge branch 'master' into fix_settings_constraints

This commit is contained in:
Nikolay Degterinsky 2022-12-15 21:54:02 +01:00 committed by GitHub
commit 1861e670e9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
130 changed files with 4608 additions and 1697 deletions

View File

@ -1,4 +1,5 @@
### Table of Contents
**[ClickHouse release v22.12, 2022-12-15](#2212)**<br/>
**[ClickHouse release v22.11, 2022-11-17](#2211)**<br/>
**[ClickHouse release v22.10, 2022-10-25](#2210)**<br/>
**[ClickHouse release v22.9, 2022-09-22](#229)**<br/>
@ -12,6 +13,121 @@
**[ClickHouse release v22.1, 2022-01-18](#221)**<br/>
**[Changelog for 2021](https://clickhouse.com/docs/en/whats-new/changelog/2021/)**<br/>
# 2022 Changelog
### <a id="2212"></a> ClickHouse release 22.12, 2022-12-15
#### Upgrade Notes
* Fixed backward incompatibility in (de)serialization of states of `min`, `max`, `any*`, `argMin`, `argMax` aggregate functions with `String` argument. The incompatibility affects 22.9, 22.10 and 22.11 branches (fixed since 22.9.6, 22.10.4 and 22.11.2 correspondingly). Some minor releases of 22.3, 22.7 and 22.8 branches are also affected: 22.3.13...22.3.14 (fixed since 22.3.15), 22.8.6...22.8.9 (fixed since 22.8.10), 22.7.6 and newer (will not be fixed in 22.7, we recommend upgrading from 22.7.* to 22.8.10 or newer). This release note does not concern users that have never used affected versions. Incompatible versions append an extra `'\0'` to strings when reading states of the aggregate functions mentioned above. For example, if an older version saved state of `anyState('foobar')` to `state_column` then the incompatible version will print `'foobar\0'` on `anyMerge(state_column)`. Also incompatible versions write states of the aggregate functions without trailing `'\0'`. Newer versions (that have the fix) can correctly read data written by all versions including incompatible versions, except one corner case. If an incompatible version saved a state with a string that actually ends with null character, then newer version will trim trailing `'\0'` when reading state of affected aggregate function. For example, if an incompatible version saved state of `anyState('abrac\0dabra\0')` to `state_column` then newer versions will print `'abrac\0dabra'` on `anyMerge(state_column)`. The issue also affects distributed queries when an incompatible version works in a cluster together with older or newer versions. [#43038](https://github.com/ClickHouse/ClickHouse/pull/43038) ([Alexander Tokmakov](https://github.com/tavplubix), [Raúl Marín](https://github.com/Algunenano)). Note: all the official ClickHouse builds already include the patches. This is not necessarily true for unofficial third-party builds that should be avoided.
#### New Feature
* Add `BSONEachRow` input/output format. In this format, ClickHouse formats/parses each row as a separate BSON document and each column is formatted/parsed as a single BSON field with the column name as the key. [#42033](https://github.com/ClickHouse/ClickHouse/pull/42033) ([mark-polokhov](https://github.com/mark-polokhov)).
* Add `grace_hash` JOIN algorithm, it can be enabled with `SET join_algorithm = 'grace_hash'`. [#38191](https://github.com/ClickHouse/ClickHouse/pull/38191) ([BigRedEye](https://github.com/BigRedEye), [Vladimir C](https://github.com/vdimir)).
* Allow configuring password complexity rules and checks for creating and changing users. [#43719](https://github.com/ClickHouse/ClickHouse/pull/43719) ([Nikolay Degterinsky](https://github.com/evillique)).
* Mask sensitive information in logs; mask secret parts in the output of queries `SHOW CREATE TABLE` and `SELECT FROM system.tables`. Also resolves [#41418](https://github.com/ClickHouse/ClickHouse/issues/41418). [#43227](https://github.com/ClickHouse/ClickHouse/pull/43227) ([Vitaly Baranov](https://github.com/vitlibar)).
* Add `GROUP BY ALL` syntax: [#37631](https://github.com/ClickHouse/ClickHouse/issues/37631). [#42265](https://github.com/ClickHouse/ClickHouse/pull/42265) ([刘陶峰](https://github.com/taofengliu)).
* Add `FROM table SELECT column` syntax. [#41095](https://github.com/ClickHouse/ClickHouse/pull/41095) ([Nikolay Degterinsky](https://github.com/evillique)).
* Added function `concatWithSeparator` and `concat_ws` as an alias for Spark SQL compatibility. A function `concatWithSeparatorAssumeInjective` added as a variant to enable GROUP BY optimization, similarly to `concatAssumeInjective`. [#43749](https://github.com/ClickHouse/ClickHouse/pull/43749) ([李扬](https://github.com/taiyang-li)).
* Added `multiplyDecimal` and `divideDecimal` functions for decimal operations with fixed precision. [#42438](https://github.com/ClickHouse/ClickHouse/pull/42438) ([Andrey Zvonov](https://github.com/zvonand)).
* Added `system.moves` table with list of currently moving parts. [#42660](https://github.com/ClickHouse/ClickHouse/pull/42660) ([Sergei Trifonov](https://github.com/serxa)).
* Add support for embedded Prometheus endpoint for ClickHouse Keeper. [#43087](https://github.com/ClickHouse/ClickHouse/pull/43087) ([Antonio Andelic](https://github.com/antonio2368)).
* Support numeric literals with `_` as the separator, for example, `1_000_000`. [#43925](https://github.com/ClickHouse/ClickHouse/pull/43925) ([jh0x](https://github.com/jh0x)).
* Added possibility to use an array as a second parameter for `cutURLParameter` function. It will cut multiple parameters. Close [#6827](https://github.com/ClickHouse/ClickHouse/issues/6827). [#43788](https://github.com/ClickHouse/ClickHouse/pull/43788) ([Roman Vasin](https://github.com/rvasin)).
* Add a column with the expression of the index in the `system.data_skipping_indices` table. [#43308](https://github.com/ClickHouse/ClickHouse/pull/43308) ([Guillaume Tassery](https://github.com/YiuRULE)).
* Add column `engine_full` to system table `databases` so that users can access the entire engine definition of a database via system tables. [#43468](https://github.com/ClickHouse/ClickHouse/pull/43468) ([凌涛](https://github.com/lingtaolf)).
* New hash function [xxh3](https://github.com/Cyan4973/xxHash) added. Also, the performance of `xxHash32` and `xxHash64` are improved on ARM thanks to a library update. [#43411](https://github.com/ClickHouse/ClickHouse/pull/43411) ([Nikita Taranov](https://github.com/nickitat)).
* Added support to define constraints for merge tree settings. For example you can forbid overriding the `storage_policy` by users. [#43903](https://github.com/ClickHouse/ClickHouse/pull/43903) ([Sergei Trifonov](https://github.com/serxa)).
* Add a new setting `input_format_json_read_objects_as_strings` that allows the parsing of nested JSON objects into Strings in all JSON input formats. This setting is disabled by default. [#44052](https://github.com/ClickHouse/ClickHouse/pull/44052) ([Kruglov Pavel](https://github.com/Avogar)).
#### Experimental Feature
* Support deduplication for asynchronous inserts. Before this change, async inserts did not support deduplication, because multiple small inserts coexisted in one inserted batch. Closes [#38075](https://github.com/ClickHouse/ClickHouse/issues/38075). [#43304](https://github.com/ClickHouse/ClickHouse/pull/43304) ([Han Fei](https://github.com/hanfei1991)).
* Add support for cosine distance for the experimental Annoy (vector similarity search) index. [#42778](https://github.com/ClickHouse/ClickHouse/pull/42778) ([Filatenkov Artur](https://github.com/FArthur-cmd)).
* Add `CREATE / ALTER / DROP NAMED COLLECTION` queries. [#43252](https://github.com/ClickHouse/ClickHouse/pull/43252) ([Kseniia Sumarokova](https://github.com/kssenii)). This feature is under development and the queries are not effective as of version 22.12. This changelog entry is added only to avoid confusion. Restrict default access to named collections to the user defined in config. This requires that `show_named_collections = 1` is set to be able to see them. [#43325](https://github.com/ClickHouse/ClickHouse/pull/43325) ([Kseniia Sumarokova](https://github.com/kssenii)). The `system.named_collections` table is introduced [#43147](https://github.com/ClickHouse/ClickHouse/pull/43147) ([Kseniia Sumarokova](https://github.com/kssenii)).
#### Performance Improvement
* Add settings `max_streams_for_merge_tree_reading` and `allow_asynchronous_read_from_io_pool_for_merge_tree`. Setting `max_streams_for_merge_tree_reading` limits the number of reading streams for MergeTree tables. Setting `allow_asynchronous_read_from_io_pool_for_merge_tree` enables a background I/O pool to read from `MergeTree` tables. This may increase performance for I/O bound queries if used together with `max_streams_to_max_threads_ratio` or `max_streams_for_merge_tree_reading`. [#43260](https://github.com/ClickHouse/ClickHouse/pull/43260) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). This improves performance up to 100 times in case of high latency storage, low number of CPU and high number of data parts.
* Settings `merge_tree_min_rows_for_concurrent_read_for_remote_filesystem/merge_tree_min_bytes_for_concurrent_read_for_remote_filesystem` did not respect adaptive granularity. Fat rows did not decrease the number of read rows (as it was done for `merge_tree_min_rows_for_concurrent_read/merge_tree_min_bytes_for_concurrent_read`, which could lead to high memory usage when using remote filesystems. [#43965](https://github.com/ClickHouse/ClickHouse/pull/43965) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Optimized the number of list requests to ZooKeeper or ClickHouse Keeper when selecting a part to merge. Previously it could produce thousands of requests in some cases. Fixes [#43647](https://github.com/ClickHouse/ClickHouse/issues/43647). [#43675](https://github.com/ClickHouse/ClickHouse/pull/43675) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Optimization is getting skipped now if `max_size_to_preallocate_for_aggregation` has too small a value. The default value of this setting increased to `10^8`. [#43945](https://github.com/ClickHouse/ClickHouse/pull/43945) ([Nikita Taranov](https://github.com/nickitat)).
* Speed-up server shutdown by avoiding cleaning up of old data parts. Because it is unnecessary after https://github.com/ClickHouse/ClickHouse/pull/41145. [#43760](https://github.com/ClickHouse/ClickHouse/pull/43760) ([Sema Checherinda](https://github.com/CheSema)).
* Merging on initiator now uses the same memory bound approach as merging of local aggregation results if `enable_memory_bound_merging_of_aggregation_results` is set. [#40879](https://github.com/ClickHouse/ClickHouse/pull/40879) ([Nikita Taranov](https://github.com/nickitat)).
* Keeper improvement: try syncing logs to disk in parallel with replication. [#43450](https://github.com/ClickHouse/ClickHouse/pull/43450) ([Antonio Andelic](https://github.com/antonio2368)).
* Keeper improvement: requests are batched more often. The batching can be controlled with the new setting `max_requests_quick_batch_size`. [#43686](https://github.com/ClickHouse/ClickHouse/pull/43686) ([Antonio Andelic](https://github.com/antonio2368)).
#### Improvement
* Implement referential dependencies and use them to create tables in the correct order while restoring from a backup. [#43834](https://github.com/ClickHouse/ClickHouse/pull/43834) ([Vitaly Baranov](https://github.com/vitlibar)).
* Substitute UDFs in `CREATE` query to avoid failures during loading at startup. Additionally, UDFs can now be used as `DEFAULT` expressions for columns. [#43539](https://github.com/ClickHouse/ClickHouse/pull/43539) ([Antonio Andelic](https://github.com/antonio2368)).
* Change how the following queries delete parts: TRUNCATE TABLE, ALTER TABLE DROP PART, ALTER TABLE DROP PARTITION. Now, these queries make empty parts which cover the old parts. This makes the TRUNCATE query work without a followedexclusive lock which means concurrent reads aren't locked. Also achieved durability in all those queries. If the request succeeds, then no resurrected parts appear later. Note that atomicity is achieved only with transaction scope. [#41145](https://github.com/ClickHouse/ClickHouse/pull/41145) ([Sema Checherinda](https://github.com/CheSema)).
* `SET param_x` query no longer requires manual string serialization for the value of the parameter. For example, query `SET param_a = '[\'a\', \'b\']'` can now be written like `SET param_a = ['a', 'b']`. [#41874](https://github.com/ClickHouse/ClickHouse/pull/41874) ([Nikolay Degterinsky](https://github.com/evillique)).
* Show read rows in the progress indication while reading from STDIN from client. Closes [#43423](https://github.com/ClickHouse/ClickHouse/issues/43423). [#43442](https://github.com/ClickHouse/ClickHouse/pull/43442) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Show progress bar while reading from s3 table function / engine. [#43454](https://github.com/ClickHouse/ClickHouse/pull/43454) ([Kseniia Sumarokova](https://github.com/kssenii)).
* `filesystemAvailable` and related functions support one optional argument with disk name, and change `filesystemFree` to `filesystemUnreserved`. Closes [#35076](https://github.com/ClickHouse/ClickHouse/issues/35076). [#42064](https://github.com/ClickHouse/ClickHouse/pull/42064) ([flynn](https://github.com/ucasfl)).
* Integration with LDAP: increased the default value of search_limit to 256, and added LDAP server config option to change that to an arbitrary value. Closes: [#42276](https://github.com/ClickHouse/ClickHouse/issues/42276). [#42461](https://github.com/ClickHouse/ClickHouse/pull/42461) ([Vasily Nemkov](https://github.com/Enmk)).
* Allow the removal of sensitive information (see the `query_masking_rules` in the configuration file) from the exception messages as well. Resolves [#41418](https://github.com/ClickHouse/ClickHouse/issues/41418). [#42940](https://github.com/ClickHouse/ClickHouse/pull/42940) ([filimonov](https://github.com/filimonov)).
* Support queries like `SHOW FULL TABLES ...` for MySQL compatibility. [#43910](https://github.com/ClickHouse/ClickHouse/pull/43910) ([Filatenkov Artur](https://github.com/FArthur-cmd)).
* Keeper improvement: Add 4lw command `rqld` which can manually assign a node as leader. [#43026](https://github.com/ClickHouse/ClickHouse/pull/43026) ([JackyWoo](https://github.com/JackyWoo)).
* Apply connection timeout settings for Distributed async INSERT from the query. [#43156](https://github.com/ClickHouse/ClickHouse/pull/43156) ([Azat Khuzhin](https://github.com/azat)).
* The `unhex` function now supports `FixedString` arguments. [issue42369](https://github.com/ClickHouse/ClickHouse/issues/42369). [#43207](https://github.com/ClickHouse/ClickHouse/pull/43207) ([DR](https://github.com/freedomDR)).
* Priority is given to deleting completely expired parts according to the TTL rules, see [#42869](https://github.com/ClickHouse/ClickHouse/issues/42869). [#43222](https://github.com/ClickHouse/ClickHouse/pull/43222) ([zhongyuankai](https://github.com/zhongyuankai)).
* More precise and reactive CPU load indication in clickhouse-client. [#43307](https://github.com/ClickHouse/ClickHouse/pull/43307) ([Sergei Trifonov](https://github.com/serxa)).
* Support reading of subcolumns of nested types from storage `S3` and table function `s3` with formats `Parquet`, `Arrow` and `ORC`. [#43329](https://github.com/ClickHouse/ClickHouse/pull/43329) ([chen](https://github.com/xiedeyantu)).
* Add `table_uuid` column to the `system.parts` table. [#43404](https://github.com/ClickHouse/ClickHouse/pull/43404) ([Azat Khuzhin](https://github.com/azat)).
* Added client option to display the number of locally processed rows in non-interactive mode (`--print-num-processed-rows`). [#43407](https://github.com/ClickHouse/ClickHouse/pull/43407) ([jh0x](https://github.com/jh0x)).
* Implement `aggregation-in-order` optimization on top of a query plan. It is enabled by default (but works only together with `optimize_aggregation_in_order`, which is disabled by default). Set `query_plan_aggregation_in_order = 0` to use the previous AST-based version. [#43592](https://github.com/ClickHouse/ClickHouse/pull/43592) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Allow to collect profile events with `trace_type = 'ProfileEvent'` to `system.trace_log` on each increment with current stack, profile event name and value of the increment. It can be enabled by the setting `trace_profile_events` and used to investigate performance of queries. [#43639](https://github.com/ClickHouse/ClickHouse/pull/43639) ([Anton Popov](https://github.com/CurtizJ)).
* Add a new setting `input_format_max_binary_string_size` to limit string size in RowBinary format. [#43842](https://github.com/ClickHouse/ClickHouse/pull/43842) ([Kruglov Pavel](https://github.com/Avogar)).
* When ClickHouse requests a remote HTTP server, and it returns an error, the numeric HTTP code was not displayed correctly in the exception message. Closes [#43919](https://github.com/ClickHouse/ClickHouse/issues/43919). [#43920](https://github.com/ClickHouse/ClickHouse/pull/43920) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Correctly report errors in queries even when multiple JOINs optimization is taking place. [#43583](https://github.com/ClickHouse/ClickHouse/pull/43583) ([Salvatore](https://github.com/tbsal)).
#### Build/Testing/Packaging Improvement
* Systemd integration now correctly notifies systemd that the service is really started and is ready to serve requests. [#43400](https://github.com/ClickHouse/ClickHouse/pull/43400) ([Коренберг Марк](https://github.com/socketpair)).
* Added the option to build ClickHouse with OpenSSL using the [OpenSSL FIPS Module](https://www.openssl.org/docs/man3.0/man7/fips_module.html). This build type has not been tested to validate security and is not supported. [#43991](https://github.com/ClickHouse/ClickHouse/pull/43991) ([Boris Kuschel](https://github.com/bkuschel)).
* Upgrade to the new `DeflateQpl` compression codec which has been implemented in a previous PR (details: https://github.com/ClickHouse/ClickHouse/pull/39494). This patch improves codec on below aspects: 1. QPL v0.2.0 to QPL v0.3.0 [Intel® Query Processing Library (QPL)](https://github.com/intel/qpl) 2. Improve CMake file for fixing QPL build issues for QPL v0.3.0. 3. Link the QPL library with libaccel-config at build time instead of runtime loading on QPL v0.2.0 (dlopen) 4. Fixed log print issue in CompressionCodecDeflateQpl.cpp. [#44024](https://github.com/ClickHouse/ClickHouse/pull/44024) ([jasperzhu](https://github.com/jinjunzh)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Fixed bug which could lead to deadlock while using asynchronous inserts. [#43233](https://github.com/ClickHouse/ClickHouse/pull/43233) ([Anton Popov](https://github.com/CurtizJ)).
* Fix some incorrect logic in AST level optimization `optimize_normalize_count_variants`. [#43873](https://github.com/ClickHouse/ClickHouse/pull/43873) ([Duc Canh Le](https://github.com/canhld94)).
* Fix a case when mutations are not making progress when checksums do not match between replicas (e.g. caused by a change in data format on an upgrade). [#36877](https://github.com/ClickHouse/ClickHouse/pull/36877) ([nvartolomei](https://github.com/nvartolomei)).
* Fix the `skip_unavailable_shards` optimization which did not work with the `hdfsCluster` table function. [#43236](https://github.com/ClickHouse/ClickHouse/pull/43236) ([chen](https://github.com/xiedeyantu)).
* Fix `s3` support for the `?` wildcard. Closes [#42731](https://github.com/ClickHouse/ClickHouse/issues/42731). [#43253](https://github.com/ClickHouse/ClickHouse/pull/43253) ([chen](https://github.com/xiedeyantu)).
* Fix functions `arrayFirstOrNull` and `arrayLastOrNull` or null when the array contains `Nullable` elements. [#43274](https://github.com/ClickHouse/ClickHouse/pull/43274) ([Duc Canh Le](https://github.com/canhld94)).
* Fix incorrect `UserTimeMicroseconds`/`SystemTimeMicroseconds` accounting related to Kafka tables. [#42791](https://github.com/ClickHouse/ClickHouse/pull/42791) ([Azat Khuzhin](https://github.com/azat)).
* Do not suppress exceptions in `web` disks. Fix retries for the `web` disk. [#42800](https://github.com/ClickHouse/ClickHouse/pull/42800) ([Azat Khuzhin](https://github.com/azat)).
* Fixed (logical) race condition between inserts and dropping materialized views. A race condition happened when a Materialized View was dropped at the same time as an INSERT, where the MVs were present as a dependency of the insert at the begining of the execution, but the table has been dropped by the time the insert chain tries to access it, producing either an `UNKNOWN_TABLE` or `TABLE_IS_DROPPED` exception, and stopping the insertion. After this change, we avoid these exceptions and just continue with the insert if the dependency is gone. [#43161](https://github.com/ClickHouse/ClickHouse/pull/43161) ([AlfVII](https://github.com/AlfVII)).
* Fix undefined behavior in the `quantiles` function, which might lead to uninitialized memory. Found by fuzzer. This closes [#44066](https://github.com/ClickHouse/ClickHouse/issues/44066). [#44067](https://github.com/ClickHouse/ClickHouse/pull/44067) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Additional check on zero uncompressed size is added to `CompressionCodecDelta`. [#43255](https://github.com/ClickHouse/ClickHouse/pull/43255) ([Nikita Taranov](https://github.com/nickitat)).
* Flatten arrays from Parquet to avoid an issue with inconsistent data in arrays. These incorrect files can be generated by Apache Iceberg. [#43297](https://github.com/ClickHouse/ClickHouse/pull/43297) ([Arthur Passos](https://github.com/arthurpassos)).
* Fix bad cast from `LowCardinality` column when using short circuit function execution. [#43311](https://github.com/ClickHouse/ClickHouse/pull/43311) ([Kruglov Pavel](https://github.com/Avogar)).
* Fixed queries with `SAMPLE BY` with prewhere optimization on tables using `Merge` engine. [#43315](https://github.com/ClickHouse/ClickHouse/pull/43315) ([Antonio Andelic](https://github.com/antonio2368)).
* Check and compare the content of the `format_version` file in `MergeTreeData` so that tables can be loaded even if the storage policy was changed. [#43328](https://github.com/ClickHouse/ClickHouse/pull/43328) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix possible (very unlikely) "No column to rollback" logical error during INSERT into `Buffer` tables. [#43336](https://github.com/ClickHouse/ClickHouse/pull/43336) ([Azat Khuzhin](https://github.com/azat)).
* Fix a bug that allowed the parser to parse an unlimited amount of round brackets into one function if `allow_function_parameters` is set. [#43350](https://github.com/ClickHouse/ClickHouse/pull/43350) ([Nikolay Degterinsky](https://github.com/evillique)).
* `MaterializeMySQL` (experimental feature) support DDL `drop table t1, t2` and compatible with most of MySQL DROP DDL. [#43366](https://github.com/ClickHouse/ClickHouse/pull/43366) ([zzsmdfj](https://github.com/zzsmdfj)).
* `session_log` (experimental feature): Fixed the inability to log in (because of failure to create the session_log entry) in a very rare case of messed up setting profiles. [#42641](https://github.com/ClickHouse/ClickHouse/pull/42641) ([Vasily Nemkov](https://github.com/Enmk)).
* Fix possible `Cannot create non-empty column with type Nothing` in functions `if`/`multiIf`. Closes [#43356](https://github.com/ClickHouse/ClickHouse/issues/43356). [#43368](https://github.com/ClickHouse/ClickHouse/pull/43368) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix a bug when a row level filter uses the default value of a column. [#43387](https://github.com/ClickHouse/ClickHouse/pull/43387) ([Alexander Gololobov](https://github.com/davenger)).
* Query with `DISTINCT` + `LIMIT BY` + `LIMIT` can return fewer rows than expected. Fixes [#43377](https://github.com/ClickHouse/ClickHouse/issues/43377). [#43410](https://github.com/ClickHouse/ClickHouse/pull/43410) ([Igor Nikonov](https://github.com/devcrafter)).
* Fix `sumMap` for `Nullable(Decimal(...))`. [#43414](https://github.com/ClickHouse/ClickHouse/pull/43414) ([Azat Khuzhin](https://github.com/azat)).
* Fix `date_diff` for hour/minute on macOS. Close [#42742](https://github.com/ClickHouse/ClickHouse/issues/42742). [#43466](https://github.com/ClickHouse/ClickHouse/pull/43466) ([zzsmdfj](https://github.com/zzsmdfj)).
* Fix incorrect memory accounting because of merges/mutations. [#43516](https://github.com/ClickHouse/ClickHouse/pull/43516) ([Azat Khuzhin](https://github.com/azat)).
* Fixed primary key analysis with conditions involving `toString(enum)`. [#43596](https://github.com/ClickHouse/ClickHouse/pull/43596) ([Nikita Taranov](https://github.com/nickitat)). This error has been found by @tisonkun.
* Ensure consistency when `clickhouse-copier` updates status and `attach_is_done` in Keeper after partition attach is done. [#43602](https://github.com/ClickHouse/ClickHouse/pull/43602) ([lzydmxy](https://github.com/lzydmxy)).
* During the recovery of a lost replica of a `Replicated` database (experimental feature), there could a situation where we need to atomically swap two table names (use EXCHANGE). Previously we tried to use two RENAME queries, which was obviously failing and moreover, failed the whole recovery process of the database replica. [#43628](https://github.com/ClickHouse/ClickHouse/pull/43628) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fix the case when the `s3Cluster` function throws `NOT_FOUND_COLUMN_IN_BLOCK` error. Closes [#43534](https://github.com/ClickHouse/ClickHouse/issues/43534). [#43629](https://github.com/ClickHouse/ClickHouse/pull/43629) ([chen](https://github.com/xiedeyantu)).
* Fix possible logical error `Array sizes mismatched` while parsing JSON object with arrays with same key names but with different nesting level. Closes [#43569](https://github.com/ClickHouse/ClickHouse/issues/43569). [#43693](https://github.com/ClickHouse/ClickHouse/pull/43693) ([Kruglov Pavel](https://github.com/Avogar)).
* Fixed possible exception in the case of distributed `GROUP BY` with an `ALIAS` column among aggregation keys. [#43709](https://github.com/ClickHouse/ClickHouse/pull/43709) ([Nikita Taranov](https://github.com/nickitat)).
* Fix bug which can lead to broken projections if zero-copy replication (experimental feature) is enabled and used. [#43764](https://github.com/ClickHouse/ClickHouse/pull/43764) ([alesapin](https://github.com/alesapin)).
* Fix using multipart upload for very large S3 objects in AWS S3. [#43824](https://github.com/ClickHouse/ClickHouse/pull/43824) ([ianton-ru](https://github.com/ianton-ru)).
* Fixed `ALTER ... RESET SETTING` with `ON CLUSTER`. It could have been applied to one replica only. Fixes [#43843](https://github.com/ClickHouse/ClickHouse/issues/43843). [#43848](https://github.com/ClickHouse/ClickHouse/pull/43848) ([Elena Torró](https://github.com/elenatorro)).
* Fix a logical error in JOIN with `Join` table engine at right hand side, if `USING` is being used. [#43963](https://github.com/ClickHouse/ClickHouse/pull/43963) ([Vladimir C](https://github.com/vdimir)). Fix a bug with wrong order of keys in `Join` table engine. [#44012](https://github.com/ClickHouse/ClickHouse/pull/44012) ([Vladimir C](https://github.com/vdimir)).
* Keeper fix: throw if the interserver port for Raft is already in use. [#43984](https://github.com/ClickHouse/ClickHouse/pull/43984) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix ORDER BY positional argument (example: `ORDER BY 1, 2`) in case of unneeded columns pruning from subqueries. Closes [#43964](https://github.com/ClickHouse/ClickHouse/issues/43964). [#43987](https://github.com/ClickHouse/ClickHouse/pull/43987) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fixed exception when a subquery contains HAVING but doesn't contain an actual aggregation. [#44051](https://github.com/ClickHouse/ClickHouse/pull/44051) ([Nikita Taranov](https://github.com/nickitat)).
* Fix race in s3 multipart upload. This race could cause the error `Part number must be an integer between 1 and 10000, inclusive. (S3_ERROR)` while restoring from a backup. [#44065](https://github.com/ClickHouse/ClickHouse/pull/44065) ([Vitaly Baranov](https://github.com/vitlibar)).
### <a id="2211"></a> ClickHouse release 22.11, 2022-11-17
#### Backward Incompatible Change
@ -534,30 +650,30 @@
* Add counters (ProfileEvents) for cases when query complexity limitation has been set and has reached (a separate counter for `overflow_mode` = `break` and `throw`). For example, if you have set up `max_rows_to_read` with `read_overflow_mode = 'break'`, looking at the value of `OverflowBreak` counter will allow distinguishing incomplete results. [#40205](https://github.com/ClickHouse/ClickHouse/pull/40205) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix memory accounting in case of "Memory limit exceeded" errors (previously [peak] memory usage was takes failed allocations into account). [#40249](https://github.com/ClickHouse/ClickHouse/pull/40249) ([Azat Khuzhin](https://github.com/azat)).
* Add metrics for filesystem cache: `FilesystemCacheSize` and `FilesystemCacheElements`. [#40260](https://github.com/ClickHouse/ClickHouse/pull/40260) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Support hadoop secure RPC transfer (hadoop.rpc.protection=privacy and hadoop.rpc.protection=integrity). [#39411](https://github.com/ClickHouse/ClickHouse/pull/39411) ([michael1589](https://github.com/michael1589)).
* Support Hadoop secure RPC transfer (hadoop.rpc.protection=privacy and hadoop.rpc.protection=integrity). [#39411](https://github.com/ClickHouse/ClickHouse/pull/39411) ([michael1589](https://github.com/michael1589)).
* Avoid continuously growing memory consumption of pattern cache when using functions multi(Fuzzy)Match(Any|AllIndices|AnyIndex)(). [#40264](https://github.com/ClickHouse/ClickHouse/pull/40264) ([Robert Schulze](https://github.com/rschu1ze)).
* Add cache for schema inference for file/s3/hdfs/url table functions. Now, schema inference will be performed only on the first query to the file, all subsequent queries to the same file will use the schema from cache if data wasn't changed. Add system table system.schema_inference_cache with all current schemas in cache and system queries SYSTEM DROP SCHEMA CACHE [FOR FILE/S3/HDFS/URL] to drop schemas from cache. [#38286](https://github.com/ClickHouse/ClickHouse/pull/38286) ([Kruglov Pavel](https://github.com/Avogar)).
* Add cache for schema inference for file/s3/hdfs/url table functions. Now, schema inference will be performed only on the first query to the file, all subsequent queries to the same file will use the schema from the cache if data has not changed. Add system table system.schema_inference_cache with all current schemas in cache and system queries SYSTEM DROP SCHEMA CACHE [FOR FILE/S3/HDFS/URL] to drop schemas from cache. [#38286](https://github.com/ClickHouse/ClickHouse/pull/38286) ([Kruglov Pavel](https://github.com/Avogar)).
* Add support for LARGE_BINARY/LARGE_STRING with Arrow (Closes [#32401](https://github.com/ClickHouse/ClickHouse/issues/32401)). [#40293](https://github.com/ClickHouse/ClickHouse/pull/40293) ([Josh Taylor](https://github.com/joshuataylor)).
#### Build/Testing/Packaging Improvement
* [ClickFiddle](https://fiddle.clickhouse.com/): A new tool for testing ClickHouse versions in read/write mode (**Igor Baliuk**).
* ClickHouse binary is made self-extracting [#35775](https://github.com/ClickHouse/ClickHouse/pull/35775) ([Yakov Olkhovskiy, Arthur Filatenkov](https://github.com/yakov-olkhovskiy)).
* Update tzdata to 2022b to support the new timezone changes. See https://github.com/google/cctz/pull/226. Chile's 2022 DST start is delayed from September 4 to September 11. Iran plans to stop observing DST permanently, after it falls back on 2022-09-21. There are corrections of the historical time zone of Asia/Tehran in the year 1977: Iran adopted standard time in 1935, not 1946. In 1977 it observed DST from 03-21 23:00 to 10-20 24:00; its 1978 transitions were on 03-24 and 08-05, not 03-20 and 10-20; and its spring 1979 transition was on 05-27, not 03-21 (https://data.iana.org/time-zones/tzdb/NEWS). ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Former packages used to install systemd.service file to `/etc`. The files there are marked as `conf` and are not cleaned out, and not updated automatically. This PR cleans them out. [#39323](https://github.com/ClickHouse/ClickHouse/pull/39323) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Update `tzdata` to 2022b to support the new timezone changes. See https://github.com/google/cctz/pull/226. Chile's 2022 DST start is delayed from September 4 to September 11. Iran plans to stop observing DST permanently after it falls back on 2022-09-21. There are corrections to the historical time zone of Asia/Tehran in the year 1977: Iran adopted standard time in 1935, not 1946. In 1977 it observed DST from 03-21 23:00 to 10-20 24:00; its 1978 transitions were on 03-24 and 08-05, not 03-20 and 10-20; and its spring 1979 transition was on 05-27, not 03-21 (https://data.iana.org/time-zones/tzdb/NEWS). ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Former packages used to install systemd.service file to `/etc`. The files there are marked as `conf` and are not cleaned out, and are not updated automatically. This PR cleans them out. [#39323](https://github.com/ClickHouse/ClickHouse/pull/39323) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Ensure LSan is effective. [#39430](https://github.com/ClickHouse/ClickHouse/pull/39430) ([Azat Khuzhin](https://github.com/azat)).
* TSAN has issues with clang-14 (https://github.com/google/sanitizers/issues/1552, https://github.com/google/sanitizers/issues/1540), so here we build the TSAN binaries with clang-15. [#39450](https://github.com/ClickHouse/ClickHouse/pull/39450) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Remove the option to build ClickHouse tools as separate executable programs. This fixes [#37847](https://github.com/ClickHouse/ClickHouse/issues/37847). [#39520](https://github.com/ClickHouse/ClickHouse/pull/39520) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Small preparations for build on s390x (which is big-endian). [#39627](https://github.com/ClickHouse/ClickHouse/pull/39627) ([Harry Lee](https://github.com/HarryLeeIBM)). [#39656](https://github.com/ClickHouse/ClickHouse/pull/39656) ([Harry Lee](https://github.com/HarryLeeIBM)). Fixed Endian issue in BitHelpers for s390x. [#39656](https://github.com/ClickHouse/ClickHouse/pull/39656) ([Harry Lee](https://github.com/HarryLeeIBM)). Implement a piece of code related to SipHash for s390x architecture (which is not supported by ClickHouse). [#39732](https://github.com/ClickHouse/ClickHouse/pull/39732) ([Harry Lee](https://github.com/HarryLeeIBM)). Fixed an Endian issue in Coordination snapshot code for s390x architecture (which is not supported by ClickHouse). [#39931](https://github.com/ClickHouse/ClickHouse/pull/39931) ([Harry Lee](https://github.com/HarryLeeIBM)). Fixed Endian issues in Codec code for s390x architecture (which is not supported by ClickHouse). [#40008](https://github.com/ClickHouse/ClickHouse/pull/40008) ([Harry Lee](https://github.com/HarryLeeIBM)). Fixed Endian issues in reading/writing BigEndian binary data in ReadHelpers and WriteHelpers code for s390x architecture (which is not supported by ClickHouse). [#40179](https://github.com/ClickHouse/ClickHouse/pull/40179) ([Harry Lee](https://github.com/HarryLeeIBM)).
* Small preparations for build on s390x (which is big-endian). [#39627](https://github.com/ClickHouse/ClickHouse/pull/39627) ([Harry Lee](https://github.com/HarryLeeIBM)). [#39656](https://github.com/ClickHouse/ClickHouse/pull/39656) ([Harry Lee](https://github.com/HarryLeeIBM)). Fixed Endian issue in BitHelpers for s390x. [#39656](https://github.com/ClickHouse/ClickHouse/pull/39656) ([Harry Lee](https://github.com/HarryLeeIBM)). Implement a piece of code related to SipHash for s390x architecture (which is not supported by ClickHouse). [#39732](https://github.com/ClickHouse/ClickHouse/pull/39732) ([Harry Lee](https://github.com/HarryLeeIBM)). Fixed an Endian issue in the Coordination snapshot code for s390x architecture (which is not supported by ClickHouse). [#39931](https://github.com/ClickHouse/ClickHouse/pull/39931) ([Harry Lee](https://github.com/HarryLeeIBM)). Fixed Endian issues in Codec code for s390x architecture (which is not supported by ClickHouse). [#40008](https://github.com/ClickHouse/ClickHouse/pull/40008) ([Harry Lee](https://github.com/HarryLeeIBM)). Fixed Endian issues in reading/writing BigEndian binary data in ReadHelpers and WriteHelpers code for s390x architecture (which is not supported by ClickHouse). [#40179](https://github.com/ClickHouse/ClickHouse/pull/40179) ([Harry Lee](https://github.com/HarryLeeIBM)).
* Support build with `clang-16` (trunk). This closes [#39949](https://github.com/ClickHouse/ClickHouse/issues/39949). [#40181](https://github.com/ClickHouse/ClickHouse/pull/40181) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Prepare RISC-V 64 build to run in CI. This is for [#40141](https://github.com/ClickHouse/ClickHouse/issues/40141). [#40197](https://github.com/ClickHouse/ClickHouse/pull/40197) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Simplified function registration macro interface (`FUNCTION_REGISTER*`) to eliminate the step to add and call an extern function in the registerFunctions.cpp, it also makes incremental builds of a new function faster. [#38615](https://github.com/ClickHouse/ClickHouse/pull/38615) ([Li Yin](https://github.com/liyinsg)).
* Docker: Now entrypoint.sh in docker image creates and executes chown for all folders it found in config for multidisk setup [#17717](https://github.com/ClickHouse/ClickHouse/issues/17717). [#39121](https://github.com/ClickHouse/ClickHouse/pull/39121) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Docker: Now entrypoint.sh in docker image creates and executes chown for all folders it finds in the config for multidisk setup [#17717](https://github.com/ClickHouse/ClickHouse/issues/17717). [#39121](https://github.com/ClickHouse/ClickHouse/pull/39121) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
#### Bug Fix
* Fix possible segfault in `CapnProto` input format. This bug was found and send through ClickHouse bug-bounty [program](https://github.com/ClickHouse/ClickHouse/issues/38986) by *kiojj*. [#40241](https://github.com/ClickHouse/ClickHouse/pull/40241) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix a very rare case of incorrect behavior of array subscript operator. This closes [#28720](https://github.com/ClickHouse/ClickHouse/issues/28720). [#40185](https://github.com/ClickHouse/ClickHouse/pull/40185) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix possible segfault in `CapnProto` input format. This bug was found and sent in through the ClickHouse bug-bounty [program](https://github.com/ClickHouse/ClickHouse/issues/38986) by *kiojj*. [#40241](https://github.com/ClickHouse/ClickHouse/pull/40241) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix a very rare case of incorrect behavior of the array subscript operator. This closes [#28720](https://github.com/ClickHouse/ClickHouse/issues/28720). [#40185](https://github.com/ClickHouse/ClickHouse/pull/40185) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix insufficient argument check for encryption functions (found by query fuzzer). This closes [#39987](https://github.com/ClickHouse/ClickHouse/issues/39987). [#40194](https://github.com/ClickHouse/ClickHouse/pull/40194) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix the case when the order of columns can be incorrect if the `IN` operator is used with a table with `ENGINE = Set` containing multiple columns. This fixes [#13014](https://github.com/ClickHouse/ClickHouse/issues/13014). [#40225](https://github.com/ClickHouse/ClickHouse/pull/40225) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix seeking while reading from encrypted disk. This PR fixes [#38381](https://github.com/ClickHouse/ClickHouse/issues/38381). [#39687](https://github.com/ClickHouse/ClickHouse/pull/39687) ([Vitaly Baranov](https://github.com/vitlibar)).

View File

@ -16,6 +16,6 @@ ClickHouse® is an open-source column-oriented database management system that a
* [Contacts](https://clickhouse.com/company/contact) can help to get your questions answered if there are any.
## Upcoming events
* [**v22.12 Release Webinar**](https://clickhouse.com/company/events/v22-12-release-webinar) Original creator, co-founder, and CTO of ClickHouse Alexey Milovidov will walk us through the highlights of the release, provide live demos, and share vision into what is coming in the roadmap.
* [**v22.12 Release Webinar**](https://clickhouse.com/company/events/v22-12-release-webinar) 22.12 is the ClickHouse Christmas release. There are plenty of gifts (a new JOIN algorithm among them) and we adopted something from MongoDB. Original creator, co-founder, and CTO of ClickHouse Alexey Milovidov will walk us through the highlights of the release.
* [**ClickHouse Meetup at the CHEQ office in Tel Aviv**](https://www.meetup.com/clickhouse-tel-aviv-user-group/events/289599423/) - Jan 16 - We are very excited to be holding our next in-person ClickHouse meetup at the CHEQ office in Tel Aviv! Hear from CHEQ, ServiceNow and Contentsquare, as well as a deep dive presentation from ClickHouse CTO Alexey Milovidov. Join us for a fun evening of talks, food and discussion!
* **ClickHouse Meetup in Seattle* - Keep an eye on this space as we will be announcing a January meetup in Seattle soon!
* [**ClickHouse Meetup at Microsoft Office in Seattle**](https://www.meetup.com/clickhouse-seattle-user-group/events/290310025/) - Jan 18 - Keep an eye on this space as we will be announcing speakers soon!

View File

@ -13,9 +13,10 @@ The following versions of ClickHouse server are currently being supported with s
| Version | Supported |
|:-|:-|
| 22.12 | ✔️ |
| 22.11 | ✔️ |
| 22.10 | ✔️ |
| 22.9 | ✔️ |
| 22.9 | |
| 22.8 | ✔️ |
| 22.7 | ❌ |
| 22.6 | ❌ |

View File

@ -33,7 +33,7 @@ RUN arch=${TARGETARCH:-amd64} \
# lts / testing / prestable / etc
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}"
ARG VERSION="22.11.2.30"
ARG VERSION="22.12.1.1752"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
# user/group precreated explicitly with fixed uid/gid on purpose.

View File

@ -21,7 +21,7 @@ RUN sed -i "s|http://archive.ubuntu.com|${apt_archive}|g" /etc/apt/sources.list
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="deb https://packages.clickhouse.com/deb ${REPO_CHANNEL} main"
ARG VERSION="22.11.2.30"
ARG VERSION="22.12.1.1752"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
# set non-empty deb_location_url url to create a docker image

View File

@ -127,23 +127,24 @@ EOL
function stop()
{
local max_tries=""
if [ -n "$1" ]
then
max_tries="--max-tries $1"
fi
local pid
# Preserve the pid, since the server can hung after the PID will be deleted.
pid="$(cat /var/run/clickhouse-server/clickhouse-server.pid)"
clickhouse stop $max_tries --do-not-kill && return
if [ -n "$1" ]
then
# temporarily disable it in BC check
clickhouse stop --force
return
fi
# We failed to stop the server with SIGTERM. Maybe it hang, let's collect stacktraces.
kill -TERM "$(pidof gdb)" ||:
sleep 5
echo "thread apply all backtrace (on stop)" >> /test_output/gdb.log
gdb -batch -ex 'thread apply all backtrace' -p "$pid" | ts '%Y-%m-%d %H:%M:%S' >> /test_output/gdb.log
timeout 30m gdb -batch -ex 'thread apply all backtrace' -p "$pid" | ts '%Y-%m-%d %H:%M:%S' >> /test_output/gdb.log
clickhouse stop --force
}
@ -431,7 +432,7 @@ else
clickhouse-client --query="SELECT 'Tables count:', count() FROM system.tables"
stop 180
stop 1
mv /var/log/clickhouse-server/clickhouse-server.log /var/log/clickhouse-server/clickhouse-server.backward.stress.log
# Start new server

View File

@ -0,0 +1,320 @@
---
sidebar_position: 1
sidebar_label: 2022
---
# 2022 Changelog
### ClickHouse release v22.12.1.1752-stable (688e488e930) FIXME as compared to v22.11.1.1360-stable (0d211ed1984)
#### Backward Incompatible Change
* Fixed backward incompatibility in (de)serialization of states of `min`, `max`, `any*`, `argMin`, `argMax` aggregate functions with `String` argument. The incompatibility was introduced in https://github.com/ClickHouse/ClickHouse/pull/41431 and affects 22.9, 22.10 and 22.11 branches (fixed since 22.9.6, 22.10.4 and 22.11.2 correspondingly). Some minor releases of 22.3, 22.7 and 22.8 branches are also affected: 22.3.13...22.3.14 (fixed since 22.3.15), 22.8.6...22.8.9 (fixed since 22.8.10), 22.7.6 and newer (will not be fixed in 22.7, we recommend to upgrade from 22.7.* to 22.8.10 or newer). This release note does not concern users that have never used affected versions. Incompatible versions append extra `'\0'` to strings when reading states of the aggregate functions mentioned above. For example, if an older version saved state of `anyState('foobar')` to `state_column` then incompatible version will print `'foobar\0'` on `anyMerge(state_column)`. Also incompatible versions write states of the aggregate functions without trailing `'\0'`. Newer versions (that have the fix) can correctly read data written by all versions including incompatible versions, except one corner case. If an incompatible version saved a state with a string that actually ends with null character, then newer version will trim trailing `'\0'` when reading state of affected aggregate function. For example, if an incompatible version saved state of `anyState('abrac\0dabra\0')` to `state_column` then newer versions will print `'abrac\0dabra'` on `anyMerge(state_column)`. The issue also affects distributed queries when an incompatible version works in a cluster together with older or newer versions. [#43038](https://github.com/ClickHouse/ClickHouse/pull/43038) ([Raúl Marín](https://github.com/Algunenano)).
#### New Feature
* Add "grace_hash" join_algorithm. [#38191](https://github.com/ClickHouse/ClickHouse/pull/38191) ([BigRedEye](https://github.com/BigRedEye)).
* Merging on initiator now uses the same memory bound approach as merging of local aggregation results if `enable_memory_bound_merging_of_aggregation_results` is set. [#40879](https://github.com/ClickHouse/ClickHouse/pull/40879) ([Nikita Taranov](https://github.com/nickitat)).
* Add BSONEachRow input/output format. In this format, ClickHouse formats/parses each row as a separated BSON Document and each column is formatted/parsed as a single BSON field with column name as a key. [#42033](https://github.com/ClickHouse/ClickHouse/pull/42033) ([mark-polokhov](https://github.com/mark-polokhov)).
* close: [#37631](https://github.com/ClickHouse/ClickHouse/issues/37631). [#42265](https://github.com/ClickHouse/ClickHouse/pull/42265) ([刘陶峰](https://github.com/taofengliu)).
* Added `multiplyDecimal` and `divideDecimal` functions for decimal operations with fixed precision. [#42438](https://github.com/ClickHouse/ClickHouse/pull/42438) ([Andrey Zvonov](https://github.com/zvonand)).
* Added `system.moves` table with list of currently moving parts. [#42660](https://github.com/ClickHouse/ClickHouse/pull/42660) ([Sergei Trifonov](https://github.com/serxa)).
* Keeper feature: add support for embedded Prometheus endpoint. [#43087](https://github.com/ClickHouse/ClickHouse/pull/43087) ([Antonio Andelic](https://github.com/antonio2368)).
* Added age function to calculate difference between two dates or dates with time values expressed as number of full units. Close [#41115](https://github.com/ClickHouse/ClickHouse/issues/41115). [#43123](https://github.com/ClickHouse/ClickHouse/pull/43123) ([Roman Vasin](https://github.com/rvasin)).
* Add settings `max_streams_for_merge_tree_reading` and `allow_asynchronous_read_from_io_pool_for_merge_tree`. Setting `max_streams_for_merge_tree_reading` limits the number of reading streams for MergeTree tables. Setting `allow_asynchronous_read_from_io_pool_for_merge_tree` enables background I/O pool to read from `MergeTree` tables. This may increase performance for I/O bound queries if used together with `max_streams_to_max_threads_ratio` or `max_streams_for_merge_tree_reading`. [#43260](https://github.com/ClickHouse/ClickHouse/pull/43260) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Add the expression of the index on `data_skipping_indices` system table. [#43308](https://github.com/ClickHouse/ClickHouse/pull/43308) ([Guillaume Tassery](https://github.com/YiuRULE)).
* New hash function [xxh3](https://github.com/Cyan4973/xxHash) added. Also performance of `xxHash32` and `xxHash64` improved on arm thanks to library update. [#43411](https://github.com/ClickHouse/ClickHouse/pull/43411) ([Nikita Taranov](https://github.com/nickitat)).
* - Temporary data (for external sorting, aggregation, and JOINs) can share storage with the filesystem cache for remote disks and evict it, close [#42158](https://github.com/ClickHouse/ClickHouse/issues/42158). [#43457](https://github.com/ClickHouse/ClickHouse/pull/43457) ([Vladimir C](https://github.com/vdimir)).
* Add column `engine_full` to system table `databases` so that users can access whole engine definition of database via system tables. [#43468](https://github.com/ClickHouse/ClickHouse/pull/43468) ([凌涛](https://github.com/lingtaolf)).
* Add password complexity rules and checks for creating a new user. [#43719](https://github.com/ClickHouse/ClickHouse/pull/43719) ([Nikolay Degterinsky](https://github.com/evillique)).
* Add function concatWithSeparator , like concat_ws in spark. [#43749](https://github.com/ClickHouse/ClickHouse/pull/43749) ([李扬](https://github.com/taiyang-li)).
* Added constraints for merge tree settings. [#43903](https://github.com/ClickHouse/ClickHouse/pull/43903) ([Sergei Trifonov](https://github.com/serxa)).
* Support numeric literals with _ as separator. [#43925](https://github.com/ClickHouse/ClickHouse/pull/43925) ([jh0x](https://github.com/jh0x)).
* Add a new setting `input_format_json_read_objects_as_strings` that allows to parse nested JSON objects into Strings in all JSON input formats. This setting is disable by default. [#44052](https://github.com/ClickHouse/ClickHouse/pull/44052) ([Kruglov Pavel](https://github.com/Avogar)).
#### Performance Improvement
* Optimisation is getting skipped now if `max_size_to_preallocate_for_aggregation` has too small value. Default value of this setting increased to `10^8`. [#43945](https://github.com/ClickHouse/ClickHouse/pull/43945) ([Nikita Taranov](https://github.com/nickitat)).
#### Improvement
* Support numeric literals with underscores. closes [#28967](https://github.com/ClickHouse/ClickHouse/issues/28967). [#39129](https://github.com/ClickHouse/ClickHouse/pull/39129) ([unbyte](https://github.com/unbyte)).
* Add `FROM table SELECT column` syntax. [#41095](https://github.com/ClickHouse/ClickHouse/pull/41095) ([Nikolay Degterinsky](https://github.com/evillique)).
* This PR changes how followed queries delete parts: truncate table, alter table drop part, alter table drop partition. Now these queries make empty parts which cover old parts. This makes truncate query works without exclusive lock which means concurrent reads aren't locked. Also achieved durability in all those queries. If request is succeeded then no resurrected pars appear later. Note that atomicity is achieved only with transaction scope. [#41145](https://github.com/ClickHouse/ClickHouse/pull/41145) ([Sema Checherinda](https://github.com/CheSema)).
* `SET param_x` query no longer requires manual string serialization for the value of the parameter. For example, query `SET param_a = '[\'a\', \'b\']'` can now be written like `SET param_a = ['a', 'b']`. [#41874](https://github.com/ClickHouse/ClickHouse/pull/41874) ([Nikolay Degterinsky](https://github.com/evillique)).
* `filesystemAvailable` and related functions support one optional argument with disk name, and change `filesystemFree` to `filesystemUnreserved`. Closes [#35076](https://github.com/ClickHouse/ClickHouse/issues/35076). [#42064](https://github.com/ClickHouse/ClickHouse/pull/42064) ([flynn](https://github.com/ucasfl)).
* Increased the default value of search_limit to 256, and added LDAP server config option to change that to an arbitrary value. Closes: [#42276](https://github.com/ClickHouse/ClickHouse/issues/42276). [#42461](https://github.com/ClickHouse/ClickHouse/pull/42461) ([Vasily Nemkov](https://github.com/Enmk)).
* Add cosine distance for annoy. [#42778](https://github.com/ClickHouse/ClickHouse/pull/42778) ([Filatenkov Artur](https://github.com/FArthur-cmd)).
* Allow to remove sensitive information from the exception messages also. Resolves [#41418](https://github.com/ClickHouse/ClickHouse/issues/41418). [#42940](https://github.com/ClickHouse/ClickHouse/pull/42940) ([filimonov](https://github.com/filimonov)).
* Keeper improvement: Add 4lw command `rqld` which can manually assign a node as leader. [#43026](https://github.com/ClickHouse/ClickHouse/pull/43026) ([JackyWoo](https://github.com/JackyWoo)).
* Apply connection timeouts settings for Distributed async INSERT from the query. [#43156](https://github.com/ClickHouse/ClickHouse/pull/43156) ([Azat Khuzhin](https://github.com/azat)).
* unhex function support FixedString arguments. [issue42369](https://github.com/ClickHouse/ClickHouse/issues/42369). [#43207](https://github.com/ClickHouse/ClickHouse/pull/43207) ([DR](https://github.com/freedomDR)).
* Priority is given to deleting completely expired Partsrelated [#42869](https://github.com/ClickHouse/ClickHouse/issues/42869). [#43222](https://github.com/ClickHouse/ClickHouse/pull/43222) ([zhongyuankai](https://github.com/zhongyuankai)).
* Follow-up to https://github.com/ClickHouse/ClickHouse/pull/42484. Mask sensitive information in logs better; mask secret parts in the output of queries `SHOW CREATE TABLE` and `SELECT FROM system.tables`. Also resolves [#41418](https://github.com/ClickHouse/ClickHouse/issues/41418). [#43227](https://github.com/ClickHouse/ClickHouse/pull/43227) ([Vitaly Baranov](https://github.com/vitlibar)).
* Enable compress marks and primary key. [#43288](https://github.com/ClickHouse/ClickHouse/pull/43288) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* resolve issue [#38075](https://github.com/ClickHouse/ClickHouse/issues/38075) . Right now async insert doesn't support deduplication, because multiple small inserts will coexist in one part, which corespond multiple `block id`s. This solution is straitfoward: The change involves: 1. mark offsets for every inserts in every chunk 2. calculate multiple `block_id`s when sinker receive a chunk 3. get block number lock by these `block_id`s 3.1. if fails, remove the dup insert(s) and dup `block_id`(s) from block and recalculate `offsets` agian. 3.2. if succeeds, commit `block_id`'s and other items into keeper a. if fails, do 3.1 b. if succeeds, everything succeeds. [#43304](https://github.com/ClickHouse/ClickHouse/pull/43304) ([Han Fei](https://github.com/hanfei1991)).
* More precise and reactive CPU load indication on client. [#43307](https://github.com/ClickHouse/ClickHouse/pull/43307) ([Sergei Trifonov](https://github.com/serxa)).
* Restrict default access to named collections for user defined in config. It must have explicit `show_named_collections=1` to be able to see them. [#43325](https://github.com/ClickHouse/ClickHouse/pull/43325) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Support reading of subcolumns of nested types from storage `S3` and table function `s3` with formats `Parquet`, `Arrow` and `ORC`. [#43329](https://github.com/ClickHouse/ClickHouse/pull/43329) ([chen](https://github.com/xiedeyantu)).
* - Systemd integration now correctly notifies systemd that service is really started and is ready to server requests. [#43400](https://github.com/ClickHouse/ClickHouse/pull/43400) ([Коренберг Марк](https://github.com/socketpair)).
* Add table_uuid to system.parts. [#43404](https://github.com/ClickHouse/ClickHouse/pull/43404) ([Azat Khuzhin](https://github.com/azat)).
* Added client option to display the number of locally processed rows in non-interactive mode (--print-num-processed-rows). [#43407](https://github.com/ClickHouse/ClickHouse/pull/43407) ([jh0x](https://github.com/jh0x)).
* Show read rows while reading from stdin from client. Closes [#43423](https://github.com/ClickHouse/ClickHouse/issues/43423). [#43442](https://github.com/ClickHouse/ClickHouse/pull/43442) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Keeper improvement: try syncing logs to disk in parallel with replication. [#43450](https://github.com/ClickHouse/ClickHouse/pull/43450) ([Antonio Andelic](https://github.com/antonio2368)).
* Show progress bar while reading from s3 table function / engine. [#43454](https://github.com/ClickHouse/ClickHouse/pull/43454) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Progress bar will show both read and written rows. [#43496](https://github.com/ClickHouse/ClickHouse/pull/43496) ([Ilya Yatsishin](https://github.com/qoega)).
* Implement `aggregation-in-order` optimization on top of query plan. It is enabled by default (but works only together with `optimize_aggregation_in_order`, which is disabled by default). Set `query_plan_aggregation_in_order = 0` to use previous AST-based version. [#43592](https://github.com/ClickHouse/ClickHouse/pull/43592) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Allow to send profile events with `trace_type = 'ProfileEvent'` to `system.trace_log` on each increment with current stack, profile event name and value of increment. It can be enabled by setting `trace_profile_events` and used to debug performance of queries. [#43639](https://github.com/ClickHouse/ClickHouse/pull/43639) ([Anton Popov](https://github.com/CurtizJ)).
* Keeper improvement: requests are batched more often. The batching can be controlled with the new setting `max_requests_quick_batch_size`. [#43686](https://github.com/ClickHouse/ClickHouse/pull/43686) ([Antonio Andelic](https://github.com/antonio2368)).
* Added possibility to use array as a second parameter for cutURLParameter function. Close [#6827](https://github.com/ClickHouse/ClickHouse/issues/6827). [#43788](https://github.com/ClickHouse/ClickHouse/pull/43788) ([Roman Vasin](https://github.com/rvasin)).
* Implement referential dependencies and use them to create tables in the correct order while restoring from a backup. [#43834](https://github.com/ClickHouse/ClickHouse/pull/43834) ([Vitaly Baranov](https://github.com/vitlibar)).
* Add a new setting `input_format_max_binary_string_size` to limit string size in RowBinary format. [#43842](https://github.com/ClickHouse/ClickHouse/pull/43842) ([Kruglov Pavel](https://github.com/Avogar)).
* - Fix some incorrect logic in ast level optimization related. [#43873](https://github.com/ClickHouse/ClickHouse/pull/43873) ([Duc Canh Le](https://github.com/canhld94)).
* Support query like `SHOW FULL TABLES ...`. [#43910](https://github.com/ClickHouse/ClickHouse/pull/43910) ([Filatenkov Artur](https://github.com/FArthur-cmd)).
* When ClickHouse requests a remote HTTP server, and it returns an error, the numeric HTTP code was not displayed correctly in the exception message. Closes [#43919](https://github.com/ClickHouse/ClickHouse/issues/43919). [#43920](https://github.com/ClickHouse/ClickHouse/pull/43920) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Settings `merge_tree_min_rows_for_concurrent_read_for_remote_filesystem/merge_tree_min_bytes_for_concurrent_read_for_remote_filesystem` did not respect adaptive granularity. Fat rows did not decrease the number of read rows (as it is was done for `merge_tree_min_rows_for_concurrent_read/merge_tree_min_bytes_for_concurrent_read`, which could lead to high memory usage. [#43965](https://github.com/ClickHouse/ClickHouse/pull/43965) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Support `optimize_if_transform_strings_to_enum` in new analyzer. [#43999](https://github.com/ClickHouse/ClickHouse/pull/43999) ([Antonio Andelic](https://github.com/antonio2368)).
* This is to upgrade the new "DeflateQpl" compression codec which has been implemented on previous PR (details: https://github.com/ClickHouse/ClickHouse/pull/39494). This patch improves codec on below aspects: 1. QPL v0.2.0 to QPL v0.3.0 [Intel® Query Processing Library (QPL)](https://github.com/intel/qpl) 2. Improve CMake file for fixing QPL build issues for QPL v0.3.0。 3. Link the QPL library with libaccel-config at build time instead of runtime loading on QPL v0.2.0 (dlopen) 4. Fixed log print issue in CompressionCodecDeflateQpl.cpp. [#44024](https://github.com/ClickHouse/ClickHouse/pull/44024) ([jasperzhu](https://github.com/jinjunzh)).
* Follow-up to https://github.com/ClickHouse/ClickHouse/pull/43834 Fix review issues; dependencies from `Distributed` table engine and from `cluster()` function are also considered now; as well as dependencies of a dictionary defined without host & port specified. [#44158](https://github.com/ClickHouse/ClickHouse/pull/44158) ([Vitaly Baranov](https://github.com/vitlibar)).
#### Bug Fix
* Fix mutations not making progress when checksums do not match between replicas (e.g. caused by a change in data format on an upgrade). [#36877](https://github.com/ClickHouse/ClickHouse/pull/36877) ([nvartolomei](https://github.com/nvartolomei)).
* fix skip_unavailable_shards does not work using hdfsCluster table function. [#43236](https://github.com/ClickHouse/ClickHouse/pull/43236) ([chen](https://github.com/xiedeyantu)).
* fix s3 support question mark wildcard. Closes [#42731](https://github.com/ClickHouse/ClickHouse/issues/42731). [#43253](https://github.com/ClickHouse/ClickHouse/pull/43253) ([chen](https://github.com/xiedeyantu)).
* - Fix functions arrayFirstOrNull and arrayLastOrNull or null when array is Nullable. [#43274](https://github.com/ClickHouse/ClickHouse/pull/43274) ([Duc Canh Le](https://github.com/canhld94)).
* - we create a new zk path called "async_blocks" for replicated tables in [#43304](https://github.com/ClickHouse/ClickHouse/issues/43304) . However, for tables created in older versions, this path does not exist and will cause error when doing partition operations. This PR will create this node when initializing replicated tree. - This PR created a flag `async_insert_deduplicate` with `false` default value to control whether to use this function. As mentioned in [#38075](https://github.com/ClickHouse/ClickHouse/issues/38075) , this function is not yet fully finished. I would turn off it by default. [#44223](https://github.com/ClickHouse/ClickHouse/pull/44223) ([Han Fei](https://github.com/hanfei1991)).
#### Build/Testing/Packaging Improvement
* Add support for FreeBSD/powerpc64le. [#40422](https://github.com/ClickHouse/ClickHouse/pull/40422) ([pkubaj](https://github.com/pkubaj)).
* Bump Testcontainers for Go to v0.15.0. [#43278](https://github.com/ClickHouse/ClickHouse/pull/43278) ([Manuel de la Peña](https://github.com/mdelapenya)).
* ... Enable base64 on s390x > Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/. [#43352](https://github.com/ClickHouse/ClickHouse/pull/43352) ([Suzy Wang](https://github.com/SuzyWangIBMer)).
* Shutdown will be much faster if do not call clearOldPartsFromFilesystem. Especially this is right for tests with zero-copy due to single thread deletion parts. clearOldPartsFromFilesystem is unnecessary after https://github.com/ClickHouse/ClickHouse/pull/41145. [#43760](https://github.com/ClickHouse/ClickHouse/pull/43760) ([Sema Checherinda](https://github.com/CheSema)).
* Integrate skim into the client/local. [#43922](https://github.com/ClickHouse/ClickHouse/pull/43922) ([Azat Khuzhin](https://github.com/azat)).
* Allow clickhouse to use openssl as a dynamic library and in-tree for development purposes. [#43991](https://github.com/ClickHouse/ClickHouse/pull/43991) ([Boris Kuschel](https://github.com/bkuschel)).
* Closes [#43912](https://github.com/ClickHouse/ClickHouse/issues/43912). [#43992](https://github.com/ClickHouse/ClickHouse/pull/43992) ([Nikolay Degterinsky](https://github.com/evillique)).
* Bring sha512 sums back to the building step. [#44017](https://github.com/ClickHouse/ClickHouse/pull/44017) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Kill stress tests after 2.5h in case of hanging process. [#44214](https://github.com/ClickHouse/ClickHouse/pull/44214) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Fixed unable to log in (because of failure to create session_log entry) in rare case of messed up setting profiles. ... [#42641](https://github.com/ClickHouse/ClickHouse/pull/42641) ([Vasily Nemkov](https://github.com/Enmk)).
* Fix incorrect UserTimeMicroseconds/SystemTimeMicroseconds accounting. [#42791](https://github.com/ClickHouse/ClickHouse/pull/42791) ([Azat Khuzhin](https://github.com/azat)).
* Do not suppress exceptions in web disk. Fix retries for web disk. [#42800](https://github.com/ClickHouse/ClickHouse/pull/42800) ([Azat Khuzhin](https://github.com/azat)).
* Fixed race condition between inserts and dropping MVs. [#43161](https://github.com/ClickHouse/ClickHouse/pull/43161) ([AlfVII](https://github.com/AlfVII)).
* Fixed bug which could lead to deadlock while using asynchronous inserts. [#43233](https://github.com/ClickHouse/ClickHouse/pull/43233) ([Anton Popov](https://github.com/CurtizJ)).
* Additional check on zero uncompressed size is added to `CompressionCodecDelta`. [#43255](https://github.com/ClickHouse/ClickHouse/pull/43255) ([Nikita Taranov](https://github.com/nickitat)).
* An issue with the following exception has been reported while trying to read a Parquet file from S3 into ClickHouse:. [#43297](https://github.com/ClickHouse/ClickHouse/pull/43297) ([Arthur Passos](https://github.com/arthurpassos)).
* Fix bad cast from LowCardinality column when using short circuit function execution. Proper fix of https://github.com/ClickHouse/ClickHouse/pull/42937. [#43311](https://github.com/ClickHouse/ClickHouse/pull/43311) ([Kruglov Pavel](https://github.com/Avogar)).
* Fixed queries with `SAMPLE BY` with prewhere optimization on tables using `Merge` engine. [#43315](https://github.com/ClickHouse/ClickHouse/pull/43315) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix `DESCRIBE` for `deltaLake` and `hudi` table functions. [#43323](https://github.com/ClickHouse/ClickHouse/pull/43323) ([Antonio Andelic](https://github.com/antonio2368)).
* Check and compare the content of `format_version` file in `MergeTreeData` so tables can be loaded even if the storage policy was changed. [#43328](https://github.com/ClickHouse/ClickHouse/pull/43328) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix possible (very unlikely) "No column to rollback" logical error during INSERT into Buffer. [#43336](https://github.com/ClickHouse/ClickHouse/pull/43336) ([Azat Khuzhin](https://github.com/azat)).
* Fix a bug that allowed FucntionParser to parse an unlimited amount of round brackets into one function if `allow_function_parameters` is set. [#43350](https://github.com/ClickHouse/ClickHouse/pull/43350) ([Nikolay Degterinsky](https://github.com/evillique)).
* MaterializeMySQL support ddl drop table t1,t2 and Compatible with most of MySQL drop ddl. [#43366](https://github.com/ClickHouse/ClickHouse/pull/43366) ([zzsmdfj](https://github.com/zzsmdfj)).
* Fix possible `Cannot create non-empty column with type Nothing` in functions if/multiIf. Closes [#43356](https://github.com/ClickHouse/ClickHouse/issues/43356). [#43368](https://github.com/ClickHouse/ClickHouse/pull/43368) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix a bug when row level filter uses default value of column. [#43387](https://github.com/ClickHouse/ClickHouse/pull/43387) ([Alexander Gololobov](https://github.com/davenger)).
* Query with DISTINCT + LIMIT BY + LIMIT can return fewer rows than expected. Fixes [#43377](https://github.com/ClickHouse/ClickHouse/issues/43377). [#43410](https://github.com/ClickHouse/ClickHouse/pull/43410) ([Igor Nikonov](https://github.com/devcrafter)).
* Fix sumMap() for Nullable(Decimal()). [#43414](https://github.com/ClickHouse/ClickHouse/pull/43414) ([Azat Khuzhin](https://github.com/azat)).
* Fix date_diff() for hour/minute on macOS. Close [#42742](https://github.com/ClickHouse/ClickHouse/issues/42742). [#43466](https://github.com/ClickHouse/ClickHouse/pull/43466) ([zzsmdfj](https://github.com/zzsmdfj)).
* Fix incorrect memory accounting because of merges/mutations. [#43516](https://github.com/ClickHouse/ClickHouse/pull/43516) ([Azat Khuzhin](https://github.com/azat)).
* Substitute UDFs in `CREATE` query to avoid failures during loading at the startup. Additionally, UDFs can now be used as `DEFAULT` expressions for columns. [#43539](https://github.com/ClickHouse/ClickHouse/pull/43539) ([Antonio Andelic](https://github.com/antonio2368)).
* Correctly report errors in queries even when multiple JOINs optimization is taking place. [#43583](https://github.com/ClickHouse/ClickHouse/pull/43583) ([Salvatore](https://github.com/tbsal)).
* Fixed primary key analysis with conditions involving `toString(enum)`. [#43596](https://github.com/ClickHouse/ClickHouse/pull/43596) ([Nikita Taranov](https://github.com/nickitat)).
* - Ensure consistency when copier update status and `attach_is_done` in keeper after partition attach is done. [#43602](https://github.com/ClickHouse/ClickHouse/pull/43602) ([lizhuoyu5](https://github.com/lzydmxy)).
* During recovering of the lost replica there could a situation where we need to atomically swap two table names (use EXCHANGE), but instead previously we tried to use two RENAME queries. Which was obviously failed and moreover failed the whole recovery process of the database replica. [#43628](https://github.com/ClickHouse/ClickHouse/pull/43628) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* fix s3Cluster function returns NOT_FOUND_COLUMN_IN_BLOCK error. Closes [#43534](https://github.com/ClickHouse/ClickHouse/issues/43534). [#43629](https://github.com/ClickHouse/ClickHouse/pull/43629) ([chen](https://github.com/xiedeyantu)).
* Optimized number of List requests to ZooKeeper when selecting a part to merge. Previously it could produce thousands of requests in some cases. Fixes [#43647](https://github.com/ClickHouse/ClickHouse/issues/43647). [#43675](https://github.com/ClickHouse/ClickHouse/pull/43675) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix posssible logical error 'Array sizes mismatched' while parsing JSON object with arrays with same key names but with different nesting level. Closes [#43569](https://github.com/ClickHouse/ClickHouse/issues/43569). [#43693](https://github.com/ClickHouse/ClickHouse/pull/43693) ([Kruglov Pavel](https://github.com/Avogar)).
* Fixed possible exception in case of distributed group by with an alias column among aggregation keys. [#43709](https://github.com/ClickHouse/ClickHouse/pull/43709) ([Nikita Taranov](https://github.com/nickitat)).
* Fix bug which can lead to broken projections if zero-copy replication is enabled and used. [#43764](https://github.com/ClickHouse/ClickHouse/pull/43764) ([alesapin](https://github.com/alesapin)).
* - Fix using multipart upload for large S3 objects in AWS S3. [#43824](https://github.com/ClickHouse/ClickHouse/pull/43824) ([ianton-ru](https://github.com/ianton-ru)).
* Fixed `ALTER ... RESET SETTING` with `ON CLUSTER`. It could be applied to one replica only. Fixes [#43843](https://github.com/ClickHouse/ClickHouse/issues/43843). [#43848](https://github.com/ClickHouse/ClickHouse/pull/43848) ([Elena Torró](https://github.com/elenatorro)).
* * Fix logical error in right storage join with using. [#43963](https://github.com/ClickHouse/ClickHouse/pull/43963) ([Vladimir C](https://github.com/vdimir)).
* Keeper fix: throw if interserver port for Raft is already in use. Fix segfault in Prometheus when Raft server failed to initialize. [#43984](https://github.com/ClickHouse/ClickHouse/pull/43984) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix order by positional arg in case unneeded columns pruning. Closes [#43964](https://github.com/ClickHouse/ClickHouse/issues/43964). [#43987](https://github.com/ClickHouse/ClickHouse/pull/43987) ([Kseniia Sumarokova](https://github.com/kssenii)).
* * Fix bug with wrong order of keys in Storage Join. [#44012](https://github.com/ClickHouse/ClickHouse/pull/44012) ([Vladimir C](https://github.com/vdimir)).
* Fixed exception when subquery contains having but doesn't contain actual aggregation. [#44051](https://github.com/ClickHouse/ClickHouse/pull/44051) ([Nikita Taranov](https://github.com/nickitat)).
* Fix race in s3 multipart upload. This race could cause the error `Part number must be an integer between 1 and 10000, inclusive. (S3_ERROR)` while restoring from a backup. [#44065](https://github.com/ClickHouse/ClickHouse/pull/44065) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix undefined behavior in the `quantiles` function, which might lead to uninitialized memory. Found by fuzzer. This closes [#44066](https://github.com/ClickHouse/ClickHouse/issues/44066). [#44067](https://github.com/ClickHouse/ClickHouse/pull/44067) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Prevent dropping nested column if it creates empty part. [#44159](https://github.com/ClickHouse/ClickHouse/pull/44159) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix `LOGICAL_ERROR` in case when fetch of part was stopped while fetching projection to the disk with enabled zero-copy replication. [#44173](https://github.com/ClickHouse/ClickHouse/pull/44173) ([Anton Popov](https://github.com/CurtizJ)).
* Fix possible Bad cast from type DB::IAST const* to DB::ASTLiteral const*. Closes [#44191](https://github.com/ClickHouse/ClickHouse/issues/44191). [#44192](https://github.com/ClickHouse/ClickHouse/pull/44192) ([Kruglov Pavel](https://github.com/Avogar)).
* Prevent `ReadonlyReplica` metric from having negative values. [#44220](https://github.com/ClickHouse/ClickHouse/pull/44220) ([Antonio Andelic](https://github.com/antonio2368)).
#### Build Improvement
* Fixed Endian issues in hex string conversion on s390x (which is not supported by ClickHouse). [#41245](https://github.com/ClickHouse/ClickHouse/pull/41245) ([Harry Lee](https://github.com/HarryLeeIBM)).
* ... toDateTime64 conversion generates wrong time on z build, add bit_cast swap fix to support toDateTime64 on s390x platform. [#42847](https://github.com/ClickHouse/ClickHouse/pull/42847) ([Suzy Wang](https://github.com/SuzyWangIBMer)).
* ... s390x support for ip coding functions. [#43078](https://github.com/ClickHouse/ClickHouse/pull/43078) ([Suzy Wang](https://github.com/SuzyWangIBMer)).
* Fix byte order issue of wide integers for s390x. [#43228](https://github.com/ClickHouse/ClickHouse/pull/43228) ([Harry Lee](https://github.com/HarryLeeIBM)).
* Fixed endian issue in bloom filter serialization for s390x. [#43642](https://github.com/ClickHouse/ClickHouse/pull/43642) ([Harry Lee](https://github.com/HarryLeeIBM)).
* Fixed setting TCP_KEEPIDLE of client connection for s390x. [#43850](https://github.com/ClickHouse/ClickHouse/pull/43850) ([Harry Lee](https://github.com/HarryLeeIBM)).
* Fix endian issue in StringHashTable for s390x. [#44049](https://github.com/ClickHouse/ClickHouse/pull/44049) ([Harry Lee](https://github.com/HarryLeeIBM)).
#### NO CL ENTRY
* NO CL ENTRY: 'Revert "Revert "S3 request per second rate throttling""'. [#43335](https://github.com/ClickHouse/ClickHouse/pull/43335) ([Sergei Trifonov](https://github.com/serxa)).
* NO CL ENTRY: 'Update version after release'. [#43348](https://github.com/ClickHouse/ClickHouse/pull/43348) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* NO CL ENTRY: 'Revert "Add table_uuid to system.parts"'. [#43571](https://github.com/ClickHouse/ClickHouse/pull/43571) ([Alexander Tokmakov](https://github.com/tavplubix)).
* NO CL ENTRY: 'Revert "Fix endian issue in integer hex string conversion"'. [#43613](https://github.com/ClickHouse/ClickHouse/pull/43613) ([Vladimir C](https://github.com/vdimir)).
* NO CL ENTRY: 'Update replication.md'. [#43643](https://github.com/ClickHouse/ClickHouse/pull/43643) ([Peignon Melvyn](https://github.com/melvynator)).
* NO CL ENTRY: 'Revert "Temporary files evict fs cache"'. [#43883](https://github.com/ClickHouse/ClickHouse/pull/43883) ([Vladimir C](https://github.com/vdimir)).
* NO CL ENTRY: 'Update html interface doc'. [#44064](https://github.com/ClickHouse/ClickHouse/pull/44064) ([San](https://github.com/santrancisco)).
* NO CL ENTRY: 'Revert "Add function 'age'"'. [#44203](https://github.com/ClickHouse/ClickHouse/pull/44203) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* NO CL ENTRY: 'Revert "Builtin skim"'. [#44227](https://github.com/ClickHouse/ClickHouse/pull/44227) ([Azat Khuzhin](https://github.com/azat)).
* NO CL ENTRY: 'Revert "Add information about written rows in progress indicator"'. [#44255](https://github.com/ClickHouse/ClickHouse/pull/44255) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Build libcxx and libcxxabi from llvm-project [#42730](https://github.com/ClickHouse/ClickHouse/pull/42730) ([Robert Schulze](https://github.com/rschu1ze)).
* Allow release only from ready commits [#43019](https://github.com/ClickHouse/ClickHouse/pull/43019) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add global flags to base/ libraries [#43082](https://github.com/ClickHouse/ClickHouse/pull/43082) ([Raúl Marín](https://github.com/Algunenano)).
* Enable strict typing check in tests/ci [#43132](https://github.com/ClickHouse/ClickHouse/pull/43132) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add server UUID for disks access checks (read/read-by-offset/write/delete) to avoid possible races [#43143](https://github.com/ClickHouse/ClickHouse/pull/43143) ([Azat Khuzhin](https://github.com/azat)).
* Do not include libcxx library for C [#43166](https://github.com/ClickHouse/ClickHouse/pull/43166) ([Azat Khuzhin](https://github.com/azat)).
* Followup fixes for FuseFunctionsPass [#43217](https://github.com/ClickHouse/ClickHouse/pull/43217) ([Vladimir C](https://github.com/vdimir)).
* Fix bug in replication queue which can lead to premature mutation finish [#43231](https://github.com/ClickHouse/ClickHouse/pull/43231) ([alesapin](https://github.com/alesapin)).
* Support `CREATE / ALTER / DROP NAMED COLLECTION` queries under according access types [#43252](https://github.com/ClickHouse/ClickHouse/pull/43252) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix race in `IColumn::dumpStructure` [#43269](https://github.com/ClickHouse/ClickHouse/pull/43269) ([Anton Popov](https://github.com/CurtizJ)).
* Sanitize thirdparty libraries for public flags [#43275](https://github.com/ClickHouse/ClickHouse/pull/43275) ([Azat Khuzhin](https://github.com/azat)).
* stress: increase timeout for server waiting after TERM [#43277](https://github.com/ClickHouse/ClickHouse/pull/43277) ([Azat Khuzhin](https://github.com/azat)).
* Fix cloning of ASTIdentifier [#43282](https://github.com/ClickHouse/ClickHouse/pull/43282) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix race on write in `ReplicatedMergeTree` [#43289](https://github.com/ClickHouse/ClickHouse/pull/43289) ([Antonio Andelic](https://github.com/antonio2368)).
* Cancel lambda api url [#43295](https://github.com/ClickHouse/ClickHouse/pull/43295) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fixed: Typo [#43312](https://github.com/ClickHouse/ClickHouse/pull/43312) ([Raevsky Rudolf](https://github.com/lanesket)).
* Analyzer small fixes [#43321](https://github.com/ClickHouse/ClickHouse/pull/43321) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix: make test_read_only_table more stable [#43326](https://github.com/ClickHouse/ClickHouse/pull/43326) ([Igor Nikonov](https://github.com/devcrafter)).
* Make insertRangeFrom() more exception safe [#43338](https://github.com/ClickHouse/ClickHouse/pull/43338) ([Azat Khuzhin](https://github.com/azat)).
* Analyzer added indexes support [#43341](https://github.com/ClickHouse/ClickHouse/pull/43341) ([Maksim Kita](https://github.com/kitaisreal)).
* Allow to "drop tables" from s3_plain disk (so as from web disk) [#43343](https://github.com/ClickHouse/ClickHouse/pull/43343) ([Azat Khuzhin](https://github.com/azat)).
* Add --max-consecutive-errors for clickhouse-benchmark [#43344](https://github.com/ClickHouse/ClickHouse/pull/43344) ([Azat Khuzhin](https://github.com/azat)).
* Add [#43072](https://github.com/ClickHouse/ClickHouse/issues/43072) [#43345](https://github.com/ClickHouse/ClickHouse/pull/43345) ([Nikita Taranov](https://github.com/nickitat)).
* Suggest users installation troubleshooting [#43346](https://github.com/ClickHouse/ClickHouse/pull/43346) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Update version_date.tsv and changelogs after v22.11.1.1360-stable [#43349](https://github.com/ClickHouse/ClickHouse/pull/43349) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Provide full stacktrace in case of uncaught exception during server startup [#43364](https://github.com/ClickHouse/ClickHouse/pull/43364) ([Azat Khuzhin](https://github.com/azat)).
* Update SECURITY.md on new stable tags [#43365](https://github.com/ClickHouse/ClickHouse/pull/43365) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Splitting checks in CI more [#43373](https://github.com/ClickHouse/ClickHouse/pull/43373) ([alesapin](https://github.com/alesapin)).
* Update version_date.tsv and changelogs after v22.8.9.24-lts [#43393](https://github.com/ClickHouse/ClickHouse/pull/43393) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Fix mess with signed sizes in SingleValueDataString [#43401](https://github.com/ClickHouse/ClickHouse/pull/43401) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add a comment [#43403](https://github.com/ClickHouse/ClickHouse/pull/43403) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Avoid race condition for updating system.distribution_queue values [#43406](https://github.com/ClickHouse/ClickHouse/pull/43406) ([Azat Khuzhin](https://github.com/azat)).
* Fix flaky 01926_order_by_desc_limit [#43408](https://github.com/ClickHouse/ClickHouse/pull/43408) ([Azat Khuzhin](https://github.com/azat)).
* Fix possible heap-use-after-free in local if history file cannot be created [#43409](https://github.com/ClickHouse/ClickHouse/pull/43409) ([Azat Khuzhin](https://github.com/azat)).
* Fix flaky test [#43435](https://github.com/ClickHouse/ClickHouse/pull/43435) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix backward compatibility check [#43436](https://github.com/ClickHouse/ClickHouse/pull/43436) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix typo [#43446](https://github.com/ClickHouse/ClickHouse/pull/43446) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Remove noise from logs about NetLink in Docker [#43447](https://github.com/ClickHouse/ClickHouse/pull/43447) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Modify test slightly [#43448](https://github.com/ClickHouse/ClickHouse/pull/43448) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Set run_passes to 1 by default [#43451](https://github.com/ClickHouse/ClickHouse/pull/43451) ([Dmitry Novik](https://github.com/novikd)).
* Do not reuse jemalloc memory in test_global_overcommit [#43453](https://github.com/ClickHouse/ClickHouse/pull/43453) ([Dmitry Novik](https://github.com/novikd)).
* Fix createTableSharedID again [#43458](https://github.com/ClickHouse/ClickHouse/pull/43458) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Use smaller buffer for small files [#43460](https://github.com/ClickHouse/ClickHouse/pull/43460) ([Alexander Gololobov](https://github.com/davenger)).
* Merging [#42064](https://github.com/ClickHouse/ClickHouse/issues/42064) [#43461](https://github.com/ClickHouse/ClickHouse/pull/43461) ([Anton Popov](https://github.com/CurtizJ)).
* Use all parameters with prefixes from ssm [#43467](https://github.com/ClickHouse/ClickHouse/pull/43467) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Avoid possible DROP hung due to attached web disk [#43489](https://github.com/ClickHouse/ClickHouse/pull/43489) ([Azat Khuzhin](https://github.com/azat)).
* Improve fuzzy search in clickhouse-client/clickhouse-local [#43498](https://github.com/ClickHouse/ClickHouse/pull/43498) ([Azat Khuzhin](https://github.com/azat)).
* check ast limits for create_parser_fuzzer [#43504](https://github.com/ClickHouse/ClickHouse/pull/43504) ([Sema Checherinda](https://github.com/CheSema)).
* Add another test for SingleDataValueString [#43514](https://github.com/ClickHouse/ClickHouse/pull/43514) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Move password reset message from client to server [#43517](https://github.com/ClickHouse/ClickHouse/pull/43517) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Sync everything to persistent storage to avoid writeback affects perf tests [#43530](https://github.com/ClickHouse/ClickHouse/pull/43530) ([Azat Khuzhin](https://github.com/azat)).
* bump lib for diag [#43538](https://github.com/ClickHouse/ClickHouse/pull/43538) ([Dale McDiarmid](https://github.com/gingerwizard)).
* Temporarily disable `test_hive_query` [#43542](https://github.com/ClickHouse/ClickHouse/pull/43542) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Analyzer SumIfToCountIfPass fix [#43543](https://github.com/ClickHouse/ClickHouse/pull/43543) ([Maksim Kita](https://github.com/kitaisreal)).
* Analyzer UniqInjectiveFunctionsEliminationPass [#43547](https://github.com/ClickHouse/ClickHouse/pull/43547) ([Maksim Kita](https://github.com/kitaisreal)).
* Disable broken 00176_bson_parallel_parsing [#43550](https://github.com/ClickHouse/ClickHouse/pull/43550) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add benchmark for query interpretation with JOINs [#43556](https://github.com/ClickHouse/ClickHouse/pull/43556) ([Raúl Marín](https://github.com/Algunenano)).
* Analyzer table functions untuple fix [#43572](https://github.com/ClickHouse/ClickHouse/pull/43572) ([Maksim Kita](https://github.com/kitaisreal)).
* Prepare CI for universal runners preallocated pool [#43579](https://github.com/ClickHouse/ClickHouse/pull/43579) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Iterate list without index-based access [#43584](https://github.com/ClickHouse/ClickHouse/pull/43584) ([Alexander Gololobov](https://github.com/davenger)).
* Remove code that I do not understand [#43593](https://github.com/ClickHouse/ClickHouse/pull/43593) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add table_uuid to system.parts (resubmit) [#43595](https://github.com/ClickHouse/ClickHouse/pull/43595) ([Azat Khuzhin](https://github.com/azat)).
* Move perf tests for Aarch64 from PRs to master [#43623](https://github.com/ClickHouse/ClickHouse/pull/43623) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix flaky 01175_distributed_ddl_output_mode_long [#43626](https://github.com/ClickHouse/ClickHouse/pull/43626) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Speedup backup config loading [#43627](https://github.com/ClickHouse/ClickHouse/pull/43627) ([Alexander Gololobov](https://github.com/davenger)).
* Fix [#43478](https://github.com/ClickHouse/ClickHouse/issues/43478) [#43636](https://github.com/ClickHouse/ClickHouse/pull/43636) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Do not checkout submodules recursively [#43637](https://github.com/ClickHouse/ClickHouse/pull/43637) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Optimize binary-builder size [#43654](https://github.com/ClickHouse/ClickHouse/pull/43654) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix flaky `KeeperMap` integration tests [#43658](https://github.com/ClickHouse/ClickHouse/pull/43658) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix data race in `Keeper` snapshot [#43663](https://github.com/ClickHouse/ClickHouse/pull/43663) ([Antonio Andelic](https://github.com/antonio2368)).
* Use docker images cache from merged PRs in master and release branches [#43664](https://github.com/ClickHouse/ClickHouse/pull/43664) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Update AsynchronousReadIndirectBufferFromRemoteFS.cpp [#43667](https://github.com/ClickHouse/ClickHouse/pull/43667) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix pagination issue in GITHUB_JOB_ID() [#43681](https://github.com/ClickHouse/ClickHouse/pull/43681) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Try fix flaky test 00176_bson_parallel_parsing [#43696](https://github.com/ClickHouse/ClickHouse/pull/43696) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix log messages in clickhouse-copier [#43707](https://github.com/ClickHouse/ClickHouse/pull/43707) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* try to remove clickhouse if already exists [#43728](https://github.com/ClickHouse/ClickHouse/pull/43728) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Fix 43622 [#43731](https://github.com/ClickHouse/ClickHouse/pull/43731) ([Amos Bird](https://github.com/amosbird)).
* Fix example of colored prompt in client [#43738](https://github.com/ClickHouse/ClickHouse/pull/43738) ([Azat Khuzhin](https://github.com/azat)).
* Minor fixes in annoy index documentation [#43743](https://github.com/ClickHouse/ClickHouse/pull/43743) ([Robert Schulze](https://github.com/rschu1ze)).
* Terminate lost runners [#43756](https://github.com/ClickHouse/ClickHouse/pull/43756) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Update README.md [#43759](https://github.com/ClickHouse/ClickHouse/pull/43759) ([Tyler Hannan](https://github.com/tylerhannan)).
* Fix included_elements calculation in AggregateFunctionNullVariadic [#43763](https://github.com/ClickHouse/ClickHouse/pull/43763) ([Dmitry Novik](https://github.com/novikd)).
* Migrate runner_token_rotation_lambda to zip-package deployment [#43766](https://github.com/ClickHouse/ClickHouse/pull/43766) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Analyzer compound expression crash fix [#43768](https://github.com/ClickHouse/ClickHouse/pull/43768) ([Maksim Kita](https://github.com/kitaisreal)).
* Migrate termination lambda to zip-package [#43769](https://github.com/ClickHouse/ClickHouse/pull/43769) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix flaky `test_store_cleanup` [#43770](https://github.com/ClickHouse/ClickHouse/pull/43770) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Attempt to fix StyleCheck condition [#43773](https://github.com/ClickHouse/ClickHouse/pull/43773) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Rerun PullRequestCI on changed description body [#43777](https://github.com/ClickHouse/ClickHouse/pull/43777) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Yet another fix for AggregateFunctionMinMaxAny [#43778](https://github.com/ClickHouse/ClickHouse/pull/43778) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add google benchmark to contrib [#43779](https://github.com/ClickHouse/ClickHouse/pull/43779) ([Nikita Taranov](https://github.com/nickitat)).
* Fix EN doc as in [#43765](https://github.com/ClickHouse/ClickHouse/issues/43765) [#43780](https://github.com/ClickHouse/ClickHouse/pull/43780) ([Alexander Gololobov](https://github.com/davenger)).
* Detach threads from thread group [#43781](https://github.com/ClickHouse/ClickHouse/pull/43781) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Try making `test_keeper_zookeeper_converter` less flaky [#43789](https://github.com/ClickHouse/ClickHouse/pull/43789) ([Antonio Andelic](https://github.com/antonio2368)).
* Polish UDF substitution visitor [#43790](https://github.com/ClickHouse/ClickHouse/pull/43790) ([Antonio Andelic](https://github.com/antonio2368)).
* Analyzer ConstantNode refactoring [#43793](https://github.com/ClickHouse/ClickHouse/pull/43793) ([Maksim Kita](https://github.com/kitaisreal)).
* Update Poco [#43802](https://github.com/ClickHouse/ClickHouse/pull/43802) ([Alexander Gololobov](https://github.com/davenger)).
* Add another BC check suppression [#43810](https://github.com/ClickHouse/ClickHouse/pull/43810) ([Alexander Tokmakov](https://github.com/tavplubix)).
* tests: fix 01676_long_clickhouse_client_autocomplete flakiness [#43819](https://github.com/ClickHouse/ClickHouse/pull/43819) ([Azat Khuzhin](https://github.com/azat)).
* Use disk operation to serialize and deserialize meta files of StorageFilelog [#43826](https://github.com/ClickHouse/ClickHouse/pull/43826) ([flynn](https://github.com/ucasfl)).
* Add constexpr [#43827](https://github.com/ClickHouse/ClickHouse/pull/43827) ([zhanglistar](https://github.com/zhanglistar)).
* Do not postpone removal of in-memory tables [#43833](https://github.com/ClickHouse/ClickHouse/pull/43833) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Increase some logging level for keeper client. [#43835](https://github.com/ClickHouse/ClickHouse/pull/43835) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* FuseFunctionsPass small fix [#43837](https://github.com/ClickHouse/ClickHouse/pull/43837) ([Maksim Kita](https://github.com/kitaisreal)).
* Followup fixes for XML helpers [#43845](https://github.com/ClickHouse/ClickHouse/pull/43845) ([Alexander Gololobov](https://github.com/davenger)).
* Hold ProcessListEntry a bit longer in case of exception from Interpreter [#43847](https://github.com/ClickHouse/ClickHouse/pull/43847) ([Alexander Tokmakov](https://github.com/tavplubix)).
* A little improve performance of PODArray [#43860](https://github.com/ClickHouse/ClickHouse/pull/43860) ([zhanglistar](https://github.com/zhanglistar)).
* Change email for robot-clickhouse to immutable one [#43861](https://github.com/ClickHouse/ClickHouse/pull/43861) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Rerun DocsCheck on edited PR description [#43862](https://github.com/ClickHouse/ClickHouse/pull/43862) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Temporarily disable misc-* slow clang-tidy checks [#43863](https://github.com/ClickHouse/ClickHouse/pull/43863) ([Robert Schulze](https://github.com/rschu1ze)).
* do not leave tmp part on disk, do not go to the keeper for remove it [#43866](https://github.com/ClickHouse/ClickHouse/pull/43866) ([Sema Checherinda](https://github.com/CheSema)).
* do not read part status just for logging [#43868](https://github.com/ClickHouse/ClickHouse/pull/43868) ([Sema Checherinda](https://github.com/CheSema)).
* Analyzer Context refactoring [#43884](https://github.com/ClickHouse/ClickHouse/pull/43884) ([Maksim Kita](https://github.com/kitaisreal)).
* Analyzer CTE resolution fix [#43893](https://github.com/ClickHouse/ClickHouse/pull/43893) ([Maksim Kita](https://github.com/kitaisreal)).
* Improve release script [#43894](https://github.com/ClickHouse/ClickHouse/pull/43894) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Use only PRs to our repository in pr_info on push [#43895](https://github.com/ClickHouse/ClickHouse/pull/43895) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Join engine works with analyzer [#43897](https://github.com/ClickHouse/ClickHouse/pull/43897) ([Vladimir C](https://github.com/vdimir)).
* Fix reports [#43904](https://github.com/ClickHouse/ClickHouse/pull/43904) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix vim settings (and make it compatible with neovim) [#43909](https://github.com/ClickHouse/ClickHouse/pull/43909) ([Azat Khuzhin](https://github.com/azat)).
* Fix clang tidy errors introduced in [#43834](https://github.com/ClickHouse/ClickHouse/issues/43834) [#43911](https://github.com/ClickHouse/ClickHouse/pull/43911) ([Nikita Taranov](https://github.com/nickitat)).
* Fix BACKUP TO S3 for Google Cloud Storage [#43940](https://github.com/ClickHouse/ClickHouse/pull/43940) ([Azat Khuzhin](https://github.com/azat)).
* Fix tags workflow [#43942](https://github.com/ClickHouse/ClickHouse/pull/43942) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Generate missed changelogs for latest releases [#43944](https://github.com/ClickHouse/ClickHouse/pull/43944) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix typo in tests/ci/bugfix_validate_check.py [#43973](https://github.com/ClickHouse/ClickHouse/pull/43973) ([Vladimir C](https://github.com/vdimir)).
* Remove test logging of signal "EINTR" [#44001](https://github.com/ClickHouse/ClickHouse/pull/44001) ([Kruglov Pavel](https://github.com/Avogar)).
* Some cleanup of isDeterministic(InScopeOfQuery)() [#44011](https://github.com/ClickHouse/ClickHouse/pull/44011) ([Robert Schulze](https://github.com/rschu1ze)).
* Try to keep runners alive for longer [#44015](https://github.com/ClickHouse/ClickHouse/pull/44015) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix relaxed "too many parts" threshold [#44021](https://github.com/ClickHouse/ClickHouse/pull/44021) ([Sergei Trifonov](https://github.com/serxa)).
* Correct CompressionCodecGorilla exception message [#44023](https://github.com/ClickHouse/ClickHouse/pull/44023) ([Duc Canh Le](https://github.com/canhld94)).
* Fix exception message [#44034](https://github.com/ClickHouse/ClickHouse/pull/44034) ([Nikolay Degterinsky](https://github.com/evillique)).
* Update version_date.tsv and changelogs after v22.8.11.15-lts [#44035](https://github.com/ClickHouse/ClickHouse/pull/44035) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* do not hardlink serialization.json in new part [#44036](https://github.com/ClickHouse/ClickHouse/pull/44036) ([Sema Checherinda](https://github.com/CheSema)).
* Fix tracing of profile events [#44045](https://github.com/ClickHouse/ClickHouse/pull/44045) ([Anton Popov](https://github.com/CurtizJ)).
* Slightly better clickhouse disks and remove DiskMemory [#44050](https://github.com/ClickHouse/ClickHouse/pull/44050) ([alesapin](https://github.com/alesapin)).
* Assign release PRs [#44055](https://github.com/ClickHouse/ClickHouse/pull/44055) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Merging [#36877](https://github.com/ClickHouse/ClickHouse/issues/36877) [#44059](https://github.com/ClickHouse/ClickHouse/pull/44059) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* add changelogs [#44061](https://github.com/ClickHouse/ClickHouse/pull/44061) ([Dan Roscigno](https://github.com/DanRoscigno)).
* Fix the CACHE_PATH creation for default value [#44079](https://github.com/ClickHouse/ClickHouse/pull/44079) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix aspell [#44090](https://github.com/ClickHouse/ClickHouse/pull/44090) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix another bug in AggregateFunctionMinMaxAny [#44091](https://github.com/ClickHouse/ClickHouse/pull/44091) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Analyzer aggregate function lambda crash fix [#44098](https://github.com/ClickHouse/ClickHouse/pull/44098) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix -Wshorten-64-to-32 on FreeBSD and enable -Werror [#44121](https://github.com/ClickHouse/ClickHouse/pull/44121) ([Azat Khuzhin](https://github.com/azat)).
* Fix flaky test `02497_trace_events_stress_long` [#44124](https://github.com/ClickHouse/ClickHouse/pull/44124) ([Anton Popov](https://github.com/CurtizJ)).
* Minor file renaming [#44125](https://github.com/ClickHouse/ClickHouse/pull/44125) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix typo [#44127](https://github.com/ClickHouse/ClickHouse/pull/44127) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Better descriptions of signals [#44129](https://github.com/ClickHouse/ClickHouse/pull/44129) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* make calls to be sure that parts are deleted [#44156](https://github.com/ClickHouse/ClickHouse/pull/44156) ([Sema Checherinda](https://github.com/CheSema)).
* Ignore "session expired" errors after BC check [#44157](https://github.com/ClickHouse/ClickHouse/pull/44157) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix incorrect assertion [#44160](https://github.com/ClickHouse/ClickHouse/pull/44160) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Close GRPC channels in tests [#44184](https://github.com/ClickHouse/ClickHouse/pull/44184) ([Antonio Andelic](https://github.com/antonio2368)).
* Remove misleading message from logs [#44190](https://github.com/ClickHouse/ClickHouse/pull/44190) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Minor clang-tidy fixes in fromUnixTimestamp64() [#44194](https://github.com/ClickHouse/ClickHouse/pull/44194) ([Igor Nikonov](https://github.com/devcrafter)).
* Hotfix for "check_status.tsv doesn't exists" in stress tests [#44197](https://github.com/ClickHouse/ClickHouse/pull/44197) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix documentation after [#42438](https://github.com/ClickHouse/ClickHouse/issues/42438) [#44200](https://github.com/ClickHouse/ClickHouse/pull/44200) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix an assertion in transactions [#44202](https://github.com/ClickHouse/ClickHouse/pull/44202) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add log message [#44237](https://github.com/ClickHouse/ClickHouse/pull/44237) ([Alexander Tokmakov](https://github.com/tavplubix)).

View File

@ -34,7 +34,7 @@ For a description of request parameters, see [request description](../../../sql-
`columns` - a tuple with the names of columns where values will be summarized. Optional parameter.
The columns must be of a numeric type and must not be in the primary key.
If `columns` not specified, ClickHouse summarizes the values in all columns with a numeric data type that are not in the primary key.
If `columns` is not specified, ClickHouse summarizes the values in all columns with a numeric data type that are not in the primary key.
### Query clauses

View File

@ -92,7 +92,7 @@ Code: 452, e.displayText() = DB::Exception: Setting force_index_by_date should n
**Note:** the `default` profile has special handling: all the constraints defined for the `default` profile become the default constraints, so they restrict all the users until theyre overridden explicitly for these users.
## Constraints on Merge Tree Settings
It is possible to set constraints for [merge tree settings](merge-tree-settings.md). There constraints are applied when table with merge tree engine is created or its storage settings are altered. Name of merge tree setting must be prepended by `merge_tree_` prefix when referenced in `<constraint>` section.
It is possible to set constraints for [merge tree settings](merge-tree-settings.md). These constraints are applied when table with merge tree engine is created or its storage settings are altered. Name of merge tree setting must be prepended by `merge_tree_` prefix when referenced in `<constraints>` section.
**Example:** Forbid to create new tables with explicitly specified `storage_policy`

View File

@ -13,6 +13,7 @@ Columns:
- `metadata_path` ([String](../../sql-reference/data-types/enum.md)) — Metadata path.
- `uuid` ([UUID](../../sql-reference/data-types/uuid.md)) — Database UUID.
- `comment` ([String](../../sql-reference/data-types/enum.md)) — Database comment.
- `engine_full` ([String](../../sql-reference/data-types/enum.md)) — Parameters of the database engine.
The `name` column from this system table is used for implementing the `SHOW DATABASES` query.
@ -31,10 +32,12 @@ SELECT * FROM system.databases;
```
``` text
┌─name───────────────┬─engine─┬─data_path──────────────────┬─metadata_path───────────────────────────────────────────────────────┬─uuid─────────────────────────────────┬─comment─┐
│ INFORMATION_SCHEMA │ Memory │ /var/lib/clickhouse/ │ │ 00000000-0000-0000-0000-000000000000 │ │
│ default │ Atomic │ /var/lib/clickhouse/store/ │ /var/lib/clickhouse/store/d31/d317b4bd-3595-4386-81ee-c2334694128a/ │ 24363899-31d7-42a0-a436-389931d752a0 │ │
│ information_schema │ Memory │ /var/lib/clickhouse/ │ │ 00000000-0000-0000-0000-000000000000 │ │
│ system │ Atomic │ /var/lib/clickhouse/store/ │ /var/lib/clickhouse/store/1d1/1d1c869d-e465-4b1b-a51f-be033436ebf9/ │ 03e9f3d1-cc88-4a49-83e9-f3d1cc881a49 │ │
└────────────────────┴────────┴────────────────────────────┴─────────────────────────────────────────────────────────────────────┴──────────────────────────────────────┴─────────┘
┌─name────────────────┬─engine─────┬─data_path────────────────────┬─metadata_path─────────────────────────────────────────────────────────┬─uuid─────────────────────────────────┬─engine_full────────────────────────────────────────────┬─comment─┐
│ INFORMATION_SCHEMA │ Memory │ /data/clickhouse_data/ │ │ 00000000-0000-0000-0000-000000000000 │ Memory │ │
│ default │ Atomic │ /data/clickhouse_data/store/ │ /data/clickhouse_data/store/f97/f97a3ceb-2e8a-4912-a043-c536e826a4d4/ │ f97a3ceb-2e8a-4912-a043-c536e826a4d4 │ Atomic │ │
│ information_schema │ Memory │ /data/clickhouse_data/ │ │ 00000000-0000-0000-0000-000000000000 │ Memory │ │
│ replicated_database │ Replicated │ /data/clickhouse_data/store/ │ /data/clickhouse_data/store/da8/da85bb71-102b-4f69-9aad-f8d6c403905e/ │ da85bb71-102b-4f69-9aad-f8d6c403905e │ Replicated('some/path/database', 'shard1', 'replica1') │ │
│ system │ Atomic │ /data/clickhouse_data/store/ │ /data/clickhouse_data/store/b57/b5770419-ac7a-4b67-8229-524122024076/ │ b5770419-ac7a-4b67-8229-524122024076 │ Atomic │ │
└─────────────────────┴────────────┴──────────────────────────────┴───────────────────────────────────────────────────────────────────────┴──────────────────────────────────────┴────────────────────────────────────────────────────────┴─────────┘
```

View File

@ -410,35 +410,35 @@ Converts a date with time to a certain fixed date, while preserving the time.
## toRelativeYearNum
Converts a date or date with time to the number of the year, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the year, starting from a certain fixed point in the past.
## toRelativeQuarterNum
Converts a date or date with time to the number of the quarter, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the quarter, starting from a certain fixed point in the past.
## toRelativeMonthNum
Converts a date or date with time to the number of the month, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the month, starting from a certain fixed point in the past.
## toRelativeWeekNum
Converts a date or date with time to the number of the week, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the week, starting from a certain fixed point in the past.
## toRelativeDayNum
Converts a date or date with time to the number of the day, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the day, starting from a certain fixed point in the past.
## toRelativeHourNum
Converts a date or date with time to the number of the hour, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the hour, starting from a certain fixed point in the past.
## toRelativeMinuteNum
Converts a date or date with time to the number of the minute, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the minute, starting from a certain fixed point in the past.
## toRelativeSecondNum
Converts a date or date with time to the number of the second, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the second, starting from a certain fixed point in the past.
## toISOYear
@ -517,154 +517,6 @@ SELECT toDate('2016-12-27') AS date, toYearWeek(date) AS yearWeek0, toYearWeek(d
└────────────┴───────────┴───────────┴───────────┘
```
## age
Returns the `unit` component of the difference between `startdate` and `enddate`. The difference is calculated using a precision of 1 second.
E.g. the difference between `2021-12-29` and `2022-01-01` is 3 days for `day` unit, 0 months for `month` unit, 0 years for `year` unit.
**Syntax**
``` sql
age('unit', startdate, enddate, [timezone])
```
**Arguments**
- `unit` — The type of interval for result. [String](../../sql-reference/data-types/string.md).
Possible values:
- `second` (possible abbreviations: `ss`, `s`)
- `minute` (possible abbreviations: `mi`, `n`)
- `hour` (possible abbreviations: `hh`, `h`)
- `day` (possible abbreviations: `dd`, `d`)
- `week` (possible abbreviations: `wk`, `ww`)
- `month` (possible abbreviations: `mm`, `m`)
- `quarter` (possible abbreviations: `qq`, `q`)
- `year` (possible abbreviations: `yyyy`, `yy`)
- `startdate` — The first time value to subtract (the subtrahend). [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md).
- `enddate` — The second time value to subtract from (the minuend). [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md).
- `timezone` — [Timezone name](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone) (optional). If specified, it is applied to both `startdate` and `enddate`. If not specified, timezones of `startdate` and `enddate` are used. If they are not the same, the result is unspecified. [String](../../sql-reference/data-types/string.md).
**Returned value**
Difference between `enddate` and `startdate` expressed in `unit`.
Type: [Int](../../sql-reference/data-types/int-uint.md).
**Example**
Query:
``` sql
SELECT age('hour', toDateTime('2018-01-01 22:30:00'), toDateTime('2018-01-02 23:00:00'));
```
Result:
``` text
┌─age('hour', toDateTime('2018-01-01 22:30:00'), toDateTime('2018-01-02 23:00:00'))─┐
│ 24 │
└───────────────────────────────────────────────────────────────────────────────────┘
```
Query:
``` sql
SELECT
toDate('2022-01-01') AS e,
toDate('2021-12-29') AS s,
age('day', s, e) AS day_age,
age('month', s, e) AS month__age,
age('year', s, e) AS year_age;
```
Result:
``` text
┌──────────e─┬──────────s─┬─day_age─┬─month__age─┬─year_age─┐
│ 2022-01-01 │ 2021-12-29 │ 3 │ 0 │ 0 │
└────────────┴────────────┴─────────┴────────────┴──────────┘
```
## date\_diff
Returns the count of the specified `unit` boundaries crossed between the `startdate` and `enddate`.
The difference is calculated using relative units, e.g. the difference between `2021-12-29` and `2022-01-01` is 3 days for day unit (see [toRelativeDayNum](#torelativedaynum)), 1 month for month unit (see [toRelativeMonthNum](#torelativemonthnum)), 1 year for year unit (see [toRelativeYearNum](#torelativeyearnum)).
**Syntax**
``` sql
date_diff('unit', startdate, enddate, [timezone])
```
Aliases: `dateDiff`, `DATE_DIFF`.
**Arguments**
- `unit` — The type of interval for result. [String](../../sql-reference/data-types/string.md).
Possible values:
- `second` (possible abbreviations: `ss`, `s`)
- `minute` (possible abbreviations: `mi`, `n`)
- `hour` (possible abbreviations: `hh`, `h`)
- `day` (possible abbreviations: `dd`, `d`)
- `week` (possible abbreviations: `wk`, `ww`)
- `month` (possible abbreviations: `mm`, `m`)
- `quarter` (possible abbreviations: `qq`, `q`)
- `year` (possible abbreviations: `yyyy`, `yy`)
- `startdate` — The first time value to subtract (the subtrahend). [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md).
- `enddate` — The second time value to subtract from (the minuend). [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md).
- `timezone` — [Timezone name](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone) (optional). If specified, it is applied to both `startdate` and `enddate`. If not specified, timezones of `startdate` and `enddate` are used. If they are not the same, the result is unspecified. [String](../../sql-reference/data-types/string.md).
**Returned value**
Difference between `enddate` and `startdate` expressed in `unit`.
Type: [Int](../../sql-reference/data-types/int-uint.md).
**Example**
Query:
``` sql
SELECT dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'));
```
Result:
``` text
┌─dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'))─┐
│ 25 │
└────────────────────────────────────────────────────────────────────────────────────────┘
```
Query:
``` sql
SELECT
toDate('2022-01-01') AS e,
toDate('2021-12-29') AS s,
dateDiff('day', s, e) AS day_diff,
dateDiff('month', s, e) AS month__diff,
dateDiff('year', s, e) AS year_diff;
```
Result:
``` text
┌──────────e─┬──────────s─┬─day_diff─┬─month__diff─┬─year_diff─┐
│ 2022-01-01 │ 2021-12-29 │ 3 │ 1 │ 1 │
└────────────┴────────────┴──────────┴─────────────┴───────────┘
```
## date\_trunc
Truncates date and time data to the specified part of date.
@ -785,6 +637,80 @@ Result:
└───────────────────────────────────────────────┘
```
## date\_diff
Returns the difference between two dates or dates with time values.
The difference is calculated using relative units, e.g. the difference between `2022-01-01` and `2021-12-29` is 3 days for day unit (see [toRelativeDayNum](#torelativedaynum)), 1 month for month unit (see [toRelativeMonthNum](#torelativemonthnum)), 1 year for year unit (see [toRelativeYearNum](#torelativeyearnum)).
**Syntax**
``` sql
date_diff('unit', startdate, enddate, [timezone])
```
Aliases: `dateDiff`, `DATE_DIFF`.
**Arguments**
- `unit` — The type of interval for result. [String](../../sql-reference/data-types/string.md).
Possible values:
- `second`
- `minute`
- `hour`
- `day`
- `week`
- `month`
- `quarter`
- `year`
- `startdate` — The first time value to subtract (the subtrahend). [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md).
- `enddate` — The second time value to subtract from (the minuend). [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md).
- `timezone` — [Timezone name](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone) (optional). If specified, it is applied to both `startdate` and `enddate`. If not specified, timezones of `startdate` and `enddate` are used. If they are not the same, the result is unspecified. [String](../../sql-reference/data-types/string.md).
**Returned value**
Difference between `enddate` and `startdate` expressed in `unit`.
Type: [Int](../../sql-reference/data-types/int-uint.md).
**Example**
Query:
``` sql
SELECT dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'));
```
Result:
``` text
┌─dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'))─┐
│ 25 │
└────────────────────────────────────────────────────────────────────────────────────────┘
```
Query:
``` sql
SELECT
toDate('2022-01-01') AS e,
toDate('2021-12-29') AS s,
dateDiff('day', s, e) AS day_diff,
dateDiff('month', s, e) AS month__diff,
dateDiff('year', s, e) AS year_diff;
```
Result:
``` text
┌──────────e─┬──────────s─┬─day_diff─┬─month__diff─┬─year_diff─┐
│ 2022-01-01 │ 2021-12-29 │ 3 │ 1 │ 1 │
└────────────┴────────────┴──────────┴─────────────┴───────────┘
```
## date\_sub
Subtracts the time interval or date interval from the provided date or date with time.

View File

@ -77,8 +77,9 @@ Numeric literal tries to be parsed:
Literal value has the smallest type that the value fits in.
For example, 1 is parsed as `UInt8`, but 256 is parsed as `UInt16`. For more information, see [Data types](../sql-reference/data-types/index.md).
Underscores `_` inside numeric literals are ignored and can be used for better readability.
Examples: `1`, `18446744073709551615`, `0xDEADBEEF`, `01`, `0.1`, `1e100`, `-1e-100`, `inf`, `nan`.
Examples: `1`, `10_000_000`, `0xffff_ffff`, `18446744073709551615`, `0xDEADBEEF`, `01`, `0.1`, `1e100`, `-1e-100`, `inf`, `nan`.
### String

View File

@ -424,23 +424,23 @@ WITH toDateTime64('2020-01-01 10:20:30.999', 3) AS dt64 SELECT toStartOfSecond(d
## toRelativeYearNum {#torelativeyearnum}
Переводит дату или дату-с-временем в номер года, начиная с некоторого фиксированного момента в прошлом.
Переводит дату-с-временем или дату в номер года, начиная с некоторого фиксированного момента в прошлом.
## toRelativeQuarterNum {#torelativequarternum}
Переводит дату или дату-с-временем в номер квартала, начиная с некоторого фиксированного момента в прошлом.
Переводит дату-с-временем или дату в номер квартала, начиная с некоторого фиксированного момента в прошлом.
## toRelativeMonthNum {#torelativemonthnum}
Переводит дату или дату-с-временем в номер месяца, начиная с некоторого фиксированного момента в прошлом.
Переводит дату-с-временем или дату в номер месяца, начиная с некоторого фиксированного момента в прошлом.
## toRelativeWeekNum {#torelativeweeknum}
Переводит дату или дату-с-временем в номер недели, начиная с некоторого фиксированного момента в прошлом.
Переводит дату-с-временем или дату в номер недели, начиная с некоторого фиксированного момента в прошлом.
## toRelativeDayNum {#torelativedaynum}
Переводит дату или дату-с-временем в номер дня, начиная с некоторого фиксированного момента в прошлом.
Переводит дату-с-временем или дату в номер дня, начиная с некоторого фиксированного момента в прошлом.
## toRelativeHourNum {#torelativehournum}
@ -456,7 +456,7 @@ WITH toDateTime64('2020-01-01 10:20:30.999', 3) AS dt64 SELECT toStartOfSecond(d
## toISOYear {#toisoyear}
Переводит дату или дату-с-временем в число типа UInt16, содержащее номер ISO года. ISO год отличается от обычного года, потому что в соответствии с [ISO 8601:1988](https://en.wikipedia.org/wiki/ISO_8601) ISO год начинается необязательно первого января.
Переводит дату-с-временем или дату в число типа UInt16, содержащее номер ISO года. ISO год отличается от обычного года, потому что в соответствии с [ISO 8601:1988](https://en.wikipedia.org/wiki/ISO_8601) ISO год начинается необязательно первого января.
**Пример**
@ -479,7 +479,7 @@ SELECT
## toISOWeek {#toisoweek}
Переводит дату или дату-с-временем в число типа UInt8, содержащее номер ISO недели.
Переводит дату-с-временем или дату в число типа UInt8, содержащее номер ISO недели.
Начало ISO года отличается от начала обычного года, потому что в соответствии с [ISO 8601:1988](https://en.wikipedia.org/wiki/ISO_8601) первая неделя года - это неделя с четырьмя или более днями в этом году.
1 Января 2017 г. - воскресение, т.е. первая ISO неделя 2017 года началась в понедельник 2 января, поэтому 1 января 2017 это последняя неделя 2016 года.
@ -503,7 +503,7 @@ SELECT
```
## toWeek(date\[, mode\]\[, timezone\]) {#toweek}
Переводит дату или дату-с-временем в число UInt8, содержащее номер недели. Второй аргументам mode задает режим, начинается ли неделя с воскресенья или с понедельника и должно ли возвращаемое значение находиться в диапазоне от 0 до 53 или от 1 до 53. Если аргумент mode опущен, то используется режим 0.
Переводит дату-с-временем или дату в число UInt8, содержащее номер недели. Второй аргументам mode задает режим, начинается ли неделя с воскресенья или с понедельника и должно ли возвращаемое значение находиться в диапазоне от 0 до 53 или от 1 до 53. Если аргумент mode опущен, то используется режим 0.
`toISOWeek() ` эквивалентно `toWeek(date,3)`.
@ -569,132 +569,6 @@ SELECT toDate('2016-12-27') AS date, toYearWeek(date) AS yearWeek0, toYearWeek(d
└────────────┴───────────┴───────────┴───────────┘
```
## age
Вычисляет компонент `unit` разницы между `startdate` и `enddate`. Разница вычисляется с точностью в 1 секунду.
Например, разница между `2021-12-29` и `2022-01-01` 3 дня для единицы `day`, 0 месяцев для единицы `month`, 0 лет для единицы `year`.
**Синтаксис**
``` sql
age('unit', startdate, enddate, [timezone])
```
**Аргументы**
- `unit` — единица измерения времени, в которой будет выражено возвращаемое значение функции. [String](../../sql-reference/data-types/string.md).
Возможные значения:
- `second` (возможные сокращения: `ss`, `s`)
- `minute` (возможные сокращения: `mi`, `n`)
- `hour` (возможные сокращения: `hh`, `h`)
- `day` (возможные сокращения: `dd`, `d`)
- `week` (возможные сокращения: `wk`, `ww`)
- `month` (возможные сокращения: `mm`, `m`)
- `quarter` (возможные сокращения: `qq`, `q`)
- `year` (возможные сокращения: `yyyy`, `yy`)
- `startdate` — первая дата или дата со временем, которая вычитается из `enddate`. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) или [DateTime64](../../sql-reference/data-types/datetime64.md).
- `enddate` — вторая дата или дата со временем, из которой вычитается `startdate`. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) или [DateTime64](../../sql-reference/data-types/datetime64.md).
- `timezone` — [часовой пояс](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone) (необязательно). Если этот аргумент указан, то он применяется как для `startdate`, так и для `enddate`. Если этот аргумент не указан, то используются часовые пояса аргументов `startdate` и `enddate`. Если часовые пояса аргументов `startdate` и `enddate` не совпадают, то результат не определен. [String](../../sql-reference/data-types/string.md).
**Возвращаемое значение**
Разница между `enddate` и `startdate`, выраженная в `unit`.
Тип: [Int](../../sql-reference/data-types/int-uint.md).
**Пример**
Запрос:
``` sql
SELECT age('hour', toDateTime('2018-01-01 22:30:00'), toDateTime('2018-01-02 23:00:00'));
```
Результат:
``` text
┌─age('hour', toDateTime('2018-01-01 22:30:00'), toDateTime('2018-01-02 23:00:00'))─┐
│ 24 │
└───────────────────────────────────────────────────────────────────────────────────┘
```
Запрос:
``` sql
SELECT
toDate('2022-01-01') AS e,
toDate('2021-12-29') AS s,
age('day', s, e) AS day_age,
age('month', s, e) AS month__age,
age('year', s, e) AS year_age;
```
Результат:
``` text
┌──────────e─┬──────────s─┬─day_age─┬─month__age─┬─year_age─┐
│ 2022-01-01 │ 2021-12-29 │ 3 │ 0 │ 0 │
└────────────┴────────────┴─────────┴────────────┴──────────┘
```
## date\_diff {#date_diff}
Вычисляет разницу указанных границ `unit` пересекаемых между `startdate` и `enddate`.
**Синтаксис**
``` sql
date_diff('unit', startdate, enddate, [timezone])
```
Синонимы: `dateDiff`, `DATE_DIFF`.
**Аргументы**
- `unit` — единица измерения времени, в которой будет выражено возвращаемое значение функции. [String](../../sql-reference/data-types/string.md).
Возможные значения:
- `second` (возможные сокращения: `ss`, `s`)
- `minute` (возможные сокращения: `mi`, `n`)
- `hour` (возможные сокращения: `hh`, `h`)
- `day` (возможные сокращения: `dd`, `d`)
- `week` (возможные сокращения: `wk`, `ww`)
- `month` (возможные сокращения: `mm`, `m`)
- `quarter` (возможные сокращения: `qq`, `q`)
- `year` (возможные сокращения: `yyyy`, `yy`)
- `startdate` — первая дата или дата со временем, которая вычитается из `enddate`. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) или [DateTime64](../../sql-reference/data-types/datetime64.md).
- `enddate` — вторая дата или дата со временем, из которой вычитается `startdate`. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) или [DateTime64](../../sql-reference/data-types/datetime64.md).
- `timezone` — [часовой пояс](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone) (необязательно). Если этот аргумент указан, то он применяется как для `startdate`, так и для `enddate`. Если этот аргумент не указан, то используются часовые пояса аргументов `startdate` и `enddate`. Если часовые пояса аргументов `startdate` и `enddate` не совпадают, то результат не определен. [String](../../sql-reference/data-types/string.md).
**Возвращаемое значение**
Разница между `enddate` и `startdate`, выраженная в `unit`.
Тип: [Int](../../sql-reference/data-types/int-uint.md).
**Пример**
Запрос:
``` sql
SELECT dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'));
```
Результат:
``` text
┌─dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'))─┐
│ 25 │
└────────────────────────────────────────────────────────────────────────────────────────┘
```
## date_trunc {#date_trunc}
Отсекает от даты и времени части, меньшие чем указанная часть.
@ -815,6 +689,60 @@ SELECT date_add(YEAR, 3, toDate('2018-01-01'));
└───────────────────────────────────────────────┘
```
## date\_diff {#date_diff}
Вычисляет разницу между двумя значениями дат или дат со временем.
**Синтаксис**
``` sql
date_diff('unit', startdate, enddate, [timezone])
```
Синонимы: `dateDiff`, `DATE_DIFF`.
**Аргументы**
- `unit` — единица измерения времени, в которой будет выражено возвращаемое значение функции. [String](../../sql-reference/data-types/string.md).
Возможные значения:
- `second`
- `minute`
- `hour`
- `day`
- `week`
- `month`
- `quarter`
- `year`
- `startdate` — первая дата или дата со временем, которая вычитается из `enddate`. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) или [DateTime64](../../sql-reference/data-types/datetime64.md).
- `enddate` — вторая дата или дата со временем, из которой вычитается `startdate`. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) или [DateTime64](../../sql-reference/data-types/datetime64.md).
- `timezone` — [часовой пояс](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone) (необязательно). Если этот аргумент указан, то он применяется как для `startdate`, так и для `enddate`. Если этот аргумент не указан, то используются часовые пояса аргументов `startdate` и `enddate`. Если часовые пояса аргументов `startdate` и `enddate` не совпадают, то результат не определен. [String](../../sql-reference/data-types/string.md).
**Возвращаемое значение**
Разница между `enddate` и `startdate`, выраженная в `unit`.
Тип: [Int](../../sql-reference/data-types/int-uint.md).
**Пример**
Запрос:
``` sql
SELECT dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'));
```
Результат:
``` text
┌─dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'))─┐
│ 25 │
└────────────────────────────────────────────────────────────────────────────────────────┘
```
## date\_sub {#date_sub}
Вычитает интервал времени или даты из указанной даты или даты со временем.

View File

@ -0,0 +1,647 @@
#include <AggregateFunctions/AggregateFunctionFactory.h>
#include <AggregateFunctions/IAggregateFunction.h>
#include <AggregateFunctions/FactoryHelpers.h>
#include <Common/HashTable/HashMap.h>
#include <Common/SymbolIndex.h>
#include <Common/ArenaAllocator.h>
#include <Core/Settings.h>
#include <Columns/ColumnArray.h>
#include <Columns/ColumnString.h>
#include <Columns/ColumnsNumber.h>
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypesNumber.h>
#include <IO/WriteHelpers.h>
#include <IO/Operators.h>
#include <filesystem>
namespace DB
{
namespace ErrorCodes
{
extern const int FUNCTION_NOT_ALLOWED;
extern const int NOT_IMPLEMENTED;
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
}
struct AggregateFunctionFlameGraphTree
{
struct ListNode;
struct TreeNode
{
TreeNode * parent = nullptr;
ListNode * children = nullptr;
UInt64 ptr = 0;
size_t allocated = 0;
};
struct ListNode
{
ListNode * next = nullptr;
TreeNode * child = nullptr;
};
TreeNode root;
static ListNode * createChild(TreeNode * parent, UInt64 ptr, Arena * arena)
{
ListNode * list_node = reinterpret_cast<ListNode *>(arena->alloc(sizeof(ListNode)));
TreeNode * tree_node = reinterpret_cast<TreeNode *>(arena->alloc(sizeof(TreeNode)));
list_node->child = tree_node;
list_node->next = nullptr;
tree_node->parent =parent;
tree_node->children = nullptr;
tree_node->ptr = ptr;
tree_node->allocated = 0;
return list_node;
}
TreeNode * find(const UInt64 * stack, size_t stack_size, Arena * arena)
{
TreeNode * node = &root;
for (size_t i = 0; i < stack_size; ++i)
{
UInt64 ptr = stack[i];
if (ptr == 0)
break;
if (!node->children)
{
node->children = createChild(node, ptr, arena);
node = node->children->child;
}
else
{
ListNode * list = node->children;
while (list->child->ptr != ptr && list->next)
list = list->next;
if (list->child->ptr != ptr)
{
list->next = createChild(node, ptr, arena);
list = list->next;
}
node = list->child;
}
}
return node;
}
static void append(DB::PaddedPODArray<UInt64> & values, DB::PaddedPODArray<UInt64> & offsets, std::vector<UInt64> & frame)
{
UInt64 prev = offsets.empty() ? 0 : offsets.back();
offsets.push_back(prev + frame.size());
for (UInt64 val : frame)
values.push_back(val);
}
struct Trace
{
using Frames = std::vector<UInt64>;
Frames frames;
/// The total number of bytes allocated for traces with the same prefix.
size_t allocated_total = 0;
/// This counter is relevant in case we want to filter some traces with small amount of bytes.
/// It shows the total number of bytes for *filtered* traces with the same prefix.
/// This is the value which is used in flamegraph.
size_t allocated_self = 0;
};
using Traces = std::vector<Trace>;
Traces dump(size_t max_depth, size_t min_bytes) const
{
Traces traces;
Trace::Frames frames;
std::vector<size_t> allocated_total;
std::vector<size_t> allocated_self;
std::vector<ListNode *> nodes;
nodes.push_back(root.children);
allocated_total.push_back(root.allocated);
allocated_self.push_back(root.allocated);
while (!nodes.empty())
{
if (nodes.back() == nullptr)
{
traces.push_back({frames, allocated_total.back(), allocated_self.back()});
nodes.pop_back();
allocated_total.pop_back();
allocated_self.pop_back();
/// We don't have root's frame so framers are empty in the end.
if (!frames.empty())
frames.pop_back();
continue;
}
TreeNode * current = nodes.back()->child;
nodes.back() = nodes.back()->next;
bool enough_bytes = current->allocated >= min_bytes;
bool enough_depth = max_depth == 0 || nodes.size() < max_depth;
if (enough_bytes)
{
frames.push_back(current->ptr);
allocated_self.back() -= current->allocated;
if (enough_depth)
{
allocated_total.push_back(current->allocated);
allocated_self.push_back(current->allocated);
nodes.push_back(current->children);
}
else
{
traces.push_back({frames, current->allocated, current->allocated});
frames.pop_back();
}
}
}
return traces;
}
};
static void insertData(DB::PaddedPODArray<UInt8> & chars, DB::PaddedPODArray<UInt64> & offsets, const char * pos, size_t length)
{
const size_t old_size = chars.size();
const size_t new_size = old_size + length + 1;
chars.resize(new_size);
if (length)
memcpy(chars.data() + old_size, pos, length);
chars[old_size + length] = 0;
offsets.push_back(new_size);
}
/// Split str by line feed and write as separate row to ColumnString.
static void fillColumn(DB::PaddedPODArray<UInt8> & chars, DB::PaddedPODArray<UInt64> & offsets, const std::string & str)
{
size_t start = 0;
size_t end = 0;
size_t size = str.size();
while (end < size)
{
if (str[end] == '\n')
{
insertData(chars, offsets, str.data() + start, end - start);
start = end + 1;
}
++end;
}
if (start < end)
insertData(chars, offsets, str.data() + start, end - start);
}
void dumpFlameGraph(
const AggregateFunctionFlameGraphTree::Traces & traces,
DB::PaddedPODArray<UInt8> & chars,
DB::PaddedPODArray<UInt64> & offsets)
{
DB::WriteBufferFromOwnString out;
std::unordered_map<uintptr_t, size_t> mapping;
#if defined(__ELF__) && !defined(OS_FREEBSD)
auto symbol_index_ptr = DB::SymbolIndex::instance();
const DB::SymbolIndex & symbol_index = *symbol_index_ptr;
#endif
for (const auto & trace : traces)
{
if (trace.allocated_self == 0)
continue;
for (size_t i = 0; i < trace.frames.size(); ++i)
{
if (i)
out << ";";
const void * ptr = reinterpret_cast<const void *>(trace.frames[i]);
#if defined(__ELF__) && !defined(OS_FREEBSD)
if (const auto * symbol = symbol_index.findSymbol(ptr))
writeString(demangle(symbol->name), out);
else
DB::writePointerHex(ptr, out);
#else
DB::writePointerHex(ptr, out);
#endif
}
out << ' ' << trace.allocated_self << "\n";
}
fillColumn(chars, offsets, out.str());
}
struct AggregateFunctionFlameGraphData
{
struct Entry
{
AggregateFunctionFlameGraphTree::TreeNode * trace;
UInt64 size;
Entry * next = nullptr;
};
struct Pair
{
Entry * allocation = nullptr;
Entry * deallocation = nullptr;
};
using Entries = HashMap<UInt64, Pair>;
AggregateFunctionFlameGraphTree tree;
Entries entries;
Entry * free_list = nullptr;
Entry * alloc(Arena * arena)
{
if (free_list)
{
auto * res = free_list;
free_list = free_list->next;
return res;
}
return reinterpret_cast<Entry *>(arena->alloc(sizeof(Entry)));
}
void release(Entry * entry)
{
entry->next = free_list;
free_list = entry;
}
static void track(Entry * allocation)
{
auto * node = allocation->trace;
while (node)
{
node->allocated += allocation->size;
node = node->parent;
}
}
static void untrack(Entry * allocation)
{
auto * node = allocation->trace;
while (node)
{
node->allocated -= allocation->size;
node = node->parent;
}
}
static Entry * tryFindMatchAndRemove(Entry *& list, UInt64 size)
{
if (!list)
return nullptr;
if (list->size == size)
{
Entry * entry = list;
list = list->next;
return entry;
}
else
{
Entry * parent = list;
while (parent->next && parent->next->size != size)
parent = parent->next;
if (parent->next && parent->next->size == size)
{
Entry * entry = parent->next;
parent->next = entry->next;
return entry;
}
return nullptr;
}
}
void add(UInt64 ptr, Int64 size, const UInt64 * stack, size_t stack_size, Arena * arena)
{
/// In case if argument is nullptr, only track allocations.
if (ptr == 0)
{
if (size > 0)
{
auto * node = tree.find(stack, stack_size, arena);
Entry entry{.trace = node, .size = UInt64(size)};
track(&entry);
}
return;
}
auto & place = entries[ptr];
if (size > 0)
{
if (auto * deallocation = tryFindMatchAndRemove(place.deallocation, size))
{
release(deallocation);
}
else
{
auto * node = tree.find(stack, stack_size, arena);
auto * allocation = alloc(arena);
allocation->size = UInt64(size);
allocation->trace = node;
track(allocation);
allocation->next = place.allocation;
place.allocation = allocation;
}
}
else if (size < 0)
{
UInt64 abs_size = -size;
if (auto * allocation = tryFindMatchAndRemove(place.allocation, abs_size))
{
untrack(allocation);
release(allocation);
}
else
{
auto * deallocation = alloc(arena);
deallocation->size = abs_size;
deallocation->next = place.deallocation;
place.deallocation = deallocation;
}
}
}
void merge(const AggregateFunctionFlameGraphTree & other_tree, Arena * arena)
{
AggregateFunctionFlameGraphTree::Trace::Frames frames;
std::vector<AggregateFunctionFlameGraphTree::ListNode *> nodes;
nodes.push_back(other_tree.root.children);
while (!nodes.empty())
{
if (nodes.back() == nullptr)
{
nodes.pop_back();
/// We don't have root's frame so framers are empty in the end.
if (!frames.empty())
frames.pop_back();
continue;
}
AggregateFunctionFlameGraphTree::TreeNode * current = nodes.back()->child;
nodes.back() = nodes.back()->next;
frames.push_back(current->ptr);
if (current->children)
nodes.push_back(current->children);
else
{
if (current->allocated)
add(0, current->allocated, frames.data(), frames.size(), arena);
frames.pop_back();
}
}
}
void merge(const AggregateFunctionFlameGraphData & other, Arena * arena)
{
AggregateFunctionFlameGraphTree::Trace::Frames frames;
for (const auto & entry : other.entries)
{
for (auto * allocation = entry.value.second.allocation; allocation; allocation = allocation->next)
{
frames.clear();
const auto * node = allocation->trace;
while (node->ptr)
{
frames.push_back(node->ptr);
node = node->parent;
}
std::reverse(frames.begin(), frames.end());
add(entry.value.first, allocation->size, frames.data(), frames.size(), arena);
untrack(allocation);
}
for (auto * deallocation = entry.value.second.deallocation; deallocation; deallocation = deallocation->next)
{
add(entry.value.first, -Int64(deallocation->size), nullptr, 0, arena);
}
}
merge(other.tree, arena);
}
void dumpFlameGraph(
DB::PaddedPODArray<UInt8> & chars,
DB::PaddedPODArray<UInt64> & offsets,
size_t max_depth, size_t min_bytes) const
{
DB::dumpFlameGraph(tree.dump(max_depth, min_bytes), chars, offsets);
}
};
/// Aggregate function which builds a flamegraph using the list of stacktraces.
/// The output is an array of strings which can be used by flamegraph.pl util.
/// See https://github.com/brendangregg/FlameGraph
///
/// Syntax: flameGraph(traces, [size = 1], [ptr = 0])
/// - trace : Array(UInt64), a stacktrace
/// - size : Int64, an allocation size (for memory profiling)
/// - ptr : UInt64, an allocation address
/// In case if ptr != 0, a flameGraph will map allocations (size > 0) and deallocations (size < 0) with the same size and ptr.
/// Only allocations which were not freed are shown. Not mapped deallocations are ignored.
///
/// Usage:
///
/// * Build a flamegraph based on CPU query profiler
/// set query_profiler_cpu_time_period_ns=10000000;
/// SELECT SearchPhrase, COUNT(DISTINCT UserID) AS u FROM hits WHERE SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY u DESC LIMIT 10;
/// clickhouse client --allow_introspection_functions=1
/// -q "select arrayJoin(flameGraph(arrayReverse(trace))) from system.trace_log where trace_type = 'CPU' and query_id = 'xxx'"
/// | ~/dev/FlameGraph/flamegraph.pl > flame_cpu.svg
///
/// * Build a flamegraph based on memory query profiler, showing all allocations
/// set memory_profiler_sample_probability=1, max_untracked_memory=1;
/// SELECT SearchPhrase, COUNT(DISTINCT UserID) AS u FROM hits WHERE SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY u DESC LIMIT 10;
/// clickhouse client --allow_introspection_functions=1
/// -q "select arrayJoin(flameGraph(trace, size)) from system.trace_log where trace_type = 'MemorySample' and query_id = 'xxx'"
/// | ~/dev/FlameGraph/flamegraph.pl --countname=bytes --color=mem > flame_mem.svg
///
/// * Build a flamegraph based on memory query profiler, showing allocations which were not deallocated in query context
/// set memory_profiler_sample_probability=1, max_untracked_memory=1, use_uncompressed_cache=1, merge_tree_max_rows_to_use_cache=100000000000, merge_tree_max_bytes_to_use_cache=1000000000000;
/// SELECT SearchPhrase, COUNT(DISTINCT UserID) AS u FROM hits WHERE SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY u DESC LIMIT 10;
/// clickhouse client --allow_introspection_functions=1
/// -q "select arrayJoin(flameGraph(trace, size, ptr)) from system.trace_log where trace_type = 'MemorySample' and query_id = 'xxx'"
/// | ~/dev/FlameGraph/flamegraph.pl --countname=bytes --color=mem > flame_mem_untracked.svg
///
/// * Build a flamegraph based on memory query profiler, showing active allocations at the fixed point of time
/// set memory_profiler_sample_probability=1, max_untracked_memory=1;
/// SELECT SearchPhrase, COUNT(DISTINCT UserID) AS u FROM hits WHERE SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY u DESC LIMIT 10;
/// 1. Memory usage per second
/// select event_time, m, formatReadableSize(max(s) as m) from (select event_time, sum(size) over (order by event_time) as s from system.trace_log where query_id = 'xxx' and trace_type = 'MemorySample') group by event_time order by event_time;
/// 2. Find a time point with maximal memory usage
/// select argMax(event_time, s), max(s) from (select event_time, sum(size) over (order by event_time) as s from system.trace_log where query_id = 'xxx' and trace_type = 'MemorySample');
/// 3. Fix active allocations at fixed point of time
/// clickhouse client --allow_introspection_functions=1
/// -q "select arrayJoin(flameGraph(trace, size, ptr)) from (select * from system.trace_log where trace_type = 'MemorySample' and query_id = 'xxx' and event_time <= 'yyy' order by event_time)"
/// | ~/dev/FlameGraph/flamegraph.pl --countname=bytes --color=mem > flame_mem_time_point_pos.svg
/// 4. Find deallocations at fixed point of time
/// clickhouse client --allow_introspection_functions=1
/// -q "select arrayJoin(flameGraph(trace, -size, ptr)) from (select * from system.trace_log where trace_type = 'MemorySample' and query_id = 'xxx' and event_time > 'yyy' order by event_time desc)"
/// | ~/dev/FlameGraph/flamegraph.pl --countname=bytes --color=mem > flame_mem_time_point_neg.svg
class AggregateFunctionFlameGraph final : public IAggregateFunctionDataHelper<AggregateFunctionFlameGraphData, AggregateFunctionFlameGraph>
{
public:
explicit AggregateFunctionFlameGraph(const DataTypes & argument_types_)
: IAggregateFunctionDataHelper<AggregateFunctionFlameGraphData, AggregateFunctionFlameGraph>(argument_types_, {})
{}
String getName() const override { return "flameGraph"; }
DataTypePtr getReturnType() const override
{
return std::make_shared<DataTypeArray>(std::make_shared<DataTypeString>());
}
bool allocatesMemoryInArena() const override { return true; }
void add(AggregateDataPtr __restrict place, const IColumn ** columns, size_t row_num, Arena * arena) const override
{
const auto * trace = typeid_cast<const ColumnArray *>(columns[0]);
const auto & trace_offsets = trace->getOffsets();
const auto & trace_values = typeid_cast<const ColumnUInt64 *>(&trace->getData())->getData();
UInt64 prev_offset = 0;
if (row_num)
prev_offset = trace_offsets[row_num - 1];
UInt64 trace_size = trace_offsets[row_num] - prev_offset;
Int64 allocated = 1;
if (argument_types.size() >= 2)
{
const auto & sizes = typeid_cast<const ColumnInt64 *>(columns[1])->getData();
allocated = sizes[row_num];
}
UInt64 ptr = 0;
if (argument_types.size() >= 3)
{
const auto & ptrs = typeid_cast<const ColumnUInt64 *>(columns[2])->getData();
ptr = ptrs[row_num];
}
this->data(place).add(ptr, allocated, trace_values.data() + prev_offset, trace_size, arena);
}
void addManyDefaults(
AggregateDataPtr __restrict /*place*/,
const IColumn ** /*columns*/,
size_t /*length*/,
Arena * /*arena*/) const override
{
}
void merge(AggregateDataPtr __restrict place, ConstAggregateDataPtr rhs, Arena * arena) const override
{
this->data(place).merge(this->data(rhs), arena);
}
void serialize(ConstAggregateDataPtr __restrict, WriteBuffer &, std::optional<size_t> /* version */) const override
{
throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Serialization for function flameGraph is not implemented.");
}
void deserialize(AggregateDataPtr __restrict, ReadBuffer &, std::optional<size_t> /* version */, Arena *) const override
{
throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Deserialization for function flameGraph is not implemented.");
}
void insertResultInto(AggregateDataPtr __restrict place, IColumn & to, Arena *) const override
{
auto & array = assert_cast<ColumnArray &>(to);
auto & str = assert_cast<ColumnString &>(array.getData());
this->data(place).dumpFlameGraph(str.getChars(), str.getOffsets(), 0, 0);
array.getOffsets().push_back(str.size());
}
};
static void check(const std::string & name, const DataTypes & argument_types, const Array & params)
{
assertNoParameters(name, params);
if (argument_types.empty() || argument_types.size() > 3)
throw Exception(
ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH,
"Aggregate function {} requires 1 to 3 arguments : trace, [size = 1], [ptr = 0]",
name);
auto ptr_type = std::make_shared<DataTypeUInt64>();
auto trace_type = std::make_shared<DataTypeArray>(ptr_type);
auto size_type = std::make_shared<DataTypeInt64>();
if (!argument_types[0]->equals(*trace_type))
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"First argument (trace) for function {} must be Array(UInt64), but it has type {}",
name, argument_types[0]->getName());
if (argument_types.size() >= 2 && !argument_types[1]->equals(*size_type))
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Second argument (size) for function {} must be Int64, but it has type {}",
name, argument_types[1]->getName());
if (argument_types.size() >= 3 && !argument_types[2]->equals(*ptr_type))
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Third argument (ptr) for function {} must be UInt64, but it has type {}",
name, argument_types[2]->getName());
}
AggregateFunctionPtr createAggregateFunctionFlameGraph(const std::string & name, const DataTypes & argument_types, const Array & params, const Settings * settings)
{
if (!settings->allow_introspection_functions)
throw Exception(ErrorCodes::FUNCTION_NOT_ALLOWED,
"Introspection functions are disabled, because setting 'allow_introspection_functions' is set to 0");
check(name, argument_types, params);
return std::make_shared<AggregateFunctionFlameGraph>(argument_types);
}
void registerAggregateFunctionFlameGraph(AggregateFunctionFactory & factory)
{
AggregateFunctionProperties properties = { .returns_default_when_only_null = true, .is_order_dependent = true };
factory.registerFunction("flameGraph", { createAggregateFunctionFlameGraph, properties });
}
}

View File

@ -73,6 +73,7 @@ void registerAggregateFunctionExponentialMovingAverage(AggregateFunctionFactory
void registerAggregateFunctionSparkbar(AggregateFunctionFactory &);
void registerAggregateFunctionIntervalLengthSum(AggregateFunctionFactory &);
void registerAggregateFunctionAnalysisOfVariance(AggregateFunctionFactory &);
void registerAggregateFunctionFlameGraph(AggregateFunctionFactory &);
class AggregateFunctionCombinatorFactory;
void registerAggregateFunctionCombinatorIf(AggregateFunctionCombinatorFactory &);
@ -158,6 +159,7 @@ void registerAggregateFunctions()
registerAggregateFunctionExponentialMovingAverage(factory);
registerAggregateFunctionSparkbar(factory);
registerAggregateFunctionAnalysisOfVariance(factory);
registerAggregateFunctionFlameGraph(factory);
registerWindowFunctions(factory);
}

View File

@ -0,0 +1,197 @@
#include <Analyzer/Passes/IfTransformStringsToEnumPass.h>
#include <Analyzer/ConstantNode.h>
#include <Analyzer/FunctionNode.h>
#include <Analyzer/IQueryTreeNode.h>
#include <Analyzer/InDepthQueryTreeVisitor.h>
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypeEnum.h>
#include <DataTypes/DataTypeString.h>
#include <DataTypes/IDataType.h>
#include <Functions/FunctionFactory.h>
namespace DB
{
namespace
{
/// We place strings in ascending order here under the assumption it could speed up String to Enum conversion.
template <typename EnumType>
auto getDataEnumType(const std::set<std::string> & string_values)
{
using EnumValues = typename EnumType::Values;
EnumValues enum_values;
enum_values.reserve(string_values.size());
size_t number = 1;
for (const auto & value : string_values)
enum_values.emplace_back(value, number++);
return std::make_shared<EnumType>(std::move(enum_values));
}
DataTypePtr getEnumType(const std::set<std::string> & string_values)
{
if (string_values.size() >= 255)
return getDataEnumType<DataTypeEnum16>(string_values);
else
return getDataEnumType<DataTypeEnum8>(string_values);
}
QueryTreeNodePtr createCastFunction(QueryTreeNodePtr from, DataTypePtr result_type, ContextPtr context)
{
auto enum_literal = std::make_shared<ConstantValue>(result_type->getName(), std::make_shared<DataTypeString>());
auto enum_literal_node = std::make_shared<ConstantNode>(std::move(enum_literal));
auto cast_function = FunctionFactory::instance().get("_CAST", std::move(context));
QueryTreeNodes arguments{std::move(from), std::move(enum_literal_node)};
auto function_node = std::make_shared<FunctionNode>("_CAST");
function_node->resolveAsFunction(std::move(cast_function), std::move(result_type));
function_node->getArguments().getNodes() = std::move(arguments);
return function_node;
}
/// if(arg1, arg2, arg3) will be transformed to if(arg1, _CAST(arg2, Enum...), _CAST(arg3, Enum...))
/// where Enum is generated based on the possible values stored in string_values
void changeIfArguments(
QueryTreeNodePtr & first, QueryTreeNodePtr & second, const std::set<std::string> & string_values, const ContextPtr & context)
{
auto result_type = getEnumType(string_values);
first = createCastFunction(first, result_type, context);
second = createCastFunction(second, result_type, context);
}
/// transform(value, array_from, array_to, default_value) will be transformed to transform(value, array_from, _CAST(array_to, Array(Enum...)), _CAST(default_value, Enum...))
/// where Enum is generated based on the possible values stored in string_values
void changeTransformArguments(
QueryTreeNodePtr & array_to,
QueryTreeNodePtr & default_value,
const std::set<std::string> & string_values,
const ContextPtr & context)
{
auto result_type = getEnumType(string_values);
array_to = createCastFunction(array_to, std::make_shared<DataTypeArray>(result_type), context);
default_value = createCastFunction(default_value, std::move(result_type), context);
}
void wrapIntoToString(FunctionNode & function_node, QueryTreeNodePtr arg, ContextPtr context)
{
assert(isString(function_node.getResultType()));
auto to_string_function = FunctionFactory::instance().get("toString", std::move(context));
QueryTreeNodes arguments{std::move(arg)};
function_node.resolveAsFunction(std::move(to_string_function), std::make_shared<DataTypeString>());
function_node.getArguments().getNodes() = std::move(arguments);
}
class ConvertStringsToEnumVisitor : public InDepthQueryTreeVisitor<ConvertStringsToEnumVisitor>
{
public:
explicit ConvertStringsToEnumVisitor(ContextPtr context_)
: context(std::move(context_))
{
}
void visitImpl(QueryTreeNodePtr & node)
{
auto * function_node = node->as<FunctionNode>();
if (!function_node)
return;
/// to preserve return type (String) of the current function_node, we wrap the newly
/// generated function nodes into toString
std::string_view function_name = function_node->getFunctionName();
if (function_name == "if")
{
if (function_node->getArguments().getNodes().size() != 3)
return;
auto modified_if_node = function_node->clone();
auto & argument_nodes = modified_if_node->as<FunctionNode>()->getArguments().getNodes();
const auto * first_literal = argument_nodes[1]->as<ConstantNode>();
const auto * second_literal = argument_nodes[2]->as<ConstantNode>();
if (!first_literal || !second_literal)
return;
if (!isString(first_literal->getResultType()) || !isString(second_literal->getResultType()))
return;
std::set<std::string> string_values;
string_values.insert(first_literal->getValue().get<std::string>());
string_values.insert(second_literal->getValue().get<std::string>());
changeIfArguments(argument_nodes[1], argument_nodes[2], string_values, context);
wrapIntoToString(*function_node, std::move(modified_if_node), context);
return;
}
if (function_name == "transform")
{
if (function_node->getArguments().getNodes().size() != 4)
return;
auto modified_transform_node = function_node->clone();
auto & argument_nodes = modified_transform_node->as<FunctionNode>()->getArguments().getNodes();
if (!isString(function_node->getResultType()))
return;
const auto * literal_to = argument_nodes[2]->as<ConstantNode>();
const auto * literal_default = argument_nodes[3]->as<ConstantNode>();
if (!literal_to || !literal_default)
return;
if (!isArray(literal_to->getResultType()) || !isString(literal_default->getResultType()))
return;
auto array_to = literal_to->getValue().get<Array>();
if (array_to.empty())
return;
if (!std::all_of(
array_to.begin(),
array_to.end(),
[](const auto & field) { return field.getType() == Field::Types::Which::String; }))
return;
/// collect possible string values
std::set<std::string> string_values;
for (const auto & value : array_to)
string_values.insert(value.get<std::string>());
string_values.insert(literal_default->getValue().get<std::string>());
changeTransformArguments(argument_nodes[2], argument_nodes[3], string_values, context);
wrapIntoToString(*function_node, std::move(modified_transform_node), context);
return;
}
}
private:
ContextPtr context;
};
}
void IfTransformStringsToEnumPass::run(QueryTreeNodePtr query, ContextPtr context)
{
ConvertStringsToEnumVisitor visitor(context);
visitor.visit(query);
}
}

View File

@ -0,0 +1,39 @@
#pragma once
#include <Analyzer/IQueryTreePass.h>
namespace DB
{
/**
* This pass replaces string-type arguments in If and Transform to enum.
*
* E.g.
* -------------------------------
* SELECT if(number > 5, 'a', 'b')
* FROM system.numbers;
*
* will be transformed into
*
* SELECT if(number > 5, _CAST('a', 'Enum8(\'a\' = 1, \'b\' = 2)'), _CAST('b', 'Enum8(\'a\' = 1, \'b\' = 2)'))
* FROM system.numbers;
* -------------------------------
* SELECT transform(number, [2, 4], ['a', 'b'], 'c') FROM system.numbers;
*
* will be transformed into
*
* SELECT transform(number, [2, 4], _CAST(['a', 'b'], 'Array(Enum8(\'a\' = 1, \'b\' = 2, \'c\' = 3)'), _CAST('c', 'Enum8(\'a\' = 1, \'b\' = 2, \'c\' = 3)'))
* FROM system.numbers;
* -------------------------------
*/
class IfTransformStringsToEnumPass final : public IQueryTreePass
{
public:
String getName() override { return "IfTransformStringsToEnumPass"; }
String getDescription() override { return "Replaces string-type arguments in If and Transform to enum"; }
void run(QueryTreeNodePtr query_tree_node, ContextPtr context) override;
};
}

View File

@ -14,6 +14,7 @@
#include <Analyzer/Passes/UniqInjectiveFunctionsEliminationPass.h>
#include <Analyzer/Passes/OrderByLimitByDuplicateEliminationPass.h>
#include <Analyzer/Passes/FuseFunctionsPass.h>
#include <Analyzer/Passes/IfTransformStringsToEnumPass.h>
#include <IO/WriteHelpers.h>
#include <IO/Operators.h>
@ -77,7 +78,6 @@ public:
* TODO: Support setting optimize_duplicate_order_by_and_distinct.
* TODO: Support setting optimize_redundant_functions_in_order_by.
* TODO: Support setting optimize_monotonous_functions_in_order_by.
* TODO: Support setting optimize_if_transform_strings_to_enum.
* TODO: Support settings.optimize_or_like_chain.
* TODO: Add optimizations based on function semantics. Example: SELECT * FROM test_table WHERE id != id. (id is not nullable column).
*/
@ -193,6 +193,9 @@ void addQueryTreePasses(QueryTreePassManager & manager)
if (settings.optimize_syntax_fuse_functions)
manager.addPass(std::make_unique<FuseFunctionsPass>());
if (settings.optimize_if_transform_strings_to_enum)
manager.addPass(std::make_unique<IfTransformStringsToEnumPass>());
}
}

View File

@ -181,6 +181,7 @@ OperationID BackupsWorker::startMakingBackup(const ASTPtr & query, const Context
/// For ON CLUSTER queries we will need to change some settings.
/// For ASYNC queries we have to clone the context anyway.
context_in_use = mutable_context = Context::createCopy(context);
mutable_context->makeQueryContext();
}
if (backup_settings.async)
@ -400,6 +401,7 @@ OperationID BackupsWorker::startRestoring(const ASTPtr & query, ContextMutablePt
/// For ON CLUSTER queries we will need to change some settings.
/// For ASYNC queries we have to clone the context anyway.
context_in_use = Context::createCopy(context);
context_in_use->makeQueryContext();
}
if (restore_settings.async)

View File

@ -346,7 +346,7 @@ void RestorerFromBackup::findTableInBackup(const QualifiedTableName & table_name
res_table_info.has_data = backup->hasFiles(data_path_in_backup);
res_table_info.data_path_in_backup = data_path_in_backup;
tables_dependencies.addDependencies(table_name, getDependenciesFromCreateQuery(context->getGlobalContext(), table_name, create_table_query));
tables_dependencies.addDependencies(table_name, getDependenciesFromCreateQuery(context, table_name, create_table_query));
if (partitions)
{
@ -674,6 +674,7 @@ void RestorerFromBackup::removeUnresolvedDependencies()
void RestorerFromBackup::createTables()
{
/// We need to create tables considering their dependencies.
tables_dependencies.log();
auto tables_to_create = tables_dependencies.getTablesSortedByDependency();
for (const auto & table_id : tables_to_create)
{

View File

@ -0,0 +1,16 @@
#pragma once
#include <cstddef>
/// This is a structure which is returned by MemoryTracker.
/// Methods onAlloc/onFree should be called after actual memory allocation if it succeed.
/// For now, it will only collect allocation trace with sample_probability.
struct AllocationTrace
{
AllocationTrace() = default;
explicit AllocationTrace(double sample_probability_);
void onAlloc(void * ptr, size_t size) const;
void onFree(void * ptr, size_t size) const;
double sample_probability = 0;
};

View File

@ -92,8 +92,10 @@ public:
void * alloc(size_t size, size_t alignment = 0)
{
checkSize(size);
CurrentMemoryTracker::alloc(size);
return allocNoTrack(size, alignment);
auto trace = CurrentMemoryTracker::alloc(size);
void * ptr = allocNoTrack(size, alignment);
trace.onAlloc(ptr, size);
return ptr;
}
/// Free memory range.
@ -103,7 +105,8 @@ public:
{
checkSize(size);
freeNoTrack(buf, size);
CurrentMemoryTracker::free(size);
auto trace = CurrentMemoryTracker::free(size);
trace.onFree(buf, size);
}
catch (...)
{
@ -129,13 +132,16 @@ public:
&& alignment <= MALLOC_MIN_ALIGNMENT)
{
/// Resize malloc'd memory region with no special alignment requirement.
CurrentMemoryTracker::realloc(old_size, new_size);
auto trace = CurrentMemoryTracker::realloc(old_size, new_size);
trace.onFree(buf, old_size);
void * new_buf = ::realloc(buf, new_size);
if (nullptr == new_buf)
DB::throwFromErrno(fmt::format("Allocator: Cannot realloc from {} to {}.", ReadableSize(old_size), ReadableSize(new_size)), DB::ErrorCodes::CANNOT_ALLOCATE_MEMORY);
buf = new_buf;
trace.onAlloc(buf, new_size);
if constexpr (clear_memory)
if (new_size > old_size)
memset(reinterpret_cast<char *>(buf) + old_size, 0, new_size - old_size);
@ -143,7 +149,8 @@ public:
else if (old_size >= MMAP_THRESHOLD && new_size >= MMAP_THRESHOLD)
{
/// Resize mmap'd memory region.
CurrentMemoryTracker::realloc(old_size, new_size);
auto trace = CurrentMemoryTracker::realloc(old_size, new_size);
trace.onFree(buf, old_size);
// On apple and freebsd self-implemented mremap used (common/mremap.h)
buf = clickhouse_mremap(buf, old_size, new_size, MREMAP_MAYMOVE,
@ -152,14 +159,17 @@ public:
DB::throwFromErrno(fmt::format("Allocator: Cannot mremap memory chunk from {} to {}.",
ReadableSize(old_size), ReadableSize(new_size)), DB::ErrorCodes::CANNOT_MREMAP);
trace.onAlloc(buf, new_size);
/// No need for zero-fill, because mmap guarantees it.
}
else if (new_size < MMAP_THRESHOLD)
{
/// Small allocs that requires a copy. Assume there's enough memory in system. Call CurrentMemoryTracker once.
CurrentMemoryTracker::realloc(old_size, new_size);
auto trace = CurrentMemoryTracker::realloc(old_size, new_size);
trace.onFree(buf, old_size);
void * new_buf = allocNoTrack(new_size, alignment);
trace.onAlloc(new_buf, new_size);
memcpy(new_buf, buf, std::min(old_size, new_size));
freeNoTrack(buf, old_size);
buf = new_buf;

View File

@ -30,21 +30,24 @@ struct AllocatorWithMemoryTracking
throw std::bad_alloc();
size_t bytes = n * sizeof(T);
CurrentMemoryTracker::alloc(bytes);
auto trace = CurrentMemoryTracker::alloc(bytes);
T * p = static_cast<T *>(malloc(bytes));
if (!p)
throw std::bad_alloc();
trace.onAlloc(p, bytes);
return p;
}
void deallocate(T * p, size_t n) noexcept
{
free(p);
size_t bytes = n * sizeof(T);
CurrentMemoryTracker::free(bytes);
free(p);
auto trace = CurrentMemoryTracker::free(bytes);
trace.onFree(p, bytes);
}
};

View File

@ -37,7 +37,7 @@ MemoryTracker * getMemoryTracker()
using DB::current_thread;
void CurrentMemoryTracker::allocImpl(Int64 size, bool throw_if_memory_exceeded)
AllocationTrace CurrentMemoryTracker::allocImpl(Int64 size, bool throw_if_memory_exceeded)
{
#ifdef MEMORY_TRACKER_DEBUG_CHECKS
if (unlikely(memory_tracker_always_throw_logical_error_on_allocation))
@ -55,8 +55,9 @@ void CurrentMemoryTracker::allocImpl(Int64 size, bool throw_if_memory_exceeded)
if (will_be > current_thread->untracked_memory_limit)
{
memory_tracker->allocImpl(will_be, throw_if_memory_exceeded);
auto res = memory_tracker->allocImpl(will_be, throw_if_memory_exceeded);
current_thread->untracked_memory = 0;
return res;
}
else
{
@ -68,36 +69,40 @@ void CurrentMemoryTracker::allocImpl(Int64 size, bool throw_if_memory_exceeded)
/// total_memory_tracker only, ignore untracked_memory
else
{
memory_tracker->allocImpl(size, throw_if_memory_exceeded);
return memory_tracker->allocImpl(size, throw_if_memory_exceeded);
}
return AllocationTrace(memory_tracker->getSampleProbability());
}
return AllocationTrace(0);
}
void CurrentMemoryTracker::check()
{
if (auto * memory_tracker = getMemoryTracker())
memory_tracker->allocImpl(0, true);
std::ignore = memory_tracker->allocImpl(0, true);
}
void CurrentMemoryTracker::alloc(Int64 size)
AllocationTrace CurrentMemoryTracker::alloc(Int64 size)
{
bool throw_if_memory_exceeded = true;
allocImpl(size, throw_if_memory_exceeded);
return allocImpl(size, throw_if_memory_exceeded);
}
void CurrentMemoryTracker::allocNoThrow(Int64 size)
AllocationTrace CurrentMemoryTracker::allocNoThrow(Int64 size)
{
bool throw_if_memory_exceeded = false;
allocImpl(size, throw_if_memory_exceeded);
return allocImpl(size, throw_if_memory_exceeded);
}
void CurrentMemoryTracker::realloc(Int64 old_size, Int64 new_size)
AllocationTrace CurrentMemoryTracker::realloc(Int64 old_size, Int64 new_size)
{
Int64 addition = new_size - old_size;
addition > 0 ? alloc(addition) : free(-addition);
return addition > 0 ? alloc(addition) : free(-addition);
}
void CurrentMemoryTracker::free(Int64 size)
AllocationTrace CurrentMemoryTracker::free(Int64 size)
{
if (auto * memory_tracker = getMemoryTracker())
{
@ -106,15 +111,20 @@ void CurrentMemoryTracker::free(Int64 size)
current_thread->untracked_memory -= size;
if (current_thread->untracked_memory < -current_thread->untracked_memory_limit)
{
memory_tracker->free(-current_thread->untracked_memory);
Int64 untracked_memory = current_thread->untracked_memory;
current_thread->untracked_memory = 0;
return memory_tracker->free(-untracked_memory);
}
}
/// total_memory_tracker only, ignore untracked_memory
else
{
memory_tracker->free(size);
return memory_tracker->free(size);
}
return AllocationTrace(memory_tracker->getSampleProbability());
}
return AllocationTrace(0);
}

View File

@ -1,19 +1,20 @@
#pragma once
#include <base/types.h>
#include <Common/AllocationTrace.h>
/// Convenience methods, that use current thread's memory_tracker if it is available.
struct CurrentMemoryTracker
{
/// Call the following functions before calling of corresponding operations with memory allocators.
static void alloc(Int64 size);
static void allocNoThrow(Int64 size);
static void realloc(Int64 old_size, Int64 new_size);
[[nodiscard]] static AllocationTrace alloc(Int64 size);
[[nodiscard]] static AllocationTrace allocNoThrow(Int64 size);
[[nodiscard]] static AllocationTrace realloc(Int64 old_size, Int64 new_size);
/// This function should be called after memory deallocation.
static void free(Int64 size);
[[nodiscard]] static AllocationTrace free(Int64 size);
static void check();
private:
static void allocImpl(Int64 size, bool throw_if_memory_exceeded);
[[nodiscard]] static AllocationTrace allocImpl(Int64 size, bool throw_if_memory_exceeded);
};

View File

@ -103,6 +103,7 @@
M(S3Requests, "S3 requests") \
M(KeeperAliveConnections, "Number of alive connections") \
M(KeeperOutstandingRequets, "Number of outstanding requests") \
M(ThreadsInOvercommitTracker, "Number of waiting threads inside of OvercommitTracker") \
namespace CurrentMetrics
{

View File

@ -1204,11 +1204,6 @@ public:
return res;
}
template <typename DateOrTime>
inline DateTimeComponents toDateTimeComponents(DateOrTime v) const
{
return toDateTimeComponents(lut[toLUTIndex(v)].date);
}
inline UInt64 toNumYYYYMMDDhhmmss(Time t) const
{

View File

@ -57,7 +57,8 @@ public:
}
/// Do not count guard page in memory usage.
CurrentMemoryTracker::alloc(num_pages * page_size);
auto trace = CurrentMemoryTracker::alloc(num_pages * page_size);
trace.onAlloc(vp, num_pages * page_size);
boost::context::stack_context sctx;
sctx.size = num_bytes;
@ -77,6 +78,7 @@ public:
::munmap(vp, sctx.size);
/// Do not count guard page in memory usage.
CurrentMemoryTracker::free(sctx.size - page_size);
auto trace = CurrentMemoryTracker::free(sctx.size - page_size);
trace.onFree(vp, sctx.size - page_size);
}
};

View File

@ -1,6 +1,7 @@
#include "MemoryTracker.h"
#include <IO/WriteHelpers.h>
#include <Common/SipHash.h>
#include <Common/VariableContext.h>
#include <Common/TraceSender.h>
#include <Common/Exception.h>
@ -82,6 +83,53 @@ inline std::string_view toDescription(OvercommitResult result)
}
}
bool shouldTrackAllocation(DB::Float64 probability, void * ptr)
{
return sipHash64(uintptr_t(ptr)) < std::numeric_limits<uint64_t>::max() * probability;
}
AllocationTrace updateAllocationTrace(AllocationTrace trace, const std::optional<double> & sample_probability)
{
if (unlikely(sample_probability))
return AllocationTrace(*sample_probability);
return trace;
}
AllocationTrace getAllocationTrace(std::optional<double> & sample_probability)
{
if (unlikely(sample_probability))
return AllocationTrace(*sample_probability);
return AllocationTrace(0);
}
}
AllocationTrace::AllocationTrace(double sample_probability_) : sample_probability(sample_probability_) {}
void AllocationTrace::onAlloc(void * ptr, size_t size) const
{
if (likely(sample_probability == 0))
return;
if (sample_probability < 1 && !shouldTrackAllocation(sample_probability, ptr))
return;
MemoryTrackerBlockerInThread untrack_lock(VariableContext::Global);
DB::TraceSender::send(DB::TraceType::MemorySample, StackTrace(), {.size = Int64(size), .ptr = ptr});
}
void AllocationTrace::onFree(void * ptr, size_t size) const
{
if (likely(sample_probability == 0))
return;
if (sample_probability < 1 && !shouldTrackAllocation(sample_probability, ptr))
return;
MemoryTrackerBlockerInThread untrack_lock(VariableContext::Global);
DB::TraceSender::send(DB::TraceType::MemorySample, StackTrace(), {.size = -Int64(size), .ptr = ptr});
}
namespace ProfileEvents
@ -135,7 +183,7 @@ void MemoryTracker::logMemoryUsage(Int64 current) const
}
void MemoryTracker::allocImpl(Int64 size, bool throw_if_memory_exceeded, MemoryTracker * query_tracker)
AllocationTrace MemoryTracker::allocImpl(Int64 size, bool throw_if_memory_exceeded, MemoryTracker * query_tracker)
{
if (size < 0)
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "Negative size ({}) is passed to MemoryTracker. It is a bug.", size);
@ -154,9 +202,14 @@ void MemoryTracker::allocImpl(Int64 size, bool throw_if_memory_exceeded, MemoryT
/// Since the MemoryTrackerBlockerInThread should respect the level, we should go to the next parent.
if (auto * loaded_next = parent.load(std::memory_order_relaxed))
loaded_next->allocImpl(size, throw_if_memory_exceeded,
level == VariableContext::Process ? this : query_tracker);
return;
{
MemoryTracker * tracker = level == VariableContext::Process ? this : query_tracker;
return updateAllocationTrace(
loaded_next->allocImpl(size, throw_if_memory_exceeded, tracker),
sample_probability);
}
return getAllocationTrace(sample_probability);
}
/** Using memory_order_relaxed means that if allocations are done simultaneously,
@ -183,14 +236,6 @@ void MemoryTracker::allocImpl(Int64 size, bool throw_if_memory_exceeded, MemoryT
allocation_traced = true;
}
std::bernoulli_distribution sample(sample_probability);
if (unlikely(sample_probability > 0.0 && sample(thread_local_rng)))
{
MemoryTrackerBlockerInThread untrack_lock(VariableContext::Global);
DB::TraceSender::send(DB::TraceType::MemorySample, StackTrace(), {.size = size});
allocation_traced = true;
}
std::bernoulli_distribution fault(fault_probability);
if (unlikely(fault_probability > 0.0 && fault(thread_local_rng)))
{
@ -309,16 +354,22 @@ void MemoryTracker::allocImpl(Int64 size, bool throw_if_memory_exceeded, MemoryT
}
if (auto * loaded_next = parent.load(std::memory_order_relaxed))
loaded_next->allocImpl(size, throw_if_memory_exceeded,
level == VariableContext::Process ? this : query_tracker);
{
MemoryTracker * tracker = level == VariableContext::Process ? this : query_tracker;
return updateAllocationTrace(
loaded_next->allocImpl(size, throw_if_memory_exceeded, tracker),
sample_probability);
}
return getAllocationTrace(sample_probability);
}
void MemoryTracker::adjustWithUntrackedMemory(Int64 untracked_memory)
{
if (untracked_memory > 0)
allocImpl(untracked_memory, /*throw_if_memory_exceeded*/ false);
std::ignore = allocImpl(untracked_memory, /*throw_if_memory_exceeded*/ false);
else
free(-untracked_memory);
std::ignore = free(-untracked_memory);
}
bool MemoryTracker::updatePeak(Int64 will_be, bool log_memory_usage)
@ -337,8 +388,7 @@ bool MemoryTracker::updatePeak(Int64 will_be, bool log_memory_usage)
return false;
}
void MemoryTracker::free(Int64 size)
AllocationTrace MemoryTracker::free(Int64 size)
{
if (MemoryTrackerBlockerInThread::isBlocked(level))
{
@ -353,15 +403,9 @@ void MemoryTracker::free(Int64 size)
/// Since the MemoryTrackerBlockerInThread should respect the level, we should go to the next parent.
if (auto * loaded_next = parent.load(std::memory_order_relaxed))
loaded_next->free(size);
return;
}
return updateAllocationTrace(loaded_next->free(size), sample_probability);
std::bernoulli_distribution sample(sample_probability);
if (unlikely(sample_probability > 0.0 && sample(thread_local_rng)))
{
MemoryTrackerBlockerInThread untrack_lock(VariableContext::Global);
DB::TraceSender::send(DB::TraceType::MemorySample, StackTrace(), {.size = -size});
return getAllocationTrace(sample_probability);
}
Int64 accounted_size = size;
@ -389,12 +433,15 @@ void MemoryTracker::free(Int64 size)
if (auto * overcommit_tracker_ptr = overcommit_tracker.load(std::memory_order_relaxed))
overcommit_tracker_ptr->tryContinueQueryExecutionAfterFree(accounted_size);
AllocationTrace res = getAllocationTrace(sample_probability);
if (auto * loaded_next = parent.load(std::memory_order_relaxed))
loaded_next->free(size);
res = updateAllocationTrace(loaded_next->free(size), sample_probability);
auto metric_loaded = metric.load(std::memory_order_relaxed);
if (metric_loaded != CurrentMetrics::end())
CurrentMetrics::sub(metric_loaded, accounted_size);
return res;
}
@ -478,3 +525,14 @@ void MemoryTracker::setOrRaiseProfilerLimit(Int64 value)
while ((value == 0 || old_value < value) && !profiler_limit.compare_exchange_weak(old_value, value))
;
}
double MemoryTracker::getSampleProbability()
{
if (sample_probability)
return *sample_probability;
if (auto * loaded_next = parent.load(std::memory_order_relaxed))
return loaded_next->getSampleProbability();
return 0;
}

View File

@ -2,9 +2,11 @@
#include <atomic>
#include <chrono>
#include <optional>
#include <base/types.h>
#include <Common/CurrentMetrics.h>
#include <Common/VariableContext.h>
#include <Common/AllocationTrace.h>
#if !defined(NDEBUG)
#define MEMORY_TRACKER_DEBUG_CHECKS
@ -65,7 +67,7 @@ private:
double fault_probability = 0;
/// To randomly sample allocations and deallocations in trace_log.
double sample_probability = 0;
std::optional<double> sample_probability;
/// Singly-linked list. All information will be passed to subsequent memory trackers also (it allows to implement trackers hierarchy).
/// In terms of tree nodes it is the list of parents. Lifetime of these trackers should "include" lifetime of current tracker.
@ -90,8 +92,8 @@ private:
/// allocImpl(...) and free(...) should not be used directly
friend struct CurrentMemoryTracker;
void allocImpl(Int64 size, bool throw_if_memory_exceeded, MemoryTracker * query_tracker = nullptr);
void free(Int64 size);
[[nodiscard]] AllocationTrace allocImpl(Int64 size, bool throw_if_memory_exceeded, MemoryTracker * query_tracker = nullptr);
[[nodiscard]] AllocationTrace free(Int64 size);
public:
static constexpr auto USAGE_EVENT_NAME = "MemoryTrackerUsage";
@ -146,6 +148,8 @@ public:
sample_probability = value;
}
double getSampleProbability();
void setProfilerStep(Int64 value)
{
profiler_step = value;

View File

@ -28,4 +28,5 @@ public:
}
friend class MemoryTracker;
friend struct AllocationTrace;
};

View File

@ -3,8 +3,13 @@
#include <chrono>
#include <mutex>
#include <Common/ProfileEvents.h>
#include <Common/CurrentMetrics.h>
#include <Interpreters/ProcessList.h>
namespace CurrentMetrics
{
extern const Metric ThreadsInOvercommitTracker;
}
namespace ProfileEvents
{
@ -32,6 +37,8 @@ OvercommitResult OvercommitTracker::needToStopQuery(MemoryTracker * tracker, Int
if (OvercommitTrackerBlockerInThread::isBlocked())
return OvercommitResult::NONE;
CurrentMetrics::Increment metric_increment(CurrentMetrics::ThreadsInOvercommitTracker);
// NOTE: Do not change the order of locks
//
// global mutex must be acquired before overcommit_m, because

View File

@ -123,16 +123,13 @@ void ProgressIndication::writeFinalProgress()
if (progress.read_rows < 1000)
return;
UInt64 processed_rows = progress.read_rows + progress.written_rows;
UInt64 processed_bytes = progress.read_bytes + progress.written_bytes;
std::cout << "Processed " << formatReadableQuantity(processed_rows) << " rows, "
<< formatReadableSizeWithDecimalSuffix(processed_bytes);
std::cout << "Processed " << formatReadableQuantity(progress.read_rows) << " rows, "
<< formatReadableSizeWithDecimalSuffix(progress.read_bytes);
UInt64 elapsed_ns = getElapsedNanoseconds();
if (elapsed_ns)
std::cout << " (" << formatReadableQuantity(processed_rows * 1000000000.0 / elapsed_ns) << " rows/s., "
<< formatReadableSizeWithDecimalSuffix(processed_bytes * 1000000000.0 / elapsed_ns) << "/s.)";
std::cout << " (" << formatReadableQuantity(progress.read_rows * 1000000000.0 / elapsed_ns) << " rows/s., "
<< formatReadableSizeWithDecimalSuffix(progress.read_bytes * 1000000000.0 / elapsed_ns) << "/s.)";
else
std::cout << ". ";
}
@ -167,18 +164,16 @@ void ProgressIndication::writeProgress(WriteBufferFromFileDescriptor & message)
size_t prefix_size = message.count();
UInt64 processed_rows = progress.read_rows + progress.written_rows;
UInt64 processed_bytes = progress.read_bytes + progress.written_bytes;
message << indicator << " Progress: ";
message
<< formatReadableQuantity(processed_rows) << " rows, "
<< formatReadableSizeWithDecimalSuffix(processed_bytes);
<< formatReadableQuantity(progress.read_rows) << " rows, "
<< formatReadableSizeWithDecimalSuffix(progress.read_bytes);
UInt64 elapsed_ns = getElapsedNanoseconds();
if (elapsed_ns)
message << " ("
<< formatReadableQuantity(processed_rows * 1000000000.0 / elapsed_ns) << " rows/s., "
<< formatReadableSizeWithDecimalSuffix(processed_bytes * 1000000000.0 / elapsed_ns) << "/s.) ";
<< formatReadableQuantity(progress.read_rows * 1000000000.0 / elapsed_ns) << " rows/s., "
<< formatReadableSizeWithDecimalSuffix(progress.read_bytes * 1000000000.0 / elapsed_ns) << "/s.) ";
else
message << ". ";

View File

@ -33,6 +33,7 @@ void TraceSender::send(TraceType trace_type, const StackTrace & stack_trace, Ext
+ sizeof(TraceType) /// trace type
+ sizeof(UInt64) /// thread_id
+ sizeof(Int64) /// size
+ sizeof(void *) /// ptr
+ sizeof(ProfileEvents::Event) /// event
+ sizeof(ProfileEvents::Count); /// increment
@ -74,6 +75,7 @@ void TraceSender::send(TraceType trace_type, const StackTrace & stack_trace, Ext
writePODBinary(trace_type, out);
writePODBinary(thread_id, out);
writePODBinary(extras.size, out);
writePODBinary(UInt64(extras.ptr), out);
writePODBinary(extras.event, out);
writePODBinary(extras.increment, out);

View File

@ -28,8 +28,9 @@ class TraceSender
public:
struct Extras
{
/// size - for memory tracing is the amount of memory allocated; for other trace types it is 0.
/// size, ptr - for memory tracing is the amount of memory allocated; for other trace types it is 0.
Int64 size{};
void * ptr = nullptr;
/// Event type and increment for 'ProfileEvent' trace type; for other trace types defaults.
ProfileEvents::Event event{ProfileEvents::end()};
ProfileEvents::Count increment{};

View File

@ -9,7 +9,11 @@ extern "C" void * clickhouse_malloc(size_t size)
{
void * res = malloc(size);
if (res)
Memory::trackMemory(size);
{
AllocationTrace trace;
size_t actual_size = Memory::trackMemory(size, trace);
trace.onAlloc(res, actual_size);
}
return res;
}
@ -17,17 +21,29 @@ extern "C" void * clickhouse_calloc(size_t number_of_members, size_t size)
{
void * res = calloc(number_of_members, size);
if (res)
Memory::trackMemory(number_of_members * size);
{
AllocationTrace trace;
size_t actual_size = Memory::trackMemory(number_of_members * size, trace);
trace.onAlloc(res, actual_size);
}
return res;
}
extern "C" void * clickhouse_realloc(void * ptr, size_t size)
{
if (ptr)
Memory::untrackMemory(ptr);
{
AllocationTrace trace;
size_t actual_size = Memory::untrackMemory(ptr, trace);
trace.onFree(ptr, actual_size);
}
void * res = realloc(ptr, size);
if (res)
Memory::trackMemory(size);
{
AllocationTrace trace;
size_t actual_size = Memory::trackMemory(size, trace);
trace.onAlloc(res, actual_size);
}
return res;
}
@ -42,7 +58,9 @@ extern "C" void * clickhouse_reallocarray(void * ptr, size_t number_of_members,
extern "C" void clickhouse_free(void * ptr)
{
Memory::untrackMemory(ptr);
AllocationTrace trace;
size_t actual_size = Memory::untrackMemory(ptr, trace);
trace.onFree(ptr, actual_size);
free(ptr);
}
@ -50,6 +68,10 @@ extern "C" int clickhouse_posix_memalign(void ** memptr, size_t alignment, size_
{
int res = posix_memalign(memptr, alignment, size);
if (res == 0)
Memory::trackMemory(size);
{
AllocationTrace trace;
size_t actual_size = Memory::trackMemory(size, trace);
trace.onAlloc(*memptr, actual_size);
}
return res;
}

View File

@ -112,16 +112,19 @@ inline ALWAYS_INLINE size_t getActualAllocationSize(size_t size, TAlign... align
template <std::same_as<std::align_val_t>... TAlign>
requires DB::OptionalArgument<TAlign...>
inline ALWAYS_INLINE void trackMemory(std::size_t size, TAlign... align)
inline ALWAYS_INLINE size_t trackMemory(std::size_t size, AllocationTrace & trace, TAlign... align)
{
std::size_t actual_size = getActualAllocationSize(size, align...);
CurrentMemoryTracker::allocNoThrow(actual_size);
trace = CurrentMemoryTracker::allocNoThrow(actual_size);
return actual_size;
}
template <std::same_as<std::align_val_t>... TAlign>
requires DB::OptionalArgument<TAlign...>
inline ALWAYS_INLINE void untrackMemory(void * ptr [[maybe_unused]], std::size_t size [[maybe_unused]] = 0, TAlign... align [[maybe_unused]]) noexcept
inline ALWAYS_INLINE size_t untrackMemory(void * ptr [[maybe_unused]], AllocationTrace & trace, std::size_t size [[maybe_unused]] = 0, TAlign... align [[maybe_unused]]) noexcept
{
std::size_t actual_size = 0;
try
{
#if USE_JEMALLOC
@ -130,23 +133,26 @@ inline ALWAYS_INLINE void untrackMemory(void * ptr [[maybe_unused]], std::size_t
if (likely(ptr != nullptr))
{
if constexpr (sizeof...(TAlign) == 1)
CurrentMemoryTracker::free(sallocx(ptr, MALLOCX_ALIGN(alignToSizeT(align...))));
actual_size = sallocx(ptr, MALLOCX_ALIGN(alignToSizeT(align...)));
else
CurrentMemoryTracker::free(sallocx(ptr, 0));
actual_size = sallocx(ptr, 0);
}
#else
if (size)
CurrentMemoryTracker::free(size);
actual_size = size;
# if defined(_GNU_SOURCE)
/// It's innaccurate resource free for sanitizers. malloc_usable_size() result is greater or equal to allocated size.
else
CurrentMemoryTracker::free(malloc_usable_size(ptr));
actual_size = malloc_usable_size(ptr);
# endif
#endif
trace = CurrentMemoryTracker::free(actual_size);
}
catch (...)
{
}
return actual_size;
}
}

View File

@ -50,50 +50,74 @@ static struct InitializeJemallocZoneAllocatorForOSX
void * operator new(std::size_t size)
{
Memory::trackMemory(size);
return Memory::newImpl(size);
AllocationTrace trace;
std::size_t actual_size = Memory::trackMemory(size, trace);
void * ptr = Memory::newImpl(size);
trace.onAlloc(ptr, actual_size);
return ptr;
}
void * operator new(std::size_t size, std::align_val_t align)
{
Memory::trackMemory(size, align);
return Memory::newImpl(size, align);
AllocationTrace trace;
std::size_t actual_size = Memory::trackMemory(size, trace, align);
void * ptr = Memory::newImpl(size, align);
trace.onAlloc(ptr, actual_size);
return ptr;
}
void * operator new[](std::size_t size)
{
Memory::trackMemory(size);
return Memory::newImpl(size);
AllocationTrace trace;
std::size_t actual_size = Memory::trackMemory(size, trace);
void * ptr = Memory::newImpl(size);
trace.onAlloc(ptr, actual_size);
return ptr;
}
void * operator new[](std::size_t size, std::align_val_t align)
{
Memory::trackMemory(size, align);
return Memory::newImpl(size, align);
AllocationTrace trace;
std::size_t actual_size = Memory::trackMemory(size, trace, align);
void * ptr = Memory::newImpl(size, align);
trace.onAlloc(ptr, actual_size);
return ptr;
}
void * operator new(std::size_t size, const std::nothrow_t &) noexcept
{
Memory::trackMemory(size);
return Memory::newNoExept(size);
AllocationTrace trace;
std::size_t actual_size = Memory::trackMemory(size, trace);
void * ptr = Memory::newNoExept(size);
trace.onAlloc(ptr, actual_size);
return ptr;
}
void * operator new[](std::size_t size, const std::nothrow_t &) noexcept
{
Memory::trackMemory(size);
return Memory::newNoExept(size);
AllocationTrace trace;
std::size_t actual_size = Memory::trackMemory(size, trace);
void * ptr = Memory::newNoExept(size);
trace.onAlloc(ptr, actual_size);
return ptr;
}
void * operator new(std::size_t size, std::align_val_t align, const std::nothrow_t &) noexcept
{
Memory::trackMemory(size, align);
return Memory::newNoExept(size, align);
AllocationTrace trace;
std::size_t actual_size = Memory::trackMemory(size, trace, align);
void * ptr = Memory::newNoExept(size, align);
trace.onAlloc(ptr, actual_size);
return ptr;
}
void * operator new[](std::size_t size, std::align_val_t align, const std::nothrow_t &) noexcept
{
Memory::trackMemory(size, align);
return Memory::newNoExept(size, align);
AllocationTrace trace;
std::size_t actual_size = Memory::trackMemory(size, trace, align);
void * ptr = Memory::newNoExept(size, align);
trace.onAlloc(ptr, actual_size);
return ptr;
}
/// delete
@ -109,48 +133,64 @@ void * operator new[](std::size_t size, std::align_val_t align, const std::nothr
void operator delete(void * ptr) noexcept
{
Memory::untrackMemory(ptr);
AllocationTrace trace;
std::size_t actual_size = Memory::untrackMemory(ptr, trace);
trace.onFree(ptr, actual_size);
Memory::deleteImpl(ptr);
}
void operator delete(void * ptr, std::align_val_t align) noexcept
{
Memory::untrackMemory(ptr, 0, align);
AllocationTrace trace;
std::size_t actual_size = Memory::untrackMemory(ptr, trace, 0, align);
trace.onFree(ptr, actual_size);
Memory::deleteImpl(ptr);
}
void operator delete[](void * ptr) noexcept
{
Memory::untrackMemory(ptr);
AllocationTrace trace;
std::size_t actual_size = Memory::untrackMemory(ptr, trace);
trace.onFree(ptr, actual_size);
Memory::deleteImpl(ptr);
}
void operator delete[](void * ptr, std::align_val_t align) noexcept
{
Memory::untrackMemory(ptr, 0, align);
AllocationTrace trace;
std::size_t actual_size = Memory::untrackMemory(ptr, trace, 0, align);
trace.onFree(ptr, actual_size);
Memory::deleteImpl(ptr);
}
void operator delete(void * ptr, std::size_t size) noexcept
{
Memory::untrackMemory(ptr, size);
AllocationTrace trace;
std::size_t actual_size = Memory::untrackMemory(ptr, trace, size);
trace.onFree(ptr, actual_size);
Memory::deleteSized(ptr, size);
}
void operator delete(void * ptr, std::size_t size, std::align_val_t align) noexcept
{
Memory::untrackMemory(ptr, size, align);
AllocationTrace trace;
std::size_t actual_size = Memory::untrackMemory(ptr, trace, size, align);
trace.onFree(ptr, actual_size);
Memory::deleteSized(ptr, size, align);
}
void operator delete[](void * ptr, std::size_t size) noexcept
{
Memory::untrackMemory(ptr, size);
AllocationTrace trace;
std::size_t actual_size = Memory::untrackMemory(ptr, trace, size);
trace.onFree(ptr, actual_size);
Memory::deleteSized(ptr, size);
}
void operator delete[](void * ptr, std::size_t size, std::align_val_t align) noexcept
{
Memory::untrackMemory(ptr, size, align);
AllocationTrace trace;
std::size_t actual_size = Memory::untrackMemory(ptr, trace, size, align);
trace.onFree(ptr, actual_size);
Memory::deleteSized(ptr, size, align);
}

View File

@ -222,6 +222,7 @@ static constexpr UInt64 operator""_GiB(unsigned long long value)
M(UInt64, max_concurrent_queries_for_user, 0, "The maximum number of concurrent requests per user.", 0) \
\
M(Bool, insert_deduplicate, true, "For INSERT queries in the replicated table, specifies that deduplication of insertings blocks should be performed", 0) \
M(Bool, async_insert_deduplicate, false, "For async INSERT queries in the replicated table, specifies that deduplication of insertings blocks should be performed", 0) \
\
M(UInt64Auto, insert_quorum, 0, "For INSERT queries in the replicated table, wait writing for the specified number of replicas and linearize the addition of the data. 0 - disabled, 'auto' - use majority", 0) \
M(Milliseconds, insert_quorum_timeout, 600000, "If the quorum of replicas did not meet in specified time (in milliseconds), exception will be thrown and insertion is aborted.", 0) \

View File

@ -525,7 +525,7 @@ ReturnType SerializationNullable::deserializeTextCSVImpl(IColumn & column, ReadB
}
/// Check if we have enough data in buffer to check if it's a null.
if (istr.available() > null_representation.size())
if (settings.csv.custom_delimiter.empty() && istr.available() > null_representation.size())
{
auto check_for_null = [&istr, &null_representation, &settings]()
{
@ -550,8 +550,21 @@ ReturnType SerializationNullable::deserializeTextCSVImpl(IColumn & column, ReadB
{
buf.setCheckpoint();
SCOPE_EXIT(buf.dropCheckpoint());
if (checkString(null_representation, buf) && (buf.eof() || *buf.position() == settings.csv.delimiter || *buf.position() == '\r' || *buf.position() == '\n'))
return true;
if (checkString(null_representation, buf))
{
if (!settings.csv.custom_delimiter.empty())
{
if (checkString(settings.csv.custom_delimiter, buf))
{
/// Rollback to the beginning of custom delimiter.
buf.rollbackToCheckpoint();
assertString(null_representation, buf);
return true;
}
}
else if (buf.eof() || *buf.position() == settings.csv.delimiter || *buf.position() == '\r' || *buf.position() == '\n')
return true;
}
buf.rollbackToCheckpoint();
return false;

View File

@ -1,12 +1,17 @@
#include <Databases/DDLDependencyVisitor.h>
#include <Dictionaries/getDictionaryConfigurationFromAST.h>
#include <Interpreters/Cluster.h>
#include <Interpreters/Context.h>
#include <Interpreters/InDepthNodeVisitor.h>
#include <Interpreters/evaluateConstantExpression.h>
#include <Interpreters/getClusterName.h>
#include <Parsers/ASTCreateQuery.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTSelectWithUnionQuery.h>
#include <Parsers/ASTTablesInSelectQuery.h>
#include <Common/KnownObjectNames.h>
#include <Poco/String.h>
@ -15,225 +20,401 @@ namespace DB
namespace
{
/// CREATE TABLE or CREATE DICTIONARY or CREATE VIEW or CREATE TEMPORARY TABLE or CREATE DATABASE query.
void visitCreateQuery(const ASTCreateQuery & create, DDLDependencyVisitor::Data & data)
/// Data for DDLDependencyVisitor.
/// Used to visits ASTCreateQuery and extracts the names of all tables explicitly referenced in the create query.
class DDLDependencyVisitorData
{
QualifiedTableName to_table{create.to_table_id.database_name, create.to_table_id.table_name};
if (!to_table.table.empty())
public:
DDLDependencyVisitorData(const ContextPtr & context_, const QualifiedTableName & table_name_, const ASTPtr & ast_)
: create_query(ast_), table_name(table_name_), current_database(context_->getCurrentDatabase()), context(context_)
{
/// TO target_table (for materialized views)
if (to_table.database.empty())
to_table.database = data.default_database;
data.dependencies.emplace(to_table);
}
QualifiedTableName as_table{create.as_database, create.as_table};
if (!as_table.table.empty())
/// Acquire the result of visiting the create query.
TableNamesSet getDependencies() &&
{
/// AS table_name
if (as_table.database.empty())
as_table.database = data.default_database;
data.dependencies.emplace(as_table);
}
}
/// ASTTableExpression represents a reference to a table in SELECT query.
/// DDLDependencyVisitor should handle ASTTableExpression because some CREATE queries can contain SELECT queries after AS
/// (for example, CREATE VIEW).
void visitTableExpression(const ASTTableExpression & expr, DDLDependencyVisitor::Data & data)
{
if (!expr.database_and_table_name)
return;
const ASTIdentifier * identifier = dynamic_cast<const ASTIdentifier *>(expr.database_and_table_name.get());
if (!identifier)
return;
auto table_identifier = identifier->createTable();
if (!table_identifier)
return;
QualifiedTableName qualified_name{table_identifier->getDatabaseName(), table_identifier->shortName()};
if (qualified_name.table.empty())
return;
if (qualified_name.database.empty())
{
/// It can be table/dictionary from default database or XML dictionary, but we cannot distinguish it here.
qualified_name.database = data.default_database;
dependencies.erase(table_name);
return std::move(dependencies);
}
data.dependencies.emplace(qualified_name);
}
bool needChildVisit(const ASTPtr & child) const { return !skip_asts.contains(child.get()); }
/// Extracts a table name with optional database written in the form db_name.table_name (as identifier) or 'db_name.table_name' (as string).
void extractQualifiedTableNameFromArgument(const ASTFunction & function, DDLDependencyVisitor::Data & data, size_t arg_idx)
{
/// Just ignore incorrect arguments, proper exception will be thrown later
if (!function.arguments || function.arguments->children.size() <= arg_idx)
return;
QualifiedTableName qualified_name;
const auto * expr_list = function.arguments->as<ASTExpressionList>();
if (!expr_list)
return;
const auto * arg = expr_list->children[arg_idx].get();
if (const auto * literal = arg->as<ASTLiteral>())
void visit(const ASTPtr & ast)
{
if (literal->value.getType() != Field::Types::String)
if (auto * create = ast->as<ASTCreateQuery>())
{
visitCreateQuery(*create);
}
else if (auto * dictionary = ast->as<ASTDictionary>())
{
visitDictionaryDef(*dictionary);
}
else if (auto * expr = ast->as<ASTTableExpression>())
{
visitTableExpression(*expr);
}
else if (const auto * function = ast->as<ASTFunction>())
{
if (function->kind == ASTFunction::Kind::TABLE_ENGINE)
visitTableEngine(*function);
else
visitFunction(*function);
}
}
private:
ASTPtr create_query;
std::unordered_set<const IAST *> skip_asts;
QualifiedTableName table_name;
String current_database;
ContextPtr context;
TableNamesSet dependencies;
/// CREATE TABLE or CREATE DICTIONARY or CREATE VIEW or CREATE TEMPORARY TABLE or CREATE DATABASE query.
void visitCreateQuery(const ASTCreateQuery & create)
{
QualifiedTableName to_table{create.to_table_id.database_name, create.to_table_id.table_name};
if (!to_table.table.empty())
{
/// TO target_table (for materialized views)
if (to_table.database.empty())
to_table.database = current_database;
dependencies.emplace(to_table);
}
QualifiedTableName as_table{create.as_database, create.as_table};
if (!as_table.table.empty())
{
/// AS table_name
if (as_table.database.empty())
as_table.database = current_database;
dependencies.emplace(as_table);
}
}
/// The definition of a dictionary: SOURCE(CLICKHOUSE(...)) LAYOUT(...) LIFETIME(...)
void visitDictionaryDef(const ASTDictionary & dictionary)
{
if (!dictionary.source || dictionary.source->name != "clickhouse" || !dictionary.source->elements)
return;
auto maybe_qualified_name = QualifiedTableName::tryParseFromString(literal->value.get<String>());
/// Just return if name if invalid
if (!maybe_qualified_name)
auto config = getDictionaryConfigurationFromAST(create_query->as<ASTCreateQuery &>(), context);
auto info = getInfoIfClickHouseDictionarySource(config, context);
/// We consider only dependencies on local tables.
if (!info || !info->is_local)
return;
qualified_name = std::move(*maybe_qualified_name);
if (info->table_name.database.empty())
info->table_name.database = current_database;
dependencies.emplace(std::move(info->table_name));
}
else if (const auto * identifier = dynamic_cast<const ASTIdentifier *>(arg))
/// ASTTableExpression represents a reference to a table in SELECT query.
/// DDLDependencyVisitor should handle ASTTableExpression because some CREATE queries can contain SELECT queries after AS
/// (for example, CREATE VIEW).
void visitTableExpression(const ASTTableExpression & expr)
{
/// ASTIdentifier or ASTTableIdentifier
if (!expr.database_and_table_name)
return;
const ASTIdentifier * identifier = dynamic_cast<const ASTIdentifier *>(expr.database_and_table_name.get());
if (!identifier)
return;
auto table_identifier = identifier->createTable();
/// Just return if table identified is invalid
if (!table_identifier)
return;
qualified_name.database = table_identifier->getDatabaseName();
qualified_name.table = table_identifier->shortName();
}
else
{
/// Just return because we don't validate AST in this function.
return;
QualifiedTableName qualified_name{table_identifier->getDatabaseName(), table_identifier->shortName()};
if (qualified_name.table.empty())
return;
if (qualified_name.database.empty())
{
/// It can be table/dictionary from default database or XML dictionary, but we cannot distinguish it here.
qualified_name.database = current_database;
}
dependencies.emplace(qualified_name);
}
if (qualified_name.database.empty())
/// Finds dependencies of a table engine.
void visitTableEngine(const ASTFunction & table_engine)
{
/// It can be table/dictionary from default database or XML dictionary, but we cannot distinguish it here.
qualified_name.database = data.default_database;
}
data.dependencies.emplace(std::move(qualified_name));
}
/// Dictionary(db_name.dictionary_name)
if (table_engine.name == "Dictionary")
addQualifiedNameFromArgument(table_engine, 0);
/// Extracts a table name with database written in the form 'db_name', 'table_name' (two strings).
void extractDatabaseAndTableNameFromArguments(const ASTFunction & function, DDLDependencyVisitor::Data & data, size_t database_arg_idx, size_t table_arg_idx)
/// Buffer('db_name', 'dest_table_name')
if (table_engine.name == "Buffer")
addDatabaseAndTableNameFromArguments(table_engine, 0, 1);
/// Distributed(cluster_name, db_name, table_name, ...)
if (table_engine.name == "Distributed")
visitDistributedTableEngine(table_engine);
}
/// Distributed(cluster_name, database_name, table_name, ...)
void visitDistributedTableEngine(const ASTFunction & table_engine)
{
/// We consider only dependencies on local tables.
bool has_local_replicas = false;
if (auto cluster_name = tryGetClusterNameFromArgument(table_engine, 0))
{
auto cluster = context->tryGetCluster(*cluster_name);
if (cluster && cluster->getLocalShardCount())
has_local_replicas = true;
}
if (has_local_replicas)
addDatabaseAndTableNameFromArguments(table_engine, 1, 2);
}
/// Finds dependencies of a function.
void visitFunction(const ASTFunction & function)
{
if (function.name == "joinGet" || function.name == "dictHas" || function.name == "dictIsIn" || function.name.starts_with("dictGet"))
{
/// dictGet('dict_name', attr_names, id_expr)
/// dictHas('dict_name', id_expr)
/// joinGet(join_storage_table_name, `value_column`, join_keys)
addQualifiedNameFromArgument(function, 0);
}
else if (function.name == "in" || function.name == "notIn" || function.name == "globalIn" || function.name == "globalNotIn")
{
/// in(x, table_name) - function for evaluating (x IN table_name)
addQualifiedNameFromArgument(function, 1);
}
else if (function.name == "dictionary")
{
/// dictionary(dict_name)
addQualifiedNameFromArgument(function, 0);
}
else if (function.name == "remote" || function.name == "remoteSecure")
{
visitRemoteFunction(function, /* is_cluster_function= */ false);
}
else if (function.name == "cluster" || function.name == "clusterAllReplicas")
{
visitRemoteFunction(function, /* is_cluster_function= */ true);
}
}
/// remote('addresses_expr', db_name.table_name, ...)
/// remote('addresses_expr', 'db_name', 'table_name', ...)
/// remote('addresses_expr', table_function(), ...)
/// cluster('cluster_name', db_name.table_name, ...)
/// cluster('cluster_name', 'db_name', 'table_name', ...)
/// cluster('cluster_name', table_function(), ...)
void visitRemoteFunction(const ASTFunction & function, bool is_cluster_function)
{
/// We consider dependencies on local tables only.
bool has_local_replicas = false;
if (is_cluster_function)
{
if (auto cluster_name = tryGetClusterNameFromArgument(function, 0))
{
if (auto cluster = context->tryGetCluster(*cluster_name))
{
if (cluster->getLocalShardCount())
has_local_replicas = true;
}
}
}
else
{
/// remote() and remoteSecure() are not fully supported. To properly support them we would need to check the first
/// argument to decide whether the host & port pattern specified in the first argument contains the local host or not
/// which is not trivial. For now we just always assume that the host & port pattern doesn't contain the local host.
}
if (!function.arguments)
return;
ASTs & args = function.arguments->children;
if (args.size() < 2)
return;
const ASTFunction * table_function = nullptr;
if (const auto * second_arg_as_function = args[1]->as<ASTFunction>();
second_arg_as_function && KnownTableFunctionNames::instance().exists(second_arg_as_function->name))
{
table_function = second_arg_as_function;
}
if (has_local_replicas && !table_function)
{
auto maybe_qualified_name = tryGetQualifiedNameFromArgument(function, 1, /* apply_current_database= */ false);
if (!maybe_qualified_name)
return;
auto & qualified_name = *maybe_qualified_name;
if (qualified_name.database.empty())
{
auto table = tryGetStringFromArgument(function, 2);
if (!table)
return;
qualified_name.database = std::move(qualified_name.table);
qualified_name.table = std::move(table).value();
}
dependencies.insert(qualified_name);
}
if (!has_local_replicas && table_function)
{
/// `table function` will be executed remotely, so we won't check it or its arguments for dependencies.
skip_asts.emplace(table_function);
}
}
/// Gets an argument as a string, evaluates constants if necessary.
std::optional<String> tryGetStringFromArgument(const ASTFunction & function, size_t arg_idx) const
{
if (!function.arguments)
return {};
const ASTs & args = function.arguments->children;
if (arg_idx >= args.size())
return {};
const auto & arg = args[arg_idx];
ASTPtr evaluated;
try
{
evaluated = evaluateConstantExpressionOrIdentifierAsLiteral(arg, context);
}
catch (...)
{
return {};
}
const auto * literal = evaluated->as<ASTLiteral>();
if (!literal || (literal->value.getType() != Field::Types::String))
return {};
return literal->value.safeGet<String>();
}
/// Gets an argument as a qualified table name.
/// Accepts forms db_name.table_name (as an identifier) and 'db_name.table_name' (as a string).
/// The function doesn't replace an empty database name with the current_database (the caller must do that).
std::optional<QualifiedTableName>
tryGetQualifiedNameFromArgument(const ASTFunction & function, size_t arg_idx, bool apply_current_database = true) const
{
if (!function.arguments)
return {};
const ASTs & args = function.arguments->children;
if (arg_idx >= args.size())
return {};
const auto & arg = args[arg_idx];
QualifiedTableName qualified_name;
if (const auto * identifier = dynamic_cast<const ASTIdentifier *>(arg.get()))
{
/// ASTIdentifier or ASTTableIdentifier
auto table_identifier = identifier->createTable();
if (!table_identifier)
return {};
qualified_name.database = table_identifier->getDatabaseName();
qualified_name.table = table_identifier->shortName();
}
else
{
auto qualified_name_as_string = tryGetStringFromArgument(function, arg_idx);
if (!qualified_name_as_string)
return {};
auto maybe_qualified_name = QualifiedTableName::tryParseFromString(*qualified_name_as_string);
if (!maybe_qualified_name)
return {};
qualified_name = std::move(maybe_qualified_name).value();
}
if (qualified_name.database.empty() && apply_current_database)
qualified_name.database = current_database;
return qualified_name;
}
/// Adds a qualified table name from an argument to the collection of dependencies.
/// Accepts forms db_name.table_name (as an identifier) and 'db_name.table_name' (as a string).
void addQualifiedNameFromArgument(const ASTFunction & function, size_t arg_idx)
{
if (auto qualified_name = tryGetQualifiedNameFromArgument(function, arg_idx))
dependencies.emplace(std::move(qualified_name).value());
}
/// Returns a database name and a table name extracted from two separate arguments.
std::optional<QualifiedTableName> tryGetDatabaseAndTableNameFromArguments(
const ASTFunction & function, size_t database_arg_idx, size_t table_arg_idx, bool apply_current_database = true) const
{
auto database = tryGetStringFromArgument(function, database_arg_idx);
if (!database)
return {};
auto table = tryGetStringFromArgument(function, table_arg_idx);
if (!table)
return {};
QualifiedTableName qualified_name;
qualified_name.database = std::move(database).value();
qualified_name.table = std::move(table).value();
if (qualified_name.database.empty() && apply_current_database)
qualified_name.database = current_database;
return qualified_name;
}
/// Adds a database name and a table name from two separate arguments to the collection of dependencies.
void addDatabaseAndTableNameFromArguments(const ASTFunction & function, size_t database_arg_idx, size_t table_arg_idx)
{
if (auto qualified_name = tryGetDatabaseAndTableNameFromArguments(function, database_arg_idx, table_arg_idx))
dependencies.emplace(std::move(qualified_name).value());
}
std::optional<String> tryGetClusterNameFromArgument(const ASTFunction & function, size_t arg_idx) const
{
if (!function.arguments)
return {};
ASTs & args = function.arguments->children;
if (arg_idx >= args.size())
return {};
auto cluster_name = ::DB::tryGetClusterName(*args[arg_idx]);
if (cluster_name)
return cluster_name;
return tryGetStringFromArgument(function, arg_idx);
}
};
/// Visits ASTCreateQuery and extracts the names of all tables explicitly referenced in the create query.
class DDLDependencyVisitor
{
/// Just ignore incorrect arguments, proper exception will be thrown later
if (!function.arguments || (function.arguments->children.size() <= database_arg_idx)
|| (function.arguments->children.size() <= table_arg_idx))
return;
public:
using Data = DDLDependencyVisitorData;
using Visitor = ConstInDepthNodeVisitor<DDLDependencyVisitor, /* top_to_bottom= */ true, /* need_child_accept_data= */ true>;
const auto * expr_list = function.arguments->as<ASTExpressionList>();
if (!expr_list)
return;
const auto * database_literal = expr_list->children[database_arg_idx]->as<ASTLiteral>();
const auto * table_name_literal = expr_list->children[table_arg_idx]->as<ASTLiteral>();
if (!database_literal || !table_name_literal || (database_literal->value.getType() != Field::Types::String)
|| (table_name_literal->value.getType() != Field::Types::String))
return;
QualifiedTableName qualified_name{database_literal->value.get<String>(), table_name_literal->value.get<String>()};
if (qualified_name.table.empty())
return;
if (qualified_name.database.empty())
qualified_name.database = data.default_database;
data.dependencies.emplace(qualified_name);
}
void visitFunction(const ASTFunction & function, DDLDependencyVisitor::Data & data)
{
if (function.name == "joinGet" || function.name == "dictHas" || function.name == "dictIsIn" || function.name.starts_with("dictGet"))
{
/// dictGet('dict_name', attr_names, id_expr)
/// dictHas('dict_name', id_expr)
/// joinGet(join_storage_table_name, `value_column`, join_keys)
extractQualifiedTableNameFromArgument(function, data, 0);
}
else if (function.name == "in" || function.name == "notIn" || function.name == "globalIn" || function.name == "globalNotIn")
{
/// in(x, table_name) - function for evaluating (x IN table_name)
extractQualifiedTableNameFromArgument(function, data, 1);
}
else if (function.name == "dictionary")
{
/// dictionary(dict_name)
extractQualifiedTableNameFromArgument(function, data, 0);
}
}
void visitTableEngine(const ASTFunction & table_engine, DDLDependencyVisitor::Data & data)
{
if (table_engine.name == "Dictionary")
extractQualifiedTableNameFromArgument(table_engine, data, 0);
if (table_engine.name == "Buffer")
extractDatabaseAndTableNameFromArguments(table_engine, data, 0, 1);
}
void visitDictionaryDef(const ASTDictionary & dictionary, DDLDependencyVisitor::Data & data)
{
if (!dictionary.source || dictionary.source->name != "clickhouse" || !dictionary.source->elements)
return;
auto config = getDictionaryConfigurationFromAST(data.create_query->as<ASTCreateQuery &>(), data.global_context);
auto info = getInfoIfClickHouseDictionarySource(config, data.global_context);
if (!info || !info->is_local)
return;
if (info->table_name.database.empty())
info->table_name.database = data.default_database;
data.dependencies.emplace(std::move(info->table_name));
}
static bool needChildVisit(const ASTPtr &, const ASTPtr & child, const Data & data) { return data.needChildVisit(child); }
static void visit(const ASTPtr & ast, Data & data) { data.visit(ast); }
};
}
TableNamesSet getDependenciesFromCreateQuery(const ContextPtr & global_context, const QualifiedTableName & table_name, const ASTPtr & ast)
TableNamesSet getDependenciesFromCreateQuery(const ContextPtr & context, const QualifiedTableName & table_name, const ASTPtr & ast)
{
assert(global_context == global_context->getGlobalContext());
DDLDependencyVisitor::Data data;
data.table_name = table_name;
data.default_database = global_context->getCurrentDatabase();
data.create_query = ast;
data.global_context = global_context;
DDLDependencyVisitor::Data data{context, table_name, ast};
DDLDependencyVisitor::Visitor visitor{data};
visitor.visit(ast);
data.dependencies.erase(data.table_name);
return data.dependencies;
}
void DDLDependencyVisitor::visit(const ASTPtr & ast, Data & data)
{
if (auto * create = ast->as<ASTCreateQuery>())
{
visitCreateQuery(*create, data);
}
else if (auto * dictionary = ast->as<ASTDictionary>())
{
visitDictionaryDef(*dictionary, data);
}
else if (auto * expr = ast->as<ASTTableExpression>())
{
visitTableExpression(*expr, data);
}
else if (const auto * function = ast->as<ASTFunction>())
{
if (function->kind == ASTFunction::Kind::TABLE_ENGINE)
visitTableEngine(*function, data);
else
visitFunction(*function, data);
}
}
bool DDLDependencyVisitor::needChildVisit(const ASTPtr &, const ASTPtr &)
{
return true;
return std::move(data).getDependencies();
}
}

View File

@ -1,8 +1,9 @@
#pragma once
#include <Parsers/IAST_fwd.h>
#include <Interpreters/InDepthNodeVisitor.h>
#include <Core/QualifiedTableName.h>
#include <Interpreters/Context_fwd.h>
#include <Parsers/IAST_fwd.h>
#include <unordered_set>
namespace DB
@ -12,25 +13,6 @@ using TableNamesSet = std::unordered_set<QualifiedTableName>;
/// Returns a list of all tables explicitly referenced in the create query of a specified table.
/// For example, a column default expression can use dictGet() and thus reference a dictionary.
/// Does not validate AST, works a best-effort way.
TableNamesSet getDependenciesFromCreateQuery(const ContextPtr & global_context, const QualifiedTableName & table_name, const ASTPtr & ast);
/// Visits ASTCreateQuery and extracts the names of all tables explicitly referenced in the create query.
class DDLDependencyVisitor
{
public:
struct Data
{
ASTPtr create_query;
QualifiedTableName table_name;
String default_database;
ContextPtr global_context;
TableNamesSet dependencies;
};
using Visitor = ConstInDepthNodeVisitor<DDLDependencyVisitor, /* top_to_bottom= */ true>;
static void visit(const ASTPtr & ast, Data & data);
static bool needChildVisit(const ASTPtr & node, const ASTPtr & child);
};
TableNamesSet getDependenciesFromCreateQuery(const ContextPtr & context, const QualifiedTableName & table_name, const ASTPtr & ast);
}

View File

@ -1,5 +1,6 @@
#include <Databases/TablesDependencyGraph.h>
#include <Common/logger_useful.h>
#include <IO/WriteHelpers.h>
#include <boost/range/adaptor/reversed.hpp>
@ -9,12 +10,13 @@ namespace DB
namespace ErrorCodes
{
extern const int INFINITE_LOOP;
extern const int LOGICAL_ERROR;
}
namespace
{
constexpr const size_t CYCLIC_LEVEL = static_cast<size_t>(-2);
constexpr const size_t CYCLIC_LEVEL = std::numeric_limits<size_t>::max();
}
@ -40,7 +42,7 @@ TablesDependencyGraph::TablesDependencyGraph(TablesDependencyGraph && src) noexc
TablesDependencyGraph & TablesDependencyGraph::operator=(const TablesDependencyGraph & src)
{
if (&src != this)
if (this != &src)
{
nodes = src.nodes;
nodes_by_database_and_table_names = src.nodes_by_database_and_table_names;
@ -54,11 +56,14 @@ TablesDependencyGraph & TablesDependencyGraph::operator=(const TablesDependencyG
TablesDependencyGraph & TablesDependencyGraph::operator=(TablesDependencyGraph && src) noexcept
{
nodes = std::exchange(src.nodes, decltype(nodes){});
nodes_by_database_and_table_names = std::exchange(src.nodes_by_database_and_table_names, decltype(nodes_by_database_and_table_names){});
nodes_by_uuid = std::exchange(src.nodes_by_uuid, decltype(nodes_by_uuid){});
levels_calculated = std::exchange(src.levels_calculated, false);
nodes_sorted_by_level_lazy = std::exchange(src.nodes_sorted_by_level_lazy, decltype(nodes_sorted_by_level_lazy){});
if (this != &src)
{
nodes = std::exchange(src.nodes, decltype(nodes){});
nodes_by_database_and_table_names = std::exchange(src.nodes_by_database_and_table_names, decltype(nodes_by_database_and_table_names){});
nodes_by_uuid = std::exchange(src.nodes_by_uuid, decltype(nodes_by_uuid){});
levels_calculated = std::exchange(src.levels_calculated, false);
nodes_sorted_by_level_lazy = std::exchange(src.nodes_sorted_by_level_lazy, decltype(nodes_sorted_by_level_lazy){});
}
return *this;
}
@ -89,11 +94,13 @@ void TablesDependencyGraph::addDependency(const StorageID & table_id, const Stor
auto * table_node = addOrUpdateNode(table_id);
auto * dependency_node = addOrUpdateNode(dependency);
if (table_node->dependencies.contains(dependency_node))
return; /// Already have this dependency.
bool inserted = table_node->dependencies.insert(dependency_node).second;
if (!inserted)
return; /// Not inserted because we already had this dependency.
table_node->dependencies.insert(dependency_node);
dependency_node->dependents.insert(table_node);
/// `dependency_node` must be updated too.
[[maybe_unused]] bool inserted_to_set = dependency_node->dependents.insert(table_node).second;
chassert(inserted_to_set);
setNeedRecalculateLevels();
}
@ -126,13 +133,19 @@ void TablesDependencyGraph::addDependencies(const StorageID & table_id, const st
for (auto * dependency_node : old_dependency_nodes)
{
if (!new_dependency_nodes.contains(dependency_node))
dependency_node->dependents.erase(table_node);
{
[[maybe_unused]] bool removed_from_set = dependency_node->dependents.erase(table_node);
chassert(removed_from_set);
}
}
for (auto * dependency_node : new_dependency_nodes)
{
if (!old_dependency_nodes.contains(dependency_node))
dependency_node->dependents.insert(table_node);
{
[[maybe_unused]] bool inserted_to_set = dependency_node->dependents.insert(table_node).second;
chassert(inserted_to_set);
}
}
table_node->dependencies = std::move(new_dependency_nodes);
@ -167,21 +180,28 @@ bool TablesDependencyGraph::removeDependency(const StorageID & table_id, const S
auto dependency_it = table_node->dependencies.find(dependency_node);
if (dependency_it == table_node->dependencies.end())
return false;
return false; /// No such dependency, nothing to remove.
table_node->dependencies.erase(dependency_it);
dependency_node->dependents.erase(table_node);
bool table_node_removed = false;
/// `dependency_node` must be updated too.
[[maybe_unused]] bool removed_from_set = dependency_node->dependents.erase(table_node);
chassert(removed_from_set);
if (remove_isolated_tables && dependency_node->dependencies.empty() && dependency_node->dependents.empty())
{
/// The dependency table has no dependencies and no dependents now, so we will remove it from the graph.
removeNode(dependency_node);
if (table_node == dependency_node)
table_node_removed = true;
}
if (remove_isolated_tables && !table_node_removed && table_node->dependencies.empty() && table_node->dependents.empty())
{
/// The table `table_id` has no dependencies and no dependents now, so we will remove it from the graph.
removeNode(table_node);
}
setNeedRecalculateLevels();
return true;
@ -203,19 +223,28 @@ std::vector<StorageID> TablesDependencyGraph::removeDependencies(const StorageID
for (auto * dependency_node : dependency_nodes)
{
/// We're gathering the list of dependencies the table `table_id` had in the graph to return from the function.
dependencies.emplace_back(dependency_node->storage_id);
dependency_node->dependents.erase(table_node);
/// Update `dependency_node`.
[[maybe_unused]] bool removed_from_set = dependency_node->dependents.erase(table_node);
chassert(removed_from_set);
if (remove_isolated_tables && dependency_node->dependencies.empty() && dependency_node->dependents.empty())
{
/// The dependency table has no dependencies and no dependents now, so we will remove it from the graph.
removeNode(dependency_node);
if (table_node == dependency_node)
table_node_removed = true;
}
}
if (remove_isolated_tables && !table_node_removed && table_node->dependencies.empty() && table_node->dependents.empty())
chassert(table_node->dependencies.empty());
if (remove_isolated_tables && !table_node_removed && table_node->dependents.empty())
{
/// The table `table_id` has no dependencies and no dependents now, so we will remove it from the graph.
removeNode(table_node);
}
setNeedRecalculateLevels();
return dependencies;
@ -251,7 +280,12 @@ TablesDependencyGraph::Node * TablesDependencyGraph::findNode(const StorageID &
{
auto * node = it->second;
if (table_id.hasUUID() && node->storage_id.hasUUID() && (table_id.uuid != node->storage_id.uuid))
return nullptr; /// UUID is different, it's not the node we're looking for.
{
/// We found a table with specified database and table names in the graph, but surprisingly it has a different UUID.
/// Maybe an "EXCHANGE TABLES" command has been executed somehow without changing the graph?
LOG_WARNING(getLogger(), "Found table {} in the graph with unexpected UUID {}", table_id, node->storage_id.uuid);
return nullptr; /// Act like it's not found.
}
return node; /// Found by table name.
}
}
@ -268,7 +302,8 @@ TablesDependencyGraph::Node * TablesDependencyGraph::addOrUpdateNode(const Stora
if (table_id.hasUUID() && !node->storage_id.hasUUID())
{
node->storage_id.uuid = table_id.uuid;
nodes_by_uuid.emplace(node->storage_id.uuid, node);
[[maybe_unused]] bool inserted_to_map = nodes_by_uuid.emplace(node->storage_id.uuid, node).second;
chassert(inserted_to_map);
}
if (!table_id.table_name.empty() && ((table_id.table_name != node->storage_id.table_name) || (table_id.database_name != node->storage_id.database_name)))
@ -283,7 +318,8 @@ TablesDependencyGraph::Node * TablesDependencyGraph::addOrUpdateNode(const Stora
nodes_by_database_and_table_names.erase(node->storage_id);
node->storage_id.database_name = table_id.database_name;
node->storage_id.table_name = table_id.table_name;
nodes_by_database_and_table_names.emplace(node->storage_id, node);
[[maybe_unused]] bool inserted_to_map = nodes_by_database_and_table_names.emplace(node->storage_id, node).second;
chassert(inserted_to_map);
}
}
else
@ -303,9 +339,15 @@ TablesDependencyGraph::Node * TablesDependencyGraph::addOrUpdateNode(const Stora
nodes.insert(node_ptr);
node = node_ptr.get();
if (table_id.hasUUID())
nodes_by_uuid.emplace(table_id.uuid, node);
{
[[maybe_unused]] bool inserted_to_map = nodes_by_uuid.emplace(table_id.uuid, node).second;
chassert(inserted_to_map);
}
if (!table_id.table_name.empty())
nodes_by_database_and_table_names.emplace(table_id, node);
{
[[maybe_unused]] bool inserted_to_map = nodes_by_database_and_table_names.emplace(table_id, node).second;
chassert(inserted_to_map);
}
}
return node;
}
@ -313,22 +355,39 @@ TablesDependencyGraph::Node * TablesDependencyGraph::addOrUpdateNode(const Stora
void TablesDependencyGraph::removeNode(Node * node)
{
chassert(node);
auto dependency_nodes = std::move(node->dependencies);
auto dependent_nodes = std::move(node->dependents);
if (node->storage_id.hasUUID())
nodes_by_uuid.erase(node->storage_id.uuid);
{
[[maybe_unused]] bool removed_from_map = nodes_by_uuid.erase(node->storage_id.uuid);
chassert(removed_from_map);
}
if (!node->storage_id.table_name.empty())
nodes_by_database_and_table_names.erase(node->storage_id);
{
[[maybe_unused]]bool removed_from_map = nodes_by_database_and_table_names.erase(node->storage_id);
chassert(removed_from_map);
}
for (auto * dependency_node : dependency_nodes)
dependency_node->dependents.erase(node);
{
[[maybe_unused]] bool removed_from_set = dependency_node->dependents.erase(node);
chassert(removed_from_set);
}
for (auto * dependent_node : dependent_nodes)
dependent_node->dependencies.erase(node);
{
[[maybe_unused]] bool removed_from_set = dependent_node->dependencies.erase(node);
chassert(removed_from_set);
}
nodes.erase(node->shared_from_this());
auto it = nodes.find(node);
chassert(it != nodes.end());
nodes.erase(it);
nodes_sorted_by_level_lazy.clear();
}
@ -533,7 +592,7 @@ String TablesDependencyGraph::describeCyclicDependencies() const
}
void TablesDependencyGraph::setNeedRecalculateLevels()
void TablesDependencyGraph::setNeedRecalculateLevels() const
{
levels_calculated = false;
nodes_sorted_by_level_lazy.clear();
@ -546,49 +605,73 @@ void TablesDependencyGraph::calculateLevels() const
return;
levels_calculated = true;
/// First find tables with no dependencies, add them to `nodes_sorted_by_level_lazy`.
/// Then remove those tables from the dependency graph (we imitate that removing by decrementing `num_dependencies_to_count`),
/// and find which tables have no dependencies now.
/// Repeat until we have tables with no dependencies.
/// In the end we expect all nodes from `nodes` to be added to `nodes_sorted_by_level_lazy`.
/// If some nodes are still not added to `nodes_sorted_by_level_lazy` in the end then there is a cyclic dependency.
/// Complexity: O(V + E)
nodes_sorted_by_level_lazy.clear();
nodes_sorted_by_level_lazy.reserve(nodes.size());
std::unordered_set<const Node *> nodes_to_process;
for (const auto & node_ptr : nodes)
nodes_to_process.emplace(node_ptr.get());
size_t current_level = 0;
while (!nodes_to_process.empty())
/// Find tables with no dependencies.
for (const auto & node_ptr : nodes)
{
size_t old_num_sorted = nodes_sorted_by_level_lazy.size();
for (auto it = nodes_to_process.begin(); it != nodes_to_process.end();)
const Node * node = node_ptr.get();
node->num_dependencies_to_count = node->dependencies.size();
if (!node->num_dependencies_to_count)
{
const auto * current_node = *(it++);
bool has_dependencies = false;
for (const auto * dependency : current_node->dependencies)
{
if (nodes_to_process.contains(dependency))
has_dependencies = true;
}
node->level = current_level;
nodes_sorted_by_level_lazy.emplace_back(node);
}
}
if (!has_dependencies)
size_t num_nodes_without_dependencies = nodes_sorted_by_level_lazy.size();
++current_level;
while (num_nodes_without_dependencies)
{
size_t begin = nodes_sorted_by_level_lazy.size() - num_nodes_without_dependencies;
size_t end = nodes_sorted_by_level_lazy.size();
/// Decrement number of dependencies for each dependent table.
for (size_t i = begin; i != end; ++i)
{
const Node * current_node = nodes_sorted_by_level_lazy[i];
for (const Node * dependent_node : current_node->dependents)
{
current_node->level = current_level;
nodes_sorted_by_level_lazy.emplace_back(current_node);
if (!dependent_node->num_dependencies_to_count)
throw Exception(ErrorCodes::LOGICAL_ERROR, "{}: Trying to decrement 0 dependencies counter for {}. It's a bug", name_for_logging, dependent_node->storage_id);
if (!--dependent_node->num_dependencies_to_count)
{
dependent_node->level = current_level;
nodes_sorted_by_level_lazy.emplace_back(dependent_node);
}
}
}
if (nodes_sorted_by_level_lazy.size() == old_num_sorted)
break;
for (size_t i = old_num_sorted; i != nodes_sorted_by_level_lazy.size(); ++i)
nodes_to_process.erase(nodes_sorted_by_level_lazy[i]);
if (nodes_sorted_by_level_lazy.size() > nodes.size())
throw Exception(ErrorCodes::LOGICAL_ERROR, "{}: Some tables were found more than once while passing through the dependency graph. It's a bug", name_for_logging);
num_nodes_without_dependencies = nodes_sorted_by_level_lazy.size() - end;
++current_level;
}
for (const auto * node_with_cyclic_dependencies : nodes_to_process)
if (nodes_sorted_by_level_lazy.size() < nodes.size())
{
node_with_cyclic_dependencies->level = CYCLIC_LEVEL;
nodes_sorted_by_level_lazy.emplace_back(node_with_cyclic_dependencies);
for (const auto & node_ptr : nodes)
{
const Node * node = node_ptr.get();
if (node->num_dependencies_to_count)
{
node->level = CYCLIC_LEVEL;
nodes_sorted_by_level_lazy.emplace_back(node);
}
}
}
}
@ -630,7 +713,7 @@ std::vector<std::vector<StorageID>> TablesDependencyGraph::getTablesSortedByDepe
void TablesDependencyGraph::log() const
{
if (empty())
if (nodes.empty())
{
LOG_TEST(getLogger(), "No tables");
return;

View File

@ -20,11 +20,11 @@ using TableNamesSet = std::unordered_set<QualifiedTableName>;
///
/// This class is used to represent various types of table-table dependencies:
/// 1. View dependencies: "source_table -> materialized_view".
/// Data inserted to a source table is also inserted to corresponding materialized views.
/// Data inserted to a source table is also inserted to corresponding materialized views.
/// 2. Loading dependencies: specify in which order tables must be loaded during startup.
/// For example a dictionary should be loaded after it's source table and it's written in the graph as "dictionary -> source_table".
/// For example a dictionary should be loaded after it's source table and it's written in the graph as "dictionary -> source_table".
/// 3. Referential dependencies: "table -> all tables mentioned in its definition".
/// Referential dependencies are checked to decide if it's safe to drop a table (it can be unsafe if the table is used by another table).
/// Referential dependencies are checked to decide if it's safe to drop a table (it can be unsafe if the table is used by another table).
///
/// WARNING: This class doesn't have an embedded mutex, so it must be synchronized outside.
class TablesDependencyGraph
@ -98,8 +98,8 @@ public:
/// Cyclic dependencies are dependencies like "A->A" or "A->B->C->D->A".
void checkNoCyclicDependencies() const;
bool hasCyclicDependencies() const;
std::vector<StorageID> getTablesWithCyclicDependencies() const;
String describeCyclicDependencies() const;
std::vector<StorageID> getTablesWithCyclicDependencies() const;
/// Returns a list of tables sorted by their dependencies:
/// tables without dependencies first, then
@ -113,8 +113,12 @@ public:
/// Outputs information about this graph as a bunch of logging messages.
void log() const;
/// Calculates levels - this is required for checking cyclic dependencies, to sort tables by dependency, and to log the graph.
/// This function is called automatically by the functions which need it, but can be invoked directly.
void calculateLevels() const;
private:
struct Node : public std::enable_shared_from_this<Node>
struct Node
{
StorageID storage_id;
@ -128,28 +132,38 @@ private:
/// Calculated lazily.
mutable size_t level = 0;
/// Number of dependencies left, used only while we're calculating levels.
mutable size_t num_dependencies_to_count = 0;
explicit Node(const StorageID & storage_id_) : storage_id(storage_id_) {}
};
using NodeSharedPtr = std::shared_ptr<Node>;
struct LessByLevel
struct Hash
{
bool operator()(const Node * left, const Node * right) { return left->level < right->level; }
using is_transparent = void;
size_t operator()(const Node * node) const { return std::hash<const Node *>{}(node); }
size_t operator()(const NodeSharedPtr & node_ptr) const { return operator()(node_ptr.get()); }
};
std::unordered_set<NodeSharedPtr> nodes;
struct Equal
{
using is_transparent = void;
size_t operator()(const NodeSharedPtr & left, const Node * right) const { return left.get() == right; }
size_t operator()(const NodeSharedPtr & left, const NodeSharedPtr & right) const { return left == right; }
};
std::unordered_set<NodeSharedPtr, Hash, Equal> nodes;
/// Nodes can be found either by UUID or by database name & table name. That's why we need two maps here.
std::unordered_map<StorageID, Node *, StorageID::DatabaseAndTableNameHash, StorageID::DatabaseAndTableNameEqual> nodes_by_database_and_table_names;
std::unordered_map<UUID, Node *> nodes_by_uuid;
/// This is set if both `level` inside each node and `nodes_sorted_by_level_lazy` are calculated.
mutable bool levels_calculated = false;
/// Nodes sorted by their level. Calculated lazily.
using NodesSortedByLevel = std::vector<const Node *>;
mutable NodesSortedByLevel nodes_sorted_by_level_lazy;
mutable bool levels_calculated = false;
const String name_for_logging;
mutable Poco::Logger * logger = nullptr;
@ -161,8 +175,7 @@ private:
static std::vector<StorageID> getDependencies(const Node & node);
static std::vector<StorageID> getDependents(const Node & node);
void setNeedRecalculateLevels();
void calculateLevels() const;
void setNeedRecalculateLevels() const;
const NodesSortedByLevel & getNodesSortedByLevel() const;
Poco::Logger * getLogger() const;

View File

@ -624,20 +624,21 @@ getInfoIfClickHouseDictionarySource(DictionaryConfigurationPtr & config, Context
{
ClickHouseDictionarySourceInfo info;
String host = config->getString("dictionary.source.clickhouse.host", "");
UInt16 port = config->getUInt("dictionary.source.clickhouse.port", 0);
bool secure = config->getBool("dictionary.source.clickhouse.secure", false);
UInt16 default_port = secure ? global_context->getTCPPortSecure().value_or(0) : global_context->getTCPPort();
String host = config->getString("dictionary.source.clickhouse.host", "localhost");
UInt16 port = config->getUInt("dictionary.source.clickhouse.port", default_port);
String database = config->getString("dictionary.source.clickhouse.db", "");
String table = config->getString("dictionary.source.clickhouse.table", "");
bool secure = config->getBool("dictionary.source.clickhouse.secure", false);
if (host.empty() || port == 0 || table.empty())
if (table.empty())
return {};
info.table_name = {database, table};
try
{
UInt16 default_port = secure ? global_context->getTCPPortSecure().value_or(0) : global_context->getTCPPort();
if (isLocalAddress({host, port}, default_port))
info.is_local = true;
}

View File

@ -119,6 +119,7 @@ struct FormatSettings
char tuple_delimiter = ',';
bool use_best_effort_in_schema_inference = true;
UInt64 skip_first_lines = 0;
String custom_delimiter;
} csv;
struct HiveText

View File

@ -1343,30 +1343,6 @@ struct ToYYYYMMDDhhmmssImpl
using FactorTransform = ZeroTransform;
};
struct ToDateTimeComponentsImpl
{
static constexpr auto name = "toDateTimeComponents";
static inline DateLUTImpl::DateTimeComponents execute(Int64 t, const DateLUTImpl & time_zone)
{
return time_zone.toDateTimeComponents(t);
}
static inline DateLUTImpl::DateTimeComponents execute(UInt32 t, const DateLUTImpl & time_zone)
{
return time_zone.toDateTimeComponents(static_cast<DateLUTImpl::Time>(t));
}
static inline DateLUTImpl::DateTimeComponents execute(Int32 d, const DateLUTImpl & time_zone)
{
return time_zone.toDateTimeComponents(ExtendedDayNum(d));
}
static inline DateLUTImpl::DateTimeComponents execute(UInt16 d, const DateLUTImpl & time_zone)
{
return time_zone.toDateTimeComponents(DayNum(d));
}
using FactorTransform = ZeroTransform;
};
template <typename FromType, typename ToType, typename Transform, bool is_extended_result = false>
struct Transformer

View File

@ -106,7 +106,7 @@ public:
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
if (arguments.size() < 1 || arguments.size() > 2)
if (arguments.empty() || arguments.size() > 2)
throw Exception(ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH, "Function {} takes one or two arguments", name);
if (!isInteger(arguments[0].type))
@ -126,7 +126,7 @@ public:
const auto & col = *src.column;
if (!checkAndGetColumn<ColumnVector<T>>(col))
return 0;
return false;
auto & result_data = result_column->getData();
@ -135,7 +135,7 @@ public:
for (size_t i = 0; i < input_rows_count; ++i)
result_data[i] = source_data[i];
return 1;
return true;
}
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t input_rows_count) const override

View File

@ -16,6 +16,7 @@
#include <Functions/FunctionUnaryArithmetic.h>
#include <Common/FieldVisitors.h>
#include <cstring>
#include <algorithm>
@ -94,7 +95,7 @@ void convertAnyColumnToBool(const IColumn * column, UInt8Container & res)
}
template <class Op, typename Func>
template <class Op, bool IsTernary, typename Func>
bool extractConstColumns(ColumnRawPtrs & in, UInt8 & res, Func && func)
{
bool has_res = false;
@ -112,7 +113,10 @@ bool extractConstColumns(ColumnRawPtrs & in, UInt8 & res, Func && func)
if (has_res)
{
res = Op::apply(res, x);
if constexpr (IsTernary)
res = Op::ternaryApply(res, x);
else
res = Op::apply(res, x);
}
else
{
@ -129,7 +133,7 @@ bool extractConstColumns(ColumnRawPtrs & in, UInt8 & res, Func && func)
template <class Op>
inline bool extractConstColumnsAsBool(ColumnRawPtrs & in, UInt8 & res)
{
return extractConstColumns<Op>(
return extractConstColumns<Op, false>(
in, res,
[](const Field & value)
{
@ -141,7 +145,7 @@ inline bool extractConstColumnsAsBool(ColumnRawPtrs & in, UInt8 & res)
template <class Op>
inline bool extractConstColumnsAsTernary(ColumnRawPtrs & in, UInt8 & res_3v)
{
return extractConstColumns<Op>(
return extractConstColumns<Op, true>(
in, res_3v,
[](const Field & value)
{
@ -192,47 +196,74 @@ private:
};
/// A helper class used by AssociativeGenericApplierImpl
/// Allows for on-the-fly conversion of any data type into intermediate ternary representation
using TernaryValueGetter = std::function<Ternary::ResultType (size_t)>;
template <typename ... Types>
struct ValueGetterBuilderImpl;
struct TernaryValueBuilderImpl;
template <typename Type, typename ...Types>
struct ValueGetterBuilderImpl<Type, Types...>
struct TernaryValueBuilderImpl<Type, Types...>
{
static TernaryValueGetter build(const IColumn * x)
static void build(const IColumn * x, UInt8* __restrict ternary_column_data)
{
size_t size = x->size();
if (x->onlyNull())
{
return [](size_t){ return Ternary::Null; };
memset(ternary_column_data, Ternary::Null, size);
}
else if (const auto * nullable_column = typeid_cast<const ColumnNullable *>(x))
{
if (const auto * nested_column = typeid_cast<const ColumnVector<Type> *>(nullable_column->getNestedColumnPtr().get()))
{
return [
&null_data = nullable_column->getNullMapData(),
&column_data = nested_column->getData()](size_t i)
const auto& null_data = nullable_column->getNullMapData();
const auto& column_data = nested_column->getData();
if constexpr (sizeof(Type) == 1)
{
return Ternary::makeValue(column_data[i], null_data[i]);
};
for (size_t i = 0; i < size; ++i)
{
auto has_value = static_cast<UInt8>(column_data[i] != 0);
auto is_null = !!null_data[i];
ternary_column_data[i] = ((has_value << 1) | is_null) & (1 << !is_null);
}
}
else
{
for (size_t i = 0; i < size; ++i)
{
auto has_value = static_cast<UInt8>(column_data[i] != 0);
ternary_column_data[i] = has_value;
}
for (size_t i = 0; i < size; ++i)
{
auto has_value = ternary_column_data[i];
auto is_null = !!null_data[i];
ternary_column_data[i] = ((has_value << 1) | is_null) & (1 << !is_null);
}
}
}
else
return ValueGetterBuilderImpl<Types...>::build(x);
TernaryValueBuilderImpl<Types...>::build(x, ternary_column_data);
}
else if (const auto column = typeid_cast<const ColumnVector<Type> *>(x))
return [&column_data = column->getData()](size_t i) { return Ternary::makeValue(column_data[i]); };
{
auto &column_data = column->getData();
for (size_t i = 0; i < size; ++i)
{
ternary_column_data[i] = (column_data[i] != 0) << 1;
}
}
else
return ValueGetterBuilderImpl<Types...>::build(x);
TernaryValueBuilderImpl<Types...>::build(x, ternary_column_data);
}
};
template <>
struct ValueGetterBuilderImpl<>
struct TernaryValueBuilderImpl<>
{
static TernaryValueGetter build(const IColumn * x)
[[noreturn]] static void build(const IColumn * x, UInt8 * /* nullable_ternary_column_data */)
{
throw Exception(
std::string("Unknown numeric column of type: ") + demangle(typeid(*x).name()),
@ -240,12 +271,12 @@ struct ValueGetterBuilderImpl<>
}
};
using ValueGetterBuilder =
ValueGetterBuilderImpl<UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64, Float32, Float64>;
using TernaryValueBuilder =
TernaryValueBuilderImpl<UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64, Float32, Float64>;
/// This class together with helper class ValueGetterBuilder can be used with columns of arbitrary data type
/// Allows for on-the-fly conversion of any type of data into intermediate ternary representation
/// and eliminates the need to materialize data columns in intermediate representation
/// This class together with helper class TernaryValueBuilder can be used with columns of arbitrary data type
/// Converts column of any data type into an intermediate UInt8Column of ternary representation for the
/// vectorized ternary logic evaluation.
template <typename Op, size_t N>
class AssociativeGenericApplierImpl
{
@ -254,20 +285,19 @@ class AssociativeGenericApplierImpl
public:
/// Remembers the last N columns from `in`.
explicit AssociativeGenericApplierImpl(const ColumnRawPtrs & in)
: val_getter{ValueGetterBuilder::build(in[in.size() - N])}, next{in} {}
: vec(in[in.size() - N]->size()), next{in}
{
TernaryValueBuilder::build(in[in.size() - N], vec.data());
}
/// Returns a combination of values in the i-th row of all columns stored in the constructor.
inline ResultValueType apply(const size_t i) const
{
const auto a = val_getter(i);
if constexpr (Op::isSaturable())
return Op::isSaturatedValueTernary(a) ? a : Op::apply(a, next.apply(i));
else
return Op::apply(a, next.apply(i));
return Op::ternaryApply(vec[i], next.apply(i));
}
private:
const TernaryValueGetter val_getter;
UInt8Container vec;
const AssociativeGenericApplierImpl<Op, N - 1> next;
};
@ -280,12 +310,15 @@ class AssociativeGenericApplierImpl<Op, 1>
public:
/// Remembers the last N columns from `in`.
explicit AssociativeGenericApplierImpl(const ColumnRawPtrs & in)
: val_getter{ValueGetterBuilder::build(in[in.size() - 1])} {}
: vec(UInt8Container(in[in.size() - 1]->size()))
{
TernaryValueBuilder::build(in[in.size() - 1], vec.data());
}
inline ResultValueType apply(const size_t i) const { return val_getter(i); }
inline ResultValueType apply(const size_t i) const { return vec[i]; }
private:
const TernaryValueGetter val_getter;
UInt8Container vec;
};
@ -318,7 +351,12 @@ struct OperationApplier
for (size_t i = 0; i < size; ++i)
{
if constexpr (CarryResult)
result_data[i] = Op::apply(result_data[i], operation_applier_impl.apply(i));
{
if constexpr (std::is_same_v<OperationApplierImpl<Op, N>, AssociativeApplierImpl<Op, N>>)
result_data[i] = Op::apply(result_data[i], operation_applier_impl.apply(i));
else
result_data[i] = Op::ternaryApply(result_data[i], operation_applier_impl.apply(i));
}
else
result_data[i] = operation_applier_impl.apply(i);
}

View File

@ -44,21 +44,29 @@ namespace Ternary
{
using ResultType = UInt8;
/** These carefully picked values magically work so bitwise "and", "or" on them
* corresponds to the expected results in three-valued logic.
/** These values are carefully picked so that they could be efficiently evaluated with bitwise operations, which
* are feasible for auto-vectorization by the compiler. The expression for the ternary value evaluation writes:
*
* False and True are represented by all-0 and all-1 bits, so all bitwise operations on them work as expected.
* Null is represented as single 1 bit. So, it is something in between False and True.
* And "or" works like maximum and "and" works like minimum:
* "or" keeps True as is and lifts False with Null to Null.
* "and" keeps False as is and downs True with Null to Null.
* ternary_value = ((value << 1) | is_null) & (1 << !is_null)
*
* The truth table of the above formula lists:
* +---------------+--------------+-------------+
* | is_null\value | 0 | 1 |
* +---------------+--------------+-------------+
* | 0 | 0b00 (False) | 0b10 (True) |
* | 1 | 0b01 (Null) | 0b01 (Null) |
* +---------------+--------------+-------------+
*
* As the numerical values of False, Null and True are assigned in ascending order, the "and" and "or" of
* ternary logic could be implemented with minimum and maximum respectively, which are also vectorizable.
* https://en.wikipedia.org/wiki/Three-valued_logic
*
* This logic does not apply for "not" and "xor" - they work with default implementation for NULLs:
* anything with NULL returns NULL, otherwise use conventional two-valued logic.
*/
static constexpr UInt8 False = 0; /// All zero bits.
static constexpr UInt8 True = -1; /// All one bits.
static constexpr UInt8 Null = 1; /// Single one bit.
static constexpr UInt8 False = 0; /// 0b00
static constexpr UInt8 Null = 1; /// 0b01
static constexpr UInt8 True = 2; /// 0b10
template <typename T>
inline ResultType makeValue(T value)
@ -90,6 +98,8 @@ struct AndImpl
static inline constexpr ResultType apply(UInt8 a, UInt8 b) { return a & b; }
static inline constexpr ResultType ternaryApply(UInt8 a, UInt8 b) { return std::min(a, b); }
/// Will use three-valued logic for NULLs (see above) or default implementation (any operation with NULL returns NULL).
static inline constexpr bool specialImplementationForNulls() { return true; }
};
@ -102,6 +112,7 @@ struct OrImpl
static inline constexpr bool isSaturatedValue(bool a) { return a; }
static inline constexpr bool isSaturatedValueTernary(UInt8 a) { return a == Ternary::True; }
static inline constexpr ResultType apply(UInt8 a, UInt8 b) { return a | b; }
static inline constexpr ResultType ternaryApply(UInt8 a, UInt8 b) { return std::max(a, b); }
static inline constexpr bool specialImplementationForNulls() { return true; }
};
@ -113,6 +124,7 @@ struct XorImpl
static inline constexpr bool isSaturatedValue(bool) { return false; }
static inline constexpr bool isSaturatedValueTernary(UInt8) { return false; }
static inline constexpr ResultType apply(UInt8 a, UInt8 b) { return a != b; }
static inline constexpr ResultType ternaryApply(UInt8 a, UInt8 b) { return a != b; }
static inline constexpr bool specialImplementationForNulls() { return false; }
#if USE_EMBEDDED_COMPILER

View File

@ -48,10 +48,6 @@ public:
: scale_multiplier(DecimalUtils::scaleMultiplier<DateTime64::NativeType>(scale_))
{}
TransformDateTime64(DateTime64::NativeType scale_multiplier_ = 1) /// NOLINT(google-explicit-constructor)
: scale_multiplier(scale_multiplier_)
{}
template <typename ... Args>
inline auto NO_SANITIZE_UNDEFINED execute(const DateTime64 & t, Args && ... args) const
{
@ -131,8 +127,6 @@ public:
return wrapped_transform.executeExtendedResult(t, std::forward<Args>(args)...);
}
DateTime64::NativeType getScaleMultiplier() const { return scale_multiplier; }
private:
DateTime64::NativeType scale_multiplier = 1;
Transform wrapped_transform = {};

View File

@ -1,7 +1,6 @@
#include <DataTypes/DataTypeDateTime.h>
#include <DataTypes/DataTypeDateTime64.h>
#include <DataTypes/DataTypesNumber.h>
#include <Common/IntervalKind.h>
#include <Columns/ColumnString.h>
#include <Columns/ColumnsDateTime.h>
#include <Columns/ColumnsNumber.h>
@ -35,7 +34,6 @@ namespace ErrorCodes
namespace
{
template <bool is_diff>
class DateDiffImpl
{
public:
@ -167,92 +165,8 @@ public:
template <typename TransformX, typename TransformY, typename T1, typename T2>
Int64 calculate(const TransformX & transform_x, const TransformY & transform_y, T1 x, T2 y, const DateLUTImpl & timezone_x, const DateLUTImpl & timezone_y) const
{
if constexpr (is_diff)
return static_cast<Int64>(transform_y.execute(y, timezone_y))
return static_cast<Int64>(transform_y.execute(y, timezone_y))
- static_cast<Int64>(transform_x.execute(x, timezone_x));
else
{
auto res = static_cast<Int64>(transform_y.execute(y, timezone_y))
- static_cast<Int64>(transform_x.execute(x, timezone_x));
DateLUTImpl::DateTimeComponents a_comp;
DateLUTImpl::DateTimeComponents b_comp;
Int64 adjust_value;
auto x_seconds = TransformDateTime64<ToRelativeSecondNumImpl<ResultPrecision::Extended>>(transform_x.getScaleMultiplier()).execute(x, timezone_x);
auto y_seconds = TransformDateTime64<ToRelativeSecondNumImpl<ResultPrecision::Extended>>(transform_y.getScaleMultiplier()).execute(y, timezone_y);
if (x_seconds <= y_seconds)
{
a_comp = TransformDateTime64<ToDateTimeComponentsImpl>(transform_x.getScaleMultiplier()).execute(x, timezone_x);
b_comp = TransformDateTime64<ToDateTimeComponentsImpl>(transform_y.getScaleMultiplier()).execute(y, timezone_y);
adjust_value = -1;
}
else
{
a_comp = TransformDateTime64<ToDateTimeComponentsImpl>(transform_y.getScaleMultiplier()).execute(y, timezone_y);
b_comp = TransformDateTime64<ToDateTimeComponentsImpl>(transform_x.getScaleMultiplier()).execute(x, timezone_x);
adjust_value = 1;
}
if constexpr (std::is_same_v<TransformX, TransformDateTime64<ToRelativeYearNumImpl<ResultPrecision::Extended>>>)
{
if ((a_comp.date.month > b_comp.date.month)
|| ((a_comp.date.month == b_comp.date.month) && ((a_comp.date.day > b_comp.date.day)
|| ((a_comp.date.day == b_comp.date.day) && ((a_comp.time.hour > b_comp.time.hour)
|| ((a_comp.time.hour == b_comp.time.hour) && ((a_comp.time.minute > b_comp.time.minute)
|| ((a_comp.time.minute == b_comp.time.minute) && (a_comp.time.second > b_comp.time.second))))
)))))
res += adjust_value;
}
else if constexpr (std::is_same_v<TransformX, TransformDateTime64<ToRelativeQuarterNumImpl<ResultPrecision::Extended>>>)
{
auto x_month_in_quarter = (a_comp.date.month - 1) % 3;
auto y_month_in_quarter = (b_comp.date.month - 1) % 3;
if ((x_month_in_quarter > y_month_in_quarter)
|| ((x_month_in_quarter == y_month_in_quarter) && ((a_comp.date.day > b_comp.date.day)
|| ((a_comp.date.day == b_comp.date.day) && ((a_comp.time.hour > b_comp.time.hour)
|| ((a_comp.time.hour == b_comp.time.hour) && ((a_comp.time.minute > b_comp.time.minute)
|| ((a_comp.time.minute == b_comp.time.minute) && (a_comp.time.second > b_comp.time.second))))
)))))
res += adjust_value;
}
else if constexpr (std::is_same_v<TransformX, TransformDateTime64<ToRelativeMonthNumImpl<ResultPrecision::Extended>>>)
{
if ((a_comp.date.day > b_comp.date.day)
|| ((a_comp.date.day == b_comp.date.day) && ((a_comp.time.hour > b_comp.time.hour)
|| ((a_comp.time.hour == b_comp.time.hour) && ((a_comp.time.minute > b_comp.time.minute)
|| ((a_comp.time.minute == b_comp.time.minute) && (a_comp.time.second > b_comp.time.second))))
)))
res += adjust_value;
}
else if constexpr (std::is_same_v<TransformX, TransformDateTime64<ToRelativeWeekNumImpl<ResultPrecision::Extended>>>)
{
auto x_day_of_week = TransformDateTime64<ToDayOfWeekImpl>(transform_x.getScaleMultiplier()).execute(x, timezone_x);
auto y_day_of_week = TransformDateTime64<ToDayOfWeekImpl>(transform_y.getScaleMultiplier()).execute(y, timezone_y);
if ((x_day_of_week > y_day_of_week)
|| ((x_day_of_week == y_day_of_week) && (a_comp.time.hour > b_comp.time.hour))
|| ((a_comp.time.hour == b_comp.time.hour) && ((a_comp.time.minute > b_comp.time.minute)
|| ((a_comp.time.minute == b_comp.time.minute) && (a_comp.time.second > b_comp.time.second)))))
res += adjust_value;
}
else if constexpr (std::is_same_v<TransformX, TransformDateTime64<ToRelativeDayNumImpl<ResultPrecision::Extended>>>)
{
if ((a_comp.time.hour > b_comp.time.hour)
|| ((a_comp.time.hour == b_comp.time.hour) && ((a_comp.time.minute > b_comp.time.minute)
|| ((a_comp.time.minute == b_comp.time.minute) && (a_comp.time.second > b_comp.time.second)))))
res += adjust_value;
}
else if constexpr (std::is_same_v<TransformX, TransformDateTime64<ToRelativeHourNumImpl<ResultPrecision::Extended>>>)
{
if ((a_comp.time.minute > b_comp.time.minute)
|| ((a_comp.time.minute == b_comp.time.minute) && (a_comp.time.second > b_comp.time.second)))
res += adjust_value;
}
else if constexpr (std::is_same_v<TransformX, TransformDateTime64<ToRelativeMinuteNumImpl<ResultPrecision::Extended>>>)
{
if (a_comp.time.second > b_comp.time.second)
res += adjust_value;
}
return res;
}
}
template <typename T>
@ -279,8 +193,7 @@ private:
/** dateDiff('unit', t1, t2, [timezone])
* age('unit', t1, t2, [timezone])
* t1 and t2 can be Date, Date32, DateTime or DateTime64
* t1 and t2 can be Date or DateTime
*
* If timezone is specified, it applied to both arguments.
* If not, timezones from datatypes t1 and t2 are used.
@ -288,11 +201,10 @@ private:
*
* Timezone matters because days can have different length.
*/
template <bool is_relative>
class FunctionDateDiff : public IFunction
{
public:
static constexpr auto name = is_relative ? "dateDiff" : "age";
static constexpr auto name = "dateDiff";
static FunctionPtr create(ContextPtr) { return std::make_shared<FunctionDateDiff>(); }
String getName() const override
@ -358,21 +270,21 @@ public:
const auto & timezone_y = extractTimeZoneFromFunctionArguments(arguments, 3, 2);
if (unit == "year" || unit == "yy" || unit == "yyyy")
impl.template dispatchForColumns<ToRelativeYearNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
impl.dispatchForColumns<ToRelativeYearNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
else if (unit == "quarter" || unit == "qq" || unit == "q")
impl.template dispatchForColumns<ToRelativeQuarterNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
impl.dispatchForColumns<ToRelativeQuarterNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
else if (unit == "month" || unit == "mm" || unit == "m")
impl.template dispatchForColumns<ToRelativeMonthNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
impl.dispatchForColumns<ToRelativeMonthNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
else if (unit == "week" || unit == "wk" || unit == "ww")
impl.template dispatchForColumns<ToRelativeWeekNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
impl.dispatchForColumns<ToRelativeWeekNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
else if (unit == "day" || unit == "dd" || unit == "d")
impl.template dispatchForColumns<ToRelativeDayNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
impl.dispatchForColumns<ToRelativeDayNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
else if (unit == "hour" || unit == "hh" || unit == "h")
impl.template dispatchForColumns<ToRelativeHourNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
impl.dispatchForColumns<ToRelativeHourNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
else if (unit == "minute" || unit == "mi" || unit == "n")
impl.template dispatchForColumns<ToRelativeMinuteNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
impl.dispatchForColumns<ToRelativeMinuteNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
else if (unit == "second" || unit == "ss" || unit == "s")
impl.template dispatchForColumns<ToRelativeSecondNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
impl.dispatchForColumns<ToRelativeSecondNumImpl<ResultPrecision::Extended>>(x, y, timezone_x, timezone_y, res->getData());
else
throw Exception(ErrorCodes::BAD_ARGUMENTS,
"Function {} does not support '{}' unit", getName(), unit);
@ -380,7 +292,7 @@ public:
return res;
}
private:
DateDiffImpl<is_relative> impl{name};
DateDiffImpl impl{name};
};
@ -440,14 +352,14 @@ public:
return res;
}
private:
DateDiffImpl<true> impl{name};
DateDiffImpl impl{name};
};
}
REGISTER_FUNCTION(DateDiff)
{
factory.registerFunction<FunctionDateDiff<true>>({}, FunctionFactory::CaseInsensitive);
factory.registerFunction<FunctionDateDiff>({}, FunctionFactory::CaseInsensitive);
}
REGISTER_FUNCTION(TimeDiff)
@ -464,9 +376,4 @@ Example:
Documentation::Categories{"Dates and Times"}}, FunctionFactory::CaseInsensitive);
}
REGISTER_FUNCTION(Age)
{
factory.registerFunction<FunctionDateDiff<false>>({}, FunctionFactory::CaseInsensitive);
}
}

View File

@ -0,0 +1,354 @@
#include <algorithm>
#include <cstring>
#include <vector>
#include <string>
#include <type_traits>
#include <gtest/gtest.h>
#include <Columns/ColumnNothing.h>
#include <Columns/ColumnsNumber.h>
#include <Functions/FunctionsLogical.h>
// I know that inclusion of .cpp is not good at all
#include <Functions/FunctionsLogical.cpp> // NOLINT
using namespace DB;
using TernaryValues = std::vector<Ternary::ResultType>;
struct LinearCongruentialGenerator
{
/// Constants from `man lrand48_r`.
static constexpr UInt64 a = 0x5DEECE66D;
static constexpr UInt64 c = 0xB;
/// And this is from `head -c8 /dev/urandom | xxd -p`
UInt64 current = 0x09826f4a081cee35ULL;
UInt32 next()
{
current = current * a + c;
return static_cast<UInt32>(current >> 16);
}
};
void generateRandomTernaryValue(LinearCongruentialGenerator & gen, Ternary::ResultType * output, size_t size, double false_ratio, double null_ratio)
{
/// The LinearCongruentialGenerator generates nonnegative integers uniformly distributed over the interval [0, 2^32).
/// See https://linux.die.net/man/3/nrand48
double false_percentile = false_ratio;
double null_percentile = false_ratio + null_ratio;
false_percentile = false_percentile > 1 ? 1 : false_percentile;
null_percentile = null_percentile > 1 ? 1 : null_percentile;
UInt32 false_threshold = static_cast<UInt32>(static_cast<double>(std::numeric_limits<UInt32>::max()) * false_percentile);
UInt32 null_threshold = static_cast<UInt32>(static_cast<double>(std::numeric_limits<UInt32>::max()) * null_percentile);
for (Ternary::ResultType * end = output + size; output != end; ++output)
{
UInt32 val = gen.next();
*output = val < false_threshold ? Ternary::False : (val < null_threshold ? Ternary::Null : Ternary::True);
}
}
template<typename T>
ColumnPtr createColumnNullable(const Ternary::ResultType * ternary_values, size_t size)
{
auto nested_column = ColumnVector<T>::create(size);
auto null_map = ColumnUInt8::create(size);
auto & nested_column_data = nested_column->getData();
auto & null_map_data = null_map->getData();
for (size_t i = 0; i < size; ++i)
{
if (ternary_values[i] == Ternary::Null)
{
null_map_data[i] = 1;
nested_column_data[i] = 0;
}
else if (ternary_values[i] == Ternary::True)
{
null_map_data[i] = 0;
nested_column_data[i] = 100;
}
else
{
null_map_data[i] = 0;
nested_column_data[i] = 0;
}
}
return ColumnNullable::create(std::move(nested_column), std::move(null_map));
}
template<typename T>
ColumnPtr createColumnVector(const Ternary::ResultType * ternary_values, size_t size)
{
auto column = ColumnVector<T>::create(size);
auto & column_data = column->getData();
for (size_t i = 0; i < size; ++i)
{
if (ternary_values[i] == Ternary::True)
{
column_data[i] = 100;
}
else
{
column_data[i] = 0;
}
}
return column;
}
template<typename ColumnType, typename T>
ColumnPtr createRandomColumn(LinearCongruentialGenerator & gen, TernaryValues & ternary_values)
{
size_t size = ternary_values.size();
Ternary::ResultType * ternary_data = ternary_values.data();
if constexpr (std::is_same_v<ColumnType, ColumnNullable>)
{
generateRandomTernaryValue(gen, ternary_data, size, 0.3, 0.7);
return createColumnNullable<T>(ternary_data, size);
}
else if constexpr (std::is_same_v<ColumnType, ColumnVector<UInt8>>)
{
generateRandomTernaryValue(gen, ternary_data, size, 0.5, 0);
return createColumnVector<T>(ternary_data, size);
}
else
{
auto nested_col = ColumnNothing::create(size);
auto null_map = ColumnUInt8::create(size);
memset(ternary_data, Ternary::Null, size);
return ColumnNullable::create(std::move(nested_col), std::move(null_map));
}
}
/* The truth table of ternary And and Or operations:
* +-------+-------+---------+--------+
* | a | b | a And b | a Or b |
* +-------+-------+---------+--------+
* | False | False | False | False |
* | False | Null | False | Null |
* | False | True | False | True |
* | Null | False | False | Null |
* | Null | Null | Null | Null |
* | Null | True | Null | True |
* | True | False | False | True |
* | True | Null | Null | True |
* | True | True | True | True |
* +-------+-------+---------+--------+
*
* https://en.wikibooks.org/wiki/Structured_Query_Language/NULLs_and_the_Three_Valued_Logic
*/
template <typename Op, typename T>
bool testTernaryLogicTruthTable()
{
constexpr size_t size = 9;
Ternary::ResultType col_a_ternary[] = {Ternary::False, Ternary::False, Ternary::False, Ternary::Null, Ternary::Null, Ternary::Null, Ternary::True, Ternary::True, Ternary::True};
Ternary::ResultType col_b_ternary[] = {Ternary::False, Ternary::Null, Ternary::True, Ternary::False, Ternary::Null, Ternary::True,Ternary::False, Ternary::Null, Ternary::True};
Ternary::ResultType and_expected_ternary[] = {Ternary::False, Ternary::False, Ternary::False, Ternary::False, Ternary::Null, Ternary::Null,Ternary::False, Ternary::Null, Ternary::True};
Ternary::ResultType or_expected_ternary[] = {Ternary::False, Ternary::Null, Ternary::True, Ternary::Null, Ternary::Null, Ternary::True,Ternary::True, Ternary::True, Ternary::True};
Ternary::ResultType * expected_ternary;
if constexpr (std::is_same_v<Op, AndImpl>)
{
expected_ternary = and_expected_ternary;
}
else
{
expected_ternary = or_expected_ternary;
}
auto col_a = createColumnNullable<T>(col_a_ternary, size);
auto col_b = createColumnNullable<T>(col_b_ternary, size);
ColumnRawPtrs arguments = {col_a.get(), col_b.get()};
auto col_res = ColumnUInt8::create(size);
auto & col_res_data = col_res->getData();
OperationApplier<Op, AssociativeGenericApplierImpl>::apply(arguments, col_res->getData(), false);
for (size_t i = 0; i < size; ++i)
{
if (col_res_data[i] != expected_ternary[i]) return false;
}
return true;
}
template <typename Op, typename LeftColumn, typename RightColumn>
bool testTernaryLogicOfTwoColumns(size_t size)
{
LinearCongruentialGenerator gen;
TernaryValues left_column_ternary(size);
TernaryValues right_column_ternary(size);
TernaryValues expected_ternary(size);
ColumnPtr left = createRandomColumn<LeftColumn, UInt8>(gen, left_column_ternary);
ColumnPtr right = createRandomColumn<RightColumn, UInt8>(gen, right_column_ternary);
for (size_t i = 0; i < size; ++i)
{
/// Given that False is less than Null and Null is less than True, the And operation can be implemented
/// with std::min, and the Or operation can be implemented with std::max.
if constexpr (std::is_same_v<Op, AndImpl>)
{
expected_ternary[i] = std::min(left_column_ternary[i], right_column_ternary[i]);
}
else
{
expected_ternary[i] = std::max(left_column_ternary[i], right_column_ternary[i]);
}
}
ColumnRawPtrs arguments = {left.get(), right.get()};
auto col_res = ColumnUInt8::create(size);
auto & col_res_data = col_res->getData();
OperationApplier<Op, AssociativeGenericApplierImpl>::apply(arguments, col_res->getData(), false);
for (size_t i = 0; i < size; ++i)
{
if (col_res_data[i] != expected_ternary[i]) return false;
}
return true;
}
TEST(TernaryLogicTruthTable, NestedUInt8)
{
bool test_1 = testTernaryLogicTruthTable<AndImpl, UInt8>();
bool test_2 = testTernaryLogicTruthTable<OrImpl, UInt8>();
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTruthTable, NestedUInt16)
{
bool test_1 = testTernaryLogicTruthTable<AndImpl, UInt16>();
bool test_2 = testTernaryLogicTruthTable<OrImpl, UInt16>();
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTruthTable, NestedUInt32)
{
bool test_1 = testTernaryLogicTruthTable<AndImpl, UInt32>();
bool test_2 = testTernaryLogicTruthTable<OrImpl, UInt32>();
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTruthTable, NestedUInt64)
{
bool test_1 = testTernaryLogicTruthTable<AndImpl, UInt64>();
bool test_2 = testTernaryLogicTruthTable<OrImpl, UInt64>();
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTruthTable, NestedInt8)
{
bool test_1 = testTernaryLogicTruthTable<AndImpl, Int8>();
bool test_2 = testTernaryLogicTruthTable<OrImpl, Int8>();
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTruthTable, NestedInt16)
{
bool test_1 = testTernaryLogicTruthTable<AndImpl, Int16>();
bool test_2 = testTernaryLogicTruthTable<OrImpl, Int16>();
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTruthTable, NestedInt32)
{
bool test_1 = testTernaryLogicTruthTable<AndImpl, Int32>();
bool test_2 = testTernaryLogicTruthTable<OrImpl, Int32>();
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTruthTable, NestedInt64)
{
bool test_1 = testTernaryLogicTruthTable<AndImpl, Int64>();
bool test_2 = testTernaryLogicTruthTable<OrImpl, Int64>();
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTruthTable, NestedFloat32)
{
bool test_1 = testTernaryLogicTruthTable<AndImpl, Float32>();
bool test_2 = testTernaryLogicTruthTable<OrImpl, Float32>();
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTruthTable, NestedFloat64)
{
bool test_1 = testTernaryLogicTruthTable<AndImpl, Float64>();
bool test_2 = testTernaryLogicTruthTable<OrImpl, Float64>();
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTwoColumns, TwoNullable)
{
bool test_1 = testTernaryLogicOfTwoColumns<AndImpl, ColumnNullable, ColumnNullable>(100 /*size*/);
bool test_2 = testTernaryLogicOfTwoColumns<OrImpl, ColumnNullable, ColumnNullable>(100 /*size*/);
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTwoColumns, TwoVector)
{
bool test_1 = testTernaryLogicOfTwoColumns<AndImpl, ColumnUInt8, ColumnUInt8>(100 /*size*/);
bool test_2 = testTernaryLogicOfTwoColumns<OrImpl, ColumnUInt8, ColumnUInt8>(100 /*size*/);
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTwoColumns, TwoNothing)
{
bool test_1 = testTernaryLogicOfTwoColumns<AndImpl, ColumnNothing, ColumnNothing>(100 /*size*/);
bool test_2 = testTernaryLogicOfTwoColumns<OrImpl, ColumnNothing, ColumnNothing>(100 /*size*/);
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTwoColumns, NullableVector)
{
bool test_1 = testTernaryLogicOfTwoColumns<AndImpl, ColumnNullable, ColumnUInt8>(100 /*size*/);
bool test_2 = testTernaryLogicOfTwoColumns<OrImpl, ColumnNullable, ColumnUInt8>(100 /*size*/);
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTwoColumns, NullableNothing)
{
bool test_1 = testTernaryLogicOfTwoColumns<AndImpl, ColumnNullable, ColumnNothing>(100 /*size*/);
bool test_2 = testTernaryLogicOfTwoColumns<OrImpl, ColumnNullable, ColumnNothing>(100 /*size*/);
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}
TEST(TernaryLogicTwoColumns, VectorNothing)
{
bool test_1 = testTernaryLogicOfTwoColumns<AndImpl, ColumnUInt8, ColumnNothing>(100 /*size*/);
bool test_2 = testTernaryLogicOfTwoColumns<OrImpl, ColumnUInt8, ColumnNothing>(100 /*size*/);
ASSERT_EQ(test_1, true);
ASSERT_EQ(test_2, true);
}

View File

@ -35,6 +35,8 @@ namespace ErrorCodes
extern const int CANNOT_PARSE_DATE;
extern const int INCORRECT_DATA;
extern const int ATTEMPT_TO_READ_AFTER_EOF;
extern const int LOGICAL_ERROR;
extern const int BAD_ARGUMENTS;
}
template <typename IteratorSrc, typename IteratorDst>
@ -642,9 +644,10 @@ void readCSVStringInto(Vector & s, ReadBuffer & buf, const FormatSettings::CSV &
const char delimiter = settings.delimiter;
const char maybe_quote = *buf.position();
const String & custom_delimiter = settings.custom_delimiter;
/// Emptiness and not even in quotation marks.
if (maybe_quote == delimiter)
if (custom_delimiter.empty() && maybe_quote == delimiter)
return;
if ((settings.allow_single_quotes && maybe_quote == '\'') || (settings.allow_double_quotes && maybe_quote == '"'))
@ -682,6 +685,42 @@ void readCSVStringInto(Vector & s, ReadBuffer & buf, const FormatSettings::CSV &
}
else
{
/// If custom_delimiter is specified, we should read until first occurrences of
/// custom_delimiter in buffer.
if (!custom_delimiter.empty())
{
PeekableReadBuffer * peekable_buf = dynamic_cast<PeekableReadBuffer *>(&buf);
if (!peekable_buf)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Reading CSV string with custom delimiter is allowed only when using PeekableReadBuffer");
while (true)
{
if (peekable_buf->eof())
throw Exception(ErrorCodes::INCORRECT_DATA, "Unexpected EOF while reading CSV string, expected custom delimiter \"{}\"", custom_delimiter);
char * next_pos = reinterpret_cast<char *>(memchr(peekable_buf->position(), custom_delimiter[0], peekable_buf->available()));
if (!next_pos)
next_pos = peekable_buf->buffer().end();
appendToStringOrVector(s, *peekable_buf, next_pos);
peekable_buf->position() = next_pos;
if (!buf.hasPendingData())
continue;
{
PeekableReadBufferCheckpoint checkpoint{*peekable_buf, true};
if (checkString(custom_delimiter, *peekable_buf))
return;
}
s.push_back(*peekable_buf->position());
++peekable_buf->position();
}
return;
}
/// Unquoted case. Look for delimiter or \r or \n.
while (!buf.eof())
{
@ -776,6 +815,72 @@ void readCSVField(String & s, ReadBuffer & buf, const FormatSettings::CSV & sett
s.push_back(quote);
}
void readCSVWithTwoPossibleDelimitersImpl(String & s, PeekableReadBuffer & buf, const String & first_delimiter, const String & second_delimiter)
{
/// Check that delimiters are not empty.
if (first_delimiter.empty() || second_delimiter.empty())
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Cannot read CSV field with two possible delimiters, one of delimiters '{}' and '{}' is empty", first_delimiter, second_delimiter);
/// Read all data until first_delimiter or second_delimiter
while (true)
{
if (buf.eof())
throw Exception(ErrorCodes::INCORRECT_DATA, R"(Unexpected EOF while reading CSV string, expected on of delimiters "{}" or "{}")", first_delimiter, second_delimiter);
char * next_pos = buf.position();
while (next_pos != buf.buffer().end() && *next_pos != first_delimiter[0] && *next_pos != second_delimiter[0])
++next_pos;
appendToStringOrVector(s, buf, next_pos);
buf.position() = next_pos;
if (!buf.hasPendingData())
continue;
if (*buf.position() == first_delimiter[0])
{
PeekableReadBufferCheckpoint checkpoint(buf, true);
if (checkString(first_delimiter, buf))
return;
}
if (*buf.position() == second_delimiter[0])
{
PeekableReadBufferCheckpoint checkpoint(buf, true);
if (checkString(second_delimiter, buf))
return;
}
s.push_back(*buf.position());
++buf.position();
}
}
String readCSVStringWithTwoPossibleDelimiters(PeekableReadBuffer & buf, const FormatSettings::CSV & settings, const String & first_delimiter, const String & second_delimiter)
{
String res;
/// If value is quoted, use regular CSV reading since we need to read only data inside quotes.
if (!buf.eof() && ((settings.allow_single_quotes && *buf.position() == '\'') || (settings.allow_double_quotes && *buf.position() == '"')))
readCSVStringInto(res, buf, settings);
else
readCSVWithTwoPossibleDelimitersImpl(res, buf, first_delimiter, second_delimiter);
return res;
}
String readCSVFieldWithTwoPossibleDelimiters(PeekableReadBuffer & buf, const FormatSettings::CSV & settings, const String & first_delimiter, const String & second_delimiter)
{
String res;
/// If value is quoted, use regular CSV reading since we need to read only data inside quotes.
if (!buf.eof() && ((settings.allow_single_quotes && *buf.position() == '\'') || (settings.allow_double_quotes && *buf.position() == '"')))
readCSVField(res, buf, settings);
else
readCSVWithTwoPossibleDelimitersImpl(res, buf, first_delimiter, second_delimiter);
return res;
}
template void readCSVStringInto<PaddedPODArray<UInt8>>(PaddedPODArray<UInt8> & s, ReadBuffer & buf, const FormatSettings::CSV & settings);
template void readCSVStringInto<NullOutput>(NullOutput & s, ReadBuffer & buf, const FormatSettings::CSV & settings);

View File

@ -558,9 +558,10 @@ void readStringUntilWhitespace(String & s, ReadBuffer & buf);
* - string could be placed in quotes; quotes could be single: ' if FormatSettings::CSV::allow_single_quotes is true
* or double: " if FormatSettings::CSV::allow_double_quotes is true;
* - or string could be unquoted - this is determined by first character;
* - if string is unquoted, then it is read until next delimiter,
* either until end of line (CR or LF),
* or until end of stream;
* - if string is unquoted, then:
* - If settings.custom_delimiter is not specified, it is read until next settings.delimiter, either until end of line (CR or LF) or until end of stream;
* - If settings.custom_delimiter is specified it reads until first occurrences of settings.custom_delimiter in buffer.
* This works only if provided buffer is PeekableReadBuffer.
* but spaces and tabs at begin and end of unquoted string are consumed but ignored (note that this behaviour differs from RFC).
* - if string is in quotes, then it will be read until closing quote,
* but sequences of two consecutive quotes are parsed as single quote inside string;
@ -570,6 +571,13 @@ void readCSVString(String & s, ReadBuffer & buf, const FormatSettings::CSV & set
/// Differ from readCSVString in that it doesn't remove quotes around field if any.
void readCSVField(String & s, ReadBuffer & buf, const FormatSettings::CSV & settings);
/// Read string in CSV format until the first occurrence of first_delimiter or second_delimiter.
/// Similar to readCSVString if string is in quotes, we read only data in quotes.
String readCSVStringWithTwoPossibleDelimiters(PeekableReadBuffer & buf, const FormatSettings::CSV & settings, const String & first_delimiter, const String & second_delimiter);
/// Same as above but includes quotes in the result if any.
String readCSVFieldWithTwoPossibleDelimiters(PeekableReadBuffer & buf, const FormatSettings::CSV & settings, const String & first_delimiter, const String & second_delimiter);
/// Read and append result to array of characters.
template <typename Vector>
void readStringInto(Vector & s, ReadBuffer & buf);

View File

@ -141,7 +141,7 @@ void ConvertStringsToEnumMatcher::visit(ASTFunction & function_node, Data & data
if (function_node.name == "if")
{
if (function_node.arguments->children.size() != 2)
if (function_node.arguments->children.size() != 3)
return;
const ASTLiteral * literal1 = function_node.arguments->children[1]->as<ASTLiteral>();

View File

@ -1126,6 +1126,7 @@ void DatabaseCatalog::cleanupStoreDirectoryTask()
continue;
size_t affected_dirs = 0;
size_t checked_dirs = 0;
for (auto it = disk->iterateDirectory("store"); it->isValid(); it->next())
{
String prefix = it->name();
@ -1135,6 +1136,7 @@ void DatabaseCatalog::cleanupStoreDirectoryTask()
if (!expected_prefix_dir)
{
LOG_WARNING(log, "Found invalid directory {} on disk {}, will try to remove it", it->path(), disk_name);
checked_dirs += 1;
affected_dirs += maybeRemoveDirectory(disk_name, disk, it->path());
continue;
}
@ -1150,6 +1152,7 @@ void DatabaseCatalog::cleanupStoreDirectoryTask()
if (!expected_dir)
{
LOG_WARNING(log, "Found invalid directory {} on disk {}, will try to remove it", jt->path(), disk_name);
checked_dirs += 1;
affected_dirs += maybeRemoveDirectory(disk_name, disk, jt->path());
continue;
}
@ -1161,6 +1164,7 @@ void DatabaseCatalog::cleanupStoreDirectoryTask()
/// so it looks safe enough to remove directory if we don't have uuid mapping for it.
/// No table or database using this directory should concurrently appear,
/// because creation of new table would fail with "directory already exists".
checked_dirs += 1;
affected_dirs += maybeRemoveDirectory(disk_name, disk, jt->path());
}
}
@ -1168,7 +1172,7 @@ void DatabaseCatalog::cleanupStoreDirectoryTask()
if (affected_dirs)
LOG_INFO(log, "Cleaned up {} directories from store/ on disk {}", affected_dirs, disk_name);
else
if (checked_dirs == 0)
LOG_TEST(log, "Nothing to clean up from store/ on disk {}", disk_name);
}

View File

@ -225,7 +225,7 @@ HashJoin::HashJoin(std::shared_ptr<TableJoin> table_join_, const Block & right_s
, right_sample_block(right_sample_block_)
, log(&Poco::Logger::get("HashJoin"))
{
LOG_DEBUG(log, "Datatype: {}, kind: {}, strictness: {}", data->type, kind, strictness);
LOG_DEBUG(log, "Datatype: {}, kind: {}, strictness: {}, right header: {}", data->type, kind, strictness, right_sample_block.dumpStructure());
LOG_DEBUG(log, "Keys: {}", TableJoin::formatClauses(table_join->getClauses(), true));
if (isCrossOrComma(kind))

View File

@ -672,6 +672,11 @@ String TableJoin::renamedRightColumnName(const String & name) const
return name;
}
void TableJoin::setRename(const String & from, const String & to)
{
renames[from] = to;
}
void TableJoin::addKey(const String & left_name, const String & right_name, const ASTPtr & left_ast, const ASTPtr & right_ast)
{
clauses.back().key_names_left.emplace_back(left_name);

View File

@ -334,6 +334,7 @@ public:
Block getRequiredRightKeys(const Block & right_table_keys, std::vector<String> & keys_sources) const;
String renamedRightColumnName(const String & name) const;
void setRename(const String & from, const String & to);
void resetKeys();
void resetToCross();

View File

@ -97,6 +97,9 @@ void TraceCollector::run()
Int64 size;
readPODBinary(size, in);
UInt64 ptr;
readPODBinary(ptr, in);
ProfileEvents::Event event;
readPODBinary(event, in);
@ -112,7 +115,7 @@ void TraceCollector::run()
UInt64 time = static_cast<UInt64>(ts.tv_sec * 1000000000LL + ts.tv_nsec);
UInt64 time_in_microseconds = static_cast<UInt64>((ts.tv_sec * 1000000LL) + (ts.tv_nsec / 1000));
TraceLogElement element{time_t(time / 1000000000), time_in_microseconds, time, trace_type, thread_id, query_id, trace, size, event, increment};
TraceLogElement element{time_t(time / 1000000000), time_in_microseconds, time, trace_type, thread_id, query_id, trace, size, ptr, event, increment};
trace_log->add(element);
}
}

View File

@ -38,6 +38,7 @@ NamesAndTypesList TraceLogElement::getNamesAndTypes()
{"query_id", std::make_shared<DataTypeString>()},
{"trace", std::make_shared<DataTypeArray>(std::make_shared<DataTypeUInt64>())},
{"size", std::make_shared<DataTypeInt64>()},
{"ptr", std::make_shared<DataTypeUInt64>()},
{"event", std::make_shared<DataTypeLowCardinality>(std::make_shared<DataTypeString>())},
{"increment", std::make_shared<DataTypeInt64>()},
};
@ -57,6 +58,7 @@ void TraceLogElement::appendToBlock(MutableColumns & columns) const
columns[i++]->insertData(query_id.data(), query_id.size());
columns[i++]->insert(trace);
columns[i++]->insert(size);
columns[i++]->insert(ptr);
String event_name;
if (event != ProfileEvents::end())

View File

@ -27,8 +27,10 @@ struct TraceLogElement
UInt64 thread_id{};
String query_id{};
Array trace{};
/// Allocation size in bytes for TraceType::Memory.
/// Allocation size in bytes for TraceType::Memory and TraceType::MemorySample.
Int64 size{};
/// Allocation ptr for TraceType::MemorySample.
UInt64 ptr{};
/// ProfileEvent for TraceType::ProfileEvent.
ProfileEvents::Event event{ProfileEvents::end()};
/// Increment of profile event for TraceType::ProfileEvent.

View File

@ -18,18 +18,31 @@ namespace ErrorCodes
std::string getClusterName(const IAST & node)
{
auto name = tryGetClusterName(node);
if (!name)
throw Exception("Illegal expression instead of cluster name.", ErrorCodes::BAD_ARGUMENTS);
return std::move(name).value();
}
std::optional<std::string> tryGetClusterName(const IAST & node)
{
if (const auto * ast_id = node.as<ASTIdentifier>())
return ast_id->name();
if (const auto * ast_lit = node.as<ASTLiteral>())
return checkAndGetLiteralArgument<String>(*ast_lit, "cluster_name");
{
if (ast_lit->value.getType() != Field::Types::String)
return {};
return ast_lit->value.safeGet<String>();
}
/// A hack to support hyphens in cluster names.
if (const auto * ast_func = node.as<ASTFunction>())
{
if (ast_func->name != "minus" || !ast_func->arguments || ast_func->arguments->children.size() < 2)
throw Exception("Illegal expression instead of cluster name.", ErrorCodes::BAD_ARGUMENTS);
return {};
String name;
for (const auto & arg : ast_func->arguments->children)
@ -43,7 +56,7 @@ std::string getClusterName(const IAST & node)
return name;
}
throw Exception("Illegal expression instead of cluster name.", ErrorCodes::BAD_ARGUMENTS);
return {};
}

View File

@ -15,6 +15,7 @@ namespace DB
* Therefore, consider this case separately.
*/
std::string getClusterName(const IAST & node);
std::optional<std::string> tryGetClusterName(const IAST & node);
std::string getClusterNameAndMakeLiteral(ASTPtr & node);

View File

@ -45,8 +45,9 @@ namespace DB
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
extern const int INCOMPATIBLE_TYPE_OF_JOIN;
extern const int INVALID_JOIN_ON_EXPRESSION;
extern const int LOGICAL_ERROR;
extern const int NOT_IMPLEMENTED;
}
@ -671,9 +672,23 @@ std::shared_ptr<IJoin> chooseJoinAlgorithm(std::shared_ptr<TableJoin> & table_jo
{
trySetStorageInTableJoin(right_table_expression, table_join);
auto & right_table_expression_data = planner_context->getTableExpressionDataOrThrow(right_table_expression);
/// JOIN with JOIN engine.
if (auto storage = table_join->getStorageJoin())
{
for (const auto & result_column : right_table_expression_header)
{
const auto * source_column_name = right_table_expression_data.getColumnNameOrNull(result_column.name);
if (!source_column_name)
throw Exception(ErrorCodes::INCOMPATIBLE_TYPE_OF_JOIN,
"JOIN with 'Join' table engine should be performed by storage keys [{}], but column '{}' was found",
fmt::join(storage->getKeyNames(), ", "), result_column.name);
table_join->setRename(*source_column_name, result_column.name);
}
return storage->getJoinLocked(table_join, planner_context->getQueryContext());
}
/** JOIN with constant.
* Example: SELECT * FROM test_table AS t1 INNER JOIN test_table AS t2 ON 1;

View File

@ -12,16 +12,6 @@ namespace ErrorCodes
extern const int BAD_ARGUMENTS;
}
static FormatSettings updateFormatSettings(const FormatSettings & settings)
{
if (settings.custom.escaping_rule != FormatSettings::EscapingRule::CSV || settings.custom.field_delimiter.empty())
return settings;
auto updated = settings;
updated.csv.delimiter = settings.custom.field_delimiter.front();
return updated;
}
CustomSeparatedRowInputFormat::CustomSeparatedRowInputFormat(
const Block & header_,
ReadBuffer & in_buf_,
@ -31,7 +21,7 @@ CustomSeparatedRowInputFormat::CustomSeparatedRowInputFormat(
bool ignore_spaces_,
const FormatSettings & format_settings_)
: CustomSeparatedRowInputFormat(
header_, std::make_unique<PeekableReadBuffer>(in_buf_), params_, with_names_, with_types_, ignore_spaces_, updateFormatSettings(format_settings_))
header_, std::make_unique<PeekableReadBuffer>(in_buf_), params_, with_names_, with_types_, ignore_spaces_, format_settings_)
{
}
@ -171,15 +161,31 @@ bool CustomSeparatedFormatReader::checkEndOfRow()
}
template <bool is_header>
String CustomSeparatedFormatReader::readFieldIntoString(bool is_first)
String CustomSeparatedFormatReader::readFieldIntoString(bool is_first, bool is_last, bool is_unknown)
{
if (!is_first)
skipFieldDelimiter();
skipSpaces();
updateFormatSettings(is_last);
if constexpr (is_header)
{
/// If the number of columns is unknown and we use CSV escaping rule,
/// we don't know what delimiter to expect after the value,
/// so we should read until we meet field_delimiter or row_after_delimiter.
if (is_unknown && format_settings.custom.escaping_rule == FormatSettings::EscapingRule::CSV)
return readCSVStringWithTwoPossibleDelimiters(
*buf, format_settings.csv, format_settings.custom.field_delimiter, format_settings.custom.row_after_delimiter);
return readStringByEscapingRule(*buf, format_settings.custom.escaping_rule, format_settings);
}
else
{
if (is_unknown && format_settings.custom.escaping_rule == FormatSettings::EscapingRule::CSV)
return readCSVFieldWithTwoPossibleDelimiters(
*buf, format_settings.csv, format_settings.custom.field_delimiter, format_settings.custom.row_after_delimiter);
return readFieldByEscapingRule(*buf, format_settings.custom.escaping_rule, format_settings);
}
}
template <bool is_header>
@ -192,14 +198,14 @@ std::vector<String> CustomSeparatedFormatReader::readRowImpl()
{
do
{
values.push_back(readFieldIntoString<is_header>(values.empty()));
values.push_back(readFieldIntoString<is_header>(values.empty(), false, true));
} while (!checkEndOfRow());
columns = values.size();
}
else
{
for (size_t i = 0; i != columns; ++i)
values.push_back(readFieldIntoString<is_header>(i == 0));
values.push_back(readFieldIntoString<is_header>(i == 0, i + 1 == columns, false));
}
skipRowEndDelimiter();
@ -223,9 +229,41 @@ void CustomSeparatedFormatReader::skipHeaderRow()
skipRowEndDelimiter();
}
bool CustomSeparatedFormatReader::readField(IColumn & column, const DataTypePtr & type, const SerializationPtr & serialization, bool, const String &)
void CustomSeparatedFormatReader::updateFormatSettings(bool is_last_column)
{
if (format_settings.custom.escaping_rule != FormatSettings::EscapingRule::CSV)
return;
/// Clean custom delimiter from previous delimiter.
format_settings.csv.custom_delimiter.clear();
/// If delimiter has length = 1, it will be more efficient to use csv.delimiter.
/// If we have some complex delimiter, normal CSV reading will now work properly if we will
/// use just the first character of delimiter (for example, if delimiter='||' and we have data 'abc|d||')
/// We have special implementation for such case that uses custom delimiter, it's not so efficient,
/// but works properly.
if (is_last_column)
{
/// If field delimiter has length = 1, it will be more efficient to use csv.delimiter.
if (format_settings.custom.row_after_delimiter.size() == 1)
format_settings.csv.delimiter = format_settings.custom.row_after_delimiter.front();
else
format_settings.csv.custom_delimiter = format_settings.custom.row_after_delimiter;
}
else
{
if (format_settings.custom.field_delimiter.size() == 1)
format_settings.csv.delimiter = format_settings.custom.field_delimiter.front();
else
format_settings.csv.custom_delimiter = format_settings.custom.field_delimiter;
}
}
bool CustomSeparatedFormatReader::readField(IColumn & column, const DataTypePtr & type, const SerializationPtr & serialization, bool is_last_file_column, const String &)
{
skipSpaces();
updateFormatSettings(is_last_file_column);
return deserializeFieldByEscapingRule(type, serialization, column, *buf, format_settings.custom.escaping_rule, format_settings);
}
@ -237,6 +275,8 @@ bool CustomSeparatedFormatReader::checkForSuffixImpl(bool check_eof)
if (!check_eof)
return false;
/// Allow optional \n before eof.
checkChar('\n', *buf);
return buf->eof();
}
@ -246,6 +286,8 @@ bool CustomSeparatedFormatReader::checkForSuffixImpl(bool check_eof)
if (!check_eof)
return true;
/// Allow optional \n before eof.
checkChar('\n', *buf);
if (buf->eof())
return true;
}
@ -312,7 +354,7 @@ CustomSeparatedSchemaReader::CustomSeparatedSchemaReader(
&reader,
getDefaultDataTypeForEscapingRule(format_setting_.custom.escaping_rule))
, buf(in_)
, reader(buf, ignore_spaces_, updateFormatSettings(format_setting_))
, reader(buf, ignore_spaces_, format_setting_)
{
}

View File

@ -83,7 +83,9 @@ private:
std::vector<String> readRowImpl();
template <bool read_string>
String readFieldIntoString(bool is_first);
String readFieldIntoString(bool is_first, bool is_last, bool is_unknown);
void updateFormatSettings(bool is_last_column);
PeekableReadBuffer * buf;
bool ignore_spaces;

View File

@ -25,6 +25,27 @@ namespace ErrorCodes
ErrorCodes::CANNOT_READ_ALL_DATA);
}
static void updateFormatSettingsIfNeeded(FormatSettings::EscapingRule escaping_rule, FormatSettings & settings, const ParsedTemplateFormatString & row_format, char default_csv_delimiter, size_t file_column)
{
if (escaping_rule != FormatSettings::EscapingRule::CSV)
return;
/// Clean custom_delimiter from previous column.
settings.csv.custom_delimiter.clear();
/// If field delimiter is empty, we read until default csv delimiter.
if (row_format.delimiters[file_column + 1].empty())
settings.csv.delimiter = default_csv_delimiter;
/// If field delimiter has length = 1, it will be more efficient to use csv.delimiter.
else if (row_format.delimiters[file_column + 1].size() == 1)
settings.csv.delimiter = row_format.delimiters[file_column + 1].front();
/// If we have some complex delimiter, normal CSV reading will now work properly if we will
/// use the first character of delimiter (for example, if delimiter='||' and we have data 'abc|d||')
/// We have special implementation for such case that uses custom delimiter, it's not so efficient,
/// but works properly.
else
settings.csv.custom_delimiter = row_format.delimiters[file_column + 1];
}
TemplateRowInputFormat::TemplateRowInputFormat(
const Block & header_,
ReadBuffer & in_,
@ -129,10 +150,8 @@ bool TemplateRowInputFormat::deserializeField(const DataTypePtr & type,
const SerializationPtr & serialization, IColumn & column, size_t file_column)
{
EscapingRule escaping_rule = row_format.escaping_rules[file_column];
if (escaping_rule == EscapingRule::CSV)
/// Will read unquoted string until settings.csv.delimiter
settings.csv.delimiter = row_format.delimiters[file_column + 1].empty() ? default_csv_delimiter :
row_format.delimiters[file_column + 1].front();
updateFormatSettingsIfNeeded(escaping_rule, settings, row_format, default_csv_delimiter, file_column);
try
{
return deserializeFieldByEscapingRule(type, serialization, column, *buf, escaping_rule, settings);
@ -466,6 +485,7 @@ TemplateSchemaReader::TemplateSchemaReader(
, format(format_)
, row_format(row_format_)
, format_reader(buf, ignore_spaces_, format, row_format, row_between_delimiter, format_settings)
, default_csv_delimiter(format_settings_.csv.delimiter)
{
setColumnNames(row_format.column_names);
}
@ -489,9 +509,7 @@ DataTypes TemplateSchemaReader::readRowAndGetDataTypes()
for (size_t i = 0; i != row_format.columnsCount(); ++i)
{
format_reader.skipDelimiter(i);
if (row_format.escaping_rules[i] == FormatSettings::EscapingRule::CSV)
format_settings.csv.delimiter = row_format.delimiters[i + 1].empty() ? format_settings.csv.delimiter : row_format.delimiters[i + 1].front();
updateFormatSettingsIfNeeded(row_format.escaping_rules[i], format_settings, row_format, default_csv_delimiter, i);
field = readFieldByEscapingRule(buf, row_format.escaping_rules[i], format_settings);
data_types.push_back(determineDataTypeByEscapingRule(field, format_settings, row_format.escaping_rules[i]));
}

View File

@ -128,6 +128,7 @@ private:
const ParsedTemplateFormatString row_format;
TemplateFormatReader format_reader;
bool first_row = true;
const char default_csv_delimiter;
};
bool parseDelimiterWithDiagnosticInfo(WriteBuffer & out, ReadBuffer & buf, const String & delimiter, const String & description, bool skip_spaces);

View File

@ -111,7 +111,7 @@ public:
protected:
ReadBuffer * in;
const FormatSettings format_settings;
FormatSettings format_settings;
};
/// Base class for schema inference for formats with -WithNames and -WithNamesAndTypes suffixes.

View File

@ -794,8 +794,6 @@ void Fetcher::downloadBasePartOrProjectionPartToDiskRemoteMeta(
/// NOTE The is_cancelled flag also makes sense to check every time you read over the network,
/// performing a poll with a not very large timeout.
/// And now we check it only between read chunks (in the `copyData` function).
data_part_storage->removeSharedRecursive(true);
data_part_storage->commitTransaction();
throw Exception("Fetching of part was cancelled", ErrorCodes::ABORTED);
}
@ -855,7 +853,6 @@ void Fetcher::downloadBaseOrProjectionPartToDisk(
/// NOTE The is_cancelled flag also makes sense to check every time you read over the network,
/// performing a poll with a not very large timeout.
/// And now we check it only between read chunks (in the `copyData` function).
data_part_storage->removeRecursive();
throw Exception("Fetching of part was cancelled", ErrorCodes::ABORTED);
}
@ -934,22 +931,36 @@ MergeTreeData::MutableDataPartPtr Fetcher::downloadPartToDisk(
CurrentMetrics::Increment metric_increment{CurrentMetrics::ReplicatedFetch};
for (size_t i = 0; i < projections; ++i)
try
{
String projection_name;
readStringBinary(projection_name, in);
MergeTreeData::DataPart::Checksums projection_checksum;
for (size_t i = 0; i < projections; ++i)
{
String projection_name;
readStringBinary(projection_name, in);
MergeTreeData::DataPart::Checksums projection_checksum;
auto projection_part_storage = data_part_storage->getProjection(projection_name + ".proj");
projection_part_storage->createDirectories();
downloadBaseOrProjectionPartToDisk(
replica_path, projection_part_storage, sync, in, projection_checksum, throttler);
checksums.addFile(
projection_name + ".proj", projection_checksum.getTotalSizeOnDisk(), projection_checksum.getTotalChecksumUInt128());
auto projection_part_storage = data_part_storage->getProjection(projection_name + ".proj");
projection_part_storage->createDirectories();
downloadBaseOrProjectionPartToDisk(
replica_path, projection_part_storage, sync, in, projection_checksum, throttler);
checksums.addFile(
projection_name + ".proj", projection_checksum.getTotalSizeOnDisk(), projection_checksum.getTotalChecksumUInt128());
}
// Download the base part
downloadBaseOrProjectionPartToDisk(replica_path, data_part_storage, sync, in, checksums, throttler);
}
catch (const Exception & e)
{
/// Remove the whole part directory if fetch of base
/// part or fetch of any projection was stopped.
if (e.code() == ErrorCodes::ABORTED)
{
data_part_storage->removeRecursive();
data_part_storage->commitTransaction();
}
throw;
}
// Download the base part
downloadBaseOrProjectionPartToDisk(replica_path, data_part_storage, sync, in, checksums, throttler);
assertEOF(in);
data_part_storage->commitTransaction();
@ -1007,23 +1018,37 @@ MergeTreeData::MutableDataPartPtr Fetcher::downloadPartToDiskRemoteMeta(
data_part_storage->createDirectories();
for (size_t i = 0; i < projections; ++i)
try
{
String projection_name;
readStringBinary(projection_name, in);
MergeTreeData::DataPart::Checksums projection_checksum;
for (size_t i = 0; i < projections; ++i)
{
String projection_name;
readStringBinary(projection_name, in);
MergeTreeData::DataPart::Checksums projection_checksum;
auto projection_part_storage = data_part_storage->getProjection(projection_name + ".proj");
projection_part_storage->createDirectories();
downloadBasePartOrProjectionPartToDiskRemoteMeta(
replica_path, projection_part_storage, in, projection_checksum, throttler);
checksums.addFile(
projection_name + ".proj", projection_checksum.getTotalSizeOnDisk(), projection_checksum.getTotalChecksumUInt128());
}
auto projection_part_storage = data_part_storage->getProjection(projection_name + ".proj");
projection_part_storage->createDirectories();
downloadBasePartOrProjectionPartToDiskRemoteMeta(
replica_path, projection_part_storage, in, projection_checksum, throttler);
checksums.addFile(
projection_name + ".proj", projection_checksum.getTotalSizeOnDisk(), projection_checksum.getTotalChecksumUInt128());
replica_path, data_part_storage, in, checksums, throttler);
}
catch (const Exception & e)
{
if (e.code() == ErrorCodes::ABORTED)
{
/// Remove the whole part directory if fetch of base
/// part or fetch of any projection was stopped.
data_part_storage->removeSharedRecursive(true);
data_part_storage->commitTransaction();
}
throw;
}
downloadBasePartOrProjectionPartToDiskRemoteMeta(
replica_path, data_part_storage, in, checksums, throttler);
assertEOF(in);
MergeTreeData::MutableDataPartPtr new_data_part;

View File

@ -46,6 +46,17 @@ class MarkCache;
class UncompressedCache;
class MergeTreeTransaction;
enum class DataPartRemovalState
{
NOT_ATTEMPTED,
VISIBLE_TO_TRANSACTIONS,
NON_UNIQUE_OWNERSHIP,
NOT_REACHED_REMOVAL_TIME,
HAS_SKIPPED_MUTATION_PARENT,
REMOVED,
};
/// Description of the data part.
class IMergeTreeDataPart : public std::enable_shared_from_this<IMergeTreeDataPart>, public DataPartStorageHolder
{
@ -446,6 +457,10 @@ public:
void removeDeleteOnDestroyMarker();
void removeVersionMetadata();
mutable std::atomic<DataPartRemovalState> removal_state = DataPartRemovalState::NOT_ATTEMPTED;
mutable std::atomic<time_t> last_removal_attemp_time = 0;
protected:
/// Total size of all columns, calculated once in calcuateColumnSizesOnDisk

View File

@ -88,6 +88,10 @@ MergeListElement::MergeListElement(
/// thread_group::memory_tracker, but MemoryTrackerThreadSwitcher will reset parent).
memory_tracker.setProfilerStep(settings.memory_profiler_step);
memory_tracker.setSampleProbability(settings.memory_profiler_sample_probability);
/// Specify sample probability also for current thread to track more deallocations.
if (auto * thread_memory_tracker = DB::CurrentThread::getMemoryTracker())
thread_memory_tracker->setSampleProbability(settings.memory_profiler_sample_probability);
memory_tracker.setSoftLimit(settings.memory_overcommit_ratio_denominator);
if (settings.memory_tracker_fault_probability > 0.0)
memory_tracker.setFaultProbability(settings.memory_tracker_fault_probability);

View File

@ -84,6 +84,7 @@
#include <base/sort.h>
#include <algorithm>
#include <atomic>
#include <iomanip>
#include <optional>
#include <set>
@ -1762,9 +1763,12 @@ MergeTreeData::DataPartsVector MergeTreeData::grabOldParts(bool force)
{
const DataPartPtr & part = *it;
part->last_removal_attemp_time.store(time_now, std::memory_order_relaxed);
/// Do not remove outdated part if it may be visible for some transaction
if (!part->version.canBeRemoved())
{
part->removal_state.store(DataPartRemovalState::VISIBLE_TO_TRANSACTIONS, std::memory_order_relaxed);
skipped_parts.push_back(part->info);
continue;
}
@ -1772,20 +1776,27 @@ MergeTreeData::DataPartsVector MergeTreeData::grabOldParts(bool force)
/// Grab only parts that are not used by anyone (SELECTs for example).
if (!part.unique())
{
part->removal_state.store(DataPartRemovalState::NON_UNIQUE_OWNERSHIP, std::memory_order_relaxed);
skipped_parts.push_back(part->info);
continue;
}
auto part_remove_time = part->remove_time.load(std::memory_order_relaxed);
if ((part_remove_time < time_now && time_now - part_remove_time > getSettings()->old_parts_lifetime.totalSeconds() && !has_skipped_mutation_parent(part))
bool reached_removal_time = part_remove_time < time_now && time_now - part_remove_time > getSettings()->old_parts_lifetime.totalSeconds();
if ((reached_removal_time && !has_skipped_mutation_parent(part))
|| force
|| isInMemoryPart(part) /// Remove in-memory parts immediately to not store excessive data in RAM
|| (part->version.creation_csn == Tx::RolledBackCSN && getSettings()->remove_rolled_back_parts_immediately))
{
part->removal_state.store(DataPartRemovalState::REMOVED, std::memory_order_relaxed);
parts_to_delete.emplace_back(it);
}
else
{
if (!reached_removal_time)
part->removal_state.store(DataPartRemovalState::NOT_REACHED_REMOVAL_TIME, std::memory_order_relaxed);
else
part->removal_state.store(DataPartRemovalState::HAS_SKIPPED_MUTATION_PARENT, std::memory_order_relaxed);
skipped_parts.push_back(part->info);
continue;
}
@ -2600,7 +2611,17 @@ void MergeTreeData::checkAlterIsPossible(const AlterCommands & commands, Context
}
}
dropped_columns.emplace(command.column_name);
if (old_metadata.columns.has(command.column_name))
{
dropped_columns.emplace(command.column_name);
}
else
{
const auto & nested = old_metadata.columns.getNested(command.column_name);
for (const auto & nested_column : nested)
dropped_columns.emplace(nested_column.name);
}
}
else if (command.type == AlterCommand::RESET_SETTING)
{
@ -3884,9 +3905,9 @@ MergeTreeData::DataPartsVector MergeTreeData::getVisibleDataPartsVectorInPartiti
return res;
}
MergeTreeData::DataPartPtr MergeTreeData::getPartIfExists(const MergeTreePartInfo & part_info, const MergeTreeData::DataPartStates & valid_states)
MergeTreeData::DataPartPtr MergeTreeData::getPartIfExists(const MergeTreePartInfo & part_info, const MergeTreeData::DataPartStates & valid_states, DataPartsLock * acquired_lock)
{
auto lock = lockParts();
auto lock = (acquired_lock) ? DataPartsLock() : lockParts();
auto it = data_parts_by_info.find(part_info);
if (it == data_parts_by_info.end())
@ -3899,9 +3920,9 @@ MergeTreeData::DataPartPtr MergeTreeData::getPartIfExists(const MergeTreePartInf
return nullptr;
}
MergeTreeData::DataPartPtr MergeTreeData::getPartIfExists(const String & part_name, const MergeTreeData::DataPartStates & valid_states)
MergeTreeData::DataPartPtr MergeTreeData::getPartIfExists(const String & part_name, const MergeTreeData::DataPartStates & valid_states, DataPartsLock * acquired_lock)
{
return getPartIfExists(MergeTreePartInfo::fromPartName(part_name, format_version), valid_states);
return getPartIfExists(MergeTreePartInfo::fromPartName(part_name, format_version), valid_states, acquired_lock);
}

View File

@ -514,8 +514,8 @@ public:
DataPartsVector getDataPartsVectorInPartitionForInternalUsage(const DataPartStates & affordable_states, const String & partition_id, DataPartsLock * acquired_lock = nullptr) const;
/// Returns the part with the given name and state or nullptr if no such part.
DataPartPtr getPartIfExists(const String & part_name, const DataPartStates & valid_states);
DataPartPtr getPartIfExists(const MergeTreePartInfo & part_info, const DataPartStates & valid_states);
DataPartPtr getPartIfExists(const String & part_name, const DataPartStates & valid_states, DataPartsLock * acquired_lock = nullptr);
DataPartPtr getPartIfExists(const MergeTreePartInfo & part_info, const DataPartStates & valid_states, DataPartsLock * acquired_lock = nullptr);
/// Total size of active parts in bytes.
size_t getTotalActiveSizeInBytes() const;

View File

@ -172,7 +172,7 @@ ColumnWithTypeAndName RPNBuilderTreeNode::getConstantColumn() const
if (ast_node)
{
const auto * literal = assert_cast<const ASTLiteral *>(ast_node);
const auto * literal = typeid_cast<const ASTLiteral *>(ast_node);
if (literal)
{
result.type = applyVisitor(FieldToDataType(), literal->value);

View File

@ -386,8 +386,13 @@ void ReplicatedMergeTreeRestartingThread::setReadonly(bool on_shutdown)
CurrentMetrics::add(CurrentMetrics::ReadonlyReplica);
/// Replica was already readonly, but we should decrement the metric, because we are detaching/dropping table.
if (on_shutdown)
/// if first pass wasn't done we don't have to decrement because it wasn't incremented in the first place
/// the task should be deactivated if it's full shutdown so no race is present
if (!first_time && on_shutdown)
{
CurrentMetrics::sub(CurrentMetrics::ReadonlyReplica);
assert(CurrentMetrics::get(CurrentMetrics::ReadonlyReplica) >= 0);
}
}
void ReplicatedMergeTreeRestartingThread::setNotReadonly()
@ -397,7 +402,10 @@ void ReplicatedMergeTreeRestartingThread::setNotReadonly()
/// because we don't want to change this metric if replication is started successfully.
/// So we should not decrement it when replica stopped being readonly on startup.
if (storage.is_readonly.compare_exchange_strong(old_val, false) && !first_time)
{
CurrentMetrics::sub(CurrentMetrics::ReadonlyReplica);
assert(CurrentMetrics::get(CurrentMetrics::ReadonlyReplica) >= 0);
}
}
}

View File

@ -174,6 +174,9 @@ HashJoinPtr StorageJoin::getJoinLocked(std::shared_ptr<TableJoin> analyzed_join,
"Table {} needs the same join_use_nulls setting as present in LEFT or FULL JOIN",
getStorageID().getNameForLogs());
if (analyzed_join->getClauses().size() != 1)
throw Exception(ErrorCodes::INCOMPATIBLE_TYPE_OF_JOIN, "JOIN keys should match to the Join engine keys [{}]", fmt::join(getKeyNames(), ", "));
const auto & join_on = analyzed_join->getOnlyClause();
if (join_on.on_filter_condition_left || join_on.on_filter_condition_right)
throw Exception(ErrorCodes::INCOMPATIBLE_TYPE_OF_JOIN, "ON section of JOIN with filter conditions is not implemented");
@ -211,9 +214,9 @@ HashJoinPtr StorageJoin::getJoinLocked(std::shared_ptr<TableJoin> analyzed_join,
left_key_names_resorted.push_back(key_names_left[key_position]);
}
/// Set names qualifiers: table.column -> column
/// It's required because storage join stores non-qualified names
/// Qualifies will be added by join implementation (HashJoin)
/// Set qualified identifiers to original names (table.column -> column).
/// It's required because storage join stores non-qualified names.
/// Qualifies will be added by join implementation (TableJoin contains a rename mapping).
analyzed_join->setRightKeys(key_names);
analyzed_join->setLeftKeys(left_key_names_resorted);

View File

@ -1370,19 +1370,21 @@ MergeTreeDataPartPtr StorageMergeTree::outdatePart(MergeTreeTransaction * txn, c
{
/// Forcefully stop merges and make part outdated
auto merge_blocker = stopMergesAndWait();
auto part = getPartIfExists(part_name, {MergeTreeDataPartState::Active});
auto parts_lock = lockParts();
auto part = getPartIfExists(part_name, {MergeTreeDataPartState::Active}, &parts_lock);
if (!part)
throw Exception(ErrorCodes::NO_SUCH_DATA_PART, "Part {} not found, won't try to drop it.", part_name);
removePartsFromWorkingSet(txn, {part}, true);
removePartsFromWorkingSet(txn, {part}, true, &parts_lock);
return part;
}
else
{
/// Wait merges selector
std::unique_lock lock(currently_processing_in_background_mutex);
auto parts_lock = lockParts();
auto part = getPartIfExists(part_name, {MergeTreeDataPartState::Active});
auto part = getPartIfExists(part_name, {MergeTreeDataPartState::Active}, &parts_lock);
/// It's okay, part was already removed
if (!part)
return nullptr;
@ -1392,7 +1394,7 @@ MergeTreeDataPartPtr StorageMergeTree::outdatePart(MergeTreeTransaction * txn, c
if (currently_merging_mutating_parts.contains(part))
return nullptr;
removePartsFromWorkingSet(txn, {part}, true);
removePartsFromWorkingSet(txn, {part}, true, &parts_lock);
return part;
}
}

View File

@ -633,6 +633,8 @@ void StorageReplicatedMergeTree::createNewZooKeeperNodes()
futures.push_back(zookeeper->asyncTryCreateNoThrow(zookeeper_path + "/pinned_part_uuids", getPinnedPartUUIDs()->toString(), zkutil::CreateMode::Persistent));
/// For ALTER PARTITION with multi-leaders
futures.push_back(zookeeper->asyncTryCreateNoThrow(zookeeper_path + "/alter_partition_version", String(), zkutil::CreateMode::Persistent));
/// For deduplication of async inserts
futures.push_back(zookeeper->asyncTryCreateNoThrow(zookeeper_path + "/async_blocks", String(), zkutil::CreateMode::Persistent));
/// As for now, "/temp" node must exist, but we want to be able to remove it in future
if (zookeeper->exists(zookeeper_path + "/temp"))
@ -4535,7 +4537,7 @@ SinkToStoragePtr StorageReplicatedMergeTree::write(const ASTPtr & /*query*/, con
const auto storage_settings_ptr = getSettings();
const Settings & query_settings = local_context->getSettingsRef();
bool deduplicate = storage_settings_ptr->replicated_deduplication_window != 0 && query_settings.insert_deduplicate;
bool async_deduplicate = query_settings.async_insert && storage_settings_ptr->replicated_deduplication_window_for_async_inserts != 0 && query_settings.insert_deduplicate;
bool async_deduplicate = query_settings.async_insert && query_settings.async_insert_deduplicate && storage_settings_ptr->replicated_deduplication_window_for_async_inserts != 0 && query_settings.insert_deduplicate;
if (async_deduplicate)
return std::make_shared<ReplicatedMergeTreeSinkWithAsyncDeduplicate>(
*this, metadata_snapshot, query_settings.insert_quorum.valueOr(0),
@ -6562,7 +6564,7 @@ void StorageReplicatedMergeTree::getClearBlocksInPartitionOpsImpl(
{
Strings blocks;
if (Coordination::Error::ZOK != zookeeper.tryGetChildren(fs::path(zookeeper_path) / blocks_dir_name, blocks))
throw Exception(zookeeper_path + "/" + blocks_dir_name + "blocks doesn't exist", ErrorCodes::NOT_FOUND_NODE);
throw Exception(ErrorCodes::NOT_FOUND_NODE, "Node {}/{} doesn't exist", zookeeper_path, blocks_dir_name);
String partition_prefix = partition_id + "_";
Strings paths_to_get;

View File

@ -4,6 +4,8 @@
#include <Interpreters/Context.h>
#include <Access/ContextAccess.h>
#include <Storages/System/StorageSystemDatabases.h>
#include <Parsers/ASTCreateQuery.h>
#include <Common/logger_useful.h>
namespace DB
@ -17,6 +19,7 @@ NamesAndTypesList StorageSystemDatabases::getNamesAndTypes()
{"data_path", std::make_shared<DataTypeString>()},
{"metadata_path", std::make_shared<DataTypeString>()},
{"uuid", std::make_shared<DataTypeUUID>()},
{"engine_full", std::make_shared<DataTypeString>()},
{"comment", std::make_shared<DataTypeString>()}
};
}
@ -28,6 +31,43 @@ NamesAndAliases StorageSystemDatabases::getNamesAndAliases()
};
}
static String getEngineFull(const DatabasePtr & database)
{
DDLGuardPtr guard;
while (true)
{
String name = database->getDatabaseName();
guard = DatabaseCatalog::instance().getDDLGuard(name, "");
/// Ensure that the database was not renamed before we acquired the lock
auto locked_database = DatabaseCatalog::instance().tryGetDatabase(name);
if (locked_database.get() == database.get())
break;
/// Database was dropped
if (name == database->getDatabaseName())
return {};
guard.reset();
LOG_TRACE(&Poco::Logger::get("StorageSystemDatabases"), "Failed to lock database {} ({}), will retry", name, database->getUUID());
}
ASTPtr ast = database->getCreateDatabaseQuery();
auto * ast_create = ast->as<ASTCreateQuery>();
if (!ast_create || !ast_create->storage)
return {};
String engine_full = ast_create->storage->formatWithSecretsHidden();
static const char * const extra_head = " ENGINE = ";
if (startsWith(engine_full, extra_head))
engine_full = engine_full.substr(strlen(extra_head));
return engine_full;
}
void StorageSystemDatabases::fillData(MutableColumns & res_columns, ContextPtr context, const SelectQueryInfo &) const
{
const auto access = context->getAccess();
@ -47,7 +87,8 @@ void StorageSystemDatabases::fillData(MutableColumns & res_columns, ContextPtr c
res_columns[2]->insert(context->getPath() + database->getDataPath());
res_columns[3]->insert(database->getMetadataPath());
res_columns[4]->insert(database->getUUID());
res_columns[5]->insert(database->getDatabaseComment());
res_columns[5]->insert(getEngineFull(database));
res_columns[6]->insert(database->getDatabaseComment());
}
}

View File

@ -1,4 +1,7 @@
#include "StorageSystemParts.h"
#include <atomic>
#include <memory>
#include <string_view>
#include <Common/escapeForFileName.h>
#include <Columns/ColumnString.h>
@ -15,6 +18,29 @@
#include <Interpreters/TransactionVersionMetadata.h>
#include <Interpreters/Context.h>
namespace
{
std::string_view getRemovalStateDescription(DB::DataPartRemovalState state)
{
switch (state)
{
case DB::DataPartRemovalState::NOT_ATTEMPTED:
return "Cleanup thread hasn't seen this part yet";
case DB::DataPartRemovalState::VISIBLE_TO_TRANSACTIONS:
return "Part maybe visible for transactions";
case DB::DataPartRemovalState::NON_UNIQUE_OWNERSHIP:
return "Part ownership is not unique";
case DB::DataPartRemovalState::NOT_REACHED_REMOVAL_TIME:
return "Part hasn't reached removal time yet";
case DB::DataPartRemovalState::HAS_SKIPPED_MUTATION_PARENT:
return "Waiting mutation parent to be removed";
case DB::DataPartRemovalState::REMOVED:
return "Part was selected to be removed";
}
}
}
namespace DB
{
@ -92,6 +118,9 @@ StorageSystemParts::StorageSystemParts(const StorageID & table_id_)
{"removal_csn", std::make_shared<DataTypeUInt64>()},
{"has_lightweight_delete", std::make_shared<DataTypeUInt8>()},
{"last_removal_attemp_time", std::make_shared<DataTypeDateTime>()},
{"removal_state", std::make_shared<DataTypeString>()},
}
)
{
@ -310,6 +339,10 @@ void StorageSystemParts::processNextStorage(
columns[res_index++]->insert(part->version.removal_csn.load(std::memory_order_relaxed));
if (columns_mask[src_index++])
columns[res_index++]->insert(part->hasLightweightDelete());
if (columns_mask[src_index++])
columns[res_index++]->insert(static_cast<UInt64>(part->last_removal_attemp_time.load(std::memory_order_relaxed)));
if (columns_mask[src_index++])
columns[res_index++]->insert(getRemovalStateDescription(part->removal_state.load(std::memory_order_relaxed)));
/// _state column should be the latest.
/// Do not use part->getState*, it can be changed from different thread

View File

@ -32,8 +32,6 @@ from version_helper import (
RELEASE_READY_STATUS = "Ready for release"
git = Git()
class Repo:
VALID = ("ssh", "https", "origin")
@ -79,7 +77,7 @@ class Release:
self.release_commit = release_commit
assert release_type in self.BIG + self.SMALL
self.release_type = release_type
self._git = git
self._git = Git()
self._version = get_version_from_repo(git=self._git)
self._release_branch = ""
self._rollback_stack = [] # type: List[str]

View File

@ -145,7 +145,7 @@ if __name__ == "__main__":
)
logging.info("Going to run func tests: %s", run_command)
with TeePopen(run_command, run_log_path) as process:
with TeePopen(run_command, run_log_path, timeout=60 * 150) as process:
retcode = process.wait()
if retcode == 0:
logging.info("Run successfully")

View File

@ -1,5 +1,6 @@
#!/usr/bin/env python3
from io import TextIOWrapper
from subprocess import Popen, PIPE, STDOUT
from threading import Thread
from time import sleep
@ -14,15 +15,23 @@ import sys
# it finishes. stderr and stdout will be redirected both to specified file and
# stdout.
class TeePopen:
# pylint: disable=W0102
def __init__(self, command, log_file, env=os.environ.copy(), timeout=None):
def __init__(
self,
command: str,
log_file: str,
env: Optional[dict] = None,
timeout: Optional[int] = None,
):
self.command = command
self.log_file = log_file
self.env = env
self._log_file_name = log_file
self._log_file = None # type: Optional[TextIOWrapper]
self.env = env or os.environ.copy()
self._process = None # type: Optional[Popen]
self.timeout = timeout
def _check_timeout(self):
def _check_timeout(self) -> None:
if self.timeout is None:
return
sleep(self.timeout)
while self.process.poll() is None:
logging.warning(
@ -33,7 +42,7 @@ class TeePopen:
os.killpg(self.process.pid, 9)
sleep(10)
def __enter__(self):
def __enter__(self) -> "TeePopen":
self.process = Popen(
self.command,
shell=True,
@ -44,25 +53,21 @@ class TeePopen:
stdout=PIPE,
bufsize=1,
)
self.log_file = open(self.log_file, "w", encoding="utf-8")
if self.timeout is not None and self.timeout > 0:
t = Thread(target=self._check_timeout)
t.daemon = True # does not block the program from exit
t.start()
return self
def __exit__(self, t, value, traceback):
for line in self.process.stdout: # type: ignore
sys.stdout.write(line)
self.log_file.write(line)
self.process.wait()
def __exit__(self, exc_type, exc_value, traceback):
self.wait()
self.log_file.close()
def wait(self):
for line in self.process.stdout: # type: ignore
sys.stdout.write(line)
self.log_file.write(line)
if self.process.stdout is not None:
for line in self.process.stdout:
sys.stdout.write(line)
self.log_file.write(line)
return self.process.wait()
@ -75,3 +80,9 @@ class TeePopen:
@process.setter
def process(self, process: Popen) -> None:
self._process = process
@property
def log_file(self) -> TextIOWrapper:
if self._log_file is None:
self._log_file = open(self._log_file_name, "w", encoding="utf-8")
return self._log_file

View File

@ -1180,6 +1180,9 @@ def test_tables_dependency():
t4 = random_table_names[3]
t5 = random_table_names[4]
t6 = random_table_names[5]
t7 = random_table_names[6]
t8 = random_table_names[7]
t9 = random_table_names[8]
# Create a materialized view and a dictionary with a local table as source.
instance.query(
@ -1193,7 +1196,7 @@ def test_tables_dependency():
instance.query(f"CREATE MATERIALIZED VIEW {t3} TO {t2} AS SELECT x, y FROM {t1}")
instance.query(
f"CREATE DICTIONARY {t4} (x Int64, y String) PRIMARY KEY x SOURCE(CLICKHOUSE(HOST 'localhost' PORT tcpPort() TABLE '{t1.split('.')[1]}' DB '{t1.split('.')[0]}')) LAYOUT(FLAT()) LIFETIME(0)"
f"CREATE DICTIONARY {t4} (x Int64, y String) PRIMARY KEY x SOURCE(CLICKHOUSE(HOST 'localhost' PORT tcpPort() TABLE '{t1.split('.')[1]}' DB '{t1.split('.')[0]}')) LAYOUT(FLAT()) LIFETIME(4)"
)
instance.query(f"CREATE TABLE {t5} AS dictionary({t4})")
@ -1202,12 +1205,25 @@ def test_tables_dependency():
f"CREATE TABLE {t6}(x Int64, y String DEFAULT dictGet({t4}, 'y', x)) ENGINE=MergeTree ORDER BY tuple()"
)
instance.query(f"CREATE VIEW {t7} AS SELECT sum(x) FROM (SELECT x FROM {t6})")
instance.query(
f"CREATE TABLE {t8} AS {t2} ENGINE = Buffer({t2.split('.')[0]}, {t2.split('.')[1]}, 16, 10, 100, 10000, 1000000, 10000000, 100000000)"
)
instance.query(
f"CREATE DICTIONARY {t9} (x Int64, y String) PRIMARY KEY x SOURCE(CLICKHOUSE(TABLE '{t1.split('.')[1]}' DB '{t1.split('.')[0]}')) LAYOUT(FLAT()) LIFETIME(9)"
)
# Make backup.
backup_name = new_backup_name()
instance.query(f"BACKUP DATABASE test, DATABASE test2 TO {backup_name}")
# Drop everything in reversive order.
def drop():
instance.query(f"DROP DICTIONARY {t9}")
instance.query(f"DROP TABLE {t8} NO DELAY")
instance.query(f"DROP TABLE {t7} NO DELAY")
instance.query(f"DROP TABLE {t6} NO DELAY")
instance.query(f"DROP TABLE {t5} NO DELAY")
instance.query(f"DROP DICTIONARY {t4}")
@ -1219,11 +1235,36 @@ def test_tables_dependency():
drop()
# Restore everything and check.
# Restore everything.
instance.query(f"RESTORE ALL FROM {backup_name}")
# Check everything is restored.
assert instance.query(
"SELECT concat(database, '.', name) AS c FROM system.tables WHERE database IN ['test', 'test2'] ORDER BY c"
) == TSV(sorted([t1, t2, t3, t4, t5, t6]))
) == TSV(sorted([t1, t2, t3, t4, t5, t6, t7, t8, t9]))
# Check logs.
instance.query("SYSTEM FLUSH LOGS")
expect_in_logs = [
f"Table {t1} has no dependencies (level 0)",
f"Table {t2} has no dependencies (level 0)",
(
f"Table {t3} has 2 dependencies: {t1}, {t2} (level 1)",
f"Table {t3} has 2 dependencies: {t2}, {t1} (level 1)",
),
f"Table {t4} has 1 dependencies: {t1} (level 1)",
f"Table {t5} has 1 dependencies: {t4} (level 2)",
f"Table {t6} has 1 dependencies: {t4} (level 2)",
f"Table {t7} has 1 dependencies: {t6} (level 3)",
f"Table {t8} has 1 dependencies: {t2} (level 1)",
f"Table {t9} has 1 dependencies: {t1} (level 1)",
]
for expect in expect_in_logs:
assert any(
[
instance.contains_in_log(f"RestorerFromBackup: {x}")
for x in tuple(expect)
]
)
drop()

View File

@ -796,6 +796,84 @@ def test_mutation():
node1.query(f"RESTORE TABLE tbl ON CLUSTER 'cluster' FROM {backup_name}")
def test_tables_dependency():
node1.query("CREATE DATABASE mydb ON CLUSTER 'cluster3'")
node1.query(
"CREATE TABLE mydb.src ON CLUSTER 'cluster' (x Int64, y String) ENGINE=MergeTree ORDER BY tuple()"
)
node1.query(
"CREATE DICTIONARY mydb.dict ON CLUSTER 'cluster' (x Int64, y String) PRIMARY KEY x "
"SOURCE(CLICKHOUSE(HOST 'localhost' PORT tcpPort() DB 'mydb' TABLE 'src')) LAYOUT(FLAT()) LIFETIME(0)"
)
node1.query(
"CREATE TABLE mydb.dist1 (x Int64) ENGINE=Distributed('cluster', 'mydb', 'src')"
)
node3.query(
"CREATE TABLE mydb.dist2 (x Int64) ENGINE=Distributed(cluster, 'mydb', 'src')"
)
node1.query("CREATE TABLE mydb.clusterfunc1 AS cluster('cluster', 'mydb.src')")
node1.query("CREATE TABLE mydb.clusterfunc2 AS cluster(cluster, mydb.src)")
node1.query("CREATE TABLE mydb.clusterfunc3 AS cluster(cluster, 'mydb', 'src')")
node1.query(
"CREATE TABLE mydb.clusterfunc4 AS cluster(cluster, dictionary(mydb.dict))"
)
node1.query(
"CREATE TABLE mydb.clusterfunc5 AS clusterAllReplicas(cluster, dictionary(mydb.dict))"
)
node3.query("CREATE TABLE mydb.clusterfunc6 AS cluster('cluster', 'mydb.src')")
node3.query("CREATE TABLE mydb.clusterfunc7 AS cluster(cluster, mydb.src)")
node3.query("CREATE TABLE mydb.clusterfunc8 AS cluster(cluster, 'mydb', 'src')")
node3.query(
"CREATE TABLE mydb.clusterfunc9 AS cluster(cluster, dictionary(mydb.dict))"
)
node3.query(
"CREATE TABLE mydb.clusterfunc10 AS clusterAllReplicas(cluster, dictionary(mydb.dict))"
)
backup_name = new_backup_name()
node3.query(f"BACKUP DATABASE mydb ON CLUSTER 'cluster3' TO {backup_name}")
node3.query("DROP DATABASE mydb")
node3.query(f"RESTORE DATABASE mydb ON CLUSTER 'cluster3' FROM {backup_name}")
node3.query("SYSTEM FLUSH LOGS ON CLUSTER 'cluster3'")
expect_in_logs_1 = [
"Table mydb.src has no dependencies (level 0)",
"Table mydb.dict has 1 dependencies: mydb.src (level 1)",
"Table mydb.dist1 has 1 dependencies: mydb.src (level 1)",
"Table mydb.clusterfunc1 has 1 dependencies: mydb.src (level 1)",
"Table mydb.clusterfunc2 has 1 dependencies: mydb.src (level 1)",
"Table mydb.clusterfunc3 has 1 dependencies: mydb.src (level 1)",
"Table mydb.clusterfunc4 has 1 dependencies: mydb.dict (level 2)",
"Table mydb.clusterfunc5 has 1 dependencies: mydb.dict (level 2)",
]
expect_in_logs_2 = [
"Table mydb.src has no dependencies (level 0)",
"Table mydb.dict has 1 dependencies: mydb.src (level 1)",
]
expect_in_logs_3 = [
"Table mydb.dist2 has no dependencies (level 0)",
"Table mydb.clusterfunc6 has no dependencies (level 0)",
"Table mydb.clusterfunc7 has no dependencies (level 0)",
"Table mydb.clusterfunc8 has no dependencies (level 0)",
"Table mydb.clusterfunc9 has no dependencies (level 0)",
"Table mydb.clusterfunc10 has no dependencies (level 0)",
]
for expect in expect_in_logs_1:
assert node1.contains_in_log(f"RestorerFromBackup: {expect}")
for expect in expect_in_logs_2:
assert node2.contains_in_log(f"RestorerFromBackup: {expect}")
for expect in expect_in_logs_3:
assert node3.contains_in_log(f"RestorerFromBackup: {expect}")
def test_get_error_from_other_host():
node1.query("CREATE TABLE tbl (`x` UInt8) ENGINE = MergeTree ORDER BY x")
node1.query("INSERT INTO tbl VALUES (3)")

View File

@ -211,8 +211,8 @@ def test_attach_detach_partition(cluster):
node.query("ALTER TABLE hdfs_test DETACH PARTITION '2020-01-03'")
assert node.query("SELECT count(*) FROM hdfs_test FORMAT Values") == "(4096)"
wait_for_delete_inactive_parts(node, "hdfs_test")
wait_for_delete_empty_parts(node, "hdfs_test")
wait_for_delete_inactive_parts(node, "hdfs_test")
hdfs_objects = fs.listdir("/clickhouse")
assert len(hdfs_objects) == FILES_OVERHEAD + FILES_OVERHEAD_PER_PART_WIDE * 2
@ -225,8 +225,8 @@ def test_attach_detach_partition(cluster):
node.query("ALTER TABLE hdfs_test DROP PARTITION '2020-01-03'")
assert node.query("SELECT count(*) FROM hdfs_test FORMAT Values") == "(4096)"
wait_for_delete_inactive_parts(node, "hdfs_test")
wait_for_delete_empty_parts(node, "hdfs_test")
wait_for_delete_inactive_parts(node, "hdfs_test")
hdfs_objects = fs.listdir("/clickhouse")
assert len(hdfs_objects) == FILES_OVERHEAD + FILES_OVERHEAD_PER_PART_WIDE
@ -237,8 +237,8 @@ def test_attach_detach_partition(cluster):
settings={"allow_drop_detached": 1},
)
assert node.query("SELECT count(*) FROM hdfs_test FORMAT Values") == "(0)"
wait_for_delete_inactive_parts(node, "hdfs_test")
wait_for_delete_empty_parts(node, "hdfs_test")
wait_for_delete_inactive_parts(node, "hdfs_test")
hdfs_objects = fs.listdir("/clickhouse")
assert len(hdfs_objects) == FILES_OVERHEAD
@ -305,8 +305,8 @@ def test_table_manipulations(cluster):
node.query("TRUNCATE TABLE hdfs_test")
assert node.query("SELECT count(*) FROM hdfs_test FORMAT Values") == "(0)"
wait_for_delete_inactive_parts(node, "hdfs_test")
wait_for_delete_empty_parts(node, "hdfs_test")
wait_for_delete_inactive_parts(node, "hdfs_test")
hdfs_objects = fs.listdir("/clickhouse")
assert len(hdfs_objects) == FILES_OVERHEAD

View File

@ -323,8 +323,8 @@ def test_attach_detach_partition(cluster, node_name):
)
node.query("ALTER TABLE s3_test DETACH PARTITION '2020-01-03'")
wait_for_delete_inactive_parts(node, "s3_test")
wait_for_delete_empty_parts(node, "s3_test")
wait_for_delete_inactive_parts(node, "s3_test")
assert node.query("SELECT count(*) FROM s3_test FORMAT Values") == "(4096)"
assert (
len(list(minio.list_objects(cluster.minio_bucket, "data/", recursive=True)))
@ -339,8 +339,8 @@ def test_attach_detach_partition(cluster, node_name):
)
node.query("ALTER TABLE s3_test DROP PARTITION '2020-01-03'")
wait_for_delete_inactive_parts(node, "s3_test")
wait_for_delete_empty_parts(node, "s3_test")
wait_for_delete_inactive_parts(node, "s3_test")
assert node.query("SELECT count(*) FROM s3_test FORMAT Values") == "(4096)"
assert (
len(list(minio.list_objects(cluster.minio_bucket, "data/", recursive=True)))
@ -348,8 +348,8 @@ def test_attach_detach_partition(cluster, node_name):
)
node.query("ALTER TABLE s3_test DETACH PARTITION '2020-01-04'")
wait_for_delete_inactive_parts(node, "s3_test")
wait_for_delete_empty_parts(node, "s3_test")
wait_for_delete_inactive_parts(node, "s3_test")
assert node.query("SELECT count(*) FROM s3_test FORMAT Values") == "(0)"
assert (
len(list(minio.list_objects(cluster.minio_bucket, "data/")))
@ -431,8 +431,8 @@ def test_table_manipulations(cluster, node_name):
)
node.query("TRUNCATE TABLE s3_test")
wait_for_delete_inactive_parts(node, "s3_test")
wait_for_delete_empty_parts(node, "s3_test")
wait_for_delete_inactive_parts(node, "s3_test")
assert node.query("SELECT count(*) FROM s3_test FORMAT Values") == "(0)"
assert (
len(list(minio.list_objects(cluster.minio_bucket, "data/", recursive=True)))
@ -546,8 +546,8 @@ def test_freeze_unfreeze(cluster, node_name):
node.query("ALTER TABLE s3_test FREEZE WITH NAME 'backup2'")
node.query("TRUNCATE TABLE s3_test")
wait_for_delete_inactive_parts(node, "s3_test")
wait_for_delete_empty_parts(node, "s3_test")
wait_for_delete_inactive_parts(node, "s3_test")
assert (
len(list(minio.list_objects(cluster.minio_bucket, "data/", recursive=True)))
== FILES_OVERHEAD + FILES_OVERHEAD_PER_PART_WIDE * 2
@ -586,8 +586,8 @@ def test_freeze_system_unfreeze(cluster, node_name):
node.query("ALTER TABLE s3_test_removed FREEZE WITH NAME 'backup3'")
node.query("TRUNCATE TABLE s3_test")
wait_for_delete_inactive_parts(node, "s3_test")
wait_for_delete_empty_parts(node, "s3_test")
wait_for_delete_inactive_parts(node, "s3_test")
node.query("DROP TABLE s3_test_removed NO DELAY")
assert (
len(list(minio.list_objects(cluster.minio_bucket, "data/", recursive=True)))

View File

@ -201,8 +201,8 @@ def attach_check_all_parts_table(started_cluster):
def test_attach_check_all_parts(attach_check_all_parts_table):
q("ALTER TABLE test.attach_partition DETACH PARTITION 0")
wait_for_delete_inactive_parts(instance, "test.attach_partition")
wait_for_delete_empty_parts(instance, "test.attach_partition")
wait_for_delete_inactive_parts(instance, "test.attach_partition")
path_to_detached = path_to_data + "data/test/attach_partition/detached/"
instance.exec_in_container(["mkdir", "{}".format(path_to_detached + "0_5_5_0")])

View File

@ -112,11 +112,15 @@ def get_pgsql_client(cluster, port):
time.sleep(0.1)
@contextlib.contextmanager
def get_grpc_channel(cluster, port):
host_port = cluster.get_instance_ip("instance") + f":{port}"
channel = grpc.insecure_channel(host_port)
grpc.channel_ready_future(channel).result(timeout=10)
return channel
try:
yield channel
finally:
channel.close()
def grpc_query(channel, query_text):
@ -238,16 +242,17 @@ def test_change_postgresql_port(cluster, zk):
def test_change_grpc_port(cluster, zk):
with default_client(cluster, zk) as client:
grpc_channel = get_grpc_channel(cluster, port=9100)
assert grpc_query(grpc_channel, "SELECT 1") == "1\n"
with sync_loaded_config(client.query):
zk.set("/clickhouse/ports/grpc", b"9090")
with pytest.raises(
grpc._channel._InactiveRpcError, match="StatusCode.UNAVAILABLE"
):
grpc_query(grpc_channel, "SELECT 1")
grpc_channel_on_new_port = get_grpc_channel(cluster, port=9090)
assert grpc_query(grpc_channel_on_new_port, "SELECT 1") == "1\n"
with get_grpc_channel(cluster, port=9100) as grpc_channel:
assert grpc_query(grpc_channel, "SELECT 1") == "1\n"
with sync_loaded_config(client.query):
zk.set("/clickhouse/ports/grpc", b"9090")
with pytest.raises(
grpc._channel._InactiveRpcError, match="StatusCode.UNAVAILABLE"
):
grpc_query(grpc_channel, "SELECT 1")
with get_grpc_channel(cluster, port=9090) as grpc_channel_on_new_port:
assert grpc_query(grpc_channel_on_new_port, "SELECT 1") == "1\n"
def test_remove_tcp_port(cluster, zk):
@ -292,14 +297,14 @@ def test_remove_postgresql_port(cluster, zk):
def test_remove_grpc_port(cluster, zk):
with default_client(cluster, zk) as client:
grpc_channel = get_grpc_channel(cluster, port=9100)
assert grpc_query(grpc_channel, "SELECT 1") == "1\n"
with sync_loaded_config(client.query):
zk.delete("/clickhouse/ports/grpc")
with pytest.raises(
grpc._channel._InactiveRpcError, match="StatusCode.UNAVAILABLE"
):
grpc_query(grpc_channel, "SELECT 1")
with get_grpc_channel(cluster, port=9100) as grpc_channel:
assert grpc_query(grpc_channel, "SELECT 1") == "1\n"
with sync_loaded_config(client.query):
zk.delete("/clickhouse/ports/grpc")
with pytest.raises(
grpc._channel._InactiveRpcError, match="StatusCode.UNAVAILABLE"
):
grpc_query(grpc_channel, "SELECT 1")
def test_change_listen_host(cluster, zk):

View File

@ -449,8 +449,8 @@ def test_ttl_empty_parts(started_cluster):
assert node1.query("SELECT count() FROM test_ttl_empty_parts") == "3000\n"
# Wait for cleanup thread
wait_for_delete_inactive_parts(node1, "test_ttl_empty_parts")
wait_for_delete_empty_parts(node1, "test_ttl_empty_parts")
wait_for_delete_inactive_parts(node1, "test_ttl_empty_parts")
assert (
node1.query(

View File

@ -65,7 +65,7 @@ $CLICKHOUSE_CLIENT --query="INSERT INTO src VALUES (1, '0', 1);"
$CLICKHOUSE_CLIENT --query="INSERT INTO src VALUES (1, '1', 1);"
$CLICKHOUSE_CLIENT --query="INSERT INTO src VALUES (2, '0', 1);"
query_with_retry "ALTER TABLE src MOVE PARTITION 1 TO TABLE dst;" &>-
query_with_retry "ALTER TABLE src MOVE PARTITION 1 TO TABLE dst;" &>/dev/null
$CLICKHOUSE_CLIENT --query="SYSTEM SYNC REPLICA dst;"
$CLICKHOUSE_CLIENT --query="SELECT count(), sum(d) FROM src;"
@ -85,7 +85,7 @@ $CLICKHOUSE_CLIENT --query="INSERT INTO src VALUES (1, '0', 1);"
$CLICKHOUSE_CLIENT --query="INSERT INTO src VALUES (1, '1', 1);"
$CLICKHOUSE_CLIENT --query="INSERT INTO src VALUES (2, '0', 1);"
query_with_retry "ALTER TABLE src MOVE PARTITION 1 TO TABLE dst;" &>-
query_with_retry "ALTER TABLE src MOVE PARTITION 1 TO TABLE dst;" &>/dev/null
$CLICKHOUSE_CLIENT --query="SYSTEM SYNC REPLICA dst;"
$CLICKHOUSE_CLIENT --query="SELECT count(), sum(d) FROM src;"

View File

@ -39,7 +39,7 @@ RENAME TABLE test_01155_ordinary.mv1 TO test_01155_atomic.mv1;
RENAME TABLE test_01155_ordinary.mv2 TO test_01155_atomic.mv2;
RENAME TABLE test_01155_ordinary.dst TO test_01155_atomic.dst;
RENAME TABLE test_01155_ordinary.src TO test_01155_atomic.src;
SET check_table_dependencies=0; -- Otherwise we'll get error "test_01155_atomic.dict depends on test_01155_ordinary.dist" in the next line.
SET check_table_dependencies=0; -- Otherwise we'll get error "test_01155_ordinary.dict depends on test_01155_ordinary.dist" in the next line.
RENAME TABLE test_01155_ordinary.dist TO test_01155_atomic.dist;
SET check_table_dependencies=1;
RENAME DICTIONARY test_01155_ordinary.dict TO test_01155_atomic.dict;
@ -65,7 +65,7 @@ SELECT dictGet('test_01155_ordinary.dict', 'x', 'after renaming database');
SELECT database, substr(name, 1, 10) FROM system.tables WHERE database like 'test_01155_%';
-- Move tables back
SET check_table_dependencies=0; -- Otherwise we'll get error "test_01155_atomic.dict depends on test_01155_ordinary.dist" in the next line.
SET check_table_dependencies=0; -- Otherwise we'll get error "test_01155_ordinary.dict depends on test_01155_ordinary.dist" in the next line.
RENAME DATABASE test_01155_ordinary TO test_01155_atomic;
SET check_table_dependencies=1;

View File

@ -27,7 +27,6 @@ Column 2, name: d, type: Decimal(18, 10), parsed text: "123456789"ERROR
ERROR: There is no delimiter between fields: expected "<TAB>", got "7<TAB>Hello<TAB>123"
ERROR: There is no delimiter after last field: expected "<LINE FEED>", got "<TAB>1<LINE FEED>"
ERROR: There is no delimiter after last field: expected "<LINE FEED>", got "Hello<LINE FEED>"
Column 0, name: t, type: DateTime, ERROR: text "<LINE FEED>" is not like DateTime
JSONCompactEachRow
Column 2, name: d, type: Decimal(18, 10), parsed text: "123456789"ERROR
Column 0, name: t, type: DateTime, parsed text: "<DOUBLE QUOTE>2020-04-21 12:34:56"ERROR: DateTime must be in YYYY-MM-DD hh:mm:ss or NNNNNNNNNN (unix timestamp, exactly 10 digits) format.

View File

@ -37,7 +37,6 @@ echo -e '2020-04-21 12:34:56\tHello\t123456789' | "${PARSER[@]}" 2>&1| grep "ERR
echo -e '2020-04-21 12:34:567\tHello\t123456789' | "${PARSER[@]}" 2>&1| grep "ERROR"
echo -e '2020-04-21 12:34:56\tHello\t12345678\t1' | "${PARSER[@]}" 2>&1| grep "ERROR"
echo -e '2020-04-21 12:34:56\t\t123Hello' | "${PARSER[@]}" 2>&1| grep "ERROR"
echo -e '2020-04-21 12:34:56\tHello\t12345678\n' | "${PARSER[@]}" 2>&1| grep "ERROR"
PARSER=(${CLICKHOUSE_LOCAL} --query 'SELECT t, s, d FROM table' --structure 't DateTime, s String, d Decimal64(10)' --input-format JSONCompactEachRow)
echo '["2020-04-21 12:34:56", "Hello", 12345678]' | "${PARSER[@]}" 2>&1| grep "ERROR" || echo "JSONCompactEachRow"

View File

@ -21,7 +21,7 @@ censor.net
censor.net
censor.net
censor.net
SELECT if(number > 5, \'censor.net\', \'google\')
SELECT if(number > 5, _CAST(\'censor.net\', \'Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2)\'), _CAST(\'google\', \'Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2)\'))
FROM system.numbers
LIMIT 10
other

View File

@ -34,8 +34,8 @@ DROP VIEW IF EXISTS test_view_different_db;
CREATE VIEW test_view_different_db AS SELECT id, value, dictGet('2025_test_db.test_dictionary', 'value', id) FROM 2025_test_db.view_table;
SELECT * FROM test_view_different_db;
DROP TABLE 2025_test_db.test_table;
DROP DICTIONARY 2025_test_db.test_dictionary;
DROP TABLE 2025_test_db.test_table;
DROP TABLE 2025_test_db.view_table;
DROP VIEW test_view_different_db;

View File

@ -128,6 +128,7 @@ CREATE TABLE system.databases
`data_path` String,
`metadata_path` String,
`uuid` UUID,
`engine_full` String,
`comment` String,
`database` String
)
@ -503,6 +504,8 @@ CREATE TABLE system.parts
`creation_csn` UInt64,
`removal_csn` UInt64,
`has_lightweight_delete` UInt8,
`last_removal_attemp_time` DateTime,
`removal_state` String,
`bytes` UInt64,
`marks_size` UInt64
)

Some files were not shown because too many files have changed in this diff Show More