Merge branch 'master' into patch-61

This commit is contained in:
Denny Crane 2023-03-30 09:32:59 -03:00 committed by GitHub
commit 3ac34fc21c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
103 changed files with 1165 additions and 395 deletions

View File

@ -1,10 +1,194 @@
### Table of Contents
**[ClickHouse release v23.3 LTS, 2023-03-30](#233)**<br/>
**[ClickHouse release v23.2, 2023-02-23](#232)**<br/>
**[ClickHouse release v23.1, 2023-01-25](#231)**<br/>
**[Changelog for 2022](https://clickhouse.com/docs/en/whats-new/changelog/2022/)**<br/>
# 2023 Changelog
### <a id="233"></a> ClickHouse release 23.3 LTS, 2023-03-30
#### Upgrade Notes
* The behavior of `*domain*RFC` and `netloc` functions is slightly changed: relaxed the set of symbols that are allowed in the URL authority for better conformance. [#46841](https://github.com/ClickHouse/ClickHouse/pull/46841) ([Azat Khuzhin](https://github.com/azat)).
* Prohibited creating tables based on KafkaEngine with DEFAULT/EPHEMERAL/ALIAS/MATERIALIZED statements for columns. [#47138](https://github.com/ClickHouse/ClickHouse/pull/47138) ([Aleksandr Musorin](https://github.com/AVMusorin)).
* An "asynchronous connection drain" feature is removed. Related settings and metrics are removed as well. It was an internal feature, so the removal should not affect users who had never heard about that feature. [#47486](https://github.com/ClickHouse/ClickHouse/pull/47486) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Support 256-bit Decimal data type (more than 38 digits) in `arraySum`/`Min`/`Max`/`Avg`/`Product`, `arrayCumSum`/`CumSumNonNegative`, `arrayDifference`, array construction, IN operator, query parameters, `groupArrayMovingSum`, statistical functions, `min`/`max`/`any`/`argMin`/`argMax`, PostgreSQL wire protocol, MySQL table engine and function, `sumMap`, `mapAdd`, `mapSubtract`, `arrayIntersect`. Add support for big integers in `arrayIntersect`. Statistical aggregate functions involving moments (such as `corr` or various `TTest`s) will use `Float64` as their internal representation (they were using `Decimal128` before this change, but it was pointless), and these functions can return `nan` instead of `inf` in case of infinite variance. Some functions were allowed on `Decimal256` data types but returned `Decimal128` in previous versions - now it is fixed. This closes [#47569](https://github.com/ClickHouse/ClickHouse/issues/47569). This closes [#44864](https://github.com/ClickHouse/ClickHouse/issues/44864). This closes [#28335](https://github.com/ClickHouse/ClickHouse/issues/28335). [#47594](https://github.com/ClickHouse/ClickHouse/pull/47594) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Make backup_threads/restore_threads server settings (instead of user settings). [#47881](https://github.com/ClickHouse/ClickHouse/pull/47881) ([Azat Khuzhin](https://github.com/azat)).
* Do not allow const and non-deterministic secondary indices [#46839](https://github.com/ClickHouse/ClickHouse/pull/46839) ([Anton Popov](https://github.com/CurtizJ)).
#### New Feature
* Add new mode for splitting the work on replicas using settings `parallel_replicas_custom_key` and `parallel_replicas_custom_key_filter_type`. If the cluster consists of a single shard with multiple replicas, up to `max_parallel_replicas` will be randomly picked and turned into shards. For each shard, a corresponding filter is added to the query on the initiator before being sent to the shard. If the cluster consists of multiple shards, it will behave the same as `sample_key` but with the possibility to define an arbitrary key. [#45108](https://github.com/ClickHouse/ClickHouse/pull/45108) ([Antonio Andelic](https://github.com/antonio2368)).
* An option to display partial result on cancel: Added query setting `stop_reading_on_first_cancel` allowing the canceled query (e.g. due to Ctrl-C) to return a partial result. [#45689](https://github.com/ClickHouse/ClickHouse/pull/45689) ([Alexey Perevyshin](https://github.com/alexX512)).
* Added support of arbitrary tables engines for temporary tables (except for Replicated and KeeperMap engines). Close [#31497](https://github.com/ClickHouse/ClickHouse/issues/31497). [#46071](https://github.com/ClickHouse/ClickHouse/pull/46071) ([Roman Vasin](https://github.com/rvasin)).
* Add support for replication of user-defined SQL functions using a centralized storage in Keeper. [#46085](https://github.com/ClickHouse/ClickHouse/pull/46085) ([Aleksei Filatov](https://github.com/aalexfvk)).
* Implement `system.server_settings` (similar to `system.settings`), which will contain server configurations. [#46550](https://github.com/ClickHouse/ClickHouse/pull/46550) ([pufit](https://github.com/pufit)).
* Support for `UNDROP TABLE` query. Closes [#46811](https://github.com/ClickHouse/ClickHouse/issues/46811). [#47241](https://github.com/ClickHouse/ClickHouse/pull/47241) ([chen](https://github.com/xiedeyantu)).
* Allow separate grants for named collections (e.g. to be able to give `SHOW/CREATE/ALTER/DROP named collection` access only to certain collections, instead of all at once). Closes [#40894](https://github.com/ClickHouse/ClickHouse/issues/40894). Add new access type `NAMED_COLLECTION_CONTROL` which is not given to default user unless explicitly added to user config (is required to be able to do `GRANT ALL`), also `show_named_collections` is no longer obligatory to be manually specified for default user to be able to have full access rights as was in 23.2. [#46241](https://github.com/ClickHouse/ClickHouse/pull/46241) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Allow nested custom disks. Previously custom disks supported only flat disk structure. [#47106](https://github.com/ClickHouse/ClickHouse/pull/47106) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Intruduce a function `widthBucket` (with a `WIDTH_BUCKET` alias for compatibility). [#42974](https://github.com/ClickHouse/ClickHouse/issues/42974). [#46790](https://github.com/ClickHouse/ClickHouse/pull/46790) ([avoiderboi](https://github.com/avoiderboi)).
* Add new function `parseDateTime`/`parseDateTimeInJodaSyntax` according to specified format string. parseDateTime parses string to datetime in MySQL syntax, parseDateTimeInJodaSyntax parses in Joda syntax. [#46815](https://github.com/ClickHouse/ClickHouse/pull/46815) ([李扬](https://github.com/taiyang-li)).
* Use `dummy UInt8` for default structure of table function `null`. Closes [#46930](https://github.com/ClickHouse/ClickHouse/issues/46930). [#47006](https://github.com/ClickHouse/ClickHouse/pull/47006) ([flynn](https://github.com/ucasfl)).
* Support for date format with a comma, like `Dec 15, 2021` in the `parseDateTimeBestEffort` function. Closes [#46816](https://github.com/ClickHouse/ClickHouse/issues/46816). [#47071](https://github.com/ClickHouse/ClickHouse/pull/47071) ([chen](https://github.com/xiedeyantu)).
* Add settings `http_wait_end_of_query` and `http_response_buffer_size` that corresponds to URL params `wait_end_of_query` and `buffer_size` for HTTP interface. This allows to change these settings in the profiles. [#47108](https://github.com/ClickHouse/ClickHouse/pull/47108) ([Vladimir C](https://github.com/vdimir)).
* Add `system.dropped_tables` table that shows tables that were dropped from `Atomic` databases but were not completely removed yet. [#47364](https://github.com/ClickHouse/ClickHouse/pull/47364) ([chen](https://github.com/xiedeyantu)).
* Add `INSTR` as alias of `positionCaseInsensitive` for MySQL compatibility. Closes [#47529](https://github.com/ClickHouse/ClickHouse/issues/47529). [#47535](https://github.com/ClickHouse/ClickHouse/pull/47535) ([flynn](https://github.com/ucasfl)).
* Added `toDecimalString` function allowing to convert numbers to string with fixed precision. [#47838](https://github.com/ClickHouse/ClickHouse/pull/47838) ([Andrey Zvonov](https://github.com/zvonand)).
* Add a merge tree setting `max_number_of_mutations_for_replica`. It limits the number of part mutations per replica to the specified amount. Zero means no limit on the number of mutations per replica (the execution can still be constrained by other settings). [#48047](https://github.com/ClickHouse/ClickHouse/pull/48047) ([Vladimir C](https://github.com/vdimir)).
* Add Map-related function `mapFromArrays`, which allows us to create map from a pair of arrays. [#31125](https://github.com/ClickHouse/ClickHouse/pull/31125) ([李扬](https://github.com/taiyang-li)).
* Allow control compression in Parquet/ORC/Arrow output formats, support more compression for input formats. This closes [#13541](https://github.com/ClickHouse/ClickHouse/issues/13541). [#47114](https://github.com/ClickHouse/ClickHouse/pull/47114) ([Kruglov Pavel](https://github.com/Avogar)).
* Add SSL User Certificate authentication to the native protocol. Closes [#47077](https://github.com/ClickHouse/ClickHouse/issues/47077). [#47596](https://github.com/ClickHouse/ClickHouse/pull/47596) ([Nikolay Degterinsky](https://github.com/evillique)).
* Add *OrNull() and *OrZero() variants for `parseDateTime`, add alias `str_to_date` for MySQL parity. [#48000](https://github.com/ClickHouse/ClickHouse/pull/48000) ([Robert Schulze](https://github.com/rschu1ze)).
* Added operator `REGEXP` (similar to operators "LIKE", "IN", "MOD" etc.) for better compatibility with MySQL [#47869](https://github.com/ClickHouse/ClickHouse/pull/47869) ([Robert Schulze](https://github.com/rschu1ze)).
#### Performance Improvement
* Marks in memory are now compressed, using 3-6x less memory. [#47290](https://github.com/ClickHouse/ClickHouse/pull/47290) ([Michael Kolupaev](https://github.com/al13n321)).
* Backups for large numbers of files were unbelievably slow in previous versions. Not anymore. Now they are unbelievably fast. [#47251](https://github.com/ClickHouse/ClickHouse/pull/47251) ([Alexey Milovidov](https://github.com/alexey-milovidov)). Introduced a separate thread pool for backup's IO operations. This will allow to scale it independently of other pools and increase performance. [#47174](https://github.com/ClickHouse/ClickHouse/pull/47174) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)). Use MultiRead request and retries for collecting metadata at final stage of backup processing. [#47243](https://github.com/ClickHouse/ClickHouse/pull/47243) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)). If a backup and restoring data are both in S3 then server-side copy should be used from now on. [#47546](https://github.com/ClickHouse/ClickHouse/pull/47546) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fixed excessive reading in queries with `FINAL`. [#47801](https://github.com/ClickHouse/ClickHouse/pull/47801) ([Nikita Taranov](https://github.com/nickitat)).
* Setting `max_final_threads` would be set to number of cores at server startup (by the same algorithm as we use for `max_threads`). This improves concurrency of `final` execution on servers with high number of CPUs. [#47915](https://github.com/ClickHouse/ClickHouse/pull/47915) ([Nikita Taranov](https://github.com/nickitat)).
* Allow executing reading pipeline for DIRECT dictionary with CLICKHOUSE source in multiple threads. To enable set `dictionary_use_async_executor=1` in `SETTINGS` section for source in `CREATE DICTIONARY` statement. [#47986](https://github.com/ClickHouse/ClickHouse/pull/47986) ([Vladimir C](https://github.com/vdimir)).
* Optimize one nullable key aggregate performance. [#45772](https://github.com/ClickHouse/ClickHouse/pull/45772) ([LiuNeng](https://github.com/liuneng1994)).
* Implemented lowercase `tokenbf_v1` index utilization for `hasTokenOrNull`, `hasTokenCaseInsensitive` and `hasTokenCaseInsensitiveOrNull`. [#46252](https://github.com/ClickHouse/ClickHouse/pull/46252) ([ltrk2](https://github.com/ltrk2)).
* Optimize functions `position` and `LIKE` by searching the first two chars using SIMD. [#46289](https://github.com/ClickHouse/ClickHouse/pull/46289) ([Jiebin Sun](https://github.com/jiebinn)).
* Optimize queries from the `system.detached_parts`, which could be significantly large. Added several sources with respect to the block size limitation; in each block an IO thread pool is used to calculate the part size, i.e. to make syscalls in parallel. [#46624](https://github.com/ClickHouse/ClickHouse/pull/46624) ([Sema Checherinda](https://github.com/CheSema)).
* Increase the default value of `max_replicated_merges_in_queue` for ReplicatedMergeTree tables from 16 to 1000. It allows faster background merge operation on clusters with a very large number of replicas, such as clusters with shared storage in ClickHouse Cloud. [#47050](https://github.com/ClickHouse/ClickHouse/pull/47050) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Updated `clickhouse-copier` to use `GROUP BY` instead of `DISTINCT` to get list of partitions. For large tables this reduced the select time from over 500s to under 1s. [#47386](https://github.com/ClickHouse/ClickHouse/pull/47386) ([Clayton McClure](https://github.com/cmcclure-twilio)).
* Fix performance degradation in `ASOF JOIN`. [#47544](https://github.com/ClickHouse/ClickHouse/pull/47544) ([Ongkong](https://github.com/ongkong)).
* Even more batching in Keeper. Avoid breaking batches on read requests to improve performance. [#47978](https://github.com/ClickHouse/ClickHouse/pull/47978) ([Antonio Andelic](https://github.com/antonio2368)).
* Allow PREWHERE for Merge with different DEFAULT expression for column. [#46831](https://github.com/ClickHouse/ClickHouse/pull/46831) ([Azat Khuzhin](https://github.com/azat)).
#### Experimental Feature
* Parallel replicas: Improved the overall performance by better utilizing local replica. And forbid reading with parallel replicas from non-replicated MergeTree by default. [#47858](https://github.com/ClickHouse/ClickHouse/pull/47858) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Support filter push down to left table for JOIN with `Join`, `Dictionary` and `EmbeddedRocksDB` tables if the experimental Analyzer is enabled. [#47280](https://github.com/ClickHouse/ClickHouse/pull/47280) ([Maksim Kita](https://github.com/kitaisreal)).
* Now ReplicatedMergeTree with zero copy replication has less load to Keeper. [#47676](https://github.com/ClickHouse/ClickHouse/pull/47676) ([alesapin](https://github.com/alesapin)).
* Fix create materialized view with MaterializedPostgreSQL [#40807](https://github.com/ClickHouse/ClickHouse/pull/40807) ([Maksim Buren](https://github.com/maks-buren630501)).
#### Improvement
* Enable `input_format_json_ignore_unknown_keys_in_named_tuple` by default. [#46742](https://github.com/ClickHouse/ClickHouse/pull/46742) ([Kruglov Pavel](https://github.com/Avogar)).
* Allow to ignore errors while pushing to MATERIALIZED VIEW (add new setting `materialized_views_ignore_errors`, by default to `false`, but it is set to `true` for flushing logs to `system.*_log` tables unconditionally). [#46658](https://github.com/ClickHouse/ClickHouse/pull/46658) ([Azat Khuzhin](https://github.com/azat)).
* Track the file queue of distributed sends in memory. [#45491](https://github.com/ClickHouse/ClickHouse/pull/45491) ([Azat Khuzhin](https://github.com/azat)).
* Now `X-ClickHouse-Query-Id` and `X-ClickHouse-Timezone` headers are added to response in all queries via http protocol. Previously it was done only for `SELECT` queries. [#46364](https://github.com/ClickHouse/ClickHouse/pull/46364) ([Anton Popov](https://github.com/CurtizJ)).
* External tables from `MongoDB`: support for connection to a replica set via a URI with a host:port enum and support for the readPreference option in MongoDB dictionaries. Example URI: mongodb://db0.example.com:27017,db1.example.com:27017,db2.example.com:27017/?replicaSet=myRepl&readPreference=primary. [#46524](https://github.com/ClickHouse/ClickHouse/pull/46524) ([artem-yadr](https://github.com/artem-yadr)).
* This improvement should be invisible for users. Re-implement projection analysis on top of query plan. Added setting `query_plan_optimize_projection=1` to switch between old and new version. Fixes [#44963](https://github.com/ClickHouse/ClickHouse/issues/44963). [#46537](https://github.com/ClickHouse/ClickHouse/pull/46537) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Use parquet format v2 instead of v1 in output format by default. Add setting `output_format_parquet_version` to control parquet version, possible values `1.0`, `2.4`, `2.6`, `2.latest` (default). [#46617](https://github.com/ClickHouse/ClickHouse/pull/46617) ([Kruglov Pavel](https://github.com/Avogar)).
* It is now possible using new configuration syntax to configure Kafka topics with periods (`.`) in their name. [#46752](https://github.com/ClickHouse/ClickHouse/pull/46752) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix heuristics that check hyperscan patterns for problematic repeats. [#46819](https://github.com/ClickHouse/ClickHouse/pull/46819) ([Robert Schulze](https://github.com/rschu1ze)).
* Don't report ZK node exists to system.errors when a block was created concurrently by a different replica. [#46820](https://github.com/ClickHouse/ClickHouse/pull/46820) ([Raúl Marín](https://github.com/Algunenano)).
* Increase the limit for opened files in `clickhouse-local`. It will be able to read from `web` tables on servers with a huge number of CPU cores. Do not back off reading from the URL table engine in case of too many opened files. This closes [#46852](https://github.com/ClickHouse/ClickHouse/issues/46852). [#46853](https://github.com/ClickHouse/ClickHouse/pull/46853) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Exceptions thrown when numbers cannot be parsed now have an easier-to-read exception message. [#46917](https://github.com/ClickHouse/ClickHouse/pull/46917) ([Robert Schulze](https://github.com/rschu1ze)).
* Added update `system.backups` after every processed task to track the progress of backups. [#46989](https://github.com/ClickHouse/ClickHouse/pull/46989) ([Aleksandr Musorin](https://github.com/AVMusorin)).
* Allow types conversion in Native input format. Add settings `input_format_native_allow_types_conversion` that controls it (enabled by default). [#46990](https://github.com/ClickHouse/ClickHouse/pull/46990) ([Kruglov Pavel](https://github.com/Avogar)).
* Allow IPv4 in the `range` function to generate IP ranges. [#46995](https://github.com/ClickHouse/ClickHouse/pull/46995) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Improve exception message when it's impossible to make part move from one volume/disk to another. [#47032](https://github.com/ClickHouse/ClickHouse/pull/47032) ([alesapin](https://github.com/alesapin)).
* Support `Bool` type in `JSONType` function. Previously `Null` type was mistakenly returned for bool values. [#47046](https://github.com/ClickHouse/ClickHouse/pull/47046) ([Anton Popov](https://github.com/CurtizJ)).
* Use `_request_body` parameter to configure predefined HTTP queries. [#47086](https://github.com/ClickHouse/ClickHouse/pull/47086) ([Constantine Peresypkin](https://github.com/pkit)).
* Automatic indentation in the built-in UI SQL editor when Enter is pressed. [#47113](https://github.com/ClickHouse/ClickHouse/pull/47113) ([Alexey Korepanov](https://github.com/alexkorep)).
* Self-extraction with 'sudo' will attempt to set uid and gid of extracted files to running user. [#47116](https://github.com/ClickHouse/ClickHouse/pull/47116) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Previously, the `repeat` function's second argument only accepted an unsigned integer type, which meant it could not accept values such as -1. This behavior differed from that of the Spark function. In this update, the repeat function has been modified to match the behavior of the Spark function. It now accepts the same types of inputs, including negative integers. Extensive testing has been performed to verify the correctness of the updated implementation. [#47134](https://github.com/ClickHouse/ClickHouse/pull/47134) ([KevinyhZou](https://github.com/KevinyhZou)). Note: the changelog entry was rewritten by ChatGPT.
* Remove `::__1` part from stacktraces. Display `std::basic_string<char, ...` as `String` in stacktraces. [#47171](https://github.com/ClickHouse/ClickHouse/pull/47171) ([Mike Kot](https://github.com/myrrc)).
* Reimplement interserver mode to avoid replay attacks (note, that change is backward compatible with older servers). [#47213](https://github.com/ClickHouse/ClickHouse/pull/47213) ([Azat Khuzhin](https://github.com/azat)).
* Better recognize regular expression groups and refine the regexp_tree dictionary. [#47218](https://github.com/ClickHouse/ClickHouse/pull/47218) ([Han Fei](https://github.com/hanfei1991)).
* Keeper improvement: Add new 4LW `clrs` to clean resources used by Keeper (e.g. release unused memory). [#47256](https://github.com/ClickHouse/ClickHouse/pull/47256) ([Antonio Andelic](https://github.com/antonio2368)).
* Add optional arguments to codecs `DoubleDelta(bytes_size)`, `Gorilla(bytes_size)`, `FPC(level, float_size)`, it will allow using this codecs without column type in `clickhouse-compressor`. Fix possible abrots and arithmetic errors in `clickhouse-compressor` with these codecs. Fixes: https://github.com/ClickHouse/ClickHouse/discussions/47262. [#47271](https://github.com/ClickHouse/ClickHouse/pull/47271) ([Kruglov Pavel](https://github.com/Avogar)).
* Add support for big int types to the `runningDifference` function. Closes [#47194](https://github.com/ClickHouse/ClickHouse/issues/47194). [#47322](https://github.com/ClickHouse/ClickHouse/pull/47322) ([Nikolay Degterinsky](https://github.com/evillique)).
* Add an expiration window for S3 credentials that have an expiration time to avoid `ExpiredToken` errors in some edge cases. It can be controlled with `expiration_window_seconds` config, the default is 120 seconds. [#47423](https://github.com/ClickHouse/ClickHouse/pull/47423) ([Antonio Andelic](https://github.com/antonio2368)).
* Support Decimals and Date32 in `Avro` format. [#47434](https://github.com/ClickHouse/ClickHouse/pull/47434) ([Kruglov Pavel](https://github.com/Avogar)).
* Do not start the server if an interrupted conversion from `Ordinary` to `Atomic` was detected, print a better error message with troubleshooting instructions. [#47487](https://github.com/ClickHouse/ClickHouse/pull/47487) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add a new column `kind` to the `system.opentelemetry_span_log`. This column holds the value of [SpanKind](https://opentelemetry.io/docs/reference/specification/trace/api/#spankind) defined in OpenTelemtry. [#47499](https://github.com/ClickHouse/ClickHouse/pull/47499) ([Frank Chen](https://github.com/FrankChen021)).
* Allow reading/writing nested arrays in `Protobuf` format with only root field name as column name. Previously column name should've contain all nested field names (like `a.b.c Array(Array(Array(UInt32)))`, now you can use just `a Array(Array(Array(UInt32)))`. [#47650](https://github.com/ClickHouse/ClickHouse/pull/47650) ([Kruglov Pavel](https://github.com/Avogar)).
* Added an optional `STRICT` modifier for `SYSTEM SYNC REPLICA` which makes the query wait for replication queue to become empty (just like it worked before https://github.com/ClickHouse/ClickHouse/pull/45648). [#47659](https://github.com/ClickHouse/ClickHouse/pull/47659) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Improvement name of some OpenTelemetry span logs. [#47667](https://github.com/ClickHouse/ClickHouse/pull/47667) ([Frank Chen](https://github.com/FrankChen021)).
* Prevent using too long chains of aggregate function combinators (they can lead to slow queries in the analysis stage). This closes [#47715](https://github.com/ClickHouse/ClickHouse/issues/47715). [#47716](https://github.com/ClickHouse/ClickHouse/pull/47716) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Support for subquery in parameterized views; resolves [#46741](https://github.com/ClickHouse/ClickHouse/issues/46741) [#47725](https://github.com/ClickHouse/ClickHouse/pull/47725) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Fix memory leak in MySQL integration (reproduces with `connection_auto_close=1`). [#47732](https://github.com/ClickHouse/ClickHouse/pull/47732) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Improved error handling in the code related to Decimal parameters, resulting in more informative error messages. Previously, when incorrect Decimal parameters were supplied, the error message generated was unclear or unhelpful. With this update, the error print message has been fixed to provide more detailed and useful information, making it easier to identify and correct issues related to Decimal parameters. [#47812](https://github.com/ClickHouse/ClickHouse/pull/47812) ([Yu Feng](https://github.com/Vigor-jpg)). Note: this changelog entry is rewritten by ChatGPT.
* The parameter `exact_rows_before_limit` is used to make `rows_before_limit_at_least` is designed to accurately reflect the number of rows returned before the limit is reached. This pull request addresses issues encountered when the query involves distributed processing across multiple shards or sorting operations. Prior to this update, these scenarios were not functioning as intended. [#47874](https://github.com/ClickHouse/ClickHouse/pull/47874) ([Amos Bird](https://github.com/amosbird)).
* ThreadPools metrics introspection. [#47880](https://github.com/ClickHouse/ClickHouse/pull/47880) ([Azat Khuzhin](https://github.com/azat)).
* Add `WriteBufferFromS3Microseconds` and `WriteBufferFromS3RequestsErrors` profile events. [#47885](https://github.com/ClickHouse/ClickHouse/pull/47885) ([Antonio Andelic](https://github.com/antonio2368)).
* Add `--link` and `--noninteractive` (`-y`) options to clickhouse install. Closes [#47750](https://github.com/ClickHouse/ClickHouse/issues/47750). [#47887](https://github.com/ClickHouse/ClickHouse/pull/47887) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fixed `UNKNOWN_TABLE` exception when attaching to a materialized view that has dependent tables that are not available. This might be useful when trying to restore state from a backup. [#47975](https://github.com/ClickHouse/ClickHouse/pull/47975) ([MikhailBurdukov](https://github.com/MikhailBurdukov)).
* Fix case when (optional) path is not added to encrypted disk configuration. [#47981](https://github.com/ClickHouse/ClickHouse/pull/47981) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Support for CTE in parameterized views Implementation: Updated to allow query parameters while evaluating scalar subqueries. [#48065](https://github.com/ClickHouse/ClickHouse/pull/48065) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Support big integers `(U)Int128/(U)Int256`, `Map` with any key type and `DateTime64` with any precision (not only 3 and 6). [#48119](https://github.com/ClickHouse/ClickHouse/pull/48119) ([Kruglov Pavel](https://github.com/Avogar)).
* Allow skipping errors related to unknown enum values in row input formats. [#48133](https://github.com/ClickHouse/ClickHouse/pull/48133) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### Build/Testing/Packaging Improvement
* ClickHouse now builds with `C++23`. [#47424](https://github.com/ClickHouse/ClickHouse/pull/47424) ([Robert Schulze](https://github.com/rschu1ze)).
* Fuzz `EXPLAIN` queries in the AST Fuzzer. [#47803](https://github.com/ClickHouse/ClickHouse/pull/47803) [#47852](https://github.com/ClickHouse/ClickHouse/pull/47852) ([flynn](https://github.com/ucasfl)).
* Split stress test and the automated backward compatibility check (now Upgrade check). [#44879](https://github.com/ClickHouse/ClickHouse/pull/44879) ([Kruglov Pavel](https://github.com/Avogar)).
* Updated the Ubuntu Image for Docker to calm down some bogus security reports. [#46784](https://github.com/ClickHouse/ClickHouse/pull/46784) ([Julio Jimenez](https://github.com/juliojimenez)). Please note that ClickHouse has no dependencies and does not require Docker.
* Adds a prompt to allow the removal of an existing `cickhouse` download when using "curl | sh" download of ClickHouse. Prompt is "ClickHouse binary clickhouse already exists. Overwrite? \[y/N\]". [#46859](https://github.com/ClickHouse/ClickHouse/pull/46859) ([Dan Roscigno](https://github.com/DanRoscigno)).
* Fix error during server startup on old distros (e.g. Amazon Linux 2) and on ARM that glibc 2.28 symbols are not found. [#47008](https://github.com/ClickHouse/ClickHouse/pull/47008) ([Robert Schulze](https://github.com/rschu1ze)).
* Prepare for clang 16. [#47027](https://github.com/ClickHouse/ClickHouse/pull/47027) ([Amos Bird](https://github.com/amosbird)).
* Added a CI check which ensures ClickHouse can run with an old glibc on ARM. [#47063](https://github.com/ClickHouse/ClickHouse/pull/47063) ([Robert Schulze](https://github.com/rschu1ze)).
* Add a style check to prevent incorrect usage of the `NDEBUG` macro. [#47699](https://github.com/ClickHouse/ClickHouse/pull/47699) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Speed up the build a little. [#47714](https://github.com/ClickHouse/ClickHouse/pull/47714) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Bump `vectorscan` to 5.4.9. [#47955](https://github.com/ClickHouse/ClickHouse/pull/47955) ([Robert Schulze](https://github.com/rschu1ze)).
* Add a unit test to assert Apache Arrow's fatal logging does not abort. It covers the changes in [ClickHouse/arrow#16](https://github.com/ClickHouse/arrow/pull/16). [#47958](https://github.com/ClickHouse/ClickHouse/pull/47958) ([Arthur Passos](https://github.com/arthurpassos)).
* Restore the ability of native macOS debug server build to start. [#48050](https://github.com/ClickHouse/ClickHouse/pull/48050) ([Robert Schulze](https://github.com/rschu1ze)). Note: this change is only relevant for development, as the ClickHouse's official builds are done with cross-compilation.
#### Bug Fix (user-visible misbehavior in an official stable release)
* Fix formats parser resetting, test processing bad messages in `Kafka` [#45693](https://github.com/ClickHouse/ClickHouse/pull/45693) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix data size calculation in Keeper [#46086](https://github.com/ClickHouse/ClickHouse/pull/46086) ([Antonio Andelic](https://github.com/antonio2368)).
* Fixed a bug in automatic retries of `DROP TABLE` query with `ReplicatedMergeTree` tables and `Atomic` databases. In rare cases it could lead to `Can't get data for node /zk_path/log_pointer` and `The specified key does not exist` errors if ZooKeeper session expired during DROP and a new replicated table with the same path in ZooKeeper was created in parallel. [#46384](https://github.com/ClickHouse/ClickHouse/pull/46384) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix incorrect alias recursion while normalizing queries, that prevented some queries to run. [#46609](https://github.com/ClickHouse/ClickHouse/pull/46609) ([Raúl Marín](https://github.com/Algunenano)).
* Fix IPv4/IPv6 serialization/deserialization in binary formats [#46616](https://github.com/ClickHouse/ClickHouse/pull/46616) ([Kruglov Pavel](https://github.com/Avogar)).
* ActionsDAG: do not change result of `and` during optimization [#46653](https://github.com/ClickHouse/ClickHouse/pull/46653) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
* Improve queries cancellation when a client dies [#46681](https://github.com/ClickHouse/ClickHouse/pull/46681) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix arithmetic operations in aggregate optimization [#46705](https://github.com/ClickHouse/ClickHouse/pull/46705) ([Duc Canh Le](https://github.com/canhld94)).
* Fix possible `clickhouse-local`'s abort on JSONEachRow schema inference [#46731](https://github.com/ClickHouse/ClickHouse/pull/46731) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix changing an expired role [#46772](https://github.com/ClickHouse/ClickHouse/pull/46772) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix combined PREWHERE column accumulated from multiple steps [#46785](https://github.com/ClickHouse/ClickHouse/pull/46785) ([Alexander Gololobov](https://github.com/davenger)).
* Use initial range for fetching file size in HTTP read buffer. Without this change, some remote files couldn't be processed. [#46824](https://github.com/ClickHouse/ClickHouse/pull/46824) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix the incorrect progress bar while using the URL tables [#46830](https://github.com/ClickHouse/ClickHouse/pull/46830) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix MSan report in `maxIntersections` function [#46847](https://github.com/ClickHouse/ClickHouse/pull/46847) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix a bug in `Map` data type [#46856](https://github.com/ClickHouse/ClickHouse/pull/46856) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix wrong results of some LIKE searches when the LIKE pattern contains quoted non-quotable characters [#46875](https://github.com/ClickHouse/ClickHouse/pull/46875) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix - WITH FILL would produce abort when the Filling Transform processing an empty block [#46897](https://github.com/ClickHouse/ClickHouse/pull/46897) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Fix date and int inference from string in JSON [#46972](https://github.com/ClickHouse/ClickHouse/pull/46972) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix bug in zero-copy replication disk choice during fetch [#47010](https://github.com/ClickHouse/ClickHouse/pull/47010) ([alesapin](https://github.com/alesapin)).
* Fix a typo in systemd service definition [#47051](https://github.com/ClickHouse/ClickHouse/pull/47051) ([Palash Goel](https://github.com/palash-goel)).
* Fix the NOT_IMPLEMENTED error with CROSS JOIN and algorithm = auto [#47068](https://github.com/ClickHouse/ClickHouse/pull/47068) ([Vladimir C](https://github.com/vdimir)).
* Fix the problem that the 'ReplicatedMergeTree' table failed to insert two similar data when the 'part_type' is configured as 'InMemory' mode (experimental feature). [#47121](https://github.com/ClickHouse/ClickHouse/pull/47121) ([liding1992](https://github.com/liding1992)).
* External dictionaries / library-bridge: Fix error "unknown library method 'extDict_libClone'" [#47136](https://github.com/ClickHouse/ClickHouse/pull/47136) ([alex filatov](https://github.com/phil-88)).
* Fix race condition in a grace hash join with limit [#47153](https://github.com/ClickHouse/ClickHouse/pull/47153) ([Vladimir C](https://github.com/vdimir)).
* Fix concrete columns PREWHERE support [#47154](https://github.com/ClickHouse/ClickHouse/pull/47154) ([Azat Khuzhin](https://github.com/azat)).
* Fix possible deadlock in Query Status [#47161](https://github.com/ClickHouse/ClickHouse/pull/47161) ([Kruglov Pavel](https://github.com/Avogar)).
* Forbid insert select for the same `Join` table, as it leads to a deadlock [#47260](https://github.com/ClickHouse/ClickHouse/pull/47260) ([Vladimir C](https://github.com/vdimir)).
* Skip merged partitions for `min_age_to_force_merge_seconds` merges [#47303](https://github.com/ClickHouse/ClickHouse/pull/47303) ([Antonio Andelic](https://github.com/antonio2368)).
* Modify find_first_symbols, so it works as expected for find_first_not_symbols [#47304](https://github.com/ClickHouse/ClickHouse/pull/47304) ([Arthur Passos](https://github.com/arthurpassos)).
* Fix big numbers inference in CSV [#47410](https://github.com/ClickHouse/ClickHouse/pull/47410) ([Kruglov Pavel](https://github.com/Avogar)).
* Disable logical expression optimizer for expression with aliases. [#47451](https://github.com/ClickHouse/ClickHouse/pull/47451) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix error in `decodeURLComponent` [#47457](https://github.com/ClickHouse/ClickHouse/pull/47457) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix explain graph with projection [#47473](https://github.com/ClickHouse/ClickHouse/pull/47473) ([flynn](https://github.com/ucasfl)).
* Fix query parameters [#47488](https://github.com/ClickHouse/ClickHouse/pull/47488) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Parameterized view: a bug fix. [#47495](https://github.com/ClickHouse/ClickHouse/pull/47495) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Fuzzer of data formats, and the corresponding fixes. [#47519](https://github.com/ClickHouse/ClickHouse/pull/47519) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix monotonicity check for `DateTime64` [#47526](https://github.com/ClickHouse/ClickHouse/pull/47526) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix "block structure mismatch" for a Nullable LowCardinality column [#47537](https://github.com/ClickHouse/ClickHouse/pull/47537) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Proper fix for a bug in Apache Parquet [#45878](https://github.com/ClickHouse/ClickHouse/issues/45878) [#47538](https://github.com/ClickHouse/ClickHouse/pull/47538) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix `BSONEachRow` parallel parsing when document size is invalid [#47540](https://github.com/ClickHouse/ClickHouse/pull/47540) ([Kruglov Pavel](https://github.com/Avogar)).
* Preserve error in `system.distribution_queue` on `SYSTEM FLUSH DISTRIBUTED` [#47541](https://github.com/ClickHouse/ClickHouse/pull/47541) ([Azat Khuzhin](https://github.com/azat)).
* Check for duplicate column in `BSONEachRow` format [#47609](https://github.com/ClickHouse/ClickHouse/pull/47609) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix wait for zero copy lock during move [#47631](https://github.com/ClickHouse/ClickHouse/pull/47631) ([alesapin](https://github.com/alesapin)).
* Fix aggregation by partitions [#47634](https://github.com/ClickHouse/ClickHouse/pull/47634) ([Nikita Taranov](https://github.com/nickitat)).
* Fix bug in tuple as array serialization in `BSONEachRow` format [#47690](https://github.com/ClickHouse/ClickHouse/pull/47690) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix crash in `polygonsSymDifferenceCartesian` [#47702](https://github.com/ClickHouse/ClickHouse/pull/47702) ([pufit](https://github.com/pufit)).
* Fix reading from storage `File` compressed files with `zlib` and `gzip` compression [#47796](https://github.com/ClickHouse/ClickHouse/pull/47796) ([Anton Popov](https://github.com/CurtizJ)).
* Improve empty query detection for PostgreSQL (for pgx golang driver) [#47854](https://github.com/ClickHouse/ClickHouse/pull/47854) ([Azat Khuzhin](https://github.com/azat)).
* Fix DateTime monotonicity check for LowCardinality types [#47860](https://github.com/ClickHouse/ClickHouse/pull/47860) ([Antonio Andelic](https://github.com/antonio2368)).
* Use restore_threads (not backup_threads) for RESTORE ASYNC [#47861](https://github.com/ClickHouse/ClickHouse/pull/47861) ([Azat Khuzhin](https://github.com/azat)).
* Fix DROP COLUMN with ReplicatedMergeTree containing projections [#47883](https://github.com/ClickHouse/ClickHouse/pull/47883) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix for Replicated database recovery [#47901](https://github.com/ClickHouse/ClickHouse/pull/47901) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Hotfix for too verbose warnings in HTTP [#47903](https://github.com/ClickHouse/ClickHouse/pull/47903) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix "Field value too long" in `catboostEvaluate` [#47970](https://github.com/ClickHouse/ClickHouse/pull/47970) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix [#36971](https://github.com/ClickHouse/ClickHouse/issues/36971): Watchdog: exit with non-zero code if child process exits [#47973](https://github.com/ClickHouse/ClickHouse/pull/47973) ([Коренберг Марк](https://github.com/socketpair)).
* Fix for "index file `cidx` is unexpectedly long" [#48010](https://github.com/ClickHouse/ClickHouse/pull/48010) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Fix MaterializedPostgreSQL query to get attributes (replica-identity) [#48015](https://github.com/ClickHouse/ClickHouse/pull/48015) ([Solomatov Sergei](https://github.com/solomatovs)).
* parseDateTime(): Fix UB (signed integer overflow) [#48019](https://github.com/ClickHouse/ClickHouse/pull/48019) ([Robert Schulze](https://github.com/rschu1ze)).
* Use uniq names for Records in Avro to avoid reusing its schema [#48057](https://github.com/ClickHouse/ClickHouse/pull/48057) ([Kruglov Pavel](https://github.com/Avogar)).
* Correctly set TCP/HTTP socket timeouts in Keeper [#48108](https://github.com/ClickHouse/ClickHouse/pull/48108) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix possible member call on null pointer in `Avro` format [#48184](https://github.com/ClickHouse/ClickHouse/pull/48184) ([Kruglov Pavel](https://github.com/Avogar)).
### <a id="232"></a> ClickHouse release 23.2, 2023-02-23
#### Backward Incompatible Change

View File

@ -568,7 +568,7 @@ if (NATIVE_BUILD_TARGETS
COMMAND ${CMAKE_COMMAND}
"-DCMAKE_C_COMPILER=${CMAKE_C_COMPILER}"
"-DCMAKE_CXX_COMPILER=${CMAKE_CXX_COMPILER}"
"-DENABLE_CCACHE=${ENABLE_CCACHE}"
"-DCOMPILER_CACHE=${COMPILER_CACHE}"
# Avoid overriding .cargo/config.toml with native toolchain.
"-DENABLE_RUST=OFF"
"-DENABLE_CLICKHOUSE_SELF_EXTRACTING=${ENABLE_CLICKHOUSE_SELF_EXTRACTING}"

View File

@ -1,5 +1,6 @@
# Setup integration with ccache to speed up builds, see https://ccache.dev/
# Matches both ccache and sccache
if (CMAKE_CXX_COMPILER_LAUNCHER MATCHES "ccache" OR CMAKE_C_COMPILER_LAUNCHER MATCHES "ccache")
# custom compiler launcher already defined, most likely because cmake was invoked with like "-DCMAKE_CXX_COMPILER_LAUNCHER=ccache" or
# via environment variable --> respect setting and trust that the launcher was specified correctly
@ -8,45 +9,65 @@ if (CMAKE_CXX_COMPILER_LAUNCHER MATCHES "ccache" OR CMAKE_C_COMPILER_LAUNCHER MA
return()
endif()
option(ENABLE_CCACHE "Speedup re-compilations using ccache (external tool)" ON)
if (NOT ENABLE_CCACHE)
message(STATUS "Using ccache: no (disabled via configuration)")
return()
set(ENABLE_CCACHE "default" CACHE STRING "Deprecated, use COMPILER_CACHE=(auto|ccache|sccache|disabled)")
if (NOT ENABLE_CCACHE STREQUAL "default")
message(WARNING "The -DENABLE_CCACHE is deprecated in favor of -DCOMPILER_CACHE")
endif()
set(COMPILER_CACHE "auto" CACHE STRING "Speedup re-compilations using the caching tools; valid options are 'auto' (ccache, then sccache), 'ccache', 'sccache', or 'disabled'")
# It has pretty complex logic, because the ENABLE_CCACHE is deprecated, but still should
# control the COMPILER_CACHE
# After it will be completely removed, the following block will be much simpler
if (COMPILER_CACHE STREQUAL "ccache" OR (ENABLE_CCACHE AND NOT ENABLE_CCACHE STREQUAL "default"))
find_program (CCACHE_EXECUTABLE ccache)
elseif(COMPILER_CACHE STREQUAL "disabled" OR NOT ENABLE_CCACHE STREQUAL "default")
message(STATUS "Using *ccache: no (disabled via configuration)")
return()
elseif(COMPILER_CACHE STREQUAL "auto")
find_program (CCACHE_EXECUTABLE ccache sccache)
elseif(COMPILER_CACHE STREQUAL "sccache")
find_program (CCACHE_EXECUTABLE sccache)
else()
message(${RECONFIGURE_MESSAGE_LEVEL} "The COMPILER_CACHE must be one of (auto|ccache|sccache|disabled), given '${COMPILER_CACHE}'")
endif()
find_program (CCACHE_EXECUTABLE ccache)
if (NOT CCACHE_EXECUTABLE)
message(${RECONFIGURE_MESSAGE_LEVEL} "Using ccache: no (Could not find find ccache. To significantly reduce compile times for the 2nd, 3rd, etc. build, it is highly recommended to install ccache. To suppress this message, run cmake with -DENABLE_CCACHE=0)")
message(${RECONFIGURE_MESSAGE_LEVEL} "Using *ccache: no (Could not find find ccache or sccache. To significantly reduce compile times for the 2nd, 3rd, etc. build, it is highly recommended to install one of them. To suppress this message, run cmake with -DCOMPILER_CACHE=disabled)")
return()
endif()
execute_process(COMMAND ${CCACHE_EXECUTABLE} "-V" OUTPUT_VARIABLE CCACHE_VERSION)
string(REGEX REPLACE "ccache version ([0-9\\.]+).*" "\\1" CCACHE_VERSION ${CCACHE_VERSION})
if (CCACHE_EXECUTABLE MATCHES "/ccache$")
execute_process(COMMAND ${CCACHE_EXECUTABLE} "-V" OUTPUT_VARIABLE CCACHE_VERSION)
string(REGEX REPLACE "ccache version ([0-9\\.]+).*" "\\1" CCACHE_VERSION ${CCACHE_VERSION})
set (CCACHE_MINIMUM_VERSION 3.3)
set (CCACHE_MINIMUM_VERSION 3.3)
if (CCACHE_VERSION VERSION_LESS_EQUAL ${CCACHE_MINIMUM_VERSION})
message(${RECONFIGURE_MESSAGE_LEVEL} "Using ccache: no (found ${CCACHE_EXECUTABLE} (version ${CCACHE_VERSION}), the minimum required version is ${CCACHE_MINIMUM_VERSION}")
return()
endif()
if (CCACHE_VERSION VERSION_LESS_EQUAL ${CCACHE_MINIMUM_VERSION})
message(${RECONFIGURE_MESSAGE_LEVEL} "Using ccache: no (found ${CCACHE_EXECUTABLE} (version ${CCACHE_VERSION}), the minimum required version is ${CCACHE_MINIMUM_VERSION}")
return()
endif()
message(STATUS "Using ccache: ${CCACHE_EXECUTABLE} (version ${CCACHE_VERSION})")
set(LAUNCHER ${CCACHE_EXECUTABLE})
message(STATUS "Using ccache: ${CCACHE_EXECUTABLE} (version ${CCACHE_VERSION})")
set(LAUNCHER ${CCACHE_EXECUTABLE})
# Work around a well-intended but unfortunate behavior of ccache 4.0 & 4.1 with
# environment variable SOURCE_DATE_EPOCH. This variable provides an alternative
# to source-code embedded timestamps (__DATE__/__TIME__) and therefore helps with
# reproducible builds (*). SOURCE_DATE_EPOCH is set automatically by the
# distribution, e.g. Debian. Ccache 4.0 & 4.1 incorporate SOURCE_DATE_EPOCH into
# the hash calculation regardless they contain timestamps or not. This invalidates
# the cache whenever SOURCE_DATE_EPOCH changes. As a fix, ignore SOURCE_DATE_EPOCH.
#
# (*) https://reproducible-builds.org/specs/source-date-epoch/
if (CCACHE_VERSION VERSION_GREATER_EQUAL "4.0" AND CCACHE_VERSION VERSION_LESS "4.2")
message(STATUS "Ignore SOURCE_DATE_EPOCH for ccache 4.0 / 4.1")
set(LAUNCHER env -u SOURCE_DATE_EPOCH ${CCACHE_EXECUTABLE})
# Work around a well-intended but unfortunate behavior of ccache 4.0 & 4.1 with
# environment variable SOURCE_DATE_EPOCH. This variable provides an alternative
# to source-code embedded timestamps (__DATE__/__TIME__) and therefore helps with
# reproducible builds (*). SOURCE_DATE_EPOCH is set automatically by the
# distribution, e.g. Debian. Ccache 4.0 & 4.1 incorporate SOURCE_DATE_EPOCH into
# the hash calculation regardless they contain timestamps or not. This invalidates
# the cache whenever SOURCE_DATE_EPOCH changes. As a fix, ignore SOURCE_DATE_EPOCH.
#
# (*) https://reproducible-builds.org/specs/source-date-epoch/
if (CCACHE_VERSION VERSION_GREATER_EQUAL "4.0" AND CCACHE_VERSION VERSION_LESS "4.2")
message(STATUS "Ignore SOURCE_DATE_EPOCH for ccache 4.0 / 4.1")
set(LAUNCHER env -u SOURCE_DATE_EPOCH ${CCACHE_EXECUTABLE})
endif()
elseif(CCACHE_EXECUTABLE MATCHES "/sccache$")
message(STATUS "Using sccache: ${CCACHE_EXECUTABLE}")
set(LAUNCHER ${CCACHE_EXECUTABLE})
endif()
set (CMAKE_CXX_COMPILER_LAUNCHER ${LAUNCHER} ${CMAKE_CXX_COMPILER_LAUNCHER})

2
contrib/boost vendored

@ -1 +1 @@
Subproject commit 03d9ec9cd159d14bd0b17c05138098451a1ea606
Subproject commit 8fe7b3326ef482ee6ecdf5a4f698f2b8c2780f98

@ -1 +1 @@
Subproject commit 4bfaeb31dd0ef13f025221f93c138974a3e0a22a
Subproject commit e0accd517933ebb44aff84bc8db448ffd8ef1929

View File

@ -69,13 +69,14 @@ RUN add-apt-repository ppa:ubuntu-toolchain-r/test --yes \
libc6 \
libc6-dev \
libc6-dev-arm64-cross \
python3-boto3 \
yasm \
zstd \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists
# Download toolchain and SDK for Darwin
RUN wget -nv https://github.com/phracker/MacOSX-SDKs/releases/download/11.3/MacOSX11.0.sdk.tar.xz
RUN curl -sL -O https://github.com/phracker/MacOSX-SDKs/releases/download/11.3/MacOSX11.0.sdk.tar.xz
# Architecture of the image when BuildKit/buildx is used
ARG TARGETARCH
@ -97,7 +98,7 @@ ENV PATH="$PATH:/usr/local/go/bin"
ENV GOPATH=/workdir/go
ENV GOCACHE=/workdir/
ARG CLANG_TIDY_SHA1=03644275e794b0587849bfc2ec6123d5ae0bdb1c
ARG CLANG_TIDY_SHA1=c191254ea00d47ade11d7170ef82fe038c213774
RUN curl -Lo /usr/bin/clang-tidy-cache \
"https://raw.githubusercontent.com/matus-chochlik/ctcache/$CLANG_TIDY_SHA1/clang-tidy-cache" \
&& chmod +x /usr/bin/clang-tidy-cache

View File

@ -6,6 +6,7 @@ exec &> >(ts)
ccache_status () {
ccache --show-config ||:
ccache --show-stats ||:
SCCACHE_NO_DAEMON=1 sccache --show-stats ||:
}
[ -O /build ] || git config --global --add safe.directory /build

View File

@ -5,13 +5,19 @@ import os
import argparse
import logging
import sys
from typing import List
from pathlib import Path
from typing import List, Optional
SCRIPT_PATH = os.path.realpath(__file__)
SCRIPT_PATH = Path(__file__).absolute()
IMAGE_TYPE = "binary"
IMAGE_NAME = f"clickhouse/{IMAGE_TYPE}-builder"
def check_image_exists_locally(image_name):
class BuildException(Exception):
pass
def check_image_exists_locally(image_name: str) -> bool:
try:
output = subprocess.check_output(
f"docker images -q {image_name} 2> /dev/null", shell=True
@ -21,17 +27,17 @@ def check_image_exists_locally(image_name):
return False
def pull_image(image_name):
def pull_image(image_name: str) -> bool:
try:
subprocess.check_call(f"docker pull {image_name}", shell=True)
return True
except subprocess.CalledProcessError:
logging.info(f"Cannot pull image {image_name}".format())
logging.info("Cannot pull image %s", image_name)
return False
def build_image(image_name, filepath):
context = os.path.dirname(filepath)
def build_image(image_name: str, filepath: Path) -> None:
context = filepath.parent
build_cmd = f"docker build --network=host -t {image_name} -f {filepath} {context}"
logging.info("Will build image with cmd: '%s'", build_cmd)
subprocess.check_call(
@ -40,7 +46,7 @@ def build_image(image_name, filepath):
)
def pre_build(repo_path: str, env_variables: List[str]):
def pre_build(repo_path: Path, env_variables: List[str]):
if "WITH_PERFORMANCE=1" in env_variables:
current_branch = subprocess.check_output(
"git branch --show-current", shell=True, encoding="utf-8"
@ -56,7 +62,9 @@ def pre_build(repo_path: str, env_variables: List[str]):
# conclusion is: in the current state the easiest way to go is to force
# unshallow repository for performance artifacts.
# To change it we need to rework our performance tests docker image
raise Exception("shallow repository is not suitable for performance builds")
raise BuildException(
"shallow repository is not suitable for performance builds"
)
if current_branch != "master":
cmd = (
f"git -C {repo_path} fetch --no-recurse-submodules "
@ -67,14 +75,14 @@ def pre_build(repo_path: str, env_variables: List[str]):
def run_docker_image_with_env(
image_name,
as_root,
output,
env_variables,
ch_root,
ccache_dir,
docker_image_version,
image_name: str,
as_root: bool,
output_dir: Path,
env_variables: List[str],
ch_root: Path,
ccache_dir: Optional[Path],
):
output_dir.mkdir(parents=True, exist_ok=True)
env_part = " -e ".join(env_variables)
if env_part:
env_part = " -e " + env_part
@ -89,10 +97,14 @@ def run_docker_image_with_env(
else:
user = f"{os.geteuid()}:{os.getegid()}"
ccache_mount = f"--volume={ccache_dir}:/ccache"
if ccache_dir is None:
ccache_mount = ""
cmd = (
f"docker run --network=host --user={user} --rm --volume={output}:/output "
f"--volume={ch_root}:/build --volume={ccache_dir}:/ccache {env_part} "
f"{interactive} {image_name}:{docker_image_version}"
f"docker run --network=host --user={user} --rm {ccache_mount}"
f"--volume={output_dir}:/output --volume={ch_root}:/build {env_part} "
f"{interactive} {image_name}"
)
logging.info("Will build ClickHouse pkg with cmd: '%s'", cmd)
@ -100,24 +112,25 @@ def run_docker_image_with_env(
subprocess.check_call(cmd, shell=True)
def is_release_build(build_type, package_type, sanitizer):
def is_release_build(build_type: str, package_type: str, sanitizer: str) -> bool:
return build_type == "" and package_type == "deb" and sanitizer == ""
def parse_env_variables(
build_type,
compiler,
sanitizer,
package_type,
cache,
distcc_hosts,
clang_tidy,
version,
author,
official,
additional_pkgs,
with_coverage,
with_binaries,
build_type: str,
compiler: str,
sanitizer: str,
package_type: str,
cache: str,
s3_bucket: str,
s3_directory: str,
s3_rw_access: bool,
clang_tidy: bool,
version: str,
official: bool,
additional_pkgs: bool,
with_coverage: bool,
with_binaries: str,
):
DARWIN_SUFFIX = "-darwin"
DARWIN_ARM_SUFFIX = "-darwin-aarch64"
@ -243,32 +256,43 @@ def parse_env_variables(
else:
result.append("BUILD_TYPE=None")
if cache == "distcc":
result.append(f"CCACHE_PREFIX={cache}")
if not cache:
cmake_flags.append("-DCOMPILER_CACHE=disabled")
if cache:
if cache == "ccache":
cmake_flags.append("-DCOMPILER_CACHE=ccache")
result.append("CCACHE_DIR=/ccache")
result.append("CCACHE_COMPRESSLEVEL=5")
result.append("CCACHE_BASEDIR=/build")
result.append("CCACHE_NOHASHDIR=true")
result.append("CCACHE_COMPILERCHECK=content")
cache_maxsize = "15G"
if clang_tidy:
# 15G is not enough for tidy build
cache_maxsize = "25G"
result.append("CCACHE_MAXSIZE=15G")
# `CTCACHE_DIR` has the same purpose as the `CCACHE_DIR` above.
# It's there to have the clang-tidy cache embedded into our standard `CCACHE_DIR`
if cache == "sccache":
cmake_flags.append("-DCOMPILER_CACHE=sccache")
# see https://github.com/mozilla/sccache/blob/main/docs/S3.md
result.append(f"SCCACHE_BUCKET={s3_bucket}")
sccache_dir = "sccache"
if s3_directory:
sccache_dir = f"{s3_directory}/{sccache_dir}"
result.append(f"SCCACHE_S3_KEY_PREFIX={sccache_dir}")
if not s3_rw_access:
result.append("SCCACHE_S3_NO_CREDENTIALS=true")
if clang_tidy:
# `CTCACHE_DIR` has the same purpose as the `CCACHE_DIR` above.
# It's there to have the clang-tidy cache embedded into our standard `CCACHE_DIR`
if cache == "ccache":
result.append("CTCACHE_DIR=/ccache/clang-tidy-cache")
result.append(f"CCACHE_MAXSIZE={cache_maxsize}")
if distcc_hosts:
hosts_with_params = [f"{host}/24,lzo" for host in distcc_hosts] + [
"localhost/`nproc`"
]
result.append('DISTCC_HOSTS="' + " ".join(hosts_with_params) + '"')
elif cache == "distcc":
result.append('DISTCC_HOSTS="localhost/`nproc`"')
if s3_bucket:
# see https://github.com/matus-chochlik/ctcache#environment-variables
ctcache_dir = "clang-tidy-cache"
if s3_directory:
ctcache_dir = f"{s3_directory}/{ctcache_dir}"
result.append(f"CTCACHE_S3_BUCKET={s3_bucket}")
result.append(f"CTCACHE_S3_FOLDER={ctcache_dir}")
if not s3_rw_access:
result.append("CTCACHE_S3_NO_CREDENTIALS=true")
if additional_pkgs:
# NOTE: This are the env for packages/build script
@ -300,9 +324,6 @@ def parse_env_variables(
if version:
result.append(f"VERSION_STRING='{version}'")
if author:
result.append(f"AUTHOR='{author}'")
if official:
cmake_flags.append("-DCLICKHOUSE_OFFICIAL_BUILD=1")
@ -312,14 +333,14 @@ def parse_env_variables(
return result
def dir_name(name: str) -> str:
if not os.path.isabs(name):
name = os.path.abspath(os.path.join(os.getcwd(), name))
return name
def dir_name(name: str) -> Path:
path = Path(name)
if not path.is_absolute():
path = Path.cwd() / name
return path
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(message)s")
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter,
description="ClickHouse building script using prebuilt Docker image",
@ -331,7 +352,7 @@ if __name__ == "__main__":
)
parser.add_argument(
"--clickhouse-repo-path",
default=os.path.join(os.path.dirname(SCRIPT_PATH), os.pardir, os.pardir),
default=SCRIPT_PATH.parents[2],
type=dir_name,
help="ClickHouse git repository",
)
@ -361,17 +382,34 @@ if __name__ == "__main__":
)
parser.add_argument("--clang-tidy", action="store_true")
parser.add_argument("--cache", choices=("ccache", "distcc", ""), default="")
parser.add_argument(
"--ccache_dir",
default=os.getenv("HOME", "") + "/.ccache",
"--cache",
choices=("ccache", "sccache", ""),
default="",
help="ccache or sccache for objects caching; sccache uses only S3 buckets",
)
parser.add_argument(
"--ccache-dir",
default=Path.home() / ".ccache",
type=dir_name,
help="a directory with ccache",
)
parser.add_argument("--distcc-hosts", nargs="+")
parser.add_argument(
"--s3-bucket",
help="an S3 bucket used for sscache and clang-tidy-cache",
)
parser.add_argument(
"--s3-directory",
default="ccache",
help="an S3 directory prefix used for sscache and clang-tidy-cache",
)
parser.add_argument(
"--s3-rw-access",
action="store_true",
help="if set, the build fails on errors writing cache to S3",
)
parser.add_argument("--force-build-image", action="store_true")
parser.add_argument("--version")
parser.add_argument("--author", default="clickhouse", help="a package author")
parser.add_argument("--official", action="store_true")
parser.add_argument("--additional-pkgs", action="store_true")
parser.add_argument("--with-coverage", action="store_true")
@ -387,34 +425,54 @@ if __name__ == "__main__":
args = parser.parse_args()
image_name = f"clickhouse/{IMAGE_TYPE}-builder"
if args.additional_pkgs and args.package_type != "deb":
raise argparse.ArgumentTypeError(
"Can build additional packages only in deb build"
)
if args.cache != "ccache":
args.ccache_dir = None
if args.with_binaries != "":
if args.package_type != "deb":
raise argparse.ArgumentTypeError(
"Can add additional binaries only in deb build"
)
logging.info("Should place %s to output", args.with_binaries)
if args.cache == "sccache":
if not args.s3_bucket:
raise argparse.ArgumentTypeError("sccache must have --s3-bucket set")
return args
def main():
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(message)s")
args = parse_args()
ch_root = args.clickhouse_repo_path
if args.additional_pkgs and args.package_type != "deb":
raise Exception("Can build additional packages only in deb build")
dockerfile = ch_root / "docker/packager" / IMAGE_TYPE / "Dockerfile"
image_with_version = IMAGE_NAME + ":" + args.docker_image_version
if args.force_build_image:
build_image(image_with_version, dockerfile)
elif not (
check_image_exists_locally(image_with_version) or pull_image(image_with_version)
):
build_image(image_with_version, dockerfile)
if args.with_binaries != "" and args.package_type != "deb":
raise Exception("Can add additional binaries only in deb build")
if args.with_binaries != "" and args.package_type == "deb":
logging.info("Should place %s to output", args.with_binaries)
dockerfile = os.path.join(ch_root, "docker/packager", IMAGE_TYPE, "Dockerfile")
image_with_version = image_name + ":" + args.docker_image_version
if not check_image_exists_locally(image_name) or args.force_build_image:
if not pull_image(image_with_version) or args.force_build_image:
build_image(image_with_version, dockerfile)
env_prepared = parse_env_variables(
args.build_type,
args.compiler,
args.sanitizer,
args.package_type,
args.cache,
args.distcc_hosts,
args.s3_bucket,
args.s3_directory,
args.s3_rw_access,
args.clang_tidy,
args.version,
args.author,
args.official,
args.additional_pkgs,
args.with_coverage,
@ -423,12 +481,15 @@ if __name__ == "__main__":
pre_build(args.clickhouse_repo_path, env_prepared)
run_docker_image_with_env(
image_name,
image_with_version,
args.as_root,
args.output_dir,
env_prepared,
ch_root,
args.ccache_dir,
args.docker_image_version,
)
logging.info("Output placed into %s", args.output_dir)
if __name__ == "__main__":
main()

View File

@ -20,12 +20,6 @@ RUN apt-get update \
zstd \
--yes --no-install-recommends
# Install CMake 3.20+ for Rust compilation
RUN apt purge cmake --yes
RUN wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null
RUN apt-add-repository 'deb https://apt.kitware.com/ubuntu/ focal main'
RUN apt update && apt install cmake --yes
RUN pip3 install numpy scipy pandas Jinja2
ARG odbc_driver_url="https://github.com/ClickHouse/clickhouse-odbc/releases/download/v1.1.4.20200302/clickhouse-odbc-1.1.4-Linux.tar.gz"

View File

@ -16,7 +16,8 @@ export LLVM_VERSION=${LLVM_VERSION:-13}
# it being undefined. Also read it as array so that we can pass an empty list
# of additional variable to cmake properly, and it doesn't generate an extra
# empty parameter.
read -ra FASTTEST_CMAKE_FLAGS <<< "${FASTTEST_CMAKE_FLAGS:-}"
# Read it as CMAKE_FLAGS to not lose exported FASTTEST_CMAKE_FLAGS on subsequential launch
read -ra CMAKE_FLAGS <<< "${FASTTEST_CMAKE_FLAGS:-}"
# Run only matching tests.
FASTTEST_FOCUS=${FASTTEST_FOCUS:-""}
@ -37,6 +38,13 @@ export FASTTEST_DATA
export FASTTEST_OUT
export PATH
function ccache_status
{
ccache --show-config ||:
ccache --show-stats ||:
SCCACHE_NO_DAEMON=1 sccache --show-stats ||:
}
function start_server
{
set -m # Spawn server in its own process groups
@ -171,14 +179,14 @@ function run_cmake
export CCACHE_COMPILERCHECK=content
export CCACHE_MAXSIZE=15G
ccache --show-stats ||:
ccache_status
ccache --zero-stats ||:
mkdir "$FASTTEST_BUILD" ||:
(
cd "$FASTTEST_BUILD"
cmake "$FASTTEST_SOURCE" -DCMAKE_CXX_COMPILER="clang++-${LLVM_VERSION}" -DCMAKE_C_COMPILER="clang-${LLVM_VERSION}" "${CMAKE_LIBS_CONFIG[@]}" "${FASTTEST_CMAKE_FLAGS[@]}" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/cmake_log.txt"
cmake "$FASTTEST_SOURCE" -DCMAKE_CXX_COMPILER="clang++-${LLVM_VERSION}" -DCMAKE_C_COMPILER="clang-${LLVM_VERSION}" "${CMAKE_LIBS_CONFIG[@]}" "${CMAKE_FLAGS[@]}" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/cmake_log.txt"
)
}
@ -193,7 +201,7 @@ function build
strip programs/clickhouse -o "$FASTTEST_OUTPUT/clickhouse-stripped"
zstd --threads=0 "$FASTTEST_OUTPUT/clickhouse-stripped"
fi
ccache --show-stats ||:
ccache_status
ccache --evict-older-than 1d ||:
)
}

View File

@ -92,4 +92,17 @@ RUN mkdir /tmp/ccache \
&& cd / \
&& rm -rf /tmp/ccache
ARG TARGETARCH
ARG SCCACHE_VERSION=v0.4.1
RUN arch=${TARGETARCH:-amd64} \
&& case $arch in \
amd64) rarch=x86_64 ;; \
arm64) rarch=aarch64 ;; \
esac \
&& curl -Ls "https://github.com/mozilla/sccache/releases/download/$SCCACHE_VERSION/sccache-$SCCACHE_VERSION-$rarch-unknown-linux-musl.tar.gz" | \
tar xz -C /tmp \
&& mv "/tmp/sccache-$SCCACHE_VERSION-$rarch-unknown-linux-musl/sccache" /usr/bin \
&& rm "/tmp/sccache-$SCCACHE_VERSION-$rarch-unknown-linux-musl" -r
COPY process_functional_tests_result.py /

View File

@ -85,7 +85,7 @@ The build requires the following components:
- Git (is used only to checkout the sources, its not needed for the build)
- CMake 3.15 or newer
- Ninja
- C++ compiler: clang-14 or newer
- C++ compiler: clang-15 or newer
- Linker: lld
- Yasm
- Gawk

View File

@ -2769,7 +2769,7 @@ Default value: `120` seconds.
## cast_keep_nullable {#cast_keep_nullable}
Enables or disables keeping of the `Nullable` data type in [CAST](../../sql-reference/functions/type-conversion-functions.md/#type_conversion_function-cast) operations.
Enables or disables keeping of the `Nullable` data type in [CAST](../../sql-reference/functions/type-conversion-functions.md/#castx-t) operations.
When the setting is enabled and the argument of `CAST` function is `Nullable`, the result is also transformed to `Nullable` type. When the setting is disabled, the result always has the destination type exactly.

View File

@ -47,6 +47,7 @@ The default join type can be overridden using [join_default_strictness](../../..
The behavior of ClickHouse server for `ANY JOIN` operations depends on the [any_join_distinct_right_table_keys](../../../operations/settings/settings.md#any_join_distinct_right_table_keys) setting.
**See also**
- [join_algorithm](../../../operations/settings/settings.md#settings-join_algorithm)
@ -57,6 +58,8 @@ The behavior of ClickHouse server for `ANY JOIN` operations depends on the [any_
- [join_on_disk_max_files_to_merge](../../../operations/settings/settings.md#join_on_disk_max_files_to_merge)
- [any_join_distinct_right_table_keys](../../../operations/settings/settings.md#any_join_distinct_right_table_keys)
Use the `cross_to_inner_join_rewrite` setting to define the behavior when ClickHouse fails to rewrite a `CROSS JOIN` as an `INNER JOIN`. The default value is `1`, which allows the join to continue but it will be slower. Set `cross_to_inner_join_rewrite` to `0` if you want an error to be thrown, and set it to `2` to not run the cross joins but instead force a rewrite of all comma/cross joins. If the rewriting fails when the value is `2`, you will receive an error message stating "Please, try to simplify `WHERE` section".
## ON Section Conditions
An `ON` section can contain several conditions combined using the `AND` and `OR` operators. Conditions specifying join keys must refer both left and right tables and must use the equality operator. Other conditions may use other logical operators but they must refer either the left or the right table of a query.

View File

@ -34,6 +34,7 @@
#include <Common/Config/configReadClient.h>
#include <Common/TerminalSize.h>
#include <Common/StudentTTest.h>
#include <Common/CurrentMetrics.h>
#include <filesystem>
@ -43,6 +44,12 @@ namespace fs = std::filesystem;
* The tool emulates a case with fixed amount of simultaneously executing queries.
*/
namespace CurrentMetrics
{
extern const Metric LocalThread;
extern const Metric LocalThreadActive;
}
namespace DB
{
@ -103,7 +110,7 @@ public:
settings(settings_),
shared_context(Context::createShared()),
global_context(Context::createGlobal(shared_context.get())),
pool(concurrency)
pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, concurrency)
{
const auto secure = secure_ ? Protocol::Secure::Enable : Protocol::Secure::Disable;
size_t connections_cnt = std::max(ports_.size(), hosts_.size());

View File

@ -6,6 +6,7 @@
#include <Common/ZooKeeper/ZooKeeper.h>
#include <Common/ZooKeeper/KeeperException.h>
#include <Common/setThreadName.h>
#include <Common/CurrentMetrics.h>
#include <Interpreters/InterpreterInsertQuery.h>
#include <Interpreters/InterpreterSelectWithUnionQuery.h>
#include <Parsers/ASTFunction.h>
@ -19,6 +20,12 @@
#include <Processors/QueryPlan/BuildQueryPipelineSettings.h>
#include <Processors/QueryPlan/Optimizations/QueryPlanOptimizationSettings.h>
namespace CurrentMetrics
{
extern const Metric LocalThread;
extern const Metric LocalThreadActive;
}
namespace DB
{
@ -192,7 +199,7 @@ void ClusterCopier::discoverTablePartitions(const ConnectionTimeouts & timeouts,
{
/// Fetch partitions list from a shard
{
ThreadPool thread_pool(num_threads ? num_threads : 2 * getNumberOfPhysicalCPUCores());
ThreadPool thread_pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, num_threads ? num_threads : 2 * getNumberOfPhysicalCPUCores());
for (const TaskShardPtr & task_shard : task_table.all_shards)
thread_pool.scheduleOrThrowOnError([this, timeouts, task_shard]()

View File

@ -153,9 +153,11 @@ private:
}
};
if (const auto * lhs_literal = lhs->as<ConstantNode>())
if (const auto * lhs_literal = lhs->as<ConstantNode>();
lhs_literal && !lhs_literal->getValue().isNull())
add_equals_function_if_not_present(rhs, lhs_literal);
else if (const auto * rhs_literal = rhs->as<ConstantNode>())
else if (const auto * rhs_literal = rhs->as<ConstantNode>();
rhs_literal && !rhs_literal->getValue().isNull())
add_equals_function_if_not_present(lhs, rhs_literal);
else
or_operands.push_back(argument);

View File

@ -20,10 +20,19 @@
#include <Common/Exception.h>
#include <Common/Macros.h>
#include <Common/logger_useful.h>
#include <Common/CurrentMetrics.h>
#include <Common/setThreadName.h>
#include <Common/scope_guard_safe.h>
namespace CurrentMetrics
{
extern const Metric BackupsThreads;
extern const Metric BackupsThreadsActive;
extern const Metric RestoreThreads;
extern const Metric RestoreThreadsActive;
}
namespace DB
{
@ -153,8 +162,8 @@ namespace
BackupsWorker::BackupsWorker(size_t num_backup_threads, size_t num_restore_threads, bool allow_concurrent_backups_, bool allow_concurrent_restores_)
: backups_thread_pool(num_backup_threads, /* max_free_threads = */ 0, num_backup_threads)
, restores_thread_pool(num_restore_threads, /* max_free_threads = */ 0, num_restore_threads)
: backups_thread_pool(CurrentMetrics::BackupsThreads, CurrentMetrics::BackupsThreadsActive, num_backup_threads, /* max_free_threads = */ 0, num_backup_threads)
, restores_thread_pool(CurrentMetrics::RestoreThreads, CurrentMetrics::RestoreThreadsActive, num_restore_threads, /* max_free_threads = */ 0, num_restore_threads)
, log(&Poco::Logger::get("BackupsWorker"))
, allow_concurrent_backups(allow_concurrent_backups_)
, allow_concurrent_restores(allow_concurrent_restores_)

View File

@ -5,6 +5,7 @@
#include <IO/WriteBufferFromString.h>
#include <IO/copyData.h>
#include <algorithm>
#include <stdexcept>
#include <chrono>
#include <cerrno>

View File

@ -72,6 +72,64 @@
M(GlobalThreadActive, "Number of threads in global thread pool running a task.") \
M(LocalThread, "Number of threads in local thread pools. The threads in local thread pools are taken from the global thread pool.") \
M(LocalThreadActive, "Number of threads in local thread pools running a task.") \
M(MergeTreeDataSelectExecutorThreads, "Number of threads in the MergeTreeDataSelectExecutor thread pool.") \
M(MergeTreeDataSelectExecutorThreadsActive, "Number of threads in the MergeTreeDataSelectExecutor thread pool running a task.") \
M(BackupsThreads, "Number of threads in the thread pool for BACKUP.") \
M(BackupsThreadsActive, "Number of threads in thread pool for BACKUP running a task.") \
M(RestoreThreads, "Number of threads in the thread pool for RESTORE.") \
M(RestoreThreadsActive, "Number of threads in the thread pool for RESTORE running a task.") \
M(IOThreads, "Number of threads in the IO thread pool.") \
M(IOThreadsActive, "Number of threads in the IO thread pool running a task.") \
M(ThreadPoolRemoteFSReaderThreads, "Number of threads in the thread pool for remote_filesystem_read_method=threadpool.") \
M(ThreadPoolRemoteFSReaderThreadsActive, "Number of threads in the thread pool for remote_filesystem_read_method=threadpool running a task.") \
M(ThreadPoolFSReaderThreads, "Number of threads in the thread pool for local_filesystem_read_method=threadpool.") \
M(ThreadPoolFSReaderThreadsActive, "Number of threads in the thread pool for local_filesystem_read_method=threadpool running a task.") \
M(BackupsIOThreads, "Number of threads in the BackupsIO thread pool.") \
M(BackupsIOThreadsActive, "Number of threads in the BackupsIO thread pool running a task.") \
M(DiskObjectStorageAsyncThreads, "Number of threads in the async thread pool for DiskObjectStorage.") \
M(DiskObjectStorageAsyncThreadsActive, "Number of threads in the async thread pool for DiskObjectStorage running a task.") \
M(StorageHiveThreads, "Number of threads in the StorageHive thread pool.") \
M(StorageHiveThreadsActive, "Number of threads in the StorageHive thread pool running a task.") \
M(TablesLoaderThreads, "Number of threads in the tables loader thread pool.") \
M(TablesLoaderThreadsActive, "Number of threads in the tables loader thread pool running a task.") \
M(DatabaseOrdinaryThreads, "Number of threads in the Ordinary database thread pool.") \
M(DatabaseOrdinaryThreadsActive, "Number of threads in the Ordinary database thread pool running a task.") \
M(DatabaseOnDiskThreads, "Number of threads in the DatabaseOnDisk thread pool.") \
M(DatabaseOnDiskThreadsActive, "Number of threads in the DatabaseOnDisk thread pool running a task.") \
M(DatabaseCatalogThreads, "Number of threads in the DatabaseCatalog thread pool.") \
M(DatabaseCatalogThreadsActive, "Number of threads in the DatabaseCatalog thread pool running a task.") \
M(DestroyAggregatesThreads, "Number of threads in the thread pool for destroy aggregate states.") \
M(DestroyAggregatesThreadsActive, "Number of threads in the thread pool for destroy aggregate states running a task.") \
M(HashedDictionaryThreads, "Number of threads in the HashedDictionary thread pool.") \
M(HashedDictionaryThreadsActive, "Number of threads in the HashedDictionary thread pool running a task.") \
M(CacheDictionaryThreads, "Number of threads in the CacheDictionary thread pool.") \
M(CacheDictionaryThreadsActive, "Number of threads in the CacheDictionary thread pool running a task.") \
M(ParallelFormattingOutputFormatThreads, "Number of threads in the ParallelFormattingOutputFormatThreads thread pool.") \
M(ParallelFormattingOutputFormatThreadsActive, "Number of threads in the ParallelFormattingOutputFormatThreads thread pool running a task.") \
M(ParallelParsingInputFormatThreads, "Number of threads in the ParallelParsingInputFormat thread pool.") \
M(ParallelParsingInputFormatThreadsActive, "Number of threads in the ParallelParsingInputFormat thread pool running a task.") \
M(MergeTreeBackgroundExecutorThreads, "Number of threads in the MergeTreeBackgroundExecutor thread pool.") \
M(MergeTreeBackgroundExecutorThreadsActive, "Number of threads in the MergeTreeBackgroundExecutor thread pool running a task.") \
M(AsynchronousInsertThreads, "Number of threads in the AsynchronousInsert thread pool.") \
M(AsynchronousInsertThreadsActive, "Number of threads in the AsynchronousInsert thread pool running a task.") \
M(StartupSystemTablesThreads, "Number of threads in the StartupSystemTables thread pool.") \
M(StartupSystemTablesThreadsActive, "Number of threads in the StartupSystemTables thread pool running a task.") \
M(AggregatorThreads, "Number of threads in the Aggregator thread pool.") \
M(AggregatorThreadsActive, "Number of threads in the Aggregator thread pool running a task.") \
M(DDLWorkerThreads, "Number of threads in the DDLWorker thread pool for ON CLUSTER queries.") \
M(DDLWorkerThreadsActive, "Number of threads in the DDLWORKER thread pool for ON CLUSTER queries running a task.") \
M(StorageDistributedThreads, "Number of threads in the StorageDistributed thread pool.") \
M(StorageDistributedThreadsActive, "Number of threads in the StorageDistributed thread pool running a task.") \
M(StorageS3Threads, "Number of threads in the StorageS3 thread pool.") \
M(StorageS3ThreadsActive, "Number of threads in the StorageS3 thread pool running a task.") \
M(MergeTreePartsLoaderThreads, "Number of threads in the MergeTree parts loader thread pool.") \
M(MergeTreePartsLoaderThreadsActive, "Number of threads in the MergeTree parts loader thread pool running a task.") \
M(MergeTreePartsCleanerThreads, "Number of threads in the MergeTree parts cleaner thread pool.") \
M(MergeTreePartsCleanerThreadsActive, "Number of threads in the MergeTree parts cleaner thread pool running a task.") \
M(SystemReplicasThreads, "Number of threads in the system.replicas thread pool.") \
M(SystemReplicasThreadsActive, "Number of threads in the system.replicas thread pool running a task.") \
M(RestartReplicaThreads, "Number of threads in the RESTART REPLICA thread pool.") \
M(RestartReplicaThreadsActive, "Number of threads in the RESTART REPLICA thread pool running a task.") \
M(DistributedFilesToInsert, "Number of pending files to process for asynchronous insertion into Distributed tables. Number of files for every shard is summed.") \
M(BrokenDistributedFilesToInsert, "Number of files for asynchronous insertion into Distributed tables that has been marked as broken. This metric will starts from 0 on start. Number of files for every shard is summed.") \
M(TablesToDropQueueSize, "Number of dropped tables, that are waiting for background data removal.") \

View File

@ -4,6 +4,7 @@
#include <cstdint>
#include <utility>
#include <atomic>
#include <cassert>
#include <base/types.h>
/** Allows to count number of simultaneously happening processes or current value of some metric.
@ -73,7 +74,10 @@ namespace CurrentMetrics
public:
explicit Increment(Metric metric, Value amount_ = 1)
: Increment(&values[metric], amount_) {}
: Increment(&values[metric], amount_)
{
assert(metric < CurrentMetrics::end());
}
~Increment()
{

View File

@ -649,6 +649,7 @@
M(678, IO_URING_INIT_FAILED) \
M(679, IO_URING_SUBMIT_ERROR) \
M(690, MIXED_ACCESS_PARAMETER_TYPES) \
M(691, UNKNOWN_ELEMENT_OF_ENUM) \
\
M(999, KEEPER_EXCEPTION) \
M(1000, POCO_EXCEPTION) \

View File

@ -26,28 +26,37 @@ namespace CurrentMetrics
{
extern const Metric GlobalThread;
extern const Metric GlobalThreadActive;
extern const Metric LocalThread;
extern const Metric LocalThreadActive;
}
static constexpr auto DEFAULT_THREAD_NAME = "ThreadPool";
template <typename Thread>
ThreadPoolImpl<Thread>::ThreadPoolImpl()
: ThreadPoolImpl(getNumberOfPhysicalCPUCores())
ThreadPoolImpl<Thread>::ThreadPoolImpl(Metric metric_threads_, Metric metric_active_threads_)
: ThreadPoolImpl(metric_threads_, metric_active_threads_, getNumberOfPhysicalCPUCores())
{
}
template <typename Thread>
ThreadPoolImpl<Thread>::ThreadPoolImpl(size_t max_threads_)
: ThreadPoolImpl(max_threads_, max_threads_, max_threads_)
ThreadPoolImpl<Thread>::ThreadPoolImpl(
Metric metric_threads_,
Metric metric_active_threads_,
size_t max_threads_)
: ThreadPoolImpl(metric_threads_, metric_active_threads_, max_threads_, max_threads_, max_threads_)
{
}
template <typename Thread>
ThreadPoolImpl<Thread>::ThreadPoolImpl(size_t max_threads_, size_t max_free_threads_, size_t queue_size_, bool shutdown_on_exception_)
: max_threads(max_threads_)
ThreadPoolImpl<Thread>::ThreadPoolImpl(
Metric metric_threads_,
Metric metric_active_threads_,
size_t max_threads_,
size_t max_free_threads_,
size_t queue_size_,
bool shutdown_on_exception_)
: metric_threads(metric_threads_)
, metric_active_threads(metric_active_threads_)
, max_threads(max_threads_)
, max_free_threads(std::min(max_free_threads_, max_threads))
, queue_size(queue_size_ ? std::max(queue_size_, max_threads) : 0 /* zero means the queue is unlimited */)
, shutdown_on_exception(shutdown_on_exception_)
@ -324,8 +333,7 @@ template <typename Thread>
void ThreadPoolImpl<Thread>::worker(typename std::list<Thread>::iterator thread_it)
{
DENY_ALLOCATIONS_IN_SCOPE;
CurrentMetrics::Increment metric_all_threads(
std::is_same_v<Thread, std::thread> ? CurrentMetrics::GlobalThread : CurrentMetrics::LocalThread);
CurrentMetrics::Increment metric_pool_threads(metric_threads);
/// Remove this thread from `threads` and detach it, that must be done before exiting from this worker.
/// We can't wrap the following lambda function into `SCOPE_EXIT` because it requires `mutex` to be locked.
@ -383,8 +391,7 @@ void ThreadPoolImpl<Thread>::worker(typename std::list<Thread>::iterator thread_
try
{
CurrentMetrics::Increment metric_active_threads(
std::is_same_v<Thread, std::thread> ? CurrentMetrics::GlobalThreadActive : CurrentMetrics::LocalThreadActive);
CurrentMetrics::Increment metric_active_pool_threads(metric_active_threads);
job();
@ -458,6 +465,22 @@ template class ThreadFromGlobalPoolImpl<true>;
std::unique_ptr<GlobalThreadPool> GlobalThreadPool::the_instance;
GlobalThreadPool::GlobalThreadPool(
size_t max_threads_,
size_t max_free_threads_,
size_t queue_size_,
const bool shutdown_on_exception_)
: FreeThreadPool(
CurrentMetrics::GlobalThread,
CurrentMetrics::GlobalThreadActive,
max_threads_,
max_free_threads_,
queue_size_,
shutdown_on_exception_)
{
}
void GlobalThreadPool::initialize(size_t max_threads, size_t max_free_threads, size_t queue_size)
{
if (the_instance)

View File

@ -16,6 +16,7 @@
#include <Poco/Event.h>
#include <Common/ThreadStatus.h>
#include <Common/OpenTelemetryTraceContext.h>
#include <Common/CurrentMetrics.h>
#include <base/scope_guard.h>
/** Very simple thread pool similar to boost::threadpool.
@ -33,15 +34,25 @@ class ThreadPoolImpl
{
public:
using Job = std::function<void()>;
using Metric = CurrentMetrics::Metric;
/// Maximum number of threads is based on the number of physical cores.
ThreadPoolImpl();
ThreadPoolImpl(Metric metric_threads_, Metric metric_active_threads_);
/// Size is constant. Up to num_threads are created on demand and then run until shutdown.
explicit ThreadPoolImpl(size_t max_threads_);
explicit ThreadPoolImpl(
Metric metric_threads_,
Metric metric_active_threads_,
size_t max_threads_);
/// queue_size - maximum number of running plus scheduled jobs. It can be greater than max_threads. Zero means unlimited.
ThreadPoolImpl(size_t max_threads_, size_t max_free_threads_, size_t queue_size_, bool shutdown_on_exception_ = true);
ThreadPoolImpl(
Metric metric_threads_,
Metric metric_active_threads_,
size_t max_threads_,
size_t max_free_threads_,
size_t queue_size_,
bool shutdown_on_exception_ = true);
/// Add new job. Locks until number of scheduled jobs is less than maximum or exception in one of threads was thrown.
/// If any thread was throw an exception, first exception will be rethrown from this method,
@ -96,6 +107,9 @@ private:
std::condition_variable job_finished;
std::condition_variable new_job_or_shutdown;
Metric metric_threads;
Metric metric_active_threads;
size_t max_threads;
size_t max_free_threads;
size_t queue_size;
@ -159,12 +173,11 @@ class GlobalThreadPool : public FreeThreadPool, private boost::noncopyable
{
static std::unique_ptr<GlobalThreadPool> the_instance;
GlobalThreadPool(size_t max_threads_, size_t max_free_threads_,
size_t queue_size_, const bool shutdown_on_exception_)
: FreeThreadPool(max_threads_, max_free_threads_, queue_size_,
shutdown_on_exception_)
{
}
GlobalThreadPool(
size_t max_threads_,
size_t max_free_threads_,
size_t queue_size_,
bool shutdown_on_exception_);
public:
static void initialize(size_t max_threads = 10000, size_t max_free_threads = 1000, size_t queue_size = 10000);

View File

@ -17,6 +17,7 @@
#include <Common/Stopwatch.h>
#include <Common/ThreadPool.h>
#include <Common/CurrentMetrics.h>
using Key = UInt64;
@ -28,6 +29,12 @@ using Map = HashMap<Key, Value>;
using MapTwoLevel = TwoLevelHashMap<Key, Value>;
namespace CurrentMetrics
{
extern const Metric LocalThread;
extern const Metric LocalThreadActive;
}
struct SmallLock
{
std::atomic<int> locked {false};
@ -247,7 +254,7 @@ int main(int argc, char ** argv)
std::cerr << std::fixed << std::setprecision(2);
ThreadPool pool(num_threads);
ThreadPool pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, num_threads);
Source data(n);

View File

@ -17,6 +17,7 @@
#include <Common/Stopwatch.h>
#include <Common/ThreadPool.h>
#include <Common/CurrentMetrics.h>
using Key = UInt64;
@ -24,6 +25,12 @@ using Value = UInt64;
using Source = std::vector<Key>;
namespace CurrentMetrics
{
extern const Metric LocalThread;
extern const Metric LocalThreadActive;
}
template <typename Map>
struct AggregateIndependent
{
@ -274,7 +281,7 @@ int main(int argc, char ** argv)
std::cerr << std::fixed << std::setprecision(2);
ThreadPool pool(num_threads);
ThreadPool pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, num_threads);
Source data(n);

View File

@ -6,6 +6,7 @@
#include <Common/Stopwatch.h>
#include <Common/Exception.h>
#include <Common/ThreadPool.h>
#include <Common/CurrentMetrics.h>
int value = 0;
@ -14,6 +15,12 @@ static void f() { ++value; }
static void * g(void *) { f(); return {}; }
namespace CurrentMetrics
{
extern const Metric LocalThread;
extern const Metric LocalThreadActive;
}
namespace DB
{
namespace ErrorCodes
@ -65,7 +72,7 @@ int main(int argc, char ** argv)
test(n, "Create and destroy ThreadPool each iteration", []
{
ThreadPool tp(1);
ThreadPool tp(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, 1);
tp.scheduleOrThrowOnError(f);
tp.wait();
});
@ -86,7 +93,7 @@ int main(int argc, char ** argv)
});
{
ThreadPool tp(1);
ThreadPool tp(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, 1);
test(n, "Schedule job for Threadpool each iteration", [&tp]
{
@ -96,7 +103,7 @@ int main(int argc, char ** argv)
}
{
ThreadPool tp(128);
ThreadPool tp(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, 128);
test(n, "Schedule job for Threadpool with 128 threads each iteration", [&tp]
{

View File

@ -1,4 +1,5 @@
#include <Common/ThreadPool.h>
#include <Common/CurrentMetrics.h>
#include <gtest/gtest.h>
@ -7,6 +8,12 @@
*/
namespace CurrentMetrics
{
extern const Metric LocalThread;
extern const Metric LocalThreadActive;
}
TEST(ThreadPool, ConcurrentWait)
{
auto worker = []
@ -18,14 +25,14 @@ TEST(ThreadPool, ConcurrentWait)
constexpr size_t num_threads = 4;
constexpr size_t num_jobs = 4;
ThreadPool pool(num_threads);
ThreadPool pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, num_threads);
for (size_t i = 0; i < num_jobs; ++i)
pool.scheduleOrThrowOnError(worker);
constexpr size_t num_waiting_threads = 4;
ThreadPool waiting_pool(num_waiting_threads);
ThreadPool waiting_pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, num_waiting_threads);
for (size_t i = 0; i < num_waiting_threads; ++i)
waiting_pool.scheduleOrThrowOnError([&pool] { pool.wait(); });

View File

@ -2,10 +2,17 @@
#include <Common/Exception.h>
#include <Common/ThreadPool.h>
#include <Common/CurrentMetrics.h>
#include <gtest/gtest.h>
namespace CurrentMetrics
{
extern const Metric LocalThread;
extern const Metric LocalThreadActive;
}
/// Test what happens if local ThreadPool cannot create a ThreadFromGlobalPool.
/// There was a bug: if local ThreadPool cannot allocate even a single thread,
/// the job will be scheduled but never get executed.
@ -27,7 +34,7 @@ TEST(ThreadPool, GlobalFull1)
auto func = [&] { ++counter; while (counter != num_jobs) {} };
ThreadPool pool(num_jobs);
ThreadPool pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, num_jobs);
for (size_t i = 0; i < capacity; ++i)
pool.scheduleOrThrowOnError(func);
@ -65,11 +72,11 @@ TEST(ThreadPool, GlobalFull2)
std::atomic<size_t> counter = 0;
auto func = [&] { ++counter; while (counter != capacity + 1) {} };
ThreadPool pool(capacity, 0, capacity);
ThreadPool pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, capacity, 0, capacity);
for (size_t i = 0; i < capacity; ++i)
pool.scheduleOrThrowOnError(func);
ThreadPool another_pool(1);
ThreadPool another_pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, 1);
EXPECT_THROW(another_pool.scheduleOrThrowOnError(func), DB::Exception);
++counter;

View File

@ -1,10 +1,17 @@
#include <atomic>
#include <iostream>
#include <Common/ThreadPool.h>
#include <Common/CurrentMetrics.h>
#include <gtest/gtest.h>
namespace CurrentMetrics
{
extern const Metric LocalThread;
extern const Metric LocalThreadActive;
}
TEST(ThreadPool, Loop)
{
std::atomic<int> res{0};
@ -12,7 +19,7 @@ TEST(ThreadPool, Loop)
for (size_t i = 0; i < 1000; ++i)
{
size_t threads = 16;
ThreadPool pool(threads);
ThreadPool pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, threads);
for (size_t j = 0; j < threads; ++j)
pool.scheduleOrThrowOnError([&] { ++res; });
pool.wait();

View File

@ -1,13 +1,20 @@
#include <iostream>
#include <stdexcept>
#include <Common/ThreadPool.h>
#include <Common/CurrentMetrics.h>
#include <gtest/gtest.h>
namespace CurrentMetrics
{
extern const Metric LocalThread;
extern const Metric LocalThreadActive;
}
static bool check()
{
ThreadPool pool(10);
ThreadPool pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, 10);
/// The throwing thread.
pool.scheduleOrThrowOnError([] { throw std::runtime_error("Hello, world!"); });

View File

@ -29,7 +29,6 @@ namespace DB
M(Int32, max_connections, 1024, "Max server connections.", 0) \
M(UInt32, asynchronous_metrics_update_period_s, 1, "Period in seconds for updating asynchronous metrics.", 0) \
M(UInt32, asynchronous_heavy_metrics_update_period_s, 120, "Period in seconds for updating asynchronous metrics.", 0) \
M(UInt32, max_threads_for_connection_collector, 10, "The maximum number of threads that will be used for draining connections asynchronously in a background upon finishing executing distributed queries.", 0) \
M(String, default_database, "default", "Default database name.", 0) \
M(String, tmp_policy, "", "Policy for storage with temporary data.", 0) \
M(UInt64, max_temporary_data_on_disk_size, 0, "The maximum amount of storage that could be used for external aggregation, joins or sorting., ", 0) \

View File

@ -10,7 +10,7 @@ namespace ErrorCodes
{
extern const int SYNTAX_ERROR;
extern const int EMPTY_DATA_PASSED;
extern const int BAD_ARGUMENTS;
extern const int UNKNOWN_ELEMENT_OF_ENUM;
}
template <typename T>
@ -69,7 +69,7 @@ T EnumValues<T>::getValue(StringRef field_name, bool try_treat_as_id) const
}
auto hints = this->getHints(field_name.toString());
auto hints_string = !hints.empty() ? ", maybe you meant: " + toString(hints) : "";
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Unknown element '{}' for enum{}", field_name.toString(), hints_string);
throw Exception(ErrorCodes::UNKNOWN_ELEMENT_OF_ENUM, "Unknown element '{}' for enum{}", field_name.toString(), hints_string);
}
return it->getMapped();
}

View File

@ -17,14 +17,21 @@
#include <TableFunctions/TableFunctionFactory.h>
#include <Common/escapeForFileName.h>
#include <Common/logger_useful.h>
#include <Common/filesystemHelpers.h>
#include <Common/CurrentMetrics.h>
#include <Common/assert_cast.h>
#include <Databases/DatabaseOrdinary.h>
#include <Databases/DatabaseAtomic.h>
#include <Common/assert_cast.h>
#include <filesystem>
#include <Common/filesystemHelpers.h>
namespace fs = std::filesystem;
namespace CurrentMetrics
{
extern const Metric DatabaseOnDiskThreads;
extern const Metric DatabaseOnDiskThreadsActive;
}
namespace DB
{
@ -620,7 +627,7 @@ void DatabaseOnDisk::iterateMetadataFiles(ContextPtr local_context, const Iterat
}
/// Read and parse metadata in parallel
ThreadPool pool;
ThreadPool pool(CurrentMetrics::DatabaseOnDiskThreads, CurrentMetrics::DatabaseOnDiskThreadsActive);
for (const auto & file : metadata_files)
{
pool.scheduleOrThrowOnError([&]()

View File

@ -25,9 +25,16 @@
#include <Common/quoteString.h>
#include <Common/typeid_cast.h>
#include <Common/logger_useful.h>
#include <Common/CurrentMetrics.h>
namespace fs = std::filesystem;
namespace CurrentMetrics
{
extern const Metric DatabaseOrdinaryThreads;
extern const Metric DatabaseOrdinaryThreadsActive;
}
namespace DB
{
@ -99,7 +106,7 @@ void DatabaseOrdinary::loadStoredObjects(
std::atomic<size_t> dictionaries_processed{0};
std::atomic<size_t> tables_processed{0};
ThreadPool pool;
ThreadPool pool(CurrentMetrics::DatabaseOrdinaryThreads, CurrentMetrics::DatabaseOrdinaryThreadsActive);
/// We must attach dictionaries before attaching tables
/// because while we're attaching tables we may need to have some dictionaries attached

View File

@ -8,8 +8,15 @@
#include <Poco/Util/AbstractConfiguration.h>
#include <Common/logger_useful.h>
#include <Common/ThreadPool.h>
#include <Common/CurrentMetrics.h>
#include <numeric>
namespace CurrentMetrics
{
extern const Metric TablesLoaderThreads;
extern const Metric TablesLoaderThreadsActive;
}
namespace DB
{
@ -31,12 +38,13 @@ void logAboutProgress(Poco::Logger * log, size_t processed, size_t total, Atomic
}
TablesLoader::TablesLoader(ContextMutablePtr global_context_, Databases databases_, LoadingStrictnessLevel strictness_mode_)
: global_context(global_context_)
, databases(std::move(databases_))
, strictness_mode(strictness_mode_)
, referential_dependencies("ReferentialDeps")
, loading_dependencies("LoadingDeps")
, all_loading_dependencies("LoadingDeps")
: global_context(global_context_)
, databases(std::move(databases_))
, strictness_mode(strictness_mode_)
, referential_dependencies("ReferentialDeps")
, loading_dependencies("LoadingDeps")
, all_loading_dependencies("LoadingDeps")
, pool(CurrentMetrics::TablesLoaderThreads, CurrentMetrics::TablesLoaderThreadsActive)
{
metadata.default_database = global_context->getCurrentDatabase();
log = &Poco::Logger::get("TablesLoader");

View File

@ -2,8 +2,15 @@
#include <Dictionaries/CacheDictionaryUpdateQueue.h>
#include <Common/CurrentMetrics.h>
#include <Common/setThreadName.h>
namespace CurrentMetrics
{
extern const Metric CacheDictionaryThreads;
extern const Metric CacheDictionaryThreadsActive;
}
namespace DB
{
@ -26,7 +33,7 @@ CacheDictionaryUpdateQueue<dictionary_key_type>::CacheDictionaryUpdateQueue(
, configuration(configuration_)
, update_func(std::move(update_func_))
, update_queue(configuration.max_update_queue_size)
, update_pool(configuration.max_threads_for_updates)
, update_pool(CurrentMetrics::CacheDictionaryThreads, CurrentMetrics::CacheDictionaryThreadsActive, configuration.max_threads_for_updates)
{
for (size_t i = 0; i < configuration.max_threads_for_updates; ++i)
update_pool.scheduleOrThrowOnError([this] { updateThreadFunction(); });

View File

@ -8,6 +8,7 @@
#include <Common/setThreadName.h>
#include <Common/logger_useful.h>
#include <Common/ConcurrentBoundedQueue.h>
#include <Common/CurrentMetrics.h>
#include <Core/Defines.h>
@ -21,6 +22,11 @@
#include <Dictionaries/DictionaryFactory.h>
#include <Dictionaries/HierarchyDictionariesUtils.h>
namespace CurrentMetrics
{
extern const Metric HashedDictionaryThreads;
extern const Metric HashedDictionaryThreadsActive;
}
namespace
{
@ -60,7 +66,7 @@ public:
explicit ParallelDictionaryLoader(HashedDictionary & dictionary_)
: dictionary(dictionary_)
, shards(dictionary.configuration.shards)
, pool(shards)
, pool(CurrentMetrics::HashedDictionaryThreads, CurrentMetrics::HashedDictionaryThreadsActive, shards)
, shards_queues(shards)
{
UInt64 backlog = dictionary.configuration.shard_load_queue_backlog;
@ -213,7 +219,7 @@ HashedDictionary<dictionary_key_type, sparse, sharded>::~HashedDictionary()
return;
size_t shards = std::max<size_t>(configuration.shards, 1);
ThreadPool pool(shards);
ThreadPool pool(CurrentMetrics::HashedDictionaryThreads, CurrentMetrics::HashedDictionaryThreadsActive, shards);
size_t hash_tables_count = 0;
auto schedule_destroy = [&hash_tables_count, &pool](auto & container)

View File

@ -61,6 +61,8 @@ namespace ProfileEvents
namespace CurrentMetrics
{
extern const Metric Read;
extern const Metric ThreadPoolFSReaderThreads;
extern const Metric ThreadPoolFSReaderThreadsActive;
}
@ -85,7 +87,7 @@ static bool hasBugInPreadV2()
#endif
ThreadPoolReader::ThreadPoolReader(size_t pool_size, size_t queue_size_)
: pool(pool_size, pool_size, queue_size_)
: pool(CurrentMetrics::ThreadPoolFSReaderThreads, CurrentMetrics::ThreadPoolFSReaderThreadsActive, pool_size, pool_size, queue_size_)
{
}

View File

@ -26,6 +26,8 @@ namespace ProfileEvents
namespace CurrentMetrics
{
extern const Metric RemoteRead;
extern const Metric ThreadPoolRemoteFSReaderThreads;
extern const Metric ThreadPoolRemoteFSReaderThreadsActive;
}
namespace DB
@ -60,7 +62,7 @@ IAsynchronousReader::Result RemoteFSFileDescriptor::readInto(char * data, size_t
ThreadPoolRemoteFSReader::ThreadPoolRemoteFSReader(size_t pool_size, size_t queue_size_)
: pool(pool_size, pool_size, queue_size_)
: pool(CurrentMetrics::ThreadPoolRemoteFSReaderThreads, CurrentMetrics::ThreadPoolRemoteFSReaderThreadsActive, pool_size, pool_size, queue_size_)
{
}

View File

@ -10,6 +10,7 @@
#include <Common/quoteString.h>
#include <Common/logger_useful.h>
#include <Common/filesystemHelpers.h>
#include <Common/CurrentMetrics.h>
#include <Disks/ObjectStorages/Cached/CachedObjectStorage.h>
#include <Disks/ObjectStorages/DiskObjectStorageRemoteMetadataRestoreHelper.h>
#include <Disks/ObjectStorages/DiskObjectStorageTransaction.h>
@ -17,6 +18,13 @@
#include <Poco/Util/AbstractConfiguration.h>
#include <Interpreters/Context.h>
namespace CurrentMetrics
{
extern const Metric DiskObjectStorageAsyncThreads;
extern const Metric DiskObjectStorageAsyncThreadsActive;
}
namespace DB
{
@ -38,7 +46,8 @@ class AsyncThreadPoolExecutor : public Executor
public:
AsyncThreadPoolExecutor(const String & name_, int thread_pool_size)
: name(name_)
, pool(ThreadPool(thread_pool_size)) {}
, pool(CurrentMetrics::DiskObjectStorageAsyncThreads, CurrentMetrics::DiskObjectStorageAsyncThreadsActive, thread_pool_size)
{}
std::future<void> execute(std::function<void()> task) override
{

View File

@ -1129,7 +1129,9 @@ public:
for (size_t i = 0; i < vec_res.size(); ++i)
{
vec_res[i] = DB::parseIPv6whole(reinterpret_cast<const char *>(&vec_src[prev_offset]), reinterpret_cast<unsigned char *>(buffer));
vec_res[i] = DB::parseIPv6whole(reinterpret_cast<const char *>(&vec_src[prev_offset]),
reinterpret_cast<const char *>(&vec_src[offsets_src[i] - 1]),
reinterpret_cast<unsigned char *>(buffer));
prev_offset = offsets_src[i];
}

View File

@ -1,5 +1,12 @@
#include <IO/BackupsIOThreadPool.h>
#include "Core/Field.h"
#include <Common/CurrentMetrics.h>
#include <Core/Field.h>
namespace CurrentMetrics
{
extern const Metric BackupsIOThreads;
extern const Metric BackupsIOThreadsActive;
}
namespace DB
{
@ -18,7 +25,13 @@ void BackupsIOThreadPool::initialize(size_t max_threads, size_t max_free_threads
throw Exception(ErrorCodes::LOGICAL_ERROR, "The BackupsIO thread pool is initialized twice");
}
instance = std::make_unique<ThreadPool>(max_threads, max_free_threads, queue_size, false /*shutdown_on_exception*/);
instance = std::make_unique<ThreadPool>(
CurrentMetrics::BackupsIOThreads,
CurrentMetrics::BackupsIOThreadsActive,
max_threads,
max_free_threads,
queue_size,
/* shutdown_on_exception= */ false);
}
ThreadPool & BackupsIOThreadPool::get()

View File

@ -1,5 +1,12 @@
#include <IO/IOThreadPool.h>
#include "Core/Field.h"
#include <Common/CurrentMetrics.h>
#include <Core/Field.h>
namespace CurrentMetrics
{
extern const Metric IOThreads;
extern const Metric IOThreadsActive;
}
namespace DB
{
@ -18,7 +25,13 @@ void IOThreadPool::initialize(size_t max_threads, size_t max_free_threads, size_
throw Exception(ErrorCodes::LOGICAL_ERROR, "The IO thread pool is initialized twice");
}
instance = std::make_unique<ThreadPool>(max_threads, max_free_threads, queue_size, false /*shutdown_on_exception*/);
instance = std::make_unique<ThreadPool>(
CurrentMetrics::IOThreads,
CurrentMetrics::IOThreadsActive,
max_threads,
max_free_threads,
queue_size,
/* shutdown_on_exception= */ false);
}
ThreadPool & IOThreadPool::get()

View File

@ -24,6 +24,7 @@
#include <Common/CacheBase.h>
#include <Common/MemoryTracker.h>
#include <Common/CurrentThread.h>
#include <Common/CurrentMetrics.h>
#include <Common/typeid_cast.h>
#include <Common/assert_cast.h>
#include <Common/JSONBuilder.h>
@ -59,6 +60,8 @@ namespace ProfileEvents
namespace CurrentMetrics
{
extern const Metric TemporaryFilesForAggregation;
extern const Metric AggregatorThreads;
extern const Metric AggregatorThreadsActive;
}
namespace DB
@ -2397,7 +2400,7 @@ BlocksList Aggregator::convertToBlocks(AggregatedDataVariants & data_variants, b
std::unique_ptr<ThreadPool> thread_pool;
if (max_threads > 1 && data_variants.sizeWithoutOverflowRow() > 100000 /// TODO Make a custom threshold.
&& data_variants.isTwoLevel()) /// TODO Use the shared thread pool with the `merge` function.
thread_pool = std::make_unique<ThreadPool>(max_threads);
thread_pool = std::make_unique<ThreadPool>(CurrentMetrics::AggregatorThreads, CurrentMetrics::AggregatorThreadsActive, max_threads);
if (data_variants.without_key)
blocks.emplace_back(prepareBlockAndFillWithoutKey(
@ -2592,7 +2595,7 @@ void NO_INLINE Aggregator::mergeDataOnlyExistingKeysImpl(
void NO_INLINE Aggregator::mergeWithoutKeyDataImpl(
ManyAggregatedDataVariants & non_empty_data) const
{
ThreadPool thread_pool{params.max_threads};
ThreadPool thread_pool{CurrentMetrics::AggregatorThreads, CurrentMetrics::AggregatorThreadsActive, params.max_threads};
AggregatedDataVariantsPtr & res = non_empty_data[0];
@ -3065,7 +3068,7 @@ void Aggregator::mergeBlocks(BucketToBlocks bucket_to_blocks, AggregatedDataVari
std::unique_ptr<ThreadPool> thread_pool;
if (max_threads > 1 && total_input_rows > 100000) /// TODO Make a custom threshold.
thread_pool = std::make_unique<ThreadPool>(max_threads);
thread_pool = std::make_unique<ThreadPool>(CurrentMetrics::AggregatorThreads, CurrentMetrics::AggregatorThreadsActive, max_threads);
for (const auto & bucket_blocks : bucket_to_blocks)
{

View File

@ -32,6 +32,8 @@
namespace CurrentMetrics
{
extern const Metric PendingAsyncInsert;
extern const Metric AsynchronousInsertThreads;
extern const Metric AsynchronousInsertThreadsActive;
}
namespace ProfileEvents
@ -130,7 +132,7 @@ AsynchronousInsertQueue::AsynchronousInsertQueue(ContextPtr context_, size_t poo
: WithContext(context_)
, pool_size(pool_size_)
, queue_shards(pool_size)
, pool(pool_size)
, pool(CurrentMetrics::AsynchronousInsertThreads, CurrentMetrics::AsynchronousInsertThreadsActive, pool_size)
{
if (!pool_size)
throw Exception(ErrorCodes::BAD_ARGUMENTS, "pool_size cannot be zero");

View File

@ -40,6 +40,12 @@
namespace fs = std::filesystem;
namespace CurrentMetrics
{
extern const Metric DDLWorkerThreads;
extern const Metric DDLWorkerThreadsActive;
}
namespace DB
{
@ -85,7 +91,7 @@ DDLWorker::DDLWorker(
{
LOG_WARNING(log, "DDLWorker is configured to use multiple threads. "
"It's not recommended because queries can be reordered. Also it may cause some unknown issues to appear.");
worker_pool = std::make_unique<ThreadPool>(pool_size);
worker_pool = std::make_unique<ThreadPool>(CurrentMetrics::DDLWorkerThreads, CurrentMetrics::DDLWorkerThreadsActive, pool_size);
}
queue_dir = zk_root_dir;
@ -1084,7 +1090,7 @@ void DDLWorker::runMainThread()
/// It will wait for all threads in pool to finish and will not rethrow exceptions (if any).
/// We create new thread pool to forget previous exceptions.
if (1 < pool_size)
worker_pool = std::make_unique<ThreadPool>(pool_size);
worker_pool = std::make_unique<ThreadPool>(CurrentMetrics::DDLWorkerThreads, CurrentMetrics::DDLWorkerThreadsActive, pool_size);
/// Clear other in-memory state, like server just started.
current_tasks.clear();
last_skipped_entry_name.reset();
@ -1123,7 +1129,7 @@ void DDLWorker::runMainThread()
initialized = false;
/// Wait for pending async tasks
if (1 < pool_size)
worker_pool = std::make_unique<ThreadPool>(pool_size);
worker_pool = std::make_unique<ThreadPool>(CurrentMetrics::DDLWorkerThreads, CurrentMetrics::DDLWorkerThreadsActive, pool_size);
LOG_INFO(log, "Lost ZooKeeper connection, will try to connect again: {}", getCurrentExceptionMessage(true));
}
else

View File

@ -1,6 +1,7 @@
#pragma once
#include <Common/CurrentThread.h>
#include <Common/CurrentMetrics.h>
#include <Common/DNSResolver.h>
#include <Common/ThreadPool.h>
#include <Common/ZooKeeper/IKeeper.h>

View File

@ -38,6 +38,8 @@
namespace CurrentMetrics
{
extern const Metric TablesToDropQueueSize;
extern const Metric DatabaseCatalogThreads;
extern const Metric DatabaseCatalogThreadsActive;
}
namespace DB
@ -852,7 +854,7 @@ void DatabaseCatalog::loadMarkedAsDroppedTables()
LOG_INFO(log, "Found {} partially dropped tables. Will load them and retry removal.", dropped_metadata.size());
ThreadPool pool;
ThreadPool pool(CurrentMetrics::DatabaseCatalogThreads, CurrentMetrics::DatabaseCatalogThreadsActive);
for (const auto & elem : dropped_metadata)
{
pool.scheduleOrThrowOnError([&]()

View File

@ -41,6 +41,7 @@ namespace ErrorCodes
extern const int INVALID_SETTING_VALUE;
extern const int UNKNOWN_SETTING;
extern const int LOGICAL_ERROR;
extern const int NOT_IMPLEMENTED;
}
namespace
@ -386,6 +387,10 @@ QueryPipeline InterpreterExplainQuery::executeImpl()
}
case ASTExplainQuery::QueryTree:
{
if (!getContext()->getSettingsRef().allow_experimental_analyzer)
throw Exception(ErrorCodes::NOT_IMPLEMENTED,
"EXPLAIN QUERY TREE is only supported with a new analyzer. Set allow_experimental_analyzer = 1.");
if (ast.getExplainedQuery()->as<ASTSelectWithUnionQuery>() == nullptr)
throw Exception(ErrorCodes::INCORRECT_QUERY, "Only SELECT is supported for EXPLAIN QUERY TREE query");

View File

@ -12,7 +12,7 @@ namespace DB
BlockIO InterpreterShowEnginesQuery::execute()
{
return executeQuery("SELECT * FROM system.table_engines", getContext(), true);
return executeQuery("SELECT * FROM system.table_engines ORDER BY name", getContext(), true);
}
}

View File

@ -28,7 +28,6 @@ InterpreterShowTablesQuery::InterpreterShowTablesQuery(const ASTPtr & query_ptr_
{
}
String InterpreterShowTablesQuery::getRewrittenQuery()
{
const auto & query = query_ptr->as<ASTShowTablesQuery &>();
@ -51,6 +50,9 @@ String InterpreterShowTablesQuery::getRewrittenQuery()
if (query.limit_length)
rewritten_query << " LIMIT " << query.limit_length;
/// (*)
rewritten_query << " ORDER BY name";
return rewritten_query.str();
}
@ -69,6 +71,9 @@ String InterpreterShowTablesQuery::getRewrittenQuery()
<< DB::quote << query.like;
}
/// (*)
rewritten_query << " ORDER BY cluster";
if (query.limit_length)
rewritten_query << " LIMIT " << query.limit_length;
@ -81,6 +86,9 @@ String InterpreterShowTablesQuery::getRewrittenQuery()
rewritten_query << " WHERE cluster = " << DB::quote << query.cluster_str;
/// (*)
rewritten_query << " ORDER BY cluster";
return rewritten_query.str();
}
@ -101,6 +109,9 @@ String InterpreterShowTablesQuery::getRewrittenQuery()
<< DB::quote << query.like;
}
/// (*)
rewritten_query << " ORDER BY name, type, value ";
return rewritten_query.str();
}
@ -146,6 +157,9 @@ String InterpreterShowTablesQuery::getRewrittenQuery()
else if (query.where_expression)
rewritten_query << " AND (" << query.where_expression << ")";
/// (*)
rewritten_query << " ORDER BY name ";
if (query.limit_length)
rewritten_query << " LIMIT " << query.limit_length;
@ -176,5 +190,8 @@ BlockIO InterpreterShowTablesQuery::execute()
return executeQuery(getRewrittenQuery(), getContext(), true);
}
/// (*) Sorting is strictly speaking not necessary but 1. it is convenient for users, 2. SQL currently does not allow to
/// sort the output of SHOW <INFO> otherwise (SELECT * FROM (SHOW <INFO> ...) ORDER BY ...) is rejected) and 3. some
/// SQL tests can take advantage of this.
}

View File

@ -7,6 +7,7 @@
#include <Common/ThreadPool.h>
#include <Common/escapeForFileName.h>
#include <Common/ShellCommand.h>
#include <Common/CurrentMetrics.h>
#include <Interpreters/Cache/FileCacheFactory.h>
#include <Interpreters/Cache/FileCache.h>
#include <Interpreters/Context.h>
@ -64,6 +65,12 @@
#include "config.h"
namespace CurrentMetrics
{
extern const Metric RestartReplicaThreads;
extern const Metric RestartReplicaThreadsActive;
}
namespace DB
{
@ -685,7 +692,8 @@ void InterpreterSystemQuery::restartReplicas(ContextMutablePtr system_context)
for (auto & guard : guards)
guard.second = catalog.getDDLGuard(guard.first.database_name, guard.first.table_name);
ThreadPool pool(std::min(static_cast<size_t>(getNumberOfPhysicalCPUCores()), replica_names.size()));
size_t threads = std::min(static_cast<size_t>(getNumberOfPhysicalCPUCores()), replica_names.size());
ThreadPool pool(CurrentMetrics::RestartReplicaThreads, CurrentMetrics::RestartReplicaThreadsActive, threads);
for (auto & replica : replica_names)
{

View File

@ -1223,18 +1223,37 @@ void executeQuery(
end = begin + parse_buf.size();
}
ASTPtr ast;
BlockIO streams;
std::tie(ast, streams) = executeQueryImpl(begin, end, context, false, QueryProcessingStage::Complete, &istr);
auto & pipeline = streams.pipeline;
QueryResultDetails result_details
{
.query_id = context->getClientInfo().current_query_id,
.timezone = DateLUT::instance().getTimeZone(),
};
/// Set the result details in case of any exception raised during query execution
SCOPE_EXIT({
if (set_result_details == nullptr)
/// Either the result_details have been set in the flow below or the caller of this function does not provide this callback
return;
try
{
set_result_details(result_details);
}
catch (...)
{
/// This exception can be ignored.
/// because if the code goes here, it means there's already an exception raised during query execution,
/// and that exception will be propagated to outer caller,
/// there's no need to report the exception thrown here.
}
});
ASTPtr ast;
BlockIO streams;
std::tie(ast, streams) = executeQueryImpl(begin, end, context, false, QueryProcessingStage::Complete, &istr);
auto & pipeline = streams.pipeline;
std::unique_ptr<WriteBuffer> compressed_buffer;
try
{
@ -1304,7 +1323,15 @@ void executeQuery(
}
if (set_result_details)
set_result_details(result_details);
{
/// The call of set_result_details itself might throw exception,
/// in such case there's no need to call this function again in the SCOPE_EXIT defined above.
/// So the callback is cleared before its execution.
auto set_result_details_copy = set_result_details;
set_result_details = nullptr;
set_result_details_copy(result_details);
}
if (pipeline.initialized())
{

View File

@ -16,16 +16,24 @@
#include <IO/ReadBufferFromFile.h>
#include <IO/ReadHelpers.h>
#include <Common/escapeForFileName.h>
#include <Common/escapeForFileName.h>
#include <Common/typeid_cast.h>
#include <filesystem>
#include <Common/logger_useful.h>
#include <Common/CurrentMetrics.h>
#include <filesystem>
#define ORDINARY_TO_ATOMIC_PREFIX ".tmp_convert."
namespace fs = std::filesystem;
namespace CurrentMetrics
{
extern const Metric StartupSystemTablesThreads;
extern const Metric StartupSystemTablesThreadsActive;
}
namespace DB
{
@ -366,7 +374,7 @@ static void maybeConvertOrdinaryDatabaseToAtomic(ContextMutablePtr context, cons
if (!tables_started)
{
/// It's not quite correct to run DDL queries while database is not started up.
ThreadPool pool;
ThreadPool pool(CurrentMetrics::StartupSystemTablesThreads, CurrentMetrics::StartupSystemTablesThreadsActive);
DatabaseCatalog::instance().getSystemDatabase()->startupTables(pool, LoadingStrictnessLevel::FORCE_RESTORE);
}
@ -461,7 +469,7 @@ void convertDatabasesEnginesIfNeed(ContextMutablePtr context)
void startupSystemTables()
{
ThreadPool pool;
ThreadPool pool(CurrentMetrics::StartupSystemTablesThreads, CurrentMetrics::StartupSystemTablesThreadsActive);
DatabaseCatalog::instance().getSystemDatabase()->startupTables(pool, LoadingStrictnessLevel::FORCE_RESTORE);
}

View File

@ -1,6 +1,7 @@
#pragma once
#include <absl/container/inlined_vector.h>
#include <algorithm>
#include <memory>
#include <Core/Defines.h>

View File

@ -28,6 +28,7 @@ namespace ErrorCodes
extern const int CANNOT_PARSE_DOMAIN_VALUE_FROM_STRING;
extern const int CANNOT_PARSE_IPV4;
extern const int CANNOT_PARSE_IPV6;
extern const int UNKNOWN_ELEMENT_OF_ENUM;
}
@ -48,7 +49,8 @@ bool isParseError(int code)
|| code == ErrorCodes::INCORRECT_DATA /// For some ReadHelpers
|| code == ErrorCodes::CANNOT_PARSE_DOMAIN_VALUE_FROM_STRING
|| code == ErrorCodes::CANNOT_PARSE_IPV4
|| code == ErrorCodes::CANNOT_PARSE_IPV6;
|| code == ErrorCodes::CANNOT_PARSE_IPV6
|| code == ErrorCodes::UNKNOWN_ELEMENT_OF_ENUM;
}
IRowInputFormat::IRowInputFormat(Block header, ReadBuffer & in_, Params params_)

View File

@ -565,6 +565,11 @@ void AvroRowOutputFormat::write(const Columns & columns, size_t row_num)
void AvroRowOutputFormat::finalizeImpl()
{
/// If file writer weren't created, we should create it here to write file prefix/suffix
/// even without actual data so the file will be valid Avro file
if (!file_writer_ptr)
createFileWriter();
file_writer_ptr->close();
}

View File

@ -7,7 +7,8 @@
#include <Common/Stopwatch.h>
#include <Common/logger_useful.h>
#include <Common/Exception.h>
#include "IO/WriteBufferFromString.h"
#include <Common/CurrentMetrics.h>
#include <IO/WriteBufferFromString.h>
#include <Formats/FormatFactory.h>
#include <Poco/Event.h>
#include <IO/BufferWithOwnMemory.h>
@ -17,6 +18,12 @@
#include <deque>
#include <atomic>
namespace CurrentMetrics
{
extern const Metric ParallelFormattingOutputFormatThreads;
extern const Metric ParallelFormattingOutputFormatThreadsActive;
}
namespace DB
{
@ -74,7 +81,7 @@ public:
explicit ParallelFormattingOutputFormat(Params params)
: IOutputFormat(params.header, params.out)
, internal_formatter_creator(params.internal_formatter_creator)
, pool(params.max_threads_for_parallel_formatting)
, pool(CurrentMetrics::ParallelFormattingOutputFormatThreads, CurrentMetrics::ParallelFormattingOutputFormatThreadsActive, params.max_threads_for_parallel_formatting)
{
LOG_TEST(&Poco::Logger::get("ParallelFormattingOutputFormat"), "Parallel formatting is being used");

View File

@ -5,14 +5,21 @@
#include <Common/CurrentThread.h>
#include <Common/ThreadPool.h>
#include <Common/setThreadName.h>
#include <Common/logger_useful.h>
#include <Common/CurrentMetrics.h>
#include <IO/BufferWithOwnMemory.h>
#include <IO/ReadBuffer.h>
#include <Processors/Formats/IRowInputFormat.h>
#include <Interpreters/Context.h>
#include <Common/logger_useful.h>
#include <Poco/Event.h>
namespace CurrentMetrics
{
extern const Metric ParallelParsingInputFormatThreads;
extern const Metric ParallelParsingInputFormatThreadsActive;
}
namespace DB
{
@ -94,7 +101,7 @@ public:
, min_chunk_bytes(params.min_chunk_bytes)
, max_block_size(params.max_block_size)
, is_server(params.is_server)
, pool(params.max_threads)
, pool(CurrentMetrics::ParallelParsingInputFormatThreads, CurrentMetrics::ParallelParsingInputFormatThreadsActive, params.max_threads)
{
// One unit for each thread, including segmentator and reader, plus a
// couple more units so that the segmentation thread doesn't spuriously

View File

@ -14,7 +14,7 @@ static ITransformingStep::Traits getTraits(const ActionsDAGPtr & expression, con
bool preserves_sorting = expression->isSortingPreserved(header, sort_description, remove_filter_column ? filter_column_name : "");
if (remove_filter_column)
{
preserves_sorting &= find_if(
preserves_sorting &= std::find_if(
begin(sort_description),
end(sort_description),
[&](const auto & column_desc) { return column_desc.column_name == filter_column_name; })

View File

@ -153,6 +153,10 @@ bool isPartitionKeySuitsGroupByKey(
if (group_by_actions->hasArrayJoin() || group_by_actions->hasStatefulFunctions() || group_by_actions->hasNonDeterministic())
return false;
/// We are interested only in calculations required to obtain group by keys (and not aggregate function arguments for example).
group_by_actions->removeUnusedActions(aggregating.getParams().keys);
const auto & gb_key_required_columns = group_by_actions->getRequiredColumnsNames();
const auto & partition_actions = reading.getStorageMetadata()->getPartitionKey().expression->getActionsDAG();
@ -202,7 +206,7 @@ size_t tryAggregatePartitionsIndependently(QueryPlan::Node * node, QueryPlan::No
return 0;
if (!reading->willOutputEachPartitionThroughSeparatePort()
&& isPartitionKeySuitsGroupByKey(*reading, expression_step->getExpression(), *aggregating_step))
&& isPartitionKeySuitsGroupByKey(*reading, expression_step->getExpression()->clone(), *aggregating_step))
{
if (reading->requestOutputEachPartitionThroughSeparatePort())
aggregating_step->skipMerging();

View File

@ -6,6 +6,13 @@
#include <Common/Stopwatch.h>
#include <Common/setThreadName.h>
#include <Common/scope_guard_safe.h>
#include <Common/CurrentMetrics.h>
namespace CurrentMetrics
{
extern const Metric DestroyAggregatesThreads;
extern const Metric DestroyAggregatesThreadsActive;
}
namespace DB
{
@ -84,7 +91,10 @@ struct ManyAggregatedData
// Aggregation states destruction may be very time-consuming.
// In the case of a query with LIMIT, most states won't be destroyed during conversion to blocks.
// Without the following code, they would be destroyed in the destructor of AggregatedDataVariants in the current thread (i.e. sequentially).
const auto pool = std::make_unique<ThreadPool>(variants.size());
const auto pool = std::make_unique<ThreadPool>(
CurrentMetrics::DestroyAggregatesThreads,
CurrentMetrics::DestroyAggregatesThreadsActive,
variants.size());
for (auto && variant : variants)
{

View File

@ -7,6 +7,7 @@
#include <fmt/core.h>
#include <Poco/URI.h>
#include <Common/logger_useful.h>
#include <Common/CurrentMetrics.h>
#include <Columns/IColumn.h>
#include <Core/Block.h>
@ -40,6 +41,11 @@
#include <Storages/StorageFactory.h>
#include <Storages/checkAndGetLiteralArgument.h>
namespace CurrentMetrics
{
extern const Metric StorageHiveThreads;
extern const Metric StorageHiveThreadsActive;
}
namespace DB
{
@ -844,7 +850,7 @@ HiveFiles StorageHive::collectHiveFiles(
Int64 hive_max_query_partitions = context_->getSettings().max_partitions_to_read;
/// Mutext to protect hive_files, which maybe appended in multiple threads
std::mutex hive_files_mutex;
ThreadPool pool{max_threads};
ThreadPool pool{CurrentMetrics::StorageHiveThreads, CurrentMetrics::StorageHiveThreadsActive, max_threads};
if (!partitions.empty())
{
for (const auto & partition : partitions)

View File

@ -99,7 +99,7 @@ Pipe StorageMeiliSearch::read(
for (const auto & el : query_params->children)
{
auto str = el->getColumnName();
auto it = find(str.begin(), str.end(), '=');
auto it = std::find(str.begin(), str.end(), '=');
if (it == str.end())
throw Exception(ErrorCodes::BAD_QUERY_PARAMETER, "meiliMatch function must have parameters of the form \'key=value\'");

View File

@ -21,6 +21,12 @@
#include <Storages/MergeTree/IExecutableTask.h>
namespace CurrentMetrics
{
extern const Metric MergeTreeBackgroundExecutorThreads;
extern const Metric MergeTreeBackgroundExecutorThreadsActive;
}
namespace DB
{
namespace ErrorCodes
@ -255,6 +261,7 @@ public:
, max_tasks_count(max_tasks_count_)
, metric(metric_)
, max_tasks_metric(max_tasks_metric_, 2 * max_tasks_count) // active + pending
, pool(CurrentMetrics::MergeTreeBackgroundExecutorThreads, CurrentMetrics::MergeTreeBackgroundExecutorThreadsActive)
{
if (max_tasks_count == 0)
throw Exception(ErrorCodes::INVALID_CONFIG_PARAMETER, "Task count for MergeTreeBackgroundExecutor must not be zero");

View File

@ -17,6 +17,7 @@
#include <Common/Stopwatch.h>
#include <Common/StringUtils/StringUtils.h>
#include <Common/typeid_cast.h>
#include <Common/CurrentMetrics.h>
#include <Compression/CompressedReadBuffer.h>
#include <Core/QueryProcessingStage.h>
#include <DataTypes/DataTypeEnum.h>
@ -117,6 +118,10 @@ namespace ProfileEvents
namespace CurrentMetrics
{
extern const Metric DelayedInserts;
extern const Metric MergeTreePartsLoaderThreads;
extern const Metric MergeTreePartsLoaderThreadsActive;
extern const Metric MergeTreePartsCleanerThreads;
extern const Metric MergeTreePartsCleanerThreadsActive;
}
@ -1567,7 +1572,7 @@ void MergeTreeData::loadDataParts(bool skip_sanity_checks)
}
}
ThreadPool pool(disks.size());
ThreadPool pool(CurrentMetrics::MergeTreePartsLoaderThreads, CurrentMetrics::MergeTreePartsLoaderThreadsActive, disks.size());
std::vector<PartLoadingTree::PartLoadingInfos> parts_to_load_by_disk(disks.size());
for (size_t i = 0; i < disks.size(); ++i)
@ -2299,7 +2304,7 @@ void MergeTreeData::clearPartsFromFilesystemImpl(const DataPartsVector & parts_t
/// Parallel parts removal.
size_t num_threads = std::min<size_t>(settings->max_part_removal_threads, parts_to_remove.size());
std::mutex part_names_mutex;
ThreadPool pool(num_threads);
ThreadPool pool(CurrentMetrics::MergeTreePartsCleanerThreads, CurrentMetrics::MergeTreePartsCleanerThreadsActive, num_threads);
bool has_zero_copy_parts = false;
if (settings->allow_remote_fs_zero_copy_replication && dynamic_cast<StorageReplicatedMergeTree *>(this) != nullptr)

View File

@ -35,6 +35,7 @@
#include <Processors/Transforms/AggregatingTransform.h>
#include <Core/UUID.h>
#include <Common/CurrentMetrics.h>
#include <DataTypes/DataTypeDate.h>
#include <DataTypes/DataTypeEnum.h>
#include <DataTypes/DataTypeUUID.h>
@ -46,6 +47,12 @@
#include <Storages/MergeTree/CommonANNIndexes.h>
namespace CurrentMetrics
{
extern const Metric MergeTreeDataSelectExecutorThreads;
extern const Metric MergeTreeDataSelectExecutorThreadsActive;
}
namespace DB
{
@ -1077,7 +1084,10 @@ RangesInDataParts MergeTreeDataSelectExecutor::filterPartsByPrimaryKeyAndSkipInd
else
{
/// Parallel loading of data parts.
ThreadPool pool(num_threads);
ThreadPool pool(
CurrentMetrics::MergeTreeDataSelectExecutorThreads,
CurrentMetrics::MergeTreeDataSelectExecutorThreadsActive,
num_threads);
for (size_t part_index = 0; part_index < parts.size(); ++part_index)
pool.scheduleOrThrowOnError([&, part_index, thread_group = CurrentThread::getGroup()]

View File

@ -29,6 +29,7 @@
#include <Common/quoteString.h>
#include <Common/randomSeed.h>
#include <Common/formatReadable.h>
#include <Common/CurrentMetrics.h>
#include <Parsers/ASTExpressionList.h>
#include <Parsers/ASTFunction.h>
@ -126,6 +127,12 @@ namespace ProfileEvents
extern const Event DistributedDelayedInsertsMilliseconds;
}
namespace CurrentMetrics
{
extern const Metric StorageDistributedThreads;
extern const Metric StorageDistributedThreadsActive;
}
namespace DB
{
@ -1436,7 +1443,7 @@ void StorageDistributed::initializeFromDisk()
const auto & disks = data_volume->getDisks();
/// Make initialization for large number of disks parallel.
ThreadPool pool(disks.size());
ThreadPool pool(CurrentMetrics::StorageDistributedThreads, CurrentMetrics::StorageDistributedThreadsActive, disks.size());
for (const DiskPtr & disk : disks)
{

View File

@ -655,7 +655,9 @@ QueryPipelineBuilderPtr ReadFromMerge::createSources(
if (real_column_names.empty())
real_column_names.push_back(ExpressionActions::getSmallestColumn(storage_snapshot->metadata->getColumns().getAllPhysical()).name);
QueryPlan plan;
/// Steps for reading from child tables should have the same lifetime as the current step
/// because `builder` can have references to them (mainly for EXPLAIN PIPELINE).
QueryPlan & plan = child_plans.emplace_back();
StorageView * view = dynamic_cast<StorageView *>(storage.get());
if (!view || allow_experimental_analyzer)

View File

@ -159,6 +159,10 @@ private:
StoragePtr storage_merge;
StorageSnapshotPtr merge_storage_snapshot;
/// Store read plan for each child table.
/// It's needed to guarantee lifetime for child steps to be the same as for this step.
std::vector<QueryPlan> child_plans;
SelectQueryInfo query_info;
ContextMutablePtr context;
QueryProcessingStage::Enum common_processed_stage;

View File

@ -29,7 +29,6 @@
#include <Storages/checkAndGetLiteralArgument.h>
#include <Storages/StorageURL.h>
#include <Storages/NamedCollectionsHelpers.h>
#include <Common/NamedCollections/NamedCollections.h>
#include <Storages/ReadFromStorageProgress.h>
#include <Disks/IO/AsynchronousReadIndirectBufferFromRemoteFS.h>
@ -53,8 +52,10 @@
#include <aws/core/auth/AWSCredentials.h>
#include <Common/NamedCollections/NamedCollections.h>
#include <Common/parseGlobs.h>
#include <Common/quoteString.h>
#include <Common/CurrentMetrics.h>
#include <re2/re2.h>
#include <Processors/ISource.h>
@ -65,6 +66,12 @@
namespace fs = std::filesystem;
namespace CurrentMetrics
{
extern const Metric StorageS3Threads;
extern const Metric StorageS3ThreadsActive;
}
namespace ProfileEvents
{
extern const Event S3DeleteObjects;
@ -150,7 +157,7 @@ public:
, object_infos(object_infos_)
, read_keys(read_keys_)
, request_settings(request_settings_)
, list_objects_pool(1)
, list_objects_pool(CurrentMetrics::StorageS3Threads, CurrentMetrics::StorageS3ThreadsActive, 1)
, list_objects_scheduler(threadPoolCallbackRunner<ListObjectsOutcome>(list_objects_pool, "ListObjects"))
{
if (globbed_uri.bucket.find_first_of("*?{") != globbed_uri.bucket.npos)
@ -574,7 +581,7 @@ StorageS3Source::StorageS3Source(
, requested_virtual_columns(requested_virtual_columns_)
, file_iterator(file_iterator_)
, download_thread_num(download_thread_num_)
, create_reader_pool(1)
, create_reader_pool(CurrentMetrics::StorageS3Threads, CurrentMetrics::StorageS3ThreadsActive, 1)
, create_reader_scheduler(threadPoolCallbackRunner<ReaderHolder>(create_reader_pool, "CreateS3Reader"))
{
reader = createReader();

View File

@ -7,12 +7,19 @@
#include <Storages/StorageReplicatedMergeTree.h>
#include <Storages/VirtualColumnUtils.h>
#include <Access/ContextAccess.h>
#include <Common/typeid_cast.h>
#include <Databases/IDatabase.h>
#include <Processors/Sources/SourceFromSingleChunk.h>
#include <Common/typeid_cast.h>
#include <Common/CurrentMetrics.h>
#include <Common/getNumberOfPhysicalCPUCores.h>
namespace CurrentMetrics
{
extern const Metric SystemReplicasThreads;
extern const Metric SystemReplicasThreadsActive;
}
namespace DB
{
@ -160,7 +167,7 @@ Pipe StorageSystemReplicas::read(
if (settings.max_threads != 0)
thread_pool_size = std::min(thread_pool_size, static_cast<size_t>(settings.max_threads));
ThreadPool thread_pool(thread_pool_size);
ThreadPool thread_pool(CurrentMetrics::SystemReplicasThreads, CurrentMetrics::SystemReplicasThreadsActive, thread_pool_size);
for (size_t i = 0; i < tables_size; ++i)
{

View File

@ -106,22 +106,9 @@ private:
StorageID(db_name, table_name), ColumnsDescription{tab.columns}, ConstraintsDescription{}, String{}));
}
DatabaseCatalog::instance().attachDatabase(database->getDatabaseName(), database);
// DatabaseCatalog::instance().attachDatabase("system", mockSystemDatabase());
context->setCurrentDatabase("test");
}
DatabasePtr mockSystemDatabase()
{
DatabasePtr database = std::make_shared<DatabaseMemory>("system", context);
auto tab = TableWithColumnNamesAndTypes(createDBAndTable("one", "system"), { {"dummy", std::make_shared<DataTypeUInt8>()} });
database->attachTable(context, tab.table.table,
std::make_shared<StorageMemory>(
StorageID(tab.table.database, tab.table.table),
ColumnsDescription{tab.columns}, ConstraintsDescription{}, String{}));
return database;
}
};
static void checkOld(

View File

@ -6,15 +6,12 @@ import json
import os
import sys
import time
from shutil import rmtree
from typing import List, Tuple
from ccache_utils import get_ccache_if_not_exists, upload_ccache
from ci_config import CI_CONFIG, BuildConfig
from commit_status_helper import get_commit_filtered_statuses, get_commit
from docker_pull_helper import get_image_with_version
from env_helper import (
CACHES_PATH,
GITHUB_JOB,
IMAGES_PATH,
REPO_COPY,
@ -54,7 +51,6 @@ def get_packager_cmd(
output_path: str,
build_version: str,
image_version: str,
ccache_path: str,
official: bool,
) -> str:
package_type = build_config["package_type"]
@ -72,8 +68,9 @@ def get_packager_cmd(
if build_config["tidy"] == "enable":
cmd += " --clang-tidy"
cmd += " --cache=ccache"
cmd += f" --ccache_dir={ccache_path}"
cmd += " --cache=sccache"
cmd += " --s3-rw-access"
cmd += f" --s3-bucket={S3_BUILDS_BUCKET}"
if "additional_pkgs" in build_config and build_config["additional_pkgs"]:
cmd += " --additional-pkgs"
@ -314,29 +311,12 @@ def main():
if not os.path.exists(build_output_path):
os.makedirs(build_output_path)
ccache_path = os.path.join(CACHES_PATH, build_name + "_ccache")
logging.info("Will try to fetch cache for our build")
try:
get_ccache_if_not_exists(
ccache_path, s3_helper, pr_info.number, TEMP_PATH, pr_info.release_pr
)
except Exception as e:
# In case there are issues with ccache, remove the path and do not fail a build
logging.info("Failed to get ccache, building without it. Error: %s", e)
rmtree(ccache_path, ignore_errors=True)
if not os.path.exists(ccache_path):
logging.info("cache was not fetched, will create empty dir")
os.makedirs(ccache_path)
packager_cmd = get_packager_cmd(
build_config,
os.path.join(REPO_COPY, "docker/packager"),
build_output_path,
version.string,
image_version,
ccache_path,
official_flag,
)
@ -352,13 +332,8 @@ def main():
subprocess.check_call(
f"sudo chown -R ubuntu:ubuntu {build_output_path}", shell=True
)
subprocess.check_call(f"sudo chown -R ubuntu:ubuntu {ccache_path}", shell=True)
logging.info("Build finished with %s, log path %s", success, log_path)
# Upload the ccache first to have the least build time in case of problems
logging.info("Will upload cache")
upload_ccache(ccache_path, s3_helper, pr_info.number, TEMP_PATH)
# FIXME performance
performance_urls = []
performance_path = os.path.join(build_output_path, "performance.tar.zst")

View File

@ -11,7 +11,6 @@ from typing import List, Tuple
from github import Github
from ccache_utils import get_ccache_if_not_exists, upload_ccache
from clickhouse_helper import (
ClickHouseHelper,
mark_flaky_tests,
@ -22,7 +21,7 @@ from commit_status_helper import (
update_mergeable_check,
)
from docker_pull_helper import get_image_with_version
from env_helper import CACHES_PATH, TEMP_PATH
from env_helper import S3_BUILDS_BUCKET, TEMP_PATH
from get_robot_token import get_best_robot_token
from pr_info import FORCE_TESTS_LABEL, PRInfo
from report import TestResults, read_test_results
@ -38,24 +37,22 @@ NAME = "Fast test"
csv.field_size_limit(sys.maxsize)
def get_fasttest_cmd(
workspace, output_path, ccache_path, repo_path, pr_number, commit_sha, image
):
def get_fasttest_cmd(workspace, output_path, repo_path, pr_number, commit_sha, image):
return (
f"docker run --cap-add=SYS_PTRACE "
"--network=host " # required to get access to IAM credentials
f"-e FASTTEST_WORKSPACE=/fasttest-workspace -e FASTTEST_OUTPUT=/test_output "
f"-e FASTTEST_SOURCE=/ClickHouse --cap-add=SYS_PTRACE "
f"-e FASTTEST_CMAKE_FLAGS='-DCOMPILER_CACHE=sccache' "
f"-e PULL_REQUEST_NUMBER={pr_number} -e COMMIT_SHA={commit_sha} "
f"-e COPY_CLICKHOUSE_BINARY_TO_OUTPUT=1 "
f"-e SCCACHE_BUCKET={S3_BUILDS_BUCKET} -e SCCACHE_S3_KEY_PREFIX=ccache/sccache "
f"--volume={workspace}:/fasttest-workspace --volume={repo_path}:/ClickHouse "
f"--volume={output_path}:/test_output "
f"--volume={ccache_path}:/fasttest-workspace/ccache {image}"
f"--volume={output_path}:/test_output {image}"
)
def process_results(
result_folder: str,
) -> Tuple[str, str, TestResults, List[str]]:
def process_results(result_folder: str) -> Tuple[str, str, TestResults, List[str]]:
test_results = [] # type: TestResults
additional_files = []
# Just upload all files from result_folder.
@ -129,21 +126,6 @@ def main():
if not os.path.exists(output_path):
os.makedirs(output_path)
if not os.path.exists(CACHES_PATH):
os.makedirs(CACHES_PATH)
subprocess.check_call(f"sudo chown -R ubuntu:ubuntu {CACHES_PATH}", shell=True)
cache_path = os.path.join(CACHES_PATH, "fasttest")
logging.info("Will try to fetch cache for our build")
ccache_for_pr = get_ccache_if_not_exists(
cache_path, s3_helper, pr_info.number, temp_path, pr_info.release_pr
)
upload_master_ccache = ccache_for_pr in (-1, 0)
if not os.path.exists(cache_path):
logging.info("cache was not fetched, will create empty dir")
os.makedirs(cache_path)
repo_path = os.path.join(temp_path, "fasttest-repo")
if not os.path.exists(repo_path):
os.makedirs(repo_path)
@ -151,7 +133,6 @@ def main():
run_cmd = get_fasttest_cmd(
workspace,
output_path,
cache_path,
repo_path,
pr_info.number,
pr_info.sha,
@ -172,7 +153,6 @@ def main():
logging.info("Run failed")
subprocess.check_call(f"sudo chown -R ubuntu:ubuntu {temp_path}", shell=True)
subprocess.check_call(f"sudo chown -R ubuntu:ubuntu {cache_path}", shell=True)
test_output_files = os.listdir(output_path)
additional_logs = []
@ -202,12 +182,6 @@ def main():
else:
state, description, test_results, additional_logs = process_results(output_path)
logging.info("Will upload cache")
upload_ccache(cache_path, s3_helper, pr_info.number, temp_path)
if upload_master_ccache:
logging.info("Will upload a fallback cache for master")
upload_ccache(cache_path, s3_helper, 0, temp_path)
ch_helper = ClickHouseHelper()
mark_flaky_tests(ch_helper, NAME, test_results)

View File

@ -125,19 +125,29 @@ def test_concurrent_backups_on_same_node():
.query(f"BACKUP TABLE tbl ON CLUSTER 'cluster' TO {backup_name} ASYNC")
.split("\t")[0]
)
assert_eq_with_retry(
nodes[0],
f"SELECT status FROM system.backups WHERE status == 'CREATING_BACKUP' AND id = '{id}'",
"CREATING_BACKUP",
status = (
nodes[0]
.query(f"SELECT status FROM system.backups WHERE id == '{id}'")
.rstrip("\n")
)
assert "Concurrent backups not supported" in nodes[0].query_and_get_error(
assert status in ["CREATING_BACKUP", "BACKUP_CREATED"]
error = nodes[0].query_and_get_error(
f"BACKUP TABLE tbl ON CLUSTER 'cluster' TO {backup_name}"
)
expected_errors = [
"Concurrent backups not supported",
f"Backup {backup_name} already exists",
]
assert any([expected_error in error for expected_error in expected_errors])
assert_eq_with_retry(
nodes[0],
f"SELECT status FROM system.backups WHERE status == 'BACKUP_CREATED' AND id = '{id}'",
f"SELECT status FROM system.backups WHERE id = '{id}'",
"BACKUP_CREATED",
sleep_time=2,
retry_count=50,
)
# This restore part is added to confirm creating an internal backup & restore work
@ -161,18 +171,29 @@ def test_concurrent_backups_on_different_nodes():
.query(f"BACKUP TABLE tbl ON CLUSTER 'cluster' TO {backup_name} ASYNC")
.split("\t")[0]
)
assert_eq_with_retry(
nodes[1],
f"SELECT status FROM system.backups WHERE status == 'CREATING_BACKUP' AND id = '{id}'",
"CREATING_BACKUP",
status = (
nodes[1]
.query(f"SELECT status FROM system.backups WHERE id == '{id}'")
.rstrip("\n")
)
assert "Concurrent backups not supported" in nodes[0].query_and_get_error(
assert status in ["CREATING_BACKUP", "BACKUP_CREATED"]
error = nodes[0].query_and_get_error(
f"BACKUP TABLE tbl ON CLUSTER 'cluster' TO {backup_name}"
)
expected_errors = [
"Concurrent backups not supported",
f"Backup {backup_name} already exists",
]
assert any([expected_error in error for expected_error in expected_errors])
assert_eq_with_retry(
nodes[1],
f"SELECT status FROM system.backups WHERE status == 'BACKUP_CREATED' AND id = '{id}'",
f"SELECT status FROM system.backups WHERE id = '{id}'",
"BACKUP_CREATED",
sleep_time=2,
retry_count=50,
)
@ -181,22 +202,7 @@ def test_concurrent_restores_on_same_node():
backup_name = new_backup_name()
id = (
nodes[0]
.query(f"BACKUP TABLE tbl ON CLUSTER 'cluster' TO {backup_name} ASYNC")
.split("\t")[0]
)
assert_eq_with_retry(
nodes[0],
f"SELECT status FROM system.backups WHERE status == 'CREATING_BACKUP' AND id = '{id}'",
"CREATING_BACKUP",
)
assert_eq_with_retry(
nodes[0],
f"SELECT status FROM system.backups WHERE status == 'BACKUP_CREATED' AND id = '{id}'",
"BACKUP_CREATED",
)
nodes[0].query(f"BACKUP TABLE tbl ON CLUSTER 'cluster' TO {backup_name}")
nodes[0].query(
f"DROP TABLE tbl ON CLUSTER 'cluster' NO DELAY",
@ -218,22 +224,21 @@ def test_concurrent_restores_on_same_node():
)
assert status in ["RESTORING", "RESTORED"]
concurrent_error = nodes[0].query_and_get_error(
error = nodes[0].query_and_get_error(
f"RESTORE TABLE tbl ON CLUSTER 'cluster' FROM {backup_name}"
)
expected_errors = [
"Concurrent restores not supported",
"Cannot restore the table default.tbl because it already contains some data",
]
assert any(
[expected_error in concurrent_error for expected_error in expected_errors]
)
assert any([expected_error in error for expected_error in expected_errors])
assert_eq_with_retry(
nodes[0],
f"SELECT status FROM system.backups WHERE id == '{restore_id}'",
"RESTORED",
sleep_time=2,
retry_count=50,
)
@ -242,22 +247,7 @@ def test_concurrent_restores_on_different_node():
backup_name = new_backup_name()
id = (
nodes[0]
.query(f"BACKUP TABLE tbl ON CLUSTER 'cluster' TO {backup_name} ASYNC")
.split("\t")[0]
)
assert_eq_with_retry(
nodes[0],
f"SELECT status FROM system.backups WHERE status == 'CREATING_BACKUP' AND id = '{id}'",
"CREATING_BACKUP",
)
assert_eq_with_retry(
nodes[0],
f"SELECT status FROM system.backups WHERE status == 'BACKUP_CREATED' AND id = '{id}'",
"BACKUP_CREATED",
)
nodes[0].query(f"BACKUP TABLE tbl ON CLUSTER 'cluster' TO {backup_name}")
nodes[0].query(
f"DROP TABLE tbl ON CLUSTER 'cluster' NO DELAY",
@ -279,20 +269,19 @@ def test_concurrent_restores_on_different_node():
)
assert status in ["RESTORING", "RESTORED"]
concurrent_error = nodes[1].query_and_get_error(
error = nodes[1].query_and_get_error(
f"RESTORE TABLE tbl ON CLUSTER 'cluster' FROM {backup_name}"
)
expected_errors = [
"Concurrent restores not supported",
"Cannot restore the table default.tbl because it already contains some data",
]
assert any(
[expected_error in concurrent_error for expected_error in expected_errors]
)
assert any([expected_error in error for expected_error in expected_errors])
assert_eq_with_retry(
nodes[0],
f"SELECT status FROM system.backups WHERE id == '{restore_id}'",
"RESTORED",
sleep_time=2,
retry_count=50,
)

View File

@ -12,6 +12,6 @@ INSERT INTO cast_enums SELECT 2 AS type, toDate('2017-01-01') AS date, number AS
SELECT type, date, id FROM cast_enums ORDER BY type, id;
INSERT INTO cast_enums VALUES ('wrong_value', '2017-01-02', 7); -- { clientError 36 }
INSERT INTO cast_enums VALUES ('wrong_value', '2017-01-02', 7); -- { clientError 691 }
DROP TABLE IF EXISTS cast_enums;

View File

@ -62,49 +62,49 @@ SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explai
--- EXPLAIN QUERY TREE
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2 WHERE t1.a = t2.a);
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2 WHERE t1.a = t2.a SETTINGS allow_experimental_analyzer = 1);
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2 WHERE t1.b = t2.b);
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2 WHERE t1.b = t2.b SETTINGS allow_experimental_analyzer = 1);
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3 WHERE t1.a = t2.a AND t1.a = t3.a);
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3 WHERE t1.a = t2.a AND t1.a = t3.a SETTINGS allow_experimental_analyzer = 1);
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3 WHERE t1.b = t2.b AND t1.b = t3.b);
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3 WHERE t1.b = t2.b AND t1.b = t3.b SETTINGS allow_experimental_analyzer = 1);
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3, t4 WHERE t1.a = t2.a AND t1.a = t3.a AND t1.a = t4.a);
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3, t4 WHERE t1.a = t2.a AND t1.a = t3.a AND t1.a = t4.a SETTINGS allow_experimental_analyzer = 1);
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3, t4 WHERE t1.b = t2.b AND t1.b = t3.b AND t1.b = t4.b);
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3, t4 WHERE t1.b = t2.b AND t1.b = t3.b AND t1.b = t4.b SETTINGS allow_experimental_analyzer = 1);
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3, t4 WHERE t2.a = t1.a AND t2.a = t3.a AND t2.a = t4.a);
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3, t4 WHERE t2.a = t1.a AND t2.a = t3.a AND t2.a = t4.a SETTINGS allow_experimental_analyzer = 1);
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3, t4 WHERE t3.a = t1.a AND t3.a = t2.a AND t3.a = t4.a);
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3, t4 WHERE t3.a = t1.a AND t3.a = t2.a AND t3.a = t4.a SETTINGS allow_experimental_analyzer = 1);
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3, t4 WHERE t4.a = t1.a AND t4.a = t2.a AND t4.a = t3.a);
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3, t4 WHERE t4.a = t1.a AND t4.a = t2.a AND t4.a = t3.a SETTINGS allow_experimental_analyzer = 1);
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3, t4 WHERE t1.a = t2.a AND t2.a = t3.a AND t3.a = t4.a);
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3, t4 WHERE t1.a = t2.a AND t2.a = t3.a AND t3.a = t4.a SETTINGS allow_experimental_analyzer = 1);
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3, t4);
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2, t3, t4 SETTINGS allow_experimental_analyzer = 1);
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1 CROSS JOIN t2 CROSS JOIN t3 CROSS JOIN t4);
EXPLAIN QUERY TREE SELECT t1.a FROM t1 CROSS JOIN t2 CROSS JOIN t3 CROSS JOIN t4 SETTINGS allow_experimental_analyzer = 1);
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2 CROSS JOIN t3);
EXPLAIN QUERY TREE SELECT t1.a FROM t1, t2 CROSS JOIN t3 SETTINGS allow_experimental_analyzer = 1);
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1 JOIN t2 USING a CROSS JOIN t3);
EXPLAIN QUERY TREE SELECT t1.a FROM t1 JOIN t2 USING a CROSS JOIN t3 SETTINGS allow_experimental_analyzer = 1);
SELECT countIf(explain like '%COMMA%' OR explain like '%CROSS%'), countIf(explain like '%INNER%') FROM (
EXPLAIN QUERY TREE SELECT t1.a FROM t1 JOIN t2 ON t1.a = t2.a CROSS JOIN t3);
EXPLAIN QUERY TREE SELECT t1.a FROM t1 JOIN t2 ON t1.a = t2.a CROSS JOIN t3 SETTINGS allow_experimental_analyzer = 1);
INSERT INTO t1 values (1,1), (2,2), (3,3), (4,4);
INSERT INTO t2 values (1,1), (1, Null);

View File

@ -1,2 +1,3 @@
test_shard_localhost
test_shard_localhost 1 1 1 localhost ::1 9000 1 default
test_cluster_one_shard_two_replicas 1 1 1 127.0.0.1 127.0.0.1 9000 1 default
test_cluster_one_shard_two_replicas 1 1 2 127.0.0.2 127.0.0.2 9000 0 default

View File

@ -6,4 +6,5 @@ CUR_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
$CLICKHOUSE_CLIENT -q "show clusters like 'test_shard%' limit 1"
# cluster,shard_num,shard_weight,replica_num,host_name,host_address,port,is_local,user,default_database[,errors_count,slowdowns_count,estimated_recovery_time]
$CLICKHOUSE_CLIENT -q "show cluster 'test_shard_localhost'" | cut -f-10
# use a cluster with static IPv4
$CLICKHOUSE_CLIENT -q "show cluster 'test_cluster_one_shard_two_replicas'" | cut -f-10

View File

@ -3,6 +3,6 @@ connect_timeout Seconds 10
connect_timeout_with_failover_ms Milliseconds 2000
connect_timeout_with_failover_secure_ms Milliseconds 3000
external_storage_connect_timeout_sec UInt64 10
filesystem_prefetch_max_memory_usage UInt64 1073741824
max_untracked_memory UInt64 1048576
memory_profiler_step UInt64 1048576
filesystem_prefetch_max_memory_usage UInt64 1073741824

View File

@ -1,5 +1,5 @@
set optimize_group_by_function_keys = 1;
set allow_experimental_analyzer = 1;
-- { echoOn }
SELECT round(avg(log(2) * number), 6) AS k FROM numbers(10000000) GROUP BY (number % 2) * (number % 3), number % 3, number % 2 HAVING avg(log(2) * number) > 3465735.3 ORDER BY k;

View File

@ -3,4 +3,4 @@ INSERT INTO enum VALUES ('hello');
SELECT count() FROM enum WHERE x = 'hello';
SELECT count() FROM enum WHERE x = 'world';
SELECT count() FROM enum WHERE x = 'xyz'; -- { serverError 36 }
SELECT count() FROM enum WHERE x = 'xyz'; -- { serverError 691 }

View File

@ -65,6 +65,7 @@ QUERY id: 0
SORT id: 12, sort_direction: ASCENDING, with_fill: 0
EXPRESSION
COLUMN id: 7, column_name: number, result_type: UInt64, source_id: 8
SETTINGS allow_experimental_analyzer=1
SELECT groupArray(x)
FROM
(
@ -98,6 +99,7 @@ QUERY id: 0
SORT id: 12, sort_direction: ASCENDING, with_fill: 0
EXPRESSION
COLUMN id: 7, column_name: number, result_type: UInt64, source_id: 8
SETTINGS allow_experimental_analyzer=1
SELECT groupArray(x)
FROM
(
@ -139,6 +141,7 @@ QUERY id: 0
SORT id: 15, sort_direction: ASCENDING, with_fill: 0
EXPRESSION
COLUMN id: 7, column_name: number, result_type: UInt64, source_id: 8
SETTINGS allow_experimental_analyzer=1
SELECT
key,
a,
@ -200,6 +203,7 @@ QUERY id: 0
SORT id: 25, sort_direction: ASCENDING, with_fill: 0
EXPRESSION
COLUMN id: 26, column_name: key, result_type: UInt64, source_id: 5
SETTINGS allow_experimental_analyzer=1
SELECT
key,
a
@ -225,6 +229,7 @@ QUERY id: 0
SORT id: 7, sort_direction: ASCENDING, with_fill: 0
EXPRESSION
COLUMN id: 4, column_name: a, result_type: UInt8, source_id: 3
SETTINGS allow_experimental_analyzer=1
SELECT
key,
a
@ -257,6 +262,7 @@ QUERY id: 0
LIST id: 11, nodes: 2
COLUMN id: 2, column_name: key, result_type: UInt64, source_id: 3
COLUMN id: 4, column_name: a, result_type: UInt8, source_id: 3
SETTINGS allow_experimental_analyzer=1
QUERY id: 0
PROJECTION COLUMNS
key UInt64
@ -279,6 +285,7 @@ QUERY id: 0
SORT id: 10, sort_direction: ASCENDING, with_fill: 0
EXPRESSION
COLUMN id: 2, column_name: key, result_type: UInt64, source_id: 3
SETTINGS allow_experimental_analyzer=1
QUERY id: 0
PROJECTION COLUMNS
t1.id UInt64
@ -307,6 +314,7 @@ QUERY id: 0
SORT id: 14, sort_direction: ASCENDING, with_fill: 0
EXPRESSION
COLUMN id: 15, column_name: id, result_type: UInt64, source_id: 5
SETTINGS allow_experimental_analyzer=1
[0,1,2]
[0,1,2]
[0,1,2]

View File

@ -20,25 +20,25 @@ SELECT key, a FROM test ORDER BY key, a, exp(key + a) SETTINGS allow_experimenta
SELECT key, a FROM test ORDER BY key, exp(key + a);
SELECT key, a FROM test ORDER BY key, exp(key + a) SETTINGS allow_experimental_analyzer=1;
EXPLAIN SYNTAX SELECT groupArray(x) from (SELECT number as x FROM numbers(3) ORDER BY x, exp(x));
EXPLAIN QUERY TREE run_passes=1 SELECT groupArray(x) from (SELECT number as x FROM numbers(3) ORDER BY x, exp(x));
EXPLAIN QUERY TREE run_passes=1 SELECT groupArray(x) from (SELECT number as x FROM numbers(3) ORDER BY x, exp(x)) settings allow_experimental_analyzer=1;
EXPLAIN SYNTAX SELECT groupArray(x) from (SELECT number as x FROM numbers(3) ORDER BY x, exp(exp(x)));
EXPLAIN QUERY TREE run_passes=1 SELECT groupArray(x) from (SELECT number as x FROM numbers(3) ORDER BY x, exp(exp(x)));
EXPLAIN QUERY TREE run_passes=1 SELECT groupArray(x) from (SELECT number as x FROM numbers(3) ORDER BY x, exp(exp(x))) settings allow_experimental_analyzer=1;
EXPLAIN SYNTAX SELECT groupArray(x) from (SELECT number as x FROM numbers(3) ORDER BY exp(x), x);
EXPLAIN QUERY TREE run_passes=1 SELECT groupArray(x) from (SELECT number as x FROM numbers(3) ORDER BY exp(x), x);
EXPLAIN QUERY TREE run_passes=1 SELECT groupArray(x) from (SELECT number as x FROM numbers(3) ORDER BY exp(x), x) settings allow_experimental_analyzer=1;
EXPLAIN SYNTAX SELECT * FROM (SELECT number + 2 AS key FROM numbers(4)) s FULL JOIN test t USING(key) ORDER BY s.key, t.key;
EXPLAIN QUERY TREE run_passes=1 SELECT * FROM (SELECT number + 2 AS key FROM numbers(4)) s FULL JOIN test t USING(key) ORDER BY s.key, t.key;
EXPLAIN QUERY TREE run_passes=1 SELECT * FROM (SELECT number + 2 AS key FROM numbers(4)) s FULL JOIN test t USING(key) ORDER BY s.key, t.key settings allow_experimental_analyzer=1;
EXPLAIN SYNTAX SELECT key, a FROM test ORDER BY key, a, exp(key + a);
EXPLAIN QUERY TREE run_passes=1 SELECT key, a FROM test ORDER BY key, a, exp(key + a);
EXPLAIN QUERY TREE run_passes=1 SELECT key, a FROM test ORDER BY key, a, exp(key + a) settings allow_experimental_analyzer=1;
EXPLAIN SYNTAX SELECT key, a FROM test ORDER BY key, exp(key + a);
EXPLAIN QUERY TREE run_passes=1 SELECT key, a FROM test ORDER BY key, exp(key + a);
EXPLAIN QUERY TREE run_passes=1 SELECT key FROM test GROUP BY key ORDER BY avg(a), key;
EXPLAIN QUERY TREE run_passes=1 SELECT key, a FROM test ORDER BY key, exp(key + a) settings allow_experimental_analyzer=1;
EXPLAIN QUERY TREE run_passes=1 SELECT key FROM test GROUP BY key ORDER BY avg(a), key settings allow_experimental_analyzer=1;
DROP TABLE IF EXISTS t1;
DROP TABLE IF EXISTS t2;
CREATE TABLE t1 (id UInt64) ENGINE = MergeTree() ORDER BY id;
CREATE TABLE t2 (id UInt64) ENGINE = MergeTree() ORDER BY id;
EXPLAIN QUERY TREE run_passes=1 SELECT * FROM t1 INNER JOIN t2 ON t1.id = t2.id ORDER BY t1.id, t2.id;
EXPLAIN QUERY TREE run_passes=1 SELECT * FROM t1 INNER JOIN t2 ON t1.id = t2.id ORDER BY t1.id, t2.id settings allow_experimental_analyzer=1;
set optimize_redundant_functions_in_order_by = 0;

View File

@ -5,8 +5,8 @@ SELECT CAST(CAST(NULL AS Nullable(String)) AS Nullable(Enum8('Hello' = 1)));
SELECT CAST(CAST(NULL AS Nullable(FixedString(1))) AS Nullable(Enum8('Hello' = 1)));
-- empty string still not acceptable
SELECT CAST(CAST('' AS Nullable(String)) AS Nullable(Enum8('Hello' = 1))); -- { serverError 36 }
SELECT CAST(CAST('' AS Nullable(FixedString(1))) AS Nullable(Enum8('Hello' = 1))); -- { serverError 36 }
SELECT CAST(CAST('' AS Nullable(String)) AS Nullable(Enum8('Hello' = 1))); -- { serverError 691 }
SELECT CAST(CAST('' AS Nullable(FixedString(1))) AS Nullable(Enum8('Hello' = 1))); -- { serverError 691 }
-- non-Nullable Enum() still not acceptable
SELECT CAST(CAST(NULL AS Nullable(String)) AS Enum8('Hello' = 1)); -- { serverError 349 }

View File

@ -1,3 +1,5 @@
set allow_experimental_analyzer = 1;
EXPLAIN QUERY TREE run_passes=1
SELECT avg(log(2) * number) AS k FROM numbers(10000000)
GROUP BY GROUPING SETS (((number % 2) * (number % 3)), number % 3, number % 2)

View File

@ -20,3 +20,7 @@ DROP TABLE t_query_id_header
< Content-Type: text/plain; charset=UTF-8
< X-ClickHouse-Query-Id: query_id
< X-ClickHouse-Timezone: timezone
BAD SQL
< Content-Type: text/plain; charset=UTF-8
< X-ClickHouse-Query-Id: query_id
< X-ClickHouse-Timezone: timezone

View File

@ -28,3 +28,4 @@ run_and_check_headers "INSERT INTO t_query_id_header VALUES (1)"
run_and_check_headers "EXISTS TABLE t_query_id_header"
run_and_check_headers "SELECT * FROM t_query_id_header"
run_and_check_headers "DROP TABLE t_query_id_header"
run_and_check_headers "BAD SQL"

View File

@ -0,0 +1 @@
0

View File

@ -0,0 +1 @@
SELECT isIPv6String('1234::1234:');

View File

@ -0,0 +1 @@
1000000

View File

@ -0,0 +1,10 @@
-- Tags: no-random-merge-tree-settings
set max_threads = 16;
create table t(a UInt32) engine=MergeTree order by tuple() partition by a % 16;
insert into t select * from numbers_mt(1e6);
set allow_aggregate_partitions_independently=1, force_aggregate_partitions_independently=1;
select count(distinct a) from t;

View File

@ -0,0 +1,2 @@
Hello
World

View File

@ -0,0 +1,15 @@
#!/usr/bin/env bash
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
# shellcheck source=../shell_config.sh
. "$CURDIR"/../shell_config.sh
$CLICKHOUSE_CLIENT --multiquery --query "DROP TABLE IF EXISTS t; CREATE TABLE t (x Enum('Hello' = 1, 'World' = 2)) ENGINE = Memory;"
$CLICKHOUSE_CLIENT --input_format_allow_errors_num 1 --query "INSERT INTO t FORMAT CSV" <<END
Hello
Goodbye
World
END
$CLICKHOUSE_CLIENT --query "SELECT x FROM t ORDER BY x"
$CLICKHOUSE_CLIENT --query "DROP TABLE t"

View File

@ -0,0 +1,58 @@
1 test
3 another
QUERY id: 0
PROJECTION COLUMNS
a Int32
b LowCardinality(String)
PROJECTION
LIST id: 1, nodes: 2
COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3
COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3
JOIN TREE
TABLE id: 3, table_name: default.02702_logical_optimizer
WHERE
FUNCTION id: 5, function_name: or, function_type: ordinary, result_type: Nullable(UInt8)
ARGUMENTS
LIST id: 6, nodes: 3
FUNCTION id: 7, function_name: equals, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 8, nodes: 2
COLUMN id: 9, column_name: a, result_type: Int32, source_id: 3
CONSTANT id: 10, constant_value: UInt64_1, constant_value_type: UInt8
FUNCTION id: 11, function_name: equals, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 12, nodes: 2
CONSTANT id: 13, constant_value: UInt64_3, constant_value_type: UInt8
COLUMN id: 9, column_name: a, result_type: Int32, source_id: 3
FUNCTION id: 14, function_name: equals, function_type: ordinary, result_type: Nullable(Nothing)
ARGUMENTS
LIST id: 15, nodes: 2
CONSTANT id: 16, constant_value: NULL, constant_value_type: Nullable(Nothing)
COLUMN id: 9, column_name: a, result_type: Int32, source_id: 3
1 test
2 test2
3 another
QUERY id: 0
PROJECTION COLUMNS
a Int32
b LowCardinality(String)
PROJECTION
LIST id: 1, nodes: 2
COLUMN id: 2, column_name: a, result_type: Int32, source_id: 3
COLUMN id: 4, column_name: b, result_type: LowCardinality(String), source_id: 3
JOIN TREE
TABLE id: 3, table_name: default.02702_logical_optimizer
WHERE
FUNCTION id: 5, function_name: or, function_type: ordinary, result_type: Nullable(UInt8)
ARGUMENTS
LIST id: 6, nodes: 2
FUNCTION id: 7, function_name: equals, function_type: ordinary, result_type: Nullable(Nothing)
ARGUMENTS
LIST id: 8, nodes: 2
COLUMN id: 9, column_name: a, result_type: Int32, source_id: 3
CONSTANT id: 10, constant_value: NULL, constant_value_type: Nullable(Nothing)
FUNCTION id: 11, function_name: in, function_type: ordinary, result_type: UInt8
ARGUMENTS
LIST id: 12, nodes: 2
COLUMN id: 9, column_name: a, result_type: Int32, source_id: 3
CONSTANT id: 13, constant_value: Tuple_(UInt64_1, UInt64_3, UInt64_2), constant_value_type: Tuple(UInt8, UInt8, UInt8)

View File

@ -0,0 +1,17 @@
SET allow_experimental_analyzer = 1;
DROP TABLE IF EXISTS 02702_logical_optimizer;
CREATE TABLE 02702_logical_optimizer
(a Int32, b LowCardinality(String))
ENGINE=Memory;
INSERT INTO 02702_logical_optimizer VALUES (1, 'test'), (2, 'test2'), (3, 'another');
SET optimize_min_equality_disjunction_chain_length = 3;
SELECT * FROM 02702_logical_optimizer WHERE a = 1 OR 3 = a OR NULL = a;
EXPLAIN QUERY TREE SELECT * FROM 02702_logical_optimizer WHERE a = 1 OR 3 = a OR NULL = a;
SELECT * FROM 02702_logical_optimizer WHERE a = 1 OR 3 = a OR 2 = a OR a = NULL;
EXPLAIN QUERY TREE SELECT * FROM 02702_logical_optimizer WHERE a = 1 OR 3 = a OR 2 = a OR a = NULL;

View File

@ -0,0 +1,2 @@
set allow_experimental_analyzer=0;
EXPLAIN QUERY TREE run_passes = true, dump_passes = true SELECT 1; -- { serverError NOT_IMPLEMENTED }

Some files were not shown because too many files have changed in this diff Show More