diff --git a/CHANGELOG.md b/CHANGELOG.md
index 01bbafe2f16..a09e4d4fe24 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,10 +1,194 @@
### Table of Contents
+**[ClickHouse release v23.3 LTS, 2023-03-30](#233)**
**[ClickHouse release v23.2, 2023-02-23](#232)**
**[ClickHouse release v23.1, 2023-01-25](#231)**
**[Changelog for 2022](https://clickhouse.com/docs/en/whats-new/changelog/2022/)**
# 2023 Changelog
+### ClickHouse release 23.3 LTS, 2023-03-30
+
+#### Upgrade Notes
+* The behavior of `*domain*RFC` and `netloc` functions is slightly changed: relaxed the set of symbols that are allowed in the URL authority for better conformance. [#46841](https://github.com/ClickHouse/ClickHouse/pull/46841) ([Azat Khuzhin](https://github.com/azat)).
+* Prohibited creating tables based on KafkaEngine with DEFAULT/EPHEMERAL/ALIAS/MATERIALIZED statements for columns. [#47138](https://github.com/ClickHouse/ClickHouse/pull/47138) ([Aleksandr Musorin](https://github.com/AVMusorin)).
+* An "asynchronous connection drain" feature is removed. Related settings and metrics are removed as well. It was an internal feature, so the removal should not affect users who had never heard about that feature. [#47486](https://github.com/ClickHouse/ClickHouse/pull/47486) ([Alexander Tokmakov](https://github.com/tavplubix)).
+* Support 256-bit Decimal data type (more than 38 digits) in `arraySum`/`Min`/`Max`/`Avg`/`Product`, `arrayCumSum`/`CumSumNonNegative`, `arrayDifference`, array construction, IN operator, query parameters, `groupArrayMovingSum`, statistical functions, `min`/`max`/`any`/`argMin`/`argMax`, PostgreSQL wire protocol, MySQL table engine and function, `sumMap`, `mapAdd`, `mapSubtract`, `arrayIntersect`. Add support for big integers in `arrayIntersect`. Statistical aggregate functions involving moments (such as `corr` or various `TTest`s) will use `Float64` as their internal representation (they were using `Decimal128` before this change, but it was pointless), and these functions can return `nan` instead of `inf` in case of infinite variance. Some functions were allowed on `Decimal256` data types but returned `Decimal128` in previous versions - now it is fixed. This closes [#47569](https://github.com/ClickHouse/ClickHouse/issues/47569). This closes [#44864](https://github.com/ClickHouse/ClickHouse/issues/44864). This closes [#28335](https://github.com/ClickHouse/ClickHouse/issues/28335). [#47594](https://github.com/ClickHouse/ClickHouse/pull/47594) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
+* Make backup_threads/restore_threads server settings (instead of user settings). [#47881](https://github.com/ClickHouse/ClickHouse/pull/47881) ([Azat Khuzhin](https://github.com/azat)).
+* Do not allow const and non-deterministic secondary indices [#46839](https://github.com/ClickHouse/ClickHouse/pull/46839) ([Anton Popov](https://github.com/CurtizJ)).
+
+#### New Feature
+* Add new mode for splitting the work on replicas using settings `parallel_replicas_custom_key` and `parallel_replicas_custom_key_filter_type`. If the cluster consists of a single shard with multiple replicas, up to `max_parallel_replicas` will be randomly picked and turned into shards. For each shard, a corresponding filter is added to the query on the initiator before being sent to the shard. If the cluster consists of multiple shards, it will behave the same as `sample_key` but with the possibility to define an arbitrary key. [#45108](https://github.com/ClickHouse/ClickHouse/pull/45108) ([Antonio Andelic](https://github.com/antonio2368)).
+* An option to display partial result on cancel: Added query setting `stop_reading_on_first_cancel` allowing the canceled query (e.g. due to Ctrl-C) to return a partial result. [#45689](https://github.com/ClickHouse/ClickHouse/pull/45689) ([Alexey Perevyshin](https://github.com/alexX512)).
+* Added support of arbitrary tables engines for temporary tables (except for Replicated and KeeperMap engines). Close [#31497](https://github.com/ClickHouse/ClickHouse/issues/31497). [#46071](https://github.com/ClickHouse/ClickHouse/pull/46071) ([Roman Vasin](https://github.com/rvasin)).
+* Add support for replication of user-defined SQL functions using a centralized storage in Keeper. [#46085](https://github.com/ClickHouse/ClickHouse/pull/46085) ([Aleksei Filatov](https://github.com/aalexfvk)).
+* Implement `system.server_settings` (similar to `system.settings`), which will contain server configurations. [#46550](https://github.com/ClickHouse/ClickHouse/pull/46550) ([pufit](https://github.com/pufit)).
+* Support for `UNDROP TABLE` query. Closes [#46811](https://github.com/ClickHouse/ClickHouse/issues/46811). [#47241](https://github.com/ClickHouse/ClickHouse/pull/47241) ([chen](https://github.com/xiedeyantu)).
+* Allow separate grants for named collections (e.g. to be able to give `SHOW/CREATE/ALTER/DROP named collection` access only to certain collections, instead of all at once). Closes [#40894](https://github.com/ClickHouse/ClickHouse/issues/40894). Add new access type `NAMED_COLLECTION_CONTROL` which is not given to default user unless explicitly added to user config (is required to be able to do `GRANT ALL`), also `show_named_collections` is no longer obligatory to be manually specified for default user to be able to have full access rights as was in 23.2. [#46241](https://github.com/ClickHouse/ClickHouse/pull/46241) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Allow nested custom disks. Previously custom disks supported only flat disk structure. [#47106](https://github.com/ClickHouse/ClickHouse/pull/47106) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Intruduce a function `widthBucket` (with a `WIDTH_BUCKET` alias for compatibility). [#42974](https://github.com/ClickHouse/ClickHouse/issues/42974). [#46790](https://github.com/ClickHouse/ClickHouse/pull/46790) ([avoiderboi](https://github.com/avoiderboi)).
+* Add new function `parseDateTime`/`parseDateTimeInJodaSyntax` according to specified format string. parseDateTime parses string to datetime in MySQL syntax, parseDateTimeInJodaSyntax parses in Joda syntax. [#46815](https://github.com/ClickHouse/ClickHouse/pull/46815) ([李扬](https://github.com/taiyang-li)).
+* Use `dummy UInt8` for default structure of table function `null`. Closes [#46930](https://github.com/ClickHouse/ClickHouse/issues/46930). [#47006](https://github.com/ClickHouse/ClickHouse/pull/47006) ([flynn](https://github.com/ucasfl)).
+* Support for date format with a comma, like `Dec 15, 2021` in the `parseDateTimeBestEffort` function. Closes [#46816](https://github.com/ClickHouse/ClickHouse/issues/46816). [#47071](https://github.com/ClickHouse/ClickHouse/pull/47071) ([chen](https://github.com/xiedeyantu)).
+* Add settings `http_wait_end_of_query` and `http_response_buffer_size` that corresponds to URL params `wait_end_of_query` and `buffer_size` for HTTP interface. This allows to change these settings in the profiles. [#47108](https://github.com/ClickHouse/ClickHouse/pull/47108) ([Vladimir C](https://github.com/vdimir)).
+* Add `system.dropped_tables` table that shows tables that were dropped from `Atomic` databases but were not completely removed yet. [#47364](https://github.com/ClickHouse/ClickHouse/pull/47364) ([chen](https://github.com/xiedeyantu)).
+* Add `INSTR` as alias of `positionCaseInsensitive` for MySQL compatibility. Closes [#47529](https://github.com/ClickHouse/ClickHouse/issues/47529). [#47535](https://github.com/ClickHouse/ClickHouse/pull/47535) ([flynn](https://github.com/ucasfl)).
+* Added `toDecimalString` function allowing to convert numbers to string with fixed precision. [#47838](https://github.com/ClickHouse/ClickHouse/pull/47838) ([Andrey Zvonov](https://github.com/zvonand)).
+* Add a merge tree setting `max_number_of_mutations_for_replica`. It limits the number of part mutations per replica to the specified amount. Zero means no limit on the number of mutations per replica (the execution can still be constrained by other settings). [#48047](https://github.com/ClickHouse/ClickHouse/pull/48047) ([Vladimir C](https://github.com/vdimir)).
+* Add Map-related function `mapFromArrays`, which allows us to create map from a pair of arrays. [#31125](https://github.com/ClickHouse/ClickHouse/pull/31125) ([李扬](https://github.com/taiyang-li)).
+* Allow control compression in Parquet/ORC/Arrow output formats, support more compression for input formats. This closes [#13541](https://github.com/ClickHouse/ClickHouse/issues/13541). [#47114](https://github.com/ClickHouse/ClickHouse/pull/47114) ([Kruglov Pavel](https://github.com/Avogar)).
+* Add SSL User Certificate authentication to the native protocol. Closes [#47077](https://github.com/ClickHouse/ClickHouse/issues/47077). [#47596](https://github.com/ClickHouse/ClickHouse/pull/47596) ([Nikolay Degterinsky](https://github.com/evillique)).
+* Add *OrNull() and *OrZero() variants for `parseDateTime`, add alias `str_to_date` for MySQL parity. [#48000](https://github.com/ClickHouse/ClickHouse/pull/48000) ([Robert Schulze](https://github.com/rschu1ze)).
+* Added operator `REGEXP` (similar to operators "LIKE", "IN", "MOD" etc.) for better compatibility with MySQL [#47869](https://github.com/ClickHouse/ClickHouse/pull/47869) ([Robert Schulze](https://github.com/rschu1ze)).
+
+#### Performance Improvement
+* Marks in memory are now compressed, using 3-6x less memory. [#47290](https://github.com/ClickHouse/ClickHouse/pull/47290) ([Michael Kolupaev](https://github.com/al13n321)).
+* Backups for large numbers of files were unbelievably slow in previous versions. Not anymore. Now they are unbelievably fast. [#47251](https://github.com/ClickHouse/ClickHouse/pull/47251) ([Alexey Milovidov](https://github.com/alexey-milovidov)). Introduced a separate thread pool for backup's IO operations. This will allow to scale it independently of other pools and increase performance. [#47174](https://github.com/ClickHouse/ClickHouse/pull/47174) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)). Use MultiRead request and retries for collecting metadata at final stage of backup processing. [#47243](https://github.com/ClickHouse/ClickHouse/pull/47243) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)). If a backup and restoring data are both in S3 then server-side copy should be used from now on. [#47546](https://github.com/ClickHouse/ClickHouse/pull/47546) ([Vitaly Baranov](https://github.com/vitlibar)).
+* Fixed excessive reading in queries with `FINAL`. [#47801](https://github.com/ClickHouse/ClickHouse/pull/47801) ([Nikita Taranov](https://github.com/nickitat)).
+* Setting `max_final_threads` would be set to number of cores at server startup (by the same algorithm as we use for `max_threads`). This improves concurrency of `final` execution on servers with high number of CPUs. [#47915](https://github.com/ClickHouse/ClickHouse/pull/47915) ([Nikita Taranov](https://github.com/nickitat)).
+* Allow executing reading pipeline for DIRECT dictionary with CLICKHOUSE source in multiple threads. To enable set `dictionary_use_async_executor=1` in `SETTINGS` section for source in `CREATE DICTIONARY` statement. [#47986](https://github.com/ClickHouse/ClickHouse/pull/47986) ([Vladimir C](https://github.com/vdimir)).
+* Optimize one nullable key aggregate performance. [#45772](https://github.com/ClickHouse/ClickHouse/pull/45772) ([LiuNeng](https://github.com/liuneng1994)).
+* Implemented lowercase `tokenbf_v1` index utilization for `hasTokenOrNull`, `hasTokenCaseInsensitive` and `hasTokenCaseInsensitiveOrNull`. [#46252](https://github.com/ClickHouse/ClickHouse/pull/46252) ([ltrk2](https://github.com/ltrk2)).
+* Optimize functions `position` and `LIKE` by searching the first two chars using SIMD. [#46289](https://github.com/ClickHouse/ClickHouse/pull/46289) ([Jiebin Sun](https://github.com/jiebinn)).
+* Optimize queries from the `system.detached_parts`, which could be significantly large. Added several sources with respect to the block size limitation; in each block an IO thread pool is used to calculate the part size, i.e. to make syscalls in parallel. [#46624](https://github.com/ClickHouse/ClickHouse/pull/46624) ([Sema Checherinda](https://github.com/CheSema)).
+* Increase the default value of `max_replicated_merges_in_queue` for ReplicatedMergeTree tables from 16 to 1000. It allows faster background merge operation on clusters with a very large number of replicas, such as clusters with shared storage in ClickHouse Cloud. [#47050](https://github.com/ClickHouse/ClickHouse/pull/47050) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
+* Updated `clickhouse-copier` to use `GROUP BY` instead of `DISTINCT` to get list of partitions. For large tables this reduced the select time from over 500s to under 1s. [#47386](https://github.com/ClickHouse/ClickHouse/pull/47386) ([Clayton McClure](https://github.com/cmcclure-twilio)).
+* Fix performance degradation in `ASOF JOIN`. [#47544](https://github.com/ClickHouse/ClickHouse/pull/47544) ([Ongkong](https://github.com/ongkong)).
+* Even more batching in Keeper. Avoid breaking batches on read requests to improve performance. [#47978](https://github.com/ClickHouse/ClickHouse/pull/47978) ([Antonio Andelic](https://github.com/antonio2368)).
+* Allow PREWHERE for Merge with different DEFAULT expression for column. [#46831](https://github.com/ClickHouse/ClickHouse/pull/46831) ([Azat Khuzhin](https://github.com/azat)).
+
+#### Experimental Feature
+* Parallel replicas: Improved the overall performance by better utilizing local replica. And forbid reading with parallel replicas from non-replicated MergeTree by default. [#47858](https://github.com/ClickHouse/ClickHouse/pull/47858) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
+* Support filter push down to left table for JOIN with `Join`, `Dictionary` and `EmbeddedRocksDB` tables if the experimental Analyzer is enabled. [#47280](https://github.com/ClickHouse/ClickHouse/pull/47280) ([Maksim Kita](https://github.com/kitaisreal)).
+* Now ReplicatedMergeTree with zero copy replication has less load to Keeper. [#47676](https://github.com/ClickHouse/ClickHouse/pull/47676) ([alesapin](https://github.com/alesapin)).
+* Fix create materialized view with MaterializedPostgreSQL [#40807](https://github.com/ClickHouse/ClickHouse/pull/40807) ([Maksim Buren](https://github.com/maks-buren630501)).
+
+#### Improvement
+* Enable `input_format_json_ignore_unknown_keys_in_named_tuple` by default. [#46742](https://github.com/ClickHouse/ClickHouse/pull/46742) ([Kruglov Pavel](https://github.com/Avogar)).
+* Allow to ignore errors while pushing to MATERIALIZED VIEW (add new setting `materialized_views_ignore_errors`, by default to `false`, but it is set to `true` for flushing logs to `system.*_log` tables unconditionally). [#46658](https://github.com/ClickHouse/ClickHouse/pull/46658) ([Azat Khuzhin](https://github.com/azat)).
+* Track the file queue of distributed sends in memory. [#45491](https://github.com/ClickHouse/ClickHouse/pull/45491) ([Azat Khuzhin](https://github.com/azat)).
+* Now `X-ClickHouse-Query-Id` and `X-ClickHouse-Timezone` headers are added to response in all queries via http protocol. Previously it was done only for `SELECT` queries. [#46364](https://github.com/ClickHouse/ClickHouse/pull/46364) ([Anton Popov](https://github.com/CurtizJ)).
+* External tables from `MongoDB`: support for connection to a replica set via a URI with a host:port enum and support for the readPreference option in MongoDB dictionaries. Example URI: mongodb://db0.example.com:27017,db1.example.com:27017,db2.example.com:27017/?replicaSet=myRepl&readPreference=primary. [#46524](https://github.com/ClickHouse/ClickHouse/pull/46524) ([artem-yadr](https://github.com/artem-yadr)).
+* This improvement should be invisible for users. Re-implement projection analysis on top of query plan. Added setting `query_plan_optimize_projection=1` to switch between old and new version. Fixes [#44963](https://github.com/ClickHouse/ClickHouse/issues/44963). [#46537](https://github.com/ClickHouse/ClickHouse/pull/46537) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Use parquet format v2 instead of v1 in output format by default. Add setting `output_format_parquet_version` to control parquet version, possible values `1.0`, `2.4`, `2.6`, `2.latest` (default). [#46617](https://github.com/ClickHouse/ClickHouse/pull/46617) ([Kruglov Pavel](https://github.com/Avogar)).
+* It is now possible using new configuration syntax to configure Kafka topics with periods (`.`) in their name. [#46752](https://github.com/ClickHouse/ClickHouse/pull/46752) ([Robert Schulze](https://github.com/rschu1ze)).
+* Fix heuristics that check hyperscan patterns for problematic repeats. [#46819](https://github.com/ClickHouse/ClickHouse/pull/46819) ([Robert Schulze](https://github.com/rschu1ze)).
+* Don't report ZK node exists to system.errors when a block was created concurrently by a different replica. [#46820](https://github.com/ClickHouse/ClickHouse/pull/46820) ([Raúl Marín](https://github.com/Algunenano)).
+* Increase the limit for opened files in `clickhouse-local`. It will be able to read from `web` tables on servers with a huge number of CPU cores. Do not back off reading from the URL table engine in case of too many opened files. This closes [#46852](https://github.com/ClickHouse/ClickHouse/issues/46852). [#46853](https://github.com/ClickHouse/ClickHouse/pull/46853) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
+* Exceptions thrown when numbers cannot be parsed now have an easier-to-read exception message. [#46917](https://github.com/ClickHouse/ClickHouse/pull/46917) ([Robert Schulze](https://github.com/rschu1ze)).
+* Added update `system.backups` after every processed task to track the progress of backups. [#46989](https://github.com/ClickHouse/ClickHouse/pull/46989) ([Aleksandr Musorin](https://github.com/AVMusorin)).
+* Allow types conversion in Native input format. Add settings `input_format_native_allow_types_conversion` that controls it (enabled by default). [#46990](https://github.com/ClickHouse/ClickHouse/pull/46990) ([Kruglov Pavel](https://github.com/Avogar)).
+* Allow IPv4 in the `range` function to generate IP ranges. [#46995](https://github.com/ClickHouse/ClickHouse/pull/46995) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
+* Improve exception message when it's impossible to make part move from one volume/disk to another. [#47032](https://github.com/ClickHouse/ClickHouse/pull/47032) ([alesapin](https://github.com/alesapin)).
+* Support `Bool` type in `JSONType` function. Previously `Null` type was mistakenly returned for bool values. [#47046](https://github.com/ClickHouse/ClickHouse/pull/47046) ([Anton Popov](https://github.com/CurtizJ)).
+* Use `_request_body` parameter to configure predefined HTTP queries. [#47086](https://github.com/ClickHouse/ClickHouse/pull/47086) ([Constantine Peresypkin](https://github.com/pkit)).
+* Automatic indentation in the built-in UI SQL editor when Enter is pressed. [#47113](https://github.com/ClickHouse/ClickHouse/pull/47113) ([Alexey Korepanov](https://github.com/alexkorep)).
+* Self-extraction with 'sudo' will attempt to set uid and gid of extracted files to running user. [#47116](https://github.com/ClickHouse/ClickHouse/pull/47116) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
+* Previously, the `repeat` function's second argument only accepted an unsigned integer type, which meant it could not accept values such as -1. This behavior differed from that of the Spark function. In this update, the repeat function has been modified to match the behavior of the Spark function. It now accepts the same types of inputs, including negative integers. Extensive testing has been performed to verify the correctness of the updated implementation. [#47134](https://github.com/ClickHouse/ClickHouse/pull/47134) ([KevinyhZou](https://github.com/KevinyhZou)). Note: the changelog entry was rewritten by ChatGPT.
+* Remove `::__1` part from stacktraces. Display `std::basic_string ClickHouse release 23.2, 2023-02-23
#### Backward Incompatible Change
diff --git a/CMakeLists.txt b/CMakeLists.txt
index bd7de46474b..ce615c11f2b 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -568,7 +568,7 @@ if (NATIVE_BUILD_TARGETS
COMMAND ${CMAKE_COMMAND}
"-DCMAKE_C_COMPILER=${CMAKE_C_COMPILER}"
"-DCMAKE_CXX_COMPILER=${CMAKE_CXX_COMPILER}"
- "-DENABLE_CCACHE=${ENABLE_CCACHE}"
+ "-DCOMPILER_CACHE=${COMPILER_CACHE}"
# Avoid overriding .cargo/config.toml with native toolchain.
"-DENABLE_RUST=OFF"
"-DENABLE_CLICKHOUSE_SELF_EXTRACTING=${ENABLE_CLICKHOUSE_SELF_EXTRACTING}"
diff --git a/cmake/ccache.cmake b/cmake/ccache.cmake
index f0769f337d0..285c8efb9e1 100644
--- a/cmake/ccache.cmake
+++ b/cmake/ccache.cmake
@@ -1,5 +1,6 @@
# Setup integration with ccache to speed up builds, see https://ccache.dev/
+# Matches both ccache and sccache
if (CMAKE_CXX_COMPILER_LAUNCHER MATCHES "ccache" OR CMAKE_C_COMPILER_LAUNCHER MATCHES "ccache")
# custom compiler launcher already defined, most likely because cmake was invoked with like "-DCMAKE_CXX_COMPILER_LAUNCHER=ccache" or
# via environment variable --> respect setting and trust that the launcher was specified correctly
@@ -8,45 +9,65 @@ if (CMAKE_CXX_COMPILER_LAUNCHER MATCHES "ccache" OR CMAKE_C_COMPILER_LAUNCHER MA
return()
endif()
-option(ENABLE_CCACHE "Speedup re-compilations using ccache (external tool)" ON)
-
-if (NOT ENABLE_CCACHE)
- message(STATUS "Using ccache: no (disabled via configuration)")
- return()
+set(ENABLE_CCACHE "default" CACHE STRING "Deprecated, use COMPILER_CACHE=(auto|ccache|sccache|disabled)")
+if (NOT ENABLE_CCACHE STREQUAL "default")
+ message(WARNING "The -DENABLE_CCACHE is deprecated in favor of -DCOMPILER_CACHE")
+endif()
+
+set(COMPILER_CACHE "auto" CACHE STRING "Speedup re-compilations using the caching tools; valid options are 'auto' (ccache, then sccache), 'ccache', 'sccache', or 'disabled'")
+
+# It has pretty complex logic, because the ENABLE_CCACHE is deprecated, but still should
+# control the COMPILER_CACHE
+# After it will be completely removed, the following block will be much simpler
+if (COMPILER_CACHE STREQUAL "ccache" OR (ENABLE_CCACHE AND NOT ENABLE_CCACHE STREQUAL "default"))
+ find_program (CCACHE_EXECUTABLE ccache)
+elseif(COMPILER_CACHE STREQUAL "disabled" OR NOT ENABLE_CCACHE STREQUAL "default")
+ message(STATUS "Using *ccache: no (disabled via configuration)")
+ return()
+elseif(COMPILER_CACHE STREQUAL "auto")
+ find_program (CCACHE_EXECUTABLE ccache sccache)
+elseif(COMPILER_CACHE STREQUAL "sccache")
+ find_program (CCACHE_EXECUTABLE sccache)
+else()
+ message(${RECONFIGURE_MESSAGE_LEVEL} "The COMPILER_CACHE must be one of (auto|ccache|sccache|disabled), given '${COMPILER_CACHE}'")
endif()
-find_program (CCACHE_EXECUTABLE ccache)
if (NOT CCACHE_EXECUTABLE)
- message(${RECONFIGURE_MESSAGE_LEVEL} "Using ccache: no (Could not find find ccache. To significantly reduce compile times for the 2nd, 3rd, etc. build, it is highly recommended to install ccache. To suppress this message, run cmake with -DENABLE_CCACHE=0)")
+ message(${RECONFIGURE_MESSAGE_LEVEL} "Using *ccache: no (Could not find find ccache or sccache. To significantly reduce compile times for the 2nd, 3rd, etc. build, it is highly recommended to install one of them. To suppress this message, run cmake with -DCOMPILER_CACHE=disabled)")
return()
endif()
-execute_process(COMMAND ${CCACHE_EXECUTABLE} "-V" OUTPUT_VARIABLE CCACHE_VERSION)
-string(REGEX REPLACE "ccache version ([0-9\\.]+).*" "\\1" CCACHE_VERSION ${CCACHE_VERSION})
+if (CCACHE_EXECUTABLE MATCHES "/ccache$")
+ execute_process(COMMAND ${CCACHE_EXECUTABLE} "-V" OUTPUT_VARIABLE CCACHE_VERSION)
+ string(REGEX REPLACE "ccache version ([0-9\\.]+).*" "\\1" CCACHE_VERSION ${CCACHE_VERSION})
-set (CCACHE_MINIMUM_VERSION 3.3)
+ set (CCACHE_MINIMUM_VERSION 3.3)
-if (CCACHE_VERSION VERSION_LESS_EQUAL ${CCACHE_MINIMUM_VERSION})
- message(${RECONFIGURE_MESSAGE_LEVEL} "Using ccache: no (found ${CCACHE_EXECUTABLE} (version ${CCACHE_VERSION}), the minimum required version is ${CCACHE_MINIMUM_VERSION}")
- return()
-endif()
+ if (CCACHE_VERSION VERSION_LESS_EQUAL ${CCACHE_MINIMUM_VERSION})
+ message(${RECONFIGURE_MESSAGE_LEVEL} "Using ccache: no (found ${CCACHE_EXECUTABLE} (version ${CCACHE_VERSION}), the minimum required version is ${CCACHE_MINIMUM_VERSION}")
+ return()
+ endif()
-message(STATUS "Using ccache: ${CCACHE_EXECUTABLE} (version ${CCACHE_VERSION})")
-set(LAUNCHER ${CCACHE_EXECUTABLE})
+ message(STATUS "Using ccache: ${CCACHE_EXECUTABLE} (version ${CCACHE_VERSION})")
+ set(LAUNCHER ${CCACHE_EXECUTABLE})
-# Work around a well-intended but unfortunate behavior of ccache 4.0 & 4.1 with
-# environment variable SOURCE_DATE_EPOCH. This variable provides an alternative
-# to source-code embedded timestamps (__DATE__/__TIME__) and therefore helps with
-# reproducible builds (*). SOURCE_DATE_EPOCH is set automatically by the
-# distribution, e.g. Debian. Ccache 4.0 & 4.1 incorporate SOURCE_DATE_EPOCH into
-# the hash calculation regardless they contain timestamps or not. This invalidates
-# the cache whenever SOURCE_DATE_EPOCH changes. As a fix, ignore SOURCE_DATE_EPOCH.
-#
-# (*) https://reproducible-builds.org/specs/source-date-epoch/
-if (CCACHE_VERSION VERSION_GREATER_EQUAL "4.0" AND CCACHE_VERSION VERSION_LESS "4.2")
- message(STATUS "Ignore SOURCE_DATE_EPOCH for ccache 4.0 / 4.1")
- set(LAUNCHER env -u SOURCE_DATE_EPOCH ${CCACHE_EXECUTABLE})
+ # Work around a well-intended but unfortunate behavior of ccache 4.0 & 4.1 with
+ # environment variable SOURCE_DATE_EPOCH. This variable provides an alternative
+ # to source-code embedded timestamps (__DATE__/__TIME__) and therefore helps with
+ # reproducible builds (*). SOURCE_DATE_EPOCH is set automatically by the
+ # distribution, e.g. Debian. Ccache 4.0 & 4.1 incorporate SOURCE_DATE_EPOCH into
+ # the hash calculation regardless they contain timestamps or not. This invalidates
+ # the cache whenever SOURCE_DATE_EPOCH changes. As a fix, ignore SOURCE_DATE_EPOCH.
+ #
+ # (*) https://reproducible-builds.org/specs/source-date-epoch/
+ if (CCACHE_VERSION VERSION_GREATER_EQUAL "4.0" AND CCACHE_VERSION VERSION_LESS "4.2")
+ message(STATUS "Ignore SOURCE_DATE_EPOCH for ccache 4.0 / 4.1")
+ set(LAUNCHER env -u SOURCE_DATE_EPOCH ${CCACHE_EXECUTABLE})
+ endif()
+elseif(CCACHE_EXECUTABLE MATCHES "/sccache$")
+ message(STATUS "Using sccache: ${CCACHE_EXECUTABLE}")
+ set(LAUNCHER ${CCACHE_EXECUTABLE})
endif()
set (CMAKE_CXX_COMPILER_LAUNCHER ${LAUNCHER} ${CMAKE_CXX_COMPILER_LAUNCHER})
diff --git a/contrib/boost b/contrib/boost
index 03d9ec9cd15..8fe7b3326ef 160000
--- a/contrib/boost
+++ b/contrib/boost
@@ -1 +1 @@
-Subproject commit 03d9ec9cd159d14bd0b17c05138098451a1ea606
+Subproject commit 8fe7b3326ef482ee6ecdf5a4f698f2b8c2780f98
diff --git a/contrib/llvm-project b/contrib/llvm-project
index 4bfaeb31dd0..e0accd51793 160000
--- a/contrib/llvm-project
+++ b/contrib/llvm-project
@@ -1 +1 @@
-Subproject commit 4bfaeb31dd0ef13f025221f93c138974a3e0a22a
+Subproject commit e0accd517933ebb44aff84bc8db448ffd8ef1929
diff --git a/docker/packager/binary/Dockerfile b/docker/packager/binary/Dockerfile
index 62e6d47c183..fa860b2207f 100644
--- a/docker/packager/binary/Dockerfile
+++ b/docker/packager/binary/Dockerfile
@@ -69,13 +69,14 @@ RUN add-apt-repository ppa:ubuntu-toolchain-r/test --yes \
libc6 \
libc6-dev \
libc6-dev-arm64-cross \
+ python3-boto3 \
yasm \
zstd \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists
# Download toolchain and SDK for Darwin
-RUN wget -nv https://github.com/phracker/MacOSX-SDKs/releases/download/11.3/MacOSX11.0.sdk.tar.xz
+RUN curl -sL -O https://github.com/phracker/MacOSX-SDKs/releases/download/11.3/MacOSX11.0.sdk.tar.xz
# Architecture of the image when BuildKit/buildx is used
ARG TARGETARCH
@@ -97,7 +98,7 @@ ENV PATH="$PATH:/usr/local/go/bin"
ENV GOPATH=/workdir/go
ENV GOCACHE=/workdir/
-ARG CLANG_TIDY_SHA1=03644275e794b0587849bfc2ec6123d5ae0bdb1c
+ARG CLANG_TIDY_SHA1=c191254ea00d47ade11d7170ef82fe038c213774
RUN curl -Lo /usr/bin/clang-tidy-cache \
"https://raw.githubusercontent.com/matus-chochlik/ctcache/$CLANG_TIDY_SHA1/clang-tidy-cache" \
&& chmod +x /usr/bin/clang-tidy-cache
diff --git a/docker/packager/binary/build.sh b/docker/packager/binary/build.sh
index 24dca72e946..2cd0a011013 100755
--- a/docker/packager/binary/build.sh
+++ b/docker/packager/binary/build.sh
@@ -6,6 +6,7 @@ exec &> >(ts)
ccache_status () {
ccache --show-config ||:
ccache --show-stats ||:
+ SCCACHE_NO_DAEMON=1 sccache --show-stats ||:
}
[ -O /build ] || git config --global --add safe.directory /build
diff --git a/docker/packager/packager b/docker/packager/packager
index 58dd299fd6d..7d022df52e6 100755
--- a/docker/packager/packager
+++ b/docker/packager/packager
@@ -5,13 +5,19 @@ import os
import argparse
import logging
import sys
-from typing import List
+from pathlib import Path
+from typing import List, Optional
-SCRIPT_PATH = os.path.realpath(__file__)
+SCRIPT_PATH = Path(__file__).absolute()
IMAGE_TYPE = "binary"
+IMAGE_NAME = f"clickhouse/{IMAGE_TYPE}-builder"
-def check_image_exists_locally(image_name):
+class BuildException(Exception):
+ pass
+
+
+def check_image_exists_locally(image_name: str) -> bool:
try:
output = subprocess.check_output(
f"docker images -q {image_name} 2> /dev/null", shell=True
@@ -21,17 +27,17 @@ def check_image_exists_locally(image_name):
return False
-def pull_image(image_name):
+def pull_image(image_name: str) -> bool:
try:
subprocess.check_call(f"docker pull {image_name}", shell=True)
return True
except subprocess.CalledProcessError:
- logging.info(f"Cannot pull image {image_name}".format())
+ logging.info("Cannot pull image %s", image_name)
return False
-def build_image(image_name, filepath):
- context = os.path.dirname(filepath)
+def build_image(image_name: str, filepath: Path) -> None:
+ context = filepath.parent
build_cmd = f"docker build --network=host -t {image_name} -f {filepath} {context}"
logging.info("Will build image with cmd: '%s'", build_cmd)
subprocess.check_call(
@@ -40,7 +46,7 @@ def build_image(image_name, filepath):
)
-def pre_build(repo_path: str, env_variables: List[str]):
+def pre_build(repo_path: Path, env_variables: List[str]):
if "WITH_PERFORMANCE=1" in env_variables:
current_branch = subprocess.check_output(
"git branch --show-current", shell=True, encoding="utf-8"
@@ -56,7 +62,9 @@ def pre_build(repo_path: str, env_variables: List[str]):
# conclusion is: in the current state the easiest way to go is to force
# unshallow repository for performance artifacts.
# To change it we need to rework our performance tests docker image
- raise Exception("shallow repository is not suitable for performance builds")
+ raise BuildException(
+ "shallow repository is not suitable for performance builds"
+ )
if current_branch != "master":
cmd = (
f"git -C {repo_path} fetch --no-recurse-submodules "
@@ -67,14 +75,14 @@ def pre_build(repo_path: str, env_variables: List[str]):
def run_docker_image_with_env(
- image_name,
- as_root,
- output,
- env_variables,
- ch_root,
- ccache_dir,
- docker_image_version,
+ image_name: str,
+ as_root: bool,
+ output_dir: Path,
+ env_variables: List[str],
+ ch_root: Path,
+ ccache_dir: Optional[Path],
):
+ output_dir.mkdir(parents=True, exist_ok=True)
env_part = " -e ".join(env_variables)
if env_part:
env_part = " -e " + env_part
@@ -89,10 +97,14 @@ def run_docker_image_with_env(
else:
user = f"{os.geteuid()}:{os.getegid()}"
+ ccache_mount = f"--volume={ccache_dir}:/ccache"
+ if ccache_dir is None:
+ ccache_mount = ""
+
cmd = (
- f"docker run --network=host --user={user} --rm --volume={output}:/output "
- f"--volume={ch_root}:/build --volume={ccache_dir}:/ccache {env_part} "
- f"{interactive} {image_name}:{docker_image_version}"
+ f"docker run --network=host --user={user} --rm {ccache_mount}"
+ f"--volume={output_dir}:/output --volume={ch_root}:/build {env_part} "
+ f"{interactive} {image_name}"
)
logging.info("Will build ClickHouse pkg with cmd: '%s'", cmd)
@@ -100,24 +112,25 @@ def run_docker_image_with_env(
subprocess.check_call(cmd, shell=True)
-def is_release_build(build_type, package_type, sanitizer):
+def is_release_build(build_type: str, package_type: str, sanitizer: str) -> bool:
return build_type == "" and package_type == "deb" and sanitizer == ""
def parse_env_variables(
- build_type,
- compiler,
- sanitizer,
- package_type,
- cache,
- distcc_hosts,
- clang_tidy,
- version,
- author,
- official,
- additional_pkgs,
- with_coverage,
- with_binaries,
+ build_type: str,
+ compiler: str,
+ sanitizer: str,
+ package_type: str,
+ cache: str,
+ s3_bucket: str,
+ s3_directory: str,
+ s3_rw_access: bool,
+ clang_tidy: bool,
+ version: str,
+ official: bool,
+ additional_pkgs: bool,
+ with_coverage: bool,
+ with_binaries: str,
):
DARWIN_SUFFIX = "-darwin"
DARWIN_ARM_SUFFIX = "-darwin-aarch64"
@@ -243,32 +256,43 @@ def parse_env_variables(
else:
result.append("BUILD_TYPE=None")
- if cache == "distcc":
- result.append(f"CCACHE_PREFIX={cache}")
+ if not cache:
+ cmake_flags.append("-DCOMPILER_CACHE=disabled")
- if cache:
+ if cache == "ccache":
+ cmake_flags.append("-DCOMPILER_CACHE=ccache")
result.append("CCACHE_DIR=/ccache")
result.append("CCACHE_COMPRESSLEVEL=5")
result.append("CCACHE_BASEDIR=/build")
result.append("CCACHE_NOHASHDIR=true")
result.append("CCACHE_COMPILERCHECK=content")
- cache_maxsize = "15G"
- if clang_tidy:
- # 15G is not enough for tidy build
- cache_maxsize = "25G"
+ result.append("CCACHE_MAXSIZE=15G")
- # `CTCACHE_DIR` has the same purpose as the `CCACHE_DIR` above.
- # It's there to have the clang-tidy cache embedded into our standard `CCACHE_DIR`
+ if cache == "sccache":
+ cmake_flags.append("-DCOMPILER_CACHE=sccache")
+ # see https://github.com/mozilla/sccache/blob/main/docs/S3.md
+ result.append(f"SCCACHE_BUCKET={s3_bucket}")
+ sccache_dir = "sccache"
+ if s3_directory:
+ sccache_dir = f"{s3_directory}/{sccache_dir}"
+ result.append(f"SCCACHE_S3_KEY_PREFIX={sccache_dir}")
+ if not s3_rw_access:
+ result.append("SCCACHE_S3_NO_CREDENTIALS=true")
+
+ if clang_tidy:
+ # `CTCACHE_DIR` has the same purpose as the `CCACHE_DIR` above.
+ # It's there to have the clang-tidy cache embedded into our standard `CCACHE_DIR`
+ if cache == "ccache":
result.append("CTCACHE_DIR=/ccache/clang-tidy-cache")
- result.append(f"CCACHE_MAXSIZE={cache_maxsize}")
-
- if distcc_hosts:
- hosts_with_params = [f"{host}/24,lzo" for host in distcc_hosts] + [
- "localhost/`nproc`"
- ]
- result.append('DISTCC_HOSTS="' + " ".join(hosts_with_params) + '"')
- elif cache == "distcc":
- result.append('DISTCC_HOSTS="localhost/`nproc`"')
+ if s3_bucket:
+ # see https://github.com/matus-chochlik/ctcache#environment-variables
+ ctcache_dir = "clang-tidy-cache"
+ if s3_directory:
+ ctcache_dir = f"{s3_directory}/{ctcache_dir}"
+ result.append(f"CTCACHE_S3_BUCKET={s3_bucket}")
+ result.append(f"CTCACHE_S3_FOLDER={ctcache_dir}")
+ if not s3_rw_access:
+ result.append("CTCACHE_S3_NO_CREDENTIALS=true")
if additional_pkgs:
# NOTE: This are the env for packages/build script
@@ -300,9 +324,6 @@ def parse_env_variables(
if version:
result.append(f"VERSION_STRING='{version}'")
- if author:
- result.append(f"AUTHOR='{author}'")
-
if official:
cmake_flags.append("-DCLICKHOUSE_OFFICIAL_BUILD=1")
@@ -312,14 +333,14 @@ def parse_env_variables(
return result
-def dir_name(name: str) -> str:
- if not os.path.isabs(name):
- name = os.path.abspath(os.path.join(os.getcwd(), name))
- return name
+def dir_name(name: str) -> Path:
+ path = Path(name)
+ if not path.is_absolute():
+ path = Path.cwd() / name
+ return path
-if __name__ == "__main__":
- logging.basicConfig(level=logging.INFO, format="%(asctime)s %(message)s")
+def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter,
description="ClickHouse building script using prebuilt Docker image",
@@ -331,7 +352,7 @@ if __name__ == "__main__":
)
parser.add_argument(
"--clickhouse-repo-path",
- default=os.path.join(os.path.dirname(SCRIPT_PATH), os.pardir, os.pardir),
+ default=SCRIPT_PATH.parents[2],
type=dir_name,
help="ClickHouse git repository",
)
@@ -361,17 +382,34 @@ if __name__ == "__main__":
)
parser.add_argument("--clang-tidy", action="store_true")
- parser.add_argument("--cache", choices=("ccache", "distcc", ""), default="")
parser.add_argument(
- "--ccache_dir",
- default=os.getenv("HOME", "") + "/.ccache",
+ "--cache",
+ choices=("ccache", "sccache", ""),
+ default="",
+ help="ccache or sccache for objects caching; sccache uses only S3 buckets",
+ )
+ parser.add_argument(
+ "--ccache-dir",
+ default=Path.home() / ".ccache",
type=dir_name,
help="a directory with ccache",
)
- parser.add_argument("--distcc-hosts", nargs="+")
+ parser.add_argument(
+ "--s3-bucket",
+ help="an S3 bucket used for sscache and clang-tidy-cache",
+ )
+ parser.add_argument(
+ "--s3-directory",
+ default="ccache",
+ help="an S3 directory prefix used for sscache and clang-tidy-cache",
+ )
+ parser.add_argument(
+ "--s3-rw-access",
+ action="store_true",
+ help="if set, the build fails on errors writing cache to S3",
+ )
parser.add_argument("--force-build-image", action="store_true")
parser.add_argument("--version")
- parser.add_argument("--author", default="clickhouse", help="a package author")
parser.add_argument("--official", action="store_true")
parser.add_argument("--additional-pkgs", action="store_true")
parser.add_argument("--with-coverage", action="store_true")
@@ -387,34 +425,54 @@ if __name__ == "__main__":
args = parser.parse_args()
- image_name = f"clickhouse/{IMAGE_TYPE}-builder"
+ if args.additional_pkgs and args.package_type != "deb":
+ raise argparse.ArgumentTypeError(
+ "Can build additional packages only in deb build"
+ )
+
+ if args.cache != "ccache":
+ args.ccache_dir = None
+
+ if args.with_binaries != "":
+ if args.package_type != "deb":
+ raise argparse.ArgumentTypeError(
+ "Can add additional binaries only in deb build"
+ )
+ logging.info("Should place %s to output", args.with_binaries)
+
+ if args.cache == "sccache":
+ if not args.s3_bucket:
+ raise argparse.ArgumentTypeError("sccache must have --s3-bucket set")
+
+ return args
+
+
+def main():
+ logging.basicConfig(level=logging.INFO, format="%(asctime)s %(message)s")
+ args = parse_args()
ch_root = args.clickhouse_repo_path
- if args.additional_pkgs and args.package_type != "deb":
- raise Exception("Can build additional packages only in deb build")
+ dockerfile = ch_root / "docker/packager" / IMAGE_TYPE / "Dockerfile"
+ image_with_version = IMAGE_NAME + ":" + args.docker_image_version
+ if args.force_build_image:
+ build_image(image_with_version, dockerfile)
+ elif not (
+ check_image_exists_locally(image_with_version) or pull_image(image_with_version)
+ ):
+ build_image(image_with_version, dockerfile)
- if args.with_binaries != "" and args.package_type != "deb":
- raise Exception("Can add additional binaries only in deb build")
-
- if args.with_binaries != "" and args.package_type == "deb":
- logging.info("Should place %s to output", args.with_binaries)
-
- dockerfile = os.path.join(ch_root, "docker/packager", IMAGE_TYPE, "Dockerfile")
- image_with_version = image_name + ":" + args.docker_image_version
- if not check_image_exists_locally(image_name) or args.force_build_image:
- if not pull_image(image_with_version) or args.force_build_image:
- build_image(image_with_version, dockerfile)
env_prepared = parse_env_variables(
args.build_type,
args.compiler,
args.sanitizer,
args.package_type,
args.cache,
- args.distcc_hosts,
+ args.s3_bucket,
+ args.s3_directory,
+ args.s3_rw_access,
args.clang_tidy,
args.version,
- args.author,
args.official,
args.additional_pkgs,
args.with_coverage,
@@ -423,12 +481,15 @@ if __name__ == "__main__":
pre_build(args.clickhouse_repo_path, env_prepared)
run_docker_image_with_env(
- image_name,
+ image_with_version,
args.as_root,
args.output_dir,
env_prepared,
ch_root,
args.ccache_dir,
- args.docker_image_version,
)
logging.info("Output placed into %s", args.output_dir)
+
+
+if __name__ == "__main__":
+ main()
diff --git a/docker/test/fasttest/Dockerfile b/docker/test/fasttest/Dockerfile
index 32546b71eb8..ffb13fc774d 100644
--- a/docker/test/fasttest/Dockerfile
+++ b/docker/test/fasttest/Dockerfile
@@ -20,12 +20,6 @@ RUN apt-get update \
zstd \
--yes --no-install-recommends
-# Install CMake 3.20+ for Rust compilation
-RUN apt purge cmake --yes
-RUN wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null
-RUN apt-add-repository 'deb https://apt.kitware.com/ubuntu/ focal main'
-RUN apt update && apt install cmake --yes
-
RUN pip3 install numpy scipy pandas Jinja2
ARG odbc_driver_url="https://github.com/ClickHouse/clickhouse-odbc/releases/download/v1.1.4.20200302/clickhouse-odbc-1.1.4-Linux.tar.gz"
diff --git a/docker/test/fasttest/run.sh b/docker/test/fasttest/run.sh
index 086276bed55..e0e30a63bb4 100755
--- a/docker/test/fasttest/run.sh
+++ b/docker/test/fasttest/run.sh
@@ -16,7 +16,8 @@ export LLVM_VERSION=${LLVM_VERSION:-13}
# it being undefined. Also read it as array so that we can pass an empty list
# of additional variable to cmake properly, and it doesn't generate an extra
# empty parameter.
-read -ra FASTTEST_CMAKE_FLAGS <<< "${FASTTEST_CMAKE_FLAGS:-}"
+# Read it as CMAKE_FLAGS to not lose exported FASTTEST_CMAKE_FLAGS on subsequential launch
+read -ra CMAKE_FLAGS <<< "${FASTTEST_CMAKE_FLAGS:-}"
# Run only matching tests.
FASTTEST_FOCUS=${FASTTEST_FOCUS:-""}
@@ -37,6 +38,13 @@ export FASTTEST_DATA
export FASTTEST_OUT
export PATH
+function ccache_status
+{
+ ccache --show-config ||:
+ ccache --show-stats ||:
+ SCCACHE_NO_DAEMON=1 sccache --show-stats ||:
+}
+
function start_server
{
set -m # Spawn server in its own process groups
@@ -171,14 +179,14 @@ function run_cmake
export CCACHE_COMPILERCHECK=content
export CCACHE_MAXSIZE=15G
- ccache --show-stats ||:
+ ccache_status
ccache --zero-stats ||:
mkdir "$FASTTEST_BUILD" ||:
(
cd "$FASTTEST_BUILD"
- cmake "$FASTTEST_SOURCE" -DCMAKE_CXX_COMPILER="clang++-${LLVM_VERSION}" -DCMAKE_C_COMPILER="clang-${LLVM_VERSION}" "${CMAKE_LIBS_CONFIG[@]}" "${FASTTEST_CMAKE_FLAGS[@]}" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/cmake_log.txt"
+ cmake "$FASTTEST_SOURCE" -DCMAKE_CXX_COMPILER="clang++-${LLVM_VERSION}" -DCMAKE_C_COMPILER="clang-${LLVM_VERSION}" "${CMAKE_LIBS_CONFIG[@]}" "${CMAKE_FLAGS[@]}" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/cmake_log.txt"
)
}
@@ -193,7 +201,7 @@ function build
strip programs/clickhouse -o "$FASTTEST_OUTPUT/clickhouse-stripped"
zstd --threads=0 "$FASTTEST_OUTPUT/clickhouse-stripped"
fi
- ccache --show-stats ||:
+ ccache_status
ccache --evict-older-than 1d ||:
)
}
diff --git a/docker/test/util/Dockerfile b/docker/test/util/Dockerfile
index 0ee426f4e4d..911cadc3c58 100644
--- a/docker/test/util/Dockerfile
+++ b/docker/test/util/Dockerfile
@@ -92,4 +92,17 @@ RUN mkdir /tmp/ccache \
&& cd / \
&& rm -rf /tmp/ccache
+ARG TARGETARCH
+ARG SCCACHE_VERSION=v0.4.1
+RUN arch=${TARGETARCH:-amd64} \
+ && case $arch in \
+ amd64) rarch=x86_64 ;; \
+ arm64) rarch=aarch64 ;; \
+ esac \
+ && curl -Ls "https://github.com/mozilla/sccache/releases/download/$SCCACHE_VERSION/sccache-$SCCACHE_VERSION-$rarch-unknown-linux-musl.tar.gz" | \
+ tar xz -C /tmp \
+ && mv "/tmp/sccache-$SCCACHE_VERSION-$rarch-unknown-linux-musl/sccache" /usr/bin \
+ && rm "/tmp/sccache-$SCCACHE_VERSION-$rarch-unknown-linux-musl" -r
+
+
COPY process_functional_tests_result.py /
diff --git a/docs/en/development/build.md b/docs/en/development/build.md
index 804aa8a3dc5..00a8a54f80a 100644
--- a/docs/en/development/build.md
+++ b/docs/en/development/build.md
@@ -85,7 +85,7 @@ The build requires the following components:
- Git (is used only to checkout the sources, it’s not needed for the build)
- CMake 3.15 or newer
- Ninja
-- C++ compiler: clang-14 or newer
+- C++ compiler: clang-15 or newer
- Linker: lld
- Yasm
- Gawk
diff --git a/docs/en/operations/settings/settings.md b/docs/en/operations/settings/settings.md
index 9f5b8dbda58..4d638964c6f 100644
--- a/docs/en/operations/settings/settings.md
+++ b/docs/en/operations/settings/settings.md
@@ -2769,7 +2769,7 @@ Default value: `120` seconds.
## cast_keep_nullable {#cast_keep_nullable}
-Enables or disables keeping of the `Nullable` data type in [CAST](../../sql-reference/functions/type-conversion-functions.md/#type_conversion_function-cast) operations.
+Enables or disables keeping of the `Nullable` data type in [CAST](../../sql-reference/functions/type-conversion-functions.md/#castx-t) operations.
When the setting is enabled and the argument of `CAST` function is `Nullable`, the result is also transformed to `Nullable` type. When the setting is disabled, the result always has the destination type exactly.
diff --git a/docs/en/sql-reference/statements/select/join.md b/docs/en/sql-reference/statements/select/join.md
index 49bd2672874..ece60961aaf 100644
--- a/docs/en/sql-reference/statements/select/join.md
+++ b/docs/en/sql-reference/statements/select/join.md
@@ -47,6 +47,7 @@ The default join type can be overridden using [join_default_strictness](../../..
The behavior of ClickHouse server for `ANY JOIN` operations depends on the [any_join_distinct_right_table_keys](../../../operations/settings/settings.md#any_join_distinct_right_table_keys) setting.
+
**See also**
- [join_algorithm](../../../operations/settings/settings.md#settings-join_algorithm)
@@ -57,6 +58,8 @@ The behavior of ClickHouse server for `ANY JOIN` operations depends on the [any_
- [join_on_disk_max_files_to_merge](../../../operations/settings/settings.md#join_on_disk_max_files_to_merge)
- [any_join_distinct_right_table_keys](../../../operations/settings/settings.md#any_join_distinct_right_table_keys)
+Use the `cross_to_inner_join_rewrite` setting to define the behavior when ClickHouse fails to rewrite a `CROSS JOIN` as an `INNER JOIN`. The default value is `1`, which allows the join to continue but it will be slower. Set `cross_to_inner_join_rewrite` to `0` if you want an error to be thrown, and set it to `2` to not run the cross joins but instead force a rewrite of all comma/cross joins. If the rewriting fails when the value is `2`, you will receive an error message stating "Please, try to simplify `WHERE` section".
+
## ON Section Conditions
An `ON` section can contain several conditions combined using the `AND` and `OR` operators. Conditions specifying join keys must refer both left and right tables and must use the equality operator. Other conditions may use other logical operators but they must refer either the left or the right table of a query.
diff --git a/programs/benchmark/Benchmark.cpp b/programs/benchmark/Benchmark.cpp
index 994f9b7ac4d..466a0c194f7 100644
--- a/programs/benchmark/Benchmark.cpp
+++ b/programs/benchmark/Benchmark.cpp
@@ -34,6 +34,7 @@
#include
#include
#include
+#include
#include
@@ -43,6 +44,12 @@ namespace fs = std::filesystem;
* The tool emulates a case with fixed amount of simultaneously executing queries.
*/
+namespace CurrentMetrics
+{
+ extern const Metric LocalThread;
+ extern const Metric LocalThreadActive;
+}
+
namespace DB
{
@@ -103,7 +110,7 @@ public:
settings(settings_),
shared_context(Context::createShared()),
global_context(Context::createGlobal(shared_context.get())),
- pool(concurrency)
+ pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, concurrency)
{
const auto secure = secure_ ? Protocol::Secure::Enable : Protocol::Secure::Disable;
size_t connections_cnt = std::max(ports_.size(), hosts_.size());
diff --git a/programs/copier/ClusterCopier.cpp b/programs/copier/ClusterCopier.cpp
index 3a6261974d1..079c70596a6 100644
--- a/programs/copier/ClusterCopier.cpp
+++ b/programs/copier/ClusterCopier.cpp
@@ -6,6 +6,7 @@
#include
#include
#include
+#include
#include
#include
#include
@@ -19,6 +20,12 @@
#include
#include
+namespace CurrentMetrics
+{
+ extern const Metric LocalThread;
+ extern const Metric LocalThreadActive;
+}
+
namespace DB
{
@@ -192,7 +199,7 @@ void ClusterCopier::discoverTablePartitions(const ConnectionTimeouts & timeouts,
{
/// Fetch partitions list from a shard
{
- ThreadPool thread_pool(num_threads ? num_threads : 2 * getNumberOfPhysicalCPUCores());
+ ThreadPool thread_pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, num_threads ? num_threads : 2 * getNumberOfPhysicalCPUCores());
for (const TaskShardPtr & task_shard : task_table.all_shards)
thread_pool.scheduleOrThrowOnError([this, timeouts, task_shard]()
diff --git a/src/Analyzer/Passes/LogicalExpressionOptimizerPass.cpp b/src/Analyzer/Passes/LogicalExpressionOptimizerPass.cpp
index 3d65035f9fd..97669f3924f 100644
--- a/src/Analyzer/Passes/LogicalExpressionOptimizerPass.cpp
+++ b/src/Analyzer/Passes/LogicalExpressionOptimizerPass.cpp
@@ -153,9 +153,11 @@ private:
}
};
- if (const auto * lhs_literal = lhs->as())
+ if (const auto * lhs_literal = lhs->as();
+ lhs_literal && !lhs_literal->getValue().isNull())
add_equals_function_if_not_present(rhs, lhs_literal);
- else if (const auto * rhs_literal = rhs->as())
+ else if (const auto * rhs_literal = rhs->as();
+ rhs_literal && !rhs_literal->getValue().isNull())
add_equals_function_if_not_present(lhs, rhs_literal);
else
or_operands.push_back(argument);
diff --git a/src/Backups/BackupsWorker.cpp b/src/Backups/BackupsWorker.cpp
index b82629eadf7..4572e481829 100644
--- a/src/Backups/BackupsWorker.cpp
+++ b/src/Backups/BackupsWorker.cpp
@@ -20,10 +20,19 @@
#include
#include
#include
+#include
#include
#include
+namespace CurrentMetrics
+{
+ extern const Metric BackupsThreads;
+ extern const Metric BackupsThreadsActive;
+ extern const Metric RestoreThreads;
+ extern const Metric RestoreThreadsActive;
+}
+
namespace DB
{
@@ -153,8 +162,8 @@ namespace
BackupsWorker::BackupsWorker(size_t num_backup_threads, size_t num_restore_threads, bool allow_concurrent_backups_, bool allow_concurrent_restores_)
- : backups_thread_pool(num_backup_threads, /* max_free_threads = */ 0, num_backup_threads)
- , restores_thread_pool(num_restore_threads, /* max_free_threads = */ 0, num_restore_threads)
+ : backups_thread_pool(CurrentMetrics::BackupsThreads, CurrentMetrics::BackupsThreadsActive, num_backup_threads, /* max_free_threads = */ 0, num_backup_threads)
+ , restores_thread_pool(CurrentMetrics::RestoreThreads, CurrentMetrics::RestoreThreadsActive, num_restore_threads, /* max_free_threads = */ 0, num_restore_threads)
, log(&Poco::Logger::get("BackupsWorker"))
, allow_concurrent_backups(allow_concurrent_backups_)
, allow_concurrent_restores(allow_concurrent_restores_)
diff --git a/src/Client/ReplxxLineReader.cpp b/src/Client/ReplxxLineReader.cpp
index fa5ba8b8ba9..1979b37a94b 100644
--- a/src/Client/ReplxxLineReader.cpp
+++ b/src/Client/ReplxxLineReader.cpp
@@ -5,6 +5,7 @@
#include
#include
+#include
#include
#include
#include
diff --git a/src/Common/CurrentMetrics.cpp b/src/Common/CurrentMetrics.cpp
index 5b20d98aa01..4c773048597 100644
--- a/src/Common/CurrentMetrics.cpp
+++ b/src/Common/CurrentMetrics.cpp
@@ -72,6 +72,64 @@
M(GlobalThreadActive, "Number of threads in global thread pool running a task.") \
M(LocalThread, "Number of threads in local thread pools. The threads in local thread pools are taken from the global thread pool.") \
M(LocalThreadActive, "Number of threads in local thread pools running a task.") \
+ M(MergeTreeDataSelectExecutorThreads, "Number of threads in the MergeTreeDataSelectExecutor thread pool.") \
+ M(MergeTreeDataSelectExecutorThreadsActive, "Number of threads in the MergeTreeDataSelectExecutor thread pool running a task.") \
+ M(BackupsThreads, "Number of threads in the thread pool for BACKUP.") \
+ M(BackupsThreadsActive, "Number of threads in thread pool for BACKUP running a task.") \
+ M(RestoreThreads, "Number of threads in the thread pool for RESTORE.") \
+ M(RestoreThreadsActive, "Number of threads in the thread pool for RESTORE running a task.") \
+ M(IOThreads, "Number of threads in the IO thread pool.") \
+ M(IOThreadsActive, "Number of threads in the IO thread pool running a task.") \
+ M(ThreadPoolRemoteFSReaderThreads, "Number of threads in the thread pool for remote_filesystem_read_method=threadpool.") \
+ M(ThreadPoolRemoteFSReaderThreadsActive, "Number of threads in the thread pool for remote_filesystem_read_method=threadpool running a task.") \
+ M(ThreadPoolFSReaderThreads, "Number of threads in the thread pool for local_filesystem_read_method=threadpool.") \
+ M(ThreadPoolFSReaderThreadsActive, "Number of threads in the thread pool for local_filesystem_read_method=threadpool running a task.") \
+ M(BackupsIOThreads, "Number of threads in the BackupsIO thread pool.") \
+ M(BackupsIOThreadsActive, "Number of threads in the BackupsIO thread pool running a task.") \
+ M(DiskObjectStorageAsyncThreads, "Number of threads in the async thread pool for DiskObjectStorage.") \
+ M(DiskObjectStorageAsyncThreadsActive, "Number of threads in the async thread pool for DiskObjectStorage running a task.") \
+ M(StorageHiveThreads, "Number of threads in the StorageHive thread pool.") \
+ M(StorageHiveThreadsActive, "Number of threads in the StorageHive thread pool running a task.") \
+ M(TablesLoaderThreads, "Number of threads in the tables loader thread pool.") \
+ M(TablesLoaderThreadsActive, "Number of threads in the tables loader thread pool running a task.") \
+ M(DatabaseOrdinaryThreads, "Number of threads in the Ordinary database thread pool.") \
+ M(DatabaseOrdinaryThreadsActive, "Number of threads in the Ordinary database thread pool running a task.") \
+ M(DatabaseOnDiskThreads, "Number of threads in the DatabaseOnDisk thread pool.") \
+ M(DatabaseOnDiskThreadsActive, "Number of threads in the DatabaseOnDisk thread pool running a task.") \
+ M(DatabaseCatalogThreads, "Number of threads in the DatabaseCatalog thread pool.") \
+ M(DatabaseCatalogThreadsActive, "Number of threads in the DatabaseCatalog thread pool running a task.") \
+ M(DestroyAggregatesThreads, "Number of threads in the thread pool for destroy aggregate states.") \
+ M(DestroyAggregatesThreadsActive, "Number of threads in the thread pool for destroy aggregate states running a task.") \
+ M(HashedDictionaryThreads, "Number of threads in the HashedDictionary thread pool.") \
+ M(HashedDictionaryThreadsActive, "Number of threads in the HashedDictionary thread pool running a task.") \
+ M(CacheDictionaryThreads, "Number of threads in the CacheDictionary thread pool.") \
+ M(CacheDictionaryThreadsActive, "Number of threads in the CacheDictionary thread pool running a task.") \
+ M(ParallelFormattingOutputFormatThreads, "Number of threads in the ParallelFormattingOutputFormatThreads thread pool.") \
+ M(ParallelFormattingOutputFormatThreadsActive, "Number of threads in the ParallelFormattingOutputFormatThreads thread pool running a task.") \
+ M(ParallelParsingInputFormatThreads, "Number of threads in the ParallelParsingInputFormat thread pool.") \
+ M(ParallelParsingInputFormatThreadsActive, "Number of threads in the ParallelParsingInputFormat thread pool running a task.") \
+ M(MergeTreeBackgroundExecutorThreads, "Number of threads in the MergeTreeBackgroundExecutor thread pool.") \
+ M(MergeTreeBackgroundExecutorThreadsActive, "Number of threads in the MergeTreeBackgroundExecutor thread pool running a task.") \
+ M(AsynchronousInsertThreads, "Number of threads in the AsynchronousInsert thread pool.") \
+ M(AsynchronousInsertThreadsActive, "Number of threads in the AsynchronousInsert thread pool running a task.") \
+ M(StartupSystemTablesThreads, "Number of threads in the StartupSystemTables thread pool.") \
+ M(StartupSystemTablesThreadsActive, "Number of threads in the StartupSystemTables thread pool running a task.") \
+ M(AggregatorThreads, "Number of threads in the Aggregator thread pool.") \
+ M(AggregatorThreadsActive, "Number of threads in the Aggregator thread pool running a task.") \
+ M(DDLWorkerThreads, "Number of threads in the DDLWorker thread pool for ON CLUSTER queries.") \
+ M(DDLWorkerThreadsActive, "Number of threads in the DDLWORKER thread pool for ON CLUSTER queries running a task.") \
+ M(StorageDistributedThreads, "Number of threads in the StorageDistributed thread pool.") \
+ M(StorageDistributedThreadsActive, "Number of threads in the StorageDistributed thread pool running a task.") \
+ M(StorageS3Threads, "Number of threads in the StorageS3 thread pool.") \
+ M(StorageS3ThreadsActive, "Number of threads in the StorageS3 thread pool running a task.") \
+ M(MergeTreePartsLoaderThreads, "Number of threads in the MergeTree parts loader thread pool.") \
+ M(MergeTreePartsLoaderThreadsActive, "Number of threads in the MergeTree parts loader thread pool running a task.") \
+ M(MergeTreePartsCleanerThreads, "Number of threads in the MergeTree parts cleaner thread pool.") \
+ M(MergeTreePartsCleanerThreadsActive, "Number of threads in the MergeTree parts cleaner thread pool running a task.") \
+ M(SystemReplicasThreads, "Number of threads in the system.replicas thread pool.") \
+ M(SystemReplicasThreadsActive, "Number of threads in the system.replicas thread pool running a task.") \
+ M(RestartReplicaThreads, "Number of threads in the RESTART REPLICA thread pool.") \
+ M(RestartReplicaThreadsActive, "Number of threads in the RESTART REPLICA thread pool running a task.") \
M(DistributedFilesToInsert, "Number of pending files to process for asynchronous insertion into Distributed tables. Number of files for every shard is summed.") \
M(BrokenDistributedFilesToInsert, "Number of files for asynchronous insertion into Distributed tables that has been marked as broken. This metric will starts from 0 on start. Number of files for every shard is summed.") \
M(TablesToDropQueueSize, "Number of dropped tables, that are waiting for background data removal.") \
diff --git a/src/Common/CurrentMetrics.h b/src/Common/CurrentMetrics.h
index c184ee1e7f2..0ae16e2d08d 100644
--- a/src/Common/CurrentMetrics.h
+++ b/src/Common/CurrentMetrics.h
@@ -4,6 +4,7 @@
#include
#include
#include
+#include
#include
/** Allows to count number of simultaneously happening processes or current value of some metric.
@@ -73,7 +74,10 @@ namespace CurrentMetrics
public:
explicit Increment(Metric metric, Value amount_ = 1)
- : Increment(&values[metric], amount_) {}
+ : Increment(&values[metric], amount_)
+ {
+ assert(metric < CurrentMetrics::end());
+ }
~Increment()
{
diff --git a/src/Common/ErrorCodes.cpp b/src/Common/ErrorCodes.cpp
index 549112f8c84..9abf3bba8ff 100644
--- a/src/Common/ErrorCodes.cpp
+++ b/src/Common/ErrorCodes.cpp
@@ -649,6 +649,7 @@
M(678, IO_URING_INIT_FAILED) \
M(679, IO_URING_SUBMIT_ERROR) \
M(690, MIXED_ACCESS_PARAMETER_TYPES) \
+ M(691, UNKNOWN_ELEMENT_OF_ENUM) \
\
M(999, KEEPER_EXCEPTION) \
M(1000, POCO_EXCEPTION) \
diff --git a/src/Common/ThreadPool.cpp b/src/Common/ThreadPool.cpp
index b742e2df06e..bcf4fce27f8 100644
--- a/src/Common/ThreadPool.cpp
+++ b/src/Common/ThreadPool.cpp
@@ -26,28 +26,37 @@ namespace CurrentMetrics
{
extern const Metric GlobalThread;
extern const Metric GlobalThreadActive;
- extern const Metric LocalThread;
- extern const Metric LocalThreadActive;
}
static constexpr auto DEFAULT_THREAD_NAME = "ThreadPool";
template
-ThreadPoolImpl::ThreadPoolImpl()
- : ThreadPoolImpl(getNumberOfPhysicalCPUCores())
+ThreadPoolImpl::ThreadPoolImpl(Metric metric_threads_, Metric metric_active_threads_)
+ : ThreadPoolImpl(metric_threads_, metric_active_threads_, getNumberOfPhysicalCPUCores())
{
}
template
-ThreadPoolImpl::ThreadPoolImpl(size_t max_threads_)
- : ThreadPoolImpl(max_threads_, max_threads_, max_threads_)
+ThreadPoolImpl::ThreadPoolImpl(
+ Metric metric_threads_,
+ Metric metric_active_threads_,
+ size_t max_threads_)
+ : ThreadPoolImpl(metric_threads_, metric_active_threads_, max_threads_, max_threads_, max_threads_)
{
}
template
-ThreadPoolImpl::ThreadPoolImpl(size_t max_threads_, size_t max_free_threads_, size_t queue_size_, bool shutdown_on_exception_)
- : max_threads(max_threads_)
+ThreadPoolImpl::ThreadPoolImpl(
+ Metric metric_threads_,
+ Metric metric_active_threads_,
+ size_t max_threads_,
+ size_t max_free_threads_,
+ size_t queue_size_,
+ bool shutdown_on_exception_)
+ : metric_threads(metric_threads_)
+ , metric_active_threads(metric_active_threads_)
+ , max_threads(max_threads_)
, max_free_threads(std::min(max_free_threads_, max_threads))
, queue_size(queue_size_ ? std::max(queue_size_, max_threads) : 0 /* zero means the queue is unlimited */)
, shutdown_on_exception(shutdown_on_exception_)
@@ -324,8 +333,7 @@ template
void ThreadPoolImpl::worker(typename std::list::iterator thread_it)
{
DENY_ALLOCATIONS_IN_SCOPE;
- CurrentMetrics::Increment metric_all_threads(
- std::is_same_v ? CurrentMetrics::GlobalThread : CurrentMetrics::LocalThread);
+ CurrentMetrics::Increment metric_pool_threads(metric_threads);
/// Remove this thread from `threads` and detach it, that must be done before exiting from this worker.
/// We can't wrap the following lambda function into `SCOPE_EXIT` because it requires `mutex` to be locked.
@@ -383,8 +391,7 @@ void ThreadPoolImpl::worker(typename std::list::iterator thread_
try
{
- CurrentMetrics::Increment metric_active_threads(
- std::is_same_v ? CurrentMetrics::GlobalThreadActive : CurrentMetrics::LocalThreadActive);
+ CurrentMetrics::Increment metric_active_pool_threads(metric_active_threads);
job();
@@ -458,6 +465,22 @@ template class ThreadFromGlobalPoolImpl;
std::unique_ptr GlobalThreadPool::the_instance;
+
+GlobalThreadPool::GlobalThreadPool(
+ size_t max_threads_,
+ size_t max_free_threads_,
+ size_t queue_size_,
+ const bool shutdown_on_exception_)
+ : FreeThreadPool(
+ CurrentMetrics::GlobalThread,
+ CurrentMetrics::GlobalThreadActive,
+ max_threads_,
+ max_free_threads_,
+ queue_size_,
+ shutdown_on_exception_)
+{
+}
+
void GlobalThreadPool::initialize(size_t max_threads, size_t max_free_threads, size_t queue_size)
{
if (the_instance)
diff --git a/src/Common/ThreadPool.h b/src/Common/ThreadPool.h
index a1ca79a1e4b..b2f77f9693c 100644
--- a/src/Common/ThreadPool.h
+++ b/src/Common/ThreadPool.h
@@ -16,6 +16,7 @@
#include
#include
#include
+#include
#include
/** Very simple thread pool similar to boost::threadpool.
@@ -33,15 +34,25 @@ class ThreadPoolImpl
{
public:
using Job = std::function;
+ using Metric = CurrentMetrics::Metric;
/// Maximum number of threads is based on the number of physical cores.
- ThreadPoolImpl();
+ ThreadPoolImpl(Metric metric_threads_, Metric metric_active_threads_);
/// Size is constant. Up to num_threads are created on demand and then run until shutdown.
- explicit ThreadPoolImpl(size_t max_threads_);
+ explicit ThreadPoolImpl(
+ Metric metric_threads_,
+ Metric metric_active_threads_,
+ size_t max_threads_);
/// queue_size - maximum number of running plus scheduled jobs. It can be greater than max_threads. Zero means unlimited.
- ThreadPoolImpl(size_t max_threads_, size_t max_free_threads_, size_t queue_size_, bool shutdown_on_exception_ = true);
+ ThreadPoolImpl(
+ Metric metric_threads_,
+ Metric metric_active_threads_,
+ size_t max_threads_,
+ size_t max_free_threads_,
+ size_t queue_size_,
+ bool shutdown_on_exception_ = true);
/// Add new job. Locks until number of scheduled jobs is less than maximum or exception in one of threads was thrown.
/// If any thread was throw an exception, first exception will be rethrown from this method,
@@ -96,6 +107,9 @@ private:
std::condition_variable job_finished;
std::condition_variable new_job_or_shutdown;
+ Metric metric_threads;
+ Metric metric_active_threads;
+
size_t max_threads;
size_t max_free_threads;
size_t queue_size;
@@ -159,12 +173,11 @@ class GlobalThreadPool : public FreeThreadPool, private boost::noncopyable
{
static std::unique_ptr the_instance;
- GlobalThreadPool(size_t max_threads_, size_t max_free_threads_,
- size_t queue_size_, const bool shutdown_on_exception_)
- : FreeThreadPool(max_threads_, max_free_threads_, queue_size_,
- shutdown_on_exception_)
- {
- }
+ GlobalThreadPool(
+ size_t max_threads_,
+ size_t max_free_threads_,
+ size_t queue_size_,
+ bool shutdown_on_exception_);
public:
static void initialize(size_t max_threads = 10000, size_t max_free_threads = 1000, size_t queue_size = 10000);
diff --git a/src/Common/examples/parallel_aggregation.cpp b/src/Common/examples/parallel_aggregation.cpp
index bd252b330f3..cf7a3197fef 100644
--- a/src/Common/examples/parallel_aggregation.cpp
+++ b/src/Common/examples/parallel_aggregation.cpp
@@ -17,6 +17,7 @@
#include
#include
+#include
using Key = UInt64;
@@ -28,6 +29,12 @@ using Map = HashMap;
using MapTwoLevel = TwoLevelHashMap;
+namespace CurrentMetrics
+{
+ extern const Metric LocalThread;
+ extern const Metric LocalThreadActive;
+}
+
struct SmallLock
{
std::atomic locked {false};
@@ -247,7 +254,7 @@ int main(int argc, char ** argv)
std::cerr << std::fixed << std::setprecision(2);
- ThreadPool pool(num_threads);
+ ThreadPool pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, num_threads);
Source data(n);
diff --git a/src/Common/examples/parallel_aggregation2.cpp b/src/Common/examples/parallel_aggregation2.cpp
index 6c20f46ab0e..1b0ad760490 100644
--- a/src/Common/examples/parallel_aggregation2.cpp
+++ b/src/Common/examples/parallel_aggregation2.cpp
@@ -17,6 +17,7 @@
#include
#include
+#include
using Key = UInt64;
@@ -24,6 +25,12 @@ using Value = UInt64;
using Source = std::vector;
+namespace CurrentMetrics
+{
+ extern const Metric LocalThread;
+ extern const Metric LocalThreadActive;
+}
+
template
struct AggregateIndependent
{
@@ -274,7 +281,7 @@ int main(int argc, char ** argv)
std::cerr << std::fixed << std::setprecision(2);
- ThreadPool pool(num_threads);
+ ThreadPool pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, num_threads);
Source data(n);
diff --git a/src/Common/examples/thread_creation_latency.cpp b/src/Common/examples/thread_creation_latency.cpp
index 351f709013a..2434759c968 100644
--- a/src/Common/examples/thread_creation_latency.cpp
+++ b/src/Common/examples/thread_creation_latency.cpp
@@ -6,6 +6,7 @@
#include
#include
#include
+#include
int value = 0;
@@ -14,6 +15,12 @@ static void f() { ++value; }
static void * g(void *) { f(); return {}; }
+namespace CurrentMetrics
+{
+ extern const Metric LocalThread;
+ extern const Metric LocalThreadActive;
+}
+
namespace DB
{
namespace ErrorCodes
@@ -65,7 +72,7 @@ int main(int argc, char ** argv)
test(n, "Create and destroy ThreadPool each iteration", []
{
- ThreadPool tp(1);
+ ThreadPool tp(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, 1);
tp.scheduleOrThrowOnError(f);
tp.wait();
});
@@ -86,7 +93,7 @@ int main(int argc, char ** argv)
});
{
- ThreadPool tp(1);
+ ThreadPool tp(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, 1);
test(n, "Schedule job for Threadpool each iteration", [&tp]
{
@@ -96,7 +103,7 @@ int main(int argc, char ** argv)
}
{
- ThreadPool tp(128);
+ ThreadPool tp(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, 128);
test(n, "Schedule job for Threadpool with 128 threads each iteration", [&tp]
{
diff --git a/src/Common/tests/gtest_thread_pool_concurrent_wait.cpp b/src/Common/tests/gtest_thread_pool_concurrent_wait.cpp
index f5f14739e39..f93017129dd 100644
--- a/src/Common/tests/gtest_thread_pool_concurrent_wait.cpp
+++ b/src/Common/tests/gtest_thread_pool_concurrent_wait.cpp
@@ -1,4 +1,5 @@
#include
+#include
#include
@@ -7,6 +8,12 @@
*/
+namespace CurrentMetrics
+{
+ extern const Metric LocalThread;
+ extern const Metric LocalThreadActive;
+}
+
TEST(ThreadPool, ConcurrentWait)
{
auto worker = []
@@ -18,14 +25,14 @@ TEST(ThreadPool, ConcurrentWait)
constexpr size_t num_threads = 4;
constexpr size_t num_jobs = 4;
- ThreadPool pool(num_threads);
+ ThreadPool pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, num_threads);
for (size_t i = 0; i < num_jobs; ++i)
pool.scheduleOrThrowOnError(worker);
constexpr size_t num_waiting_threads = 4;
- ThreadPool waiting_pool(num_waiting_threads);
+ ThreadPool waiting_pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, num_waiting_threads);
for (size_t i = 0; i < num_waiting_threads; ++i)
waiting_pool.scheduleOrThrowOnError([&pool] { pool.wait(); });
diff --git a/src/Common/tests/gtest_thread_pool_global_full.cpp b/src/Common/tests/gtest_thread_pool_global_full.cpp
index 583d43be1bb..1b2ded9c7e1 100644
--- a/src/Common/tests/gtest_thread_pool_global_full.cpp
+++ b/src/Common/tests/gtest_thread_pool_global_full.cpp
@@ -2,10 +2,17 @@
#include
#include
+#include
#include
+namespace CurrentMetrics
+{
+ extern const Metric LocalThread;
+ extern const Metric LocalThreadActive;
+}
+
/// Test what happens if local ThreadPool cannot create a ThreadFromGlobalPool.
/// There was a bug: if local ThreadPool cannot allocate even a single thread,
/// the job will be scheduled but never get executed.
@@ -27,7 +34,7 @@ TEST(ThreadPool, GlobalFull1)
auto func = [&] { ++counter; while (counter != num_jobs) {} };
- ThreadPool pool(num_jobs);
+ ThreadPool pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, num_jobs);
for (size_t i = 0; i < capacity; ++i)
pool.scheduleOrThrowOnError(func);
@@ -65,11 +72,11 @@ TEST(ThreadPool, GlobalFull2)
std::atomic counter = 0;
auto func = [&] { ++counter; while (counter != capacity + 1) {} };
- ThreadPool pool(capacity, 0, capacity);
+ ThreadPool pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, capacity, 0, capacity);
for (size_t i = 0; i < capacity; ++i)
pool.scheduleOrThrowOnError(func);
- ThreadPool another_pool(1);
+ ThreadPool another_pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, 1);
EXPECT_THROW(another_pool.scheduleOrThrowOnError(func), DB::Exception);
++counter;
diff --git a/src/Common/tests/gtest_thread_pool_loop.cpp b/src/Common/tests/gtest_thread_pool_loop.cpp
index 15915044652..556c39df949 100644
--- a/src/Common/tests/gtest_thread_pool_loop.cpp
+++ b/src/Common/tests/gtest_thread_pool_loop.cpp
@@ -1,10 +1,17 @@
#include
#include
#include
+#include
#include
+namespace CurrentMetrics
+{
+ extern const Metric LocalThread;
+ extern const Metric LocalThreadActive;
+}
+
TEST(ThreadPool, Loop)
{
std::atomic res{0};
@@ -12,7 +19,7 @@ TEST(ThreadPool, Loop)
for (size_t i = 0; i < 1000; ++i)
{
size_t threads = 16;
- ThreadPool pool(threads);
+ ThreadPool pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, threads);
for (size_t j = 0; j < threads; ++j)
pool.scheduleOrThrowOnError([&] { ++res; });
pool.wait();
diff --git a/src/Common/tests/gtest_thread_pool_schedule_exception.cpp b/src/Common/tests/gtest_thread_pool_schedule_exception.cpp
index 69362c34cd2..176e469d5ef 100644
--- a/src/Common/tests/gtest_thread_pool_schedule_exception.cpp
+++ b/src/Common/tests/gtest_thread_pool_schedule_exception.cpp
@@ -1,13 +1,20 @@
#include
#include
#include
+#include
#include
+namespace CurrentMetrics
+{
+ extern const Metric LocalThread;
+ extern const Metric LocalThreadActive;
+}
+
static bool check()
{
- ThreadPool pool(10);
+ ThreadPool pool(CurrentMetrics::LocalThread, CurrentMetrics::LocalThreadActive, 10);
/// The throwing thread.
pool.scheduleOrThrowOnError([] { throw std::runtime_error("Hello, world!"); });
diff --git a/src/Core/ServerSettings.h b/src/Core/ServerSettings.h
index e780424507c..efa41a5ec13 100644
--- a/src/Core/ServerSettings.h
+++ b/src/Core/ServerSettings.h
@@ -29,7 +29,6 @@ namespace DB
M(Int32, max_connections, 1024, "Max server connections.", 0) \
M(UInt32, asynchronous_metrics_update_period_s, 1, "Period in seconds for updating asynchronous metrics.", 0) \
M(UInt32, asynchronous_heavy_metrics_update_period_s, 120, "Period in seconds for updating asynchronous metrics.", 0) \
- M(UInt32, max_threads_for_connection_collector, 10, "The maximum number of threads that will be used for draining connections asynchronously in a background upon finishing executing distributed queries.", 0) \
M(String, default_database, "default", "Default database name.", 0) \
M(String, tmp_policy, "", "Policy for storage with temporary data.", 0) \
M(UInt64, max_temporary_data_on_disk_size, 0, "The maximum amount of storage that could be used for external aggregation, joins or sorting., ", 0) \
diff --git a/src/DataTypes/EnumValues.cpp b/src/DataTypes/EnumValues.cpp
index b0f51a54ccb..9df49e765a7 100644
--- a/src/DataTypes/EnumValues.cpp
+++ b/src/DataTypes/EnumValues.cpp
@@ -10,7 +10,7 @@ namespace ErrorCodes
{
extern const int SYNTAX_ERROR;
extern const int EMPTY_DATA_PASSED;
- extern const int BAD_ARGUMENTS;
+ extern const int UNKNOWN_ELEMENT_OF_ENUM;
}
template
@@ -69,7 +69,7 @@ T EnumValues::getValue(StringRef field_name, bool try_treat_as_id) const
}
auto hints = this->getHints(field_name.toString());
auto hints_string = !hints.empty() ? ", maybe you meant: " + toString(hints) : "";
- throw Exception(ErrorCodes::BAD_ARGUMENTS, "Unknown element '{}' for enum{}", field_name.toString(), hints_string);
+ throw Exception(ErrorCodes::UNKNOWN_ELEMENT_OF_ENUM, "Unknown element '{}' for enum{}", field_name.toString(), hints_string);
}
return it->getMapped();
}
diff --git a/src/Databases/DatabaseOnDisk.cpp b/src/Databases/DatabaseOnDisk.cpp
index 4d9e22bd15d..01afbdcaa57 100644
--- a/src/Databases/DatabaseOnDisk.cpp
+++ b/src/Databases/DatabaseOnDisk.cpp
@@ -17,14 +17,21 @@
#include
#include
#include
+#include
+#include
+#include
#include
#include
-#include
#include
-#include
namespace fs = std::filesystem;
+namespace CurrentMetrics
+{
+ extern const Metric DatabaseOnDiskThreads;
+ extern const Metric DatabaseOnDiskThreadsActive;
+}
+
namespace DB
{
@@ -620,7 +627,7 @@ void DatabaseOnDisk::iterateMetadataFiles(ContextPtr local_context, const Iterat
}
/// Read and parse metadata in parallel
- ThreadPool pool;
+ ThreadPool pool(CurrentMetrics::DatabaseOnDiskThreads, CurrentMetrics::DatabaseOnDiskThreadsActive);
for (const auto & file : metadata_files)
{
pool.scheduleOrThrowOnError([&]()
diff --git a/src/Databases/DatabaseOrdinary.cpp b/src/Databases/DatabaseOrdinary.cpp
index 49250602132..0db16f80656 100644
--- a/src/Databases/DatabaseOrdinary.cpp
+++ b/src/Databases/DatabaseOrdinary.cpp
@@ -25,9 +25,16 @@
#include
#include
#include
+#include
namespace fs = std::filesystem;
+namespace CurrentMetrics
+{
+ extern const Metric DatabaseOrdinaryThreads;
+ extern const Metric DatabaseOrdinaryThreadsActive;
+}
+
namespace DB
{
@@ -99,7 +106,7 @@ void DatabaseOrdinary::loadStoredObjects(
std::atomic dictionaries_processed{0};
std::atomic tables_processed{0};
- ThreadPool pool;
+ ThreadPool pool(CurrentMetrics::DatabaseOrdinaryThreads, CurrentMetrics::DatabaseOrdinaryThreadsActive);
/// We must attach dictionaries before attaching tables
/// because while we're attaching tables we may need to have some dictionaries attached
diff --git a/src/Databases/TablesLoader.cpp b/src/Databases/TablesLoader.cpp
index 5d66f49554d..177b32daa2e 100644
--- a/src/Databases/TablesLoader.cpp
+++ b/src/Databases/TablesLoader.cpp
@@ -8,8 +8,15 @@
#include
#include
#include
+#include
#include
+namespace CurrentMetrics
+{
+ extern const Metric TablesLoaderThreads;
+ extern const Metric TablesLoaderThreadsActive;
+}
+
namespace DB
{
@@ -31,12 +38,13 @@ void logAboutProgress(Poco::Logger * log, size_t processed, size_t total, Atomic
}
TablesLoader::TablesLoader(ContextMutablePtr global_context_, Databases databases_, LoadingStrictnessLevel strictness_mode_)
-: global_context(global_context_)
-, databases(std::move(databases_))
-, strictness_mode(strictness_mode_)
-, referential_dependencies("ReferentialDeps")
-, loading_dependencies("LoadingDeps")
-, all_loading_dependencies("LoadingDeps")
+ : global_context(global_context_)
+ , databases(std::move(databases_))
+ , strictness_mode(strictness_mode_)
+ , referential_dependencies("ReferentialDeps")
+ , loading_dependencies("LoadingDeps")
+ , all_loading_dependencies("LoadingDeps")
+ , pool(CurrentMetrics::TablesLoaderThreads, CurrentMetrics::TablesLoaderThreadsActive)
{
metadata.default_database = global_context->getCurrentDatabase();
log = &Poco::Logger::get("TablesLoader");
diff --git a/src/Dictionaries/CacheDictionaryUpdateQueue.cpp b/src/Dictionaries/CacheDictionaryUpdateQueue.cpp
index 1fdaf10c57c..09d5bed18b8 100644
--- a/src/Dictionaries/CacheDictionaryUpdateQueue.cpp
+++ b/src/Dictionaries/CacheDictionaryUpdateQueue.cpp
@@ -2,8 +2,15 @@
#include
+#include
#include
+namespace CurrentMetrics
+{
+ extern const Metric CacheDictionaryThreads;
+ extern const Metric CacheDictionaryThreadsActive;
+}
+
namespace DB
{
@@ -26,7 +33,7 @@ CacheDictionaryUpdateQueue::CacheDictionaryUpdateQueue(
, configuration(configuration_)
, update_func(std::move(update_func_))
, update_queue(configuration.max_update_queue_size)
- , update_pool(configuration.max_threads_for_updates)
+ , update_pool(CurrentMetrics::CacheDictionaryThreads, CurrentMetrics::CacheDictionaryThreadsActive, configuration.max_threads_for_updates)
{
for (size_t i = 0; i < configuration.max_threads_for_updates; ++i)
update_pool.scheduleOrThrowOnError([this] { updateThreadFunction(); });
diff --git a/src/Dictionaries/HashedDictionary.cpp b/src/Dictionaries/HashedDictionary.cpp
index d6c9ac50dbe..0e5d18363e9 100644
--- a/src/Dictionaries/HashedDictionary.cpp
+++ b/src/Dictionaries/HashedDictionary.cpp
@@ -8,6 +8,7 @@
#include
#include
#include
+#include
#include
@@ -21,6 +22,11 @@
#include
#include
+namespace CurrentMetrics
+{
+ extern const Metric HashedDictionaryThreads;
+ extern const Metric HashedDictionaryThreadsActive;
+}
namespace
{
@@ -60,7 +66,7 @@ public:
explicit ParallelDictionaryLoader(HashedDictionary & dictionary_)
: dictionary(dictionary_)
, shards(dictionary.configuration.shards)
- , pool(shards)
+ , pool(CurrentMetrics::HashedDictionaryThreads, CurrentMetrics::HashedDictionaryThreadsActive, shards)
, shards_queues(shards)
{
UInt64 backlog = dictionary.configuration.shard_load_queue_backlog;
@@ -213,7 +219,7 @@ HashedDictionary::~HashedDictionary()
return;
size_t shards = std::max(configuration.shards, 1);
- ThreadPool pool(shards);
+ ThreadPool pool(CurrentMetrics::HashedDictionaryThreads, CurrentMetrics::HashedDictionaryThreadsActive, shards);
size_t hash_tables_count = 0;
auto schedule_destroy = [&hash_tables_count, &pool](auto & container)
diff --git a/src/Disks/IO/ThreadPoolReader.cpp b/src/Disks/IO/ThreadPoolReader.cpp
index 18b283b0ff3..3a071d13122 100644
--- a/src/Disks/IO/ThreadPoolReader.cpp
+++ b/src/Disks/IO/ThreadPoolReader.cpp
@@ -61,6 +61,8 @@ namespace ProfileEvents
namespace CurrentMetrics
{
extern const Metric Read;
+ extern const Metric ThreadPoolFSReaderThreads;
+ extern const Metric ThreadPoolFSReaderThreadsActive;
}
@@ -85,7 +87,7 @@ static bool hasBugInPreadV2()
#endif
ThreadPoolReader::ThreadPoolReader(size_t pool_size, size_t queue_size_)
- : pool(pool_size, pool_size, queue_size_)
+ : pool(CurrentMetrics::ThreadPoolFSReaderThreads, CurrentMetrics::ThreadPoolFSReaderThreadsActive, pool_size, pool_size, queue_size_)
{
}
diff --git a/src/Disks/IO/ThreadPoolRemoteFSReader.cpp b/src/Disks/IO/ThreadPoolRemoteFSReader.cpp
index c2d3ee8b53d..1980f57c876 100644
--- a/src/Disks/IO/ThreadPoolRemoteFSReader.cpp
+++ b/src/Disks/IO/ThreadPoolRemoteFSReader.cpp
@@ -26,6 +26,8 @@ namespace ProfileEvents
namespace CurrentMetrics
{
extern const Metric RemoteRead;
+ extern const Metric ThreadPoolRemoteFSReaderThreads;
+ extern const Metric ThreadPoolRemoteFSReaderThreadsActive;
}
namespace DB
@@ -60,7 +62,7 @@ IAsynchronousReader::Result RemoteFSFileDescriptor::readInto(char * data, size_t
ThreadPoolRemoteFSReader::ThreadPoolRemoteFSReader(size_t pool_size, size_t queue_size_)
- : pool(pool_size, pool_size, queue_size_)
+ : pool(CurrentMetrics::ThreadPoolRemoteFSReaderThreads, CurrentMetrics::ThreadPoolRemoteFSReaderThreadsActive, pool_size, pool_size, queue_size_)
{
}
diff --git a/src/Disks/ObjectStorages/DiskObjectStorage.cpp b/src/Disks/ObjectStorages/DiskObjectStorage.cpp
index 44cb80558af..6143f0620b8 100644
--- a/src/Disks/ObjectStorages/DiskObjectStorage.cpp
+++ b/src/Disks/ObjectStorages/DiskObjectStorage.cpp
@@ -10,6 +10,7 @@
#include
#include
#include
+#include
#include
#include
#include
@@ -17,6 +18,13 @@
#include
#include
+namespace CurrentMetrics
+{
+ extern const Metric DiskObjectStorageAsyncThreads;
+ extern const Metric DiskObjectStorageAsyncThreadsActive;
+}
+
+
namespace DB
{
@@ -38,7 +46,8 @@ class AsyncThreadPoolExecutor : public Executor
public:
AsyncThreadPoolExecutor(const String & name_, int thread_pool_size)
: name(name_)
- , pool(ThreadPool(thread_pool_size)) {}
+ , pool(CurrentMetrics::DiskObjectStorageAsyncThreads, CurrentMetrics::DiskObjectStorageAsyncThreadsActive, thread_pool_size)
+ {}
std::future execute(std::function task) override
{
diff --git a/src/Functions/FunctionsCodingIP.cpp b/src/Functions/FunctionsCodingIP.cpp
index fb54fb951d1..f10ea24f4a6 100644
--- a/src/Functions/FunctionsCodingIP.cpp
+++ b/src/Functions/FunctionsCodingIP.cpp
@@ -1129,7 +1129,9 @@ public:
for (size_t i = 0; i < vec_res.size(); ++i)
{
- vec_res[i] = DB::parseIPv6whole(reinterpret_cast(&vec_src[prev_offset]), reinterpret_cast(buffer));
+ vec_res[i] = DB::parseIPv6whole(reinterpret_cast(&vec_src[prev_offset]),
+ reinterpret_cast(&vec_src[offsets_src[i] - 1]),
+ reinterpret_cast(buffer));
prev_offset = offsets_src[i];
}
diff --git a/src/IO/BackupIOThreadPool.cpp b/src/IO/BackupsIOThreadPool.cpp
similarity index 59%
rename from src/IO/BackupIOThreadPool.cpp
rename to src/IO/BackupsIOThreadPool.cpp
index 067fc54b1ae..0829553945a 100644
--- a/src/IO/BackupIOThreadPool.cpp
+++ b/src/IO/BackupsIOThreadPool.cpp
@@ -1,5 +1,12 @@
#include
-#include "Core/Field.h"
+#include
+#include
+
+namespace CurrentMetrics
+{
+ extern const Metric BackupsIOThreads;
+ extern const Metric BackupsIOThreadsActive;
+}
namespace DB
{
@@ -18,7 +25,13 @@ void BackupsIOThreadPool::initialize(size_t max_threads, size_t max_free_threads
throw Exception(ErrorCodes::LOGICAL_ERROR, "The BackupsIO thread pool is initialized twice");
}
- instance = std::make_unique(max_threads, max_free_threads, queue_size, false /*shutdown_on_exception*/);
+ instance = std::make_unique(
+ CurrentMetrics::BackupsIOThreads,
+ CurrentMetrics::BackupsIOThreadsActive,
+ max_threads,
+ max_free_threads,
+ queue_size,
+ /* shutdown_on_exception= */ false);
}
ThreadPool & BackupsIOThreadPool::get()
diff --git a/src/IO/IOThreadPool.cpp b/src/IO/IOThreadPool.cpp
index 4014d00d8b8..98bb6ffe6a7 100644
--- a/src/IO/IOThreadPool.cpp
+++ b/src/IO/IOThreadPool.cpp
@@ -1,5 +1,12 @@
#include
-#include "Core/Field.h"
+#include
+#include
+
+namespace CurrentMetrics
+{
+ extern const Metric IOThreads;
+ extern const Metric IOThreadsActive;
+}
namespace DB
{
@@ -18,7 +25,13 @@ void IOThreadPool::initialize(size_t max_threads, size_t max_free_threads, size_
throw Exception(ErrorCodes::LOGICAL_ERROR, "The IO thread pool is initialized twice");
}
- instance = std::make_unique(max_threads, max_free_threads, queue_size, false /*shutdown_on_exception*/);
+ instance = std::make_unique(
+ CurrentMetrics::IOThreads,
+ CurrentMetrics::IOThreadsActive,
+ max_threads,
+ max_free_threads,
+ queue_size,
+ /* shutdown_on_exception= */ false);
}
ThreadPool & IOThreadPool::get()
diff --git a/src/Interpreters/Aggregator.cpp b/src/Interpreters/Aggregator.cpp
index b0bcea23449..d6fbf072d05 100644
--- a/src/Interpreters/Aggregator.cpp
+++ b/src/Interpreters/Aggregator.cpp
@@ -24,6 +24,7 @@
#include
#include
#include
+#include
#include
#include
#include
@@ -59,6 +60,8 @@ namespace ProfileEvents
namespace CurrentMetrics
{
extern const Metric TemporaryFilesForAggregation;
+ extern const Metric AggregatorThreads;
+ extern const Metric AggregatorThreadsActive;
}
namespace DB
@@ -2397,7 +2400,7 @@ BlocksList Aggregator::convertToBlocks(AggregatedDataVariants & data_variants, b
std::unique_ptr thread_pool;
if (max_threads > 1 && data_variants.sizeWithoutOverflowRow() > 100000 /// TODO Make a custom threshold.
&& data_variants.isTwoLevel()) /// TODO Use the shared thread pool with the `merge` function.
- thread_pool = std::make_unique(max_threads);
+ thread_pool = std::make_unique(CurrentMetrics::AggregatorThreads, CurrentMetrics::AggregatorThreadsActive, max_threads);
if (data_variants.without_key)
blocks.emplace_back(prepareBlockAndFillWithoutKey(
@@ -2592,7 +2595,7 @@ void NO_INLINE Aggregator::mergeDataOnlyExistingKeysImpl(
void NO_INLINE Aggregator::mergeWithoutKeyDataImpl(
ManyAggregatedDataVariants & non_empty_data) const
{
- ThreadPool thread_pool{params.max_threads};
+ ThreadPool thread_pool{CurrentMetrics::AggregatorThreads, CurrentMetrics::AggregatorThreadsActive, params.max_threads};
AggregatedDataVariantsPtr & res = non_empty_data[0];
@@ -3065,7 +3068,7 @@ void Aggregator::mergeBlocks(BucketToBlocks bucket_to_blocks, AggregatedDataVari
std::unique_ptr thread_pool;
if (max_threads > 1 && total_input_rows > 100000) /// TODO Make a custom threshold.
- thread_pool = std::make_unique(max_threads);
+ thread_pool = std::make_unique(CurrentMetrics::AggregatorThreads, CurrentMetrics::AggregatorThreadsActive, max_threads);
for (const auto & bucket_blocks : bucket_to_blocks)
{
diff --git a/src/Interpreters/AsynchronousInsertQueue.cpp b/src/Interpreters/AsynchronousInsertQueue.cpp
index 590cbc9ba83..2dd2409442e 100644
--- a/src/Interpreters/AsynchronousInsertQueue.cpp
+++ b/src/Interpreters/AsynchronousInsertQueue.cpp
@@ -32,6 +32,8 @@
namespace CurrentMetrics
{
extern const Metric PendingAsyncInsert;
+ extern const Metric AsynchronousInsertThreads;
+ extern const Metric AsynchronousInsertThreadsActive;
}
namespace ProfileEvents
@@ -130,7 +132,7 @@ AsynchronousInsertQueue::AsynchronousInsertQueue(ContextPtr context_, size_t poo
: WithContext(context_)
, pool_size(pool_size_)
, queue_shards(pool_size)
- , pool(pool_size)
+ , pool(CurrentMetrics::AsynchronousInsertThreads, CurrentMetrics::AsynchronousInsertThreadsActive, pool_size)
{
if (!pool_size)
throw Exception(ErrorCodes::BAD_ARGUMENTS, "pool_size cannot be zero");
diff --git a/src/Interpreters/DDLWorker.cpp b/src/Interpreters/DDLWorker.cpp
index f70c0010585..22bece0ef04 100644
--- a/src/Interpreters/DDLWorker.cpp
+++ b/src/Interpreters/DDLWorker.cpp
@@ -40,6 +40,12 @@
namespace fs = std::filesystem;
+namespace CurrentMetrics
+{
+ extern const Metric DDLWorkerThreads;
+ extern const Metric DDLWorkerThreadsActive;
+}
+
namespace DB
{
@@ -85,7 +91,7 @@ DDLWorker::DDLWorker(
{
LOG_WARNING(log, "DDLWorker is configured to use multiple threads. "
"It's not recommended because queries can be reordered. Also it may cause some unknown issues to appear.");
- worker_pool = std::make_unique(pool_size);
+ worker_pool = std::make_unique(CurrentMetrics::DDLWorkerThreads, CurrentMetrics::DDLWorkerThreadsActive, pool_size);
}
queue_dir = zk_root_dir;
@@ -1084,7 +1090,7 @@ void DDLWorker::runMainThread()
/// It will wait for all threads in pool to finish and will not rethrow exceptions (if any).
/// We create new thread pool to forget previous exceptions.
if (1 < pool_size)
- worker_pool = std::make_unique(pool_size);
+ worker_pool = std::make_unique(CurrentMetrics::DDLWorkerThreads, CurrentMetrics::DDLWorkerThreadsActive, pool_size);
/// Clear other in-memory state, like server just started.
current_tasks.clear();
last_skipped_entry_name.reset();
@@ -1123,7 +1129,7 @@ void DDLWorker::runMainThread()
initialized = false;
/// Wait for pending async tasks
if (1 < pool_size)
- worker_pool = std::make_unique(pool_size);
+ worker_pool = std::make_unique(CurrentMetrics::DDLWorkerThreads, CurrentMetrics::DDLWorkerThreadsActive, pool_size);
LOG_INFO(log, "Lost ZooKeeper connection, will try to connect again: {}", getCurrentExceptionMessage(true));
}
else
diff --git a/src/Interpreters/DDLWorker.h b/src/Interpreters/DDLWorker.h
index 65ef4b440a1..6cf034edae8 100644
--- a/src/Interpreters/DDLWorker.h
+++ b/src/Interpreters/DDLWorker.h
@@ -1,6 +1,7 @@
#pragma once
#include
+#include
#include
#include
#include
diff --git a/src/Interpreters/DatabaseCatalog.cpp b/src/Interpreters/DatabaseCatalog.cpp
index ac9b8615fe5..f37e41614b0 100644
--- a/src/Interpreters/DatabaseCatalog.cpp
+++ b/src/Interpreters/DatabaseCatalog.cpp
@@ -38,6 +38,8 @@
namespace CurrentMetrics
{
extern const Metric TablesToDropQueueSize;
+ extern const Metric DatabaseCatalogThreads;
+ extern const Metric DatabaseCatalogThreadsActive;
}
namespace DB
@@ -852,7 +854,7 @@ void DatabaseCatalog::loadMarkedAsDroppedTables()
LOG_INFO(log, "Found {} partially dropped tables. Will load them and retry removal.", dropped_metadata.size());
- ThreadPool pool;
+ ThreadPool pool(CurrentMetrics::DatabaseCatalogThreads, CurrentMetrics::DatabaseCatalogThreadsActive);
for (const auto & elem : dropped_metadata)
{
pool.scheduleOrThrowOnError([&]()
diff --git a/src/Interpreters/InterpreterExplainQuery.cpp b/src/Interpreters/InterpreterExplainQuery.cpp
index 3c225522cc4..3a381cd8dab 100644
--- a/src/Interpreters/InterpreterExplainQuery.cpp
+++ b/src/Interpreters/InterpreterExplainQuery.cpp
@@ -41,6 +41,7 @@ namespace ErrorCodes
extern const int INVALID_SETTING_VALUE;
extern const int UNKNOWN_SETTING;
extern const int LOGICAL_ERROR;
+ extern const int NOT_IMPLEMENTED;
}
namespace
@@ -386,6 +387,10 @@ QueryPipeline InterpreterExplainQuery::executeImpl()
}
case ASTExplainQuery::QueryTree:
{
+ if (!getContext()->getSettingsRef().allow_experimental_analyzer)
+ throw Exception(ErrorCodes::NOT_IMPLEMENTED,
+ "EXPLAIN QUERY TREE is only supported with a new analyzer. Set allow_experimental_analyzer = 1.");
+
if (ast.getExplainedQuery()->as() == nullptr)
throw Exception(ErrorCodes::INCORRECT_QUERY, "Only SELECT is supported for EXPLAIN QUERY TREE query");
diff --git a/src/Interpreters/InterpreterShowEngineQuery.cpp b/src/Interpreters/InterpreterShowEngineQuery.cpp
index 5aae6ad5d28..8fd829f39ec 100644
--- a/src/Interpreters/InterpreterShowEngineQuery.cpp
+++ b/src/Interpreters/InterpreterShowEngineQuery.cpp
@@ -12,7 +12,7 @@ namespace DB
BlockIO InterpreterShowEnginesQuery::execute()
{
- return executeQuery("SELECT * FROM system.table_engines", getContext(), true);
+ return executeQuery("SELECT * FROM system.table_engines ORDER BY name", getContext(), true);
}
}
diff --git a/src/Interpreters/InterpreterShowTablesQuery.cpp b/src/Interpreters/InterpreterShowTablesQuery.cpp
index 4e0dfdc9236..a631cb72722 100644
--- a/src/Interpreters/InterpreterShowTablesQuery.cpp
+++ b/src/Interpreters/InterpreterShowTablesQuery.cpp
@@ -28,7 +28,6 @@ InterpreterShowTablesQuery::InterpreterShowTablesQuery(const ASTPtr & query_ptr_
{
}
-
String InterpreterShowTablesQuery::getRewrittenQuery()
{
const auto & query = query_ptr->as();
@@ -51,6 +50,9 @@ String InterpreterShowTablesQuery::getRewrittenQuery()
if (query.limit_length)
rewritten_query << " LIMIT " << query.limit_length;
+ /// (*)
+ rewritten_query << " ORDER BY name";
+
return rewritten_query.str();
}
@@ -69,6 +71,9 @@ String InterpreterShowTablesQuery::getRewrittenQuery()
<< DB::quote << query.like;
}
+ /// (*)
+ rewritten_query << " ORDER BY cluster";
+
if (query.limit_length)
rewritten_query << " LIMIT " << query.limit_length;
@@ -81,6 +86,9 @@ String InterpreterShowTablesQuery::getRewrittenQuery()
rewritten_query << " WHERE cluster = " << DB::quote << query.cluster_str;
+ /// (*)
+ rewritten_query << " ORDER BY cluster";
+
return rewritten_query.str();
}
@@ -101,6 +109,9 @@ String InterpreterShowTablesQuery::getRewrittenQuery()
<< DB::quote << query.like;
}
+ /// (*)
+ rewritten_query << " ORDER BY name, type, value ";
+
return rewritten_query.str();
}
@@ -146,6 +157,9 @@ String InterpreterShowTablesQuery::getRewrittenQuery()
else if (query.where_expression)
rewritten_query << " AND (" << query.where_expression << ")";
+ /// (*)
+ rewritten_query << " ORDER BY name ";
+
if (query.limit_length)
rewritten_query << " LIMIT " << query.limit_length;
@@ -176,5 +190,8 @@ BlockIO InterpreterShowTablesQuery::execute()
return executeQuery(getRewrittenQuery(), getContext(), true);
}
+/// (*) Sorting is strictly speaking not necessary but 1. it is convenient for users, 2. SQL currently does not allow to
+/// sort the output of SHOW otherwise (SELECT * FROM (SHOW ...) ORDER BY ...) is rejected) and 3. some
+/// SQL tests can take advantage of this.
}
diff --git a/src/Interpreters/InterpreterSystemQuery.cpp b/src/Interpreters/InterpreterSystemQuery.cpp
index fb6b1635f28..cd3e5579e12 100644
--- a/src/Interpreters/InterpreterSystemQuery.cpp
+++ b/src/Interpreters/InterpreterSystemQuery.cpp
@@ -7,6 +7,7 @@
#include
#include
#include
+#include
#include
#include
#include
@@ -64,6 +65,12 @@
#include "config.h"
+namespace CurrentMetrics
+{
+ extern const Metric RestartReplicaThreads;
+ extern const Metric RestartReplicaThreadsActive;
+}
+
namespace DB
{
@@ -685,7 +692,8 @@ void InterpreterSystemQuery::restartReplicas(ContextMutablePtr system_context)
for (auto & guard : guards)
guard.second = catalog.getDDLGuard(guard.first.database_name, guard.first.table_name);
- ThreadPool pool(std::min(static_cast(getNumberOfPhysicalCPUCores()), replica_names.size()));
+ size_t threads = std::min(static_cast(getNumberOfPhysicalCPUCores()), replica_names.size());
+ ThreadPool pool(CurrentMetrics::RestartReplicaThreads, CurrentMetrics::RestartReplicaThreadsActive, threads);
for (auto & replica : replica_names)
{
diff --git a/src/Interpreters/executeQuery.cpp b/src/Interpreters/executeQuery.cpp
index 74394879191..e60510dc5f3 100644
--- a/src/Interpreters/executeQuery.cpp
+++ b/src/Interpreters/executeQuery.cpp
@@ -1223,18 +1223,37 @@ void executeQuery(
end = begin + parse_buf.size();
}
- ASTPtr ast;
- BlockIO streams;
-
- std::tie(ast, streams) = executeQueryImpl(begin, end, context, false, QueryProcessingStage::Complete, &istr);
- auto & pipeline = streams.pipeline;
-
QueryResultDetails result_details
{
.query_id = context->getClientInfo().current_query_id,
.timezone = DateLUT::instance().getTimeZone(),
};
+ /// Set the result details in case of any exception raised during query execution
+ SCOPE_EXIT({
+ if (set_result_details == nullptr)
+ /// Either the result_details have been set in the flow below or the caller of this function does not provide this callback
+ return;
+
+ try
+ {
+ set_result_details(result_details);
+ }
+ catch (...)
+ {
+ /// This exception can be ignored.
+ /// because if the code goes here, it means there's already an exception raised during query execution,
+ /// and that exception will be propagated to outer caller,
+ /// there's no need to report the exception thrown here.
+ }
+ });
+
+ ASTPtr ast;
+ BlockIO streams;
+
+ std::tie(ast, streams) = executeQueryImpl(begin, end, context, false, QueryProcessingStage::Complete, &istr);
+ auto & pipeline = streams.pipeline;
+
std::unique_ptr compressed_buffer;
try
{
@@ -1304,7 +1323,15 @@ void executeQuery(
}
if (set_result_details)
- set_result_details(result_details);
+ {
+ /// The call of set_result_details itself might throw exception,
+ /// in such case there's no need to call this function again in the SCOPE_EXIT defined above.
+ /// So the callback is cleared before its execution.
+ auto set_result_details_copy = set_result_details;
+ set_result_details = nullptr;
+
+ set_result_details_copy(result_details);
+ }
if (pipeline.initialized())
{
diff --git a/src/Interpreters/loadMetadata.cpp b/src/Interpreters/loadMetadata.cpp
index 47ccff57419..83af2684322 100644
--- a/src/Interpreters/loadMetadata.cpp
+++ b/src/Interpreters/loadMetadata.cpp
@@ -16,16 +16,24 @@
#include
#include
-#include
+#include
#include
-#include
#include
+#include
+
+#include
#define ORDINARY_TO_ATOMIC_PREFIX ".tmp_convert."
namespace fs = std::filesystem;
+namespace CurrentMetrics
+{
+ extern const Metric StartupSystemTablesThreads;
+ extern const Metric StartupSystemTablesThreadsActive;
+}
+
namespace DB
{
@@ -366,7 +374,7 @@ static void maybeConvertOrdinaryDatabaseToAtomic(ContextMutablePtr context, cons
if (!tables_started)
{
/// It's not quite correct to run DDL queries while database is not started up.
- ThreadPool pool;
+ ThreadPool pool(CurrentMetrics::StartupSystemTablesThreads, CurrentMetrics::StartupSystemTablesThreadsActive);
DatabaseCatalog::instance().getSystemDatabase()->startupTables(pool, LoadingStrictnessLevel::FORCE_RESTORE);
}
@@ -461,7 +469,7 @@ void convertDatabasesEnginesIfNeed(ContextMutablePtr context)
void startupSystemTables()
{
- ThreadPool pool;
+ ThreadPool pool(CurrentMetrics::StartupSystemTablesThreads, CurrentMetrics::StartupSystemTablesThreadsActive);
DatabaseCatalog::instance().getSystemDatabase()->startupTables(pool, LoadingStrictnessLevel::FORCE_RESTORE);
}
diff --git a/src/Parsers/IParser.h b/src/Parsers/IParser.h
index dbd292bcd9f..d53b58baa7c 100644
--- a/src/Parsers/IParser.h
+++ b/src/Parsers/IParser.h
@@ -1,6 +1,7 @@
#pragma once
#include
+#include
#include
#include
diff --git a/src/Processors/Formats/IRowInputFormat.cpp b/src/Processors/Formats/IRowInputFormat.cpp
index 81c818e3334..948e6f4cb82 100644
--- a/src/Processors/Formats/IRowInputFormat.cpp
+++ b/src/Processors/Formats/IRowInputFormat.cpp
@@ -28,6 +28,7 @@ namespace ErrorCodes
extern const int CANNOT_PARSE_DOMAIN_VALUE_FROM_STRING;
extern const int CANNOT_PARSE_IPV4;
extern const int CANNOT_PARSE_IPV6;
+ extern const int UNKNOWN_ELEMENT_OF_ENUM;
}
@@ -48,7 +49,8 @@ bool isParseError(int code)
|| code == ErrorCodes::INCORRECT_DATA /// For some ReadHelpers
|| code == ErrorCodes::CANNOT_PARSE_DOMAIN_VALUE_FROM_STRING
|| code == ErrorCodes::CANNOT_PARSE_IPV4
- || code == ErrorCodes::CANNOT_PARSE_IPV6;
+ || code == ErrorCodes::CANNOT_PARSE_IPV6
+ || code == ErrorCodes::UNKNOWN_ELEMENT_OF_ENUM;
}
IRowInputFormat::IRowInputFormat(Block header, ReadBuffer & in_, Params params_)
diff --git a/src/Processors/Formats/Impl/AvroRowOutputFormat.cpp b/src/Processors/Formats/Impl/AvroRowOutputFormat.cpp
index cd1c1b096aa..c743b2c1766 100644
--- a/src/Processors/Formats/Impl/AvroRowOutputFormat.cpp
+++ b/src/Processors/Formats/Impl/AvroRowOutputFormat.cpp
@@ -565,6 +565,11 @@ void AvroRowOutputFormat::write(const Columns & columns, size_t row_num)
void AvroRowOutputFormat::finalizeImpl()
{
+ /// If file writer weren't created, we should create it here to write file prefix/suffix
+ /// even without actual data so the file will be valid Avro file
+ if (!file_writer_ptr)
+ createFileWriter();
+
file_writer_ptr->close();
}
diff --git a/src/Processors/Formats/Impl/ParallelFormattingOutputFormat.h b/src/Processors/Formats/Impl/ParallelFormattingOutputFormat.h
index 7ea19f01e01..790d05e83dd 100644
--- a/src/Processors/Formats/Impl/ParallelFormattingOutputFormat.h
+++ b/src/Processors/Formats/Impl/ParallelFormattingOutputFormat.h
@@ -7,7 +7,8 @@
#include
#include
#include
-#include "IO/WriteBufferFromString.h"
+#include
+#include
#include
#include
#include
@@ -17,6 +18,12 @@
#include
#include
+namespace CurrentMetrics
+{
+ extern const Metric ParallelFormattingOutputFormatThreads;
+ extern const Metric ParallelFormattingOutputFormatThreadsActive;
+}
+
namespace DB
{
@@ -74,7 +81,7 @@ public:
explicit ParallelFormattingOutputFormat(Params params)
: IOutputFormat(params.header, params.out)
, internal_formatter_creator(params.internal_formatter_creator)
- , pool(params.max_threads_for_parallel_formatting)
+ , pool(CurrentMetrics::ParallelFormattingOutputFormatThreads, CurrentMetrics::ParallelFormattingOutputFormatThreadsActive, params.max_threads_for_parallel_formatting)
{
LOG_TEST(&Poco::Logger::get("ParallelFormattingOutputFormat"), "Parallel formatting is being used");
diff --git a/src/Processors/Formats/Impl/ParallelParsingInputFormat.h b/src/Processors/Formats/Impl/ParallelParsingInputFormat.h
index 03fb2d650dc..97df9308dbf 100644
--- a/src/Processors/Formats/Impl/ParallelParsingInputFormat.h
+++ b/src/Processors/Formats/Impl/ParallelParsingInputFormat.h
@@ -5,14 +5,21 @@
#include
#include
#include
+#include
+#include
#include
#include
#include
#include
-#include
#include
+namespace CurrentMetrics
+{
+ extern const Metric ParallelParsingInputFormatThreads;
+ extern const Metric ParallelParsingInputFormatThreadsActive;
+}
+
namespace DB
{
@@ -94,7 +101,7 @@ public:
, min_chunk_bytes(params.min_chunk_bytes)
, max_block_size(params.max_block_size)
, is_server(params.is_server)
- , pool(params.max_threads)
+ , pool(CurrentMetrics::ParallelParsingInputFormatThreads, CurrentMetrics::ParallelParsingInputFormatThreadsActive, params.max_threads)
{
// One unit for each thread, including segmentator and reader, plus a
// couple more units so that the segmentation thread doesn't spuriously
diff --git a/src/Processors/QueryPlan/FilterStep.cpp b/src/Processors/QueryPlan/FilterStep.cpp
index dc837446a96..9137b2d1998 100644
--- a/src/Processors/QueryPlan/FilterStep.cpp
+++ b/src/Processors/QueryPlan/FilterStep.cpp
@@ -14,7 +14,7 @@ static ITransformingStep::Traits getTraits(const ActionsDAGPtr & expression, con
bool preserves_sorting = expression->isSortingPreserved(header, sort_description, remove_filter_column ? filter_column_name : "");
if (remove_filter_column)
{
- preserves_sorting &= find_if(
+ preserves_sorting &= std::find_if(
begin(sort_description),
end(sort_description),
[&](const auto & column_desc) { return column_desc.column_name == filter_column_name; })
diff --git a/src/Processors/QueryPlan/Optimizations/useDataParallelAggregation.cpp b/src/Processors/QueryPlan/Optimizations/useDataParallelAggregation.cpp
index 13a749bd0b6..dc801d10514 100644
--- a/src/Processors/QueryPlan/Optimizations/useDataParallelAggregation.cpp
+++ b/src/Processors/QueryPlan/Optimizations/useDataParallelAggregation.cpp
@@ -153,6 +153,10 @@ bool isPartitionKeySuitsGroupByKey(
if (group_by_actions->hasArrayJoin() || group_by_actions->hasStatefulFunctions() || group_by_actions->hasNonDeterministic())
return false;
+
+ /// We are interested only in calculations required to obtain group by keys (and not aggregate function arguments for example).
+ group_by_actions->removeUnusedActions(aggregating.getParams().keys);
+
const auto & gb_key_required_columns = group_by_actions->getRequiredColumnsNames();
const auto & partition_actions = reading.getStorageMetadata()->getPartitionKey().expression->getActionsDAG();
@@ -202,7 +206,7 @@ size_t tryAggregatePartitionsIndependently(QueryPlan::Node * node, QueryPlan::No
return 0;
if (!reading->willOutputEachPartitionThroughSeparatePort()
- && isPartitionKeySuitsGroupByKey(*reading, expression_step->getExpression(), *aggregating_step))
+ && isPartitionKeySuitsGroupByKey(*reading, expression_step->getExpression()->clone(), *aggregating_step))
{
if (reading->requestOutputEachPartitionThroughSeparatePort())
aggregating_step->skipMerging();
diff --git a/src/Processors/Transforms/AggregatingTransform.h b/src/Processors/Transforms/AggregatingTransform.h
index 3abd2ac3346..048b69adae6 100644
--- a/src/Processors/Transforms/AggregatingTransform.h
+++ b/src/Processors/Transforms/AggregatingTransform.h
@@ -6,6 +6,13 @@
#include
#include
#include
+#include
+
+namespace CurrentMetrics
+{
+ extern const Metric DestroyAggregatesThreads;
+ extern const Metric DestroyAggregatesThreadsActive;
+}
namespace DB
{
@@ -84,7 +91,10 @@ struct ManyAggregatedData
// Aggregation states destruction may be very time-consuming.
// In the case of a query with LIMIT, most states won't be destroyed during conversion to blocks.
// Without the following code, they would be destroyed in the destructor of AggregatedDataVariants in the current thread (i.e. sequentially).
- const auto pool = std::make_unique(variants.size());
+ const auto pool = std::make_unique(
+ CurrentMetrics::DestroyAggregatesThreads,
+ CurrentMetrics::DestroyAggregatesThreadsActive,
+ variants.size());
for (auto && variant : variants)
{
diff --git a/src/Storages/Hive/StorageHive.cpp b/src/Storages/Hive/StorageHive.cpp
index 85e6341eb5a..71830e8bc86 100644
--- a/src/Storages/Hive/StorageHive.cpp
+++ b/src/Storages/Hive/StorageHive.cpp
@@ -7,6 +7,7 @@
#include
#include
#include
+#include
#include
#include
@@ -40,6 +41,11 @@
#include
#include
+namespace CurrentMetrics
+{
+ extern const Metric StorageHiveThreads;
+ extern const Metric StorageHiveThreadsActive;
+}
namespace DB
{
@@ -844,7 +850,7 @@ HiveFiles StorageHive::collectHiveFiles(
Int64 hive_max_query_partitions = context_->getSettings().max_partitions_to_read;
/// Mutext to protect hive_files, which maybe appended in multiple threads
std::mutex hive_files_mutex;
- ThreadPool pool{max_threads};
+ ThreadPool pool{CurrentMetrics::StorageHiveThreads, CurrentMetrics::StorageHiveThreadsActive, max_threads};
if (!partitions.empty())
{
for (const auto & partition : partitions)
diff --git a/src/Storages/MeiliSearch/StorageMeiliSearch.cpp b/src/Storages/MeiliSearch/StorageMeiliSearch.cpp
index 62a6c471070..fe8cc74f577 100644
--- a/src/Storages/MeiliSearch/StorageMeiliSearch.cpp
+++ b/src/Storages/MeiliSearch/StorageMeiliSearch.cpp
@@ -99,7 +99,7 @@ Pipe StorageMeiliSearch::read(
for (const auto & el : query_params->children)
{
auto str = el->getColumnName();
- auto it = find(str.begin(), str.end(), '=');
+ auto it = std::find(str.begin(), str.end(), '=');
if (it == str.end())
throw Exception(ErrorCodes::BAD_QUERY_PARAMETER, "meiliMatch function must have parameters of the form \'key=value\'");
diff --git a/src/Storages/MergeTree/MergeTreeBackgroundExecutor.h b/src/Storages/MergeTree/MergeTreeBackgroundExecutor.h
index 5c47d20865b..a27fb18c0fe 100644
--- a/src/Storages/MergeTree/MergeTreeBackgroundExecutor.h
+++ b/src/Storages/MergeTree/MergeTreeBackgroundExecutor.h
@@ -21,6 +21,12 @@
#include
+namespace CurrentMetrics
+{
+ extern const Metric MergeTreeBackgroundExecutorThreads;
+ extern const Metric MergeTreeBackgroundExecutorThreadsActive;
+}
+
namespace DB
{
namespace ErrorCodes
@@ -255,6 +261,7 @@ public:
, max_tasks_count(max_tasks_count_)
, metric(metric_)
, max_tasks_metric(max_tasks_metric_, 2 * max_tasks_count) // active + pending
+ , pool(CurrentMetrics::MergeTreeBackgroundExecutorThreads, CurrentMetrics::MergeTreeBackgroundExecutorThreadsActive)
{
if (max_tasks_count == 0)
throw Exception(ErrorCodes::INVALID_CONFIG_PARAMETER, "Task count for MergeTreeBackgroundExecutor must not be zero");
diff --git a/src/Storages/MergeTree/MergeTreeData.cpp b/src/Storages/MergeTree/MergeTreeData.cpp
index fef6ccbdaeb..f9848b572f9 100644
--- a/src/Storages/MergeTree/MergeTreeData.cpp
+++ b/src/Storages/MergeTree/MergeTreeData.cpp
@@ -17,6 +17,7 @@
#include
#include
#include
+#include
#include
#include
#include
@@ -117,6 +118,10 @@ namespace ProfileEvents
namespace CurrentMetrics
{
extern const Metric DelayedInserts;
+ extern const Metric MergeTreePartsLoaderThreads;
+ extern const Metric MergeTreePartsLoaderThreadsActive;
+ extern const Metric MergeTreePartsCleanerThreads;
+ extern const Metric MergeTreePartsCleanerThreadsActive;
}
@@ -1567,7 +1572,7 @@ void MergeTreeData::loadDataParts(bool skip_sanity_checks)
}
}
- ThreadPool pool(disks.size());
+ ThreadPool pool(CurrentMetrics::MergeTreePartsLoaderThreads, CurrentMetrics::MergeTreePartsLoaderThreadsActive, disks.size());
std::vector parts_to_load_by_disk(disks.size());
for (size_t i = 0; i < disks.size(); ++i)
@@ -2299,7 +2304,7 @@ void MergeTreeData::clearPartsFromFilesystemImpl(const DataPartsVector & parts_t
/// Parallel parts removal.
size_t num_threads = std::min(settings->max_part_removal_threads, parts_to_remove.size());
std::mutex part_names_mutex;
- ThreadPool pool(num_threads);
+ ThreadPool pool(CurrentMetrics::MergeTreePartsCleanerThreads, CurrentMetrics::MergeTreePartsCleanerThreadsActive, num_threads);
bool has_zero_copy_parts = false;
if (settings->allow_remote_fs_zero_copy_replication && dynamic_cast(this) != nullptr)
diff --git a/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp b/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp
index ff8862f0f36..2039912106c 100644
--- a/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp
+++ b/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp
@@ -35,6 +35,7 @@
#include
#include