Merge branch 'master' into zvonand-issue-49290

This commit is contained in:
Andrey Zvonov 2023-06-14 14:22:32 +02:00 committed by GitHub
commit 2f572b7211
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1023 changed files with 29813 additions and 10848 deletions

View File

@ -74,6 +74,7 @@ ConstructorInitializerIndentWidth: 4
ContinuationIndentWidth: 4
DerivePointerAlignment: false
DisableFormat: false
IndentRequiresClause: false
IndentWidth: 4
IndentWrappedFunctionNames: false
MacroBlockBegin: ''

View File

@ -46,7 +46,12 @@ jobs:
- name: Python unit tests
run: |
cd "$GITHUB_WORKSPACE/tests/ci"
echo "Testing the main ci directory"
python3 -m unittest discover -s . -p '*_test.py'
for dir in *_lambda/; do
echo "Testing $dir"
python3 -m unittest discover -s "$dir" -p '*_test.py'
done
DockerHubPushAarch64:
needs: CheckLabels
runs-on: [self-hosted, style-checker-aarch64]

13
.gitmodules vendored
View File

@ -35,10 +35,9 @@
[submodule "contrib/unixodbc"]
path = contrib/unixodbc
url = https://github.com/ClickHouse/UnixODBC
[submodule "contrib/protobuf"]
path = contrib/protobuf
url = https://github.com/ClickHouse/protobuf
branch = v3.13.0.1
[submodule "contrib/google-protobuf"]
path = contrib/google-protobuf
url = https://github.com/ClickHouse/google-protobuf.git
[submodule "contrib/boost"]
path = contrib/boost
url = https://github.com/ClickHouse/boost
@ -268,9 +267,6 @@
[submodule "contrib/vectorscan"]
path = contrib/vectorscan
url = https://github.com/VectorCamp/vectorscan.git
[submodule "contrib/c-ares"]
path = contrib/c-ares
url = https://github.com/ClickHouse/c-ares
[submodule "contrib/llvm-project"]
path = contrib/llvm-project
url = https://github.com/ClickHouse/llvm-project
@ -344,3 +340,6 @@
[submodule "contrib/isa-l"]
path = contrib/isa-l
url = https://github.com/ClickHouse/isa-l.git
[submodule "contrib/c-ares"]
path = contrib/c-ares
url = https://github.com/c-ares/c-ares.git

View File

@ -1,4 +1,5 @@
### Table of Contents
**[ClickHouse release v23.5, 2023-06-08](#235)**<br/>
**[ClickHouse release v23.4, 2023-04-26](#234)**<br/>
**[ClickHouse release v23.3 LTS, 2023-03-30](#233)**<br/>
**[ClickHouse release v23.2, 2023-02-23](#232)**<br/>
@ -7,6 +8,258 @@
# 2023 Changelog
### <a id="235"></a> ClickHouse release 23.5, 2023-06-08
#### Upgrade Notes
* Compress marks and primary key by default. It significantly reduces the cold query time. Upgrade notes: the support for compressed marks and primary key has been added in version 22.9. If you turned on compressed marks or primary key or installed version 23.5 or newer, which has compressed marks or primary key on by default, you will not be able to downgrade to version 22.8 or earlier. You can also explicitly disable compressed marks or primary keys by specifying the `compress_marks` and `compress_primary_key` settings in the `<merge_tree>` section of the server configuration file. **Upgrade notes:** If you upgrade from versions prior to 22.9, you should either upgrade all replicas at once or disable the compression before upgrade, or upgrade through an intermediate version, where the compressed marks are supported but not enabled by default, such as 23.3. [#42587](https://github.com/ClickHouse/ClickHouse/pull/42587) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Make local object storage work consistently with s3 object storage, fix problem with append (closes [#48465](https://github.com/ClickHouse/ClickHouse/issues/48465)), make it configurable as independent storage. The change is backward incompatible because the cache on top of local object storage is not compatible to previous versions. [#48791](https://github.com/ClickHouse/ClickHouse/pull/48791) ([Kseniia Sumarokova](https://github.com/kssenii)).
* The experimental feature "in-memory data parts" is removed. The data format is still supported, but the settings are no-op, and compact or wide parts will be used instead. This closes [#45409](https://github.com/ClickHouse/ClickHouse/issues/45409). [#49429](https://github.com/ClickHouse/ClickHouse/pull/49429) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Changed default values of settings `parallelize_output_from_storages` and `input_format_parquet_preserve_order`. This allows ClickHouse to reorder rows when reading from files (e.g. CSV or Parquet), greatly improving performance in many cases. To restore the old behavior of preserving order, use `parallelize_output_from_storages = 0`, `input_format_parquet_preserve_order = 1`. [#49479](https://github.com/ClickHouse/ClickHouse/pull/49479) ([Michael Kolupaev](https://github.com/al13n321)).
* Make projections production-ready. Add the `optimize_use_projections` setting to control whether the projections will be selected for SELECT queries. The setting `allow_experimental_projection_optimization` is obsolete and does nothing. [#49719](https://github.com/ClickHouse/ClickHouse/pull/49719) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Mark `joinGet` as non-deterministic (so as `dictGet`). It allows using them in mutations without an extra setting. [#49843](https://github.com/ClickHouse/ClickHouse/pull/49843) ([Azat Khuzhin](https://github.com/azat)).
* Revert the "`groupArray` returns cannot be nullable" change (due to binary compatibility breakage for `groupArray`/`groupArrayLast`/`groupArraySample` over `Nullable` types, which likely will lead to `TOO_LARGE_ARRAY_SIZE` or `CANNOT_READ_ALL_DATA`). [#49971](https://github.com/ClickHouse/ClickHouse/pull/49971) ([Azat Khuzhin](https://github.com/azat)).
* Setting `enable_memory_bound_merging_of_aggregation_results` is enabled by default. If you update from version prior to 22.12, we recommend to set this flag to `false` until update is finished. [#50319](https://github.com/ClickHouse/ClickHouse/pull/50319) ([Nikita Taranov](https://github.com/nickitat)).
#### New Feature
* Added storage engine AzureBlobStorage and azureBlobStorage table function. The supported set of features is very similar to storage/table function S3 [#50604] (https://github.com/ClickHouse/ClickHouse/pull/50604) ([alesapin](https://github.com/alesapin)) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni).
* Added native ClickHouse Keeper CLI Client, it is available as `clickhouse keeper-client` [#47414](https://github.com/ClickHouse/ClickHouse/pull/47414) ([pufit](https://github.com/pufit)).
* Add `urlCluster` table function. Refactor all *Cluster table functions to reduce code duplication. Make schema inference work for all possible *Cluster function signatures and for named collections. Closes [#38499](https://github.com/ClickHouse/ClickHouse/issues/38499). [#45427](https://github.com/ClickHouse/ClickHouse/pull/45427) ([attack204](https://github.com/attack204)), Pavel Kruglov.
* The query cache can now be used for production workloads. [#47977](https://github.com/ClickHouse/ClickHouse/pull/47977) ([Robert Schulze](https://github.com/rschu1ze)). The query cache can now support queries with totals and extremes modifier. [#48853](https://github.com/ClickHouse/ClickHouse/pull/48853) ([Robert Schulze](https://github.com/rschu1ze)). Make `allow_experimental_query_cache` setting as obsolete for backward-compatibility. It was removed in https://github.com/ClickHouse/ClickHouse/pull/47977. [#49934](https://github.com/ClickHouse/ClickHouse/pull/49934) ([Timur Solodovnikov](https://github.com/tsolodov)).
* Geographical data types (`Point`, `Ring`, `Polygon`, and `MultiPolygon`) are production-ready. [#50022](https://github.com/ClickHouse/ClickHouse/pull/50022) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Add schema inference to PostgreSQL, MySQL, MeiliSearch, and SQLite table engines. Closes [#49972](https://github.com/ClickHouse/ClickHouse/issues/49972). [#50000](https://github.com/ClickHouse/ClickHouse/pull/50000) ([Nikolay Degterinsky](https://github.com/evillique)).
* Password type in queries like `CREATE USER u IDENTIFIED BY 'p'` will be automatically set according to the setting `default_password_type` in the `config.xml` on the server. Closes [#42915](https://github.com/ClickHouse/ClickHouse/issues/42915). [#44674](https://github.com/ClickHouse/ClickHouse/pull/44674) ([Nikolay Degterinsky](https://github.com/evillique)).
* Add bcrypt password authentication type. Closes [#34599](https://github.com/ClickHouse/ClickHouse/issues/34599). [#44905](https://github.com/ClickHouse/ClickHouse/pull/44905) ([Nikolay Degterinsky](https://github.com/evillique)).
* Introduces new keyword `INTO OUTFILE 'file.txt' APPEND`. [#48880](https://github.com/ClickHouse/ClickHouse/pull/48880) ([alekar](https://github.com/alekar)).
* Added `system.zookeeper_connection` table that shows information about Keeper connections. [#45245](https://github.com/ClickHouse/ClickHouse/pull/45245) ([mateng915](https://github.com/mateng0915)).
* Add new function `generateRandomStructure` that generates random table structure. It can be used in combination with table function `generateRandom`. [#47409](https://github.com/ClickHouse/ClickHouse/pull/47409) ([Kruglov Pavel](https://github.com/Avogar)).
* Allow the use of `CASE` without an `ELSE` branch and extended `transform` to deal with more types. Also fix some issues that made transform() return incorrect results when decimal types were mixed with other numeric types. [#48300](https://github.com/ClickHouse/ClickHouse/pull/48300) ([Salvatore Mesoraca](https://github.com/aiven-sal)). This closes #2655. This closes #9596. This closes #38666.
* Added [server-side encryption using KMS keys](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingKMSEncryption.html) with S3 tables, and the `header` setting with S3 disks. Closes [#48723](https://github.com/ClickHouse/ClickHouse/issues/48723). [#48724](https://github.com/ClickHouse/ClickHouse/pull/48724) ([Johann Gan](https://github.com/johanngan)).
* Add MemoryTracker for the background tasks (merges and mutation). Introduces `merges_mutations_memory_usage_soft_limit` and `merges_mutations_memory_usage_to_ram_ratio` settings that represent the soft memory limit for merges and mutations. If this limit is reached ClickHouse won't schedule new merge or mutation tasks. Also `MergesMutationsMemoryTracking` metric is introduced to allow observing current memory usage of background tasks. Resubmit [#46089](https://github.com/ClickHouse/ClickHouse/issues/46089). Closes [#48774](https://github.com/ClickHouse/ClickHouse/issues/48774). [#48787](https://github.com/ClickHouse/ClickHouse/pull/48787) ([Dmitry Novik](https://github.com/novikd)).
* Function `dotProduct` work for array. [#49050](https://github.com/ClickHouse/ClickHouse/pull/49050) ([FFFFFFFHHHHHHH](https://github.com/FFFFFFFHHHHHHH)).
* Support statement `SHOW INDEX` to improve compatibility with MySQL. [#49158](https://github.com/ClickHouse/ClickHouse/pull/49158) ([Robert Schulze](https://github.com/rschu1ze)).
* Add virtual column `_file` and `_path` support to table function `url`. - Impove error message for table function `url`. - resolves [#49231](https://github.com/ClickHouse/ClickHouse/issues/49231) - resolves [#49232](https://github.com/ClickHouse/ClickHouse/issues/49232). [#49356](https://github.com/ClickHouse/ClickHouse/pull/49356) ([Ziyi Tan](https://github.com/Ziy1-Tan)).
* Adding the `grants` field in the users.xml file, which allows specifying grants for users. [#49381](https://github.com/ClickHouse/ClickHouse/pull/49381) ([pufit](https://github.com/pufit)).
* Support full/right join by using grace hash join algorithm. [#49483](https://github.com/ClickHouse/ClickHouse/pull/49483) ([lgbo](https://github.com/lgbo-ustc)).
* `WITH FILL` modifier groups filling by sorting prefix. Controlled by `use_with_fill_by_sorting_prefix` setting (enabled by default). Related to [#33203](https://github.com/ClickHouse/ClickHouse/issues/33203)#issuecomment-1418736794. [#49503](https://github.com/ClickHouse/ClickHouse/pull/49503) ([Igor Nikonov](https://github.com/devcrafter)).
* Clickhouse-client now accepts queries after "--multiquery" when "--query" (or "-q") is absent. example: clickhouse-client --multiquery "select 1; select 2;". [#49870](https://github.com/ClickHouse/ClickHouse/pull/49870) ([Alexey Gerasimchuk](https://github.com/Demilivor)).
* Add separate `handshake_timeout` for receiving Hello packet from replica. Closes [#48854](https://github.com/ClickHouse/ClickHouse/issues/48854). [#49948](https://github.com/ClickHouse/ClickHouse/pull/49948) ([Kruglov Pavel](https://github.com/Avogar)).
* Added a function "space" which repeats a space as many times as specified. [#50103](https://github.com/ClickHouse/ClickHouse/pull/50103) ([Robert Schulze](https://github.com/rschu1ze)).
* Added --input_format_csv_trim_whitespaces option. [#50215](https://github.com/ClickHouse/ClickHouse/pull/50215) ([Alexey Gerasimchuk](https://github.com/Demilivor)).
* Allow the `dictGetAll` function for regexp tree dictionaries to return values from multiple matches as arrays. Closes [#50254](https://github.com/ClickHouse/ClickHouse/issues/50254). [#50255](https://github.com/ClickHouse/ClickHouse/pull/50255) ([Johann Gan](https://github.com/johanngan)).
* Added `toLastDayOfWeek` function to round a date or a date with time up to the nearest Saturday or Sunday. [#50315](https://github.com/ClickHouse/ClickHouse/pull/50315) ([Victor Krasnov](https://github.com/sirvickr)).
* Ability to ignore a skip index by specifying `ignore_data_skipping_indices`. [#50329](https://github.com/ClickHouse/ClickHouse/pull/50329) ([Boris Kuschel](https://github.com/bkuschel)).
* Add `system.user_processes` table and `SHOW USER PROCESSES` query to show memory info and ProfileEvents on user level. [#50492](https://github.com/ClickHouse/ClickHouse/pull/50492) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* Add server and format settings `display_secrets_in_show_and_select` for displaying secrets of tables, databases, table functions, and dictionaries. Add privilege `displaySecretsInShowAndSelect` controlling which users can view secrets. [#46528](https://github.com/ClickHouse/ClickHouse/pull/46528) ([Mike Kot](https://github.com/myrrc)).
* Allow to set up a ROW POLICY for all tables that belong to a DATABASE. [#47640](https://github.com/ClickHouse/ClickHouse/pull/47640) ([Ilya Golshtein](https://github.com/ilejn)).
#### Performance Improvement
* Compress marks and primary key by default. It significantly reduces the cold query time. Upgrade notes: the support for compressed marks and primary key has been added in version 22.9. If you turned on compressed marks or primary key or installed version 23.5 or newer, which has compressed marks or primary key on by default, you will not be able to downgrade to version 22.8 or earlier. You can also explicitly disable compressed marks or primary keys by specifying the `compress_marks` and `compress_primary_key` settings in the `<merge_tree>` section of the server configuration file. [#42587](https://github.com/ClickHouse/ClickHouse/pull/42587) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* New setting s3_max_inflight_parts_for_one_file sets the limit of concurrently loaded parts with multipart upload request in scope of one file. [#49961](https://github.com/ClickHouse/ClickHouse/pull/49961) ([Sema Checherinda](https://github.com/CheSema)).
* When reading from multiple files reduce parallel parsing threads for each file. Resolves [#42192](https://github.com/ClickHouse/ClickHouse/issues/42192). [#46661](https://github.com/ClickHouse/ClickHouse/pull/46661) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Use aggregate projection only if it reads fewer granules than normal reading. It should help in case if query hits the PK of the table, but not the projection. Fixes [#49150](https://github.com/ClickHouse/ClickHouse/issues/49150). [#49417](https://github.com/ClickHouse/ClickHouse/pull/49417) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Do not store blocks in `ANY` hash join if nothing is inserted. [#48633](https://github.com/ClickHouse/ClickHouse/pull/48633) ([vdimir](https://github.com/vdimir)).
* Fixes aggregate combinator `-If` when JIT compiled, and enable JIT compilation for aggregate functions. Closes [#48120](https://github.com/ClickHouse/ClickHouse/issues/48120). [#49083](https://github.com/ClickHouse/ClickHouse/pull/49083) ([Igor Nikonov](https://github.com/devcrafter)).
* For reading from remote tables we use smaller tasks (instead of reading the whole part) to make tasks stealing work * task size is determined by size of columns to read * always use 1mb buffers for reading from s3 * boundaries of cache segments aligned to 1mb so they have decent size even with small tasks. it also should prevent fragmentation. [#49287](https://github.com/ClickHouse/ClickHouse/pull/49287) ([Nikita Taranov](https://github.com/nickitat)).
* Introduced settings: - `merge_max_block_size_bytes` to limit the amount of memory used for background operations. - `vertical_merge_algorithm_min_bytes_to_activate` to add another condition to activate vertical merges. [#49313](https://github.com/ClickHouse/ClickHouse/pull/49313) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Default size of a read buffer for reading from local filesystem changed to a slightly better value. Also two new settings are introduced: `max_read_buffer_size_local_fs` and `max_read_buffer_size_remote_fs`. [#49321](https://github.com/ClickHouse/ClickHouse/pull/49321) ([Nikita Taranov](https://github.com/nickitat)).
* Improve memory usage and speed of `SPARSE_HASHED`/`HASHED` dictionaries (e.g. `SPARSE_HASHED` now eats 2.6x less memory, and is ~2x faster). [#49380](https://github.com/ClickHouse/ClickHouse/pull/49380) ([Azat Khuzhin](https://github.com/azat)).
* Optimize the `system.query_log` and `system.query_thread_log` tables by applying `LowCardinality` when appropriate. The queries over these tables will be faster. [#49530](https://github.com/ClickHouse/ClickHouse/pull/49530) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Better performance when reading local `Parquet` files (through parallel reading). [#49539](https://github.com/ClickHouse/ClickHouse/pull/49539) ([Michael Kolupaev](https://github.com/al13n321)).
* Improve the performance of `RIGHT/FULL JOIN` by up to 2 times in certain scenarios, especially when joining a small left table with a large right table. [#49585](https://github.com/ClickHouse/ClickHouse/pull/49585) ([lgbo](https://github.com/lgbo-ustc)).
* Improve performance of BLAKE3 by 11% by enabling LTO for Rust. [#49600](https://github.com/ClickHouse/ClickHouse/pull/49600) ([Azat Khuzhin](https://github.com/azat)). Now it is on par with C++.
* Optimize the structure of the `system.opentelemetry_span_log`. Use `LowCardinality` where appropriate. Although this table is generally stupid (it is using the Map data type even for common attributes), it will be slightly better. [#49647](https://github.com/ClickHouse/ClickHouse/pull/49647) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Try to reserve hash table's size in `grace_hash` join. [#49816](https://github.com/ClickHouse/ClickHouse/pull/49816) ([lgbo](https://github.com/lgbo-ustc)).
* Parallel merge of `uniqExactIf` states. Closes [#49885](https://github.com/ClickHouse/ClickHouse/issues/49885). [#50285](https://github.com/ClickHouse/ClickHouse/pull/50285) ([flynn](https://github.com/ucasfl)).
* Keeper improvement: add `CheckNotExists` request to Keeper, which allows to improve the performance of Replicated tables. [#48897](https://github.com/ClickHouse/ClickHouse/pull/48897) ([Antonio Andelic](https://github.com/antonio2368)).
* Keeper performance improvements: avoid serializing same request twice while processing. Cache deserialization results of large requests. Controlled by new coordination setting `min_request_size_for_cache`. [#49004](https://github.com/ClickHouse/ClickHouse/pull/49004) ([Antonio Andelic](https://github.com/antonio2368)).
* Reduced number of `List` ZooKeeper requests when selecting parts to merge and a lot of partitions do not have anything to merge. [#49637](https://github.com/ClickHouse/ClickHouse/pull/49637) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Rework locking in the FS cache [#44985](https://github.com/ClickHouse/ClickHouse/pull/44985) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Disable pure parallel replicas if trivial count optimization is possible. [#50594](https://github.com/ClickHouse/ClickHouse/pull/50594) ([Raúl Marín](https://github.com/Algunenano)).
* Don't send head request for all keys in Iceberg schema inference, only for keys that are used for reaing data. [#50203](https://github.com/ClickHouse/ClickHouse/pull/50203) ([Kruglov Pavel](https://github.com/Avogar)).
* Setting `enable_memory_bound_merging_of_aggregation_results` is enabled by default. [#50319](https://github.com/ClickHouse/ClickHouse/pull/50319) ([Nikita Taranov](https://github.com/nickitat)).
#### Experimental Feature
* `DEFLATE_QPL` codec lower the minimum simd version to SSE 4.2. [doc change in qpl](https://github.com/intel/qpl/commit/3f8f5cea27739f5261e8fd577dc233ffe88bf679) - Intel® QPL relies on a run-time kernels dispatcher and cpuid check to choose the best available implementation(sse/avx2/avx512) - restructured cmakefile for qpl build in clickhouse to align with latest upstream qpl. [#49811](https://github.com/ClickHouse/ClickHouse/pull/49811) ([jasperzhu](https://github.com/jinjunzh)).
* Add initial support to do JOINs with pure parallel replicas. [#49544](https://github.com/ClickHouse/ClickHouse/pull/49544) ([Raúl Marín](https://github.com/Algunenano)).
* More parallelism on `Outdated` parts removal with "zero-copy replication". [#49630](https://github.com/ClickHouse/ClickHouse/pull/49630) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Parallel Replicas: 1) Fixed an error `NOT_FOUND_COLUMN_IN_BLOCK` in case of using parallel replicas with non-replicated storage with disabled setting `parallel_replicas_for_non_replicated_merge_tree` 2) Now `allow_experimental_parallel_reading_from_replicas` have 3 possible values - 0, 1 and 2. 0 - disabled, 1 - enabled, silently disable them in case of failure (in case of FINAL or JOIN), 2 - enabled, throw an expection in case of failure. 3) If FINAL modifier is used in SELECT query and parallel replicas are enabled, ClickHouse will try to disable them if `allow_experimental_parallel_reading_from_replicas` is set to 1 and throw an exception otherwise. [#50195](https://github.com/ClickHouse/ClickHouse/pull/50195) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* When parallel replicas are enabled they will always skip unavailable servers (the behavior is controlled by the setting `skip_unavailable_shards`, enabled by default and can be only disabled). This closes: [#48565](https://github.com/ClickHouse/ClickHouse/issues/48565). [#50293](https://github.com/ClickHouse/ClickHouse/pull/50293) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
#### Improvement
* The `BACKUP` command will not decrypt data from encrypted disks while making a backup. Instead the data will be stored in a backup in encrypted form. Such backups can be restored only to an encrypted disk with the same (or extended) list of encryption keys. [#48896](https://github.com/ClickHouse/ClickHouse/pull/48896) ([Vitaly Baranov](https://github.com/vitlibar)).
* Added possibility to use temporary tables in FROM part of ATTACH PARTITION FROM and REPLACE PARTITION FROM. [#49436](https://github.com/ClickHouse/ClickHouse/pull/49436) ([Roman Vasin](https://github.com/rvasin)).
* Added setting `async_insert` for `MergeTree` tables. It has the same meaning as query-level setting `async_insert` and enables asynchronous inserts for specific table. Note: it doesn't take effect for insert queries from `clickhouse-client`, use query-level setting in that case. [#49122](https://github.com/ClickHouse/ClickHouse/pull/49122) ([Anton Popov](https://github.com/CurtizJ)).
* Add support for size suffixes in quota creation statement parameters. [#49087](https://github.com/ClickHouse/ClickHouse/pull/49087) ([Eridanus](https://github.com/Eridanus117)).
* Extend `first_value` and `last_value` to accept NULL. [#46467](https://github.com/ClickHouse/ClickHouse/pull/46467) ([lgbo](https://github.com/lgbo-ustc)).
* Add alias `str_to_map` and `mapFromString` for `extractKeyValuePairs`. closes https://github.com/clickhouse/clickhouse/issues/47185. [#49466](https://github.com/ClickHouse/ClickHouse/pull/49466) ([flynn](https://github.com/ucasfl)).
* Add support for CGroup version 2 for asynchronous metrics about the memory usage and availability. This closes [#37983](https://github.com/ClickHouse/ClickHouse/issues/37983). [#45999](https://github.com/ClickHouse/ClickHouse/pull/45999) ([sichenzhao](https://github.com/sichenzhao)).
* Cluster table functions should always skip unavailable shards. close [#46314](https://github.com/ClickHouse/ClickHouse/issues/46314). [#46765](https://github.com/ClickHouse/ClickHouse/pull/46765) ([zk_kiger](https://github.com/zk-kiger)).
* Allow CSV file to contain empty columns in its header. [#47496](https://github.com/ClickHouse/ClickHouse/pull/47496) ([你不要过来啊](https://github.com/iiiuwioajdks)).
* Add Google Cloud Storage S3 compatible table function `gcs`. Like the `oss` and `cosn` functions, it is just an alias over the `s3` table function, and it does not bring any new features. [#47815](https://github.com/ClickHouse/ClickHouse/pull/47815) ([Kuba Kaflik](https://github.com/jkaflik)).
* Add ability to use strict parts size for S3 (compatibility with CloudFlare R2 S3 Storage). [#48492](https://github.com/ClickHouse/ClickHouse/pull/48492) ([Azat Khuzhin](https://github.com/azat)).
* Added new columns with info about `Replicated` database replicas to `system.clusters`: `database_shard_name`, `database_replica_name`, `is_active`. Added an optional `FROM SHARD` clause to `SYSTEM DROP DATABASE REPLICA` query. [#48548](https://github.com/ClickHouse/ClickHouse/pull/48548) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add a new column `zookeeper_name` in system.replicas, to indicate on which (auxiliary) zookeeper cluster the replicated table's metadata is stored. [#48549](https://github.com/ClickHouse/ClickHouse/pull/48549) ([cangyin](https://github.com/cangyin)).
* `IN` operator support the comparison of `Date` and `Date32`. Closes [#48736](https://github.com/ClickHouse/ClickHouse/issues/48736). [#48806](https://github.com/ClickHouse/ClickHouse/pull/48806) ([flynn](https://github.com/ucasfl)).
* Support for erasure codes in `HDFS`, author: @M1eyu2018, @tomscut. [#48833](https://github.com/ClickHouse/ClickHouse/pull/48833) ([M1eyu](https://github.com/M1eyu2018)).
* Implement SYSTEM DROP REPLICA from auxillary ZooKeeper clusters, may be close [#48931](https://github.com/ClickHouse/ClickHouse/issues/48931). [#48932](https://github.com/ClickHouse/ClickHouse/pull/48932) ([wangxiaobo](https://github.com/wzb5212)).
* Add Array data type to MongoDB. Closes [#48598](https://github.com/ClickHouse/ClickHouse/issues/48598). [#48983](https://github.com/ClickHouse/ClickHouse/pull/48983) ([Nikolay Degterinsky](https://github.com/evillique)).
* Support storing `Interval` data types in tables. [#49085](https://github.com/ClickHouse/ClickHouse/pull/49085) ([larryluogit](https://github.com/larryluogit)).
* Allow using `ntile` window function without explicit window frame definition: `ntile(3) OVER (ORDER BY a)`, close [#46763](https://github.com/ClickHouse/ClickHouse/issues/46763). [#49093](https://github.com/ClickHouse/ClickHouse/pull/49093) ([vdimir](https://github.com/vdimir)).
* Added settings (`number_of_mutations_to_delay`, `number_of_mutations_to_throw`) to delay or throw `ALTER` queries that create mutations (`ALTER UPDATE`, `ALTER DELETE`, `ALTER MODIFY COLUMN`, ...) in case when table already has a lot of unfinished mutations. [#49117](https://github.com/ClickHouse/ClickHouse/pull/49117) ([Anton Popov](https://github.com/CurtizJ)).
* Catch exception from `create_directories` in filesystem cache. [#49203](https://github.com/ClickHouse/ClickHouse/pull/49203) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Copies embedded examples to a new field `example` in `system.functions` to supplement the field `description`. [#49222](https://github.com/ClickHouse/ClickHouse/pull/49222) ([Dan Roscigno](https://github.com/DanRoscigno)).
* Enable connection options for the MongoDB dictionary. Example: ``` xml <source> <mongodb> <host>localhost</host> <port>27017</port> <user></user> <password></password> <db>test</db> <collection>dictionary_source</collection> <options>ssl=true</options> </mongodb> </source> ``` ### Documentation entry for user-facing changes. [#49225](https://github.com/ClickHouse/ClickHouse/pull/49225) ([MikhailBurdukov](https://github.com/MikhailBurdukov)).
* Added an alias `asymptotic` for `asymp` computational method for `kolmogorovSmirnovTest`. Improved documentation. [#49286](https://github.com/ClickHouse/ClickHouse/pull/49286) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Aggregation function groupBitAnd/Or/Xor now work on signed integer data. This makes them consistent with the behavior of scalar functions bitAnd/Or/Xor. [#49292](https://github.com/ClickHouse/ClickHouse/pull/49292) ([exmy](https://github.com/exmy)).
* Split function-documentation into more fine-granular fields. [#49300](https://github.com/ClickHouse/ClickHouse/pull/49300) ([Robert Schulze](https://github.com/rschu1ze)).
* Use multiple threads shared between all tables within a server to load outdated data parts. The the size of the pool and its queue is controlled by `max_outdated_parts_loading_thread_pool_size` and `outdated_part_loading_thread_pool_queue_size` settings. [#49317](https://github.com/ClickHouse/ClickHouse/pull/49317) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Don't overestimate the size of processed data for `LowCardinality` columns when they share dictionaries between blocks. This closes [#49322](https://github.com/ClickHouse/ClickHouse/issues/49322). See also [#48745](https://github.com/ClickHouse/ClickHouse/issues/48745). [#49323](https://github.com/ClickHouse/ClickHouse/pull/49323) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Parquet writer now uses reasonable row group size when invoked through `OUTFILE`. [#49325](https://github.com/ClickHouse/ClickHouse/pull/49325) ([Michael Kolupaev](https://github.com/al13n321)).
* Allow restricted keywords like `ARRAY` as an alias if the alias is quoted. Closes [#49324](https://github.com/ClickHouse/ClickHouse/issues/49324). [#49360](https://github.com/ClickHouse/ClickHouse/pull/49360) ([Nikolay Degterinsky](https://github.com/evillique)).
* Data parts loading and deletion jobs were moved to shared server-wide pools instead of per-table pools. Pools sizes are controlled via settings `max_active_parts_loading_thread_pool_size`, `max_outdated_parts_loading_thread_pool_size` and `max_parts_cleaning_thread_pool_size` in top-level config. Table-level settings `max_part_loading_threads` and `max_part_removal_threads` became obsolete. [#49474](https://github.com/ClickHouse/ClickHouse/pull/49474) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Allow `?password=pass` in URL of the Play UI. Password is replaced in browser history. [#49505](https://github.com/ClickHouse/ClickHouse/pull/49505) ([Mike Kot](https://github.com/myrrc)).
* Allow reading zero-size objects from remote filesystems. (because empty files are not backup'd, so we might end up with zero blobs in metadata file). Closes [#49480](https://github.com/ClickHouse/ClickHouse/issues/49480). [#49519](https://github.com/ClickHouse/ClickHouse/pull/49519) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Attach thread MemoryTracker to `total_memory_tracker` after `ThreadGroup` detached. [#49527](https://github.com/ClickHouse/ClickHouse/pull/49527) ([Dmitry Novik](https://github.com/novikd)).
* Fix parameterized views when a query parameter is used multiple times in the query. [#49556](https://github.com/ClickHouse/ClickHouse/pull/49556) ([Azat Khuzhin](https://github.com/azat)).
* Release memory allocated for the last sent ProfileEvents snapshot in the context of a query. Followup [#47564](https://github.com/ClickHouse/ClickHouse/issues/47564). [#49561](https://github.com/ClickHouse/ClickHouse/pull/49561) ([Dmitry Novik](https://github.com/novikd)).
* Function "makeDate" now provides a MySQL-compatible overload (year & day of the year argument). [#49603](https://github.com/ClickHouse/ClickHouse/pull/49603) ([Robert Schulze](https://github.com/rschu1ze)).
* Support `dictionary` table function for `RegExpTreeDictionary`. [#49666](https://github.com/ClickHouse/ClickHouse/pull/49666) ([Han Fei](https://github.com/hanfei1991)).
* Added weighted fair IO scheduling policy. Added dynamic resource manager, which allows IO scheduling hierarchy to be updated in runtime w/o server restarts. [#49671](https://github.com/ClickHouse/ClickHouse/pull/49671) ([Sergei Trifonov](https://github.com/serxa)).
* Add compose request after multipart upload to GCS. This enables the usage of copy operation on objects uploaded with the multipart upload. It's recommended to set `s3_strict_upload_part_size` to some value because compose request can fail on objects created with parts of different sizes. [#49693](https://github.com/ClickHouse/ClickHouse/pull/49693) ([Antonio Andelic](https://github.com/antonio2368)).
* For the `extractKeyValuePairs` function: improve the "best-effort" parsing logic to accept `key_value_delimiter` as a valid part of the value. This also simplifies branching and might even speed up things a bit. [#49760](https://github.com/ClickHouse/ClickHouse/pull/49760) ([Arthur Passos](https://github.com/arthurpassos)).
* Add `initial_query_id` field for system.processors_profile_log [#49777](https://github.com/ClickHouse/ClickHouse/pull/49777) ([helifu](https://github.com/helifu)).
* System log tables can now have custom sorting keys. [#49778](https://github.com/ClickHouse/ClickHouse/pull/49778) ([helifu](https://github.com/helifu)).
* A new field `partitions` to `system.query_log` is used to indicate which partitions are participating in the calculation. [#49779](https://github.com/ClickHouse/ClickHouse/pull/49779) ([helifu](https://github.com/helifu)).
* Added `enable_the_endpoint_id_with_zookeeper_name_prefix` setting for `ReplicatedMergeTree` (disabled by default). When enabled, it adds ZooKeeper cluster name to table's interserver communication endpoint. It avoids `Duplicate interserver IO endpoint` errors when having replicated tables with the same path, but different auxiliary ZooKeepers. [#49780](https://github.com/ClickHouse/ClickHouse/pull/49780) ([helifu](https://github.com/helifu)).
* Add query parameters to `clickhouse-local`. Closes [#46561](https://github.com/ClickHouse/ClickHouse/issues/46561). [#49785](https://github.com/ClickHouse/ClickHouse/pull/49785) ([Nikolay Degterinsky](https://github.com/evillique)).
* Allow loading dictionaries and functions from YAML by default. In previous versions, it required editing the `dictionaries_config` or `user_defined_executable_functions_config` in the configuration file, as they expected `*.xml` files. [#49812](https://github.com/ClickHouse/ClickHouse/pull/49812) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* The Kafka table engine now allows to use alias columns. [#49824](https://github.com/ClickHouse/ClickHouse/pull/49824) ([Aleksandr Musorin](https://github.com/AVMusorin)).
* Add setting to limit the max number of pairs produced by `extractKeyValuePairs`, a safeguard to avoid using way too much memory. [#49836](https://github.com/ClickHouse/ClickHouse/pull/49836) ([Arthur Passos](https://github.com/arthurpassos)).
* Add support for (an unusual) case where the arguments in the `IN` operator are single-element tuples. [#49844](https://github.com/ClickHouse/ClickHouse/pull/49844) ([MikhailBurdukov](https://github.com/MikhailBurdukov)).
* `bitHammingDistance` function support `String` and `FixedString` data type. Closes [#48827](https://github.com/ClickHouse/ClickHouse/issues/48827). [#49858](https://github.com/ClickHouse/ClickHouse/pull/49858) ([flynn](https://github.com/ucasfl)).
* Fix timeout resetting errors in the client on OS X. [#49863](https://github.com/ClickHouse/ClickHouse/pull/49863) ([alekar](https://github.com/alekar)).
* Add support for big integers, such as UInt128, Int128, UInt256, and Int256 in the function `bitCount`. This enables Hamming distance over large bit masks for AI applications. [#49867](https://github.com/ClickHouse/ClickHouse/pull/49867) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fingerprints to be used instead of key IDs in encrypted disks. This simplifies the configuration of encrypted disks. [#49882](https://github.com/ClickHouse/ClickHouse/pull/49882) ([Vitaly Baranov](https://github.com/vitlibar)).
* Add UUID data type to PostgreSQL. Closes [#49739](https://github.com/ClickHouse/ClickHouse/issues/49739). [#49894](https://github.com/ClickHouse/ClickHouse/pull/49894) ([Nikolay Degterinsky](https://github.com/evillique)).
* Function `toUnixTimestamp` now accepts `Date` and `Date32` arguments. [#49989](https://github.com/ClickHouse/ClickHouse/pull/49989) ([Victor Krasnov](https://github.com/sirvickr)).
* Charge only server memory for dictionaries. [#49995](https://github.com/ClickHouse/ClickHouse/pull/49995) ([Azat Khuzhin](https://github.com/azat)).
* The server will allow using the `SQL_*` settings such as `SQL_AUTO_IS_NULL` as no-ops for MySQL compatibility. This closes [#49927](https://github.com/ClickHouse/ClickHouse/issues/49927). [#50013](https://github.com/ClickHouse/ClickHouse/pull/50013) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Preserve initial_query_id for ON CLUSTER queries, which is useful for introspection (under `distributed_ddl_entry_format_version=5`). [#50015](https://github.com/ClickHouse/ClickHouse/pull/50015) ([Azat Khuzhin](https://github.com/azat)).
* Preserve backward incompatibility for renamed settings by using aliases (`allow_experimental_projection_optimization` for `optimize_use_projections`, `allow_experimental_lightweight_delete` for `enable_lightweight_delete`). [#50044](https://github.com/ClickHouse/ClickHouse/pull/50044) ([Azat Khuzhin](https://github.com/azat)).
* Support passing FQDN through setting my_hostname to register cluster node in keeper. Add setting of invisible to support multi compute groups. A compute group as a cluster, is invisible to other compute groups. [#50186](https://github.com/ClickHouse/ClickHouse/pull/50186) ([Yangkuan Liu](https://github.com/LiuYangkuan)).
* Fix PostgreSQL reading all the data even though `LIMIT n` could be specified. [#50187](https://github.com/ClickHouse/ClickHouse/pull/50187) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Add new profile events for queries with subqueries (`QueriesWithSubqueries`/`SelectQueriesWithSubqueries`/`InsertQueriesWithSubqueries`). [#50204](https://github.com/ClickHouse/ClickHouse/pull/50204) ([Azat Khuzhin](https://github.com/azat)).
* Adding the roles field in the users.xml file, which allows specifying roles with grants via a config file. [#50278](https://github.com/ClickHouse/ClickHouse/pull/50278) ([pufit](https://github.com/pufit)).
* Report `CGroupCpuCfsPeriod` and `CGroupCpuCfsQuota` in AsynchronousMetrics. - Respect cgroup v2 memory limits during server startup. [#50379](https://github.com/ClickHouse/ClickHouse/pull/50379) ([alekar](https://github.com/alekar)).
* Add a signal handler for SIGQUIT to work the same way as SIGINT. Closes [#50298](https://github.com/ClickHouse/ClickHouse/issues/50298). [#50435](https://github.com/ClickHouse/ClickHouse/pull/50435) ([Nikolay Degterinsky](https://github.com/evillique)).
* In case JSON parse fails due to the large size of the object output the last position to allow debugging. [#50474](https://github.com/ClickHouse/ClickHouse/pull/50474) ([Valentin Alexeev](https://github.com/valentinalexeev)).
* Support decimals with not fixed size. Closes [#49130](https://github.com/ClickHouse/ClickHouse/issues/49130). [#50586](https://github.com/ClickHouse/ClickHouse/pull/50586) ([Kruglov Pavel](https://github.com/Avogar)).
#### Build/Testing/Packaging Improvement
* New and improved `keeper-bench`. Everything can be customized from YAML/XML file: - request generator - each type of request generator can have a specific set of fields - multi requests can be generated just by doing the same under `multi` key - for each request or subrequest in multi a `weight` field can be defined to control distribution - define trees that need to be setup for a test run - hosts can be defined with all timeouts customizable and it's possible to control how many sessions to generate for each host - integers defined with `min_value` and `max_value` fields are random number generators. [#48547](https://github.com/ClickHouse/ClickHouse/pull/48547) ([Antonio Andelic](https://github.com/antonio2368)).
* Io_uring is not supported on macos, don't choose it when running tests on local to avoid occassional failures. [#49250](https://github.com/ClickHouse/ClickHouse/pull/49250) ([Frank Chen](https://github.com/FrankChen021)).
* Support named fault injection for testing. [#49361](https://github.com/ClickHouse/ClickHouse/pull/49361) ([Han Fei](https://github.com/hanfei1991)).
* Allow running ClickHouse in the OS where the `prctl` (process control) syscall is not available, such as AWS Lambda. [#49538](https://github.com/ClickHouse/ClickHouse/pull/49538) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fixed the issue of build conflict between contrib/isa-l and isa-l in qpl [49296](https://github.com/ClickHouse/ClickHouse/issues/49296). [#49584](https://github.com/ClickHouse/ClickHouse/pull/49584) ([jasperzhu](https://github.com/jinjunzh)).
* Utilities are now only build if explicitly requested ("-DENABLE_UTILS=1") instead of by default, this reduces link times in typical development builds. [#49620](https://github.com/ClickHouse/ClickHouse/pull/49620) ([Robert Schulze](https://github.com/rschu1ze)).
* Pull build description of idxd-config into a separate CMake file to avoid accidental removal in future. [#49651](https://github.com/ClickHouse/ClickHouse/pull/49651) ([jasperzhu](https://github.com/jinjunzh)).
* Add CI check with an enabled analyzer in the master. Follow-up [#49562](https://github.com/ClickHouse/ClickHouse/issues/49562). [#49668](https://github.com/ClickHouse/ClickHouse/pull/49668) ([Dmitry Novik](https://github.com/novikd)).
* Switch to LLVM/clang 16. [#49678](https://github.com/ClickHouse/ClickHouse/pull/49678) ([Azat Khuzhin](https://github.com/azat)).
* Allow building ClickHouse with clang-17. [#49851](https://github.com/ClickHouse/ClickHouse/pull/49851) ([Alexey Milovidov](https://github.com/alexey-milovidov)). [#50410](https://github.com/ClickHouse/ClickHouse/pull/50410) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* ClickHouse is now easier to be integrated into other cmake projects. [#49991](https://github.com/ClickHouse/ClickHouse/pull/49991) ([Amos Bird](https://github.com/amosbird)). (Which is strongly discouraged - Alexey Milovidov).
* Fix strange additional QEMU logging after [#47151](https://github.com/ClickHouse/ClickHouse/issues/47151), see https://s3.amazonaws.com/clickhouse-test-reports/50078/a4743996ee4f3583884d07bcd6501df0cfdaa346/stateless_tests__release__databasereplicated__[3_4].html. [#50442](https://github.com/ClickHouse/ClickHouse/pull/50442) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* ClickHouse can work on Linux RISC-V 6.1.22. This closes [#50456](https://github.com/ClickHouse/ClickHouse/issues/50456). [#50457](https://github.com/ClickHouse/ClickHouse/pull/50457) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Bump internal protobuf to v3.18 (fixes bogus CVE-2022-1941). [#50400](https://github.com/ClickHouse/ClickHouse/pull/50400) ([Robert Schulze](https://github.com/rschu1ze)).
* Bump internal libxml2 to v2.10.4 (fixes bogus CVE-2023-28484 and bogus CVE-2023-29469). [#50402](https://github.com/ClickHouse/ClickHouse/pull/50402) ([Robert Schulze](https://github.com/rschu1ze)).
* Bump c-ares to v1.19.1 (bogus CVE-2023-32067, bogus CVE-2023-31130, bogus CVE-2023-31147). [#50403](https://github.com/ClickHouse/ClickHouse/pull/50403) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix bogus CVE-2022-2469 in libgsasl. [#50404](https://github.com/ClickHouse/ClickHouse/pull/50404) ([Robert Schulze](https://github.com/rschu1ze)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* ActionsDAG: fix wrong optimization [#47584](https://github.com/ClickHouse/ClickHouse/pull/47584) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
* Correctly handle concurrent snapshots in Keeper [#48466](https://github.com/ClickHouse/ClickHouse/pull/48466) ([Antonio Andelic](https://github.com/antonio2368)).
* MergeTreeMarksLoader holds DataPart instead of DataPartStorage [#48515](https://github.com/ClickHouse/ClickHouse/pull/48515) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Sequence state fix [#48603](https://github.com/ClickHouse/ClickHouse/pull/48603) ([Ilya Golshtein](https://github.com/ilejn)).
* Back/Restore concurrency check on previous fails [#48726](https://github.com/ClickHouse/ClickHouse/pull/48726) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Fix Attaching a table with non-existent ZK path does not increase the ReadonlyReplica metric [#48954](https://github.com/ClickHouse/ClickHouse/pull/48954) ([wangxiaobo](https://github.com/wzb5212)).
* Fix possible terminate called for uncaught exception in some places [#49112](https://github.com/ClickHouse/ClickHouse/pull/49112) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix key not found error for queries with multiple StorageJoin [#49137](https://github.com/ClickHouse/ClickHouse/pull/49137) ([vdimir](https://github.com/vdimir)).
* Fix wrong query result when using nullable primary key [#49172](https://github.com/ClickHouse/ClickHouse/pull/49172) ([Duc Canh Le](https://github.com/canhld94)).
* Fix reinterpretAs*() on big endian machines [#49198](https://github.com/ClickHouse/ClickHouse/pull/49198) ([Suzy Wang](https://github.com/SuzyWangIBMer)).
* (Experimental zero-copy replication) Lock zero copy parts more atomically [#49211](https://github.com/ClickHouse/ClickHouse/pull/49211) ([alesapin](https://github.com/alesapin)).
* Fix race on Outdated parts loading [#49223](https://github.com/ClickHouse/ClickHouse/pull/49223) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix all key value is null and group use rollup return wrong answer [#49282](https://github.com/ClickHouse/ClickHouse/pull/49282) ([Shuai li](https://github.com/loneylee)).
* Fix calculating load_factor for HASHED dictionaries with SHARDS [#49319](https://github.com/ClickHouse/ClickHouse/pull/49319) ([Azat Khuzhin](https://github.com/azat)).
* Disallow configuring compression CODECs for alias columns [#49363](https://github.com/ClickHouse/ClickHouse/pull/49363) ([Timur Solodovnikov](https://github.com/tsolodov)).
* Fix bug in removal of existing part directory [#49365](https://github.com/ClickHouse/ClickHouse/pull/49365) ([alesapin](https://github.com/alesapin)).
* Properly fix GCS when HMAC is used [#49390](https://github.com/ClickHouse/ClickHouse/pull/49390) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix fuzz bug when subquery set is not built when reading from remote() [#49425](https://github.com/ClickHouse/ClickHouse/pull/49425) ([Alexander Gololobov](https://github.com/davenger)).
* Invert `shutdown_wait_unfinished_queries` [#49427](https://github.com/ClickHouse/ClickHouse/pull/49427) ([Konstantin Bogdanov](https://github.com/thevar1able)).
* (Experimental zero-copy replication) Fix another zero copy bug [#49473](https://github.com/ClickHouse/ClickHouse/pull/49473) ([alesapin](https://github.com/alesapin)).
* Fix postgres database setting [#49481](https://github.com/ClickHouse/ClickHouse/pull/49481) ([Mal Curtis](https://github.com/snikch)).
* Correctly handle `s3Cluster` arguments [#49490](https://github.com/ClickHouse/ClickHouse/pull/49490) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix bug in TraceCollector destructor. [#49508](https://github.com/ClickHouse/ClickHouse/pull/49508) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Fix AsynchronousReadIndirectBufferFromRemoteFS breaking on short seeks [#49525](https://github.com/ClickHouse/ClickHouse/pull/49525) ([Michael Kolupaev](https://github.com/al13n321)).
* Fix dictionaries loading order [#49560](https://github.com/ClickHouse/ClickHouse/pull/49560) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Forbid the change of data type of Object('json') column [#49563](https://github.com/ClickHouse/ClickHouse/pull/49563) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix stress test (Logical error: Expected 7134 >= 11030) [#49623](https://github.com/ClickHouse/ClickHouse/pull/49623) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix bug in DISTINCT [#49628](https://github.com/ClickHouse/ClickHouse/pull/49628) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix: DISTINCT in order with zero values in non-sorted columns [#49636](https://github.com/ClickHouse/ClickHouse/pull/49636) ([Igor Nikonov](https://github.com/devcrafter)).
* Fix one-off error in big integers found by UBSan with fuzzer [#49645](https://github.com/ClickHouse/ClickHouse/pull/49645) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix reading from sparse columns after restart [#49660](https://github.com/ClickHouse/ClickHouse/pull/49660) ([Anton Popov](https://github.com/CurtizJ)).
* Fix assert in SpanHolder::finish() with fibers [#49673](https://github.com/ClickHouse/ClickHouse/pull/49673) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix short circuit functions and mutations with sparse arguments [#49716](https://github.com/ClickHouse/ClickHouse/pull/49716) ([Anton Popov](https://github.com/CurtizJ)).
* Fix writing appended files to incremental backups [#49725](https://github.com/ClickHouse/ClickHouse/pull/49725) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix "There is no physical column _row_exists in table" error occurring during lightweight delete mutation on a table with Object column. [#49737](https://github.com/ClickHouse/ClickHouse/pull/49737) ([Alexander Gololobov](https://github.com/davenger)).
* Fix msan issue in randomStringUTF8(uneven number) [#49750](https://github.com/ClickHouse/ClickHouse/pull/49750) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix aggregate function kolmogorovSmirnovTest [#49768](https://github.com/ClickHouse/ClickHouse/pull/49768) ([FFFFFFFHHHHHHH](https://github.com/FFFFFFFHHHHHHH)).
* Fix settings aliases in native protocol [#49776](https://github.com/ClickHouse/ClickHouse/pull/49776) ([Azat Khuzhin](https://github.com/azat)).
* Fix `arrayMap` with array of tuples with single argument [#49789](https://github.com/ClickHouse/ClickHouse/pull/49789) ([Anton Popov](https://github.com/CurtizJ)).
* Fix per-query IO/BACKUPs throttling settings [#49797](https://github.com/ClickHouse/ClickHouse/pull/49797) ([Azat Khuzhin](https://github.com/azat)).
* Fix setting NULL in profile definition [#49831](https://github.com/ClickHouse/ClickHouse/pull/49831) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix a bug with projections and the aggregate_functions_null_for_empty setting (for query_plan_optimize_projection) [#49873](https://github.com/ClickHouse/ClickHouse/pull/49873) ([Amos Bird](https://github.com/amosbird)).
* Fix processing pending batch for Distributed async INSERT after restart [#49884](https://github.com/ClickHouse/ClickHouse/pull/49884) ([Azat Khuzhin](https://github.com/azat)).
* Fix assertion in CacheMetadata::doCleanup [#49914](https://github.com/ClickHouse/ClickHouse/pull/49914) ([Kseniia Sumarokova](https://github.com/kssenii)).
* fix `is_prefix` in OptimizeRegularExpression [#49919](https://github.com/ClickHouse/ClickHouse/pull/49919) ([Han Fei](https://github.com/hanfei1991)).
* Fix metrics `WriteBufferFromS3Bytes`, `WriteBufferFromS3Microseconds` and `WriteBufferFromS3RequestsErrors` [#49930](https://github.com/ClickHouse/ClickHouse/pull/49930) ([Aleksandr Musorin](https://github.com/AVMusorin)).
* Fix IPv6 encoding in protobuf [#49933](https://github.com/ClickHouse/ClickHouse/pull/49933) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Fix possible Logical error on bad Nullable parsing for text formats [#49960](https://github.com/ClickHouse/ClickHouse/pull/49960) ([Kruglov Pavel](https://github.com/Avogar)).
* Add setting output_format_parquet_compliant_nested_types to produce more compatible Parquet files [#50001](https://github.com/ClickHouse/ClickHouse/pull/50001) ([Michael Kolupaev](https://github.com/al13n321)).
* Fix logical error in stress test "Not enough space to add ..." [#50021](https://github.com/ClickHouse/ClickHouse/pull/50021) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Avoid deadlock when starting table in attach thread of `ReplicatedMergeTree` [#50026](https://github.com/ClickHouse/ClickHouse/pull/50026) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix assert in SpanHolder::finish() with fibers attempt 2 [#50034](https://github.com/ClickHouse/ClickHouse/pull/50034) ([Kruglov Pavel](https://github.com/Avogar)).
* Add proper escaping for DDL OpenTelemetry context serialization [#50045](https://github.com/ClickHouse/ClickHouse/pull/50045) ([Azat Khuzhin](https://github.com/azat)).
* Fix reporting broken projection parts [#50052](https://github.com/ClickHouse/ClickHouse/pull/50052) ([Amos Bird](https://github.com/amosbird)).
* JIT compilation not equals NaN fix [#50056](https://github.com/ClickHouse/ClickHouse/pull/50056) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix crashing in case of Replicated database without arguments [#50058](https://github.com/ClickHouse/ClickHouse/pull/50058) ([Azat Khuzhin](https://github.com/azat)).
* Fix crash with `multiIf` and constant condition and nullable arguments [#50123](https://github.com/ClickHouse/ClickHouse/pull/50123) ([Anton Popov](https://github.com/CurtizJ)).
* Fix invalid index analysis for date related keys [#50153](https://github.com/ClickHouse/ClickHouse/pull/50153) ([Amos Bird](https://github.com/amosbird)).
* do not allow modify order by when there are no order by cols [#50154](https://github.com/ClickHouse/ClickHouse/pull/50154) ([Han Fei](https://github.com/hanfei1991)).
* Fix broken index analysis when binary operator contains a null constant argument [#50177](https://github.com/ClickHouse/ClickHouse/pull/50177) ([Amos Bird](https://github.com/amosbird)).
* clickhouse-client: disallow usage of `--query` and `--queries-file` at the same time [#50210](https://github.com/ClickHouse/ClickHouse/pull/50210) ([Alexey Gerasimchuk](https://github.com/Demilivor)).
* Fix UB for INTO OUTFILE extensions (APPEND / AND STDOUT) and WATCH EVENTS [#50216](https://github.com/ClickHouse/ClickHouse/pull/50216) ([Azat Khuzhin](https://github.com/azat)).
* Fix skipping spaces at end of row in CustomSeparatedIgnoreSpaces format [#50224](https://github.com/ClickHouse/ClickHouse/pull/50224) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix iceberg metadata parsing [#50232](https://github.com/ClickHouse/ClickHouse/pull/50232) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix nested distributed SELECT in WITH clause [#50234](https://github.com/ClickHouse/ClickHouse/pull/50234) ([Azat Khuzhin](https://github.com/azat)).
* Fix msan issue in keyed siphash [#50245](https://github.com/ClickHouse/ClickHouse/pull/50245) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix bugs in Poco sockets in non-blocking mode, use true non-blocking sockets [#50252](https://github.com/ClickHouse/ClickHouse/pull/50252) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix checksum calculation for backup entries [#50264](https://github.com/ClickHouse/ClickHouse/pull/50264) ([Vitaly Baranov](https://github.com/vitlibar)).
* Comparison functions NaN fix [#50287](https://github.com/ClickHouse/ClickHouse/pull/50287) ([Maksim Kita](https://github.com/kitaisreal)).
* JIT aggregation nullable key fix [#50291](https://github.com/ClickHouse/ClickHouse/pull/50291) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix clickhouse-local crashing when writing empty Arrow or Parquet output [#50328](https://github.com/ClickHouse/ClickHouse/pull/50328) ([Michael Kolupaev](https://github.com/al13n321)).
* Fix crash when Pool::Entry::disconnect() is called [#50334](https://github.com/ClickHouse/ClickHouse/pull/50334) ([Val Doroshchuk](https://github.com/valbok)).
* Improved fetch part by holding directory lock longer [#50339](https://github.com/ClickHouse/ClickHouse/pull/50339) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Fix bitShift* functions with both constant arguments [#50343](https://github.com/ClickHouse/ClickHouse/pull/50343) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix Keeper deadlock on exception when preprocessing requests. [#50387](https://github.com/ClickHouse/ClickHouse/pull/50387) ([frinkr](https://github.com/frinkr)).
* Fix hashing of const integer values [#50421](https://github.com/ClickHouse/ClickHouse/pull/50421) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix merge_tree_min_rows_for_seek/merge_tree_min_bytes_for_seek for data skipping indexes [#50432](https://github.com/ClickHouse/ClickHouse/pull/50432) ([Azat Khuzhin](https://github.com/azat)).
* Limit the number of in-flight tasks for loading outdated parts [#50450](https://github.com/ClickHouse/ClickHouse/pull/50450) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Keeper fix: apply uncommitted state after snapshot install [#50483](https://github.com/ClickHouse/ClickHouse/pull/50483) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix incorrect constant folding [#50536](https://github.com/ClickHouse/ClickHouse/pull/50536) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix logical error in stress test (Not enough space to add ...) [#50583](https://github.com/ClickHouse/ClickHouse/pull/50583) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix converting Null to LowCardinality(Nullable) in values table function [#50637](https://github.com/ClickHouse/ClickHouse/pull/50637) ([Kruglov Pavel](https://github.com/Avogar)).
* Revert invalid RegExpTreeDictionary optimization [#50642](https://github.com/ClickHouse/ClickHouse/pull/50642) ([Johann Gan](https://github.com/johanngan)).
### <a id="234"></a> ClickHouse release 23.4, 2023-04-26
#### Backward Incompatible Change

View File

@ -22,11 +22,10 @@ curl https://clickhouse.com/ | sh
## Upcoming Events
* [**v23.5 Release Webinar**](https://clickhouse.com/company/events/v23-5-release-webinar?utm_source=github&utm_medium=social&utm_campaign=release-webinar-2023-05) - May 31 - 23.5 is rapidly approaching. Original creator, co-founder, and CTO of ClickHouse Alexey Milovidov will walk us through the highlights of the release.
* [**ClickHouse Meetup in Barcelona**](https://www.meetup.com/clickhouse-barcelona-user-group/events/292892669) - May 25
* [**ClickHouse Meetup in London**](https://www.meetup.com/clickhouse-london-user-group/events/292892824) - May 25
* [**v23.5 Release Webinar**](https://clickhouse.com/company/events/v23-5-release-webinar?utm_source=github&utm_medium=social&utm_campaign=release-webinar-2023-05) - Jun 8 - 23.5 is rapidly approaching. Original creator, co-founder, and CTO of ClickHouse Alexey Milovidov will walk us through the highlights of the release.
* [**ClickHouse Meetup in Bangalore**](https://www.meetup.com/clickhouse-bangalore-user-group/events/293740066/) - Jun 7
* [**ClickHouse Meetup in San Francisco**](https://www.meetup.com/clickhouse-silicon-valley-meetup-group/events/293426725/) - Jun 7
* [**ClickHouse Meetup in Stockholm**](https://www.meetup.com/clickhouse-berlin-user-group/events/292892466) - Jun 13
Also, keep an eye out for upcoming meetups in Amsterdam, Boston, NYC, Beijing, and Toronto. Somewhere else you want us to be? Please feel free to reach out to tyler <at> clickhouse <dot> com.

View File

@ -13,9 +13,10 @@ The following versions of ClickHouse server are currently being supported with s
| Version | Supported |
|:-|:-|
| 23.5 | ✔️ |
| 23.4 | ✔️ |
| 23.3 | ✔️ |
| 23.2 | ✔️ |
| 23.2 | |
| 23.1 | ❌ |
| 22.12 | ❌ |
| 22.11 | ❌ |

View File

@ -28,6 +28,19 @@ uint64_t getMemoryAmountOrZero()
#if defined(OS_LINUX)
// Try to lookup at the Cgroup limit
// CGroups v2
std::ifstream cgroupv2_limit("/sys/fs/cgroup/memory.max");
if (cgroupv2_limit.is_open())
{
uint64_t memory_limit = 0;
cgroupv2_limit >> memory_limit;
if (memory_limit > 0 && memory_limit < memory_amount)
memory_amount = memory_limit;
}
else
{
// CGroups v1
std::ifstream cgroup_limit("/sys/fs/cgroup/memory/memory.limit_in_bytes");
if (cgroup_limit.is_open())
{
@ -36,6 +49,7 @@ uint64_t getMemoryAmountOrZero()
if (memory_limit > 0 && memory_limit < memory_amount)
memory_amount = memory_limit;
}
}
#endif
return memory_amount;

View File

@ -117,6 +117,9 @@ public:
void readRaw(char * buffer, std::streamsize length);
/// Reads length bytes of raw data into buffer.
void readCString(std::string& value);
/// Reads zero-terminated C-string into value.
void readBOM();
/// Reads a byte-order mark from the stream and configures
/// the reader for the encountered byte order.

View File

@ -56,6 +56,8 @@ public:
LITTLE_ENDIAN_BYTE_ORDER = 3 /// little-endian byte-order
};
static const std::streamsize DEFAULT_MAX_CSTR_LENGTH { 1024 };
BinaryWriter(std::ostream & ostr, StreamByteOrder byteOrder = NATIVE_BYTE_ORDER);
/// Creates the BinaryWriter.
@ -131,6 +133,9 @@ public:
void writeRaw(const char * buffer, std::streamsize length);
/// Writes length raw bytes from the given buffer to the stream.
void writeCString(const char* cString, std::streamsize maxLength = DEFAULT_MAX_CSTR_LENGTH);
/// Writes zero-terminated C-string.
void writeBOM();
/// Writes a byte-order mark to the stream. A byte order mark is
/// a 16-bit integer with a value of 0xFEFF, written in host byte-order.

View File

@ -274,6 +274,31 @@ void BinaryReader::readRaw(char* buffer, std::streamsize length)
}
void BinaryReader::readCString(std::string& value)
{
value.clear();
if (!_istr.good())
{
return;
}
value.reserve(256);
while (true)
{
char c;
_istr.get(c);
if (!_istr.good())
{
break;
}
if (c == '\0')
{
break;
}
value += c;
}
}
void BinaryReader::readBOM()
{
UInt16 bom;

View File

@ -332,6 +332,15 @@ void BinaryWriter::writeRaw(const char* buffer, std::streamsize length)
}
void BinaryWriter::writeCString(const char* cString, std::streamsize maxLength)
{
const std::size_t len = ::strnlen(cString, maxLength);
writeRaw(cString, len);
static const char zero = '\0';
_ostr.write(&zero, sizeof(zero));
}
void BinaryWriter::writeBOM()
{
UInt16 value = 0xFEFF;

View File

@ -13,3 +13,4 @@ target_compile_options (_poco_mongodb
target_include_directories (_poco_mongodb SYSTEM PUBLIC "include")
target_link_libraries (_poco_mongodb PUBLIC Poco::Net)

View File

@ -33,7 +33,7 @@ namespace MongoDB
/// This class represents a BSON Array.
{
public:
typedef SharedPtr<Array> Ptr;
using Ptr = SharedPtr<Array>;
Array();
/// Creates an empty Array.
@ -41,8 +41,31 @@ namespace MongoDB
virtual ~Array();
/// Destroys the Array.
// Document template functions available for backward compatibility
using Document::add;
using Document::get;
template <typename T>
T get(int pos) const
Document & add(T value)
/// Creates an element with the name from the current pos and value and
/// adds it to the array document.
///
/// The active document is returned to allow chaining of the add methods.
{
return Document::add<T>(Poco::NumberFormatter::format(size()), value);
}
Document & add(const char * value)
/// Creates an element with a name from the current pos and value and
/// adds it to the array document.
///
/// The active document is returned to allow chaining of the add methods.
{
return Document::add(Poco::NumberFormatter::format(size()), value);
}
template <typename T>
T get(std::size_t pos) const
/// Returns the element at the given index and tries to convert
/// it to the template type. If the element is not found, a
/// Poco::NotFoundException will be thrown. If the element cannot be
@ -52,7 +75,7 @@ namespace MongoDB
}
template <typename T>
T get(int pos, const T & deflt) const
T get(std::size_t pos, const T & deflt) const
/// Returns the element at the given index and tries to convert
/// it to the template type. If the element is not found, or
/// has the wrong type, the deflt argument will be returned.
@ -60,12 +83,12 @@ namespace MongoDB
return Document::get<T>(Poco::NumberFormatter::format(pos), deflt);
}
Element::Ptr get(int pos) const;
Element::Ptr get(std::size_t pos) const;
/// Returns the element at the given index.
/// An empty element will be returned if the element is not found.
template <typename T>
bool isType(int pos) const
bool isType(std::size_t pos) const
/// Returns true if the type of the element equals the TypeId of ElementTrait,
/// otherwise false.
{
@ -74,6 +97,9 @@ namespace MongoDB
std::string toString(int indent = 0) const;
/// Returns a string representation of the Array.
private:
friend void BSONReader::read<Array::Ptr>(Array::Ptr & to);
};

View File

@ -40,7 +40,7 @@ namespace MongoDB
/// A Binary stores its data in a Poco::Buffer<unsigned char>.
{
public:
typedef SharedPtr<Binary> Ptr;
using Ptr = SharedPtr<Binary>;
Binary();
/// Creates an empty Binary with subtype 0.

View File

@ -18,6 +18,7 @@
#define MongoDB_Connection_INCLUDED
#include "Poco/MongoDB/OpMsgMessage.h"
#include "Poco/MongoDB/RequestMessage.h"
#include "Poco/MongoDB/ResponseMessage.h"
#include "Poco/Mutex.h"
@ -39,7 +40,7 @@ namespace MongoDB
/// for more information on the wire protocol.
{
public:
typedef Poco::SharedPtr<Connection> Ptr;
using Ptr = Poco::SharedPtr<Connection>;
class MongoDB_API SocketFactory
{
@ -145,6 +146,21 @@ namespace MongoDB
/// Use this when a response is expected: only a "query" or "getmore"
/// request will return a response.
void sendRequest(OpMsgMessage & request, OpMsgMessage & response);
/// Sends a request to the MongoDB server and receives the response
/// using newer wire protocol with OP_MSG.
void sendRequest(OpMsgMessage & request);
/// Sends an unacknowledged request to the MongoDB server using newer
/// wire protocol with OP_MSG.
/// No response is sent by the server.
void readResponse(OpMsgMessage & response);
/// Reads additional response data when previous message's flag moreToCome
/// indicates that server will send more data.
/// NOTE: See comments in OpMsgCursor code.
protected:
void connect();

View File

@ -40,6 +40,9 @@ namespace MongoDB
Cursor(const std::string & fullCollectionName, QueryRequest::Flags flags = QueryRequest::QUERY_DEFAULT);
/// Creates a Cursor for the given database and collection ("database.collection"), using the specified flags.
Cursor(const Document & aggregationResponse);
/// Creates a Cursor for the given aggregation query response.
virtual ~Cursor();
/// Destroys the Cursor.

View File

@ -26,6 +26,8 @@
#include "Poco/MongoDB/QueryRequest.h"
#include "Poco/MongoDB/UpdateRequest.h"
#include "Poco/MongoDB/OpMsgCursor.h"
#include "Poco/MongoDB/OpMsgMessage.h"
namespace Poco
{
@ -45,6 +47,9 @@ namespace MongoDB
virtual ~Database();
/// Destroys the Database.
const std::string & name() const;
/// Database name
bool authenticate(
Connection & connection,
const std::string & username,
@ -62,34 +67,49 @@ namespace MongoDB
/// May throw a Poco::ProtocolException if authentication fails for a reason other than
/// invalid credentials.
Document::Ptr queryBuildInfo(Connection & connection) const;
/// Queries server build info (all wire protocols)
Document::Ptr queryServerHello(Connection & connection) const;
/// Queries hello response from server (all wire protocols)
Int64 count(Connection & connection, const std::string & collectionName) const;
/// Sends a count request for the given collection to MongoDB.
/// Sends a count request for the given collection to MongoDB. (old wire protocol)
///
/// If the command fails, -1 is returned.
Poco::SharedPtr<Poco::MongoDB::QueryRequest> createCommand() const;
/// Creates a QueryRequest for a command.
/// Creates a QueryRequest for a command. (old wire protocol)
Poco::SharedPtr<Poco::MongoDB::QueryRequest> createCountRequest(const std::string & collectionName) const;
/// Creates a QueryRequest to count the given collection.
/// The collectionname must not contain the database name.
/// The collectionname must not contain the database name. (old wire protocol)
Poco::SharedPtr<Poco::MongoDB::DeleteRequest> createDeleteRequest(const std::string & collectionName) const;
/// Creates a DeleteRequest to delete documents in the given collection.
/// The collectionname must not contain the database name.
/// The collectionname must not contain the database name. (old wire protocol)
Poco::SharedPtr<Poco::MongoDB::InsertRequest> createInsertRequest(const std::string & collectionName) const;
/// Creates an InsertRequest to insert new documents in the given collection.
/// The collectionname must not contain the database name.
/// The collectionname must not contain the database name. (old wire protocol)
Poco::SharedPtr<Poco::MongoDB::QueryRequest> createQueryRequest(const std::string & collectionName) const;
/// Creates a QueryRequest.
/// Creates a QueryRequest. (old wire protocol)
/// The collectionname must not contain the database name.
Poco::SharedPtr<Poco::MongoDB::UpdateRequest> createUpdateRequest(const std::string & collectionName) const;
/// Creates an UpdateRequest.
/// Creates an UpdateRequest. (old wire protocol)
/// The collectionname must not contain the database name.
Poco::SharedPtr<Poco::MongoDB::OpMsgMessage> createOpMsgMessage(const std::string & collectionName) const;
/// Creates OpMsgMessage. (new wire protocol)
Poco::SharedPtr<Poco::MongoDB::OpMsgMessage> createOpMsgMessage() const;
/// Creates OpMsgMessage for database commands that do not require collection as an argument. (new wire protocol)
Poco::SharedPtr<Poco::MongoDB::OpMsgCursor> createOpMsgCursor(const std::string & collectionName) const;
/// Creates OpMsgCursor. (new wire protocol)
Poco::MongoDB::Document::Ptr ensureIndex(
Connection & connection,
const std::string & collection,
@ -100,14 +120,16 @@ namespace MongoDB
int version = 0,
int ttl = 0);
/// Creates an index. The document returned is the result of a getLastError call.
/// For more info look at the ensureIndex information on the MongoDB website.
/// For more info look at the ensureIndex information on the MongoDB website. (old wire protocol)
Document::Ptr getLastErrorDoc(Connection & connection) const;
/// Sends the getLastError command to the database and returns the error document.
/// (old wire protocol)
std::string getLastError(Connection & connection) const;
/// Sends the getLastError command to the database and returns the err element
/// from the error document. When err is null, an empty string is returned.
/// (old wire protocol)
static const std::string AUTH_MONGODB_CR;
/// Default authentication mechanism prior to MongoDB 3.0.
@ -115,6 +137,27 @@ namespace MongoDB
static const std::string AUTH_SCRAM_SHA1;
/// Default authentication mechanism for MongoDB 3.0.
enum WireVersion
/// Wire version as reported by the command hello.
/// See details in MongoDB github, repository specifications.
/// @see queryServerHello
{
VER_26 = 1,
VER_26_2 = 2,
VER_30 = 3,
VER_32 = 4,
VER_34 = 5,
VER_36 = 6, ///< First wire version that supports OP_MSG
VER_40 = 7,
VER_42 = 8,
VER_44 = 9,
VER_50 = 13,
VER_51 = 14, ///< First wire version that supports only OP_MSG
VER_52 = 15,
VER_53 = 16,
VER_60 = 17
};
protected:
bool authCR(Connection & connection, const std::string & username, const std::string & password);
bool authSCRAM(Connection & connection, const std::string & username, const std::string & password);
@ -127,6 +170,12 @@ namespace MongoDB
//
// inlines
//
inline const std::string & Database::name() const
{
return _dbname;
}
inline Poco::SharedPtr<Poco::MongoDB::QueryRequest> Database::createCommand() const
{
Poco::SharedPtr<Poco::MongoDB::QueryRequest> cmd = createQueryRequest("$cmd");
@ -158,6 +207,24 @@ namespace MongoDB
return new Poco::MongoDB::UpdateRequest(_dbname + '.' + collectionName);
}
// -- New wire protocol commands
inline Poco::SharedPtr<Poco::MongoDB::OpMsgMessage> Database::createOpMsgMessage(const std::string & collectionName) const
{
return new Poco::MongoDB::OpMsgMessage(_dbname, collectionName);
}
inline Poco::SharedPtr<Poco::MongoDB::OpMsgMessage> Database::createOpMsgMessage() const
{
// Collection name for database commands is not needed.
return createOpMsgMessage("");
}
inline Poco::SharedPtr<Poco::MongoDB::OpMsgCursor> Database::createOpMsgCursor(const std::string & collectionName) const
{
return new Poco::MongoDB::OpMsgCursor(_dbname, collectionName);
}
}
} // namespace Poco::MongoDB

View File

@ -31,6 +31,7 @@ namespace Poco
namespace MongoDB
{
class Array;
class ElementFindByName
{
@ -48,8 +49,8 @@ namespace MongoDB
/// Represents a MongoDB (BSON) document.
{
public:
typedef SharedPtr<Document> Ptr;
typedef std::vector<Document::Ptr> Vector;
using Ptr = SharedPtr<Document>;
using Vector = std::vector<Document::Ptr>;
Document();
/// Creates an empty Document.
@ -86,6 +87,10 @@ namespace MongoDB
/// Unlike the other add methods, this method returns
/// a reference to the new document.
Array & addNewArray(const std::string & name);
/// Create a new array and add it to this document.
/// Method returns a reference to the new array.
void clear();
/// Removes all elements from the document.
@ -95,7 +100,7 @@ namespace MongoDB
bool empty() const;
/// Returns true if the document doesn't contain any documents.
bool exists(const std::string & name);
bool exists(const std::string & name) const;
/// Returns true if the document has an element with the given name.
template <typename T>
@ -158,6 +163,9 @@ namespace MongoDB
/// return an Int64. When the element is not found, a
/// Poco::NotFoundException will be thrown.
bool remove(const std::string & name);
/// Removes an element from the document.
template <typename T>
bool isType(const std::string & name) const
/// Returns true when the type of the element equals the TypeId of ElementTrait.
@ -227,12 +235,23 @@ namespace MongoDB
}
inline bool Document::exists(const std::string & name)
inline bool Document::exists(const std::string & name) const
{
return std::find_if(_elements.begin(), _elements.end(), ElementFindByName(name)) != _elements.end();
}
inline bool Document::remove(const std::string & name)
{
auto it = std::find_if(_elements.begin(), _elements.end(), ElementFindByName(name));
if (it == _elements.end())
return false;
_elements.erase(it);
return true;
}
inline std::size_t Document::size() const
{
return _elements.size();

View File

@ -45,7 +45,7 @@ namespace MongoDB
/// Represents an Element of a Document or an Array.
{
public:
typedef Poco::SharedPtr<Element> Ptr;
using Ptr = Poco::SharedPtr<Element>;
explicit Element(const std::string & name);
/// Creates the Element with the given name.
@ -80,7 +80,7 @@ namespace MongoDB
}
typedef std::list<Element::Ptr> ElementSet;
using ElementSet = std::list<Element::Ptr>;
template <typename T>
@ -266,7 +266,7 @@ namespace MongoDB
}
typedef Nullable<unsigned char> NullValue;
using NullValue = Nullable<unsigned char>;
// BSON Null Value

View File

@ -35,7 +35,7 @@ namespace MongoDB
/// Represents JavaScript type in BSON.
{
public:
typedef SharedPtr<JavaScriptCode> Ptr;
using Ptr = SharedPtr<JavaScriptCode>;
JavaScriptCode();
/// Creates an empty JavaScriptCode object.

View File

@ -28,6 +28,9 @@ namespace MongoDB
{
class Message; // Required to disambiguate friend declaration in MessageHeader.
class MongoDB_API MessageHeader
/// Represents the message header which is always prepended to a
/// MongoDB request or response message.
@ -37,14 +40,18 @@ namespace MongoDB
enum OpCode
{
// Opcodes deprecated in MongoDB 5.0
OP_REPLY = 1,
OP_MSG = 1000,
OP_UPDATE = 2001,
OP_INSERT = 2002,
OP_QUERY = 2004,
OP_GET_MORE = 2005,
OP_DELETE = 2006,
OP_KILL_CURSORS = 2007
OP_KILL_CURSORS = 2007,
/// Opcodes supported in MongoDB 5.1 and later
OP_COMPRESSED = 2012,
OP_MSG = 2013
};
explicit MessageHeader(OpCode);

View File

@ -33,6 +33,13 @@
//
#if defined(_WIN32) && defined(POCO_DLL)
# if defined(MongoDB_EXPORTS)
# define MongoDB_API __declspec(dllexport)
# else
# define MongoDB_API __declspec(dllimport)
# endif
#endif
#if !defined(MongoDB_API)
@ -47,6 +54,11 @@
//
// Automatically link MongoDB library.
//
#if defined(_MSC_VER)
# if !defined(POCO_NO_AUTOMATIC_LIBS) && !defined(MongoDB_EXPORTS)
# pragma comment(lib, "PocoMongoDB" POCO_LIB_SUFFIX)
# endif
#endif
#endif // MongoDBMongoDB_INCLUDED

View File

@ -44,7 +44,7 @@ namespace MongoDB
/// as its value.
{
public:
typedef SharedPtr<ObjectId> Ptr;
using Ptr = SharedPtr<ObjectId>;
explicit ObjectId(const std::string & id);
/// Creates an ObjectId from a string.

View File

@ -0,0 +1,96 @@
//
// OpMsgCursor.h
//
// Library: MongoDB
// Package: MongoDB
// Module: OpMsgCursor
//
// Definition of the OpMsgCursor class.
//
// Copyright (c) 2012, Applied Informatics Software Engineering GmbH.
// and Contributors.
//
// SPDX-License-Identifier: BSL-1.0
//
#ifndef MongoDB_OpMsgCursor_INCLUDED
#define MongoDB_OpMsgCursor_INCLUDED
#include "Poco/MongoDB/Connection.h"
#include "Poco/MongoDB/MongoDB.h"
#include "Poco/MongoDB/OpMsgMessage.h"
namespace Poco
{
namespace MongoDB
{
class MongoDB_API OpMsgCursor : public Document
/// OpMsgCursor is an helper class for querying multiple documents using OpMsgMessage.
{
public:
OpMsgCursor(const std::string & dbname, const std::string & collectionName);
/// Creates a OpMsgCursor for the given database and collection.
virtual ~OpMsgCursor();
/// Destroys the OpMsgCursor.
void setEmptyFirstBatch(bool empty);
/// Empty first batch is used to get error response faster with little server processing
bool emptyFirstBatch() const;
void setBatchSize(Int32 batchSize);
/// Set non-default batch size
Int32 batchSize() const;
/// Current batch size (zero or negative number indicates default batch size)
Int64 cursorID() const;
OpMsgMessage & next(Connection & connection);
/// Tries to get the next documents. As long as response message has a
/// cursor ID next can be called to retrieve the next bunch of documents.
///
/// The cursor must be killed (see kill()) when not all documents are needed.
OpMsgMessage & query();
/// Returns the associated query.
void kill(Connection & connection);
/// Kills the cursor and reset it so that it can be reused.
private:
OpMsgMessage _query;
OpMsgMessage _response;
bool _emptyFirstBatch{false};
Int32 _batchSize{-1};
/// Batch size used in the cursor. Zero or negative value means that default shall be used.
Int64 _cursorID{0};
};
//
// inlines
//
inline OpMsgMessage & OpMsgCursor::query()
{
return _query;
}
inline Int64 OpMsgCursor::cursorID() const
{
return _cursorID;
}
}
} // namespace Poco::MongoDB
#endif // MongoDB_OpMsgCursor_INCLUDED

View File

@ -0,0 +1,163 @@
//
// OpMsgMessage.h
//
// Library: MongoDB
// Package: MongoDB
// Module: OpMsgMessage
//
// Definition of the OpMsgMessage class.
//
// Copyright (c) 2022, Applied Informatics Software Engineering GmbH.
// and Contributors.
//
// SPDX-License-Identifier: BSL-1.0
//
#ifndef MongoDB_OpMsgMessage_INCLUDED
#define MongoDB_OpMsgMessage_INCLUDED
#include "Poco/MongoDB/Document.h"
#include "Poco/MongoDB/Message.h"
#include "Poco/MongoDB/MongoDB.h"
#include <string>
namespace Poco
{
namespace MongoDB
{
class MongoDB_API OpMsgMessage : public Message
/// This class represents a request/response (OP_MSG) to send requests and receive responses to/from MongoDB.
{
public:
// Constants for most often used MongoDB commands that can be sent using OP_MSG
// For complete list see: https://www.mongodb.com/docs/manual/reference/command/
// Query and write
static const std::string CMD_INSERT;
static const std::string CMD_DELETE;
static const std::string CMD_UPDATE;
static const std::string CMD_FIND;
static const std::string CMD_FIND_AND_MODIFY;
static const std::string CMD_GET_MORE;
// Aggregation
static const std::string CMD_AGGREGATE;
static const std::string CMD_COUNT;
static const std::string CMD_DISTINCT;
static const std::string CMD_MAP_REDUCE;
// Replication and administration
static const std::string CMD_HELLO;
static const std::string CMD_REPL_SET_GET_STATUS;
static const std::string CMD_REPL_SET_GET_CONFIG;
static const std::string CMD_CREATE;
static const std::string CMD_CREATE_INDEXES;
static const std::string CMD_DROP;
static const std::string CMD_DROP_DATABASE;
static const std::string CMD_KILL_CURSORS;
static const std::string CMD_LIST_DATABASES;
static const std::string CMD_LIST_INDEXES;
// Diagnostic
static const std::string CMD_BUILD_INFO;
static const std::string CMD_COLL_STATS;
static const std::string CMD_DB_STATS;
static const std::string CMD_HOST_INFO;
enum Flags : UInt32
{
MSG_FLAGS_DEFAULT = 0,
MSG_CHECKSUM_PRESENT = (1 << 0),
MSG_MORE_TO_COME = (1 << 1),
/// Sender will send another message and is not prepared for overlapping messages
MSG_EXHAUST_ALLOWED = (1 << 16)
/// Client is prepared for multiple replies (using the moreToCome bit) to this request
};
OpMsgMessage();
/// Creates an OpMsgMessage for response.
OpMsgMessage(const std::string & databaseName, const std::string & collectionName, UInt32 flags = MSG_FLAGS_DEFAULT);
/// Creates an OpMsgMessage for requests.
virtual ~OpMsgMessage();
const std::string & databaseName() const;
const std::string & collectionName() const;
void setCommandName(const std::string & command);
/// Sets the command name and clears the command document
void setCursor(Poco::Int64 cursorID, Poco::Int32 batchSize = -1);
/// Sets the command "getMore" for the cursor id with batch size (if it is not negative).
const std::string & commandName() const;
/// Current command name.
void setAcknowledgedRequest(bool ack);
/// Set false to create request that does not return response.
/// It has effect only for commands that write or delete documents.
/// Default is true (request returns acknowledge response).
bool acknowledgedRequest() const;
UInt32 flags() const;
Document & body();
/// Access to body document.
/// Additional query arguments shall be added after setting the command name.
const Document & body() const;
Document::Vector & documents();
/// Documents prepared for request or retrieved in response.
const Document::Vector & documents() const;
/// Documents prepared for request or retrieved in response.
bool responseOk() const;
/// Reads "ok" status from the response message.
void clear();
/// Clears the message.
void send(std::ostream & ostr);
/// Writes the request to stream.
void read(std::istream & istr);
/// Reads the response from the stream.
private:
enum PayloadType : UInt8
{
PAYLOAD_TYPE_0 = 0,
PAYLOAD_TYPE_1 = 1
};
std::string _databaseName;
std::string _collectionName;
UInt32 _flags{MSG_FLAGS_DEFAULT};
std::string _commandName;
bool _acknowledged{true};
Document _body;
Document::Vector _documents;
};
}
} // namespace Poco::MongoDB
#endif // MongoDB_OpMsgMessage_INCLUDED

View File

@ -94,7 +94,23 @@ namespace MongoDB
operator Connection::Ptr() { return _connection; }
#if defined(POCO_ENABLE_CPP11)
// Disable copy to prevent unwanted release of resources: C++11 way
PooledConnection(const PooledConnection &) = delete;
PooledConnection & operator=(const PooledConnection &) = delete;
// Enable move semantics
PooledConnection(PooledConnection && other) = default;
PooledConnection & operator=(PooledConnection &&) = default;
#endif
private:
#if !defined(POCO_ENABLE_CPP11)
// Disable copy to prevent unwanted release of resources: pre C++11 way
PooledConnection(const PooledConnection &);
PooledConnection & operator=(const PooledConnection &);
#endif
Poco::ObjectPool<Connection, Connection::Ptr> & _pool;
Connection::Ptr _connection;
};

View File

@ -33,7 +33,7 @@ namespace MongoDB
/// Represents a regular expression in BSON format.
{
public:
typedef SharedPtr<RegularExpression> Ptr;
using Ptr = SharedPtr<RegularExpression>;
RegularExpression();
/// Creates an empty RegularExpression.

View File

@ -38,6 +38,9 @@ namespace MongoDB
ResponseMessage();
/// Creates an empty ResponseMessage.
ResponseMessage(const Int64 & cursorID);
/// Creates an ResponseMessage for existing cursor ID.
virtual ~ResponseMessage();
/// Destroys the ResponseMessage.

View File

@ -31,7 +31,7 @@ Array::~Array()
}
Element::Ptr Array::get(int pos) const
Element::Ptr Array::get(std::size_t pos) const
{
std::string name = Poco::NumberFormatter::format(pos);
return Document::get(name);

View File

@ -319,4 +319,30 @@ void Connection::sendRequest(RequestMessage& request, ResponseMessage& response)
}
void Connection::sendRequest(OpMsgMessage& request, OpMsgMessage& response)
{
Poco::Net::SocketOutputStream sos(_socket);
request.send(sos);
response.clear();
readResponse(response);
}
void Connection::sendRequest(OpMsgMessage& request)
{
request.setAcknowledgedRequest(false);
Poco::Net::SocketOutputStream sos(_socket);
request.send(sos);
}
void Connection::readResponse(OpMsgMessage& response)
{
Poco::Net::SocketInputStream sis(_socket);
response.read(sis);
}
} } // Poco::MongoDB

View File

@ -33,6 +33,12 @@ Cursor::Cursor(const std::string& fullCollectionName, QueryRequest::Flags flags)
}
Cursor::Cursor(const Document& aggregationResponse) :
_query(aggregationResponse.get<Poco::MongoDB::Document::Ptr>("cursor")->get<std::string>("ns")),
_response(aggregationResponse.get<Poco::MongoDB::Document::Ptr>("cursor")->get<Int64>("id"))
{
}
Cursor::~Cursor()
{
try

View File

@ -334,6 +334,50 @@ bool Database::authSCRAM(Connection& connection, const std::string& username, co
}
Document::Ptr Database::queryBuildInfo(Connection& connection) const
{
// build info can be issued on "config" system database
Poco::SharedPtr<Poco::MongoDB::QueryRequest> request = createCommand();
request->selector().add("buildInfo", 1);
Poco::MongoDB::ResponseMessage response;
connection.sendRequest(*request, response);
Document::Ptr buildInfo;
if ( response.documents().size() > 0 )
{
buildInfo = response.documents()[0];
}
else
{
throw Poco::ProtocolException("Didn't get a response from the buildinfo command");
}
return buildInfo;
}
Document::Ptr Database::queryServerHello(Connection& connection) const
{
// hello can be issued on "config" system database
Poco::SharedPtr<Poco::MongoDB::QueryRequest> request = createCommand();
request->selector().add("hello", 1);
Poco::MongoDB::ResponseMessage response;
connection.sendRequest(*request, response);
Document::Ptr hello;
if ( response.documents().size() > 0 )
{
hello = response.documents()[0];
}
else
{
throw Poco::ProtocolException("Didn't get a response from the hello command");
}
return hello;
}
Int64 Database::count(Connection& connection, const std::string& collectionName) const
{
Poco::SharedPtr<Poco::MongoDB::QueryRequest> countRequest = createCountRequest(collectionName);
@ -390,7 +434,7 @@ Document::Ptr Database::getLastErrorDoc(Connection& connection) const
{
Document::Ptr errorDoc;
Poco::SharedPtr<Poco::MongoDB::QueryRequest> request = createQueryRequest("$cmd");
Poco::SharedPtr<Poco::MongoDB::QueryRequest> request = createCommand();
request->setNumberToReturn(1);
request->selector().add("getLastError", 1);
@ -420,7 +464,7 @@ std::string Database::getLastError(Connection& connection) const
Poco::SharedPtr<Poco::MongoDB::QueryRequest> Database::createCountRequest(const std::string& collectionName) const
{
Poco::SharedPtr<Poco::MongoDB::QueryRequest> request = createQueryRequest("$cmd");
Poco::SharedPtr<Poco::MongoDB::QueryRequest> request = createCommand();
request->setNumberToReturn(1);
request->selector().add("count", collectionName);
return request;

View File

@ -35,6 +35,14 @@ Document::~Document()
}
Array& Document::addNewArray(const std::string& name)
{
Array::Ptr newArray = new Array();
add(name, newArray);
return *newArray;
}
Element::Ptr Document::get(const std::string& name) const
{
Element::Ptr element;
@ -198,7 +206,7 @@ void Document::write(BinaryWriter& writer)
else
{
std::stringstream sstream;
Poco::BinaryWriter tempWriter(sstream);
Poco::BinaryWriter tempWriter(sstream, BinaryWriter::LITTLE_ENDIAN_BYTE_ORDER);
for (ElementSet::iterator it = _elements.begin(); it != _elements.end(); ++it)
{
tempWriter << static_cast<unsigned char>((*it)->type());

View File

@ -42,7 +42,7 @@ void MessageHeader::read(BinaryReader& reader)
Int32 opCode;
reader >> opCode;
_opCode = (OpCode) opCode;
_opCode = static_cast<OpCode>(opCode);
if (!reader.good())
{
@ -56,7 +56,7 @@ void MessageHeader::write(BinaryWriter& writer)
writer << _messageLength;
writer << _requestID;
writer << _responseTo;
writer << (Int32) _opCode;
writer << static_cast<Int32>(_opCode);
}

View File

@ -0,0 +1,187 @@
//
// OpMsgCursor.cpp
//
// Library: MongoDB
// Package: MongoDB
// Module: OpMsgCursor
//
// Copyright (c) 2022, Applied Informatics Software Engineering GmbH.
// and Contributors.
//
// SPDX-License-Identifier: BSL-1.0
//
#include "Poco/MongoDB/OpMsgCursor.h"
#include "Poco/MongoDB/Array.h"
//
// NOTE:
//
// MongoDB specification indicates that the flag MSG_EXHAUST_ALLOWED shall be
// used in the request when the receiver is ready to receive multiple messages
// without sending additional requests in between. Sender (MongoDB) indicates
// that more messages follow with flag MSG_MORE_TO_COME.
//
// It seems that this does not work properly. MSG_MORE_TO_COME is set and reading
// next messages sometimes works, however often the data is missing in response
// or the message header contains wrong message length and reading blocks.
// Opcode in the header is correct.
//
// Using MSG_EXHAUST_ALLOWED is therefore currently disabled.
//
// It seems that related JIRA ticket is:
//
// https://jira.mongodb.org/browse/SERVER-57297
//
// https://github.com/mongodb/specifications/blob/master/source/message/OP_MSG.rst
//
#define MONGODB_EXHAUST_ALLOWED_WORKS false
namespace Poco {
namespace MongoDB {
static const std::string keyCursor {"cursor"};
static const std::string keyFirstBatch {"firstBatch"};
static const std::string keyNextBatch {"nextBatch"};
static Poco::Int64 cursorIdFromResponse(const MongoDB::Document& doc);
OpMsgCursor::OpMsgCursor(const std::string& db, const std::string& collection):
#if MONGODB_EXHAUST_ALLOWED_WORKS
_query(db, collection, OpMsgMessage::MSG_EXHAUST_ALLOWED)
#else
_query(db, collection)
#endif
{
}
OpMsgCursor::~OpMsgCursor()
{
try
{
poco_assert_dbg(_cursorID == 0);
}
catch (...)
{
}
}
void OpMsgCursor::setEmptyFirstBatch(bool empty)
{
_emptyFirstBatch = empty;
}
bool OpMsgCursor::emptyFirstBatch() const
{
return _emptyFirstBatch;
}
void OpMsgCursor::setBatchSize(Int32 batchSize)
{
_batchSize = batchSize;
}
Int32 OpMsgCursor::batchSize() const
{
return _batchSize;
}
OpMsgMessage& OpMsgCursor::next(Connection& connection)
{
if (_cursorID == 0)
{
_response.clear();
if (_emptyFirstBatch || _batchSize > 0)
{
Int32 bsize = _emptyFirstBatch ? 0 : _batchSize;
if (_query.commandName() == OpMsgMessage::CMD_FIND)
{
_query.body().add("batchSize", bsize);
}
else if (_query.commandName() == OpMsgMessage::CMD_AGGREGATE)
{
auto& cursorDoc = _query.body().addNewDocument("cursor");
cursorDoc.add("batchSize", bsize);
}
}
connection.sendRequest(_query, _response);
const auto& rdoc = _response.body();
_cursorID = cursorIdFromResponse(rdoc);
}
else
{
#if MONGODB_EXHAUST_ALLOWED_WORKS
std::cout << "Response flags: " << _response.flags() << std::endl;
if (_response.flags() & OpMsgMessage::MSG_MORE_TO_COME)
{
std::cout << "More to come. Reading more response: " << std::endl;
_response.clear();
connection.readResponse(_response);
}
else
#endif
{
_response.clear();
_query.setCursor(_cursorID, _batchSize);
connection.sendRequest(_query, _response);
}
}
const auto& rdoc = _response.body();
_cursorID = cursorIdFromResponse(rdoc);
return _response;
}
void OpMsgCursor::kill(Connection& connection)
{
_response.clear();
if (_cursorID != 0)
{
_query.setCommandName(OpMsgMessage::CMD_KILL_CURSORS);
MongoDB::Array::Ptr cursors = new MongoDB::Array();
cursors->add<Poco::Int64>(_cursorID);
_query.body().add("cursors", cursors);
connection.sendRequest(_query, _response);
const auto killed = _response.body().get<MongoDB::Array::Ptr>("cursorsKilled", nullptr);
if (!killed || killed->size() != 1 || killed->get<Poco::Int64>(0, -1) != _cursorID)
{
throw Poco::ProtocolException("Cursor not killed as expected: " + std::to_string(_cursorID));
}
_cursorID = 0;
_query.clear();
_response.clear();
}
}
Poco::Int64 cursorIdFromResponse(const MongoDB::Document& doc)
{
Poco::Int64 id {0};
auto cursorDoc = doc.get<Document::Ptr>(keyCursor, nullptr);
if(cursorDoc)
{
id = cursorDoc->get<Poco::Int64>("id", 0);
}
return id;
}
} } // Namespace Poco::MongoDB

View File

@ -0,0 +1,412 @@
//
// OpMsgMessage.cpp
//
// Library: MongoDB
// Package: MongoDB
// Module: OpMsgMessage
//
// Copyright (c) 2022, Applied Informatics Software Engineering GmbH.
// and Contributors.
//
// SPDX-License-Identifier: BSL-1.0
//
#include "Poco/MongoDB/OpMsgMessage.h"
#include "Poco/MongoDB/MessageHeader.h"
#include "Poco/MongoDB/Array.h"
#include "Poco/StreamCopier.h"
#include "Poco/Logger.h"
#define POCO_MONGODB_DUMP false
namespace Poco {
namespace MongoDB {
// Query and write
const std::string OpMsgMessage::CMD_INSERT { "insert" };
const std::string OpMsgMessage::CMD_DELETE { "delete" };
const std::string OpMsgMessage::CMD_UPDATE { "update" };
const std::string OpMsgMessage::CMD_FIND { "find" };
const std::string OpMsgMessage::CMD_FIND_AND_MODIFY { "findAndModify" };
const std::string OpMsgMessage::CMD_GET_MORE { "getMore" };
// Aggregation
const std::string OpMsgMessage::CMD_AGGREGATE { "aggregate" };
const std::string OpMsgMessage::CMD_COUNT { "count" };
const std::string OpMsgMessage::CMD_DISTINCT { "distinct" };
const std::string OpMsgMessage::CMD_MAP_REDUCE { "mapReduce" };
// Replication and administration
const std::string OpMsgMessage::CMD_HELLO { "hello" };
const std::string OpMsgMessage::CMD_REPL_SET_GET_STATUS { "replSetGetStatus" };
const std::string OpMsgMessage::CMD_REPL_SET_GET_CONFIG { "replSetGetConfig" };
const std::string OpMsgMessage::CMD_CREATE { "create" };
const std::string OpMsgMessage::CMD_CREATE_INDEXES { "createIndexes" };
const std::string OpMsgMessage::CMD_DROP { "drop" };
const std::string OpMsgMessage::CMD_DROP_DATABASE { "dropDatabase" };
const std::string OpMsgMessage::CMD_KILL_CURSORS { "killCursors" };
const std::string OpMsgMessage::CMD_LIST_DATABASES { "listDatabases" };
const std::string OpMsgMessage::CMD_LIST_INDEXES { "listIndexes" };
// Diagnostic
const std::string OpMsgMessage::CMD_BUILD_INFO { "buildInfo" };
const std::string OpMsgMessage::CMD_COLL_STATS { "collStats" };
const std::string OpMsgMessage::CMD_DB_STATS { "dbStats" };
const std::string OpMsgMessage::CMD_HOST_INFO { "hostInfo" };
static const std::string& commandIdentifier(const std::string& command);
/// Commands have different names for the payload that is sent in a separate section
static const std::string keyCursor {"cursor"};
static const std::string keyFirstBatch {"firstBatch"};
static const std::string keyNextBatch {"nextBatch"};
OpMsgMessage::OpMsgMessage() :
Message(MessageHeader::OP_MSG)
{
}
OpMsgMessage::OpMsgMessage(const std::string& databaseName, const std::string& collectionName, UInt32 flags) :
Message(MessageHeader::OP_MSG),
_databaseName(databaseName),
_collectionName(collectionName),
_flags(flags)
{
}
OpMsgMessage::~OpMsgMessage()
{
}
const std::string& OpMsgMessage::databaseName() const
{
return _databaseName;
}
const std::string& OpMsgMessage::collectionName() const
{
return _collectionName;
}
void OpMsgMessage::setCommandName(const std::string& command)
{
_commandName = command;
_body.clear();
// IMPORTANT: Command name must be first
if (_collectionName.empty())
{
// Collection is not specified. It is assumed that this particular command does
// not need it.
_body.add(_commandName, Int32(1));
}
else
{
_body.add(_commandName, _collectionName);
}
_body.add("$db", _databaseName);
}
void OpMsgMessage::setCursor(Poco::Int64 cursorID, Poco::Int32 batchSize)
{
_commandName = OpMsgMessage::CMD_GET_MORE;
_body.clear();
// IMPORTANT: Command name must be first
_body.add(_commandName, cursorID);
_body.add("$db", _databaseName);
_body.add("collection", _collectionName);
if (batchSize > 0)
{
_body.add("batchSize", batchSize);
}
}
const std::string& OpMsgMessage::commandName() const
{
return _commandName;
}
void OpMsgMessage::setAcknowledgedRequest(bool ack)
{
const auto& id = commandIdentifier(_commandName);
if (id.empty())
return;
_acknowledged = ack;
auto writeConcern = _body.get<Document::Ptr>("writeConcern", nullptr);
if (writeConcern)
writeConcern->remove("w");
if (ack)
{
_flags = _flags & (~MSG_MORE_TO_COME);
}
else
{
_flags = _flags | MSG_MORE_TO_COME;
if (!writeConcern)
_body.addNewDocument("writeConcern").add("w", 0);
else
writeConcern->add("w", 0);
}
}
bool OpMsgMessage::acknowledgedRequest() const
{
return _acknowledged;
}
UInt32 OpMsgMessage::flags() const
{
return _flags;
}
Document& OpMsgMessage::body()
{
return _body;
}
const Document& OpMsgMessage::body() const
{
return _body;
}
Document::Vector& OpMsgMessage::documents()
{
return _documents;
}
const Document::Vector& OpMsgMessage::documents() const
{
return _documents;
}
bool OpMsgMessage::responseOk() const
{
Poco::Int64 ok {false};
if (_body.exists("ok"))
{
ok = _body.getInteger("ok");
}
return (ok != 0);
}
void OpMsgMessage::clear()
{
_flags = MSG_FLAGS_DEFAULT;
_commandName.clear();
_body.clear();
_documents.clear();
}
void OpMsgMessage::send(std::ostream& ostr)
{
BinaryWriter socketWriter(ostr, BinaryWriter::LITTLE_ENDIAN_BYTE_ORDER);
// Serialise the body
std::stringstream ss;
BinaryWriter writer(ss, BinaryWriter::LITTLE_ENDIAN_BYTE_ORDER);
writer << _flags;
writer << PAYLOAD_TYPE_0;
_body.write(writer);
if (!_documents.empty())
{
// Serialise attached documents
std::stringstream ssdoc;
BinaryWriter wdoc(ssdoc, BinaryWriter::LITTLE_ENDIAN_BYTE_ORDER);
for (auto& doc: _documents)
{
doc->write(wdoc);
}
wdoc.flush();
const std::string& identifier = commandIdentifier(_commandName);
const Poco::Int32 size = static_cast<Poco::Int32>(sizeof(size) + identifier.size() + 1 + ssdoc.tellp());
writer << PAYLOAD_TYPE_1;
writer << size;
writer.writeCString(identifier.c_str());
StreamCopier::copyStream(ssdoc, ss);
}
writer.flush();
#if POCO_MONGODB_DUMP
const std::string section = ss.str();
std::string dump;
Logger::formatDump(dump, section.data(), section.length());
std::cout << dump << std::endl;
#endif
messageLength(static_cast<Poco::Int32>(ss.tellp()));
_header.write(socketWriter);
StreamCopier::copyStream(ss, ostr);
ostr.flush();
}
void OpMsgMessage::read(std::istream& istr)
{
std::string message;
{
BinaryReader reader(istr, BinaryReader::LITTLE_ENDIAN_BYTE_ORDER);
_header.read(reader);
poco_assert_dbg(_header.opCode() == _header.OP_MSG);
const std::streamsize remainingSize {_header.getMessageLength() - _header.MSG_HEADER_SIZE };
message.reserve(remainingSize);
#if POCO_MONGODB_DUMP
std::cout
<< "Message hdr: " << _header.getMessageLength() << " " << remainingSize << " "
<< _header.opCode() << " " << _header.getRequestID() << " " << _header.responseTo()
<< std::endl;
#endif
reader.readRaw(remainingSize, message);
#if POCO_MONGODB_DUMP
std::string dump;
Logger::formatDump(dump, message.data(), message.length());
std::cout << dump << std::endl;
#endif
}
// Read complete message and then interpret it.
std::istringstream msgss(message);
BinaryReader reader(msgss, BinaryReader::LITTLE_ENDIAN_BYTE_ORDER);
Poco::UInt8 payloadType {0xFF};
reader >> _flags;
reader >> payloadType;
poco_assert_dbg(payloadType == PAYLOAD_TYPE_0);
_body.read(reader);
// Read next sections from the buffer
while (msgss.good())
{
// NOTE: Not tested yet with database, because it returns everything in the body.
// Does MongoDB ever return documents as Payload type 1?
reader >> payloadType;
if (!msgss.good())
{
break;
}
poco_assert_dbg(payloadType == PAYLOAD_TYPE_1);
#if POCO_MONGODB_DUMP
std::cout << "section payload: " << payloadType << std::endl;
#endif
Poco::Int32 sectionSize {0};
reader >> sectionSize;
poco_assert_dbg(sectionSize > 0);
#if POCO_MONGODB_DUMP
std::cout << "section size: " << sectionSize << std::endl;
#endif
std::streamoff offset = sectionSize - sizeof(sectionSize);
std::streampos endOfSection = msgss.tellg() + offset;
std::string identifier;
reader.readCString(identifier);
#if POCO_MONGODB_DUMP
std::cout << "section identifier: " << identifier << std::endl;
#endif
// Loop to read documents from this section.
while (msgss.tellg() < endOfSection)
{
#if POCO_MONGODB_DUMP
std::cout << "section doc: " << msgss.tellg() << " " << endOfSection << std::endl;
#endif
Document::Ptr doc = new Document();
doc->read(reader);
_documents.push_back(doc);
if (msgss.tellg() < 0)
{
break;
}
}
}
// Extract documents from the cursor batch if they are there.
MongoDB::Array::Ptr batch;
auto curDoc = _body.get<MongoDB::Document::Ptr>(keyCursor, nullptr);
if (curDoc)
{
batch = curDoc->get<MongoDB::Array::Ptr>(keyFirstBatch, nullptr);
if (!batch)
{
batch = curDoc->get<MongoDB::Array::Ptr>(keyNextBatch, nullptr);
}
}
if (batch)
{
for(std::size_t i = 0; i < batch->size(); i++)
{
const auto& d = batch->get<MongoDB::Document::Ptr>(i, nullptr);
if (d)
{
_documents.push_back(d);
}
}
}
}
const std::string& commandIdentifier(const std::string& command)
{
// Names of identifiers for commands that send bulk documents in the request
// The identifier is set in the section type 1.
static std::map<std::string, std::string> identifiers {
{ OpMsgMessage::CMD_INSERT, "documents" },
{ OpMsgMessage::CMD_DELETE, "deletes" },
{ OpMsgMessage::CMD_UPDATE, "updates" },
// Not sure if create index can send document section
{ OpMsgMessage::CMD_CREATE_INDEXES, "indexes" }
};
const auto i = identifiers.find(command);
if (i != identifiers.end())
{
return i->second;
}
// This likely means that documents are incorrectly set for a command
// that does not send list of documents in section type 1.
static const std::string emptyIdentifier;
return emptyIdentifier;
}
} } // namespace Poco::MongoDB

View File

@ -35,7 +35,7 @@ RequestMessage::~RequestMessage()
void RequestMessage::send(std::ostream& ostr)
{
std::stringstream ss;
BinaryWriter requestWriter(ss);
BinaryWriter requestWriter(ss, BinaryWriter::LITTLE_ENDIAN_BYTE_ORDER);
buildRequest(requestWriter);
requestWriter.flush();

View File

@ -30,6 +30,16 @@ ResponseMessage::ResponseMessage():
}
ResponseMessage::ResponseMessage(const Int64& cursorID):
Message(MessageHeader::OP_REPLY),
_responseFlags(0),
_cursorID(cursorID),
_startingFrom(0),
_numberReturned(0)
{
}
ResponseMessage::~ResponseMessage()
{
}

View File

@ -127,6 +127,9 @@ namespace Net
void setResolvedHost(std::string resolved_host) { _resolved_host.swap(resolved_host); }
std::string getResolvedHost() const { return _resolved_host; }
/// Returns the resolved IP address of the target HTTP server.
Poco::UInt16 getPort() const;
/// Returns the port number of the target HTTP server.

View File

@ -274,7 +274,9 @@ void SocketImpl::shutdown()
int SocketImpl::sendBytes(const void* buffer, int length, int flags)
{
if (_isBrokenTimeout)
bool blocking = _blocking && (flags & MSG_DONTWAIT) == 0;
if (_isBrokenTimeout && blocking)
{
if (_sndTimeout.totalMicroseconds() != 0)
{
@ -289,11 +291,13 @@ int SocketImpl::sendBytes(const void* buffer, int length, int flags)
if (_sockfd == POCO_INVALID_SOCKET) throw InvalidSocketException();
rc = ::send(_sockfd, reinterpret_cast<const char*>(buffer), length, flags);
}
while (_blocking && rc < 0 && lastError() == POCO_EINTR);
while (blocking && rc < 0 && lastError() == POCO_EINTR);
if (rc < 0)
{
int err = lastError();
if (err == POCO_EAGAIN || err == POCO_ETIMEDOUT)
if ((err == POCO_EAGAIN || err == POCO_EWOULDBLOCK) && !blocking)
;
else if (err == POCO_EAGAIN || err == POCO_ETIMEDOUT)
throw TimeoutException();
else
error(err);

View File

@ -183,6 +183,16 @@ namespace Net
/// Returns true iff a reused session was negotiated during
/// the handshake.
virtual void setBlocking(bool flag);
/// Sets the socket in blocking mode if flag is true,
/// disables blocking mode if flag is false.
virtual bool getBlocking() const;
/// Returns the blocking mode of the socket.
/// This method will only work if the blocking modes of
/// the socket are changed via the setBlocking method!
protected:
void acceptSSL();
/// Assume per-object mutex is locked.

View File

@ -201,6 +201,16 @@ namespace Net
/// Returns true iff a reused session was negotiated during
/// the handshake.
virtual void setBlocking(bool flag);
/// Sets the socket in blocking mode if flag is true,
/// disables blocking mode if flag is false.
virtual bool getBlocking() const;
/// Returns the blocking mode of the socket.
/// This method will only work if the blocking modes of
/// the socket are changed via the setBlocking method!
protected:
void acceptSSL();
/// Performs a SSL server-side handshake.

View File

@ -629,5 +629,15 @@ bool SecureSocketImpl::sessionWasReused()
return false;
}
void SecureSocketImpl::setBlocking(bool flag)
{
_pSocket->setBlocking(flag);
}
bool SecureSocketImpl::getBlocking() const
{
return _pSocket->getBlocking();
}
} } // namespace Poco::Net

View File

@ -237,5 +237,15 @@ int SecureStreamSocketImpl::completeHandshake()
return _impl.completeHandshake();
}
bool SecureStreamSocketImpl::getBlocking() const
{
return _impl.getBlocking();
}
void SecureStreamSocketImpl::setBlocking(bool flag)
{
_impl.setBlocking(flag);
}
} } // namespace Poco::Net

View File

@ -2,11 +2,11 @@
# NOTE: has nothing common with DBMS_TCP_PROTOCOL_VERSION,
# only DBMS_TCP_PROTOCOL_VERSION should be incremented on protocol changes.
SET(VERSION_REVISION 54474)
SET(VERSION_REVISION 54475)
SET(VERSION_MAJOR 23)
SET(VERSION_MINOR 5)
SET(VERSION_MINOR 6)
SET(VERSION_PATCH 1)
SET(VERSION_GITHASH 3920eb987f7ed837ada5de8907284adf123f0583)
SET(VERSION_DESCRIBE v23.5.1.1-testing)
SET(VERSION_STRING 23.5.1.1)
SET(VERSION_GITHASH 2fec796e73efda10a538a03af3205ce8ffa1b2de)
SET(VERSION_DESCRIBE v23.6.1.1-testing)
SET(VERSION_STRING 23.6.1.1)
# end of autochange

View File

@ -1,2 +1,2 @@
wget https://github.com/phracker/MacOSX-SDKs/releases/download/10.15/MacOSX10.15.sdk.tar.xz
tar xJf MacOSX10.15.sdk.tar.xz --strip-components=1
wget https://github.com/phracker/MacOSX-SDKs/releases/download/11.3/MacOSX11.0.sdk.tar.xz
tar xJf MacOSX11.0.sdk.tar.xz --strip-components=1

View File

@ -88,7 +88,7 @@ add_contrib (thrift-cmake thrift)
# parquet/arrow/orc
add_contrib (arrow-cmake arrow) # requires: snappy, thrift, double-conversion
add_contrib (avro-cmake avro) # requires: snappy
add_contrib (protobuf-cmake protobuf)
add_contrib (google-protobuf-cmake google-protobuf)
add_contrib (openldap-cmake openldap)
add_contrib (grpc-cmake grpc)
add_contrib (msgpack-c-cmake msgpack-c)
@ -156,7 +156,7 @@ add_contrib (libgsasl-cmake libgsasl) # requires krb5
add_contrib (librdkafka-cmake librdkafka) # requires: libgsasl
add_contrib (nats-io-cmake nats-io)
add_contrib (isa-l-cmake isa-l)
add_contrib (libhdfs3-cmake libhdfs3) # requires: protobuf, krb5, isa-l
add_contrib (libhdfs3-cmake libhdfs3) # requires: google-protobuf, krb5, isa-l
add_contrib (hive-metastore-cmake hive-metastore) # requires: thrift/avro/arrow/libhdfs3
add_contrib (cppkafka-cmake cppkafka)
add_contrib (libpqxx-cmake libpqxx)

2
contrib/NuRaft vendored

@ -1 +1 @@
Subproject commit b56784be1aec568fb72aff47f281097c017623cb
Subproject commit 491eaf592d950e0e37accbe8b3f217e068c9fecf

View File

@ -1,6 +1,6 @@
option (ENABLE_AZURE_BLOB_STORAGE "Enable Azure blob storage" ${ENABLE_LIBRARIES})
if (NOT ENABLE_AZURE_BLOB_STORAGE OR BUILD_STANDALONE_KEEPER OR OS_FREEBSD)
if (NOT ENABLE_AZURE_BLOB_STORAGE OR BUILD_STANDALONE_KEEPER OR OS_FREEBSD OR (NOT ARCH_AMD64))
message(STATUS "Not using Azure blob storage")
return()
endif()

2
contrib/c-ares vendored

@ -1 +1 @@
Subproject commit afee6748b0b99acf4509d42fa37ac8422262f91b
Subproject commit 6360e96b5cf8e5980c887ce58ef727e53d77243a

View File

@ -48,6 +48,7 @@ SET(SRCS
"${LIBRARY_DIR}/src/lib/ares_platform.c"
"${LIBRARY_DIR}/src/lib/ares_process.c"
"${LIBRARY_DIR}/src/lib/ares_query.c"
"${LIBRARY_DIR}/src/lib/ares_rand.c"
"${LIBRARY_DIR}/src/lib/ares_search.c"
"${LIBRARY_DIR}/src/lib/ares_send.c"
"${LIBRARY_DIR}/src/lib/ares_strcasecmp.c"

2
contrib/capnproto vendored

@ -1 +1 @@
Subproject commit dc8b50b999777bcb23c89bb5907c785c3f654441
Subproject commit 976209a6d18074804f60d18ef99b6a809d27dadf

1
contrib/google-protobuf vendored Submodule

@ -0,0 +1 @@
Subproject commit c47efe2d8f6a60022b49ecd6cc23660687c8598f

View File

@ -5,7 +5,7 @@ if(NOT ENABLE_PROTOBUF)
return()
endif()
set(Protobuf_INCLUDE_DIR "${ClickHouse_SOURCE_DIR}/contrib/protobuf/src")
set(Protobuf_INCLUDE_DIR "${ClickHouse_SOURCE_DIR}/contrib/google-protobuf/src")
if(OS_FREEBSD AND SANITIZE STREQUAL "address")
# ../contrib/protobuf/src/google/protobuf/arena_impl.h:45:10: fatal error: 'sanitizer/asan_interface.h' file not found
# #include <sanitizer/asan_interface.h>
@ -17,8 +17,8 @@ if(OS_FREEBSD AND SANITIZE STREQUAL "address")
endif()
endif()
set(protobuf_source_dir "${ClickHouse_SOURCE_DIR}/contrib/protobuf")
set(protobuf_binary_dir "${ClickHouse_BINARY_DIR}/contrib/protobuf")
set(protobuf_source_dir "${ClickHouse_SOURCE_DIR}/contrib/google-protobuf")
set(protobuf_binary_dir "${ClickHouse_BINARY_DIR}/contrib/google-protobuf")
add_definitions(-DGOOGLE_PROTOBUF_CMAKE_BUILD)
@ -35,7 +35,6 @@ set(libprotobuf_lite_files
${protobuf_source_dir}/src/google/protobuf/arena.cc
${protobuf_source_dir}/src/google/protobuf/arenastring.cc
${protobuf_source_dir}/src/google/protobuf/extension_set.cc
${protobuf_source_dir}/src/google/protobuf/field_access_listener.cc
${protobuf_source_dir}/src/google/protobuf/generated_enum_util.cc
${protobuf_source_dir}/src/google/protobuf/generated_message_table_driven_lite.cc
${protobuf_source_dir}/src/google/protobuf/generated_message_util.cc
@ -86,6 +85,7 @@ set(libprotobuf_files
${protobuf_source_dir}/src/google/protobuf/empty.pb.cc
${protobuf_source_dir}/src/google/protobuf/extension_set_heavy.cc
${protobuf_source_dir}/src/google/protobuf/field_mask.pb.cc
${protobuf_source_dir}/src/google/protobuf/generated_message_bases.cc
${protobuf_source_dir}/src/google/protobuf/generated_message_reflection.cc
${protobuf_source_dir}/src/google/protobuf/generated_message_table_driven.cc
${protobuf_source_dir}/src/google/protobuf/io/gzip_stream.cc
@ -316,7 +316,7 @@ else ()
add_dependencies(protoc "${PROTOC_BUILD_DIR}/protoc")
endif ()
include("${ClickHouse_SOURCE_DIR}/contrib/protobuf-cmake/protobuf_generate.cmake")
include("${ClickHouse_SOURCE_DIR}/contrib/google-protobuf-cmake/protobuf_generate.cmake")
add_library(_protobuf INTERFACE)
target_link_libraries(_protobuf INTERFACE _libprotobuf)

2
contrib/libgsasl vendored

@ -1 +1 @@
Subproject commit f4e7bf0bb068030d57266f87ccac4c8c012fb5c4
Subproject commit 0fb79e7609ae5a5e015a41d24bcbadd48f8f5469

2
contrib/libxml2 vendored

@ -1 +1 @@
Subproject commit f507d167f1755b7eaea09fb1a44d29aab828b6d1
Subproject commit 223cb03a5d27b1b2393b266a8657443d046139d6

2
contrib/lz4 vendored

@ -1 +1 @@
Subproject commit 4c9431e9af596af0556e5da0ae99305bafb2b10b
Subproject commit e82198428c8061372d5adef1f9bfff4203f6081e

View File

@ -12,6 +12,7 @@ add_library (_lz4 ${SRCS})
add_library (ch_contrib::lz4 ALIAS _lz4)
target_compile_definitions (_lz4 PUBLIC LZ4_DISABLE_DEPRECATE_WARNINGS=1)
target_compile_definitions (_lz4 PUBLIC LZ4_FAST_DEC_LOOP=1)
if (SANITIZE STREQUAL "undefined")
target_compile_options (_lz4 PRIVATE -fno-sanitize=undefined)
endif ()

1
contrib/protobuf vendored

@ -1 +0,0 @@
Subproject commit 6bb70196c5360268d9f021bb7936fb0b551724c2

View File

@ -32,7 +32,7 @@ RUN arch=${TARGETARCH:-amd64} \
esac
ARG REPOSITORY="https://s3.amazonaws.com/clickhouse-builds/22.4/31c367d3cd3aefd316778601ff6565119fe36682/package_release"
ARG VERSION="23.4.2.11"
ARG VERSION="23.5.2.7"
ARG PACKAGES="clickhouse-keeper"
# user/group precreated explicitly with fixed uid/gid on purpose.

View File

@ -46,10 +46,12 @@ ENV CXX=clang++-${LLVM_VERSION}
# Rust toolchain and libraries
ENV RUSTUP_HOME=/rust/rustup
ENV CARGO_HOME=/rust/cargo
ENV PATH="/rust/cargo/env:${PATH}"
ENV PATH="/rust/cargo/bin:${PATH}"
RUN curl https://sh.rustup.rs -sSf | bash -s -- -y && \
chmod 777 -R /rust && \
rustup toolchain install nightly && \
rustup default nightly && \
rustup component add rust-src && \
rustup target add aarch64-unknown-linux-gnu && \
rustup target add x86_64-apple-darwin && \
rustup target add x86_64-unknown-freebsd && \

View File

@ -11,9 +11,11 @@ ccache_status () {
[ -O /build ] || git config --global --add safe.directory /build
mkdir -p /build/cmake/toolchain/darwin-x86_64
tar xJf /MacOSX11.0.sdk.tar.xz -C /build/cmake/toolchain/darwin-x86_64 --strip-components=1
ln -sf darwin-x86_64 /build/cmake/toolchain/darwin-aarch64
if [ "$EXTRACT_TOOLCHAIN_DARWIN" = "1" ]; then
mkdir -p /build/cmake/toolchain/darwin-x86_64
tar xJf /MacOSX11.0.sdk.tar.xz -C /build/cmake/toolchain/darwin-x86_64 --strip-components=1
ln -sf darwin-x86_64 /build/cmake/toolchain/darwin-aarch64
fi
# Uncomment to debug ccache. Don't put ccache log in /output right away, or it
# will be confusingly packed into the "performance" package.

View File

@ -167,6 +167,7 @@ def parse_env_variables(
cmake_flags.append(
"-DCMAKE_TOOLCHAIN_FILE=/build/cmake/darwin/toolchain-x86_64.cmake"
)
result.append("EXTRACT_TOOLCHAIN_DARWIN=1")
elif is_cross_darwin_arm:
cc = compiler[: -len(DARWIN_ARM_SUFFIX)]
cmake_flags.append("-DCMAKE_AR:FILEPATH=/cctools/bin/aarch64-apple-darwin-ar")
@ -181,6 +182,7 @@ def parse_env_variables(
cmake_flags.append(
"-DCMAKE_TOOLCHAIN_FILE=/build/cmake/darwin/toolchain-aarch64.cmake"
)
result.append("EXTRACT_TOOLCHAIN_DARWIN=1")
elif is_cross_arm:
cc = compiler[: -len(ARM_SUFFIX)]
cmake_flags.append(

View File

@ -33,7 +33,7 @@ RUN arch=${TARGETARCH:-amd64} \
# lts / testing / prestable / etc
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}"
ARG VERSION="23.4.2.11"
ARG VERSION="23.5.2.7"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
# user/group precreated explicitly with fixed uid/gid on purpose.

View File

@ -22,7 +22,7 @@ RUN sed -i "s|http://archive.ubuntu.com|${apt_archive}|g" /etc/apt/sources.list
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="deb https://packages.clickhouse.com/deb ${REPO_CHANNEL} main"
ARG VERSION="23.4.2.11"
ARG VERSION="23.5.2.7"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
# set non-empty deb_location_url url to create a docker image

View File

@ -16,6 +16,11 @@ For more information and documentation see https://clickhouse.com/.
- The tag `head` is built from the latest commit to the default branch.
- Each tag has optional `-alpine` suffix to reflect that it's built on top of `alpine`.
### Compatibility
- The amd64 image requires support for [SSE3 instructions](https://en.wikipedia.org/wiki/SSE3). Virtually all x86 CPUs after 2005 support SSE3.
- The arm64 image requires support for the [ARMv8.2-A architecture](https://en.wikipedia.org/wiki/AArch64#ARMv8.2-A). Most ARM CPUs after 2017 support ARMv8.2-A. A notable exception is Raspberry Pi 4 from 2019 whose CPU only supports ARMv8.0-A.
## How to use this image
### start server instance

View File

@ -626,7 +626,9 @@ if args.report == "main":
message_array.append(str(faster_queries) + " faster")
if slower_queries:
if slower_queries > 3:
# This threshold should be synchronized with the value in https://github.com/ClickHouse/ClickHouse/blob/master/tests/ci/performance_comparison_check.py#L225
# False positives rate should be < 1%: https://shorturl.at/CDEK8
if slower_queries > 5:
status = "failure"
message_array.append(str(slower_queries) + " slower")

View File

@ -15,6 +15,9 @@ dpkg -i package_folder/clickhouse-client_*.deb
ln -s /usr/share/clickhouse-test/clickhouse-test /usr/bin/clickhouse-test
# shellcheck disable=SC1091
source /usr/share/clickhouse-test/ci/attach_gdb.lib || true # FIXME: to not break old builds, clean on 2023-09-01
# install test configs
/usr/share/clickhouse-test/config/install.sh
@ -85,6 +88,8 @@ fi
sleep 5
attach_gdb_to_clickhouse || true # FIXME: to not break old builds, clean on 2023-09-01
function run_tests()
{
set -x

View File

@ -3,5 +3,5 @@
set -x
service zookeeper start && sleep 7 && /usr/share/zookeeper/bin/zkCli.sh -server localhost:2181 -create create /clickhouse_test '';
gdb -q -ex 'set print inferior-events off' -ex 'set confirm off' -ex 'set print thread-events off' -ex run -ex bt -ex quit --args ./unit_tests_dbms | tee test_output/test_result.txt
timeout 40m gdb -q -ex 'set print inferior-events off' -ex 'set confirm off' -ex 'set print thread-events off' -ex run -ex bt -ex quit --args ./unit_tests_dbms | tee test_output/test_result.txt
./process_unit_tests_result.py || echo -e "failure\tCannot parse results" > /test_output/check_status.tsv

View File

@ -59,12 +59,6 @@ install_packages previous_release_package_folder
# available for dump via clickhouse-local
configure
# local_blob_storage disk type does not exist in older versions
sudo cat /etc/clickhouse-server/config.d/storage_conf.xml \
| sed "s|<type>local_blob_storage</type>|<type>local</type>|" \
> /etc/clickhouse-server/config.d/storage_conf.xml.tmp
sudo mv /etc/clickhouse-server/config.d/storage_conf.xml.tmp /etc/clickhouse-server/config.d/storage_conf.xml
# it contains some new settings, but we can safely remove it
rm /etc/clickhouse-server/config.d/merge_tree.xml
@ -92,11 +86,6 @@ export USE_S3_STORAGE_FOR_MERGE_TREE=1
export ZOOKEEPER_FAULT_INJECTION=0
configure
sudo cat /etc/clickhouse-server/config.d/storage_conf.xml \
| sed "s|<type>local_blob_storage</type>|<type>local</type>|" \
> /etc/clickhouse-server/config.d/storage_conf.xml.tmp
sudo mv /etc/clickhouse-server/config.d/storage_conf.xml.tmp /etc/clickhouse-server/config.d/storage_conf.xml
# it contains some new settings, but we can safely remove it
rm /etc/clickhouse-server/config.d/merge_tree.xml

View File

@ -1,6 +0,0 @@
# ARM (AArch64) build works on Amazon Graviton, Oracle Cloud, Huawei Cloud ARM machines.
# The support for AArch64 is pre-production ready.
wget 'https://builds.clickhouse.com/master/aarch64/clickhouse'
chmod a+x ./clickhouse
sudo ./clickhouse install

View File

@ -1,3 +0,0 @@
fetch 'https://builds.clickhouse.com/master/freebsd/clickhouse'
chmod a+x ./clickhouse
su -m root -c './clickhouse install'

View File

@ -1,3 +0,0 @@
wget 'https://builds.clickhouse.com/master/macos-aarch64/clickhouse'
chmod a+x ./clickhouse
./clickhouse

View File

@ -1,3 +0,0 @@
wget 'https://builds.clickhouse.com/master/macos/clickhouse'
chmod a+x ./clickhouse
./clickhouse

View File

@ -0,0 +1,32 @@
---
sidebar_position: 1
sidebar_label: 2023
---
# 2023 Changelog
### ClickHouse release v22.8.18.31-lts (4de7a95a544) FIXME as compared to v22.8.17.17-lts (df7f2ef0b41)
#### Performance Improvement
* Backported in [#49214](https://github.com/ClickHouse/ClickHouse/issues/49214): Fixed excessive reading in queries with `FINAL`. [#47801](https://github.com/ClickHouse/ClickHouse/pull/47801) ([Nikita Taranov](https://github.com/nickitat)).
#### Build/Testing/Packaging Improvement
* Backported in [#49079](https://github.com/ClickHouse/ClickHouse/issues/49079): Update time zones. The following were updated: Africa/Cairo, Africa/Casablanca, Africa/El_Aaiun, America/Bogota, America/Cambridge_Bay, America/Ciudad_Juarez, America/Godthab, America/Inuvik, America/Iqaluit, America/Nuuk, America/Ojinaga, America/Pangnirtung, America/Rankin_Inlet, America/Resolute, America/Whitehorse, America/Yellowknife, Asia/Gaza, Asia/Hebron, Asia/Kuala_Lumpur, Asia/Singapore, Canada/Yukon, Egypt, Europe/Kirov, Europe/Volgograd, Singapore. [#48572](https://github.com/ClickHouse/ClickHouse/pull/48572) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Fix bad cast from LowCardinality column when using short circuit function execution [#43311](https://github.com/ClickHouse/ClickHouse/pull/43311) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix msan issue in randomStringUTF8(<uneven number>) [#49750](https://github.com/ClickHouse/ClickHouse/pull/49750) ([Robert Schulze](https://github.com/rschu1ze)).
* JIT compilation not equals NaN fix [#50056](https://github.com/ClickHouse/ClickHouse/pull/50056) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix crash with `multiIf` and constant condition and nullable arguments [#50123](https://github.com/ClickHouse/ClickHouse/pull/50123) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed type conversion from Date/Date32 to DateTime64 when querying with DateTime64 index [#50280](https://github.com/ClickHouse/ClickHouse/pull/50280) ([Lucas Chang](https://github.com/lucas-tubi)).
* Fix Keeper deadlock on exception when preprocessing requests. [#50387](https://github.com/ClickHouse/ClickHouse/pull/50387) ([frinkr](https://github.com/frinkr)).
* Fix Log family table return wrong rows count after truncate [#50585](https://github.com/ClickHouse/ClickHouse/pull/50585) ([flynn](https://github.com/ucasfl)).
* Do not read all the columns from right GLOBAL JOIN table. [#50721](https://github.com/ClickHouse/ClickHouse/pull/50721) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Improve test reports [#49151](https://github.com/ClickHouse/ClickHouse/pull/49151) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Update github.com/distribution/distribution [#50114](https://github.com/ClickHouse/ClickHouse/pull/50114) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Catch issues with dockerd during the build [#50700](https://github.com/ClickHouse/ClickHouse/pull/50700) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -0,0 +1,35 @@
---
sidebar_position: 1
sidebar_label: 2023
---
# 2023 Changelog
### ClickHouse release v23.2.7.32-stable (934f6a2aa0e) FIXME as compared to v23.2.6.34-stable (570190045b0)
#### Performance Improvement
* Backported in [#49218](https://github.com/ClickHouse/ClickHouse/issues/49218): Fixed excessive reading in queries with `FINAL`. [#47801](https://github.com/ClickHouse/ClickHouse/pull/47801) ([Nikita Taranov](https://github.com/nickitat)).
#### Build/Testing/Packaging Improvement
* Backported in [#49208](https://github.com/ClickHouse/ClickHouse/issues/49208): Fix glibc compatibility check: replace `preadv` from musl. [#49144](https://github.com/ClickHouse/ClickHouse/pull/49144) ([alesapin](https://github.com/alesapin)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Fix key not found error for queries with multiple StorageJoin [#49137](https://github.com/ClickHouse/ClickHouse/pull/49137) ([vdimir](https://github.com/vdimir)).
* Fix race on Outdated parts loading [#49223](https://github.com/ClickHouse/ClickHouse/pull/49223) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix bug in DISTINCT [#49628](https://github.com/ClickHouse/ClickHouse/pull/49628) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix msan issue in randomStringUTF8(<uneven number>) [#49750](https://github.com/ClickHouse/ClickHouse/pull/49750) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix IPv6 encoding in protobuf [#49933](https://github.com/ClickHouse/ClickHouse/pull/49933) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Avoid deadlock when starting table in attach thread of `ReplicatedMergeTree` [#50026](https://github.com/ClickHouse/ClickHouse/pull/50026) ([Antonio Andelic](https://github.com/antonio2368)).
* JIT compilation not equals NaN fix [#50056](https://github.com/ClickHouse/ClickHouse/pull/50056) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix crash with `multiIf` and constant condition and nullable arguments [#50123](https://github.com/ClickHouse/ClickHouse/pull/50123) ([Anton Popov](https://github.com/CurtizJ)).
* Fix Keeper deadlock on exception when preprocessing requests. [#50387](https://github.com/ClickHouse/ClickHouse/pull/50387) ([frinkr](https://github.com/frinkr)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Improve test reports [#49151](https://github.com/ClickHouse/ClickHouse/pull/49151) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fallback auth gh api [#49314](https://github.com/ClickHouse/ClickHouse/pull/49314) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Improve CI: status commit, auth for get_gh_api [#49388](https://github.com/ClickHouse/ClickHouse/pull/49388) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Update github.com/distribution/distribution [#50114](https://github.com/ClickHouse/ClickHouse/pull/50114) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Catch issues with dockerd during the build [#50700](https://github.com/ClickHouse/ClickHouse/pull/50700) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -0,0 +1,45 @@
---
sidebar_position: 1
sidebar_label: 2023
---
# 2023 Changelog
### ClickHouse release v23.3.3.52-lts (cb963c474db) FIXME as compared to v23.3.2.37-lts (1b144bcd101)
#### Improvement
* Backported in [#49954](https://github.com/ClickHouse/ClickHouse/issues/49954): Add support for (an unusual) case where the arguments in the `IN` operator are single-element tuples. [#49844](https://github.com/ClickHouse/ClickHouse/pull/49844) ([MikhailBurdukov](https://github.com/MikhailBurdukov)).
#### Build/Testing/Packaging Improvement
* Backported in [#49210](https://github.com/ClickHouse/ClickHouse/issues/49210): Fix glibc compatibility check: replace `preadv` from musl. [#49144](https://github.com/ClickHouse/ClickHouse/pull/49144) ([alesapin](https://github.com/alesapin)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Fix aggregate empty string error [#48999](https://github.com/ClickHouse/ClickHouse/pull/48999) ([LiuNeng](https://github.com/liuneng1994)).
* Fix key not found error for queries with multiple StorageJoin [#49137](https://github.com/ClickHouse/ClickHouse/pull/49137) ([vdimir](https://github.com/vdimir)).
* Fix race on Outdated parts loading [#49223](https://github.com/ClickHouse/ClickHouse/pull/49223) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix bug in DISTINCT [#49628](https://github.com/ClickHouse/ClickHouse/pull/49628) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix msan issue in randomStringUTF8(<uneven number>) [#49750](https://github.com/ClickHouse/ClickHouse/pull/49750) ([Robert Schulze](https://github.com/rschu1ze)).
* fix `is_prefix` in OptimizeRegularExpression [#49919](https://github.com/ClickHouse/ClickHouse/pull/49919) ([Han Fei](https://github.com/hanfei1991)).
* Fix IPv6 encoding in protobuf [#49933](https://github.com/ClickHouse/ClickHouse/pull/49933) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Avoid deadlock when starting table in attach thread of `ReplicatedMergeTree` [#50026](https://github.com/ClickHouse/ClickHouse/pull/50026) ([Antonio Andelic](https://github.com/antonio2368)).
* JIT compilation not equals NaN fix [#50056](https://github.com/ClickHouse/ClickHouse/pull/50056) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix crash with `multiIf` and constant condition and nullable arguments [#50123](https://github.com/ClickHouse/ClickHouse/pull/50123) ([Anton Popov](https://github.com/CurtizJ)).
* Fix reconnecting of HTTPS session when target host IP was changed [#50240](https://github.com/ClickHouse/ClickHouse/pull/50240) ([Aleksei Filatov](https://github.com/aalexfvk)).
* Fixed type conversion from Date/Date32 to DateTime64 when querying with DateTime64 index [#50280](https://github.com/ClickHouse/ClickHouse/pull/50280) ([Lucas Chang](https://github.com/lucas-tubi)).
* Fix Keeper deadlock on exception when preprocessing requests. [#50387](https://github.com/ClickHouse/ClickHouse/pull/50387) ([frinkr](https://github.com/frinkr)).
* Fix incorrect constant folding [#50536](https://github.com/ClickHouse/ClickHouse/pull/50536) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix Log family table return wrong rows count after truncate [#50585](https://github.com/ClickHouse/ClickHouse/pull/50585) ([flynn](https://github.com/ucasfl)).
* Fix bug in `uniqExact` parallel merging [#50590](https://github.com/ClickHouse/ClickHouse/pull/50590) ([Nikita Taranov](https://github.com/nickitat)).
* Do not read all the columns from right GLOBAL JOIN table. [#50721](https://github.com/ClickHouse/ClickHouse/pull/50721) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Implement status comment [#48468](https://github.com/ClickHouse/ClickHouse/pull/48468) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Update curl to 8.0.1 (for CVEs) [#48765](https://github.com/ClickHouse/ClickHouse/pull/48765) ([Boris Kuschel](https://github.com/bkuschel)).
* Improve test reports [#49151](https://github.com/ClickHouse/ClickHouse/pull/49151) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fallback auth gh api [#49314](https://github.com/ClickHouse/ClickHouse/pull/49314) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Improve CI: status commit, auth for get_gh_api [#49388](https://github.com/ClickHouse/ClickHouse/pull/49388) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Update github.com/distribution/distribution [#50114](https://github.com/ClickHouse/ClickHouse/pull/50114) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Catch issues with dockerd during the build [#50700](https://github.com/ClickHouse/ClickHouse/pull/50700) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -0,0 +1,42 @@
---
sidebar_position: 1
sidebar_label: 2023
---
# 2023 Changelog
### ClickHouse release v23.4.3.48-stable (d9199f8d3cc) FIXME as compared to v23.4.2.11-stable (b6442320f9d)
#### Backward Incompatible Change
* Backported in [#49981](https://github.com/ClickHouse/ClickHouse/issues/49981): Revert "`groupArray` returns cannot be nullable" (due to binary compatibility breakage for `groupArray`/`groupArrayLast`/`groupArraySample` over `Nullable` types, which likely will lead to `TOO_LARGE_ARRAY_SIZE` or `CANNOT_READ_ALL_DATA`). [#49971](https://github.com/ClickHouse/ClickHouse/pull/49971) ([Azat Khuzhin](https://github.com/azat)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Fix key not found error for queries with multiple StorageJoin [#49137](https://github.com/ClickHouse/ClickHouse/pull/49137) ([vdimir](https://github.com/vdimir)).
* Fix fuzz bug when subquery set is not built when reading from remote() [#49425](https://github.com/ClickHouse/ClickHouse/pull/49425) ([Alexander Gololobov](https://github.com/davenger)).
* Fix postgres database setting [#49481](https://github.com/ClickHouse/ClickHouse/pull/49481) ([Mal Curtis](https://github.com/snikch)).
* Fix AsynchronousReadIndirectBufferFromRemoteFS breaking on short seeks [#49525](https://github.com/ClickHouse/ClickHouse/pull/49525) ([Michael Kolupaev](https://github.com/al13n321)).
* Fix bug in DISTINCT [#49628](https://github.com/ClickHouse/ClickHouse/pull/49628) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix assert in SpanHolder::finish() with fibers [#49673](https://github.com/ClickHouse/ClickHouse/pull/49673) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix msan issue in randomStringUTF8(<uneven number>) [#49750](https://github.com/ClickHouse/ClickHouse/pull/49750) ([Robert Schulze](https://github.com/rschu1ze)).
* fix `is_prefix` in OptimizeRegularExpression [#49919](https://github.com/ClickHouse/ClickHouse/pull/49919) ([Han Fei](https://github.com/hanfei1991)).
* Fix IPv6 encoding in protobuf [#49933](https://github.com/ClickHouse/ClickHouse/pull/49933) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Avoid deadlock when starting table in attach thread of `ReplicatedMergeTree` [#50026](https://github.com/ClickHouse/ClickHouse/pull/50026) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix assert in SpanHolder::finish() with fibers attempt 2 [#50034](https://github.com/ClickHouse/ClickHouse/pull/50034) ([Kruglov Pavel](https://github.com/Avogar)).
* JIT compilation not equals NaN fix [#50056](https://github.com/ClickHouse/ClickHouse/pull/50056) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix crashing in case of Replicated database without arguments [#50058](https://github.com/ClickHouse/ClickHouse/pull/50058) ([Azat Khuzhin](https://github.com/azat)).
* Fix crash with `multiIf` and constant condition and nullable arguments [#50123](https://github.com/ClickHouse/ClickHouse/pull/50123) ([Anton Popov](https://github.com/CurtizJ)).
* Fix iceberg metadata parsing [#50232](https://github.com/ClickHouse/ClickHouse/pull/50232) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix bugs in Poco sockets in non-blocking mode, use true non-blocking sockets [#50252](https://github.com/ClickHouse/ClickHouse/pull/50252) ([Kruglov Pavel](https://github.com/Avogar)).
* Fixed type conversion from Date/Date32 to DateTime64 when querying with DateTime64 index [#50280](https://github.com/ClickHouse/ClickHouse/pull/50280) ([Lucas Chang](https://github.com/lucas-tubi)).
* Fix Keeper deadlock on exception when preprocessing requests. [#50387](https://github.com/ClickHouse/ClickHouse/pull/50387) ([frinkr](https://github.com/frinkr)).
* Fix Log family table return wrong rows count after truncate [#50585](https://github.com/ClickHouse/ClickHouse/pull/50585) ([flynn](https://github.com/ucasfl)).
* Fix bug in `uniqExact` parallel merging [#50590](https://github.com/ClickHouse/ClickHouse/pull/50590) ([Nikita Taranov](https://github.com/nickitat)).
* Do not read all the columns from right GLOBAL JOIN table. [#50721](https://github.com/ClickHouse/ClickHouse/pull/50721) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Improve CI: status commit, auth for get_gh_api [#49388](https://github.com/ClickHouse/ClickHouse/pull/49388) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Update github.com/distribution/distribution [#50114](https://github.com/ClickHouse/ClickHouse/pull/50114) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Catch issues with dockerd during the build [#50700](https://github.com/ClickHouse/ClickHouse/pull/50700) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -0,0 +1,599 @@
---
sidebar_position: 1
sidebar_label: 2023
---
# 2023 Changelog
### ClickHouse release v23.5.1.3174-stable (2fec796e73e) FIXME as compared to v23.4.1.1943-stable (3920eb987f7)
#### Backward Incompatible Change
* Make local object storage work consistently with s3 object storage, fix problem with append (closes [#48465](https://github.com/ClickHouse/ClickHouse/issues/48465)), make it configurable as independent storage. The change is backward incompatible because cache on top of local object storage is not incompatible to previous versions. [#48791](https://github.com/ClickHouse/ClickHouse/pull/48791) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Date_trunc function to always return datetime type. [#48851](https://github.com/ClickHouse/ClickHouse/pull/48851) ([Shane Andrade](https://github.com/mauidude)).
* Remove the experimental feature "in-memory data parts". The data format is still supported, but the settings are no-op, and compact or wide parts will be used instead. This closes [#45409](https://github.com/ClickHouse/ClickHouse/issues/45409). [#49429](https://github.com/ClickHouse/ClickHouse/pull/49429) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Changed default values of settings parallelize_output_from_storages and input_format_parquet_preserve_order. This allows ClickHouse to reorder rows when reading from files (e.g. CSV or Parquet), greatly improving performance in many cases. To restore the old behavior of preserving order, use `parallelize_output_from_storages = 0`, `input_format_parquet_preserve_order = 1`. [#49479](https://github.com/ClickHouse/ClickHouse/pull/49479) ([Michael Kolupaev](https://github.com/al13n321)).
* Make projections production-ready. Add the `optimize_use_projections` setting to control whether the projections will be selected for SELECT queries. The setting `allow_experimental_projection_optimization` is obsolete and does nothing. [#49719](https://github.com/ClickHouse/ClickHouse/pull/49719) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Mark joinGet() as non deterministic (so as dictGet). [#49843](https://github.com/ClickHouse/ClickHouse/pull/49843) ([Azat Khuzhin](https://github.com/azat)).
* Revert "`groupArray` returns cannot be nullable" (due to binary compatibility breakage for `groupArray`/`groupArrayLast`/`groupArraySample` over `Nullable` types, which likely will lead to `TOO_LARGE_ARRAY_SIZE` or `CANNOT_READ_ALL_DATA`). [#49971](https://github.com/ClickHouse/ClickHouse/pull/49971) ([Azat Khuzhin](https://github.com/azat)).
#### New Feature
* Password type in queries like `CREATE USER u IDENTIFIED BY 'p'` will be automatically set according to the setting `default_password_type` in the `config.xml` on the server. Closes [#42915](https://github.com/ClickHouse/ClickHouse/issues/42915). [#44674](https://github.com/ClickHouse/ClickHouse/pull/44674) ([Nikolay Degterinsky](https://github.com/evillique)).
* Add bcrypt password authentication type. Closes [#34599](https://github.com/ClickHouse/ClickHouse/issues/34599). [#44905](https://github.com/ClickHouse/ClickHouse/pull/44905) ([Nikolay Degterinsky](https://github.com/evillique)).
* Added `system.zookeeper_connection` table that shows information about ZooKeeper connections. [#45245](https://github.com/ClickHouse/ClickHouse/pull/45245) ([mateng915](https://github.com/mateng0915)).
* Add urlCluster table function. Refactor all *Cluster table functions to reduce code duplication. Make schema inference work for all possible *Cluster function signatures and for named collections. Closes [#38499](https://github.com/ClickHouse/ClickHouse/issues/38499). [#45427](https://github.com/ClickHouse/ClickHouse/pull/45427) ([attack204](https://github.com/attack204)).
* Extend `first_value` and `last_value` to accept null. [#46467](https://github.com/ClickHouse/ClickHouse/pull/46467) ([lgbo](https://github.com/lgbo-ustc)).
* Add server and format settings `display_secrets_in_show_and_select` for displaying secrets of tables, databases, table functions, and dictionaries. Add privilege `displaySecretsInShowAndSelect` controlling which users can view secrets. [#46528](https://github.com/ClickHouse/ClickHouse/pull/46528) ([Mike Kot](https://github.com/myrrc)).
* Add new function `generateRandomStructure` that generates random table structure. It can be used in combination with table function `generateRandom`. [#47409](https://github.com/ClickHouse/ClickHouse/pull/47409) ([Kruglov Pavel](https://github.com/Avogar)).
* Added native ClickHouse Keeper CLI Client. [#47414](https://github.com/ClickHouse/ClickHouse/pull/47414) ([pufit](https://github.com/pufit)).
* The query cache can now be used for production workloads. [#47977](https://github.com/ClickHouse/ClickHouse/pull/47977) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix a bug that prevented the use of `CASE` without an `ELSE` branch and extended `transform` to deal with more types. Also fix some bugs that made transform() return incorrect results when decimal types were mixed with other numeric types. [#48300](https://github.com/ClickHouse/ClickHouse/pull/48300) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
* Added [server-side encryption using KMS keys](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingKMSEncryption.html) with S3 tables, and the `header` setting with S3 disks. Closes [#48723](https://github.com/ClickHouse/ClickHouse/issues/48723). [#48724](https://github.com/ClickHouse/ClickHouse/pull/48724) ([Johann Gan](https://github.com/johanngan)).
* Add MemoryTracker for the background tasks (merges and mutation). Introduces `merges_mutations_memory_usage_soft_limit` and `merges_mutations_memory_usage_to_ram_ratio` settings that represent the soft memory limit for merges and mutations. If this limit is reached ClickHouse won't schedule new merge or mutation tasks. Also `MergesMutationsMemoryTracking` metric is introduced to allow observing current memory usage of background tasks. Resubmit [#46089](https://github.com/ClickHouse/ClickHouse/issues/46089). Closes [#48774](https://github.com/ClickHouse/ClickHouse/issues/48774). [#48787](https://github.com/ClickHouse/ClickHouse/pull/48787) ([Dmitry Novik](https://github.com/novikd)).
* Function `dotProduct` work for array. [#49050](https://github.com/ClickHouse/ClickHouse/pull/49050) ([FFFFFFFHHHHHHH](https://github.com/FFFFFFFHHHHHHH)).
* Support statement `SHOW INDEX` to improve compatibility with MySQL. [#49158](https://github.com/ClickHouse/ClickHouse/pull/49158) ([Robert Schulze](https://github.com/rschu1ze)).
* Add virtual column `_file` and `_path` support to table function `url`. - Impove error message for table function `url`. - resolves [#49231](https://github.com/ClickHouse/ClickHouse/issues/49231) - resolves [#49232](https://github.com/ClickHouse/ClickHouse/issues/49232). [#49356](https://github.com/ClickHouse/ClickHouse/pull/49356) ([Ziyi Tan](https://github.com/Ziy1-Tan)).
* Adding the `grants` field in the users.xml file, which allows specifying grants for users. [#49381](https://github.com/ClickHouse/ClickHouse/pull/49381) ([pufit](https://github.com/pufit)).
* Add alias `str_to_map` and `mapfromstring` for `extractkeyvaluepairs`. closes [#47185](https://github.com/ClickHouse/ClickHouse/issues/47185). [#49466](https://github.com/ClickHouse/ClickHouse/pull/49466) ([flynn](https://github.com/ucasfl)).
* Support full/right join by using grace hash join algorithm. [#49483](https://github.com/ClickHouse/ClickHouse/pull/49483) ([lgbo](https://github.com/lgbo-ustc)).
* `WITH FILL` modifier groups filling by sorting prefix. Controlled by `use_with_fill_by_sorting_prefix` setting (enabled by default). Related to [#33203](https://github.com/ClickHouse/ClickHouse/issues/33203)#issuecomment-1418736794. [#49503](https://github.com/ClickHouse/ClickHouse/pull/49503) ([Igor Nikonov](https://github.com/devcrafter)).
* Add SQL functions for entropy-learned hashing. [#49656](https://github.com/ClickHouse/ClickHouse/pull/49656) ([Robert Schulze](https://github.com/rschu1ze)).
* Clickhouse-client now accepts queries after "--multiquery" when "--query" (or "-q") is absent. example: clickhouse-client --multiquery "select 1; select 2;". [#49870](https://github.com/ClickHouse/ClickHouse/pull/49870) ([Alexey Gerasimchuck](https://github.com/Demilivor)).
* Add separate `handshake_timeout` for receiving Hello packet from replica. Closes [#48854](https://github.com/ClickHouse/ClickHouse/issues/48854). [#49948](https://github.com/ClickHouse/ClickHouse/pull/49948) ([Kruglov Pavel](https://github.com/Avogar)).
* New setting s3_max_inflight_parts_for_one_file sets the limit of concurrently loaded parts with multipart upload request in scope of one file. [#49961](https://github.com/ClickHouse/ClickHouse/pull/49961) ([Sema Checherinda](https://github.com/CheSema)).
* Geographical data types (`Point`, `Ring`, `Polygon`, and `MultiPolygon`) are production-ready. [#50022](https://github.com/ClickHouse/ClickHouse/pull/50022) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Added a function "space()" which repeats a space as many times as specified. [#50103](https://github.com/ClickHouse/ClickHouse/pull/50103) ([Robert Schulze](https://github.com/rschu1ze)).
* Added --input_format_csv_trim_whitespaces option. [#50215](https://github.com/ClickHouse/ClickHouse/pull/50215) ([Alexey Gerasimchuck](https://github.com/Demilivor)).
* Added the dictGetAll function for regexp tree dictionaries to return values from multiple matches as arrays. Closes [#50254](https://github.com/ClickHouse/ClickHouse/issues/50254). [#50255](https://github.com/ClickHouse/ClickHouse/pull/50255) ([Johann Gan](https://github.com/johanngan)).
* Added toLastDayOfWeek() function to round a date or a date with time up to the nearest Saturday or Sunday. [#50315](https://github.com/ClickHouse/ClickHouse/pull/50315) ([Victor Krasnov](https://github.com/sirvickr)).
* Ability to ignore a skip index by specifying `ignore_data_skipping_indices`. [#50329](https://github.com/ClickHouse/ClickHouse/pull/50329) ([Boris Kuschel](https://github.com/bkuschel)).
* Revert 'Add SQL functions for entropy-learned hashing'. [#50416](https://github.com/ClickHouse/ClickHouse/pull/50416) ([Robert Schulze](https://github.com/rschu1ze)).
* Add `system.user_processes` table and `SHOW USER PROCESSES` query to show memory info and ProfileEvents on user level. [#50492](https://github.com/ClickHouse/ClickHouse/pull/50492) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* Added storage engine `AzureBlobStorage` and `azure_blob_storage` table function. The supported set of features is very similar to storage/table function `S3`. Implements [#19307](https://github.com/ClickHouse/ClickHouse/issues/19307). [#50604](https://github.com/ClickHouse/ClickHouse/pull/50604) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
#### Performance Improvement
* Compress marks and primary key by default. It significantly reduces the cold query time. Upgrade notes: the support for compressed marks and primary key has been added in version 22.9. If you turned on compressed marks or primary key or installed version 23.5 or newer, which has compressed marks or primary key on by default, you will not be able to downgrade to version 22.8 or earlier. You can also explicitly disable compressed marks or primary keys by specifying the `compress_marks` and `compress_primary_key` settings in the `<merge_tree>` section of the server configuration file. **Upgrade notes:** If you upgrade from versions prior to 22.9, you should either upgrade all replicas at once or disable the compression before upgrade, or upgrade through an intermediate version, where the compressed marks are supported but not enabled by default, such as 23.3. [#42587](https://github.com/ClickHouse/ClickHouse/pull/42587) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* When reading from multiple files reduce parallel parsing threads for each file resolves [#42192](https://github.com/ClickHouse/ClickHouse/issues/42192). [#46661](https://github.com/ClickHouse/ClickHouse/pull/46661) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Do not store blocks in `ANY` hash join if nothing is inserted. [#48633](https://github.com/ClickHouse/ClickHouse/pull/48633) ([vdimir](https://github.com/vdimir)).
* Fixes aggregate combinator `-If` when JIT compiled. Closes [#48120](https://github.com/ClickHouse/ClickHouse/issues/48120). [#49083](https://github.com/ClickHouse/ClickHouse/pull/49083) ([Igor Nikonov](https://github.com/devcrafter)).
* For reading from remote tables we use smaller tasks (instead of reading the whole part) to make tasks stealing work * task size is determined by size of columns to read * always use 1mb buffers for reading from s3 * boundaries of cache segments aligned to 1mb so they have decent size even with small tasks. it also should prevent fragmentation. [#49287](https://github.com/ClickHouse/ClickHouse/pull/49287) ([Nikita Taranov](https://github.com/nickitat)).
* Default size of a read buffer for reading from local filesystem changed to a slightly better value. Also two new settings are introduced: `max_read_buffer_size_local_fs` and `max_read_buffer_size_remote_fs`. [#49321](https://github.com/ClickHouse/ClickHouse/pull/49321) ([Nikita Taranov](https://github.com/nickitat)).
* Improve memory usage and speed of `SPARSE_HASHED`/`HASHED` dictionaries (e.g. `SPARSE_HASHED` now eats 2.6x less memory, and is ~2x faster). [#49380](https://github.com/ClickHouse/ClickHouse/pull/49380) ([Azat Khuzhin](https://github.com/azat)).
* Use aggregate projection only if it reads fewer granules than normal reading. It should help in case if query hits the PK of the table, but not the projection. Fixes [#49150](https://github.com/ClickHouse/ClickHouse/issues/49150). [#49417](https://github.com/ClickHouse/ClickHouse/pull/49417) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Optimize PODArray::resize_fill() callers. [#49459](https://github.com/ClickHouse/ClickHouse/pull/49459) ([Azat Khuzhin](https://github.com/azat)).
* Optimize the system.query_log and system.query_thread_log tables by applying LowCardinality when appropriate. The queries over these tables will be faster. [#49530](https://github.com/ClickHouse/ClickHouse/pull/49530) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Better performance when reading local Parquet files (through parallel reading). [#49539](https://github.com/ClickHouse/ClickHouse/pull/49539) ([Michael Kolupaev](https://github.com/al13n321)).
* Improve the performance of `RIGHT/FULL JOIN` by up to 2 times in certain scenarios, especially when joining a small left table with a large right table. [#49585](https://github.com/ClickHouse/ClickHouse/pull/49585) ([lgbo](https://github.com/lgbo-ustc)).
* Improve performance of BLAKE3 by 11% by enabling LTO for Rust. [#49600](https://github.com/ClickHouse/ClickHouse/pull/49600) ([Azat Khuzhin](https://github.com/azat)).
* Optimize the structure of the `system.opentelemetry_span_log`. Use `LowCardinality` where appropriate. Although this table is generally stupid (it is using the Map data type even for common attributes), it will be slightly better. [#49647](https://github.com/ClickHouse/ClickHouse/pull/49647) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Try to reserve hash table's size in `grace_hash` join. [#49816](https://github.com/ClickHouse/ClickHouse/pull/49816) ([lgbo](https://github.com/lgbo-ustc)).
* As is addresed in issue [#49748](https://github.com/ClickHouse/ClickHouse/issues/49748), the predicates with date converters, such as **toYear, toYYYYMM**, could be rewritten with the equivalent date (YYYY-MM-DD) comparisons at the AST level. And this transformation could bring performance improvement as it is free from the expensive date converter and the comparison between dates (or integers in the low level representation) is quite low-cost. The [prototype](https://github.com/ZhiguoZh/ClickHouse/commit/c7f1753f0c9363a19d95fa46f1cfed1d9f505ee0) shows that, with all identified date converters optimized, the overall QPS of the 13 queries is enhanced by **~11%** on the ICX server (Intel Xeon Platinum 8380 CPU, 80 cores, 160 threads). [#50062](https://github.com/ClickHouse/ClickHouse/pull/50062) ([Zhiguo Zhou](https://github.com/ZhiguoZh)).
* Parallel merge of `uniqExactIf` states. Closes [#49885](https://github.com/ClickHouse/ClickHouse/issues/49885). [#50285](https://github.com/ClickHouse/ClickHouse/pull/50285) ([flynn](https://github.com/ucasfl)).
* As is addresed in issue [#49748](https://github.com/ClickHouse/ClickHouse/issues/49748), the predicates with date converters, such as toYear, toYYYYMM, could be rewritten with the equivalent date (YYYY-MM-DD) comparisons at the AST level. And this transformation could bring performance improvement as it is free from the expensive date converter and the comparison between dates (or integers in the low level representation) is quite low-cost. [#50307](https://github.com/ClickHouse/ClickHouse/pull/50307) ([Zhiguo Zhou](https://github.com/ZhiguoZh)).
* Parallel merging supported for `uniqExact` with modifiers `-Array`, `-Merge`, `-OrNull`, `-State`. [#50413](https://github.com/ClickHouse/ClickHouse/pull/50413) ([flynn](https://github.com/ucasfl)).
* Enable LZ4_FAST_DEC_LOOP for Arm LZ4 to get 5% of decompression speed. [#50588](https://github.com/ClickHouse/ClickHouse/pull/50588) ([Daniel Kutenin](https://github.com/danlark1)).
#### Improvement
* Add support for CGroup version 2 for asynchronous metrics about the memory usage and availability. This closes [#37983](https://github.com/ClickHouse/ClickHouse/issues/37983). [#45999](https://github.com/ClickHouse/ClickHouse/pull/45999) ([sichenzhao](https://github.com/sichenzhao)).
* Cluster table functions should always skip unavailable shards. close [#46314](https://github.com/ClickHouse/ClickHouse/issues/46314). [#46765](https://github.com/ClickHouse/ClickHouse/pull/46765) ([zk_kiger](https://github.com/zk-kiger)).
* When your csv file contains empty columns, like ```. [#47496](https://github.com/ClickHouse/ClickHouse/pull/47496) ([你不要过来啊](https://github.com/iiiuwioajdks)).
* ROW POLICY for all tables that belong to a DATABASE. [#47640](https://github.com/ClickHouse/ClickHouse/pull/47640) ([Ilya Golshtein](https://github.com/ilejn)).
* Add Google Cloud Storage S3 compatible table function `gcs`. Like the `oss` and `cosn` functions, it is just an alias over the `s3` table function, and it does not bring any new features. [#47815](https://github.com/ClickHouse/ClickHouse/pull/47815) ([Kuba Kaflik](https://github.com/jkaflik)).
* Add ability to use strict parts size for S3 (compatibility with CloudFlare R2 S3 Storage). [#48492](https://github.com/ClickHouse/ClickHouse/pull/48492) ([Azat Khuzhin](https://github.com/azat)).
* Added new columns with info about `Replicated` database replicas to `system.clusters`: `database_shard_name`, `database_replica_name`, `is_active`. Added an optional `FROM SHARD` clause to `SYSTEM DROP DATABASE REPLICA` query. [#48548](https://github.com/ClickHouse/ClickHouse/pull/48548) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add a new column `zookeeper_name` in system.replicas, to indicate on which (auxiliary) zookeeper cluster the replicated table's metadata is stored. [#48549](https://github.com/ClickHouse/ClickHouse/pull/48549) ([cangyin](https://github.com/cangyin)).
* `IN` operator support compare `Date` and `Date32`. Closes [#48736](https://github.com/ClickHouse/ClickHouse/issues/48736). [#48806](https://github.com/ClickHouse/ClickHouse/pull/48806) ([flynn](https://github.com/ucasfl)).
* Support for erasure codes in HDFS, author: @M1eyu2018, @tomscut. [#48833](https://github.com/ClickHouse/ClickHouse/pull/48833) ([M1eyu](https://github.com/M1eyu2018)).
* The query cache can now supports queries with totals and extremes modifier. [#48853](https://github.com/ClickHouse/ClickHouse/pull/48853) ([Robert Schulze](https://github.com/rschu1ze)).
* Introduces new keyword `INTO OUTFILE 'file.txt' APPEND`. [#48880](https://github.com/ClickHouse/ClickHouse/pull/48880) ([alekar](https://github.com/alekar)).
* The `BACKUP` command will not decrypt data from encrypted disks while making a backup. Instead the data will be stored in a backup in encrypted form. Such backups can be restored only to an encrypted disk with the same (or extended) list of encryption keys. [#48896](https://github.com/ClickHouse/ClickHouse/pull/48896) ([Vitaly Baranov](https://github.com/vitlibar)).
* Keeper improvement: add `CheckNotExists` request to Keeper. [#48897](https://github.com/ClickHouse/ClickHouse/pull/48897) ([Antonio Andelic](https://github.com/antonio2368)).
* Implement SYSTEM DROP REPLICA from auxillary ZooKeeper clusters, may be close [#48931](https://github.com/ClickHouse/ClickHouse/issues/48931). [#48932](https://github.com/ClickHouse/ClickHouse/pull/48932) ([wangxiaobo](https://github.com/wzb5212)).
* Add Array data type to MongoDB. Closes [#48598](https://github.com/ClickHouse/ClickHouse/issues/48598). [#48983](https://github.com/ClickHouse/ClickHouse/pull/48983) ([Nikolay Degterinsky](https://github.com/evillique)).
* Keeper performance improvements: avoid serializing same request twice while processing. Cache deserialization results of large requests. Controlled by new coordination setting `min_request_size_for_cache`. [#49004](https://github.com/ClickHouse/ClickHouse/pull/49004) ([Antonio Andelic](https://github.com/antonio2368)).
* Support storing `Interval` data types in tables. [#49085](https://github.com/ClickHouse/ClickHouse/pull/49085) ([larryluogit](https://github.com/larryluogit)).
* Add support for size suffixes in quota creation statement parameters. [#49087](https://github.com/ClickHouse/ClickHouse/pull/49087) ([Eridanus](https://github.com/Eridanus117)).
* Allow using `ntile` window function without explicit window frame definition: `ntile(3) OVER (ORDER BY a)`, close [#46763](https://github.com/ClickHouse/ClickHouse/issues/46763). [#49093](https://github.com/ClickHouse/ClickHouse/pull/49093) ([vdimir](https://github.com/vdimir)).
* Added settings (`number_of_mutations_to_delay`, `number_of_mutations_to_throw`) to delay or throw `ALTER` queries that create mutations (`ALTER UPDATE`, `ALTER DELETE`, `ALTER MODIFY COLUMN`, ...) in case when table already has a lot of unfinished mutations. [#49117](https://github.com/ClickHouse/ClickHouse/pull/49117) ([Anton Popov](https://github.com/CurtizJ)).
* Added setting `async_insert` for `MergeTables`. It has the same meaning as query-level setting `async_insert` and enables asynchronous inserts for specific table. Note: it doesn't take effect for insert queries from `clickhouse-client`, use query-level setting in that case. [#49122](https://github.com/ClickHouse/ClickHouse/pull/49122) ([Anton Popov](https://github.com/CurtizJ)).
* Catch exception from `create_directories` in filesystem cache. [#49203](https://github.com/ClickHouse/ClickHouse/pull/49203) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Copies embedded examples to a new field `example` in `system.functions` to supplement the field `description`. [#49222](https://github.com/ClickHouse/ClickHouse/pull/49222) ([Dan Roscigno](https://github.com/DanRoscigno)).
* Enable connection options for the MongoDB dictionary. Example: ``` xml <source> <mongodb> <host>localhost</host> <port>27017</port> <user></user> <password></password> <db>test</db> <collection>dictionary_source</collection> <options>ssl=true</options> </mongodb> </source> ``` ### Documentation entry for user-facing changes. [#49225](https://github.com/ClickHouse/ClickHouse/pull/49225) ([MikhailBurdukov](https://github.com/MikhailBurdukov)).
* Added an alias `asymptotic` for `asymp` computational method for `kolmogorovSmirnovTest`. Improved documentation. [#49286](https://github.com/ClickHouse/ClickHouse/pull/49286) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Aggregation function groupBitAnd/Or/Xor now work on signed integer data. This makes them consistent with the behavior of scalar functions bitAnd/Or/Xor. [#49292](https://github.com/ClickHouse/ClickHouse/pull/49292) ([exmy](https://github.com/exmy)).
* Split function-documentation into more fine-granular fields. [#49300](https://github.com/ClickHouse/ClickHouse/pull/49300) ([Robert Schulze](https://github.com/rschu1ze)).
* Introduced settings: - `merge_max_block_size_bytes` to limit the amount of memory used for background operations. - `vertical_merge_algorithm_min_bytes_to_activate` to add another condition to activate vertical merges. [#49313](https://github.com/ClickHouse/ClickHouse/pull/49313) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Use multiple threads shared between all tables within a server to load outdated data parts. The the size of the pool and its queue is controlled by `max_outdated_parts_loading_thread_pool_size` and `outdated_part_loading_thread_pool_queue_size` settings. [#49317](https://github.com/ClickHouse/ClickHouse/pull/49317) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Don't overestimate the size of processed data for `LowCardinality` columns when they share dictionaries between blocks. This closes [#49322](https://github.com/ClickHouse/ClickHouse/issues/49322). See also [#48745](https://github.com/ClickHouse/ClickHouse/issues/48745). [#49323](https://github.com/ClickHouse/ClickHouse/pull/49323) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Parquet writer now uses reasonable row group size when invoked through OUTFILE. [#49325](https://github.com/ClickHouse/ClickHouse/pull/49325) ([Michael Kolupaev](https://github.com/al13n321)).
* Allow restricted keywords like `ARRAY` as an alias if the alias is quoted. Closes [#49324](https://github.com/ClickHouse/ClickHouse/issues/49324). [#49360](https://github.com/ClickHouse/ClickHouse/pull/49360) ([Nikolay Degterinsky](https://github.com/evillique)).
* Added possibility to use temporary tables in FROM part of ATTACH PARTITION FROM and REPLACE PARTITION FROM. [#49436](https://github.com/ClickHouse/ClickHouse/pull/49436) ([Roman Vasin](https://github.com/rvasin)).
* Data parts loading and deletion jobs were moved to shared server-wide pools instead of per-table pools. Pools sizes are controlled via settings `max_active_parts_loading_thread_pool_size`, `max_outdated_parts_loading_thread_pool_size` and `max_parts_cleaning_thread_pool_size` in top-level config. Table-level settings `max_part_loading_threads` and `max_part_removal_threads` became obsolete. [#49474](https://github.com/ClickHouse/ClickHouse/pull/49474) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Allow `?password=pass` in URL. Password is replaced in browser history. [#49505](https://github.com/ClickHouse/ClickHouse/pull/49505) ([Mike Kot](https://github.com/myrrc)).
* Allow zero objects in ReadBufferFromRemoteFSGather (because empty files are not backuped, so we might end up with zero blobs in metadata file). Closes [#49480](https://github.com/ClickHouse/ClickHouse/issues/49480). [#49519](https://github.com/ClickHouse/ClickHouse/pull/49519) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Attach thread MemoryTracker to `total_memory_tracker` after `ThreadGroup` detached. [#49527](https://github.com/ClickHouse/ClickHouse/pull/49527) ([Dmitry Novik](https://github.com/novikd)).
* Make `Pretty` formats prettier: squash blocks if not much time passed since the output of the previous block. This is controlled by a new setting `output_format_pretty_squash_ms` (100ms by default). This closes [#49153](https://github.com/ClickHouse/ClickHouse/issues/49153). [#49537](https://github.com/ClickHouse/ClickHouse/pull/49537) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Add initial support to do JOINs with pure parallel replicas. [#49544](https://github.com/ClickHouse/ClickHouse/pull/49544) ([Raúl Marín](https://github.com/Algunenano)).
* Fix parameterized views when query parameter used multiple times in the query. [#49556](https://github.com/ClickHouse/ClickHouse/pull/49556) ([Azat Khuzhin](https://github.com/azat)).
* Release memory allocated for the last sent ProfileEvents snapshot in the context of a query. Followup [#47564](https://github.com/ClickHouse/ClickHouse/issues/47564). [#49561](https://github.com/ClickHouse/ClickHouse/pull/49561) ([Dmitry Novik](https://github.com/novikd)).
* Function "makeDate" now provides a MySQL-compatible overload (year & day of the year argument). [#49603](https://github.com/ClickHouse/ClickHouse/pull/49603) ([Robert Schulze](https://github.com/rschu1ze)).
* More parallelism on `Outdated` parts removal with "zero-copy replication". [#49630](https://github.com/ClickHouse/ClickHouse/pull/49630) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Reduced number of `List` ZooKeeper requests when selecting parts to merge and a lot of partitions do not have anything to merge. [#49637](https://github.com/ClickHouse/ClickHouse/pull/49637) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Support `dictionary` table function for `RegExpTreeDictionary`. [#49666](https://github.com/ClickHouse/ClickHouse/pull/49666) ([Han Fei](https://github.com/hanfei1991)).
* Added weighted fair IO scheduling policy. Added dynamic resource manager, which allows IO scheduling hierarchy to be updated in runtime w/o server restarts. [#49671](https://github.com/ClickHouse/ClickHouse/pull/49671) ([Sergei Trifonov](https://github.com/serxa)).
* Add compose request after multipart upload to GCS. This enables the usage of copy operation on objects uploaded with the multipart upload. It's recommended to set `s3_strict_upload_part_size` to some value because compose request can fail on objects created with parts of different sizes. [#49693](https://github.com/ClickHouse/ClickHouse/pull/49693) ([Antonio Andelic](https://github.com/antonio2368)).
* Improve the "best-effort" parsing logic to accept `key_value_delimiter` as a valid part of the value. This also simplifies branching and might even speed up things a bit. [#49760](https://github.com/ClickHouse/ClickHouse/pull/49760) ([Arthur Passos](https://github.com/arthurpassos)).
* Facilitate profile data association and aggregation for the same query. [#49777](https://github.com/ClickHouse/ClickHouse/pull/49777) ([helifu](https://github.com/helifu)).
* System log tables can now have custom sorting keys. [#49778](https://github.com/ClickHouse/ClickHouse/pull/49778) ([helifu](https://github.com/helifu)).
* A new field 'partitions' is used to indicate which partitions are participating in the calculation. [#49779](https://github.com/ClickHouse/ClickHouse/pull/49779) ([helifu](https://github.com/helifu)).
* Added `enable_the_endpoint_id_with_zookeeper_name_prefix` setting for `ReplicatedMergeTree` (disabled by default). When enabled, it adds ZooKeeper cluster name to table's interserver communication endpoint. It avoids `Duplicate interserver IO endpoint` errors when having replicated tables with the same path, but different auxiliary ZooKeepers. [#49780](https://github.com/ClickHouse/ClickHouse/pull/49780) ([helifu](https://github.com/helifu)).
* Add query parameters to clickhouse-local. Closes [#46561](https://github.com/ClickHouse/ClickHouse/issues/46561). [#49785](https://github.com/ClickHouse/ClickHouse/pull/49785) ([Nikolay Degterinsky](https://github.com/evillique)).
* Qpl_deflate codec lower the minimum simd version to sse 4.2. [doc change in qpl](https://github.com/intel/qpl/commit/3f8f5cea27739f5261e8fd577dc233ffe88bf679) - intel® qpl relies on a run-time kernels dispatcher and cpuid check to choose the best available implementation(sse/avx2/avx512) - restructured cmakefile for qpl build in clickhouse to align with latest upstream qpl. [#49811](https://github.com/ClickHouse/ClickHouse/pull/49811) ([jasperzhu](https://github.com/jinjunzh)).
* Allow loading dictionaries and functions from YAML by default. In previous versions, it required editing the `dictionaries_config` or `user_defined_executable_functions_config` in the configuration file, as they expected `*.xml` files. [#49812](https://github.com/ClickHouse/ClickHouse/pull/49812) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* The Kafka table engine now allows to use alias columns. [#49824](https://github.com/ClickHouse/ClickHouse/pull/49824) ([Aleksandr Musorin](https://github.com/AVMusorin)).
* Add setting to limit the max number of pairs produced by extractKeyValuePairs, safeguard to avoid using way too much memory. [#49836](https://github.com/ClickHouse/ClickHouse/pull/49836) ([Arthur Passos](https://github.com/arthurpassos)).
* Add support for (an unusual) case where the arguments in the `IN` operator are single-element tuples. [#49844](https://github.com/ClickHouse/ClickHouse/pull/49844) ([MikhailBurdukov](https://github.com/MikhailBurdukov)).
* `bitHammingDistance` function support `String` and `FixedString` data type. Closes [#48827](https://github.com/ClickHouse/ClickHouse/issues/48827). [#49858](https://github.com/ClickHouse/ClickHouse/pull/49858) ([flynn](https://github.com/ucasfl)).
* Fix timeout resetting errors in the client on OS X. [#49863](https://github.com/ClickHouse/ClickHouse/pull/49863) ([alekar](https://github.com/alekar)).
* Add support for big integers, such as UInt128, Int128, UInt256, and Int256 in the function `bitCount`. This enables Hamming distance over large bit masks for AI applications. [#49867](https://github.com/ClickHouse/ClickHouse/pull/49867) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* This PR makes fingerprints to be used instead of key IDs in encrypted disks. [#49882](https://github.com/ClickHouse/ClickHouse/pull/49882) ([Vitaly Baranov](https://github.com/vitlibar)).
* Add UUID data type to PostgreSQL. Closes [#49739](https://github.com/ClickHouse/ClickHouse/issues/49739). [#49894](https://github.com/ClickHouse/ClickHouse/pull/49894) ([Nikolay Degterinsky](https://github.com/evillique)).
* Make `allow_experimental_query_cache` setting as obsolete for backward-compatibility. It was removed in https://github.com/ClickHouse/ClickHouse/pull/47977. [#49934](https://github.com/ClickHouse/ClickHouse/pull/49934) ([Timur Solodovnikov](https://github.com/tsolodov)).
* Function toUnixTimestamp() now accepts Date and Date32 arguments. [#49989](https://github.com/ClickHouse/ClickHouse/pull/49989) ([Victor Krasnov](https://github.com/sirvickr)).
* Charge only server memory for dictionaries. [#49995](https://github.com/ClickHouse/ClickHouse/pull/49995) ([Azat Khuzhin](https://github.com/azat)).
* Add schema inference to PostgreSQL, MySQL, MeiliSearch, and SQLite table engines. Closes [#49972](https://github.com/ClickHouse/ClickHouse/issues/49972). [#50000](https://github.com/ClickHouse/ClickHouse/pull/50000) ([Nikolay Degterinsky](https://github.com/evillique)).
* The server will allow using the `SQL_*` settings such as `SQL_AUTO_IS_NULL` as no-ops for MySQL compatibility. This closes [#49927](https://github.com/ClickHouse/ClickHouse/issues/49927). [#50013](https://github.com/ClickHouse/ClickHouse/pull/50013) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Preserve initial_query_id for ON CLUSTER queries, which is useful for introspection (under `distributed_ddl_entry_format_version=5`). [#50015](https://github.com/ClickHouse/ClickHouse/pull/50015) ([Azat Khuzhin](https://github.com/azat)).
* Preserve backward incompatibility for renamed settings by using aliases (`allow_experimental_projection_optimization` for `optimize_use_projections`, `allow_experimental_lightweight_delete` for `enable_lightweight_delete`). [#50044](https://github.com/ClickHouse/ClickHouse/pull/50044) ([Azat Khuzhin](https://github.com/azat)).
* Support cross-replication in distributed queries using the new infrastructure. [#50097](https://github.com/ClickHouse/ClickHouse/pull/50097) ([Dmitry Novik](https://github.com/novikd)).
* Support passing fqdn through setting my_hostname to register cluster node in keeper. Add setting of invisible to support multi compute groups. A compute group as a cluster, is invisible to other compute groups. [#50186](https://github.com/ClickHouse/ClickHouse/pull/50186) ([Yangkuan Liu](https://github.com/LiuYangkuan)).
* Fix PostgreSQL reading all the data even though `LIMIT n` could be specified. [#50187](https://github.com/ClickHouse/ClickHouse/pull/50187) ([Kseniia Sumarokova](https://github.com/kssenii)).
* 1) Fixed an error `NOT_FOUND_COLUMN_IN_BLOCK` in case of using parallel replicas with non-replicated storage with disabled setting `parallel_replicas_for_non_replicated_merge_tree` 2) Now `allow_experimental_parallel_reading_from_replicas` have 3 possible values - 0, 1 and 2. 0 - disabled, 1 - enabled, silently disable them in case of failure (in case of FINAL or JOIN), 2 - enabled, throw an expection in case of failure. 3) If FINAL modifier is used in SELECT query and parallel replicas are enabled, ClickHouse will try to disable them if `allow_experimental_parallel_reading_from_replicas` is set to 1 and throw an exception otherwise. [#50195](https://github.com/ClickHouse/ClickHouse/pull/50195) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Don't send head request for all keys in Iceberg schema inference, only for keys that are used for reaing data. [#50203](https://github.com/ClickHouse/ClickHouse/pull/50203) ([Kruglov Pavel](https://github.com/Avogar)).
* Add new profile events for queries with subqueries (`QueriesWithSubqueries`/`SelectQueriesWithSubqueries`/`InsertQueriesWithSubqueries`). [#50204](https://github.com/ClickHouse/ClickHouse/pull/50204) ([Azat Khuzhin](https://github.com/azat)).
* Adding the roles field in the users.xml file, which allows specifying roles with grants via a config file. [#50278](https://github.com/ClickHouse/ClickHouse/pull/50278) ([pufit](https://github.com/pufit)).
* When parallel replicas are enabled they will always skip unavailable servers (the behavior is controlled by the setting `skip_unavailable_shards`, enabled by default and can be only disabled). This closes: [#48565](https://github.com/ClickHouse/ClickHouse/issues/48565). [#50293](https://github.com/ClickHouse/ClickHouse/pull/50293) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fix a typo. [#50306](https://github.com/ClickHouse/ClickHouse/pull/50306) ([helifu](https://github.com/helifu)).
* Setting `enable_memory_bound_merging_of_aggregation_results` is enabled by default. If you update from version prior to 22.12, we recommend to set this flag to `false` until update is finished. [#50319](https://github.com/ClickHouse/ClickHouse/pull/50319) ([Nikita Taranov](https://github.com/nickitat)).
* Report `CGroupCpuCfsPeriod` and `CGroupCpuCfsQuota` in AsynchronousMetrics. - Respect cgroup v2 memory limits during server startup. [#50379](https://github.com/ClickHouse/ClickHouse/pull/50379) ([alekar](https://github.com/alekar)).
* Bump internal protobuf to v3.18 (fixes CVE-2022-1941). [#50400](https://github.com/ClickHouse/ClickHouse/pull/50400) ([Robert Schulze](https://github.com/rschu1ze)).
* Bump internal libxml2 to v2.10.4 (fixes CVE-2023-28484 and CVE-2023-29469). [#50402](https://github.com/ClickHouse/ClickHouse/pull/50402) ([Robert Schulze](https://github.com/rschu1ze)).
* Bump c-ares to v1.19.1 (CVE-2023-32067, CVE-2023-31130, CVE-2023-31147). [#50403](https://github.com/ClickHouse/ClickHouse/pull/50403) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix CVE-2022-2469 in libgsasl. [#50404](https://github.com/ClickHouse/ClickHouse/pull/50404) ([Robert Schulze](https://github.com/rschu1ze)).
* Make filter push down through cross join. [#50430](https://github.com/ClickHouse/ClickHouse/pull/50430) ([Han Fei](https://github.com/hanfei1991)).
* Add a signal handler for SIGQUIT to work the same way as SIGINT. Closes [#50298](https://github.com/ClickHouse/ClickHouse/issues/50298). [#50435](https://github.com/ClickHouse/ClickHouse/pull/50435) ([Nikolay Degterinsky](https://github.com/evillique)).
* In case JSON parse fails due to the large size of the object output the last position to allow debugging. [#50474](https://github.com/ClickHouse/ClickHouse/pull/50474) ([Valentin Alexeev](https://github.com/valentinalexeev)).
* Support decimals with not fixed size. Closes [#49130](https://github.com/ClickHouse/ClickHouse/issues/49130). [#50586](https://github.com/ClickHouse/ClickHouse/pull/50586) ([Kruglov Pavel](https://github.com/Avogar)).
* Disable pure parallel replicas if trivial count optimization is possible. [#50594](https://github.com/ClickHouse/ClickHouse/pull/50594) ([Raúl Marín](https://github.com/Algunenano)).
* Added support of TRUNCATE db.table additional to TRUNCATE TABLE db.table in MaterializedMySQL. [#50624](https://github.com/ClickHouse/ClickHouse/pull/50624) ([Val Doroshchuk](https://github.com/valbok)).
* Disable parallel replicas automatically when the estimated number of granules is less than threshold. The behavior is controlled by a setting `parallel_replicas_min_number_of_granules_to_enable`. [#50639](https://github.com/ClickHouse/ClickHouse/pull/50639) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* When creating skipping indexes via "ALTER TABLE table ADD INDEX", the "GRANULARITY" clause can now be omitted. In that case, GRANULARITY is assumed to be 1. [#50658](https://github.com/ClickHouse/ClickHouse/pull/50658) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix slow cache in presence of big inserts. [#50680](https://github.com/ClickHouse/ClickHouse/pull/50680) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Set default max_elements limit in filesystem cache to 10000000. [#50682](https://github.com/ClickHouse/ClickHouse/pull/50682) ([Kseniia Sumarokova](https://github.com/kssenii)).
* SHOW INDICES is now an alias of statement SHOW INDEX/INDEXES/KEYS. [#50713](https://github.com/ClickHouse/ClickHouse/pull/50713) ([Robert Schulze](https://github.com/rschu1ze)).
#### Build/Testing/Packaging Improvement
* New and improved keeper-bench. Everything can be customized from yaml/XML file: - request generator - each type of request generator can have a specific set of fields - multi requests can be generated just by doing the same under `multi` key - for each request or subrequest in multi a `weight` field can be defined to control distribution - define trees that need to be setup for a test run - hosts can be defined with all timeouts customizable and it's possible to control how many sessions to generate for each host - integers defined with `min_value` and `max_value` fields are random number generators. [#48547](https://github.com/ClickHouse/ClickHouse/pull/48547) ([Antonio Andelic](https://github.com/antonio2368)).
* ... Add a test to check max_rows_to_read_leaf behaviour. [#48950](https://github.com/ClickHouse/ClickHouse/pull/48950) ([Sean Haynes](https://github.com/seandhaynes)).
* Io_uring is not supported on macos, don't choose it when running tests on local to avoid occassional failures. [#49250](https://github.com/ClickHouse/ClickHouse/pull/49250) ([Frank Chen](https://github.com/FrankChen021)).
* Support named fault injection for testing. [#49361](https://github.com/ClickHouse/ClickHouse/pull/49361) ([Han Fei](https://github.com/hanfei1991)).
* Fix the 01193_metadata_loading test to match the query execution time specific to s390x. [#49455](https://github.com/ClickHouse/ClickHouse/pull/49455) ([MeenaRenganathan22](https://github.com/MeenaRenganathan22)).
* Use the RapidJSONParser library to parse the JSON float values in case of s390x. [#49457](https://github.com/ClickHouse/ClickHouse/pull/49457) ([MeenaRenganathan22](https://github.com/MeenaRenganathan22)).
* Allow running ClickHouse in the OS where the `prctl` (process control) syscall is not available, such as AWS Lambda. [#49538](https://github.com/ClickHouse/ClickHouse/pull/49538) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Improve CI check with an enabled analyzer. Now it should be green if only tests from `tests/broken_tests.txt` are broken. [#49562](https://github.com/ClickHouse/ClickHouse/pull/49562) ([Dmitry Novik](https://github.com/novikd)).
* Fixed the issue of build conflict between contrib/isa-l and isa-l in qpl [49296](https://github.com/ClickHouse/ClickHouse/issues/49296). [#49584](https://github.com/ClickHouse/ClickHouse/pull/49584) ([jasperzhu](https://github.com/jinjunzh)).
* Utilities are now only build if explicitly requested ("-DENABLE_UTILS=1") instead of by default, this reduces link times in typical development builds. [#49620](https://github.com/ClickHouse/ClickHouse/pull/49620) ([Robert Schulze](https://github.com/rschu1ze)).
* Pull build description of idxd-config into a separate CMake file to avoid accidental removal in future. [#49651](https://github.com/ClickHouse/ClickHouse/pull/49651) ([jasperzhu](https://github.com/jinjunzh)).
* Add CI check with an enabled analyzer in the master. Followup [#49562](https://github.com/ClickHouse/ClickHouse/issues/49562). [#49668](https://github.com/ClickHouse/ClickHouse/pull/49668) ([Dmitry Novik](https://github.com/novikd)).
* Switch to LLVM/clang 16. [#49678](https://github.com/ClickHouse/ClickHouse/pull/49678) ([Azat Khuzhin](https://github.com/azat)).
* Fixed DefaultHash64 for non-64 bit integers on s390x. [#49833](https://github.com/ClickHouse/ClickHouse/pull/49833) ([Harry Lee](https://github.com/HarryLeeIBM)).
* Allow building ClickHouse with clang-17. [#49851](https://github.com/ClickHouse/ClickHouse/pull/49851) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* ClickHouse is now easier to be integrated into other cmake projects. [#49991](https://github.com/ClickHouse/ClickHouse/pull/49991) ([Amos Bird](https://github.com/amosbird)).
* Link `boost::context` library to `clickhouse_common_io`. This closes: [#50381](https://github.com/ClickHouse/ClickHouse/issues/50381). [#50385](https://github.com/ClickHouse/ClickHouse/pull/50385) ([HaiBo Li](https://github.com/marising)).
* Add support for building with clang-17. [#50410](https://github.com/ClickHouse/ClickHouse/pull/50410) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix strange additional QEMU logging after [#47151](https://github.com/ClickHouse/ClickHouse/issues/47151), see https://s3.amazonaws.com/clickhouse-test-reports/50078/a4743996ee4f3583884d07bcd6501df0cfdaa346/stateless_tests__release__databasereplicated__[3_4].html. [#50442](https://github.com/ClickHouse/ClickHouse/pull/50442) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* ClickHouse can work on Linux RISC-V 6.1.22. This closes [#50456](https://github.com/ClickHouse/ClickHouse/issues/50456). [#50457](https://github.com/ClickHouse/ClickHouse/pull/50457) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* ActionsDAG: fix wrong optimization [#47584](https://github.com/ClickHouse/ClickHouse/pull/47584) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
* Correctly handle concurrent snapshots in Keeper [#48466](https://github.com/ClickHouse/ClickHouse/pull/48466) ([Antonio Andelic](https://github.com/antonio2368)).
* MergeTreeMarksLoader holds DataPart instead of DataPartStorage [#48515](https://github.com/ClickHouse/ClickHouse/pull/48515) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* sequence state fix [#48603](https://github.com/ClickHouse/ClickHouse/pull/48603) ([Ilya Golshtein](https://github.com/ilejn)).
* Back/Restore concurrency check on previous fails [#48726](https://github.com/ClickHouse/ClickHouse/pull/48726) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Fix Attaching a table with non-existent ZK path does not increase the ReadonlyReplica metric [#48954](https://github.com/ClickHouse/ClickHouse/pull/48954) ([wangxiaobo](https://github.com/wzb5212)).
* Fix possible terminate called for uncaught exception in some places [#49112](https://github.com/ClickHouse/ClickHouse/pull/49112) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix key not found error for queries with multiple StorageJoin [#49137](https://github.com/ClickHouse/ClickHouse/pull/49137) ([vdimir](https://github.com/vdimir)).
* Fix wrong query result when using nullable primary key [#49172](https://github.com/ClickHouse/ClickHouse/pull/49172) ([Duc Canh Le](https://github.com/canhld94)).
* Revert "Fix GCS native copy ([#48981](https://github.com/ClickHouse/ClickHouse/issues/48981))" [#49194](https://github.com/ClickHouse/ClickHouse/pull/49194) ([Raúl Marín](https://github.com/Algunenano)).
* Fix reinterpretAs*() on big endian machines [#49198](https://github.com/ClickHouse/ClickHouse/pull/49198) ([Suzy Wang](https://github.com/SuzyWangIBMer)).
* Lock zero copy parts more atomically [#49211](https://github.com/ClickHouse/ClickHouse/pull/49211) ([alesapin](https://github.com/alesapin)).
* Fix race on Outdated parts loading [#49223](https://github.com/ClickHouse/ClickHouse/pull/49223) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix all key value is null and group use rollup return wrong answer [#49282](https://github.com/ClickHouse/ClickHouse/pull/49282) ([Shuai li](https://github.com/loneylee)).
* Fix calculating load_factor for HASHED dictionaries with SHARDS [#49319](https://github.com/ClickHouse/ClickHouse/pull/49319) ([Azat Khuzhin](https://github.com/azat)).
* Disallow configuring compression CODECs for alias columns [#49363](https://github.com/ClickHouse/ClickHouse/pull/49363) ([Timur Solodovnikov](https://github.com/tsolodov)).
* Fix bug in removal of existing part directory [#49365](https://github.com/ClickHouse/ClickHouse/pull/49365) ([alesapin](https://github.com/alesapin)).
* Properly fix GCS when HMAC is used [#49390](https://github.com/ClickHouse/ClickHouse/pull/49390) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix fuzz bug when subquery set is not built when reading from remote() [#49425](https://github.com/ClickHouse/ClickHouse/pull/49425) ([Alexander Gololobov](https://github.com/davenger)).
* Invert `shutdown_wait_unfinished_queries` [#49427](https://github.com/ClickHouse/ClickHouse/pull/49427) ([Konstantin Bogdanov](https://github.com/thevar1able)).
* Fix another zero copy bug [#49473](https://github.com/ClickHouse/ClickHouse/pull/49473) ([alesapin](https://github.com/alesapin)).
* Fix postgres database setting [#49481](https://github.com/ClickHouse/ClickHouse/pull/49481) ([Mal Curtis](https://github.com/snikch)).
* Correctly handle s3Cluster arguments [#49490](https://github.com/ClickHouse/ClickHouse/pull/49490) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix bug in TraceCollector destructor. [#49508](https://github.com/ClickHouse/ClickHouse/pull/49508) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Fix AsynchronousReadIndirectBufferFromRemoteFS breaking on short seeks [#49525](https://github.com/ClickHouse/ClickHouse/pull/49525) ([Michael Kolupaev](https://github.com/al13n321)).
* Fix dictionaries loading order [#49560](https://github.com/ClickHouse/ClickHouse/pull/49560) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Forbid the change of data type of Object('json') column [#49563](https://github.com/ClickHouse/ClickHouse/pull/49563) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix stress test (Logical error: Expected 7134 >= 11030) [#49623](https://github.com/ClickHouse/ClickHouse/pull/49623) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix bug in DISTINCT [#49628](https://github.com/ClickHouse/ClickHouse/pull/49628) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix: DISTINCT in order with zero values in non-sorted columns [#49636](https://github.com/ClickHouse/ClickHouse/pull/49636) ([Igor Nikonov](https://github.com/devcrafter)).
* Fix one-off error in big integers found by UBSan with fuzzer [#49645](https://github.com/ClickHouse/ClickHouse/pull/49645) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix reading from sparse columns after restart [#49660](https://github.com/ClickHouse/ClickHouse/pull/49660) ([Anton Popov](https://github.com/CurtizJ)).
* Fix assert in SpanHolder::finish() with fibers [#49673](https://github.com/ClickHouse/ClickHouse/pull/49673) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix short circuit functions and mutations with sparse arguments [#49716](https://github.com/ClickHouse/ClickHouse/pull/49716) ([Anton Popov](https://github.com/CurtizJ)).
* Fix writing appended files to incremental backups [#49725](https://github.com/ClickHouse/ClickHouse/pull/49725) ([Vitaly Baranov](https://github.com/vitlibar)).
* Ignore LWD column in checkPartDynamicColumns [#49737](https://github.com/ClickHouse/ClickHouse/pull/49737) ([Alexander Gololobov](https://github.com/davenger)).
* Fix msan issue in randomStringUTF8(<uneven number>) [#49750](https://github.com/ClickHouse/ClickHouse/pull/49750) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix aggregate function kolmogorovSmirnovTest [#49768](https://github.com/ClickHouse/ClickHouse/pull/49768) ([FFFFFFFHHHHHHH](https://github.com/FFFFFFFHHHHHHH)).
* Fix settings aliases in native protocol [#49776](https://github.com/ClickHouse/ClickHouse/pull/49776) ([Azat Khuzhin](https://github.com/azat)).
* Fix `arrayMap` with array of tuples with single argument [#49789](https://github.com/ClickHouse/ClickHouse/pull/49789) ([Anton Popov](https://github.com/CurtizJ)).
* Fix per-query IO/BACKUPs throttling settings [#49797](https://github.com/ClickHouse/ClickHouse/pull/49797) ([Azat Khuzhin](https://github.com/azat)).
* Fix setting NULL in profile definition [#49831](https://github.com/ClickHouse/ClickHouse/pull/49831) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix a bug with projections and the aggregate_functions_null_for_empty setting (for query_plan_optimize_projection) [#49873](https://github.com/ClickHouse/ClickHouse/pull/49873) ([Amos Bird](https://github.com/amosbird)).
* Fix processing pending batch for Distributed async INSERT after restart [#49884](https://github.com/ClickHouse/ClickHouse/pull/49884) ([Azat Khuzhin](https://github.com/azat)).
* Fix assertion in CacheMetadata::doCleanup [#49914](https://github.com/ClickHouse/ClickHouse/pull/49914) ([Kseniia Sumarokova](https://github.com/kssenii)).
* fix `is_prefix` in OptimizeRegularExpression [#49919](https://github.com/ClickHouse/ClickHouse/pull/49919) ([Han Fei](https://github.com/hanfei1991)).
* Fix metrics `WriteBufferFromS3Bytes`, `WriteBufferFromS3Microseconds` and `WriteBufferFromS3RequestsErrors` [#49930](https://github.com/ClickHouse/ClickHouse/pull/49930) ([Aleksandr Musorin](https://github.com/AVMusorin)).
* Fix IPv6 encoding in protobuf [#49933](https://github.com/ClickHouse/ClickHouse/pull/49933) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Fix possible Logical error on bad Nullable parsing for text formats [#49960](https://github.com/ClickHouse/ClickHouse/pull/49960) ([Kruglov Pavel](https://github.com/Avogar)).
* Add setting output_format_parquet_compliant_nested_types to produce more compatible Parquet files [#50001](https://github.com/ClickHouse/ClickHouse/pull/50001) ([Michael Kolupaev](https://github.com/al13n321)).
* Fix logical error in stress test "Not enough space to add ..." [#50021](https://github.com/ClickHouse/ClickHouse/pull/50021) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Avoid deadlock when starting table in attach thread of `ReplicatedMergeTree` [#50026](https://github.com/ClickHouse/ClickHouse/pull/50026) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix assert in SpanHolder::finish() with fibers attempt 2 [#50034](https://github.com/ClickHouse/ClickHouse/pull/50034) ([Kruglov Pavel](https://github.com/Avogar)).
* Add proper escaping for DDL OpenTelemetry context serialization [#50045](https://github.com/ClickHouse/ClickHouse/pull/50045) ([Azat Khuzhin](https://github.com/azat)).
* Fix reporting broken projection parts [#50052](https://github.com/ClickHouse/ClickHouse/pull/50052) ([Amos Bird](https://github.com/amosbird)).
* JIT compilation not equals NaN fix [#50056](https://github.com/ClickHouse/ClickHouse/pull/50056) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix crashing in case of Replicated database without arguments [#50058](https://github.com/ClickHouse/ClickHouse/pull/50058) ([Azat Khuzhin](https://github.com/azat)).
* Fix crash with `multiIf` and constant condition and nullable arguments [#50123](https://github.com/ClickHouse/ClickHouse/pull/50123) ([Anton Popov](https://github.com/CurtizJ)).
* Fix invalid index analysis for date related keys [#50153](https://github.com/ClickHouse/ClickHouse/pull/50153) ([Amos Bird](https://github.com/amosbird)).
* do not allow modify order by when there are no order by cols [#50154](https://github.com/ClickHouse/ClickHouse/pull/50154) ([Han Fei](https://github.com/hanfei1991)).
* Fix broken index analysis when binary operator contains a null constant argument [#50177](https://github.com/ClickHouse/ClickHouse/pull/50177) ([Amos Bird](https://github.com/amosbird)).
* clickhouse-client: disallow usage of `--query` and `--queries-file` at the same time [#50210](https://github.com/ClickHouse/ClickHouse/pull/50210) ([Alexey Gerasimchuck](https://github.com/Demilivor)).
* Fix UB for INTO OUTFILE extensions (APPEND / AND STDOUT) and WATCH EVENTS [#50216](https://github.com/ClickHouse/ClickHouse/pull/50216) ([Azat Khuzhin](https://github.com/azat)).
* Fix skipping spaces at end of row in CustomSeparatedIgnoreSpaces format [#50224](https://github.com/ClickHouse/ClickHouse/pull/50224) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix iceberg metadata parsing [#50232](https://github.com/ClickHouse/ClickHouse/pull/50232) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix nested distributed SELECT in WITH clause [#50234](https://github.com/ClickHouse/ClickHouse/pull/50234) ([Azat Khuzhin](https://github.com/azat)).
* Fix reconnecting of HTTPS session when target host IP was changed [#50240](https://github.com/ClickHouse/ClickHouse/pull/50240) ([Aleksei Filatov](https://github.com/aalexfvk)).
* Fix msan issue in keyed siphash [#50245](https://github.com/ClickHouse/ClickHouse/pull/50245) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix bugs in Poco sockets in non-blocking mode, use true non-blocking sockets [#50252](https://github.com/ClickHouse/ClickHouse/pull/50252) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix checksum calculation for backup entries [#50264](https://github.com/ClickHouse/ClickHouse/pull/50264) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fixed type conversion from Date/Date32 to DateTime64 when querying with DateTime64 index [#50280](https://github.com/ClickHouse/ClickHouse/pull/50280) ([Lucas Chang](https://github.com/lucas-tubi)).
* Comparison functions NaN fix [#50287](https://github.com/ClickHouse/ClickHouse/pull/50287) ([Maksim Kita](https://github.com/kitaisreal)).
* JIT aggregation nullable key fix [#50291](https://github.com/ClickHouse/ClickHouse/pull/50291) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix clickhouse-local crashing when writing empty Arrow or Parquet output [#50328](https://github.com/ClickHouse/ClickHouse/pull/50328) ([Michael Kolupaev](https://github.com/al13n321)).
* Fix crash when Pool::Entry::disconnect() is called [#50334](https://github.com/ClickHouse/ClickHouse/pull/50334) ([Val Doroshchuk](https://github.com/valbok)).
* Improved fetch part by holding directory lock longer [#50339](https://github.com/ClickHouse/ClickHouse/pull/50339) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Fix bitShift* functions with both constant arguments [#50343](https://github.com/ClickHouse/ClickHouse/pull/50343) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix Keeper deadlock on exception when preprocessing requests. [#50387](https://github.com/ClickHouse/ClickHouse/pull/50387) ([frinkr](https://github.com/frinkr)).
* Fix hashing of const integer values [#50421](https://github.com/ClickHouse/ClickHouse/pull/50421) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix excessive memory usage for FINAL (due to too much streams usage) [#50429](https://github.com/ClickHouse/ClickHouse/pull/50429) ([Azat Khuzhin](https://github.com/azat)).
* Fix merge_tree_min_rows_for_seek/merge_tree_min_bytes_for_seek for data skipping indexes [#50432](https://github.com/ClickHouse/ClickHouse/pull/50432) ([Azat Khuzhin](https://github.com/azat)).
* Limit the number of in-flight tasks for loading outdated parts [#50450](https://github.com/ClickHouse/ClickHouse/pull/50450) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Keeper fix: apply uncommitted state after snapshot install [#50483](https://github.com/ClickHouse/ClickHouse/pull/50483) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix incorrect constant folding [#50536](https://github.com/ClickHouse/ClickHouse/pull/50536) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix logical error in stress test (Not enough space to add ...) [#50583](https://github.com/ClickHouse/ClickHouse/pull/50583) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix converting Null to LowCardinality(Nullable) in values table function [#50637](https://github.com/ClickHouse/ClickHouse/pull/50637) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix crash in anti/semi join [#50638](https://github.com/ClickHouse/ClickHouse/pull/50638) ([vdimir](https://github.com/vdimir)).
* Revert invalid RegExpTreeDictionary optimization [#50642](https://github.com/ClickHouse/ClickHouse/pull/50642) ([Johann Gan](https://github.com/johanngan)).
* Correctly disable async insert with deduplication when it's not needed [#50663](https://github.com/ClickHouse/ClickHouse/pull/50663) ([Antonio Andelic](https://github.com/antonio2368)).
#### Build Improvement
* Fixed Functional Test 00870_t64_codec, 00871_t64_codec_signed, 00872_t64_bit_codec. [#49658](https://github.com/ClickHouse/ClickHouse/pull/49658) ([Sanjam Panda](https://github.com/saitama951)).
#### NO CL ENTRY
* NO CL ENTRY: 'Fix user MemoryTracker counter in async inserts'. [#47630](https://github.com/ClickHouse/ClickHouse/pull/47630) ([Dmitry Novik](https://github.com/novikd)).
* NO CL ENTRY: 'Revert "Make `Pretty` formats even prettier."'. [#49850](https://github.com/ClickHouse/ClickHouse/pull/49850) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* NO CL ENTRY: 'Update first_value.md:remove redundant 's''. [#50331](https://github.com/ClickHouse/ClickHouse/pull/50331) ([sslouis](https://github.com/savezed)).
* NO CL ENTRY: 'Revert "less logs in WriteBufferFromS3"'. [#50390](https://github.com/ClickHouse/ClickHouse/pull/50390) ([Alexander Tokmakov](https://github.com/tavplubix)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Attempt to fix the "system.stack_trace" test [#44627](https://github.com/ClickHouse/ClickHouse/pull/44627) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* rework WriteBufferFromS3, add tests, add abortion [#44869](https://github.com/ClickHouse/ClickHouse/pull/44869) ([Sema Checherinda](https://github.com/CheSema)).
* Rework locking in fs cache [#44985](https://github.com/ClickHouse/ClickHouse/pull/44985) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Update ubuntu_ami_for_ci.sh [#47151](https://github.com/ClickHouse/ClickHouse/pull/47151) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Implement status comment [#48468](https://github.com/ClickHouse/ClickHouse/pull/48468) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Update curl to 8.0.1 (for CVEs) [#48765](https://github.com/ClickHouse/ClickHouse/pull/48765) ([Boris Kuschel](https://github.com/bkuschel)).
* Fix some tests [#48792](https://github.com/ClickHouse/ClickHouse/pull/48792) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Bug Fix for 02432_s3_parallel_parts_cleanup.sql with zero copy replication [#48865](https://github.com/ClickHouse/ClickHouse/pull/48865) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Add AsyncLoader with dependency tracking and runtime prioritization [#48923](https://github.com/ClickHouse/ClickHouse/pull/48923) ([Sergei Trifonov](https://github.com/serxa)).
* Fix incorrect createColumn call on join clause [#48998](https://github.com/ClickHouse/ClickHouse/pull/48998) ([Ongkong](https://github.com/ongkong)).
* Try fix flaky 01346_alter_enum_partition_key_replicated_zookeeper_long [#49099](https://github.com/ClickHouse/ClickHouse/pull/49099) ([Sergei Trifonov](https://github.com/serxa)).
* Fix possible logical error "Cannot cancel. Either no query sent or already cancelled" [#49106](https://github.com/ClickHouse/ClickHouse/pull/49106) ([Kruglov Pavel](https://github.com/Avogar)).
* Refactor ColumnLowCardinality::cutAndCompact [#49111](https://github.com/ClickHouse/ClickHouse/pull/49111) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix tests with enabled analyzer [#49116](https://github.com/ClickHouse/ClickHouse/pull/49116) ([Dmitry Novik](https://github.com/novikd)).
* Use `SharedMutex` instead of `UpgradableMutex` [#49139](https://github.com/ClickHouse/ClickHouse/pull/49139) ([Sergei Trifonov](https://github.com/serxa)).
* Don't add metadata_version file if it doesn't exist [#49146](https://github.com/ClickHouse/ClickHouse/pull/49146) ([alesapin](https://github.com/alesapin)).
* clearing s3 between tests in a robust way [#49157](https://github.com/ClickHouse/ClickHouse/pull/49157) ([Sema Checherinda](https://github.com/CheSema)).
* Align connect timeout with aws sdk default [#49161](https://github.com/ClickHouse/ClickHouse/pull/49161) ([Nikita Taranov](https://github.com/nickitat)).
* Fix test_encrypted_disk_replication [#49193](https://github.com/ClickHouse/ClickHouse/pull/49193) ([Vitaly Baranov](https://github.com/vitlibar)).
* Allow using function `concat` with `Map` type [#49200](https://github.com/ClickHouse/ClickHouse/pull/49200) ([Anton Popov](https://github.com/CurtizJ)).
* Slight improvements to coordinator logging [#49204](https://github.com/ClickHouse/ClickHouse/pull/49204) ([Raúl Marín](https://github.com/Algunenano)).
* Fix some typos in conversion functions [#49221](https://github.com/ClickHouse/ClickHouse/pull/49221) ([Raúl Marín](https://github.com/Algunenano)).
* CMake: Remove some GCC-specific code [#49224](https://github.com/ClickHouse/ClickHouse/pull/49224) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix oss-fuzz build errors [#49236](https://github.com/ClickHouse/ClickHouse/pull/49236) ([Nikita Taranov](https://github.com/nickitat)).
* Update version after release [#49237](https://github.com/ClickHouse/ClickHouse/pull/49237) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Update version_date.tsv and changelogs after v23.4.1.1943-stable [#49239](https://github.com/ClickHouse/ClickHouse/pull/49239) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Merge [#24050](https://github.com/ClickHouse/ClickHouse/issues/24050) [#49240](https://github.com/ClickHouse/ClickHouse/pull/49240) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Add file name to exception raised during decompression [#49241](https://github.com/ClickHouse/ClickHouse/pull/49241) ([Nikolay Degterinsky](https://github.com/evillique)).
* Disable ISA-L on aarch64 architectures [#49256](https://github.com/ClickHouse/ClickHouse/pull/49256) ([Jordi Villar](https://github.com/jrdi)).
* Add a comment in FileCache.cpp [#49260](https://github.com/ClickHouse/ClickHouse/pull/49260) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix garbage [#48719](https://github.com/ClickHouse/ClickHouse/issues/48719) [#49263](https://github.com/ClickHouse/ClickHouse/pull/49263) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Update build for nasm [#49288](https://github.com/ClickHouse/ClickHouse/pull/49288) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix race in `waitForProcessingQueue` [#49302](https://github.com/ClickHouse/ClickHouse/pull/49302) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix stress test [#49309](https://github.com/ClickHouse/ClickHouse/pull/49309) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix 02516_join_with_totals_and_subquery_bug with new analyzer [#49310](https://github.com/ClickHouse/ClickHouse/pull/49310) ([Dmitry Novik](https://github.com/novikd)).
* Fallback auth gh api [#49314](https://github.com/ClickHouse/ClickHouse/pull/49314) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Unpoison stack frame ptrs from libunwind for msan [#49316](https://github.com/ClickHouse/ClickHouse/pull/49316) ([Robert Schulze](https://github.com/rschu1ze)).
* Respect projections in 01600_parts [#49318](https://github.com/ClickHouse/ClickHouse/pull/49318) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* move pipe compute into initializePipeline [#49326](https://github.com/ClickHouse/ClickHouse/pull/49326) ([Konstantin Morozov](https://github.com/k-morozov)).
* Fix compiling average example (suppress -Wframe-larger-than) [#49358](https://github.com/ClickHouse/ClickHouse/pull/49358) ([Azat Khuzhin](https://github.com/azat)).
* Fix join_use_nulls in analyzer [#49359](https://github.com/ClickHouse/ClickHouse/pull/49359) ([vdimir](https://github.com/vdimir)).
* Fix 02680_mysql_ast_logical_err in analyzer [#49362](https://github.com/ClickHouse/ClickHouse/pull/49362) ([vdimir](https://github.com/vdimir)).
* Remove wrong assertion in cache [#49376](https://github.com/ClickHouse/ClickHouse/pull/49376) ([Kseniia Sumarokova](https://github.com/kssenii)).
* A better way of excluding ISA-L on non-x86 [#49378](https://github.com/ClickHouse/ClickHouse/pull/49378) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix decimal aggregates test for s390x [#49382](https://github.com/ClickHouse/ClickHouse/pull/49382) ([Harry Lee](https://github.com/HarryLeeIBM)).
* Move logging one line higher [#49387](https://github.com/ClickHouse/ClickHouse/pull/49387) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Improve CI: status commit, auth for get_gh_api [#49388](https://github.com/ClickHouse/ClickHouse/pull/49388) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix printing hung queries in clickhouse-test. [#49389](https://github.com/ClickHouse/ClickHouse/pull/49389) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Correctly stop CNF convert for too many atomics in new analyzer [#49402](https://github.com/ClickHouse/ClickHouse/pull/49402) ([Antonio Andelic](https://github.com/antonio2368)).
* Remove 02707_complex_query_fails_analyzer test [#49403](https://github.com/ClickHouse/ClickHouse/pull/49403) ([Dmitry Novik](https://github.com/novikd)).
* Update FileSegment.cpp [#49411](https://github.com/ClickHouse/ClickHouse/pull/49411) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Switch Block::NameMap to google::dense_hash_map over HashMap [#49412](https://github.com/ClickHouse/ClickHouse/pull/49412) ([Azat Khuzhin](https://github.com/azat)).
* Slightly reduce inter-header dependencies [#49413](https://github.com/ClickHouse/ClickHouse/pull/49413) ([Azat Khuzhin](https://github.com/azat)).
* Update WithFileName.cpp [#49414](https://github.com/ClickHouse/ClickHouse/pull/49414) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix some assertions failing in stress test [#49415](https://github.com/ClickHouse/ClickHouse/pull/49415) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Correctly cleanup sequential node in ZooKeeperWithFaultInjection [#49418](https://github.com/ClickHouse/ClickHouse/pull/49418) ([vdimir](https://github.com/vdimir)).
* Throw an exception for non-parametric functions in new analyzer [#49419](https://github.com/ClickHouse/ClickHouse/pull/49419) ([Dmitry Novik](https://github.com/novikd)).
* Fix some bad error messages [#49420](https://github.com/ClickHouse/ClickHouse/pull/49420) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Update version_date.tsv and changelogs after v23.4.2.11-stable [#49422](https://github.com/ClickHouse/ClickHouse/pull/49422) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Remove trash [#49423](https://github.com/ClickHouse/ClickHouse/pull/49423) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Whitespaces [#49424](https://github.com/ClickHouse/ClickHouse/pull/49424) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Remove dependency from DB::Context in remote/cache readers [#49426](https://github.com/ClickHouse/ClickHouse/pull/49426) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Merging [#49066](https://github.com/ClickHouse/ClickHouse/issues/49066) (Better error handling during loading of parts) [#49430](https://github.com/ClickHouse/ClickHouse/pull/49430) ([Anton Popov](https://github.com/CurtizJ)).
* all s3-blobs removed when merge aborted, remove part from failed fetch without unlock keper [#49432](https://github.com/ClickHouse/ClickHouse/pull/49432) ([Sema Checherinda](https://github.com/CheSema)).
* Make INSERT do more things in parallel to avoid getting bottlenecked on one thread [#49434](https://github.com/ClickHouse/ClickHouse/pull/49434) ([Michael Kolupaev](https://github.com/al13n321)).
* Make 'exceptions shorter than 30' test less noisy [#49435](https://github.com/ClickHouse/ClickHouse/pull/49435) ([Michael Kolupaev](https://github.com/al13n321)).
* Build fixes for ENABLE_LIBRARIES=OFF [#49437](https://github.com/ClickHouse/ClickHouse/pull/49437) ([Azat Khuzhin](https://github.com/azat)).
* Add image for docker-server jepsen [#49452](https://github.com/ClickHouse/ClickHouse/pull/49452) ([alesapin](https://github.com/alesapin)).
* Follow-up to [#48792](https://github.com/ClickHouse/ClickHouse/issues/48792) [#49458](https://github.com/ClickHouse/ClickHouse/pull/49458) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add method `getCurrentAvailabilityZone` to `AWSEC2MetadataClient` [#49464](https://github.com/ClickHouse/ClickHouse/pull/49464) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Add an integration test for `shutdown_wait_unfinished_queries` [#49469](https://github.com/ClickHouse/ClickHouse/pull/49469) ([Konstantin Bogdanov](https://github.com/thevar1able)).
* Replace `NO DELAY` with `SYNC` in tests [#49470](https://github.com/ClickHouse/ClickHouse/pull/49470) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Check the PRs body directly in lambda, without rerun. Fix RCE in the CI [#49475](https://github.com/ClickHouse/ClickHouse/pull/49475) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Minor changes for setThreadName [#49476](https://github.com/ClickHouse/ClickHouse/pull/49476) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Static cast std::atomic<size_t> to uint64_t to serialize. [#49482](https://github.com/ClickHouse/ClickHouse/pull/49482) ([alekar](https://github.com/alekar)).
* Fix logical error in stress test, add some logging [#49491](https://github.com/ClickHouse/ClickHouse/pull/49491) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fixes in server jepsen image [#49492](https://github.com/ClickHouse/ClickHouse/pull/49492) ([alesapin](https://github.com/alesapin)).
* Fix UserTimeMicroseconds and SystemTimeMicroseconds descriptions [#49521](https://github.com/ClickHouse/ClickHouse/pull/49521) ([Sergei Trifonov](https://github.com/serxa)).
* Remove garbage from HDFS [#49531](https://github.com/ClickHouse/ClickHouse/pull/49531) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Split ReadWriteBufferFromHTTP.h into .h and .cpp file [#49533](https://github.com/ClickHouse/ClickHouse/pull/49533) ([Michael Kolupaev](https://github.com/al13n321)).
* Remove garbage from Pretty format [#49534](https://github.com/ClickHouse/ClickHouse/pull/49534) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Make input_format_parquet_preserve_order imply !parallelize_output_from_storages [#49536](https://github.com/ClickHouse/ClickHouse/pull/49536) ([Michael Kolupaev](https://github.com/al13n321)).
* Remove extra semicolons [#49545](https://github.com/ClickHouse/ClickHouse/pull/49545) ([Bulat Gaifullin](https://github.com/bgaifullin)).
* Fix 00597_push_down_predicate_long for analyzer [#49551](https://github.com/ClickHouse/ClickHouse/pull/49551) ([Igor Nikonov](https://github.com/devcrafter)).
* Fix stress test (assertion 'key_metadata.lock()') [#49554](https://github.com/ClickHouse/ClickHouse/pull/49554) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix writeAnyEscapedString if quote_character is a meta character [#49558](https://github.com/ClickHouse/ClickHouse/pull/49558) ([Robert Schulze](https://github.com/rschu1ze)).
* Add CMake option for BOOST_USE_UCONTEXT [#49564](https://github.com/ClickHouse/ClickHouse/pull/49564) ([ltrk2](https://github.com/ltrk2)).
* Fix 01655_plan_optimizations_optimize_read_in_window_order for analyzer [#49565](https://github.com/ClickHouse/ClickHouse/pull/49565) ([Igor Nikonov](https://github.com/devcrafter)).
* Fix `ThreadPool::wait` [#49572](https://github.com/ClickHouse/ClickHouse/pull/49572) ([Anton Popov](https://github.com/CurtizJ)).
* Query cache: disable for internal queries [#49573](https://github.com/ClickHouse/ClickHouse/pull/49573) ([Robert Schulze](https://github.com/rschu1ze)).
* Remove `test_merge_tree_s3_restore` [#49576](https://github.com/ClickHouse/ClickHouse/pull/49576) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix bad test [#49578](https://github.com/ClickHouse/ClickHouse/pull/49578) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Remove obsolete test about deprecated feature [#49579](https://github.com/ClickHouse/ClickHouse/pull/49579) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Avoid error found by AST Fuzzer [#49580](https://github.com/ClickHouse/ClickHouse/pull/49580) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix wrong assert [#49581](https://github.com/ClickHouse/ClickHouse/pull/49581) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Flaky test 02723_zookeeper_name.sql [#49592](https://github.com/ClickHouse/ClickHouse/pull/49592) ([Sema Checherinda](https://github.com/CheSema)).
* Query Cache: Safeguard against empty chunks [#49593](https://github.com/ClickHouse/ClickHouse/pull/49593) ([Robert Schulze](https://github.com/rschu1ze)).
* 02723_zookeeper_name: Force a deterministic result order [#49594](https://github.com/ClickHouse/ClickHouse/pull/49594) ([Robert Schulze](https://github.com/rschu1ze)).
* Remove dangerous code (stringstream) [#49595](https://github.com/ClickHouse/ClickHouse/pull/49595) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Remove some code [#49596](https://github.com/ClickHouse/ClickHouse/pull/49596) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Remove "locale" [#49597](https://github.com/ClickHouse/ClickHouse/pull/49597) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* CMake: Cleanup utils build [#49598](https://github.com/ClickHouse/ClickHouse/pull/49598) ([Robert Schulze](https://github.com/rschu1ze)).
* Follow-up for [#49580](https://github.com/ClickHouse/ClickHouse/issues/49580) [#49604](https://github.com/ClickHouse/ClickHouse/pull/49604) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix typo [#49605](https://github.com/ClickHouse/ClickHouse/pull/49605) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix bad test 01660_system_parts_smoke [#49611](https://github.com/ClickHouse/ClickHouse/pull/49611) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Minor changes [#49612](https://github.com/ClickHouse/ClickHouse/pull/49612) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Follow-up for [#49576](https://github.com/ClickHouse/ClickHouse/issues/49576) [#49615](https://github.com/ClickHouse/ClickHouse/pull/49615) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix error in [#48300](https://github.com/ClickHouse/ClickHouse/issues/48300) [#49616](https://github.com/ClickHouse/ClickHouse/pull/49616) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix typo: "as much slots" -> "as many slots" [#49617](https://github.com/ClickHouse/ClickHouse/pull/49617) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Better concurrent parts removal with zero copy [#49619](https://github.com/ClickHouse/ClickHouse/pull/49619) ([alesapin](https://github.com/alesapin)).
* CMake: Remove legacy switch for ccache [#49627](https://github.com/ClickHouse/ClickHouse/pull/49627) ([Robert Schulze](https://github.com/rschu1ze)).
* Try to fix integration test 'test_ssl_cert_authentication' [#49632](https://github.com/ClickHouse/ClickHouse/pull/49632) ([Nikolay Degterinsky](https://github.com/evillique)).
* Unflake 01660_system_parts_smoke [#49633](https://github.com/ClickHouse/ClickHouse/pull/49633) ([Robert Schulze](https://github.com/rschu1ze)).
* Add trash [#49634](https://github.com/ClickHouse/ClickHouse/pull/49634) ([Robert Schulze](https://github.com/rschu1ze)).
* Remove commented code [#49635](https://github.com/ClickHouse/ClickHouse/pull/49635) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Add flaky test [#49646](https://github.com/ClickHouse/ClickHouse/pull/49646) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix race in `Context::createCopy` [#49663](https://github.com/ClickHouse/ClickHouse/pull/49663) ([Anton Popov](https://github.com/CurtizJ)).
* Disable 01710_projection_aggregation_in_order.sql [#49667](https://github.com/ClickHouse/ClickHouse/pull/49667) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix flaky 02684_bson.sql [#49674](https://github.com/ClickHouse/ClickHouse/pull/49674) ([Kruglov Pavel](https://github.com/Avogar)).
* Some cache cleanup after rework locking [#49675](https://github.com/ClickHouse/ClickHouse/pull/49675) ([Igor Nikonov](https://github.com/devcrafter)).
* Correctly update log pointer during database replica recovery [#49676](https://github.com/ClickHouse/ClickHouse/pull/49676) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Enable distinct in order after fix [#49636](https://github.com/ClickHouse/ClickHouse/issues/49636) [#49677](https://github.com/ClickHouse/ClickHouse/pull/49677) ([Igor Nikonov](https://github.com/devcrafter)).
* Build fixes for RISCV64 [#49688](https://github.com/ClickHouse/ClickHouse/pull/49688) ([Azat Khuzhin](https://github.com/azat)).
* Add some logging [#49690](https://github.com/ClickHouse/ClickHouse/pull/49690) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix a wrong built generator removal, use `depth=1` [#49692](https://github.com/ClickHouse/ClickHouse/pull/49692) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix member call on null pointer in AST fuzzer [#49696](https://github.com/ClickHouse/ClickHouse/pull/49696) ([Nikolay Degterinsky](https://github.com/evillique)).
* Improve woboq codebrowser pipeline [#49701](https://github.com/ClickHouse/ClickHouse/pull/49701) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Enable `do_not_evict_index_and_mark_files` by default [#49702](https://github.com/ClickHouse/ClickHouse/pull/49702) ([Nikita Taranov](https://github.com/nickitat)).
* Backport fix for UBSan error in musl/logf.c [#49705](https://github.com/ClickHouse/ClickHouse/pull/49705) ([Nikita Taranov](https://github.com/nickitat)).
* Fix flaky test for `kolmogorovSmirnovTest` function [#49710](https://github.com/ClickHouse/ClickHouse/pull/49710) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Update clickhouse-test [#49712](https://github.com/ClickHouse/ClickHouse/pull/49712) ([Alexander Tokmakov](https://github.com/tavplubix)).
* IBM s390x: ip encoding fix [#49713](https://github.com/ClickHouse/ClickHouse/pull/49713) ([Suzy Wang](https://github.com/SuzyWangIBMer)).
* Remove not used ErrorCodes [#49715](https://github.com/ClickHouse/ClickHouse/pull/49715) ([Sergei Trifonov](https://github.com/serxa)).
* Disable mmap for StorageFile in clickhouse-server [#49717](https://github.com/ClickHouse/ClickHouse/pull/49717) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix typo [#49718](https://github.com/ClickHouse/ClickHouse/pull/49718) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Do not launch workflows for PRs w/o "can be tested" [#49726](https://github.com/ClickHouse/ClickHouse/pull/49726) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Move assertions after logging [#49729](https://github.com/ClickHouse/ClickHouse/pull/49729) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Docs: Fix sidebar label for dictionary table function [#49730](https://github.com/ClickHouse/ClickHouse/pull/49730) ([Robert Schulze](https://github.com/rschu1ze)).
* Do not allocate own buffer in CachedOnDiskReadBufferFromFile when `use_external_buffer == true` [#49733](https://github.com/ClickHouse/ClickHouse/pull/49733) ([Nikita Taranov](https://github.com/nickitat)).
* fix convertation [#49749](https://github.com/ClickHouse/ClickHouse/pull/49749) ([Sema Checherinda](https://github.com/CheSema)).
* fix flaky test 02504_regexp_dictionary_ua_parser [#49753](https://github.com/ClickHouse/ClickHouse/pull/49753) ([Han Fei](https://github.com/hanfei1991)).
* Fix unit test `ExceptionFromWait` [#49755](https://github.com/ClickHouse/ClickHouse/pull/49755) ([Anton Popov](https://github.com/CurtizJ)).
* Add forgotten lock (addition to [#49117](https://github.com/ClickHouse/ClickHouse/issues/49117)) [#49757](https://github.com/ClickHouse/ClickHouse/pull/49757) ([Anton Popov](https://github.com/CurtizJ)).
* Fix typo [#49762](https://github.com/ClickHouse/ClickHouse/pull/49762) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix build of `libfiu` on clang-16 [#49766](https://github.com/ClickHouse/ClickHouse/pull/49766) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Update README.md [#49782](https://github.com/ClickHouse/ClickHouse/pull/49782) ([Tyler Hannan](https://github.com/tylerhannan)).
* Analyzer: fix column not found for optimized prewhere with sample by [#49784](https://github.com/ClickHouse/ClickHouse/pull/49784) ([vdimir](https://github.com/vdimir)).
* Typo: demange.cpp --> demangle.cpp [#49799](https://github.com/ClickHouse/ClickHouse/pull/49799) ([Robert Schulze](https://github.com/rschu1ze)).
* Analyzer: apply _CAST to constants only once [#49800](https://github.com/ClickHouse/ClickHouse/pull/49800) ([Dmitry Novik](https://github.com/novikd)).
* Use CLOCK_MONOTONIC_RAW over CLOCK_MONOTONIC on Linux (fixes non monotonic clock) [#49819](https://github.com/ClickHouse/ClickHouse/pull/49819) ([Azat Khuzhin](https://github.com/azat)).
* README.md: 4 --> 5 [#49822](https://github.com/ClickHouse/ClickHouse/pull/49822) ([Robert Schulze](https://github.com/rschu1ze)).
* Allow ASOF JOIN over nullable right column [#49826](https://github.com/ClickHouse/ClickHouse/pull/49826) ([vdimir](https://github.com/vdimir)).
* Make 01533_multiple_nested test more reliable [#49828](https://github.com/ClickHouse/ClickHouse/pull/49828) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
* What happens if I remove everything in msan_suppressions? [#49829](https://github.com/ClickHouse/ClickHouse/pull/49829) ([Robert Schulze](https://github.com/rschu1ze)).
* Update README.md [#49832](https://github.com/ClickHouse/ClickHouse/pull/49832) ([AnneClickHouse](https://github.com/AnneClickHouse)).
* Randomize enable_multiple_prewhere_read_steps setting [#49834](https://github.com/ClickHouse/ClickHouse/pull/49834) ([Alexander Gololobov](https://github.com/davenger)).
* Analyzer: do not optimize GROUP BY keys with ROLLUP and CUBE [#49838](https://github.com/ClickHouse/ClickHouse/pull/49838) ([Dmitry Novik](https://github.com/novikd)).
* Clearable hash table and zero values [#49846](https://github.com/ClickHouse/ClickHouse/pull/49846) ([Igor Nikonov](https://github.com/devcrafter)).
* Reset vectorscan reference to an "official" repo [#49848](https://github.com/ClickHouse/ClickHouse/pull/49848) ([Robert Schulze](https://github.com/rschu1ze)).
* Enable few slow clang-tidy checks for clangd [#49855](https://github.com/ClickHouse/ClickHouse/pull/49855) ([Azat Khuzhin](https://github.com/azat)).
* Update QPL docs [#49857](https://github.com/ClickHouse/ClickHouse/pull/49857) ([Robert Schulze](https://github.com/rschu1ze)).
* Small-ish .clang-tidy update [#49859](https://github.com/ClickHouse/ClickHouse/pull/49859) ([Robert Schulze](https://github.com/rschu1ze)).
* Follow-up for clang-tidy [#49861](https://github.com/ClickHouse/ClickHouse/pull/49861) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix "reference to local binding" after fixes for clang-17 [#49874](https://github.com/ClickHouse/ClickHouse/pull/49874) ([Azat Khuzhin](https://github.com/azat)).
* fix typo [#49876](https://github.com/ClickHouse/ClickHouse/pull/49876) ([JackyWoo](https://github.com/JackyWoo)).
* Log with warning if the server was terminated forcefully [#49881](https://github.com/ClickHouse/ClickHouse/pull/49881) ([Azat Khuzhin](https://github.com/azat)).
* Fix some tests [#49889](https://github.com/ClickHouse/ClickHouse/pull/49889) ([Alexander Tokmakov](https://github.com/tavplubix)).
* use chassert in MergeTreeDeduplicationLog to have better log info [#49891](https://github.com/ClickHouse/ClickHouse/pull/49891) ([Han Fei](https://github.com/hanfei1991)).
* Multiple pools support for AsyncLoader [#49893](https://github.com/ClickHouse/ClickHouse/pull/49893) ([Sergei Trifonov](https://github.com/serxa)).
* Fix stack-use-after-scope in resource manager test [#49908](https://github.com/ClickHouse/ClickHouse/pull/49908) ([Sergei Trifonov](https://github.com/serxa)).
* Retry connection expired in test_rename_column/test.py [#49911](https://github.com/ClickHouse/ClickHouse/pull/49911) ([alesapin](https://github.com/alesapin)).
* Try to fix flaky test_distributed_load_balancing tests [#49912](https://github.com/ClickHouse/ClickHouse/pull/49912) ([Kruglov Pavel](https://github.com/Avogar)).
* Remove unused code [#49918](https://github.com/ClickHouse/ClickHouse/pull/49918) ([alesapin](https://github.com/alesapin)).
* Fix flakiness of test_distributed_load_balancing test [#49921](https://github.com/ClickHouse/ClickHouse/pull/49921) ([Azat Khuzhin](https://github.com/azat)).
* Add some logging [#49925](https://github.com/ClickHouse/ClickHouse/pull/49925) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Support hardlinking parts transactionally [#49931](https://github.com/ClickHouse/ClickHouse/pull/49931) ([Michael Kolupaev](https://github.com/al13n321)).
* Fix for analyzer: 02377_ optimize_sorting_by_input_stream_properties_e… [#49943](https://github.com/ClickHouse/ClickHouse/pull/49943) ([Igor Nikonov](https://github.com/devcrafter)).
* Follow up to [#49429](https://github.com/ClickHouse/ClickHouse/issues/49429) [#49964](https://github.com/ClickHouse/ClickHouse/pull/49964) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix flaky test_ssl_cert_authentication to use urllib3 [#49982](https://github.com/ClickHouse/ClickHouse/pull/49982) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Fix woboq codebrowser build with -Wno-poison-system-directories [#49992](https://github.com/ClickHouse/ClickHouse/pull/49992) ([Azat Khuzhin](https://github.com/azat)).
* test for [#46128](https://github.com/ClickHouse/ClickHouse/issues/46128) [#49993](https://github.com/ClickHouse/ClickHouse/pull/49993) ([Denny Crane](https://github.com/den-crane)).
* Fix test_insert_same_partition_and_merge failing if one Azure request attempt fails [#49996](https://github.com/ClickHouse/ClickHouse/pull/49996) ([Michael Kolupaev](https://github.com/al13n321)).
* Check return value of `ftruncate` in Keeper [#50020](https://github.com/ClickHouse/ClickHouse/pull/50020) ([Antonio Andelic](https://github.com/antonio2368)).
* Add some assertions [#50025](https://github.com/ClickHouse/ClickHouse/pull/50025) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Update 02441_alter_delete_and_drop_column.sql [#50027](https://github.com/ClickHouse/ClickHouse/pull/50027) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Move some common code to common [#50028](https://github.com/ClickHouse/ClickHouse/pull/50028) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Add method getCredentials() to S3::Client [#50030](https://github.com/ClickHouse/ClickHouse/pull/50030) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Update query_log.md [#50032](https://github.com/ClickHouse/ClickHouse/pull/50032) ([Sergei Trifonov](https://github.com/serxa)).
* Get rid of indirect write buffer in object storages [#50033](https://github.com/ClickHouse/ClickHouse/pull/50033) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Load balancing bugfixes [#50036](https://github.com/ClickHouse/ClickHouse/pull/50036) ([Sergei Trifonov](https://github.com/serxa)).
* Update S3 sdk to v1.11.61 [#50037](https://github.com/ClickHouse/ClickHouse/pull/50037) ([Nikita Taranov](https://github.com/nickitat)).
* Fix 02735_system_zookeeper_connection for DatabaseReplicated [#50047](https://github.com/ClickHouse/ClickHouse/pull/50047) ([Azat Khuzhin](https://github.com/azat)).
* Add more profile events for distributed connections [#50051](https://github.com/ClickHouse/ClickHouse/pull/50051) ([Sergei Trifonov](https://github.com/serxa)).
* FileCache: simple tryReserve() cleanup [#50059](https://github.com/ClickHouse/ClickHouse/pull/50059) ([Igor Nikonov](https://github.com/devcrafter)).
* Fix hashed/sparse_hashed dictionaries max_load_factor upper range [#50065](https://github.com/ClickHouse/ClickHouse/pull/50065) ([Azat Khuzhin](https://github.com/azat)).
* Clearer coordinator log [#50101](https://github.com/ClickHouse/ClickHouse/pull/50101) ([Raúl Marín](https://github.com/Algunenano)).
* Analyzer: Do not execute table functions multiple times [#50105](https://github.com/ClickHouse/ClickHouse/pull/50105) ([Dmitry Novik](https://github.com/novikd)).
* Update default settings for Replicated database [#50108](https://github.com/ClickHouse/ClickHouse/pull/50108) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Make async prefetched buffer work with arbitrary impl [#50109](https://github.com/ClickHouse/ClickHouse/pull/50109) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Update github.com/distribution/distribution [#50114](https://github.com/ClickHouse/ClickHouse/pull/50114) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Docs: Update clickhouse-local arguments [#50138](https://github.com/ClickHouse/ClickHouse/pull/50138) ([Robert Schulze](https://github.com/rschu1ze)).
* Change fields destruction order in AsyncTaskExecutor [#50151](https://github.com/ClickHouse/ClickHouse/pull/50151) ([Kruglov Pavel](https://github.com/Avogar)).
* Follow-up to [#49889](https://github.com/ClickHouse/ClickHouse/issues/49889) [#50152](https://github.com/ClickHouse/ClickHouse/pull/50152) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Clarification comment on retries controller behavior [#50155](https://github.com/ClickHouse/ClickHouse/pull/50155) ([Igor Nikonov](https://github.com/devcrafter)).
* Switch to upstream repository of vectorscan [#50159](https://github.com/ClickHouse/ClickHouse/pull/50159) ([Azat Khuzhin](https://github.com/azat)).
* Refactor lambdas, prepare to prio runners [#50160](https://github.com/ClickHouse/ClickHouse/pull/50160) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Speed-up the shellcheck with parallel xargs [#50164](https://github.com/ClickHouse/ClickHouse/pull/50164) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Update an exception message [#50180](https://github.com/ClickHouse/ClickHouse/pull/50180) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Upgrade boost submodule [#50188](https://github.com/ClickHouse/ClickHouse/pull/50188) ([ltrk2](https://github.com/ltrk2)).
* Implement a uniform way to query processor core IDs [#50190](https://github.com/ClickHouse/ClickHouse/pull/50190) ([ltrk2](https://github.com/ltrk2)).
* Don't replicate delete through DDL worker if there is just 1 shard [#50193](https://github.com/ClickHouse/ClickHouse/pull/50193) ([Alexander Gololobov](https://github.com/davenger)).
* Fix codebrowser by using clang-15 image [#50197](https://github.com/ClickHouse/ClickHouse/pull/50197) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add comments to build reports [#50200](https://github.com/ClickHouse/ClickHouse/pull/50200) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Automatic backports of important fixes to cloud-release [#50202](https://github.com/ClickHouse/ClickHouse/pull/50202) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Unify priorities: lower value means higher priority [#50205](https://github.com/ClickHouse/ClickHouse/pull/50205) ([Sergei Trifonov](https://github.com/serxa)).
* Use transactions for encrypted disks [#50206](https://github.com/ClickHouse/ClickHouse/pull/50206) ([alesapin](https://github.com/alesapin)).
* Get detailed error instead of unknown error for function test [#50207](https://github.com/ClickHouse/ClickHouse/pull/50207) ([Suzy Wang](https://github.com/SuzyWangIBMer)).
* README.md: Remove Berlin Meetup from upcoming events [#50218](https://github.com/ClickHouse/ClickHouse/pull/50218) ([Robert Schulze](https://github.com/rschu1ze)).
* Minor adjustment of clickhouse-client/local parameter docs [#50219](https://github.com/ClickHouse/ClickHouse/pull/50219) ([Robert Schulze](https://github.com/rschu1ze)).
* Unify priorities: rework IO scheduling subsystem [#50231](https://github.com/ClickHouse/ClickHouse/pull/50231) ([Sergei Trifonov](https://github.com/serxa)).
* Add new metrics BrokenDistributedBytesToInsert/DistributedBytesToInsert [#50238](https://github.com/ClickHouse/ClickHouse/pull/50238) ([Azat Khuzhin](https://github.com/azat)).
* Fix URL in backport comment [#50241](https://github.com/ClickHouse/ClickHouse/pull/50241) ([pufit](https://github.com/pufit)).
* Fix `02535_max_parallel_replicas_custom_key` [#50242](https://github.com/ClickHouse/ClickHouse/pull/50242) ([Antonio Andelic](https://github.com/antonio2368)).
* Fixes for MergeTree with readonly disks [#50244](https://github.com/ClickHouse/ClickHouse/pull/50244) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Yet another refactoring [#50257](https://github.com/ClickHouse/ClickHouse/pull/50257) ([Anton Popov](https://github.com/CurtizJ)).
* Unify priorities: rework AsyncLoader [#50272](https://github.com/ClickHouse/ClickHouse/pull/50272) ([Sergei Trifonov](https://github.com/serxa)).
* buffers d-tor finalize free [#50275](https://github.com/ClickHouse/ClickHouse/pull/50275) ([Sema Checherinda](https://github.com/CheSema)).
* Fix 02767_into_outfile_extensions_msan under analyzer [#50290](https://github.com/ClickHouse/ClickHouse/pull/50290) ([Azat Khuzhin](https://github.com/azat)).
* QPL: Add a comment about isal [#50308](https://github.com/ClickHouse/ClickHouse/pull/50308) ([Robert Schulze](https://github.com/rschu1ze)).
* Avoid clang 15 crash [#50310](https://github.com/ClickHouse/ClickHouse/pull/50310) ([Raúl Marín](https://github.com/Algunenano)).
* Cleanup Annoy index [#50312](https://github.com/ClickHouse/ClickHouse/pull/50312) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix flaky `AsyncLoader.StaticPriorities` unit test [#50313](https://github.com/ClickHouse/ClickHouse/pull/50313) ([Sergei Trifonov](https://github.com/serxa)).
* Update gtest_async_loader.cpp [#50317](https://github.com/ClickHouse/ClickHouse/pull/50317) ([Nikita Taranov](https://github.com/nickitat)).
* Fix IS (NOT) NULL operator priority [#50327](https://github.com/ClickHouse/ClickHouse/pull/50327) ([Nikolay Degterinsky](https://github.com/evillique)).
* Update README.md [#50340](https://github.com/ClickHouse/ClickHouse/pull/50340) ([Tyler Hannan](https://github.com/tylerhannan)).
* do not fix the event list in test [#50342](https://github.com/ClickHouse/ClickHouse/pull/50342) ([Sema Checherinda](https://github.com/CheSema)).
* less logs in WriteBufferFromS3 [#50347](https://github.com/ClickHouse/ClickHouse/pull/50347) ([Sema Checherinda](https://github.com/CheSema)).
* Remove legacy install scripts superseded by universal.sh [#50360](https://github.com/ClickHouse/ClickHouse/pull/50360) ([Robert Schulze](https://github.com/rschu1ze)).
* Fail perf tests when too many queries slowed down [#50361](https://github.com/ClickHouse/ClickHouse/pull/50361) ([Nikita Taranov](https://github.com/nickitat)).
* Fix after [#50109](https://github.com/ClickHouse/ClickHouse/issues/50109) [#50362](https://github.com/ClickHouse/ClickHouse/pull/50362) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix log message [#50363](https://github.com/ClickHouse/ClickHouse/pull/50363) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Compare functions NaN update test [#50366](https://github.com/ClickHouse/ClickHouse/pull/50366) ([Maksim Kita](https://github.com/kitaisreal)).
* Add re-creation for cherry-pick PRs [#50373](https://github.com/ClickHouse/ClickHouse/pull/50373) ([pufit](https://github.com/pufit)).
* Without applying `prepareRightBlock` will cause mismatch block structrue [#50383](https://github.com/ClickHouse/ClickHouse/pull/50383) ([lgbo](https://github.com/lgbo-ustc)).
* fix hung in unit tests [#50391](https://github.com/ClickHouse/ClickHouse/pull/50391) ([Sema Checherinda](https://github.com/CheSema)).
* Fix poll timeout in MaterializedMySQL [#50392](https://github.com/ClickHouse/ClickHouse/pull/50392) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Compile aggregate expressions enable by default [#50401](https://github.com/ClickHouse/ClickHouse/pull/50401) ([Maksim Kita](https://github.com/kitaisreal)).
* Update app.py [#50407](https://github.com/ClickHouse/ClickHouse/pull/50407) ([Nikita Taranov](https://github.com/nickitat)).
* reuse s3_mocks, rewrite test test_paranoid_check_in_logs [#50408](https://github.com/ClickHouse/ClickHouse/pull/50408) ([Sema Checherinda](https://github.com/CheSema)).
* test for [#42610](https://github.com/ClickHouse/ClickHouse/issues/42610) [#50409](https://github.com/ClickHouse/ClickHouse/pull/50409) ([Denny Crane](https://github.com/den-crane)).
* Remove something [#50411](https://github.com/ClickHouse/ClickHouse/pull/50411) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Mark the builds without results as pending [#50415](https://github.com/ClickHouse/ClickHouse/pull/50415) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Revert "Fix msan issue in keyed siphash" [#50426](https://github.com/ClickHouse/ClickHouse/pull/50426) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Revert "Revert "less logs in WriteBufferFromS3" ([#50390](https://github.com/ClickHouse/ClickHouse/issues/50390))" [#50444](https://github.com/ClickHouse/ClickHouse/pull/50444) ([Sema Checherinda](https://github.com/CheSema)).
* Paranoid fix for removing parts from ZooKeeper [#50448](https://github.com/ClickHouse/ClickHouse/pull/50448) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add timeout for unit tests [#50449](https://github.com/ClickHouse/ClickHouse/pull/50449) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Changes related to an internal feature [#50453](https://github.com/ClickHouse/ClickHouse/pull/50453) ([Michael Kolupaev](https://github.com/al13n321)).
* Don't crash if config doesn't have logger section [#50455](https://github.com/ClickHouse/ClickHouse/pull/50455) ([Michael Kolupaev](https://github.com/al13n321)).
* Update function docs [#50466](https://github.com/ClickHouse/ClickHouse/pull/50466) ([Robert Schulze](https://github.com/rschu1ze)).
* Revert "make filter push down through cross join" [#50467](https://github.com/ClickHouse/ClickHouse/pull/50467) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add some assertions [#50470](https://github.com/ClickHouse/ClickHouse/pull/50470) ([Kseniia Sumarokova](https://github.com/kssenii)).
* CI: Enable aspell on nested docs [#50476](https://github.com/ClickHouse/ClickHouse/pull/50476) ([Robert Schulze](https://github.com/rschu1ze)).
* Try fix flaky test test_async_query_sending [#50480](https://github.com/ClickHouse/ClickHouse/pull/50480) ([Kruglov Pavel](https://github.com/Avogar)).
* Disable 00534_functions_bad_arguments with msan [#50481](https://github.com/ClickHouse/ClickHouse/pull/50481) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Typos: Follow-up to [#50476](https://github.com/ClickHouse/ClickHouse/issues/50476) [#50482](https://github.com/ClickHouse/ClickHouse/pull/50482) ([Robert Schulze](https://github.com/rschu1ze)).
* Remove unneeded Keeper test [#50485](https://github.com/ClickHouse/ClickHouse/pull/50485) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix KeyError in cherry-pick [#50493](https://github.com/ClickHouse/ClickHouse/pull/50493) ([pufit](https://github.com/pufit)).
* Make typeid_cast for pointers noexcept [#50495](https://github.com/ClickHouse/ClickHouse/pull/50495) ([Sergey Kazmin ](https://github.com/yerseg)).
* less traces in logs [#50518](https://github.com/ClickHouse/ClickHouse/pull/50518) ([Sema Checherinda](https://github.com/CheSema)).
* Implement endianness-independent serialization for UUID [#50519](https://github.com/ClickHouse/ClickHouse/pull/50519) ([ltrk2](https://github.com/ltrk2)).
* Remove strange object storage methods [#50521](https://github.com/ClickHouse/ClickHouse/pull/50521) ([alesapin](https://github.com/alesapin)).
* Fix low quality code around metadata in RocksDB (experimental feature never used in production) [#50527](https://github.com/ClickHouse/ClickHouse/pull/50527) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Function if constant folding [#50529](https://github.com/ClickHouse/ClickHouse/pull/50529) ([Maksim Kita](https://github.com/kitaisreal)).
* Add profile events for fs cache eviction [#50533](https://github.com/ClickHouse/ClickHouse/pull/50533) ([Kseniia Sumarokova](https://github.com/kssenii)).
* QueryNode small fix [#50535](https://github.com/ClickHouse/ClickHouse/pull/50535) ([Maksim Kita](https://github.com/kitaisreal)).
* Control memory usage in generateRandom [#50538](https://github.com/ClickHouse/ClickHouse/pull/50538) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Disable skim (Rust library) under memory sanitizer [#50539](https://github.com/ClickHouse/ClickHouse/pull/50539) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* MSan support for Rust [#50541](https://github.com/ClickHouse/ClickHouse/pull/50541) ([Azat Khuzhin](https://github.com/azat)).
* Make 01565_query_loop_after_client_error slightly more robust [#50542](https://github.com/ClickHouse/ClickHouse/pull/50542) ([Azat Khuzhin](https://github.com/azat)).
* Resize BufferFromVector underlying vector only pos_offset == vector.size() [#50546](https://github.com/ClickHouse/ClickHouse/pull/50546) ([auxten](https://github.com/auxten)).
* Add async iteration to object storage [#50548](https://github.com/ClickHouse/ClickHouse/pull/50548) ([alesapin](https://github.com/alesapin)).
* skip extracting darwin toolchain in builder when unncessary [#50550](https://github.com/ClickHouse/ClickHouse/pull/50550) ([SuperDJY](https://github.com/cmsxbc)).
* Remove flaky test [#50558](https://github.com/ClickHouse/ClickHouse/pull/50558) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Revert "Disable skim (Rust library) under memory sanitizer" [#50574](https://github.com/ClickHouse/ClickHouse/pull/50574) ([Azat Khuzhin](https://github.com/azat)).
* Analyzer: fix 01487_distributed_in_not_default_db [#50587](https://github.com/ClickHouse/ClickHouse/pull/50587) ([Dmitry Novik](https://github.com/novikd)).
* Fix commit for DiskObjectStorage [#50599](https://github.com/ClickHouse/ClickHouse/pull/50599) ([alesapin](https://github.com/alesapin)).
* Fix Jepsen runs in PRs [#50615](https://github.com/ClickHouse/ClickHouse/pull/50615) ([Antonio Andelic](https://github.com/antonio2368)).
* Revert incorrect optimizations [#50629](https://github.com/ClickHouse/ClickHouse/pull/50629) ([Raúl Marín](https://github.com/Algunenano)).
* Disable 01676_clickhouse_client_autocomplete under UBSan [#50636](https://github.com/ClickHouse/ClickHouse/pull/50636) ([Nikita Taranov](https://github.com/nickitat)).
* Merging [#50329](https://github.com/ClickHouse/ClickHouse/issues/50329) [#50660](https://github.com/ClickHouse/ClickHouse/pull/50660) ([Anton Popov](https://github.com/CurtizJ)).
* Revert "date_trunc function to always return DateTime type" [#50670](https://github.com/ClickHouse/ClickHouse/pull/50670) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fix flaky test 02461_prewhere_row_level_policy_lightweight_delete [#50674](https://github.com/ClickHouse/ClickHouse/pull/50674) ([Alexander Gololobov](https://github.com/davenger)).
* Fix asan issue with analyzer and prewhere [#50685](https://github.com/ClickHouse/ClickHouse/pull/50685) ([Alexander Gololobov](https://github.com/davenger)).
* Catch issues with dockerd during the build [#50700](https://github.com/ClickHouse/ClickHouse/pull/50700) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Temporarily disable annoy index tests (flaky for analyzer) [#50714](https://github.com/ClickHouse/ClickHouse/pull/50714) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix assertion from stress test [#50718](https://github.com/ClickHouse/ClickHouse/pull/50718) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix flaky unit test [#50719](https://github.com/ClickHouse/ClickHouse/pull/50719) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Show correct sharing state in system.query_cache [#50728](https://github.com/ClickHouse/ClickHouse/pull/50728) ([Robert Schulze](https://github.com/rschu1ze)).

View File

@ -0,0 +1,18 @@
---
sidebar_position: 1
sidebar_label: 2023
---
# 2023 Changelog
### ClickHouse release v23.5.2.7-stable (5751aa1ab9f) FIXME as compared to v23.5.1.3174-stable (2fec796e73e)
#### Bug Fix (user-visible misbehavior in an official stable release)
* Do not read all the columns from right GLOBAL JOIN table. [#50721](https://github.com/ClickHouse/ClickHouse/pull/50721) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Fix build for aarch64 (temporary disable azure) [#50770](https://github.com/ClickHouse/ClickHouse/pull/50770) ([alesapin](https://github.com/alesapin)).
* Rename azure_blob_storage to azureBlobStorage [#50812](https://github.com/ClickHouse/ClickHouse/pull/50812) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).

View File

@ -43,7 +43,7 @@ sudo add-apt-repository -y ppa:ubuntu-toolchain-r/test
For other Linux distribution - check the availability of LLVM's [prebuild packages](https://releases.llvm.org/download.html).
As of April 2023, any version of Clang >= 15 will work.
GCC as a compiler is not supported
GCC as a compiler is not supported.
To build with a specific Clang version:
:::tip
@ -114,18 +114,3 @@ mkdir build
cmake -S . -B build
cmake --build build
```
## You Dont Have to Build ClickHouse {#you-dont-have-to-build-clickhouse}
ClickHouse is available in pre-built binaries and packages. Binaries are portable and can be run on any Linux flavour.
The CI checks build the binaries on each commit to [ClickHouse](https://github.com/clickhouse/clickhouse/). To download them:
1. Open the [commits list](https://github.com/ClickHouse/ClickHouse/commits/master)
1. Choose a **Merge pull request** commit that includes the new feature, or was added after the new feature
1. Click the status symbol (yellow dot, red x, green check) to open the CI check list
1. Scroll through the list until you find **ClickHouse build check x/x artifact groups are OK**
1. Click **Details**
1. Find the type of package for your operating system that you need and download the files.
![build artifact check](images/find-build-artifact.png)

View File

@ -119,7 +119,7 @@ When working with the `MaterializedMySQL` database engine, [ReplacingMergeTree](
The data of TIME type in MySQL is converted to microseconds in ClickHouse.
Other types are not supported. If MySQL table contains a column of such type, ClickHouse throws exception "Unhandled data type" and stops replication.
Other types are not supported. If MySQL table contains a column of such type, ClickHouse throws an exception and stops replication.
## Specifics and Recommendations {#specifics-and-recommendations}

View File

@ -55,7 +55,7 @@ ATTACH TABLE postgres_database.new_table;
```
:::warning
Before version 22.1, adding a table to replication left an unremoved temporary replication slot (named `{db_name}_ch_replication_slot_tmp`). If attaching tables in ClickHouse version before 22.1, make sure to delete it manually (`SELECT pg_drop_replication_slot('{db_name}_ch_replication_slot_tmp')`). Otherwise disk usage will grow. This issue is fixed in 22.1.
Before version 22.1, adding a table to replication left a non-removed temporary replication slot (named `{db_name}_ch_replication_slot_tmp`). If attaching tables in ClickHouse version before 22.1, make sure to delete it manually (`SELECT pg_drop_replication_slot('{db_name}_ch_replication_slot_tmp')`). Otherwise disk usage will grow. This issue is fixed in 22.1.
:::
## Dynamically removing tables from replication {#dynamically-removing-table-from-replication}
@ -257,7 +257,7 @@ Please note that this should be used only if it is actually needed. If there is
1. [CREATE PUBLICATION](https://postgrespro.ru/docs/postgresql/14/sql-createpublication) -- create query privilege.
2. [CREATE_REPLICATION_SLOT](https://postgrespro.ru/docs/postgrespro/10/protocol-replication#PROTOCOL-REPLICATION-CREATE-SLOT) -- replication privelege.
2. [CREATE_REPLICATION_SLOT](https://postgrespro.ru/docs/postgrespro/10/protocol-replication#PROTOCOL-REPLICATION-CREATE-SLOT) -- replication privilege.
3. [pg_drop_replication_slot](https://postgrespro.ru/docs/postgrespro/9.5/functions-admin#functions-replication) -- replication privilege or superuser.

View File

@ -30,7 +30,7 @@ Allows to connect to [SQLite](https://www.sqlite.org/index.html) database and pe
## Specifics and Recommendations {#specifics-and-recommendations}
SQLite stores the entire database (definitions, tables, indices, and the data itself) as a single cross-platform file on a host machine. During writing SQLite locks the entire database file, therefore write operations are performed sequentially. Read operations can be multitasked.
SQLite stores the entire database (definitions, tables, indices, and the data itself) as a single cross-platform file on a host machine. During writing SQLite locks the entire database file, therefore write operations are performed sequentially. Read operations can be multi-tasked.
SQLite does not require service management (such as startup scripts) or access control based on `GRANT` and passwords. Access control is handled by means of file-system permissions given to the database file itself.
## Usage Example {#usage-example}

View File

@ -53,6 +53,7 @@ Engines in the family:
- [JDBC](../../engines/table-engines/integrations/jdbc.md)
- [MySQL](../../engines/table-engines/integrations/mysql.md)
- [MongoDB](../../engines/table-engines/integrations/mongodb.md)
- [Redis](../../engines/table-engines/integrations/redis.md)
- [HDFS](../../engines/table-engines/integrations/hdfs.md)
- [S3](../../engines/table-engines/integrations/s3.md)
- [Kafka](../../engines/table-engines/integrations/kafka.md)

View File

@ -0,0 +1,51 @@
---
slug: /en/engines/table-engines/integrations/azureBlobStorage
sidebar_label: Azure Blob Storage
---
# AzureBlobStorage Table Engine
This engine provides an integration with [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs) ecosystem.
## Create Table
``` sql
CREATE TABLE azure_blob_storage_table (name String, value UInt32)
ENGINE = AzureBlobStorage(connection_string|storage_account_url, container_name, blobpath, [account_name, account_key, format, compression])
[PARTITION BY expr]
[SETTINGS ...]
```
### Engine parameters
- `connection_string|storage_account_url` — connection_string includes account name & key ([Create connection string](https://learn.microsoft.com/en-us/azure/storage/common/storage-configure-connection-string?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json&bc=%2Fazure%2Fstorage%2Fblobs%2Fbreadcrumb%2Ftoc.json#configure-a-connection-string-for-an-azure-storage-account)) or you could also provide the storage account url here and account name & account key as separate parameters (see parameters account_name & account_key)
- `container_name` - Container name
- `blobpath` - file path. Supports following wildcards in readonly mode: `*`, `?`, `{abc,def}` and `{N..M}` where `N`, `M` — numbers, `'abc'`, `'def'` — strings.
- `account_name` - if storage_account_url is used, then account name can be specified here
- `account_key` - if storage_account_url is used, then account key can be specified here
- `format` — The [format](/docs/en/interfaces/formats.md) of the file.
- `compression` — Supported values: `none`, `gzip/gz`, `brotli/br`, `xz/LZMA`, `zstd/zst`. By default, it will autodetect compression by file extension. (same as setting to `auto`).
**Example**
``` sql
CREATE TABLE test_table (key UInt64, data String)
ENGINE = AzureBlobStorage('DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://azurite1:10000/devstoreaccount1/;',
'test_container', 'test_table', 'CSV');
INSERT INTO test_table VALUES (1, 'a'), (2, 'b'), (3, 'c');
SELECT * FROM test_table;
```
```text
┌─key──┬─data──┐
│ 1 │ a │
│ 2 │ b │
│ 3 │ c │
└──────┴───────┘
```
## See also
[Azure Blob Storage Table Function](/docs/en/sql-reference/table-functions/azureBlobStorage)

View File

@ -120,3 +120,93 @@ Values can be updated using the `ALTER TABLE` query. The primary key cannot be u
```sql
ALTER TABLE test UPDATE v1 = v1 * 10 + 2 WHERE key LIKE 'some%' AND v3 > 3.1;
```
### Joins
A special `direct` join with EmbeddedRocksDB tables is supported.
This direct join avoids forming a hash table in memory and accesses
the data directly from the EmbeddedRocksDB.
With large joins you may see much lower memory usage with direct joins
because the hash table is not created.
To enable direct joins:
```sql
SET join_algorithm = 'direct, hash'
```
:::tip
When the `join_algorithm` is set to `direct, hash`, direct joins will be used
when possible, and hash otherwise.
:::
#### Example
##### Create and populate an EmbeddedRocksDB table:
```sql
CREATE TABLE rdb
(
`key` UInt32,
`value` Array(UInt32),
`value2` String
)
ENGINE = EmbeddedRocksDB
PRIMARY KEY key
```
```sql
INSERT INTO rdb
SELECT
toUInt32(sipHash64(number) % 10) as key,
[key, key+1] as value,
('val2' || toString(key)) as value2
FROM numbers_mt(10);
```
##### Create and populate a table to join with table `rdb`:
```sql
CREATE TABLE t2
(
`k` UInt16
)
ENGINE = TinyLog
```
```sql
INSERT INTO t2 SELECT number AS k
FROM numbers_mt(10)
```
##### Set the join algorithm to `direct`:
```sql
SET join_algorithm = 'direct'
```
##### An INNER JOIN:
```sql
SELECT *
FROM
(
SELECT k AS key
FROM t2
) AS t2
INNER JOIN rdb ON rdb.key = t2.key
ORDER BY key ASC
```
```response
┌─key─┬─rdb.key─┬─value──┬─value2─┐
│ 0 │ 0 │ [0,1] │ val20 │
│ 2 │ 2 │ [2,3] │ val22 │
│ 3 │ 3 │ [3,4] │ val23 │
│ 6 │ 6 │ [6,7] │ val26 │
│ 7 │ 7 │ [7,8] │ val27 │
│ 8 │ 8 │ [8,9] │ val28 │
│ 9 │ 9 │ [9,10] │ val29 │
└─────┴─────────┴────────┴────────┘
```
### More information on Joins
- [`join_algorithm` setting](/docs/en/operations/settings/settings.md#settings-join_algorithm)
- [JOIN clause](/docs/en/sql-reference/statements/select/join.md)

View File

@ -156,7 +156,7 @@ Similar to GraphiteMergeTree, the HDFS engine supports extended configuration us
| rpc\_client\_connect\_timeout | 600 * 1000 |
| rpc\_client\_read\_timeout | 3600 * 1000 |
| rpc\_client\_write\_timeout | 3600 * 1000 |
| rpc\_client\_socekt\_linger\_timeout | -1 |
| rpc\_client\_socket\_linger\_timeout | -1 |
| rpc\_client\_connect\_retry | 10 |
| rpc\_client\_timeout | 3600 * 1000 |
| dfs\_default\_replica | 3 |
@ -176,7 +176,7 @@ Similar to GraphiteMergeTree, the HDFS engine supports extended configuration us
| output\_write\_timeout | 3600 * 1000 |
| output\_close\_timeout | 3600 * 1000 |
| output\_packetpool\_size | 1024 |
| output\_heeartbeat\_interval | 10 * 1000 |
| output\_heartbeat\_interval | 10 * 1000 |
| dfs\_client\_failover\_max\_attempts | 15 |
| dfs\_client\_read\_shortcircuit\_streams\_cache\_size | 256 |
| dfs\_client\_socketcache\_expiryMsec | 3000 |

View File

@ -6,7 +6,7 @@ sidebar_label: Hive
# Hive
The Hive engine allows you to perform `SELECT` quries on HDFS Hive table. Currently it supports input formats as below:
The Hive engine allows you to perform `SELECT` queries on HDFS Hive table. Currently it supports input formats as below:
- Text: only supports simple scalar column types except `binary`

View File

@ -6,24 +6,4 @@ sidebar_label: Integrations
# Table Engines for Integrations
ClickHouse provides various means for integrating with external systems, including table engines. Like with all other table engines, the configuration is done using `CREATE TABLE` or `ALTER TABLE` queries. Then from a user perspective, the configured integration looks like a normal table, but queries to it are proxied to the external system. This transparent querying is one of the key advantages of this approach over alternative integration methods, like dictionaries or table functions, which require to use custom query methods on each use.
List of supported integrations:
- [ODBC](../../../engines/table-engines/integrations/odbc.md)
- [JDBC](../../../engines/table-engines/integrations/jdbc.md)
- [MySQL](../../../engines/table-engines/integrations/mysql.md)
- [MongoDB](../../../engines/table-engines/integrations/mongodb.md)
- [HDFS](../../../engines/table-engines/integrations/hdfs.md)
- [S3](../../../engines/table-engines/integrations/s3.md)
- [Kafka](../../../engines/table-engines/integrations/kafka.md)
- [EmbeddedRocksDB](../../../engines/table-engines/integrations/embedded-rocksdb.md)
- [RabbitMQ](../../../engines/table-engines/integrations/rabbitmq.md)
- [PostgreSQL](../../../engines/table-engines/integrations/postgresql.md)
- [SQLite](../../../engines/table-engines/integrations/sqlite.md)
- [Hive](../../../engines/table-engines/integrations/hive.md)
- [ExternalDistributed](../../../engines/table-engines/integrations/ExternalDistributed.md)
- [MaterializedPostgreSQL](../../../engines/table-engines/integrations/materialized-postgresql.md)
- [NATS](../../../engines/table-engines/integrations/nats.md)
- [DeltaLake](../../../engines/table-engines/integrations/deltalake.md)
- [Hudi](../../../engines/table-engines/integrations/hudi.md)
ClickHouse provides various means for integrating with external systems, including table engines. Like with all other table engines, the configuration is done using `CREATE TABLE` or `ALTER TABLE` queries. Then from a user perspective, the configured integration looks like a normal table, but queries to it are proxied to the external system. This transparent querying is one of the key advantages of this approach over alternative integration methods, like dictionaries or table functions, which require the use of custom query methods on each use.

View File

@ -10,7 +10,7 @@ This engine allows integrating ClickHouse with [NATS](https://nats.io/).
`NATS` lets you:
- Publish or subcribe to message subjects.
- Publish or subscribe to message subjects.
- Process new messages as they become available.
## Creating a Table {#table_engine-redisstreams-creating-a-table}
@ -46,7 +46,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
Required parameters:
- `nats_url` host:port (for example, `localhost:5672`)..
- `nats_subjects` List of subject for NATS table to subscribe/publsh to. Supports wildcard subjects like `foo.*.bar` or `baz.>`
- `nats_subjects` List of subject for NATS table to subscribe/publish to. Supports wildcard subjects like `foo.*.bar` or `baz.>`
- `nats_format` Message format. Uses the same notation as the SQL `FORMAT` function, such as `JSONEachRow`. For more information, see the [Formats](../../../interfaces/formats.md) section.
Optional parameters:

View File

@ -57,7 +57,7 @@ or via config (since version 21.11):
</named_collections>
```
Some parameters can be overriden by key value arguments:
Some parameters can be overridden by key value arguments:
``` sql
SELECT * FROM postgresql(postgres1, schema='schema1', table='table1');
```

View File

@ -42,7 +42,6 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
[rabbitmq_queue_consume = false,]
[rabbitmq_address = '',]
[rabbitmq_vhost = '/',]
[rabbitmq_queue_consume = false,]
[rabbitmq_username = '',]
[rabbitmq_password = '',]
[rabbitmq_commit_on_select = false,]

View File

@ -0,0 +1,119 @@
---
slug: /en/sql-reference/table-functions/redis
sidebar_position: 43
sidebar_label: Redis
---
# Redis
This engine allows integrating ClickHouse with [Redis](https://redis.io/). For Redis takes kv model, we strongly recommend you only query it in a point way, such as `where k=xx` or `where k in (xx, xx)`.
## Creating a Table {#creating-a-table}
``` sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name
(
name1 [type1],
name2 [type2],
...
) ENGINE = Redis(host:port[, db_index[, password[, pool_size]]]) PRIMARY KEY(primary_key_name);
```
**Engine Parameters**
- `host:port` — Redis server address, you can ignore port and default Redis port 6379 will be used.
- `db_index` — Redis db index range from 0 to 15, default is 0.
- `password` — User password, default is blank string.
- `pool_size` — Redis max connection pool size, default is 16.
- `primary_key_name` - any column name in the column list.
- `primary` must be specified, it supports only one column in the primary key. The primary key will be serialized in binary as a Redis key.
- columns other than the primary key will be serialized in binary as Redis value in corresponding order.
- queries with key equals or in filtering will be optimized to multi keys lookup from Redis. If queries without filtering key full table scan will happen which is a heavy operation.
## Usage Example {#usage-example}
Create a table in ClickHouse which allows to read data from Redis:
``` sql
CREATE TABLE redis_table
(
`k` String,
`m` String,
`n` UInt32
)
ENGINE = Redis('redis1:6379') PRIMARY KEY(k);
```
Insert:
```sql
INSERT INTO redis_table Values('1', 1, '1', 1.0), ('2', 2, '2', 2.0);
```
Query:
``` sql
SELECT COUNT(*) FROM redis_table;
```
``` text
┌─count()─┐
│ 2 │
└─────────┘
```
``` sql
SELECT * FROM redis_table WHERE key='1';
```
```text
┌─key─┬─v1─┬─v2─┬─v3─┐
│ 1 │ 1 │ 1 │ 1 │
└─────┴────┴────┴────┘
```
``` sql
SELECT * FROM redis_table WHERE v1=2;
```
```text
┌─key─┬─v1─┬─v2─┬─v3─┐
│ 2 │ 2 │ 2 │ 2 │
└─────┴────┴────┴────┘
```
Update:
Note that the primary key cannot be updated.
```sql
ALTER TABLE redis_table UPDATE v1=2 WHERE key='1';
```
Delete:
```sql
ALTER TABLE redis_table DELETE WHERE key='1';
```
Truncate:
Flush Redis db asynchronously. Also `Truncate` support SYNC mode.
```sql
TRUNCATE TABLE redis_table SYNC;
```
## Limitations {#limitations}
Redis engine also supports scanning queries, such as `where k > xx`, but it has some limitations:
1. Scanning query may produce some duplicated keys in a very rare case when it is rehashing. See details in [Redis Scan](https://github.com/redis/redis/blob/e4d183afd33e0b2e6e8d1c79a832f678a04a7886/src/dict.c#L1186-L1269)
2. During the scanning, keys could be created and deleted, so the resulting dataset can not represent a valid point in time.

View File

@ -23,7 +23,7 @@ CREATE TABLE s3_engine_table (name String, value UInt32)
- `NOSIGN` - If this keyword is provided in place of credentials, all the requests will not be signed.
- `format` — The [format](../../../interfaces/formats.md#formats) of the file.
- `aws_access_key_id`, `aws_secret_access_key` - Long-term credentials for the [AWS](https://aws.amazon.com/) account user. You can use these to authenticate your requests. Parameter is optional. If credentials are not specified, they are used from the configuration file. For more information see [Using S3 for Data Storage](../mergetree-family/mergetree.md#table_engine-mergetree-s3).
- `compression` — Compression type. Supported values: `none`, `gzip/gz`, `brotli/br`, `xz/LZMA`, `zstd/zst`. Parameter is optional. By default, it will autodetect compression by file extension.
- `compression` — Compression type. Supported values: `none`, `gzip/gz`, `brotli/br`, `xz/LZMA`, `zstd/zst`. Parameter is optional. By default, it will auto-detect compression by file extension.
### PARTITION BY
@ -140,8 +140,8 @@ The following settings can be set before query execution or placed into configur
- `s3_max_get_rps` — Maximum GET requests per second rate before throttling. Default value is `0` (unlimited).
- `s3_max_get_burst` — Max number of requests that can be issued simultaneously before hitting request per second limit. By default (`0` value) equals to `s3_max_get_rps`.
- `s3_upload_part_size_multiply_factor` - Multiply `s3_min_upload_part_size` by this factor each time `s3_multiply_parts_count_threshold` parts were uploaded from a single write to S3. Default values is `2`.
- `s3_upload_part_size_multiply_parts_count_threshold` - Each time this number of parts was uploaded to S3 `s3_min_upload_part_size multiplied` by `s3_upload_part_size_multiply_factor`. DEfault value us `500`.
- `s3_max_inflight_parts_for_one_file` - Limits the number of put requests that can be run concurenly for one object. Its number should be limited. The value `0` means unlimited. Default value is `20`. Each inflight part has a buffer with size `s3_min_upload_part_size` for the first `s3_upload_part_size_multiply_factor` parts and more when file is big enought, see `upload_part_size_multiply_factor`. With default settings one uploaded file consumes not more than `320Mb` for a file which is less than `8G`. The consumption is greater for a larger file.
- `s3_upload_part_size_multiply_parts_count_threshold` - Each time this number of parts was uploaded to S3 `s3_min_upload_part_size multiplied` by `s3_upload_part_size_multiply_factor`. Default value us `500`.
- `s3_max_inflight_parts_for_one_file` - Limits the number of put requests that can be run concurrently for one object. Its number should be limited. The value `0` means unlimited. Default value is `20`. Each in-flight part has a buffer with size `s3_min_upload_part_size` for the first `s3_upload_part_size_multiply_factor` parts and more when file is big enough, see `upload_part_size_multiply_factor`. With default settings one uploaded file consumes not more than `320Mb` for a file which is less than `8G`. The consumption is greater for a larger file.
Security consideration: if malicious user can specify arbitrary S3 URLs, `s3_max_redirects` must be set to zero to avoid [SSRF](https://en.wikipedia.org/wiki/Server-side_request_forgery) attacks; or alternatively, `remote_host_filter` must be specified in server configuration.

View File

@ -109,7 +109,7 @@ INSERT INTO test.visits (StartDate, CounterID, Sign, UserID)
VALUES (1667446031, 1, 6, 3)
```
The data are inserted in both the table and the materialized view `test.mv_visits`.
The data is inserted in both the table and the materialized view `test.mv_visits`.
To get the aggregated data, we need to execute a query such as `SELECT ... GROUP BY ...` from the materialized view `test.mv_visits`:

View File

@ -1,147 +1,211 @@
# Approximate Nearest Neighbor Search Indexes [experimental] {#table_engines-ANNIndex}
The main task that indexes achieve is to quickly find nearest neighbors for multidimensional data. An example of such a problem can be finding similar pictures (texts) for a given picture (text). That problem can be reduced to finding the nearest [embeddings](https://cloud.google.com/architecture/overview-extracting-and-serving-feature-embeddings-for-machine-learning). They can be created from data using [UDF](/docs/en/sql-reference/functions/index.md/#executable-user-defined-functions).
Nearest neighborhood search is the problem of finding the M closest points for a given point in an N-dimensional vector space. The most
straightforward approach to solve this problem is a brute force search where the distance between all points in the vector space and the
reference point is computed. This method guarantees perfect accuracy but it is usually too slow for practical applications. Thus, nearest
neighborhood search problems are often solved with [approximative algorithms](https://github.com/erikbern/ann-benchmarks). Approximative
nearest neighborhood search techniques, in conjunction with [embedding
methods](https://cloud.google.com/architecture/overview-extracting-and-serving-feature-embeddings-for-machine-learning) allow to search huge
amounts of media (pictures, songs, articles, etc.) in milliseconds.
The next queries find the closest neighbors in N-dimensional space using the L2 (Euclidean) distance:
``` sql
SELECT *
FROM table_name
WHERE L2Distance(Column, Point) < MaxDistance
LIMIT N
```
Blogs:
- [Vector Search with ClickHouse - Part 1](https://clickhouse.com/blog/vector-search-clickhouse-p1)
- [Vector Search with ClickHouse - Part 2](https://clickhouse.com/blog/vector-search-clickhouse-p2)
In terms of SQL, the nearest neighborhood problem can be expressed as follows:
``` sql
SELECT *
FROM table_name
ORDER BY L2Distance(Column, Point)
FROM table
ORDER BY Distance(vectors, Point)
LIMIT N
```
But it will take some time for execution because of the long calculation of the distance between `TargetEmbedding` and all other vectors. This is where ANN indexes can help. They store a compact approximation of the search space (e.g. using clustering, search trees, etc.) and are able to compute approximate neighbors quickly.
## Indexes Structure
`vectors` contains N-dimensional values of type [Array](../../../sql-reference/data-types/array.md) or
[Tuple](../../../sql-reference/data-types/tuple.md), for example embeddings. Function `Distance` computes the distance between two vectors.
Often, the the Euclidean (L2) distance is chosen as distance function but [other
distance functions](/docs/en/sql-reference/functions/distance-functions.md) are also possible. `Point` is the reference point, e.g. `(0.17,
0.33, ...)`, and `N` limits the number of search results.
Approximate Nearest Neighbor Search Indexes (`ANNIndexes`) are similar to skip indexes. They are constructed by some granules and determine which of them should be skipped. Compared to skip indices, ANN indices use their results not only to skip some group of granules, but also to select particular granules from a set of granules.
An alternative formulation of the nearest neighborhood search problem looks as follows:
`ANNIndexes` are designed to speed up two types of queries:
``` sql
SELECT *
FROM table
WHERE Distance(vectors, Point) < MaxDistance
LIMIT N
```
- ###### Type 1: Where
``` sql
SELECT *
FROM table_name
WHERE DistanceFunction(Column, Point) < MaxDistance
LIMIT N
```
- ###### Type 2: Order by
``` sql
SELECT *
FROM table_name [WHERE ...]
ORDER BY DistanceFunction(Column, Point)
LIMIT N
```
While the first query returns the top-`N` closest points to the reference point, the second query returns all points closer to the reference
point than a maximally allowed radius `MaxDistance`. Parameter `N` limits the number of returned values which is useful for situations where
`MaxDistance` is difficult to determine in advance.
In these queries, `DistanceFunction` is selected from [distance functions](/docs/en/sql-reference/functions/distance-functions.md). `Point` is a known vector (something like `(0.1, 0.1, ... )`). To avoid writing large vectors, use [client parameters](/docs/en//interfaces/cli.md#queries-with-parameters-cli-queries-with-parameters). `Value` - a float value that will bound the neighbourhood.
With brute force search, both queries are expensive (linear in the number of points) because the distance between all points in `vectors` and
`Point` must be computed. To speed this process up, Approximate Nearest Neighbor Search Indexes (ANN indexes) store a compact representation
of the search space (using clustering, search trees, etc.) which allows to compute an approximate answer much quicker (in sub-linear time).
:::note
ANN index can't speed up query that satisfies both types (`where + order by`, only one of them). All queries must have the limit, as algorithms are used to find nearest neighbors and need a specific number of them.
:::
# Creating and Using ANN Indexes
:::note
Indexes are applied only to queries with a limit less than the `max_limit_for_ann_queries` setting. This helps to avoid memory overflows in queries with a large limit. `max_limit_for_ann_queries` setting can be changed if you know you can provide enough memory. The default value is `1000000`.
:::
Both types of queries are handled the same way. The indexes get `n` neighbors (where `n` is taken from the `LIMIT` clause) and work with them. In `ORDER BY` query they remember the numbers of all parts of the granule that have at least one of neighbor. In `WHERE` query they remember only those parts that satisfy the requirements.
## Create table with ANNIndex
This feature is disabled by default. To enable it, set `allow_experimental_annoy_index` to 1. Also, this feature is disabled on ARM, due to likely problems with the algorithm.
Syntax to create an ANN index over an [Array](../../../sql-reference/data-types/array.md) column:
```sql
CREATE TABLE t
CREATE TABLE table
(
`id` Int64,
`data` Tuple(Float32, Float32, Float32),
INDEX ann_index_name data TYPE ann_index_type(ann_index_parameters) GRANULARITY N
`vectors` Array(Float32),
INDEX [ann_index_name vectors TYPE [ann_index_type]([ann_index_parameters]) [GRANULARITY [N]]
)
ENGINE = MergeTree
ORDER BY id;
```
Syntax to create an ANN index over a [Tuple](../../../sql-reference/data-types/tuple.md) column:
```sql
CREATE TABLE t
CREATE TABLE table
(
`id` Int64,
`data` Array(Float32),
INDEX ann_index_name data TYPE ann_index_type(ann_index_parameters) GRANULARITY N
`vectors` Tuple(Float32[, Float32[, ...]]),
INDEX [ann_index_name] vectors TYPE [ann_index_type]([ann_index_parameters]) [GRANULARITY [N]]
)
ENGINE = MergeTree
ORDER BY id;
```
With greater `GRANULARITY` indexes remember the data structure better. The `GRANULARITY` indicates how many granules will be used to construct the index. The more data is provided for the index, the more of it can be handled by one index and the more chances that with the right hyperparameters the index will remember the data structure better. But some indexes can't be built if they don't have enough data, so this granule will always participate in the query. For more information, see the description of indexes.
ANN indexes are built during column insertion and merge. As a result, `INSERT` and `OPTIMIZE` statements will be slower than for ordinary
tables. ANNIndexes are ideally used only with immutable or rarely changed data, respectively when are far more read requests than write
requests.
As the indexes are built only during insertions into table, `INSERT` and `OPTIMIZE` queries are slower than for ordinary table. At this stage indexes remember all the information about the given data. ANNIndexes should be used if you have immutable or rarely changed data and many read requests.
ANN indexes support two types of queries:
You can create your table with index which uses certain algorithm. Now only indices based on the following algorithms are supported:
- ORDER BY queries:
``` sql
SELECT *
FROM table
[WHERE ...]
ORDER BY Distance(vectors, Point)
LIMIT N
```
- WHERE queries:
``` sql
SELECT *
FROM table
WHERE Distance(vectors, Point) < MaxDistance
LIMIT N
```
:::tip
To avoid writing out large vectors, you can use [query
parameters](/docs/en/interfaces/cli.md#queries-with-parameters-cli-queries-with-parameters), e.g.
```bash
clickhouse-client --param_vec='hello' --query="SELECT * FROM table WHERE L2Distance(vectors, {vec: Array(Float32)}) < 1.0"
```
:::
**Restrictions**: Queries that contain both a `WHERE Distance(vectors, Point) < MaxDistance` and an `ORDER BY Distance(vectors, Point)`
clause cannot use ANN indexes. Also, the approximate algorithms used to determine the nearest neighbors require a limit, hence queries
without `LIMIT` clause cannot utilize ANN indexes. Also ANN indexes are only used if the query has a `LIMIT` value smaller than setting
`max_limit_for_ann_queries` (default: 1 million rows). This is a safeguard to prevent large memory allocations by external libraries for
approximate neighbor search.
**Differences to Skip Indexes** Similar to regular [skip indexes](https://clickhouse.com/docs/en/optimize/skipping-indexes), ANN indexes are
constructed over granules and each indexed block consists of `GRANULARITY = [N]`-many granules (`[N]` = 1 by default for normal skip
indexes). For example, if the primary index granularity of the table is 8192 (setting `index_granularity = 8192`) and `GRANULARITY = 2`,
then each indexed block will contain 16384 rows. However, data structures and algorithms for approximate neighborhood search (usually
provided by external libraries) are inherently row-oriented. They store a compact representation of a set of rows and also return rows for
ANN queries. This causes some rather unintuitive differences in the way ANN indexes behave compared to normal skip indexes.
When a user defines a ANN index on a column, ClickHouse internally creates a ANN "sub-index" for each index block. The sub-index is "local"
in the sense that it only knows about the rows of its containing index block. In the previous example and assuming that a column has 65536
rows, we obtain four index blocks (spanning eight granules) and a ANN sub-index for each index block. A sub-index is theoretically able to
return the rows with the N closest points within its index block directly. However, since ClickHouse loads data from disk to memory at the
granularity of granules, sub-indexes extrapolate matching rows to granule granularity. This is different from regular skip indexes which
skip data at the granularity of index blocks.
The `GRANULARITY` parameter determines how many ANN sub-indexes are created. Bigger `GRANULARITY` values mean fewer but larger ANN
sub-indexes, up to the point where a column (or a column's data part) has only a single sub-index. In that case, the sub-index has a
"global" view of all column rows and can directly return all granules of the column (part) with relevant rows (there are at most
`LIMIT [N]`-many such granules). In a second step, ClickHouse will load these granules and identify the actually best rows by performing a
brute-force distance calculation over all rows of the granules. With a small `GRANULARITY` value, each of the sub-indexes returns up to
`LIMIT N`-many granules. As a result, more granules need to be loaded and post-filtered. Note that the search accuracy is with both cases
equally good, only the processing performance differs. It is generally recommended to use a large `GRANULARITY` for ANN indexes and fall
back to a smaller `GRANULARITY` values only in case of problems like excessive memory consumption of the ANN structures. If no `GRANULARITY`
was specified for ANN indexes, the default value is 100 million.
# Available ANN Indexes
# Index list
- [Annoy](/docs/en/engines/table-engines/mergetree-family/annindexes.md#annoy-annoy)
# Annoy {#annoy}
Implementation of the algorithm was taken from [this repository](https://github.com/spotify/annoy).
## Annoy {#annoy}
Short description of the algorithm:
The algorithm recursively divides in half all space by random linear surfaces (lines in 2D, planes in 3D etc.). Thus it makes tree of polyhedrons and points that they contains. Repeating the operation several times for greater accuracy it creates a forest.
To find K Nearest Neighbours it goes down through the trees and fills the buffer of closest points using the priority queue of polyhedrons. Next, it sorts buffer and return the nearest K points.
Annoy indexes are currently experimental, to use them you first need to `SET allow_experimental_annoy_index = 1`. They are also currently
disabled on ARM due to memory safety problems with the algorithm.
This type of ANN index implements [the Annoy algorithm](https://github.com/spotify/annoy) which is based on a recursive division of the
space in random linear surfaces (lines in 2D, planes in 3D etc.).
<div class='vimeo-container'>
<iframe src="//www.youtube.com/embed/QkCCyLW0ehU"
width="640"
height="360"
frameborder="0"
allow="autoplay;
fullscreen;
picture-in-picture"
allowfullscreen>
</iframe>
</div>
Syntax to create an Annoy index over an [Array](../../../sql-reference/data-types/array.md) column:
__Examples__:
```sql
CREATE TABLE t
CREATE TABLE table
(
id Int64,
data Tuple(Float32, Float32, Float32),
INDEX ann_index_name data TYPE annoy(NumTrees, DistanceName) GRANULARITY N
vectors Array(Float32),
INDEX [ann_index_name] vectors TYPE annoy([Distance[, NumTrees]]) [GRANULARITY N]
)
ENGINE = MergeTree
ORDER BY id;
```
Syntax to create an ANN index over a [Tuple](../../../sql-reference/data-types/tuple.md) column:
```sql
CREATE TABLE t
CREATE TABLE table
(
id Int64,
data Array(Float32),
INDEX ann_index_name data TYPE annoy(NumTrees, DistanceName) GRANULARITY N
vectors Tuple(Float32[, Float32[, ...]]),
INDEX [ann_index_name] vectors TYPE annoy([Distance[, NumTrees]]) [GRANULARITY N]
)
ENGINE = MergeTree
ORDER BY id;
```
Annoy currently supports `L2Distance` and `cosineDistance` as distance function `Distance`. If no distance function was specified during
index creation, `L2Distance` is used as default. Parameter `NumTrees` is the number of trees which the algorithm creates (default if not
specified: 100). Higher values of `NumTree` mean more accurate search results but slower index creation / query times (approximately
linearly) as well as larger index sizes.
:::note
Table with array field will work faster, but all arrays **must** have same length. Use [CONSTRAINT](/docs/en/sql-reference/statements/create/table.md#constraints) to avoid errors. For example, `CONSTRAINT constraint_name_1 CHECK length(data) = 256`.
Indexes over columns of type `Array` will generally work faster than indexes on `Tuple` columns. All arrays **must** have same length. Use
[CONSTRAINT](/docs/en/sql-reference/statements/create/table.md#constraints) to avoid errors. For example, `CONSTRAINT constraint_name_1
CHECK length(vectors) = 256`.
:::
Parameter `NumTrees` is the number of trees which the algorithm will create. The bigger it is, the slower (approximately linear) it works (in both `CREATE` and `SELECT` requests), but the better accuracy you get (adjusted for randomness). By default it is set to `100`. Parameter `DistanceName` is name of distance function. By default it is set to `L2Distance`. It can be set without changing first parameter, for example
Setting `annoy_index_search_k_nodes` (default: `NumTrees * LIMIT`) determines how many tree nodes are inspected during SELECTs. Larger
values mean more accurate results at the cost of longer query runtime:
```sql
CREATE TABLE t
(
id Int64,
data Array(Float32),
INDEX ann_index_name data TYPE annoy('cosineDistance') GRANULARITY N
)
ENGINE = MergeTree
ORDER BY id;
```
Annoy supports `L2Distance` and `cosineDistance`.
In the `SELECT` in the settings (`ann_index_select_query_params`) you can specify the size of the internal buffer (more details in the description above or in the [original repository](https://github.com/spotify/annoy)). During the query it will inspect up to `search_k` nodes which defaults to `n_trees * n` if not provided. `search_k` gives you a run-time tradeoff between better accuracy and speed.
__Example__:
``` sql
SELECT *
FROM table_name [WHERE ...]
ORDER BY L2Distance(Column, Point)
FROM table_name
ORDER BY L2Distance(vectors, Point)
LIMIT N
SETTING ann_index_select_query_params=`k_search=100`
SETTINGS annoy_index_search_k_nodes=100;
```

View File

@ -165,7 +165,7 @@ Performance of such a query heavily depends on the table layout. Because of that
The key factors for a good performance:
- number of partitions involved in the query should be sufficiently large (more than `max_threads / 2`), otherwise query will underutilize the machine
- number of partitions involved in the query should be sufficiently large (more than `max_threads / 2`), otherwise query will under-utilize the machine
- partitions shouldn't be too small, so batch processing won't degenerate into row-by-row processing
- partitions should be comparable in size, so all threads will do roughly the same amount of work

View File

@ -15,6 +15,18 @@ tokenized cells of the string column. For example, the string cell "I will be a
" wi", "wil", "ill", "ll ", "l b", " be" etc. The more fine-granular the input strings are tokenized, the bigger but also the more
useful the resulting inverted index will be.
<div class='vimeo-container'>
<iframe src="//www.youtube.com/embed/O_MnyUkrIq8"
width="640"
height="360"
frameborder="0"
allow="autoplay;
fullscreen;
picture-in-picture"
allowfullscreen>
</iframe>
</div>
:::note
Inverted indexes are experimental and should not be used in production environments yet. They may change in the future in backward-incompatible
ways, for example with respect to their DDL/DQL syntax or performance/compression characteristics.

Some files were not shown because too many files have changed in this diff Show More