Merge branch 'master' into rocks-db-analyzer

This commit is contained in:
Nikita Mikhaylov 2023-11-07 00:44:49 +01:00 committed by GitHub
commit b696540a5a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
280 changed files with 5351 additions and 1776 deletions

View File

@ -1,4 +1,5 @@
### Table of Contents
**[ClickHouse release v23.10, 2023-11-02](#2310)**<br/>
**[ClickHouse release v23.9, 2023-09-28](#239)**<br/>
**[ClickHouse release v23.8 LTS, 2023-08-31](#238)**<br/>
**[ClickHouse release v23.7, 2023-07-27](#237)**<br/>
@ -12,6 +13,184 @@
# 2023 Changelog
### ClickHouse release 23.10, 2023-11-02
#### Backward Incompatible Change
* There is no longer an option to automatically remove broken data parts. This closes [#55174](https://github.com/ClickHouse/ClickHouse/issues/55174). [#55184](https://github.com/ClickHouse/ClickHouse/pull/55184) ([Alexey Milovidov](https://github.com/alexey-milovidov)). [#55557](https://github.com/ClickHouse/ClickHouse/pull/55557) ([Jihyuk Bok](https://github.com/tomahawk28)).
* The obsolete in-memory data parts can no longer be read from the write-ahead log. If you have configured in-memory parts before, they have to be removed before the upgrade. [#55186](https://github.com/ClickHouse/ClickHouse/pull/55186) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Remove the integration with Meilisearch. Reason: it was compatible only with the old version 0.18. The recent version of Meilisearch changed the protocol and does not work anymore. Note: we would appreciate it if you help to return it back. [#55189](https://github.com/ClickHouse/ClickHouse/pull/55189) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Rename directory monitor concept into background INSERT. All the settings `*directory_monitor*` had been renamed to `distributed_background_insert*`. *Backward compatibility should be preserved* (since old settings had been added as an alias). [#55978](https://github.com/ClickHouse/ClickHouse/pull/55978) ([Azat Khuzhin](https://github.com/azat)).
* Do not interpret the `send_timeout` set on the client side as the `receive_timeout` on the server side and vise-versa. [#56035](https://github.com/ClickHouse/ClickHouse/pull/56035) ([Azat Khuzhin](https://github.com/azat)).
* Comparison of time intervals with different units will throw an exception. This closes [#55942](https://github.com/ClickHouse/ClickHouse/issues/55942). You might have occasionally rely on the previous behavior when the underlying numeric values were compared regardless of the units. [#56090](https://github.com/ClickHouse/ClickHouse/pull/56090) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Rewrited the experimental `S3Queue` table engine completely: changed the way we keep information in zookeeper which allows to make less zookeeper requests, added caching of zookeeper state in cases when we know the state will not change, improved the polling from s3 process to make it less aggressive, changed the way ttl and max set for trached files is maintained, now it is a background process. Added `system.s3queue` and `system.s3queue_log` tables. Closes [#54998](https://github.com/ClickHouse/ClickHouse/issues/54998). [#54422](https://github.com/ClickHouse/ClickHouse/pull/54422) ([Kseniia Sumarokova](https://github.com/kssenii)).
#### New Feature
* Add function `arrayFold(accumulator, x1, ..., xn -> expression, initial, array1, ..., arrayn)` which applies a lambda function to multiple arrays of the same cardinality and collects the result in an accumulator. [#49794](https://github.com/ClickHouse/ClickHouse/pull/49794) ([Lirikl](https://github.com/Lirikl)).
* Support for `Npy` format. `SELECT * FROM file('example_array.npy', Npy)`. [#55982](https://github.com/ClickHouse/ClickHouse/pull/55982) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
* If a table has a space-filling curve in its key, e.g., `ORDER BY mortonEncode(x, y)`, the conditions on its arguments, e.g., `x >= 10 AND x <= 20 AND y >= 20 AND y <= 30` can be used for indexing. A setting `analyze_index_with_space_filling_curves` is added to enable or disable this analysis. This closes [#41195](https://github.com/ClickHouse/ClickHouse/issue/41195). Continuation of [#4538](https://github.com/ClickHouse/ClickHouse/pull/4538). Continuation of [#6286](https://github.com/ClickHouse/ClickHouse/pull/6286). Continuation of [#28130](https://github.com/ClickHouse/ClickHouse/pull/28130). Continuation of [#41753](https://github.com/ClickHouse/ClickHouse/pull/#41753). [#55642](https://github.com/ClickHouse/ClickHouse/pull/55642) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* A new setting called `force_optimize_projection_name`, it takes a name of projection as an argument. If it's value set to a non-empty string, ClickHouse checks that this projection is used in the query at least once. Closes [#55331](https://github.com/ClickHouse/ClickHouse/issues/55331). [#56134](https://github.com/ClickHouse/ClickHouse/pull/56134) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
* Support asynchronous inserts with external data via native protocol. Previously it worked only if data is inlined into query. [#54730](https://github.com/ClickHouse/ClickHouse/pull/54730) ([Anton Popov](https://github.com/CurtizJ)).
* Added aggregation function `lttb` which uses the [Largest-Triangle-Three-Buckets](https://skemman.is/bitstream/1946/15343/3/SS_MSthesis.pdf) algorithm for downsampling data for visualization. [#53145](https://github.com/ClickHouse/ClickHouse/pull/53145) ([Sinan](https://github.com/sinsinan)).
* Query`CHECK TABLE` has better performance and usability (sends progress updates, cancellable). Support checking particular part with `CHECK TABLE ... PART 'part_name'`. [#53404](https://github.com/ClickHouse/ClickHouse/pull/53404) ([vdimir](https://github.com/vdimir)).
* Added function `jsonMergePatch`. When working with JSON data as strings, it provides a way to merge these strings (of JSON objects) together to form a single string containing a single JSON object. [#54364](https://github.com/ClickHouse/ClickHouse/pull/54364) ([Memo](https://github.com/Joeywzr)).
* The second part of Kusto Query Language dialect support. [Phase 1 implementation ](https://github.com/ClickHouse/ClickHouse/pull/37961) has been merged. [#42510](https://github.com/ClickHouse/ClickHouse/pull/42510) ([larryluogit](https://github.com/larryluogit)).
* Added a new SQL function, `arrayRandomSample(arr, k)` which returns a sample of k elements from the input array. Similar functionality could previously be achieved only with less convenient syntax, e.g. "SELECT arrayReduce('groupArraySample(3)', range(10))". [#54391](https://github.com/ClickHouse/ClickHouse/pull/54391) ([itayisraelov](https://github.com/itayisraelov)).
* Introduce `-ArgMin`/`-ArgMax` aggregate combinators which allow to aggregate by min/max values only. One use case can be found in [#54818](https://github.com/ClickHouse/ClickHouse/issues/54818). This PR also reorganize combinators into dedicated folder. [#54947](https://github.com/ClickHouse/ClickHouse/pull/54947) ([Amos Bird](https://github.com/amosbird)).
* Allow to drop cache for Protobuf format with `SYSTEM DROP SCHEMA FORMAT CACHE [FOR Protobuf]`. [#55064](https://github.com/ClickHouse/ClickHouse/pull/55064) ([Aleksandr Musorin](https://github.com/AVMusorin)).
* Add external HTTP Basic authenticator. [#55199](https://github.com/ClickHouse/ClickHouse/pull/55199) ([Aleksei Filatov](https://github.com/aalexfvk)).
* Added function `byteSwap` which reverses the bytes of unsigned integers. This is particularly useful for reversing values of types which are represented as unsigned integers internally such as IPv4. [#55211](https://github.com/ClickHouse/ClickHouse/pull/55211) ([Priyansh Agrawal](https://github.com/Priyansh121096)).
* Added function `formatQuery()` which returns a formatted version (possibly spanning multiple lines) of a SQL query string. Also added function `formatQuerySingleLine()` which does the same but the returned string will not contain linebreaks. [#55239](https://github.com/ClickHouse/ClickHouse/pull/55239) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
* Added `DWARF` input format that reads debug symbols from an ELF executable/library/object file. [#55450](https://github.com/ClickHouse/ClickHouse/pull/55450) ([Michael Kolupaev](https://github.com/al13n321)).
* Allow to save unparsed records and errors in RabbitMQ, NATS and FileLog engines. Add virtual columns `_error` and `_raw_message`(for NATS and RabbitMQ), `_raw_record` (for FileLog) that are filled when ClickHouse fails to parse new record. The behaviour is controlled under storage settings `nats_handle_error_mode` for NATS, `rabbitmq_handle_error_mode` for RabbitMQ, `handle_error_mode` for FileLog similar to `kafka_handle_error_mode`. If it's set to `default`, en exception will be thrown when ClickHouse fails to parse a record, if it's set to `stream`, erorr and raw record will be saved into virtual columns. Closes [#36035](https://github.com/ClickHouse/ClickHouse/issues/36035). [#55477](https://github.com/ClickHouse/ClickHouse/pull/55477) ([Kruglov Pavel](https://github.com/Avogar)).
* Keeper client improvement: add `get_all_children_number command` that returns number of all children nodes under a specific path. [#55485](https://github.com/ClickHouse/ClickHouse/pull/55485) ([guoxiaolong](https://github.com/guoxiaolongzte)).
* Keeper client improvement: add `get_direct_children_number` command that returns number of direct children nodes under a path. [#55898](https://github.com/ClickHouse/ClickHouse/pull/55898) ([xuzifu666](https://github.com/xuzifu666)).
* Add statement `SHOW SETTING setting_name` which is a simpler version of existing statement `SHOW SETTINGS`. [#55979](https://github.com/ClickHouse/ClickHouse/pull/55979) ([Maksim Kita](https://github.com/kitaisreal)).
* Added fields `substreams` and `filenames` to the `system.parts_columns` table. [#55108](https://github.com/ClickHouse/ClickHouse/pull/55108) ([Anton Popov](https://github.com/CurtizJ)).
* Add support for `SHOW MERGES` query. [#55815](https://github.com/ClickHouse/ClickHouse/pull/55815) ([megao](https://github.com/jetgm)).
* Introduce a setting `create_table_empty_primary_key_by_default` for default `ORDER BY ()`. [#55899](https://github.com/ClickHouse/ClickHouse/pull/55899) ([Srikanth Chekuri](https://github.com/srikanthccv)).
#### Performance Improvement
* Add option `query_plan_preserve_num_streams_after_window_functions` to preserve the number of streams after evaluating window functions to allow parallel stream processing. [#50771](https://github.com/ClickHouse/ClickHouse/pull/50771) ([frinkr](https://github.com/frinkr)).
* Release more streams if data is small. [#53867](https://github.com/ClickHouse/ClickHouse/pull/53867) ([Jiebin Sun](https://github.com/jiebinn)).
* RoaringBitmaps being optimized before serialization. [#55044](https://github.com/ClickHouse/ClickHouse/pull/55044) ([UnamedRus](https://github.com/UnamedRus)).
* Posting lists in inverted indexes are now optimized to use the smallest possible representation for internal bitmaps. Depending on the repetitiveness of the data, this may significantly reduce the space consumption of inverted indexes. [#55069](https://github.com/ClickHouse/ClickHouse/pull/55069) ([Harry Lee](https://github.com/HarryLeeIBM)).
* Fix contention on Context lock, this significantly improves performance for a lot of short-running concurrent queries. [#55121](https://github.com/ClickHouse/ClickHouse/pull/55121) ([Maksim Kita](https://github.com/kitaisreal)).
* Improved the performance of inverted index creation by 30%. This was achieved by replacing `std::unordered_map` with `absl::flat_hash_map`. [#55210](https://github.com/ClickHouse/ClickHouse/pull/55210) ([Harry Lee](https://github.com/HarryLeeIBM)).
* Support ORC filter push down (rowgroup level). [#55330](https://github.com/ClickHouse/ClickHouse/pull/55330) ([李扬](https://github.com/taiyang-li)).
* Improve performance of external aggregation with a lot of temporary files. [#55489](https://github.com/ClickHouse/ClickHouse/pull/55489) ([Maksim Kita](https://github.com/kitaisreal)).
* Set a reasonable size for the marks cache for secondary indices by default to avoid loading the marks over and over again. [#55654](https://github.com/ClickHouse/ClickHouse/pull/55654) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Avoid unnecessary reconstruction of index granules when reading skip indexes. This addresses [#55653](https://github.com/ClickHouse/ClickHouse/issues/55653#issuecomment-1763766009). [#55683](https://github.com/ClickHouse/ClickHouse/pull/55683) ([Amos Bird](https://github.com/amosbird)).
* Cache CAST function in set during execution to improve the performance of function `IN` when set element type doesn't exactly match column type. [#55712](https://github.com/ClickHouse/ClickHouse/pull/55712) ([Duc Canh Le](https://github.com/canhld94)).
* Performance improvement for `ColumnVector::insertMany` and `ColumnVector::insertManyFrom`. [#55714](https://github.com/ClickHouse/ClickHouse/pull/55714) ([frinkr](https://github.com/frinkr)).
* Optimized Map subscript operations by predicting the next row's key position and reduce the comparisons. [#55929](https://github.com/ClickHouse/ClickHouse/pull/55929) ([lgbo](https://github.com/lgbo-ustc)).
* Support struct fields pruning in Parquet (in previous versions it didn't work in some cases). [#56117](https://github.com/ClickHouse/ClickHouse/pull/56117) ([lgbo](https://github.com/lgbo-ustc)).
* Add the ability to tune the number of parallel replicas used in a query execution based on the estimation of rows to read. [#51692](https://github.com/ClickHouse/ClickHouse/pull/51692) ([Raúl Marín](https://github.com/Algunenano)).
* Optimized external aggregation memory consumption in case many temporary files were generated. [#54798](https://github.com/ClickHouse/ClickHouse/pull/54798) ([Nikita Taranov](https://github.com/nickitat)).
* Distributed queries executed in `async_socket_for_remote` mode (default) now respect `max_threads` limit. Previously, some queries could create excessive threads (up to `max_distributed_connections`), causing server performance issues. [#53504](https://github.com/ClickHouse/ClickHouse/pull/53504) ([filimonov](https://github.com/filimonov)).
* Caching skip-able entries while executing DDL from Zookeeper distributed DDL queue. [#54828](https://github.com/ClickHouse/ClickHouse/pull/54828) ([Duc Canh Le](https://github.com/canhld94)).
* Experimental inverted indexes do not store tokens with too many matches (i.e. row ids in the posting list). This saves space and avoids ineffective index lookups when sequential scans would be equally fast or faster. The previous heuristics (`density` parameter passed to the index definition) that controlled when tokens would not be stored was too confusing for users. A much simpler heuristics based on parameter `max_rows_per_postings_list` (default: 64k) is introduced which directly controls the maximum allowed number of row ids in a postings list. [#55616](https://github.com/ClickHouse/ClickHouse/pull/55616) ([Harry Lee](https://github.com/HarryLeeIBM)).
* Improve write performance to `EmbeddedRocksDB` tables. [#55732](https://github.com/ClickHouse/ClickHouse/pull/55732) ([Duc Canh Le](https://github.com/canhld94)).
* Improved overall resilience for ClickHouse in case of many parts within partition (more than 1000). It might reduce the number of `TOO_MANY_PARTS` errors. [#55526](https://github.com/ClickHouse/ClickHouse/pull/55526) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Reduced memory consumption during loading of hierarchical dictionaries. [#55838](https://github.com/ClickHouse/ClickHouse/pull/55838) ([Nikita Taranov](https://github.com/nickitat)).
* All dictionaries support setting `dictionary_use_async_executor`. [#55839](https://github.com/ClickHouse/ClickHouse/pull/55839) ([vdimir](https://github.com/vdimir)).
* Prevent excesive memory usage when deserializing AggregateFunctionTopKGenericData. [#55947](https://github.com/ClickHouse/ClickHouse/pull/55947) ([Raúl Marín](https://github.com/Algunenano)).
* On a Keeper with lots of watches AsyncMetrics threads can consume 100% of CPU for noticable time in `DB::KeeperStorage::getSessionsWithWatchesCount()`. The fix is to avoid traversing heavy `watches` and `list_watches` sets. [#56054](https://github.com/ClickHouse/ClickHouse/pull/56054) ([Alexander Gololobov](https://github.com/davenger)).
* Add setting `optimize_trivial_approximate_count_query` to use `count()` approximation for storage EmbeddedRocksDB. Enable trivial count for StorageJoin. [#55806](https://github.com/ClickHouse/ClickHouse/pull/55806) ([Duc Canh Le](https://github.com/canhld94)).
#### Improvement
* Functions `toDayOfWeek()` (MySQL alias: `DAYOFWEEK()`), `toYearWeek()` (`YEARWEEK()`) and `toWeek()` (`WEEK()`) now supports `String` arguments. This makes its behavior consistent with MySQL's behavior. [#55589](https://github.com/ClickHouse/ClickHouse/pull/55589) ([Robert Schulze](https://github.com/rschu1ze)).
* Introduced setting `date_time_overflow_behavior` with possible values `ignore`, `throw`, `saturate` that controls the overflow behavior when converting from Date, Date32, DateTime64, Integer or Float to Date, Date32, DateTime or DateTime64. [#55696](https://github.com/ClickHouse/ClickHouse/pull/55696) ([Andrey Zvonov](https://github.com/zvonand)).
* Implement query parameters support for `ALTER TABLE ... ACTION PARTITION [ID] {parameter_name:ParameterType}`. Merges [#49516](https://github.com/ClickHouse/ClickHouse/issues/49516). Closes [#49449](https://github.com/ClickHouse/ClickHouse/issues/49449). [#55604](https://github.com/ClickHouse/ClickHouse/pull/55604) ([alesapin](https://github.com/alesapin)).
* Print processor ids in a prettier manner in EXPLAIN. [#48852](https://github.com/ClickHouse/ClickHouse/pull/48852) ([Vlad Seliverstov](https://github.com/behebot)).
* Creating a direct dictionary with a lifetime field will be rejected at create time (as the lifetime does not make sense for direct dictionaries). Fixes: [#27861](https://github.com/ClickHouse/ClickHouse/issues/27861). [#49043](https://github.com/ClickHouse/ClickHouse/pull/49043) ([Rory Crispin](https://github.com/RoryCrispin)).
* Allow parameters in queries with partitions like `ALTER TABLE t DROP PARTITION`. Closes [#49449](https://github.com/ClickHouse/ClickHouse/issues/49449). [#49516](https://github.com/ClickHouse/ClickHouse/pull/49516) ([Nikolay Degterinsky](https://github.com/evillique)).
* Add a new column `xid` for `system.zookeeper_connection`. [#50702](https://github.com/ClickHouse/ClickHouse/pull/50702) ([helifu](https://github.com/helifu)).
* Display the correct server settings in `system.server_settings` after configuration reload. [#53774](https://github.com/ClickHouse/ClickHouse/pull/53774) ([helifu](https://github.com/helifu)).
* Add support for mathematical minus `` character in queries, similar to `-`. [#54100](https://github.com/ClickHouse/ClickHouse/pull/54100) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Add replica groups to the experimental `Replicated` database engine. Closes [#53620](https://github.com/ClickHouse/ClickHouse/issues/53620). [#54421](https://github.com/ClickHouse/ClickHouse/pull/54421) ([Nikolay Degterinsky](https://github.com/evillique)).
* It is better to retry retriable s3 errors than totally fail the query. Set bigger value to the s3_retry_attempts by default. [#54770](https://github.com/ClickHouse/ClickHouse/pull/54770) ([Sema Checherinda](https://github.com/CheSema)).
* Add load balancing mode `hostname_levenshtein_distance`. [#54826](https://github.com/ClickHouse/ClickHouse/pull/54826) ([JackyWoo](https://github.com/JackyWoo)).
* Improve hiding secrets in logs. [#55089](https://github.com/ClickHouse/ClickHouse/pull/55089) ([Vitaly Baranov](https://github.com/vitlibar)).
* For now the projection analysis will be performed only on top of query plan. The setting `query_plan_optimize_projection` became obsolete (it was enabled by default long time ago). [#55112](https://github.com/ClickHouse/ClickHouse/pull/55112) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* When function `untuple` is now called on a tuple with named elements and itself has an alias (e.g. `select untuple(tuple(1)::Tuple(element_alias Int)) AS untuple_alias`), then the result column name is now generated from the untuple alias and the tuple element alias (in the example: "untuple_alias.element_alias"). [#55123](https://github.com/ClickHouse/ClickHouse/pull/55123) ([garcher22](https://github.com/garcher22)).
* Added setting `describe_include_virtual_columns`, which allows to include virtual columns of table into result of `DESCRIBE` query. Added setting `describe_compact_output`. If it is set to `true`, `DESCRIBE` query returns only names and types of columns without extra information. [#55129](https://github.com/ClickHouse/ClickHouse/pull/55129) ([Anton Popov](https://github.com/CurtizJ)).
* Sometimes `OPTIMIZE` with `optimize_throw_if_noop=1` may fail with an error `unknown reason` while the real cause of it - different projections in different parts. This behavior is fixed. [#55130](https://github.com/ClickHouse/ClickHouse/pull/55130) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Allow to have several `MaterializedPostgreSQL` tables following the same Postgres table. By default this behaviour is not enabled (for compatibility, because it is a backward-incompatible change), but can be turned on with setting `materialized_postgresql_use_unique_replication_consumer_identifier`. Closes [#54918](https://github.com/ClickHouse/ClickHouse/issues/54918). [#55145](https://github.com/ClickHouse/ClickHouse/pull/55145) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Allow to parse negative `DateTime64` and `DateTime` with fractional part from short strings. [#55146](https://github.com/ClickHouse/ClickHouse/pull/55146) ([Andrey Zvonov](https://github.com/zvonand)).
* To improve compatibility with MySQL, 1. `information_schema.tables` now includes the new field `table_rows`, and 2. `information_schema.columns` now includes the new field `extra`. [#55215](https://github.com/ClickHouse/ClickHouse/pull/55215) ([Robert Schulze](https://github.com/rschu1ze)).
* Clickhouse-client won't show "0 rows in set" if it is zero and if exception was thrown. [#55240](https://github.com/ClickHouse/ClickHouse/pull/55240) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
* Support rename table without keyword `TABLE` like `RENAME db.t1 to db.t2`. [#55373](https://github.com/ClickHouse/ClickHouse/pull/55373) ([凌涛](https://github.com/lingtaolf)).
* Add `internal_replication` to `system.clusters`. [#55377](https://github.com/ClickHouse/ClickHouse/pull/55377) ([Konstantin Morozov](https://github.com/k-morozov)).
* Select remote proxy resolver based on request protocol, add proxy feature docs and remove `DB::ProxyConfiguration::Protocol::ANY`. [#55430](https://github.com/ClickHouse/ClickHouse/pull/55430) ([Arthur Passos](https://github.com/arthurpassos)).
* Avoid retrying keeper operations on INSERT after table shutdown. [#55519](https://github.com/ClickHouse/ClickHouse/pull/55519) ([Azat Khuzhin](https://github.com/azat)).
* `SHOW COLUMNS` now correctly reports type `FixedString` as `BLOB` if setting `use_mysql_types_in_show_columns` is on. Also added two new settings, `mysql_map_string_to_text_in_show_columns` and `mysql_map_fixed_string_to_text_in_show_columns` to switch the output for types `String` and `FixedString` as `TEXT` or `BLOB`. [#55617](https://github.com/ClickHouse/ClickHouse/pull/55617) ([Serge Klochkov](https://github.com/slvrtrn)).
* During ReplicatedMergeTree tables startup clickhouse server checks set of parts for unexpected parts (exists locally, but not in zookeeper). All unexpected parts move to detached directory and instead of them server tries to restore some ancestor (covered) parts. Now server tries to restore closest ancestors instead of random covered parts. [#55645](https://github.com/ClickHouse/ClickHouse/pull/55645) ([alesapin](https://github.com/alesapin)).
* The advanced dashboard now supports draggable charts on touch devices. This closes [#54206](https://github.com/ClickHouse/ClickHouse/issues/54206). [#55649](https://github.com/ClickHouse/ClickHouse/pull/55649) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Use the default query format if declared when outputting exception with `http_write_exception_in_output_format`. [#55739](https://github.com/ClickHouse/ClickHouse/pull/55739) ([Raúl Marín](https://github.com/Algunenano)).
* Provide a better message for common MATERIALIZED VIEW pitfalls. [#55826](https://github.com/ClickHouse/ClickHouse/pull/55826) ([Raúl Marín](https://github.com/Algunenano)).
* If you dropped the current database, you will still be able to run some queries in `clickhouse-local` and switch to another database. This makes the behavior consistent with `clickhouse-client`. This closes [#55834](https://github.com/ClickHouse/ClickHouse/issues/55834). [#55853](https://github.com/ClickHouse/ClickHouse/pull/55853) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Functions `(add|subtract)(Year|Quarter|Month|Week|Day|Hour|Minute|Second|Millisecond|Microsecond|Nanosecond)` now support string-encoded date arguments, e.g. `SELECT addDays('2023-10-22', 1)`. This increases compatibility with MySQL and is needed by Tableau Online. [#55869](https://github.com/ClickHouse/ClickHouse/pull/55869) ([Robert Schulze](https://github.com/rschu1ze)).
* The setting `apply_deleted_mask` when disabled allows to read rows that where marked as deleted by lightweight DELETE queries. This is useful for debugging. [#55952](https://github.com/ClickHouse/ClickHouse/pull/55952) ([Alexander Gololobov](https://github.com/davenger)).
* Allow skipping `null` values when serailizing Tuple to json objects, which makes it possible to keep compatiability with Spark's `to_json` function, which is also useful for gluten. [#55956](https://github.com/ClickHouse/ClickHouse/pull/55956) ([李扬](https://github.com/taiyang-li)).
* Functions `(add|sub)Date()` now support string-encoded date arguments, e.g. `SELECT addDate('2023-10-22 11:12:13', INTERVAL 5 MINUTE)`. The same support for string-encoded date arguments is added to the plus and minus operators, e.g. `SELECT '2023-10-23' + INTERVAL 1 DAY`. This increases compatibility with MySQL and is needed by Tableau Online. [#55960](https://github.com/ClickHouse/ClickHouse/pull/55960) ([Robert Schulze](https://github.com/rschu1ze)).
* Allow unquoted strings with CR (`\r`) in CSV format. Closes [#39930](https://github.com/ClickHouse/ClickHouse/issues/39930). [#56046](https://github.com/ClickHouse/ClickHouse/pull/56046) ([Kruglov Pavel](https://github.com/Avogar)).
* Allow to run `clickhouse-keeper` using embedded config. [#56086](https://github.com/ClickHouse/ClickHouse/pull/56086) ([Maksim Kita](https://github.com/kitaisreal)).
* Set limit of the maximum configuration value for `queued.min.messages` to avoid problem with start fetching data with Kafka. [#56121](https://github.com/ClickHouse/ClickHouse/pull/56121) ([Stas Morozov](https://github.com/r3b-fish)).
* Fixed a typo in SQL function `minSampleSizeContinous` (renamed `minSampleSizeContinuous`). Old name is preserved for backward compatibility. This closes: [#56139](https://github.com/ClickHouse/ClickHouse/issues/56139). [#56143](https://github.com/ClickHouse/ClickHouse/pull/56143) ([Dorota Szeremeta](https://github.com/orotaday)).
* Print path for broken parts on disk before shutting down the server. Before this change if a part is corrupted on disk and server cannot start, it was almost impossible to understand which part is broken. This is fixed. [#56181](https://github.com/ClickHouse/ClickHouse/pull/56181) ([Duc Canh Le](https://github.com/canhld94)).
#### Build/Testing/Packaging Improvement
* If the database in Docker is already initialized, it doesn't need to be initialized again upon subsequent launches. This can potentially fix the issue of infinite container restarts when the database fails to load within 1000 attempts (relevant for very large databases and multi-node setups). [#50724](https://github.com/ClickHouse/ClickHouse/pull/50724) ([Alexander Nikolaev](https://github.com/AlexNik)).
* Resource with source code including submodules is built in Darwin special build task. It may be used to build ClickHouse without checking out the submodules. [#51435](https://github.com/ClickHouse/ClickHouse/pull/51435) ([Ilya Yatsishin](https://github.com/qoega)).
* An error was occuring when building ClickHouse with the AVX series of instructions enabled globally (which isn't recommended). The reason is that snappy does not enable `SNAPPY_HAVE_X86_CRC32`. [#55049](https://github.com/ClickHouse/ClickHouse/pull/55049) ([monchickey](https://github.com/monchickey)).
* Solve issue with launching standalone `clickhouse-keeper` from `clickhouse-server` package. [#55226](https://github.com/ClickHouse/ClickHouse/pull/55226) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* In the tests, RabbitMQ version is updated to 3.12.6. Improved logs collection for RabbitMQ tests. [#55424](https://github.com/ClickHouse/ClickHouse/pull/55424) ([Ilya Yatsishin](https://github.com/qoega)).
* Modified the error message difference between openssl and boringssl to fix the functional test. [#55975](https://github.com/ClickHouse/ClickHouse/pull/55975) ([MeenaRenganathan22](https://github.com/MeenaRenganathan22)).
* Use upstream repo for apache datasketches. [#55787](https://github.com/ClickHouse/ClickHouse/pull/55787) ([Nikita Taranov](https://github.com/nickitat)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Skip hardlinking inverted index files in mutation [#47663](https://github.com/ClickHouse/ClickHouse/pull/47663) ([cangyin](https://github.com/cangyin)).
* Fixed bug of `match` function (regex) with pattern containing alternation produces incorrect key condition. Closes #53222. [#54696](https://github.com/ClickHouse/ClickHouse/pull/54696) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Fix 'Cannot find column' in read-in-order optimization with ARRAY JOIN [#51746](https://github.com/ClickHouse/ClickHouse/pull/51746) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Support missed experimental `Object(Nullable(json))` subcolumns in query. [#54052](https://github.com/ClickHouse/ClickHouse/pull/54052) ([zps](https://github.com/VanDarkholme7)).
* Re-add fix for `accurateCastOrNull()` [#54629](https://github.com/ClickHouse/ClickHouse/pull/54629) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
* Fix detecting `DEFAULT` for columns of a Distributed table created without AS [#55060](https://github.com/ClickHouse/ClickHouse/pull/55060) ([Vitaly Baranov](https://github.com/vitlibar)).
* Proper cleanup in case of exception in ctor of ShellCommandSource [#55103](https://github.com/ClickHouse/ClickHouse/pull/55103) ([Alexander Gololobov](https://github.com/davenger)).
* Fix deadlock in LDAP assigned role update [#55119](https://github.com/ClickHouse/ClickHouse/pull/55119) ([Julian Maicher](https://github.com/jmaicher)).
* Suppress error statistics update for internal exceptions [#55128](https://github.com/ClickHouse/ClickHouse/pull/55128) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix deadlock in backups [#55132](https://github.com/ClickHouse/ClickHouse/pull/55132) ([alesapin](https://github.com/alesapin)).
* Fix storage Iceberg files retrieval [#55144](https://github.com/ClickHouse/ClickHouse/pull/55144) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix partition pruning of extra columns in set. [#55172](https://github.com/ClickHouse/ClickHouse/pull/55172) ([Amos Bird](https://github.com/amosbird)).
* Fix recalculation of skip indexes in ALTER UPDATE queries when table has adaptive granularity [#55202](https://github.com/ClickHouse/ClickHouse/pull/55202) ([Duc Canh Le](https://github.com/canhld94)).
* Fix for background download in fs cache [#55252](https://github.com/ClickHouse/ClickHouse/pull/55252) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Avoid possible memory leaks in compressors in case of missing buffer finalization [#55262](https://github.com/ClickHouse/ClickHouse/pull/55262) ([Azat Khuzhin](https://github.com/azat)).
* Fix functions execution over sparse columns [#55275](https://github.com/ClickHouse/ClickHouse/pull/55275) ([Azat Khuzhin](https://github.com/azat)).
* Fix incorrect merging of Nested for SELECT FINAL FROM SummingMergeTree [#55276](https://github.com/ClickHouse/ClickHouse/pull/55276) ([Azat Khuzhin](https://github.com/azat)).
* Fix bug with inability to drop detached partition in replicated merge tree on top of S3 without zero copy [#55309](https://github.com/ClickHouse/ClickHouse/pull/55309) ([alesapin](https://github.com/alesapin)).
* Fix a crash in MergeSortingPartialResultTransform (due to zero chunks after `remerge`) [#55335](https://github.com/ClickHouse/ClickHouse/pull/55335) ([Azat Khuzhin](https://github.com/azat)).
* Fix data-race in CreatingSetsTransform (on errors) due to throwing shared exception [#55338](https://github.com/ClickHouse/ClickHouse/pull/55338) ([Azat Khuzhin](https://github.com/azat)).
* Fix trash optimization (up to a certain extent) [#55353](https://github.com/ClickHouse/ClickHouse/pull/55353) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix leak in StorageHDFS [#55370](https://github.com/ClickHouse/ClickHouse/pull/55370) ([Azat Khuzhin](https://github.com/azat)).
* Fix parsing of arrays in cast operator [#55417](https://github.com/ClickHouse/ClickHouse/pull/55417) ([Anton Popov](https://github.com/CurtizJ)).
* Fix filtering by virtual columns with OR filter in query [#55418](https://github.com/ClickHouse/ClickHouse/pull/55418) ([Azat Khuzhin](https://github.com/azat)).
* Fix MongoDB connection issues [#55419](https://github.com/ClickHouse/ClickHouse/pull/55419) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix MySQL interface boolean representation [#55427](https://github.com/ClickHouse/ClickHouse/pull/55427) ([Serge Klochkov](https://github.com/slvrtrn)).
* Fix MySQL text protocol DateTime formatting and LowCardinality(Nullable(T)) types reporting [#55479](https://github.com/ClickHouse/ClickHouse/pull/55479) ([Serge Klochkov](https://github.com/slvrtrn)).
* Make `use_mysql_types_in_show_columns` affect only `SHOW COLUMNS` [#55481](https://github.com/ClickHouse/ClickHouse/pull/55481) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix stack symbolizer parsing `DW_FORM_ref_addr` incorrectly and sometimes crashing [#55483](https://github.com/ClickHouse/ClickHouse/pull/55483) ([Michael Kolupaev](https://github.com/al13n321)).
* Destroy fiber in case of exception in cancelBefore in AsyncTaskExecutor [#55516](https://github.com/ClickHouse/ClickHouse/pull/55516) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix Query Parameters not working with custom HTTP handlers [#55521](https://github.com/ClickHouse/ClickHouse/pull/55521) ([Konstantin Bogdanov](https://github.com/thevar1able)).
* Fix checking of non handled data for Values format [#55527](https://github.com/ClickHouse/ClickHouse/pull/55527) ([Azat Khuzhin](https://github.com/azat)).
* Fix 'Invalid cursor state' in odbc interacting with MS SQL Server [#55558](https://github.com/ClickHouse/ClickHouse/pull/55558) ([vdimir](https://github.com/vdimir)).
* Fix max execution time and 'break' overflow mode [#55577](https://github.com/ClickHouse/ClickHouse/pull/55577) ([Alexander Gololobov](https://github.com/davenger)).
* Fix crash in QueryNormalizer with cyclic aliases [#55602](https://github.com/ClickHouse/ClickHouse/pull/55602) ([vdimir](https://github.com/vdimir)).
* Disable wrong optimization and add a test [#55609](https://github.com/ClickHouse/ClickHouse/pull/55609) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Merging [#52352](https://github.com/ClickHouse/ClickHouse/issues/52352) [#55621](https://github.com/ClickHouse/ClickHouse/pull/55621) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Add a test to avoid incorrect decimal sorting [#55662](https://github.com/ClickHouse/ClickHouse/pull/55662) ([Amos Bird](https://github.com/amosbird)).
* Fix progress bar for s3 and azure Cluster functions with url without globs [#55666](https://github.com/ClickHouse/ClickHouse/pull/55666) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix filtering by virtual columns with OR filter in query (resubmit) [#55678](https://github.com/ClickHouse/ClickHouse/pull/55678) ([Azat Khuzhin](https://github.com/azat)).
* Fixes and improvements for Iceberg storage [#55695](https://github.com/ClickHouse/ClickHouse/pull/55695) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix data race in CreatingSetsTransform (v2) [#55786](https://github.com/ClickHouse/ClickHouse/pull/55786) ([Azat Khuzhin](https://github.com/azat)).
* Throw exception when parsing illegal string as float if precise_float_parsing is true [#55861](https://github.com/ClickHouse/ClickHouse/pull/55861) ([李扬](https://github.com/taiyang-li)).
* Disable predicate pushdown if the CTE contains stateful functions [#55871](https://github.com/ClickHouse/ClickHouse/pull/55871) ([Raúl Marín](https://github.com/Algunenano)).
* Fix normalize ASTSelectWithUnionQuery, as it was stripping `FORMAT` from the query [#55887](https://github.com/ClickHouse/ClickHouse/pull/55887) ([flynn](https://github.com/ucasfl)).
* Try to fix possible segfault in Native ORC input format [#55891](https://github.com/ClickHouse/ClickHouse/pull/55891) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix window functions in case of sparse columns. [#55895](https://github.com/ClickHouse/ClickHouse/pull/55895) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* fix: StorageNull supports subcolumns [#55912](https://github.com/ClickHouse/ClickHouse/pull/55912) ([FFish](https://github.com/wxybear)).
* Do not write retriable errors for Replicated mutate/merge into error log [#55944](https://github.com/ClickHouse/ClickHouse/pull/55944) ([Azat Khuzhin](https://github.com/azat)).
* Fix `SHOW DATABASES LIMIT <N>` [#55962](https://github.com/ClickHouse/ClickHouse/pull/55962) ([Raúl Marín](https://github.com/Algunenano)).
* Fix autogenerated Protobuf schema with fields with underscore [#55974](https://github.com/ClickHouse/ClickHouse/pull/55974) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix dateTime64ToSnowflake64() with non-default scale [#55983](https://github.com/ClickHouse/ClickHouse/pull/55983) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix output/input of Arrow dictionary column [#55989](https://github.com/ClickHouse/ClickHouse/pull/55989) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix fetching schema from schema registry in AvroConfluent [#55991](https://github.com/ClickHouse/ClickHouse/pull/55991) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix 'Block structure mismatch' on concurrent ALTER and INSERTs in Buffer table [#55995](https://github.com/ClickHouse/ClickHouse/pull/55995) ([Michael Kolupaev](https://github.com/al13n321)).
* Fix incorrect free space accounting for least_used JBOD policy [#56030](https://github.com/ClickHouse/ClickHouse/pull/56030) ([Azat Khuzhin](https://github.com/azat)).
* Fix missing scalar issue when evaluating subqueries inside table functions [#56057](https://github.com/ClickHouse/ClickHouse/pull/56057) ([Amos Bird](https://github.com/amosbird)).
* Fix wrong query result when http_write_exception_in_output_format=1 [#56135](https://github.com/ClickHouse/ClickHouse/pull/56135) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix schema cache for fallback JSON->JSONEachRow with changed settings [#56172](https://github.com/ClickHouse/ClickHouse/pull/56172) ([Kruglov Pavel](https://github.com/Avogar)).
* Add error handler to odbc-bridge [#56185](https://github.com/ClickHouse/ClickHouse/pull/56185) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
### ClickHouse release 23.9, 2023-09-28
#### Backward Incompatible Change

View File

@ -22,8 +22,6 @@ curl https://clickhouse.com/ | sh
## Upcoming Events
* [**v23.9 Community Call**]([https://clickhouse.com/company/events/v23-8-community-release-call](https://clickhouse.com/company/events/v23-9-community-release-call)?utm_source=github&utm_medium=social&utm_campaign=release-webinar-2023-08) - Sep 28 - 23.9 is rapidly approaching. Original creator, co-founder, and CTO of ClickHouse Alexey Milovidov will walk us through the highlights of the release.
* [**ClickHouse Meetup in Amsterdam**](https://www.meetup.com/clickhouse-netherlands-user-group/events/296334590/) - Oct 31
* [**ClickHouse Meetup in Beijing**](https://www.meetup.com/clickhouse-beijing-user-group/events/296334856/) - Nov 4
* [**ClickHouse Meetup in San Francisco**](https://www.meetup.com/clickhouse-silicon-valley-meetup-group/events/296334923/) - Nov 8
* [**ClickHouse Meetup in Singapore**](https://www.meetup.com/clickhouse-singapore-meetup-group/events/296334976/) - Nov 15

View File

@ -13,9 +13,10 @@ The following versions of ClickHouse server are currently being supported with s
| Version | Supported |
|:-|:-|
| 23.10 | ✔️ |
| 23.9 | ✔️ |
| 23.8 | ✔️ |
| 23.7 | ✔️ |
| 23.7 | |
| 23.6 | ❌ |
| 23.5 | ❌ |
| 23.4 | ❌ |

View File

@ -45,6 +45,10 @@ else ()
target_compile_definitions(common PUBLIC WITH_COVERAGE=0)
endif ()
if (TARGET ch_contrib::crc32_s390x)
target_link_libraries(common PUBLIC ch_contrib::crc32_s390x)
endif()
target_include_directories(common PUBLIC .. "${CMAKE_CURRENT_BINARY_DIR}/..")
target_link_libraries (common

View File

@ -35,6 +35,10 @@
#pragma clang diagnostic ignored "-Wreserved-identifier"
#endif
#if defined(__s390x__)
#include <base/crc32c_s390x.h>
#define CRC_INT s390x_crc32c
#endif
/**
* The std::string_view-like container to avoid creating strings to find substrings in the hash table.
@ -264,8 +268,8 @@ inline size_t hashLessThan8(const char * data, size_t size)
if (size >= 4)
{
UInt64 a = unalignedLoad<uint32_t>(data);
return hashLen16(size + (a << 3), unalignedLoad<uint32_t>(data + size - 4));
UInt64 a = unalignedLoadLittleEndian<uint32_t>(data);
return hashLen16(size + (a << 3), unalignedLoadLittleEndian<uint32_t>(data + size - 4));
}
if (size > 0)
@ -285,8 +289,8 @@ inline size_t hashLessThan16(const char * data, size_t size)
{
if (size > 8)
{
UInt64 a = unalignedLoad<UInt64>(data);
UInt64 b = unalignedLoad<UInt64>(data + size - 8);
UInt64 a = unalignedLoadLittleEndian<UInt64>(data);
UInt64 b = unalignedLoadLittleEndian<UInt64>(data + size - 8);
return hashLen16(a, rotateByAtLeast1(b + size, static_cast<UInt8>(size))) ^ b;
}
@ -315,13 +319,13 @@ struct CRC32Hash
do
{
UInt64 word = unalignedLoad<UInt64>(pos);
UInt64 word = unalignedLoadLittleEndian<UInt64>(pos);
res = static_cast<unsigned>(CRC_INT(res, word));
pos += 8;
} while (pos + 8 < end);
UInt64 word = unalignedLoad<UInt64>(end - 8); /// I'm not sure if this is normal.
UInt64 word = unalignedLoadLittleEndian<UInt64>(end - 8); /// I'm not sure if this is normal.
res = static_cast<unsigned>(CRC_INT(res, word));
return res;

26
base/base/crc32c_s390x.h Normal file
View File

@ -0,0 +1,26 @@
#pragma once
#include <crc32-s390x.h>
inline uint32_t s390x_crc32c_u8(uint32_t crc, uint8_t v)
{
return crc32c_le_vx(crc, reinterpret_cast<unsigned char *>(&v), sizeof(v));
}
inline uint32_t s390x_crc32c_u16(uint32_t crc, uint16_t v)
{
v = __builtin_bswap16(v);
return crc32c_le_vx(crc, reinterpret_cast<unsigned char *>(&v), sizeof(v));
}
inline uint32_t s390x_crc32c_u32(uint32_t crc, uint32_t v)
{
v = __builtin_bswap32(v);
return crc32c_le_vx(crc, reinterpret_cast<unsigned char *>(&v), sizeof(v));
}
inline uint64_t s390x_crc32c(uint64_t crc, uint64_t v)
{
v = __builtin_bswap64(v);
return crc32c_le_vx(static_cast<uint32_t>(crc), reinterpret_cast<unsigned char *>(&v), sizeof(uint64_t));
}

View File

@ -68,7 +68,7 @@ namespace Net
struct ProxyConfig
/// HTTP proxy server configuration.
{
ProxyConfig() : port(HTTP_PORT), protocol("http"), tunnel(true) { }
ProxyConfig() : port(HTTP_PORT), protocol("http"), tunnel(true), originalRequestProtocol("http") { }
std::string host;
/// Proxy server host name or IP address.
@ -87,6 +87,9 @@ namespace Net
/// A regular expression defining hosts for which the proxy should be bypassed,
/// e.g. "localhost|127\.0\.0\.1|192\.168\.0\.\d+". Can also be an empty
/// string to disable proxy bypassing.
std::string originalRequestProtocol;
/// Original request protocol (http or https).
/// Required in the case of: HTTPS request over HTTP proxy with tunneling (CONNECT) off.
};
HTTPClientSession();

View File

@ -418,7 +418,7 @@ void HTTPClientSession::reconnect()
std::string HTTPClientSession::proxyRequestPrefix() const
{
std::string result("http://");
std::string result(_proxyConfig.originalRequestProtocol + "://");
result.append(_host);
/// Do not append default by default, since this may break some servers.
/// One example of such server is GCS (Google Cloud Storage).

View File

@ -2,11 +2,11 @@
# NOTE: has nothing common with DBMS_TCP_PROTOCOL_VERSION,
# only DBMS_TCP_PROTOCOL_VERSION should be incremented on protocol changes.
SET(VERSION_REVISION 54479)
SET(VERSION_REVISION 54480)
SET(VERSION_MAJOR 23)
SET(VERSION_MINOR 10)
SET(VERSION_MINOR 11)
SET(VERSION_PATCH 1)
SET(VERSION_GITHASH 8f9a227de1f530cdbda52c145d41a6b0f1d29961)
SET(VERSION_DESCRIBE v23.10.1.1-testing)
SET(VERSION_STRING 23.10.1.1)
SET(VERSION_GITHASH 13adae0e42fd48de600486fc5d4b64d39f80c43e)
SET(VERSION_DESCRIBE v23.11.1.1-testing)
SET(VERSION_STRING 23.11.1.1)
# end of autochange

View File

@ -3,4 +3,4 @@ It allows to integrate JEMalloc into CMake project.
- Remove JEMALLOC_HAVE_ATTR_FORMAT_GNU_PRINTF because it's non standard.
- Added JEMALLOC_CONFIG_MALLOC_CONF substitution
- Add musl support (USE_MUSL)
- Also note, that darwin build requires JEMALLOC_PREFIX, while others don not
- Also note, that darwin build requires JEMALLOC_PREFIX, while others do not

View File

@ -1,3 +1,10 @@
option (ENABLE_SSH "Enable support for SSH keys and protocol" ON)
if (NOT ENABLE_SSH)
message(STATUS "Not using SSH")
return()
endif()
set(LIB_SOURCE_DIR "${ClickHouse_SOURCE_DIR}/contrib/libssh")
set(LIB_BINARY_DIR "${ClickHouse_BINARY_DIR}/contrib/libssh")
# Specify search path for CMake modules to be loaded by include()

2
contrib/orc vendored

@ -1 +1 @@
Subproject commit f31c271110a2f0dac908a152f11708193ae209ee
Subproject commit e24f2c2a3ca0769c96704ab20ad6f512a83ea2ad

View File

@ -34,7 +34,7 @@ RUN arch=${TARGETARCH:-amd64} \
# lts / testing / prestable / etc
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}"
ARG VERSION="23.9.3.12"
ARG VERSION="23.10.1.1976"
ARG PACKAGES="clickhouse-keeper"
# user/group precreated explicitly with fixed uid/gid on purpose.

View File

@ -32,7 +32,7 @@ RUN arch=${TARGETARCH:-amd64} \
# lts / testing / prestable / etc
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}"
ARG VERSION="23.9.3.12"
ARG VERSION="23.10.1.1976"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
# user/group precreated explicitly with fixed uid/gid on purpose.

View File

@ -30,7 +30,7 @@ RUN sed -i "s|http://archive.ubuntu.com|${apt_archive}|g" /etc/apt/sources.list
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg] https://packages.clickhouse.com/deb ${REPO_CHANNEL} main"
ARG VERSION="23.9.3.12"
ARG VERSION="23.10.1.1976"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
# set non-empty deb_location_url url to create a docker image

View File

@ -19,7 +19,7 @@ For more information and documentation see https://clickhouse.com/.
### Compatibility
- The amd64 image requires support for [SSE3 instructions](https://en.wikipedia.org/wiki/SSE3). Virtually all x86 CPUs after 2005 support SSE3.
- The arm64 image requires support for the [ARMv8.2-A architecture](https://en.wikipedia.org/wiki/AArch64#ARMv8.2-A). Most ARM CPUs after 2017 support ARMv8.2-A. A notable exception is Raspberry Pi 4 from 2019 whose CPU only supports ARMv8.0-A.
- The arm64 image requires support for the [ARMv8.2-A architecture](https://en.wikipedia.org/wiki/AArch64#ARMv8.2-A) and additionally the Load-Acquire RCpc register. The register is optional in version ARMv8.2-A and mandatory in [ARMv8.3-A](https://en.wikipedia.org/wiki/AArch64#ARMv8.3-A). Supported in Graviton >=2, Azure and GCP instances. Examples for unsupported devices are Raspberry Pi 4 (ARMv8.0-A) and Jetson AGX Xavier/Orin (ARMv8.2-A).
## How to use this image

View File

@ -75,7 +75,7 @@ sidebar_label: 2022
* Fix usage of nested columns with non-array columns with the same prefix [2] [#28762](https://github.com/ClickHouse/ClickHouse/pull/28762) ([Anton Popov](https://github.com/CurtizJ)).
* Lower compiled_expression_cache_size to 128MB [#28816](https://github.com/ClickHouse/ClickHouse/pull/28816) ([Maksim Kita](https://github.com/kitaisreal)).
* Column default dictGet identifier fix [#28863](https://github.com/ClickHouse/ClickHouse/pull/28863) ([Maksim Kita](https://github.com/kitaisreal)).
* Don not add const group by key for query with only having. [#28975](https://github.com/ClickHouse/ClickHouse/pull/28975) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Do not add const group by key for query with only having. [#28975](https://github.com/ClickHouse/ClickHouse/pull/28975) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Merging [#27963](https://github.com/ClickHouse/ClickHouse/issues/27963) [#29063](https://github.com/ClickHouse/ClickHouse/pull/29063) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix terminate on uncaught exception [#29216](https://github.com/ClickHouse/ClickHouse/pull/29216) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix arcadia (pre)stable 21.10 build [#29250](https://github.com/ClickHouse/ClickHouse/pull/29250) ([DimasKovas](https://github.com/DimasKovas)).

View File

@ -29,7 +29,7 @@ sidebar_label: 2022
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Don not add const group by key for query with only having. [#28975](https://github.com/ClickHouse/ClickHouse/pull/28975) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Do not add const group by key for query with only having. [#28975](https://github.com/ClickHouse/ClickHouse/pull/28975) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Merging [#27963](https://github.com/ClickHouse/ClickHouse/issues/27963) [#29063](https://github.com/ClickHouse/ClickHouse/pull/29063) ([Maksim Kita](https://github.com/kitaisreal)).
* May be fix s3 tests [#29762](https://github.com/ClickHouse/ClickHouse/pull/29762) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix ca-bundle.crt in kerberized_hadoop/Dockerfile [#30358](https://github.com/ClickHouse/ClickHouse/pull/30358) ([Vladimir C](https://github.com/vdimir)).

View File

@ -14,6 +14,6 @@ sidebar_label: 2022
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Don not add const group by key for query with only having. [#28975](https://github.com/ClickHouse/ClickHouse/pull/28975) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Do not add const group by key for query with only having. [#28975](https://github.com/ClickHouse/ClickHouse/pull/28975) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Merging [#27963](https://github.com/ClickHouse/ClickHouse/issues/27963) [#29063](https://github.com/ClickHouse/ClickHouse/pull/29063) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix terminate on uncaught exception [#29216](https://github.com/ClickHouse/ClickHouse/pull/29216) ([Alexander Tokmakov](https://github.com/tavplubix)).

View File

@ -15,6 +15,6 @@ sidebar_label: 2022
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Don not add const group by key for query with only having. [#28975](https://github.com/ClickHouse/ClickHouse/pull/28975) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Do not add const group by key for query with only having. [#28975](https://github.com/ClickHouse/ClickHouse/pull/28975) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Merging [#27963](https://github.com/ClickHouse/ClickHouse/issues/27963) [#29063](https://github.com/ClickHouse/ClickHouse/pull/29063) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix terminate on uncaught exception [#29216](https://github.com/ClickHouse/ClickHouse/pull/29216) ([Alexander Tokmakov](https://github.com/tavplubix)).

View File

@ -13,6 +13,6 @@ sidebar_label: 2022
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Don not add const group by key for query with only having. [#28975](https://github.com/ClickHouse/ClickHouse/pull/28975) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Do not add const group by key for query with only having. [#28975](https://github.com/ClickHouse/ClickHouse/pull/28975) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Merging [#27963](https://github.com/ClickHouse/ClickHouse/issues/27963) [#29063](https://github.com/ClickHouse/ClickHouse/pull/29063) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix terminate on uncaught exception [#29216](https://github.com/ClickHouse/ClickHouse/pull/29216) ([Alexander Tokmakov](https://github.com/tavplubix)).

View File

@ -0,0 +1,406 @@
---
sidebar_position: 1
sidebar_label: 2023
---
# 2023 Changelog
### ClickHouse release v23.10.1.1976-stable (13adae0e42f) FIXME as compared to v23.9.1.1854-stable (8f9a227de1f)
#### Backward Incompatible Change
* Rewrited storage S3Queue completely: changed the way we keep information in zookeeper which allows to make less zookeeper requests, added caching of zookeeper state in cases when we know the state will not change, improved the polling from s3 process to make it less aggressive, changed the way ttl and max set for trached files is maintained, now it is a background process. Added `system.s3queue` and `system.s3queue_log` tables. Closes [#54998](https://github.com/ClickHouse/ClickHouse/issues/54998). [#54422](https://github.com/ClickHouse/ClickHouse/pull/54422) ([Kseniia Sumarokova](https://github.com/kssenii)).
* There is no longer an option to automatically remove broken data parts. This closes [#55174](https://github.com/ClickHouse/ClickHouse/issues/55174). [#55184](https://github.com/ClickHouse/ClickHouse/pull/55184) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* The obsolete in-memory data parts can no longer be read from the write-ahead log. If you have configured in-memory parts before, they have to be removed before the upgrade. [#55186](https://github.com/ClickHouse/ClickHouse/pull/55186) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Remove the integration with Meilisearch. Reason: it was compatible only with the old version 0.18. The recent version of Meilisearch changed the protocol and does not work anymore. Note: we would appreciate it if you help to return it back. [#55189](https://github.com/ClickHouse/ClickHouse/pull/55189) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Rename directory monitor concept into background INSERT. All settings `*directory_monitor*` had been renamed to `distributed_background_insert*`. **Backward compatibility should be preserved** (since old settings had been added as an alias). [#55978](https://github.com/ClickHouse/ClickHouse/pull/55978) ([Azat Khuzhin](https://github.com/azat)).
* Do not mix-up send_timeout and receive_timeout. [#56035](https://github.com/ClickHouse/ClickHouse/pull/56035) ([Azat Khuzhin](https://github.com/azat)).
* Comparison of time intervals with different units will throw an exception. This closes [#55942](https://github.com/ClickHouse/ClickHouse/issues/55942). You might have occasionally rely on the previous behavior when the underlying numeric values were compared regardless of the units. [#56090](https://github.com/ClickHouse/ClickHouse/pull/56090) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### New Feature
* Add function "arrayFold(x1, ..., xn, accum -> expression, array1, ..., arrayn, init_accum)" which applies a lambda function to multiple arrays of the same cardinality and collects the result in an accumulator. [#49794](https://github.com/ClickHouse/ClickHouse/pull/49794) ([Lirikl](https://github.com/Lirikl)).
* Added aggregation function lttb which uses the [Largest-Triangle-Three-Buckets](https://skemman.is/bitstream/1946/15343/3/SS_MSthesis.pdf) algorithm for downsampling data for visualization. [#53145](https://github.com/ClickHouse/ClickHouse/pull/53145) ([Sinan](https://github.com/sinsinan)).
* Query`CHECK TABLE` has better performance and usability (sends progress updates, cancellable). Support checking particular part with `CHECK TABLE ... PART 'part_name'`. [#53404](https://github.com/ClickHouse/ClickHouse/pull/53404) ([vdimir](https://github.com/vdimir)).
* Added function `jsonMergePatch`. When working with JSON data as strings, it provides a way to merge these strings (of JSON objects) together to form a single string containing a single JSON object. [#54364](https://github.com/ClickHouse/ClickHouse/pull/54364) ([Memo](https://github.com/Joeywzr)).
* Added a new SQL function, "arrayRandomSample(arr, k)" which returns a sample of k elements from the input array. Similar functionality could previously be achieved only with less convenient syntax, e.g. "SELECT arrayReduce('groupArraySample(3)', range(10))". [#54391](https://github.com/ClickHouse/ClickHouse/pull/54391) ([itayisraelov](https://github.com/itayisraelov)).
* Added new function `getHttpHeader` to get HTTP request header value used for a request to ClickHouse server. Return empty string if the request is not done over HTTP protocol or there is no such header. [#54813](https://github.com/ClickHouse/ClickHouse/pull/54813) ([凌涛](https://github.com/lingtaolf)).
* Introduce -ArgMin/-ArgMax aggregate combinators which allow to aggregate by min/max values only. One use case can be found in [#54818](https://github.com/ClickHouse/ClickHouse/issues/54818). This PR also reorganize combinators into dedicated folder. [#54947](https://github.com/ClickHouse/ClickHouse/pull/54947) ([Amos Bird](https://github.com/amosbird)).
* Allow to drop cache for Protobuf format with `SYSTEM DROP SCHEMA FORMAT CACHE [FOR Protobuf]`. [#55064](https://github.com/ClickHouse/ClickHouse/pull/55064) ([Aleksandr Musorin](https://github.com/AVMusorin)).
* Add external HTTP Basic authenticator. [#55199](https://github.com/ClickHouse/ClickHouse/pull/55199) ([Aleksei Filatov](https://github.com/aalexfvk)).
* Added function `byteSwap` which reverses the bytes of unsigned integers. This is particularly useful for reversing values of types which are represented as unsigned integers internally such as IPv4. [#55211](https://github.com/ClickHouse/ClickHouse/pull/55211) ([Priyansh Agrawal](https://github.com/Priyansh121096)).
* Added function `formatQuery()` which returns a formatted version (possibly spanning multiple lines) of a SQL query string. Also added function `formatQuerySingleLine()` which does the same but the returned string will not contain linebreaks. [#55239](https://github.com/ClickHouse/ClickHouse/pull/55239) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
* Added DWARF input format that reads debug symbols from an ELF executable/library/object file. [#55450](https://github.com/ClickHouse/ClickHouse/pull/55450) ([Michael Kolupaev](https://github.com/al13n321)).
* Allow to save unparsed records and errors in RabbitMQ, NATS and FileLog engines. Add virtual columns `_error` and `_raw_message`(for NATS and RabbitMQ), `_raw_record` (for FileLog) that are filled when ClickHouse fails to parse new record. The behaviour is controlled under storage settings `nats_handle_error_mode` for NATS, `rabbitmq_handle_error_mode` for RabbitMQ, `handle_error_mode` for FileLog similar to `kafka_handle_error_mode`. If it's set to `default`, en exception will be thrown when ClickHouse fails to parse a record, if it's set to `stream`, erorr and raw record will be saved into virtual columns. Closes [#36035](https://github.com/ClickHouse/ClickHouse/issues/36035). [#55477](https://github.com/ClickHouse/ClickHouse/pull/55477) ([Kruglov Pavel](https://github.com/Avogar)).
* Keeper client improvement: add get_all_children_number command that returns number of all children nodes under a specific path. [#55485](https://github.com/ClickHouse/ClickHouse/pull/55485) ([guoxiaolong](https://github.com/guoxiaolongzte)).
* If a table has a space-filling curve in its key, e.g., `ORDER BY mortonEncode(x, y)`, the conditions on its arguments, e.g., `x >= 10 AND x <= 20 AND y >= 20 AND y <= 30` can be used for indexing. A setting `analyze_index_with_space_filling_curves` is added to enable or disable this analysis. This closes [#41195](https://github.com/ClickHouse/ClickHouse/issues/41195). Continuation of [#4538](https://github.com/ClickHouse/ClickHouse/issues/4538). Continuation of [#6286](https://github.com/ClickHouse/ClickHouse/issues/6286). Continuation of [#28130](https://github.com/ClickHouse/ClickHouse/issues/28130). Continuation of [#41753](https://github.com/ClickHouse/ClickHouse/issues/41753). [#55642](https://github.com/ClickHouse/ClickHouse/pull/55642) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Add setting `optimize_trivial_approximate_count_query` to use `count()` approximation for storage EmbeddedRocksDB. Enable trivial count for StorageJoin. [#55806](https://github.com/ClickHouse/ClickHouse/pull/55806) ([Duc Canh Le](https://github.com/canhld94)).
* Keeper client improvement: add get_direct_children_number command that returns number of direct children nodes under a path. [#55898](https://github.com/ClickHouse/ClickHouse/pull/55898) ([xuzifu666](https://github.com/xuzifu666)).
* Add statement `SHOW SETTING setting_name` which is a simpler version of existing statement `SHOW SETTINGS`. [#55979](https://github.com/ClickHouse/ClickHouse/pull/55979) ([Maksim Kita](https://github.com/kitaisreal)).
* This pr gives possibility to pass data in Npy format to Clickhouse. ``` SELECT * FROM file('example_array.npy', Npy). [#55982](https://github.com/ClickHouse/ClickHouse/pull/55982) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
* This PR impleents a new setting called `force_optimize_projection_name`, it takes a name of projection as an argument. If it's value set to a non-empty string, ClickHouse checks that this projection is used in the query at least once. Closes [#55331](https://github.com/ClickHouse/ClickHouse/issues/55331). [#56134](https://github.com/ClickHouse/ClickHouse/pull/56134) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
#### Performance Improvement
* Add option `query_plan_preserve_num_streams_after_window_functions` to preserve the number of streams after evaluating window functions to allow parallel stream processing. [#50771](https://github.com/ClickHouse/ClickHouse/pull/50771) ([frinkr](https://github.com/frinkr)).
* Release more num_streams if data is small. [#53867](https://github.com/ClickHouse/ClickHouse/pull/53867) ([Jiebin Sun](https://github.com/jiebinn)).
* RoaringBitmaps being optimized before serialization. [#55044](https://github.com/ClickHouse/ClickHouse/pull/55044) ([UnamedRus](https://github.com/UnamedRus)).
* Posting lists in inverted indexes are now optimized to use the smallest possible representation for internal bitmaps. Depending on the repetitiveness of the data, this may significantly reduce the space consumption of inverted indexes. [#55069](https://github.com/ClickHouse/ClickHouse/pull/55069) ([Harry Lee](https://github.com/HarryLeeIBM)).
* Fix contention on Context lock, this significantly improves performance for a lot of short-running concurrent queries. [#55121](https://github.com/ClickHouse/ClickHouse/pull/55121) ([Maksim Kita](https://github.com/kitaisreal)).
* Improved the performance of inverted index creation by 30%. This was achieved by replacing `std::unordered_map` with `absl::flat_hash_map`. [#55210](https://github.com/ClickHouse/ClickHouse/pull/55210) ([Harry Lee](https://github.com/HarryLeeIBM)).
* Support orc filter push down (rowgroup level). [#55330](https://github.com/ClickHouse/ClickHouse/pull/55330) ([李扬](https://github.com/taiyang-li)).
* Improve performance of external aggregation with a lot of temporary files. [#55489](https://github.com/ClickHouse/ClickHouse/pull/55489) ([Maksim Kita](https://github.com/kitaisreal)).
* Set a reasonable size for the marks cache for secondary indices by default to avoid loading the marks over and over again. [#55654](https://github.com/ClickHouse/ClickHouse/pull/55654) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Avoid unnecessary reconstruction of index granules when reading skip indexes. This addresses [#55653](https://github.com/ClickHouse/ClickHouse/issues/55653)#issuecomment-1763766009 . [#55683](https://github.com/ClickHouse/ClickHouse/pull/55683) ([Amos Bird](https://github.com/amosbird)).
* Cache cast function in set during execution to improve the performance of function `IN` when set element type doesn't exactly match column type. [#55712](https://github.com/ClickHouse/ClickHouse/pull/55712) ([Duc Canh Le](https://github.com/canhld94)).
* Performance improvement for `ColumnVector::insertMany` and `ColumnVector::insertManyFrom`. [#55714](https://github.com/ClickHouse/ClickHouse/pull/55714) ([frinkr](https://github.com/frinkr)).
* ... Getting values from a map is widely used. In practice, the key structrues are usally the same in the same map column, we could try to predict the next row's key position and reduce the comparisons. [#55929](https://github.com/ClickHouse/ClickHouse/pull/55929) ([lgbo](https://github.com/lgbo-ustc)).
* ... Fix an issue that struct field prune doesn't work in some cases. For example ```sql INSERT INTO FUNCTION file('test_parquet_struct', Parquet, 'x Tuple(a UInt32, b UInt32, c String)') SELECT tuple(number, rand(), concat('testxxxxxxx' toString(number))) FROM numbers(10);. [#56117](https://github.com/ClickHouse/ClickHouse/pull/56117) ([lgbo](https://github.com/lgbo-ustc)).
#### Improvement
* This is the second part of Kusto Query Language dialect support. [Phase 1 implementation ](https://github.com/ClickHouse/ClickHouse/pull/37961) has been merged. [#42510](https://github.com/ClickHouse/ClickHouse/pull/42510) ([larryluogit](https://github.com/larryluogit)).
* Op processors IDs are raw ptrs casted to UInt64. Print it in a prettier manner:. [#48852](https://github.com/ClickHouse/ClickHouse/pull/48852) ([Vlad Seliverstov](https://github.com/behebot)).
* Creating a direct dictionary with a lifetime field set will be rejected at create time. Fixes: [#27861](https://github.com/ClickHouse/ClickHouse/issues/27861). [#49043](https://github.com/ClickHouse/ClickHouse/pull/49043) ([Rory Crispin](https://github.com/RoryCrispin)).
* Allow parameters in queries with partitions like `ALTER TABLE t DROP PARTITION`. Closes [#49449](https://github.com/ClickHouse/ClickHouse/issues/49449). [#49516](https://github.com/ClickHouse/ClickHouse/pull/49516) ([Nikolay Degterinsky](https://github.com/evillique)).
* 1.Refactor the code about zookeeper_connection 2.Add a new column xid for zookeeper_connection. [#50702](https://github.com/ClickHouse/ClickHouse/pull/50702) ([helifu](https://github.com/helifu)).
* Add the ability to tune the number of parallel replicas used in a query execution based on the estimation of rows to read. [#51692](https://github.com/ClickHouse/ClickHouse/pull/51692) ([Raúl Marín](https://github.com/Algunenano)).
* Distributed queries executed in `async_socket_for_remote` mode (default) now respect `max_threads` limit. Previously, some queries could create excessive threads (up to `max_distributed_connections`), causing server performance issues. [#53504](https://github.com/ClickHouse/ClickHouse/pull/53504) ([filimonov](https://github.com/filimonov)).
* Display the correct server settings after reload. [#53774](https://github.com/ClickHouse/ClickHouse/pull/53774) ([helifu](https://github.com/helifu)).
* Add support for mathematical minus `` character in queries, similar to `-`. [#54100](https://github.com/ClickHouse/ClickHouse/pull/54100) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Add replica groups to the Replicated database engine. Closes [#53620](https://github.com/ClickHouse/ClickHouse/issues/53620). [#54421](https://github.com/ClickHouse/ClickHouse/pull/54421) ([Nikolay Degterinsky](https://github.com/evillique)).
* This PR will fix UBsan test error [here](https://github.com/ClickHouse/ClickHouse/pull/54566). [#54568](https://github.com/ClickHouse/ClickHouse/pull/54568) ([JackyWoo](https://github.com/JackyWoo)).
* Support asynchronous inserts with external data via native protocol. Previously it worked only if data is inlined into query. [#54730](https://github.com/ClickHouse/ClickHouse/pull/54730) ([Anton Popov](https://github.com/CurtizJ)).
* It is better to retry retriable s3 errors than totally fail the query. Set bigger value to the s3_retry_attempts by default. [#54770](https://github.com/ClickHouse/ClickHouse/pull/54770) ([Sema Checherinda](https://github.com/CheSema)).
* Optimised external aggregation memory consumption in case many temporary files were generated. [#54798](https://github.com/ClickHouse/ClickHouse/pull/54798) ([Nikita Taranov](https://github.com/nickitat)).
* Add load balancing test_hostname_levenshtein_distance. [#54826](https://github.com/ClickHouse/ClickHouse/pull/54826) ([JackyWoo](https://github.com/JackyWoo)).
* Caching skip-able entries while executing DDL from Zookeeper distributed DDL queue. [#54828](https://github.com/ClickHouse/ClickHouse/pull/54828) ([Duc Canh Le](https://github.com/canhld94)).
* Improve hiding secrets in logs. [#55089](https://github.com/ClickHouse/ClickHouse/pull/55089) ([Vitaly Baranov](https://github.com/vitlibar)).
* Added fields `substreams` and `filenames` to the `system.parts_columns` table. [#55108](https://github.com/ClickHouse/ClickHouse/pull/55108) ([Anton Popov](https://github.com/CurtizJ)).
* For now the projection analysis will be performed only on top of query plan. The setting `query_plan_optimize_projection` became obsolete (it was enabled by default long time ago). [#55112](https://github.com/ClickHouse/ClickHouse/pull/55112) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* When function "untuple()" is now called on a tuple with named elements and itself has an alias (e.g. "select untuple(tuple(1)::Tuple(element_alias Int)) AS untuple_alias"), then the result column name is now generated from the untuple alias and the tuple element alias (in the example: "untuple_alias.element_alias"). [#55123](https://github.com/ClickHouse/ClickHouse/pull/55123) ([garcher22](https://github.com/garcher22)).
* Added setting `describe_include_virtual_columns`, which allows to include virtual columns of table into result of `DESCRIBE` query. Added setting `describe_compact_output`. If it is set to `true`, `DESCRIBE` query returns only names and types of columns without extra information. [#55129](https://github.com/ClickHouse/ClickHouse/pull/55129) ([Anton Popov](https://github.com/CurtizJ)).
* Sometimes `OPTIMIZE` with `optimize_throw_if_noop=1` may fail with an error `unknown reason` while the real cause of it - different projections in different parts. This behavior is fixed. [#55130](https://github.com/ClickHouse/ClickHouse/pull/55130) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Allow to have several MaterializedPostgreSQL tables following the same Postgres table. By default this behaviour is not enabled (for compatibility, because it is backward-incompatible change), but can be turned on with setting `materialized_postgresql_use_unique_replication_consumer_identifier`. Closes [#54918](https://github.com/ClickHouse/ClickHouse/issues/54918). [#55145](https://github.com/ClickHouse/ClickHouse/pull/55145) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Allow to parse negative DateTime64 and DateTime with fractional part from short strings. [#55146](https://github.com/ClickHouse/ClickHouse/pull/55146) ([Andrey Zvonov](https://github.com/zvonand)).
* To improve compatibility with MySQL, 1. "information_schema.tables" now includes the new field "table_rows", and 2. "information_schema.columns" now includes the new field "extra". [#55215](https://github.com/ClickHouse/ClickHouse/pull/55215) ([Robert Schulze](https://github.com/rschu1ze)).
* Clickhouse-client won't show "0 rows in set" if it is zero and if exception was thrown. [#55240](https://github.com/ClickHouse/ClickHouse/pull/55240) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
* Support rename table without keyword `TABLE` like `RENAME db.t1 to db.t2`. [#55373](https://github.com/ClickHouse/ClickHouse/pull/55373) ([凌涛](https://github.com/lingtaolf)).
* Add internal_replication to system.clusters. [#55377](https://github.com/ClickHouse/ClickHouse/pull/55377) ([Konstantin Morozov](https://github.com/k-morozov)).
* Select remote proxy resolver based on request protocol, add proxy feature docs and remove `DB::ProxyConfiguration::Protocol::ANY`. [#55430](https://github.com/ClickHouse/ClickHouse/pull/55430) ([Arthur Passos](https://github.com/arthurpassos)).
* Avoid retrying keeper operations on INSERT after table shutdown. [#55519](https://github.com/ClickHouse/ClickHouse/pull/55519) ([Azat Khuzhin](https://github.com/azat)).
* Improved overall resilience for ClickHouse in case of many parts within partition (more than 1000). It might reduce the number of `TOO_MANY_PARTS` errors. [#55526](https://github.com/ClickHouse/ClickHouse/pull/55526) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Follow up https://github.com/ClickHouse/ClickHouse/pull/55184 to not to fall into `UNKNOWN_SETTING` error when the user uses the deleted MergeTree settings. [#55557](https://github.com/ClickHouse/ClickHouse/pull/55557) ([Jihyuk Bok](https://github.com/tomahawk28)).
* Updated `dashboard.html()` added a `if` to check if there is one chart, then "maximize" and "drag" buttons are not shown. [#55581](https://github.com/ClickHouse/ClickHouse/pull/55581) ([bhavuk2002](https://github.com/bhavuk2002)).
* Functions `toDayOfWeek()` (MySQL alias: `DAYOFWEEK()`), `toYearWeek()` (`YEARWEEK()`) and `toWeek()` (`WEEK()`) now supports `String` arguments. This makes its behavior consistent with MySQL's behavior. [#55589](https://github.com/ClickHouse/ClickHouse/pull/55589) ([Robert Schulze](https://github.com/rschu1ze)).
* Implement query parameters support for `ALTER TABLE ... ACTION PARTITION [ID] {parameter_name:ParameterType}`. Merges [#49516](https://github.com/ClickHouse/ClickHouse/issues/49516). Closes [#49449](https://github.com/ClickHouse/ClickHouse/issues/49449). [#55604](https://github.com/ClickHouse/ClickHouse/pull/55604) ([alesapin](https://github.com/alesapin)).
* Inverted indexes do not store tokens with too many matches (i.e. row ids in the posting list). This saves space and avoids ineffective index lookups when sequential scans would be equally fast or faster. The previous heuristics (`density` parameter passed to the index definition) that controlled when tokens would not be stored was too confusing for users. A much simpler heuristics based on parameter `max_rows_per_postings_list` (default: 64k) is introduced which directly controls the maximum allowed number of row ids in a postings list. [#55616](https://github.com/ClickHouse/ClickHouse/pull/55616) ([Harry Lee](https://github.com/HarryLeeIBM)).
* `SHOW COLUMNS` now correctly reports type `FixedString` as `BLOB` if setting `use_mysql_types_in_show_columns` is on. Also added two new settings, `mysql_map_string_to_text_in_show_columns` and `mysql_map_fixed_string_to_text_in_show_columns` to switch the output for types `String` and `FixedString` as `TEXT` or `BLOB`. [#55617](https://github.com/ClickHouse/ClickHouse/pull/55617) ([Serge Klochkov](https://github.com/slvrtrn)).
* During ReplicatedMergeTree tables startup clickhouse server checks set of parts for unexpected parts (exists locally, but not in zookeeper). All unexpected parts move to detached directory and instead of them server tries to restore some ancestor (covered) parts. Now server tries to restore closest ancestors instead of random covered parts. [#55645](https://github.com/ClickHouse/ClickHouse/pull/55645) ([alesapin](https://github.com/alesapin)).
* The advanced dashboard now supports draggable charts on touch devices. This closes [#54206](https://github.com/ClickHouse/ClickHouse/issues/54206). [#55649](https://github.com/ClickHouse/ClickHouse/pull/55649) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Introduced setting `date_time_overflow_behavior` with possible values `ignore`, `throw`, `saturate` that controls the overflow behavior when converting from Date, Date32, DateTime64, Integer or Float to Date, Date32, DateTime or DateTime64. [#55696](https://github.com/ClickHouse/ClickHouse/pull/55696) ([Andrey Zvonov](https://github.com/zvonand)).
* Improve write performance to rocksdb. [#55732](https://github.com/ClickHouse/ClickHouse/pull/55732) ([Duc Canh Le](https://github.com/canhld94)).
* Use the default query format if declared when outputting exception with http_write_exception_in_output_format. [#55739](https://github.com/ClickHouse/ClickHouse/pull/55739) ([Raúl Marín](https://github.com/Algunenano)).
* Use upstream repo for apache datasketches. [#55787](https://github.com/ClickHouse/ClickHouse/pull/55787) ([Nikita Taranov](https://github.com/nickitat)).
* Add support for SHOW MERGES query. [#55815](https://github.com/ClickHouse/ClickHouse/pull/55815) ([megao](https://github.com/jetgm)).
* Provide a better message for common MV pitfalls. [#55826](https://github.com/ClickHouse/ClickHouse/pull/55826) ([Raúl Marín](https://github.com/Algunenano)).
* Reduced memory consumption during loading of hierarchical dictionaries. [#55838](https://github.com/ClickHouse/ClickHouse/pull/55838) ([Nikita Taranov](https://github.com/nickitat)).
* All dictionaries support setting `dictionary_use_async_executor`. [#55839](https://github.com/ClickHouse/ClickHouse/pull/55839) ([vdimir](https://github.com/vdimir)).
* If you dropped the current database, you will still be able to run some queries in `clickhouse-local` and switch to another database. This makes the behavior consistent with `clickhouse-client`. This closes [#55834](https://github.com/ClickHouse/ClickHouse/issues/55834). [#55853](https://github.com/ClickHouse/ClickHouse/pull/55853) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Functions `(add|subtract)(Year|Quarter|Month|Week|Day|Hour|Minute|Second|Millisecond|Microsecond|Nanosecond)` now support string-encoded date arguments, e.g. `SELECT addDays('2023-10-22', 1)`. This increases compatibility with MySQL and is needed by Tableau Online. [#55869](https://github.com/ClickHouse/ClickHouse/pull/55869) ([Robert Schulze](https://github.com/rschu1ze)).
* Introduce setting `create_table_empty_primary_key_by_default` for default `ORDER BY ()`. [#55899](https://github.com/ClickHouse/ClickHouse/pull/55899) ([Srikanth Chekuri](https://github.com/srikanthccv)).
* Prevent excesive memory usage when deserializing AggregateFunctionTopKGenericData. [#55947](https://github.com/ClickHouse/ClickHouse/pull/55947) ([Raúl Marín](https://github.com/Algunenano)).
* The setting `apply_deleted_mask` when disabled allows to read rows that where marked as deleted by lightweight DELETE queries. This is useful for debugging. [#55952](https://github.com/ClickHouse/ClickHouse/pull/55952) ([Alexander Gololobov](https://github.com/davenger)).
* Allow skip null values when serailize tuple to json objects, which makes it possible to keep compatiability with spark to_json function, which is also useful for gluten. [#55956](https://github.com/ClickHouse/ClickHouse/pull/55956) ([李扬](https://github.com/taiyang-li)).
* Functions `(add|sub)Date()` now support string-encoded date arguments, e.g. `SELECT addDate('2023-10-22 11:12:13', INTERVAL 5 MINUTE)`. The same support for string-encoded date arguments is added to the plus and minus operators, e.g. `SELECT '2023-10-23' + INTERVAL 1 DAY`. This increases compatibility with MySQL and is needed by Tableau Online. [#55960](https://github.com/ClickHouse/ClickHouse/pull/55960) ([Robert Schulze](https://github.com/rschu1ze)).
* Allow unquoted strings with CR in CSV format. Closes [#39930](https://github.com/ClickHouse/ClickHouse/issues/39930). [#56046](https://github.com/ClickHouse/ClickHouse/pull/56046) ([Kruglov Pavel](https://github.com/Avogar)).
* On a Keeper with lots of watches AsyncMetrics threads can consume 100% of CPU for noticable time in `DB::KeeperStorage::getSessionsWithWatchesCount()`. The fix is to avoid traversing heavy `watches` and `list_watches` sets. [#56054](https://github.com/ClickHouse/ClickHouse/pull/56054) ([Alexander Gololobov](https://github.com/davenger)).
* Allow to run `clickhouse-keeper` using embedded config. [#56086](https://github.com/ClickHouse/ClickHouse/pull/56086) ([Maksim Kita](https://github.com/kitaisreal)).
* Set limit of the maximum configuration value for queued.min.messages to avoid problem with start fetching data with Kafka. [#56121](https://github.com/ClickHouse/ClickHouse/pull/56121) ([Stas Morozov](https://github.com/r3b-fish)).
* Fixed typo in SQL function `minSampleSizeContinous` (renamed `minSampleSizeContinuous`). Old name is preserved for backward compatibility. This closes: [#56139](https://github.com/ClickHouse/ClickHouse/issues/56139). [#56143](https://github.com/ClickHouse/ClickHouse/pull/56143) ([Dorota Szeremeta](https://github.com/orotaday)).
* Print corrupted part path on disk before shutdown server. Before this change if a part is corrupted on disk and server cannot start, it was almost impossible to understand which part is broken. This is fixed. [#56181](https://github.com/ClickHouse/ClickHouse/pull/56181) ([Duc Canh Le](https://github.com/canhld94)).
#### Build/Testing/Packaging Improvement
* If the database is already initialized, it doesn't need to be initialized again upon subsequent launches. This can potentially fix the issue of infinite container restarts when the database fails to load within 1000 attempts (relevant for very large databases and multi-node setups). [#50724](https://github.com/ClickHouse/ClickHouse/pull/50724) ([Alexander Nikolaev](https://github.com/AlexNik)).
* Resource with source code including submodules is built in Darwin special build task. It may be used to build ClickHouse without checkouting submodules. [#51435](https://github.com/ClickHouse/ClickHouse/pull/51435) ([Ilya Yatsishin](https://github.com/qoega)).
* An error will occur when building ClickHouse with the avx series of instructions enabled. CMake command ```shell cmake .. -DCMAKE_BUILD_TYPE=Release -DENABLE_AVX=ON -DENABLE_AVX2=ON -DENABLE_AVX2_FOR_SPEC_OP=ON ``` Failed message ``` [1558/11630] Building CXX object contrib/snappy-cmake/CMakeFiles/_snappy.dir/__/snappy/snappy.cc.o FAILED: contrib/snappy-cmake/CMakeFiles/_snappy.dir/__/snappy/snappy.cc.o /usr/bin/ccache /usr/bin/clang++-17 --target=x86_64-linux-gnu --sysroot=/opt/ClickHouse/cmake/linux/../../contrib/sysroot/linux-x86_64/x86_64-linux-gnu/libc -DHAVE_CONFIG_H -DSTD_EXCEPTION_HAS_STACK_TRACE=1 -D_LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS -I/opt/ClickHouse/base/glibc-compatibility/memcpy -isystem /opt/ClickHouse/contrib/snappy -isystem /opt/ClickHouse/build/contrib/snappy-cmake -isystem /opt/ClickHouse/contrib/llvm-project/libcxx/include -isystem /opt/ClickHouse/contrib/llvm-project/libcxxabi/include -isystem /opt/ClickHouse/contrib/libunwind/include --gcc-toolchain=/opt/ClickHouse/cmake/linux/../../contrib/sysroot/linux-x86_64 -fdiagnostics-color=always -Wno-enum-constexpr-conversion -fsized-deallocation -pipe -mssse3 -msse4.1 -msse4.2 -mpclmul -mpopcnt -mavx -mavx2 -fasynchronous-unwind-tables -ffile-prefix-map=/opt/ClickHouse=. -falign-functions=32 -mbranches-within-32B-boundaries -fdiagnostics-absolute-paths -fstrict-vtable-pointers -w -O3 -DNDEBUG -D OS_LINUX -Werror -nostdinc++ -std=c++2b -MD -MT contrib/snappy-cmake/CMakeFiles/_snappy.dir/__/snappy/snappy.cc.o -MF contrib/snappy-cmake/CMakeFiles/_snappy.dir/__/snappy/snappy.cc.o.d -o contrib/snappy-cmake/CMakeFiles/_snappy.dir/__/snappy/snappy.cc.o -c /opt/ClickHouse/contrib/snappy/snappy.cc /opt/ClickHouse/contrib/snappy/snappy.cc:1061:3: error: unknown type name '__m256i' 1061 | __m256i data = _mm256_lddqu_si256(static_cast<const __m256i *>(src)); | ^ /opt/ClickHouse/contrib/snappy/snappy.cc:1061:55: error: unknown type name '__m256i' 1061 | __m256i data = _mm256_lddqu_si256(static_cast<const __m256i *>(src)); | ^ /opt/ClickHouse/contrib/snappy/snappy.cc:1061:18: error: use of undeclared identifier '_mm256_lddqu_si256'; did you mean '_mm_lddqu_si128'? 1061 | __m256i data = _mm256_lddqu_si256(static_cast<const __m256i *>(src)); | ^~~~~~~~~~~~~~~~~~ | _mm_lddqu_si128 /usr/lib/llvm-17/lib/clang/17/include/pmmintrin.h:38:1: note: '_mm_lddqu_si128' declared here 38 | _mm_lddqu_si128(__m128i_u const *__p) | ^ /opt/ClickHouse/contrib/snappy/snappy.cc:1061:66: error: cannot initialize a parameter of type 'const __m128i_u *' with an lvalue of type 'const void *' 1061 | __m256i data = _mm256_lddqu_si256(static_cast<const __m256i *>(src)); | ^~~ /usr/lib/llvm-17/lib/clang/17/include/pmmintrin.h:38:34: note: passing argument to parameter '__p' here 38 | _mm_lddqu_si128(__m128i_u const *__p) | ^ /opt/ClickHouse/contrib/snappy/snappy.cc:1062:40: error: unknown type name '__m256i' 1062 | _mm256_storeu_si256(reinterpret_cast<__m256i *>(dst), data); | ^ /opt/ClickHouse/contrib/snappy/snappy.cc:1065:49: error: unknown type name '__m256i' 1065 | data = _mm256_lddqu_si256(static_cast<const __m256i *>(src) + 1); | ^ /opt/ClickHouse/contrib/snappy/snappy.cc:1065:65: error: arithmetic on a pointer to void 1065 | data = _mm256_lddqu_si256(static_cast<const __m256i *>(src) + 1); | ~~~ ^ /opt/ClickHouse/contrib/snappy/snappy.cc:1066:42: error: unknown type name '__m256i' 1066 | _mm256_storeu_si256(reinterpret_cast<__m256i *>(dst) + 1, data); | ^ 8 errors generated. [1567/11630] Building CXX object contrib/rocksdb-cma...rocksdb.dir/__/rocksdb/db/arena_wrapped_db_iter.cc.o ninja: build stopped: subcommand failed. ``` The reason is that snappy does not enable SNAPPY_HAVE_X86_CRC32. [#55049](https://github.com/ClickHouse/ClickHouse/pull/55049) ([monchickey](https://github.com/monchickey)).
* Add `instance_env_variables` option to integration tests. [#55208](https://github.com/ClickHouse/ClickHouse/pull/55208) ([Arthur Passos](https://github.com/arthurpassos)).
* Solve issue with launching standalone clickhouse-keeper from clickhouse-server package. [#55226](https://github.com/ClickHouse/ClickHouse/pull/55226) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* In tests RabbitMQ version is updated to 3.12.6. Improved logs collection for RabbitMQ tests. [#55424](https://github.com/ClickHouse/ClickHouse/pull/55424) ([Ilya Yatsishin](https://github.com/qoega)).
* Fix integration check python script to use gh api url - Add Readme for CI tests. [#55476](https://github.com/ClickHouse/ClickHouse/pull/55476) ([Max K.](https://github.com/mkaynov)).
* Fix integration check python script to use gh api url - Add Readme for CI tests. [#55716](https://github.com/ClickHouse/ClickHouse/pull/55716) ([Max K.](https://github.com/mkaynov)).
* Check sha512 for tgz; use a proper repository for keeper; write only filenames to TGZ.sha512 files for tarball packages. Prerequisite for [#31473](https://github.com/ClickHouse/ClickHouse/issues/31473). [#55717](https://github.com/ClickHouse/ClickHouse/pull/55717) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Updated to get free port for azurite. [#55796](https://github.com/ClickHouse/ClickHouse/pull/55796) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Reduce the ouput info. [#55938](https://github.com/ClickHouse/ClickHouse/pull/55938) ([helifu](https://github.com/helifu)).
* Modified the error message difference between openssl and boringssl to fix the functional test. [#55975](https://github.com/ClickHouse/ClickHouse/pull/55975) ([MeenaRenganathan22](https://github.com/MeenaRenganathan22)).
* Changes to support the HDFS for s390x. [#56128](https://github.com/ClickHouse/ClickHouse/pull/56128) ([MeenaRenganathan22](https://github.com/MeenaRenganathan22)).
* Fix flaky test of jbod balancer by relaxing the Gini coefficient and introducing more determinism in insertions. [#56175](https://github.com/ClickHouse/ClickHouse/pull/56175) ([Amos Bird](https://github.com/amosbird)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* skip hardlinking inverted index files in mutation [#47663](https://github.com/ClickHouse/ClickHouse/pull/47663) ([cangyin](https://github.com/cangyin)).
* Fix 'Cannot find column' in read-in-order optimization with ARRAY JOIN [#51746](https://github.com/ClickHouse/ClickHouse/pull/51746) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Support missed Object(Nullable(json)) subcolumns in query. [#54052](https://github.com/ClickHouse/ClickHouse/pull/54052) ([zps](https://github.com/VanDarkholme7)).
* Re-add fix for `accurateCastOrNull()` [#54629](https://github.com/ClickHouse/ClickHouse/pull/54629) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
* Fix detecting DEFAULT for columns of a Distributed table created without AS [#55060](https://github.com/ClickHouse/ClickHouse/pull/55060) ([Vitaly Baranov](https://github.com/vitlibar)).
* Proper cleanup in case of exception in ctor of ShellCommandSource [#55103](https://github.com/ClickHouse/ClickHouse/pull/55103) ([Alexander Gololobov](https://github.com/davenger)).
* Fix deadlock in LDAP assigned role update [#55119](https://github.com/ClickHouse/ClickHouse/pull/55119) ([Julian Maicher](https://github.com/jmaicher)).
* Suppress error statistics update for internal exceptions [#55128](https://github.com/ClickHouse/ClickHouse/pull/55128) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix deadlock in backups [#55132](https://github.com/ClickHouse/ClickHouse/pull/55132) ([alesapin](https://github.com/alesapin)).
* Fix storage Iceberg files retrieval [#55144](https://github.com/ClickHouse/ClickHouse/pull/55144) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix partition pruning of extra columns in set. [#55172](https://github.com/ClickHouse/ClickHouse/pull/55172) ([Amos Bird](https://github.com/amosbird)).
* Fix recalculation of skip indexes in ALTER UPDATE queries when table has adaptive granularity [#55202](https://github.com/ClickHouse/ClickHouse/pull/55202) ([Duc Canh Le](https://github.com/canhld94)).
* Fix for background download in fs cache [#55252](https://github.com/ClickHouse/ClickHouse/pull/55252) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Avoid possible memory leaks in compressors in case of missing buffer finalization [#55262](https://github.com/ClickHouse/ClickHouse/pull/55262) ([Azat Khuzhin](https://github.com/azat)).
* Fix functions execution over sparse columns [#55275](https://github.com/ClickHouse/ClickHouse/pull/55275) ([Azat Khuzhin](https://github.com/azat)).
* Fix incorrect merging of Nested for SELECT FINAL FROM SummingMergeTree [#55276](https://github.com/ClickHouse/ClickHouse/pull/55276) ([Azat Khuzhin](https://github.com/azat)).
* Fix bug with inability to drop detached partition in replicated merge tree on top of S3 without zero copy [#55309](https://github.com/ClickHouse/ClickHouse/pull/55309) ([alesapin](https://github.com/alesapin)).
* Fix SIGSEGV in MergeSortingPartialResultTransform (due to zero chunks after remerge()) [#55335](https://github.com/ClickHouse/ClickHouse/pull/55335) ([Azat Khuzhin](https://github.com/azat)).
* Fix data-race in CreatingSetsTransform (on errors) due to throwing shared exception [#55338](https://github.com/ClickHouse/ClickHouse/pull/55338) ([Azat Khuzhin](https://github.com/azat)).
* Fix trash optimization (up to a certain extent) [#55353](https://github.com/ClickHouse/ClickHouse/pull/55353) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix leak in StorageHDFS [#55370](https://github.com/ClickHouse/ClickHouse/pull/55370) ([Azat Khuzhin](https://github.com/azat)).
* Fix parsing of arrays in cast operator [#55417](https://github.com/ClickHouse/ClickHouse/pull/55417) ([Anton Popov](https://github.com/CurtizJ)).
* Fix filtering by virtual columns with OR filter in query [#55418](https://github.com/ClickHouse/ClickHouse/pull/55418) ([Azat Khuzhin](https://github.com/azat)).
* Fix MongoDB connection issues [#55419](https://github.com/ClickHouse/ClickHouse/pull/55419) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix MySQL interface boolean representation [#55427](https://github.com/ClickHouse/ClickHouse/pull/55427) ([Serge Klochkov](https://github.com/slvrtrn)).
* Fix MySQL text protocol DateTime formatting and LowCardinality(Nullable(T)) types reporting [#55479](https://github.com/ClickHouse/ClickHouse/pull/55479) ([Serge Klochkov](https://github.com/slvrtrn)).
* Make `use_mysql_types_in_show_columns` affect only `SHOW COLUMNS` [#55481](https://github.com/ClickHouse/ClickHouse/pull/55481) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix stack symbolizer parsing DW_FORM_ref_addr incorrectly and sometimes crashing [#55483](https://github.com/ClickHouse/ClickHouse/pull/55483) ([Michael Kolupaev](https://github.com/al13n321)).
* Destroy fiber in case of exception in cancelBefore in AsyncTaskExecutor [#55516](https://github.com/ClickHouse/ClickHouse/pull/55516) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix Query Parameters not working with custom HTTP handlers [#55521](https://github.com/ClickHouse/ClickHouse/pull/55521) ([Konstantin Bogdanov](https://github.com/thevar1able)).
* Fix checking of non handled data for Values format [#55527](https://github.com/ClickHouse/ClickHouse/pull/55527) ([Azat Khuzhin](https://github.com/azat)).
* Fix 'Invalid cursor state' in odbc interacting with MS SQL Server [#55558](https://github.com/ClickHouse/ClickHouse/pull/55558) ([vdimir](https://github.com/vdimir)).
* Fix max execution time and 'break' overflow mode [#55577](https://github.com/ClickHouse/ClickHouse/pull/55577) ([Alexander Gololobov](https://github.com/davenger)).
* Fix crash in QueryNormalizer with cyclic aliases [#55602](https://github.com/ClickHouse/ClickHouse/pull/55602) ([vdimir](https://github.com/vdimir)).
* Disable wrong optimization and add a test [#55609](https://github.com/ClickHouse/ClickHouse/pull/55609) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Merging [#52352](https://github.com/ClickHouse/ClickHouse/issues/52352) [#55621](https://github.com/ClickHouse/ClickHouse/pull/55621) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Add a test to avoid incorrect decimal sorting [#55662](https://github.com/ClickHouse/ClickHouse/pull/55662) ([Amos Bird](https://github.com/amosbird)).
* Fix progress bar for s3 and azure Cluster functions with url without globs [#55666](https://github.com/ClickHouse/ClickHouse/pull/55666) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix filtering by virtual columns with OR filter in query (resubmit) [#55678](https://github.com/ClickHouse/ClickHouse/pull/55678) ([Azat Khuzhin](https://github.com/azat)).
* Fixes and improvements for Iceberg storage [#55695](https://github.com/ClickHouse/ClickHouse/pull/55695) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix data race in CreatingSetsTransform (v2) [#55786](https://github.com/ClickHouse/ClickHouse/pull/55786) ([Azat Khuzhin](https://github.com/azat)).
* Throw exception when parsing illegal string as float if precise_float_parsing is true [#55861](https://github.com/ClickHouse/ClickHouse/pull/55861) ([李扬](https://github.com/taiyang-li)).
* Disable predicate pushdown if the CTE contains stateful functions [#55871](https://github.com/ClickHouse/ClickHouse/pull/55871) ([Raúl Marín](https://github.com/Algunenano)).
* Fix normalize ASTSelectWithUnionQuery strip FORMAT of the query [#55887](https://github.com/ClickHouse/ClickHouse/pull/55887) ([flynn](https://github.com/ucasfl)).
* Try to fix possible segfault in Native ORC input format [#55891](https://github.com/ClickHouse/ClickHouse/pull/55891) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix window functions in case of sparse columns. [#55895](https://github.com/ClickHouse/ClickHouse/pull/55895) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* fix: StorageNull supports subcolumns [#55912](https://github.com/ClickHouse/ClickHouse/pull/55912) ([FFish](https://github.com/wxybear)).
* Do not write retriable errors for Replicated mutate/merge into error log [#55944](https://github.com/ClickHouse/ClickHouse/pull/55944) ([Azat Khuzhin](https://github.com/azat)).
* Fix `SHOW DATABASES LIMIT <N>` [#55962](https://github.com/ClickHouse/ClickHouse/pull/55962) ([Raúl Marín](https://github.com/Algunenano)).
* Fix autogenerated Protobuf schema with fields with underscore [#55974](https://github.com/ClickHouse/ClickHouse/pull/55974) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix dateTime64ToSnowflake64() with non-default scale [#55983](https://github.com/ClickHouse/ClickHouse/pull/55983) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix output/input of Arrow dictionary column [#55989](https://github.com/ClickHouse/ClickHouse/pull/55989) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix fetching schema from schema registry in AvroConfluent [#55991](https://github.com/ClickHouse/ClickHouse/pull/55991) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix 'Block structure mismatch' on concurrent ALTER and INSERTs in Buffer table [#55995](https://github.com/ClickHouse/ClickHouse/pull/55995) ([Michael Kolupaev](https://github.com/al13n321)).
* Fix incorrect free space accounting for least_used JBOD policy [#56030](https://github.com/ClickHouse/ClickHouse/pull/56030) ([Azat Khuzhin](https://github.com/azat)).
* Fix missing scalar issue when evaluating subqueries inside table functions [#56057](https://github.com/ClickHouse/ClickHouse/pull/56057) ([Amos Bird](https://github.com/amosbird)).
* Fix wrong query result when http_write_exception_in_output_format=1 [#56135](https://github.com/ClickHouse/ClickHouse/pull/56135) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix schema cache for fallback JSON->JSONEachRow with changed settings [#56172](https://github.com/ClickHouse/ClickHouse/pull/56172) ([Kruglov Pavel](https://github.com/Avogar)).
* Add error handler to odbc-bridge [#56185](https://github.com/ClickHouse/ClickHouse/pull/56185) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
#### NO CL ENTRY
* NO CL ENTRY: 'Revert "Fix libssh+openssl3 & s390x (part 2)"'. [#55188](https://github.com/ClickHouse/ClickHouse/pull/55188) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* NO CL ENTRY: 'Revert "Support SAMPLE BY for VIEW"'. [#55357](https://github.com/ClickHouse/ClickHouse/pull/55357) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* NO CL ENTRY: 'Revert "Revert "refine error code of duplicated index in create query""'. [#55467](https://github.com/ClickHouse/ClickHouse/pull/55467) ([Han Fei](https://github.com/hanfei1991)).
* NO CL ENTRY: 'Update mysql.md - Remove the Private Preview Note'. [#55486](https://github.com/ClickHouse/ClickHouse/pull/55486) ([Ryadh DAHIMENE](https://github.com/Ryado)).
* NO CL ENTRY: 'Revert "Removed "maximize" and "drag" buttons from `dashboard` in case of single chart"'. [#55623](https://github.com/ClickHouse/ClickHouse/pull/55623) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* NO CL ENTRY: 'Revert "Fix filtering by virtual columns with OR filter in query"'. [#55657](https://github.com/ClickHouse/ClickHouse/pull/55657) ([Antonio Andelic](https://github.com/antonio2368)).
* NO CL ENTRY: 'Revert "Improve ColumnDecimal, ColumnVector getPermutation performance using pdqsort with RadixSort"'. [#55682](https://github.com/ClickHouse/ClickHouse/pull/55682) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* NO CL ENTRY: 'Revert "Integration check script fix ups"'. [#55694](https://github.com/ClickHouse/ClickHouse/pull/55694) ([alesapin](https://github.com/alesapin)).
* NO CL ENTRY: 'Revert "Fix 'Block structure mismatch' on concurrent ALTER and INSERTs in Buffer table"'. [#56103](https://github.com/ClickHouse/ClickHouse/pull/56103) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* NO CL ENTRY: 'Revert "Add function getHttpHeader"'. [#56109](https://github.com/ClickHouse/ClickHouse/pull/56109) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* NO CL ENTRY: 'Revert "Fix output/input of Arrow dictionary column"'. [#56150](https://github.com/ClickHouse/ClickHouse/pull/56150) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Set defaults_for_omitted_fields to true for hive text format [#49486](https://github.com/ClickHouse/ClickHouse/pull/49486) ([李扬](https://github.com/taiyang-li)).
* Fixing join tests with analyzer [#49555](https://github.com/ClickHouse/ClickHouse/pull/49555) ([vdimir](https://github.com/vdimir)).
* Make exception about `ALTER TABLE ... DROP COLUMN|INDEX|PROJECTION` more clear [#50181](https://github.com/ClickHouse/ClickHouse/pull/50181) ([Alexander Gololobov](https://github.com/davenger)).
* ANTI JOIN: Invalid number of rows in Chunk [#50944](https://github.com/ClickHouse/ClickHouse/pull/50944) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Analyzer: fix row policy [#53170](https://github.com/ClickHouse/ClickHouse/pull/53170) ([Dmitry Novik](https://github.com/novikd)).
* Add a test with Block structure mismatch in grace hash join. [#53278](https://github.com/ClickHouse/ClickHouse/pull/53278) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Support skip_unused_shards in Analyzer [#53282](https://github.com/ClickHouse/ClickHouse/pull/53282) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Revert "Revert "Planner prepare filters for analysis"" [#53792](https://github.com/ClickHouse/ClickHouse/pull/53792) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Add setting allow_experimental_partial_result [#54514](https://github.com/ClickHouse/ClickHouse/pull/54514) ([vdimir](https://github.com/vdimir)).
* Fix CI skip build and skip tests checks [#54532](https://github.com/ClickHouse/ClickHouse/pull/54532) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* `MergeTreeData::forcefullyMovePartToDetachedAndRemoveFromMemory` does not respect mutations [#54653](https://github.com/ClickHouse/ClickHouse/pull/54653) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix azure test by using unique names [#54738](https://github.com/ClickHouse/ClickHouse/pull/54738) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Use `--filter` to reduce checkout time [#54857](https://github.com/ClickHouse/ClickHouse/pull/54857) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Remove tests [#54873](https://github.com/ClickHouse/ClickHouse/pull/54873) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* Remove useless path from lockSharedData in StorageReplicatedMergeTree [#54989](https://github.com/ClickHouse/ClickHouse/pull/54989) ([Mike Kot](https://github.com/myrrc)).
* Fix broken test [#55002](https://github.com/ClickHouse/ClickHouse/pull/55002) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Update version_date.tsv and changelogs after v23.8.3.48-lts [#55063](https://github.com/ClickHouse/ClickHouse/pull/55063) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Use `source` instead of `bash` for pre-build script [#55071](https://github.com/ClickHouse/ClickHouse/pull/55071) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Update s3queue.md to add experimental flag [#55093](https://github.com/ClickHouse/ClickHouse/pull/55093) ([Peignon Melvyn](https://github.com/melvynator)).
* Clean data dir and always start an old server version in aggregate functions compatibility test. [#55105](https://github.com/ClickHouse/ClickHouse/pull/55105) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Update version_date.tsv and changelogs after v23.9.1.1854-stable [#55118](https://github.com/ClickHouse/ClickHouse/pull/55118) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Add links to check reports in status comment [#55122](https://github.com/ClickHouse/ClickHouse/pull/55122) ([vdimir](https://github.com/vdimir)).
* Bump croaring to 2.0.2 [#55127](https://github.com/ClickHouse/ClickHouse/pull/55127) ([Robert Schulze](https://github.com/rschu1ze)).
* check if block is empty after async insert retries [#55143](https://github.com/ClickHouse/ClickHouse/pull/55143) ([Han Fei](https://github.com/hanfei1991)).
* Improve linker detection on macOS [#55147](https://github.com/ClickHouse/ClickHouse/pull/55147) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix libssh+openssl3 & s390x [#55154](https://github.com/ClickHouse/ClickHouse/pull/55154) ([Boris Kuschel](https://github.com/bkuschel)).
* Fix file cache temporary file segment range in FileSegment::reserve [#55164](https://github.com/ClickHouse/ClickHouse/pull/55164) ([vdimir](https://github.com/vdimir)).
* Fix libssh+openssl3 & s390x (part 2) [#55187](https://github.com/ClickHouse/ClickHouse/pull/55187) ([Boris Kuschel](https://github.com/bkuschel)).
* Fix wrong test name [#55190](https://github.com/ClickHouse/ClickHouse/pull/55190) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix libssh+openssl3 & s390x (Part 2.1) [#55192](https://github.com/ClickHouse/ClickHouse/pull/55192) ([Boris Kuschel](https://github.com/bkuschel)).
* Reset signals caught by clickhouse-client if a pager is in use [#55193](https://github.com/ClickHouse/ClickHouse/pull/55193) ([Azat Khuzhin](https://github.com/azat)).
* Update README.md [#55209](https://github.com/ClickHouse/ClickHouse/pull/55209) ([Tyler Hannan](https://github.com/tylerhannan)).
* remove the blocker to grow the metadata file version [#55218](https://github.com/ClickHouse/ClickHouse/pull/55218) ([Sema Checherinda](https://github.com/CheSema)).
* Fix syntax highlight in client for spaceship operator [#55224](https://github.com/ClickHouse/ClickHouse/pull/55224) ([Azat Khuzhin](https://github.com/azat)).
* Fix mypy errors [#55228](https://github.com/ClickHouse/ClickHouse/pull/55228) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Upgrade MinIO to support accepting non signed requests [#55245](https://github.com/ClickHouse/ClickHouse/pull/55245) ([Azat Khuzhin](https://github.com/azat)).
* tests: switch test_throttling to S3 over https to make it more production like [#55247](https://github.com/ClickHouse/ClickHouse/pull/55247) ([Azat Khuzhin](https://github.com/azat)).
* Evaluate defaults during async insert safer [#55253](https://github.com/ClickHouse/ClickHouse/pull/55253) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix data race in context [#55260](https://github.com/ClickHouse/ClickHouse/pull/55260) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Do not allow tests with state ERROR be overwritten by PASSED [#55261](https://github.com/ClickHouse/ClickHouse/pull/55261) ([Azat Khuzhin](https://github.com/azat)).
* Fix query formatting for SYSTEM queries [#55277](https://github.com/ClickHouse/ClickHouse/pull/55277) ([Azat Khuzhin](https://github.com/azat)).
* Context added TSA [#55278](https://github.com/ClickHouse/ClickHouse/pull/55278) ([Maksim Kita](https://github.com/kitaisreal)).
* Improve logging in query cache [#55296](https://github.com/ClickHouse/ClickHouse/pull/55296) ([Robert Schulze](https://github.com/rschu1ze)).
* MaterializedPostgreSQL: remove back check [#55297](https://github.com/ClickHouse/ClickHouse/pull/55297) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Make `use_mysql_types_in_show_columns` independent from the connection [#55298](https://github.com/ClickHouse/ClickHouse/pull/55298) ([Robert Schulze](https://github.com/rschu1ze)).
* Make `HandlingRuleHTTPHandlerFactory` more stupid, but less error prone [#55307](https://github.com/ClickHouse/ClickHouse/pull/55307) ([alesapin](https://github.com/alesapin)).
* Fix tsan issue in croaring [#55311](https://github.com/ClickHouse/ClickHouse/pull/55311) ([Robert Schulze](https://github.com/rschu1ze)).
* Refactorings and better documentation for `toStartOfInterval()` [#55327](https://github.com/ClickHouse/ClickHouse/pull/55327) ([Robert Schulze](https://github.com/rschu1ze)).
* Review [#51946](https://github.com/ClickHouse/ClickHouse/issues/51946) and partially revert it [#55336](https://github.com/ClickHouse/ClickHouse/pull/55336) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Update README.md [#55339](https://github.com/ClickHouse/ClickHouse/pull/55339) ([Tyler Hannan](https://github.com/tylerhannan)).
* Context locks small fixes [#55352](https://github.com/ClickHouse/ClickHouse/pull/55352) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix bad test `01605_dictinct_two_level` [#55354](https://github.com/ClickHouse/ClickHouse/pull/55354) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix test_max_rows_to_read_leaf_with_view flakiness (due to prefer_localhost_replica) [#55355](https://github.com/ClickHouse/ClickHouse/pull/55355) ([Azat Khuzhin](https://github.com/azat)).
* Better exception messages [#55356](https://github.com/ClickHouse/ClickHouse/pull/55356) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Update CHANGELOG.md [#55359](https://github.com/ClickHouse/ClickHouse/pull/55359) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Better recursion depth check [#55361](https://github.com/ClickHouse/ClickHouse/pull/55361) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Merging [#43085](https://github.com/ClickHouse/ClickHouse/issues/43085) [#55362](https://github.com/ClickHouse/ClickHouse/pull/55362) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix data-race in web disk [#55372](https://github.com/ClickHouse/ClickHouse/pull/55372) ([Azat Khuzhin](https://github.com/azat)).
* Disable skim under TSan (Rust does not supports ThreadSanitizer) [#55378](https://github.com/ClickHouse/ClickHouse/pull/55378) ([Azat Khuzhin](https://github.com/azat)).
* Fix missing thread accounting for insert_distributed_sync=1 [#55392](https://github.com/ClickHouse/ClickHouse/pull/55392) ([Azat Khuzhin](https://github.com/azat)).
* Improve tests for untuple() [#55425](https://github.com/ClickHouse/ClickHouse/pull/55425) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix test that never worked test_rabbitmq_random_detach [#55453](https://github.com/ClickHouse/ClickHouse/pull/55453) ([Ilya Yatsishin](https://github.com/qoega)).
* Fix out of bound error in system.remote_data_paths + disk web [#55468](https://github.com/ClickHouse/ClickHouse/pull/55468) ([alesapin](https://github.com/alesapin)).
* Remove existing moving/ dir if allow_remove_stale_moving_parts is off [#55480](https://github.com/ClickHouse/ClickHouse/pull/55480) ([Mike Kot](https://github.com/myrrc)).
* Bump curl to 8.4 [#55492](https://github.com/ClickHouse/ClickHouse/pull/55492) ([Robert Schulze](https://github.com/rschu1ze)).
* Minor fixes for 02882_replicated_fetch_checksums_doesnt_match.sql [#55493](https://github.com/ClickHouse/ClickHouse/pull/55493) ([vdimir](https://github.com/vdimir)).
* AggregatingTransform initGenerate race condition fix [#55495](https://github.com/ClickHouse/ClickHouse/pull/55495) ([Maksim Kita](https://github.com/kitaisreal)).
* HashTable resize exception handle fix [#55497](https://github.com/ClickHouse/ClickHouse/pull/55497) ([Maksim Kita](https://github.com/kitaisreal)).
* fix lots of 'Structure does not match' warnings in ci [#55503](https://github.com/ClickHouse/ClickHouse/pull/55503) ([Han Fei](https://github.com/hanfei1991)).
* Cleanup: parallel replica coordinator usage [#55515](https://github.com/ClickHouse/ClickHouse/pull/55515) ([Igor Nikonov](https://github.com/devcrafter)).
* add k-morozov to trusted contributors [#55523](https://github.com/ClickHouse/ClickHouse/pull/55523) ([Mike Kot](https://github.com/myrrc)).
* Forbid create inverted index if setting not enabled [#55529](https://github.com/ClickHouse/ClickHouse/pull/55529) ([flynn](https://github.com/ucasfl)).
* Better exception messages but without SEGFAULT [#55541](https://github.com/ClickHouse/ClickHouse/pull/55541) ([Antonio Andelic](https://github.com/antonio2368)).
* Avoid setting same promise twice [#55553](https://github.com/ClickHouse/ClickHouse/pull/55553) ([Antonio Andelic](https://github.com/antonio2368)).
* Better error message in case when merge selecting task failed. [#55554](https://github.com/ClickHouse/ClickHouse/pull/55554) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Add a test [#55564](https://github.com/ClickHouse/ClickHouse/pull/55564) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Added healthcheck for LDAP [#55571](https://github.com/ClickHouse/ClickHouse/pull/55571) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fix global context for tests with --gtest_filter [#55583](https://github.com/ClickHouse/ClickHouse/pull/55583) ([Azat Khuzhin](https://github.com/azat)).
* Fix replica groups for Replicated database engine [#55587](https://github.com/ClickHouse/ClickHouse/pull/55587) ([Azat Khuzhin](https://github.com/azat)).
* Remove unused protobuf includes [#55590](https://github.com/ClickHouse/ClickHouse/pull/55590) ([Raúl Marín](https://github.com/Algunenano)).
* Apply Context changes to standalone Keeper [#55591](https://github.com/ClickHouse/ClickHouse/pull/55591) ([Antonio Andelic](https://github.com/antonio2368)).
* Do not fail if label-to-remove does not exists in PR [#55592](https://github.com/ClickHouse/ClickHouse/pull/55592) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* CI: cast extra column expression `pull_request_number` to Int32 [#55599](https://github.com/ClickHouse/ClickHouse/pull/55599) ([Han Fei](https://github.com/hanfei1991)).
* Add back a test that was removed by mistake [#55605](https://github.com/ClickHouse/ClickHouse/pull/55605) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Bump croaring to v2.0.4 [#55606](https://github.com/ClickHouse/ClickHouse/pull/55606) ([Robert Schulze](https://github.com/rschu1ze)).
* byteswap: Add 16/32-byte integer support [#55607](https://github.com/ClickHouse/ClickHouse/pull/55607) ([Robert Schulze](https://github.com/rschu1ze)).
* Revert [#54421](https://github.com/ClickHouse/ClickHouse/issues/54421) [#55613](https://github.com/ClickHouse/ClickHouse/pull/55613) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix: race condition in kusto implementation [#55615](https://github.com/ClickHouse/ClickHouse/pull/55615) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Remove passed tests from `analyzer_tech_debt.txt` [#55618](https://github.com/ClickHouse/ClickHouse/pull/55618) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Enable 02161_addressToLineWithInlines [#55622](https://github.com/ClickHouse/ClickHouse/pull/55622) ([Michael Kolupaev](https://github.com/al13n321)).
* KeyCondition: preparation [#55625](https://github.com/ClickHouse/ClickHouse/pull/55625) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix flakiness of test_system_merges (by increasing sleep interval properly) [#55627](https://github.com/ClickHouse/ClickHouse/pull/55627) ([Azat Khuzhin](https://github.com/azat)).
* fix `structure does not match` logs again [#55628](https://github.com/ClickHouse/ClickHouse/pull/55628) ([Han Fei](https://github.com/hanfei1991)).
* KeyCondition: small changes [#55640](https://github.com/ClickHouse/ClickHouse/pull/55640) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Resubmit [#54421](https://github.com/ClickHouse/ClickHouse/issues/54421) [#55641](https://github.com/ClickHouse/ClickHouse/pull/55641) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix some typos [#55646](https://github.com/ClickHouse/ClickHouse/pull/55646) ([alesapin](https://github.com/alesapin)).
* Show move/maximize only if there is more than a single chart [#55648](https://github.com/ClickHouse/ClickHouse/pull/55648) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Enable test_query_is_lock_free[detach table] for the analyzer [#55668](https://github.com/ClickHouse/ClickHouse/pull/55668) ([Raúl Marín](https://github.com/Algunenano)).
* Allow FINAL with parallel replicas with custom key [#55679](https://github.com/ClickHouse/ClickHouse/pull/55679) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix StorageMaterializedView::isRemote [#55681](https://github.com/ClickHouse/ClickHouse/pull/55681) ([vdimir](https://github.com/vdimir)).
* Bump gRPC to 1.34.1 [#55693](https://github.com/ClickHouse/ClickHouse/pull/55693) ([Robert Schulze](https://github.com/rschu1ze)).
* Randomize block_number column setting in ci [#55713](https://github.com/ClickHouse/ClickHouse/pull/55713) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Use pool for proxied S3 disk http sessions [#55718](https://github.com/ClickHouse/ClickHouse/pull/55718) ([Aleksei Filatov](https://github.com/aalexfvk)).
* Analyzer: fix block stucture mismatch in matview with engine distributed [#55741](https://github.com/ClickHouse/ClickHouse/pull/55741) ([vdimir](https://github.com/vdimir)).
* Use diff object again, since JSON API limits the files [#55750](https://github.com/ClickHouse/ClickHouse/pull/55750) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Big endian platform max intersection fix [#55756](https://github.com/ClickHouse/ClickHouse/pull/55756) ([Suzy Wang](https://github.com/SuzyWangIBMer)).
* Remove temporary debug logging in MultiplexedConnections [#55764](https://github.com/ClickHouse/ClickHouse/pull/55764) ([Michael Kolupaev](https://github.com/al13n321)).
* Check if partition ID is `nullptr` [#55765](https://github.com/ClickHouse/ClickHouse/pull/55765) ([Antonio Andelic](https://github.com/antonio2368)).
* Control Keeper feature flag randomization with env [#55766](https://github.com/ClickHouse/ClickHouse/pull/55766) ([Antonio Andelic](https://github.com/antonio2368)).
* Added test to check CapnProto cache [#55769](https://github.com/ClickHouse/ClickHouse/pull/55769) ([Aleksandr Musorin](https://github.com/AVMusorin)).
* Query Cache: Only cache initial query [#55771](https://github.com/ClickHouse/ClickHouse/pull/55771) ([zhongyuankai](https://github.com/zhongyuankai)).
* Temporarily disable flaky test [#55772](https://github.com/ClickHouse/ClickHouse/pull/55772) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix test test_postgresql_replica_database_engine_2/test.py::test_replica_consumer [#55774](https://github.com/ClickHouse/ClickHouse/pull/55774) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Improve ColumnDecimal, ColumnVector getPermutation performance using pdqsort with RadixSort fix [#55775](https://github.com/ClickHouse/ClickHouse/pull/55775) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix black check [#55779](https://github.com/ClickHouse/ClickHouse/pull/55779) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Correctly grep fuzzer.log [#55780](https://github.com/ClickHouse/ClickHouse/pull/55780) ([Antonio Andelic](https://github.com/antonio2368)).
* Parallel replicas: cleanup, less copying during announcement [#55781](https://github.com/ClickHouse/ClickHouse/pull/55781) ([Igor Nikonov](https://github.com/devcrafter)).
* Enable test_mutation_simple with the analyzer [#55791](https://github.com/ClickHouse/ClickHouse/pull/55791) ([Raúl Marín](https://github.com/Algunenano)).
* Improve enrich image [#55793](https://github.com/ClickHouse/ClickHouse/pull/55793) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Bump gRPC to v1.36.4 [#55811](https://github.com/ClickHouse/ClickHouse/pull/55811) ([Robert Schulze](https://github.com/rschu1ze)).
* Attemp to fix test_dictionaries_redis flakiness [#55813](https://github.com/ClickHouse/ClickHouse/pull/55813) ([Raúl Marín](https://github.com/Algunenano)).
* Add diagnostic checks for issue [#55041](https://github.com/ClickHouse/ClickHouse/issues/55041) [#55835](https://github.com/ClickHouse/ClickHouse/pull/55835) ([Robert Schulze](https://github.com/rschu1ze)).
* Correct aggregate functions ser/deserialization to be endianness-independent. [#55837](https://github.com/ClickHouse/ClickHouse/pull/55837) ([Austin Kothig](https://github.com/kothiga)).
* Bump gRPC to v1.37.1 [#55840](https://github.com/ClickHouse/ClickHouse/pull/55840) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix 00002_log_and_exception_messages_formatting [#55844](https://github.com/ClickHouse/ClickHouse/pull/55844) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix caching objects in pygithub, and changelogs [#55845](https://github.com/ClickHouse/ClickHouse/pull/55845) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Update version_date.tsv and changelogs after v23.3.14.78-lts [#55847](https://github.com/ClickHouse/ClickHouse/pull/55847) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Update version_date.tsv and changelogs after v23.9.2.56-stable [#55848](https://github.com/ClickHouse/ClickHouse/pull/55848) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Update version_date.tsv and changelogs after v23.8.4.69-lts [#55849](https://github.com/ClickHouse/ClickHouse/pull/55849) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* fix node setting in the test [#55850](https://github.com/ClickHouse/ClickHouse/pull/55850) ([Sema Checherinda](https://github.com/CheSema)).
* Add load_metadata_threads to describe filesystem cache [#55863](https://github.com/ClickHouse/ClickHouse/pull/55863) ([Jordi Villar](https://github.com/jrdi)).
* One final leftover in diff_urls of PRInfo [#55874](https://github.com/ClickHouse/ClickHouse/pull/55874) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix digest check in replicated ddl worker [#55877](https://github.com/ClickHouse/ClickHouse/pull/55877) ([Sergei Trifonov](https://github.com/serxa)).
* Test parallel replicas with rollup [#55886](https://github.com/ClickHouse/ClickHouse/pull/55886) ([Raúl Marín](https://github.com/Algunenano)).
* Fix some tests with Replicated database [#55889](https://github.com/ClickHouse/ClickHouse/pull/55889) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Update stress.py [#55890](https://github.com/ClickHouse/ClickHouse/pull/55890) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Revert "Revert "Revert "Add settings for real-time updates during query execution""" [#55893](https://github.com/ClickHouse/ClickHouse/pull/55893) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Make test `system_zookeeper_connection` better [#55900](https://github.com/ClickHouse/ClickHouse/pull/55900) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* A test `01019_alter_materialized_view_consistent` is unstable with Analyzer [#55901](https://github.com/ClickHouse/ClickHouse/pull/55901) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Remove C++ templates, because they are stupid [#55910](https://github.com/ClickHouse/ClickHouse/pull/55910) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Bump gRPC to v1.39.1 [#55914](https://github.com/ClickHouse/ClickHouse/pull/55914) ([Robert Schulze](https://github.com/rschu1ze)).
* Add sanity check to RPNBuilderFunctionTreeNode [#55915](https://github.com/ClickHouse/ClickHouse/pull/55915) ([Robert Schulze](https://github.com/rschu1ze)).
* Bump gRPC to v1.42.0 [#55916](https://github.com/ClickHouse/ClickHouse/pull/55916) ([Robert Schulze](https://github.com/rschu1ze)).
* Set storage.has_lightweight_delete_parts flag when a part has been loaded [#55935](https://github.com/ClickHouse/ClickHouse/pull/55935) ([Alexander Gololobov](https://github.com/davenger)).
* Include information about supported versions in bug report issue template [#55937](https://github.com/ClickHouse/ClickHouse/pull/55937) ([Nikita Taranov](https://github.com/nickitat)).
* arrayFold: Switch accumulator and array arguments [#55948](https://github.com/ClickHouse/ClickHouse/pull/55948) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix overrides via connections_credentials in case of root directives exists [#55949](https://github.com/ClickHouse/ClickHouse/pull/55949) ([Azat Khuzhin](https://github.com/azat)).
* Test for Bug 43644 [#55955](https://github.com/ClickHouse/ClickHouse/pull/55955) ([Robert Schulze](https://github.com/rschu1ze)).
* Bump protobuf to v3.19.6 [#55963](https://github.com/ClickHouse/ClickHouse/pull/55963) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix possible performance test error [#55964](https://github.com/ClickHouse/ClickHouse/pull/55964) ([Azat Khuzhin](https://github.com/azat)).
* Avoid counting lost parts twice [#55987](https://github.com/ClickHouse/ClickHouse/pull/55987) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Convert unnecessary std::scoped_lock usage to std::lock_guard [#56006](https://github.com/ClickHouse/ClickHouse/pull/56006) ([Robert Schulze](https://github.com/rschu1ze)).
* Stress tests: Try to wait until server is responsive after gdb detach [#56009](https://github.com/ClickHouse/ClickHouse/pull/56009) ([Raúl Marín](https://github.com/Algunenano)).
* test_storage_s3_queue - add debug info [#56011](https://github.com/ClickHouse/ClickHouse/pull/56011) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Bump protobuf to v21.9 [#56014](https://github.com/ClickHouse/ClickHouse/pull/56014) ([Robert Schulze](https://github.com/rschu1ze)).
* Correct the implementation of function `jsonMergePatch` [#56020](https://github.com/ClickHouse/ClickHouse/pull/56020) ([Anton Popov](https://github.com/CurtizJ)).
* Fix 02438_sync_replica_lightweight [#56023](https://github.com/ClickHouse/ClickHouse/pull/56023) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix some bad code by making it worse [#56026](https://github.com/ClickHouse/ClickHouse/pull/56026) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix bash completion for mawk (and update format list and add one more delimiter) [#56050](https://github.com/ClickHouse/ClickHouse/pull/56050) ([Azat Khuzhin](https://github.com/azat)).
* Analyzer: Fix crash on window resolve [#56055](https://github.com/ClickHouse/ClickHouse/pull/56055) ([Dmitry Novik](https://github.com/novikd)).
* Fix function_json_value_return_type_allow_nullable setting name in doc [#56056](https://github.com/ClickHouse/ClickHouse/pull/56056) ([vdimir](https://github.com/vdimir)).
* Force shutdown in upgrade test [#56074](https://github.com/ClickHouse/ClickHouse/pull/56074) ([Raúl Marín](https://github.com/Algunenano)).
* Try enable `01154_move_partition_long` with s3 [#56080](https://github.com/ClickHouse/ClickHouse/pull/56080) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix race condition between DROP_RANGE and committing existing block [#56083](https://github.com/ClickHouse/ClickHouse/pull/56083) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix flakiness of 02263_lazy_mark_load [#56087](https://github.com/ClickHouse/ClickHouse/pull/56087) ([Michael Kolupaev](https://github.com/al13n321)).
* Make the code less bloated [#56091](https://github.com/ClickHouse/ClickHouse/pull/56091) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Maybe smaller binary [#56112](https://github.com/ClickHouse/ClickHouse/pull/56112) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Remove old trash from unit tests [#56113](https://github.com/ClickHouse/ClickHouse/pull/56113) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Remove some bloat [#56114](https://github.com/ClickHouse/ClickHouse/pull/56114) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix test_format_schema_on_server flakiness [#56116](https://github.com/ClickHouse/ClickHouse/pull/56116) ([Azat Khuzhin](https://github.com/azat)).
* Fix: schedule delayed part checks correctly [#56123](https://github.com/ClickHouse/ClickHouse/pull/56123) ([Igor Nikonov](https://github.com/devcrafter)).
* Beautify `show merges` [#56124](https://github.com/ClickHouse/ClickHouse/pull/56124) ([Denny Crane](https://github.com/den-crane)).
* Better options for disabling frame pointer omitting [#56130](https://github.com/ClickHouse/ClickHouse/pull/56130) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fix: incorrect brace style in clang-format [#56133](https://github.com/ClickHouse/ClickHouse/pull/56133) ([Igor Nikonov](https://github.com/devcrafter)).
* Do not try to activate covered parts when handilng unexpected parts [#56137](https://github.com/ClickHouse/ClickHouse/pull/56137) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Re-fix 'Block structure mismatch' on concurrent ALTER and INSERTs in Buffer table [#56140](https://github.com/ClickHouse/ClickHouse/pull/56140) ([Michael Kolupaev](https://github.com/al13n321)).
* Update version_date.tsv and changelogs after v23.3.15.29-lts [#56145](https://github.com/ClickHouse/ClickHouse/pull/56145) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Update version_date.tsv and changelogs after v23.8.5.16-lts [#56146](https://github.com/ClickHouse/ClickHouse/pull/56146) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Update version_date.tsv and changelogs after v23.9.3.12-stable [#56147](https://github.com/ClickHouse/ClickHouse/pull/56147) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Update version_date.tsv and changelogs after v23.7.6.111-stable [#56148](https://github.com/ClickHouse/ClickHouse/pull/56148) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Fasttest timeout setting [#56160](https://github.com/ClickHouse/ClickHouse/pull/56160) ([Max K.](https://github.com/mkaynov)).
* Use monotonic clock for part check scheduling [#56162](https://github.com/ClickHouse/ClickHouse/pull/56162) ([Igor Nikonov](https://github.com/devcrafter)).
* More metrics for fs cache [#56165](https://github.com/ClickHouse/ClickHouse/pull/56165) ([Kseniia Sumarokova](https://github.com/kssenii)).
* FileCache minor changes [#56168](https://github.com/ClickHouse/ClickHouse/pull/56168) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Update 01414_mutations_and_errors_zookeeper.sh [#56176](https://github.com/ClickHouse/ClickHouse/pull/56176) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Shard fs cache keys [#56194](https://github.com/ClickHouse/ClickHouse/pull/56194) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Do less work when there lots of read requests and watches for same paths [#56197](https://github.com/ClickHouse/ClickHouse/pull/56197) ([Alexander Gololobov](https://github.com/davenger)).
* Remove skip_unused_shards tests from analyzer skiplist [#56200](https://github.com/ClickHouse/ClickHouse/pull/56200) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Easy tests fix for analyzer [#56211](https://github.com/ClickHouse/ClickHouse/pull/56211) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Add a warning if delayed accounting is not enabled (breaks OSIOWaitMicroseconds) [#56227](https://github.com/ClickHouse/ClickHouse/pull/56227) ([Azat Khuzhin](https://github.com/azat)).
* Do not remove part if `Too many open files` is thrown [#56238](https://github.com/ClickHouse/ClickHouse/pull/56238) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix ORC commit [#56261](https://github.com/ClickHouse/ClickHouse/pull/56261) ([Raúl Marín](https://github.com/Algunenano)).
* Fix typo in largestTriangleThreeBuckets.md [#56263](https://github.com/ClickHouse/ClickHouse/pull/56263) ([Nikita Taranov](https://github.com/nickitat)).

View File

@ -11,6 +11,8 @@ This is intended for continuous integration checks that run on Linux servers. If
The cross-build for macOS is based on the [Build instructions](../development/build.md), follow them first.
The following sections provide a walk-through for building ClickHouse for `x86_64` macOS. If youre targeting ARM architecture, simply substitute all occurrences of `x86_64` with `aarch64`. For example, replace `x86_64-apple-darwin` with `aarch64-apple-darwin` throughout the steps.
## Install Clang-17
Follow the instructions from https://apt.llvm.org/ for your Ubuntu or Debian setup.
@ -30,13 +32,13 @@ export CCTOOLS=$(cd ~/cctools && pwd)
mkdir ${CCTOOLS}
cd ${CCTOOLS}
git clone https://github.com/tpoechtrager/apple-libtapi.git
git clone --depth=1 https://github.com/tpoechtrager/apple-libtapi.git
cd apple-libtapi
INSTALLPREFIX=${CCTOOLS} ./build.sh
./install.sh
cd ..
git clone https://github.com/tpoechtrager/cctools-port.git
git clone --depth=1 https://github.com/tpoechtrager/cctools-port.git
cd cctools-port/cctools
./configure --prefix=$(readlink -f ${CCTOOLS}) --with-libtapi=$(readlink -f ${CCTOOLS}) --target=x86_64-apple-darwin
make install
@ -46,7 +48,7 @@ Also, we need to download macOS X SDK into the working tree.
``` bash
cd ClickHouse/cmake/toolchain/darwin-x86_64
curl -L 'https://github.com/phracker/MacOSX-SDKs/releases/download/10.15/MacOSX10.15.sdk.tar.xz' | tar xJ --strip-components=1
curl -L 'https://github.com/phracker/MacOSX-SDKs/releases/download/11.3/MacOSX11.0.sdk.tar.xz' | tar xJ --strip-components=1
```
## Build ClickHouse {#build-clickhouse}

View File

@ -23,7 +23,7 @@ sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)"
``` bash
cd ClickHouse
mkdir build-riscv64
CC=clang-16 CXX=clang++-16 cmake . -Bbuild-riscv64 -G Ninja -DCMAKE_TOOLCHAIN_FILE=cmake/linux/toolchain-riscv64.cmake -DGLIBC_COMPATIBILITY=OFF -DENABLE_LDAP=OFF -DOPENSSL_NO_ASM=ON -DENABLE_JEMALLOC=ON -DENABLE_PARQUET=OFF -DENABLE_GRPC=OFF -DENABLE_HDFS=OFF -DENABLE_MYSQL=OFF
CC=clang-17 CXX=clang++-17 cmake . -Bbuild-riscv64 -G Ninja -DCMAKE_TOOLCHAIN_FILE=cmake/linux/toolchain-riscv64.cmake -DGLIBC_COMPATIBILITY=OFF -DENABLE_LDAP=OFF -DOPENSSL_NO_ASM=ON -DENABLE_JEMALLOC=ON -DENABLE_PARQUET=OFF -DENABLE_GRPC=OFF -DENABLE_HDFS=OFF -DENABLE_MYSQL=OFF
ninja -C build-riscv64
```

View File

@ -2762,3 +2762,7 @@ Proxy settings are determined in the following order:
ClickHouse will check the highest priority resolver type for the request protocol. If it is not defined,
it will check the next highest priority resolver type, until it reaches the environment resolver.
This also allows a mix of resolver types can be used.
### disable_tunneling_for_https_requests_over_http_proxy {#disable_tunneling_for_https_requests_over_http_proxy}
By default, tunneling (i.e, `HTTP CONNECT`) is used to make `HTTPS` requests over `HTTP` proxy. This setting can be used to disable it.

View File

@ -172,7 +172,27 @@ If you set `timeout_before_checking_execution_speed `to 0, ClickHouse will use c
## timeout_overflow_mode {#timeout-overflow-mode}
What to do if the query is run longer than max_execution_time: throw or break. By default, throw.
What to do if the query is run longer than `max_execution_time`: `throw` or `break`. By default, `throw`.
# max_execution_time_leaf
Similar semantic to `max_execution_time` but only apply on leaf node for distributed or remote queries.
For example, if we want to limit execution time on leaf node to `10s` but no limit on the initial node, instead of having `max_execution_time` in the nested subquery settings:
``` sql
SELECT count() FROM cluster(cluster, view(SELECT * FROM t SETTINGS max_execution_time = 10));
```
We can use `max_execution_time_leaf` as the query settings:
``` sql
SELECT count() FROM cluster(cluster, view(SELECT * FROM t)) SETTINGS max_execution_time_leaf = 10;
```
# timeout_overflow_mode_leaf
What to do when the query in leaf node run longer than `max_execution_time_leaf`: `throw` or `break`. By default, `throw`.
## min_execution_speed {#min-execution-speed}

View File

@ -3310,22 +3310,11 @@ Possible values:
Default value: `0`.
## use_mysql_types_in_show_columns {#use_mysql_types_in_show_columns}
Show the names of MySQL data types corresponding to ClickHouse data types in [SHOW COLUMNS](../../sql-reference/statements/show.md#show_columns).
Possible values:
- 0 - Show names of native ClickHouse data types.
- 1 - Show names of MySQL data types corresponding to ClickHouse data types.
Default value: `0`.
## mysql_map_string_to_text_in_show_columns {#mysql_map_string_to_text_in_show_columns}
When enabled, [String](../../sql-reference/data-types/string.md) ClickHouse data type will be displayed as `TEXT` in [SHOW COLUMNS](../../sql-reference/statements/show.md#show_columns).
Has effect only when [use_mysql_types_in_show_columns](#use_mysql_types_in_show_columns) is enabled.
Has an effect only when the connection is made through the MySQL wire protocol.
- 0 - Use `BLOB`.
- 1 - Use `TEXT`.
@ -3336,7 +3325,7 @@ Default value: `0`.
When enabled, [FixedString](../../sql-reference/data-types/fixedstring.md) ClickHouse data type will be displayed as `TEXT` in [SHOW COLUMNS](../../sql-reference/statements/show.md#show_columns).
Has effect only when [use_mysql_types_in_show_columns](#use_mysql_types_in_show_columns) is enabled.
Has an effect only when the connection is made through the MySQL wire protocol.
- 0 - Use `BLOB`.
- 1 - Use `TEXT`.
@ -3966,6 +3955,10 @@ Possible values:
Default value: `1`.
:::note
`alter_sync` is applicable to `Replicated` tables only, it does nothing to alters of not `Replicated` tables.
:::
## replication_wait_for_inactive_replica_timeout {#replication-wait-for-inactive-replica-timeout}
Specifies how long (in seconds) to wait for inactive replicas to execute [ALTER](../../sql-reference/statements/alter/index.md), [OPTIMIZE](../../sql-reference/statements/optimize.md) or [TRUNCATE](../../sql-reference/statements/truncate.md) queries.
@ -4808,3 +4801,10 @@ LIFETIME(MIN 0 MAX 3600)
LAYOUT(COMPLEX_KEY_HASHED_ARRAY())
SETTINGS(dictionary_use_async_executor=1, max_threads=8);
```
## storage_metadata_write_full_object_key {#storage_metadata_write_full_object_key}
When set to `true` the metadata files are written with `VERSION_FULL_OBJECT_KEY` format version. With that format full object storage key names are written to the metadata files.
When set to `false` the metadata files are written with the previous format version, `VERSION_INLINE_DATA`. With that format only suffixes of object storage key names are are written to the metadata files. The prefix for all of object storage key names is set in configurations files at `storage_configuration.disks` section.
Default value: `false`.

View File

@ -60,7 +60,7 @@ SELECT largestTriangleThreeBuckets(4)(x, y) FROM largestTriangleThreeBuckets_tes
Result:
``` text
┌────────largestTriangleThreeBuckets(3)(x, y)───────────┐
┌────────largestTriangleThreeBuckets(4)(x, y)───────────┐
│ [(1,10),(3,15),(5,40),(10,70)] │
└───────────────────────────────────────────────────────┘
```

View File

@ -1557,10 +1557,10 @@ Returns for a given date, the number of days passed since [1 January 0000](https
toDaysSinceYearZero(date[, time_zone])
```
Aliases: `TO_DAYS`
Alias: `TO_DAYS`
**Arguments**
- `date` — The date to calculate the number of days passed since year zero from. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md).
- `time_zone` — A String type const value or a expression represent the time zone. [String types](../../sql-reference/data-types/string.md)
@ -1584,6 +1584,56 @@ Result:
└────────────────────────────────────────────┘
```
**See Also**
- [fromDaysSinceYearZero](#fromDaysSinceYearZero)
## fromDaysSinceYearZero
Returns for a given number of days passed since [1 January 0000](https://en.wikipedia.org/wiki/Year_zero) the corresponding date in the [proleptic Gregorian calendar defined by ISO 8601](https://en.wikipedia.org/wiki/Gregorian_calendar#Proleptic_Gregorian_calendar). The calculation is the same as in MySQL's [`FROM_DAYS()`](https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_from-days) function.
The result is undefined if it cannot be represented within the bounds of the [Date](../../sql-reference/data-types/date.md) type.
**Syntax**
``` sql
fromDaysSinceYearZero(days)
```
Alias: `FROM_DAYS`
**Arguments**
- `days` — The number of days passed since year zero.
**Returned value**
The date corresponding to the number of days passed since year zero.
Type: [Date](../../sql-reference/data-types/date.md).
**Example**
``` sql
SELECT fromDaysSinceYearZero(739136), fromDaysSinceYearZero(toDaysSinceYearZero(toDate('2023-09-08')));
```
Result:
``` text
┌─fromDaysSinceYearZero(739136)─┬─fromDaysSinceYearZero(toDaysSinceYearZero(toDate('2023-09-08')))─┐
│ 2023-09-08 │ 2023-09-08 │
└───────────────────────────────┴──────────────────────────────────────────────────────────────────┘
```
**See Also**
- [toDaysSinceYearZero](#toDaysSinceYearZero)
## fromDaysSinceYearZero32
Like [fromDaysSinceYearZero](#fromDaysSinceYearZero) but returns a [Date32](../../sql-reference/data-types/date32.md).
## age
Returns the `unit` component of the difference between `startdate` and `enddate`. The difference is calculated using a precision of 1 microsecond.

View File

@ -2760,10 +2760,13 @@ message Root
Returns a formatted, possibly multi-line, version of the given SQL query.
Throws an exception if the query is not well-formed. To return `NULL` instead, function `formatQueryOrNull()` may be used.
**Syntax**
```sql
formatQuery(query)
formatQueryOrNull(query)
```
**Arguments**
@ -2796,10 +2799,13 @@ WHERE (a > 3) AND (b < 3) │
Like formatQuery() but the returned formatted string contains no line breaks.
Throws an exception if the query is not well-formed. To return `NULL` instead, function `formatQuerySingleLineOrNull()` may be used.
**Syntax**
```sql
formatQuerySingleLine(query)
formatQuerySingleLineOrNull(query)
```
**Arguments**

View File

@ -689,47 +689,47 @@ Calculates the [hamming distance](https://en.wikipedia.org/wiki/Hamming_distance
**Syntax**
```sql
byteHammingDistance(string2, string2)
byteHammingDistance(string1, string2)
```
**Examples**
``` sql
SELECT byteHammingDistance('abc', 'ab') ;
SELECT byteHammingDistance('karolin', 'kathrin');
```
Result:
``` text
┌─byteHammingDistance('abc', 'ab')─┐
1
└──────────────────────────────────┘
┌─byteHammingDistance('karolin', 'kathrin')─┐
3
└───────────────────────────────────────────
```
- Alias: mismatches
Alias: mismatches
## jaccardIndex
## stringJaccardIndex
Calculates the [Jaccard similarity index](https://en.wikipedia.org/wiki/Jaccard_index) between two byte strings.
**Syntax**
```sql
byteJaccardIndex(string1, string2)
stringJaccardIndex(string1, string2)
```
**Examples**
``` sql
SELECT jaccardIndex('clickhouse', 'mouse');
SELECT stringJaccardIndex('clickhouse', 'mouse');
```
Result:
``` text
┌─jaccardIndex('clickhouse', 'mouse')─┐
│ 0.4 │
└─────────────────────────────────────────┘
┌─stringJaccardIndex('clickhouse', 'mouse')─┐
0.4 │
└───────────────────────────────────────────
```
## editDistance
@ -752,8 +752,8 @@ Result:
``` text
┌─editDistance('clickhouse', 'mouse')─┐
6 │
└─────────────────────────────────────────
│ 6 │
└─────────────────────────────────────┘
```
- Alias: levenshteinDistance
Alias: levenshteinDistance

View File

@ -10,10 +10,10 @@ Attaches a table or a dictionary, for example, when moving a database to another
**Syntax**
``` sql
ATTACH TABLE|DICTIONARY [IF NOT EXISTS] [db.]name [ON CLUSTER cluster] ...
ATTACH TABLE|DICTIONARY|DATABASE [IF NOT EXISTS] [db.]name [ON CLUSTER cluster] ...
```
The query does not create data on the disk, but assumes that data is already in the appropriate places, and just adds information about the table or the dictionary to the server. After executing the `ATTACH` query, the server will know about the existence of the table or the dictionary.
The query does not create data on the disk, but assumes that data is already in the appropriate places, and just adds information about the specified table, dictionary or database to the server. After executing the `ATTACH` query, the server will know about the existence of the table, dictionary or database.
If a table was previously detached ([DETACH](../../sql-reference/statements/detach.md) query), meaning that its structure is known, you can use shorthand without defining the structure.
@ -79,3 +79,13 @@ Attaches a previously detached dictionary.
``` sql
ATTACH DICTIONARY [IF NOT EXISTS] [db.]name [ON CLUSTER cluster]
```
## Attach Existing Database
Attaches a previously detached database.
**Syntax**
``` sql
ATTACH DATABASE [IF NOT EXISTS] name [ENGINE=<database engine>] [ON CLUSTER cluster]
```

View File

@ -136,8 +136,30 @@ Output:
└──────────────┴───────────┴──────────────────────────────────────────┘
```
If the checksums.txt file is missing, it can be restored. It will be recalculated and rewritten during the execution of the CHECK TABLE command for the specific partition, and the status will still be reported as 'success.'"
If the checksums.txt file is missing, it can be restored. It will be recalculated and rewritten during the execution of the CHECK TABLE command for the specific partition, and the status will still be reported as 'is_passed = 1'.
You can check all existing `(Replicated)MergeTree` tables at once by using the `CHECK ALL TABLES` query.
```sql
CHECK ALL TABLES
FORMAT PrettyCompactMonoBlock
SETTINGS check_query_single_value_result = 0
```
```text
┌─database─┬─table────┬─part_path───┬─is_passed─┬─message─┐
│ default │ t2 │ all_1_95_3 │ 1 │ │
│ db1 │ table_01 │ all_39_39_0 │ 1 │ │
│ default │ t1 │ all_39_39_0 │ 1 │ │
│ db1 │ t1 │ all_39_39_0 │ 1 │ │
│ db1 │ table_01 │ all_1_6_1 │ 1 │ │
│ default │ t1 │ all_1_6_1 │ 1 │ │
│ db1 │ t1 │ all_1_6_1 │ 1 │ │
│ db1 │ table_01 │ all_7_38_2 │ 1 │ │
│ db1 │ t1 │ all_7_38_2 │ 1 │ │
│ default │ t1 │ all_7_38_2 │ 1 │ │
└──────────┴──────────┴─────────────┴───────────┴─────────┘
```
## If the Data Is Corrupted

View File

@ -228,7 +228,7 @@ hex(hexed): 5A90B714
Calculated columns (synonym). Column of this type are not stored in the table and it is not possible to INSERT values into them.
When SELECT queries explicitly reference columns of this type, the value is computed at query time from `expr`. By default, `SELECT *` excludes ALIAS columns. This behavior can be disabled with setting `asteriks_include_alias_columns`.
When SELECT queries explicitly reference columns of this type, the value is computed at query time from `expr`. By default, `SELECT *` excludes ALIAS columns. This behavior can be disabled with setting `asterisk_include_alias_columns`.
When using the ALTER query to add new columns, old data for these columns is not written. Instead, when reading old data that does not have values for the new columns, expressions are computed on the fly by default. However, if running the expressions requires different columns that are not indicated in the query, these columns will additionally be read, but only for the blocks of data that need it.

View File

@ -5,17 +5,17 @@ sidebar_label: DETACH
title: "DETACH Statement"
---
Makes the server "forget" about the existence of a table, a materialized view, or a dictionary.
Makes the server "forget" about the existence of a table, a materialized view, a dictionary, or a database.
**Syntax**
``` sql
DETACH TABLE|VIEW|DICTIONARY [IF EXISTS] [db.]name [ON CLUSTER cluster] [PERMANENTLY] [SYNC]
DETACH TABLE|VIEW|DICTIONARY|DATABASE [IF EXISTS] [db.]name [ON CLUSTER cluster] [PERMANENTLY] [SYNC]
```
Detaching does not delete the data or metadata of a table, a materialized view or a dictionary. If an entity was not detached `PERMANENTLY`, on the next server launch the server will read the metadata and recall the table/view/dictionary again. If an entity was detached `PERMANENTLY`, there will be no automatic recall.
Detaching does not delete the data or metadata of a table, a materialized view, a dictionary or a database. If an entity was not detached `PERMANENTLY`, on the next server launch the server will read the metadata and recall the table/view/dictionary/database again. If an entity was detached `PERMANENTLY`, there will be no automatic recall.
Whether a table or a dictionary was detached permanently or not, in both cases you can reattach them using the [ATTACH](../../sql-reference/statements/attach.md) query.
Whether a table, a dictionary or a database was detached permanently or not, in both cases you can reattach them using the [ATTACH](../../sql-reference/statements/attach.md) query.
System log tables can be also attached back (e.g. `query_log`, `text_log`, etc). Other system tables can't be reattached. On the next server launch the server will recall those tables again.
`ATTACH MATERIALIZED VIEW` does not work with short syntax (without `SELECT`), but you can attach it using the `ATTACH TABLE` query.

View File

@ -207,7 +207,7 @@ The optional keyword `FULL` causes the output to include the collation, comment
The statement produces a result table with the following structure:
- `field` - The name of the column (String)
- `type` - The column data type. If setting `[use_mysql_types_in_show_columns](../../operations/settings/settings.md#use_mysql_types_in_show_columns) = 1` (default: 0), then the equivalent type name in MySQL is shown. (String)
- `type` - The column data type. If the query was made through the MySQL wire protocol, then the equivalent type name in MySQL is shown. (String)
- `null` - `YES` if the column data type is Nullable, `NO` otherwise (String)
- `key` - `PRI` if the column is part of the primary key, `SOR` if the column is part of the sorting key, empty otherwise (String)
- `default` - Default expression of the column if it is of type `ALIAS`, `DEFAULT`, or `MATERIALIZED`, otherwise `NULL`. (Nullable(String))

View File

@ -536,6 +536,16 @@ static void sanityChecks(Server & server)
{
}
try
{
const char * filename = "/proc/sys/kernel/task_delayacct";
if (readNumber(filename) == 0)
server.context()->addWarningMessage("Delay accounting is not enabled, OSIOWaitMicroseconds will not be gathered. Check " + String(filename));
}
catch (...) // NOLINT(bugprone-empty-catch)
{
}
std::string dev_id = getBlockDeviceId(data_path);
if (getBlockDeviceType(dev_id) == BlockDeviceType::ROT && getBlockDeviceReadAheadBytes(dev_id) == 0)
server.context()->addWarningMessage("Rotational disk with disabled readahead is in use. Performance can be degraded. Used for data: " + String(data_path));

View File

@ -1472,6 +1472,15 @@
<!-- <disable_internal_dns_cache>1</disable_internal_dns_cache> -->
<!-- You can also configure rocksdb like this: -->
<!-- Full list of options:
- options:
- https://github.com/facebook/rocksdb/blob/4b013dcbed2df84fde3901d7655b9b91c557454d/include/rocksdb/options.h#L1452
- column_family_options:
- https://github.com/facebook/rocksdb/blob/4b013dcbed2df84fde3901d7655b9b91c557454d/include/rocksdb/options.h#L66
- block_based_table_options:
- https://github.com/facebook/rocksdb/blob/4b013dcbed2df84fde3901d7655b9b91c557454d/table/block_based/block_based_table_factory.cc#L228
- https://github.com/facebook/rocksdb/blob/4b013dcbed2df84fde3901d7655b9b91c557454d/include/rocksdb/table.h#L129
-->
<!--
<rocksdb>
<options>
@ -1480,6 +1489,9 @@
<column_family_options>
<num_levels>2</num_levels>
</column_family_options>
<block_based_table_options>
<block_size>1024</block_size>
</block_based_table_options>
<tables>
<table>
<name>TABLE</name>
@ -1489,6 +1501,9 @@
<column_family_options>
<num_levels>2</num_levels>
</column_family_options>
<block_based_table_options>
<block_size>1024</block_size>
</block_based_table_options>
</table>
</tables>
</rocksdb>

View File

@ -95,7 +95,7 @@ public:
size_t size = set.size();
writeVarUInt(size, buf);
for (const auto & elem : set)
writeIntBinary(elem, buf);
writeBinaryLittleEndian(elem.key, buf);
}
void deserialize(AggregateDataPtr __restrict place, ReadBuffer & buf, std::optional<size_t> /* version */, Arena *) const override

View File

@ -72,14 +72,14 @@ public:
{
writeBinary(has(), buf);
if (has())
writeBinary(value, buf);
writeBinaryLittleEndian(value, buf);
}
void read(ReadBuffer & buf, const ISerialization & /*serialization*/, Arena *)
{
readBinary(has_value, buf);
if (has())
readBinary(value, buf);
readBinaryLittleEndian(value, buf);
}
@ -1275,13 +1275,13 @@ struct AggregateFunctionAnyHeavyData : Data
void write(WriteBuffer & buf, const ISerialization & serialization) const
{
Data::write(buf, serialization);
writeBinary(counter, buf);
writeBinaryLittleEndian(counter, buf);
}
void read(ReadBuffer & buf, const ISerialization & serialization, Arena * arena)
{
Data::read(buf, serialization, arena);
readBinary(counter, buf);
readBinaryLittleEndian(counter, buf);
}
static const char * name() { return "anyHeavy"; }

View File

@ -34,7 +34,7 @@ struct TheilsUData : CrossTabData
for (const auto & [key, value] : count_ab)
{
Float64 value_ab = value;
Float64 value_b = count_b.at(key.items[1]);
Float64 value_b = count_b.at(key.items[UInt128::_impl::little(1)]);
dep += (value_ab / count) * log(value_ab / value_b);
}

View File

@ -35,8 +35,8 @@ struct AggregateFunctionMapCombinatorData
using SearchType = KeyType;
std::unordered_map<KeyType, AggregateDataPtr> merged_maps;
static void writeKey(KeyType key, WriteBuffer & buf) { writeBinary(key, buf); }
static void readKey(KeyType & key, ReadBuffer & buf) { readBinary(key, buf); }
static void writeKey(KeyType key, WriteBuffer & buf) { writeBinaryLittleEndian(key, buf); }
static void readKey(KeyType & key, ReadBuffer & buf) { readBinaryLittleEndian(key, buf); }
};
template <>

View File

@ -98,8 +98,8 @@ struct CrossTabData
Float64 chi_squared = 0;
for (const auto & [key, value_ab] : count_ab)
{
Float64 value_a = count_a.at(key.items[0]);
Float64 value_b = count_b.at(key.items[1]);
Float64 value_a = count_a.at(key.items[UInt128::_impl::little(0)]);
Float64 value_b = count_b.at(key.items[UInt128::_impl::little(1)]);
Float64 expected_value_ab = (value_a * value_b) / count;

View File

@ -44,7 +44,7 @@ using FunctionOverloadResolverPtr = std::shared_ptr<IFunctionOverloadResolver>;
class FunctionNode;
using FunctionNodePtr = std::shared_ptr<FunctionNode>;
enum class FunctionKind
enum class FunctionKind : UInt8
{
UNKNOWN,
ORDINARY,

View File

@ -15,6 +15,11 @@
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
}
/** Visitor that traverse query tree in depth.
* Derived class must implement `visitImpl` method.
* Additionally subclass can control if child need to be visited using `needChildVisit` method, by
@ -99,7 +104,7 @@ using ConstInDepthQueryTreeVisitor = InDepthQueryTreeVisitor<Derived, true /*con
* 2. enterImpl This method is called before children are processed.
* 3. leaveImpl This method is called after children are processed.
*/
template <typename Derived, bool const_visitor = false>
template <typename Derived>
class InDepthQueryTreeVisitorWithContext
{
public:
@ -169,31 +174,45 @@ private:
return *static_cast<Derived *>(this);
}
bool shouldSkipSubtree(
void visitChildIfNeeded(
VisitQueryTreeNodeType & parent,
VisitQueryTreeNodeType & child,
size_t subtree_index)
VisitQueryTreeNodeType & child)
{
bool need_visit_child = getDerived().needChildVisit(parent, child);
if (!need_visit_child)
return true;
return;
// We do not visit ListNode with arguments of TableFunctionNode directly, because
// we need to know which arguments are in the unresolved state.
// It must be safe because we do not modify table function arguments list in any visitor.
if (auto * table_function_node = parent->as<TableFunctionNode>())
{
if (child != table_function_node->getArgumentsNode())
throw Exception(ErrorCodes::LOGICAL_ERROR, "TableFunctioNode is expected to have only one child node");
const auto & unresolved_indexes = table_function_node->getUnresolvedArgumentIndexes();
return std::find(unresolved_indexes.begin(), unresolved_indexes.end(), subtree_index) != unresolved_indexes.end();
size_t index = 0;
for (auto & argument : table_function_node->getArguments())
{
if (std::find(unresolved_indexes.begin(),
unresolved_indexes.end(),
index) == unresolved_indexes.end())
visit(argument);
++index;
}
return;
}
return false;
else
visit(child);
}
void visitChildren(VisitQueryTreeNodeType & expression)
{
size_t index = 0;
for (auto & child : expression->getChildren())
{
if (child && !shouldSkipSubtree(expression, child, index))
visit(child);
++index;
if (child)
visitChildIfNeeded(expression, child);
}
}

View File

@ -5,6 +5,7 @@
#include <Analyzer/InDepthQueryTreeVisitor.h>
#include <Analyzer/ConstantNode.h>
#include <Analyzer/FunctionNode.h>
#include <Analyzer/Utils.h>
namespace DB
{
@ -12,10 +13,13 @@ namespace DB
namespace
{
class IfConstantConditionVisitor : public InDepthQueryTreeVisitor<IfConstantConditionVisitor>
class IfConstantConditionVisitor : public InDepthQueryTreeVisitorWithContext<IfConstantConditionVisitor>
{
public:
static void visitImpl(QueryTreeNodePtr & node)
using Base = InDepthQueryTreeVisitorWithContext<IfConstantConditionVisitor>;
using Base::Base;
void enterImpl(QueryTreeNodePtr & node)
{
auto * function_node = node->as<FunctionNode>();
if (!function_node || (function_node->getFunctionName() != "if" && function_node->getFunctionName() != "multiIf"))
@ -40,18 +44,22 @@ public:
else
return;
QueryTreeNodePtr argument_node;
if (condition_boolean_value)
node = function_node->getArguments().getNodes()[1];
argument_node = function_node->getArguments().getNodes()[1];
else
node = function_node->getArguments().getNodes()[2];
argument_node = function_node->getArguments().getNodes()[2];
if (node->getResultType()->equals(*argument_node->getResultType()))
node = argument_node;
}
};
}
void IfConstantConditionPass::run(QueryTreeNodePtr query_tree_node, ContextPtr)
void IfConstantConditionPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
{
IfConstantConditionVisitor visitor;
IfConstantConditionVisitor visitor(std::move(context));
visitor.visit(query_tree_node);
}

View File

@ -10,6 +10,8 @@
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include <AggregateFunctions/AggregateFunctionFactory.h>
#include <Functions/FunctionHelpers.h>
#include <Functions/FunctionFactory.h>
@ -513,6 +515,31 @@ private:
bool has_function = false;
};
inline AggregateFunctionPtr resolveAggregateFunction(FunctionNode * function_node)
{
Array parameters;
for (const auto & param : function_node->getParameters())
{
auto * constant = param->as<ConstantNode>();
parameters.push_back(constant->getValue());
}
const auto & function_node_argument_nodes = function_node->getArguments().getNodes();
DataTypes argument_types;
argument_types.reserve(function_node_argument_nodes.size());
for (const auto & function_node_argument : function_node_argument_nodes)
argument_types.emplace_back(function_node_argument->getResultType());
AggregateFunctionProperties properties;
return AggregateFunctionFactory::instance().get(
function_node->getFunctionName(),
argument_types,
parameters,
properties);
}
}
bool hasFunctionNode(const QueryTreeNodePtr & node, std::string_view function_name)
@ -573,6 +600,32 @@ void replaceColumns(QueryTreeNodePtr & node,
visitor.visit(node);
}
void rerunFunctionResolve(FunctionNode * function_node, ContextPtr context)
{
if (!function_node->isResolved())
throw Exception(ErrorCodes::LOGICAL_ERROR, "Trying to rerun resolve of unresolved function '{}'", function_node->getFunctionName());
const auto & name = function_node->getFunctionName();
if (function_node->isOrdinaryFunction())
{
// Special case, don't need to be resolved. It must be processed by GroupingFunctionsResolvePass.
if (name == "grouping")
return;
auto function = FunctionFactory::instance().get(name, context);
function_node->resolveAsFunction(function->build(function_node->getArgumentColumns()));
}
else if (function_node->isAggregateFunction())
{
if (name == "nothing")
return;
function_node->resolveAsAggregateFunction(resolveAggregateFunction(function_node));
}
else if (function_node->isWindowFunction())
{
function_node->resolveAsWindowFunction(resolveAggregateFunction(function_node));
}
}
namespace
{
@ -605,4 +658,5 @@ NameSet collectIdentifiersFullNames(const QueryTreeNodePtr & node)
visitor.visit(node);
return out;
}
}

View File

@ -7,6 +7,8 @@
namespace DB
{
class FunctionNode;
/// Returns true if node part of root tree, false otherwise
bool isNodePartOfTree(const IQueryTreeNode * node, const IQueryTreeNode * root);
@ -83,6 +85,10 @@ void replaceColumns(QueryTreeNodePtr & node,
const QueryTreeNodePtr & table_expression_node,
const std::unordered_map<std::string, QueryTreeNodePtr> & column_name_to_node);
/** Resolve function node again using it's content.
* This function should be called when arguments or parameters are changed.
*/
void rerunFunctionResolve(FunctionNode * function_node, ContextPtr context);
/// Just collect all identifiers from query tree
NameSet collectIdentifiersFullNames(const QueryTreeNodePtr & node);

View File

@ -43,6 +43,7 @@ public:
std::shared_ptr<IBackupCoordination> getBackupCoordination() const { return backup_coordination; }
const ReadSettings & getReadSettings() const { return read_settings; }
ContextPtr getContext() const { return context; }
const ZooKeeperRetriesInfo & getZooKeeperRetriesInfo() const { return global_zookeeper_retries_info; }
/// Adds a backup entry which will be later returned by run().
/// These function can be called by implementations of IStorage::backupData() in inherited storage classes.

View File

@ -330,10 +330,6 @@ if (TARGET ch_contrib::cpuid)
target_link_libraries(clickhouse_common_io PRIVATE ch_contrib::cpuid)
endif()
if (TARGET ch_contrib::crc32_s390x)
target_link_libraries(clickhouse_common_io PUBLIC ch_contrib::crc32_s390x)
endif()
if (TARGET ch_contrib::crc32-vpmsum)
target_link_libraries(clickhouse_common_io PUBLIC ch_contrib::crc32-vpmsum)
endif()

View File

@ -13,8 +13,9 @@ namespace DB
static constexpr auto PROXY_HTTP_ENVIRONMENT_VARIABLE = "http_proxy";
static constexpr auto PROXY_HTTPS_ENVIRONMENT_VARIABLE = "https_proxy";
EnvironmentProxyConfigurationResolver::EnvironmentProxyConfigurationResolver(Protocol protocol_)
: protocol(protocol_)
EnvironmentProxyConfigurationResolver::EnvironmentProxyConfigurationResolver(
Protocol request_protocol_, bool disable_tunneling_for_https_requests_over_http_proxy_)
: ProxyConfigurationResolver(request_protocol_, disable_tunneling_for_https_requests_over_http_proxy_)
{}
namespace
@ -37,7 +38,7 @@ namespace
ProxyConfiguration EnvironmentProxyConfigurationResolver::resolve()
{
const auto * proxy_host = getProxyHost(protocol);
const auto * proxy_host = getProxyHost(request_protocol);
if (!proxy_host)
{
@ -54,7 +55,9 @@ ProxyConfiguration EnvironmentProxyConfigurationResolver::resolve()
return ProxyConfiguration {
host,
ProxyConfiguration::protocolFromString(scheme),
port
port,
useTunneling(request_protocol, ProxyConfiguration::protocolFromString(scheme), disable_tunneling_for_https_requests_over_http_proxy),
request_protocol
};
}

View File

@ -11,13 +11,10 @@ namespace DB
class EnvironmentProxyConfigurationResolver : public ProxyConfigurationResolver
{
public:
explicit EnvironmentProxyConfigurationResolver(Protocol protocol_);
EnvironmentProxyConfigurationResolver(Protocol request_protocol, bool disable_tunneling_for_https_requests_over_http_proxy_ = false);
ProxyConfiguration resolve() override;
void errorReport(const ProxyConfiguration &) override {}
private:
Protocol protocol;
};
}

View File

@ -51,7 +51,7 @@ void abortOnFailedAssertion(const String & description)
}
bool terminate_on_any_exception = false;
static int terminate_status_code = 128 + SIGABRT;
thread_local bool update_error_statistics = true;
/// - Aborts the process if error code is LOGICAL_ERROR.
@ -92,7 +92,7 @@ Exception::Exception(const MessageMasked & msg_masked, int code, bool remote_)
, remote(remote_)
{
if (terminate_on_any_exception)
std::terminate();
std::_Exit(terminate_status_code);
capture_thread_frame_pointers = thread_frame_pointers;
handle_error_code(msg_masked.msg, code, remote, getStackFramePointers());
}
@ -102,7 +102,7 @@ Exception::Exception(MessageMasked && msg_masked, int code, bool remote_)
, remote(remote_)
{
if (terminate_on_any_exception)
std::terminate();
std::_Exit(terminate_status_code);
capture_thread_frame_pointers = thread_frame_pointers;
handle_error_code(message(), code, remote, getStackFramePointers());
}
@ -111,7 +111,7 @@ Exception::Exception(CreateFromPocoTag, const Poco::Exception & exc)
: Poco::Exception(exc.displayText(), ErrorCodes::POCO_EXCEPTION)
{
if (terminate_on_any_exception)
std::terminate();
std::_Exit(terminate_status_code);
capture_thread_frame_pointers = thread_frame_pointers;
#ifdef STD_EXCEPTION_HAS_STACK_TRACE
auto * stack_trace_frames = exc.get_stack_trace_frames();
@ -125,7 +125,7 @@ Exception::Exception(CreateFromSTDTag, const std::exception & exc)
: Poco::Exception(demangle(typeid(exc).name()) + ": " + String(exc.what()), ErrorCodes::STD_EXCEPTION)
{
if (terminate_on_any_exception)
std::terminate();
std::_Exit(terminate_status_code);
capture_thread_frame_pointers = thread_frame_pointers;
#ifdef STD_EXCEPTION_HAS_STACK_TRACE
auto * stack_trace_frames = exc.get_stack_trace_frames();

View File

@ -37,7 +37,9 @@ static struct InitFiu
ONCE(replicated_merge_tree_insert_quorum_fail_0) \
REGULAR(use_delayed_remote_source) \
REGULAR(cluster_discovery_faults) \
REGULAR(check_table_query_delay_for_part) \
REGULAR(dummy_failpoint) \
REGULAR(prefetched_reader_pool_failpoint) \
PAUSEABLE_ONCE(dummy_pausable_failpoint_once) \
PAUSEABLE(dummy_pausable_failpoint)

View File

@ -54,30 +54,7 @@ inline DB::UInt64 intHash64(DB::UInt64 x)
#endif
#if defined(__s390x__) && __BYTE_ORDER__==__ORDER_BIG_ENDIAN__
#include <crc32-s390x.h>
inline uint32_t s390x_crc32_u8(uint32_t crc, uint8_t v)
{
return crc32c_le_vx(crc, reinterpret_cast<unsigned char *>(&v), sizeof(v));
}
inline uint32_t s390x_crc32_u16(uint32_t crc, uint16_t v)
{
v = std::byteswap(v);
return crc32c_le_vx(crc, reinterpret_cast<unsigned char *>(&v), sizeof(v));
}
inline uint32_t s390x_crc32_u32(uint32_t crc, uint32_t v)
{
v = std::byteswap(v);
return crc32c_le_vx(crc, reinterpret_cast<unsigned char *>(&v), sizeof(v));
}
inline uint64_t s390x_crc32(uint64_t crc, uint64_t v)
{
v = std::byteswap(v);
return crc32c_le_vx(static_cast<uint32_t>(crc), reinterpret_cast<unsigned char *>(&v), sizeof(uint64_t));
}
#include <base/crc32c_s390x.h>
#endif
/// NOTE: Intel intrinsic can be confusing.
@ -92,7 +69,7 @@ inline DB::UInt64 intHashCRC32(DB::UInt64 x)
#elif (defined(__PPC64__) || defined(__powerpc64__)) && __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
return crc32_ppc(-1U, reinterpret_cast<const unsigned char *>(&x), sizeof(x));
#elif defined(__s390x__) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
return s390x_crc32(-1U, x);
return s390x_crc32c(-1U, x);
#else
/// On other platforms we do not have CRC32. NOTE This can be confusing.
/// NOTE: consider using intHash32()
@ -108,7 +85,7 @@ inline DB::UInt64 intHashCRC32(DB::UInt64 x, DB::UInt64 updated_value)
#elif (defined(__PPC64__) || defined(__powerpc64__)) && __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
return crc32_ppc(updated_value, reinterpret_cast<const unsigned char *>(&x), sizeof(x));
#elif defined(__s390x__) && __BYTE_ORDER__==__ORDER_BIG_ENDIAN__
return s390x_crc32(updated_value, x);
return s390x_crc32c(updated_value, x);
#else
/// On other platforms we do not have CRC32. NOTE This can be confusing.
return intHash64(x) ^ updated_value;
@ -358,7 +335,7 @@ struct UInt128Hash
{
size_t operator()(UInt128 x) const
{
return CityHash_v1_0_2::Hash128to64({x.items[0], x.items[1]});
return CityHash_v1_0_2::Hash128to64({x.items[UInt128::_impl::little(0)], x.items[UInt128::_impl::little(1)]});
}
};
@ -403,8 +380,8 @@ struct UInt128HashCRC32
size_t operator()(UInt128 x) const
{
UInt64 crc = -1ULL;
crc = s390x_crc32(crc, x.items[UInt128::_impl::little(0)]);
crc = s390x_crc32(crc, x.items[UInt128::_impl::little(1)]);
crc = s390x_crc32c(crc, x.items[UInt128::_impl::little(0)]);
crc = s390x_crc32c(crc, x.items[UInt128::_impl::little(1)]);
return crc;
}
};
@ -472,10 +449,10 @@ struct UInt256HashCRC32
size_t operator()(UInt256 x) const
{
UInt64 crc = -1ULL;
crc = s390x_crc32(crc, x.items[UInt256::_impl::little(0)]);
crc = s390x_crc32(crc, x.items[UInt256::_impl::little(1)]);
crc = s390x_crc32(crc, x.items[UInt256::_impl::little(2)]);
crc = s390x_crc32(crc, x.items[UInt256::_impl::little(3)]);
crc = s390x_crc32c(crc, x.items[UInt256::_impl::little(0)]);
crc = s390x_crc32c(crc, x.items[UInt256::_impl::little(1)]);
crc = s390x_crc32c(crc, x.items[UInt256::_impl::little(2)]);
crc = s390x_crc32c(crc, x.items[UInt256::_impl::little(3)]);
return crc;
}
};
@ -557,7 +534,10 @@ struct IntHash32
else if constexpr (sizeof(T) <= sizeof(UInt64))
{
DB::UInt64 out {0};
std::memcpy(&out, &key, sizeof(T));
if constexpr (std::endian::native == std::endian::little)
std::memcpy(&out, &key, sizeof(T));
else
std::memcpy(reinterpret_cast<char*>(&out) + sizeof(DB::UInt64) - sizeof(T), &key, sizeof(T));
return intHash32<salt>(out);
}

View File

@ -75,22 +75,22 @@ struct StringHashTableHash
size_t ALWAYS_INLINE operator()(StringKey8 key) const
{
size_t res = -1ULL;
res = s390x_crc32(res, key);
res = s390x_crc32c(res, key);
return res;
}
size_t ALWAYS_INLINE operator()(StringKey16 key) const
{
size_t res = -1ULL;
res = s390x_crc32(res, key.items[UInt128::_impl::little(0)]);
res = s390x_crc32(res, key.items[UInt128::_impl::little(1)]);
res = s390x_crc32c(res, key.items[UInt128::_impl::little(0)]);
res = s390x_crc32c(res, key.items[UInt128::_impl::little(1)]);
return res;
}
size_t ALWAYS_INLINE operator()(StringKey24 key) const
{
size_t res = -1ULL;
res = s390x_crc32(res, key.a);
res = s390x_crc32(res, key.b);
res = s390x_crc32(res, key.c);
res = s390x_crc32c(res, key.a);
res = s390x_crc32c(res, key.b);
res = s390x_crc32c(res, key.c);
return res;
}
#else

View File

@ -4,6 +4,7 @@
#include <Common/HyperLogLogBiasEstimator.h>
#include <Common/CompactArray.h>
#include <Common/HashTable/Hash.h>
#include <Common/TransformEndianness.hpp>
#include <IO/ReadBuffer.h>
#include <IO/WriteBuffer.h>
@ -330,7 +331,26 @@ public:
void read(DB::ReadBuffer & in)
{
in.readStrict(reinterpret_cast<char *>(this), sizeof(*this));
if constexpr (std::endian::native == std::endian::little)
in.readStrict(reinterpret_cast<char *>(this), sizeof(*this));
else
{
in.readStrict(reinterpret_cast<char *>(&rank_store), sizeof(RankStore));
constexpr size_t denom_size = sizeof(DenominatorCalculatorType);
std::array<char, denom_size> denominator_copy;
in.readStrict(denominator_copy.begin(), denom_size);
for (size_t i = 0; i < denominator_copy.size(); i += (sizeof(UInt32) / sizeof(char)))
{
UInt32 * cur = reinterpret_cast<UInt32 *>(&denominator_copy[i]);
DB::transformEndianness<std::endian::native, std::endian::little>(*cur);
}
memcpy(reinterpret_cast<char *>(&denominator), denominator_copy.begin(), denom_size);
in.readStrict(reinterpret_cast<char *>(&zeros), sizeof(ZerosCounterType));
DB::transformEndianness<std::endian::native, std::endian::little>(zeros);
}
}
void readAndMerge(DB::ReadBuffer & in)
@ -352,7 +372,27 @@ public:
void write(DB::WriteBuffer & out) const
{
out.write(reinterpret_cast<const char *>(this), sizeof(*this));
if constexpr (std::endian::native == std::endian::little)
out.write(reinterpret_cast<const char *>(this), sizeof(*this));
else
{
out.write(reinterpret_cast<const char *>(&rank_store), sizeof(RankStore));
constexpr size_t denom_size = sizeof(DenominatorCalculatorType);
std::array<char, denom_size> denominator_copy;
memcpy(denominator_copy.begin(), reinterpret_cast<const char *>(&denominator), denom_size);
for (size_t i = 0; i < denominator_copy.size(); i += (sizeof(UInt32) / sizeof(char)))
{
UInt32 * cur = reinterpret_cast<UInt32 *>(&denominator_copy[i]);
DB::transformEndianness<std::endian::little, std::endian::native>(*cur);
}
out.write(denominator_copy.begin(), denom_size);
auto zeros_copy = zeros;
DB::transformEndianness<std::endian::little, std::endian::native>(zeros_copy);
out.write(reinterpret_cast<const char *>(&zeros_copy), sizeof(ZerosCounterType));
}
}
/// Read and write in text mode is suboptimal (but compatible with OLAPServer and Metrage).

View File

@ -1,18 +1,19 @@
#include "MemoryTracker.h"
#include <IO/WriteHelpers.h>
#include <Common/HashTable/Hash.h>
#include <Common/VariableContext.h>
#include <Common/TraceSender.h>
#include <Common/Exception.h>
#include <Common/HashTable/Hash.h>
#include <Common/LockMemoryExceptionInThread.h>
#include <Common/MemoryTrackerBlockerInThread.h>
#include <Common/formatReadable.h>
#include <Common/ProfileEvents.h>
#include <Common/thread_local_rng.h>
#include <Common/OvercommitTracker.h>
#include <Common/ProfileEvents.h>
#include <Common/Stopwatch.h>
#include <Common/ThreadStatus.h>
#include <Common/TraceSender.h>
#include <Common/VariableContext.h>
#include <Common/formatReadable.h>
#include <Common/logger_useful.h>
#include <Common/thread_local_rng.h>
#include "config.h"
@ -589,6 +590,16 @@ bool MemoryTracker::isSizeOkForSampling(UInt64 size) const
return ((max_allocation_size_bytes == 0 || size <= max_allocation_size_bytes) && size >= min_allocation_size_bytes);
}
void MemoryTracker::setParent(MemoryTracker * elem)
{
/// Untracked memory shouldn't be accounted to a query or a user if it was allocated before the thread was attached
/// to a query thread group or a user group, because this memory will be (🤞) freed outside of these scopes.
if (level == VariableContext::Thread && DB::current_thread)
DB::current_thread->flushUntrackedMemory();
parent.store(elem, std::memory_order_relaxed);
}
bool canEnqueueBackgroundTask()
{
auto limit = background_memory_tracker.getSoftLimit();

View File

@ -200,10 +200,7 @@ public:
/// next should be changed only once: from nullptr to some value.
/// NOTE: It is not true in MergeListElement
void setParent(MemoryTracker * elem)
{
parent.store(elem, std::memory_order_relaxed);
}
void setParent(MemoryTracker * elem);
MemoryTracker * getParent()
{

View File

@ -0,0 +1,68 @@
#include "ObjectStorageKey.h"
#include <Common/Exception.h>
#include <filesystem>
namespace fs = std::filesystem;
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
}
const String & ObjectStorageKey::getPrefix() const
{
if (!is_relative)
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "object key has no prefix, key: {}", key);
return prefix;
}
const String & ObjectStorageKey::getSuffix() const
{
if (!is_relative)
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "object key has no suffix, key: {}", key);
return suffix;
}
const String & ObjectStorageKey::serialize() const
{
return key;
}
ObjectStorageKey ObjectStorageKey::createAsRelative(String key_)
{
ObjectStorageKey object_key;
object_key.suffix = std::move(key_);
object_key.key = object_key.suffix;
object_key.is_relative = true;
return object_key;
}
ObjectStorageKey ObjectStorageKey::createAsRelative(String prefix_, String suffix_)
{
ObjectStorageKey object_key;
object_key.prefix = std::move(prefix_);
object_key.suffix = std::move(suffix_);
if (object_key.prefix.empty())
object_key.key = object_key.suffix;
else
object_key.key = fs::path(object_key.prefix) / object_key.suffix;
object_key.is_relative = true;
return object_key;
}
ObjectStorageKey ObjectStorageKey::createAsAbsolute(String key_)
{
ObjectStorageKey object_key;
object_key.key = std::move(key_);
object_key.is_relative = false;
return object_key;
}
}

View File

@ -0,0 +1,29 @@
#pragma once
#include <base/types.h>
#include <memory>
namespace DB
{
struct ObjectStorageKey
{
ObjectStorageKey() = default;
bool hasPrefix() const { return is_relative; }
const String & getPrefix() const;
const String & getSuffix() const;
const String & serialize() const;
static ObjectStorageKey createAsRelative(String prefix_, String suffix_);
static ObjectStorageKey createAsRelative(String key_);
static ObjectStorageKey createAsAbsolute(String key_);
private:
String prefix;
String suffix;
String key;
bool is_relative = false;
};
}

View File

@ -11,7 +11,6 @@ namespace ErrorCodes
extern const int BAD_ARGUMENTS;
}
struct ProxyConfiguration
{
enum class Protocol
@ -48,6 +47,8 @@ struct ProxyConfiguration
std::string host;
Protocol protocol;
uint16_t port;
bool tunneling;
Protocol original_request_protocol;
};
}

View File

@ -9,9 +9,25 @@ struct ProxyConfigurationResolver
{
using Protocol = ProxyConfiguration::Protocol;
explicit ProxyConfigurationResolver(Protocol request_protocol_, bool disable_tunneling_for_https_requests_over_http_proxy_ = false)
: request_protocol(request_protocol_), disable_tunneling_for_https_requests_over_http_proxy(disable_tunneling_for_https_requests_over_http_proxy_)
{
}
virtual ~ProxyConfigurationResolver() = default;
virtual ProxyConfiguration resolve() = 0;
virtual void errorReport(const ProxyConfiguration & config) = 0;
protected:
static bool useTunneling(Protocol request_protocol, Protocol proxy_protocol, bool disable_tunneling_for_https_requests_over_http_proxy)
{
bool is_https_request_over_http_proxy = request_protocol == Protocol::HTTPS && proxy_protocol == Protocol::HTTP;
return is_https_request_over_http_proxy && !disable_tunneling_for_https_requests_over_http_proxy;
}
Protocol request_protocol;
bool disable_tunneling_for_https_requests_over_http_proxy = false;
};
}

View File

@ -17,8 +17,16 @@ namespace ErrorCodes
namespace
{
bool isTunnelingDisabledForHTTPSRequestsOverHTTPProxy(
const Poco::Util::AbstractConfiguration & configuration)
{
return configuration.getBool("proxy.disable_tunneling_for_https_requests_over_http_proxy", false);
}
std::shared_ptr<ProxyConfigurationResolver> getRemoteResolver(
const String & config_prefix, const Poco::Util::AbstractConfiguration & configuration)
ProxyConfiguration::Protocol request_protocol,
const String & config_prefix,
const Poco::Util::AbstractConfiguration & configuration)
{
auto resolver_prefix = config_prefix + ".resolver";
auto endpoint = Poco::URI(configuration.getString(resolver_prefix + ".endpoint"));
@ -31,7 +39,17 @@ namespace
LOG_DEBUG(&Poco::Logger::get("ProxyConfigurationResolverProvider"), "Configured remote proxy resolver: {}, Scheme: {}, Port: {}",
endpoint.toString(), proxy_scheme, proxy_port);
return std::make_shared<RemoteProxyConfigurationResolver>(endpoint, proxy_scheme, proxy_port, cache_ttl);
auto server_configuration = RemoteProxyConfigurationResolver::RemoteServerConfiguration {
endpoint,
proxy_scheme,
proxy_port,
cache_ttl
};
return std::make_shared<RemoteProxyConfigurationResolver>(
server_configuration,
request_protocol,
isTunnelingDisabledForHTTPSRequestsOverHTTPProxy(configuration));
}
auto extractURIList(const String & config_prefix, const Poco::Util::AbstractConfiguration & configuration)
@ -61,13 +79,15 @@ namespace
}
std::shared_ptr<ProxyConfigurationResolver> getListResolver(
ProxyConfiguration::Protocol request_protocol,
const String & config_prefix,
const Poco::Util::AbstractConfiguration & configuration
)
const Poco::Util::AbstractConfiguration & configuration)
{
auto uris = extractURIList(config_prefix, configuration);
return uris.empty() ? nullptr : std::make_shared<ProxyListConfigurationResolver>(uris);
return uris.empty()
? nullptr
: std::make_shared<ProxyListConfigurationResolver>(uris, request_protocol, isTunnelingDisabledForHTTPSRequestsOverHTTPProxy(configuration));
}
bool hasRemoteResolver(const String & config_prefix, const Poco::Util::AbstractConfiguration & configuration)
@ -119,14 +139,18 @@ namespace
}
}
std::shared_ptr<ProxyConfigurationResolver> ProxyConfigurationResolverProvider::get(Protocol protocol, const Poco::Util::AbstractConfiguration & configuration)
std::shared_ptr<ProxyConfigurationResolver> ProxyConfigurationResolverProvider::get(
Protocol request_protocol,
const Poco::Util::AbstractConfiguration & configuration)
{
if (auto resolver = getFromSettings(protocol, "proxy", configuration))
if (auto resolver = getFromSettings(request_protocol, "proxy", configuration))
{
return resolver;
}
return std::make_shared<EnvironmentProxyConfigurationResolver>(protocol);
return std::make_shared<EnvironmentProxyConfigurationResolver>(
request_protocol,
isTunnelingDisabledForHTTPSRequestsOverHTTPProxy(configuration));
}
template <bool is_new_syntax>
@ -142,16 +166,15 @@ std::shared_ptr<ProxyConfigurationResolver> ProxyConfigurationResolverProvider::
{
return nullptr;
}
auto prefix = *prefix_opt;
if (hasRemoteResolver(prefix, configuration))
{
return getRemoteResolver(prefix, configuration);
return getRemoteResolver(request_protocol, prefix, configuration);
}
else if (hasListResolver(prefix, configuration))
{
return getListResolver(prefix, configuration);
return getListResolver(request_protocol, prefix, configuration);
}
return nullptr;

View File

@ -7,8 +7,10 @@
namespace DB
{
ProxyListConfigurationResolver::ProxyListConfigurationResolver(std::vector<Poco::URI> proxies_)
: proxies(std::move(proxies_))
ProxyListConfigurationResolver::ProxyListConfigurationResolver(
std::vector<Poco::URI> proxies_,
Protocol request_protocol_, bool disable_tunneling_for_https_requests_over_http_proxy_)
: ProxyConfigurationResolver(request_protocol_, disable_tunneling_for_https_requests_over_http_proxy_), proxies(std::move(proxies_))
{
}
@ -25,7 +27,14 @@ ProxyConfiguration ProxyListConfigurationResolver::resolve()
auto & proxy = proxies[index];
LOG_DEBUG(&Poco::Logger::get("ProxyListConfigurationResolver"), "Use proxy: {}", proxies[index].toString());
return ProxyConfiguration {proxy.getHost(), ProxyConfiguration::protocolFromString(proxy.getScheme()), proxy.getPort()};
return ProxyConfiguration {
proxy.getHost(),
ProxyConfiguration::protocolFromString(proxy.getScheme()),
proxy.getPort(),
useTunneling(request_protocol, ProxyConfiguration::protocolFromString(proxy.getScheme()), disable_tunneling_for_https_requests_over_http_proxy),
request_protocol
};
}
}

View File

@ -14,7 +14,7 @@ namespace DB
class ProxyListConfigurationResolver : public ProxyConfigurationResolver
{
public:
explicit ProxyListConfigurationResolver(std::vector<Poco::URI> proxies_);
ProxyListConfigurationResolver(std::vector<Poco::URI> proxies_, Protocol request_protocol_, bool disable_tunneling_for_https_requests_over_http_proxy_ = false);
ProxyConfiguration resolve() override;

View File

@ -17,12 +17,11 @@ namespace ErrorCodes
}
RemoteProxyConfigurationResolver::RemoteProxyConfigurationResolver(
const Poco::URI & endpoint_,
String proxy_protocol_,
unsigned proxy_port_,
unsigned cache_ttl_
const RemoteServerConfiguration & remote_server_configuration_,
Protocol request_protocol_,
bool disable_tunneling_for_https_requests_over_http_proxy_
)
: endpoint(endpoint_), proxy_protocol(std::move(proxy_protocol_)), proxy_port(proxy_port_), cache_ttl(cache_ttl_)
: ProxyConfigurationResolver(request_protocol_, disable_tunneling_for_https_requests_over_http_proxy_), remote_server_configuration(remote_server_configuration_)
{
}
@ -30,6 +29,8 @@ ProxyConfiguration RemoteProxyConfigurationResolver::resolve()
{
auto * logger = &Poco::Logger::get("RemoteProxyConfigurationResolver");
auto & [endpoint, proxy_protocol, proxy_port, cache_ttl_] = remote_server_configuration;
LOG_DEBUG(logger, "Obtain proxy using resolver: {}", endpoint.toString());
std::lock_guard lock(cache_mutex);
@ -39,11 +40,11 @@ ProxyConfiguration RemoteProxyConfigurationResolver::resolve()
if (cache_ttl.count() && cache_valid && now <= cache_timestamp + cache_ttl && now >= cache_timestamp)
{
LOG_DEBUG(logger,
"Use cached proxy: {}://{}:{}",
"Use cached proxy: {}://{}:{}. Tunneling: {}",
cached_config.protocol,
cached_config.host,
cached_config.port
);
cached_config.port,
cached_config.tunneling);
return cached_config;
}
@ -95,9 +96,16 @@ ProxyConfiguration RemoteProxyConfigurationResolver::resolve()
LOG_DEBUG(logger, "Use proxy: {}://{}:{}", proxy_protocol, proxy_host, proxy_port);
bool use_tunneling_for_https_requests_over_http_proxy = useTunneling(
request_protocol,
cached_config.protocol,
disable_tunneling_for_https_requests_over_http_proxy);
cached_config.protocol = ProxyConfiguration::protocolFromString(proxy_protocol);
cached_config.host = proxy_host;
cached_config.port = proxy_port;
cached_config.tunneling = use_tunneling_for_https_requests_over_http_proxy;
cached_config.original_request_protocol = request_protocol;
cache_timestamp = std::chrono::system_clock::now();
cache_valid = true;

View File

@ -16,25 +16,26 @@ namespace DB
class RemoteProxyConfigurationResolver : public ProxyConfigurationResolver
{
public:
struct RemoteServerConfiguration
{
Poco::URI endpoint;
String proxy_protocol;
unsigned proxy_port;
unsigned cache_ttl_;
};
RemoteProxyConfigurationResolver(
const Poco::URI & endpoint_,
String proxy_protocol_,
unsigned proxy_port_,
unsigned cache_ttl_
);
const RemoteServerConfiguration & remote_server_configuration_,
Protocol request_protocol_,
bool disable_tunneling_for_https_requests_over_http_proxy_ = true);
ProxyConfiguration resolve() override;
void errorReport(const ProxyConfiguration & config) override;
private:
/// Endpoint to obtain a proxy host.
const Poco::URI endpoint;
/// Scheme for obtained proxy.
const String proxy_protocol;
/// Port for obtained proxy.
const unsigned proxy_port;
RemoteServerConfiguration remote_server_configuration;
std::mutex cache_mutex;
bool cache_valid = false;

View File

@ -96,14 +96,20 @@ public:
void write(WriteBuffer & wb) const
{
writeBinary(key, wb);
if constexpr (std::is_same_v<TKey, StringRef>)
writeBinary(key, wb);
else
writeBinaryLittleEndian(key, wb);
writeVarUInt(count, wb);
writeVarUInt(error, wb);
}
void read(ReadBuffer & rb)
{
readBinary(key, rb);
if constexpr (std::is_same_v<TKey, StringRef>)
readBinary(key, rb);
else
readBinaryLittleEndian(key, rb);
readVarUInt(count, rb);
readVarUInt(error, rb);
}

View File

@ -198,7 +198,7 @@ namespace Hashes
#ifdef __SSE4_2__
return _mm_crc32_u64(-1ULL, x);
#elif defined(__s390x__) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
return s390x_crc32(-1ULL, x);
return s390x_crc32c(-1ULL, x);
#else
/// On other platforms we do not have CRC32. NOTE This can be confusing.
return intHash64(x);

View File

@ -195,3 +195,37 @@ TEST_F(ProxyConfigurationResolverProviderTests, RemoteResolverIsBasedOnProtocolC
}
// remote resolver is tricky to be tested in unit tests
template <bool DISABLE_TUNNELING_FOR_HTTPS_REQUESTS_OVER_HTTP_PROXY, bool STRING>
void test_tunneling(DB::ContextMutablePtr context)
{
EnvironmentProxySetter setter(http_env_proxy_server, https_env_proxy_server);
ConfigurationPtr config = Poco::AutoPtr(new Poco::Util::MapConfiguration());
config->setString("proxy", "");
if constexpr (STRING)
{
config->setString("proxy.disable_tunneling_for_https_requests_over_http_proxy", DISABLE_TUNNELING_FOR_HTTPS_REQUESTS_OVER_HTTP_PROXY ? "true" : "false");
}
else
{
config->setBool("proxy.disable_tunneling_for_https_requests_over_http_proxy", DISABLE_TUNNELING_FOR_HTTPS_REQUESTS_OVER_HTTP_PROXY);
}
context->setConfig(config);
auto https_configuration = DB::ProxyConfigurationResolverProvider::get(DB::ProxyConfiguration::Protocol::HTTPS, *config)->resolve();
ASSERT_EQ(https_configuration.tunneling, !DISABLE_TUNNELING_FOR_HTTPS_REQUESTS_OVER_HTTP_PROXY);
}
TEST_F(ProxyConfigurationResolverProviderTests, TunnelingForHTTPSRequestsOverHTTPProxySetting)
{
test_tunneling<false, false>(context);
test_tunneling<false, true>(context);
test_tunneling<true, false>(context);
test_tunneling<true, true>(context);
}

View File

@ -4,6 +4,9 @@
#include <Common/tests/gtest_helper_functions.h>
#include <Poco/URI.h>
namespace DB
{
namespace
{
auto http_proxy_server = Poco::URI("http://proxy_server:3128");
@ -14,23 +17,23 @@ TEST(EnvironmentProxyConfigurationResolver, TestHTTP)
{
EnvironmentProxySetter setter(http_proxy_server, {});
DB::EnvironmentProxyConfigurationResolver resolver(DB::ProxyConfiguration::Protocol::HTTP);
EnvironmentProxyConfigurationResolver resolver(ProxyConfiguration::Protocol::HTTP);
auto configuration = resolver.resolve();
ASSERT_EQ(configuration.host, http_proxy_server.getHost());
ASSERT_EQ(configuration.port, http_proxy_server.getPort());
ASSERT_EQ(configuration.protocol, DB::ProxyConfiguration::protocolFromString(http_proxy_server.getScheme()));
ASSERT_EQ(configuration.protocol, ProxyConfiguration::protocolFromString(http_proxy_server.getScheme()));
}
TEST(EnvironmentProxyConfigurationResolver, TestHTTPNoEnv)
{
DB::EnvironmentProxyConfigurationResolver resolver(DB::ProxyConfiguration::Protocol::HTTP);
EnvironmentProxyConfigurationResolver resolver(ProxyConfiguration::Protocol::HTTP);
auto configuration = resolver.resolve();
ASSERT_EQ(configuration.host, "");
ASSERT_EQ(configuration.protocol, DB::ProxyConfiguration::Protocol::HTTP);
ASSERT_EQ(configuration.protocol, ProxyConfiguration::Protocol::HTTP);
ASSERT_EQ(configuration.port, 0u);
}
@ -38,22 +41,42 @@ TEST(EnvironmentProxyConfigurationResolver, TestHTTPs)
{
EnvironmentProxySetter setter({}, https_proxy_server);
DB::EnvironmentProxyConfigurationResolver resolver(DB::ProxyConfiguration::Protocol::HTTPS);
EnvironmentProxyConfigurationResolver resolver(ProxyConfiguration::Protocol::HTTPS);
auto configuration = resolver.resolve();
ASSERT_EQ(configuration.host, https_proxy_server.getHost());
ASSERT_EQ(configuration.port, https_proxy_server.getPort());
ASSERT_EQ(configuration.protocol, DB::ProxyConfiguration::protocolFromString(https_proxy_server.getScheme()));
ASSERT_EQ(configuration.protocol, ProxyConfiguration::protocolFromString(https_proxy_server.getScheme()));
}
TEST(EnvironmentProxyConfigurationResolver, TestHTTPsNoEnv)
{
DB::EnvironmentProxyConfigurationResolver resolver(DB::ProxyConfiguration::Protocol::HTTPS);
EnvironmentProxyConfigurationResolver resolver(ProxyConfiguration::Protocol::HTTPS);
auto configuration = resolver.resolve();
ASSERT_EQ(configuration.host, "");
ASSERT_EQ(configuration.protocol, DB::ProxyConfiguration::Protocol::HTTP);
ASSERT_EQ(configuration.protocol, ProxyConfiguration::Protocol::HTTP);
ASSERT_EQ(configuration.port, 0u);
}
TEST(EnvironmentProxyConfigurationResolver, TestHTTPsOverHTTPTunnelingDisabled)
{
// use http proxy for https, this would use connect protocol by default
EnvironmentProxySetter setter({}, http_proxy_server);
bool disable_tunneling_for_https_requests_over_http_proxy = true;
EnvironmentProxyConfigurationResolver resolver(
ProxyConfiguration::Protocol::HTTPS, disable_tunneling_for_https_requests_over_http_proxy);
auto configuration = resolver.resolve();
ASSERT_EQ(configuration.host, http_proxy_server.getHost());
ASSERT_EQ(configuration.port, http_proxy_server.getPort());
ASSERT_EQ(configuration.protocol, ProxyConfiguration::protocolFromString(http_proxy_server.getScheme()));
ASSERT_EQ(configuration.tunneling, false);
}
}

View File

@ -3,6 +3,9 @@
#include <Common/ProxyListConfigurationResolver.h>
#include <Poco/URI.h>
namespace DB
{
namespace
{
auto proxy_server1 = Poco::URI("http://proxy_server1:3128");
@ -11,16 +14,58 @@ namespace
TEST(ProxyListConfigurationResolver, SimpleTest)
{
DB::ProxyListConfigurationResolver resolver({proxy_server1, proxy_server2});
ProxyListConfigurationResolver resolver(
{proxy_server1, proxy_server2},
ProxyConfiguration::Protocol::HTTP);
auto configuration1 = resolver.resolve();
auto configuration2 = resolver.resolve();
ASSERT_EQ(configuration1.host, proxy_server1.getHost());
ASSERT_EQ(configuration1.port, proxy_server1.getPort());
ASSERT_EQ(configuration1.protocol, DB::ProxyConfiguration::protocolFromString(proxy_server1.getScheme()));
ASSERT_EQ(configuration1.protocol, ProxyConfiguration::protocolFromString(proxy_server1.getScheme()));
ASSERT_EQ(configuration2.host, proxy_server2.getHost());
ASSERT_EQ(configuration2.port, proxy_server2.getPort());
ASSERT_EQ(configuration2.protocol, DB::ProxyConfiguration::protocolFromString(proxy_server2.getScheme()));
ASSERT_EQ(configuration2.protocol, ProxyConfiguration::protocolFromString(proxy_server2.getScheme()));
}
TEST(ProxyListConfigurationResolver, HTTPSRequestsOverHTTPProxyDefault)
{
ProxyListConfigurationResolver resolver(
{proxy_server1, proxy_server2},
ProxyConfiguration::Protocol::HTTPS);
auto configuration1 = resolver.resolve();
auto configuration2 = resolver.resolve();
ASSERT_EQ(configuration1.host, proxy_server1.getHost());
ASSERT_EQ(configuration1.port, proxy_server1.getPort());
ASSERT_EQ(configuration1.protocol, ProxyConfiguration::protocolFromString(proxy_server1.getScheme()));
ASSERT_EQ(configuration1.tunneling, true);
ASSERT_EQ(configuration2.host, proxy_server2.getHost());
ASSERT_EQ(configuration2.port, proxy_server2.getPort());
ASSERT_EQ(configuration2.protocol, ProxyConfiguration::protocolFromString(proxy_server2.getScheme()));
ASSERT_EQ(configuration1.tunneling, true);
}
TEST(ProxyListConfigurationResolver, SimpleTestTunnelingDisabled)
{
bool disable_tunneling_for_https_requests_over_http_proxy = true;
ProxyListConfigurationResolver resolver(
{proxy_server1, proxy_server2},
ProxyConfiguration::Protocol::HTTPS,
disable_tunneling_for_https_requests_over_http_proxy);
auto configuration1 = resolver.resolve();
ASSERT_EQ(configuration1.host, proxy_server1.getHost());
ASSERT_EQ(configuration1.port, proxy_server1.getPort());
ASSERT_EQ(configuration1.protocol, ProxyConfiguration::protocolFromString(proxy_server1.getScheme()));
ASSERT_EQ(configuration1.tunneling, false);
}
}

View File

@ -16,8 +16,8 @@ private:
bool nextImpl() override;
public:
explicit CompressedReadBuffer(ReadBuffer & in_, bool allow_different_codecs_ = false)
: CompressedReadBufferBase(&in_, allow_different_codecs_), BufferWithOwnMemory<ReadBuffer>(0)
explicit CompressedReadBuffer(ReadBuffer & in_, bool allow_different_codecs_ = false, bool external_data_ = false)
: CompressedReadBufferBase(&in_, allow_different_codecs_, external_data_), BufferWithOwnMemory<ReadBuffer>(0)
{
}

View File

@ -114,7 +114,8 @@ static void readHeaderAndGetCodecAndSize(
CompressionCodecPtr & codec,
size_t & size_decompressed,
size_t & size_compressed_without_checksum,
bool allow_different_codecs)
bool allow_different_codecs,
bool external_data)
{
uint8_t method = ICompressionCodec::readMethod(compressed_buffer);
@ -136,8 +137,11 @@ static void readHeaderAndGetCodecAndSize(
}
}
size_compressed_without_checksum = ICompressionCodec::readCompressedBlockSize(compressed_buffer);
size_decompressed = ICompressionCodec::readDecompressedBlockSize(compressed_buffer);
if (external_data)
codec->setExternalDataFlag();
size_compressed_without_checksum = codec->readCompressedBlockSize(compressed_buffer);
size_decompressed = codec->readDecompressedBlockSize(compressed_buffer);
/// This is for clang static analyzer.
assert(size_decompressed > 0);
@ -170,7 +174,8 @@ size_t CompressedReadBufferBase::readCompressedData(size_t & size_decompressed,
codec,
size_decompressed,
size_compressed_without_checksum,
allow_different_codecs);
allow_different_codecs,
external_data);
auto additional_size_at_the_end_of_buffer = codec->getAdditionalSizeAtTheEndOfBuffer();
@ -221,7 +226,8 @@ size_t CompressedReadBufferBase::readCompressedDataBlockForAsynchronous(size_t &
codec,
size_decompressed,
size_compressed_without_checksum,
allow_different_codecs);
allow_different_codecs,
external_data);
auto additional_size_at_the_end_of_buffer = codec->getAdditionalSizeAtTheEndOfBuffer();
@ -254,7 +260,8 @@ size_t CompressedReadBufferBase::readCompressedDataBlockForAsynchronous(size_t &
}
}
static void readHeaderAndGetCodec(const char * compressed_buffer, size_t size_decompressed, CompressionCodecPtr & codec, bool allow_different_codecs)
static void readHeaderAndGetCodec(const char * compressed_buffer, size_t size_decompressed, CompressionCodecPtr & codec,
bool allow_different_codecs, bool external_data)
{
ProfileEvents::increment(ProfileEvents::CompressedReadBufferBlocks);
ProfileEvents::increment(ProfileEvents::CompressedReadBufferBytes, size_decompressed);
@ -278,17 +285,20 @@ static void readHeaderAndGetCodec(const char * compressed_buffer, size_t size_de
getHexUIntLowercase(method), getHexUIntLowercase(codec->getMethodByte()));
}
}
if (external_data)
codec->setExternalDataFlag();
}
void CompressedReadBufferBase::decompressTo(char * to, size_t size_decompressed, size_t size_compressed_without_checksum)
{
readHeaderAndGetCodec(compressed_buffer, size_decompressed, codec, allow_different_codecs);
readHeaderAndGetCodec(compressed_buffer, size_decompressed, codec, allow_different_codecs, external_data);
codec->decompress(compressed_buffer, static_cast<UInt32>(size_compressed_without_checksum), to);
}
void CompressedReadBufferBase::decompress(BufferBase::Buffer & to, size_t size_decompressed, size_t size_compressed_without_checksum)
{
readHeaderAndGetCodec(compressed_buffer, size_decompressed, codec, allow_different_codecs);
readHeaderAndGetCodec(compressed_buffer, size_decompressed, codec, allow_different_codecs, external_data);
if (codec->isNone())
{
@ -320,8 +330,8 @@ void CompressedReadBufferBase::setDecompressMode(ICompressionCodec::CodecMode mo
}
/// 'compressed_in' could be initialized lazily, but before first call of 'readCompressedData'.
CompressedReadBufferBase::CompressedReadBufferBase(ReadBuffer * in, bool allow_different_codecs_)
: compressed_in(in), own_compressed_buffer(0), allow_different_codecs(allow_different_codecs_)
CompressedReadBufferBase::CompressedReadBufferBase(ReadBuffer * in, bool allow_different_codecs_, bool external_data_)
: compressed_in(in), own_compressed_buffer(0), allow_different_codecs(allow_different_codecs_), external_data(external_data_)
{
}

View File

@ -30,6 +30,9 @@ protected:
/// Allow reading data, compressed by different codecs from one file.
bool allow_different_codecs;
/// Report decompression errors as CANNOT_DECOMPRESS, not CORRUPTED_DATA
bool external_data;
/// Read compressed data into compressed_buffer. Get size of decompressed data from block header. Checksum if need.
///
/// If always_copy is true then even if the compressed block is already stored in compressed_in.buffer()
@ -67,7 +70,7 @@ protected:
public:
/// 'compressed_in' could be initialized lazily, but before first call of 'readCompressedData'.
explicit CompressedReadBufferBase(ReadBuffer * in = nullptr, bool allow_different_codecs_ = false);
explicit CompressedReadBufferBase(ReadBuffer * in = nullptr, bool allow_different_codecs_ = false, bool external_data_ = false);
virtual ~CompressedReadBufferBase();
/** Disable checksums.

View File

@ -14,12 +14,6 @@
namespace DB
{
namespace ErrorCodes
{
extern const int CORRUPTED_DATA;
}
CompressionCodecMultiple::CompressionCodecMultiple(Codecs codecs_)
: codecs(codecs_)
{
@ -79,7 +73,7 @@ UInt32 CompressionCodecMultiple::doCompressData(const char * source, UInt32 sour
void CompressionCodecMultiple::doDecompressData(const char * source, UInt32 source_size, char * dest, UInt32 decompressed_size) const
{
if (source_size < 1 || !source[0])
throw Exception(ErrorCodes::CORRUPTED_DATA, "Wrong compression methods list");
throw Exception(decompression_error_code, "Wrong compression methods list");
UInt8 compression_methods_size = source[0];
@ -95,10 +89,10 @@ void CompressionCodecMultiple::doDecompressData(const char * source, UInt32 sour
auto additional_size_at_the_end_of_buffer = codec->getAdditionalSizeAtTheEndOfBuffer();
compressed_buf.resize(compressed_buf.size() + additional_size_at_the_end_of_buffer);
UInt32 uncompressed_size = ICompressionCodec::readDecompressedBlockSize(compressed_buf.data());
UInt32 uncompressed_size = readDecompressedBlockSize(compressed_buf.data());
if (idx == 0 && uncompressed_size != decompressed_size)
throw Exception(ErrorCodes::CORRUPTED_DATA, "Wrong final decompressed size in codec Multiple, got {}, expected {}",
throw Exception(decompression_error_code, "Wrong final decompressed size in codec Multiple, got {}, expected {}",
uncompressed_size, decompressed_size);
uncompressed_buf.resize(uncompressed_size + additional_size_at_the_end_of_buffer);

View File

@ -15,8 +15,6 @@ namespace DB
namespace ErrorCodes
{
extern const int CANNOT_DECOMPRESS;
extern const int CORRUPTED_DATA;
extern const int LOGICAL_ERROR;
}
@ -97,14 +95,14 @@ UInt32 ICompressionCodec::decompress(const char * source, UInt32 source_size, ch
UInt8 header_size = getHeaderSize();
if (source_size < header_size)
throw Exception(ErrorCodes::CORRUPTED_DATA,
throw Exception(decompression_error_code,
"Can't decompress data: the compressed data size ({}, this should include header size) "
"is less than the header size ({})", source_size, static_cast<size_t>(header_size));
uint8_t our_method = getMethodByte();
uint8_t method = source[0];
if (method != our_method)
throw Exception(ErrorCodes::CANNOT_DECOMPRESS, "Can't decompress data with codec byte {} using codec with byte {}", method, our_method);
throw Exception(decompression_error_code, "Can't decompress data with codec byte {} using codec with byte {}", method, our_method);
UInt32 decompressed_size = readDecompressedBlockSize(source);
doDecompressData(&source[header_size], source_size - header_size, dest, decompressed_size);
@ -112,20 +110,20 @@ UInt32 ICompressionCodec::decompress(const char * source, UInt32 source_size, ch
return decompressed_size;
}
UInt32 ICompressionCodec::readCompressedBlockSize(const char * source)
UInt32 ICompressionCodec::readCompressedBlockSize(const char * source) const
{
UInt32 compressed_block_size = unalignedLoadLittleEndian<UInt32>(&source[1]);
if (compressed_block_size == 0)
throw Exception(ErrorCodes::CORRUPTED_DATA, "Can't decompress data: header is corrupt with compressed block size 0");
throw Exception(decompression_error_code, "Can't decompress data: header is corrupt with compressed block size 0");
return compressed_block_size;
}
UInt32 ICompressionCodec::readDecompressedBlockSize(const char * source)
UInt32 ICompressionCodec::readDecompressedBlockSize(const char * source) const
{
UInt32 decompressed_block_size = unalignedLoadLittleEndian<UInt32>(&source[5]);
if (decompressed_block_size == 0)
throw Exception(ErrorCodes::CORRUPTED_DATA, "Can't decompress data: header is corrupt with decompressed block size 0");
throw Exception(decompression_error_code, "Can't decompress data: header is corrupt with decompressed block size 0");
return decompressed_block_size;
}

View File

@ -13,6 +13,12 @@ namespace DB
extern "C" int LLVMFuzzerTestOneInput(const uint8_t * data, size_t size);
namespace ErrorCodes
{
extern const int CANNOT_DECOMPRESS;
extern const int CORRUPTED_DATA;
}
/**
* Represents interface for compression codecs like LZ4, ZSTD, etc.
*/
@ -59,7 +65,10 @@ public:
CodecMode getDecompressMode() const{ return decompressMode; }
/// if set mode to CodecMode::Asynchronous, must be followed with flushAsynchronousDecompressRequests
void setDecompressMode(CodecMode mode){ decompressMode = mode; }
void setDecompressMode(CodecMode mode) { decompressMode = mode; }
/// Report decompression errors as CANNOT_DECOMPRESS, not CORRUPTED_DATA
void setExternalDataFlag() { decompression_error_code = ErrorCodes::CANNOT_DECOMPRESS; }
/// Flush result for previous asynchronous decompression requests.
/// This function must be called following several requests offload to HW.
@ -82,10 +91,10 @@ public:
static constexpr UInt8 getHeaderSize() { return COMPRESSED_BLOCK_HEADER_SIZE; }
/// Read size of compressed block from compressed source
static UInt32 readCompressedBlockSize(const char * source);
UInt32 readCompressedBlockSize(const char * source) const;
/// Read size of decompressed block from compressed source
static UInt32 readDecompressedBlockSize(const char * source);
UInt32 readDecompressedBlockSize(const char * source) const;
/// Read method byte from compressed source
static uint8_t readMethod(const char * source);
@ -131,6 +140,8 @@ protected:
/// Construct and set codec description from codec name and arguments. Must be called in codec constructor.
void setCodecDescription(const String & name, const ASTs & arguments = {});
int decompression_error_code = ErrorCodes::CORRUPTED_DATA;
private:
ASTPtr full_codec_desc;
CodecMode decompressMode{CodecMode::Synchronous};

View File

@ -54,6 +54,10 @@ public:
constexpr KeeperDispatcher * getDispatcher() const { return dispatcher; }
/// set to true when we have preprocessed or committed all the logs
/// that were already present locally during startup
std::atomic<bool> local_logs_preprocessed = false;
std::atomic<bool> shutdown_called = false;
private:
/// local disk defined using path or disk name
using Storage = std::variant<DiskPtr, std::string>;

View File

@ -68,6 +68,8 @@ void KeeperDispatcher::requestThread()
/// to send errors to the client.
KeeperStorage::RequestsForSessions prev_batch;
auto & shutdown_called = keeper_context->shutdown_called;
while (!shutdown_called)
{
KeeperStorage::RequestForSession request;
@ -204,6 +206,9 @@ void KeeperDispatcher::requestThread()
/// we will wake up this thread on each commit so we need to run it in loop until the last request of batch is committed
while (true)
{
if (shutdown_called)
return;
auto current_last_committed_idx = our_last_committed_log_idx.load(std::memory_order_relaxed);
if (current_last_committed_idx >= log_idx)
break;
@ -236,6 +241,7 @@ void KeeperDispatcher::requestThread()
void KeeperDispatcher::responseThread()
{
setThreadName("KeeperRspT");
auto & shutdown_called = keeper_context->shutdown_called;
while (!shutdown_called)
{
KeeperStorage::ResponseForSession response_for_session;
@ -262,6 +268,7 @@ void KeeperDispatcher::responseThread()
void KeeperDispatcher::snapshotThread()
{
setThreadName("KeeperSnpT");
auto & shutdown_called = keeper_context->shutdown_called;
while (!shutdown_called)
{
CreateSnapshotTask task;
@ -339,7 +346,7 @@ bool KeeperDispatcher::putRequest(const Coordination::ZooKeeperRequestPtr & requ
request_info.time = duration_cast<milliseconds>(system_clock::now().time_since_epoch()).count();
request_info.session_id = session_id;
if (shutdown_called)
if (keeper_context->shutdown_called)
return false;
/// Put close requests without timeouts
@ -361,17 +368,17 @@ void KeeperDispatcher::initialize(const Poco::Util::AbstractConfiguration & conf
LOG_DEBUG(log, "Initializing storage dispatcher");
configuration_and_settings = KeeperConfigurationAndSettings::loadFromConfig(config, standalone_keeper);
requests_queue = std::make_unique<RequestsQueue>(configuration_and_settings->coordination_settings->max_request_queue_size);
keeper_context = std::make_shared<KeeperContext>(standalone_keeper);
keeper_context->initialize(config, this);
requests_queue = std::make_unique<RequestsQueue>(configuration_and_settings->coordination_settings->max_request_queue_size);
request_thread = ThreadFromGlobalPool([this] { requestThread(); });
responses_thread = ThreadFromGlobalPool([this] { responseThread(); });
snapshot_thread = ThreadFromGlobalPool([this] { snapshotThread(); });
snapshot_s3.startup(config, macros);
keeper_context = std::make_shared<KeeperContext>(standalone_keeper);
keeper_context->initialize(config, this);
server = std::make_unique<KeeperServer>(
configuration_and_settings,
config,
@ -445,11 +452,16 @@ void KeeperDispatcher::shutdown()
try
{
{
if (shutdown_called.exchange(true))
if (keeper_context->shutdown_called.exchange(true))
return;
LOG_DEBUG(log, "Shutting down storage dispatcher");
our_last_committed_log_idx = std::numeric_limits<uint64_t>::max();
our_last_committed_log_idx.notify_all();
keeper_context->local_logs_preprocessed = true;
if (session_cleaner_thread.joinable())
session_cleaner_thread.join();
@ -576,6 +588,7 @@ void KeeperDispatcher::registerSession(int64_t session_id, ZooKeeperResponseCall
void KeeperDispatcher::sessionCleanerTask()
{
auto & shutdown_called = keeper_context->shutdown_called;
while (true)
{
if (shutdown_called)
@ -626,7 +639,7 @@ void KeeperDispatcher::sessionCleanerTask()
void KeeperDispatcher::finishSession(int64_t session_id)
{
/// shutdown() method will cleanup sessions if needed
if (shutdown_called)
if (keeper_context->shutdown_called)
return;
{
@ -737,6 +750,7 @@ int64_t KeeperDispatcher::getSessionID(int64_t session_timeout_ms)
void KeeperDispatcher::clusterUpdateWithReconfigDisabledThread()
{
auto & shutdown_called = keeper_context->shutdown_called;
while (!shutdown_called)
{
try
@ -789,6 +803,7 @@ void KeeperDispatcher::clusterUpdateWithReconfigDisabledThread()
void KeeperDispatcher::clusterUpdateThread()
{
auto & shutdown_called = keeper_context->shutdown_called;
while (!shutdown_called)
{
ClusterUpdateAction action;
@ -808,7 +823,7 @@ void KeeperDispatcher::clusterUpdateThread()
void KeeperDispatcher::pushClusterUpdates(ClusterUpdateActions && actions)
{
if (shutdown_called) return;
if (keeper_context->shutdown_called) return;
for (auto && action : actions)
{
if (!cluster_update_queue.push(std::move(action)))

View File

@ -39,8 +39,6 @@ private:
/// More than 1k updates is definitely misconfiguration.
ClusterUpdateQueue cluster_update_queue{1000};
std::atomic<bool> shutdown_called{false};
mutable std::mutex session_to_response_callback_mutex;
/// These two maps looks similar, but serves different purposes.
/// The first map is subscription map for normal responses like

View File

@ -17,6 +17,7 @@
#include <boost/algorithm/string.hpp>
#include <libnuraft/cluster_config.hxx>
#include <libnuraft/log_val_type.hxx>
#include <libnuraft/msg_type.hxx>
#include <libnuraft/ptr.hxx>
#include <libnuraft/raft_server.hxx>
#include <Poco/Util/AbstractConfiguration.h>
@ -28,6 +29,7 @@
#include <Common/getMultipleKeysFromConfig.h>
#include <Disks/DiskLocal.h>
#include <fmt/chrono.h>
#include <libnuraft/req_msg.hxx>
namespace DB
{
@ -195,6 +197,15 @@ struct KeeperServer::KeeperRaftServer : public nuraft::raft_server
nuraft::raft_server::commit_in_bg();
}
void commitLogs(uint64_t index_to_commit, bool initial_commit_exec)
{
leader_commit_index_.store(index_to_commit);
quick_commit_index_ = index_to_commit;
lagging_sm_target_index_ = index_to_commit;
commit_in_bg_exec(0, initial_commit_exec);
}
using nuraft::raft_server::raft_server;
// peers are initially marked as responding because at least one cycle
@ -398,27 +409,10 @@ void KeeperServer::startup(const Poco::Util::AbstractConfiguration & config, boo
state_manager->loadLogStore(state_machine->last_commit_index() + 1, coordination_settings->reserved_log_items);
auto log_store = state_manager->load_log_store();
auto next_log_idx = log_store->next_slot();
if (next_log_idx > 0 && next_log_idx > state_machine->last_commit_index())
{
auto log_entries = log_store->log_entries(state_machine->last_commit_index() + 1, next_log_idx);
size_t preprocessed = 0;
LOG_INFO(log, "Preprocessing {} log entries", log_entries->size());
auto idx = state_machine->last_commit_index() + 1;
for (const auto & entry : *log_entries)
{
if (entry && entry->get_val_type() == nuraft::log_val_type::app_log)
state_machine->pre_commit(idx, entry->get_buf());
++idx;
++preprocessed;
if (preprocessed % 50000 == 0)
LOG_TRACE(log, "Preprocessed {}/{} entries", preprocessed, log_entries->size());
}
LOG_INFO(log, "Preprocessing done");
}
last_log_idx_on_disk = log_store->next_slot() - 1;
LOG_TRACE(log, "Last local log idx {}", last_log_idx_on_disk);
if (state_machine->last_commit_index() >= last_log_idx_on_disk)
keeper_context->local_logs_preprocessed = true;
loadLatestConfig();
@ -619,6 +613,84 @@ nuraft::cb_func::ReturnCode KeeperServer::callbackFunc(nuraft::cb_func::Type typ
}
}
if (!keeper_context->local_logs_preprocessed)
{
const auto preprocess_logs = [&]
{
auto log_store = state_manager->load_log_store();
if (last_log_idx_on_disk > 0 && last_log_idx_on_disk > state_machine->last_commit_index())
{
auto log_entries = log_store->log_entries(state_machine->last_commit_index() + 1, last_log_idx_on_disk + 1);
size_t preprocessed = 0;
LOG_INFO(log, "Preprocessing {} log entries", log_entries->size());
auto idx = state_machine->last_commit_index() + 1;
for (const auto & entry : *log_entries)
{
if (entry && entry->get_val_type() == nuraft::log_val_type::app_log)
state_machine->pre_commit(idx, entry->get_buf());
++idx;
++preprocessed;
if (preprocessed % 50000 == 0)
LOG_TRACE(log, "Preprocessed {}/{} entries", preprocessed, log_entries->size());
}
LOG_INFO(log, "Preprocessing done");
}
else
{
LOG_INFO(log, "All local log entries preprocessed");
}
keeper_context->local_logs_preprocessed = true;
};
switch (type)
{
case nuraft::cb_func::InitialBatchCommited:
{
preprocess_logs();
break;
}
case nuraft::cb_func::GotAppendEntryReqFromLeader:
{
auto & req = *static_cast<nuraft::req_msg *>(param->ctx);
if (req.get_commit_idx() == 0 || req.log_entries().empty())
break;
auto last_committed_index = state_machine->last_commit_index();
// Actual log number.
auto index_to_commit = std::min({last_log_idx_on_disk, req.get_last_log_idx(), req.get_commit_idx()});
if (index_to_commit > last_committed_index)
{
LOG_TRACE(log, "Trying to commit local log entries, committing upto {}", index_to_commit);
raft_instance->commitLogs(index_to_commit, true);
/// after we manually committed all the local logs we can, we assert that all of the local logs are either
/// committed or preprocessed
if (!keeper_context->local_logs_preprocessed)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Local logs are not preprocessed");
}
else if (last_log_idx_on_disk <= last_committed_index)
{
keeper_context->local_logs_preprocessed = true;
}
else if
(
index_to_commit == 0 ||
(index_to_commit == last_committed_index && last_log_idx_on_disk > index_to_commit) /// we need to rollback all the logs so we preprocess all of them
)
{
preprocess_logs();
}
break;
}
default:
break;
}
}
const auto follower_preappend = [&](const auto & entry)
{
if (entry->get_val_type() != nuraft::app_log)

View File

@ -44,6 +44,8 @@ private:
std::condition_variable initialized_cv;
std::atomic<bool> initial_batch_committed = false;
uint64_t last_log_idx_on_disk = 0;
nuraft::ptr<nuraft::cluster_config> last_local_config;
Poco::Logger * log;
@ -64,7 +66,7 @@ private:
std::atomic_bool is_recovering = false;
std::shared_ptr<KeeperContext> keeper_context;
KeeperContextPtr keeper_context;
const bool create_snapshot_on_exit;
const bool enable_reconfiguration;

View File

@ -389,6 +389,9 @@ nuraft::ptr<nuraft::buffer> KeeperStateMachine::commit(const uint64_t log_idx, n
request_for_session->log_idx = log_idx;
if (!keeper_context->local_logs_preprocessed)
preprocess(*request_for_session);
auto try_push = [this](const KeeperStorage::ResponseForSession& response)
{
if (!responses_queue.push(response))
@ -493,11 +496,12 @@ bool KeeperStateMachine::apply_snapshot(nuraft::snapshot & s)
}
void KeeperStateMachine::commit_config(const uint64_t /* log_idx */, nuraft::ptr<nuraft::cluster_config> & new_conf)
void KeeperStateMachine::commit_config(const uint64_t log_idx, nuraft::ptr<nuraft::cluster_config> & new_conf)
{
std::lock_guard lock(cluster_config_lock);
auto tmp = new_conf->serialize();
cluster_config = ClusterConfig::deserialize(*tmp);
last_committed_idx = log_idx;
}
void KeeperStateMachine::rollback(uint64_t log_idx, nuraft::buffer & data)

View File

@ -76,6 +76,8 @@ protected:
Poco::AutoPtr<Poco::ConsoleChannel> channel(new Poco::ConsoleChannel(std::cerr));
Poco::Logger::root().setChannel(channel);
Poco::Logger::root().setLevel("trace");
keeper_context->local_logs_preprocessed = true;
}
void setLogDirectory(const std::string & path) { keeper_context->setLogDisk(std::make_shared<DB::DiskLocal>("LogDisk", path)); }

View File

@ -30,7 +30,7 @@
#define DBMS_CLUSTER_PROCESSING_PROTOCOL_VERSION 1
#define DBMS_PARALLEL_REPLICAS_PROTOCOL_VERSION 2
#define DBMS_PARALLEL_REPLICAS_PROTOCOL_VERSION 3
#define DBMS_MIN_REVISION_WITH_PARALLEL_REPLICAS 54453
#define DBMS_MERGE_TREE_PART_INFO_VERSION 1

View File

@ -208,9 +208,8 @@ class IColumn;
M(Bool, allow_experimental_inverted_index, false, "If it is set to true, allow to use experimental inverted index.", 0) \
\
M(UInt64, mysql_max_rows_to_insert, 65536, "The maximum number of rows in MySQL batch insertion of the MySQL storage engine", 0) \
M(Bool, use_mysql_types_in_show_columns, false, "Show native MySQL types in SHOW [FULL] COLUMNS", 0) \
M(Bool, mysql_map_string_to_text_in_show_columns, false, "If enabled, String type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise. Will only take effect if use_mysql_types_in_show_columns is enabled too", 0) \
M(Bool, mysql_map_fixed_string_to_text_in_show_columns, false, "If enabled, FixedString type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise. Will only take effect if use_mysql_types_in_show_columns is enabled too", 0) \
M(Bool, mysql_map_string_to_text_in_show_columns, false, "If enabled, String type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise.", 0) \
M(Bool, mysql_map_fixed_string_to_text_in_show_columns, false, "If enabled, FixedString type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise.", 0) \
\
M(UInt64, optimize_min_equality_disjunction_chain_length, 3, "The minimum length of the expression `expr = x1 OR ... expr = xN` for optimization ", 0) \
\
@ -288,7 +287,8 @@ class IColumn;
M(Bool, http_write_exception_in_output_format, true, "Write exception in output format to produce valid output. Works with JSON and XML formats.", 0) \
M(UInt64, http_response_buffer_size, 0, "The number of bytes to buffer in the server memory before sending a HTTP response to the client or flushing to disk (when http_wait_end_of_query is enabled).", 0) \
\
M(Bool, fsync_metadata, true, "Do fsync after changing metadata for tables and databases (.sql files). Could be disabled in case of poor latency on server with high load of DDL queries and high load of disk subsystem.", 0) \
M(Bool, fsync_metadata, true, "Do fsync after changing metadata for tables and databases (.sql files). Could be disabled in case of poor latency on server with high load of DDL queries and high load of disk subsystem.", 0) \
M(Bool, storage_metadata_write_full_object_key, false, "Write disk metadata files with VERSION_FULL_OBJECT_KEY format", 0) \
\
M(Bool, join_use_nulls, false, "Use NULLs for non-joined rows of outer JOINs for types that can be inside Nullable. If false, use default value of corresponding columns data type.", IMPORTANT) \
\
@ -365,16 +365,16 @@ class IColumn;
M(UInt64, max_bytes_to_read, 0, "Limit on read bytes (after decompression) from the most 'deep' sources. That is, only in the deepest subquery. When reading from a remote server, it is only checked on a remote server.", 0) \
M(OverflowMode, read_overflow_mode, OverflowMode::THROW, "What to do when the limit is exceeded.", 0) \
\
M(UInt64, max_rows_to_read_leaf, 0, "Limit on read rows on the leaf nodes for distributed queries. Limit is applied for local reads only excluding the final merge stage on the root node. Note, the setting is unstable with prefer_localhost_replica=1.", 0) \
M(UInt64, max_bytes_to_read_leaf, 0, "Limit on read bytes (after decompression) on the leaf nodes for distributed queries. Limit is applied for local reads only excluding the final merge stage on the root node. Note, the setting is unstable with prefer_localhost_replica=1.", 0) \
M(UInt64, max_rows_to_read_leaf, 0, "Limit on read rows on the leaf nodes for distributed queries. Limit is applied for local reads only, excluding the final merge stage on the root node. Note, the setting is unstable with prefer_localhost_replica=1.", 0) \
M(UInt64, max_bytes_to_read_leaf, 0, "Limit on read bytes (after decompression) on the leaf nodes for distributed queries. Limit is applied for local reads only, excluding the final merge stage on the root node. Note, the setting is unstable with prefer_localhost_replica=1.", 0) \
M(OverflowMode, read_overflow_mode_leaf, OverflowMode::THROW, "What to do when the leaf limit is exceeded.", 0) \
\
M(UInt64, max_rows_to_group_by, 0, "If aggregation during GROUP BY is generating more than specified number of rows (unique GROUP BY keys), the behavior will be determined by the 'group_by_overflow_mode' which by default is - throw an exception, but can be also switched to an approximate GROUP BY mode.", 0) \
M(UInt64, max_rows_to_group_by, 0, "If aggregation during GROUP BY is generating more than the specified number of rows (unique GROUP BY keys), the behavior will be determined by the 'group_by_overflow_mode' which by default is - throw an exception, but can be also switched to an approximate GROUP BY mode.", 0) \
M(OverflowModeGroupBy, group_by_overflow_mode, OverflowMode::THROW, "What to do when the limit is exceeded.", 0) \
M(UInt64, max_bytes_before_external_group_by, 0, "If memory usage during GROUP BY operation is exceeding this threshold in bytes, activate the 'external aggregation' mode (spill data to disk). Recommended value is half of available system memory.", 0) \
\
M(UInt64, max_rows_to_sort, 0, "If more than specified amount of records have to be processed for ORDER BY operation, the behavior will be determined by the 'sort_overflow_mode' which by default is - throw an exception", 0) \
M(UInt64, max_bytes_to_sort, 0, "If more than specified amount of (uncompressed) bytes have to be processed for ORDER BY operation, the behavior will be determined by the 'sort_overflow_mode' which by default is - throw an exception", 0) \
M(UInt64, max_rows_to_sort, 0, "If more than the specified amount of records have to be processed for ORDER BY operation, the behavior will be determined by the 'sort_overflow_mode' which by default is - throw an exception", 0) \
M(UInt64, max_bytes_to_sort, 0, "If more than the specified amount of (uncompressed) bytes have to be processed for ORDER BY operation, the behavior will be determined by the 'sort_overflow_mode' which by default is - throw an exception", 0) \
M(OverflowMode, sort_overflow_mode, OverflowMode::THROW, "What to do when the limit is exceeded.", 0) \
M(UInt64, max_bytes_before_external_sort, 0, "If memory usage during ORDER BY operation is exceeding this threshold in bytes, activate the 'external sorting' mode (spill data to disk). Recommended value is half of available system memory.", 0) \
M(UInt64, max_bytes_before_remerge_sort, 1000000000, "In case of ORDER BY with LIMIT, when memory usage is higher than specified threshold, perform additional steps of merging blocks before final merge to keep just top LIMIT rows.", 0) \
@ -385,8 +385,10 @@ class IColumn;
M(OverflowMode, result_overflow_mode, OverflowMode::THROW, "What to do when the limit is exceeded.", 0) \
\
/* TODO: Check also when merging and finalizing aggregate functions. */ \
M(Seconds, max_execution_time, 0, "If query run time exceeded the specified number of seconds, the behavior will be determined by the 'timeout_overflow_mode' which by default is - throw an exception. Note that the timeout is checked and query can stop only in designated places during data processing. It currently cannot stop during merging of aggregation states or during query analysis, and the actual run time will be higher than the value of this setting.", 0) \
M(Seconds, max_execution_time, 0, "If query runtime exceeds the specified number of seconds, the behavior will be determined by the 'timeout_overflow_mode', which by default is - throw an exception. Note that the timeout is checked and query can stop only in designated places during data processing. It currently cannot stop during merging of aggregation states or during query analysis, and the actual run time will be higher than the value of this setting.", 0) \
M(OverflowMode, timeout_overflow_mode, OverflowMode::THROW, "What to do when the limit is exceeded.", 0) \
M(Seconds, max_execution_time_leaf, 0, "Similar semantic to max_execution_time but only apply on leaf node for distributed queries, the time out behavior will be determined by 'timeout_overflow_mode_leaf' which by default is - throw an exception", 0) \
M(OverflowMode, timeout_overflow_mode_leaf, OverflowMode::THROW, "What to do when the leaf limit is exceeded.", 0) \
\
M(UInt64, min_execution_speed, 0, "Minimum number of execution rows per second.", 0) \
M(UInt64, max_execution_speed, 0, "Maximum number of execution rows per second.", 0) \
@ -400,7 +402,7 @@ class IColumn;
\
M(UInt64, max_sessions_for_user, 0, "Maximum number of simultaneous sessions for a user.", 0) \
\
M(UInt64, max_subquery_depth, 100, "If a query has more than specified number of nested subqueries, throw an exception. This allows you to have a sanity check to protect the users of your cluster from going insane with their queries.", 0) \
M(UInt64, max_subquery_depth, 100, "If a query has more than the specified number of nested subqueries, throw an exception. This allows you to have a sanity check to protect the users of your cluster from going insane with their queries.", 0) \
M(UInt64, max_analyze_depth, 5000, "Maximum number of analyses performed by interpreter.", 0) \
M(UInt64, max_ast_depth, 1000, "Maximum depth of query syntax tree. Checked after parsing.", 0) \
M(UInt64, max_ast_elements, 50000, "Maximum size of query syntax tree in number of nodes. Checked after parsing.", 0) \
@ -835,6 +837,7 @@ class IColumn;
MAKE_OBSOLETE(M, Bool, allow_experimental_bigint_types, true) \
MAKE_OBSOLETE(M, Bool, allow_experimental_window_functions, true) \
MAKE_OBSOLETE(M, Bool, allow_experimental_geo_types, true) \
MAKE_OBSOLETE(M, Bool, allow_experimental_query_cache, true) \
\
MAKE_OBSOLETE(M, Milliseconds, async_insert_stale_timeout_ms, 0) \
MAKE_OBSOLETE(M, StreamingHandleErrorMode, handle_kafka_error_mode, StreamingHandleErrorMode::DEFAULT) \
@ -846,6 +849,7 @@ class IColumn;
MAKE_OBSOLETE(M, UInt64, merge_tree_clear_old_parts_interval_seconds, 1) \
MAKE_OBSOLETE(M, UInt64, partial_merge_join_optimizations, 0) \
MAKE_OBSOLETE(M, MaxThreads, max_alter_threads, 0) \
MAKE_OBSOLETE(M, Bool, use_mysql_types_in_show_columns, false) \
/* moved to config.xml: see also src/Core/ServerSettings.h */ \
MAKE_DEPRECATED_BY_SERVER_CONFIG(M, UInt64, background_buffer_flush_schedule_pool_size, 16) \
MAKE_DEPRECATED_BY_SERVER_CONFIG(M, UInt64, background_pool_size, 16) \

View File

@ -458,6 +458,8 @@ inline bool isUInt32(const T & data_type) { return WhichDataType(data_type).isUI
template <typename T>
inline bool isUInt64(const T & data_type) { return WhichDataType(data_type).isUInt64(); }
template <typename T>
inline bool isNativeUnsignedInteger(const T & data_type) { return WhichDataType(data_type).isNativeUInt(); }
template <typename T>
inline bool isUnsignedInteger(const T & data_type) { return WhichDataType(data_type).isUInt(); }
template <typename T>

View File

@ -86,12 +86,18 @@ public:
void serializeBinary(const Field & field, WriteBuffer & ostr, const FormatSettings &) const override
{
IPv x = field.get<IPv>();
writeBinaryLittleEndian(x, ostr);
if constexpr (std::is_same_v<IPv, IPv6>)
writeBinary(x, ostr);
else
writeBinaryLittleEndian(x, ostr);
}
void deserializeBinary(Field & field, ReadBuffer & istr, const FormatSettings &) const override
{
IPv x;
readBinaryLittleEndian(x.toUnderType(), istr);
if constexpr (std::is_same_v<IPv, IPv6>)
readBinary(x, istr);
else
readBinaryLittleEndian(x, istr);
field = NearestFieldType<IPv>(x);
}
void serializeBinary(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const override

View File

@ -236,6 +236,9 @@ ClusterPtr DatabaseReplicated::getClusterImpl() const
throw Exception(ErrorCodes::ALL_CONNECTION_TRIES_FAILED, "Cannot get consistent cluster snapshot,"
"because replicas are created or removed concurrently");
LOG_TRACE(log, "Got a list of hosts after {} iterations. All hosts: [{}], filtered: [{}], ids: [{}]", iteration,
fmt::join(unfiltered_hosts, ", "), fmt::join(hosts, ", "), fmt::join(host_ids, ", "));
assert(!hosts.empty());
assert(hosts.size() == host_ids.size());
String current_shard = parseFullReplicaName(hosts.front()).first;

View File

@ -396,18 +396,20 @@ std::string ExternalQueryBuilder::composeLoadKeysQuery(
}
else
{
writeString(query, out);
auto condition_position = query.find(CONDITION_PLACEHOLDER_TO_REPLACE_VALUE);
if (condition_position == std::string::npos)
{
writeString(" WHERE ", out);
writeString("SELECT * FROM (", out);
writeString(query, out);
writeString(") WHERE ", out);
composeKeysCondition(key_columns, requested_rows, method, partition_key_prefix, out);
writeString(";", out);
return out.str();
}
writeString(query, out);
WriteBufferFromOwnString condition_value_buffer;
composeKeysCondition(key_columns, requested_rows, method, partition_key_prefix, condition_value_buffer);
const auto & condition_value = condition_value_buffer.str();

View File

@ -302,12 +302,14 @@ public:
struct LocalPathWithObjectStoragePaths
{
std::string local_path;
std::string common_prefix_for_objects;
StoredObjects objects;
LocalPathWithObjectStoragePaths(
const std::string & local_path_, const std::string & common_prefix_for_objects_, StoredObjects && objects_)
: local_path(local_path_), common_prefix_for_objects(common_prefix_for_objects_), objects(std::move(objects_)) {}
const std::string & local_path_,
StoredObjects && objects_)
: local_path(local_path_)
, objects(std::move(objects_))
{}
};
virtual void getRemotePathsRecursive(const String &, std::vector<LocalPathWithObjectStoragePaths> &)

View File

@ -102,9 +102,9 @@ AzureObjectStorage::AzureObjectStorage(
data_source_description.is_encrypted = false;
}
std::string AzureObjectStorage::generateBlobNameForPath(const std::string & /* path */)
ObjectStorageKey AzureObjectStorage::generateObjectKeyForPath(const std::string & /* path */) const
{
return getRandomASCIIString(32);
return ObjectStorageKey::createAsRelative(getRandomASCIIString(32));
}
bool AzureObjectStorage::exists(const StoredObject & object) const
@ -320,18 +320,7 @@ void AzureObjectStorage::removeObjectsIfExist(const StoredObjects & objects)
auto client_ptr = client.get();
for (const auto & object : objects)
{
try
{
auto delete_info = client_ptr->DeleteBlob(object.remote_path);
}
catch (const Azure::Storage::StorageException & e)
{
/// If object doesn't exist...
if (e.StatusCode == Azure::Core::Http::HttpStatusCode::NotFound)
return;
tryLogCurrentException(__PRETTY_FUNCTION__);
throw;
}
removeObjectIfExists(object);
}
}

View File

@ -121,7 +121,7 @@ public:
const std::string & config_prefix,
ContextPtr context) override;
std::string generateBlobNameForPath(const std::string & path) override;
ObjectStorageKey generateObjectKeyForPath(const std::string & path) const override;
bool isRemote() const override { return true; }

View File

@ -31,11 +31,12 @@ void registerDiskAzureBlobStorage(DiskFactory & factory, bool global_skip_access
getAzureBlobContainerClient(config, config_prefix),
getAzureBlobStorageSettings(config, config_prefix, context));
auto metadata_storage = std::make_shared<MetadataStorageFromDisk>(metadata_disk, "");
String key_prefix;
auto metadata_storage = std::make_shared<MetadataStorageFromDisk>(metadata_disk, key_prefix);
std::shared_ptr<IDisk> azure_blob_storage_disk = std::make_shared<DiskObjectStorage>(
name,
/* no namespaces */"",
/* no namespaces */ key_prefix,
"DiskAzureBlobStorage",
std::move(metadata_storage),
std::move(azure_object_storage),

View File

@ -42,9 +42,9 @@ FileCache::Key CachedObjectStorage::getCacheKey(const std::string & path) const
return cache->createKeyForPath(path);
}
std::string CachedObjectStorage::generateBlobNameForPath(const std::string & path)
ObjectStorageKey CachedObjectStorage::generateObjectKeyForPath(const std::string & path) const
{
return object_storage->generateBlobNameForPath(path);
return object_storage->generateObjectKeyForPath(path);
}
ReadSettings CachedObjectStorage::patchSettings(const ReadSettings & read_settings) const

View File

@ -92,7 +92,7 @@ public:
const std::string & getCacheName() const override { return cache_config_name; }
std::string generateBlobNameForPath(const std::string & path) override;
ObjectStorageKey generateObjectKeyForPath(const std::string & path) const override;
bool isRemote() const override { return object_storage->isRemote(); }

View File

@ -48,14 +48,14 @@ DiskTransactionPtr DiskObjectStorage::createObjectStorageTransaction()
DiskObjectStorage::DiskObjectStorage(
const String & name_,
const String & object_storage_root_path_,
const String & object_key_prefix_,
const String & log_name,
MetadataStoragePtr metadata_storage_,
ObjectStoragePtr object_storage_,
const Poco::Util::AbstractConfiguration & config,
const String & config_prefix)
: IDisk(name_, config, config_prefix)
, object_storage_root_path(object_storage_root_path_)
, object_key_prefix(object_key_prefix_)
, log (&Poco::Logger::get("DiskObjectStorage(" + log_name + ")"))
, metadata_storage(std::move(metadata_storage_))
, object_storage(std::move(object_storage_))
@ -80,7 +80,7 @@ void DiskObjectStorage::getRemotePathsRecursive(const String & local_path, std::
{
try
{
paths_map.emplace_back(local_path, metadata_storage->getObjectStorageRootPath(), getStorageObjects(local_path));
paths_map.emplace_back(local_path, getStorageObjects(local_path));
}
catch (const Exception & e)
{
@ -243,9 +243,9 @@ String DiskObjectStorage::getUniqueId(const String & path) const
bool DiskObjectStorage::checkUniqueId(const String & id) const
{
if (!id.starts_with(object_storage_root_path))
if (!id.starts_with(object_key_prefix))
{
LOG_DEBUG(log, "Blob with id {} doesn't start with blob storage prefix {}, Stack {}", id, object_storage_root_path, StackTrace().toString());
LOG_DEBUG(log, "Blob with id {} doesn't start with blob storage prefix {}, Stack {}", id, object_key_prefix, StackTrace().toString());
return false;
}
@ -470,7 +470,7 @@ DiskObjectStoragePtr DiskObjectStorage::createDiskObjectStorage()
const auto config_prefix = "storage_configuration.disks." + name;
return std::make_shared<DiskObjectStorage>(
getName(),
object_storage_root_path,
object_key_prefix,
getName(),
metadata_storage,
object_storage,
@ -586,7 +586,7 @@ void DiskObjectStorage::restoreMetadataIfNeeded(
{
metadata_helper->restore(config, config_prefix, context);
auto current_schema_version = metadata_helper->readSchemaVersion(object_storage.get(), object_storage_root_path);
auto current_schema_version = metadata_helper->readSchemaVersion(object_storage.get(), object_key_prefix);
if (current_schema_version < DiskObjectStorageRemoteMetadataRestoreHelper::RESTORABLE_SCHEMA_VERSION)
metadata_helper->migrateToRestorableSchema();

View File

@ -37,7 +37,7 @@ friend class DiskObjectStorageRemoteMetadataRestoreHelper;
public:
DiskObjectStorage(
const String & name_,
const String & object_storage_root_path_,
const String & object_key_prefix_,
const String & log_name,
MetadataStoragePtr metadata_storage_,
ObjectStoragePtr object_storage_,
@ -224,7 +224,7 @@ private:
String getReadResourceName() const;
String getWriteResourceName() const;
const String object_storage_root_path;
const String object_key_prefix;
Poco::Logger * log;
MetadataStoragePtr metadata_storage;

Some files were not shown because too many files have changed in this diff Show More