ClickHouse/docs/changelogs/v24.6.1.4423-stable.md

143 KiB
Raw Permalink Blame History

sidebar_position sidebar_label
1 2024

2024 Changelog

ClickHouse release v24.6.1.4423-stable (dcced7c847) FIXME as compared to v24.4.1.2088-stable (6d4b31322d)

Backward Incompatible Change

  • Enable asynchronous load of databases and tables by default. See the async_load_databases in config.xml. While this change is fully compatible, it can introduce a difference in behavior. When async_load_databases is false, as in the previous versions, the server will not accept connections until all tables are loaded. When async_load_databases is true, as in the new version, the server can accept connections before all the tables are loaded. If a query is made to a table that is not yet loaded, it will wait for the table's loading, which can take considerable time. It can change the behavior of the server if it is part of a large distributed system under a load balancer. In the first case, the load balancer can get a connection refusal and quickly failover to another server. In the second case, the load balancer can connect to a server that is still loading the tables, and the query will have a higher latency. Moreover, if many queries accumulate in the waiting state, it can lead to a "thundering herd" problem when they start processing simultaneously. This can make a difference only for highly loaded distributed backends. You can set the value of async_load_databases to false to avoid this problem. #57695 (Alexey Milovidov).
  • Some invalid queries will fail earlier during parsing. Note: disabled the support for inline KQL expressions (the experimental Kusto language) when they are put into a kql table function without a string literal, e.g. kql(garbage | trash) instead of kql('garbage | trash') or kql($$garbage | trash$$). This feature was introduced unintentionally and should not exist. #61500 (Alexey Milovidov).
  • Renamed "inverted indexes" to "full-text indexes" which is a less technical / more user-friendly name. This also changes internal table metadata and breaks tables with existing (experimental) inverted indexes. Please make to drop such indexes before upgrade and re-create them after upgrade. #62884 (Robert Schulze).
  • Usage of functions neighbor, runningAccumulate, runningDifferenceStartingWithFirstValue, runningDifference deprecated (because it is error-prone). Proper window functions should be used instead. To enable them back, set allow_deprecated_functions=1. #63132 (Nikita Taranov).
  • Queries from system.columns will work faster if there is a large number of columns, but many databases or tables are not granted for SHOW TABLES. Note that in previous versions, if you grant SHOW COLUMNS to individual columns without granting SHOW TABLES to the corresponding tables, the system.columns table will show these columns, but in a new version, it will skip the table entirely. Remove trace log messages "Access granted" and "Access denied" that slowed down queries. #63439 (Alexey Milovidov).
  • Rework parallel processing in Ordered mode of storage S3Queue. This PR is backward incompatible for Ordered mode if you used settings s3queue_processing_threads_num or s3queue_total_shards_num. Setting s3queue_total_shards_num is deleted, previously it was allowed to use only under s3queue_allow_experimental_sharded_mode, which is now deprecated. A new setting is added - s3queue_buckets. #64349 (Kseniia Sumarokova).
  • New functions snowflakeIDToDateTime, snowflakeIDToDateTime64, dateTimeToSnowflakeID, and dateTime64ToSnowflakeID were added. Unlike the existing functions snowflakeToDateTime, snowflakeToDateTime64, dateTimeToSnowflake, and dateTime64ToSnowflake, the new functions are compatible with function generateSnowflakeID, i.e. they accept the snowflake IDs generated by generateSnowflakeID and produce snowflake IDs of the same type as generateSnowflakeID (i.e. UInt64). Furthermore, the new functions default to the UNIX epoch (aka. 1970-01-01), just like generateSnowflakeID. If necessary, a different epoch, e.g. Twitter's/X's epoch 2010-11-04 aka. 1288834974657 msec since UNIX epoch, can be passed. The old conversion functions are deprecated and will be removed after a transition period: to use them regardless, enable setting allow_deprecated_snowflake_conversion_functions. #64948 (Robert Schulze).

New Feature

  • Provide support for AzureBlobStorage function in ClickHouse server to use Azure Workload identity to authenticate against Azure blob storage. If use_workload_identity parameter is set in config, workload identity is used for authentication. #57881 (Vinay Suryadevara).
  • Introduce bulk loading to StorageEmbeddedRocksDB by creating and ingesting SST file instead of relying on rocksdb build-in memtable. This help to increase importing speed, especially for long-running insert query to StorageEmbeddedRocksDB tables. Also, introduce StorageEmbeddedRocksDB table settings. #59163 (Duc Canh Le).
  • Introduce statistics of type "number of distinct values". #59357 (Han Fei).
  • User can now parse CRLF with TSV format using a setting input_format_tsv_crlf_end_of_line. Closes #56257. #59747 (Shaun Struwig).
  • Add Hilbert Curve encode and decode functions. #60156 (Artem Mustafin).
  • Adds the Form Format to read/write a single record in the application/x-www-form-urlencoded format. #60199 (Shaun Struwig).
  • Added possibility to compress in CROSS JOIN. #60459 (p1rattttt).
  • New setting input_format_force_null_for_omitted_fields that forces NULL values for omitted fields. #60887 (Constantine Peresypkin).
  • Support join with inequal conditions which involve columns from both left and right table. e.g. t1.y < t2.y. To enable, SET allow_experimental_join_condition = 1. #60920 (lgbo).
  • Earlier our s3 storage and s3 table function didn't support selecting from archive files. I created a solution that allows to iterate over files inside archives in S3. #62259 (Daniil Ivanik).
  • Support for conditional function clamp. #62377 (skyoct).
  • Add npy output format. #62430 (豪肥肥).
  • Added support for reading LINESTRING geometry in WKT format using function readWKTLineString. #62519 (Nikita Mikhaylov).
  • Added SQL functions generateUUIDv7, generateUUIDv7ThreadMonotonic, generateUUIDv7NonMonotonic (with different monotonicity/performance trade-offs) to generate version 7 UUIDs aka. timestamp-based UUIDs with random component. Also added a new function UUIDToNum to extract bytes from a UUID and a new function UUIDv7ToDateTime to extract timestamp component from a UUID version 7. #62852 (Alexey Petrunyaka).
  • Implement Dynamic data type that allows to store values of any type inside it without knowing all of them in advance. Dynamic type is available under a setting allow_experimental_dynamic_type. #63058 (Kruglov Pavel).
  • Allow to attach parts from a different disk. #63087 (Unalian).
  • Allow proxy to be bypassed for hosts specified in no_proxy env variable and ClickHouse proxy configuration. #63314 (Arthur Passos).
  • Introduce bulk loading to StorageEmbeddedRocksDB by creating and ingesting SST file instead of relying on rocksdb build-in memtable. This help to increase importing speed, especially for long-running insert query to StorageEmbeddedRocksDB tables. Also, introduce StorageEmbeddedRocksDB table settings. #63324 (Duc Canh Le).
  • Raw as a synonym for TSVRaw. #63394 (Unalian).
  • Added possibility to do cross join in temporary file if size exceeds limits. #63432 (p1rattttt).
  • Added a new table function loop to support returning query results in an infinite loop. #63452 (Sariel).
  • Added new SQL functions generateSnowflakeID for generating Twitter-style Snowflake IDs. #63577 (Danila Puzov).
  • Add the ability to reshuffle rows during insert to optimize for size without violating the order set by PRIMARY KEY. It's controlled by the setting optimize_row_order (off by default). #63578 (Igor Markelov).
  • On Linux and MacOS, if the program has STDOUT redirected to a file with a compression extension, use the corresponding compression method instead of nothing (making it behave similarly to INTO OUTFILE ). #63662 (v01dXYZ).
  • Added merge_workload and mutation_workload settings to regulate how resources are utilized and shared between merges, mutations and other workloads. #64061 (Sergei Trifonov).
  • Change warning on high number of attached tables to differentiate tables, views and dictionaries. #64180 (Francisco J. Jurado Moreno).
  • Add support for comparing IPv4 and IPv6 types using the = operator. #64292 (Francisco J. Jurado Moreno).
  • Allow to store named collections in zookeeper. #64574 (Kseniia Sumarokova).
  • Support decimal arguments in binary math functions (pow(), atan2(), max2, min2(), hypot(). #64582 (Mikhail Gorshkov).
  • Add support for index analysis over hilbertEncode. #64662 (Artem Mustafin).
  • Added SQL functions parseReadableSize (along with OrNull and OrZero variants). #64742 (Francisco J. Jurado Moreno).
  • Add server settings max_table_num_to_throw and max_database_num_to_throw to limit the number of databases or tables on CREATE queries. #64781 (Xu Jia).
  • Add _time virtual column to file alike storages (s3/file/hdfs/url/azureBlobStorage). #64947 (Ilya Golshtein).
  • Introduced new functions base64URLEncode, base64URLDecode and tryBase64URLDecode. #64991 (Mikhail Gorshkov).
  • Add new function editDistanceUTF8, which calculates the edit distance between two UTF8 strings. #65269 (LiuNeng).

Performance Improvement

  • Skip merging of newly created projection blocks during INSERT-s. #59405 (Nikita Taranov).
  • Add a native parquet reader, which can read parquet binary to ClickHouse Columns directly. It's controlled by the setting input_format_parquet_use_native_reader (disabled by default). #60361 (ZhiHong Zhang).
  • Reduce the number of virtual function calls in ColumnNullable::size(). #60556 (HappenLee).
  • Process string functions XXXUTF8 'asciily' if input strings are all ascii chars. Inspired by https://github.com/apache/doris/pull/29799. Overall speed up by 1.07x~1.62x. Notice that peak memory usage had been decreased in some cases. #61632 (李扬).
  • Improved performance of selection ({}) globs in StorageS3. #62120 (Andrey Zvonov).
  • HostResolver has each IP address several times. If remote host has several IPs and by some reason (firewall rules for example) access on some IPs allowed and on others forbidden, than only first record of forbidden IPs marked as failed, and in each try these IPs have a chance to be chosen (and failed again). Even if fix this, every 120 seconds DNS cache dropped, and IPs can be chosen again. #62652 (Anton Ivashkin).
  • Speedup splitByRegexp when the regular expression argument is a single-character. #62696 (Robert Schulze).
  • Speed up FixedHashTable by keeping track of the min and max keys used. This allows to reduce the number of cells that need to be verified. #62746 (Jiebin Sun).
  • Add a new configurationprefer_merge_sort_block_bytes to control the memory usage and speed up sorting 2 times when merging when there are many columns. #62904 (LiuNeng).
  • clickhouse-local will start faster. In previous versions, it was not deleting temporary directories by mistake. Now it will. This closes #62941. #63074 (Alexey Milovidov).
  • Micro-optimizations for the new analyzer. #63429 (Raúl Marín).
  • Index analysis will work if DateTime is compared to DateTime64. This closes #63441. #63443 (Alexey Milovidov).
  • Index analysis will work if DateTime is compared to DateTime64. This closes #63441. #63532 (Raúl Marín).
  • Optimize the resolution of in(LowCardinality, ConstantSet). #64060 (Zhiguo Zhou).
  • Speed up indices of type set a little (around 1.5 times) by removing garbage. #64098 (Alexey Milovidov).
  • Use a thread pool to initialize and destroy hash tables inside ConcurrentHashJoin. #64241 (Nikita Taranov).
  • Optimized vertical merges in tables with sparse columns. #64311 (Anton Popov).
  • Enabled prefetches of data from remote filesystem during vertical merges. It improves latency of vertical merges in tables with data stored on remote filesystem. #64314 (Anton Popov).
  • Reduce redundant calls to isDefault() of ColumnSparse::filter to improve performance. #64426 (Jiebin Sun).
  • Speedup find_super_nodes and find_big_family keeper-client commands by making multiple asynchronous getChildren requests. #64628 (Alexander Gololobov).
  • Improve function least/greatest for nullable numberic type arguments. #64668 (KevinyhZou).
  • Allow merging two consequent FilterSteps of a query plan. This improves filter-push-down optimization if the filter condition can be pushed down from the parent step. #64760 (Nikolai Kochetov).
  • Remove bad optimization in vertical final implementation and re-enable vertical final algorithm by default. #64783 (Duc Canh Le).
  • Remove ALIAS nodes from the filter expression. This slightly improves performance for queries with PREWHERE (with new analyzer). #64793 (Nikolai Kochetov).
  • Fix performance regression in cross join introduced in #60459 (24.5). #65243 (Nikita Taranov).

Improvement

  • Support empty tuples. #55061 (Amos Bird).
  • Hot reload storage policy for distributed tables when adding a new disk. #58285 (Duc Canh Le).
  • Maps can now have Float32, Float64, Array(T), Map(K,V) and Tuple(T1, T2, ...) as keys. Closes #54537. #59318 (李扬).
  • Avoid possible deadlock during MergeTree index analysis when scheduling threads in a saturated service. #59427 (Sean Haynes).
  • Multiline strings with border preservation and column width change. #59940 (Volodyachan).
  • Make rabbitmq nack broken messages. Closes #45350. #60312 (Kseniia Sumarokova).
  • Support partial trivial count optimization when the query filter is able to select exact ranges from merge tree tables. #60463 (Amos Bird).
  • Fix a crash in asynchronous stack unwinding (such as when using the sampling query profiler) while interpreting debug info. This closes #60460. #60468 (Alexey Milovidov).
  • Reduce max memory usage of multithreaded INSERTs by collecting chunks of multiple threads in a single transform. #61047 (Yarik Briukhovetskyi).
  • Distinct messages for s3 error 'no key' for cases disk and storage. #61108 (Sema Checherinda).
  • Less contention in filesystem cache (part 4). Allow to keep filesystem cache not filled to the limit by doing additional eviction in the background (controlled by keep_free_space_size(elements)_ratio). This allows to release pressure from space reservation for queries (on tryReserve method). Also this is done in a lock free way as much as possible, e.g. should not block normal cache usage. #61250 (Kseniia Sumarokova).
  • The progress bar will work for trivial queries with LIMIT from system.zeros, system.zeros_mt (it already works for system.numbers and system.numbers_mt), and the generateRandom table function. As a bonus, if the total number of records is greater than the max_rows_to_read limit, it will throw an exception earlier. This closes #58183. #61823 (Alexey Milovidov).
  • YAML Merge Key support. #62685 (Azat Khuzhin).
  • Enhance error message when non-deterministic function is used with Replicated source. #62896 (Grégoire Pineau).
  • Fix interserver secret for Distributed over Distributed from remote. #63013 (Azat Khuzhin).
  • Allow using clickhouse-local and its shortcuts clickhouse and ch with a query or queries file as a positional argument. Examples: ch "SELECT 1", ch --param_test Hello "SELECT {test:String}", ch query.sql. This closes #62361. #63081 (Alexey Milovidov).
  • Support configuration substitutions from YAML files. #63106 (Eduard Karacharov).
  • Reduce the memory usage when using Azure object storage by using fixed memory allocation, avoiding the allocation of an extra buffer. #63160 (SmitaRKulkarni).
  • Add TTL information in system parts_columns table. #63200 (litlig).
  • Keep previous data in terminal after picking from skim suggestions. #63261 (FlameFactory).
  • Width of fields now correctly calculate, ignoring ANSI escape sequences. #63270 (Shaun Struwig).
  • Enable plain_rewritable metadata for local and Azure (azure_blob_storage) object storages. #63365 (Julia Kartseva).
  • Support English-style Unicode quotes, e.g. “Hello”, world. This is questionable in general but helpful when you type your query in a word processor, such as Google Docs. This closes #58634. #63381 (Alexey Milovidov).
  • Allowed to create MaterializedMySQL database without connection to MySQL. #63397 (Kirill).
  • Remove copying data when writing to filesystem cache. #63401 (Kseniia Sumarokova).
  • Update the usage of error code NUMBER_OF_ARGUMENTS_DOESNT_MATCH by more accurate error codes when appropriate. #63406 (Yohann Jardin).
  • Several minor corner case fixes to proxy support & tunneling. #63427 (Arthur Passos).
  • os_user and client_hostname are now correctly set up for queries for command line suggestions in clickhouse-client. This closes #63430. #63433 (Alexey Milovidov).
  • Fixed tabulation from line numbering, correct handling of length when moving a line if the value has a tab, added tests. #63493 (Volodyachan).
  • Add this aggregate_function_group_array_has_limit_sizesetting to support discarding data in some scenarios. #63516 (zhongyuankai).
  • Automatically mark a replica of Replicated database as lost and start recovery if some DDL task fails more than max_retries_before_automatic_recovery (100 by default) times in a row with the same error. Also, fixed a bug that could cause skipping DDL entries when an exception is thrown during an early stage of entry execution. #63549 (Alexander Tokmakov).
  • Add http_response_headers setting to support custom response headers in custom HTTP handlers. #63562 (Grigorii).
  • Automatically correct max_block_size=0 to default value. #63587 (Antonio Andelic).
  • Account failed files in s3queue_tracked_file_ttl_sec and s3queue_traked_files_limit for StorageS3Queue. #63638 (Kseniia Sumarokova).
  • Add a build_id ALIAS column to trace_log to facilitate auto renaming upon detecting binary changes. This is to address #52086. #63656 (Zimu Li).
  • Enable truncate operation for object storage disks. #63693 (MikhailBurdukov).
  • Improve io_uring resubmits visibility. Rename profile event IOUringSQEsResubmits -> IOUringSQEsResubmitsAsync and add a new one IOUringSQEsResubmitsSync. #63699 (Tomer Shafir).
  • Introduce assertions to verify all functions are called with columns of the right size. #63723 (Raúl Marín).
  • The loading of the keywords list is now dependent on the server revision and will be disabled for the old versions of ClickHouse server. CC @azat. #63786 (Nikita Mikhaylov).
  • SHOW CREATE TABLE executed on top of system tables will now show the super handy comment unique for each table which will explain why this table is needed. #63788 (Nikita Mikhaylov).
  • Allow trailing commas in the columns list in the INSERT query. For example, INSERT INTO test (a, b, c, ) VALUES .... #63803 (Alexey Milovidov).
  • Better exception messages for the Regexp format. #63804 (Alexey Milovidov).
  • Allow trailing commas in the Values format. For example, this query is allowed: INSERT INTO test (a, b, c) VALUES (4, 5, 6,);. #63810 (Alexey Milovidov).
  • Clickhouse disks have to read server setting to obtain actual metadata format version. #63831 (Sema Checherinda).
  • Disable pretty format restrictions (output_format_pretty_max_rows/output_format_pretty_max_value_width) when stdout is not TTY. #63942 (Azat Khuzhin).
  • Exception handling now works when ClickHouse is used inside AWS Lambda. Author: Alexey Coolnev. #64014 (Alexey Milovidov).
  • Throw CANNOT_DECOMPRESS instread of CORRUPTED_DATA on invalid compressed data passed via HTTP. #64036 (vdimir).
  • A tip for a single large number in Pretty formats now works for Nullable and LowCardinality. This closes #61993. #64084 (Alexey Milovidov).
  • Now backups with azure blob storage will use multicopy. #64116 (alesapin).
  • Added a new setting, metadata_keep_free_space_bytes to keep free space on the metadata storage disk. #64128 (MikhailBurdukov).
  • Add metrics, logs, and thread names around parts filtering with indices. #64130 (Alexey Milovidov).
  • Allow to use native copy for azure even with different containers. #64154 (alesapin).
  • Add metrics to track the number of directories created and removed by the plain_rewritable metadata storage, and the number of entries in the local-to-remote in-memory map. #64175 (Julia Kartseva).
  • Finally enable native copy for azure. #64182 (alesapin).
  • Ignore allow_suspicious_primary_key on ATTACH and verify on ALTER. #64202 (Azat Khuzhin).
  • The query cache now considers identical queries with different settings as different. This increases robustness in cases where different settings (e.g. limit or additional_table_filters) would affect the query result. #64205 (Robert Schulze).
  • Better Exception Message in Delete Table with Projection, users can understand the error and the steps should be taken. #64212 (jsc0218).
  • Support the non standard error code QpsLimitExceeded in object storage as a retryable error. #64225 (Sema Checherinda).
  • Forbid converting a MergeTree table to replicated if the zookeeper path for this table already exists. #64244 (Kirill).
  • If "replica group" is configured for a Replicated database, automatically create a cluster that includes replicas from all groups. #64312 (Alexander Tokmakov).
  • Added settings to disable materialization of skip indexes and statistics on inserts (materialize_skip_indexes_on_insert and materialize_statistics_on_insert). #64391 (Anton Popov).
  • Use the allocated memory size to calculate the row group size and reduce the peak memory of the parquet writer in single-threaded mode. #64424 (LiuNeng).
  • Added new configuration input_format_parquet_prefer_block_bytes to control the average output block bytes, and modified the default value of input_format_parquet_max_block_size to 65409. #64427 (LiuNeng).
  • Always start Keeper with sufficient amount of threads in global thread pool. #64444 (Duc Canh Le).
  • Settings from user config doesn't affect merges and mutations for MergeTree on top of object storage. #64456 (alesapin).
  • Setting replace_long_file_name_to_hash is enabled by default for MergeTree tables. #64457 (Anton Popov).
  • Improve the iterator of sparse column to reduce call of size(). #64497 (Jiebin Sun).
  • Update condition to use copy for azure blob storage. #64518 (SmitaRKulkarni).
  • Support the non standard error code TotalQpsLimitExceeded in object storage as a retryable error. #64520 (Sema Checherinda).
  • Optimized memory usage of vertical merges for tables with high number of skip indexes. #64580 (Anton Popov).
  • Introduced two additional columns in the system.query_log: used_privileges and missing_privileges. used_privileges is populated with the privileges that were checked during query execution, and missing_privileges contains required privileges that are missing. #64597 (Alexey Katsman).
  • Add settings parallel_replicas_custom_key_range_lower and parallel_replicas_custom_key_range_upper to control how parallel replicas with dynamic shards parallelizes queries when using a range filter. #64604 (josh-hildred).
  • Updated Advanced Dashboard for both open-source and ClickHouse Cloud versions to include a chart for 'Maximum concurrent network connections'. #64610 (Thom O'Connor).
  • The second argument (scale) of functions round(), roundBankers(), floor(), ceil() and trunc() can now be non-const. #64798 (Mikhail Gorshkov).
  • Improve progress report on zeros_mt and generateRandom. #64804 (Raúl Marín).
  • Add an asynchronous metric jemalloc.profile.active to show whether sampling is currently active. This is an activation mechanism in addition to prof.active; both must be active for the calling thread to sample. #64842 (Unalian).
  • Support statistics with ReplicatedMergeTree. #64934 (Han Fei).
  • Don't mark of allow_experimental_join_condition as IMPORTANT. This may have prevented distributed queries in a mixed versions cluster from being executed successfully. #65008 (Nikita Mikhaylov).
  • Backported in #65716: StorageS3Queue related fixes and improvements. Deduce a default value of s3queue_processing_threads_num according to the number of physical cpu cores on the server (instead of the previous default value as 1). Set default value of s3queue_loading_retries to 10. Fix possible vague "Uncaught exception" in exception column of system.s3queue. Do not increment retry count on MEMORY_LIMIT_EXCEEDED exception. Move files commit to a stage after insertion into table fully finished to avoid files being commited while not inserted. Add settings s3queue_max_processed_files_before_commit, s3queue_max_processed_rows_before_commit, s3queue_max_processed_bytes_before_commit, s3queue_max_processing_time_sec_before_commit, to better control commit and flush time. #65046 (Kseniia Sumarokova).
  • Added server Asynchronous metrics DiskGetObjectThrottler* and DiskGetObjectThrottler* reflecting request per second rate limit defined with s3_max_get_rps and s3_max_put_rps disk settings and currently available number of requests that could be sent without hitting throttling limit on the disk. Metrics are defined for every disk that has a configured limit. #65050 (Sergei Trifonov).
  • Added a setting output_format_pretty_display_footer_column_names which when enabled displays column names at the end of the table for long tables (50 rows by default), with the threshold value for minimum number of rows controlled by output_format_pretty_display_footer_column_names_min_rows. #65144 (Shaun Struwig).
  • Returned back the behaviour of how ClickHouse works and interprets Tuples in CSV format. This change effectively reverts https://github.com/ClickHouse/ClickHouse/pull/60994 and makes it available only under a few settings: output_format_csv_serialize_tuple_into_separate_columns, input_format_csv_deserialize_separate_columns_into_tuple and input_format_csv_try_infer_strings_from_quoted_tuples. #65170 (Nikita Mikhaylov).
  • Initialize global trace collector for Poco::ThreadPool (needed for keeper, etc). #65239 (Kseniia Sumarokova).
  • Add validation when creating a user with bcrypt_hash. #65242 (Raúl Marín).

Critical Bug Fix (crash, LOGICAL_ERROR, data loss, RBAC)

  • Fix a permission error where a user in a specific situation can escalate their privileges on the default database without necessary grants. #64769 (pufit).
  • Fix crash with UniqInjectiveFunctionsEliminationPass and uniqCombined. #65188 (Raúl Marín).
  • Fix a bug in ClickHouse Keeper that causes digest mismatch during closing session. #65198 (Aleksei Filatov).
  • Forbid QUALIFY clause in the old analyzer. The old analyzer ignored QUALIFY, so it could lead to unexpected data removal in mutations. #65356 (Dmitry Novik).
  • Use correct memory alignment for Distinct combinator. Previously, crash could happen because of invalid memory allocation when the combinator was used. #65379 (Antonio Andelic).
  • Backported in #65846: Check cyclic dependencies on CREATE/REPLACE/RENAME/EXCHANGE queries and throw an exception if there is a cyclic dependency. Previously such cyclic dependencies could lead to a deadlock during server startup. Closes #65355. Also fix some bugs in dependencies creation. #65405 (Kruglov Pavel).
  • Backported in #65714: Fix crash in maxIntersections. #65689 (Raúl Marín).

Bug Fix (user-visible misbehavior in an official stable release)

  • Fix making backup when multiple shards are used. This PR fixes #56566. #57684 (Vitaly Baranov).
  • Fix passing projections/indexes from CREATE query into inner table of MV. #59183 (Azat Khuzhin).
  • Fix boundRatio incorrect merge. #60532 (Tao Wang).
  • Fix crash when using some functions with low-cardinality columns. #61966 (Michael Kolupaev).
  • Fixed 'set' skip index not working with IN and indexHint(). #62083 (Michael Kolupaev).
  • Fix queries with FINAL give wrong result when table does not use adaptive granularity. #62432 (Duc Canh Le).
  • Improve the detection of cgroups v2 memory controller in unusual locations. This fixes a warning that the cgroup memory observer was disabled because no cgroups v1 or v2 current memory file could be found. #62903 (Robert Schulze).
  • Fix subsequent use of external tables in client. #62964 (Azat Khuzhin).
  • Fix crash with untuple and unresolved lambda. #63131 (Raúl Marín).
  • Fix bug which could lead to server to accept connections before server is actually loaded. #63181 (alesapin).
  • Fix intersect parts when restart after drop range. #63202 (Han Fei).
  • Fix a misbehavior when SQL security defaults don't load for old tables during server startup. #63209 (pufit).
  • JOIN filter push down filled join fix. Closes #63228. #63234 (Maksim Kita).
  • Fix infinite loop while listing objects in Azure blob storage. #63257 (Julia Kartseva).
  • CROSS join can be executed with any value join_algorithm setting, close #62431. #63273 (vdimir).
  • Fixed a potential crash caused by a no space left error when temporary data in the cache is used. #63346 (vdimir).
  • Fix bug which could potentially lead to rare LOGICAL_ERROR during SELECT query with message: Unexpected return type from materialize. Expected type_XXX. Got type_YYY. Introduced in #59379. #63353 (alesapin).
  • Fix X-ClickHouse-Timezone header returning wrong timezone when using session_timezone as query level setting. #63377 (Andrey Zvonov).
  • Fix debug assert when using grouping WITH ROLLUP and LowCardinality types. #63398 (Raúl Marín).
  • Fix logical errors in queries with GROUPING SETS and WHERE and group_by_use_nulls = true, close #60538. #63405 (vdimir).
  • Fix backup of projection part in case projection was removed from table metadata, but part still has projection. #63426 (Kseniia Sumarokova).
  • Fix 'Every derived table must have its own alias' error for MYSQL dictionary source, close #63341. #63481 (vdimir).
  • Insert QueryFinish on AsyncInsertFlush with no data. #63483 (Raúl Marín).
  • Fix system.query_log.used_dictionaries logging. #63487 (Eduard Karacharov).
  • Support executing function during assignment of parameterized view value. #63502 (SmitaRKulkarni).
  • Avoid segafult in MergeTreePrefetchedReadPool while fetching projection parts. #63513 (Antonio Andelic).
  • Fix rabbitmq heap-use-after-free found by clang-18, which can happen if an error is thrown from RabbitMQ during initialization of exchange and queues. #63515 (Kseniia Sumarokova).
  • Fix crash on exit with sentry enabled (due to openssl destroyed before sentry). #63548 (Azat Khuzhin).
  • Fixed parquet memory tracking. #63584 (Michael Kolupaev).
  • Fix support for Array and Map with Keyed hashing functions and materialized keys. #63628 (Salvatore Mesoraca).
  • Fixed Parquet filter pushdown not working with Analyzer. #63642 (Michael Kolupaev).
  • It is forbidden to convert MergeTree to replicated if the zookeeper path for this table already exists. #63670 (Kirill).
  • Read only the necessary columns from VIEW (new analyzer). Closes #62594. #63688 (Maksim Kita).
  • Fix rare case with missing data in the result of distributed query. #63691 (vdimir).
  • Fix #63539. Forbid WINDOW redefinition in new analyzer. #63694 (Dmitry Novik).
  • Flatten_nested is broken with replicated database. #63695 (Nikolai Kochetov).
  • Fix SIZES_OF_COLUMNS_DOESNT_MATCH error for queries with arrayJoin function in WHERE. Fixes #63653. #63722 (Nikolai Kochetov).
  • Fix Not found column and CAST AS Map from array requires nested tuple of 2 elements exceptions for distributed queries which use Map(Nothing, Nothing) type. Fixes #63637. #63753 (Nikolai Kochetov).
  • Fix possible ILLEGAL_COLUMN error in partial_merge join, close #37928. #63755 (vdimir).
  • query_plan_remove_redundant_distinct can break queries with WINDOW FUNCTIONS (with allow_experimental_analyzer is on). Fixes #62820. #63776 (Igor Nikonov).
  • Fix possible crash with SYSTEM UNLOAD PRIMARY KEY. #63778 (Raúl Marín).
  • Fix a query with a duplicating cycling alias. Fixes #63320. #63791 (Nikolai Kochetov).
  • Fixed performance degradation of parsing data formats in INSERT query. This closes #62918. This partially reverts #42284, which breaks the original design and introduces more problems. #63801 (Alexey Milovidov).
  • Add 'endpoint_subpath' S3 URI setting to allow plain_rewritable disks to share the same endpoint. #63806 (Julia Kartseva).
  • Fix queries using parallel read buffer (e.g. with max_download_thread > 0) getting stuck when threads cannot be allocated. #63814 (Antonio Andelic).
  • Allow JOIN filter push down to both streams if only single equivalent column is used in query. Closes #63799. #63819 (Maksim Kita).
  • Remove the data from all disks after DROP with the Lazy database engines. Without these changes, orhpaned will remain on the disks. #63848 (MikhailBurdukov).
  • Fix incorrect select query result when parallel replicas were used to read from a Materialized View. #63861 (Nikita Taranov).
  • Fixes in find_super_nodes and find_big_family command of keeper-client: - do not fail on ZNONODE errors - find super nodes inside super nodes - properly calculate subtree node count. #63862 (Alexander Gololobov).
  • Fix a error Database name is empty for remote queries with lambdas over the cluster with modified default database. Fixes #63471. #63864 (Nikolai Kochetov).
  • Fix SIGSEGV due to CPU/Real (query_profiler_real_time_period_ns/query_profiler_cpu_time_period_ns) profiler (has been an issue since 2022, that leads to periodic server crashes, especially if you were using distributed engine). #63865 (Azat Khuzhin).
  • Fixed EXPLAIN CURRENT TRANSACTION query. #63926 (Anton Popov).
  • Fix analyzer - IN function with arbitrary deep sub-selects in materialized view to use insertion block. #63930 (Yakov Olkhovskiy).
  • Allow ALTER TABLE .. MODIFY|RESET SETTING and ALTER TABLE .. MODIFY COMMENT for plain_rewritable disk. #63933 (Julia Kartseva).
  • Fix Recursive CTE with distributed queries. Closes #63790. #63939 (Maksim Kita).
  • Fixed reading of columns of type Tuple(Map(LowCardinality(String), String), ...). #63956 (Anton Popov).
  • Fix resolve of unqualified COLUMNS matcher. Preserve the input columns order and forbid usage of unknown identifiers. #63962 (Dmitry Novik).
  • Fix the Not found column error for queries with skip_unused_shards = 1, LIMIT BY, and the new analyzer. Fixes #63943. #63983 (Nikolai Kochetov).
  • (Low-quality third-party Kusto Query Language). Resolve Client Abortion Issue When Using KQL Table Function in Interactive Mode. #63992 (Yong Wang).
  • Fix an Cyclic aliases error for cyclic aliases of different type (expression and function). #63993 (Nikolai Kochetov).
  • Deserialize untrusted binary inputs in a safer way. #64024 (Robert Schulze).
  • Do not throw Storage doesn't support FINAL error for remote queries over non-MergeTree tables with final = true and new analyzer. Fixes #63960. #64037 (Nikolai Kochetov).
  • Add missing settings to recoverLostReplica. #64040 (Raúl Marín).
  • Fix unwind on SIGSEGV on aarch64 (due to small stack for signal). #64058 (Azat Khuzhin).
  • This fix will use a proper redefined context with the correct definer for each individual view in the query pipeline. #64079 (pufit).
  • Fix analyzer: "Not found column" error is fixed when using INTERPOLATE. #64096 (Yakov Olkhovskiy).
  • Fix azure backup writing multipart blocks as 1mb (read buffer size) instead of max_upload_part_size. #64117 (Kseniia Sumarokova).
  • Fix creating backups to S3 buckets with different credentials from the disk containing the file. #64153 (Antonio Andelic).
  • Prevent LOGICAL_ERROR on CREATE TABLE as MaterializedView. #64174 (Raúl Marín).
  • The query cache now considers two identical queries against different databases as different. The previous behavior could be used to bypass missing privileges to read from a table. #64199 (Robert Schulze).
  • Fix possible abort on uncaught exception in ~WriteBufferFromFileDescriptor in StatusFile. #64206 (Kruglov Pavel).
  • Ignore text_log config when using Keeper. #64218 (Antonio Andelic).
  • Fix duplicate alias error for distributed queries with ARRAY JOIN. #64226 (Nikolai Kochetov).
  • Fix unexpected accurateCast from string to integer. #64255 (wudidapaopao).
  • Fixed CNF simplification, in case any OR group contains mutually exclusive atoms. #64256 (Eduard Karacharov).
  • Fix Query Tree size validation. #64377 (Dmitry Novik).
  • Fix Logical error: Bad cast for Buffer table with PREWHERE. #64388 (Nikolai Kochetov).
  • Prevent recursive logging in blob_storage_log when it's stored on object storage. #64393 (vdimir).
  • Fixed CREATE TABLE AS queries for tables with default expressions. #64455 (Anton Popov).
  • Fixed optimize_read_in_order behaviour for ORDER BY ... NULLS FIRST / LAST on tables with nullable keys. #64483 (Eduard Karacharov).
  • Fix the Expression nodes list expected 1 projection names and Unknown expression or identifier errors for queries with aliases to GLOBAL IN.. #64517 (Nikolai Kochetov).
  • Fix an error Cannot find column in distributed queries with constant CTE in the GROUP BY key. #64519 (Nikolai Kochetov).
  • Fixed ORC statistics calculation, when writing, for unsigned types on all platforms and Int8 on ARM. #64563 (Michael Kolupaev).
  • Fix the crash loop when restoring from backup is blocked by creating an MV with a definer that hasn't been restored yet. #64595 (pufit).
  • Fix the output of function formatDateTimeInJodaSyntax when a formatter generates an uneven number of characters and the last character is 0. For example, SELECT formatDateTimeInJodaSyntax(toDate('2012-05-29'), 'D') now correctly returns 150 instead of previously 15. #64614 (LiuNeng).
  • Do not rewrite aggregation if -If combinator is already used. #64638 (Dmitry Novik).
  • Fix type inference for float (in case of small buffer, i.e. --max_read_buffer_size 1). #64641 (Azat Khuzhin).
  • Fix bug which could lead to non-working TTLs with expressions. #64694 (alesapin).
  • Fix removing the WHERE and PREWHERE expressions, which are always true (for the new analyzer). #64695 (Nikolai Kochetov).
  • Fixed excessive part elimination by token-based text indexes (ngrambf , full_text) when filtering by result of startsWith, endsWith, match, multiSearchAny. #64720 (Eduard Karacharov).
  • Fixes incorrect behaviour of ANSI CSI escaping in the UTF8::computeWidth function. #64756 (Shaun Struwig).
  • Fix a case of incorrect removal of ORDER BY / LIMIT BY across subqueries. #64766 (Raúl Marín).
  • Fix (experimental) unequal join with subqueries for sets which are in the mixed join conditions. #64775 (lgbo).
  • Fix crash in a local cache over plain_rewritable disk. #64778 (Julia Kartseva).
  • Keeper fix: return correct value for zk_latest_snapshot_size in mntr command. #64784 (Antonio Andelic).
  • Fix Cannot find column in distributed query with ARRAY JOIN by Nested column. Fixes #64755. #64801 (Nikolai Kochetov).
  • Fix memory leak in slru cache policy. #64803 (Kseniia Sumarokova).
  • Fixed possible incorrect memory tracking in several kinds of queries: queries that read any data from S3, queries via http protocol, asynchronous inserts. #64844 (Anton Popov).
  • Fix the Block structure mismatch error for queries reading with PREWHERE from the materialized view when the materialized view has columns of different types than the source table. Fixes #64611. #64855 (Nikolai Kochetov).
  • Fix rare crash when table has TTL with subquery + database replicated + parallel replicas + analyzer. It's really rare, but please don't use TTLs with subqueries. #64858 (alesapin).
  • Fix duplicating Delete events in blob_storage_log in case of large batch to delete. #64924 (vdimir).
  • Backported in #65544: Fix crash for ALTER TABLE ... ON CLUSTER ... MODIFY SQL SECURITY. #64957 (pufit).
  • Fixed Session moved to another server error from [Zoo]Keeper that might happen after server startup when the config has includes from [Zoo]Keeper. #64986 (Alexander Tokmakov).
  • Backported in #65582: Fix crash on destroying AccessControl: add explicit shutdown. #64993 (Vitaly Baranov).
  • Fix ALTER MODIFY COMMENT query that was broken for parameterized VIEWs in https://github.com/ClickHouse/ClickHouse/pull/54211. #65031 (Nikolay Degterinsky).
  • Fix host_id in DatabaseReplicated when cluster_secure_connection parameter is enabled. Previously all the connections within the cluster created by DatabaseReplicated were not secure, even if the parameter was enabled. #65054 (Nikolay Degterinsky).
  • Fixing the Not-ready Set error after the PREWHERE optimization for StorageMerge. #65057 (Nikolai Kochetov).
  • Avoid writing to finalized buffer in File-like storages. #65063 (Kruglov Pavel).
  • Fix possible infinite query duration in case of cyclic aliases. Fixes #64849. #65081 (Nikolai Kochetov).
  • Fix the Unknown expression identifier error for remote queries with INTERPOLATE (alias) (new analyzer). Fixes #64636. #65090 (Nikolai Kochetov).
  • Fix pushing arithmetic operations out of aggregation. In the new analyzer, optimization was applied only once. #65104 (Dmitry Novik).
  • Fix aggregate function name rewriting in the new analyzer. #65110 (Dmitry Novik).
  • Respond with 5xx instead of 200 OK in case of receive timeout while reading (parts of) the request body from the client socket. #65118 (Julian Maicher).
  • Backported in #65734: Eliminate injective function in argument of functions uniq* recursively. This used to work correctly but was broken in the new analyzer. #65140 (Duc Canh Le).
  • Fix possible crash for hedged requests. #65206 (Azat Khuzhin).
  • Fix the bug in Hashed and Hashed_Array dictionary short circuit evaluation, which may read uninitialized number, leading to various errors. #65256 (jsc0218).
  • This PR ensures that the type of the constant(IN operator's second parameter) is always visible during the IN operator's type conversion process. Otherwise, losing type information may cause some conversions to fail, such as the conversion from DateTime to Date. fix (#64487). #65315 (pn).
  • Backported in #65665: Disable non-intersecting-parts optimization for queries with FINAL in case of read-in-order optimization was enabled. This could lead to an incorrect query result. As a workaround, disable do_not_merge_across_partitions_select_final and split_parts_ranges_into_intersecting_and_non_intersecting_final before this fix is merged. #65505 (Nikolai Kochetov).
  • Backported in #65606: Fix getting exception Index out of bound for blob metadata in case all files from list batch were filtered out. #65523 (Kseniia Sumarokova).
  • Backported in #65790: Fixed bug in MergeJoin. Column in sparse serialisation might be treated as a column of its nested type though the required conversion wasn't performed. #65632 (Nikita Taranov).
  • Backported in #65814: Fix invalid exceptions in function parseDateTime with %F and %D placeholders. #65768 (Antonio Andelic).
  • Backported in #65830: Fix a bug in short circuit logic when old analyzer and dictGetOrDefault is used. #65802 (jsc0218).

Build/Testing/Packaging Improvement

NO CL CATEGORY

NO CL ENTRY

  • NO CL ENTRY: 'Revert "Do not remove server constants from GROUP BY key for secondary query."'. #63297 (Alexey Milovidov).
  • NO CL ENTRY: 'Revert "Introduce bulk loading to StorageEmbeddedRocksDB"'. #63316 (Alexey Milovidov).
  • NO CL ENTRY: 'Revert "Revert "Do not remove server constants from GROUP BY key for secondary query.""'. #63415 (Nikolai Kochetov).
  • NO CL ENTRY: 'Revert "Fix index analysis for DateTime64"'. #63525 (Raúl Marín).
  • NO CL ENTRY: 'Revert "Update gui.md - Add ch-ui to open-source available tools."'. #64064 (Alexey Milovidov).
  • NO CL ENTRY: 'Revert "Prevent conversion to Replicated if zookeeper path already exists"'. #64214 (Sergei Trifonov).
  • NO CL ENTRY: 'Revert "Refactoring of Server.h: Isolate server management from other logic"'. #64425 (Alexander Tokmakov).
  • NO CL ENTRY: 'Revert "Remove some unnecessary UNREACHABLEs"'. #64430 (Alexander Tokmakov).
  • NO CL ENTRY: 'Revert "CI: fix build_report selection in case of job reuse"'. #64516 (Max K.).
  • NO CL ENTRY: 'Revert "Revert "CI: fix build_report selection in case of job reuse""'. #64531 (Max K.).
  • NO CL ENTRY: 'Revert "Add fromReadableSize function"'. #64616 (Robert Schulze).
  • NO CL ENTRY: 'Update CHANGELOG.md'. #64816 (Paweł Kudzia).
  • NO CL ENTRY: 'Revert "Reduce lock contention for MergeTree tables (by renaming parts without holding lock)"'. #64899 (alesapin).
  • NO CL ENTRY: 'Revert "Add dynamic untracked memory limits for more precise memory tracking"'. #64969 (Sergei Trifonov).
  • NO CL ENTRY: 'Revert "Fix duplicating Delete events in blob_storage_log"'. #65049 (Alexander Tokmakov).
  • NO CL ENTRY: 'Revert "Revert "Fix duplicating Delete events in blob_storage_log""'. #65053 (vdimir).
  • NO CL ENTRY: 'Revert "S3: reduce retires time for queries, increase retries count for backups"'. #65148 (Raúl Marín).
  • NO CL ENTRY: 'Revert "Small fix for 02340_parts_refcnt_mergetree"'. #65149 (Raúl Marín).
  • NO CL ENTRY: 'Revert "Change default s3_throw_on_zero_files_match to true, document that presigned S3 URLs are not supported"'. #65250 (Max K.).
  • NO CL ENTRY: 'Revert "Fix AWS ECS"'. #65361 (Alexander Tokmakov).

NOT FOR CHANGELOG / INSIGNIFICANT