ClickHouse/docs/changelogs/v21.2.1.5869-prestable.md

51 KiB

sidebar_position sidebar_label
1 2022

2022 Changelog

ClickHouse release v21.2.1.5869-prestable FIXME as compared to v21.1.1.5646-prestable

Backward Incompatible Change

    • Fix memory tracking for OPTIMIZE TABLE/merges - Account query memory limits and sampling for OPTIMIZE TABLE/merges. #18772 (Azat Khuzhin).
  • Forbid lcm/gcd for floats. #19532 (Azat Khuzhin).
  • Bitwise functions (bitAnd, bitOr, etc) are forbidden for floating point arguments. Now you have to do explicit cast to integer. #19853 (Azat Khuzhin).

New Feature

  • add support for zstd long option for better compression of string columns to save space. #17184 (ygrek).
    • Added support of mapping LDAP group names, and attribute values in general, to local roles for users from ldap user directories. #17211 (Denis Glazachev).
  • Data type Nested now supports arbitrary levels of nesting. Introduced subcolumns of complex types, such as size0 in Array, null in Nullable, names of Tuple elements, which can be read without reading of whole column. #17310 (Anton Popov).
  • Add support of tuple argument to argMin and argMax functions. #17359 (Ildus Kurbangaliev).
  • Added Nullable support for FlatDictionary, HashedDictionary, ComplexKeyHashedDictionary, DirectDictionary, ComplexKeyDirectDictionary, RangeHashedDictionary. #18236 (Maksim Kita).
  • Disallow floating point column as partition key related to : #18421#event-4147046255. #18464 (hexiaoting).
  • Add function decodeXMLComponent to decode characters for XML. SELECT decodeXMLComponent('Hello,"world"!'); #17659. #18542 (nauta).
  • Added PostgreSQL table engine (both select/insert, with support for multidimensional arrays), also as table function. Added PostgreSQL dictionary source. Added PostgreSQL database engine. #18554 (Kseniia Sumarokova).
  • Add SELECT ALL syntax. closes #18706. #18723 (flynn).
  • Add three functions for map data type: 1. mapContains(map, key) to check weather map.keys include the second parameter key. 2. mapKeys(map) return all the keys in Array format 3. mapValues(map) return all the values in Array format. #18788 (hexiaoting).
  • Support MetaKey+Enter hotkey binding in play ui. #19012 (sundyli).
  • Function formatDateTime support the %Q modification to format date to quarter. ... #19224 (Jianmei Zhang).
  • ... #19261 (RegulusZ).
  • Add factories' objects names, created during query, into system.query_log. Closes #18495. #19371 (Kseniia Sumarokova).
  • Add sign math function. #19527 (flynn).
  • Added functions parseDateTimeBestEffortUSOrZero, parseDateTimeBestEffortUSOrNull. #19712 (Maksim Kita).
  • ... #19764 (emhlbmc).

Performance Improvement

  • Use a connection pool for S3 connections, controlled by the s3_max_connections settings. #13405 (Vladimir Chebotarev).
  • Rewrite sumIf() and sum(if()) function to countIf() function when logically equivalent. #17041 (flynn).
  • Update libcxx and use unstable ABI to provide better performance. #18914 (Daniel Kutenin).
  • Faster parts removal by lowering the number of stat syscalls. This returns the optimization that existed while ago. More safe interface of IDisk. This closes #19065. #19086 (Alexey Milovidov).
  • Speed up aggregate function sum. Improvement only visible on synthetic benchmarks and not very practical. #19216 (Alexey Milovidov).
  • Support splitting Filter step of query plan into Expression + Filter pair. Together with Expression + Expression merging optimization (#17458) it may delay execution for some expressions after Filter step. #19253 (Nikolai Kochetov).
  • Reduce lock contention for multiple layers of the Buffer engine. #19379 (Azat Khuzhin).
  • Slightly improve server latency by removing access to configuration on every connection. #19863 (Alexey Milovidov).

Improvement

  • Added support for WITH ... [AND] [PERIODIC] REFRESH [interval_in_sec] clause when creating LIVE VIEW tables. #14822 (vzakaznikov).
    • Add optimize_alias_column_prediction (on by default), that will: * Respect aliased columns in WHERE during partition pruning and skipping data using secondary indexes * Respect aliased columns in WHERE for trivial count queries for optimize_trivial_count * Respect aliased columns in GROUP BY/ORDER BY for optimize_aggregation_in_order/optimize_read_in_order. #16995 (sundyli).
  • Updated AWS C++ SDK in order to utilize global regions in S3. #17870 (Vladimir Chebotarev).
  • Support insert into table function cluster, and for both table functions remote and cluster, support distributing data across nodes by specify sharding key. Close #16752. #18264 (flynn).
  • Support EXISTS VIEW syntax. #18552 (Du Chuan).
  • Update librdkafka to v1.6.0-RC2. Fixes #18668. #18671 (filimonov).
  • Allow CTE to be further aliased. Propagate CSE to subqueries in the same level when enable_global_with_statement = 1. This fixes #17378 . This fixes https://github.com/ClickHouse/ClickHouse/pull/16575#issuecomment-753416235 . #18684 (Amos Bird).
  • Add [UInt8, UInt16, UInt32, UInt64] arguments types support for bitmapTransform, bitmapSubsetInRange, bitmapSubsetLimit, bitmapContains functions. This closes #18713. #18791 (sundyli).
  • Added prefix-based S3 endpoint settings. #18812 (Vladimir Chebotarev).
  • Fix issues with RIGHT and FULL JOIN of tables with aggregate function states. In previous versions exception about cloneResized method was thrown. #18818 (templarzq).
    • Check per-block checksum of the distributed batch on the sender before sending (without reading the file twice, the checksums will be verified while reading), this will avoid stuck of the INSERT on the receiver (on truncated .bin file on the sender) - Avoid reading .bin files twice for batched INSERT (it was required to calculate rows/bytes to take squashing into account, now this information included into the header, backward compatible is preserved). #18853 (Azat Khuzhin).
  • Add normalizeQueryKeepNames and normalizedQueryHashKeepNames to normalize queries without masking long names with ?. This helps better analyze complex query logs. #18910 (Amos Bird).
  • Docker image: several improvements for clickhouse-server entrypoint. #18954 (filimonov).
  • Fixed PeekableReadBuffer: Memory limit exceed error when inserting data with huge strings. Fixes #18690. #18979 (Alexander Tokmakov).
  • Explicitly set uid / gid of clickhouse user & group to the fixed values (101) in clickhouse-server images. #19096 (filimonov).
  • The exception when function bar is called with certain NaN argument may be slightly misleading in previous versions. This fixes #19088. #19107 (Alexey Milovidov).
  • Allow change max_server_memory_usage without restart. This closes #18154. #19186 (Alexey Milovidov).
  • Fix wrong alignment of values of IPv4 data type in Pretty formats. They were aligned to the right, not to the left. This closes #19184. #19339 (Alexey Milovidov).
  • Allow docker to be executed with arbitrary uid. #19374 (filimonov).
  • Add metrics for MergeTree parts (Wide/Compact/InMemory) types. #19381 (Azat Khuzhin).
  • Improve MySQL compatibility. #19387 (Daniil Kondratyev).
  • Add http_referer field to system.query_log, system.processes, etc. This closes #19389. #19390 (Alexey Milovidov).
  • toIPv6 function parses IPv4 addresses. #19518 (Bharat Nallan).
  • Support using the new location of .debug file. This fixes #19348. #19520 (Amos Bird).
  • Enable function length/empty/notEmpty for datatype map, which returns keys number in map. #19530 (李扬).
  • Support constant result in function multiIf. #19533 (Maksim Kita).
  • Add an option to disable validation of checksums on reading. Should never be used in production. Please do not expect any benefits in disabling it. It may only be used for experiments and benchmarks. The setting only applicable for tables of MergeTree family. Checksums are always validated for other table engines and when receiving data over network. In my observations there is no performance difference or it is less than 0.5%. #19588 (Alexey Milovidov).
  • Dictionary better error message during attribute parsing. #19678 (Maksim Kita).
  • Fix rare max_number_of_merges_with_ttl_in_pool limit overrun (more merges with TTL can be assigned) for non-replicated MergeTree. #19708 (alesapin).
  • Insuffiient arguments check in positionCaseInsensitiveUTF8 function triggered address sanitizer. #19720 (Alexey Milovidov).
  • Add separate pool for message brokers (RabbitMQ and Kafka). #19722 (Azat Khuzhin).
  • In distributed queries if the setting async_socket_for_remote is enabled, it was possible to get stack overflow at least in debug build configuration if very deeply nested data type is used in table (e.g. Array(Array(Array(...more...)))). This fixes #19108. This change introduces minor backward incompatibility: excessive parenthesis in type definitions no longer supported, example: Array((UInt8)). #19736 (Alexey Milovidov).
  • Table function S3 will use global region if the region can't be determined exactly. This closes #10998. #19750 (Vladimir Chebotarev).
  • Clickhouse client query param CTE added test. #19762 (Maksim Kita).
  • Correctly output infinite arguments for formatReadableTimeDelta function. In previous versions, there was implicit conversion to implementation specific integer value. #19791 (Alexey Milovidov).
  • S3 table function now supports auto compression mode (autodetect). This closes #18754. #19793 (Vladimir Chebotarev).
  • Set charset to utf8mb4 when interacting with remote MySQL servers. Fixes #19795. #19800 (Alexey Milovidov).
  • Add --reconnect option to clickhouse-benchmark. When this option is specified, it will reconnect before every request. This is needed for testing. #19872 (Alexey Milovidov).

Bug Fix

Build/Testing/Packaging Improvement

NO CL ENTRY

NOT FOR CHANGELOG / INSIGNIFICANT