ClickHouse/docs/changelogs/v22.11.1.1360-stable.md
Alexey Milovidov 26501178e6 Fix analyzer
2024-05-17 10:23:32 +02:00

44 KiB
Raw Permalink Blame History

sidebar_position sidebar_label
1 2022

2022 Changelog

ClickHouse release v22.11.1.1360-stable (0d211ed198) FIXME as compared to v22.10.1.1877-stable (98ab5a3c18)

Backward Incompatible Change

  • JSONExtract family of functions will now attempt to coerce to the request type. #41502 (Márcio Martins).

New Feature

  • Added applied row-level policies to system.query_log. #39819 (Vladimir Chebotaryov).
  • Add Hudi and DeltaLake table engines, read-only, only for tables on S3. #41054 (Daniil Rubin).
  • Add 4LW command csnp for manually creating snapshots. Additionally, lgif was added to get Raft information for a specific node (e.g. index of last created snapshot, last committed log index). #41766 (JackyWoo).
  • Support for keeper request retries during insert into replicated merge trees. Apart from fault tolerance, it aims to provide better user experience, - avoid returning a user an error during insert if keeper is restarted (for example, due to upgrade). #42607 (Igor Nikonov).
  • Add function ascii like in spark: https://spark.apache.org/docs/latest/api/sql/#ascii. #42670 (李扬).
  • Add function pmod which return non-negative result based on modulo. #42755 (李扬).
  • Published function formatReadableDecimalSize. #42774 (Alejandro).
  • Added S3 PUTs and GETs request per second rate throttling. Settings s3_max_get_rps, s3_max_get_burst, s3_max_put_rps, s3_max_put_burst are used to configure token bucket throttler. Can be used with both S3 ObjectStorage and S3 table function. Different limits can be configured for different S3 disks or endpoints. #43014 (Sergei Trifonov).
  • Add table function hudi and deltaLake. #43080 (flynn).
  • Add function factorial, as in Impala or Spark. #43110 (李扬).
  • Add function randCanonical, which is similar to rand function in spark or impala. The function generates pseudo random results with independent and identically distributed uniformly distributed values in [0, 1). #43124 (李扬).

Performance Improvement

  • Currently, the only saturable operators are And and Or, and their code paths are affected by this change. #42214 (Zhiguo Zhou).
  • match function can use the index if it's a condition on string prefix. This closes #37333. #42458 (clarkcaoliu).
  • Fixed slowness in JSONExtract with LowCardinality(String) tuples. #42761 (AlfVII).
  • Support parallel parsing for LineAsString input format. This improves performance just slightly. This closes #42502. #42780 (Kruglov Pavel).
  • Keeper performance improvement: improve commit performance for cases when many different nodes have uncommitted states. This should help with cases when a follower node can't sync fast enough. #42926 (Antonio Andelic).
  • Parallelized merging of uniqExact states for aggregation without a key, i.e. queries like SELECT uniqExact(number) FROM table. The improvement becomes noticeable when the number of unique keys approaches 10^6. Also uniq performance is slightly optimized. This closes #4510. #43072 (Nikita Taranov).

Improvement

  • Support type Object inside other types, e.g. Array(JSON). #36969 (Anton Popov).
  • Remove covered parts for fetched part (to avoid possible replication delay grows). #39737 (Azat Khuzhin).
  • ClickHouse Client and ClickHouse Local will show progress by default even in non-interactive mode. If /dev/tty is available, the progress will be rendered directly to the terminal, without writing to stderr. It allows to get progress even if stderr is redirected to a file, and the file will not be polluted by terminal escape sequences. The progress can be disabled by --progress false. This closes #32238. #42003 (Alexey Milovidov).
    1. Add, subtract and negate operations are now available on Intervals. In case when the types of Intervals are different they will be transformed into the Tuple of those types. 2. A tuple of intervals can be added to or subtracted from a Date/DateTime field. 3. Added parsing of Intervals with different types, for example: INTERVAL '1 HOUR 1 MINUTE 1 SECOND'. #42195 (Nikolay Degterinsky).
    • Add notLike to key condition atom map, so condition like NOT LIKE 'prefix%' can use primary index. #42209 (Duc Canh Le).
  • Add support for FixedString input to base64 coding functions. #42285 (ltrk2).
  • Add columns bytes_on_disk and path to system.detached_parts. Closes #42264. #42303 (chen).
  • Improve using structure from insertion table in table functions, now setting use_structure_from_insertion_table_in_table_functions has new possible value - 2 that means that ClickHouse will try to determine if we can use structure from insertion table or not automatically. Closes #40028. #42320 (Kruglov Pavel).
  • Added ** glob support for recursive directory traversal to filesystem and S3. resolves #36316. #42376 (SmitaRKulkarni).
  • Mask passwords and secret keys both in system.query_log and /var/log/clickhouse-server/*.log and also in error messages. #42484 (Vitaly Baranov).
  • Add a new variable call limit in query_info, indicating whether this query is a limit-trivial query. If so, we will adjust the approximate total rows for later estimation. Closes #7071. #42580 (Han Fei).
  • Implement ATTACH of MergeTree table for s3_plain disk (plus some fixes for s3_plain). #42628 (Azat Khuzhin).
  • Fix no progress indication on INSERT FROM INFILE. Closes #42548. #42634 (chen).
  • Add min_age_to_force_merge_on_partition_only setting to optimize old parts for the entire partition only. #42659 (Antonio Andelic).
  • Throttling algorithm changed to token bucket. #42665 (Sergei Trifonov).
  • Refactor FunctionTokens to enable max tokens returned for related functions(default disabled). #42673 (李扬).
  • Added new field allow_readonly in system.table_functions to allow using table functions in readonly mode resolves #42414 Implementation: * Added a new field allow_readonly to table system.table_functions. * Updated to use new field allow_readonly to allow using table functions in readonly mode. Testing: * Added a test for filesystem tests/queries/0_stateless/02473_functions_in_readonly_mode.sh Documentation: * Updated the english documentation for Table Functions. #42708 (SmitaRKulkarni).
  • Allow to use Date32 arguments for formatDateTime and FROM_UNIXTIME functions. #42737 (Roman Vasin).
  • Update tzdata to 2022f. Mexico will no longer observe DST except near the US border: https://www.timeanddate.com/news/time/mexico-abolishes-dst-2022.html. Chihuahua moves to year-round UTC-6 on 2022-10-30. Fiji no longer observes DST. See https://github.com/google/cctz/pull/235 and https://bugs.launchpad.net/ubuntu/+source/tzdata/+bug/1995209. #42796 (Alexey Milovidov).
  • Add FailedAsyncInsertQuery event metric for async inserts. #42814 (Krzysztof Góralski).
  • Implement read-in-order optimization on top of query plan. It is enabled by default. Set query_plan_read_in_order = 0 to use previous AST-based version. #42829 (Nikolai Kochetov).
  • Increase the size of upload part exponentially for backup to S3. #42833 (Vitaly Baranov).
  • When the merge task is continuously busy and the disk space is insufficient, the completely expired parts cannot be selected and dropped, resulting in insufficient disk space. My idea is that when the entire Part expires, there is no need for additional disk space to guarantee, ensure the normal execution of TTL. #42869 (zhongyuankai).
  • bugfix #42856 ignore Mysql binlog SAVEPOINT event. #42931 (zzsmdfj).
  • Add support for interactive parameters in INSERT VALUES queries. #43077 (Nikolay Degterinsky).
  • Add generic implementation for arbitrary structured named collections, access type and system.named_collections. #43147 (Kseniia Sumarokova).
  • add oss function and StorageOSS (This is convenient for users). oss is fully compatible with s3. #43155 (zzsmdfj).
  • Improve error reporting in the collection of OS-related info for the system.asynchronous_metrics table. #43192 (Alexey Milovidov).
  • The system.asynchronous_metrics gets embedded documentation. This documentation is also exported to Prometheus. Fixed an error with the metrics about cache disks - they were calculated only for one arbitrary cache disk instead all of them. This closes #7644. #43194 (Alexey Milovidov).
  • Modify the INFORMATION_SCHEMA tables in a way so that now ClickHouse can connect to itself using the MySQL compatibility protocol. Add columns instead of aliases (related to #9769). It will improve the compatibility with various MySQL clients. #43198 (Filatenkov Artur).
  • Disable deltaLake and hudi table functions in readonly mode. #43316 (Antonio Andelic).

Bug Fix

  • Updated normaliser to clone the alias ast. resolves #42452 Implementation: * Updated QueryNormalizer to clone alias ast, when its replaced. Previously just assigning the same leads to exception in LogicalExpressinsOptimizer as it would be the same parent being inserted again. * This bug is not seen with new analyser (allow_experimental_analyzer), so no changes for it. I added a test for the same. #42827 (SmitaRKulkarni).
  • Fix race for backup of tables in Lazy databases. #43104 (Vitaly Baranov).
  • fix skip_unavailable_shards does not work using s3Cluster table function. #43131 (chen).

Build/Testing/Packaging Improvement

Bug Fix (user-visible misbehavior in official stable release)

  • Fix schema inference in s3Cluster and improve in hdfsCluster. #41979 (Kruglov Pavel).
  • Fix retries while reading from http table engines / table function. (retrtiable errors could be retries more times than needed, non-retrialble errors resulted in failed assertion in code). #42224 (Kseniia Sumarokova).
  • A segmentation fault related to DNS & c-ares has been reported. The below error ocurred in multiple threads: 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008088 [ 356 ] {} <Fatal> BaseDaemon: ######################################## 2022-09-28 15:41:19.008,"2022.09.28 15:41:19.008147 [ 356 ] {} <Fatal> BaseDaemon: (version 22.8.5.29 (official build), build id: 92504ACA0B8E2267) (from thread 353) (no query) Received signal Segmentation fault (11)" 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008196 [ 356 ] {} <Fatal> BaseDaemon: Address: 0xf Access: write. Address not mapped to object. 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008216 [ 356 ] {} <Fatal> BaseDaemon: Stack trace: 0x188f8212 0x1626851b 0x1626a69e 0x16269b3f 0x16267eab 0x13cf8284 0x13d24afc 0x13c5217e 0x14ec2495 0x15ba440f 0x15b9d13b 0x15bb2699 0x1891ccb3 0x1891e00d 0x18ae0769 0x18ade022 0x7f76aa985609 0x7f76aa8aa133 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008274 [ 356 ] {} <Fatal> BaseDaemon: 2. Poco::Net::IPAddress::family() const @ 0x188f8212 in /usr/bin/clickhouse 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008297 [ 356 ] {} <Fatal> BaseDaemon: 3. ? @ 0x1626851b in /usr/bin/clickhouse 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008309 [ 356 ] {} <Fatal> BaseDaemon: 4. ? @ 0x1626a69e in /usr/bin/clickhouse. #42234 (Arthur Passos).
  • Fix LOGICAL_ERROR Arguments of 'plus' have incorrect data types which may happen in PK analysis (monotonicity check). Fix invalid PK analysis for monotonic binary functions with first constant argument. #42410 (Nikolai Kochetov).
  • Fix incorrect key analysis when key types cannot be inside Nullable. This fixes #42456. #42469 (Amos Bird).
  • Fix typo in setting name that led to bad usage of schema inference cache while using setting input_format_csv_use_best_effort_in_schema_inference. Closes #41735. #42536 (Kruglov Pavel).
  • Fix create Set with wrong header when data type is LowCardinality. Closes #42460. #42579 (flynn).
  • (U)Int128 and (U)Int256 values are correctly checked in PREWHERE. #42605 (Antonio Andelic).
  • Fix a bug in ParserFunction that could have led to a segmentation fault. #42724 (Nikolay Degterinsky).
  • Fix truncate table does not hold lock correctly. #42728 (flynn).
  • Fix possible SIGSEGV for web disks when file does not exist (or OPTIMIZE TABLE FINAL, that also can got the same error eventually). #42767 (Azat Khuzhin).
  • Fix auth_type mapping in system.session_log, by including SSL_CERTIFICATE for the enum values. #42782 (Miel Donkers).
  • Fix stack-use-after-return under ASAN build in ParserCreateUserQuery. #42804 (Nikolay Degterinsky).
  • Fix lowerUTF8()/upperUTF8() in case of symbol was in between 16-byte boundary (very frequent case of you have strings > 16 bytes long). #42812 (Azat Khuzhin).
  • Additional bound check was added to lz4 decompression routine to fix misbehaviour in case of malformed input. #42868 (Nikita Taranov).
  • Fix rare possible hung on query cancellation. #42874 (Azat Khuzhin).
  • A null pointer will be generated when select if as from three table join , For example, the SQL:. #42883 (zzsmdfj).
  • Fix memory sanitizer report in ClusterDiscovery, close #42763. #42905 (Vladimir C).
  • Fix datetime schema inference in case of empty string. #42911 (Kruglov Pavel).
  • Fix rare NOT_FOUND_COLUMN_IN_BLOCK error when projection is possible to use but there is no projection available. This fixes #42771 . The bug was introduced in https://github.com/ClickHouse/ClickHouse/pull/25563. #42938 (Amos Bird).
  • Fixes for s3_plain disk that will allow to attach Wide parts. #42950 (Azat Khuzhin).
  • Fix ATTACH TABLE in PostgreSQL database engine if the table contains DATETIME data type. Closes #42817. #42960 (Kseniia Sumarokova).
  • Fix lambda parsing. Closes #41848. #42979 (Nikolay Degterinsky).
  • Handle (ignore) SAVEPOINT queries in MaterializedMySQL. #43086 (Stig Bakken).
  • Fix incorrect key analysis when nullable keys appear in the middle of a hyperrectangle. This fixes #43111 . #43133 (Amos Bird).
  • Fix function if in case of NULL and const Nullable arguments. Closes #43069. #43178 (Kruglov Pavel).
  • Fix decimal math overflow in parsing datetime with 'best effort' algorithm. Closes #43061. #43180 (Kruglov Pavel).
  • The indent field produced by the git-import tool was miscalculated. See https://clickhouse.com/docs/en/getting-started/example-datasets/github/. #43191 (Alexey Milovidov).
  • Fixed unexpected behaviour of Interval types with subquery and casting. #43193 (jh0x).
    • Fix logical error in sumMap/minMap/maxMap functions executing TOTALS/ROLLUP/CUBE on NULL values. Close #43022. #43232 (Vladimir C).
  • Fix IS (NOT) NULL operator priority in regard to other operators. #43265 (Nikolay Degterinsky).

Build Improvement

NO CL ENTRY

NOT FOR CHANGELOG / INSIGNIFICANT