ClickHouse/docs/changelogs/v20.11.1.5109-prestable.md
2022-05-25 00:05:55 +02:00

32 KiB

ClickHouse release v20.11.1.5109-prestable FIXME as compared to v20.10.1.4881-prestable

Backward Incompatible Change

  • Make rankCorr function return nan on insufficient data #16124. #16135 (hexiaoting).
  • Aggregate functions boundingRatio, rankCorr, retention, timeSeriesGroupSum, timeSeriesGroupRateSum, windowFunnel were erroneously made case-insensitive. Now their names are made case sensitive as designed. Only functions that are specified in SQL standard or made for compatibility with other DBMS or functions similar to those should be case-insensitive. #16407 (Alexey Milovidov).
  • Remove ANALYZE and AST queries, and make the setting enable_debug_queries obsolete since now it is the part of full featured EXPLAIN query. #16536 (Ivan).
  • Restrict to use of non-comparable data types (like AggregateFunction) in keys (Sorting key, Primary key, Partition key, and so on). #16601 (alesapin).
  • If some profile was specified in distributed_ddl config section, then this profile could overwrite settings of default profile on server startup. It's fixed, now settings of distributed DDL queries should not affect global server settings. #16635 (Alexander Tokmakov).

New Feature

  • #WelchTTest aggregate function implementation. #10351 (antikvist).
  • New functions encrypt, aes_encrypt_mysql, decrypt, aes_decrypt_mysql. These functions are working slowly (below ClickHouse standards), so we consider it as an experimental feature. #11844 (Vasily Nemkov).
  • Added disable_merges option for volumes in multi-disk configuration. #13956 (Vladimir Chebotarev).
  • Added initial OpenTelemetry support. ClickHouse now accepts OpenTelemetry traceparent headers over Native and HTTP protocols, and passes them downstream in some cases. The trace spans for executed queries are saved into the system.opentelemetry_span_log table. #14195 (Alexander Kuzmenkov).
  • Allows to read/write Single protobuf message at once (w/o length-delimiters). #15199 (filimonov).
  • Add function formatReadableTimeDelta that format time delta to human readable string ... #15497 (Filipe Caixeta).
  • Add tid and logTrace function. This closes #9434. #15803 (flynn).
  • Add a new option print_query_id to clickhouse-client. It helps generate arbitrary strings with the current query id generated by the client. #15809 (Amos Bird).
  • Allow specify primary key in column list of CREATE TABLE query. #15823 (Maksim Kita).
  • Added setting date_time_output_format. #15845 (Maksim Kita).
  • Implement OFFSET offset_row_count {ROW | ROWS} FETCH {FIRST | NEXT} fetch_row_count {ROW | ROWS} {ONLY | WITH TIES} in select Query with order by. related issue:#15367. #15855 (hexiaoting).
  • Added an aggregate function, which calculates the p-value used for Welch's t-test. #15874 (Nikita Mikhaylov).
  • Add max_concurrent_queries_for_all_users setting, see #6636 for use cases. #16154 (nvartolomei).
  • Added minimal web UI to ClickHouse. #16158 (Alexey Milovidov).
  • Added function untuple which is a special function which can introduce new columns to the SELECT list by flattening a named tuple. #16242 (Nikolai Kochetov).
  • Added toUUIDOrNull, toUUIDOrZero cast functions. #16337 (Maksim Kita).
  • Add system.replicated_fetches table which shows currently running background fetches. #16428 (alesapin).
    • errorCodeToName() function - return variable name of the error (useful for analyzing query_log and similar) - system.errors table - shows how many times errors has been happened (respects system_events_show_zero_values). #16438 (Azat Khuzhin).
  • Ability to create a docker image on the top of alpine. Uses precompiled binary and glibc components from ubuntu 20.04. #16479 (filimonov).
  • Add log_queries_min_query_duration_ms, only queries slower then the value of this setting will go to query_log/query_thread_log (i.e. something like slow_query_log in mysql). #16529 (Azat Khuzhin).
  • Add farmFingerprint64 function. #16570 (Jacob Hayes).

  • Now we can provide identifiers via query parameters. And these parameters can be used as table objects or columns. #3815. #16594 (Amos Bird).
  • Added big integers (UInt256, Int128, Int256) and UUID data types support for MergeTree BloomFilter index. #16642 (Maksim Kita).

Performance Improvement

Improvement

  • Allow explicitly specify columns list in CREATE TABLE table AS table_function(...) query. Fixes #9249 Fixes #14214. #14295 (Alexander Tokmakov).
  • Now trivial count optimization becomes slightly non-trivial. Predicates that contain exact partition expr can be optimized too. This also fixes #11092 which returns wrong count when max_parallel_replicas > 1. #15074 (Amos Bird).
  • Enable parsing enum values by their ids for CSV, TSV and JSON input formats. #15685 (vivarum).
  • Add reconnects to zookeeper-dump-tree tool. #15711 (Alexey Milovidov).
  • Remove MemoryTrackingInBackground* metrics to avoid potentially misleading results. This fixes #15684. #15813 (Alexey Milovidov).
  • Change level of some log messages from information to debug, so information messages will not appear for every query. This closes #5293. #15816 (Alexey Milovidov).
  • Fix query hang (endless loop) in case of misconfiguration (connections_with_failover_max_tries set to 0). #15876 (Azat Khuzhin).
  • Added boost::program_options to db_generator in order to increase its usability. This closes #15940. #15973 (Nikita Mikhaylov).
  • Treat INTERVAL '1 hour' as equivalent to INTERVAL 1 HOUR, to be compatible with Postgres. This fixes #15637. #15978 (flynn).
  • Simplify the implementation of background tasks processing for the MergeTree table engines family. There should be no visible changes for user. #15983 (alesapin).
  • Add support of cache layout for Redis dictionaries with complex key. #15985 (Anton Popov).
  • Fix rare issue when clickhouse-client may abort on exit due to loading of suggestions. This fixes #16035. #16047 (Alexey Milovidov).
  • Now it's allowed to execute ALTER ... ON CLUSTER queries regardless of the <internal_replication> setting in cluster config. #16075 (alesapin).
    • Fix memory_profiler_step/max_untracked_memory for queries via HTTP (test included), and adjusting this value globally in xml config will not help either, since those settings are not applied anyway, only default (4MB) value is used. - Fix query_id for the most root ThreadStatus of the http query (by initializing QueryScope after reading query_id). #16101 (Azat Khuzhin).
  • Add allow_nondeterministic_optimize_skip_unused_shards (to allow non deterministic like rand() or dictGet() in sharding key). #16105 (Azat Khuzhin).
  • database_atomic_wait_for_drop_and_detach_synchronously/NO DELAY/SYNC for DROP DATABASE. #16127 (Azat Khuzhin).
  • Add support for nested data types (like named tuple) as sub-types. Fixes #15587. #16262 (Ivan).
  • If there are no tmp folder in the system (chroot, misconfigutation etc) clickhouse-local will create temporary subfolder in the current directory. #16280 (filimonov).
  • Now it's possible to specify PRIMARY KEY without ORDER BY for MergeTree table engines family. Closes #15591. #16284 (alesapin).
  • try use cmake version for croaring instead of amalgamation.sh. #16285 (sundyli).
  • Add total_rows/total_bytes (from system.tables) support for Set/Join table engines. #16306 (Azat Khuzhin).
  • Better diagnostics when client has dropped connection. In previous versions, Attempt to read after EOF and Broken pipe exceptions were logged in server. In new version, it's information message Client has dropped the connection, cancel the query.. #16329 (Alexey Milovidov).
  • Add TablesToDropQueueSize metric. It's equal to number of dropped tables, that are waiting for background data removal. #16364 (Alexander Tokmakov).
  • Fix debug assertion in quantileDeterministic function. In previous version it may also transfer up to two times more data over the network. Although no bug existed. This fixes #15683. #16410 (Alexey Milovidov).
  • Better read task scheduling for JBOD architecture and MergeTree storage. New setting read_backoff_min_concurrency which serves as the lower limit to the number of reading threads. #16423 (Amos Bird).
  • Fixed bug for #16263. Also minimized event loop lifetime. Added more efficient queues setup. #16426 (Kseniia Sumarokova).
  • Allow to fetch parts that are already committed or outdated in the current instance into the detached directory. It's useful when migrating tables from another cluster and having N to 1 shards mapping. It's also consistent with the current fetchPartition implementation. #16538 (Amos Bird).
  • Add current_database into query_thread_log. #16558 (Azat Khuzhin).
  • Subqueries in WITH section (CTE) can reference previous subqueries in WITH section by their name. #16575 (Amos Bird).
    • Improve scheduling of background task which removes data of dropped tables in Atomic databases. - Atomic databases do not create broken symlink to table data directory if table actually has no data directory. #16584 (Alexander Tokmakov).
  • Now paratmeterized functions can be used in APPLY column transformer. #16589 (Amos Bird).
  • Now event_time_microseconds field stores in Decimal64, not UInt64. Removed an incorrect check from Field::get(). #16617 (Nikita Mikhaylov).
  • Apply SETTINGS clause as early as possible. It allows to modify more settings in the query. This closes #3178. #16619 (Alexey Milovidov).
  • Better update of ZooKeeper configuration in runtime. #16630 (sundyli).
  • Make the behaviour of minMap and maxMap more desireable. It will not skip zero values in the result. Fixes #16087. #16631 (Ildus Kurbangaliev).
  • Better diagnostics on parse errors in input data. Provide row number on Cannot read all data errors. #16644 (Alexey Milovidov).

Bug Fix

  • Update jemalloc to fix percpu_arena with affinity mask. #15035 (Azat Khuzhin).
  • Decrement the ReadonlyReplica metric when detaching read-only tables. This fixes #15598. #15592 (sundyli).
  • Fixed bug with globs in S3 table function, region from URL was not applied to S3 client configuration. #15646 (Vladimir Chebotarev).
  • Fix error Cannot add simple transform to empty Pipe which happened while reading from Buffer table which has different structure than destination table. It was possible if destination table returned empty result for query. Fixes #15529. #15662 (Nikolai Kochetov).
  • Possibility to move part to another disk/volume if the first attempt was failed. #15723 (Pavel Kovalenko).
  • Fix drop of materialized view with inner table in Atomic database (hangs all subsequent DROP TABLE due to hang of the worker thread, due to recursive DROP TABLE for inner table of MV). #15743 (Azat Khuzhin).
  • Fix select count() inaccuracy for MaterializeMySQL. #15767 (Alexander Tokmakov).
  • Fix exception Block structure mismatch in SELECT ... ORDER BY DESC queries which were executed after ALTER MODIFY COLUMN query. Fixes #15800. #15852 (alesapin).
  • Now exception will be thrown when ALTER MODIFY COLUMN ... DEFAULT ... has incompatible default with column type. Fixes #15854. #15858 (alesapin).
  • Fix possible deadlocks in RBAC. #15875 (Vitaly Baranov).
  • fixes #12513 fix difference expressions with same alias when analyze queries again. #15886 (Winter Zhang).
  • Fix incorrect empty result for query from Distributed table if query has WHERE, PREWHERE and GLOBAL IN. Fixes #15792. #15933 (Nikolai Kochetov).
  • Fixed DROP TABLE IF EXISTS failure with Table ... doesn't exist error when table is concurrently renamed (for Atomic database engine). Fixed rare deadlock when concurrently executing some DDL queries with multiple tables (like DROP DATABASE and RENAME TABLE) Fixed DROP/DETACH DATABASE failure with Table ... doesn't exist when concurrently executing DROP/DETACH TABLE. #15934 (Alexander Tokmakov).
  • Fix a crash when database creation fails. #15954 (Winter Zhang).
  • Fix ambiguity in parsing of settings profiles: CREATE USER ... SETTINGS profile readonly is now considered as using a profile named readonly, not a setting named profile with the readonly constraint. This fixes #15628. #15982 (Vitaly Baranov).
  • Fix rare segfaults when inserting into or selecting from MaterializedView and concurrently dropping target table (for Atomic database engine). #15984 (Alexander Tokmakov).
  • Prevent replica hang for 5-10 mins when replication error happens after a period of inactivity. #15987 (filimonov).
  • Allow to use direct layout for dictionaries with complex keys. #16007 (Anton Popov).
  • Fix collate name & charset name parser and support length = 0 for string type. #16008 (Winter Zhang).
  • Fix ALTER MODIFY ... ORDER BY query hang for ReplicatedVersionedCollapsingMergeTree. This fixes #15980. #16011 (alesapin).
  • Fix bug with MySQL database. When MySQL server used as database engine is down some queries raise Exception, because they try to get tables from disabled server, while it's unnecessary. For example, query SELECT ... FROM system.parts should work only with MergeTree tables and don't touch MySQL database at all. #16032 (Kruglov Pavel).
  • Fixes #15780 regression, e.g. indexOf([1, 2, 3], toLowCardinality(1)) now is prohibited but it should not be. #16038 (Mike Kot).
  • Fix segfault in some cases of wrong aggregation in lambdas. #16082 (Anton Popov).
  • Fix the clickhouse-local crash when trying to do OPTIMIZE command. Fixes #16076. #16192 (filimonov).
  • Fix dictGet in sharding_key (and similar places, i.e. when the function context is stored permanently). #16205 (Azat Khuzhin).
  • Fix the case when memory can be overallocated regardless to the limit. This closes #14560. #16206 (Alexey Milovidov).
  • Fix a possible memory leak during GROUP BY with string keys, caused by an error in TwoLevelStringHashTable implementation. #16264 (Amos Bird).
  • Fixed the inconsistent behaviour when a part of return data could be dropped because the set for its filtration wasn't created. #16308 (Nikita Mikhaylov).
  • Fix processing of very large entries in replication queue. Very large entries may appear in ALTER queries if table structure is extremely large (near 1 MB). This fixes #16307. #16332 (Alexey Milovidov).
  • Fix async Distributed INSERT w/ prefer_localhost_replica=0 and internal_replication. #16358 (Azat Khuzhin).
  • Fix group by with totals/rollup/cube modifers and min/max functions over group by keys. Fixes #16393. #16397 (Anton Popov).
  • Fix DROP TABLE for Distributed (racy with INSERT). #16409 (Azat Khuzhin).
  • Fix double free in case of exception in function dictGet. It could have happened if dictionary was loaded with error. #16429 (Nikolai Kochetov).
  • Specifically crafter argument of round function with Decimal was leading to integer division by zero. This fixes #13338. #16451 (Alexey Milovidov).
  • Fix rapid growth of metadata when using MySQL Master -> MySQL Slave -> ClickHouse MaterializeMySQL Engine, and slave_parallel_worker enabled on MySQL Slave, by properly shrinking GTID sets. This fixes #15951. #16504 (TCeason).
  • Now when parsing AVRO from input the LowCardinality is removed from type. Fixes #16188. #16521 (Mike Kot).
  • Fix query_thread_log.query_duration_ms unit. #16563 (Azat Khuzhin).
  • Calculation of DEFAULT expressions was involving possible name collisions (that was very unlikely to encounter). This fixes #9359. #16612 (Alexey Milovidov).
  • The setting max_parallel_replicas worked incorrectly if the queried table has no sampling. This fixes #5733. #16675 (Alexey Milovidov).

Build/Testing/Packaging Improvement

Other

  • Use only |name_parts| as primary name source and auto-generate full name. #16149 (Ivan).
  • Rename struct NullSink from ReadHelpers to NullOutput, because class NullSink exists in Processors/NullSink.h. It's needed to prevent redefinition of 'NullSink' error. #16520 (Kruglov Pavel).

NO CL CATEGORY

NO CL ENTRY