ClickHouse/CHANGELOG.md
Nikolai Kochetov 1a500bad78
Update CHANGELOG.md
Move old changelog to separate files.
2020-03-17 20:24:36 +03:00

122 KiB

ClickHouse release v20.3

ClickHouse release v20.3.3.6, 2020-03-17

Bug Fix

  • Fixed incorrect internal function names for sumKahan and sumWithOverflow. I lead to exception while using this functions in remote queries. #9636 (Azat Khuzhin)
  • Fixed the issue: timezone was not preserved if you write a simple arithmetic expression like time + 1 (in contrast to an expression like time + INTERVAL 1 SECOND). This fixes #5743. #9323 (alexey-milovidov)
  • Fix possible exceptions Size of filter doesn't match size of column and Invalid number of rows in Chunk in MergeTreeRangeReader. They could appear while executing PREWHERE in some cases. Fixes #9132. #9612 (Anton Popov)
  • Allow ALTER ON CLUSTER of Distributed tables with internal replication. This fixes #3268. #9617 (shinoi2)
  • Fix bug in a replication that doesn't allow replication to work if the user has executed mutations on the previous version. This fixes #9645. #9652 (alesapin)
  • Add setting use_compact_format_in_distributed_parts_names which allows to write files for INSERT queries into Distributed table with more compact format. This fixes #9647. #9653 (alesapin)

ClickHouse release v20.3.2.1, 2020-03-12

Backward Incompatible Change

  • Fixed the issue file name too long when sending data for Distributed tables for a large number of replicas. Fixed the issue that replica credentials were exposed in the server log. The format of directory name on disk was changed to [shard{shard_index}[_replica{replica_index}]]. #8911 (Mikhail Korotov) After you upgrade to the new version, you will not be able to downgrade without manual intervention, because old server version does not recognize the new directory format. If you want to downgrade, you have to manually rename the corresponding directories to the old format. This change is relevant only if you have used asynchronous INSERTs to Distributed tables. In the version 20.3.3 we will introduce a setting that will allow you to enable the new format gradually.
  • Changed the format of replication log entries for mutation commands. You have to wait for old mutations to process before installing the new version.
  • Implement simple memory profiler that dumps stacktraces to system.trace_log every N bytes over soft allocation limit #8765 (Ivan) #9472 (alexey-milovidov) The column of system.trace_log was renamed from timer_type to trace_type. This will require changes in third-party performance analysis and flamegraph processing tools.
  • Use OS thread id everywhere instead of internal thread number. This fixes #7477 Old clickhouse-client cannot receive logs that are send from the server when the setting send_logs_level is enabled, because the names and types of the structured log messages were changed. On the other hand, different server versions can send logs with different types to each other. When you don't use the send_logs_level setting, you should not care. #8954 (alexey-milovidov)
  • Remove indexHint function #9542 (alexey-milovidov)
  • Remove findClusterIndex, findClusterValue functions. This fixes #8641. If you were using these functions, send an email to clickhouse-feedback@yandex-team.com #9543 (alexey-milovidov)
  • Now it's not allowed to create columns or add columns with SELECT subquery as default expression. #9481 (alesapin)
  • Require aliases for subqueries in JOIN. #9274 (Artem Zuikov)
  • Improved ALTER MODIFY/ADD queries logic. Now you cannot ADD column without type, MODIFY default expression doesn't change type of column and MODIFY type doesn't loose default expression value. Fixes #8669. #9227 (alesapin)
  • Require server to be restarted to apply the changes in logging configuration. This is a temporary workaround to avoid the bug where the server logs to a deleted log file (see #8696). #8707 (Alexander Kuzmenkov)
  • The setting experimental_use_processors is enabled by default. This setting enables usage of the new query pipeline. This is internal refactoring and we expect no visible changes. If you will see any issues, set it to back zero. #8768 (alexey-milovidov)

New Feature

  • Add Avro and AvroConfluent input/output formats #8571 (Andrew Onyshchuk) #8957 (Andrew Onyshchuk) #8717 (alexey-milovidov)
  • Multi-threaded and non-blocking updates of expired keys in cache dictionaries (with optional permission to read old ones). #8303 (Nikita Mikhaylov)
  • Add query ALTER ... MATERIALIZE TTL. It runs mutation that forces to remove expired data by TTL and recalculates meta-information about TTL in all parts. #8775 (Anton Popov)
  • Switch from HashJoin to MergeJoin (on disk) if needed #9082 (Artem Zuikov)
  • Added MOVE PARTITION command for ALTER TABLE #4729 #6168 (Guillaume Tassery)
  • Reloading storage configuration from configuration file on the fly. #8594 (Vladimir Chebotarev)
  • Allowed to change storage_policy to not less rich one. #8107 (Vladimir Chebotarev)
  • Added support for globs/wildcards for S3 storage and table function. #8851 (Vladimir Chebotarev)
  • Implement bitAnd, bitOr, bitXor, bitNot for FixedString(N) datatype. #9091 (Guillaume Tassery)
  • Added function bitCount. This fixes #8702. #8708 (alexey-milovidov) #8749 (ikopylov)
  • Add generateRandom table function to generate random rows with given schema. Allows to populate arbitrary test table with data. #8994 (Ilya Yatsishin)
  • JSONEachRowFormat: support special case when objects enclosed in top-level array. #8860 (Kruglov Pavel)
  • Now it's possible to create a column with DEFAULT expression which depends on a column with default ALIAS expression. #9489 (alesapin)
  • Allow to specify --limit more than the source data size in clickhouse-obfuscator. The data will repeat itself with different random seed. #9155 (alexey-milovidov)
  • Added groupArraySample function (similar to groupArray) with reservior sampling algorithm. #8286 (Amos Bird)
  • Now you can monitor the size of update queue in cache/complex_key_cache dictionaries via system metrics. #9413 (Nikita Mikhaylov)
  • Allow to use CRLF as a line separator in CSV output format with setting output_format_csv_crlf_end_of_line is set to 1 #8934 #8935 #8963 (Mikhail Korotov)
  • Implement more functions of the H3 API: h3GetBaseCell, h3HexAreaM2, h3IndexesAreNeighbors, h3ToChildren, h3ToString and stringToH3 #8938 (Nico Mandery)
  • New setting introduced: max_parser_depth to control maximum stack size and allow large complex queries. This fixes #6681 and #7668. #8647 (Maxim Smirnov)
  • Add a setting force_optimize_skip_unused_shards setting to throw if skipping of unused shards is not possible #8805 (Azat Khuzhin)
  • Allow to configure multiple disks/volumes for storing data for send in Distributed engine #8756 (Azat Khuzhin)
  • Support storage policy (<tmp_policy>) for storing temporary data. #8750 (Azat Khuzhin)
  • Added X-ClickHouse-Exception-Code HTTP header that is set if exception was thrown before sending data. This implements #4971. #8786 (Mikhail Korotov)
  • Added function ifNotFinite. It is just a syntactic sugar: ifNotFinite(x, y) = isFinite(x) ? x : y. #8710 (alexey-milovidov)
  • Added last_successful_update_time column in system.dictionaries table #9394 (Nikita Mikhaylov)
  • Add blockSerializedSize function (size on disk without compression) #8952 (Azat Khuzhin)
  • Add function moduloOrZero #9358 (hcz)
  • Added system tables system.zeros and system.zeros_mt as well as tale functions zeros() and zeros_mt(). Tables (and table functions) contain single column with name zero and type UInt8. This column contains zeros. It is needed for test purposes as the fastest method to generate many rows. This fixes #6604 #9593 (Nikolai Kochetov)

Experimental Feature

  • Add new compact format of parts in MergeTree-family tables in which all columns are stored in one file. It helps to increase performance of small and frequent inserts. The old format (one file per column) is now called wide. Data storing format is controlled by settings min_bytes_for_wide_part and min_rows_for_wide_part. #8290 (Anton Popov)
  • Support for S3 storage for Log, TinyLog and StripeLog tables. #8862 (Pavel Kovalenko)

Bug Fix

  • Fixed inconsistent whitespaces in log messages. #9322 (alexey-milovidov)
  • Fix bug in which arrays of unnamed tuples were flattened as Nested structures on table creation. #8866 (achulkov2)
  • Fixed the issue when "Too many open files" error may happen if there are too many files matching glob pattern in File table or file table function. Now files are opened lazily. This fixes #8857 #8861 (alexey-milovidov)
  • DROP TEMPORARY TABLE now drops only temporary table. #8907 (Vitaly Baranov)
  • Remove outdated partition when we shutdown the server or DETACH/ATTACH a table. #8602 (Guillaume Tassery)
  • For how the default disk calculates the free space from data subdirectory. Fixed the issue when the amount of free space is not calculated correctly if the data directory is mounted to a separate device (rare case). This fixes #7441 #9257 (Mikhail Korotov)
  • Allow comma (cross) join with IN () inside. #9251 (Artem Zuikov)
  • Allow to rewrite CROSS to INNER JOIN if there's [NOT] LIKE operator in WHERE section. #9229 (Artem Zuikov)
  • Fix possible incorrect result after GROUP BY with enabled setting distributed_aggregation_memory_efficient. Fixes #9134. #9289 (Nikolai Kochetov)
  • Found keys were counted as missed in metrics of cache dictionaries. #9411 (Nikita Mikhaylov)
  • Fix replication protocol incompatibility introduced in #8598. #9412 (alesapin)
  • Fixed race condition on queue_task_handle at the startup of ReplicatedMergeTree tables. #9552 (alexey-milovidov)
  • The token NOT didn't work in SHOW TABLES NOT LIKE query #8727 #8940 (alexey-milovidov)
  • Added range check to function h3EdgeLengthM. Without this check, buffer overflow is possible. #8945 (alexey-milovidov)
  • Fixed up a bug in batched calculations of ternary logical OPs on multiple arguments (more than 10). #8718 (Alexander Kazakov)
  • Fix error of PREWHERE optimization, which could lead to segfaults or Inconsistent number of columns got from MergeTreeRangeReader exception. #9024 (Anton Popov)
  • Fix unexpected Timeout exceeded while reading from socket exception, which randomly happens on secure connection before timeout actually exceeded and when query profiler is enabled. Also add connect_timeout_with_failover_secure_ms settings (default 100ms), which is similar to connect_timeout_with_failover_ms, but is used for secure connections (because SSL handshake is slower, than ordinary TCP connection) #9026 (tavplubix)
  • Fix bug with mutations finalization, when mutation may hang in state with parts_to_do=0 and is_done=0. #9022 (alesapin)
  • Use new ANY JOIN logic with partial_merge_join setting. It's possible to make ANY|ALL|SEMI LEFT and ALL INNER joins with partial_merge_join=1 now. #8932 (Artem Zuikov)
  • Shard now clamps the settings got from the initiator to the shard's constaints instead of throwing an exception. This fix allows to send queries to a shard with another constraints. #9447 (Vitaly Baranov)
  • Fixed memory management problem in MergeTreeReadPool. #8791 (Vladimir Chebotarev)
  • Fix toDecimal*OrNull() functions family when called with string e. Fixes #8312 #8764 (Artem Zuikov)
  • Make sure that FORMAT Null sends no data to the client. #8767 (Alexander Kuzmenkov)
  • Fix bug that timestamp in LiveViewBlockInputStream will not updated. LIVE VIEW is an experimental feature. #8644 (vxider) #8625 (vxider)
  • Fixed ALTER MODIFY TTL wrong behavior which did not allow to delete old TTL expressions. #8422 (Vladimir Chebotarev)
  • Fixed UBSan report in MergeTreeIndexSet. This fixes #9250 #9365 (alexey-milovidov)
  • Fixed the behaviour of match and extract functions when haystack has zero bytes. The behaviour was wrong when haystack was constant. This fixes #9160 #9163 (alexey-milovidov) #9345 (alexey-milovidov)
  • Avoid throwing from destructor in Apache Avro 3rd-party library. #9066 (Andrew Onyshchuk)
  • Don't commit a batch polled from Kafka partially as it can lead to holes in data. #8876 (filimonov)
  • Fix joinGet with nullable return types. https://github.com/ClickHouse/ClickHouse/issues/8919 #9014 (Amos Bird)
  • Fix data incompatibility when compressed with T64 codec. #9016 (Artem Zuikov) Fix data type ids in T64 compression codec that leads to wrong (de)compression in affected versions. #9033 (Artem Zuikov)
  • Add setting enable_early_constant_folding and disable it in some cases that leads to errors. #9010 (Artem Zuikov)
  • Fix pushdown predicate optimizer with VIEW and enable the test #9011 (Winter Zhang)
  • Fix segfault in Merge tables, that can happen when reading from File storages #9387 (tavplubix)
  • Added a check for storage policy in ATTACH PARTITION FROM, REPLACE PARTITION, MOVE TO TABLE. Otherwise it could make data of part inaccessible after restart and prevent ClickHouse to start. #9383 (Vladimir Chebotarev)
  • Fix alters if there is TTL set for table. #8800 (Anton Popov)
  • Fix race condition that can happen when SYSTEM RELOAD ALL DICTIONARIES is executed while some dictionary is being modified/added/removed. #8801 (Vitaly Baranov)
  • In previous versions Memory database engine use empty data path, so tables are created in path directory (e.g. /var/lib/clickhouse/), not in data directory of database (e.g. /var/lib/clickhouse/db_name). #8753 (tavplubix)
  • Fixed wrong log messages about missing default disk or policy. #9530 (Vladimir Chebotarev)
  • Fix not(has()) for the bloom_filter index of array types. #9407 (achimbab)
  • Allow first column(s) in a table with Log engine be an alias #9231 (Ivan)
  • Fix order of ranges while reading from MergeTree table in one thread. It could lead to exceptions from MergeTreeRangeReader or wrong query results. #9050 (Anton Popov)
  • Make reinterpretAsFixedString to return FixedString instead of String. #9052 (Andrew Onyshchuk)
  • Avoid extremely rare cases when the user can get wrong error message (Success instead of detailed error description). #9457 (alexey-milovidov)
  • Do not crash when using Template format with empty row template. #8785 (Alexander Kuzmenkov)
  • Metadata files for system tables could be created in wrong place #8653 (tavplubix) Fixes #8581.
  • Fix data race on exception_ptr in cache dictionary #8303. #9379 (Nikita Mikhaylov)
  • Do not throw an exception for query ATTACH TABLE IF NOT EXISTS. Previously it was thrown if table already exists, despite the IF NOT EXISTS clause. #8967 (Anton Popov)
  • Fixed missing closing paren in exception message. #8811 (alexey-milovidov)
  • Avoid message Possible deadlock avoided at the startup of clickhouse-client in interactive mode. #9455 (alexey-milovidov)
  • Fixed the issue when padding at the end of base64 encoded value can be malformed. Update base64 library. This fixes #9491, closes #9492 #9500 (alexey-milovidov)
  • Prevent losing data in Kafka in rare cases when exception happens after reading suffix but before commit. Fixes #9378 #9507 (filimonov)
  • Fixed exception in DROP TABLE IF EXISTS #8663 (Nikita Vasilev)
  • Fix crash when a user tries to ALTER MODIFY SETTING for old-formated MergeTree table engines family. #9435 (alesapin)
  • Support for UInt64 numbers that don't fit in Int64 in JSON-related functions. Update SIMDJSON to master. This fixes #9209 #9344 (alexey-milovidov)
  • Fixed execution of inversed predicates when non-strictly monotinic functional index is used. #9223 (Alexander Kazakov)
  • Don't try to fold IN constant in GROUP BY #8868 (Amos Bird)
  • Fix bug in ALTER DELETE mutations which leads to index corruption. This fixes #9019 and #8982. Additionally fix extremely rare race conditions in ReplicatedMergeTree ALTER queries. #9048 (alesapin)
  • When the setting compile_expressions is enabled, you can get unexpected column in LLVMExecutableFunction when we use Nullable type #8910 (Guillaume Tassery)
  • Multiple fixes for Kafka engine: 1) fix duplicates that were appearing during consumer group rebalance. 2) Fix rare 'holes' appeared when data were polled from several partitions with one poll and committed partially (now we always process / commit the whole polled block of messages). 3) Fix flushes by block size (before that only flushing by timeout was working properly). 4) better subscription procedure (with assignment feedback). 5) Make tests work faster (with default intervals and timeouts). Due to the fact that data was not flushed by block size before (as it should according to documentation), that PR may lead to some performance degradation with default settings (due to more often & tinier flushes which are less optimal). If you encounter the performance issue after that change - please increase kafka_max_block_size in the table to the bigger value ( for example CREATE TABLE ...Engine=Kafka ... SETTINGS ... kafka_max_block_size=524288). Fixes #7259 #8917 (filimonov)
  • Fix Parameter out of bound exception in some queries after PREWHERE optimizations. #8914 (Baudouin Giard)
  • Fixed the case of mixed-constness of arguments of function arrayZip. #8705 (alexey-milovidov)
  • When executing CREATE query, fold constant expressions in storage engine arguments. Replace empty database name with current database. Fixes #6508, #3492 #9262 (tavplubix)
  • Now it's not possible to create or add columns with simple cyclic aliases like a DEFAULT b, b DEFAULT a. #9603 (alesapin)
  • Fixed a bug with double move which may corrupt original part. This is relevant if you use ALTER TABLE MOVE #8680 (Vladimir Chebotarev)
  • Allow interval identifier to correctly parse without backticks. Fixed issue when a query cannot be executed even if the interval identifier is enclosed in backticks or double quotes. This fixes #9124. #9142 (alexey-milovidov)
  • Fixed fuzz test and incorrect behaviour of bitTestAll/bitTestAny functions. #9143 (alexey-milovidov)
  • Fix possible crash/wrong number of rows in LIMIT n WITH TIES when there are a lot of rows equal to n'th row. #9464 (tavplubix)
  • Fix mutations with parts written with enabled insert_quorum. #9463 (alesapin)
  • Fix data race at destruction of Poco::HTTPServer. It could happen when server is started and immediately shut down. #9468 (Anton Popov)
  • Fix bug in which a misleading error message was shown when running SHOW CREATE TABLE a_table_that_does_not_exist. #8899 (achulkov2)
  • Fixed Parameters are out of bound exception in some rare cases when we have a constant in the SELECT clause when we have an ORDER BY and a LIMIT clause. #8892 (Guillaume Tassery)
  • Fix mutations finalization, when already done mutation can have status is_done=0. #9217 (alesapin)
  • Prevent from executing ALTER ADD INDEX for MergeTree tables with old syntax, because it doesn't work. #8822 (Mikhail Korotov)
  • During server startup do not access table, which LIVE VIEW depends on, so server will be able to start. Also remove LIVE VIEW dependencies when detaching LIVE VIEW. LIVE VIEW is an experimental feature. #8824 (tavplubix)
  • Fix possible segfault in MergeTreeRangeReader, while executing PREWHERE. #9106 (Anton Popov)
  • Fix possible mismatched checksums with column TTLs. #9451 (Anton Popov)
  • Fixed a bug when parts were not being moved in background by TTL rules in case when there is only one volume. #8672 (Vladimir Chebotarev)
  • Fixed the issue Method createColumn() is not implemented for data type Set. This fixes #7799. #8674 (alexey-milovidov)
  • Now we will try finalize mutations more frequently. #9427 (alesapin)
  • Fix intDiv by minus one constant #9351 (hcz)
  • Fix possible race condition in BlockIO. #9356 (Nikolai Kochetov)
  • Fix bug leading to server termination when trying to use / drop Kafka table created with wrong parameters. #9513 (filimonov)
  • Added workaround if OS returns wrong result for timer_create function. #8837 (alexey-milovidov)
  • Fixed error in usage of min_marks_for_seek parameter. Fixed the error message when there is no sharding key in Distributed table and we try to skip unused shards. #8908 (Azat Khuzhin)

Improvement

  • Implement ALTER MODIFY/DROP queries on top of mutations for ReplicatedMergeTree* engines family. Now ALTERS blocks only at the metadata update stage, and don't block after that. #8701 (alesapin)
  • Add ability to rewrite CROSS to INNER JOINs with WHERE section containing unqialified names. #9512 (Artem Zuikov)
  • Make SHOW TABLES and SHOW DATABASES queries support the WHERE expressions and FROM/IN #9076 (sundyli)
  • Added a setting deduplicate_blocks_in_dependent_materialized_views. #9070 (urykhy)
  • After recent changes MySQL client started to print binary strings in hex thereby making them not readable (#9032). The workaround in ClickHouse is to mark string columns as UTF-8, which is not always, but usually the case. #9079 (Yuriy Baranov)
  • Add support of String and FixedString keys for sumMap #8903 (Baudouin Giard)
  • Support string keys in SummingMergeTree maps #8933 (Baudouin Giard)
  • Signal termination of thread to the thread pool even if the thread has thrown exception #8736 (Ding Xiang Fei)
  • Allow to set query_id in clickhouse-benchmark #9416 (Anton Popov)
  • Don't allow strange expressions in ALTER TABLE ... PARTITION partition query. This addresses #7192 #8835 (alexey-milovidov)
  • The table system.table_engines now provides information about feature support (like supports_ttl or supports_sort_order). #8830 (Max Akhmedov)
  • Enable system.metric_log by default. It will contain rows with values of ProfileEvents, CurrentMetrics collected with "collect_interval_milliseconds" interval (one second by default). The table is very small (usually in order of megabytes) and collecting this data by default is reasonable. #9225 (alexey-milovidov)
  • Initialize query profiler for all threads in a group, e.g. it allows to fully profile insert-queries. Fixes #6964 #8874 (Ivan)
  • Now temporary LIVE VIEW is created by CREATE LIVE VIEW name WITH TIMEOUT [42] ... instead of CREATE TEMPORARY LIVE VIEW ..., because the previous syntax was not consistent with CREATE TEMPORARY TABLE ... #9131 (tavplubix)
  • Add text_log.level configuration parameter to limit entries that goes to system.text_log table #8809 (Azat Khuzhin)
  • Allow to put downloaded part to a disks/volumes according to TTL rules #8598 (Vladimir Chebotarev)
  • For external MySQL dictionaries, allow to mutualize MySQL connection pool to "share" them among dictionaries. This option significantly reduces the number of connections to MySQL servers. #9409 (Clément Rodriguez)
  • Show nearest query execution time for quantiles in clickhouse-benchmark output instead of interpolated values. It's better to show values that correspond to the execution time of some queries. #8712 (alexey-milovidov)
  • Possibility to add key & timestamp for the message when inserting data to Kafka. Fixes #7198 #8969 (filimonov)
  • If server is run from terminal, highlight thread number, query id and log priority by colors. This is for improved readability of correlated log messages for developers. #8961 (alexey-milovidov)
  • Better exception message while loading tables for Ordinary database. #9527 (alexey-milovidov)
  • Implement arraySlice for arrays with aggregate function states. This fixes #9388 #9391 (alexey-milovidov)
  • Allow constant functions and constant arrays to be used on the right side of IN operator. #8813 (Anton Popov)
  • If zookeeper exception has happened while fetching data for system.replicas, display it in a separate column. This implements #9137 #9138 (alexey-milovidov)
  • Atomically remove MergeTree data parts on destroy. #8402 (Vladimir Chebotarev)
  • Support row-level security for Distributed tables. #8926 (Ivan)
  • Now we recognize suffix (like KB, KiB...) in settings values. #8072 (Mikhail Korotov)
  • Prevent out of memory while constructing result of a large JOIN. #8637 (Artem Zuikov)
  • Added names of clusters to suggestions in interactive mode in clickhouse-client. #8709 (alexey-milovidov)
  • Initialize query profiler for all threads in a group, e.g. it allows to fully profile insert-queries #8820 (Ivan)
  • Added column exception_code in system.query_log table. #8770 (Mikhail Korotov)
  • Enabled MySQL compatibility server on port 9004 in the default server configuration file. Fixed password generation command in the example in configuration. #8771 (Yuriy Baranov)
  • Prevent abort on shutdown if the filesystem is readonly. This fixes #9094 #9100 (alexey-milovidov)
  • Better exception message when length is required in HTTP POST query. #9453 (alexey-milovidov)
  • Add _path and _file virtual columns to HDFS and File engines and hdfs and file table functions #8489 (Olga Khvostikova)
  • Fix error Cannot find column while inserting into MATERIALIZED VIEW in case if new column was added to view's internal table. #8766 #8788 (vzakaznikov) #8788 #8806 (Nikolai Kochetov) #8803 (Nikolai Kochetov)
  • Fix progress over native client-server protocol, by send progress after final update (like logs). This may be relevant only to some third-party tools that are using native protocol. #9495 (Azat Khuzhin)
  • Add a system metric tracking the number of client connections using MySQL protocol (#9013). #9015 (Eugene Klimov)
  • From now on, HTTP responses will have X-ClickHouse-Timezone header set to the same timezone value that SELECT timezone() would report. #9493 (Denis Glazachev)

Performance Improvement

  • Improve performance of analysing index with IN #9261 (Anton Popov)
  • Simpler and more efficient code in Logical Functions + code cleanups. A followup to #8718 #8728 (Alexander Kazakov)
  • Overall performance improvement (in range of 5%..200% for affected queries) by ensuring even more strict aliasing with C++20 features. #9304 (Amos Bird)
  • More strict aliasing for inner loops of comparison functions. #9327 (alexey-milovidov)
  • More strict aliasing for inner loops of arithmetic functions. #9325 (alexey-milovidov)
  • A ~3 times faster implementation for ColumnVector::replicate(), via which ColumnConst::convertToFullColumn() is implemented. Also will be useful in tests when materializing constants. #9293 (Alexander Kazakov)
  • Another minor performance improvement to ColumnVector::replicate() (this speeds up the materialize function and higher order functions) an even further improvement to #9293 #9442 (Alexander Kazakov)
  • Improved performance of stochasticLinearRegression aggregate function. This patch is contributed by Intel. #8652 (alexey-milovidov)
  • Improve performance of reinterpretAsFixedString function. #9342 (alexey-milovidov)
  • Do not send blocks to client for Null format in processors pipeline. #8797 (Nikolai Kochetov) #8767 (Alexander Kuzmenkov)

Build/Testing/Packaging Improvement

ClickHouse release v20.1

ClickHouse release v20.1.6.30, 2020-03-05

Bug Fix

  • Fix data incompatibility when compressed with T64 codec. #9039 (abyss7)
  • Fix order of ranges while reading from MergeTree table in one thread. Fixes #8964. #9050 (CurtizJ)
  • Fix possible segfault in MergeTreeRangeReader, while executing PREWHERE. Fixes #9064. #9106 (CurtizJ)
  • Fix reinterpretAsFixedString to return FixedString instead of String. #9052 (oandrew)
  • Fix joinGet with nullable return types. Fixes #8919 #9014 (amosbird)
  • Fix fuzz test and incorrect behaviour of bitTestAll/bitTestAny functions. #9143 (alexey-milovidov)
  • Fix the behaviour of match and extract functions when haystack has zero bytes. The behaviour was wrong when haystack was constant. Fixes #9160 #9163 (alexey-milovidov)
  • Fixed execution of inversed predicates when non-strictly monotinic functional index is used. Fixes #9034 #9223 (Akazz)
  • Allow to rewrite CROSS to INNER JOIN if there's [NOT] LIKE operator in WHERE section. Fixes #9191 #9229 (4ertus2)
  • Allow first column(s) in a table with Log engine be an alias. #9231 (abyss7)
  • Allow comma join with IN() inside. Fixes #7314. #9251 (4ertus2)
  • Improve ALTER MODIFY/ADD queries logic. Now you cannot ADD column without type, MODIFY default expression doesn't change type of column and MODIFY type doesn't loose default expression value. Fixes #8669. #9227 (alesapin)
  • Fix mutations finalization, when already done mutation can have status is_done=0. #9217 (alesapin)
  • Support "Processors" pipeline for system.numbers and system.numbers_mt. This also fixes the bug when max_execution_time is not respected. #7796 (KochetovNicolai)
  • Fix wrong counting of DictCacheKeysRequestedFound metric. #9411 (nikitamikhaylov)
  • Added a check for storage policy in ATTACH PARTITION FROM, REPLACE PARTITION, MOVE TO TABLE which otherwise could make data of part inaccessible after restart and prevent ClickHouse to start. #9383 (excitoon)
  • Fixed UBSan report in MergeTreeIndexSet. This fixes #9250 #9365 (alexey-milovidov)
  • Fix possible datarace in BlockIO. #9356 (KochetovNicolai)
  • Support for UInt64 numbers that don't fit in Int64 in JSON-related functions. Update SIMDJSON to master. This fixes #9209 #9344 (alexey-milovidov)
  • Fix the issue when the amount of free space is not calculated correctly if the data directory is mounted to a separate device. For default disk calculate the free space from data subdirectory. This fixes #7441 #9257 (millb)
  • Fix the issue when TLS connections may fail with the message OpenSSL SSL_read: error:14094438:SSL routines:ssl3_read_bytes:tlsv1 alert internal error and SSL Exception: error:2400006E:random number generator::error retrieving entropy. Update OpenSSL to upstream master. #8956 (alexey-milovidov)
  • When executing CREATE query, fold constant expressions in storage engine arguments. Replace empty database name with current database. Fixes #6508, #3492. Also fix check for local address in ClickHouseDictionarySource. #9262 (tabplubix)
  • Fix segfault in StorageMerge, which can happen when reading from StorageFile. #9387 (tabplubix)
  • Prevent losing data in Kafka in rare cases when exception happens after reading suffix but before commit. Fixes #9378. Related: #7175 #9507 (filimonov)
  • Fix bug leading to server termination when trying to use / drop Kafka table created with wrong parameters. Fixes #9494. Incorporates #9507. #9513 (filimonov)

New Feature

  • Add deduplicate_blocks_in_dependent_materialized_views option to control the behaviour of idempotent inserts into tables with materialized views. This new feature was added to the bugfix release by a special request from Altinity. #9070 (urykhy)

ClickHouse release v20.1.2.4, 2020-01-22

Backward Incompatible Change

  • Make the setting merge_tree_uniform_read_distribution obsolete. The server still recognizes this setting but it has no effect. #8308 (alexey-milovidov)
  • Changed return type of the function greatCircleDistance to Float32 because now the result of calculation is Float32. #7993 (alexey-milovidov)
  • Now it's expected that query parameters are represented in "escaped" format. For example, to pass string a<tab>b you have to write a\tb or a\<tab>b and respectively, a%5Ctb or a%5C%09b in URL. This is needed to add the possibility to pass NULL as \N. This fixes #7488. #8517 (alexey-milovidov)
  • Enable use_minimalistic_part_header_in_zookeeper setting for ReplicatedMergeTree by default. This will significantly reduce amount of data stored in ZooKeeper. This setting is supported since version 19.1 and we already use it in production in multiple services without any issues for more than half a year. Disable this setting if you have a chance to downgrade to versions older than 19.1. #6850 (alexey-milovidov)
  • Data skipping indices are production ready and enabled by default. The settings allow_experimental_data_skipping_indices, allow_experimental_cross_to_join_conversion and allow_experimental_multiple_joins_emulation are now obsolete and do nothing. #7974 (alexey-milovidov)
  • Add new ANY JOIN logic for StorageJoin consistent with JOIN operation. To upgrade without changes in behaviour you need add SETTINGS any_join_distinct_right_table_keys = 1 to Engine Join tables metadata or recreate these tables after upgrade. #8400 (Artem Zuikov)
  • Require server to be restarted to apply the changes in logging configuration. This is a temporary workaround to avoid the bug where the server logs to a deleted log file (see #8696). #8707 (Alexander Kuzmenkov)

New Feature

  • Added information about part paths to system.merges. #8043 (Vladimir Chebotarev)
  • Add ability to execute SYSTEM RELOAD DICTIONARY query in ON CLUSTER mode. #8288 (Guillaume Tassery)
  • Add ability to execute CREATE DICTIONARY queries in ON CLUSTER mode. #8163 (alesapin)
  • Now user's profile in users.xml can inherit multiple profiles. #8343 (Mikhail f. Shiryaev)
  • Added system.stack_trace table that allows to look at stack traces of all server threads. This is useful for developers to introspect server state. This fixes #7576. #8344 (alexey-milovidov)
  • Add DateTime64 datatype with configurable sub-second precision. #7170 (Vasily Nemkov)
  • Add table function clusterAllReplicas which allows to query all the nodes in the cluster. #8493 (kiran sunkari)
  • Add aggregate function categoricalInformationValue which calculates the information value of a discrete feature. #8117 (hcz)
  • Speed up parsing of data files in CSV, TSV and JSONEachRow format by doing it in parallel. #7780 (Alexander Kuzmenkov)
  • Add function bankerRound which performs banker's rounding. #8112 (hcz)
  • Support more languages in embedded dictionary for region names: 'ru', 'en', 'ua', 'uk', 'by', 'kz', 'tr', 'de', 'uz', 'lv', 'lt', 'et', 'pt', 'he', 'vi'. #8189 (alexey-milovidov)
  • Improvements in consistency of ANY JOIN logic. Now t1 ANY LEFT JOIN t2 equals t2 ANY RIGHT JOIN t1. #7665 (Artem Zuikov)
  • Add setting any_join_distinct_right_table_keys which enables old behaviour for ANY INNER JOIN. #7665 (Artem Zuikov)
  • Add new SEMI and ANTI JOIN. Old ANY INNER JOIN behaviour now available as SEMI LEFT JOIN. #7665 (Artem Zuikov)
  • Added Distributed format for File engine and file table function which allows to read from .bin files generated by asynchronous inserts into Distributed table. #8535 (Nikolai Kochetov)
  • Add optional reset column argument for runningAccumulate which allows to reset aggregation results for each new key value. #8326 (Sergey Kononenko)
  • Add ability to use ClickHouse as Prometheus endpoint. #7900 (vdimir)
  • Add section <remote_url_allow_hosts> in config.xml which restricts allowed hosts for remote table engines and table functions URL, S3, HDFS. #7154 (Mikhail Korotov)
  • Added function greatCircleAngle which calculates the distance on a sphere in degrees. #8105 (alexey-milovidov)
  • Changed Earth radius to be consistent with H3 library. #8105 (alexey-milovidov)
  • Added JSONCompactEachRow and JSONCompactEachRowWithNamesAndTypes formats for input and output. #7841 (Mikhail Korotov)
  • Added feature for file-related table engines and table functions (File, S3, URL, HDFS) which allows to read and write gzip files based on additional engine parameter or file extension. #7840 (Andrey Bodrov)
  • Added the randomASCII(length) function, generating a string with a random set of ASCII printable characters. #8401 (BayoNet)
  • Added function JSONExtractArrayRaw which returns an array on unparsed json array elements from JSON string. #8081 (Oleg Matrokhin)
  • Add arrayZip function which allows to combine multiple arrays of equal lengths into one array of tuples. #8149 (Winter Zhang)
  • Add ability to move data between disks according to configured TTL-expressions for *MergeTree table engines family. #8140 (Vladimir Chebotarev)
  • Added new aggregate function avgWeighted which allows to calculate weighted average. #7898 (Andrey Bodrov)
  • Now parallel parsing is enabled by default for TSV, TSKV, CSV and JSONEachRow formats. #7894 (Nikita Mikhaylov)
  • Add several geo functions from H3 library: h3GetResolution, h3EdgeAngle, h3EdgeLength, h3IsValid and h3kRing. #8034 (Konstantin Malanchev)
  • Added support for brotli (br) compression in file-related storages and table functions. This fixes #8156. #8526 (alexey-milovidov)
  • Add groupBit* functions for the SimpleAggregationFunction type. #8485 (Guillaume Tassery)

Bug Fix

  • Fix rename of tables with Distributed engine. Fixes issue #7868. #8306 (tavplubix)
  • Now dictionaries support EXPRESSION for attributes in arbitrary string in non-ClickHouse SQL dialect. #8098 (alesapin)
  • Fix broken INSERT SELECT FROM mysql(...) query. This fixes #8070 and #7960. #8234 (tavplubix)
  • Fix error "Mismatch column sizes" when inserting default Tuple from JSONEachRow. This fixes #5653. #8606 (tavplubix)
  • Now an exception will be thrown in case of using WITH TIES alongside LIMIT BY. Also add ability to use TOP with LIMIT BY. This fixes #7472. #7637 (Nikita Mikhaylov)
  • Fix unintendent dependency from fresh glibc version in clickhouse-odbc-bridge binary. #8046 (Amos Bird)
  • Fix bug in check function of *MergeTree engines family. Now it doesn't fail in case when we have equal amount of rows in last granule and last mark (non-final). #8047 (alesapin)
  • Fix insert into Enum* columns after ALTER query, when underlying numeric type is equal to table specified type. This fixes #7836. #7908 (Anton Popov)
  • Allowed non-constant negative "size" argument for function substring. It was not allowed by mistake. This fixes #4832. #7703 (alexey-milovidov)
  • Fix parsing bug when wrong number of arguments passed to (O|J)DBC table engine. #7709 (alesapin)
  • Using command name of the running clickhouse process when sending logs to syslog. In previous versions, empty string was used instead of command name. #8460 (Michael Nacharov)
  • Fix check of allowed hosts for localhost. This PR fixes the solution provided in #8241. #8342 (Vitaly Baranov)
  • Fix rare crash in argMin and argMax functions for long string arguments, when result is used in runningAccumulate function. This fixes #8325 #8341 (dinosaur)
  • Fix memory overcommit for tables with Buffer engine. #8345 (Azat Khuzhin)
  • Fixed potential bug in functions that can take NULL as one of the arguments and return non-NULL. #8196 (alexey-milovidov)
  • Better metrics calculations in thread pool for background processes for MergeTree table engines. #8194 (Vladimir Chebotarev)
  • Fix function IN inside WHERE statement when row-level table filter is present. Fixes #6687 #8357 (Ivan)
  • Now an exception is thrown if the integral value is not parsed completely for settings values. #7678 (Mikhail Korotov)
  • Fix exception when aggregate function is used in query to distributed table with more than two local shards. #8164 (小路)
  • Now bloom filter can handle zero length arrays and doesn't perform redundant calculations. #8242 (achimbab)
  • Fixed checking if a client host is allowed by matching the client host to host_regexp specified in users.xml. #8241 (Vitaly Baranov)
  • Relax ambiguous column check that leads to false positives in multiple JOIN ON section. #8385 (Artem Zuikov)
  • Fixed possible server crash (std::terminate) when the server cannot send or write data in JSON or XML format with values of String data type (that require UTF-8 validation) or when compressing result data with Brotli algorithm or in some other rare cases. This fixes #7603 #8384 (alexey-milovidov)
  • Fix race condition in StorageDistributedDirectoryMonitor found by CI. This fixes #8364. #8383 (Nikolai Kochetov)
  • Now background merges in *MergeTree table engines family preserve storage policy volume order more accurately. #8549 (Vladimir Chebotarev)
  • Now table engine Kafka works properly with Native format. This fixes #6731 #7337 #8003. #8016 (filimonov)
  • Fixed formats with headers (like CSVWithNames) which were throwing exception about EOF for table engine Kafka. #8016 (filimonov)
  • Fixed a bug with making set from subquery in right part of IN section. This fixes #5767 and #2542. #7755 (Nikita Mikhaylov)
  • Fix possible crash while reading from storage File. #7756 (Nikolai Kochetov)
  • Fixed reading of the files in Parquet format containing columns of type list. #8334 (maxulan)
  • Fix error Not found column for distributed queries with PREWHERE condition dependent on sampling key if max_parallel_replicas > 1. #7913 (Nikolai Kochetov)
  • Fix error Not found column if query used PREWHERE dependent on table's alias and the result set was empty because of primary key condition. #7911 (Nikolai Kochetov)
  • Fixed return type for functions rand and randConstant in case of Nullable argument. Now functions always return UInt32 and never Nullable(UInt32). #8204 (Nikolai Kochetov)
  • Disabled predicate push-down for WITH FILL expression. This fixes #7784. #7789 (Winter Zhang)
  • Fixed incorrect count() result for SummingMergeTree when FINAL section is used. #3280 #7786 (Nikita Mikhaylov)
  • Fix possible incorrect result for constant functions from remote servers. It happened for queries with functions like version(), uptime(), etc. which returns different constant values for different servers. This fixes #7666. #7689 (Nikolai Kochetov)
  • Fix complicated bug in push-down predicate optimization which leads to wrong results. This fixes a lot of issues on push-down predicate optimization. #8503 (Winter Zhang)
  • Fix crash in CREATE TABLE .. AS dictionary query. #8508 (Azat Khuzhin)
  • Several improvements ClickHouse grammar in .g4 file. #8294 (taiyang-li)
  • Fix bug that leads to crashes in JOINs with tables with engine Join. This fixes #7556 #8254 #7915 #8100. #8298 (Artem Zuikov)
  • Fix redundant dictionaries reload on CREATE DATABASE. #7916 (Azat Khuzhin)
  • Limit maximum number of streams for read from StorageFile and StorageHDFS. Fixes https://github.com/ClickHouse/ClickHouse/issues/7650. #7981 (alesapin)
  • Fix bug in ALTER ... MODIFY ... CODEC query, when user specify both default expression and codec. Fixes 8593. #8614 (alesapin)
  • Fix error in background merge of columns with SimpleAggregateFunction(LowCardinality) type. #8613 (Nikolai Kochetov)
  • Fixed type check in function toDateTime64. #8375 (Vasily Nemkov)
  • Now server do not crash on LEFT or FULL JOIN with and Join engine and unsupported join_use_nulls settings. #8479 (Artem Zuikov)
  • Now DROP DICTIONARY IF EXISTS db.dict query doesn't throw exception if db doesn't exist. #8185 (Vitaly Baranov)
  • Fix possible crashes in table functions (file, mysql, remote) caused by usage of reference to removed IStorage object. Fix incorrect parsing of columns specified at insertion into table function. #7762 (tavplubix)
  • Ensure network be up before starting clickhouse-server. This fixes #7507. #8570 (Zhichang Yu)
  • Fix timeouts handling for secure connections, so queries doesn't hang indefenitely. This fixes #8126. #8128 (alexey-milovidov)
  • Fix clickhouse-copier's redundant contention between concurrent workers. #7816 (Ding Xiang Fei)
  • Now mutations doesn't skip attached parts, even if their mutation version were larger than current mutation version. #7812 (Zhichang Yu) #8250 (alesapin)
  • Ignore redundant copies of *MergeTree data parts after move to another disk and server restart. #7810 (Vladimir Chebotarev)
  • Fix crash in FULL JOIN with LowCardinality in JOIN key. #8252 (Artem Zuikov)
  • Forbidden to use column name more than once in insert query like INSERT INTO tbl (x, y, x). This fixes #5465, #7681. #7685 (alesapin)
  • Added fallback for detection the number of physical CPU cores for unknown CPUs (using the number of logical CPU cores). This fixes #5239. #7726 (alexey-milovidov)
  • Fix There's no column error for materialized and alias columns. #8210 (Artem Zuikov)
  • Fixed sever crash when EXISTS query was used without TABLE or DICTIONARY qualifier. Just like EXISTS t. This fixes #8172. This bug was introduced in version 19.17. #8213 (alexey-milovidov)
  • Fix rare bug with error "Sizes of columns doesn't match" that might appear when using SimpleAggregateFunction column. #7790 (Boris Granveaud)
  • Fix bug where user with empty allow_databases got access to all databases (and same for allow_dictionaries). #7793 (DeifyTheGod)
  • Fix client crash when server already disconnected from client. #8071 (Azat Khuzhin)
  • Fix ORDER BY behaviour in case of sorting by primary key prefix and non primary key suffix. #7759 (Anton Popov)
  • Check if qualified column present in the table. This fixes #6836. #7758 (Artem Zuikov)
  • Fixed behavior with ALTER MOVE ran immediately after merge finish moves superpart of specified. Fixes #8103. #8104 (Vladimir Chebotarev)
  • Fix possible server crash while using UNION with different number of columns. Fixes #7279. #7929 (Nikolai Kochetov)
  • Fix size of result substring for function substr with negative size. #8589 (Nikolai Kochetov)
  • Now server does not execute part mutation in MergeTree if there are not enough free threads in background pool. #8588 (tavplubix)
  • Fix a minor typo on formatting UNION ALL AST. #7999 (litao91)
  • Fixed incorrect bloom filter results for negative numbers. This fixes #8317. #8566 (Winter Zhang)
  • Fixed potential buffer overflow in decompress. Malicious user can pass fabricated compressed data that will cause read after buffer. This issue was found by Eldar Zaitov from Yandex information security team. #8404 (alexey-milovidov)
  • Fix incorrect result because of integers overflow in arrayIntersect. #7777 (Nikolai Kochetov)
  • Now OPTIMIZE TABLE query will not wait for offline replicas to perform the operation. #8314 (javi santana)
  • Fixed ALTER TTL parser for Replicated*MergeTree tables. #8318 (Vladimir Chebotarev)
  • Fix communication between server and client, so server read temporary tables info after query failure. #8084 (Azat Khuzhin)
  • Fix bitmapAnd function error when intersecting an aggregated bitmap and a scalar bitmap. #8082 (Yue Huang)
  • Refine the definition of ZXid according to the ZooKeeper Programmer's Guide which fixes bug in clickhouse-cluster-copier. #8088 (Ding Xiang Fei)
  • odbc table function now respects external_table_functions_use_nulls setting. #7506 (Vasily Nemkov)
  • Fixed bug that lead to a rare data race. #8143 (Alexander Kazakov)
  • Now SYSTEM RELOAD DICTIONARY reloads a dictionary completely, ignoring update_field. This fixes #7440. #8037 (Vitaly Baranov)
  • Add ability to check if dictionary exists in create query. #8032 (alesapin)
  • Fix Float* parsing in Values format. This fixes #7817. #7870 (tavplubix)
  • Fix crash when we cannot reserve space in some background operations of *MergeTree table engines family. #7873 (Vladimir Chebotarev)
  • Fix crash of merge operation when table contains SimpleAggregateFunction(LowCardinality) column. This fixes #8515. #8522 (Azat Khuzhin)
  • Restore support of all ICU locales and add the ability to apply collations for constant expressions. Also add language name to system.collations table. #8051 (alesapin)
  • Fix bug when external dictionaries with zero minimal lifetime (LIFETIME(MIN 0 MAX N), LIFETIME(N)) don't update in background. #7983 (alesapin)
  • Fix crash when external dictionary with ClickHouse source has subquery in query. #8351 (Nikolai Kochetov)
  • Fix incorrect parsing of file extension in table with engine URL. This fixes #8157. #8419 (Andrey Bodrov)
  • Fix CHECK TABLE query for *MergeTree tables without key. Fixes #7543. #7979 (alesapin)
  • Fixed conversion of Float64 to MySQL type. #8079 (Yuriy Baranov)
  • Now if table was not completely dropped because of server crash, server will try to restore and load it. #8176 (tavplubix)
  • Fixed crash in table function file while inserting into file that doesn't exist. Now in this case file would be created and then insert would be processed. #8177 (Olga Khvostikova)
  • Fix rare deadlock which can happen when trace_log is in enabled. #7838 (filimonov)
  • Add ability to work with different types besides Date in RangeHashed external dictionary created from DDL query. Fixes 7899. #8275 (alesapin)
  • Fixes crash when now64() is called with result of another function. #8270 (Vasily Nemkov)
  • Fixed bug with detecting client IP for connections through mysql wire protocol. #7743 (Dmitry Muzyka)
  • Fix empty array handling in arraySplit function. This fixes #7708. #7747 (hcz)
  • Fixed the issue when pid-file of another running clickhouse-server may be deleted. #8487 (Weiqing Xu)
  • Fix dictionary reload if it has invalidate_query, which stopped updates and some exception on previous update tries. #8029 (alesapin)
  • Fixed error in function arrayReduce that may lead to "double free" and error in aggregate function combinator Resample that may lead to memory leak. Added aggregate function aggThrow. This function can be used for testing purposes. #8446 (alexey-milovidov)

Improvement

  • Improved logging when working with S3 table engine. #8251 (Grigory Pervakov)
  • Printed help message when no arguments are passed when calling clickhouse-local. This fixes #5335. #8230 (Andrey Nagorny)
  • Add setting mutations_sync which allows to wait ALTER UPDATE/DELETE queries synchronously. #8237 (alesapin)
  • Allow to set up relative user_files_path in config.xml (in the way similar to format_schema_path). #7632 (hcz)
  • Add exception for illegal types for conversion functions with -OrZero postfix. #7880 (Andrey Konyaev)
  • Simplify format of the header of data sending to a shard in a distributed query. #8044 (Vitaly Baranov)
  • Live View table engine refactoring. #8519 (vzakaznikov)
  • Add additional checks for external dictionaries created from DDL-queries. #8127 (alesapin)
  • Fix error Column ... already exists while using FINAL and SAMPLE together, e.g. select count() from table final sample 1/2. Fixes #5186. #7907 (Nikolai Kochetov)
  • Now table the first argument of joinGet function can be table indentifier. #7707 (Amos Bird)
  • Allow using MaterializedView with subqueries above Kafka tables. #8197 (filimonov)
  • Now background moves between disks run it the seprate thread pool. #7670 (Vladimir Chebotarev)
  • SYSTEM RELOAD DICTIONARY now executes synchronously. #8240 (Vitaly Baranov)
  • Stack traces now display physical addresses (offsets in object file) instead of virtual memory addresses (where the object file was loaded). That allows the use of addr2line when binary is position independent and ASLR is active. This fixes #8360. #8387 (alexey-milovidov)
  • Support new syntax for row-level security filters: <table name='table_name'>…</table>. Fixes #5779. #8381 (Ivan)
  • Now cityHash function can work with Decimal and UUID types. Fixes #5184. #7693 (Mikhail Korotov)
  • Removed fixed index granularity (it was 1024) from system logs because it's obsolete after implementation of adaptive granularity. #7698 (alexey-milovidov)
  • Enabled MySQL compatibility server when ClickHouse is compiled without SSL. #7852 (Yuriy Baranov)
  • Now server checksums distributed batches, which gives more verbose errors in case of corrupted data in batch. #7914 (Azat Khuzhin)
  • Support DROP DATABASE, DETACH TABLE, DROP TABLE and ATTACH TABLE for MySQL database engine. #8202 (Winter Zhang)
  • Add authentication in S3 table function and table engine. #7623 (Vladimir Chebotarev)
  • Added check for extra parts of MergeTree at different disks, in order to not allow to miss data parts at undefined disks. #8118 (Vladimir Chebotarev)
  • Enable SSL support for Mac client and server. #8297 (Ivan)
  • Now ClickHouse can work as MySQL federated server (see https://dev.mysql.com/doc/refman/5.7/en/federated-create-server.html). #7717 (Maxim Fedotov)
  • clickhouse-client now only enable bracketed-paste when multiquery is on and multiline is off. This fixes (#7757)[https://github.com/ClickHouse/ClickHouse/issues/7757]. #7761 (Amos Bird)
  • Support Array(Decimal) in if function. #7721 (Artem Zuikov)
  • Support Decimals in arrayDifference, arrayCumSum and arrayCumSumNegative functions. #7724 (Artem Zuikov)
  • Added lifetime column to system.dictionaries table. #6820 #7727 (kekekekule)
  • Improved check for existing parts on different disks for *MergeTree table engines. Addresses #7660. #8440 (Vladimir Chebotarev)
  • Integration with AWS SDK for S3 interactions which allows to use all S3 features out of the box. #8011 (Pavel Kovalenko)
  • Added support for subqueries in Live View tables. #7792 (vzakaznikov)
  • Check for using Date or DateTime column from TTL expressions was removed. #7920 (Vladimir Chebotarev)
  • Information about disk was added to system.detached_parts table. #7833 (Vladimir Chebotarev)
  • Now settings max_(table|partition)_size_to_drop can be changed without a restart. #7779 (Grigory Pervakov)
  • Slightly better usability of error messages. Ask user not to remove the lines below Stack trace:. #7897 (alexey-milovidov)
  • Better reading messages from Kafka engine in various formats after #7935. #8035 (Ivan)
  • Better compatibility with MySQL clients which don't support sha2_password auth plugin. #8036 (Yuriy Baranov)
  • Support more column types in MySQL compatibility server. #7975 (Yuriy Baranov)
  • Implement ORDER BY optimization for Merge, Buffer and Materilized View storages with underlying MergeTree tables. #8130 (Anton Popov)
  • Now we always use POSIX implementation of getrandom to have better compatibility with old kernels (< 3.17). #7940 (Amos Bird)
  • Better check for valid destination in a move TTL rule. #8410 (Vladimir Chebotarev)
  • Better checks for broken insert batches for Distributed table engine. #7933 (Azat Khuzhin)
  • Add column with array of parts name which mutations must process in future to system.mutations table. #8179 (alesapin)
  • Parallel merge sort optimization for processors. #8552 (Nikolai Kochetov)
  • The settings mark_cache_min_lifetime is now obsolete and does nothing. In previous versions, mark cache can grow in memory larger than mark_cache_size to accomodate data within mark_cache_min_lifetime seconds. That was leading to confusion and higher memory usage than expected, that is especially bad on memory constrained systems. If you will see performance degradation after installing this release, you should increase the mark_cache_size. #8484 (alexey-milovidov)
  • Preparation to use tid everywhere. This is needed for #7477. #8276 (alexey-milovidov)

Performance Improvement

  • Performance optimizations in processors pipeline. #7988 (Nikolai Kochetov)
  • Non-blocking updates of expired keys in cache dictionaries (with permission to read old ones). #8303 (Nikita Mikhaylov)
  • Compile ClickHouse without -fno-omit-frame-pointer globally to spare one more register. #8097 (Amos Bird)
  • Speedup greatCircleDistance function and add performance tests for it. #7307 (Olga Khvostikova)
  • Improved performance of function roundDown. #8465 (alexey-milovidov)
  • Improved performance of max, min, argMin, argMax for DateTime64 data type. #8199 (Vasily Nemkov)
  • Improved performance of sorting without a limit or with big limit and external sorting. #8545 (alexey-milovidov)
  • Improved performance of formatting floating point numbers up to 6 times. #8542 (alexey-milovidov)
  • Improved performance of modulo function. #7750 (Amos Bird)
  • Optimized ORDER BY and merging with single column key. #8335 (alexey-milovidov)
  • Better implementation for arrayReduce, -Array and -State combinators. #7710 (Amos Bird)
  • Now PREWHERE should be optimized to be at least as efficient as WHERE. #7769 (Amos Bird)
  • Improve the way round and roundBankers handling negative numbers. #8229 (hcz)
  • Improved decoding performance of DoubleDelta and Gorilla codecs by roughly 30-40%. This fixes #7082. #8019 (Vasily Nemkov)
  • Improved performance of base64 related functions. #8444 (alexey-milovidov)
  • Added a function geoDistance. It is similar to greatCircleDistance but uses approximation to WGS-84 ellipsoid model. The performance of both functions are near the same. #8086 (alexey-milovidov)
  • Faster min and max aggregation functions for Decimal data type. #8144 (Artem Zuikov)
  • Vectorize processing arrayReduce. #7608 (Amos Bird)
  • if chains are now optimized as multiIf. #8355 (kamalov-ruslan)
  • Fix performance regression of Kafka table engine introduced in 19.15. This fixes #7261. #7935 (filimonov)
  • Removed "pie" code generation that gcc from Debian packages occasionally brings by default. #8483 (alexey-milovidov)
  • Parallel parsing data formats #6553 (Nikita Mikhaylov)
  • Enable optimized parser of Values with expressions by default (input_format_values_deduce_templates_of_expressions=1). #8231 (tavplubix)

Build/Testing/Packaging Improvement

Experimental Feature

  • Added experimental setting min_bytes_to_use_mmap_io. It allows to read big files without copying data from kernel to userspace. The setting is disabled by default. Recommended threshold is about 64 MB, because mmap/munmap is slow. #8520 (alexey-milovidov)
  • Reworked quotas as a part of access control system. Added new table system.quotas, new functions currentQuota, currentQuotaKey, new SQL syntax CREATE QUOTA, ALTER QUOTA, DROP QUOTA, SHOW QUOTA. #7257 (Vitaly Baranov)
  • Allow skipping unknown settings with warnings instead of throwing exceptions. #7653 (Vitaly Baranov)
  • Reworked row policies as a part of access control system. Added new table system.row_policies, new function currentRowPolicies(), new SQL syntax CREATE POLICY, ALTER POLICY, DROP POLICY, SHOW CREATE POLICY, SHOW POLICIES. #7808 (Vitaly Baranov)

Security Fix

  • Fixed the possibility of reading directories structure in tables with File table engine. This fixes #8536. #8537 (alexey-milovidov)