ClickHouse/CHANGELOG.md
2020-03-16 06:16:39 +03:00

523 KiB
Raw Blame History

ClickHouse release v20.3

ClickHouse release v20.3.2.1, 2020-03-12

Backward Incompatible Change

  • Fixed the issue file name too long when sending data for Distributed tables for a large number of replicas. Fixed the issue that replica credentials were exposed in the server log. The format of directory name on disk was changed to [shard{shard_index}[_replica{replica_index}]]. #8911 (Mikhail Korotov) After you upgrade to the new version, you will not be able to downgrade without manual intervention, because old server version does not recognize the new directory format. If you want to downgrade, you have to manually rename the corresponding directories to the old format. This change is relevant only if you have used asynchronous INSERTs to Distributed tables. In the version 20.3.3 we will introduce a setting that will allow you to enable the new format gradually.
  • Changed the format of replication log entries for mutation commands. You have to wait for old mutations to process before installing the new version.
  • Implement simple memory profiler that dumps stacktraces to system.trace_log every N bytes over soft allocation limit #8765 (Ivan) #9472 (alexey-milovidov) The column of system.trace_log was renamed from timer_type to trace_type. This will require changes in third-party performance analysis and flamegraph processing tools.
  • Use OS thread id everywhere instead of internal thread number. This fixes #7477 Old clickhouse-client cannot receive logs that are send from the server when the setting send_logs_level is enabled, because the names and types of the structured log messages were changed. On the other hand, different server versions can send logs with different types to each other. When you don't use the send_logs_level setting, you should not care. #8954 (alexey-milovidov)
  • Remove indexHint function #9542 (alexey-milovidov)
  • Remove findClusterIndex, findClusterValue functions. This fixes #8641. If you were using these functions, send an email to clickhouse-feedback@yandex-team.com #9543 (alexey-milovidov)
  • Now it's not allowed to create columns or add columns with SELECT subquery as default expression. #9481 (alesapin)
  • Require aliases for subqueries in JOIN. #9274 (Artem Zuikov)
  • Improved ALTER MODIFY/ADD queries logic. Now you cannot ADD column without type, MODIFY default expression doesn't change type of column and MODIFY type doesn't loose default expression value. Fixes #8669. #9227 (alesapin)
  • Require server to be restarted to apply the changes in logging configuration. This is a temporary workaround to avoid the bug where the server logs to a deleted log file (see #8696). #8707 (Alexander Kuzmenkov)
  • The setting experimental_use_processors is enabled by default. This setting enables usage of the new query pipeline. This is internal refactoring and we expect no visible changes. If you will see any issues, set it to back zero. #8768 (alexey-milovidov)

New Feature

  • Add Avro and AvroConfluent input/output formats #8571 (Andrew Onyshchuk) #8957 (Andrew Onyshchuk) #8717 (alexey-milovidov)
  • Multi-threaded and non-blocking updates of expired keys in cache dictionaries (with optional permission to read old ones). #8303 (Nikita Mikhaylov)
  • Add query ALTER ... MATERIALIZE TTL. It runs mutation that forces to remove expired data by TTL and recalculates meta-information about TTL in all parts. #8775 (Anton Popov)
  • Switch from HashJoin to MergeJoin (on disk) if needed #9082 (Artem Zuikov)
  • Added MOVE PARTITION command for ALTER TABLE #4729 #6168 (Guillaume Tassery)
  • Reloading storage configuration from configuration file on the fly. #8594 (Vladimir Chebotarev)
  • Allowed to change storage_policy to not less rich one. #8107 (Vladimir Chebotarev)
  • Added support for globs/wildcards for S3 storage and table function. #8851 (Vladimir Chebotarev)
  • Implement bitAnd, bitOr, bitXor, bitNot for FixedString(N) datatype. #9091 (Guillaume Tassery)
  • Added function bitCount. This fixes #8702. #8708 (alexey-milovidov) #8749 (ikopylov)
  • Add generateRandom table function to generate random rows with given schema. Allows to populate arbitrary test table with data. #8994 (Ilya Yatsishin)
  • JSONEachRowFormat: support special case when objects enclosed in top-level array. #8860 (Kruglov Pavel)
  • Now it's possible to create a column with DEFAULT expression which depends on a column with default ALIAS expression. #9489 (alesapin)
  • Allow to specify --limit more than the source data size in clickhouse-obfuscator. The data will repeat itself with different random seed. #9155 (alexey-milovidov)
  • Added groupArraySample function (similar to groupArray) with reservior sampling algorithm. #8286 (Amos Bird)
  • Now you can monitor the size of update queue in cache/complex_key_cache dictionaries via system metrics. #9413 (Nikita Mikhaylov)
  • Allow to use CRLF as a line separator in CSV output format with setting output_format_csv_crlf_end_of_line is set to 1 #8934 #8935 #8963 (Mikhail Korotov)
  • Implement more functions of the H3 API: h3GetBaseCell, h3HexAreaM2, h3IndexesAreNeighbors, h3ToChildren, h3ToString and stringToH3 #8938 (Nico Mandery)
  • New setting introduced: max_parser_depth to control maximum stack size and allow large complex queries. This fixes #6681 and #7668. #8647 (Maxim Smirnov)
  • Add a setting force_optimize_skip_unused_shards setting to throw if skipping of unused shards is not possible #8805 (Azat Khuzhin)
  • Allow to configure multiple disks/volumes for storing data for send in Distributed engine #8756 (Azat Khuzhin)
  • Support storage policy (<tmp_policy>) for storing temporary data. #8750 (Azat Khuzhin)
  • Added X-ClickHouse-Exception-Code HTTP header that is set if exception was thrown before sending data. This implements #4971. #8786 (Mikhail Korotov)
  • Added function ifNotFinite. It is just a syntactic sugar: ifNotFinite(x, y) = isFinite(x) ? x : y. #8710 (alexey-milovidov)
  • Added last_successful_update_time column in system.dictionaries table #9394 (Nikita Mikhaylov)
  • Add blockSerializedSize function (size on disk without compression) #8952 (Azat Khuzhin)
  • Add function moduloOrZero #9358 (hcz)
  • Added system tables system.zeros and system.zeros_mt as well as tale functions zeros() and zeros_mt(). Tables (and table functions) contain single column with name zero and type UInt8. This column contains zeros. It is needed for test purposes as the fastest method to generate many rows. This fixes #6604 #9593 (Nikolai Kochetov)

Experimental Feature

  • Add new compact format of parts in MergeTree-family tables in which all columns are stored in one file. It helps to increase performance of small and frequent inserts. The old format (one file per column) is now called wide. Data storing format is controlled by settings min_bytes_for_wide_part and min_rows_for_wide_part. #8290 (Anton Popov)
  • Support for S3 storage for Log, TinyLog and StripeLog tables. #8862 (Pavel Kovalenko)

Bug Fix

  • Fixed inconsistent whitespaces in log messages. #9322 (alexey-milovidov)
  • Fix bug in which arrays of unnamed tuples were flattened as Nested structures on table creation. #8866 (achulkov2)
  • Fixed the issue when "Too many open files" error may happen if there are too many files matching glob pattern in File table or file table function. Now files are opened lazily. This fixes #8857 #8861 (alexey-milovidov)
  • DROP TEMPORARY TABLE now drops only temporary table. #8907 (Vitaly Baranov)
  • Remove outdated partition when we shutdown the server or DETACH/ATTACH a table. #8602 (Guillaume Tassery)
  • For how the default disk calculates the free space from data subdirectory. Fixed the issue when the amount of free space is not calculated correctly if the data directory is mounted to a separate device (rare case). This fixes #7441 #9257 (Mikhail Korotov)
  • Allow comma (cross) join with IN () inside. #9251 (Artem Zuikov)
  • Allow to rewrite CROSS to INNER JOIN if there's [NOT] LIKE operator in WHERE section. #9229 (Artem Zuikov)
  • Fix possible incorrect result after GROUP BY with enabled setting distributed_aggregation_memory_efficient. Fixes #9134. #9289 (Nikolai Kochetov)
  • Found keys were counted as missed in metrics of cache dictionaries. #9411 (Nikita Mikhaylov)
  • Fix replication protocol incompatibility introduced in #8598. #9412 (alesapin)
  • Fixed race condition on queue_task_handle at the startup of ReplicatedMergeTree tables. #9552 (alexey-milovidov)
  • The token NOT didn't work in SHOW TABLES NOT LIKE query #8727 #8940 (alexey-milovidov)
  • Added range check to function h3EdgeLengthM. Without this check, buffer overflow is possible. #8945 (alexey-milovidov)
  • Fixed up a bug in batched calculations of ternary logical OPs on multiple arguments (more than 10). #8718 (Alexander Kazakov)
  • Fix error of PREWHERE optimization, which could lead to segfaults or Inconsistent number of columns got from MergeTreeRangeReader exception. #9024 (Anton Popov)
  • Fix unexpected Timeout exceeded while reading from socket exception, which randomly happens on secure connection before timeout actually exceeded and when query profiler is enabled. Also add connect_timeout_with_failover_secure_ms settings (default 100ms), which is similar to connect_timeout_with_failover_ms, but is used for secure connections (because SSL handshake is slower, than ordinary TCP connection) #9026 (tavplubix)
  • Fix bug with mutations finalization, when mutation may hang in state with parts_to_do=0 and is_done=0. #9022 (alesapin)
  • Use new ANY JOIN logic with partial_merge_join setting. It's possible to make ANY|ALL|SEMI LEFT and ALL INNER joins with partial_merge_join=1 now. #8932 (Artem Zuikov)
  • Shard now clamps the settings got from the initiator to the shard's constaints instead of throwing an exception. This fix allows to send queries to a shard with another constraints. #9447 (Vitaly Baranov)
  • Fixed memory management problem in MergeTreeReadPool. #8791 (Vladimir Chebotarev)
  • Fix toDecimal*OrNull() functions family when called with string e. Fixes #8312 #8764 (Artem Zuikov)
  • Make sure that FORMAT Null sends no data to the client. #8767 (Alexander Kuzmenkov)
  • Fix bug that timestamp in LiveViewBlockInputStream will not updated. LIVE VIEW is an experimental feature. #8644 (vxider) #8625 (vxider)
  • Fixed ALTER MODIFY TTL wrong behavior which did not allow to delete old TTL expressions. #8422 (Vladimir Chebotarev)
  • Fixed UBSan report in MergeTreeIndexSet. This fixes #9250 #9365 (alexey-milovidov)
  • Fixed the behaviour of match and extract functions when haystack has zero bytes. The behaviour was wrong when haystack was constant. This fixes #9160 #9163 (alexey-milovidov) #9345 (alexey-milovidov)
  • Avoid throwing from destructor in Apache Avro 3rd-party library. #9066 (Andrew Onyshchuk)
  • Don't commit a batch polled from Kafka partially as it can lead to holes in data. #8876 (filimonov)
  • Fix joinGet with nullable return types. https://github.com/ClickHouse/ClickHouse/issues/8919 #9014 (Amos Bird)
  • Fix data incompatibility when compressed with T64 codec. #9016 (Artem Zuikov) Fix data type ids in T64 compression codec that leads to wrong (de)compression in affected versions. #9033 (Artem Zuikov)
  • Add setting enable_early_constant_folding and disable it in some cases that leads to errors. #9010 (Artem Zuikov)
  • Fix pushdown predicate optimizer with VIEW and enable the test #9011 (Winter Zhang)
  • Fix segfault in Merge tables, that can happen when reading from File storages #9387 (tavplubix)
  • Added a check for storage policy in ATTACH PARTITION FROM, REPLACE PARTITION, MOVE TO TABLE. Otherwise it could make data of part inaccessible after restart and prevent ClickHouse to start. #9383 (Vladimir Chebotarev)
  • Fix alters if there is TTL set for table. #8800 (Anton Popov)
  • Fix race condition that can happen when SYSTEM RELOAD ALL DICTIONARIES is executed while some dictionary is being modified/added/removed. #8801 (Vitaly Baranov)
  • In previous versions Memory database engine use empty data path, so tables are created in path directory (e.g. /var/lib/clickhouse/), not in data directory of database (e.g. /var/lib/clickhouse/db_name). #8753 (tavplubix)
  • Fixed wrong log messages about missing default disk or policy. #9530 (Vladimir Chebotarev)
  • Fix not(has()) for the bloom_filter index of array types. #9407 (achimbab)
  • Allow first column(s) in a table with Log engine be an alias #9231 (Ivan)
  • Fix order of ranges while reading from MergeTree table in one thread. It could lead to exceptions from MergeTreeRangeReader or wrong query results. #9050 (Anton Popov)
  • Make reinterpretAsFixedString to return FixedString instead of String. #9052 (Andrew Onyshchuk)
  • Avoid extremely rare cases when the user can get wrong error message (Success instead of detailed error description). #9457 (alexey-milovidov)
  • Do not crash when using Template format with empty row template. #8785 (Alexander Kuzmenkov)
  • Metadata files for system tables could be created in wrong place #8653 (tavplubix) Fixes #8581.
  • Fix data race on exception_ptr in cache dictionary #8303. #9379 (Nikita Mikhaylov)
  • Do not throw an exception for query ATTACH TABLE IF NOT EXISTS. Previously it was thrown if table already exists, despite the IF NOT EXISTS clause. #8967 (Anton Popov)
  • Fixed missing closing paren in exception message. #8811 (alexey-milovidov)
  • Avoid message Possible deadlock avoided at the startup of clickhouse-client in interactive mode. #9455 (alexey-milovidov)
  • Fixed the issue when padding at the end of base64 encoded value can be malformed. Update base64 library. This fixes #9491, closes #9492 #9500 (alexey-milovidov)
  • Prevent losing data in Kafka in rare cases when exception happens after reading suffix but before commit. Fixes #9378 #9507 (filimonov)
  • Fixed exception in DROP TABLE IF EXISTS #8663 (Nikita Vasilev)
  • Fix crash when a user tries to ALTER MODIFY SETTING for old-formated MergeTree table engines family. #9435 (alesapin)
  • Support for UInt64 numbers that don't fit in Int64 in JSON-related functions. Update SIMDJSON to master. This fixes #9209 #9344 (alexey-milovidov)
  • Fixed execution of inversed predicates when non-strictly monotinic functional index is used. #9223 (Alexander Kazakov)
  • Don't try to fold IN constant in GROUP BY #8868 (Amos Bird)
  • Fix bug in ALTER DELETE mutations which leads to index corruption. This fixes #9019 and #8982. Additionally fix extremely rare race conditions in ReplicatedMergeTree ALTER queries. #9048 (alesapin)
  • When the setting compile_expressions is enabled, you can get unexpected column in LLVMExecutableFunction when we use Nullable type #8910 (Guillaume Tassery)
  • Multiple fixes for Kafka engine: 1) fix duplicates that were appearing during consumer group rebalance. 2) Fix rare 'holes' appeared when data were polled from several partitions with one poll and committed partially (now we always process / commit the whole polled block of messages). 3) Fix flushes by block size (before that only flushing by timeout was working properly). 4) better subscription procedure (with assignment feedback). 5) Make tests work faster (with default intervals and timeouts). Due to the fact that data was not flushed by block size before (as it should according to documentation), that PR may lead to some performance degradation with default settings (due to more often & tinier flushes which are less optimal). If you encounter the performance issue after that change - please increase kafka_max_block_size in the table to the bigger value ( for example CREATE TABLE ...Engine=Kafka ... SETTINGS ... kafka_max_block_size=524288). Fixes #7259 #8917 (filimonov)
  • Fix Parameter out of bound exception in some queries after PREWHERE optimizations. #8914 (Baudouin Giard)
  • Fixed the case of mixed-constness of arguments of function arrayZip. #8705 (alexey-milovidov)
  • When executing CREATE query, fold constant expressions in storage engine arguments. Replace empty database name with current database. Fixes #6508, #3492 #9262 (tavplubix)
  • Now it's not possible to create or add columns with simple cyclic aliases like a DEFAULT b, b DEFAULT a. #9603 (alesapin)
  • Fixed a bug with double move which may corrupt original part. This is relevant if you use ALTER TABLE MOVE #8680 (Vladimir Chebotarev)
  • Allow interval identifier to correctly parse without backticks. Fixed issue when a query cannot be executed even if the interval identifier is enclosed in backticks or double quotes. This fixes #9124. #9142 (alexey-milovidov)
  • Fixed fuzz test and incorrect behaviour of bitTestAll/bitTestAny functions. #9143 (alexey-milovidov)
  • Fix possible crash/wrong number of rows in LIMIT n WITH TIES when there are a lot of rows equal to n'th row. #9464 (tavplubix)
  • Fix mutations with parts written with enabled insert_quorum. #9463 (alesapin)
  • Fix data race at destruction of Poco::HTTPServer. It could happen when server is started and immediately shut down. #9468 (Anton Popov)
  • Fix bug in which a misleading error message was shown when running SHOW CREATE TABLE a_table_that_does_not_exist. #8899 (achulkov2)
  • Fixed Parameters are out of bound exception in some rare cases when we have a constant in the SELECT clause when we have an ORDER BY and a LIMIT clause. #8892 (Guillaume Tassery)
  • Fix mutations finalization, when already done mutation can have status is_done=0. #9217 (alesapin)
  • Prevent from executing ALTER ADD INDEX for MergeTree tables with old syntax, because it doesn't work. #8822 (Mikhail Korotov)
  • During server startup do not access table, which LIVE VIEW depends on, so server will be able to start. Also remove LIVE VIEW dependencies when detaching LIVE VIEW. LIVE VIEW is an experimental feature. #8824 (tavplubix)
  • Fix possible segfault in MergeTreeRangeReader, while executing PREWHERE. #9106 (Anton Popov)
  • Fix possible mismatched checksums with column TTLs. #9451 (Anton Popov)
  • Fixed a bug when parts were not being moved in background by TTL rules in case when there is only one volume. #8672 (Vladimir Chebotarev)
  • Fixed the issue Method createColumn() is not implemented for data type Set. This fixes #7799. #8674 (alexey-milovidov)
  • Now we will try finalize mutations more frequently. #9427 (alesapin)
  • Fix intDiv by minus one constant #9351 (hcz)
  • Fix possible race condition in BlockIO. #9356 (Nikolai Kochetov)
  • Fix bug leading to server termination when trying to use / drop Kafka table created with wrong parameters. #9513 (filimonov)
  • Added workaround if OS returns wrong result for timer_create function. #8837 (alexey-milovidov)
  • Fixed error in usage of min_marks_for_seek parameter. Fixed the error message when there is no sharding key in Distributed table and we try to skip unused shards. #8908 (Azat Khuzhin)

Improvement

  • Implement ALTER MODIFY/DROP queries on top of mutations for ReplicatedMergeTree* engines family. Now ALTERS blocks only at the metadata update stage, and don't block after that. #8701 (alesapin)
  • Add ability to rewrite CROSS to INNER JOINs with WHERE section containing unqialified names. #9512 (Artem Zuikov)
  • Make SHOW TABLES and SHOW DATABASES queries support the WHERE expressions and FROM/IN #9076 (sundyli)
  • Added a setting deduplicate_blocks_in_dependent_materialized_views. #9070 (urykhy)
  • After recent changes MySQL client started to print binary strings in hex thereby making them not readable (#9032). The workaround in ClickHouse is to mark string columns as UTF-8, which is not always, but usually the case. #9079 (Yuriy Baranov)
  • Add support of String and FixedString keys for sumMap #8903 (Baudouin Giard)
  • Support string keys in SummingMergeTree maps #8933 (Baudouin Giard)
  • Signal termination of thread to the thread pool even if the thread has thrown exception #8736 (Ding Xiang Fei)
  • Allow to set query_id in clickhouse-benchmark #9416 (Anton Popov)
  • Don't allow strange expressions in ALTER TABLE ... PARTITION partition query. This addresses #7192 #8835 (alexey-milovidov)
  • The table system.table_engines now provides information about feature support (like supports_ttl or supports_sort_order). #8830 (Max Akhmedov)
  • Enable system.metric_log by default. It will contain rows with values of ProfileEvents, CurrentMetrics collected with "collect_interval_milliseconds" interval (one second by default). The table is very small (usually in order of megabytes) and collecting this data by default is reasonable. #9225 (alexey-milovidov)
  • Initialize query profiler for all threads in a group, e.g. it allows to fully profile insert-queries. Fixes #6964 #8874 (Ivan)
  • Now temporary LIVE VIEW is created by CREATE LIVE VIEW name WITH TIMEOUT [42] ... instead of CREATE TEMPORARY LIVE VIEW ..., because the previous syntax was not consistent with CREATE TEMPORARY TABLE ... #9131 (tavplubix)
  • Add text_log.level configuration parameter to limit entries that goes to system.text_log table #8809 (Azat Khuzhin)
  • Allow to put downloaded part to a disks/volumes according to TTL rules #8598 (Vladimir Chebotarev)
  • For external MySQL dictionaries, allow to mutualize MySQL connection pool to "share" them among dictionaries. This option significantly reduces the number of connections to MySQL servers. #9409 (Clément Rodriguez)
  • Show nearest query execution time for quantiles in clickhouse-benchmark output instead of interpolated values. It's better to show values that correspond to the execution time of some queries. #8712 (alexey-milovidov)
  • Possibility to add key & timestamp for the message when inserting data to Kafka. Fixes #7198 #8969 (filimonov)
  • If server is run from terminal, highlight thread number, query id and log priority by colors. This is for improved readability of correlated log messages for developers. #8961 (alexey-milovidov)
  • Better exception message while loading tables for Ordinary database. #9527 (alexey-milovidov)
  • Implement arraySlice for arrays with aggregate function states. This fixes #9388 #9391 (alexey-milovidov)
  • Allow constant functions and constant arrays to be used on the right side of IN operator. #8813 (Anton Popov)
  • If zookeeper exception has happened while fetching data for system.replicas, display it in a separate column. This implements #9137 #9138 (alexey-milovidov)
  • Atomically remove MergeTree data parts on destroy. #8402 (Vladimir Chebotarev)
  • Support row-level security for Distributed tables. #8926 (Ivan)
  • Now we recognize suffix (like KB, KiB...) in settings values. #8072 (Mikhail Korotov)
  • Prevent out of memory while constructing result of a large JOIN. #8637 (Artem Zuikov)
  • Added names of clusters to suggestions in interactive mode in clickhouse-client. #8709 (alexey-milovidov)
  • Initialize query profiler for all threads in a group, e.g. it allows to fully profile insert-queries #8820 (Ivan)
  • Added column exception_code in system.query_log table. #8770 (Mikhail Korotov)
  • Enabled MySQL compatibility server on port 9004 in the default server configuration file. Fixed password generation command in the example in configuration. #8771 (Yuriy Baranov)
  • Prevent abort on shutdown if the filesystem is readonly. This fixes #9094 #9100 (alexey-milovidov)
  • Better exception message when length is required in HTTP POST query. #9453 (alexey-milovidov)
  • Add _path and _file virtual columns to HDFS and File engines and hdfs and file table functions #8489 (Olga Khvostikova)
  • Fix error Cannot find column while inserting into MATERIALIZED VIEW in case if new column was added to view's internal table. #8766 #8788 (vzakaznikov) #8788 #8806 (Nikolai Kochetov) #8803 (Nikolai Kochetov)
  • Fix progress over native client-server protocol, by send progress after final update (like logs). This may be relevant only to some third-party tools that are using native protocol. #9495 (Azat Khuzhin)
  • Add a system metric tracking the number of client connections using MySQL protocol (#9013). #9015 (Eugene Klimov)
  • From now on, HTTP responses will have X-ClickHouse-Timezone header set to the same timezone value that SELECT timezone() would report. #9493 (Denis Glazachev)

Performance Improvement

  • Improve performance of analysing index with IN #9261 (Anton Popov)
  • Simpler and more efficient code in Logical Functions + code cleanups. A followup to #8718 #8728 (Alexander Kazakov)
  • Overall performance improvement (in range of 5%..200% for affected queries) by ensuring even more strict aliasing with C++20 features. #9304 (Amos Bird)
  • More strict aliasing for inner loops of comparison functions. #9327 (alexey-milovidov)
  • More strict aliasing for inner loops of arithmetic functions. #9325 (alexey-milovidov)
  • A ~3 times faster implementation for ColumnVector::replicate(), via which ColumnConst::convertToFullColumn() is implemented. Also will be useful in tests when materializing constants. #9293 (Alexander Kazakov)
  • Another minor performance improvement to ColumnVector::replicate() (this speeds up the materialize function and higher order functions) an even further improvement to #9293 #9442 (Alexander Kazakov)
  • Improved performance of stochasticLinearRegression aggregate function. This patch is contributed by Intel. #8652 (alexey-milovidov)
  • Improve performance of reinterpretAsFixedString function. #9342 (alexey-milovidov)
  • Do not send blocks to client for Null format in processors pipeline. #8797 (Nikolai Kochetov) #8767 (Alexander Kuzmenkov)

Build/Testing/Packaging Improvement

ClickHouse release v20.1

ClickHouse release v20.1.6.30, 2020-03-05

Bug Fix

  • Fix data incompatibility when compressed with T64 codec. #9039 (abyss7)
  • Fix order of ranges while reading from MergeTree table in one thread. Fixes #8964. #9050 (CurtizJ)
  • Fix possible segfault in MergeTreeRangeReader, while executing PREWHERE. Fixes #9064. #9106 (CurtizJ)
  • Fix reinterpretAsFixedString to return FixedString instead of String. #9052 (oandrew)
  • Fix joinGet with nullable return types. Fixes #8919 #9014 (amosbird)
  • Fix fuzz test and incorrect behaviour of bitTestAll/bitTestAny functions. #9143 (alexey-milovidov)
  • Fix the behaviour of match and extract functions when haystack has zero bytes. The behaviour was wrong when haystack was constant. Fixes #9160 #9163 (alexey-milovidov)
  • Fixed execution of inversed predicates when non-strictly monotinic functional index is used. Fixes #9034 #9223 (Akazz)
  • Allow to rewrite CROSS to INNER JOIN if there's [NOT] LIKE operator in WHERE section. Fixes #9191 #9229 (4ertus2)
  • Allow first column(s) in a table with Log engine be an alias. #9231 (abyss7)
  • Allow comma join with IN() inside. Fixes #7314. #9251 (4ertus2)
  • Improve ALTER MODIFY/ADD queries logic. Now you cannot ADD column without type, MODIFY default expression doesn't change type of column and MODIFY type doesn't loose default expression value. Fixes #8669. #9227 (alesapin)
  • Fix mutations finalization, when already done mutation can have status is_done=0. #9217 (alesapin)
  • Support "Processors" pipeline for system.numbers and system.numbers_mt. This also fixes the bug when max_execution_time is not respected. #7796 (KochetovNicolai)
  • Fix wrong counting of DictCacheKeysRequestedFound metric. #9411 (nikitamikhaylov)
  • Added a check for storage policy in ATTACH PARTITION FROM, REPLACE PARTITION, MOVE TO TABLE which otherwise could make data of part inaccessible after restart and prevent ClickHouse to start. #9383 (excitoon)
  • Fixed UBSan report in MergeTreeIndexSet. This fixes #9250 #9365 (alexey-milovidov)
  • Fix possible datarace in BlockIO. #9356 (KochetovNicolai)
  • Support for UInt64 numbers that don't fit in Int64 in JSON-related functions. Update SIMDJSON to master. This fixes #9209 #9344 (alexey-milovidov)
  • Fix the issue when the amount of free space is not calculated correctly if the data directory is mounted to a separate device. For default disk calculate the free space from data subdirectory. This fixes #7441 #9257 (millb)
  • Fix the issue when TLS connections may fail with the message OpenSSL SSL_read: error:14094438:SSL routines:ssl3_read_bytes:tlsv1 alert internal error and SSL Exception: error:2400006E:random number generator::error retrieving entropy. Update OpenSSL to upstream master. #8956 (alexey-milovidov)
  • When executing CREATE query, fold constant expressions in storage engine arguments. Replace empty database name with current database. Fixes #6508, #3492. Also fix check for local address in ClickHouseDictionarySource. #9262 (tabplubix)
  • Fix segfault in StorageMerge, which can happen when reading from StorageFile. #9387 (tabplubix)
  • Prevent losing data in Kafka in rare cases when exception happens after reading suffix but before commit. Fixes #9378. Related: #7175 #9507 (filimonov)
  • Fix bug leading to server termination when trying to use / drop Kafka table created with wrong parameters. Fixes #9494. Incorporates #9507. #9513 (filimonov)

New Feature

  • Add deduplicate_blocks_in_dependent_materialized_views option to control the behaviour of idempotent inserts into tables with materialized views. This new feature was added to the bugfix release by a special request from Altinity. #9070 (urykhy)

ClickHouse release v20.1.2.4, 2020-01-22

Backward Incompatible Change

  • Make the setting merge_tree_uniform_read_distribution obsolete. The server still recognizes this setting but it has no effect. #8308 (alexey-milovidov)
  • Changed return type of the function greatCircleDistance to Float32 because now the result of calculation is Float32. #7993 (alexey-milovidov)
  • Now it's expected that query parameters are represented in "escaped" format. For example, to pass string a<tab>b you have to write a\tb or a\<tab>b and respectively, a%5Ctb or a%5C%09b in URL. This is needed to add the possibility to pass NULL as \N. This fixes #7488. #8517 (alexey-milovidov)
  • Enable use_minimalistic_part_header_in_zookeeper setting for ReplicatedMergeTree by default. This will significantly reduce amount of data stored in ZooKeeper. This setting is supported since version 19.1 and we already use it in production in multiple services without any issues for more than half a year. Disable this setting if you have a chance to downgrade to versions older than 19.1. #6850 (alexey-milovidov)
  • Data skipping indices are production ready and enabled by default. The settings allow_experimental_data_skipping_indices, allow_experimental_cross_to_join_conversion and allow_experimental_multiple_joins_emulation are now obsolete and do nothing. #7974 (alexey-milovidov)
  • Add new ANY JOIN logic for StorageJoin consistent with JOIN operation. To upgrade without changes in behaviour you need add SETTINGS any_join_distinct_right_table_keys = 1 to Engine Join tables metadata or recreate these tables after upgrade. #8400 (Artem Zuikov)
  • Require server to be restarted to apply the changes in logging configuration. This is a temporary workaround to avoid the bug where the server logs to a deleted log file (see #8696). #8707 (Alexander Kuzmenkov)

New Feature

  • Added information about part paths to system.merges. #8043 (Vladimir Chebotarev)
  • Add ability to execute SYSTEM RELOAD DICTIONARY query in ON CLUSTER mode. #8288 (Guillaume Tassery)
  • Add ability to execute CREATE DICTIONARY queries in ON CLUSTER mode. #8163 (alesapin)
  • Now user's profile in users.xml can inherit multiple profiles. #8343 (Mikhail f. Shiryaev)
  • Added system.stack_trace table that allows to look at stack traces of all server threads. This is useful for developers to introspect server state. This fixes #7576. #8344 (alexey-milovidov)
  • Add DateTime64 datatype with configurable sub-second precision. #7170 (Vasily Nemkov)
  • Add table function clusterAllReplicas which allows to query all the nodes in the cluster. #8493 (kiran sunkari)
  • Add aggregate function categoricalInformationValue which calculates the information value of a discrete feature. #8117 (hcz)
  • Speed up parsing of data files in CSV, TSV and JSONEachRow format by doing it in parallel. #7780 (Alexander Kuzmenkov)
  • Add function bankerRound which performs banker's rounding. #8112 (hcz)
  • Support more languages in embedded dictionary for region names: 'ru', 'en', 'ua', 'uk', 'by', 'kz', 'tr', 'de', 'uz', 'lv', 'lt', 'et', 'pt', 'he', 'vi'. #8189 (alexey-milovidov)
  • Improvements in consistency of ANY JOIN logic. Now t1 ANY LEFT JOIN t2 equals t2 ANY RIGHT JOIN t1. #7665 (Artem Zuikov)
  • Add setting any_join_distinct_right_table_keys which enables old behaviour for ANY INNER JOIN. #7665 (Artem Zuikov)
  • Add new SEMI and ANTI JOIN. Old ANY INNER JOIN behaviour now available as SEMI LEFT JOIN. #7665 (Artem Zuikov)
  • Added Distributed format for File engine and file table function which allows to read from .bin files generated by asynchronous inserts into Distributed table. #8535 (Nikolai Kochetov)
  • Add optional reset column argument for runningAccumulate which allows to reset aggregation results for each new key value. #8326 (Sergey Kononenko)
  • Add ability to use ClickHouse as Prometheus endpoint. #7900 (vdimir)
  • Add section <remote_url_allow_hosts> in config.xml which restricts allowed hosts for remote table engines and table functions URL, S3, HDFS. #7154 (Mikhail Korotov)
  • Added function greatCircleAngle which calculates the distance on a sphere in degrees. #8105 (alexey-milovidov)
  • Changed Earth radius to be consistent with H3 library. #8105 (alexey-milovidov)
  • Added JSONCompactEachRow and JSONCompactEachRowWithNamesAndTypes formats for input and output. #7841 (Mikhail Korotov)
  • Added feature for file-related table engines and table functions (File, S3, URL, HDFS) which allows to read and write gzip files based on additional engine parameter or file extension. #7840 (Andrey Bodrov)
  • Added the randomASCII(length) function, generating a string with a random set of ASCII printable characters. #8401 (BayoNet)
  • Added function JSONExtractArrayRaw which returns an array on unparsed json array elements from JSON string. #8081 (Oleg Matrokhin)
  • Add arrayZip function which allows to combine multiple arrays of equal lengths into one array of tuples. #8149 (Winter Zhang)
  • Add ability to move data between disks according to configured TTL-expressions for *MergeTree table engines family. #8140 (Vladimir Chebotarev)
  • Added new aggregate function avgWeighted which allows to calculate weighted average. #7898 (Andrey Bodrov)
  • Now parallel parsing is enabled by default for TSV, TSKV, CSV and JSONEachRow formats. #7894 (Nikita Mikhaylov)
  • Add several geo functions from H3 library: h3GetResolution, h3EdgeAngle, h3EdgeLength, h3IsValid and h3kRing. #8034 (Konstantin Malanchev)
  • Added support for brotli (br) compression in file-related storages and table functions. This fixes #8156. #8526 (alexey-milovidov)
  • Add groupBit* functions for the SimpleAggregationFunction type. #8485 (Guillaume Tassery)

Bug Fix

  • Fix rename of tables with Distributed engine. Fixes issue #7868. #8306 (tavplubix)
  • Now dictionaries support EXPRESSION for attributes in arbitrary string in non-ClickHouse SQL dialect. #8098 (alesapin)
  • Fix broken INSERT SELECT FROM mysql(...) query. This fixes #8070 and #7960. #8234 (tavplubix)
  • Fix error "Mismatch column sizes" when inserting default Tuple from JSONEachRow. This fixes #5653. #8606 (tavplubix)
  • Now an exception will be thrown in case of using WITH TIES alongside LIMIT BY. Also add ability to use TOP with LIMIT BY. This fixes #7472. #7637 (Nikita Mikhaylov)
  • Fix unintendent dependency from fresh glibc version in clickhouse-odbc-bridge binary. #8046 (Amos Bird)
  • Fix bug in check function of *MergeTree engines family. Now it doesn't fail in case when we have equal amount of rows in last granule and last mark (non-final). #8047 (alesapin)
  • Fix insert into Enum* columns after ALTER query, when underlying numeric type is equal to table specified type. This fixes #7836. #7908 (Anton Popov)
  • Allowed non-constant negative "size" argument for function substring. It was not allowed by mistake. This fixes #4832. #7703 (alexey-milovidov)
  • Fix parsing bug when wrong number of arguments passed to (O|J)DBC table engine. #7709 (alesapin)
  • Using command name of the running clickhouse process when sending logs to syslog. In previous versions, empty string was used instead of command name. #8460 (Michael Nacharov)
  • Fix check of allowed hosts for localhost. This PR fixes the solution provided in #8241. #8342 (Vitaly Baranov)
  • Fix rare crash in argMin and argMax functions for long string arguments, when result is used in runningAccumulate function. This fixes #8325 #8341 (dinosaur)
  • Fix memory overcommit for tables with Buffer engine. #8345 (Azat Khuzhin)
  • Fixed potential bug in functions that can take NULL as one of the arguments and return non-NULL. #8196 (alexey-milovidov)
  • Better metrics calculations in thread pool for background processes for MergeTree table engines. #8194 (Vladimir Chebotarev)
  • Fix function IN inside WHERE statement when row-level table filter is present. Fixes #6687 #8357 (Ivan)
  • Now an exception is thrown if the integral value is not parsed completely for settings values. #7678 (Mikhail Korotov)
  • Fix exception when aggregate function is used in query to distributed table with more than two local shards. #8164 (小路)
  • Now bloom filter can handle zero length arrays and doesn't perform redundant calculations. #8242 (achimbab)
  • Fixed checking if a client host is allowed by matching the client host to host_regexp specified in users.xml. #8241 (Vitaly Baranov)
  • Relax ambiguous column check that leads to false positives in multiple JOIN ON section. #8385 (Artem Zuikov)
  • Fixed possible server crash (std::terminate) when the server cannot send or write data in JSON or XML format with values of String data type (that require UTF-8 validation) or when compressing result data with Brotli algorithm or in some other rare cases. This fixes #7603 #8384 (alexey-milovidov)
  • Fix race condition in StorageDistributedDirectoryMonitor found by CI. This fixes #8364. #8383 (Nikolai Kochetov)
  • Now background merges in *MergeTree table engines family preserve storage policy volume order more accurately. #8549 (Vladimir Chebotarev)
  • Now table engine Kafka works properly with Native format. This fixes #6731 #7337 #8003. #8016 (filimonov)
  • Fixed formats with headers (like CSVWithNames) which were throwing exception about EOF for table engine Kafka. #8016 (filimonov)
  • Fixed a bug with making set from subquery in right part of IN section. This fixes #5767 and #2542. #7755 (Nikita Mikhaylov)
  • Fix possible crash while reading from storage File. #7756 (Nikolai Kochetov)
  • Fixed reading of the files in Parquet format containing columns of type list. #8334 (maxulan)
  • Fix error Not found column for distributed queries with PREWHERE condition dependent on sampling key if max_parallel_replicas > 1. #7913 (Nikolai Kochetov)
  • Fix error Not found column if query used PREWHERE dependent on table's alias and the result set was empty because of primary key condition. #7911 (Nikolai Kochetov)
  • Fixed return type for functions rand and randConstant in case of Nullable argument. Now functions always return UInt32 and never Nullable(UInt32). #8204 (Nikolai Kochetov)
  • Disabled predicate push-down for WITH FILL expression. This fixes #7784. #7789 (Winter Zhang)
  • Fixed incorrect count() result for SummingMergeTree when FINAL section is used. #3280 #7786 (Nikita Mikhaylov)
  • Fix possible incorrect result for constant functions from remote servers. It happened for queries with functions like version(), uptime(), etc. which returns different constant values for different servers. This fixes #7666. #7689 (Nikolai Kochetov)
  • Fix complicated bug in push-down predicate optimization which leads to wrong results. This fixes a lot of issues on push-down predicate optimization. #8503 (Winter Zhang)
  • Fix crash in CREATE TABLE .. AS dictionary query. #8508 (Azat Khuzhin)
  • Several improvements ClickHouse grammar in .g4 file. #8294 (taiyang-li)
  • Fix bug that leads to crashes in JOINs with tables with engine Join. This fixes #7556 #8254 #7915 #8100. #8298 (Artem Zuikov)
  • Fix redundant dictionaries reload on CREATE DATABASE. #7916 (Azat Khuzhin)
  • Limit maximum number of streams for read from StorageFile and StorageHDFS. Fixes https://github.com/ClickHouse/ClickHouse/issues/7650. #7981 (alesapin)
  • Fix bug in ALTER ... MODIFY ... CODEC query, when user specify both default expression and codec. Fixes 8593. #8614 (alesapin)
  • Fix error in background merge of columns with SimpleAggregateFunction(LowCardinality) type. #8613 (Nikolai Kochetov)
  • Fixed type check in function toDateTime64. #8375 (Vasily Nemkov)
  • Now server do not crash on LEFT or FULL JOIN with and Join engine and unsupported join_use_nulls settings. #8479 (Artem Zuikov)
  • Now DROP DICTIONARY IF EXISTS db.dict query doesn't throw exception if db doesn't exist. #8185 (Vitaly Baranov)
  • Fix possible crashes in table functions (file, mysql, remote) caused by usage of reference to removed IStorage object. Fix incorrect parsing of columns specified at insertion into table function. #7762 (tavplubix)
  • Ensure network be up before starting clickhouse-server. This fixes #7507. #8570 (Zhichang Yu)
  • Fix timeouts handling for secure connections, so queries doesn't hang indefenitely. This fixes #8126. #8128 (alexey-milovidov)
  • Fix clickhouse-copier's redundant contention between concurrent workers. #7816 (Ding Xiang Fei)
  • Now mutations doesn't skip attached parts, even if their mutation version were larger than current mutation version. #7812 (Zhichang Yu) #8250 (alesapin)
  • Ignore redundant copies of *MergeTree data parts after move to another disk and server restart. #7810 (Vladimir Chebotarev)
  • Fix crash in FULL JOIN with LowCardinality in JOIN key. #8252 (Artem Zuikov)
  • Forbidden to use column name more than once in insert query like INSERT INTO tbl (x, y, x). This fixes #5465, #7681. #7685 (alesapin)
  • Added fallback for detection the number of physical CPU cores for unknown CPUs (using the number of logical CPU cores). This fixes #5239. #7726 (alexey-milovidov)
  • Fix There's no column error for materialized and alias columns. #8210 (Artem Zuikov)
  • Fixed sever crash when EXISTS query was used without TABLE or DICTIONARY qualifier. Just like EXISTS t. This fixes #8172. This bug was introduced in version 19.17. #8213 (alexey-milovidov)
  • Fix rare bug with error "Sizes of columns doesn't match" that might appear when using SimpleAggregateFunction column. #7790 (Boris Granveaud)
  • Fix bug where user with empty allow_databases got access to all databases (and same for allow_dictionaries). #7793 (DeifyTheGod)
  • Fix client crash when server already disconnected from client. #8071 (Azat Khuzhin)
  • Fix ORDER BY behaviour in case of sorting by primary key prefix and non primary key suffix. #7759 (Anton Popov)
  • Check if qualified column present in the table. This fixes #6836. #7758 (Artem Zuikov)
  • Fixed behavior with ALTER MOVE ran immediately after merge finish moves superpart of specified. Fixes #8103. #8104 (Vladimir Chebotarev)
  • Fix possible server crash while using UNION with different number of columns. Fixes #7279. #7929 (Nikolai Kochetov)
  • Fix size of result substring for function substr with negative size. #8589 (Nikolai Kochetov)
  • Now server does not execute part mutation in MergeTree if there are not enough free threads in background pool. #8588 (tavplubix)
  • Fix a minor typo on formatting UNION ALL AST. #7999 (litao91)
  • Fixed incorrect bloom filter results for negative numbers. This fixes #8317. #8566 (Winter Zhang)
  • Fixed potential buffer overflow in decompress. Malicious user can pass fabricated compressed data that will cause read after buffer. This issue was found by Eldar Zaitov from Yandex information security team. #8404 (alexey-milovidov)
  • Fix incorrect result because of integers overflow in arrayIntersect. #7777 (Nikolai Kochetov)
  • Now OPTIMIZE TABLE query will not wait for offline replicas to perform the operation. #8314 (javi santana)
  • Fixed ALTER TTL parser for Replicated*MergeTree tables. #8318 (Vladimir Chebotarev)
  • Fix communication between server and client, so server read temporary tables info after query failure. #8084 (Azat Khuzhin)
  • Fix bitmapAnd function error when intersecting an aggregated bitmap and a scalar bitmap. #8082 (Yue Huang)
  • Refine the definition of ZXid according to the ZooKeeper Programmer's Guide which fixes bug in clickhouse-cluster-copier. #8088 (Ding Xiang Fei)
  • odbc table function now respects external_table_functions_use_nulls setting. #7506 (Vasily Nemkov)
  • Fixed bug that lead to a rare data race. #8143 (Alexander Kazakov)
  • Now SYSTEM RELOAD DICTIONARY reloads a dictionary completely, ignoring update_field. This fixes #7440. #8037 (Vitaly Baranov)
  • Add ability to check if dictionary exists in create query. #8032 (alesapin)
  • Fix Float* parsing in Values format. This fixes #7817. #7870 (tavplubix)
  • Fix crash when we cannot reserve space in some background operations of *MergeTree table engines family. #7873 (Vladimir Chebotarev)
  • Fix crash of merge operation when table contains SimpleAggregateFunction(LowCardinality) column. This fixes #8515. #8522 (Azat Khuzhin)
  • Restore support of all ICU locales and add the ability to apply collations for constant expressions. Also add language name to system.collations table. #8051 (alesapin)
  • Fix bug when external dictionaries with zero minimal lifetime (LIFETIME(MIN 0 MAX N), LIFETIME(N)) don't update in background. #7983 (alesapin)
  • Fix crash when external dictionary with ClickHouse source has subquery in query. #8351 (Nikolai Kochetov)
  • Fix incorrect parsing of file extension in table with engine URL. This fixes #8157. #8419 (Andrey Bodrov)
  • Fix CHECK TABLE query for *MergeTree tables without key. Fixes #7543. #7979 (alesapin)
  • Fixed conversion of Float64 to MySQL type. #8079 (Yuriy Baranov)
  • Now if table was not completely dropped because of server crash, server will try to restore and load it. #8176 (tavplubix)
  • Fixed crash in table function file while inserting into file that doesn't exist. Now in this case file would be created and then insert would be processed. #8177 (Olga Khvostikova)
  • Fix rare deadlock which can happen when trace_log is in enabled. #7838 (filimonov)
  • Add ability to work with different types besides Date in RangeHashed external dictionary created from DDL query. Fixes 7899. #8275 (alesapin)
  • Fixes crash when now64() is called with result of another function. #8270 (Vasily Nemkov)
  • Fixed bug with detecting client IP for connections through mysql wire protocol. #7743 (Dmitry Muzyka)
  • Fix empty array handling in arraySplit function. This fixes #7708. #7747 (hcz)
  • Fixed the issue when pid-file of another running clickhouse-server may be deleted. #8487 (Weiqing Xu)
  • Fix dictionary reload if it has invalidate_query, which stopped updates and some exception on previous update tries. #8029 (alesapin)
  • Fixed error in function arrayReduce that may lead to "double free" and error in aggregate function combinator Resample that may lead to memory leak. Added aggregate function aggThrow. This function can be used for testing purposes. #8446 (alexey-milovidov)

Improvement

  • Improved logging when working with S3 table engine. #8251 (Grigory Pervakov)
  • Printed help message when no arguments are passed when calling clickhouse-local. This fixes #5335. #8230 (Andrey Nagorny)
  • Add setting mutations_sync which allows to wait ALTER UPDATE/DELETE queries synchronously. #8237 (alesapin)
  • Allow to set up relative user_files_path in config.xml (in the way similar to format_schema_path). #7632 (hcz)
  • Add exception for illegal types for conversion functions with -OrZero postfix. #7880 (Andrey Konyaev)
  • Simplify format of the header of data sending to a shard in a distributed query. #8044 (Vitaly Baranov)
  • Live View table engine refactoring. #8519 (vzakaznikov)
  • Add additional checks for external dictionaries created from DDL-queries. #8127 (alesapin)
  • Fix error Column ... already exists while using FINAL and SAMPLE together, e.g. select count() from table final sample 1/2. Fixes #5186. #7907 (Nikolai Kochetov)
  • Now table the first argument of joinGet function can be table indentifier. #7707 (Amos Bird)
  • Allow using MaterializedView with subqueries above Kafka tables. #8197 (filimonov)
  • Now background moves between disks run it the seprate thread pool. #7670 (Vladimir Chebotarev)
  • SYSTEM RELOAD DICTIONARY now executes synchronously. #8240 (Vitaly Baranov)
  • Stack traces now display physical addresses (offsets in object file) instead of virtual memory addresses (where the object file was loaded). That allows the use of addr2line when binary is position independent and ASLR is active. This fixes #8360. #8387 (alexey-milovidov)
  • Support new syntax for row-level security filters: <table name='table_name'>…</table>. Fixes #5779. #8381 (Ivan)
  • Now cityHash function can work with Decimal and UUID types. Fixes #5184. #7693 (Mikhail Korotov)
  • Removed fixed index granularity (it was 1024) from system logs because it's obsolete after implementation of adaptive granularity. #7698 (alexey-milovidov)
  • Enabled MySQL compatibility server when ClickHouse is compiled without SSL. #7852 (Yuriy Baranov)
  • Now server checksums distributed batches, which gives more verbose errors in case of corrupted data in batch. #7914 (Azat Khuzhin)
  • Support DROP DATABASE, DETACH TABLE, DROP TABLE and ATTACH TABLE for MySQL database engine. #8202 (Winter Zhang)
  • Add authentication in S3 table function and table engine. #7623 (Vladimir Chebotarev)
  • Added check for extra parts of MergeTree at different disks, in order to not allow to miss data parts at undefined disks. #8118 (Vladimir Chebotarev)
  • Enable SSL support for Mac client and server. #8297 (Ivan)
  • Now ClickHouse can work as MySQL federated server (see https://dev.mysql.com/doc/refman/5.7/en/federated-create-server.html). #7717 (Maxim Fedotov)
  • clickhouse-client now only enable bracketed-paste when multiquery is on and multiline is off. This fixes (#7757)[https://github.com/ClickHouse/ClickHouse/issues/7757]. #7761 (Amos Bird)
  • Support Array(Decimal) in if function. #7721 (Artem Zuikov)
  • Support Decimals in arrayDifference, arrayCumSum and arrayCumSumNegative functions. #7724 (Artem Zuikov)
  • Added lifetime column to system.dictionaries table. #6820 #7727 (kekekekule)
  • Improved check for existing parts on different disks for *MergeTree table engines. Addresses #7660. #8440 (Vladimir Chebotarev)
  • Integration with AWS SDK for S3 interactions which allows to use all S3 features out of the box. #8011 (Pavel Kovalenko)
  • Added support for subqueries in Live View tables. #7792 (vzakaznikov)
  • Check for using Date or DateTime column from TTL expressions was removed. #7920 (Vladimir Chebotarev)
  • Information about disk was added to system.detached_parts table. #7833 (Vladimir Chebotarev)
  • Now settings max_(table|partition)_size_to_drop can be changed without a restart. #7779 (Grigory Pervakov)
  • Slightly better usability of error messages. Ask user not to remove the lines below Stack trace:. #7897 (alexey-milovidov)
  • Better reading messages from Kafka engine in various formats after #7935. #8035 (Ivan)
  • Better compatibility with MySQL clients which don't support sha2_password auth plugin. #8036 (Yuriy Baranov)
  • Support more column types in MySQL compatibility server. #7975 (Yuriy Baranov)
  • Implement ORDER BY optimization for Merge, Buffer and Materilized View storages with underlying MergeTree tables. #8130 (Anton Popov)
  • Now we always use POSIX implementation of getrandom to have better compatibility with old kernels (< 3.17). #7940 (Amos Bird)
  • Better check for valid destination in a move TTL rule. #8410 (Vladimir Chebotarev)
  • Better checks for broken insert batches for Distributed table engine. #7933 (Azat Khuzhin)
  • Add column with array of parts name which mutations must process in future to system.mutations table. #8179 (alesapin)
  • Parallel merge sort optimization for processors. #8552 (Nikolai Kochetov)
  • The settings mark_cache_min_lifetime is now obsolete and does nothing. In previous versions, mark cache can grow in memory larger than mark_cache_size to accomodate data within mark_cache_min_lifetime seconds. That was leading to confusion and higher memory usage than expected, that is especially bad on memory constrained systems. If you will see performance degradation after installing this release, you should increase the mark_cache_size. #8484 (alexey-milovidov)
  • Preparation to use tid everywhere. This is needed for #7477. #8276 (alexey-milovidov)

Performance Improvement

  • Performance optimizations in processors pipeline. #7988 (Nikolai Kochetov)
  • Non-blocking updates of expired keys in cache dictionaries (with permission to read old ones). #8303 (Nikita Mikhaylov)
  • Compile ClickHouse without -fno-omit-frame-pointer globally to spare one more register. #8097 (Amos Bird)
  • Speedup greatCircleDistance function and add performance tests for it. #7307 (Olga Khvostikova)
  • Improved performance of function roundDown. #8465 (alexey-milovidov)
  • Improved performance of max, min, argMin, argMax for DateTime64 data type. #8199 (Vasily Nemkov)
  • Improved performance of sorting without a limit or with big limit and external sorting. #8545 (alexey-milovidov)
  • Improved performance of formatting floating point numbers up to 6 times. #8542 (alexey-milovidov)
  • Improved performance of modulo function. #7750 (Amos Bird)
  • Optimized ORDER BY and merging with single column key. #8335 (alexey-milovidov)
  • Better implementation for arrayReduce, -Array and -State combinators. #7710 (Amos Bird)
  • Now PREWHERE should be optimized to be at least as efficient as WHERE. #7769 (Amos Bird)
  • Improve the way round and roundBankers handling negative numbers. #8229 (hcz)
  • Improved decoding performance of DoubleDelta and Gorilla codecs by roughly 30-40%. This fixes #7082. #8019 (Vasily Nemkov)
  • Improved performance of base64 related functions. #8444 (alexey-milovidov)
  • Added a function geoDistance. It is similar to greatCircleDistance but uses approximation to WGS-84 ellipsoid model. The performance of both functions are near the same. #8086 (alexey-milovidov)
  • Faster min and max aggregation functions for Decimal data type. #8144 (Artem Zuikov)
  • Vectorize processing arrayReduce. #7608 (Amos Bird)
  • if chains are now optimized as multiIf. #8355 (kamalov-ruslan)
  • Fix performance regression of Kafka table engine introduced in 19.15. This fixes #7261. #7935 (filimonov)
  • Removed "pie" code generation that gcc from Debian packages occasionally brings by default. #8483 (alexey-milovidov)
  • Parallel parsing data formats #6553 (Nikita Mikhaylov)
  • Enable optimized parser of Values with expressions by default (input_format_values_deduce_templates_of_expressions=1). #8231 (tavplubix)

Build/Testing/Packaging Improvement

Experimental Feature

  • Added experimental setting min_bytes_to_use_mmap_io. It allows to read big files without copying data from kernel to userspace. The setting is disabled by default. Recommended threshold is about 64 MB, because mmap/munmap is slow. #8520 (alexey-milovidov)
  • Reworked quotas as a part of access control system. Added new table system.quotas, new functions currentQuota, currentQuotaKey, new SQL syntax CREATE QUOTA, ALTER QUOTA, DROP QUOTA, SHOW QUOTA. #7257 (Vitaly Baranov)
  • Allow skipping unknown settings with warnings instead of throwing exceptions. #7653 (Vitaly Baranov)
  • Reworked row policies as a part of access control system. Added new table system.row_policies, new function currentRowPolicies(), new SQL syntax CREATE POLICY, ALTER POLICY, DROP POLICY, SHOW CREATE POLICY, SHOW POLICIES. #7808 (Vitaly Baranov)

Security Fix

  • Fixed the possibility of reading directories structure in tables with File table engine. This fixes #8536. #8537 (alexey-milovidov)

ClickHouse release v19.17

ClickHouse release v19.17.6.36, 2019-12-27

Bug Fix

  • Fixed potential buffer overflow in decompress. Malicious user can pass fabricated compressed data that could cause read after buffer. This issue was found by Eldar Zaitov from Yandex information security team. #8404 (alexey-milovidov)
  • Fixed possible server crash (std::terminate) when the server cannot send or write data in JSON or XML format with values of String data type (that require UTF-8 validation) or when compressing result data with Brotli algorithm or in some other rare cases. #8384 (alexey-milovidov)
  • Fixed dictionaries with source from a clickhouse VIEW, now reading such dictionaries doesn't cause the error There is no query. #8351 (Nikolai Kochetov)
  • Fixed checking if a client host is allowed by host_regexp specified in users.xml. #8241, #8342 (Vitaly Baranov)
  • RENAME TABLE for a distributed table now renames the folder containing inserted data before sending to shards. This fixes an issue with successive renames tableA->tableB, tableC->tableA. #8306 (tavplubix)
  • range_hashed external dictionaries created by DDL queries now allow ranges of arbitrary numeric types. #8275 (alesapin)
  • Fixed INSERT INTO table SELECT ... FROM mysql(...) table function. #8234 (tavplubix)
  • Fixed segfault in INSERT INTO TABLE FUNCTION file() while inserting into a file which doesn't exist. Now in this case file would be created and then insert would be processed. #8177 (Olga Khvostikova)
  • Fixed bitmapAnd error when intersecting an aggregated bitmap and a scalar bitmap. #8082 (Yue Huang)
  • Fixed segfault when EXISTS query was used without TABLE or DICTIONARY qualifier, just like EXISTS t. #8213 (alexey-milovidov)
  • Fixed return type for functions rand and randConstant in case of nullable argument. Now functions always return UInt32 and never Nullable(UInt32). #8204 (Nikolai Kochetov)
  • Fixed DROP DICTIONARY IF EXISTS db.dict, now it doesn't throw exception if db doesn't exist. #8185 (Vitaly Baranov)
  • If a table wasn't completely dropped because of server crash, the server will try to restore and load it #8176 (tavplubix)
  • Fixed a trivial count query for a distributed table if there are more than two shard local table. #8164 (小路)
  • Fixed bug that lead to a data race in DB::BlockStreamProfileInfo::calculateRowsBeforeLimit() #8143 (Alexander Kazakov)
  • Fixed ALTER table MOVE part executed immediately after merging the specified part, which could cause moving a part which the specified part merged into. Now it correctly moves the specified part. #8104 (Vladimir Chebotarev)
  • Expressions for dictionaries can be specified as strings now. This is useful for calculation of attributes while extracting data from non-ClickHouse sources because it allows to use non-ClickHouse syntax for those expressions. #8098 (alesapin)
  • Fixed a very rare race in clickhouse-copier because of an overflow in ZXid. #8088 (Ding Xiang Fei)
  • Fixed the bug when after the query failed (due to "Too many simultaneous queries" for example) it would not read external tables info, and the next request would interpret this info as the beginning of the next query causing an error like Unknown packet from client. #8084 (Azat Khuzhin)
  • Avoid null dereference after "Unknown packet X from server" #8071 (Azat Khuzhin)
  • Restore support of all ICU locales, add the ability to apply collations for constant expressions and add language name to system.collations table. #8051 (alesapin)
  • Number of streams for read from StorageFile and StorageHDFS is now limited, to avoid exceeding the memory limit. #7981 (alesapin)
  • Fixed CHECK TABLE query for *MergeTree tables without key. #7979 (alesapin)
  • Removed the mutation number from a part name in case there were no mutations. This removing improved the compatibility with older versions. #8250 (alesapin)
  • Fixed the bug that mutations are skipped for some attached parts due to their data_version are larger than the table mutation version. #7812 (Zhichang Yu)
  • Allow starting the server with redundant copies of parts after moving them to another device. #7810 (Vladimir Chebotarev)
  • Fixed the error "Sizes of columns doesn't match" that might appear when using aggregate function columns. #7790 (Boris Granveaud)
  • Now an exception will be thrown in case of using WITH TIES alongside LIMIT BY. And now it's possible to use TOP with LIMIT BY. #7637 (Nikita Mikhaylov)
  • Fix dictionary reload if it has invalidate_query, which stopped updates and some exception on previous update tries. #8029 (alesapin)

ClickHouse release v19.17.4.11, 2019-11-22

Backward Incompatible Change

  • Using column instead of AST to store scalar subquery results for better performance. Setting enable_scalar_subquery_optimization was added in 19.17 and it was enabled by default. It leads to errors like this during upgrade to 19.17.2 or 19.17.3 from previous versions. This setting was disabled by default in 19.17.4, to make possible upgrading from 19.16 and older versions without errors. #7392 (Amos Bird)

New Feature

  • Add the ability to create dictionaries with DDL queries. #7360 (alesapin)
  • Make bloom_filter type of index supporting LowCardinality and Nullable #7363 #7561 (Nikolai Kochetov)
  • Add function isValidJSON to check that passed string is a valid json. #5910 #7293 (Vdimir)
  • Implement arrayCompact function #7328 (Memo)
  • Created function hex for Decimal numbers. It works like hex(reinterpretAsString()), but doesn't delete last zero bytes. #7355 (Mikhail Korotov)
  • Add arrayFill and arrayReverseFill functions, which replace elements by other elements in front/back of them in the array. #7380 (hcz)
  • Add CRC32IEEE()/CRC64() support #7480 (Azat Khuzhin)
  • Implement char function similar to one in mysql #7486 (sundyli)
  • Add bitmapTransform function. It transforms an array of values in a bitmap to another array of values, the result is a new bitmap #7598 (Zhichang Yu)
  • Implemented javaHashUTF16LE() function #7651 (achimbab)
  • Add _shard_num virtual column for the Distributed engine #7624 (Azat Khuzhin)

Experimental Feature

Bug Fix

  • Fix incorrect float parsing in Values #7817 #7870 (tavplubix)
  • Fix rare deadlock which can happen when trace_log is enabled. #7838 (filimonov)
  • Prevent message duplication when producing Kafka table has any MVs selecting from it #7265 (Ivan)
  • Support for Array(LowCardinality(Nullable(String))) in IN. Resolves #7364 #7366 (achimbab)
  • Add handling of SQL_TINYINT and SQL_BIGINT, and fix handling of SQL_FLOAT data source types in ODBC Bridge. #7491 (Denis Glazachev)
  • Fix aggregation (avg and quantiles) over empty decimal columns #7431 (Andrey Konyaev)
  • Fix INSERT into Distributed with MATERIALIZED columns #7377 (Azat Khuzhin)
  • Make MOVE PARTITION work if some parts of partition are already on destination disk or volume #7434 (Vladimir Chebotarev)
  • Fixed bug with hardlinks failing to be created during mutations in ReplicatedMergeTree in multi-disk configurations. #7558 (Vladimir Chebotarev)
  • Fixed a bug with a mutation on a MergeTree when whole part remains unchanged and best space is being found on another disk #7602 (Vladimir Chebotarev)
  • Fixed bug with keep_free_space_ratio not being read from disks configuration #7645 (Vladimir Chebotarev)
  • Fix bug with table contains only Tuple columns or columns with complex paths. Fixes 7541. #7545 (alesapin)
  • Do not account memory for Buffer engine in max_memory_usage limit #7552 (Azat Khuzhin)
  • Fix final mark usage in MergeTree tables ordered by tuple(). In rare cases it could lead to Can't adjust last granule error while select. #7639 (Anton Popov)
  • Fix bug in mutations that have predicate with actions that require context (for example functions for json), which may lead to crashes or strange exceptions. #7664 (alesapin)
  • Fix mismatch of database and table names escaping in data/ and shadow/ directories #7575 (Alexander Burmak)
  • Support duplicated keys in RIGHT|FULL JOINs, e.g. ON t.x = u.x AND t.x = u.y. Fix crash in this case. #7586 (Artem Zuikov)
  • Fix Not found column <expression> in block when joining on expression with RIGHT or FULL JOIN. #7641 (Artem Zuikov)
  • One more attempt to fix infinite loop in PrettySpace format #7591 (Olga Khvostikova)
  • Fix bug in concat function when all arguments were FixedString of the same size. #7635 (alesapin)
  • Fixed exception in case of using 1 argument while defining S3, URL and HDFS storages. #7618 (Vladimir Chebotarev)
  • Fix scope of the InterpreterSelectQuery for views with query #7601 (Azat Khuzhin)

Improvement

  • Nullable columns recognized and NULL-values handled correctly by ODBC-bridge #7402 (Vasily Nemkov)
  • Write current batch for distributed send atomically #7600 (Azat Khuzhin)
  • Throw an exception if we cannot detect table for column name in query. #7358 (Artem Zuikov)
  • Add merge_max_block_size setting to MergeTreeSettings #7412 (Artem Zuikov)
  • Queries with HAVING and without GROUP BY assume group by constant. So, SELECT 1 HAVING 1 now returns a result. #7496 (Amos Bird)
  • Support parsing (X,) as tuple similar to python. #7501, #7562 (Amos Bird)
  • Make range function behaviors almost like pythonic one. #7518 (sundyli)
  • Add constraints columns to table system.settings #7553 (Vitaly Baranov)
  • Better Null format for tcp handler, so that it's possible to use select ignore(<expression>) from table format Null for perf measure via clickhouse-client #7606 (Amos Bird)
  • Queries like CREATE TABLE ... AS (SELECT (1, 2)) are parsed correctly #7542 (hcz)

Performance Improvement

Build/Testing/Packaging Improvement

  • Add support for cross-compiling to the CPU architecture AARCH64. Refactor packager script. #7370 #7539 (Ivan)
  • Unpack darwin-x86_64 and linux-aarch64 toolchains into mounted Docker volume when building packages #7534 (Ivan)
  • Update Docker Image for Binary Packager #7474 (Ivan)
  • Fixed compile errors on MacOS Catalina #7585 (Ernest Poletaev)
  • Some refactoring in query analysis logic: split complex class into several simple ones. #7454 (Artem Zuikov)
  • Fix build without submodules #7295 (proller)
  • Better add_globs in CMake files #7418 (Amos Bird)
  • Remove hardcoded paths in unwind target #7460 (Konstantin Podshumok)
  • Allow to use mysql format without ssl #7524 (proller)

Other

ClickHouse release v19.16

Clickhouse release v19.16.14.65, 2020-03-05

  • Fix distributed subqueries incompatibility with older CH versions. Fixes #7851 (tabplubix)
  • When executing CREATE query, fold constant expressions in storage engine arguments. Replace empty database name with current database. Fixes #6508, #3492. Also fix check for local address in ClickHouseDictionarySource. #9262 (tabplubix)
  • Now background merges in *MergeTree table engines family preserve storage policy volume order more accurately. #8549 (Vladimir Chebotarev)
  • Prevent losing data in Kafka in rare cases when exception happens after reading suffix but before commit. Fixes #9378. Related: #7175 #9507 (filimonov)
  • Fix bug leading to server termination when trying to use / drop Kafka table created with wrong parameters. Fixes #9494. Incorporates #9507. #9513 (filimonov)
  • Allow using MaterializedView with subqueries above Kafka tables. #8197 (filimonov)

New Feature

  • Add deduplicate_blocks_in_dependent_materialized_views option to control the behaviour of idempotent inserts into tables with materialized views. This new feature was added to the bugfix release by a special request from Altinity. #9070 (urykhy)

ClickHouse release v19.16.2.2, 2019-10-30

Backward Incompatible Change

  • Add missing arity validation for count/counIf. #7095 #7298 (Vdimir)
  • Remove legacy asterisk_left_columns_only setting (it was disabled by default). #7335 (Artem Zuikov)
  • Format strings for Template data format are now specified in files. #7118 (tavplubix)

New Feature

  • Introduce uniqCombined64() to calculate cardinality greater than UINT_MAX. #7213, #7222 (Azat Khuzhin)
  • Support Bloom filter indexes on Array columns. #6984 (achimbab)
  • Add a function getMacro(name) that returns String with the value of corresponding <macros> from server configuration. #7240 (alexey-milovidov)
  • Set two configuration options for a dictionary based on an HTTP source: credentials and http-headers. #7092 (Guillaume Tassery)
  • Add a new ProfileEvent Merge that counts the number of launched background merges. #7093 (Mikhail Korotov)
  • Add fullHostName function that returns a fully qualified domain name. #7263 #7291 (sundyli)
  • Add function arraySplit and arrayReverseSplit which split an array by "cut off" conditions. They are useful in time sequence handling. #7294 (hcz)
  • Add new functions that return the Array of all matched indices in multiMatch family of functions. #7299 (Danila Kutenin)
  • Add a new database engine Lazy that is optimized for storing a large number of small -Log tables. #7171 (Nikita Vasilev)
  • Add aggregate functions groupBitmapAnd, -Or, -Xor for bitmap columns. #7109 (Zhichang Yu)
  • Add aggregate function combinators -OrNull and -OrDefault, which return null or default values when there is nothing to aggregate. #7331 (hcz)
  • Introduce CustomSeparated data format that supports custom escaping and delimiter rules. #7118 (tavplubix)
  • Support Redis as source of external dictionary. #4361 #6962 (comunodi, Anton Popov)

Bug Fix

  • Fix wrong query result if it has WHERE IN (SELECT ...) section and optimize_read_in_order is used. #7371 (Anton Popov)
  • Disabled MariaDB authentication plugin, which depends on files outside of project. #7140 (Yuriy Baranov)
  • Fix exception Cannot convert column ... because it is constant but values of constants are different in source and result which could rarely happen when functions now(), today(), yesterday(), randConstant() are used. #7156 (Nikolai Kochetov)
  • Fixed issue of using HTTP keep alive timeout instead of TCP keep alive timeout. #7351 (Vasily Nemkov)
  • Fixed a segmentation fault in groupBitmapOr (issue #7109). #7289 (Zhichang Yu)
  • For materialized views the commit for Kafka is called after all data were written. #7175 (Ivan)
  • Fixed wrong duration_ms value in system.part_log table. It was ten times off. #7172 (Vladimir Chebotarev)
  • A quick fix to resolve crash in LIVE VIEW table and re-enabling all LIVE VIEW tests. #7201 (vzakaznikov)
  • Serialize NULL values correctly in min/max indexes of MergeTree parts. #7234 (Alexander Kuzmenkov)
  • Don't put virtual columns to .sql metadata when table is created as CREATE TABLE AS. #7183 (Ivan)
  • Fix segmentation fault in ATTACH PART query. #7185 (alesapin)
  • Fix wrong result for some queries given by the optimization of empty IN subqueries and empty INNER/RIGHT JOIN. #7284 (Nikolai Kochetov)
  • Fixing AddressSanitizer error in the LIVE VIEW getHeader() method. #7271 (vzakaznikov)

Improvement

  • Add a message in case of queue_wait_max_ms wait takes place. #7390 (Azat Khuzhin)
  • Made setting s3_min_upload_part_size table-level. #7059 (Vladimir Chebotarev)
  • Check TTL in StorageFactory. #7304 (sundyli)
  • Squash left-hand blocks in partial merge join (optimization). #7122 (Artem Zuikov)
  • Do not allow non-deterministic functions in mutations of Replicated table engines, because this can introduce inconsistencies between replicas. #7247 (Alexander Kazakov)
  • Disable memory tracker while converting exception stack trace to string. It can prevent the loss of error messages of type Memory limit exceeded on server, which caused the Attempt to read after eof exception on client. #7264 (Nikolai Kochetov)
  • Miscellaneous format improvements. Resolves #6033, #2633, #6611, #6742 #7215 (tavplubix)
  • ClickHouse ignores values on the right side of IN operator that are not convertible to the left side type. Make it work properly for compound types -- Array and Tuple. #7283 (Alexander Kuzmenkov)
  • Support missing inequalities for ASOF JOIN. It's possible to join less-or-equal variant and strict greater and less variants for ASOF column in ON syntax. #7282 (Artem Zuikov)
  • Optimize partial merge join. #7070 (Artem Zuikov)
  • Do not use more than 98K of memory in uniqCombined functions. #7236, #7270 (Azat Khuzhin)
  • Flush parts of right-hand joining table on disk in PartialMergeJoin (if there is not enough memory). Load data back when needed. #7186 (Artem Zuikov)

Performance Improvement

  • Speed up joinGet with const arguments by avoiding data duplication. #7359 (Amos Bird)
  • Return early if the subquery is empty. #7007 (小路)
  • Optimize parsing of SQL expression in Values. #6781 (tavplubix)

Build/Testing/Packaging Improvement

Code cleanup

  • Generalize configuration repository to prepare for DDL for Dictionaries. #7155 (alesapin)
  • Parser for dictionaries DDL without any semantic. #7209 (alesapin)
  • Split ParserCreateQuery into different smaller parsers. #7253 (alesapin)
  • Small refactoring and renaming near external dictionaries. #7111 (alesapin)
  • Refactor some code to prepare for role-based access control. #7235 (Vitaly Baranov)
  • Some improvements in DatabaseOrdinary code. #7086 (Nikita Vasilev)
  • Do not use iterators in find() and emplace() methods of hash tables. #7026 (Alexander Kuzmenkov)
  • Fix getMultipleValuesFromConfig in case when parameter root is not empty. #7374 (Mikhail Korotov)
  • Remove some copy-paste (TemporaryFile and TemporaryFileStream) #7166 (Artem Zuikov)
  • Improved code readability a little bit (MergeTreeData::getActiveContainingPart). #7361 (Vladimir Chebotarev)
  • Wait for all scheduled jobs, which are using local objects, if ThreadPool::schedule(...) throws an exception. Rename ThreadPool::schedule(...) to ThreadPool::scheduleOrThrowOnError(...) and fix comments to make obvious that it may throw. #7350 (tavplubix)

ClickHouse release 19.15

ClickHouse release 19.15.4.10, 2019-10-31

Bug Fix

ClickHouse release 19.15.3.6, 2019-10-09

Bug Fix

  • Fixed bad_variant in hashed dictionary. (alesapin)
  • Fixed up bug with segmentation fault in ATTACH PART query. (alesapin)
  • Fixed time calculation in MergeTreeData. (Vladimir Chebotarev)
  • Commit to Kafka explicitly after the writing is finalized. #7175 (Ivan)
  • Serialize NULL values correctly in min/max indexes of MergeTree parts. #7234 (Alexander Kuzmenkov)

ClickHouse release 19.15.2.2, 2019-10-01

New Feature

  • Tiered storage: support to use multiple storage volumes for tables with MergeTree engine. It's possible to store fresh data on SSD and automatically move old data to HDD. (example). #4918 (Igr) #6489 (alesapin)
  • Add table function input for reading incoming data in INSERT SELECT query. #5450 (palasonic1) #6832 (Anton Popov)
  • Add a sparse_hashed dictionary layout, that is functionally equivalent to the hashed layout, but is more memory efficient. It uses about twice as less memory at the cost of slower value retrieval. #6894 (Azat Khuzhin)
  • Implement ability to define list of users for access to dictionaries. Only current connected database using. #6907 (Guillaume Tassery)
  • Add LIMIT option to SHOW query. #6944 (Philipp Malkovsky)
  • Add bitmapSubsetLimit(bitmap, range_start, limit) function, that returns subset of the smallest limit values in set that is no smaller than range_start. #6957 (Zhichang Yu)
  • Add bitmapMin and bitmapMax functions. #6970 (Zhichang Yu)
  • Add function repeat related to issue-6648 #6999 (flynn)

Experimental Feature

  • Implement (in memory) Merge Join variant that does not change current pipeline. Result is partially sorted by merge key. Set partial_merge_join = 1 to use this feature. The Merge Join is still in development. #6940 (Artem Zuikov)
  • Add S3 engine and table function. It is still in development (no authentication support yet). #5596 (Vladimir Chebotarev)

Improvement

  • Every message read from Kafka is inserted atomically. This resolves almost all known issues with Kafka engine. #6950 (Ivan)
  • Improvements for failover of Distributed queries. Shorten recovery time, also it is now configurable and can be seen in system.clusters. #6399 (Vasily Nemkov)
  • Support numeric values for Enums directly in IN section. #6766 #6941 (dimarub2000)
  • Support (optional, disabled by default) redirects on URL storage. #6914 (maqroll)
  • Add information message when client with an older version connects to a server. #6893 (Philipp Malkovsky)
  • Remove maximum backoff sleep time limit for sending data in Distributed tables #6895 (Azat Khuzhin)
  • Add ability to send profile events (counters) with cumulative values to graphite. It can be enabled under <events_cumulative> in server config.xml. #6969 (Azat Khuzhin)
  • Add automatically cast type T to LowCardinality(T) while inserting data in column of type LowCardinality(T) in Native format via HTTP. #6891 (Nikolai Kochetov)
  • Add ability to use function hex without using reinterpretAsString for Float32, Float64. #7024 (Mikhail Korotov)

Build/Testing/Packaging Improvement

  • Add gdb-index to clickhouse binary with debug info. It will speed up startup time of gdb. #6947 (alesapin)
  • Speed up deb packaging with patched dpkg-deb which uses pigz. #6960 (alesapin)
  • Set enable_fuzzing = 1 to enable libfuzzer instrumentation of all the project code. #7042 (kyprizel)
  • Add split build smoke test in CI. #7061 (alesapin)
  • Add build with MemorySanitizer to CI. #7066 (Alexander Kuzmenkov)
  • Replace libsparsehash with sparsehash-c11 #6965 (Azat Khuzhin)

Bug Fix

  • Fixed performance degradation of index analysis on complex keys on large tables. This fixes #6924. #7075 (alexey-milovidov)
  • Fix logical error causing segfaults when selecting from Kafka empty topic. #6909 (Ivan)
  • Fix too early MySQL connection close in MySQLBlockInputStream.cpp. #6882 (Clément Rodriguez)
  • Returned support for very old Linux kernels (fix #6841) #6853 (alexey-milovidov)
  • Fix possible data loss in insert select query in case of empty block in input stream. #6834 #6862 #6911 (Nikolai Kochetov)
  • Fix for function АrrayEnumerateUniqRanked with empty arrays in params #6928 (proller)
  • Fix complex queries with array joins and global subqueries. #6934 (Ivan)
  • Fix Unknown identifier error in ORDER BY and GROUP BY with multiple JOINs #7022 (Artem Zuikov)
  • Fixed MSan warning while executing function with LowCardinality argument. #7062 (Nikolai Kochetov)

Backward Incompatible Change

  • Changed serialization format of bitmap* aggregate function states to improve performance. Serialized states of bitmap* from previous versions cannot be read. #6908 (Zhichang Yu)

ClickHouse release 19.14

ClickHouse release 19.14.7.15, 2019-10-02

Bug Fix

  • This release also contains all bug fixes from 19.11.12.69.
  • Fixed compatibility for distributed queries between 19.14 and earlier versions. This fixes #7068. #7069 (alexey-milovidov)

ClickHouse release 19.14.6.12, 2019-09-19

Bug Fix

  • Fix for function АrrayEnumerateUniqRanked with empty arrays in params. #6928 (proller)
  • Fixed subquery name in queries with ARRAY JOIN and GLOBAL IN subquery with alias. Use subquery alias for external table name if it is specified. #6934 (Ivan)

Build/Testing/Packaging Improvement

  • Fix flapping test 00715_fetch_merged_or_mutated_part_zookeeper by rewriting it to a shell scripts because it needs to wait for mutations to apply. #6977 (Alexander Kazakov)
  • Fixed UBSan and MemSan failure in function groupUniqArray with emtpy array argument. It was caused by placing of empty PaddedPODArray into hash table zero cell because constructor for zero cell value was not called. #6937 (Amos Bird)

ClickHouse release 19.14.3.3, 2019-09-10

New Feature

  • WITH FILL modifier for ORDER BY. (continuation of #5069) #6610 (Anton Popov)
  • WITH TIES modifier for LIMIT. (continuation of #5069) #6610 (Anton Popov)
  • Parse unquoted NULL literal as NULL (if setting format_csv_unquoted_null_literal_as_null=1). Initialize null fields with default values if data type of this field is not nullable (if setting input_format_null_as_default=1). #5990 #6055 (tavplubix)
  • Support for wildcards in paths of table functions file and hdfs. If the path contains wildcards, the table will be readonly. Example of usage: select * from hdfs('hdfs://hdfs1:9000/some_dir/another_dir/*/file{0..9}{0..9}') and select * from file('some_dir/{some_file,another_file,yet_another}.tsv', 'TSV', 'value UInt32'). #6092 (Olga Khvostikova)
  • New system.metric_log table which stores values of system.events and system.metrics with specified time interval. #6363 #6467 (Nikita Mikhaylov) #6530 (alexey-milovidov)
  • Allow to write ClickHouse text logs to system.text_log table. #6037 #6103 (Nikita Mikhaylov) #6164 (alexey-milovidov)
  • Show private symbols in stack traces (this is done via parsing symbol tables of ELF files). Added information about file and line number in stack traces if debug info is present. Speedup symbol name lookup with indexing symbols present in program. Added new SQL functions for introspection: demangle and addressToLine. Renamed function symbolizeAddress to addressToSymbol for consistency. Function addressToSymbol will return mangled name for performance reasons and you have to apply demangle. Added setting allow_introspection_functions which is turned off by default. #6201 (alexey-milovidov)
  • Table function values (the name is case-insensitive). It allows to read from VALUES list proposed in #5984. Example: SELECT * FROM VALUES('a UInt64, s String', (1, 'one'), (2, 'two'), (3, 'three')). #6217. #6209 (dimarub2000)
  • Added an ability to alter storage settings. Syntax: ALTER TABLE <table> MODIFY SETTING <setting> = <value>. #6366 #6669 #6685 (alesapin)
  • Support for removing of detached parts. Syntax: ALTER TABLE <table_name> DROP DETACHED PART '<part_id>'. #6158 (tavplubix)
  • Table constraints. Allows to add constraint to table definition which will be checked at insert. #5273 (Gleb Novikov) #6652 (alexey-milovidov)
  • Suppport for cascaded materialized views. #6324 (Amos Bird)
  • Turn on query profiler by default to sample every query execution thread once a second. #6283 (alexey-milovidov)
  • Input format ORC. #6454 #6703 (akonyaev90)
  • Added two new functions: sigmoid and tanh (that are useful for machine learning applications). #6254 (alexey-milovidov)
  • Function hasToken(haystack, token), hasTokenCaseInsensitive(haystack, token) to check if given token is in haystack. Token is a maximal length substring between two non alphanumeric ASCII characters (or boundaries of haystack). Token must be a constant string. Supported by tokenbf_v1 index specialization. #6596, #6662 (Vasily Nemkov)
  • New function neighbor(value, offset[, default_value]). Allows to reach prev/next value within column in a block of data. #5925 (Alex Krash) 6685365ab8c5b74f9650492c88a012596eb1b0c6 341e2e4587a18065c2da1ca888c73389f48ce36c Alexey Milovidov
  • Created a function currentUser(), returning login of authorized user. Added alias user() for compatibility with MySQL. #6470 (Alex Krash)
  • New aggregate functions quantilesExactInclusive and quantilesExactExclusive which were proposed in #5885. #6477 (dimarub2000)
  • Function bitmapRange(bitmap, range_begin, range_end) which returns new set with specified range (not include the range_end). #6314 (Zhichang Yu)
  • Function geohashesInBox(longitude_min, latitude_min, longitude_max, latitude_max, precision) which creates array of precision-long strings of geohash-boxes covering provided area. #6127 (Vasily Nemkov)
  • Implement support for INSERT query with Kafka tables. #6012 (Ivan)
  • Added support for _partition and _timestamp virtual columns to Kafka engine. #6400 (Ivan)
  • Possibility to remove sensitive data from query_log, server logs, process list with regexp-based rules. #5710 (filimonov)

Experimental Feature

Bug Fix

  • This release also contains all bug fixes from 19.13 and 19.11.
  • Fix segmentation fault when the table has skip indices and vertical merge happens. #6723 (alesapin)
  • Fix per-column TTL with non-trivial column defaults. Previously in case of force TTL merge with OPTIMIZE ... FINAL query, expired values was replaced by type defaults instead of user-specified column defaults. #6796 (Anton Popov)
  • Fix Kafka messages duplication problem on normal server restart. #6597 (Ivan)
  • Fixed infinite loop when reading Kafka messages. Do not pause/resume consumer on subscription at all - otherwise it may get paused indefinitely in some scenarios. #6354 (Ivan)
  • Fix Key expression contains comparison between inconvertible types exception in bitmapContains function. #6136 #6146 #6156 (dimarub2000)
  • Fix segfault with enabled optimize_skip_unused_shards and missing sharding key. #6384 (Anton Popov)
  • Fixed wrong code in mutations that may lead to memory corruption. Fixed segfault with read of address 0x14c0 that may happed due to concurrent DROP TABLE and SELECT from system.parts or system.parts_columns. Fixed race condition in preparation of mutation queries. Fixed deadlock caused by OPTIMIZE of Replicated tables and concurrent modification operations like ALTERs. #6514 (alexey-milovidov)
  • Removed extra verbose logging in MySQL interface #6389 (alexey-milovidov)
  • Return the ability to parse boolean settings from 'true' and 'false' in the configuration file. #6278 (alesapin)
  • Fix crash in quantile and median function over Nullable(Decimal128). #6378 (Artem Zuikov)
  • Fixed possible incomplete result returned by SELECT query with WHERE condition on primary key contained conversion to Float type. It was caused by incorrect checking of monotonicity in toFloat function. #6248 #6374 (dimarub2000)
  • Check max_expanded_ast_elements setting for mutations. Clear mutations after TRUNCATE TABLE. #6205 (Winter Zhang)
  • Fix JOIN results for key columns when used with join_use_nulls. Attach Nulls instead of columns defaults. #6249 (Artem Zuikov)
  • Fix for skip indices with vertical merge and alter. Fix for Bad size of marks file exception. #6594 #6713 (alesapin)
  • Fix rare crash in ALTER MODIFY COLUMN and vertical merge when one of merged/altered parts is empty (0 rows) #6746 #6780 (alesapin)
  • Fixed bug in conversion of LowCardinality types in AggregateFunctionFactory. This fixes #6257. #6281 (Nikolai Kochetov)
  • Fix wrong behavior and possible segfaults in topK and topKWeighted aggregated functions. #6404 (Anton Popov)
  • Fixed unsafe code around getIdentifier function. #6401 #6409 (alexey-milovidov)
  • Fixed bug in MySQL wire protocol (is used while connecting to ClickHouse form MySQL client). Caused by heap buffer overflow in PacketPayloadWriteBuffer. #6212 (Yuriy Baranov)
  • Fixed memory leak in bitmapSubsetInRange function. #6819 (Zhichang Yu)
  • Fix rare bug when mutation executed after granularity change. #6816 (alesapin)
  • Allow protobuf message with all fields by default. #6132 (Vitaly Baranov)
  • Resolve a bug with nullIf function when we send a NULL argument on the second argument. #6446 (Guillaume Tassery)
  • Fix rare bug with wrong memory allocation/deallocation in complex key cache dictionaries with string fields which leads to infinite memory consumption (looks like memory leak). Bug reproduces when string size was a power of two starting from eight (8, 16, 32, etc). #6447 (alesapin)
  • Fixed Gorilla encoding on small sequences which caused exception Cannot write after end of buffer. #6398 #6444 (Vasily Nemkov)
  • Allow to use not nullable types in JOINs with join_use_nulls enabled. #6705 (Artem Zuikov)
  • Disable Poco::AbstractConfiguration substitutions in query in clickhouse-client. #6706 (alexey-milovidov)
  • Avoid deadlock in REPLACE PARTITION. #6677 (alexey-milovidov)
  • Using arrayReduce for constant arguments may lead to segfault. #6242 #6326 (alexey-milovidov)
  • Fix inconsistent parts which can appear if replica was restored after DROP PARTITION. #6522 #6523 (tavplubix)
  • Fixed hang in JSONExtractRaw function. #6195 #6198 (alexey-milovidov)
  • Fix bug with incorrect skip indices serialization and aggregation with adaptive granularity. #6594. #6748 (alesapin)
  • Fix WITH ROLLUP and WITH CUBE modifiers of GROUP BY with two-level aggregation. #6225 (Anton Popov)
  • Fix bug with writing secondary indices marks with adaptive granularity. #6126 (alesapin)
  • Fix initialization order while server startup. Since StorageMergeTree::background_task_handle is initialized in startup() the MergeTreeBlockOutputStream::write() may try to use it before initialization. Just check if it is initialized. #6080 (Ivan)
  • Clearing the data buffer from the previous read operation that was completed with an error. #6026 (Nikolay)
  • Fix bug with enabling adaptive granularity when creating a new replica for Replicated*MergeTree table. #6394 #6452 (alesapin)
  • Fixed possible crash during server startup in case of exception happened in libunwind during exception at access to uninitialized ThreadStatus structure. #6456 (Nikita Mikhaylov)
  • Fix crash in yandexConsistentHash function. Found by fuzz test. #6304 #6305 (alexey-milovidov)
  • Fixed the possibility of hanging queries when server is overloaded and global thread pool becomes near full. This have higher chance to happen on clusters with large number of shards (hundreds), because distributed queries allocate a thread per connection to each shard. For example, this issue may reproduce if a cluster of 330 shards is processing 30 concurrent distributed queries. This issue affects all versions starting from 19.2. #6301 (alexey-milovidov)
  • Fixed logic of arrayEnumerateUniqRanked function. #6423 (alexey-milovidov)
  • Fix segfault when decoding symbol table. #6603 (Amos Bird)
  • Fixed irrelevant exception in cast of LowCardinality(Nullable) to not-Nullable column in case if it doesn't contain Nulls (e.g. in query like SELECT CAST(CAST('Hello' AS LowCardinality(Nullable(String))) AS String). #6094 #6119 (Nikolai Kochetov)
  • Removed extra quoting of description in system.settings table. #6696 #6699 (alexey-milovidov)
  • Avoid possible deadlock in TRUNCATE of Replicated table. #6695 (alexey-milovidov)
  • Fix reading in order of sorting key. #6189 (Anton Popov)
  • Fix ALTER TABLE ... UPDATE query for tables with enable_mixed_granularity_parts=1. #6543 (alesapin)
  • Fix bug opened by #4405 (since 19.4.0). Reproduces in queries to Distributed tables over MergeTree tables when we doesn't query any columns (SELECT 1). #6236 (alesapin)
  • Fixed overflow in integer division of signed type to unsigned type. The behaviour was exactly as in C or C++ language (integer promotion rules) that may be surprising. Please note that the overflow is still possible when dividing large signed number to large unsigned number or vice-versa (but that case is less usual). The issue existed in all server versions. #6214 #6233 (alexey-milovidov)
  • Limit maximum sleep time for throttling when max_execution_speed or max_execution_speed_bytes is set. Fixed false errors like Estimated query execution time (inf seconds) is too long. #5547 #6232 (alexey-milovidov)
  • Fixed issues about using MATERIALIZED columns and aliases in MaterializedView. #448 #3484 #3450 #2878 #2285 #3796 (Amos Bird) #6316 (alexey-milovidov)
  • Fix FormatFactory behaviour for input streams which are not implemented as processor. #6495 (Nikolai Kochetov)
  • Fixed typo. #6631 (Alex Ryndin)
  • Typo in the error message ( is -> are ). #6839 (Denis Zhuravlev)
  • Fixed error while parsing of columns list from string if type contained a comma (this issue was relevant for File, URL, HDFS storages) #6217. #6209 (dimarub2000)

Security Fix

  • This release also contains all bug security fixes from 19.13 and 19.11.
  • Fixed the possibility of a fabricated query to cause server crash due to stack overflow in SQL parser. Fixed the possibility of stack overflow in Merge and Distributed tables, materialized views and conditions for row-level security that involve subqueries. #6433 (alexey-milovidov)

Improvement

  • Correct implementation of ternary logic for AND/OR. #6048 (Alexander Kazakov)
  • Now values and rows with expired TTL will be removed after OPTIMIZE ... FINAL query from old parts without TTL infos or with outdated TTL infos, e.g. after ALTER ... MODIFY TTL query. Added queries SYSTEM STOP/START TTL MERGES to disallow/allow assign merges with TTL and filter expired values in all merges. #6274 (Anton Popov)
  • Possibility to change the location of ClickHouse history file for client using CLICKHOUSE_HISTORY_FILE env. #6840 (filimonov)
  • Remove dry_run flag from InterpreterSelectQuery. ... #6375 (Nikolai Kochetov)
  • Support ASOF JOIN with ON section. #6211 (Artem Zuikov)
  • Better support of skip indexes for mutations and replication. Support for MATERIALIZE/CLEAR INDEX ... IN PARTITION query. UPDATE x = x recalculates all indices that use column x. #5053 (Nikita Vasilev)
  • Allow to ATTACH live views (for example, at the server startup) regardless to allow_experimental_live_view setting. #6754 (alexey-milovidov)
  • For stack traces gathered by query profiler, do not include stack frames generated by the query profiler itself. #6250 (alexey-milovidov)
  • Now table functions values, file, url, hdfs have support for ALIAS columns. #6255 (alexey-milovidov)
  • Throw an exception if config.d file doesn't have the corresponding root element as the config file. #6123 (dimarub2000)
  • Print extra info in exception message for no space left on device. #6182, #6252 #6352 (tavplubix)
  • When determining shards of a Distributed table to be covered by a read query (for optimize_skip_unused_shards = 1) ClickHouse now checks conditions from both prewhere and where clauses of select statement. #6521 (Alexander Kazakov)
  • Enabled SIMDJSON for machines without AVX2 but with SSE 4.2 and PCLMUL instruction set. #6285 #6320 (alexey-milovidov)
  • ClickHouse can work on filesystems without O_DIRECT support (such as ZFS and BtrFS) without additional tuning. #4449 #6730 (alexey-milovidov)
  • Support push down predicate for final subquery. #6120 (TCeason) #6162 (alexey-milovidov)
  • Better JOIN ON keys extraction #6131 (Artem Zuikov)
  • Upated SIMDJSON. #6285. #6306 (alexey-milovidov)
  • Optimize selecting of smallest column for SELECT count() query. #6344 (Amos Bird)
  • Added strict parameter in windowFunnel(). When the strict is set, the windowFunnel() applies conditions only for the unique values. #6548 (achimbab)
  • Safer interface of mysqlxx::Pool. #6150 (avasiliev)
  • Options line size when executing with --help option now corresponds with terminal size. #6590 (dimarub2000)
  • Disable "read in order" optimization for aggregation without keys. #6599 (Anton Popov)
  • HTTP status code for INCORRECT_DATA and TYPE_MISMATCH error codes was changed from default 500 Internal Server Error to 400 Bad Request. #6271 (Alexander Rodin)
  • Move Join object from ExpressionAction into AnalyzedJoin. ExpressionAnalyzer and ExpressionAction do not know about Join class anymore. Its logic is hidden by AnalyzedJoin iface. #6801 (Artem Zuikov)
  • Fixed possible deadlock of distributed queries when one of shards is localhost but the query is sent via network connection. #6759 (alexey-milovidov)
  • Changed semantic of multiple tables RENAME to avoid possible deadlocks. #6757. #6756 (alexey-milovidov)
  • Rewritten MySQL compatibility server to prevent loading full packet payload in memory. Decreased memory consumption for each connection to approximately 2 * DBMS_DEFAULT_BUFFER_SIZE (read/write buffers). #5811 (Yuriy Baranov)
  • Move AST alias interpreting logic out of parser that doesn't have to know anything about query semantics. #6108 (Artem Zuikov)
  • Slightly more safe parsing of NamesAndTypesList. #6408. #6410 (alexey-milovidov)
  • clickhouse-copier: Allow use where_condition from config with partition_key alias in query for checking partition existence (Earlier it was used only in reading data queries). #6577 (proller)
  • Added optional message argument in throwIf. (#5772) #6329 (Vdimir)
  • Server exception got while sending insertion data is now being processed in client as well. #5891 #6711 (dimarub2000)
  • Added a metric DistributedFilesToInsert that shows the total number of files in filesystem that are selected to send to remote servers by Distributed tables. The number is summed across all shards. #6600 (alexey-milovidov)
  • Move most of JOINs prepare logic from ExpressionAction/ExpressionAnalyzer to AnalyzedJoin. #6785 (Artem Zuikov)
  • Fix TSan warning 'lock-order-inversion'. #6740 (Vasily Nemkov)
  • Better information messages about lack of Linux capabilities. Logging fatal errors with "fatal" level, that will make it easier to find in system.text_log. #6441 (alexey-milovidov)
  • When enable dumping temporary data to the disk to restrict memory usage during GROUP BY, ORDER BY, it didn't check the free disk space. The fix add a new setting min_free_disk_space, when the free disk space it smaller then the threshold, the query will stop and throw ErrorCodes::NOT_ENOUGH_SPACE. #6678 (Weiqing Xu) #6691 (alexey-milovidov)
  • Removed recursive rwlock by thread. It makes no sense, because threads are reused between queries. SELECT query may acquire a lock in one thread, hold a lock from another thread and exit from first thread. In the same time, first thread can be reused by DROP query. This will lead to false "Attempt to acquire exclusive lock recursively" messages. #6771 (alexey-milovidov)
  • Split ExpressionAnalyzer.appendJoin(). Prepare a place in ExpressionAnalyzer for MergeJoin. #6524 (Artem Zuikov)
  • Added mysql_native_password authentication plugin to MySQL compatibility server. #6194 (Yuriy Baranov)
  • Less number of clock_gettime calls; fixed ABI compatibility between debug/release in Allocator (insignificant issue). #6197 (alexey-milovidov)
  • Move collectUsedColumns from ExpressionAnalyzer to SyntaxAnalyzer. SyntaxAnalyzer makes required_source_columns itself now. #6416 (Artem Zuikov)
  • Add setting joined_subquery_requires_alias to require aliases for subselects and table functions in FROM that more than one table is present (i.e. queries with JOINs). #6733 (Artem Zuikov)
  • Extract GetAggregatesVisitor class from ExpressionAnalyzer. #6458 (Artem Zuikov)
  • system.query_log: change data type of type column to Enum. #6265 (Nikita Mikhaylov)
  • Static linking of sha256_password authentication plugin. #6512 (Yuriy Baranov)
  • Avoid extra dependency for the setting compile to work. In previous versions, the user may get error like cannot open crti.o, unable to find library -lc etc. #6309 (alexey-milovidov)
  • More validation of the input that may come from malicious replica. #6303 (alexey-milovidov)
  • Now clickhouse-obfuscator file is available in clickhouse-client package. In previous versions it was available as clickhouse obfuscator (with whitespace). #5816 #6609 (dimarub2000)
  • Fixed deadlock when we have at least two queries that read at least two tables in different order and another query that performs DDL operation on one of tables. Fixed another very rare deadlock. #6764 (alexey-milovidov)
  • Added os_thread_ids column to system.processes and system.query_log for better debugging possibilities. #6763 (alexey-milovidov)
  • A workaround for PHP mysqlnd extension bugs which occur when sha256_password is used as a default authentication plugin (described in #6031). #6113 (Yuriy Baranov)
  • Remove unneeded place with changed nullability columns. #6693 (Artem Zuikov)
  • Set default value of queue_max_wait_ms to zero, because current value (five seconds) makes no sense. There are rare circumstances when this settings has any use. Added settings replace_running_query_max_wait_ms, kafka_max_wait_ms and connection_pool_max_wait_ms for disambiguation. #6692 (alexey-milovidov)
  • Extract SelectQueryExpressionAnalyzer from ExpressionAnalyzer. Keep the last one for non-select queries. #6499 (Artem Zuikov)
  • Removed duplicating input and output formats. #6239 (Nikolai Kochetov)
  • Allow user to override poll_interval and idle_connection_timeout settings on connection. #6230 (alexey-milovidov)
  • MergeTree now has an additional option ttl_only_drop_parts (disabled by default) to avoid partial pruning of parts, so that they dropped completely when all the rows in a part are expired. #6191 (Sergi Vladykin)
  • Type checks for set index functions. Throw exception if function got a wrong type. This fixes fuzz test with UBSan. #6511 (Nikita Vasilev)

Performance Improvement

  • Optimize queries with ORDER BY expressions clause, where expressions have coinciding prefix with sorting key in MergeTree tables. This optimization is controlled by optimize_read_in_order setting. #6054 #6629 (Anton Popov)
  • Allow to use multiple threads during parts loading and removal. #6372 #6074 #6438 (alexey-milovidov)
  • Implemented batch variant of updating aggregate function states. It may lead to performance benefits. #6435 (alexey-milovidov)
  • Using FastOps library for functions exp, log, sigmoid, tanh. FastOps is a fast vector math library from Michael Parakhin (Yandex CTO). Improved performance of exp and log functions more than 6 times. The functions exp and log from Float32 argument will return Float32 (in previous versions they always return Float64). Now exp(nan) may return inf. The result of exp and log functions may be not the nearest machine representable number to the true answer. #6254 (alexey-milovidov) Using Danila Kutenin variant to make fastops working #6317 (alexey-milovidov)
  • Disable consecutive key optimization for UInt8/16. #6298 #6701 (akuzm)
  • Improved performance of simdjson library by getting rid of dynamic allocation in ParsedJson::Iterator. #6479 (Vitaly Baranov)
  • Pre-fault pages when allocating memory with mmap(). #6667 (akuzm)
  • Fix performance bug in Decimal comparison. #6380 (Artem Zuikov)

Build/Testing/Packaging Improvement

Backward Incompatible Change

  • Removed rarely used table function catBoostPool and storage CatBoostPool. If you have used this table function, please write email to clickhouse-feedback@yandex-team.com. Note that CatBoost integration remains and will be supported. #6279 (alexey-milovidov)
  • Disable ANY RIGHT JOIN and ANY FULL JOIN by default. Set any_join_distinct_right_table_keys setting to enable them. #5126 #6351 (Artem Zuikov)

ClickHouse release 19.13

ClickHouse release 19.13.6.51, 2019-10-02

Bug Fix

  • This release also contains all bug fixes from 19.11.12.69.

ClickHouse release 19.13.5.44, 2019-09-20

Bug Fix

ClickHouse release 19.13.4.32, 2019-09-10

Bug Fix

  • This release also contains all bug security fixes from 19.11.9.52 and 19.11.10.54.
  • Fixed data race in system.parts table and ALTER query. #6245 #6513 (alexey-milovidov)
  • Fixed mismatched header in streams happened in case of reading from empty distributed table with sample and prewhere. #6167 (Lixiang Qian) #6823 (Nikolai Kochetov)
  • Fixed crash when using IN clause with a subquery with a tuple. #6125 #6550 (tavplubix)
  • Fix case with same column names in GLOBAL JOIN ON section. #6181 (Artem Zuikov)
  • Fix crash when casting types to Decimal that do not support it. Throw exception instead. #6297 (Artem Zuikov)
  • Fixed crash in extractAll() function. #6644 (Artem Zuikov)
  • Query transformation for MySQL, ODBC, JDBC table functions now works properly for SELECT WHERE queries with multiple AND expressions. #6381 #6676 (dimarub2000)
  • Added previous declaration checks for MySQL 8 integration. #6569 (Rafael David Tinoco)

Security Fix

  • Fix two vulnerabilities in codecs in decompression phase (malicious user can fabricate compressed data that will lead to buffer overflow in decompression). #6670 (Artem Zuikov)

ClickHouse release 19.13.3.26, 2019-08-22

Bug Fix

  • Fix ALTER TABLE ... UPDATE query for tables with enable_mixed_granularity_parts=1. #6543 (alesapin)
  • Fix NPE when using IN clause with a subquery with a tuple. #6125 #6550 (tavplubix)
  • Fixed an issue that if a stale replica becomes alive, it may still have data parts that were removed by DROP PARTITION. #6522 #6523 (tavplubix)
  • Fixed issue with parsing CSV #6426 #6559 (tavplubix)
  • Fixed data race in system.parts table and ALTER query. This fixes #6245. #6513 (alexey-milovidov)
  • Fixed wrong code in mutations that may lead to memory corruption. Fixed segfault with read of address 0x14c0 that may happed due to concurrent DROP TABLE and SELECT from system.parts or system.parts_columns. Fixed race condition in preparation of mutation queries. Fixed deadlock caused by OPTIMIZE of Replicated tables and concurrent modification operations like ALTERs. #6514 (alexey-milovidov)
  • Fixed possible data loss after ALTER DELETE query on table with skipping index. #6224 #6282 (Nikita Vasilev)

Security Fix

  • If the attacker has write access to ZooKeeper and is able to run custom server available from the network where ClickHouse run, it can create custom-built malicious server that will act as ClickHouse replica and register it in ZooKeeper. When another replica will fetch data part from malicious replica, it can force clickhouse-server to write to arbitrary path on filesystem. Found by Eldar Zaitov, information security team at Yandex. #6247 (alexey-milovidov)

ClickHouse release 19.13.2.19, 2019-08-14

New Feature

  • Sampling profiler on query level. Example. #4247 (laplab) #6124 (alexey-milovidov) #6250 #6283 #6386
  • Allow to specify a list of columns with COLUMNS('regexp') expression that works like a more sophisticated variant of * asterisk. #5951 (mfridental), (alexey-milovidov)
  • CREATE TABLE AS table_function() is now possible #6057 (dimarub2000)
  • Adam optimizer for stochastic gradient descent is used by default in stochasticLinearRegression() and stochasticLogisticRegression() aggregate functions, because it shows good quality without almost any tuning. #6000 (Quid37)
  • Added functions for working with the сustom week number #5212 (Andy Yang)
  • RENAME queries now work with all storages. #5953 (Ivan)
  • Now client receive logs from server with any desired level by setting send_logs_level regardless to the log level specified in server settings. #5964 (Nikita Mikhaylov)

Backward Incompatible Change

  • The setting input_format_defaults_for_omitted_fields is enabled by default. Inserts in Distributed tables need this setting to be the same on cluster (you need to set it before rolling update). It enables calculation of complex default expressions for omitted fields in JSONEachRow and CSV* formats. It should be the expected behavior but may lead to negligible performance difference. #6043 (Artem Zuikov), #5625 (akuzm)

Experimental features

  • New query processing pipeline. Use experimental_use_processors=1 option to enable it. Use for your own trouble. #4914 (Nikolai Kochetov)

Bug Fix

  • Kafka integration has been fixed in this version.
  • Fixed DoubleDelta encoding of Int64 for large DoubleDelta values, improved DoubleDelta encoding for random data for Int32. #5998 (Vasily Nemkov)
  • Fixed overestimation of max_rows_to_read if the setting merge_tree_uniform_read_distribution is set to 0. #6019 (alexey-milovidov)

Improvement

  • Throws an exception if config.d file doesn't have the corresponding root element as the config file #6123 (dimarub2000)

Performance Improvement

  • Optimize count(). Now it uses the smallest column (if possible). #6028 (Amos Bird)

Build/Testing/Packaging Improvement

  • Report memory usage in performance tests. #5899 (akuzm)
  • Fix build with external libcxx #6010 (Ivan)
  • Fix shared build with rdkafka library #6101 (Ivan)

ClickHouse release 19.11

ClickHouse release 19.11.13.74, 2019-11-01

Bug Fix

  • Fixed rare crash in ALTER MODIFY COLUMN and vertical merge when one of merged/altered parts is empty (0 rows). #6780 (alesapin)
  • Manual update of SIMDJSON. This fixes possible flooding of stderr files with bogus json diagnostic messages. #7548 (Alexander Kazakov)
  • Fixed bug with mrk file extension for mutations (alesapin)

ClickHouse release 19.11.12.69, 2019-10-02

Bug Fix

  • Fixed performance degradation of index analysis on complex keys on large tables. This fixes #6924. #7075 (alexey-milovidov)
  • Avoid rare SIGSEGV while sending data in tables with Distributed engine (Failed to send batch: file with index XXXXX is absent). #7032 (Azat Khuzhin)
  • Fix Unknown identifier with multiple joins. This fixes #5254. #7022 (Artem Zuikov)

ClickHouse release 19.11.11.57, 2019-09-13

  • Fix logical error causing segfaults when selecting from Kafka empty topic. #6902 #6909 (Ivan)
  • Fix for function АrrayEnumerateUniqRanked with empty arrays in params. #6928 (proller)

ClickHouse release 19.11.10.54, 2019-09-10

Bug Fix

  • Do store offsets for Kafka messages manually to be able to commit them all at once for all partitions. Fixes potential duplication in "one consumer - many partitions" scenario. #6872 (Ivan)

ClickHouse release 19.11.9.52, 2019-09-6

Security Fix

  • If the attacker has write access to ZooKeeper and is able to run custom server available from the network where ClickHouse runs, it can create custom-built malicious server that will act as ClickHouse replica and register it in ZooKeeper. When another replica will fetch data part from malicious replica, it can force clickhouse-server to write to arbitrary path on filesystem. Found by Eldar Zaitov, information security team at Yandex. #6247 (alexey-milovidov)

ClickHouse release 19.11.8.46, 2019-08-22

Bug Fix

  • Fix ALTER TABLE ... UPDATE query for tables with enable_mixed_granularity_parts=1. #6543 (alesapin)
  • Fix NPE when using IN clause with a subquery with a tuple. #6125 #6550 (tavplubix)
  • Fixed an issue that if a stale replica becomes alive, it may still have data parts that were removed by DROP PARTITION. #6522 #6523 (tavplubix)
  • Fixed issue with parsing CSV #6426 #6559 (tavplubix)
  • Fixed data race in system.parts table and ALTER query. This fixes #6245. #6513 (alexey-milovidov)
  • Fixed wrong code in mutations that may lead to memory corruption. Fixed segfault with read of address 0x14c0 that may happed due to concurrent DROP TABLE and SELECT from system.parts or system.parts_columns. Fixed race condition in preparation of mutation queries. Fixed deadlock caused by OPTIMIZE of Replicated tables and concurrent modification operations like ALTERs. #6514 (alexey-milovidov)

ClickHouse release 19.11.7.40, 2019-08-14

Bug fix

  • Kafka integration has been fixed in this version.
  • Fix segfault when using arrayReduce for constant arguments. #6326 (alexey-milovidov)
  • Fixed toFloat() monotonicity. #6374 (dimarub2000)
  • Fix segfault with enabled optimize_skip_unused_shards and missing sharding key. #6384 (CurtizJ)
  • Fixed logic of arrayEnumerateUniqRanked function. #6423 (alexey-milovidov)
  • Removed extra verbose logging from MySQL handler. #6389 (alexey-milovidov)
  • Fix wrong behavior and possible segfaults in topK and topKWeighted aggregated functions. #6404 (CurtizJ)
  • Do not expose virtual columns in system.columns table. This is required for backward compatibility. #6406 (alexey-milovidov)
  • Fix bug with memory allocation for string fields in complex key cache dictionary. #6447 (alesapin)
  • Fix bug with enabling adaptive granularity when creating new replica for Replicated*MergeTree table. #6452 (alesapin)
  • Fix infinite loop when reading Kafka messages. #6354 (abyss7)
  • Fixed the possibility of a fabricated query to cause server crash due to stack overflow in SQL parser and possibility of stack overflow in Merge and Distributed tables #6433 (alexey-milovidov)
  • Fixed Gorilla encoding error on small sequences. #6444 (Enmk)

Improvement

  • Allow user to override poll_interval and idle_connection_timeout settings on connection. #6230 (alexey-milovidov)

ClickHouse release 19.11.5.28, 2019-08-05

Bug fix

  • Fixed the possibility of hanging queries when server is overloaded. #6301 (alexey-milovidov)
  • Fix FPE in yandexConsistentHash function. This fixes #6304. #6126 (alexey-milovidov)
  • Fixed bug in conversion of LowCardinality types in AggregateFunctionFactory. This fixes #6257. #6281 (Nikolai Kochetov)
  • Fix parsing of bool settings from true and false strings in configuration files. #6278 (alesapin)
  • Fix rare bug with incompatible stream headers in queries to Distributed table over MergeTree table when part of WHERE moves to PREWHERE. #6236 (alesapin)
  • Fixed overflow in integer division of signed type to unsigned type. This fixes #6214. #6233 (alexey-milovidov)

Backward Incompatible Change

  • Kafka still broken.

ClickHouse release 19.11.4.24, 2019-08-01

Bug Fix

  • Fix bug with writing secondary indices marks with adaptive granularity. #6126 (alesapin)
  • Fix WITH ROLLUP and WITH CUBE modifiers of GROUP BY with two-level aggregation. #6225 (Anton Popov)
  • Fixed hang in JSONExtractRaw function. Fixed #6195 #6198 (alexey-milovidov)
  • Fix segfault in ExternalLoader::reloadOutdated(). #6082 (Vitaly Baranov)
  • Fixed the case when server may close listening sockets but not shutdown and continue serving remaining queries. You may end up with two running clickhouse-server processes. Sometimes, the server may return an error bad_function_call for remaining queries. #6231 (alexey-milovidov)
  • Fixed useless and incorrect condition on update field for initial loading of external dictionaries via ODBC, MySQL, ClickHouse and HTTP. This fixes #6069 #6083 (alexey-milovidov)
  • Fixed irrelevant exception in cast of LowCardinality(Nullable) to not-Nullable column in case if it doesn't contain Nulls (e.g. in query like SELECT CAST(CAST('Hello' AS LowCardinality(Nullable(String))) AS String). #6094 #6119 (Nikolai Kochetov)
  • Fix non-deterministic result of "uniq" aggregate function in extreme rare cases. The bug was present in all ClickHouse versions. #6058 (alexey-milovidov)
  • Segfault when we set a little bit too high CIDR on the function IPv6CIDRToRange. #6068 (Guillaume Tassery)
  • Fixed small memory leak when server throw many exceptions from many different contexts. #6144 (alexey-milovidov)
  • Fix the situation when consumer got paused before subscription and not resumed afterwards. #6075 (Ivan) Note that Kafka is broken in this version.
  • Clearing the Kafka data buffer from the previous read operation that was completed with an error #6026 (Nikolay) Note that Kafka is broken in this version.
  • Since StorageMergeTree::background_task_handle is initialized in startup() the MergeTreeBlockOutputStream::write() may try to use it before initialization. Just check if it is initialized. #6080 (Ivan)

Build/Testing/Packaging Improvement

Backward Incompatible Change

  • Kafka is broken in this version.

ClickHouse release 19.11.3.11, 2019-07-18

New Feature

  • Added support for prepared statements. #5331 (Alexander) #5630 (alexey-milovidov)
  • DoubleDelta and Gorilla column codecs #5600 (Vasily Nemkov)
  • Added os_thread_priority setting that allows to control the "nice" value of query processing threads that is used by OS to adjust dynamic scheduling priority. It requires CAP_SYS_NICE capabilities to work. This implements #5858 #5909 (alexey-milovidov)
  • Implement _topic, _offset, _key columns for Kafka engine #5382 (Ivan) Note that Kafka is broken in this version.
  • Add aggregate function combinator -Resample #5590 (hcz)
  • Aggregate functions groupArrayMovingSum(win_size)(x) and groupArrayMovingAvg(win_size)(x), which calculate moving sum/avg with or without window-size limitation. #5595 (inv2004)
  • Add synonim arrayFlatten <-> flatten #5764 (hcz)
  • Intergate H3 function geoToH3 from Uber. #4724 (Remen Ivan) #5805 (alexey-milovidov)

Bug Fix

  • Implement DNS cache with asynchronous update. Separate thread resolves all hosts and updates DNS cache with period (setting dns_cache_update_period). It should help, when ip of hosts changes frequently. #5857 (Anton Popov)
  • Fix segfault in Delta codec which affects columns with values less than 32 bits size. The bug led to random memory corruption. #5786 (alesapin)
  • Fix segfault in TTL merge with non-physical columns in block. #5819 (Anton Popov)
  • Fix rare bug in checking of part with LowCardinality column. Previously checkDataPart always fails for part with LowCardinality column. #5832 (alesapin)
  • Avoid hanging connections when server thread pool is full. It is important for connections from remote table function or connections to a shard without replicas when there is long connection timeout. This fixes #5878 #5881 (alexey-milovidov)
  • Support for constant arguments to evalMLModel function. This fixes #5817 #5820 (alexey-milovidov)
  • Fixed the issue when ClickHouse determines default time zone as UCT instead of UTC. This fixes #5804. #5828 (alexey-milovidov)
  • Fixed buffer underflow in visitParamExtractRaw. This fixes #5901 #5902 (alexey-milovidov)
  • Now distributed DROP/ALTER/TRUNCATE/OPTIMIZE ON CLUSTER queries will be executed directly on leader replica. #5757 (alesapin)
  • Fix coalesce for ColumnConst with ColumnNullable + related changes. #5755 (Artem Zuikov)
  • Fix the ReadBufferFromKafkaConsumer so that it keeps reading new messages after commit() even if it was stalled before #5852 (Ivan)
  • Fix FULL and RIGHT JOIN results when joining on Nullable keys in right table. #5859 (Artem Zuikov)
  • Possible fix of infinite sleeping of low-priority queries. #5842 (alexey-milovidov)
  • Fix race condition, which cause that some queries may not appear in query_log after SYSTEM FLUSH LOGS query. #5456 #5685 (Anton Popov)
  • Fixed heap-use-after-free ASan warning in ClusterCopier caused by watch which try to use already removed copier object. #5871 (Nikolai Kochetov)
  • Fixed wrong StringRef pointer returned by some implementations of IColumn::deserializeAndInsertFromArena. This bug affected only unit-tests. #5973 (Nikolai Kochetov)
  • Prevent source and intermediate array join columns of masking same name columns. #5941 (Artem Zuikov)
  • Fix insert and select query to MySQL engine with MySQL style identifier quoting. #5704 (Winter Zhang)
  • Now CHECK TABLE query can work with MergeTree engine family. It returns check status and message if any for each part (or file in case of simplier engines). Also, fix bug in fetch of a broken part. #5865 (alesapin)
  • Fix SPLIT_SHARED_LIBRARIES runtime #5793 (Danila Kutenin)
  • Fixed time zone initialization when /etc/localtime is a relative symlink like ../usr/share/zoneinfo/Europe/Moscow #5922 (alexey-milovidov)
  • clickhouse-copier: Fix use-after free on shutdown #5752 (proller)
  • Updated simdjson. Fixed the issue that some invalid JSONs with zero bytes successfully parse. #5938 (alexey-milovidov)
  • Fix shutdown of SystemLogs #5802 (Anton Popov)
  • Fix hanging when condition in invalidate_query depends on a dictionary. #6011 (Vitaly Baranov)

Improvement

  • Allow unresolvable addresses in cluster configuration. They will be considered unavailable and tried to resolve at every connection attempt. This is especially useful for Kubernetes. This fixes #5714 #5924 (alexey-milovidov)
  • Close idle TCP connections (with one hour timeout by default). This is especially important for large clusters with multiple distributed tables on every server, because every server can possibly keep a connection pool to every other server, and after peak query concurrency, connections will stall. This fixes #5879 #5880 (alexey-milovidov)
  • Better quality of topK function. Changed the SavingSpace set behavior to remove the last element if the new element have a bigger weight. #5833 #5850 (Guillaume Tassery)
  • URL functions to work with domains now can work for incomplete URLs without scheme #5725 (alesapin)
  • Checksums added to the system.parts_columns table. #5874 (Nikita Mikhaylov)
  • Added Enum data type as a synonim for Enum8 or Enum16. #5886 (dimarub2000)
  • Full bit transpose variant for T64 codec. Could lead to better compression with zstd. #5742 (Artem Zuikov)
  • Condition on startsWith function now can uses primary key. This fixes #5310 and #5882 #5919 (dimarub2000)
  • Allow to use clickhouse-copier with cross-replication cluster topology by permitting empty database name. #5745 (nvartolomei)
  • Use UTC as default timezone on a system without tzdata (e.g. bare Docker container). Before this patch, error message Could not determine local time zone was printed and server or client refused to start. #5827 (alexey-milovidov)
  • Returned back support for floating point argument in function quantileTiming for backward compatibility. #5911 (alexey-milovidov)
  • Show which table is missing column in error messages. #5768 (Ivan)
  • Disallow run query with same query_id by various users #5430 (proller)
  • More robust code for sending metrics to Graphite. It will work even during long multiple RENAME TABLE operation. #5875 (alexey-milovidov)
  • More informative error messages will be displayed when ThreadPool cannot schedule a task for execution. This fixes #5305 #5801 (alexey-milovidov)
  • Inverting ngramSearch to be more intuitive #5807 (Danila Kutenin)
  • Add user parsing in HDFS engine builder #5946 (akonyaev90)
  • Update default value of max_ast_elements parameter #5933 (Artem Konovalov)
  • Added a notion of obsolete settings. The obsolete setting allow_experimental_low_cardinality_type can be used with no effect. 0f15c01c6802f7ce1a1494c12c846be8c98944cd Alexey Milovidov

Performance Improvement

  • Increase number of streams to SELECT from Merge table for more uniform distribution of threads. Added setting max_streams_multiplier_for_merge_tables. This fixes #5797 #5915 (alexey-milovidov)

Build/Testing/Packaging Improvement

Backward Incompatible Change

  • Kafka is broken in this version.
  • Enable adaptive_index_granularity = 10MB by default for new MergeTree tables. If you created new MergeTree tables on version 19.11+, downgrade to versions prior to 19.6 will be impossible. #5628 (alesapin)
  • Removed obsolete undocumented embedded dictionaries that were used by Yandex.Metrica. The functions OSIn, SEIn, OSToRoot, SEToRoot, OSHierarchy, SEHierarchy are no longer available. If you are using these functions, write email to clickhouse-feedback@yandex-team.com. Note: at the last moment we decided to keep these functions for a while. #5780 (alexey-milovidov)

ClickHouse release 19.10

ClickHouse release 19.10.1.5, 2019-07-12

New Feature

  • Add new column codec: T64. Made for (U)IntX/EnumX/Data(Time)/DecimalX columns. It should be good for columns with constant or small range values. Codec itself allows enlarge or shrink data type without re-compression. #5557 (Artem Zuikov)
  • Add database engine MySQL that allow to view all the tables in remote MySQL server #5599 (Winter Zhang)
  • bitmapContains implementation. It's 2x faster than bitmapHasAny if the second bitmap contains one element. #5535 (Zhichang Yu)
  • Support for crc32 function (with behaviour exactly as in MySQL or PHP). Do not use it if you need a hash function. #5661 (Remen Ivan)
  • Implemented SYSTEM START/STOP DISTRIBUTED SENDS queries to control asynchronous inserts into Distributed tables. #4935 (Winter Zhang)

Bug Fix

  • Ignore query execution limits and max parts size for merge limits while executing mutations. #5659 (Anton Popov)
  • Fix bug which may lead to deduplication of normal blocks (extremely rare) and insertion of duplicate blocks (more often). #5549 (alesapin)
  • Fix of function arrayEnumerateUniqRanked for arguments with empty arrays #5559 (proller)
  • Don't subscribe to Kafka topics without intent to poll any messages. #5698 (Ivan)
  • Make setting join_use_nulls get no effect for types that cannot be inside Nullable #5700 (Olga Khvostikova)
  • Fixed Incorrect size of index granularity errors #5720 (coraxster)
  • Fix Float to Decimal convert overflow #5607 (coraxster)
  • Flush buffer when WriteBufferFromHDFS's destructor is called. This fixes writing into HDFS. #5684 (Xindong Peng)

Improvement

  • Treat empty cells in CSV as default values when the setting input_format_defaults_for_omitted_fields is enabled. #5625 (akuzm)
  • Non-blocking loading of external dictionaries. #5567 (Vitaly Baranov)
  • Network timeouts can be dynamically changed for already established connections according to the settings. #4558 (Konstantin Podshumok)
  • Using "public_suffix_list" for functions firstSignificantSubdomain, cutToFirstSignificantSubdomain. It's using a perfect hash table generated by gperf with a list generated from the file: https://publicsuffix.org/list/public_suffix_list.dat. (for example, now we recognize the domain ac.uk as non-significant). #5030 (Guillaume Tassery)
  • Adopted IPv6 data type in system tables; unified client info columns in system.processes and system.query_log #5640 (alexey-milovidov)
  • Using sessions for connections with MySQL compatibility protocol. #5476 #5646 (Yuriy Baranov)
  • Support more ALTER queries ON CLUSTER. #5593 #5613 (sundyli)
  • Support <logger> section in clickhouse-local config file. #5540 (proller)
  • Allow run query with remote table function in clickhouse-local #5627 (proller)

Performance Improvement

  • Add the possibility to write the final mark at the end of MergeTree columns. It allows to avoid useless reads for keys that are out of table data range. It is enabled only if adaptive index granularity is in use. #5624 (alesapin)
  • Improved performance of MergeTree tables on very slow filesystems by reducing number of stat syscalls. #5648 (alexey-milovidov)
  • Fixed performance degradation in reading from MergeTree tables that was introduced in version 19.6. Fixes #5631. #5633 (alexey-milovidov)

Build/Testing/Packaging Improvement

  • Implemented TestKeeper as an implementation of ZooKeeper interface used for testing #5643 (alexey-milovidov) (levushkin aleksej)
  • From now on .sql tests can be run isolated by server, in parallel, with random database. It allows to run them faster, add new tests with custom server configurations, and be sure that different tests doesn't affect each other. #5554 (Ivan)
  • Remove <name> and <metrics> from performance tests #5672 (Olga Khvostikova)
  • Fixed "select_format" performance test for Pretty formats #5642 (alexey-milovidov)

ClickHouse release 19.9

ClickHouse release 19.9.3.31, 2019-07-05

Bug Fix

  • Fix segfault in Delta codec which affects columns with values less than 32 bits size. The bug led to random memory corruption. #5786 (alesapin)
  • Fix rare bug in checking of part with LowCardinality column. #5832 (alesapin)
  • Fix segfault in TTL merge with non-physical columns in block. #5819 (Anton Popov)
  • Fix potential infinite sleeping of low-priority queries. #5842 (alexey-milovidov)
  • Fix how ClickHouse determines default time zone as UCT instead of UTC. #5828 (alexey-milovidov)
  • Fix bug about executing distributed DROP/ALTER/TRUNCATE/OPTIMIZE ON CLUSTER queries on follower replica before leader replica. Now they will be executed directly on leader replica. #5757 (alesapin)
  • Fix race condition, which cause that some queries may not appear in query_log instantly after SYSTEM FLUSH LOGS query. #5685 (Anton Popov)
  • Added missing support for constant arguments to evalMLModel function. #5820 (alexey-milovidov)

ClickHouse release 19.9.2.4, 2019-06-24

New Feature

  • Print information about frozen parts in system.parts table. #5471 (proller)
  • Ask client password on clickhouse-client start on tty if not set in arguments #5092 (proller)
  • Implement dictGet and dictGetOrDefault functions for Decimal types. #5394 (Artem Zuikov)

Improvement

  • Debian init: Add service stop timeout #5522 (proller)
  • Add setting forbidden by default to create table with suspicious types for LowCardinality #5448 (Olga Khvostikova)
  • Regression functions return model weights when not used as State in function evalMLMethod. #5411 (Quid37)
  • Rename and improve regression methods. #5492 (Quid37)
  • Clearer interfaces of string searchers. #5586 (Danila Kutenin)

Bug Fix

  • Fix potential data loss in Kafka #5445 (Ivan)
  • Fix potential infinite loop in PrettySpace format when called with zero columns #5560 (Olga Khvostikova)
  • Fixed UInt32 overflow bug in linear models. Allow eval ML model for non-const model argument. #5516 (Nikolai Kochetov)
  • ALTER TABLE ... DROP INDEX IF EXISTS ... should not raise an exception if provided index does not exist #5524 (Gleb Novikov)
  • Fix segfault with bitmapHasAny in scalar subquery #5528 (Zhichang Yu)
  • Fixed error when replication connection pool doesn't retry to resolve host, even when DNS cache was dropped. #5534 (alesapin)
  • Fixed ALTER ... MODIFY TTL on ReplicatedMergeTree. #5539 (Anton Popov)
  • Fix INSERT into Distributed table with MATERIALIZED column #5429 (Azat Khuzhin)
  • Fix bad alloc when truncate Join storage #5437 (TCeason)
  • In recent versions of package tzdata some of files are symlinks now. The current mechanism for detecting default timezone gets broken and gives wrong names for some timezones. Now at least we force the timezone name to the contents of TZ if provided. #5443 (Ivan)
  • Fix some extremely rare cases with MultiVolnitsky searcher when the constant needles in sum are at least 16KB long. The algorithm missed or overwrote the previous results which can lead to the incorrect result of multiSearchAny. #5588 (Danila Kutenin)
  • Fix the issue when settings for ExternalData requests couldn't use ClickHouse settings. Also, for now, settings date_time_input_format and low_cardinality_allow_in_native_format cannot be used because of the ambiguity of names (in external data it can be interpreted as table format and in the query it can be a setting). #5455 (Danila Kutenin)
  • Fix bug when parts were removed only from FS without dropping them from Zookeeper. #5520 (alesapin)
  • Remove debug logging from MySQL protocol #5478 (alexey-milovidov)
  • Skip ZNONODE during DDL query processing #5489 (Azat Khuzhin)
  • Fix mix UNION ALL result column type. There were cases with inconsistent data and column types of resulting columns. #5503 (Artem Zuikov)
  • Throw an exception on wrong integers in dictGetT functions instead of crash. #5446 (Artem Zuikov)
  • Fix wrong element_count and load_factor for hashed dictionary in system.dictionaries table. #5440 (Azat Khuzhin)

Build/Testing/Packaging Improvement

ClickHouse release 19.8

ClickHouse release 19.8.3.8, 2019-06-11

New Features

  • Added functions to work with JSON #4686 (hcz) #5124. (Vitaly Baranov)
  • Add a function basename, with a similar behaviour to a basename function, which exists in a lot of languages (os.path.basename in python, basename in PHP, etc...). Work with both an UNIX-like path or a Windows path. #5136 (Guillaume Tassery)
  • Added LIMIT n, m BY or LIMIT m OFFSET n BY syntax to set offset of n for LIMIT BY clause. #5138 (Anton Popov)
  • Added new data type SimpleAggregateFunction, which allows to have columns with light aggregation in an AggregatingMergeTree. This can only be used with simple functions like any, anyLast, sum, min, max. #4629 (Boris Granveaud)
  • Added support for non-constant arguments in function ngramDistance #5198 (Danila Kutenin)
  • Added functions skewPop, skewSamp, kurtPop and kurtSamp to compute for sequence skewness, sample skewness, kurtosis and sample kurtosis respectively. #5200 (hcz)
  • Support rename operation for MaterializeView storage. #5209 (Guillaume Tassery)
  • Added server which allows connecting to ClickHouse using MySQL client. #4715 (Yuriy Baranov)
  • Add toDecimal*OrZero and toDecimal*OrNull functions. #5291 (Artem Zuikov)
  • Support Decimal types in functions: quantile, quantiles, median, quantileExactWeighted, quantilesExactWeighted, medianExactWeighted. #5304 (Artem Zuikov)
  • Added toValidUTF8 function, which replaces all invalid UTF-8 characters by replacement character <20> (U+FFFD). #5322 (Danila Kutenin)
  • Added format function. Formatting constant pattern (simplified Python format pattern) with the strings listed in the arguments. #5330 (Danila Kutenin)
  • Added system.detached_parts table containing information about detached parts of MergeTree tables. #5353 (akuzm)
  • Added ngramSearch function to calculate the non-symmetric difference between needle and haystack. #5418#5422 (Danila Kutenin)
  • Implementation of basic machine learning methods (stochastic linear regression and logistic regression) using aggregate functions interface. Has different strategies for updating model weights (simple gradient descent, momentum method, Nesterov method). Also supports mini-batches of custom size. #4943 (Quid37)
  • Implementation of geohashEncode and geohashDecode functions. #5003 (Vasily Nemkov)
  • Added aggregate function timeSeriesGroupSum, which can aggregate different time series that sample timestamp not alignment. It will use linear interpolation between two sample timestamp and then sum time-series together. Added aggregate function timeSeriesGroupRateSum, which calculates the rate of time-series and then sum rates together. #4542 (Yangkuan Liu)
  • Added functions IPv4CIDRtoIPv4Range and IPv6CIDRtoIPv6Range to calculate the lower and higher bounds for an IP in the subnet using a CIDR. #5095 (Guillaume Tassery)
  • Add a X-ClickHouse-Summary header when we send a query using HTTP with enabled setting send_progress_in_http_headers. Return the usual information of X-ClickHouse-Progress, with additional information like how many rows and bytes were inserted in the query. #5116 (Guillaume Tassery)

Improvements

  • Added max_parts_in_total setting for MergeTree family of tables (default: 100 000) that prevents unsafe specification of partition key #5166. #5171 (alexey-milovidov)
  • clickhouse-obfuscator: derive seed for individual columns by combining initial seed with column name, not column position. This is intended to transform datasets with multiple related tables, so that tables will remain JOINable after transformation. #5178 (alexey-milovidov)
  • Added functions JSONExtractRaw, JSONExtractKeyAndValues. Renamed functions jsonExtract<type> to JSONExtract<type>. When something goes wrong these functions return the correspondent values, not NULL. Modified function JSONExtract, now it gets the return type from its last parameter and doesn't inject nullables. Implemented fallback to RapidJSON in case AVX2 instructions are not available. Simdjson library updated to a new version. #5235 (Vitaly Baranov)
  • Now if and multiIf functions don't rely on the condition's Nullable, but rely on the branches for sql compatibility. #5238 (Jian Wu)
  • In predicate now generates Null result from Null input like the Equal function. #5152 (Jian Wu)
  • Check the time limit every (flush_interval / poll_timeout) number of rows from Kafka. This allows to break the reading from Kafka consumer more frequently and to check the time limits for the top-level streams #5249 (Ivan)
  • Link rdkafka with bundled SASL. It should allow to use SASL SCRAM authentication #5253 (Ivan)
  • Batched version of RowRefList for ALL JOINS. #5267 (Artem Zuikov)
  • clickhouse-server: more informative listen error messages. #5268 (proller)
  • Support dictionaries in clickhouse-copier for functions in <sharding_key> #5270 (proller)
  • Add new setting kafka_commit_every_batch to regulate Kafka committing policy. It allows to set commit mode: after every batch of messages is handled, or after the whole block is written to the storage. It's a trade-off between losing some messages or reading them twice in some extreme situations. #5308 (Ivan)
  • Make windowFunnel support other Unsigned Integer Types. #5320 (sundyli)
  • Allow to shadow virtual column _table in Merge engine. #5325 (Ivan)
  • Make sequenceMatch aggregate functions support other unsigned Integer types #5339 (sundyli)
  • Better error messages if checksum mismatch is most likely caused by hardware failures. #5355 (alexey-milovidov)
  • Check that underlying tables support sampling for StorageMerge #5366 (Ivan)
  • Сlose MySQL connections after their usage in external dictionaries. It is related to issue #893. #5395 (Clément Rodriguez)
  • Improvements of MySQL Wire Protocol. Changed name of format to MySQLWire. Using RAII for calling RSA_free. Disabling SSL if context cannot be created. #5419 (Yuriy Baranov)
  • clickhouse-client: allow to run with unaccessable history file (read-only, no disk space, file is directory, ...). #5431 (proller)
  • Respect query settings in asynchronous INSERTs into Distributed tables. #4936 (TCeason)
  • Renamed functions leastSqr to simpleLinearRegression, LinearRegression to linearRegression, LogisticRegression to logisticRegression. #5391 (Nikolai Kochetov)

Performance Improvements

  • Parallelize processing of parts of non-replicated MergeTree tables in ALTER MODIFY query. #4639 (Ivan Kush)
  • Optimizations in regular expressions extraction. #5193 #5191 (Danila Kutenin)
  • Do not add right join key column to join result if it's used only in join on section. #5260 (Artem Zuikov)
  • Freeze the Kafka buffer after first empty response. It avoids multiple invokations of ReadBuffer::next() for empty result in some row-parsing streams. #5283 (Ivan)
  • concat function optimization for multiple arguments. #5357 (Danila Kutenin)
  • Query optimisation. Allow push down IN statement while rewriting commа/cross join into inner one. #5396 (Artem Zuikov)
  • Upgrade our LZ4 implementation with reference one to have faster decompression. #5070 (Danila Kutenin)
  • Implemented MSD radix sort (based on kxsort), and partial sorting. #5129 (Evgenii Pravda)

Bug Fixes

  • Fix push require columns with join #5192 (Winter Zhang)
  • Fixed bug, when ClickHouse is run by systemd, the command sudo service clickhouse-server forcerestart was not working as expected. #5204 (proller)
  • Fix http error codes in DataPartsExchange (interserver http server on 9009 port always returned code 200, even on errors). #5216 (proller)
  • Fix SimpleAggregateFunction for String longer than MAX_SMALL_STRING_SIZE #5311 (Azat Khuzhin)
  • Fix error for Decimal to Nullable(Decimal) conversion in IN. Support other Decimal to Decimal conversions (including different scales). #5350 (Artem Zuikov)
  • Fixed FPU clobbering in simdjson library that lead to wrong calculation of uniqHLL and uniqCombined aggregate function and math functions such as log. #5354 (alexey-milovidov)
  • Fixed handling mixed const/nonconst cases in JSON functions. #5435 (Vitaly Baranov)
  • Fix retention function. Now all conditions that satisfy in a row of data are added to the data state. #5119 (小路)
  • Fix result type for quantileExact with Decimals. #5304 (Artem Zuikov)

Documentation

Build/Testing/Packaging Improvements

ClickHouse release 19.7

ClickHouse release 19.7.5.29, 2019-07-05

Bug Fix

ClickHouse release 19.7.5.27, 2019-06-09

New features

  • Added bitmap related functions bitmapHasAny and bitmapHasAll analogous to hasAny and hasAll functions for arrays. #5279 (Sergi Vladykin)

Bug Fixes

  • Fix segfault on minmax INDEX with Null value. #5246 (Nikita Vasilev)
  • Mark all input columns in LIMIT BY as required output. It fixes 'Not found column' error in some distributed queries. #5407 (Constantin S. Pan)
  • Fix "Column '0' already exists" error in SELECT .. PREWHERE on column with DEFAULT #5397 (proller)
  • Fix ALTER MODIFY TTL query on ReplicatedMergeTree. #5539 (Anton Popov)
  • Don't crash the server when Kafka consumers have failed to start. #5285 (Ivan)
  • Fixed bitmap functions produce wrong result. #5359 (Andy Yang)
  • Fix element_count for hashed dictionary (do not include duplicates) #5440 (Azat Khuzhin)
  • Use contents of environment variable TZ as the name for timezone. It helps to correctly detect default timezone in some cases.#5443 (Ivan)
  • Do not try to convert integers in dictGetT functions, because it doesn't work correctly. Throw an exception instead. #5446 (Artem Zuikov)
  • Fix settings in ExternalData HTTP request. #5455 (Danila Kutenin)
  • Fix bug when parts were removed only from FS without dropping them from Zookeeper. #5520 (alesapin)
  • Fix segmentation fault in bitmapHasAny function. #5528 (Zhichang Yu)
  • Fixed error when replication connection pool doesn't retry to resolve host, even when DNS cache was dropped. #5534 (alesapin)
  • Fixed DROP INDEX IF EXISTS query. Now ALTER TABLE ... DROP INDEX IF EXISTS ... query doesn't raise an exception if provided index does not exist. #5524 (Gleb Novikov)
  • Fix union all supertype column. There were cases with inconsistent data and column types of resulting columns. #5503 (Artem Zuikov)
  • Skip ZNONODE during DDL query processing. Before if another node removes the znode in task queue, the one that did not process it, but already get list of children, will terminate the DDLWorker thread. #5489 (Azat Khuzhin)
  • Fix INSERT into Distributed() table with MATERIALIZED column. #5429 (Azat Khuzhin)

ClickHouse release 19.7.3.9, 2019-05-30

New Features

  • Allow to limit the range of a setting that can be specified by user. These constraints can be set up in user settings profile. #4931 (Vitaly Baranov)
  • Add a second version of the function groupUniqArray with an optional max_size parameter that limits the size of the resulting array. This behavior is similar to groupArray(max_size)(x) function. #5026 (Guillaume Tassery)
  • For TSVWithNames/CSVWithNames input file formats, column order can now be determined from file header. This is controlled by input_format_with_names_use_header parameter. #5081 (Alexander)

Bug Fixes

  • Crash with uncompressed_cache + JOIN during merge (#5197) #5133 (Danila Kutenin)
  • Segmentation fault on a clickhouse-client query to system tables. #5066 #5127 (Ivan)
  • Data loss on heavy load via KafkaEngine (#4736) #5080 (Ivan)
  • Fixed very rare data race condition that could happen when executing a query with UNION ALL involving at least two SELECTs from system.columns, system.tables, system.parts, system.parts_tables or tables of Merge family and performing ALTER of columns of the related tables concurrently. #5189 (alexey-milovidov)

Performance Improvements

Documentation

Build/Testing/Packaging Improvements

ClickHouse release 19.6

ClickHouse release 19.6.3.18, 2019-06-13

Bug Fixes

  • Fixed IN condition pushdown for queries from table functions mysql and odbc and corresponding table engines. This fixes #3540 and #2384. #5313 (alexey-milovidov)
  • Fix deadlock in Zookeeper. #5297 (github1youlc)
  • Allow quoted decimals in CSV. #5284 (Artem Zuikov
  • Disallow conversion from float Inf/NaN into Decimals (throw exception). #5282 (Artem Zuikov)
  • Fix data race in rename query. #5247 (Winter Zhang)
  • Temporarily disable LFAlloc. Usage of LFAlloc might lead to a lot of MAP_FAILED in allocating UncompressedCache and in a result to crashes of queries at high loaded servers. cfdba93(Danila Kutenin)

ClickHouse release 19.6.2.11, 2019-05-13

New Features

  • TTL expressions for columns and tables. #4212 (Anton Popov)
  • Added support for brotli compression for HTTP responses (Accept-Encoding: br) #4388 (Mikhail)
  • Added new function isValidUTF8 for checking whether a set of bytes is correctly utf-8 encoded. #4934 (Danila Kutenin)
  • Add new load balancing policy first_or_random which sends queries to the first specified host and if it's inaccessible send queries to random hosts of shard. Useful for cross-replication topology setups. #5012 (nvartolomei)

Experimental Features

  • Add setting index_granularity_bytes (adaptive index granularity) for MergeTree* tables family. #4826 (alesapin)

Improvements

  • Added support for non-constant and negative size and length arguments for function substringUTF8. #4989 (alexey-milovidov)
  • Disable push-down to right table in left join, left table in right join, and both tables in full join. This fixes wrong JOIN results in some cases. #4846 (Ivan)
  • clickhouse-copier: auto upload task configuration from --task-file option #4876 (proller)
  • Added typos handler for storage factory and table functions factory. #4891 (Danila Kutenin)
  • Support asterisks and qualified asterisks for multiple joins without subqueries #4898 (Artem Zuikov)
  • Make missing column error message more user friendly. #4915 (Artem Zuikov)

Performance Improvements

Backward Incompatible Changes

  • HTTP header Query-Id was renamed to X-ClickHouse-Query-Id for consistency. #4972 (Mikhail)

Bug Fixes

  • Fixed potential null pointer dereference in clickhouse-copier. #4900 (proller)
  • Fixed error on query with JOIN + ARRAY JOIN #4938 (Artem Zuikov)
  • Fixed hanging on start of the server when a dictionary depends on another dictionary via a database with engine=Dictionary. #4962 (Vitaly Baranov)
  • Partially fix distributed_product_mode = local. It's possible to allow columns of local tables in where/having/order by/... via table aliases. Throw exception if table does not have alias. There's not possible to access to the columns without table aliases yet. #4986 (Artem Zuikov)
  • Fix potentially wrong result for SELECT DISTINCT with JOIN #5001 (Artem Zuikov)
  • Fixed very rare data race condition that could happen when executing a query with UNION ALL involving at least two SELECTs from system.columns, system.tables, system.parts, system.parts_tables or tables of Merge family and performing ALTER of columns of the related tables concurrently. #5189 (alexey-milovidov)

Build/Testing/Packaging Improvements

  • Fixed test failures when running clickhouse-server on different host #4713 (Vasily Nemkov)
  • clickhouse-test: Disable color control sequences in non tty environment. #4937 (alesapin)
  • clickhouse-test: Allow use any test database (remove test. qualification where it possible) #5008 (proller)
  • Fix ubsan errors #5037 (Vitaly Baranov)
  • Yandex LFAlloc was added to ClickHouse to allocate MarkCache and UncompressedCache data in different ways to catch segfaults more reliable #4995 (Danila Kutenin)
  • Python util to help with backports and changelogs. #4949 (Ivan)

ClickHouse release 19.5

ClickHouse release 19.5.4.22, 2019-05-13

Bug fixes

  • Fixed possible crash in bitmap* functions #5220 #5228 (Andy Yang)
  • Fixed very rare data race condition that could happen when executing a query with UNION ALL involving at least two SELECTs from system.columns, system.tables, system.parts, system.parts_tables or tables of Merge family and performing ALTER of columns of the related tables concurrently. #5189 (alexey-milovidov)
  • Fixed error Set for IN is not created yet in case of using single LowCardinality column in the left part of IN. This error happened if LowCardinality column was the part of primary key. #5031 #5154 (Nikolai Kochetov)
  • Modification of retention function: If a row satisfies both the first and NTH condition, only the first satisfied condition is added to the data state. Now all conditions that satisfy in a row of data are added to the data state. #5119 (小路)

ClickHouse release 19.5.3.8, 2019-04-18

Bug fixes

ClickHouse release 19.5.2.6, 2019-04-15

New Features

Improvement

  • topK and topKWeighted now supports custom loadFactor (fixes issue #4252). #4634 (Kirill Danshin)
  • Allow to use parallel_replicas_count > 1 even for tables without sampling (the setting is simply ignored for them). In previous versions it was lead to exception. #4637 (Alexey Elymanov)
  • Support for CREATE OR REPLACE VIEW. Allow to create a view or set a new definition in a single statement. #4654 (Boris Granveaud)
  • Buffer table engine now supports PREWHERE. #4671 (Yangkuan Liu)
  • Add ability to start replicated table without metadata in zookeeper in readonly mode. #4691 (alesapin)
  • Fixed flicker of progress bar in clickhouse-client. The issue was most noticeable when using FORMAT Null with streaming queries. #4811 (alexey-milovidov)
  • Allow to disable functions with hyperscan library on per user basis to limit potentially excessive and uncontrolled resource usage. #4816 (alexey-milovidov)
  • Add version number logging in all errors. #4824 (proller)
  • Added restriction to the multiMatch functions which requires string size to fit into unsigned int. Also added the number of arguments limit to the multiSearch functions. #4834 (Danila Kutenin)
  • Improved usage of scratch space and error handling in Hyperscan. #4866 (Danila Kutenin)
  • Fill system.graphite_detentions from a table config of *GraphiteMergeTree engine tables. #4584 (Mikhail f. Shiryaev)
  • Rename trigramDistance function to ngramDistance and add more functions with CaseInsensitive and UTF. #4602 (Danila Kutenin)
  • Improved data skipping indices calculation. #4640 (Nikita Vasilev)
  • Keep ordinary, DEFAULT, MATERIALIZED and ALIAS columns in a single list (fixes issue #2867). #4707 (Alex Zatelepin)

Bug Fix

  • Avoid std::terminate in case of memory allocation failure. Now std::bad_alloc exception is thrown as expected. #4665 (alexey-milovidov)
  • Fixes capnproto reading from buffer. Sometimes files wasn't loaded successfully by HTTP. #4674 (Vladislav)
  • Fix error Unknown log entry type: 0 after OPTIMIZE TABLE FINAL query. #4683 (Amos Bird)
  • Wrong arguments to hasAny or hasAll functions may lead to segfault. #4698 (alexey-milovidov)
  • Deadlock may happen while executing DROP DATABASE dictionary query. #4701 (alexey-milovidov)
  • Fix undefined behavior in median and quantile functions. #4702 (hcz)
  • Fix compression level detection when network_compression_method in lowercase. Broken in v19.1. #4706 (proller)
  • Fixed ignorance of <timezone>UTC</timezone> setting (fixes issue #4658). #4718 (proller)
  • Fix histogram function behaviour with Distributed tables. #4741 (olegkv)
  • Fixed tsan report destroy of a locked mutex. #4742 (alexey-milovidov)
  • Fixed TSan report on shutdown due to race condition in system logs usage. Fixed potential use-after-free on shutdown when part_log is enabled. #4758 (alexey-milovidov)
  • Fix recheck parts in ReplicatedMergeTreeAlterThread in case of error. #4772 (Nikolai Kochetov)
  • Arithmetic operations on intermediate aggregate function states were not working for constant arguments (such as subquery results). #4776 (alexey-milovidov)
  • Always backquote column names in metadata. Otherwise it's impossible to create a table with column named index (server won't restart due to malformed ATTACH query in metadata). #4782 (alexey-milovidov)
  • Fix crash in ALTER ... MODIFY ORDER BY on Distributed table. #4790 (TCeason)
  • Fix segfault in JOIN ON with enabled enable_optimize_predicate_expression. #4794 (Winter Zhang)
  • Fix bug with adding an extraneous row after consuming a protobuf message from Kafka. #4808 (Vitaly Baranov)
  • Fix crash of JOIN on not-nullable vs nullable column. Fix NULLs in right keys in ANY JOIN + join_use_nulls. #4815 (Artem Zuikov)
  • Fix segmentation fault in clickhouse-copier. #4835 (proller)
  • Fixed race condition in SELECT from system.tables if the table is renamed or altered concurrently. #4836 (alexey-milovidov)
  • Fixed data race when fetching data part that is already obsolete. #4839 (alexey-milovidov)
  • Fixed rare data race that can happen during RENAME table of MergeTree family. #4844 (alexey-milovidov)
  • Fixed segmentation fault in function arrayIntersect. Segmentation fault could happen if function was called with mixed constant and ordinary arguments. #4847 (Lixiang Qian)
  • Fixed reading from Array(LowCardinality) column in rare case when column contained a long sequence of empty arrays. #4850 (Nikolai Kochetov)
  • Fix crash in FULL/RIGHT JOIN when we joining on nullable vs not nullable. #4855 (Artem Zuikov)
  • Fix No message received exception while fetching parts between replicas. #4856 (alesapin)
  • Fixed arrayIntersect function wrong result in case of several repeated values in single array. #4871 (Nikolai Kochetov)
  • Fix a race condition during concurrent ALTER COLUMN queries that could lead to a server crash (fixes issue #3421). #4592 (Alex Zatelepin)
  • Fix incorrect result in FULL/RIGHT JOIN with const column. #4723 (Artem Zuikov)
  • Fix duplicates in GLOBAL JOIN with asterisk. #4705 (Artem Zuikov)
  • Fix parameter deduction in ALTER MODIFY of column CODEC when column type is not specified. #4883 (alesapin)
  • Functions cutQueryStringAndFragment() and queryStringAndFragment() now works correctly when URL contains a fragment and no query. #4894 (Vitaly Baranov)
  • Fix rare bug when setting min_bytes_to_use_direct_io is greater than zero, which occures when thread have to seek backward in column file. #4897 (alesapin)
  • Fix wrong argument types for aggregate functions with LowCardinality arguments (fixes issue #4919). #4922 (Nikolai Kochetov)
  • Fix wrong name qualification in GLOBAL JOIN. #4969 (Artem Zuikov)
  • Fix function toISOWeek result for year 1970. #4988 (alexey-milovidov)
  • Fix DROP, TRUNCATE and OPTIMIZE queries duplication, when executed on ON CLUSTER for ReplicatedMergeTree* tables family. #4991 (alesapin)

Backward Incompatible Change

  • Rename setting insert_sample_with_metadata to setting input_format_defaults_for_omitted_fields. #4771 (Artem Zuikov)
  • Added setting max_partitions_per_insert_block (with value 100 by default). If inserted block contains larger number of partitions, an exception is thrown. Set it to 0 if you want to remove the limit (not recommended). #4845 (alexey-milovidov)
  • Multi-search functions were renamed (multiPosition to multiSearchAllPositions, multiSearch to multiSearchAny, firstMatch to multiSearchFirstIndex). #4780 (Danila Kutenin)

Performance Improvement

  • Optimize Volnitsky searcher by inlining, giving about 5-10% search improvement for queries with many needles or many similar bigrams. #4862 (Danila Kutenin)
  • Fix performance issue when setting use_uncompressed_cache is greater than zero, which appeared when all read data contained in cache. #4913 (alesapin)

Build/Testing/Packaging Improvement

  • Hardening debug build: more granular memory mappings and ASLR; add memory protection for mark cache and index. This allows to find more memory stomping bugs in case when ASan and MSan cannot do it. #4632 (alexey-milovidov)
  • Add support for cmake variables ENABLE_PROTOBUF, ENABLE_PARQUET and ENABLE_BROTLI which allows to enable/disable the above features (same as we can do for librdkafka, mysql, etc). #4669 (Silviu Caragea)
  • Add ability to print process list and stacktraces of all threads if some queries are hung after test run. #4675 (alesapin)
  • Add retries on Connection loss error in clickhouse-test. #4682 (alesapin)
  • Add freebsd build with vagrant and build with thread sanitizer to packager script. #4712 #4748 (alesapin)
  • Now user asked for password for user 'default' during installation. #4725 (proller)
  • Suppress warning in rdkafka library. #4740 (alexey-milovidov)
  • Allow ability to build without ssl. #4750 (proller)
  • Add a way to launch clickhouse-server image from a custom user. #4753 (Mikhail f. Shiryaev)
  • Upgrade contrib boost to 1.69. #4793 (proller)
  • Disable usage of mremap when compiled with Thread Sanitizer. Surprisingly enough, TSan does not intercept mremap (though it does intercept mmap, munmap) that leads to false positives. Fixed TSan report in stateful tests. #4859 (alexey-milovidov)
  • Add test checking using format schema via HTTP interface. #4864 (Vitaly Baranov)

ClickHouse release 19.4

ClickHouse release 19.4.4.33, 2019-04-17

Bug Fixes

  • Avoid std::terminate in case of memory allocation failure. Now std::bad_alloc exception is thrown as expected. #4665 (alexey-milovidov)
  • Fixes capnproto reading from buffer. Sometimes files wasn't loaded successfully by HTTP. #4674 (Vladislav)
  • Fix error Unknown log entry type: 0 after OPTIMIZE TABLE FINAL query. #4683 (Amos Bird)
  • Wrong arguments to hasAny or hasAll functions may lead to segfault. #4698 (alexey-milovidov)
  • Deadlock may happen while executing DROP DATABASE dictionary query. #4701 (alexey-milovidov)
  • Fix undefined behavior in median and quantile functions. #4702 (hcz)
  • Fix compression level detection when network_compression_method in lowercase. Broken in v19.1. #4706 (proller)
  • Fixed ignorance of <timezone>UTC</timezone> setting (fixes issue #4658). #4718 (proller)
  • Fix histogram function behaviour with Distributed tables. #4741 (olegkv)
  • Fixed tsan report destroy of a locked mutex. #4742 (alexey-milovidov)
  • Fixed TSan report on shutdown due to race condition in system logs usage. Fixed potential use-after-free on shutdown when part_log is enabled. #4758 (alexey-milovidov)
  • Fix recheck parts in ReplicatedMergeTreeAlterThread in case of error. #4772 (Nikolai Kochetov)
  • Arithmetic operations on intermediate aggregate function states were not working for constant arguments (such as subquery results). #4776 (alexey-milovidov)
  • Always backquote column names in metadata. Otherwise it's impossible to create a table with column named index (server won't restart due to malformed ATTACH query in metadata). #4782 (alexey-milovidov)
  • Fix crash in ALTER ... MODIFY ORDER BY on Distributed table. #4790 (TCeason)
  • Fix segfault in JOIN ON with enabled enable_optimize_predicate_expression. #4794 (Winter Zhang)
  • Fix bug with adding an extraneous row after consuming a protobuf message from Kafka. #4808 (Vitaly Baranov)
  • Fix segmentation fault in clickhouse-copier. #4835 (proller)
  • Fixed race condition in SELECT from system.tables if the table is renamed or altered concurrently. #4836 (alexey-milovidov)
  • Fixed data race when fetching data part that is already obsolete. #4839 (alexey-milovidov)
  • Fixed rare data race that can happen during RENAME table of MergeTree family. #4844 (alexey-milovidov)
  • Fixed segmentation fault in function arrayIntersect. Segmentation fault could happen if function was called with mixed constant and ordinary arguments. #4847 (Lixiang Qian)
  • Fixed reading from Array(LowCardinality) column in rare case when column contained a long sequence of empty arrays. #4850 (Nikolai Kochetov)
  • Fix No message received exception while fetching parts between replicas. #4856 (alesapin)
  • Fixed arrayIntersect function wrong result in case of several repeated values in single array. #4871 (Nikolai Kochetov)
  • Fix a race condition during concurrent ALTER COLUMN queries that could lead to a server crash (fixes issue #3421). #4592 (Alex Zatelepin)
  • Fix parameter deduction in ALTER MODIFY of column CODEC when column type is not specified. #4883 (alesapin)
  • Functions cutQueryStringAndFragment() and queryStringAndFragment() now works correctly when URL contains a fragment and no query. #4894 (Vitaly Baranov)
  • Fix rare bug when setting min_bytes_to_use_direct_io is greater than zero, which occures when thread have to seek backward in column file. #4897 (alesapin)
  • Fix wrong argument types for aggregate functions with LowCardinality arguments (fixes issue #4919). #4922 (Nikolai Kochetov)
  • Fix function toISOWeek result for year 1970. #4988 (alexey-milovidov)
  • Fix DROP, TRUNCATE and OPTIMIZE queries duplication, when executed on ON CLUSTER for ReplicatedMergeTree* tables family. #4991 (alesapin)

Improvements

ClickHouse release 19.4.3.11, 2019-04-02

Bug Fixes

  • Fix crash in FULL/RIGHT JOIN when we joining on nullable vs not nullable. #4855 (Artem Zuikov)
  • Fix segmentation fault in clickhouse-copier. #4835 (proller)

Build/Testing/Packaging Improvement

ClickHouse release 19.4.2.7, 2019-03-30

Bug Fixes

  • Fixed reading from Array(LowCardinality) column in rare case when column contained a long sequence of empty arrays. #4850 (Nikolai Kochetov)

ClickHouse release 19.4.1.3, 2019-03-19

Bug Fixes

  • Fixed remote queries which contain both LIMIT BY and LIMIT. Previously, if LIMIT BY and LIMIT were used for remote query, LIMIT could happen before LIMIT BY, which led to too filtered result. #4708 (Constantin S. Pan)

ClickHouse release 19.4.0.49, 2019-03-09

New Features

  • Added full support for Protobuf format (input and output, nested data structures). #4174 #4493 (Vitaly Baranov)
  • Added bitmap functions with Roaring Bitmaps. #4207 (Andy Yang) #4568 (Vitaly Baranov)
  • Parquet format support. #4448 (proller)
  • N-gram distance was added for fuzzy string comparison. It is similar to q-gram metrics in R language. #4466 (Danila Kutenin)
  • Combine rules for graphite rollup from dedicated aggregation and retention patterns. #4426 (Mikhail f. Shiryaev)
  • Added max_execution_speed and max_execution_speed_bytes to limit resource usage. Added min_execution_speed_bytes setting to complement the min_execution_speed. #4430 (Winter Zhang)
  • Implemented function flatten. #4555 #4409 (alexey-milovidov, kzon)
  • Added functions arrayEnumerateDenseRanked and arrayEnumerateUniqRanked (it's like arrayEnumerateUniq but allows to fine tune array depth to look inside multidimensional arrays). #4475 (proller) #4601 (alexey-milovidov)
  • Multiple JOINS with some restrictions: no asterisks, no complex aliases in ON/WHERE/GROUP BY/... #4462 (Artem Zuikov)

Bug Fixes

  • This release also contains all bug fixes from 19.3 and 19.1.
  • Fixed bug in data skipping indices: order of granules after INSERT was incorrect. #4407 (Nikita Vasilev)
  • Fixed set index for Nullable and LowCardinality columns. Before it, set index with Nullable or LowCardinality column led to error Data type must be deserialized with multiple streams while selecting. #4594 (Nikolai Kochetov)
  • Correctly set update_time on full executable dictionary update. #4551 (Tema Novikov)
  • Fix broken progress bar in 19.3. #4627 (filimonov)
  • Fixed inconsistent values of MemoryTracker when memory region was shrinked, in certain cases. #4619 (alexey-milovidov)
  • Fixed undefined behaviour in ThreadPool. #4612 (alexey-milovidov)
  • Fixed a very rare crash with the message mutex lock failed: Invalid argument that could happen when a MergeTree table was dropped concurrently with a SELECT. #4608 (Alex Zatelepin)
  • ODBC driver compatibility with LowCardinality data type. #4381 (proller)
  • FreeBSD: Fixup for AIOcontextPool: Found io_event with unknown id 0 error. #4438 (urgordeadbeef)
  • system.part_log table was created regardless to configuration. #4483 (alexey-milovidov)
  • Fix undefined behaviour in dictIsIn function for cache dictionaries. #4515 (alesapin)
  • Fixed a deadlock when a SELECT query locks the same table multiple times (e.g. from different threads or when executing multiple subqueries) and there is a concurrent DDL query. #4535 (Alex Zatelepin)
  • Disable compile_expressions by default until we get own llvm contrib and can test it with clang and asan. #4579 (alesapin)
  • Prevent std::terminate when invalidate_query for clickhouse external dictionary source has returned wrong resultset (empty or more than one row or more than one column). Fixed issue when the invalidate_query was performed every five seconds regardless to the lifetime. #4583 (alexey-milovidov)
  • Avoid deadlock when the invalidate_query for a dictionary with clickhouse source was involving system.dictionaries table or Dictionaries database (rare case). #4599 (alexey-milovidov)
  • Fixes for CROSS JOIN with empty WHERE. #4598 (Artem Zuikov)
  • Fixed segfault in function "replicate" when constant argument is passed. #4603 (alexey-milovidov)
  • Fix lambda function with predicate optimizer. #4408 (Winter Zhang)
  • Multiple JOINs multiple fixes. #4595 (Artem Zuikov)

Improvements

  • Support aliases in JOIN ON section for right table columns. #4412 (Artem Zuikov)
  • Result of multiple JOINs need correct result names to be used in subselects. Replace flat aliases with source names in result. #4474 (Artem Zuikov)
  • Improve push-down logic for joined statements. #4387 (Ivan)

Performance Improvements

  • Improved heuristics of "move to PREWHERE" optimization. #4405 (alexey-milovidov)
  • Use proper lookup tables that uses HashTable's API for 8-bit and 16-bit keys. #4536 (Amos Bird)
  • Improved performance of string comparison. #4564 (alexey-milovidov)
  • Cleanup distributed DDL queue in a separate thread so that it doesn't slow down the main loop that processes distributed DDL tasks. #4502 (Alex Zatelepin)
  • When min_bytes_to_use_direct_io is set to 1, not every file was opened with O_DIRECT mode because the data size to read was sometimes underestimated by the size of one compressed block. #4526 (alexey-milovidov)

Build/Testing/Packaging Improvement

  • Added support for clang-9 #4604 (alexey-milovidov)
  • Fix wrong __asm__ instructions (again) #4621 (Konstantin Podshumok)
  • Add ability to specify settings for clickhouse-performance-test from command line. #4437 (alesapin)
  • Add dictionaries tests to integration tests. #4477 (alesapin)
  • Added queries from the benchmark on the website to automated performance tests. #4496 (alexey-milovidov)
  • xxhash.h does not exist in external lz4 because it is an implementation detail and its symbols are namespaced with XXH_NAMESPACE macro. When lz4 is external, xxHash has to be external too, and the dependents have to link to it. #4495 (Orivej Desh)
  • Fixed a case when quantileTiming aggregate function can be called with negative or floating point argument (this fixes fuzz test with undefined behaviour sanitizer). #4506 (alexey-milovidov)
  • Spelling error correction. #4531 (sdk2)
  • Fix compilation on Mac. #4371 (Vitaly Baranov)
  • Build fixes for FreeBSD and various unusual build configurations. #4444 (proller)

ClickHouse release 19.3

ClickHouse release 19.3.9.1, 2019-04-02

Bug Fixes

  • Fix crash in FULL/RIGHT JOIN when we joining on nullable vs not nullable. #4855 (Artem Zuikov)
  • Fix segmentation fault in clickhouse-copier. #4835 (proller)
  • Fixed reading from Array(LowCardinality) column in rare case when column contained a long sequence of empty arrays. #4850 (Nikolai Kochetov)

Build/Testing/Packaging Improvement

ClickHouse release 19.3.7, 2019-03-12

Bug fixes

  • Fixed error in #3920. This error manifests itself as random cache corruption (messages Unknown codec family code, Cannot seek through file) and segfaults. This bug first appeared in version 19.1 and is present in versions up to 19.1.10 and 19.3.6. #4623 (alexey-milovidov)

ClickHouse release 19.3.6, 2019-03-02

Bug fixes

  • When there are more than 1000 threads in a thread pool, std::terminate may happen on thread exit. Azat Khuzhin #4485 #4505 (alexey-milovidov)
  • Now it's possible to create ReplicatedMergeTree* tables with comments on columns without defaults and tables with columns codecs without comments and defaults. Also fix comparison of codecs. #4523 (alesapin)
  • Fixed crash on JOIN with array or tuple. #4552 (Artem Zuikov)
  • Fixed crash in clickhouse-copier with the message ThreadStatus not created. #4540 (Artem Zuikov)
  • Fixed hangup on server shutdown if distributed DDLs were used. #4472 (Alex Zatelepin)
  • Incorrect column numbers were printed in error message about text format parsing for columns with number greater than 10. #4484 (alexey-milovidov)

Build/Testing/Packaging Improvements

  • Fixed build with AVX enabled. #4527 (alexey-milovidov)
  • Enable extended accounting and IO accounting based on good known version instead of kernel under which it is compiled. #4541 (nvartolomei)
  • Allow to skip setting of core_dump.size_limit, warning instead of throw if limit set fail. #4473 (proller)
  • Removed the inline tags of void readBinary(...) in Field.cpp. Also merged redundant namespace DB blocks. #4530 (hcz)

ClickHouse release 19.3.5, 2019-02-21

Bug fixes

  • Fixed bug with large http insert queries processing. #4454 (alesapin)
  • Fixed backward incompatibility with old versions due to wrong implementation of send_logs_level setting. #4445 (alexey-milovidov)
  • Fixed backward incompatibility of table function remote introduced with column comments. #4446 (alexey-milovidov)

ClickHouse release 19.3.4, 2019-02-16

Improvements

  • Table index size is not accounted for memory limits when doing ATTACH TABLE query. Avoided the possibility that a table cannot be attached after being detached. #4396 (alexey-milovidov)
  • Slightly raised up the limit on max string and array size received from ZooKeeper. It allows to continue to work with increased size of CLIENT_JVMFLAGS=-Djute.maxbuffer=... on ZooKeeper. #4398 (alexey-milovidov)
  • Allow to repair abandoned replica even if it already has huge number of nodes in its queue. #4399 (alexey-milovidov)
  • Add one required argument to SET index (max stored rows number). #4386 (Nikita Vasilev)

Bug Fixes

Build/Testing/Packaging Improvements

  • Add ability to run clickhouse-server for stateless tests in docker image. #4347 (Vasily Nemkov)

ClickHouse release 19.3.3, 2019-02-13

New Features

  • Added the KILL MUTATION statement that allows removing mutations that are for some reasons stuck. Added latest_failed_part, latest_fail_time, latest_fail_reason fields to the system.mutations table for easier troubleshooting. #4287 (Alex Zatelepin)
  • Added aggregate function entropy which computes Shannon entropy. #4238 (Quid37)
  • Added ability to send queries INSERT INTO tbl VALUES (.... to server without splitting on query and data parts. #4301 (alesapin)
  • Generic implementation of arrayWithConstant function was added. #4322 (alexey-milovidov)
  • Implemented NOT BETWEEN comparison operator. #4228 (Dmitry Naumov)
  • Implement sumMapFiltered in order to be able to limit the number of keys for which values will be summed by sumMap. #4129 (Léo Ercolanelli)
  • Added support of Nullable types in mysql table function. #4198 (Emmanuel Donin de Rosière)
  • Support for arbitrary constant expressions in LIMIT clause. #4246 (k3box)
  • Added topKWeighted aggregate function that takes additional argument with (unsigned integer) weight. #4245 (Andrew Golman)
  • StorageJoin now supports join_any_take_last_row setting that allows overwriting existing values of the same key. #3973 (Amos Bird
  • Added function toStartOfInterval. #4304 (Vitaly Baranov)
  • Added RowBinaryWithNamesAndTypes format. #4200 (Oleg V. Kozlyuk)
  • Added IPv4 and IPv6 data types. More effective implementations of IPv* functions. #3669 (Vasily Nemkov)
  • Added function toStartOfTenMinutes(). #4298 (Vitaly Baranov)
  • Added Protobuf output format. #4005 #4158 (Vitaly Baranov)
  • Added brotli support for HTTP interface for data import (INSERTs). #4235 (Mikhail )
  • Added hints while user make typo in function name or type in command line client. #4239 (Danila Kutenin)
  • Added Query-Id to Server's HTTP Response header. #4231 (Mikhail )

Experimental features

Bug Fixes

  • Fixed Not found column for duplicate columns in JOIN ON section. #4279 (Artem Zuikov)
  • Make START REPLICATED SENDS command start replicated sends. #4229 (nvartolomei)
  • Fixed aggregate functions execution with Array(LowCardinality) arguments. #4055 (KochetovNicolai)
  • Fixed wrong behaviour when doing INSERT ... SELECT ... FROM file(...) query and file has CSVWithNames or TSVWIthNames format and the first data row is missing. #4297 (alexey-milovidov)
  • Fixed crash on dictionary reload if dictionary not available. This bug was appeared in 19.1.6. #4188 (proller)
  • Fixed ALL JOIN with duplicates in right table. #4184 (Artem Zuikov)
  • Fixed segmentation fault with use_uncompressed_cache=1 and exception with wrong uncompressed size. This bug was appeared in 19.1.6. #4186 (alesapin)
  • Fixed compile_expressions bug with comparison of big (more than int16) dates. #4341 (alesapin)
  • Fixed infinite loop when selecting from table function numbers(0). #4280 (alexey-milovidov)
  • Temporarily disable predicate optimization for ORDER BY. #3890 (Winter Zhang)
  • Fixed Illegal instruction error when using base64 functions on old CPUs. This error has been reproduced only when ClickHouse was compiled with gcc-8. #4275 (alexey-milovidov)
  • Fixed No message received error when interacting with PostgreSQL ODBC Driver through TLS connection. Also fixes segfault when using MySQL ODBC Driver. #4170 (alexey-milovidov)
  • Fixed incorrect result when Date and DateTime arguments are used in branches of conditional operator (function if). Added generic case for function if. #4243 (alexey-milovidov)
  • ClickHouse dictionaries now load within clickhouse process. #4166 (alexey-milovidov)
  • Fixed deadlock when SELECT from a table with File engine was retried after No such file or directory error. #4161 (alexey-milovidov)
  • Fixed race condition when selecting from system.tables may give table doesn't exist error. #4313 (alexey-milovidov)
  • clickhouse-client can segfault on exit while loading data for command line suggestions if it was run in interactive mode. #4317 (alexey-milovidov)
  • Fixed a bug when the execution of mutations containing IN operators was producing incorrect results. #4099 (Alex Zatelepin)
  • Fixed error: if there is a database with Dictionary engine, all dictionaries forced to load at server startup, and if there is a dictionary with ClickHouse source from localhost, the dictionary cannot load. #4255 (alexey-milovidov)
  • Fixed error when system logs are tried to create again at server shutdown. #4254 (alexey-milovidov)
  • Correctly return the right type and properly handle locks in joinGet function. #4153 (Amos Bird)
  • Added sumMapWithOverflow function. #4151 (Léo Ercolanelli)
  • Fixed segfault with allow_experimental_multiple_joins_emulation. 52de2c (Artem Zuikov)
  • Fixed bug with incorrect Date and DateTime comparison. #4237 (valexey)
  • Fixed fuzz test under undefined behavior sanitizer: added parameter type check for quantile*Weighted family of functions. #4145 (alexey-milovidov)
  • Fixed rare race condition when removing of old data parts can fail with File not found error. #4378 (alexey-milovidov)
  • Fix install package with missing /etc/clickhouse-server/config.xml. #4343 (proller)

Build/Testing/Packaging Improvements

Backward Incompatible Changes

  • Removed allow_experimental_low_cardinality_type setting. LowCardinality data types are production ready. #4323 (alexey-milovidov)
  • Reduce mark cache size and uncompressed cache size accordingly to available memory amount. #4240 (Lopatin Konstantin
  • Added keyword INDEX in CREATE TABLE query. A column with name index must be quoted with backticks or double quotes: `index`. #4143 (Nikita Vasilev)
  • sumMap now promote result type instead of overflow. The old sumMap behavior can be obtained by using sumMapWithOverflow function. #4151 (Léo Ercolanelli)

Performance Improvements

  • std::sort replaced by pdqsort for queries without LIMIT. #4236 (Evgenii Pravda)
  • Now server reuse threads from global thread pool. This affects performance in some corner cases. #4150 (alexey-milovidov)

Improvements

  • Implemented AIO support for FreeBSD. #4305 (urgordeadbeef)
  • SELECT * FROM a JOIN b USING a, b now return a and b columns only from the left table. #4141 (Artem Zuikov)
  • Allow -C option of client to work as -c option. #4232 (syominsergey)
  • Now option --password used without value requires password from stdin. #4230 (BSD_Conqueror)
  • Added highlighting of unescaped metacharacters in string literals that contain LIKE expressions or regexps. #4327 (alexey-milovidov)
  • Added cancelling of HTTP read only queries if client socket goes away. #4213 (nvartolomei)
  • Now server reports progress to keep client connections alive. #4215 (Ivan)
  • Slightly better message with reason for OPTIMIZE query with optimize_throw_if_noop setting enabled. #4294 (alexey-milovidov)
  • Added support of --version option for clickhouse server. #4251 (Lopatin Konstantin)
  • Added --help/-h option to clickhouse-server. #4233 (Yuriy Baranov)
  • Added support for scalar subqueries with aggregate function state result. #4348 (Nikolai Kochetov)
  • Improved server shutdown time and ALTERs waiting time. #4372 (alexey-milovidov)
  • Added info about the replicated_can_become_leader setting to system.replicas and add logging if the replica won't try to become leader. #4379 (Alex Zatelepin)

ClickHouse release 19.1

ClickHouse release 19.1.14, 2019-03-14

  • Fixed error Column ... queried more than once that may happen if the setting asterisk_left_columns_only is set to 1 in case of using GLOBAL JOIN with SELECT * (rare case). The issue does not exist in 19.3 and newer. 6bac7d8d (Artem Zuikov)

ClickHouse release 19.1.13, 2019-03-12

This release contains exactly the same set of patches as 19.3.7.

ClickHouse release 19.1.10, 2019-03-03

This release contains exactly the same set of patches as 19.3.6.

ClickHouse release 19.1

ClickHouse release 19.1.9, 2019-02-21

Bug fixes

  • Fixed backward incompatibility with old versions due to wrong implementation of send_logs_level setting. #4445 (alexey-milovidov)
  • Fixed backward incompatibility of table function remote introduced with column comments. #4446 (alexey-milovidov)

ClickHouse release 19.1.8, 2019-02-16

Bug Fixes

  • Fix install package with missing /etc/clickhouse-server/config.xml. #4343 (proller)

ClickHouse release 19.1

ClickHouse release 19.1.7, 2019-02-15

Bug Fixes

  • Correctly return the right type and properly handle locks in joinGet function. #4153 (Amos Bird)
  • Fixed error when system logs are tried to create again at server shutdown. #4254 (alexey-milovidov)
  • Fixed error: if there is a database with Dictionary engine, all dictionaries forced to load at server startup, and if there is a dictionary with ClickHouse source from localhost, the dictionary cannot load. #4255 (alexey-milovidov)
  • Fixed a bug when the execution of mutations containing IN operators was producing incorrect results. #4099 (Alex Zatelepin)
  • clickhouse-client can segfault on exit while loading data for command line suggestions if it was run in interactive mode. #4317 (alexey-milovidov)
  • Fixed race condition when selecting from system.tables may give table doesn't exist error. #4313 (alexey-milovidov)
  • Fixed deadlock when SELECT from a table with File engine was retried after No such file or directory error. #4161 (alexey-milovidov)
  • Fixed an issue: local ClickHouse dictionaries are loaded via TCP, but should load within process. #4166 (alexey-milovidov)
  • Fixed No message received error when interacting with PostgreSQL ODBC Driver through TLS connection. Also fixes segfault when using MySQL ODBC Driver. #4170 (alexey-milovidov)
  • Temporarily disable predicate optimization for ORDER BY. #3890 (Winter Zhang)
  • Fixed infinite loop when selecting from table function numbers(0). #4280 (alexey-milovidov)
  • Fixed compile_expressions bug with comparison of big (more than int16) dates. #4341 (alesapin)
  • Fixed segmentation fault with uncompressed_cache=1 and exception with wrong uncompressed size. #4186 (alesapin)
  • Fixed ALL JOIN with duplicates in right table. #4184 (Artem Zuikov)
  • Fixed wrong behaviour when doing INSERT ... SELECT ... FROM file(...) query and file has CSVWithNames or TSVWIthNames format and the first data row is missing. #4297 (alexey-milovidov)
  • Fixed aggregate functions execution with Array(LowCardinality) arguments. #4055 (KochetovNicolai)
  • Debian package: correct /etc/clickhouse-server/preprocessed link according to config. #4205 (proller)
  • Fixed fuzz test under undefined behavior sanitizer: added parameter type check for quantile*Weighted family of functions. #4145 (alexey-milovidov)
  • Make START REPLICATED SENDS command start replicated sends. #4229 (nvartolomei)
  • Fixed Not found column for duplicate columns in JOIN ON section. #4279 (Artem Zuikov)
  • Now /etc/ssl is used as default directory with SSL certificates. #4167 (alexey-milovidov)
  • Fixed crash on dictionary reload if dictionary not available. #4188 (proller)
  • Fixed bug with incorrect Date and DateTime comparison. #4237 (valexey)
  • Fixed incorrect result when Date and DateTime arguments are used in branches of conditional operator (function if). Added generic case for function if. #4243 (alexey-milovidov)

ClickHouse release 19.1.6, 2019-01-24

New Features

  • Custom per column compression codecs for tables. #3899 #4111 (alesapin, Winter Zhang, Anatoly)
  • Added compression codec Delta. #4052 (alesapin)
  • Allow to ALTER compression codecs. #4054 (alesapin)
  • Added functions left, right, trim, ltrim, rtrim, timestampadd, timestampsub for SQL standard compatibility. #3826 (Ivan Blinkov)
  • Support for write in HDFS tables and hdfs table function. #4084 (alesapin)
  • Added functions to search for multiple constant strings from big haystack: multiPosition, multiSearch ,firstMatch also with -UTF8, -CaseInsensitive, and -CaseInsensitiveUTF8 variants. #4053 (Danila Kutenin)
  • Pruning of unused shards if SELECT query filters by sharding key (setting optimize_skip_unused_shards). #3851 (Gleb Kanterov, Ivan)
  • Allow Kafka engine to ignore some number of parsing errors per block. #4094 (Ivan)
  • Added support for CatBoost multiclass models evaluation. Function modelEvaluate returns tuple with per-class raw predictions for multiclass models. libcatboostmodel.so should be built with #607. #3959 (KochetovNicolai)
  • Added functions filesystemAvailable, filesystemFree, filesystemCapacity. #4097 (Boris Granveaud)
  • Added hashing functions xxHash64 and xxHash32. #3905 (filimonov)
  • Added gccMurmurHash hashing function (GCC flavoured Murmur hash) which uses the same hash seed as gcc #4000 (sundyli)
  • Added hashing functions javaHash, hiveHash. #3811 (shangshujie365)
  • Added table function remoteSecure. Function works as remote, but uses secure connection. #4088 (proller)

Experimental features

  • Added multiple JOINs emulation (allow_experimental_multiple_joins_emulation setting). #3946 (Artem Zuikov)

Bug Fixes

  • Make compiled_expression_cache_size setting limited by default to lower memory consumption. #4041 (alesapin)
  • Fix a bug that led to hangups in threads that perform ALTERs of Replicated tables and in the thread that updates configuration from ZooKeeper. #2947 #3891 #3934 (Alex Zatelepin)
  • Fixed a race condition when executing a distributed ALTER task. The race condition led to more than one replica trying to execute the task and all replicas except one failing with a ZooKeeper error. #3904 (Alex Zatelepin)
  • Fix a bug when from_zk config elements weren't refreshed after a request to ZooKeeper timed out. #2947 #3947 (Alex Zatelepin)
  • Fix bug with wrong prefix for IPv4 subnet masks. #3945 (alesapin)
  • Fixed crash (std::terminate) in rare cases when a new thread cannot be created due to exhausted resources. #3956 (alexey-milovidov)
  • Fix bug when in remote table function execution when wrong restrictions were used for in getStructureOfRemoteTable. #4009 (alesapin)
  • Fix a leak of netlink sockets. They were placed in a pool where they were never deleted and new sockets were created at the start of a new thread when all current sockets were in use. #4017 (Alex Zatelepin)
  • Fix bug with closing /proc/self/fd directory earlier than all fds were read from /proc after forking odbc-bridge subprocess. #4120 (alesapin)
  • Fixed String to UInt monotonic conversion in case of usage String in primary key. #3870 (Winter Zhang)
  • Fixed error in calculation of integer conversion function monotonicity. #3921 (alexey-milovidov)
  • Fixed segfault in arrayEnumerateUniq, arrayEnumerateDense functions in case of some invalid arguments. #3909 (alexey-milovidov)
  • Fix UB in StorageMerge. #3910 (Amos Bird)
  • Fixed segfault in functions addDays, subtractDays. #3913 (alexey-milovidov)
  • Fixed error: functions round, floor, trunc, ceil may return bogus result when executed on integer argument and large negative scale. #3914 (alexey-milovidov)
  • Fixed a bug induced by 'kill query sync' which leads to a core dump. #3916 (muVulDeePecker)
  • Fix bug with long delay after empty replication queue. #3928 #3932 (alesapin)
  • Fixed excessive memory usage in case of inserting into table with LowCardinality primary key. #3955 (KochetovNicolai)
  • Fixed LowCardinality serialization for Native format in case of empty arrays. #3907 #4011 (KochetovNicolai)
  • Fixed incorrect result while using distinct by single LowCardinality numeric column. #3895 #4012 (KochetovNicolai)
  • Fixed specialized aggregation with LowCardinality key (in case when compile setting is enabled). #3886 (KochetovNicolai)
  • Fix user and password forwarding for replicated tables queries. #3957 (alesapin) (小路)
  • Fixed very rare race condition that can happen when listing tables in Dictionary database while reloading dictionaries. #3970 (alexey-milovidov)
  • Fixed incorrect result when HAVING was used with ROLLUP or CUBE. #3756 #3837 (Sam Chou)
  • Fixed column aliases for query with JOIN ON syntax and distributed tables. #3980 (Winter Zhang)
  • Fixed error in internal implementation of quantileTDigest (found by Artem Vakhrushev). This error never happens in ClickHouse and was relevant only for those who use ClickHouse codebase as a library directly. #3935 (alexey-milovidov)

Improvements

  • Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN statements along with IF EXISTS in DROP/MODIFY/CLEAR/COMMENT COLUMN. #3900 (Boris Granveaud)
  • Function parseDateTimeBestEffort: support for formats DD.MM.YYYY, DD.MM.YY, DD-MM-YYYY, DD-Mon-YYYY, DD/Month/YYYY and similar. #3922 (alexey-milovidov)
  • CapnProtoInputStream now support jagged structures. #4063 (Odin Hultgren Van Der Horst)
  • Usability improvement: added a check that server process is started from the data directory's owner. Do not allow to start server from root if the data belongs to non-root user. #3785 (sergey-v-galtsev)
  • Better logic of checking required columns during analysis of queries with JOINs. #3930 (Artem Zuikov)
  • Decreased the number of connections in case of large number of Distributed tables in a single server. #3726 (Winter Zhang)
  • Supported totals row for WITH TOTALS query for ODBC driver. #3836 (Maksim Koritckiy)
  • Allowed to use Enums as integers inside if function. #3875 (Ivan)
  • Added low_cardinality_allow_in_native_format setting. If disabled, do not use LowCadrinality type in Native format. #3879 (KochetovNicolai)
  • Removed some redundant objects from compiled expressions cache to lower memory usage. #4042 (alesapin)
  • Add check that SET send_logs_level = 'value' query accept appropriate value. #3873 (Sabyanin Maxim)
  • Fixed data type check in type conversion functions. #3896 (Winter Zhang)

Performance Improvements

  • Add a MergeTree setting use_minimalistic_part_header_in_zookeeper. If enabled, Replicated tables will store compact part metadata in a single part znode. This can dramatically reduce ZooKeeper snapshot size (especially if the tables have a lot of columns). Note that after enabling this setting you will not be able to downgrade to a version that doesn't support it. #3960 (Alex Zatelepin)
  • Add an DFA-based implementation for functions sequenceMatch and sequenceCount in case pattern doesn't contain time. #4004 (Léo Ercolanelli)
  • Performance improvement for integer numbers serialization. #3968 (Amos Bird)
  • Zero left padding PODArray so that -1 element is always valid and zeroed. It's used for branchless calculation of offsets. #3920 (Amos Bird)
  • Reverted jemalloc version which lead to performance degradation. #4018 (alexey-milovidov)

Backward Incompatible Changes

  • Removed undocumented feature ALTER MODIFY PRIMARY KEY because it was superseded by the ALTER MODIFY ORDER BY command. #3887 (Alex Zatelepin)
  • Removed function shardByHash. #3833 (alexey-milovidov)
  • Forbid using scalar subqueries with result of type AggregateFunction. #3865 (Ivan)

Build/Testing/Packaging Improvements

  • Added support for PowerPC (ppc64le) build. #4132 (Danila Kutenin)
  • Stateful functional tests are run on public available dataset. #3969 (alexey-milovidov)
  • Fixed error when the server cannot start with the bash: /usr/bin/clickhouse-extract-from-config: Operation not permitted message within Docker or systemd-nspawn. #4136 (alexey-milovidov)
  • Updated rdkafka library to v1.0.0-RC5. Used cppkafka instead of raw C interface. #4025 (Ivan)
  • Updated mariadb-client library. Fixed one of issues found by UBSan. #3924 (alexey-milovidov)
  • Some fixes for UBSan builds. #3926 #3021 #3948 (alexey-milovidov)
  • Added per-commit runs of tests with UBSan build.
  • Added per-commit runs of PVS-Studio static analyzer.
  • Fixed bugs found by PVS-Studio. #4013 (alexey-milovidov)
  • Fixed glibc compatibility issues. #4100 (alexey-milovidov)
  • Move Docker images to 18.10 and add compatibility file for glibc >= 2.28 #3965 (alesapin)
  • Add env variable if user don't want to chown directories in server Docker image. #3967 (alesapin)
  • Enabled most of the warnings from -Weverything in clang. Enabled -Wpedantic. #3986 (alexey-milovidov)
  • Added a few more warnings that are available only in clang 8. #3993 (alexey-milovidov)
  • Link to libLLVM rather than to individual LLVM libs when using shared linking. #3989 (Orivej Desh)
  • Added sanitizer variables for test images. #4072 (alesapin)
  • clickhouse-server debian package will recommend libcap2-bin package to use setcap tool for setting capabilities. This is optional. #4093 (alexey-milovidov)
  • Improved compilation time, fixed includes. #3898 (proller)
  • Added performance tests for hash functions. #3918 (filimonov)
  • Fixed cyclic library dependences. #3958 (proller)
  • Improved compilation with low available memory. #4030 (proller)
  • Added test script to reproduce performance degradation in jemalloc. #4036 (alexey-milovidov)
  • Fixed misspells in comments and string literals under dbms. #4122 (maiha)
  • Fixed typos in comments. #4089 (Evgenii Pravda)

ClickHouse release 18.16

ClickHouse release 18.16.1, 2018-12-21

Bug fixes:

  • Fixed an error that led to problems with updating dictionaries with the ODBC source. #3825, #3829
  • JIT compilation of aggregate functions now works with LowCardinality columns. #3838

Improvements:

  • Added the low_cardinality_allow_in_native_format setting (enabled by default). When disabled, LowCardinality columns will be converted to ordinary columns for SELECT queries and ordinary columns will be expected for INSERT queries. #3879

Build improvements:

  • Fixes for builds on macOS and ARM.

ClickHouse release 18.16.0, 2018-12-14

New features:

  • DEFAULT expressions are evaluated for missing fields when loading data in semi-structured input formats (JSONEachRow, TSKV). The feature is enabled with the insert_sample_with_metadata setting. #3555
  • The ALTER TABLE query now has the MODIFY ORDER BY action for changing the sorting key when adding or removing a table column. This is useful for tables in the MergeTree family that perform additional tasks when merging based on this sorting key, such as SummingMergeTree, AggregatingMergeTree, and so on. #3581 #3755
  • For tables in the MergeTree family, now you can specify a different sorting key (ORDER BY) and index (PRIMARY KEY). The sorting key can be longer than the index. #3581
  • Added the hdfs table function and the HDFS table engine for importing and exporting data to HDFS. chenxing-xc
  • Added functions for working with base64: base64Encode, base64Decode, tryBase64Decode. Alexander Krasheninnikov
  • Now you can use a parameter to configure the precision of the uniqCombined aggregate function (select the number of HyperLogLog cells). #3406
  • Added the system.contributors table that contains the names of everyone who made commits in ClickHouse. #3452
  • Added the ability to omit the partition for the ALTER TABLE ... FREEZE query in order to back up all partitions at once. #3514
  • Added dictGet and dictGetOrDefault functions that don't require specifying the type of return value. The type is determined automatically from the dictionary description. Amos Bird
  • Now you can specify comments for a column in the table description and change it using ALTER. #3377
  • Reading is supported for Join type tables with simple keys. Amos Bird
  • Now you can specify the options join_use_nulls, max_rows_in_join, max_bytes_in_join, and join_overflow_mode when creating a Join type table. Amos Bird
  • Added the joinGet function that allows you to use a Join type table like a dictionary. Amos Bird
  • Added the partition_key, sorting_key, primary_key, and sampling_key columns to the system.tables table in order to provide information about table keys. #3609
  • Added the is_in_partition_key, is_in_sorting_key, is_in_primary_key, and is_in_sampling_key columns to the system.columns table. #3609
  • Added the min_time and max_time columns to the system.parts table. These columns are populated when the partitioning key is an expression consisting of DateTime columns. Emmanuel Donin de Rosière

Bug fixes:

  • Fixes and performance improvements for the LowCardinality data type. GROUP BY using LowCardinality(Nullable(...)). Getting the values of extremes. Processing high-order functions. LEFT ARRAY JOIN. Distributed GROUP BY. Functions that return Array. Execution of ORDER BY. Writing to Distributed tables (nicelulu). Backward compatibility for INSERT queries from old clients that implement the Native protocol. Support for LowCardinality for JOIN. Improved performance when working in a single stream. #3823 #3803 #3799 #3769 #3744 #3681 #3651 #3649 #3641 #3632 #3568 #3523 #3518
  • Fixed how the select_sequential_consistency option works. Previously, when this setting was enabled, an incomplete result was sometimes returned after beginning to write to a new partition. #2863
  • Databases are correctly specified when executing DDL ON CLUSTER queries and ALTER UPDATE/DELETE. #3772 #3460
  • Databases are correctly specified for subqueries inside a VIEW. #3521
  • Fixed a bug in PREWHERE with FINAL for VersionedCollapsingMergeTree. 7167bfd7
  • Now you can use KILL QUERY to cancel queries that have not started yet because they are waiting for the table to be locked. #3517
  • Corrected date and time calculations if the clocks were moved back at midnight (this happens in Iran, and happened in Moscow from 1981 to 1983). Previously, this led to the time being reset a day earlier than necessary, and also caused incorrect formatting of the date and time in text format. #3819
  • Fixed bugs in some cases of VIEW and subqueries that omit the database. Winter Zhang
  • Fixed a race condition when simultaneously reading from a MATERIALIZED VIEW and deleting a MATERIALIZED VIEW due to not locking the internal MATERIALIZED VIEW. #3404 #3694
  • Fixed the error Lock handler cannot be nullptr. #3689
  • Fixed query processing when the compile_expressions option is enabled (it's enabled by default). Nondeterministic constant expressions like the now function are no longer unfolded. #3457
  • Fixed a crash when specifying a non-constant scale argument in toDecimal32/64/128 functions.
  • Fixed an error when trying to insert an array with NULL elements in the Values format into a column of type Array without Nullable (if input_format_values_interpret_expressions = 1). #3487 #3503
  • Fixed continuous error logging in DDLWorker if ZooKeeper is not available. 8f50c620
  • Fixed the return type for quantile* functions from Date and DateTime types of arguments. #3580
  • Fixed the WITH clause if it specifies a simple alias without expressions. #3570
  • Fixed processing of queries with named sub-queries and qualified column names when enable_optimize_predicate_expression is enabled. Winter Zhang
  • Fixed the error Attempt to attach to nullptr thread group when working with materialized views. Marek Vavruša
  • Fixed a crash when passing certain incorrect arguments to the arrayReverse function. 73e3a7b6
  • Fixed the buffer overflow in the extractURLParameter function. Improved performance. Added correct processing of strings containing zero bytes. 141e9799
  • Fixed buffer overflow in the lowerUTF8 and upperUTF8 functions. Removed the ability to execute these functions over FixedString type arguments. #3662
  • Fixed a rare race condition when deleting MergeTree tables. #3680
  • Fixed a race condition when reading from Buffer tables and simultaneously performing ALTER or DROP on the target tables. #3719
  • Fixed a segfault if the max_temporary_non_const_columns limit was exceeded. #3788

Improvements:

  • The server does not write the processed configuration files to the /etc/clickhouse-server/ directory. Instead, it saves them in the preprocessed_configs directory inside path. This means that the /etc/clickhouse-server/ directory doesn't have write access for the clickhouse user, which improves security. #2443
  • The min_merge_bytes_to_use_direct_io option is set to 10 GiB by default. A merge that forms large parts of tables from the MergeTree family will be performed in O_DIRECT mode, which prevents excessive page cache eviction. #3504
  • Accelerated server start when there is a very large number of tables. #3398
  • Added a connection pool and HTTP Keep-Alive for connections between replicas. #3594
  • If the query syntax is invalid, the 400 Bad Request code is returned in the HTTP interface (500 was returned previously). 31bc680a
  • The join_default_strictness option is set to ALL by default for compatibility. 120e2cbe
  • Removed logging to stderr from the re2 library for invalid or complex regular expressions. #3723
  • Added for the Kafka table engine: checks for subscriptions before beginning to read from Kafka; the kafka_max_block_size setting for the table. Marek Vavruša
  • The cityHash64, farmHash64, metroHash64, sipHash64, halfMD5, murmurHash2_32, murmurHash2_64, murmurHash3_32, and murmurHash3_64 functions now work for any number of arguments and for arguments in the form of tuples. #3451 #3519
  • The arrayReverse function now works with any types of arrays. 73e3a7b6
  • Added an optional parameter: the slot size for the timeSlots function. Kirill Shvakov
  • For FULL and RIGHT JOIN, the max_block_size setting is used for a stream of non-joined data from the right table. Amos Bird
  • Added the --secure command line parameter in clickhouse-benchmark and clickhouse-performance-test to enable TLS. #3688 #3690
  • Type conversion when the structure of a Buffer type table does not match the structure of the destination table. Vitaly Baranov
  • Added the tcp_keep_alive_timeout option to enable keep-alive packets after inactivity for the specified time interval. #3441
  • Removed unnecessary quoting of values for the partition key in the system.parts table if it consists of a single column. #3652
  • The modulo function works for Date and DateTime data types. #3385
  • Added synonyms for the POWER, LN, LCASE, UCASE, REPLACE, LOCATE, SUBSTR, and MID functions. #3774 #3763 Some function names are case-insensitive for compatibility with the SQL standard. Added syntactic sugar SUBSTRING(expr FROM start FOR length) for compatibility with SQL. #3804
  • Added the ability to mlock memory pages corresponding to clickhouse-server executable code to prevent it from being forced out of memory. This feature is disabled by default. #3553
  • Improved performance when reading from O_DIRECT (with the min_bytes_to_use_direct_io option enabled). #3405
  • Improved performance of the dictGet...OrDefault function for a constant key argument and a non-constant default argument. Amos Bird
  • The firstSignificantSubdomain function now processes the domains gov, mil, and edu. Igor Hatarist Improved performance. #3628
  • Ability to specify custom environment variables for starting clickhouse-server using the SYS-V init.d script by defining CLICKHOUSE_PROGRAM_ENV in /etc/default/clickhouse. Pavlo Bashynskyi
  • Correct return code for the clickhouse-server init script. #3516
  • The system.metrics table now has the VersionInteger metric, and system.build_options has the added line VERSION_INTEGER, which contains the numeric form of the ClickHouse version, such as 18016000. #3644
  • Removed the ability to compare the Date type with a number to avoid potential errors like date = 2018-12-17, where quotes around the date are omitted by mistake. #3687
  • Fixed the behavior of stateful functions like rowNumberInAllBlocks. They previously output a result that was one number larger due to starting during query analysis. Amos Bird
  • If the force_restore_data file can't be deleted, an error message is displayed. Amos Bird

Build improvements:

  • Updated the jemalloc library, which fixes a potential memory leak. Amos Bird
  • Profiling with jemalloc is enabled by default in order to debug builds. 2cc82f5c
  • Added the ability to run integration tests when only Docker is installed on the system. #3650
  • Added the fuzz expression test in SELECT queries. #3442
  • Added a stress test for commits, which performs functional tests in parallel and in random order to detect more race conditions. #3438
  • Improved the method for starting clickhouse-server in a Docker image. Elghazal Ahmed
  • For a Docker image, added support for initializing databases using files in the /docker-entrypoint-initdb.d directory. Konstantin Lebedev
  • Fixes for builds on ARM. #3709

Backward incompatible changes:

  • Removed the ability to compare the Date type with a number. Instead of toDate('2018-12-18') = 17883, you must use explicit type conversion = toDate(17883) #3687

ClickHouse release 18.14

ClickHouse release 18.14.19, 2018-12-19

Bug fixes:

  • Fixed an error that led to problems with updating dictionaries with the ODBC source. #3825, #3829
  • Databases are correctly specified when executing DDL ON CLUSTER queries. #3460
  • Fixed a segfault if the max_temporary_non_const_columns limit was exceeded. #3788

Build improvements:

  • Fixes for builds on ARM.

ClickHouse release 18.14.18, 2018-12-04

Bug fixes:

  • Fixed error in dictGet... function for dictionaries of type range, if one of the arguments is constant and other is not. #3751
  • Fixed error that caused messages netlink: '...': attribute type 1 has an invalid length to be printed in Linux kernel log, that was happening only on fresh enough versions of Linux kernel. #3749
  • Fixed segfault in function empty for argument of FixedString type. Daniel, Dao Quang Minh
  • Fixed excessive memory allocation when using large value of max_query_size setting (a memory chunk of max_query_size bytes was preallocated at once). #3720

Build changes:

  • Fixed build with LLVM/Clang libraries of version 7 from the OS packages (these libraries are used for runtime query compilation). #3582

ClickHouse release 18.14.17, 2018-11-30

Bug fixes:

  • Fixed cases when the ODBC bridge process did not terminate with the main server process. #3642
  • Fixed synchronous insertion into the Distributed table with a columns list that differs from the column list of the remote table. #3673
  • Fixed a rare race condition that can lead to a crash when dropping a MergeTree table. #3643
  • Fixed a query deadlock in case when query thread creation fails with the Resource temporarily unavailable error. #3643
  • Fixed parsing of the ENGINE clause when the CREATE AS table syntax was used and the ENGINE clause was specified before the AS table (the error resulted in ignoring the specified engine). #3692

ClickHouse release 18.14.15, 2018-11-21

Bug fixes:

  • The size of memory chunk was overestimated while deserializing the column of type Array(String) that leads to "Memory limit exceeded" errors. The issue appeared in version 18.12.13. #3589

ClickHouse release 18.14.14, 2018-11-20

Bug fixes:

  • Fixed ON CLUSTER queries when cluster configured as secure (flag <secure>). #3599

Build changes:

  • Fixed problems (llvm-7 from system, macos) #3582

ClickHouse release 18.14.13, 2018-11-08

Bug fixes:

  • Fixed the Block structure mismatch in MergingSorted stream error. #3162
  • Fixed ON CLUSTER queries in case when secure connections were turned on in the cluster config (the <secure> flag). #3465
  • Fixed an error in queries that used SAMPLE, PREWHERE and alias columns. #3543
  • Fixed a rare unknown compression method error when the min_bytes_to_use_direct_io setting was enabled. 3544

Performance improvements:

  • Fixed performance regression of queries with GROUP BY of columns of UInt16 or Date type when executing on AMD EPYC processors. Igor Lapko
  • Fixed performance regression of queries that process long strings. #3530

Build improvements:

  • Improvements for simplifying the Arcadia build. #3475, #3535

ClickHouse release 18.14.12, 2018-11-02

Bug fixes:

  • Fixed a crash on joining two unnamed subqueries. #3505
  • Fixed generating incorrect queries (with an empty WHERE clause) when querying external databases. hotid
  • Fixed using an incorrect timeout value in ODBC dictionaries. Marek Vavruša

ClickHouse release 18.14.11, 2018-10-29

Bug fixes:

  • Fixed the error Block structure mismatch in UNION stream: different number of columns in LIMIT queries. #2156
  • Fixed errors when merging data in tables containing arrays inside Nested structures. #3397
  • Fixed incorrect query results if the merge_tree_uniform_read_distribution setting is disabled (it is enabled by default). #3429
  • Fixed an error on inserts to a Distributed table in Native format. #3411

ClickHouse release 18.14.10, 2018-10-23

  • The compile_expressions setting (JIT compilation of expressions) is disabled by default. #3410
  • The enable_optimize_predicate_expression setting is disabled by default.

ClickHouse release 18.14.9, 2018-10-16

New features:

  • The WITH CUBE modifier for GROUP BY (the alternative syntax GROUP BY CUBE(...) is also available). #3172
  • Added the formatDateTime function. Alexandr Krasheninnikov
  • Added the JDBC table engine and jdbc table function (requires installing clickhouse-jdbc-bridge). Alexandr Krasheninnikov
  • Added functions for working with the ISO week number: toISOWeek, toISOYear, toStartOfISOYear, and toDayOfYear. #3146
  • Now you can use Nullable columns for MySQL and ODBC tables. #3362
  • Nested data structures can be read as nested objects in JSONEachRow format. Added the input_format_import_nested_json setting. Veloman Yunkan
  • Parallel processing is available for many MATERIALIZED VIEWs when inserting data. See the parallel_view_processing setting. Marek Vavruša
  • Added the SYSTEM FLUSH LOGS query (forced log flushes to system tables such as query_log) #3321
  • Now you can use pre-defined database and table macros when declaring Replicated tables. #3251
  • Added the ability to read Decimal type values in engineering notation (indicating powers of ten). #3153

Experimental features:

  • Optimization of the GROUP BY clause for LowCardinality data types. #3138
  • Optimized calculation of expressions for LowCardinality data types. #3200

Improvements:

  • Significantly reduced memory consumption for queries with ORDER BY and LIMIT. See the max_bytes_before_remerge_sort setting. #3205
  • In the absence of JOIN (LEFT, INNER, ...), INNER JOIN is assumed. #3147
  • Qualified asterisks work correctly in queries with JOIN. Winter Zhang
  • The ODBC table engine correctly chooses the method for quoting identifiers in the SQL dialect of a remote database. Alexandr Krasheninnikov
  • The compile_expressions setting (JIT compilation of expressions) is enabled by default.
  • Fixed behavior for simultaneous DROP DATABASE/TABLE IF EXISTS and CREATE DATABASE/TABLE IF NOT EXISTS. Previously, a CREATE DATABASE ... IF NOT EXISTS query could return the error message "File ... already exists", and the CREATE TABLE ... IF NOT EXISTS and DROP TABLE IF EXISTS queries could return Table ... is creating or attaching right now. #3101
  • LIKE and IN expressions with a constant right half are passed to the remote server when querying from MySQL or ODBC tables. #3182
  • Comparisons with constant expressions in a WHERE clause are passed to the remote server when querying from MySQL and ODBC tables. Previously, only comparisons with constants were passed. #3182
  • Correct calculation of row width in the terminal for Pretty formats, including strings with hieroglyphs. Amos Bird.
  • ON CLUSTER can be specified for ALTER UPDATE queries.
  • Improved performance for reading data in JSONEachRow format. #3332
  • Added synonyms for the LENGTH and CHARACTER_LENGTH functions for compatibility. The CONCAT function is no longer case-sensitive. #3306
  • Added the TIMESTAMP synonym for the DateTime type. #3390
  • There is always space reserved for query_id in the server logs, even if the log line is not related to a query. This makes it easier to parse server text logs with third-party tools.
  • Memory consumption by a query is logged when it exceeds the next level of an integer number of gigabytes. #3205
  • Added compatibility mode for the case when the client library that uses the Native protocol sends fewer columns by mistake than the server expects for the INSERT query. This scenario was possible when using the clickhouse-cpp library. Previously, this scenario caused the server to crash. #3171
  • In a user-defined WHERE expression in clickhouse-copier, you can now use a partition_key alias (for additional filtering by source table partition). This is useful if the partitioning scheme changes during copying, but only changes slightly. #3166
  • The workflow of the Kafka engine has been moved to a background thread pool in order to automatically reduce the speed of data reading at high loads. Marek Vavruša.
  • Support for reading Tuple and Nested values of structures like struct in the Cap'n'Proto format. Marek Vavruša
  • The list of top-level domains for the firstSignificantSubdomain function now includes the domain biz. decaseal
  • In the configuration of external dictionaries, null_value is interpreted as the value of the default data type. #3330
  • Support for the intDiv and intDivOrZero functions for Decimal. b48402e8
  • Support for the Date, DateTime, UUID, and Decimal types as a key for the sumMap aggregate function. #3281
  • Support for the Decimal data type in external dictionaries. #3324
  • Support for the Decimal data type in SummingMergeTree tables. #3348
  • Added specializations for UUID in if. #3366
  • Reduced the number of open and close system calls when reading from a MergeTree table. #3283
  • A TRUNCATE TABLE query can be executed on any replica (the query is passed to the leader replica). Kirill Shvakov

Bug fixes:

  • Fixed an issue with Dictionary tables for range_hashed dictionaries. This error occurred in version 18.12.17. #1702
  • Fixed an error when loading range_hashed dictionaries (the message Unsupported type Nullable (...)). This error occurred in version 18.12.17. #3362
  • Fixed errors in the pointInPolygon function due to the accumulation of inaccurate calculations for polygons with a large number of vertices located close to each other. #3331 #3341
  • If after merging data parts, the checksum for the resulting part differs from the result of the same merge in another replica, the result of the merge is deleted and the data part is downloaded from the other replica (this is the correct behavior). But after downloading the data part, it couldn't be added to the working set because of an error that the part already exists (because the data part was deleted with some delay after the merge). This led to cyclical attempts to download the same data. #3194
  • Fixed incorrect calculation of total memory consumption by queries (because of incorrect calculation, the max_memory_usage_for_all_queries setting worked incorrectly and the MemoryTracking metric had an incorrect value). This error occurred in version 18.12.13. Marek Vavruša
  • Fixed the functionality of CREATE TABLE ... ON CLUSTER ... AS SELECT ... This error occurred in version 18.12.13. #3247
  • Fixed unnecessary preparation of data structures for JOINs on the server that initiates the query if the JOIN is only performed on remote servers. #3340
  • Fixed bugs in the Kafka engine: deadlocks after exceptions when starting to read data, and locks upon completion Marek Vavruša.
  • For Kafka tables, the optional schema parameter was not passed (the schema of the Cap'n'Proto format). Vojtech Splichal
  • If the ensemble of ZooKeeper servers has servers that accept the connection but then immediately close it instead of responding to the handshake, ClickHouse chooses to connect another server. Previously, this produced the error Cannot read all data. Bytes read: 0. Bytes expected: 4. and the server couldn't start. 8218cf3a
  • If the ensemble of ZooKeeper servers contains servers for which the DNS query returns an error, these servers are ignored. 17b8e209
  • Fixed type conversion between Date and DateTime when inserting data in the VALUES format (if input_format_values_interpret_expressions = 1). Previously, the conversion was performed between the numerical value of the number of days in Unix Epoch time and the Unix timestamp, which led to unexpected results. #3229
  • Corrected type conversion between Decimal and integer numbers. #3211
  • Fixed errors in the enable_optimize_predicate_expression setting. Winter Zhang
  • Fixed a parsing error in CSV format with floating-point numbers if a non-default CSV separator is used, such as ; #3155
  • Fixed the arrayCumSumNonNegative function (it does not accumulate negative values if the accumulator is less than zero). Aleksey Studnev
  • Fixed how Merge tables work on top of Distributed tables when using PREWHERE. #3165
  • Bug fixes in the ALTER UPDATE query.
  • Fixed bugs in the odbc table function that appeared in version 18.12. #3197
  • Fixed the operation of aggregate functions with StateArray combinators. #3188
  • Fixed a crash when dividing a Decimal value by zero. 69dd6609
  • Fixed output of types for operations using Decimal and integer arguments. #3224
  • Fixed the segfault during GROUP BY on Decimal128. 3359ba06
  • The log_query_threads setting (logging information about each thread of query execution) now takes effect only if the log_queries option (logging information about queries) is set to 1. Since the log_query_threads option is enabled by default, information about threads was previously logged even if query logging was disabled. #3241
  • Fixed an error in the distributed operation of the quantiles aggregate function (the error message Not found column quantile...). 292a8855
  • Fixed the compatibility problem when working on a cluster of version 18.12.17 servers and older servers at the same time. For distributed queries with GROUP BY keys of both fixed and non-fixed length, if there was a large amount of data to aggregate, the returned data was not always fully aggregated (two different rows contained the same aggregation keys). #3254
  • Fixed handling of substitutions in clickhouse-performance-test, if the query contains only part of the substitutions declared in the test. #3263
  • Fixed an error when using FINAL with PREWHERE. #3298
  • Fixed an error when using PREWHERE over columns that were added during ALTER. #3298
  • Added a check for the absence of arrayJoin for DEFAULT and MATERIALIZED expressions. Previously, arrayJoin led to an error when inserting data. #3337
  • Added a check for the absence of arrayJoin in a PREWHERE clause. Previously, this led to messages like Size ... doesn't match or Unknown compression method when executing queries. #3357
  • Fixed segfault that could occur in rare cases after optimization that replaced AND chains from equality evaluations with the corresponding IN expression. liuyimin-bytedance
  • Minor corrections to clickhouse-benchmark: previously, client information was not sent to the server; now the number of queries executed is calculated more accurately when shutting down and for limiting the number of iterations. #3351 #3352

Backward incompatible changes:

  • Removed the allow_experimental_decimal_type option. The Decimal data type is available for default use. #3329

ClickHouse release 18.12

ClickHouse release 18.12.17, 2018-09-16

New features:

  • invalidate_query (the ability to specify a query to check whether an external dictionary needs to be updated) is implemented for the clickhouse source. #3126
  • Added the ability to use UInt*, Int*, and DateTime data types (along with the Date type) as a range_hashed external dictionary key that defines the boundaries of ranges. Now NULL can be used to designate an open range. Vasily Nemkov
  • The Decimal type now supports var* and stddev* aggregate functions. #3129
  • The Decimal type now supports mathematical functions (exp, sin and so on.) #3129
  • The system.part_log table now has the partition_id column. #3089

Bug fixes:

  • Merge now works correctly on Distributed tables. Winter Zhang
  • Fixed incompatibility (unnecessary dependency on the glibc version) that made it impossible to run ClickHouse on Ubuntu Precise and older versions. The incompatibility arose in version 18.12.13. #3130
  • Fixed errors in the enable_optimize_predicate_expression setting. Winter Zhang
  • Fixed a minor issue with backwards compatibility that appeared when working with a cluster of replicas on versions earlier than 18.12.13 and simultaneously creating a new replica of a table on a server with a newer version (shown in the message Can not clone replica, because the ... updated to new ClickHouse version, which is logical, but shouldn't happen). #3122

Backward incompatible changes:

  • The enable_optimize_predicate_expression option is enabled by default (which is rather optimistic). If query analysis errors occur that are related to searching for the column names, set enable_optimize_predicate_expression to 0. Winter Zhang

ClickHouse release 18.12.14, 2018-09-13

New features:

  • Added support for ALTER UPDATE queries. #3035
  • Added the allow_ddl option, which restricts the user's access to DDL queries. #3104
  • Added the min_merge_bytes_to_use_direct_io option for MergeTree engines, which allows you to set a threshold for the total size of the merge (when above the threshold, data part files will be handled using O_DIRECT). #3117
  • The system.merges system table now contains the partition_id column. #3099

Improvements

  • If a data part remains unchanged during mutation, it isn't downloaded by replicas. #3103
  • Autocomplete is available for names of settings when working with clickhouse-client. #3106

Bug fixes:

  • Added a check for the sizes of arrays that are elements of Nested type fields when inserting. #3118
  • Fixed an error updating external dictionaries with the ODBC source and hashed storage. This error occurred in version 18.12.13.
  • Fixed a crash when creating a temporary table from a query with an IN condition. Winter Zhang
  • Fixed an error in aggregate functions for arrays that can have NULL elements. Winter Zhang

ClickHouse release 18.12.13, 2018-09-10

New features:

  • Added the DECIMAL(digits, scale) data type (Decimal32(scale), Decimal64(scale), Decimal128(scale)). To enable it, use the setting allow_experimental_decimal_type. #2846 #2970 #3008 #3047
  • New WITH ROLLUP modifier for GROUP BY (alternative syntax: GROUP BY ROLLUP(...)). #2948
  • In queries with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting asterisk_left_columns_only to 1 on the user configuration level. Winter Zhang
  • Added support for JOIN with table functions. Winter Zhang
  • Autocomplete by pressing Tab in clickhouse-client. Sergey Shcherbin
  • Ctrl+C in clickhouse-client clears a query that was entered. #2877
  • Added the join_default_strictness setting (values: ", 'any', 'all'). This allows you to not specify ANY or ALL for JOIN. #2982
  • Each line of the server log related to query processing shows the query ID. #2482
  • Now you can get query execution logs in clickhouse-client (use the send_logs_level setting). With distributed query processing, logs are cascaded from all the servers. #2482
  • The system.query_log and system.processes (SHOW PROCESSLIST) tables now have information about all changed settings when you run a query (the nested structure of the Settings data). Added the log_query_settings setting. #2482
  • The system.query_log and system.processes tables now show information about the number of threads that are participating in query execution (see the thread_numbers column). #2482
  • Added ProfileEvents counters that measure the time spent on reading and writing over the network and reading and writing to disk, the number of network errors, and the time spent waiting when network bandwidth is limited. #2482
  • Added ProfileEventscounters that contain the system metrics from rusage (you can use them to get information about CPU usage in userspace and the kernel, page faults, and context switches), as well as taskstats metrics (use these to obtain information about I/O wait time, CPU wait time, and the amount of data read and recorded, both with and without page cache). #2482
  • The ProfileEvents counters are applied globally and for each query, as well as for each query execution thread, which allows you to profile resource consumption by query in detail. #2482
  • Added the system.query_thread_log table, which contains information about each query execution thread. Added the log_query_threads setting. #2482
  • The system.metrics and system.events tables now have built-in documentation. #3016
  • Added the arrayEnumerateDense function. Amos Bird
  • Added the arrayCumSumNonNegative and arrayDifference functions. Aleksey Studnev
  • Added the retention aggregate function. Sundy Li
  • Now you can add (merge) states of aggregate functions by using the plus operator, and multiply the states of aggregate functions by a nonnegative constant. #3062 #3034
  • Tables in the MergeTree family now have the virtual column _partition_id. #3089

Experimental features:

  • Added the LowCardinality(T) data type. This data type automatically creates a local dictionary of values and allows data processing without unpacking the dictionary. #2830
  • Added a cache of JIT-compiled functions and a counter for the number of uses before compiling. To JIT compile expressions, enable the compile_expressions setting. #2990 #3077

Improvements:

  • Fixed the problem with unlimited accumulation of the replication log when there are abandoned replicas. Added an effective recovery mode for replicas with a long lag.
  • Improved performance of GROUP BY with multiple aggregation fields when one of them is string and the others are fixed length.
  • Improved performance when using PREWHERE and with implicit transfer of expressions in PREWHERE.
  • Improved parsing performance for text formats (CSV, TSV). Amos Bird #2980
  • Improved performance of reading strings and arrays in binary formats. Amos Bird
  • Increased performance and reduced memory consumption for queries to system.tables and system.columns when there is a very large number of tables on a single server. #2953
  • Fixed a performance problem in the case of a large stream of queries that result in an error (the _dl_addr function is visible in perf top, but the server isn't using much CPU). #2938
  • Conditions are cast into the View (when enable_optimize_predicate_expression is enabled). Winter Zhang
  • Improvements to the functionality for the UUID data type. #3074 #2985
  • The UUID data type is supported in The-Alchemist dictionaries. #2822
  • The visitParamExtractRaw function works correctly with nested structures. Winter Zhang
  • When the input_format_skip_unknown_fields setting is enabled, object fields in JSONEachRow format are skipped correctly. BlahGeek
  • For a CASE expression with conditions, you can now omit ELSE, which is equivalent to ELSE NULL. #2920
  • The operation timeout can now be configured when working with ZooKeeper. urykhy
  • You can specify an offset for LIMIT n, m as LIMIT n OFFSET m. #2840
  • You can use the SELECT TOP n syntax as an alternative for LIMIT. #2840
  • Increased the size of the queue to write to system tables, so the SystemLog parameter queue is full error doesn't happen as often.
  • The windowFunnel aggregate function now supports events that meet multiple conditions. Amos Bird
  • Duplicate columns can be used in a USING clause for JOIN. #3006
  • Pretty formats now have a limit on column alignment by width. Use the output_format_pretty_max_column_pad_width setting. If a value is wider, it will still be displayed in its entirety, but the other cells in the table will not be too wide. #3003
  • The odbc table function now allows you to specify the database/schema name. Amos Bird
  • Added the ability to use a username specified in the clickhouse-client config file. Vladimir Kozbin
  • The ZooKeeperExceptions counter has been split into three counters: ZooKeeperUserExceptions, ZooKeeperHardwareExceptions, and ZooKeeperOtherExceptions.
  • ALTER DELETE queries work for materialized views.
  • Added randomization when running the cleanup thread periodically for ReplicatedMergeTree tables in order to avoid periodic load spikes when there are a very large number of ReplicatedMergeTree tables.
  • Support for ATTACH TABLE ... ON CLUSTER queries. #3025

Bug fixes:

  • Fixed an issue with Dictionary tables (throws the Size of offsets doesn't match size of column or Unknown compression method exception). This bug appeared in version 18.10.3. #2913
  • Fixed a bug when merging CollapsingMergeTree tables if one of the data parts is empty (these parts are formed during merge or ALTER DELETE if all data was deleted), and the vertical algorithm was used for the merge. #3049
  • Fixed a race condition during DROP or TRUNCATE for Memory tables with a simultaneous SELECT, which could lead to server crashes. This bug appeared in version 1.1.54388. #3038
  • Fixed the possibility of data loss when inserting in Replicated tables if the Session is expired error is returned (data loss can be detected by the ReplicatedDataLoss metric). This error occurred in version 1.1.54378. #2939 #2949 #2964
  • Fixed a segfault during JOIN ... ON. #3000
  • Fixed the error searching column names when the WHERE expression consists entirely of a qualified column name, such as WHERE table.column. #2994
  • Fixed the "Not found column" error that occurred when executing distributed queries if a single column consisting of an IN expression with a subquery is requested from a remote server. #3087
  • Fixed the Block structure mismatch in UNION stream: different number of columns error that occurred for distributed queries if one of the shards is local and the other is not, and optimization of the move to PREWHERE is triggered. #2226 #3037 #3055 #3065 #3073 #3090 #3093
  • Fixed the pointInPolygon function for certain cases of non-convex polygons. #2910
  • Fixed the incorrect result when comparing nan with integers. #3024
  • Fixed an error in the zlib-ng library that could lead to segfault in rare cases. #2854
  • Fixed a memory leak when inserting into a table with AggregateFunction columns, if the state of the aggregate function is not simple (allocates memory separately), and if a single insertion request results in multiple small blocks. #3084
  • Fixed a race condition when creating and deleting the same Buffer or MergeTree table simultaneously.
  • Fixed the possibility of a segfault when comparing tuples made up of certain non-trivial types, such as tuples. #2989
  • Fixed the possibility of a segfault when running certain ON CLUSTER queries. Winter Zhang
  • Fixed an error in the arrayDistinct function for Nullable array elements. #2845 #2937
  • The enable_optimize_predicate_expression option now correctly supports cases with SELECT *. Winter Zhang
  • Fixed the segfault when re-initializing the ZooKeeper session. #2917
  • Fixed potential blocking when working with ZooKeeper.
  • Fixed incorrect code for adding nested data structures in a SummingMergeTree.
  • When allocating memory for states of aggregate functions, alignment is correctly taken into account, which makes it possible to use operations that require alignment when implementing states of aggregate functions. chenxing-xc

Security fix:

  • Safe use of ODBC data sources. Interaction with ODBC drivers uses a separate clickhouse-odbc-bridge process. Errors in third-party ODBC drivers no longer cause problems with server stability or vulnerabilities. #2828 #2879 #2886 #2893 #2921
  • Fixed incorrect validation of the file path in the catBoostPool table function. #2894
  • The contents of system tables (tables, databases, parts, columns, parts_columns, merges, mutations, replicas, and replication_queue) are filtered according to the user's configured access to databases (allow_databases). Winter Zhang

Backward incompatible changes:

  • In queries with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting asterisk_left_columns_only to 1 on the user configuration level.

Build changes:

  • Most integration tests can now be run by commit.
  • Code style checks can also be run by commit.
  • The memcpy implementation is chosen correctly when building on CentOS7/Fedora. Etienne Champetier
  • When using clang to build, some warnings from -Weverything have been added, in addition to the regular -Wall-Wextra -Werror. #2957
  • Debugging the build uses the jemalloc debug option.
  • The interface of the library for interacting with ZooKeeper is declared abstract. #2950

ClickHouse release 18.10

ClickHouse release 18.10.3, 2018-08-13

New features:

  • HTTPS can be used for replication. #2760
  • Added the functions murmurHash2_64, murmurHash3_32, murmurHash3_64, and murmurHash3_128 in addition to the existing murmurHash2_32. #2791
  • Support for Nullable types in the ClickHouse ODBC driver (ODBCDriver2 output format). #2834
  • Support for UUID in the key columns.

Improvements:

  • Clusters can be removed without restarting the server when they are deleted from the config files. #2777
  • External dictionaries can be removed without restarting the server when they are removed from config files. #2779
  • Added SETTINGS support for the Kafka table engine. Alexander Marshalov
  • Improvements for the UUID data type (not yet complete). #2618
  • Support for empty parts after merges in the SummingMergeTree, CollapsingMergeTree and VersionedCollapsingMergeTree engines. #2815
  • Old records of completed mutations are deleted (ALTER DELETE). #2784
  • Added the system.merge_tree_settings table. Kirill Shvakov
  • The system.tables table now has dependency columns: dependencies_database and dependencies_table. Winter Zhang
  • Added the max_partition_size_to_drop config option. #2782
  • Added the output_format_json_escape_forward_slashes option. Alexander Bocharov
  • Added the max_fetch_partition_retries_count setting. #2831
  • Added the prefer_localhost_replica setting for disabling the preference for a local replica and going to a local replica without inter-process interaction. #2832
  • The quantileExact aggregate function returns nan in the case of aggregation on an empty Float32 or Float64 set. Sundy Li

Bug fixes:

  • Removed unnecessary escaping of the connection string parameters for ODBC, which made it impossible to establish a connection. This error occurred in version 18.6.0.
  • Fixed the logic for processing REPLACE PARTITION commands in the replication queue. If there are two REPLACE commands for the same partition, the incorrect logic could cause one of them to remain in the replication queue and not be executed. #2814
  • Fixed a merge bug when all data parts were empty (parts that were formed from a merge or from ALTER DELETE if all data was deleted). This bug appeared in version 18.1.0. #2930
  • Fixed an error for concurrent Set or Join. Amos Bird
  • Fixed the Block structure mismatch in UNION stream: different number of columns error that occurred for UNION ALL queries inside a sub-query if one of the SELECT queries contains duplicate column names. Winter Zhang
  • Fixed a memory leak if an exception occurred when connecting to a MySQL server.
  • Fixed incorrect clickhouse-client response code in case of a query error.
  • Fixed incorrect behavior of materialized views containing DISTINCT. #2795

Backward incompatible changes

  • Removed support for CHECK TABLE queries for Distributed tables.

Build changes:

  • The allocator has been replaced: jemalloc is now used instead of tcmalloc. In some scenarios, this increases speed up to 20%. However, there are queries that have slowed by up to 20%. Memory consumption has been reduced by approximately 10% in some scenarios, with improved stability. With highly competitive loads, CPU usage in userspace and in system shows just a slight increase. #2773
  • Use of libressl from a submodule. #1983 #2807
  • Use of unixodbc from a submodule. #2789
  • Use of mariadb-connector-c from a submodule. #2785
  • Added functional test files to the repository that depend on the availability of test data (for the time being, without the test data itself).

ClickHouse release 18.6

ClickHouse release 18.6.0, 2018-08-02

New features:

  • Added support for ON expressions for the JOIN ON syntax: JOIN ON Expr([table.]column ...) = Expr([table.]column, ...) [AND Expr([table.]column, ...) = Expr([table.]column, ...) ...] The expression must be a chain of equalities joined by the AND operator. Each side of the equality can be an arbitrary expression over the columns of one of the tables. The use of fully qualified column names is supported (table.name, database.table.name, table_alias.name, subquery_alias.name) for the right table. #2742
  • HTTPS can be enabled for replication. #2760

Improvements:

  • The server passes the patch component of its version to the client. Data about the patch version component is in system.processes and query_log. #2646

ClickHouse release 18.5

ClickHouse release 18.5.1, 2018-07-31

New features:

  • Added the hash function murmurHash2_32 #2756.

Improvements:

  • Now you can use the from_env #2741 attribute to set values in config files from environment variables.
  • Added case-insensitive versions of the coalesce, ifNull, and nullIf functions #2752.

Bug fixes:

  • Fixed a possible bug when starting a replica #2759.

ClickHouse release 18.4

ClickHouse release 18.4.0, 2018-07-28

New features:

  • Added system tables: formats, data_type_families, aggregate_function_combinators, table_functions, table_engines, collations #2721.
  • Added the ability to use a table function instead of a table as an argument of a remote or cluster table function #2708.
  • Support for HTTP Basic authentication in the replication protocol #2727.
  • The has function now allows searching for a numeric value in an array of Enum values Maxim Khrisanfov.
  • Support for adding arbitrary message separators when reading from Kafka Amos Bird.

Improvements:

  • The ALTER TABLE t DELETE WHERE query does not rewrite data parts that were not affected by the WHERE condition #2694.
  • The use_minimalistic_checksums_in_zookeeper option for ReplicatedMergeTree tables is enabled by default. This setting was added in version 1.1.54378, 2018-04-16. Versions that are older than 1.1.54378 can no longer be installed.
  • Support for running KILL and OPTIMIZE queries that specify ON CLUSTER Winter Zhang.

Bug fixes:

  • Fixed the error Column ... is not under an aggregate function and not in GROUP BY for aggregation with an IN expression. This bug appeared in version 18.1.0. (bbdd780b)
  • Fixed a bug in the windowFunnel aggregate function Winter Zhang.
  • Fixed a bug in the anyHeavy aggregate function (a2101df2)
  • Fixed server crash when using the countArray() aggregate function.

Backward incompatible changes:

  • Parameters for Kafka engine was changed from Kafka(kafka_broker_list, kafka_topic_list, kafka_group_name, kafka_format[, kafka_schema, kafka_num_consumers]) to Kafka(kafka_broker_list, kafka_topic_list, kafka_group_name, kafka_format[, kafka_row_delimiter, kafka_schema, kafka_num_consumers]). If your tables use kafka_schema or kafka_num_consumers parameters, you have to manually edit the metadata files path/metadata/database/table.sql and add kafka_row_delimiter parameter with '' value.

ClickHouse release 18.1

ClickHouse release 18.1.0, 2018-07-23

New features:

  • Support for the ALTER TABLE t DELETE WHERE query for non-replicated MergeTree tables (#2634).
  • Support for arbitrary types for the uniq* family of aggregate functions (#2010).
  • Support for arbitrary types in comparison operators (#2026).
  • The users.xml file allows setting a subnet mask in the format 10.0.0.1/255.255.255.0. This is necessary for using masks for IPv6 networks with zeros in the middle (#2637).
  • Added the arrayDistinct function (#2670).
  • The SummingMergeTree engine can now work with AggregateFunction type columns (Constantin S. Pan).

Improvements:

  • Changed the numbering scheme for release versions. Now the first part contains the year of release (A.D., Moscow timezone, minus 2000), the second part contains the number for major changes (increases for most releases), and the third part is the patch version. Releases are still backward compatible, unless otherwise stated in the changelog.
  • Faster conversions of floating-point numbers to a string (Amos Bird).
  • If some rows were skipped during an insert due to parsing errors (this is possible with the input_allow_errors_num and input_allow_errors_ratio settings enabled), the number of skipped rows is now written to the server log (Leonardo Cecchi).

Bug fixes:

  • Fixed the TRUNCATE command for temporary tables (Amos Bird).
  • Fixed a rare deadlock in the ZooKeeper client library that occurred when there was a network error while reading the response (c315200).
  • Fixed an error during a CAST to Nullable types (#1322).
  • Fixed the incorrect result of the maxIntersection() function when the boundaries of intervals coincided (Michael Furmur).
  • Fixed incorrect transformation of the OR expression chain in a function argument (chenxing-xc).
  • Fixed performance degradation for queries containing IN (subquery) expressions inside another subquery (#2571).
  • Fixed incompatibility between servers with different versions in distributed queries that use a CAST function that isn't in uppercase letters (fe8c4d6).
  • Added missing quoting of identifiers for queries to an external DBMS (#2635).

Backward incompatible changes:

  • Converting a string containing the number zero to DateTime does not work. Example: SELECT toDateTime('0'). This is also the reason that DateTime DEFAULT '0' does not work in tables, as well as <null_value>0</null_value> in dictionaries. Solution: replace 0 with 0000-00-00 00:00:00.

ClickHouse release 1.1

ClickHouse release 1.1.54394, 2018-07-12

New features:

  • Added the histogram aggregate function (Mikhail Surin).
  • Now OPTIMIZE TABLE ... FINAL can be used without specifying partitions for ReplicatedMergeTree (Amos Bird).

Bug fixes:

  • Fixed a problem with a very small timeout for sockets (one second) for reading and writing when sending and downloading replicated data, which made it impossible to download larger parts if there is a load on the network or disk (it resulted in cyclical attempts to download parts). This error occurred in version 1.1.54388.
  • Fixed issues when using chroot in ZooKeeper if you inserted duplicate data blocks in the table.
  • The has function now works correctly for an array with Nullable elements (#2115).
  • The system.tables table now works correctly when used in distributed queries. The metadata_modification_time and engine_full columns are now non-virtual. Fixed an error that occurred if only these columns were queried from the table.
  • Fixed how an empty TinyLog table works after inserting an empty data block (#2563).
  • The system.zookeeper table works if the value of the node in ZooKeeper is NULL.

ClickHouse release 1.1.54390, 2018-07-06

New features:

  • Queries can be sent in multipart/form-data format (in the query field), which is useful if external data is also sent for query processing (Olga Hvostikova).
  • Added the ability to enable or disable processing single or double quotes when reading data in CSV format. You can configure this in the format_csv_allow_single_quotes and format_csv_allow_double_quotes settings (Amos Bird).
  • Now OPTIMIZE TABLE ... FINAL can be used without specifying the partition for non-replicated variants of MergeTree (Amos Bird).

Improvements:

  • Improved performance, reduced memory consumption, and correct memory consumption tracking with use of the IN operator when a table index could be used (#2584).
  • Removed redundant checking of checksums when adding a data part. This is important when there are a large number of replicas, because in these cases the total number of checks was equal to N^2.
  • Added support for Array(Tuple(...)) arguments for the arrayEnumerateUniq function (#2573).
  • Added Nullable support for the runningDifference function (#2594).
  • Improved query analysis performance when there is a very large number of expressions (#2572).
  • Faster selection of data parts for merging in ReplicatedMergeTree tables. Faster recovery of the ZooKeeper session (#2597).
  • The format_version.txt file for MergeTree tables is re-created if it is missing, which makes sense if ClickHouse is launched after copying the directory structure without files (Ciprian Hacman).

Bug fixes:

  • Fixed a bug when working with ZooKeeper that could make it impossible to recover the session and readonly states of tables before restarting the server.
  • Fixed a bug when working with ZooKeeper that could result in old nodes not being deleted if the session is interrupted.
  • Fixed an error in the quantileTDigest function for Float arguments (this bug was introduced in version 1.1.54388) (Mikhail Surin).
  • Fixed a bug in the index for MergeTree tables if the primary key column is located inside the function for converting types between signed and unsigned integers of the same size (#2603).
  • Fixed segfault if macros are used but they aren't in the config file (#2570).
  • Fixed switching to the default database when reconnecting the client (#2583).
  • Fixed a bug that occurred when the use_index_for_in_with_subqueries setting was disabled.

Security fix:

  • Sending files is no longer possible when connected to MySQL (LOAD DATA LOCAL INFILE).

ClickHouse release 1.1.54388, 2018-06-28

New features:

  • Support for the ALTER TABLE t DELETE WHERE query for replicated tables. Added the system.mutations table to track progress of this type of queries.
  • Support for the ALTER TABLE t [REPLACE|ATTACH] PARTITION query for *MergeTree tables.
  • Support for the TRUNCATE TABLE query (Winter Zhang)
  • Several new SYSTEM queries for replicated tables (RESTART REPLICAS, SYNC REPLICA, [STOP|START] [MERGES|FETCHES|SENDS REPLICATED|REPLICATION QUEUES]).
  • Added the ability to write to a table with the MySQL engine and the corresponding table function (sundy-li).
  • Added the url() table function and the URL table engine (Alexander Sapin).
  • Added the windowFunnel aggregate function (sundy-li).
  • New startsWith and endsWith functions for strings (Vadim Plakhtinsky).
  • The numbers() table function now allows you to specify the offset (Winter Zhang).
  • The password to clickhouse-client can be entered interactively.
  • Server logs can now be sent to syslog (Alexander Krasheninnikov).
  • Support for logging in dictionaries with a shared library source (Alexander Sapin).
  • Support for custom CSV delimiters (Ivan Zhukov)
  • Added the date_time_input_format setting. If you switch this setting to 'best_effort', DateTime values will be read in a wide range of formats.
  • Added the clickhouse-obfuscator utility for data obfuscation. Usage example: publishing data used in performance tests.

Experimental features:

  • Added the ability to calculate and arguments only where they are needed (Anastasia Tsarkova)
  • JIT compilation to native code is now available for some expressions (pyos).

Bug fixes:

  • Duplicates no longer appear for a query with DISTINCT and ORDER BY.
  • Queries with ARRAY JOIN and arrayFilter no longer return an incorrect result.
  • Fixed an error when reading an array column from a Nested structure (#2066).
  • Fixed an error when analyzing queries with a HAVING clause like HAVING tuple IN (...).
  • Fixed an error when analyzing queries with recursive aliases.
  • Fixed an error when reading from ReplacingMergeTree with a condition in PREWHERE that filters all rows (#2525).
  • User profile settings were not applied when using sessions in the HTTP interface.
  • Fixed how settings are applied from the command line parameters in clickhouse-local.
  • The ZooKeeper client library now uses the session timeout received from the server.
  • Fixed a bug in the ZooKeeper client library when the client waited for the server response longer than the timeout.
  • Fixed pruning of parts for queries with conditions on partition key columns (#2342).
  • Merges are now possible after CLEAR COLUMN IN PARTITION (#2315).
  • Type mapping in the ODBC table function has been fixed (sundy-li).
  • Type comparisons have been fixed for DateTime with and without the time zone (Alexander Bocharov).
  • Fixed syntactic parsing and formatting of the CAST operator.
  • Fixed insertion into a materialized view for the Distributed table engine (Babacar Diassé).
  • Fixed a race condition when writing data from the Kafka engine to materialized views (Yangkuan Liu).
  • Fixed SSRF in the remote() table function.
  • Fixed exit behavior of clickhouse-client in multiline mode (#2510).

Improvements:

  • Background tasks in replicated tables are now performed in a thread pool instead of in separate threads (Silviu Caragea).
  • Improved LZ4 compression performance.
  • Faster analysis for queries with a large number of JOINs and sub-queries.
  • The DNS cache is now updated automatically when there are too many network errors.
  • Table inserts no longer occur if the insert into one of the materialized views is not possible because it has too many parts.
  • Corrected the discrepancy in the event counters Query, SelectQuery, and InsertQuery.
  • Expressions like tuple IN (SELECT tuple) are allowed if the tuple types match.
  • A server with replicated tables can start even if you haven't configured ZooKeeper.
  • When calculating the number of available CPU cores, limits on cgroups are now taken into account (Atri Sharma).
  • Added chown for config directories in the systemd config file (Mikhail Shiryaev).

Build changes:

  • The gcc8 compiler can be used for builds.
  • Added the ability to build llvm from submodule.
  • The version of the librdkafka library has been updated to v0.11.4.
  • Added the ability to use the system libcpuid library. The library version has been updated to 0.4.0.
  • Fixed the build using the vectorclass library (Babacar Diassé).
  • Cmake now generates files for ninja by default (like when using -G Ninja).
  • Added the ability to use the libtinfo library instead of libtermcap (Georgy Kondratiev).
  • Fixed a header file conflict in Fedora Rawhide (#2520).

Backward incompatible changes:

  • Removed escaping in Vertical and Pretty* formats and deleted the VerticalRaw format.
  • If servers with version 1.1.54388 (or newer) and servers with an older version are used simultaneously in a distributed query and the query has the cast(x, 'Type') expression without the AS keyword and doesn't have the word cast in uppercase, an exception will be thrown with a message like Not found column cast(0, 'UInt8') in block. Solution: Update the server on the entire cluster.

ClickHouse release 1.1.54385, 2018-06-01

Bug fixes:

  • Fixed an error that in some cases caused ZooKeeper operations to block.

ClickHouse release 1.1.54383, 2018-05-22

Bug fixes:

  • Fixed a slowdown of replication queue if a table has many replicas.

ClickHouse release 1.1.54381, 2018-05-14

Bug fixes:

  • Fixed a nodes leak in ZooKeeper when ClickHouse loses connection to ZooKeeper server.

ClickHouse release 1.1.54380, 2018-04-21

New features:

  • Added the table function file(path, format, structure). An example reading bytes from /dev/urandom: ln -s /dev/urandom /var/lib/clickhouse/user_files/random``clickhouse-client -q "SELECT * FROM file('random', 'RowBinary', 'd UInt8') LIMIT 10".

Improvements:

  • Subqueries can be wrapped in () brackets to enhance query readability. For example: (SELECT 1) UNION ALL (SELECT 1).
  • Simple SELECT queries from the system.processes table are not included in the max_concurrent_queries limit.

Bug fixes:

  • Fixed incorrect behavior of the IN operator when select from MATERIALIZED VIEW.
  • Fixed incorrect filtering by partition index in expressions like partition_key_column IN (...).
  • Fixed inability to execute OPTIMIZE query on non-leader replica if REANAME was performed on the table.
  • Fixed the authorization error when executing OPTIMIZE or ALTER queries on a non-leader replica.
  • Fixed freezing of KILL QUERY.
  • Fixed an error in ZooKeeper client library which led to loss of watches, freezing of distributed DDL queue, and slowdowns in the replication queue if a non-empty chroot prefix is used in the ZooKeeper configuration.

Backward incompatible changes:

  • Removed support for expressions like (a, b) IN (SELECT (a, b)) (you can use the equivalent expression (a, b) IN (SELECT a, b)). In previous releases, these expressions led to undetermined WHERE filtering or caused errors.

ClickHouse release 1.1.54378, 2018-04-16

New features:

  • Logging level can be changed without restarting the server.
  • Added the SHOW CREATE DATABASE query.
  • The query_id can be passed to clickhouse-client (elBroom).
  • New setting: max_network_bandwidth_for_all_users.
  • Added support for ALTER TABLE ... PARTITION ... for MATERIALIZED VIEW.
  • Added information about the size of data parts in uncompressed form in the system table.
  • Server-to-server encryption support for distributed tables (<secure>1</secure> in the replica config in <remote_servers>).
  • Configuration of the table level for the ReplicatedMergeTree family in order to minimize the amount of data stored in Zookeeper: : use_minimalistic_checksums_in_zookeeper = 1
  • Configuration of the clickhouse-client prompt. By default, server names are now output to the prompt. The server's display name can be changed. It's also sent in the X-ClickHouse-Display-Name HTTP header (Kirill Shvakov).
  • Multiple comma-separated topics can be specified for the Kafka engine (Tobias Adamson)
  • When a query is stopped by KILL QUERY or replace_running_query, the client receives the Query was canceled exception instead of an incomplete result.

Improvements:

  • ALTER TABLE ... DROP/DETACH PARTITION queries are run at the front of the replication queue.
  • SELECT ... FINAL and OPTIMIZE ... FINAL can be used even when the table has a single data part.
  • A query_log table is recreated on the fly if it was deleted manually (Kirill Shvakov).
  • The lengthUTF8 function runs faster (zhang2014).
  • Improved performance of synchronous inserts in Distributed tables (insert_distributed_sync = 1) when there is a very large number of shards.
  • The server accepts the send_timeout and receive_timeout settings from the client and applies them when connecting to the client (they are applied in reverse order: the server socket's send_timeout is set to the receive_timeout value received from the client, and vice versa).
  • More robust crash recovery for asynchronous insertion into Distributed tables.
  • The return type of the countEqual function changed from UInt32 to UInt64 (谢磊).

Bug fixes:

  • Fixed an error with IN when the left side of the expression is Nullable.
  • Correct results are now returned when using tuples with IN when some of the tuple components are in the table index.
  • The max_execution_time limit now works correctly with distributed queries.
  • Fixed errors when calculating the size of composite columns in the system.columns table.
  • Fixed an error when creating a temporary table CREATE TEMPORARY TABLE IF NOT EXISTS.
  • Fixed errors in StorageKafka (##2075)
  • Fixed server crashes from invalid arguments of certain aggregate functions.
  • Fixed the error that prevented the DETACH DATABASE query from stopping background tasks for ReplicatedMergeTree tables.
  • Too many parts state is less likely to happen when inserting into aggregated materialized views (##2084).
  • Corrected recursive handling of substitutions in the config if a substitution must be followed by another substitution on the same level.
  • Corrected the syntax in the metadata file when creating a VIEW that uses a query with UNION ALL.
  • SummingMergeTree now works correctly for summation of nested data structures with a composite key.
  • Fixed the possibility of a race condition when choosing the leader for ReplicatedMergeTree tables.

Build changes:

  • The build supports ninja instead of make and uses ninja by default for building releases.
  • Renamed packages: clickhouse-server-base in clickhouse-common-static; clickhouse-server-common in clickhouse-server; clickhouse-common-dbg in clickhouse-common-static-dbg. To install, use clickhouse-server clickhouse-client. Packages with the old names will still load in the repositories for backward compatibility.

Backward incompatible changes:

  • Removed the special interpretation of an IN expression if an array is specified on the left side. Previously, the expression arr IN (set) was interpreted as "at least one arr element belongs to the set". To get the same behavior in the new version, write arrayExists(x -> x IN (set), arr).
  • Disabled the incorrect use of the socket option SO_REUSEPORT, which was incorrectly enabled by default in the Poco library. Note that on Linux there is no longer any reason to simultaneously specify the addresses :: and 0.0.0.0 for listen use just ::, which allows listening to the connection both over IPv4 and IPv6 (with the default kernel config settings). You can also revert to the behavior from previous versions by specifying <listen_reuse_port>1</listen_reuse_port> in the config.

ClickHouse release 1.1.54370, 2018-03-16

New features:

  • Added the system.macros table and auto updating of macros when the config file is changed.
  • Added the SYSTEM RELOAD CONFIG query.
  • Added the maxIntersections(left_col, right_col) aggregate function, which returns the maximum number of simultaneously intersecting intervals [left; right]. The maxIntersectionsPosition(left, right) function returns the beginning of the "maximum" interval. (Michael Furmur).

Improvements:

  • When inserting data in a Replicated table, fewer requests are made to ZooKeeper (and most of the user-level errors have disappeared from the ZooKeeper log).
  • Added the ability to create aliases for data sets. Example: WITH (1, 2, 3) AS set SELECT number IN set FROM system.numbers LIMIT 10.

Bug fixes:

  • Fixed the Illegal PREWHERE error when reading from Merge tables for Distributedtables.
  • Added fixes that allow you to start clickhouse-server in IPv4-only Docker containers.
  • Fixed a race condition when reading from system system.parts_columns tables.
  • Removed double buffering during a synchronous insert to a Distributed table, which could have caused the connection to timeout.
  • Fixed a bug that caused excessively long waits for an unavailable replica before beginning a SELECT query.
  • Fixed incorrect dates in the system.parts table.
  • Fixed a bug that made it impossible to insert data in a Replicated table if chroot was non-empty in the configuration of the ZooKeeper cluster.
  • Fixed the vertical merging algorithm for an empty ORDER BY table.
  • Restored the ability to use dictionaries in queries to remote tables, even if these dictionaries are not present on the requestor server. This functionality was lost in release 1.1.54362.
  • Restored the behavior for queries like SELECT * FROM remote('server2', default.table) WHERE col IN (SELECT col2 FROM default.table) when the right side of the IN should use a remote default.table instead of a local one. This behavior was broken in version 1.1.54358.
  • Removed extraneous error-level logging of Not found column ... in block.

Clickhouse Release 1.1.54362, 2018-03-11

New features:

  • Aggregation without GROUP BY for an empty set (such as SELECT count(*) FROM table WHERE 0) now returns a result with one row with null values for aggregate functions, in compliance with the SQL standard. To restore the old behavior (return an empty result), set empty_result_for_aggregation_by_empty_set to 1.
  • Added type conversion for UNION ALL. Different alias names are allowed in SELECT positions in UNION ALL, in compliance with the SQL standard.
  • Arbitrary expressions are supported in LIMIT BY clauses. Previously, it was only possible to use columns resulting from SELECT.
  • An index of MergeTree tables is used when IN is applied to a tuple of expressions from the columns of the primary key. Example: WHERE (UserID, EventDate) IN ((123, '2000-01-01'), ...) (Anastasiya Tsarkova).
  • Added the clickhouse-copier tool for copying between clusters and resharding data (beta).
  • Added consistent hashing functions: yandexConsistentHash, jumpConsistentHash, sumburConsistentHash. They can be used as a sharding key in order to reduce the amount of network traffic during subsequent reshardings.
  • Added functions: arrayAny, arrayAll, hasAny, hasAll, arrayIntersect, arrayResize.
  • Added the arrayCumSum function (Javi Santana).
  • Added the parseDateTimeBestEffort, parseDateTimeBestEffortOrZero, and parseDateTimeBestEffortOrNull functions to read the DateTime from a string containing text in a wide variety of possible formats.
  • Data can be partially reloaded from external dictionaries during updating (load just the records in which the value of the specified field greater than in the previous download) (Arsen Hakobyan).
  • Added the cluster table function. Example: cluster(cluster_name, db, table). The remote table function can accept the cluster name as the first argument, if it is specified as an identifier.
  • The remote and cluster table functions can be used in INSERT queries.
  • Added the create_table_query and engine_full virtual columns to the system.tablestable . The metadata_modification_time column is virtual.
  • Added the data_path and metadata_path columns to system.tablesand system.databases tables, and added the path column to the system.parts and system.parts_columns tables.
  • Added additional information about merges in the system.part_log table.
  • An arbitrary partitioning key can be used for the system.query_log table (Kirill Shvakov).
  • The SHOW TABLES query now also shows temporary tables. Added temporary tables and the is_temporary column to system.tables (zhang2014).
  • Added DROP TEMPORARY TABLE and EXISTS TEMPORARY TABLE queries (zhang2014).
  • Support for SHOW CREATE TABLE for temporary tables (zhang2014).
  • Added the system_profile configuration parameter for the settings used by internal processes.
  • Support for loading object_id as an attribute in MongoDB dictionaries (Pavel Litvinenko).
  • Reading null as the default value when loading data for an external dictionary with the MongoDB source (Pavel Litvinenko).
  • Reading DateTime values in the Values format from a Unix timestamp without single quotes.
  • Failover is supported in remote table functions for cases when some of the replicas are missing the requested table.
  • Configuration settings can be overridden in the command line when you run clickhouse-server. Example: clickhouse-server -- --logger.level=information.
  • Implemented the empty function from a FixedString argument: the function returns 1 if the string consists entirely of null bytes (zhang2014).
  • Added the listen_tryconfiguration parameter for listening to at least one of the listen addresses without quitting, if some of the addresses can't be listened to (useful for systems with disabled support for IPv4 or IPv6).
  • Added the VersionedCollapsingMergeTree table engine.
  • Support for rows and arbitrary numeric types for the library dictionary source.
  • MergeTree tables can be used without a primary key (you need to specify ORDER BY tuple()).
  • A Nullable type can be CAST to a non-Nullable type if the argument is not NULL.
  • RENAME TABLE can be performed for VIEW.
  • Added the throwIf function.
  • Added the odbc_default_field_size option, which allows you to extend the maximum size of the value loaded from an ODBC source (by default, it is 1024).
  • The system.processes table and SHOW PROCESSLIST now have the is_cancelled and peak_memory_usage columns.

Improvements:

  • Limits and quotas on the result are no longer applied to intermediate data for INSERT SELECT queries or for SELECT subqueries.
  • Fewer false triggers of force_restore_data when checking the status of Replicated tables when the server starts.
  • Added the allow_distributed_ddl option.
  • Nondeterministic functions are not allowed in expressions for MergeTree table keys.
  • Files with substitutions from config.d directories are loaded in alphabetical order.
  • Improved performance of the arrayElement function in the case of a constant multidimensional array with an empty array as one of the elements. Example: [[1], []][x].
  • The server starts faster now when using configuration files with very large substitutions (for instance, very large lists of IP networks).
  • When running a query, table valued functions run once. Previously, remote and mysql table valued functions performed the same query twice to retrieve the table structure from a remote server.
  • The MkDocs documentation generator is used.
  • When you try to delete a table column that DEFAULT/MATERIALIZED expressions of other columns depend on, an exception is thrown (zhang2014).
  • Added the ability to parse an empty line in text formats as the number 0 for Float data types. This feature was previously available but was lost in release 1.1.54342.
  • Enum values can be used in min, max, sum and some other functions. In these cases, it uses the corresponding numeric values. This feature was previously available but was lost in the release 1.1.54337.
  • Added max_expanded_ast_elements to restrict the size of the AST after recursively expanding aliases.

Bug fixes:

  • Fixed cases when unnecessary columns were removed from subqueries in error, or not removed from subqueries containing UNION ALL.
  • Fixed a bug in merges for ReplacingMergeTree tables.
  • Fixed synchronous insertions in Distributed tables (insert_distributed_sync = 1).
  • Fixed segfault for certain uses of FULL and RIGHT JOIN with duplicate columns in subqueries.
  • Fixed segfault for certain uses of replace_running_query and KILL QUERY.
  • Fixed the order of the source and last_exception columns in the system.dictionaries table.
  • Fixed a bug when the DROP DATABASE query did not delete the file with metadata.
  • Fixed the DROP DATABASE query for Dictionary databases.
  • Fixed the low precision of uniqHLL12 and uniqCombined functions for cardinalities greater than 100 million items (Alex Bocharov).
  • Fixed the calculation of implicit default values when necessary to simultaneously calculate default explicit expressions in INSERT queries (zhang2014).
  • Fixed a rare case when a query to a MergeTree table couldn't finish (chenxing-xc).
  • Fixed a crash that occurred when running a CHECK query for Distributed tables if all shards are local (chenxing.xc).
  • Fixed a slight performance regression with functions that use regular expressions.
  • Fixed a performance regression when creating multidimensional arrays from complex expressions.
  • Fixed a bug that could cause an extra FORMAT section to appear in an .sql file with metadata.
  • Fixed a bug that caused the max_table_size_to_drop limit to apply when trying to delete a MATERIALIZED VIEW looking at an explicitly specified table.
  • Fixed incompatibility with old clients (old clients were sometimes sent data with the DateTime('timezone') type, which they do not understand).
  • Fixed a bug when reading Nested column elements of structures that were added using ALTER but that are empty for the old partitions, when the conditions for these columns moved to PREWHERE.
  • Fixed a bug when filtering tables by virtual _table columns in queries to Merge tables.
  • Fixed a bug when using ALIAS columns in Distributed tables.
  • Fixed a bug that made dynamic compilation impossible for queries with aggregate functions from the quantile family.
  • Fixed a race condition in the query execution pipeline that occurred in very rare cases when using Merge tables with a large number of tables, and when using GLOBAL subqueries.
  • Fixed a crash when passing arrays of different sizes to an arrayReduce function when using aggregate functions from multiple arguments.
  • Prohibited the use of queries with UNION ALL in a MATERIALIZED VIEW.
  • Fixed an error during initialization of the part_log system table when the server starts (by default, part_log is disabled).

Backward incompatible changes:

  • Removed the distributed_ddl_allow_replicated_alter option. This behavior is enabled by default.
  • Removed the strict_insert_defaults setting. If you were using this functionality, write to clickhouse-feedback@yandex-team.com.
  • Removed the UnsortedMergeTree engine.

Clickhouse Release 1.1.54343, 2018-02-05

  • Added macros support for defining cluster names in distributed DDL queries and constructors of Distributed tables: CREATE TABLE distr ON CLUSTER '{cluster}' (...) ENGINE = Distributed('{cluster}', 'db', 'table').
  • Now queries like SELECT ... FROM table WHERE expr IN (subquery) are processed using the table index.
  • Improved processing of duplicates when inserting to Replicated tables, so they no longer slow down execution of the replication queue.

Clickhouse Release 1.1.54342, 2018-01-22

This release contains bug fixes for the previous release 1.1.54337:

  • Fixed a regression in 1.1.54337: if the default user has readonly access, then the server refuses to start up with the message Cannot create database in readonly mode.
  • Fixed a regression in 1.1.54337: on systems with systemd, logs are always written to syslog regardless of the configuration; the watchdog script still uses init.d.
  • Fixed a regression in 1.1.54337: wrong default configuration in the Docker image.
  • Fixed nondeterministic behavior of GraphiteMergeTree (you can see it in log messages Data after merge is not byte-identical to the data on another replicas).
  • Fixed a bug that may lead to inconsistent merges after OPTIMIZE query to Replicated tables (you may see it in log messages Part ... intersects the previous part).
  • Buffer tables now work correctly when MATERIALIZED columns are present in the destination table (by zhang2014).
  • Fixed a bug in implementation of NULL.

Clickhouse Release 1.1.54337, 2018-01-18

New features:

  • Added support for storage of multi-dimensional arrays and tuples (Tuple data type) in tables.
  • Support for table functions for DESCRIBE and INSERT queries. Added support for subqueries in DESCRIBE. Examples: DESC TABLE remote('host', default.hits); DESC TABLE (SELECT 1); INSERT INTO TABLE FUNCTION remote('host', default.hits). Support for INSERT INTO TABLE in addition to INSERT INTO.
  • Improved support for time zones. The DateTime data type can be annotated with the timezone that is used for parsing and formatting in text formats. Example: DateTime('Europe/Moscow'). When timezones are specified in functions for DateTime arguments, the return type will track the timezone, and the value will be displayed as expected.
  • Added the functions toTimeZone, timeDiff, toQuarter, toRelativeQuarterNum. The toRelativeHour/Minute/Second functions can take a value of type Date as an argument. The now function name is case-sensitive.
  • Added the toStartOfFifteenMinutes function (Kirill Shvakov).
  • Added the clickhouse format tool for formatting queries.
  • Added the format_schema_path configuration parameter (Marek Vavruşa). It is used for specifying a schema in Cap'n Proto format. Schema files can be located only in the specified directory.
  • Added support for config substitutions (incl and conf.d) for configuration of external dictionaries and models (Pavel Yakunin).
  • Added a column with documentation for the system.settings table (Kirill Shvakov).
  • Added the system.parts_columns table with information about column sizes in each data part of MergeTree tables.
  • Added the system.models table with information about loaded CatBoost machine learning models.
  • Added the mysql and odbc table function and corresponding MySQL and ODBC table engines for accessing remote databases. This functionality is in the beta stage.
  • Added the possibility to pass an argument of type AggregateFunction for the groupArray aggregate function (so you can create an array of states of some aggregate function).
  • Removed restrictions on various combinations of aggregate function combinators. For example, you can use avgForEachIf as well as avgIfForEach aggregate functions, which have different behaviors.
  • The -ForEach aggregate function combinator is extended for the case of aggregate functions of multiple arguments.
  • Added support for aggregate functions of Nullable arguments even for cases when the function returns a non-Nullable result (added with the contribution of Silviu Caragea). Example: groupArray, groupUniqArray, topK.
  • Added the max_client_network_bandwidth for clickhouse-client (Kirill Shvakov).
  • Users with the readonly = 2 setting are allowed to work with TEMPORARY tables (CREATE, DROP, INSERT...) (Kirill Shvakov).
  • Added support for using multiple consumers with the Kafka engine. Extended configuration options for Kafka (Marek Vavruša).
  • Added the intExp3 and intExp4 functions.
  • Added the sumKahan aggregate function.
  • Added the to * Number* OrNull functions, where * Number* is a numeric type.
  • Added support for WITH clauses for an INSERT SELECT query (author: zhang2014).
  • Added settings: http_connection_timeout, http_send_timeout, http_receive_timeout. In particular, these settings are used for downloading data parts for replication. Changing these settings allows for faster failover if the network is overloaded.
  • Added support for ALTER for tables of type Null (Anastasiya Tsarkova).
  • The reinterpretAsString function is extended for all data types that are stored contiguously in memory.
  • Added the --silent option for the clickhouse-local tool. It suppresses printing query execution info in stderr.
  • Added support for reading values of type Date from text in a format where the month and/or day of the month is specified using a single digit instead of two digits (Amos Bird).

Performance optimizations:

  • Improved performance of aggregate functions min, max, any, anyLast, anyHeavy, argMin, argMax from string arguments.
  • Improved performance of the functions isInfinite, isFinite, isNaN, roundToExp2.
  • Improved performance of parsing and formatting Date and DateTime type values in text format.
  • Improved performance and precision of parsing floating point numbers.
  • Lowered memory usage for JOIN in the case when the left and right parts have columns with identical names that are not contained in USING .
  • Improved performance of aggregate functions varSamp, varPop, stddevSamp, stddevPop, covarSamp, covarPop, corr by reducing computational stability. The old functions are available under the names varSampStable, varPopStable, stddevSampStable, stddevPopStable, covarSampStable, covarPopStable, corrStable.

Bug fixes:

  • Fixed data deduplication after running a DROP or DETACH PARTITION query. In the previous version, dropping a partition and inserting the same data again was not working because inserted blocks were considered duplicates.
  • Fixed a bug that could lead to incorrect interpretation of the WHERE clause for CREATE MATERIALIZED VIEW queries with POPULATE .
  • Fixed a bug in using the root_path parameter in the zookeeper_servers configuration.
  • Fixed unexpected results of passing the Date argument to toStartOfDay .
  • Fixed the addMonths and subtractMonths functions and the arithmetic for INTERVAL n MONTH in cases when the result has the previous year.
  • Added missing support for the UUID data type for DISTINCT , JOIN , and uniq aggregate functions and external dictionaries (Evgeniy Ivanov). Support for UUID is still incomplete.
  • Fixed SummingMergeTree behavior in cases when the rows summed to zero.
  • Various fixes for the Kafka engine (Marek Vavruša).
  • Fixed incorrect behavior of the Join table engine (Amos Bird).
  • Fixed incorrect allocator behavior under FreeBSD and OS X.
  • The extractAll function now supports empty matches.
  • Fixed an error that blocked usage of libressl instead of openssl .
  • Fixed the CREATE TABLE AS SELECT query from temporary tables.
  • Fixed non-atomicity of updating the replication queue. This could lead to replicas being out of sync until the server restarts.
  • Fixed possible overflow in gcd , lcm and modulo (% operator) (Maks Skorokhod).
  • -preprocessed files are now created after changing umask (umask can be changed in the config).
  • Fixed a bug in the background check of parts (MergeTreePartChecker ) when using a custom partition key.
  • Fixed parsing of tuples (values of the Tuple data type) in text formats.
  • Improved error messages about incompatible types passed to multiIf , array and some other functions.
  • Redesigned support for Nullable types. Fixed bugs that may lead to a server crash. Fixed almost all other bugs related to NULL support: incorrect type conversions in INSERT SELECT, insufficient support for Nullable in HAVING and PREWHERE, join_use_nulls mode, Nullable types as arguments of OR operator, etc.
  • Fixed various bugs related to internal semantics of data types. Examples: unnecessary summing of Enum type fields in SummingMergeTree ; alignment of Enum types in Pretty formats, etc.
  • Stricter checks for allowed combinations of composite columns.
  • Fixed the overflow when specifying a very large parameter for the FixedString data type.
  • Fixed a bug in the topK aggregate function in a generic case.
  • Added the missing check for equality of array sizes in arguments of n-ary variants of aggregate functions with an -Array combinator.
  • Fixed a bug in --pager for clickhouse-client (author: ks1322).
  • Fixed the precision of the exp10 function.
  • Fixed the behavior of the visitParamExtract function for better compliance with documentation.
  • Fixed the crash when incorrect data types are specified.
  • Fixed the behavior of DISTINCT in the case when all columns are constants.
  • Fixed query formatting in the case of using the tupleElement function with a complex constant expression as the tuple element index.
  • Fixed a bug in Dictionary tables for range_hashed dictionaries.
  • Fixed a bug that leads to excessive rows in the result of FULL and RIGHT JOIN (Amos Bird).
  • Fixed a server crash when creating and removing temporary files in config.d directories during config reload.
  • Fixed the SYSTEM DROP DNS CACHE query: the cache was flushed but addresses of cluster nodes were not updated.
  • Fixed the behavior of MATERIALIZED VIEW after executing DETACH TABLE for the table under the view (Marek Vavruša).

Build improvements:

  • The pbuilder tool is used for builds. The build process is almost completely independent of the build host environment.
  • A single build is used for different OS versions. Packages and binaries have been made compatible with a wide range of Linux systems.
  • Added the clickhouse-test package. It can be used to run functional tests.
  • The source tarball can now be published to the repository. It can be used to reproduce the build without using GitHub.
  • Added limited integration with Travis CI. Due to limits on build time in Travis, only the debug build is tested and a limited subset of tests are run.
  • Added support for Cap'n'Proto in the default build.
  • Changed the format of documentation sources from Restricted Text to Markdown.
  • Added support for systemd (Vladimir Smirnov). It is disabled by default due to incompatibility with some OS images and can be enabled manually.
  • For dynamic code generation, clang and lld are embedded into the clickhouse binary. They can also be invoked as clickhouse clang and clickhouse lld .
  • Removed usage of GNU extensions from the code. Enabled the -Wextra option. When building with clang the default is libc++ instead of libstdc++.
  • Extracted clickhouse_parsers and clickhouse_common_io libraries to speed up builds of various tools.

Backward incompatible changes:

  • The format for marks in Log type tables that contain Nullable columns was changed in a backward incompatible way. If you have these tables, you should convert them to the TinyLog type before starting up the new server version. To do this, replace ENGINE = Log with ENGINE = TinyLog in the corresponding .sql file in the metadata directory. If your table doesn't have Nullable columns or if the type of your table is not Log, then you don't need to do anything.
  • Removed the experimental_allow_extended_storage_definition_syntax setting. Now this feature is enabled by default.
  • The runningIncome function was renamed to runningDifferenceStartingWithFirstvalue to avoid confusion.
  • Removed the FROM ARRAY JOIN arr syntax when ARRAY JOIN is specified directly after FROM with no table (Amos Bird).
  • Removed the BlockTabSeparated format that was used solely for demonstration purposes.
  • Changed the state format for aggregate functions varSamp, varPop, stddevSamp, stddevPop, covarSamp, covarPop, corr. If you have stored states of these aggregate functions in tables (using the AggregateFunction data type or materialized views with corresponding states), please write to clickhouse-feedback@yandex-team.com.
  • In previous server versions there was an undocumented feature: if an aggregate function depends on parameters, you can still specify it without parameters in the AggregateFunction data type. Example: AggregateFunction(quantiles, UInt64) instead of AggregateFunction(quantiles(0.5, 0.9), UInt64). This feature was lost. Although it was undocumented, we plan to support it again in future releases.
  • Enum data types cannot be used in min/max aggregate functions. This ability will be returned in the next release.

Please note when upgrading:

  • When doing a rolling update on a cluster, at the point when some of the replicas are running the old version of ClickHouse and some are running the new version, replication is temporarily stopped and the message unknown parameter 'shard' appears in the log. Replication will continue after all replicas of the cluster are updated.
  • If different versions of ClickHouse are running on the cluster servers, it is possible that distributed queries using the following functions will have incorrect results: varSamp, varPop, stddevSamp, stddevPop, covarSamp, covarPop, corr. You should update all cluster nodes.

ClickHouse release 1.1.54327, 2017-12-21

This release contains bug fixes for the previous release 1.1.54318:

  • Fixed bug with possible race condition in replication that could lead to data loss. This issue affects versions 1.1.54310 and 1.1.54318. If you use one of these versions with Replicated tables, the update is strongly recommended. This issue shows in logs in Warning messages like Part ... from own log doesn't exist. The issue is relevant even if you don't see these messages in logs.

ClickHouse release 1.1.54318, 2017-11-30

This release contains bug fixes for the previous release 1.1.54310:

  • Fixed incorrect row deletions during merges in the SummingMergeTree engine
  • Fixed a memory leak in unreplicated MergeTree engines
  • Fixed performance degradation with frequent inserts in MergeTree engines
  • Fixed an issue that was causing the replication queue to stop running
  • Fixed rotation and archiving of server logs

ClickHouse release 1.1.54310, 2017-11-01

New features:

  • Custom partitioning key for the MergeTree family of table engines.
  • Kafka table engine.
  • Added support for loading CatBoost models and applying them to data stored in ClickHouse.
  • Added support for time zones with non-integer offsets from UTC.
  • Added support for arithmetic operations with time intervals.
  • The range of values for the Date and DateTime types is extended to the year 2105.
  • Added the CREATE MATERIALIZED VIEW x TO y query (specifies an existing table for storing the data of a materialized view).
  • Added the ATTACH TABLE query without arguments.
  • The processing logic for Nested columns with names ending in -Map in a SummingMergeTree table was extracted to the sumMap aggregate function. You can now specify such columns explicitly.
  • Max size of the IP trie dictionary is increased to 128M entries.
  • Added the getSizeOfEnumType function.
  • Added the sumWithOverflow aggregate function.
  • Added support for the Cap'n Proto input format.
  • You can now customize compression level when using the zstd algorithm.

Backward incompatible changes:

  • Creation of temporary tables with an engine other than Memory is not allowed.
  • Explicit creation of tables with the View or MaterializedView engine is not allowed.
  • During table creation, a new check verifies that the sampling key expression is included in the primary key.

Bug fixes:

  • Fixed hangups when synchronously inserting into a Distributed table.
  • Fixed nonatomic adding and removing of parts in Replicated tables.
  • Data inserted into a materialized view is not subjected to unnecessary deduplication.
  • Executing a query to a Distributed table for which the local replica is lagging and remote replicas are unavailable does not result in an error anymore.
  • Users don't need access permissions to the default database to create temporary tables anymore.
  • Fixed crashing when specifying the Array type without arguments.
  • Fixed hangups when the disk volume containing server logs is full.
  • Fixed an overflow in the toRelativeWeekNum function for the first week of the Unix epoch.

Build improvements:

  • Several third-party libraries (notably Poco) were updated and converted to git submodules.

ClickHouse release 1.1.54304, 2017-10-19

New features:

  • TLS support in the native protocol (to enable, set tcp_ssl_port in config.xml ).

Bug fixes:

  • ALTER for replicated tables now tries to start running as soon as possible.
  • Fixed crashing when reading data with the setting preferred_block_size_bytes=0.
  • Fixed crashes of clickhouse-client when pressing Page Down
  • Correct interpretation of certain complex queries with GLOBAL IN and UNION ALL
  • FREEZE PARTITION always works atomically now.
  • Empty POST requests now return a response with code 411.
  • Fixed interpretation errors for expressions like CAST(1 AS Nullable(UInt8)).
  • Fixed an error when reading Array(Nullable(String)) columns from MergeTree tables.
  • Fixed crashing when parsing queries like SELECT dummy AS dummy, dummy AS b
  • Users are updated correctly with invalid users.xml
  • Correct handling when an executable dictionary returns a non-zero response code.

ClickHouse release 1.1.54292, 2017-09-20

New features:

  • Added the pointInPolygon function for working with coordinates on a coordinate plane.
  • Added the sumMap aggregate function for calculating the sum of arrays, similar to SummingMergeTree.
  • Added the trunc function. Improved performance of the rounding functions (round, floor, ceil, roundToExp2) and corrected the logic of how they work. Changed the logic of the roundToExp2 function for fractions and negative numbers.
  • The ClickHouse executable file is now less dependent on the libc version. The same ClickHouse executable file can run on a wide variety of Linux systems. There is still a dependency when using compiled queries (with the setting compile = 1 , which is not used by default).
  • Reduced the time needed for dynamic compilation of queries.

Bug fixes:

  • Fixed an error that sometimes produced part ... intersects previous part messages and weakened replica consistency.
  • Fixed an error that caused the server to lock up if ZooKeeper was unavailable during shutdown.
  • Removed excessive logging when restoring replicas.
  • Fixed an error in the UNION ALL implementation.
  • Fixed an error in the concat function that occurred if the first column in a block has the Array type.
  • Progress is now displayed correctly in the system.merges table.

ClickHouse release 1.1.54289, 2017-09-13

New features:

  • SYSTEM queries for server administration: SYSTEM RELOAD DICTIONARY, SYSTEM RELOAD DICTIONARIES, SYSTEM DROP DNS CACHE, SYSTEM SHUTDOWN, SYSTEM KILL.
  • Added functions for working with arrays: concat, arraySlice, arrayPushBack, arrayPushFront, arrayPopBack, arrayPopFront.
  • Added root and identity parameters for the ZooKeeper configuration. This allows you to isolate individual users on the same ZooKeeper cluster.
  • Added aggregate functions groupBitAnd, groupBitOr, and groupBitXor (for compatibility, they are also available under the names BIT_AND, BIT_OR, and BIT_XOR).
  • External dictionaries can be loaded from MySQL by specifying a socket in the filesystem.
  • External dictionaries can be loaded from MySQL over SSL (ssl_cert, ssl_key, ssl_ca parameters).
  • Added the max_network_bandwidth_for_user setting to restrict the overall bandwidth use for queries per user.
  • Support for DROP TABLE for temporary tables.
  • Support for reading DateTime values in Unix timestamp format from the CSV and JSONEachRow formats.
  • Lagging replicas in distributed queries are now excluded by default (the default threshold is 5 minutes).
  • FIFO locking is used during ALTER: an ALTER query isn't blocked indefinitely for continuously running queries.
  • Option to set umask in the config file.
  • Improved performance for queries with DISTINCT .

Bug fixes:

  • Improved the process for deleting old nodes in ZooKeeper. Previously, old nodes sometimes didn't get deleted if there were very frequent inserts, which caused the server to be slow to shut down, among other things.
  • Fixed randomization when choosing hosts for the connection to ZooKeeper.
  • Fixed the exclusion of lagging replicas in distributed queries if the replica is localhost.
  • Fixed an error where a data part in a ReplicatedMergeTree table could be broken after running ALTER MODIFY on an element in a Nested structure.
  • Fixed an error that could cause SELECT queries to "hang".
  • Improvements to distributed DDL queries.
  • Fixed the query CREATE TABLE ... AS <materialized view>.
  • Resolved the deadlock in the ALTER ... CLEAR COLUMN IN PARTITION query for Buffer tables.
  • Fixed the invalid default value for Enum s (0 instead of the minimum) when using the JSONEachRow and TSKV formats.
  • Resolved the appearance of zombie processes when using a dictionary with an executable source.
  • Fixed segfault for the HEAD query.

Improved workflow for developing and assembling ClickHouse:

  • You can use pbuilder to build ClickHouse.
  • You can use libc++ instead of libstdc++ for builds on Linux.
  • Added instructions for using static code analysis tools: Coverage, clang-tidy, cppcheck.

Please note when upgrading:

  • There is now a higher default value for the MergeTree setting max_bytes_to_merge_at_max_space_in_pool (the maximum total size of data parts to merge, in bytes): it has increased from 100 GiB to 150 GiB. This might result in large merges running after the server upgrade, which could cause an increased load on the disk subsystem. If the free space available on the server is less than twice the total amount of the merges that are running, this will cause all other merges to stop running, including merges of small data parts. As a result, INSERT queries will fail with the message "Merges are processing significantly slower than inserts." Use the SELECT * FROM system.merges query to monitor the situation. You can also check the DiskSpaceReservedForMerge metric in the system.metrics table, or in Graphite. You don't need to do anything to fix this, since the issue will resolve itself once the large merges finish. If you find this unacceptable, you can restore the previous value for the max_bytes_to_merge_at_max_space_in_pool setting. To do this, go to the <merge_tree> section in config.xml, set <merge_tree>``<max_bytes_to_merge_at_max_space_in_pool>107374182400</max_bytes_to_merge_at_max_space_in_pool> and restart the server.

ClickHouse release 1.1.54284, 2017-08-29

  • This is a bugfix release for the previous 1.1.54282 release. It fixes leaks in the parts directory in ZooKeeper.

ClickHouse release 1.1.54282, 2017-08-23

This release contains bug fixes for the previous release 1.1.54276:

  • Fixed DB::Exception: Assertion violation: !_path.empty() when inserting into a Distributed table.
  • Fixed parsing when inserting in RowBinary format if input data starts with';'.
  • Errors during runtime compilation of certain aggregate functions (e.g. groupArray()).

Clickhouse Release 1.1.54276, 2017-08-16

New features:

  • Added an optional WITH section for a SELECT query. Example query: WITH 1+1 AS a SELECT a, a*a
  • INSERT can be performed synchronously in a Distributed table: OK is returned only after all the data is saved on all the shards. This is activated by the setting insert_distributed_sync=1.
  • Added the UUID data type for working with 16-byte identifiers.
  • Added aliases of CHAR, FLOAT and other types for compatibility with the Tableau.
  • Added the functions toYYYYMM, toYYYYMMDD, and toYYYYMMDDhhmmss for converting time into numbers.
  • You can use IP addresses (together with the hostname) to identify servers for clustered DDL queries.
  • Added support for non-constant arguments and negative offsets in the function substring(str, pos, len).
  • Added the max_size parameter for the groupArray(max_size)(column) aggregate function, and optimized its performance.

Main changes:

  • Security improvements: all server files are created with 0640 permissions (can be changed via config parameter).
  • Improved error messages for queries with invalid syntax.
  • Significantly reduced memory consumption and improved performance when merging large sections of MergeTree data.
  • Significantly increased the performance of data merges for the ReplacingMergeTree engine.
  • Improved performance for asynchronous inserts from a Distributed table by combining multiple source inserts. To enable this functionality, use the setting distributed_directory_monitor_batch_inserts=1.

Backward incompatible changes:

  • Changed the binary format of aggregate states of groupArray(array_column) functions for arrays.

Complete list of changes:

  • Added the output_format_json_quote_denormals setting, which enables outputting nan and inf values in JSON format.
  • Optimized stream allocation when reading from a Distributed table.
  • Settings can be configured in readonly mode if the value doesn't change.
  • Added the ability to retrieve non-integer granules of the MergeTree engine in order to meet restrictions on the block size specified in the preferred_block_size_bytes setting. The purpose is to reduce the consumption of RAM and increase cache locality when processing queries from tables with large columns.
  • Efficient use of indexes that contain expressions like toStartOfHour(x) for conditions like toStartOfHour(x) op сonstexpr.
  • Added new settings for MergeTree engines (the merge_tree section in config.xml):
    • replicated_deduplication_window_seconds sets the number of seconds allowed for deduplicating inserts in Replicated tables.
    • cleanup_delay_period sets how often to start cleanup to remove outdated data.
    • replicated_can_become_leader can prevent a replica from becoming the leader (and assigning merges).
  • Accelerated cleanup to remove outdated data from ZooKeeper.
  • Multiple improvements and fixes for clustered DDL queries. Of particular interest is the new setting distributed_ddl_task_timeout, which limits the time to wait for a response from the servers in the cluster. If a ddl request has not been performed on all hosts, a response will contain a timeout error and a request will be executed in an async mode.
  • Improved display of stack traces in the server logs.
  • Added the "none" value for the compression method.
  • You can use multiple dictionaries_config sections in config.xml.
  • It is possible to connect to MySQL through a socket in the file system.
  • The system.parts table has a new column with information about the size of marks, in bytes.

Bug fixes:

  • Distributed tables using a Merge table now work correctly for a SELECT query with a condition on the _table field.
  • Fixed a rare race condition in ReplicatedMergeTree when checking data parts.
  • Fixed possible freezing on "leader election" when starting a server.
  • The max_replica_delay_for_distributed_queries setting was ignored when using a local replica of the data source. This has been fixed.
  • Fixed incorrect behavior of ALTER TABLE CLEAR COLUMN IN PARTITION when attempting to clean a non-existing column.
  • Fixed an exception in the multiIf function when using empty arrays or strings.
  • Fixed excessive memory allocations when deserializing Native format.
  • Fixed incorrect auto-update of Trie dictionaries.
  • Fixed an exception when running queries with a GROUP BY clause from a Merge table when using SAMPLE.
  • Fixed a crash of GROUP BY when using distributed_aggregation_memory_efficient=1.
  • Now you can specify the database.table in the right side of IN and JOIN.
  • Too many threads were used for parallel aggregation. This has been fixed.
  • Fixed how the "if" function works with FixedString arguments.
  • SELECT worked incorrectly from a Distributed table for shards with a weight of 0. This has been fixed.
  • Running CREATE VIEW IF EXISTS no longer causes crashes.
  • Fixed incorrect behavior when input_format_skip_unknown_fields=1 is set and there are negative numbers.
  • Fixed an infinite loop in the dictGetHierarchy() function if there is some invalid data in the dictionary.
  • Fixed Syntax error: unexpected (...) errors when running distributed queries with subqueries in an IN or JOIN clause and Merge tables.
  • Fixed an incorrect interpretation of a SELECT query from Dictionary tables.
  • Fixed the "Cannot mremap" error when using arrays in IN and JOIN clauses with more than 2 billion elements.
  • Fixed the failover for dictionaries with MySQL as the source.

Improved workflow for developing and assembling ClickHouse:

  • Builds can be assembled in Arcadia.
  • You can use gcc 7 to compile ClickHouse.
  • Parallel builds using ccache+distcc are faster now.

ClickHouse release 1.1.54245, 2017-07-04

New features:

  • Distributed DDL (for example, CREATE TABLE ON CLUSTER)
  • The replicated query ALTER TABLE CLEAR COLUMN IN PARTITION.
  • The engine for Dictionary tables (access to dictionary data in the form of a table).
  • Dictionary database engine (this type of database automatically has Dictionary tables available for all the connected external dictionaries).
  • You can check for updates to the dictionary by sending a request to the source.
  • Qualified column names
  • Quoting identifiers using double quotation marks.
  • Sessions in the HTTP interface.
  • The OPTIMIZE query for a Replicated table can can run not only on the leader.

Backward incompatible changes:

  • Removed SET GLOBAL.

Minor changes:

  • Now after an alert is triggered, the log prints the full stack trace.
  • Relaxed the verification of the number of damaged/extra data parts at startup (there were too many false positives).

Bug fixes:

  • Fixed a bad connection "sticking" when inserting into a Distributed table.
  • GLOBAL IN now works for a query from a Merge table that looks at a Distributed table.
  • The incorrect number of cores was detected on a Google Compute Engine virtual machine. This has been fixed.
  • Changes in how an executable source of cached external dictionaries works.
  • Fixed the comparison of strings containing null characters.
  • Fixed the comparison of Float32 primary key fields with constants.
  • Previously, an incorrect estimate of the size of a field could lead to overly large allocations.
  • Fixed a crash when querying a Nullable column added to a table using ALTER.
  • Fixed a crash when sorting by a Nullable column, if the number of rows is less than LIMIT.
  • Fixed an ORDER BY subquery consisting of only constant values.
  • Previously, a Replicated table could remain in the invalid state after a failed DROP TABLE.
  • Aliases for scalar subqueries with empty results are no longer lost.
  • Now a query that used compilation does not fail with an error if the .so file gets damaged.