mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-06 07:32:27 +00:00
168 KiB
168 KiB
ClickHouse release 21.4
ClickHouse release 21.4.1 2021-04-12
Backward Incompatible Change
- The
toStartOfIntervalFunction
will align hour intervals to the midnight (in previous versions they were aligned to the start of unix epoch). For example,toStartOfInterval(x, INTERVAL 11 HOUR)
will split every day into three intervals:00:00:00..10:59:59
,11:00:00..21:59:59
and22:00:00..23:59:59
. This behaviour is more suited for practical needs. This closes #9510. #22060 (alexey-milovidov). Age
andPrecision
in graphite rollup configs should increase from retention to retention. Now it's checked and the wrong config raises an exception. #21496 (Mikhail f. Shiryaev).- Fix
cutToFirstSignificantSubdomainCustom()
/firstSignificantSubdomainCustom()
returning wrong result for 3+ level domains present in custom top-level domain list. For input domains matching these custom top-level domains, the third-level domain was considered to be the first significant one. This is now fixed. This change may introduce incompatibility if the function is used in e.g. the sharding key. #21946 (Azat Khuzhin). - Column
keys
in tablesystem.dictionaries
was replaced to columnskey.names
andkey.types
. Columnskey.names
,key.types
,attribute.names
,attribute.types
fromsystem.dictionaries
table does not require dictionary to be loaded. #21884 (Maksim Kita). - Now replicas that are processing the
ALTER TABLE ATTACH PART[ITION]
command search in theirdetached/
folders before fetching the data from other replicas. As an implementation detail, a new commandATTACH_PART
is introduced in the replicated log. Parts are searched and compared by their checksums. #18978 (Mike Kot). Note:ATTACH PART[ITION]
queries may not work during cluster upgrade.- It's not possible to rollback to older ClickHouse version after executing
ALTER ... ATTACH
query in new version as the old servers would fail to pass theATTACH_PART
entry in the replicated log.
New Feature
- Extended range of
DateTime64
to support dates from year 1925 to 2283. Improved support ofDateTime
around zero date (1970-01-01
). #9404 (alexey-milovidov, Vasily Nemkov). - Add
prefer_column_name_to_alias
setting to use original column names instead of aliases. it is needed to be more compatible with common databases' aliasing rules. This is for #9715 and #9887. #22044 (Amos Bird). - Added functions
dictGetChildren(dictionary, key)
,dictGetDescendants(dictionary, key, level)
. FunctiondictGetChildren
return all children as an array if indexes. It is a inverse transformation fordictGetHierarchy
. FunctiondictGetDescendants
return all descendants as ifdictGetChildren
was appliedlevel
times recursively. Zerolevel
value is equivalent to infinity. Closes #14656. #22096 (Maksim Kita). - Added
executable_pool
dictionary source. Close #14528. #21321 (Maksim Kita). - Added table function
dictionary
. It works the same way asDictionary
engine. Closes #21560. #21910 (Maksim Kita). - Support
Nullable
type forPolygonDictionary
attribute. #21890 (Maksim Kita). - Functions
dictGet
,dictHas
use current database name if it is not specified for dictionaries created with DDL. Closes #21632. #21859 (Maksim Kita). - Added function
dictGetOrNull
. It works likedictGet
, but returnNull
in case key was not found in dictionary. Closes #22375. #22413 (Maksim Kita). - Added async update in
ComplexKeyCache
,SSDCache
,SSDComplexKeyCache
dictionaries. Added support forNullable
type inCache
,ComplexKeyCache
,SSDCache
,SSDComplexKeyCache
dictionaries. Added support for multiple attributes fetch withdictGet
,dictGetOrDefault
functions. Fixes #21517. #20595 (Maksim Kita). - Support
dictHas
function forRangeHashedDictionary
. Fixes #6680. #19816 (Maksim Kita). - Add function
timezoneOf
that returns the timezone name ofDateTime
orDateTime64
data types. This does not close #9959. Fix inconsistencies in function names: add aliasestimezone
andtimeZone
as well astoTimezone
andtoTimeZone
andtimezoneOf
andtimeZoneOf
. #22001 (alexey-milovidov). - Add new optional clause
GRANTEES
forCREATE/ALTER USER
commands. It specifies users or roles which are allowed to receive grants from this user on condition this user has also all required access granted with grant option. By defaultGRANTEES ANY
is used which means a user with grant option can grant to anyone. Syntax:CREATE USER ... GRANTEES {user | role | ANY | NONE} [,...] [EXCEPT {user | role} [,...]]
. #21641 (Vitaly Baranov). - Add new column
slowdowns_count
tosystem.clusters
. When using hedged requests, it shows how many times we switched to another replica because this replica was responding slowly. Also show actual value oferrors_count
insystem.clusters
. #21480 (Kruglov Pavel). - Add
_partition_id
virtual column forMergeTree*
engines. Allow to prune partitions by_partition_id
. AddpartitionID()
function to calculate partition id string. #21401 (Amos Bird). - Add function
isIPAddressInRange
to test if an IPv4 or IPv6 address is contained in a given CIDR network prefix. #21329 (PHO). - Added new SQL command
ALTER TABLE 'table_name' UNFREEZE [PARTITION 'part_expr'] WITH NAME 'backup_name'
. This command is needed to properly remove 'freezed' partitions from all disks. #21142 (Pavel Kovalenko). - Supports implicit key type conversion for JOIN. #19885 (Vladimir).
Experimental Feature
- Support
RANGE OFFSET
frame (for window functions) for floating point types. ImplementlagInFrame
/leadInFrame
window functions, which are analogous tolag
/lead
, but respect the window frame. They are identical when the frame isbetween unbounded preceding and unbounded following
. This closes #5485. #21895 (Alexander Kuzmenkov). - Zero-copy replication for
ReplicatedMergeTree
over S3 storage. #16240 (ianton-ru). - Added possibility to migrate existing S3 disk to the schema with backup-restore capabilities. #22070 (Pavel Kovalenko).
Performance Improvement
- Supported parallel formatting in
clickhouse-local
and everywhere else. #21630 (Nikita Mikhaylov). - Support parallel parsing for
CSVWithNames
andTSVWithNames
formats. This closes #21085. #21149 (Nikita Mikhaylov). - Enable read with mmap IO for file ranges from 64 MiB (the settings
min_bytes_to_use_mmap_io
). It may lead to moderate performance improvement. #22326 (alexey-milovidov). - Add cache for files read with
min_bytes_to_use_mmap_io
setting. It makes significant (2x and more) performance improvement when the value of the setting is small by avoiding frequent mmap/munmap calls and the consequent page faults. Note that mmap IO has major drawbacks that makes it less reliable in production (e.g. hung or SIGBUS on faulty disks; less controllable memory usage). Nevertheless it is good in benchmarks. #22206 (alexey-milovidov). - Avoid unnecessary data copy when using codec
NONE
. Please note that codecNONE
is mostly useless - it's recommended to always use compression (LZ4
is by default). Despite the common belief, disabling compression may not improve performance (the opposite effect is possible). TheNONE
codec is useful in some cases: - when data is uncompressable; - for synthetic benchmarks. #22145 (alexey-milovidov). - Faster
GROUP BY
with smallmax_rows_to_group_by
andgroup_by_overflow_mode='any'
. #21856 (Nikolai Kochetov). - Optimize performance of queries like
SELECT ... FINAL ... WHERE
. Now in queries withFINAL
it's allowed to move toPREWHERE
columns, which are in sorting key. #21830 (foolchi). - Improved performance by replacing
memcpy
to another implementation. This closes #18583. #21520 (alexey-milovidov). - Improve performance of aggregation in order of sorting key (with enabled setting
optimize_aggregation_in_order
). #19401 (Anton Popov).
Improvement
- Add connection pool for PostgreSQL table/database engine and dictionary source. Should fix #21444. #21839 (Kseniia Sumarokova).
- Support non-default table schema for postgres storage/table-function. Closes #21701. #21711 (Kseniia Sumarokova).
- Support replicas priority for postgres dictionary source. #21710 (Kseniia Sumarokova).
- Introduce a new merge tree setting
min_bytes_to_rebalance_partition_over_jbod
which allows assigning new parts to different disks of a JBOD volume in a balanced way. #16481 (Amos Bird). - Added
Grant
,Revoke
andSystem
values ofquery_kind
column for corresponding queries insystem.query_log
. #21102 (Vasily Nemkov). - Allow customizing timeouts for HTTP connections used for replication independently from other HTTP timeouts. #20088 (nvartolomei).
- Better exception message in client in case of exception while server is writing blocks. In previous versions client may get misleading message like
Data compressed with different methods
. #22427 (alexey-milovidov). - Fix error
Directory tmp_fetch_XXX already exists
which could happen after failed fetch part. Delete temporary fetch directory if it already exists. Fixes #14197. #22411 (nvartolomei). - Fix MSan report for function
range
withUInt256
argument (support for large integers is experimental). This closes #22157. #22387 (alexey-milovidov). - Add
current_database
column tosystem.processes
table. It contains the current database of the query. #22365 (Alexander Kuzmenkov). - Add case-insensitive history search/navigation and subword movement features to
clickhouse-client
. #22105 (Amos Bird). - If tuple of NULLs, e.g.
(NULL, NULL)
is on the left hand side ofIN
operator with tuples of non-NULLs on the right hand side, e.g.SELECT (NULL, NULL) IN ((0, 0), (3, 1))
return 0 instead of throwing an exception about incompatible types. The expression may also appear due to optimization of something likeSELECT (NULL, NULL) = (8, 0) OR (NULL, NULL) = (3, 2) OR (NULL, NULL) = (0, 0) OR (NULL, NULL) = (3, 1)
. This closes #22017. #22063 (alexey-milovidov). - Update used version of simdjson to 0.9.1. This fixes #21984. #22057 (Vitaly Baranov).
- Added case insensitive aliases for
CONNECTION_ID()
andVERSION()
functions. This fixes #22028. #22042 (Eugene Klimov). - Add option
strict_increase
towindowFunnel
function to calculate each event once (resolve #21835). #22025 (Vladimir). - If partition key of a
MergeTree
table does not includeDate
orDateTime
columns but includes exactly oneDateTime64
column, expose its values in themin_time
andmax_time
columns insystem.parts
andsystem.parts_columns
tables. Addmin_time
andmax_time
columns tosystem.parts_columns
table (these was inconsistency to thesystem.parts
table). This closes #18244. #22011 (alexey-milovidov). - Supported
replication_alter_partitions_sync=1
setting inclickhouse-copier
for moving partitions from helping table to destination. Decreased default timeouts. Fixes #21911. #21912 (turbo jason). - Show path to data directory of
EmbeddedRocksDB
tables in system tables. #21903 (tavplubix). - Add profile event
HedgedRequestsChangeReplica
, change read data timeout from sec to ms. #21886 (Kruglov Pavel). - DiskS3 (experimental feature under development). Fixed bug with the impossibility to move directory if the destination is not empty and cache disk is used. #21837 (Pavel Kovalenko).
- Better formatting for
Array
andMap
data types in Web UI. #21798 (alexey-milovidov). - Update clusters only if their configurations were updated. #21685 (Kruglov Pavel).
- Propagate query and session settings for distributed DDL queries. Set
distributed_ddl_entry_format_version
to 2 to enable this. Addeddistributed_ddl_output_mode
setting. Supported modes:none
,throw
(default),null_status_on_timeout
andnever_throw
. Miscellaneous fixes and improvements forReplicated
database engine. #21535 (tavplubix). - If
PODArray
was instantiated with element size that is neither a fraction or a multiple of 16, buffer overflow was possible. No bugs in current releases exist. #21533 (alexey-milovidov). - Add
last_error_time
/last_error_message
/last_error_stacktrace
/remote
columns forsystem.errors
. #21529 (Azat Khuzhin). - Add aliases
simpleJSONExtract/simpleJSONHas
tovisitParam/visitParamExtract{UInt, Int, Bool, Float, Raw, String}
. Fixes #21383. #21519 (fastio). - Add setting
optimize_skip_unused_shards_limit
to limit the number of sharding key values foroptimize_skip_unused_shards
. #21512 (Azat Khuzhin). - Improve
clickhouse-format
to not throw exception when there are extra spaces or comment after the last query, and throw exception early with readable message when formatASTInsertQuery
with data . #21311 (flynn). - Improve support of integer keys in data type
Map
. #21157 (Anton Popov). - MaterializeMySQL: attempt to reconnect to MySQL if the connection is lost. #20961 (Håvard Kvålen).
- Support more cases to rewrite
CROSS JOIN
toINNER JOIN
. #20392 (Vladimir). - Do not create empty parts on INSERT when
optimize_on_insert
setting enabled. Fixes #20304. #20387 (Kruglov Pavel). MaterializeMySQL
: add minmax skipping index for_version
column. #20382 (Stig Bakken).- Add option
--backslash
forclickhouse-format
, which can add a backslash at the end of each line of the formatted query. #21494 (flynn). - Now clickhouse will not throw
LOGICAL_ERROR
exception when we try to mutate the already covered part. Fixes #22013. #22291 (alesapin).
Bug Fix
- Remove socket from epoll before cancelling packet receiver in
HedgedConnections
to prevent possible race. Fixes #22161. #22443 (Kruglov Pavel). - Add (missing) memory accounting in parallel parsing routines. In previous versions OOM was possible when the resultset contains very large blocks of data. This closes #22008. #22425 (alexey-milovidov).
- Fix exception which may happen when
SELECT
has constantWHERE
condition and source table has columns which names are digits. #22270 (LiuNeng). - Fix query cancellation with
use_hedged_requests=0
andasync_socket_for_remote=1
. #22183 (Azat Khuzhin). - Fix uncaught exception in
InterserverIOHTTPHandler
. #22146 (Azat Khuzhin). - Fix docker entrypoint in case
http_port
is not in the config. #22132 (Ewout). - Fix error
Invalid number of rows in Chunk
inJOIN
withTOTALS
andarrayJoin
. Closes #19303. #22129 (Vladimir). - Fix the background thread pool name which used to poll message from Kafka. The Kafka engine with the broken thread pool will not consume the message from message queue. #22122 (fastio).
- Fix waiting for
OPTIMIZE
andALTER
queries forReplicatedMergeTree
table engines. Now the query will not hang when the table was detached or restarted. #22118 (alesapin). - Disable
async_socket_for_remote
/use_hedged_requests
for buggy Linux kernels. #22109 (Azat Khuzhin). - Docker entrypoint: avoid chown of
.
in case whenLOG_PATH
is empty. Closes #22100. #22102 (filimonov). - The function
decrypt
was lacking a check for the minimal size of data encrypted inAEAD
mode. This closes #21897. #22064 (alexey-milovidov). - In rare case, merge for
CollapsingMergeTree
may create granule withindex_granularity + 1
rows. Because of this, internal check, added in #18928 (affects 21.2 and 21.3), may fail with errorIncomplete granules are not allowed while blocks are granules size
. This error did not allow parts to merge. #21976 (Nikolai Kochetov). - Reverted #15454 that may cause significant increase in memory usage while loading external dictionaries of hashed type. This closes #21935. #21948 (Maksim Kita).
- Prevent hedged connections overlaps (
Unknown packet 9 from server
error). #21941 (Azat Khuzhin). - Fix reading the HTTP POST request with "multipart/form-data" content type in some cases. #21936 (Ivan).
- Fix wrong
ORDER BY
results when a query contains window functions, and optimization for reading in primary key order is applied. Fixes #21828. #21915 (Alexander Kuzmenkov). - Fix deadlock in first catboost model execution. Closes #13832. #21844 (Kruglov Pavel).
- Fix incorrect query result (and possible crash) which could happen when
WHERE
orHAVING
condition is pushed beforeGROUP BY
. Fixes #21773. #21841 (Nikolai Kochetov). - Better error handling and logging in
WriteBufferFromS3
. #21836 (Pavel Kovalenko). - Fix possible crashes in aggregate functions with combinator
Distinct
, while using two-level aggregation. This is a follow-up fix of #18365 . Can only reproduced in production env. #21818 (Amos Bird). - Fix scalar subquery index analysis. This fixes #21717 , which was introduced in #18896. #21766 (Amos Bird).
- Fix bug for
ReplicatedMerge
table engines whenALTER MODIFY COLUMN
query doesn't change the type ofDecimal
column if its size (32 bit or 64 bit) doesn't change. #21728 (alesapin). - Fix possible infinite waiting when concurrent
OPTIMIZE
andDROP
are run forReplicatedMergeTree
. #21716 (Azat Khuzhin). - Fix function
arrayElement
with typeMap
for constant integer arguments. #21699 (Anton Popov). - Fix SIGSEGV on not existing attributes from
ip_trie
withaccess_to_key_from_attributes
. #21692 (Azat Khuzhin). - Server now start accepting connections only after
DDLWorker
and dictionaries initialization. #21676 (Azat Khuzhin). - Add type conversion for keys of tables of type
Join
(previously led to SIGSEGV). #21646 (Azat Khuzhin). - Fix distributed requests cancellation (for example simple select from multiple shards with limit, i.e.
select * from remote('127.{2,3}', system.numbers) limit 100
) withasync_socket_for_remote=1
. #21643 (Azat Khuzhin). - Fix
fsync_part_directory
for horizontal merge. #21642 (Azat Khuzhin). - Remove unknown columns from joined table in
WHERE
for queries to external database engines (MySQL, PostgreSQL). close #14614, close #19288 (dup), close #19645 (dup). #21640 (Vladimir). std::terminate
was called if there is an error writing data into s3. #21624 (Vladimir).- Fix possible error
Cannot find column
whenoptimize_skip_unused_shards
is enabled and zero shards are used. #21579 (Azat Khuzhin). - In case if query has constant
WHERE
condition, and settingoptimize_skip_unused_shards
enabled, all shards may be skipped and query could return incorrect empty result. #21550 (Amos Bird). - Fix table function
clusterAllReplicas
returns wrong_shard_num
. close #21481. #21498 (flynn). - Fix that S3 table holds old credentials after config update. #21457 (Grigory Pervakov).
- Fixed race on SSL object inside
SecureSocket
in Poco. #21456 (Nikita Mikhaylov). - Fix
Avro
format parsing forKafka
. Fixes #21437. #21438 (Ilya Golshtein). - Fix receive and send timeouts and non-blocking read in secure socket. #21429 (Kruglov Pavel).
force_drop_table
flag didn't work forMATERIALIZED VIEW
, it's fixed. Fixes #18943. #20626 (tavplubix).- Fix name clashes in
PredicateRewriteVisitor
. It caused incorrectWHERE
filtration after full join. Close #20497. #20622 (Vladimir). - Fixed open behavior of remote host filter in case when there is
remote_url_allow_hosts
section in configuration but no entries there. #20058 (Vladimir Chebotarev).
Build/Testing/Packaging Improvement
- Add Jepsen tests for ClickHouse Keeper. #21677 (alesapin).
- Run stateless tests in parallel in CI. Depends on #22181. #22300 (alesapin).
- Enable status check for SQLancer CI run. #22015 (Ilya Yatsishin).
- Multiple preparations for PowerPC builds: Enable the bundled openldap on
ppc64le
. #22487 (Kfir Itzhak). Enable compiling onppc64le
with Clang. #22476 (Kfir Itzhak). Fix compiling boost onppc64le
. #22474 (Kfir Itzhak). Fix CMake error about internal CMake variableCMAKE_ASM_COMPILE_OBJECT
not set onppc64le
. #22469 (Kfir Itzhak). Fix Fedora/RHEL/CentOS not findinglibclang_rt.builtins
onppc64le
. #22458 (Kfir Itzhak). Enable building withjemalloc
onppc64le
. #22447 (Kfir Itzhak). Fix ClickHouse's config embedding and cctz's timezone embedding onppc64le
. #22445 (Kfir Itzhak). Fixed compiling onppc64le
and use the correct instruction pointer register onppc64le
. #22430 (Kfir Itzhak). - Re-enable the S3 (AWS) library on
aarch64
. #22484 (Kfir Itzhak). - Add
tzdata
to Docker containers because readingORC
formats requires it. This closes #14156. #22000 (alexey-milovidov). - Introduce 2 arguments for
clickhouse-server
image Dockerfile:deb_location
&single_binary_location
. #21977 (filimonov). - Allow to use clang-tidy with release builds by enabling assertions if it is used. #21914 (alexey-milovidov).
- Add llvm-12 binaries name to search in cmake scripts. Implicit constants conversions to mute clang warnings. Updated submodules to build with CMake 3.19. Mute recursion in macro expansion in
readpassphrase
library. Deprecated-fuse-ld
changed to--ld-path
for clang. #21597 (Ilya Yatsishin). - Updating
docker/test/testflows/runner/dockerd-entrypoint.sh
to use Yandex dockerhub-proxy, because Docker Hub has enabled very restrictive rate limits #21551 (vzakaznikov). - Fix macOS shared lib build. #20184 (nvartolomei).
- Add
ctime
option tozookeeper-dump-tree
. It allows to dump node creation time. #21842 (Ilya).
ClickHouse release 21.3 (LTS)
ClickHouse release v21.3, 2021-03-12
Backward Incompatible Change
- Now it's not allowed to create MergeTree tables in old syntax with table TTL because it's just ignored. Attach of old tables is still possible. #20282 (alesapin).
- Now all case-insensitive function names will be rewritten to their canonical representations. This is needed for projection query routing (the upcoming feature). #20174 (Amos Bird).
- Fix creation of
TTL
in cases, when its expression is a function and it is the same asORDER BY
key. Now it's allowed to set custom aggregation to primary key columns inTTL
withGROUP BY
. Backward incompatible: For primary key columns, which are not inGROUP BY
and aren't set explicitly now is applied functionany
instead ofmax
, when TTL is expired. Also if you use TTL withWHERE
orGROUP BY
you can see exceptions at merges, while making rolling update. #15450 (Anton Popov).
New Feature
- Add file engine settings:
engine_file_empty_if_not_exists
andengine_file_truncate_on_insert
. #20620 (M0r64n). - Add aggregate function
deltaSum
for summing the differences between consecutive rows. #20057 (Russ Frank). - New
event_time_microseconds
column insystem.part_log
table. #20027 (Bharat Nallan). - Added
timezoneOffset(datetime)
function which will give the offset from UTC in seconds. This close #issue:19850. #19962 (keenwolf). - Add setting
insert_shard_id
to support insert data into specific shard from distributed table. #19961 (flynn). - Function
reinterpretAs
updated to support big integers. Fixes #19691. #19858 (Maksim Kita). - Added Server Side Encryption Customer Keys (the
x-amz-server-side-encryption-customer-(key/md5)
header) support in S3 client. See the link. Closes #19428. #19748 (Vladimir Chebotarev). - Added
implicit_key
option forexecutable
dictionary source. It allows to avoid printing key for every record if records comes in the same order as the input keys. Implements #14527. #19677 (Maksim Kita). - Add quota type
query_selects
andquery_inserts
. #19603 (JackyWoo). - Add function
extractTextFromHTML
#19600 (zlx19950903), (alexey-milovidov). - Tables with
MergeTree*
engine now have two new table-level settings for query concurrency control. Settingmax_concurrent_queries
limits the number of concurrently executed queries which are related to this table. Settingmin_marks_to_honor_max_concurrent_queries
tells to apply previous setting only if query reads at least this number of marks. #19544 (Amos Bird). - Added
file
function to read file from user_files directory as a String. This is different from thefile
table function. This implements #issue:18851. #19204 (keenwolf).
Experimental feature
- Add experimental
Replicated
database engine. It replicates DDL queries across multiple hosts. #16193 (tavplubix). - Introduce experimental support for window functions, enabled with
allow_experimental_window_functions = 1
. This is a preliminary, alpha-quality implementation that is not suitable for production use and will change in backward-incompatible ways in future releases. Please see the documentation for the list of supported features. #20337 (Alexander Kuzmenkov). - Add the ability to backup/restore metadata files for DiskS3. #18377 (Pavel Kovalenko).
Performance Improvement
- Hedged requests for remote queries. When setting
use_hedged_requests
enabled (off by default), allow to establish many connections with different replicas for query. New connection is enabled in case existent connection(s) with replica(s) were not established withinhedged_connection_timeout
or no data was received withinreceive_data_timeout
. Query uses the first connection which send non empty progress packet (or data packet, ifallow_changing_replica_until_first_data_packet
); other connections are cancelled. Queries withmax_parallel_replicas > 1
are supported. #19291 (Kruglov Pavel). This allows to significantly reduce tail latencies on very large clusters. - Added support for
PREWHERE
(and enable the corresponding optimization) when tables have row-level security expressions specified. #19576 (Denis Glazachev). - The setting
distributed_aggregation_memory_efficient
is enabled by default. It will lower memory usage and improve performance of distributed queries. #20599 (alexey-milovidov). - Improve performance of GROUP BY multiple fixed size keys. #20472 (alexey-milovidov).
- Improve performance of aggregate functions by more strict aliasing. #19946 (alexey-milovidov).
- Speed up reading from
Memory
tables in extreme cases (when reading speed is in order of 50 GB/sec) by simplification of pipeline and (consequently) less lock contention in pipeline scheduling. #20468 (alexey-milovidov). - Partially reimplement HTTP server to make it making less copies of incoming and outgoing data. It gives up to 1.5 performance improvement on inserting long records over HTTP. #19516 (Ivan).
- Add
compress
setting forMemory
tables. If it's enabled the table will use less RAM. On some machines and datasets it can also work faster on SELECT, but it is not always the case. This closes #20093. Note: there are reasons why Memory tables can work slower than MergeTree: (1) lack of compression (2) static size of blocks (3) lack of indices and prewhere... #20168 (alexey-milovidov). - Slightly better code in aggregation. #20978 (alexey-milovidov).
- Add back
intDiv
/modulo
specializations for better performance. This fixes #21293 . The regression was introduced in https://github.com/ClickHouse/ClickHouse/pull/18145 . #21307 (Amos Bird). - Do not squash blocks too much on INSERT SELECT if inserting into Memory table. In previous versions inefficient data representation was created in Memory table after INSERT SELECT. This closes #13052. #20169 (alexey-milovidov).
- Fix at least one case when DataType parser may have exponential complexity (found by fuzzer). This closes #20096. #20132 (alexey-milovidov).
- Parallelize SELECT with FINAL for single part with level > 0 when
do_not_merge_across_partitions_select_final
setting is 1. #19375 (Kruglov Pavel). - Fill only requested columns when querying
system.parts
andsystem.parts_columns
. Closes #19570. #21035 (Anmol Arora). - Perform algebraic optimizations of arithmetic expressions inside
avg
aggregate function. close #20092. #20183 (flynn).
Improvement
- Case-insensitive compression methods for table functions. Also fixed LZMA compression method which was checked in upper case. #21416 (Vladimir Chebotarev).
- Add two settings to delay or throw error during insertion when there are too many inactive parts. This is useful when server fails to clean up parts quickly enough. #20178 (Amos Bird).
- Provide better compatibility for mysql clients. 1. mysql jdbc 2. mycli. #21367 (Amos Bird).
- Forbid to drop a column if it's referenced by materialized view. Closes #21164. #21303 (flynn).
- MySQL dictionary source will now retry unexpected connection failures (Lost connection to MySQL server during query) which sometimes happen on SSL/TLS connections. #21237 (Alexander Kazakov).
- Usability improvement: more consistent
DateTime64
parsing: recognize the case when unix timestamp with subsecond resolution is specified as scaled integer (like1111111111222
instead of1111111111.222
). This closes #13194. #21053 (alexey-milovidov). - Do only merging of sorted blocks on initiator with distributed_group_by_no_merge. #20882 (Azat Khuzhin).
- When loading config for mysql source ClickHouse will now randomize the list of replicas with the same priority to ensure the round-robin logics of picking mysql endpoint. This closes #20629. #20632 (Alexander Kazakov).
- Function 'reinterpretAs(x, Type)' renamed into 'reinterpret(x, Type)'. #20611 (Maksim Kita).
- Support vhost for RabbitMQ engine #20576. #20596 (Kseniia Sumarokova).
- Improved serialization for data types combined of Arrays and Tuples. Improved matching enum data types to protobuf enum type. Fixed serialization of the
Map
data type. Omitted values are now set by default. #20506 (Vitaly Baranov). - Fixed race between execution of distributed DDL tasks and cleanup of DDL queue. Now DDL task cannot be removed from ZooKeeper if there are active workers. Fixes #20016. #20448 (tavplubix).
- Make FQDN and other DNS related functions work correctly in alpine images. #20336 (filimonov).
- Do not allow early constant folding of explicitly forbidden functions. #20303 (Azat Khuzhin).
- Implicit conversion from integer to Decimal type might succeeded if integer value doe not fit into Decimal type. Now it throws
ARGUMENT_OUT_OF_BOUND
. #20232 (tavplubix). - Lockless
SYSTEM FLUSH DISTRIBUTED
. #20215 (Azat Khuzhin). - Normalize count(constant), sum(1) to count(). This is needed for projection query routing. #20175 (Amos Bird).
- Support all native integer types in bitmap functions. #20171 (Amos Bird).
- Updated
CacheDictionary
,ComplexCacheDictionary
,SSDCacheDictionary
,SSDComplexKeyDictionary
to use LRUHashMap as underlying index. #20164 (Maksim Kita). - The setting
access_management
is now configurable on startup by providingCLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT
, defaults to disabled (0
) which was the prior value. #20139 (Marquitos). - Fix toDateTime64(toDate()/toDateTime()) for DateTime64 - Implement DateTime64 clamping to match DateTime behaviour. #20131 (Azat Khuzhin).
- Quota improvements: SHOW TABLES is now considered as one query in the quota calculations, not two queries. SYSTEM queries now consume quota. Fix calculation of interval's end in quota consumption. #20106 (Vitaly Baranov).
- Supports
path IN (set)
expressions forsystem.zookeeper
table. #20105 (小路). - Show full details of
MaterializeMySQL
tables insystem.tables
. #20051 (Stig Bakken). - Fix data race in executable dictionary that was possible only on misuse (when the script returns data ignoring its input). #20045 (alexey-milovidov).
- The value of MYSQL_OPT_RECONNECT option can now be controlled by "opt_reconnect" parameter in the config section of mysql replica. #19998 (Alexander Kazakov).
- If user calls
JSONExtract
function withFloat32
type requested, allow inaccurate conversion to the result type. For example the number0.1
in JSON is double precision and is not representable in Float32, but the user still wants to get it. Previous versions return 0 for non-Nullable type and NULL for Nullable type to indicate that conversion is imprecise. The logic was 100% correct but it was surprising to users and leading to questions. This closes #13962. #19960 (alexey-milovidov). - Add conversion of block structure for INSERT into Distributed tables if it does not match. #19947 (Azat Khuzhin).
- Improvement for the
system.distributed_ddl_queue
table. Initialize MaxDDLEntryID to the last value after restarting. Before this PR, MaxDDLEntryID will remain zero until a new DDLTask is processed. #19924 (Amos Bird). - Show
MaterializeMySQL
tables insystem.parts
. #19770 (Stig Bakken). - Add separate config directive for
Buffer
profile. #19721 (Azat Khuzhin). - Move conditions that are not related to JOIN to WHERE clause. #18720. #19685 (hexiaoting).
- Add ability to throttle INSERT into Distributed based on amount of pending bytes for async send (
bytes_to_delay_insert
/max_delay_to_insert
andbytes_to_throw_insert
settings forDistributed
engine has been added). #19673 (Azat Khuzhin). - Fix some rare cases when write errors can be ignored in destructors. #19451 (Azat Khuzhin).
- Print inline frames in stack traces for fatal errors. #19317 (Ivan).
Bug Fix
- Fix redundant reconnects to ZooKeeper and the possibility of two active sessions for a single clickhouse server. Both problems introduced in #14678. #21264 (alesapin).
- Fix error
Bad cast from type ... to DB::ColumnLowCardinality
while inserting into table withLowCardinality
column fromValues
format. Fixes #21140 #21357 (Nikolai Kochetov). - Fix a deadlock in
ALTER DELETE
mutations for non replicated MergeTree table engines when the predicate contains the table itself. Fixes #20558. #21477 (alesapin). - Fix SIGSEGV for distributed queries on failures. #21434 (Azat Khuzhin).
- Now
ALTER MODIFY COLUMN
queries will correctly affect changes in partition key, skip indices, TTLs, and so on. Fixes #13675. #21334 (alesapin). - Fix bug with
join_use_nulls
and joiningTOTALS
from subqueries. This closes #19362 and #21137. #21248 (vdimir). - Fix crash in
EXPLAIN
for query withUNION
. Fixes #20876, #21170. #21246 (flynn). - Now mutations allowed only for table engines that support them (MergeTree family, Memory, MaterializedView). Other engines will report a more clear error. Fixes #21168. #21183 (alesapin).
- Fixes #21112. Fixed bug that could cause duplicates with insert query (if one of the callbacks came a little too late). #21138 (Kseniia Sumarokova).
- Fix
input_format_null_as_default
take effective when types are nullable. This fixes #21116 . #21121 (Amos Bird). - fix bug related to cast Tuple to Map. Closes #21029. #21120 (hexiaoting).
- Fix the metadata leak when the Replicated*MergeTree with custom (non default) ZooKeeper cluster is dropped. #21119 (fastio).
- Fix type mismatch issue when using LowCardinality keys in joinGet. This fixes #21114. #21117 (Amos Bird).
- fix default_replica_path and default_replica_name values are useless on Replicated(*)MergeTree engine when the engine needs specify other parameters. #21060 (mxzlxy).
- Out of bound memory access was possible when formatting specifically crafted out of range value of type
DateTime64
. This closes #20494. This closes #20543. #21023 (alexey-milovidov). - Block parallel insertions into storage join. #21009 (vdimir).
- Fixed behaviour, when
ALTER MODIFY COLUMN
created mutation, that will knowingly fail. #21007 (Anton Popov). - Closes #9969. Fixed Brotli http compression error, which reproduced for large data sizes, slightly complicated structure and with json output format. Update Brotli to the latest version to include the "fix rare access to uninitialized data in ring-buffer". #20991 (Kseniia Sumarokova).
- Fix 'Empty task was returned from async task queue' on query cancellation. #20881 (Azat Khuzhin).
USE database;
query did not work when using MySQL 5.7 client to connect to ClickHouse server, it's fixed. Fixes #18926. #20878 (tavplubix).- Fix usage of
-Distinct
combinator with-State
combinator in aggregate functions. #20866 (Anton Popov). - Fix subquery with union distinct and limit clause. close #20597. #20610 (flynn).
- Fixed inconsistent behavior of dictionary in case of queries where we look for absent keys in dictionary. #20578 (Nikita Mikhaylov).
- Fix the number of threads for scalar subqueries and subqueries for index (after #19007 single thread was always used). Fixes #20457, #20512. #20550 (Nikolai Kochetov).
- Fix crash which could happen if unknown packet was received from remove query (was introduced in #17868). #20547 (Azat Khuzhin).
- Add proper checks while parsing directory names for async INSERT (fixes SIGSEGV). #20498 (Azat Khuzhin).
- Fix function
transform
does not work properly for floating point keys. Closes #20460. #20479 (flynn). - Fix infinite loop when propagating WITH aliases to subqueries. This fixes #20388. #20476 (Amos Bird).
- Fix abnormal server termination when http client goes away. #20464 (Azat Khuzhin).
- Fix
LOGICAL_ERROR
forjoin_use_nulls=1
when JOIN contains const from SELECT. #20461 (Azat Khuzhin). - Check if table function
view
is used in expression list and throw an error. This fixes #20342. #20350 (Amos Bird). - Avoid invalid dereference in RANGE_HASHED() dictionary. #20345 (Azat Khuzhin).
- Fix null dereference with
join_use_nulls=1
. #20344 (Azat Khuzhin). - Fix incorrect result of binary operations between two constant decimals of different scale. Fixes #20283. #20339 (Maksim Kita).
- Fix too often retries of failed background tasks for
ReplicatedMergeTree
table engines family. This could lead to too verbose logging and increased CPU load. Fixes #20203. #20335 (alesapin). - Restrict to
DROP
orRENAME
version column of*CollapsingMergeTree
andReplacingMergeTree
table engines. #20300 (alesapin). - Fixed the behavior when in case of broken JSON we tried to read the whole file into memory which leads to exception from the allocator. Fixes #19719. #20286 (Nikita Mikhaylov).
- Fix exception during vertical merge for
MergeTree
table engines family which don't allow to perform vertical merges. Fixes #20259. #20279 (alesapin). - Fix rare server crash on config reload during the shutdown. Fixes #19689. #20224 (alesapin).
- Fix CTE when using in INSERT SELECT. This fixes #20187, fixes #20195. #20211 (Amos Bird).
- Fixes #19314. #20156 (Ivan).
- fix toMinute function to handle special timezone correctly. #20149 (keenwolf).
- Fix server crash after query with
if
function withTuple
type of then/else branches result.Tuple
type must containArray
or another complex type. Fixes #18356. #20133 (alesapin). - The
MongoDB
table engine now establishes connection only when it's going to read data.ATTACH TABLE
won't try to connect anymore. #20110 (Vitaly Baranov). - Bugfix in StorageJoin. #20079 (vdimir).
- Fix the case when calculating modulo of division of negative number by small divisor, the resulting data type was not large enough to accomodate the negative result. This closes #20052. #20067 (alexey-milovidov).
- MaterializeMySQL: Fix replication for statements that update several tables. #20066 (Håvard Kvålen).
- Prevent "Connection refused" in docker during initialization script execution. #20012 (filimonov).
EmbeddedRocksDB
is an experimental storage. Fix the issue with lack of proper type checking. Simplified code. This closes #19967. #19972 (alexey-milovidov).- Fix a segfault in function
fromModifiedJulianDay
when the argument type isNullable(T)
for any integral types other than Int32. #19959 (PHO). - BloomFilter index crash fix. Fixes #19757. #19884 (Maksim Kita).
- Deadlock was possible if system.text_log is enabled. This fixes #19874. #19875 (alexey-milovidov).
- Fix starting the server with tables having default expressions containing dictGet(). Allow getting return type of dictGet() without loading dictionary. #19805 (Vitaly Baranov).
- Fix clickhouse-client abort exception while executing only
select
. #19790 (taiyang-li). - Fix a bug that moving pieces to destination table may failed in case of launching multiple clickhouse-copiers. #19743 (madianjun).
- Background thread which executes
ON CLUSTER
queries might hang waiting for dropped replicated table to do something. It's fixed. #19684 (yiguolei).
Build/Testing/Packaging Improvement
- Allow to build ClickHouse with AVX-2 enabled globally. It gives slight performance benefits on modern CPUs. Not recommended for production and will not be supported as official build for now. #20180 (alexey-milovidov).
- Fix some of the issues found by Coverity. See #19964. #20010 (alexey-milovidov).
- Allow to start up with modified binary under gdb. In previous version if you set up breakpoint in gdb before start, server will refuse to start up due to failed integrity check. #21258 (alexey-milovidov).
- Add a test for different compression methods in Kafka. #21111 (filimonov).
- Fixed port clash from test_storage_kerberized_hdfs test. #19974 (Ilya Yatsishin).
- Print
stdout
andstderr
to log when failed to start docker in integration tests. Before this PR there was a very short error message in this case which didn't help to investigate the problems. #20631 (Vitaly Baranov).
ClickHouse release 21.2
ClickHouse release v21.2.2.8-stable, 2021-02-07
Backward Incompatible Change
- Bitwise functions (
bitAnd
,bitOr
, etc) are forbidden for floating point arguments. Now you have to do explicit cast to integer. #19853 (Azat Khuzhin). - Forbid
lcm
/gcd
for floats. #19532 (Azat Khuzhin). - Fix memory tracking for
OPTIMIZE TABLE
/merges; account query memory limits and sampling forOPTIMIZE TABLE
/merges. #18772 (Azat Khuzhin). - Disallow floating point column as partition key, see #18421. #18464 (hexiaoting).
- Excessive parenthesis in type definitions no longer supported, example:
Array((UInt8))
.
New Feature
- Added
PostgreSQL
table engine (both select/insert, with support for multidimensional arrays), also as table function. AddedPostgreSQL
dictionary source. AddedPostgreSQL
database engine. #18554 (Kseniia Sumarokova). - Data type
Nested
now supports arbitrary levels of nesting. Introduced subcolumns of complex types, such assize0
inArray
,null
inNullable
, names ofTuple
elements, which can be read without reading of whole column. #17310 (Anton Popov). - Added
Nullable
support forFlatDictionary
,HashedDictionary
,ComplexKeyHashedDictionary
,DirectDictionary
,ComplexKeyDirectDictionary
,RangeHashedDictionary
. #18236 (Maksim Kita). - Adds a new table called
system.distributed_ddl_queue
that displays the queries in the DDL worker queue. #17656 (Bharat Nallan). - Added support of mapping LDAP group names, and attribute values in general, to local roles for users from ldap user directories. #17211 (Denis Glazachev).
- Support insert into table function
cluster
, and for both table functionsremote
andcluster
, support distributing data across nodes by specify sharding key. Close #16752. #18264 (flynn). - Add function
decodeXMLComponent
to decode characters for XML. Example:SELECT decodeXMLComponent('Hello,"world"!')
#17659. #18542 (nauta). - Added functions
parseDateTimeBestEffortUSOrZero
,parseDateTimeBestEffortUSOrNull
. #19712 (Maksim Kita). - Add
sign
math function. #19527 (flynn). - Add information about used features (functions, table engines, etc) into system.query_log. #18495. #19371 (Kseniia Sumarokova).
- Function
formatDateTime
support the%Q
modification to format date to quarter. #19224 (Jianmei Zhang). - Support MetaKey+Enter hotkey binding in play UI. #19012 (sundyli).
- Add three functions for map data type: 1.
mapContains(map, key)
to check weather map.keys include the second parameter key. 2.mapKeys(map)
return all the keys in Array format 3.mapValues(map)
return all the values in Array format. #18788 (hexiaoting). - Add
log_comment
setting related to #18494. #18549 (Zijie Lu). - Add support of tuple argument to
argMin
andargMax
functions. #17359 (Ildus Kurbangaliev). - Support
EXISTS VIEW
syntax. #18552 (Du Chuan). - Add
SELECT ALL
syntax. closes #18706. #18723 (flynn).
Performance Improvement
- Faster parts removal by lowering the number of
stat
syscalls. This returns the optimization that existed while ago. More safe interface ofIDisk
. This closes #19065. #19086 (alexey-milovidov). - Aliases declared in
WITH
statement are properly used in index analysis. Queries likeWITH column AS alias SELECT ... WHERE alias = ...
may use index now. #18896 (Amos Bird). - Add
optimize_alias_column_prediction
(on by default), that will: - Respect aliased columns in WHERE during partition pruning and skipping data using secondary indexes; - Respect aliased columns in WHERE for trivial count queries for optimize_trivial_count; - Respect aliased columns in GROUP BY/ORDER BY for optimize_aggregation_in_order/optimize_read_in_order. #16995 (sundyli). - Speed up aggregate function
sum
. Improvement only visible on synthetic benchmarks and not very practical. #19216 (alexey-milovidov). - Update libc++ and use another ABI to provide better performance. #18914 (Danila Kutenin).
- Rewrite
sumIf()
andsum(if())
function tocountIf()
function when logically equivalent. #17041 (flynn). - Use a connection pool for S3 connections, controlled by the
s3_max_connections
settings. #13405 (Vladimir Chebotarev). - Add support for zstd long option for better compression of string columns to save space. #17184 (ygrek).
- Slightly improve server latency by removing access to configuration on every connection. #19863 (alexey-milovidov).
- Reduce lock contention for multiple layers of the
Buffer
engine. #19379 (Azat Khuzhin). - Support splitting
Filter
step of query plan intoExpression + Filter
pair. Together withExpression + Expression
merging optimization (#17458) it may delay execution for some expressions afterFilter
step. #19253 (Nikolai Kochetov).
Improvement
SELECT count() FROM table
now can be executed if only one any column can be selected from thetable
. This PR fixes #10639. #18233 (Vitaly Baranov).- Set charset to
utf8mb4
when interacting with remote MySQL servers. Fixes #19795. #19800 (alexey-milovidov). S3
table function now supportsauto
compression mode (autodetect). This closes #18754. #19793 (Vladimir Chebotarev).- Correctly output infinite arguments for
formatReadableTimeDelta
function. In previous versions, there was implicit conversion to implementation specific integer value. #19791 (alexey-milovidov). - Table function
S3
will use global region if the region can't be determined exactly. This closes #10998. #19750 (Vladimir Chebotarev). - In distributed queries if the setting
async_socket_for_remote
is enabled, it was possible to get stack overflow at least in debug build configuration if very deeply nested data type is used in table (e.g.Array(Array(Array(...more...)))
). This fixes #19108. This change introduces minor backward incompatibility: excessive parenthesis in type definitions no longer supported, example:Array((UInt8))
. #19736 (alexey-milovidov). - Add separate pool for message brokers (RabbitMQ and Kafka). #19722 (Azat Khuzhin).
- Fix rare
max_number_of_merges_with_ttl_in_pool
limit overrun (more merges with TTL can be assigned) for non-replicated MergeTree. #19708 (alesapin). - Dictionary: better error message during attribute parsing. #19678 (Maksim Kita).
- Add an option to disable validation of checksums on reading. Should never be used in production. Please do not expect any benefits in disabling it. It may only be used for experiments and benchmarks. The setting only applicable for tables of MergeTree family. Checksums are always validated for other table engines and when receiving data over network. In my observations there is no performance difference or it is less than 0.5%. #19588 (alexey-milovidov).
- Support constant result in function
multiIf
. #19533 (Maksim Kita). - Enable function length/empty/notEmpty for datatype Map, which returns keys number in Map. #19530 (taiyang-li).
- Add
--reconnect
option toclickhouse-benchmark
. When this option is specified, it will reconnect before every request. This is needed for testing. #19872 (alexey-milovidov). - Support using the new location of
.debug
file. This fixes #19348. #19520 (Amos Bird). toIPv6
function parsesIPv4
addresses. #19518 (Bharat Nallan).- Add
http_referer
field tosystem.query_log
,system.processes
, etc. This closes #19389. #19390 (alexey-milovidov). - Improve MySQL compatibility by making more functions case insensitive and adding aliases. #19387 (Daniil Kondratyev).
- Add metrics for MergeTree parts (Wide/Compact/InMemory) types. #19381 (Azat Khuzhin).
- Allow docker to be executed with arbitrary uid. #19374 (filimonov).
- Fix wrong alignment of values of
IPv4
data type in Pretty formats. They were aligned to the right, not to the left. This closes #19184. #19339 (alexey-milovidov). - Allow change
max_server_memory_usage
without restart. This closes #18154. #19186 (alexey-milovidov). - The exception when function
bar
is called with certain NaN argument may be slightly misleading in previous versions. This fixes #19088. #19107 (alexey-milovidov). - Explicitly set uid / gid of clickhouse user & group to the fixed values (101) in clickhouse-server images. #19096 (filimonov).
- Fixed
PeekableReadBuffer: Memory limit exceed
error when inserting data with huge strings. Fixes #18690. #18979 (tavplubix). - Docker image: several improvements for clickhouse-server entrypoint. #18954 (filimonov).
- Add
normalizeQueryKeepNames
andnormalizedQueryHashKeepNames
to normalize queries without masking long names with?
. This helps better analyze complex query logs. #18910 (Amos Bird). - Check per-block checksum of the distributed batch on the sender before sending (without reading the file twice, the checksums will be verified while reading), this will avoid stuck of the INSERT on the receiver (on truncated .bin file on the sender). Avoid reading .bin files twice for batched INSERT (it was required to calculate rows/bytes to take squashing into account, now this information included into the header, backward compatible is preserved). #18853 (Azat Khuzhin).
- Fix issues with RIGHT and FULL JOIN of tables with aggregate function states. In previous versions exception about
cloneResized
method was thrown. #18818 (templarzq). - Added prefix-based S3 endpoint settings. #18812 (Vladimir Chebotarev).
- Add [UInt8, UInt16, UInt32, UInt64] arguments types support for bitmapTransform, bitmapSubsetInRange, bitmapSubsetLimit, bitmapContains functions. This closes #18713. #18791 (sundyli).
- Allow CTE (Common Table Expressions) to be further aliased. Propagate CSE (Common Subexpressions Elimination) to subqueries in the same level when
enable_global_with_statement = 1
. This fixes #17378 . This fixes https://github.com/ClickHouse/ClickHouse/pull/16575#issuecomment-753416235 . #18684 (Amos Bird). - Update librdkafka to v1.6.0-RC2. Fixes #18668. #18671 (filimonov).
- In case of unexpected exceptions automatically restart background thread which is responsible for execution of distributed DDL queries. Fixes #17991. #18285 (徐炘).
- Updated AWS C++ SDK in order to utilize global regions in S3. #17870 (Vladimir Chebotarev).
- Added support for
WITH ... [AND] [PERIODIC] REFRESH [interval_in_sec]
clause when creatingLIVE VIEW
tables. #14822 (vzakaznikov). - Restrict
MODIFY TTL
queries forMergeTree
tables created in old syntax. Previously the query succeeded, but actually it had no effect. #19064 (Anton Popov).
Bug Fix
- Fix index analysis of binary functions with constant argument which leads to wrong query results. This fixes #18364. #18373 (Amos Bird).
- Fix starting the server with tables having default expressions containing dictGet(). Allow getting return type of dictGet() without loading dictionary. #19805 (Vitaly Baranov).
- Fix server crash after query with
if
function withTuple
type of then/else branches result.Tuple
type must containArray
or another complex type. Fixes #18356. #20133 (alesapin). MaterializeMySQL
(experimental feature): Fix replication for statements that update several tables. #20066 (Håvard Kvålen).- Prevent "Connection refused" in docker during initialization script execution. #20012 (filimonov).
EmbeddedRocksDB
is an experimental storage. Fix the issue with lack of proper type checking. Simplified code. This closes #19967. #19972 (alexey-milovidov).- Fix a segfault in function
fromModifiedJulianDay
when the argument type isNullable(T)
for any integral types other than Int32. #19959 (PHO). - The function
greatCircleAngle
returned inaccurate results in previous versions. This closes #19769. #19789 (alexey-milovidov). - Fix rare bug when some replicated operations (like mutation) cannot process some parts after data corruption. Fixes #19593. #19702 (alesapin).
- Background thread which executes
ON CLUSTER
queries might hang waiting for dropped replicated table to do something. It's fixed. #19684 (yiguolei). - Fix wrong deserialization of columns description. It makes INSERT into a table with a column named
\
impossible. #19479 (alexey-milovidov). - Mark distributed batch as broken in case of empty data block in one of files. #19449 (Azat Khuzhin).
- Fixed very rare bug that might cause mutation to hang after
DROP/DETACH/REPLACE/MOVE PARTITION
. It was partially fixed by #15537 for the most cases. #19443 (tavplubix). - Fix possible error
Extremes transform was already added to pipeline
. Fixes #14100. #19430 (Nikolai Kochetov). - Fix default value in join types with non-zero default (e.g. some Enums). Closes #18197. #19360 (vdimir).
- Do not mark file for distributed send as broken on EOF. #19290 (Azat Khuzhin).
- Fix leaking of pipe fd for
async_socket_for_remote
. #19153 (Azat Khuzhin). - Fix infinite reading from file in
ORC
format (was introduced in #10580). Fixes #19095. #19134 (Nikolai Kochetov). - Fix issue in merge tree data writer which can lead to marks with bigger size than fixed granularity size. Fixes #18913. #19123 (alesapin).
- Fix startup bug when clickhouse was not able to read compression codec from
LowCardinality(Nullable(...))
and throws exceptionAttempt to read after EOF
. Fixes #18340. #19101 (alesapin). - Simplify the implementation of
tupleHammingDistance
. Support for tuples of any equal length. Fixes #19029. #19084 (Nikolai Kochetov). - Make sure
groupUniqArray
returns correct type for argument of Enum type. This closes #17875. #19019 (alexey-milovidov). - Fix possible error
Expected single dictionary argument for function
if use functionignore
withLowCardinality
argument. Fixes #14275. #19016 (Nikolai Kochetov). - Fix inserting of
LowCardinality
column to table withTinyLog
engine. Fixes #18629. #19010 (Nikolai Kochetov). - Fix minor issue in JOIN: Join tries to materialize const columns, but our code waits for them in other places. #18982 (Nikita Mikhaylov).
- Disable
optimize_move_functions_out_of_any
because optimization is not always correct. This closes #18051. This closes #18973. #18981 (alexey-milovidov). - Fix possible exception
QueryPipeline stream: different number of columns
caused by merging of query plan'sExpression
steps. Fixes #18190. #18980 (Nikolai Kochetov). - Fixed very rare deadlock at shutdown. #18977 (tavplubix).
- Fixed rare crashes when server run out of memory. #18976 (tavplubix).
- Fix incorrect behavior when
ALTER TABLE ... DROP PART 'part_name'
query removes all deduplication blocks for the whole partition. Fixes #18874. #18969 (alesapin). - Fixed issue #18894 Add a check to avoid exception when long column alias('table.column' style, usually auto-generated by BI tools like Looker) equals to long table name. #18968 (Daniel Qin).
- Fix error
Task was not found in task queue
(possible only for remote queries, withasync_socket_for_remote = 1
). #18964 (Nikolai Kochetov). - Fix bug when mutation with some escaped text (like
ALTER ... UPDATE e = CAST('foo', 'Enum8(\'foo\' = 1')
serialized incorrectly. Fixes #18878. #18944 (alesapin). - ATTACH PARTITION will reset mutations. #18804. #18935 (fastio).
- Fix issue with
bitmapOrCardinality
that may lead to nullptr dereference. This closes #18911. #18912 (sundyli). - Fixed
Attempt to read after eof
error when trying toCAST
NULL
fromNullable(String)
toNullable(Decimal(P, S))
. Now functionCAST
returnsNULL
when it cannot parse decimal from nullable string. Fixes #7690. #18718 (Winter Zhang). - Fix data type convert issue for MySQL engine. #18124 (bo zeng).
- Fix clickhouse-client abort exception while executing only
select
. #19790 (taiyang-li).
Build/Testing/Packaging Improvement
- Run SQLancer (logical SQL fuzzer) in CI. #19006 (Ilya Yatsishin).
- Query Fuzzer will fuzz newly added tests more extensively. This closes #18916. #19185 (alexey-milovidov).
- Integrate with Big List of Naughty Strings for better fuzzing. #19480 (alexey-milovidov).
- Add integration tests run with MSan. #18974 (alesapin).
- Fixed MemorySanitizer errors in cyrus-sasl and musl. #19821 (Ilya Yatsishin).
- Insuffiient arguments check in
positionCaseInsensitiveUTF8
function triggered address sanitizer. #19720 (alexey-milovidov). - Remove --project-directory for docker-compose in integration test. Fix logs formatting from docker container. #19706 (Ilya Yatsishin).
- Made generation of macros.xml easier for integration tests. No more excessive logging from dicttoxml. dicttoxml project is not active for 5+ years. #19697 (Ilya Yatsishin).
- Allow to explicitly enable or disable watchdog via environment variable
CLICKHOUSE_WATCHDOG_ENABLE
. By default it is enabled if server is not attached to terminal. #19522 (alexey-milovidov). - Allow building ClickHouse with Kafka support on arm64. #19369 (filimonov).
- Allow building librdkafka without ssl. #19337 (filimonov).
- Restore Kafka input in FreeBSD builds. #18924 (Alexandre Snarskii).
- Fix potential nullptr dereference in table function
VALUES
. #19357 (alexey-milovidov). - Avoid UBSan reports in
arrayElement
function,substring
andarraySum
. Fixes #19305. Fixes #19287. This closes #19336. #19347 (alexey-milovidov).
ClickHouse release 21.1
ClickHouse release v21.1.3.32-stable, 2021-02-03
Bug Fix
- BloomFilter index crash fix. Fixes #19757. #19884 (Maksim Kita).
- Fix crash when pushing down predicates to union distinct subquery. This fixes #19855. #19861 (Amos Bird).
- Fix filtering by UInt8 greater than 127. #19799 (Anton Popov).
- In previous versions, unusual arguments for function arrayEnumerateUniq may cause crash or infinite loop. This closes #19787. #19788 (alexey-milovidov).
- Fixed stack overflow when using accurate comparison of arithmetic type with string type. #19773 (tavplubix).
- Fix crash when nested column name was used in
WHERE
orPREWHERE
. Fixes #19755. #19763 (Nikolai Kochetov). - Fix a segmentation fault in
bitmapAndnot
function. Fixes #19668. #19713 (Maksim Kita). - Some functions with big integers may cause segfault. Big integers is experimental feature. This closes #19667. #19672 (alexey-milovidov).
- Fix wrong result of function
neighbor
forLowCardinality
argument. Fixes #10333. #19617 (Nikolai Kochetov). - Fix use-after-free of the CompressedWriteBuffer in Connection after disconnect. #19599 (Azat Khuzhin).
DROP/DETACH TABLE table ON CLUSTER cluster SYNC
query might hang, it's fixed. Fixes #19568. #19572 (tavplubix).- Query CREATE DICTIONARY id expression fix. #19571 (Maksim Kita).
- Fix SIGSEGV with merge_tree_min_rows_for_concurrent_read/merge_tree_min_bytes_for_concurrent_read=0/UINT64_MAX. #19528 (Azat Khuzhin).
- Buffer overflow (on memory read) was possible if
addMonth
function was called with specifically crafted arguments. This fixes #19441. This fixes #19413. #19472 (alexey-milovidov). - Uninitialized memory read was possible in encrypt/decrypt functions if empty string was passed as IV. This closes #19391. #19397 (alexey-milovidov).
- Fix possible buffer overflow in Uber H3 library. See https://github.com/uber/h3/issues/392. This closes #19219. #19383 (alexey-milovidov).
- Fix system.parts _state column (LOGICAL_ERROR when querying this column, due to incorrect order). #19346 (Azat Khuzhin).
- Fixed possible wrong result or segfault on aggregation when Materialized View and its target table have different structure. Fixes #18063. #19322 (tavplubix).
- Fix error
Cannot convert column now64() because it is constant but values of constants are different in source and result
. Continuation of #7156. #19316 (Nikolai Kochetov). - Fix bug when concurrent
ALTER
andDROP
queries may hang while processing ReplicatedMergeTree table. #19237 (alesapin). - Fixed
There is no checkpoint
error when inserting data through http interface usingTemplate
orCustomSeparated
format. Fixes #19021. #19072 (tavplubix). - Disable constant folding for subqueries on the analysis stage, when the result cannot be calculated. #18446 (Azat Khuzhin).
- Mutation might hang waiting for some non-existent part after
MOVE
orREPLACE PARTITION
or, in rare cases, afterDETACH
orDROP PARTITION
. It's fixed. #15537 (tavplubix).
ClickHouse release v21.1.2.15-stable 2021-01-18
Backward Incompatible Change
- The setting
input_format_null_as_default
is enabled by default. #17525 (alexey-milovidov). - Check settings constraints for profile settings from config. Server will fail to start if users.xml contain settings that do not meet corresponding constraints. #18486 (tavplubix).
- Restrict
ALTER MODIFY SETTING
from changing storage settings that affects data parts (write_final_mark
andenable_mixed_granularity_parts
). #18306 (Amos Bird). - Set
insert_quorum_parallel
to 1 by default. It is significantly more convenient to use than "sequential" quorum inserts. But if you rely to sequential consistency, you should set the setting back to zero. #17567 (alexey-milovidov). - Remove
sumburConsistentHash
function. This closes #18120. #18656 (alexey-milovidov). - Removed aggregate functions
timeSeriesGroupSum
,timeSeriesGroupRateSum
because a friend of mine said they never worked. This fixes #16869. If you have luck using these functions, write a email to clickhouse-feedback@yandex-team.com. #17423 (alexey-milovidov). - Prohibit toUnixTimestamp(Date()) (before it just returns UInt16 representation of Date). #17376 (Azat Khuzhin).
- Allow using extended integer types (
Int128
,Int256
,UInt256
) inavg
andavgWeighted
functions. Also allow using different types (integer, decimal, floating point) for value and for weight inavgWeighted
function. This is a backward-incompatible change: now theavg
andavgWeighted
functions always returnFloat64
(as documented). Before this change the return type forDecimal
arguments was alsoDecimal
. #15419 (Mike). - Expression
toUUID(N)
no longer works. Replace withtoUUID('00000000-0000-0000-0000-000000000000')
. This change is motivated by non-obvious results oftoUUID(N)
where N is non zero. - SSL Certificates with incorrect "key usage" are rejected. In previous versions they are used to work. See #19262.
New Feature
- Implement gRPC protocol in ClickHouse. #15111 (Vitaly Baranov).
- Allow to use multiple zookeeper clusters. #17070 (fastio).
- Implemented
REPLACE TABLE
andCREATE OR REPLACE TABLE
queries. #18521 (tavplubix). - Implement
UNION DISTINCT
and treat the plainUNION
clause asUNION DISTINCT
by default. Add a settingunion_default_mode
that allows to treat it asUNION ALL
or require explicit mode specification. #16338 (flynn). - Added function
accurateCastOrNull
. This closes #10290. Add type conversions inx IN (subquery)
expressions. This closes #10266. #16724 (Maksim Kita). - IP Dictionary supports
IPv4
/IPv6
types directly. #17571 (vdimir). - IP Dictionary supports key fetching. Resolves #18241. #18480 (vdimir).
- Add
*.zst
compression/decompression support for data import and export. It enables using*.zst
infile()
function andContent-encoding: zstd
in HTTP client. This closes #16791 . #17144 (Abi Palagashvili). - Added
mannWitneyUTest
,studentTTest
andwelchTTest
aggregate functions. RefactoredrankCorr
a bit. #16883 (Nikita Mikhaylov). - Add functions
countMatches
/countMatchesCaseInsensitive
. #17459 (Azat Khuzhin). - Implement
countSubstrings()
/countSubstringsCaseInsensitive()
/countSubstringsCaseInsensitiveUTF8()
(Count the number of substring occurrences). #17347 (Azat Khuzhin). - Add information about used databases, tables and columns in system.query_log. Add
query_kind
andnormalized_query_hash
fields. #17726 (Amos Bird). - Add a setting
optimize_on_insert
. When enabled, do the same transformation for INSERTed block of data as if merge was done on this block (e.g. Replacing, Collapsing, Aggregating...). This setting is enabled by default. This can influence Materialized View and MaterializeMySQL behaviour (see detailed description). This closes #10683. #16954 (Kruglov Pavel). - Kerberos Authenticaiton for HDFS. #16621 (Ilya Golshtein).
- Support
SHOW SETTINGS
statement to show parameters in system.settings.SHOW CHANGED SETTINGS
andLIKE/ILIKE
clause are also supported. #18056 (Jianmei Zhang). - Function
position
now supportsPOSITION(needle IN haystack)
synax for SQL compatibility. This closes #18701. ... #18779 (Jianmei Zhang). - Now we have a new storage setting
max_partitions_to_read
for tables in the MergeTree family. It limits the max number of partitions that can be accessed in one query. A user settingforce_max_partition_limit
is also added to enforce this constraint. #18712 (Amos Bird). - Add
query_id
column tosystem.part_log
for inserted parts. Closes #10097. #18644 (flynn). - Allow create table as select with columns specification. Example
CREATE TABLE t1 (x String) ENGINE = Memory AS SELECT 1;
. #18060 (Maksim Kita). - Added
arrayMin
,arrayMax
,arrayAvg
aggregation functions. #18032 (Maksim Kita). - Implemented
ATTACH TABLE name FROM 'path/to/data/' (col1 Type1, ...
query. It creates new table with provided structure and attaches table data from provided directory inuser_files
. #17903 (tavplubix). - Add mutation support for StorageMemory. This closes #9117. #15127 (flynn).
- Support syntax
EXISTS DATABASE name
. #18458 (Du Chuan). - Support builtin function
isIPv4String
&&isIPv6String
like MySQL. #18349 (Du Chuan). - Add a new setting
insert_distributed_one_random_shard = 1
to allow insertion into multi-sharded distributed table without any distributed key. #18294 (Amos Bird). - Add settings
min_compress_block_size
andmax_compress_block_size
to MergeTreeSettings, which have higher priority than the global settings and take effect when they are set. close 13890. #17867 (flynn). - Add support for 64bit roaring bitmaps. #17858 (Andy Yang).
- Extended
OPTIMIZE ... DEDUPLICATE
syntax to allow explicit (or implicit with asterisk/column transformers) list of columns to check for duplicates on. ... #17846 (Vasily Nemkov). - Added functions
toModifiedJulianDay
,fromModifiedJulianDay
,toModifiedJulianDayOrNull
, andfromModifiedJulianDayOrNull
. These functions convert between Proleptic Gregorian calendar date and Modified Julian Day number. #17750 (PHO). - Add ability to use custom TLD list: added functions
firstSignificantSubdomainCustom
,cutToFirstSignificantSubdomainCustom
. #17748 (Azat Khuzhin). - Add support for
PROXYv1
protocol to wrap native TCP interface. Allow quotas to be keyed by proxy-forwarded IP address (applied forPROXYv1
address and forX-Forwarded-For
from HTTP interface). This is useful when you provide access to ClickHouse only via trusted proxy (e.g. CloudFlare) but want to account user resources by their original IP addresses. This fixes #17268. #17707 (alexey-milovidov). - Now clickhouse-client supports opening
EDITOR
to edit commands.Alt-Shift-E
. #17665 (Amos Bird). - Add function
encodeXMLComponent
to escape characters to place string into XML text node or attribute. #17659 (nauta). - Introduce
DETACH TABLE/VIEW ... PERMANENTLY
syntax, so that after restarting the table does not reappear back automatically on restart (only by explicit request). The table can still be attached back using the short syntax ATTACH TABLE. Implements #5555. Fixes #13850. #17642 (filimonov). - Add asynchronous metrics on total amount of rows, bytes and parts in MergeTree tables. This fix #11714. #17639 (flynn).
- Add settings
limit
andoffset
for out-of-SQL pagination: #16176 They are useful for building APIs. These two settings will affect SELECT query as if it is added likeselect * from (your_original_select_query) t limit xxx offset xxx;
. #17633 (hexiaoting). - Provide a new aggregator combinator :
-SimpleState
to buildSimpleAggregateFunction
types via query. It's useful for defining MaterializedView of AggregatingMergeTree engine, and will benefit projections too. #16853 (Amos Bird). - Added
queries-file
parameter forclickhouse-client
andclickhouse-local
. #15930 (Maksim Kita). - Added
query
parameter forclickhouse-benchmark
. #17832 (Maksim Kita). EXPLAIN AST
now support queries other thenSELECT
. #18136 (taiyang-li).
Experimental Feature
- Added functions for calculation of minHash and simHash of text n-grams and shingles. They are intended for semi-duplicate search. Also functions
bitHammingDistance
andtupleHammingDistance
are added. #7649 (flynn). - Add new data type
Map
. See #1841. First version for Map only supportsString
type of key and value. #15806 (hexiaoting). - Implement alternative SQL parser based on ANTLR4 runtime and generated from EBNF grammar. #11298 (Ivan).
Performance Improvement
- New IP Dictionary implementation with lower memory consumption, improved performance for some cases, and fixed bugs. #16804 (vdimir).
- Parallel formatting for data export. #11617 (Nikita Mikhaylov).
- LDAP integration: Added
verification_cooldown
parameter in LDAP server connection configuration to allow caching of successful "bind" attempts for configurable period of time. #15988 (Denis Glazachev). - Add
--no-system-table
option forclickhouse-local
to run without system tables. This avoids initialization ofDateLUT
that may take noticeable amount of time (tens of milliseconds) at startup. #18899 (alexey-milovidov). - Replace
PODArray
withPODArrayWithStackMemory
inAggregateFunctionWindowFunnelData
to improvewindowFunnel
function performance. #18817 (flynn). - Don't send empty blocks to shards on synchronous INSERT into Distributed table. This closes #14571. #18775 (alexey-milovidov).
- Optimized read for StorageMemory. #18052 (Maksim Kita).
- Using Dragonbox algorithm for float to string conversion instead of ryu. This improves performance of float to string conversion significantly. #17831 (Maksim Kita).
- Speedup
IPv6CIDRToRange
implementation. #17569 (vdimir). - Add
remerge_sort_lowered_memory_bytes_ratio
setting (If memory usage after remerge does not reduced by this ratio, remerge will be disabled). #17539 (Azat Khuzhin). - Improve performance of AggregatingMergeTree with SimpleAggregateFunction(String) in PK. #17109 (Azat Khuzhin).
- Now the
-If
combinator is devirtualized, andcount
is properly vectorized. It is for this PR. #17043 (Amos Bird). - Fix performance of reading from
Merge
tables over huge number ofMergeTree
tables. Fixes #7748. #16988 (Anton Popov). - Improved performance of function
repeat
. #16937 (satanson). - Slightly improved performance of float parsing. #16809 (Maksim Kita).
- Add possibility to skip merged partitions for
OPTIMIZE TABLE ... FINAL
. #15939 (Kruglov Pavel). - Integrate with fast_float from Daniel Lemire to parse floating point numbers. #16787 (Maksim Kita). It is not enabled, because performance its performance is still lower than rough float parser in ClickHouse.
- Fix max_distributed_connections (affects
prefer_localhost_replica = 1
andmax_threads != max_distributed_connections
). #17848 (Azat Khuzhin). - Adaptive choice of single/multi part upload when sending data to S3. Single part upload is controlled by a new setting
max_single_part_upload_size
. #17934 (Pavel Kovalenko). - Support for async tasks in
PipelineExecutor
. Initial support of async sockets for remote queries. #17868 (Nikolai Kochetov). - Allow to use
optimize_move_to_prewhere
optimization with compact parts, when sizes of columns are unknown. #17330 (Anton Popov).
Improvement
- Avoid deadlock when executing INSERT SELECT into itself from a table with
TinyLog
orLog
table engines. This closes #6802. This closes #18691. This closes #16812. This closes #14570. #15260 (alexey-milovidov). - Support
SHOW CREATE VIEW name
syntax like MySQL. #18095 (Du Chuan). - All queries of type
Decimal * Float
or vice versa are allowed, including aggregate ones (e.g.SELECT sum(decimal_field * 1.1)
orSELECT dec_col * float_col
), the result type is Float32 or Float64. #18145 (Mike). - Improved minimal Web UI: add history; add sharing support; avoid race condition of different requests; add request in-flight and ready indicators; add favicon; detect Ctrl+Enter if textarea is not in focus. #17293 #17770 (alexey-milovidov).
- clickhouse-server didn't send
close
request to ZooKeeper server. #16837 (alesapin). - Avoid server abnormal termination in case of too low memory limits (
max_memory_usage = 1
/max_untracked_memory = 1
). #17453 (Azat Khuzhin). - Fix non-deterministic result of
windowFunnel
function in case of same timestamp for different events. #18884 (Fuwang Hu). - Docker: Explicitly set uid / gid of clickhouse user & group to the fixed values (101) in clickhouse-server Docker images. #19096 (filimonov).
- Asynchronous INSERTs to
Distributed
tables: Two new settings (by analogy with MergeTree family) has been added: -fsync_after_insert
- Do fsync for every inserted. Will decreases performance of inserts. -fsync_directories
- Do fsync for temporary directory (that is used for async INSERT only) after all operations (writes, renames, etc.). #18864 (Azat Khuzhin). SYSTEM KILL
command started to work in Docker. This closes #18847. #18848 (alexey-milovidov).- Expand macros in the zk path when executing
FETCH PARTITION
. #18839 (fastio). - Apply
ALTER TABLE <replicated_table> ON CLUSTER MODIFY SETTING ...
to all replicas. Because we don't replicate such alter commands. #18789 (Amos Bird). - Allow column transformer
EXCEPT
to accept a string as regular expression matcher. This resolves #18685 . #18699 (Amos Bird). - Fix SimpleAggregateFunction in SummingMergeTree. Now it works like AggregateFunction. In previous versions values were summed together regardless to the aggregate function. This fixes #18564 . #8052. #18637 (Amos Bird). Another fix of using
SimpleAggregateFunction
inSummingMergeTree
. This fixes #18676 . #18677 (Amos Bird). - Fixed assertion error inside allocator in case when last argument of function bar is NaN. Now simple ClickHouse's exception is being thrown. This fixes #17876. #18520 (Nikita Mikhaylov).
- Fix usability issue: no newline after exception message in some tools. #18444 (alexey-milovidov).
- Add ability to modify primary and partition key column type from
LowCardinality(Type)
toType
and vice versa. Also add an ability to modify primary key column type fromEnumX
toIntX
type. Fixes #5604. #18362 (alesapin). - Implement
untuple
field access. #18133. #18309 (hexiaoting). - Allow to parse Array fields from CSV if it is represented as a string containing array that was serialized as nested CSV. Example:
"[""Hello"", ""world"", ""42"""" TV""]"
will parse as['Hello', 'world', '42" TV']
. Allow to parse array in CSV in a string without enclosing braces. Example:"'Hello', 'world', '42"" TV'"
will parse as['Hello', 'world', '42" TV']
. #18271 (alexey-milovidov). - Make better adaptive granularity calculation for merge tree wide parts. #18223 (alesapin).
- Now
clickhouse install
could work on Mac. The problem was that there is no procfs on this platform. #18201 (Nikita Mikhaylov). - Better hints for
SHOW ...
query syntax. #18183 (Du Chuan). - Array aggregation
arrayMin
,arrayMax
,arraySum
,arrayAvg
support forInt128
,Int256
,UInt256
. #18147 (Maksim Kita). - Add
disk
to Set and Join storage settings. #18112 (Grigory Pervakov). - Access control: Now table function
merge()
requires current user to haveSELECT
privilege on each table it receives data from. This PR fixes #16964. #18104 #17983 (Vitaly Baranov). - Temporary tables are visible in the system tables
system.tables
andsystem.columns
now only in those session where they have been created. The internal database_temporary_and_external_tables
is now hidden in those system tables; temporary tables are shown as tables with empty database with theis_temporary
flag set instead. #18014 (Vitaly Baranov). - Fix clickhouse-client rendering issue when the size of terminal window changes. #18009 (Amos Bird).
- Decrease log verbosity of the events when the client drops the connection from Warning to Information. #18005 (filimonov).
- Forcibly removing empty or bad metadata files from filesystem for DiskS3. S3 is an experimental feature. #17935 (Pavel Kovalenko).
- Access control:
allow_introspection_functions=0
prohibits usage of introspection functions but doesn't prohibit giving grants for them anymore (the grantee will need to setallow_introspection_functions=1
for himself to be able to use that grant). Similarlyallow_ddl=0
prohibits usage of DDL commands but doesn't prohibit giving grants for them anymore. #17908 (Vitaly Baranov). - Usability improvement: hints for column names. #17112. #17857 (fastio).
- Add diagnostic information when two merge tables try to read each other's data. #17854 (徐炘).
- Let the possibility to override timeout value for running script using the ClickHouse docker image. #17818 (Guillaume Tassery).
- Check system log tables' engine definition grammar to prevent some configuration errors. Notes that this grammar check is not semantical, that means such mistakes as non-existent columns / expression functions would be not found out util the table is created. #17739 (Du Chuan).
- Removed exception throwing at
RabbitMQ
table initialization if there was no connection (it will be reconnecting in the background). #17709 (Kseniia Sumarokova). - Do not ignore server memory limits during Buffer flush. #17646 (Azat Khuzhin).
- Switch to patched version of RocksDB (from ClickHouse-Extras) to fix use-after-free error. #17643 (Nikita Mikhaylov).
- Added an offset to exception message for parallel parsing. This fixes #17457. #17641 (Nikita Mikhaylov).
- Don't throw "Too many parts" error in the middle of INSERT query. #17566 (alexey-milovidov).
- Allow query parameters in UPDATE statement of ALTER query. Fixes #10976. #17563 (alexey-milovidov).
- Query obfuscator: avoid usage of some SQL keywords for identifier names. #17526 (alexey-milovidov).
- Export current max ddl entry executed by DDLWorker via server metric. It's useful to check if DDLWorker hangs somewhere. #17464 (Amos Bird).
- Export asynchronous metrics of all servers current threads. It's useful to track down issues like this. #17463 (Amos Bird).
- Include dynamic columns like MATERIALIZED / ALIAS for wildcard query when settings
asterisk_include_materialized_columns
andasterisk_include_alias_columns
are turned on. #17462 (Ken Chen). - Allow specifying TTL to remove old entries from system log tables, using the
<ttl>
attribute inconfig.xml
. #17438 (Du Chuan). - Now queries coming to the server via MySQL and PostgreSQL protocols have distinctive interface types (which can be seen in the
interface
column of the tablesystem.query_log
):4
for MySQL, and5
for PostgreSQL, instead of formerly used1
which is now used for the native protocol only. #17437 (Vitaly Baranov). - Fix parsing of SETTINGS clause of the
INSERT ... SELECT ... SETTINGS
query. #17414 (Azat Khuzhin). - Correctly account memory in RadixSort. #17412 (Nikita Mikhaylov).
- Add eof check in
receiveHello
in server to prevent gettingAttempt to read after eof
exception. #17365 (Kruglov Pavel). - Avoid possible stack overflow in bigint conversion. Big integers are experimental. #17269 (flynn).
- Now
set
indices will work withGLOBAL IN
. This fixes #17232 , #5576 . #17253 (Amos Bird). - Add limit for http redirects in request to S3 storage (
s3_max_redirects
). #17220 (ianton-ru). - When
-OrNull
combinator combined-If
,-Merge
,-MergeState
,-State
combinators, we should put-OrNull
in front. #16935 (flynn). - Support HTTP proxy and HTTPS S3 endpoint configuration. #16861 (Pavel Kovalenko).
- Added proper authentication using environment,
~/.aws
andAssumeRole
for S3 client. #16856 (Vladimir Chebotarev). - Add more OpenTelemetry spans. Add an example of how to export the span data to Zipkin. #16535 (Alexander Kuzmenkov).
- Cache dictionaries: Completely eliminate callbacks and locks for acquiring them. Keys are not divided into "not found" and "expired", but stored in the same map during query. #14958 (Nikita Mikhaylov).
- Fix never worked
fsync_part_directory
/fsync_after_insert
/in_memory_parts_insert_sync
(experimental feature). #18845 (Azat Khuzhin). - Allow using
Atomic
engine for nested database ofMaterializeMySQL
engine. #14849 (tavplubix).
Bug Fix
- Fix the issue when server can stop accepting connections in very rare cases. #17542 (Amos Bird, alexey-milovidov).
- Fix index analysis of binary functions with constant argument which leads to wrong query results. This fixes #18364. #18373 (Amos Bird).
- Fix possible wrong index analysis when the types of the index comparison are different. This fixes #17122. #17145 (Amos Bird).
- Disable write with AIO during merges because it can lead to extremely rare data corruption of primary key columns during merge. #18481 (alesapin).
- Restrict merges from wide to compact parts. In case of vertical merge it led to broken result part. #18381 (Anton Popov).
- Fix possible incomplete query result while reading from
MergeTree*
in case of read backoff (message<Debug> MergeTreeReadPool: Will lower number of threads
in logs). Was introduced in #16423. Fixes #18137. #18216 (Nikolai Kochetov). - Fix use after free bug in
rocksdb
library. #18862 (sundyli). - Fix infinite reading from file in
ORC
format (was introduced in #10580). Fixes #19095. #19134 (Nikolai Kochetov). - Fix bug in merge tree data writer which can lead to marks with bigger size than fixed granularity size. Fixes #18913. #19123 (alesapin).
- Fix startup bug when clickhouse was not able to read compression codec from
LowCardinality(Nullable(...))
and throws exceptionAttempt to read after EOF
. Fixes #18340. #19101 (alesapin). - Restrict
MODIFY TTL
queries forMergeTree
tables created in old syntax. Previously the query succeeded, but actually it had no effect. #19064 (Anton Popov). - Make sure
groupUniqArray
returns correct type for argument of Enum type. This closes #17875. #19019 (alexey-milovidov). - Fix possible error
Expected single dictionary argument for function
if use functionignore
withLowCardinality
argument. Fixes #14275. #19016 (Nikolai Kochetov). - Fix inserting of
LowCardinality
column to table withTinyLog
engine. Fixes #18629. #19010 (Nikolai Kochetov). - Join tries to materialize const columns, but our code wants them in other places. #18982 (Nikita Mikhaylov).
- Disable
optimize_move_functions_out_of_any
because optimization is not always correct. This closes #18051. This closes #18973. #18981 (alexey-milovidov). - Fix possible exception
QueryPipeline stream: different number of columns
caused by merging of query plan'sExpression
steps. Fixes #18190. #18980 (Nikolai Kochetov). - Fixed very rare deadlock at shutdown. #18977 (tavplubix).
- Fix incorrect behavior when
ALTER TABLE ... DROP PART 'part_name'
query removes all deduplication blocks for the whole partition. Fixes #18874. #18969 (alesapin). - Attach partition should reset the mutation. #18804. #18935 (fastio).
- Fix issue with
bitmapOrCardinality
that may lead to nullptr dereference. This closes #18911. #18912 (sundyli). - Fix possible hang at shutdown in
clickhouse-local
. This fixes #18891. #18893 (alexey-milovidov). - Queries for external databases (MySQL, ODBC, JDBC) were incorrectly rewritten if there was an expression in form of
x IN table
. This fixes #9756. #18876 (alexey-milovidov). - Fix *If combinator with unary function and Nullable types. #18806 (Azat Khuzhin).
- Fix the issue that asynchronous distributed INSERTs can be rejected by the server if the setting
network_compression_method
is globally set to non-default value. This fixes #18741. #18776 (alexey-milovidov). - Fixed
Attempt to read after eof
error when trying toCAST
NULL
fromNullable(String)
toNullable(Decimal(P, S))
. Now functionCAST
returnsNULL
when it cannot parse decimal from nullable string. Fixes #7690. #18718 (Winter Zhang). - Fix minor issue with logging. #18717 (sundyli).
- Fix removing of empty parts in
ReplicatedMergeTree
tables, created with old syntax. Fixes #18582. #18614 (Anton Popov). - Fix previous bug when date overflow with different values. Strict Date value limit to "2106-02-07", cast date > "2106-02-07" to value 0. #18565 (hexiaoting).
- Add FixedString data type support for replication from MySQL. Replication from MySQL is an experimental feature. This patch fixes #18450 Also fixes #6556. #18553 (awesomeleo).
- Fix possible
Pipeline stuck
error while usingORDER BY
after subquery withRIGHT
orFULL
join. #18550 (Nikolai Kochetov). - Fix bug which may lead to
ALTER
queries hung after corresponding mutation kill. Found by thread fuzzer. #18518 (alesapin). - Proper support for 12AM in
parseDateTimeBestEffort
function. This fixes #18402. #18449 (vladimir-golovchenko). - Fixed
value is too short
error when executingtoType(...)
functions (toDate
,toUInt32
, etc) with argument of typeNullable(String)
. Now such functions returnNULL
on parsing errors instead of throwing exception. Fixes #7673. #18445 (tavplubix). - Fix the unexpected behaviour of
SHOW TABLES
. #18431 (fastio). - Fix -SimpleState combinator generates incompatible arugment type and return type. #18404 (Amos Bird).
- Fix possible race condition in concurrent usage of
Set
orJoin
tables and selects fromsystem.tables
. #18385 (alexey-milovidov). - Fix filling table
system.settings_profile_elements
. This PR fixes #18231. #18379 (Vitaly Baranov). - Fix possible crashes in aggregate functions with combinator
Distinct
, while using two-level aggregation. Fixes #17682. #18365 (Anton Popov). - Fixed issue when
clickhouse-odbc-bridge
process is unreachable by server on machines with dual IPv4/IPv6 stack; Fixed issue when ODBC dictionary updates are performed using malformed queries and/or cause crashes of the odbc-bridge process; Possibly closes #14489. #18278 (Denis Glazachev). - Access control:
SELECT count() FROM table
now can be executed if the user has access to at least single column from a table. This PR fixes #10639. #18233 (Vitaly Baranov). - Access control:
SELECT JOIN
now requires theSELECT
privilege on each of the joined tables. This PR fixes #17654. #18232 (Vitaly Baranov). - Fix key comparison between Enum and Int types. This fixes #17989. #18214 (Amos Bird).
- Replication from MySQL (experimental feature). Fixes #18186 Fixes #16372 Fix unique key convert issue in MaterializeMySQL database engine. #18211 (Winter Zhang).
- Fix inconsistency for queries with both
WITH FILL
andWITH TIES
#17466. #18188 (hexiaoting). - Fix inserting a row with default value in case of parsing error in the last column. Fixes #17712. #18182 (Jianmei Zhang).
- Fix
Unknown setting profile
error on attempt to set settings profile. #18167 (tavplubix). - Fix error when query
MODIFY COLUMN ... REMOVE TTL
doesn't actually remove column TTL. #18130 (alesapin). - Fixed
std::out_of_range: basic_string
in S3 URL parsing. #18059 (Vladimir Chebotarev). - Fix comparison of
DateTime64
andDate
. Fixes #13804 and #11222. ... #18050 (Vasily Nemkov). - Replication from MySQL (experimental feature): Fixes #15187 Fixes #17912 support convert MySQL prefix index for MaterializeMySQL. #17944 (Winter Zhang).
- When server log rotation was configured using
logger.size
parameter with numeric value larger than 2^32, the logs were not rotated properly. This is fixed. #17905 (Alexander Kuzmenkov). - Trivial query optimization was producing wrong result if query contains ARRAY JOIN (so query is actually non trivial). #17887 (sundyli).
- Fix possible segfault in
topK
aggregate function. This closes #17404. #17845 (Maksim Kita). - WAL (experimental feature): Do not restore parts from WAL if
in_memory_parts_enable_wal
is disabled. #17802 (detailyang). - Exception message about max table size to drop was displayed incorrectly. #17764 (alexey-milovidov).
- Fixed possible segfault when there is not enough space when inserting into
Distributed
table. #17737 (tavplubix). - Fixed problem when ClickHouse fails to resume connection to MySQL servers. #17681 (Alexander Kazakov).
- Windows: Fixed
Function not implemented
error when executingRENAME
query inAtomic
database with ClickHouse running on Windows Subsystem for Linux. Fixes #17661. #17664 (tavplubix). - In might be determined incorrectly if cluster is circular- (cross-) replicated or not when executing
ON CLUSTER
query due to race condition whenpool_size
> 1. It's fixed. #17640 (tavplubix). - Fix empty
system.stack_trace
table when server is running in daemon mode. #17630 (Amos Bird). - Exception
fmt::v7::format_error
can be logged in background for MergeTree tables. This fixes #17613. #17615 (alexey-milovidov). - When clickhouse-client is used in interactive mode with multiline queries, single line comment was erronously extended till the end of query. This fixes #13654. #17565 (alexey-milovidov).
- Fix alter query hang when the corresponding mutation was killed on the different replica. Fixes #16953. #17499 (alesapin).
- Fix issue with memory accounting when mark cache size was underestimated by clickhouse. It may happen when there are a lot of tiny files with marks. #17496 (alesapin).
- Fix
ORDER BY
with enabled settingoptimize_redundant_functions_in_order_by
. #17471 (Anton Popov). - Fix duplicates after
DISTINCT
which were possible because of incorrect optimization. Fixes #17294. #17296 (li chengxiang). #17439 (Nikolai Kochetov). - Fixed high CPU usage in background tasks of *MergeTree tables. #17416 (tavplubix).
- Fix possible crash while reading from
JOIN
table withLowCardinality
types. Fixes #17228. #17397 (Nikolai Kochetov). - Replication from MySQL (experimental feature): Fixes #16835 try fix miss match header with MySQL SHOW statement. #17366 (Winter Zhang).
- Fix nondeterministic functions with predicate optimizer. This fixes #17244. #17273 (Winter Zhang).
- Fix possible
Unexpected packet Data received from client
error for Distributed queries withLIMIT
. #17254 (Azat Khuzhin). - Fix set index invalidation when there are const columns in the subquery. This fixes #17246. #17249 (Amos Bird).
- clickhouse-copier: Fix for non-partitioned tables #15235. #17248 (Qi Chen).
- Fixed possible not-working mutations for parts stored on S3 disk (experimental feature). #17227 (Pavel Kovalenko).
- Bug fix for funciton
fuzzBits
, related issue: #16980. #17051 (hexiaoting). - Fix
optimize_distributed_group_by_sharding_key
for query with OFFSET only. #16996 (Azat Khuzhin). - Fix queries from
Merge
tables overDistributed
tables with JOINs. #16993 (Azat Khuzhin). - Fix order by optimization with monotonic functions. Fixes #16107. #16956 (Anton Popov).
- Fix incorrect comparison of types
DateTime64
with different scales. Fixes #16655 ... #16952 (Vasily Nemkov). - Fix optimization of group by with enabled setting
optimize_aggregators_of_group_by_keys
and joins. Fixes #12604. #16951 (Anton Popov). - Minor fix in SHOW ACCESS query. #16866 (tavplubix).
- Fix the behaviour with enabled
optimize_trivial_count_query
setting with partition predicate. #16767 (Azat Khuzhin). - Return number of affected rows for INSERT queries via MySQL wire protocol. Previously ClickHouse used to always return 0, it's fixed. Fixes #16605. #16715 (Winter Zhang).
- Fix inconsistent behavior caused by
select_sequential_consistency
for optimized trivial count query and system tables. #16309 (Hao Chen). - Throw error when
REPLACE
column transformer operates on non existing column. #16183 (hexiaoting). - Throw exception in case of not equi-join ON expression in RIGH|FULL JOIN. #15162 (Artem Zuikov).
Build/Testing/Packaging Improvement
- Add simple integrity check for ClickHouse binary. It allows to detect corruption due to faulty hardware (bit rot on storage media or bit flips in RAM). #18811 (alexey-milovidov).
- Change
OpenSSL
toBoringSSL
. It allows to avoid issues with sanitizers. This fixes #12490. This fixes #17502. This fixes #12952. #18129 (alexey-milovidov). - Simplify
Sys/V
init script. It was not working on Ubuntu 12.04 or older. #17428 (alexey-milovidov). - Multiple improvements in
./clickhouse install
script. #17421 (alexey-milovidov). - Now ClickHouse can pretend to be a fake ZooKeeper. Currently, storage implementation is just stored in-memory hash-table, and server partially support ZooKeeper protocol. #16877 (alesapin).
- Fix dead list watches removal for TestKeeperStorage (a mock for ZooKeeper). #18065 (alesapin).
- Add
SYSTEM SUSPEND
command for fault injection. It can be used to faciliate failover tests. This closes #15979. #18850 (alexey-milovidov). - Generate build id when ClickHouse is linked with
lld
. It's appeared thatlld
does not generate it by default on my machine. Build id is used for crash reports and introspection. #18808 (alexey-milovidov). - Fix shellcheck errors in style check. #18566 (Ilya Yatsishin).
- Update timezones info to 2020e. #18531 (alesapin).
- Fix codespell warnings. Split style checks into separate parts. Update style checks docker image. #18463 (Ilya Yatsishin).
- Automated check for leftovers of conflict markers in docs. #18332 (alexey-milovidov).
- Enable Thread Fuzzer for stateless tests flaky check. #18299 (alesapin).
- Do not use non thread-safe function
strerror
. #18204 (alexey-milovidov). - Update
anchore/scan-action@main
workflow action (was moved frommaster
tomain
). #18192 (Stig Bakken). - Now
clickhouse-test
does DROP/CREATE databases with a timeout. #18098 (alesapin). - Enable experimental support for Pytest framework for stateless tests. #17902 (Ivan).
- Now we use the fresh docker daemon version in integration tests. #17671 (alesapin).
- Send info about official build, memory, cpu and free disk space to Sentry if it is enabled. Sentry is opt-in feature to help ClickHouse developers. This closes #17279. #17543 (alexey-milovidov).
- There was an uninitialized variable in the code of clickhouse-copier. #17363 (Nikita Mikhaylov).
- Fix one MSan report from #17309. #17344 (Nikita Mikhaylov).
- Fix for the issue with IPv6 in Arrow Flight library. See the comments for details. #16664 (Zhanna).
- Add a library that replaces some
libc
functions to traps that will terminate the process. #16366 (alexey-milovidov). - Provide diagnostics in server logs in case of stack overflow, send error message to clickhouse-client. This closes #14840. #16346 (alexey-milovidov).
- Now we can run almost all stateless functional tests in parallel. #15236 (alesapin).
- Fix corruption in
librdkafka
snappy decompression (was a problem only for gcc10 builds, but official builds uses clang already, so at least recent official releases are not affected). #18053 (Azat Khuzhin). - If server was terminated by OOM killer, print message in log. #13516 (alexey-milovidov).
- PODArray: Avoid call to memcpy with (nullptr, 0) arguments (Fix UBSan report). This fixes #18525. #18526 (alexey-milovidov).
- Minor improvement for path concatenation of zookeeper paths inside DDLWorker. #17767 (Bharat Nallan).
- Allow to reload symbols from debug file. This PR also fixes a build-id issue. #17637 (Amos Bird).
- TestFlows: fixes to LDAP tests that fail due to slow test execution. #18790 (vzakaznikov).
- TestFlows: Merging requirements for AES encryption functions. Updating aes_encryption tests to use new requirements. Updating TestFlows version to 1.6.72. #18221 (vzakaznikov).
- TestFlows: Updating TestFlows version to the latest 1.6.72. Re-generating requirements.py. #18208 (vzakaznikov).
- TestFlows: Updating TestFlows README.md to include "How To Debug Why Test Failed" section. #17808 (vzakaznikov).
- TestFlows: tests for RBAC ACCESS MANAGEMENT privileges. #17804 (MyroTk).
- TestFlows: RBAC tests for SHOW, TRUNCATE, KILL, and OPTIMIZE. - Updates to old tests. - Resolved comments from #https://github.com/ClickHouse/ClickHouse/pull/16977. #17657 (MyroTk).
- TestFlows: Added RBAC tests for
ATTACH
,CREATE
,DROP
, andDETACH
. #16977 (MyroTk).