mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-27 01:51:59 +00:00
37 KiB
37 KiB
ClickHouse release v21.2.2.8-stable FIXME as compared to v21.1.1.5646-prestable
Backward Incompatible Change
-
- Fix memory tracking for
OPTIMIZE TABLE
/merges - Account query memory limits and sampling forOPTIMIZE TABLE
/merges. #18772 (Azat Khuzhin).
- Fix memory tracking for
- Forbid
lcm
/gcd
for floats. #19532 (Azat Khuzhin). - Bitwise functions (
bitAnd
,bitOr
, etc) are forbidden for floating point arguments. Now you have to do explicit cast to integer. #19853 (Azat Khuzhin).
New Feature
- add support for zstd long option for better compression of string columns to save space. #17184 (ygrek).
-
- Added support of mapping LDAP group names, and attribute values in general, to local roles for users from ldap user directories. #17211 (Denis Glazachev).
- Data type
Nested
now supports arbitrary levels of nesting. Introduced subcolumns of complex types, such assize0
inArray
,null
inNullable
, names ofTuple
elements, which can be read without reading of whole column. #17310 (Anton Popov). - Add support of tuple argument to
argMin
andargMax
functions. #17359 (Ildus Kurbangaliev). - Added
Nullable
support forFlatDictionary
,HashedDictionary
,ComplexKeyHashedDictionary
,DirectDictionary
,ComplexKeyDirectDictionary
,RangeHashedDictionary
. #18236 (Maksim Kita). - Disallow floating point column as partition key related to : #18421#event-4147046255. #18464 (hexiaoting).
- Add function decodeXMLComponent to decode characters for XML.
SELECT decodeXMLComponent('Hello,"world"!');
#17659. #18542 (nauta). - Added PostgreSQL table engine (both select/insert, with support for multidimensional arrays), also as table function. Added PostgreSQL dictionary source. Added PostgreSQL database engine. #18554 (Kseniia Sumarokova).
- Add
SELECT ALL
syntax. closes #18706. #18723 (flynn). - Add three functions for map data type: 1. mapContains(map, key) to check weather map.keys include the second parameter key. 2. mapKeys(map) return all the keys in Array format 3. mapValues(map) return all the values in Array format. #18788 (hexiaoting).
- Support MetaKey+Enter hotkey binding in play ui. #19012 (sundyli).
- Function formatDateTime support the %Q modification to format date to quarter. ... #19224 (Jianmei Zhang).
- ... #19261 (RegulusZ).
- Add factories' objects names, created during query, into system.query_log. Closes #18495. #19371 (Kseniia Sumarokova).
- Add
sign
math function. #19527 (flynn). - Added functions
parseDateTimeBestEffortUSOrZero
,parseDateTimeBestEffortUSOrNull
. #19712 (Maksim Kita). - ... #19764 (emhlbmc).
Performance Improvement
- Use a connection pool for S3 connections, controlled by the
s3_max_connections
settings. #13405 (Vladimir Chebotarev). - Rewrite
sumIf()
andsum(if())
function tocountIf()
function when logically equivalent. #17041 (flynn). - Update libcxx and use unstable ABI to provide better performance. #18914 (Daniel Kutenin).
- Faster parts removal by lowering the number of
stat
syscalls. This returns the optimization that existed while ago. More safe interface ofIDisk
. This closes #19065. #19086 (Alexey Milovidov). - Speed up aggregate function
sum
. Improvement only visible on synthetic benchmarks and not very practical. #19216 (Alexey Milovidov). - Support splitting
Filter
step of query plan intoExpression + Filter
pair. Together withExpression + Expression
merging optimization (#17458) it may delay execution for some expressions afterFilter
step. #19253 (Nikolai Kochetov). - Reduce lock contention for multiple layers of the Buffer engine. #19379 (Azat Khuzhin).
- Slightly improve server latency by removing access to configuration on every connection. #19863 (Alexey Milovidov).
Improvement
- Added support for
WITH ... [AND] [PERIODIC] REFRESH [interval_in_sec]
clause when creatingLIVE VIEW
tables. #14822 (vzakaznikov). -
- Add optimize_alias_column_prediction (on by default), that will: * Respect aliased columns in WHERE during partition pruning and skipping data using secondary indexes * Respect aliased columns in WHERE for trivial count queries for optimize_trivial_count * Respect aliased columns in GROUP BY/ORDER BY for optimize_aggregation_in_order/optimize_read_in_order. #16995 (sundyli).
- Updated AWS C++ SDK in order to utilize global regions in S3. #17870 (Vladimir Chebotarev).
- Support insert into table function
cluster
, and for both table functionsremote
andcluster
, support distributing data across nodes by specify sharding key. Close #16752. #18264 (flynn). - Support
EXISTS VIEW
syntax. #18552 (Du Chuan). - Update librdkafka to v1.6.0-RC2. Fixes #18668. #18671 (filimonov).
- Allow CTE to be further aliased. Propagate CSE to subqueries in the same level when
enable_global_with_statement = 1
. This fixes #17378 . This fixes https://github.com/ClickHouse/ClickHouse/pull/16575#issuecomment-753416235 . #18684 (Amos Bird). - Add [UInt8, UInt16, UInt32, UInt64] arguments types support for bitmapTransform, bitmapSubsetInRange, bitmapSubsetLimit, bitmapContains functions. This closes #18713. #18791 (sundyli).
- Added prefix-based S3 endpoint settings. #18812 (Vladimir Chebotarev).
- Fix issues with RIGHT and FULL JOIN of tables with aggregate function states. In previous versions exception about
cloneResized
method was thrown. #18818 (templarzq). -
- Check per-block checksum of the distributed batch on the sender before sending (without reading the file twice, the checksums will be verified while reading), this will avoid stuck of the INSERT on the receiver (on truncated .bin file on the sender) - Avoid reading .bin files twice for batched INSERT (it was required to calculate rows/bytes to take squashing into account, now this information included into the header, backward compatible is preserved). #18853 (Azat Khuzhin).
- Add
normalizeQueryKeepNames
andnormalizedQueryHashKeepNames
to normalize queries without masking long names with?
. This helps better analyze complex query logs. #18910 (Amos Bird). - Docker image: several improvements for clickhouse-server entrypoint. #18954 (filimonov).
- Fixed
PeekableReadBuffer: Memory limit exceed
error when inserting data with huge strings. Fixes #18690. #18979 (Alexander Tokmakov). - Explicitly set uid / gid of clickhouse user & group to the fixed values (101) in clickhouse-server images. #19096 (filimonov).
- The exception when function
bar
is called with certain NaN argument may be slightly misleading in previous versions. This fixes #19088. #19107 (Alexey Milovidov). - Allow change
max_server_memory_usage
without restart. This closes #18154. #19186 (Alexey Milovidov). - Fix wrong alignment of values of
IPv4
data type in Pretty formats. They were aligned to the right, not to the left. This closes #19184. #19339 (Alexey Milovidov). - Allow docker to be executed with arbitrary uid. #19374 (filimonov).
- Add metrics for MergeTree parts (Wide/Compact/InMemory) types. #19381 (Azat Khuzhin).
- Improve MySQL compatibility. #19387 (Daniil Kondratyev).
- Add
http_referer
field tosystem.query_log
,system.processes
, etc. This closes #19389. #19390 (Alexey Milovidov). toIPv6
function parsesIPv4
addresses. #19518 (Bharat Nallan).- Support using the new location of
.debug
file. This fixes #19348. #19520 (Amos Bird). - Enable function length/empty/notEmpty for datatype map, which returns keys number in map. #19530 (李扬).
- Support constant result in function
multiIf
. #19533 (Maksim Kita). - Add an option to disable validation of checksums on reading. Should never be used in production. Please do not expect any benefits in disabling it. It may only be used for experiments and benchmarks. The setting only applicable for tables of MergeTree family. Checksums are always validated for other table engines and when receiving data over network. In my observations there is no performance difference or it is less than 0.5%. #19588 (Alexey Milovidov).
- Dictionary better error message during attribute parsing. #19678 (Maksim Kita).
- Fix rare
max_number_of_merges_with_ttl_in_pool
limit overrun (more merges with TTL can be assigned) for non-replicated MergeTree. #19708 (alesapin). - Insuffiient arguments check in
positionCaseInsensitiveUTF8
function triggered address sanitizer. #19720 (Alexey Milovidov). - Add separate pool for message brokers (RabbitMQ and Kafka). #19722 (Azat Khuzhin).
- In distributed queries if the setting
async_socket_for_remote
is enabled, it was possible to get stack overflow at least in debug build configuration if very deeply nested data type is used in table (e.g.Array(Array(Array(...more...)))
). This fixes #19108. This change introduces minor backward incompatibility: excessive parenthesis in type definitions no longer supported, example:Array((UInt8))
. #19736 (Alexey Milovidov). - Table function
S3
will use global region if the region can't be determined exactly. This closes #10998. #19750 (Vladimir Chebotarev). - ClickHouse client query param CTE added test. #19762 (Maksim Kita).
- Correctly output infinite arguments for
formatReadableTimeDelta
function. In previous versions, there was implicit conversion to implementation specific integer value. #19791 (Alexey Milovidov). S3
table function now supportsauto
compression mode (autodetect). This closes #18754. #19793 (Vladimir Chebotarev).- Set charset to utf8mb4 when interacting with remote MySQL servers. Fixes #19795. #19800 (Alexey Milovidov).
- Add
--reconnect
option toclickhouse-benchmark
. When this option is specified, it will reconnect before every request. This is needed for testing. #19872 (Alexey Milovidov).
Bug Fix
- fix data type convert issue for mysql engine ... #18124 (bo zeng).
SELECT count() FROM table
now can be executed if only one any column can be selected from thetable
. This PR fixes #10639. #18233 (Vitaly Baranov).- Fix index analysis of binary functions with constant argument which leads to wrong query results. This fixes #18364. #18373 (Amos Bird).
- Disable constant folding for subqueries on the analysis stage, when the result cannot be calculated. #18446 (Azat Khuzhin).
- Attach partition should reset the mutation. #18804. #18935 (fastio).
- Fix bug when mutation with some escaped text (like
ALTER ... UPDATE e = CAST('foo', 'Enum8(\'foo\' = 1')
serialized incorrectly. Fixes #18878. #18944 (alesapin). - Fix error
Task was not found in task queue
(possible only for remote queries, withasync_socket_for_remote = 1
). #18964 (Nikolai Kochetov). - to fix #18894 Add a check to avoid exception when long column alias('table.column' style, usually auto-generated by BI tools like Looker) equals to long table name. #18968 (Daniel Qin).
- Fix incorrect behavior when
ALTER TABLE ... DROP PART 'part_name'
query removes all deduplication blocks for the whole partition. Fixes #18874. #18969 (alesapin). - Fixed rare crashes when server run out of memory. #18976 (Alexander Tokmakov).
- Fixed very rare deadlock at shutdown. #18977 (Alexander Tokmakov).
- Fix possible exception
QueryPipeline stream: different number of columns
caused by merging of query plan'sExpression
steps. Fixes #18190. #18980 (Nikolai Kochetov). - Disable
optimize_move_functions_out_of_any
because optimization is not always correct. This closes #18051. This closes #18973. #18981 (Alexey Milovidov). - Join tries to materialize const columns, but our code waits for them in other places. #18982 (Nikita Mikhaylov).
- Fix inserting of
LowCardinality
column to table withTinyLog
engine. Fixes #18629. #19010 (Nikolai Kochetov). - Fix possible error
Expected single dictionary argument for function
if use functionignore
withLowCardinality
argument. Fixes #14275. #19016 (Nikolai Kochetov). - Make sure
groupUniqArray
returns correct type for argument of Enum type. This closes #17875. #19019 (Alexey Milovidov). - Restrict
MODIFY TTL
queries forMergeTree
tables created in old syntax. Previously the query succeeded, but actually it had no effect. #19064 (Anton Popov). - Fixed
There is no checkpoint
error when inserting data through http interface usingTemplate
orCustomSeparated
format. Fixes #19021. #19072 (Alexander Tokmakov). - Simplify the implementation of
tupleHammingDistance
. Support for tuples of any equal length. Fixes #19029. #19084 (Nikolai Kochetov). - Fix startup bug when clickhouse was not able to read compression codec from
LowCardinality(Nullable(...))
and throws exceptionAttempt to read after EOF
. Fixes #18340. #19101 (alesapin). - Fix bug in merge tree data writer which can lead to marks with bigger size than fixed granularity size. Fixes #18913. #19123 (alesapin).
- Fix infinite reading from file in
ORC
format (was introduced in #10580). Fixes #19095. #19134 (Nikolai Kochetov). -
- Split RemoteQueryExecutorReadContext into module part - Fix leaking of pipe fd for
async_socket_for_remote
. #19153 (Azat Khuzhin).
- Split RemoteQueryExecutorReadContext into module part - Fix leaking of pipe fd for
- Fix bug when concurrent
ALTER
andDROP
queries may hang while processing ReplicatedMergeTree table. #19237 (alesapin). - Do not mark file for distributed send as broken on EOF. #19290 (Azat Khuzhin).
- Fix error
Cannot convert column now64() because it is constant but values of constants are different in source and result
. Continuation of #7156. #19316 (Nikolai Kochetov). - Fixed possible wrong result or segfault on aggregation when Materialized View and its target table have different structure. Fixes #18063. #19322 (Alexander Tokmakov).
- Fix system.parts _state column (LOGICAL_ERROR when querying this column, due to incorrect order). #19346 (Azat Khuzhin).
- Added
cast
,accurateCast
,accurateCastOrNull
performance tests. #19354 (Maksim Kita). -
- Fix default value in join types with non-zero default (e.g. some Enums). Closes #18197. #19360 (Vladimir C).
- Fix possible buffer overflow in Uber H3 library. See https://github.com/uber/h3/issues/392. This closes #19219. #19383 (Alexey Milovidov).
- Uninitialized memory read was possible in encrypt/decrypt functions if empty string was passed as IV. This closes #19391. #19397 (Alexey Milovidov).
- Fix possible error
Extremes transform was already added to pipeline
. Fixes #14100. #19430 (Nikolai Kochetov). - Fixed very rare bug that might cause mutation to hang after
DROP/DETACH/REPLACE/MOVE PARTITION
. It was partially fixed by #15537 for the most cases. #19443 (Alexander Tokmakov). - Mark distributed batch as broken in case of empty data block in one of files. #19449 (Azat Khuzhin).
- Buffer overflow (on memory read) was possible if
addMonth
function was called with specifically crafted arguments. This fixes #19441. This fixes #19413. #19472 (Alexey Milovidov). - Fix wrong deserialization of columns description. It makes INSERT into a table with a column named
\
impossible. #19479 (Alexey Milovidov). - Fix SIGSEGV with merge_tree_min_rows_for_concurrent_read/merge_tree_min_bytes_for_concurrent_read=0/UINT64_MAX. #19528 (Azat Khuzhin).
- Query CREATE DICTIONARY id expression fix. #19571 (Maksim Kita).
DROP/DETACH TABLE table ON CLUSTER cluster SYNC
query might hang, it's fixed. Fixes #19568. #19572 (Alexander Tokmakov).- Fix use-after-free of the CompressedWriteBuffer in Connection after disconnect. #19599 (Azat Khuzhin).
- Fix wrong result of function
neighbor
forLowCardinality
argument. Fixes #10333. #19617 (Nikolai Kochetov). - Some functions with big integers may cause segfault. Big integers is experimental feature. This closes #19667. #19672 (Alexey Milovidov).
- Backported in #19986: Background thread which executes
ON CLUSTER
queries might hang waiting for dropped replicated table to do something. It's fixed. #19684 (yiguolei). - Fix rare bug when some replicated operations (like mutation) cannot process some parts after data corruption. Fixes #19593. #19702 (alesapin).
- Fix a segmentation fault in
bitmapAndnot
function. Fixes #19668. #19713 (Maksim Kita). - Fix crash when nested column name was used in
WHERE
orPREWHERE
. Fixes #19755. #19763 (Nikolai Kochetov). - Fixed stack overflow when using accurate comparison of arithmetic type with string type. #19773 (Alexander Tokmakov).
- In previous versions, unusual arguments for function arrayEnumerateUniq may cause crash or infinite loop. This closes #19787. #19788 (Alexey Milovidov).
- The function
greatCircleAngle
returned inaccurate results in previous versions. This closes #19769. #19789 (Alexey Milovidov). - Backported in #19922: Fix clickhouse-client abort exception while executing only
select
. #19790 (李扬). - Fix filtering by UInt8 greater than 127. #19799 (Anton Popov).
- Backported in #20007: Fix starting the server with tables having default expressions containing dictGet(). Allow getting return type of dictGet() without loading dictionary. #19805 (Vitaly Baranov).
- Fix crash when pushing down predicates to union distinct subquery. This fixes #19855. #19861 (Amos Bird).
- Fix argMin/argMax crash when combining with -If. This fixes https://clickhouse-test-reports.s3.yandex.net/19800/7b8589dbde5bc621d1bcfd68a713e4684183f593/fuzzer_ubsan/report.html#fail1. #19868 (Amos Bird).
- Backported in #19939: Deadlock was possible if system.text_log is enabled. This fixes #19874. #19875 (Alexey Milovidov).
- Backported in #19935: BloomFilter index crash fix. Fixes #19757. #19884 (Maksim Kita).
- Backported in #19996: - Fix a segfault in function
fromModifiedJulianDay
when the argument type isNullable(T)
for any integral types other than Int32. #19959 (PHO). - Backported in #20112:
EmbeddedRocksDB
is an experimental storage. Fix the issue with lack of proper type checking. Simplified code. This closes #19967. #19972 (Alexey Milovidov). - Backported in #20028: Prevent "Connection refused" in docker during initialization script execution. #20012 (filimonov).
- Backported in #20123: MaterializeMySQL: Fix replication for statements that update several tables. #20066 (Håvard Kvålen).
- Backported in #20146: Fix server crash after query with
if
function withTuple
type of then/else branches result.Tuple
type must containArray
or another complex type. Fixes #18356. #20133 (alesapin).
Build/Testing/Packaging Improvement
- Restore Kafka input in FreeBSD builds. #18924 (Alexandre Snarskii).
- Add integration tests run with memory sanitizer. #18974 (alesapin).
- Add SQLancer test docker image to run check in CI. #19006 (Ilya Yatsishin).
- Added tests for defaults in URL and File engine. This closes #5666. #19015 (Nikita Mikhaylov).
- Query Fuzzer will fuzz newly added tests more extensively. This closes #18916. #19185 (Alexey Milovidov).
- Allow building librdkafka without ssl. #19337 (filimonov).
- Avoid UBSan reports in
arrayElement
function,substring
andarraySum
. Fixes #19305. Fixes #19287. This closes #19336. #19347 (Alexey Milovidov). - Fix potential nullptr dereference in table function
VALUES
. #19357 (Alexey Milovidov). - Allow building ClickHouse with Kafka support on arm64. #19369 (filimonov).
- Integrate with Big List of Naughty Strings for better fuzzing. #19480 (Alexey Milovidov).
- Allow to explicitly enable or disable watchdog via environment variable
CLICKHOUSE_WATCHDOG_ENABLE
. By default it is enabled if server is not attached to terminal. #19522 (Alexey Milovidov). - Updating TestFlows AES encryption tests to support changes to the encrypt plaintext parameter. #19674 (vzakaznikov).
- Made generation of macros.xml easier for integration tests. No more excessive logging from dicttoxml. dicttoxml project is not active for 5+ years. #19697 (Ilya Yatsishin).
- Remove --project-directory for docker-compose in integration test. Fix logs formatting from docker container. #19706 (Ilya Yatsishin).
- Fixed MemorySanitizer errors in cyrus-sasl and musl. #19821 (Ilya Yatsishin).
- Add test for throwing an exception on inserting incorrect data in CollapsingMergeTree. #19851 (Kruglov Pavel).
- Adding retries for docker-compose start, stop and restart in TestFlows tests. #19852 (vzakaznikov).
NO CL ENTRY
- NO CL ENTRY: 'Remove useless codes'. #19293 (sundyli).
- NO CL ENTRY: 'Merging #19387'. #19683 (Alexey Milovidov).