mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-12-14 10:22:10 +00:00
87 KiB
87 KiB
sidebar_position | sidebar_label |
---|---|
1 | 2023 |
2023 Changelog
ClickHouse release v23.2.1.2537-stable (52bf836e03
) FIXME as compared to v23.1.1.3077-stable (dcaac47702
)
Backward Incompatible Change
- Extend function "toDayOfWeek()" (alias: "DAYOFWEEK") with a mode argument that encodes whether the week starts on Monday or Sunday and whether counting starts at 0 or 1. For consistency with other date time functions, the mode argument was inserted between the time and the time zone arguments. This breaks existing usage of the (previously undocumented) 2-argument syntax "toDayOfWeek(time, time_zone)". A fix is to rewrite the function into "toDayOfWeek(time, 0, time_zone)". #45233 (Robert Schulze).
- Rename setting
max_query_cache_size
tofilesystem_cache_max_download_size
. #45614 (Kseniia Sumarokova). - Fix applying settings for FORMAT on the client. #46003 (Azat Khuzhin).
- Default user will not have permissions for access type
SHOW NAMED COLLECTION
by default (e.g. by default, default user will not longer be able to do grant ALL to other users as it was before, therefore this PR is backward incompatible). #46010 (Kseniia Sumarokova). - Remove support for setting
materialized_postgresql_allow_automatic_update
(which was by default turned off). Fix integration tests. #46106 (Kseniia Sumarokova). - Slightly improve performance of
countDigits
on realistic datasets. This closed #44518. In previous versions,countDigits(0)
returned0
; now it returns1
, which is more correct, and follows the existing documentation. #46187 (Alexey Milovidov).
New Feature
- Expose ProfileEvents counters in system.part_log. #38614 (Bharat Nallan).
- Enrichment of the existing ReplacingMergeTree engine to allow duplicates insertion. It leverages the power of both ReplacingMergeTree and CollapsingMergeTree in one mergeTree engine. Deleted data are not returned when queried, but not removed from disk neither. #41005 (youennL-cs).
- Add
generateULID()
function. Closes #36536. #44662 (Nikolay Degterinsky). - Add
corrMatrix
Aggregatefunction, calculating each two columns. In addition, since AggregatefunctionscovarSamp
andcovarPop
are similar tocorr
, I addcovarSampMatrix
,covarPopMatrix
by the way. @alexey-milovidov closes #44587. #44680 (FFFFFFFHHHHHHH). - Rewrite aggregate functions with if expression as argument when logically equivalent. For example, avg(if(cond, col, null)) can be rewritten to avgIf(cond, col). It is helpful in performance. #44730 (李扬).
- Introduce arrayShuffle function for random array permutations. #45271 (Joanna Hulboj).
- Support types FIXED_SIZE_BINARY type in Arrow, FIXED_LENGTH_BYTE_ARRAY in Parquet and match them to FixedString. Add settings
output_format_parquet_fixed_string_as_fixed_byte_array/output_format_arrow_fixed_string_as_fixed_byte_array
to control default output type for FixedString. Closes #45326. #45340 (Kruglov Pavel). - Add
StorageIceberg
and table functioniceberg
to access iceberg table store on S3. #45384 (flynn). - Add a new column
last_exception_time
to system.replication_queue. #45457 (Frank Chen). - Add two new functions which allow for user-defined keys/seeds with SipHash{64,128}. #45513 (Salvatore Mesoraca).
- Allow a three-argument version for table function
format
. close #45808. #45873 (FFFFFFFHHHHHHH). - add joda format support for 'x','w','S'.Refer to https://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html. #46073 (zk_kiger).
- ... Support window function
ntile
. ``` insert into test_data values(1,2), (1,3), (1,4), (2,5),(2,6); select a, b, ntile(2) over (partition by a order by b rows between unbounded preceding and unbounded following ) from test_data;. #46256 (lgbo). - Added arrayPartialSort and arrayPartialReverseSort functions. #46296 (Joanna Hulboj).
- The new http parameter
client_protocol_version
allows setting a client protocol version for HTTP responses using the Native format. #40397. #46360 (Geoff Genz). - Add new function regexpExtract, like spark function REGEXP_EXTRACT. #46469 (李扬).
- Author: taiyang-li Add new function regexpExtract, like spark function REGEXP_EXTRACT. #46529 (Alexander Gololobov).
- Add new function JSONArrayLength, which returns the number of elements in the outermost JSON array. The function returns NULL if input JSON string is invalid. #46631 (李扬).
Performance Improvement
- Improve lower/upper function performance with avx512 instructions. #37894 (yaqi-zhao).
- Add new
local_filesystem_read_method
methodio_uring
based on the asynchronous Linux io_uring subsystem, improving read performance almost universally compared to the defaultpread
method. #38456 (Saulius Valatka). - Remove the limitation that on systems with >=32 cores and SMT disabled ClickHouse uses only half of the cores. #44973 (Robert Schulze).
- Improve performance of function multiIf by columnar executing, speed up by 2.3x. #45296 (李扬).
- An option added to aggregate partitions independently if table partition key and group by key are compatible. Controlled by the setting
allow_aggregate_partitions_independently
. Disabled by default because of limited applicability (please refer to the docs). #45364 (Nikita Taranov). - Add fastpath for function position when needle is empty. #45382 (李扬).
- Enable
query_plan_remove_redundant_sorting
optimization by default. Optimization implemented in #45420. #45567 (Igor Nikonov). - Increased HTTP Transfer Encoding chunk size to improve performance of large queries using the HTTP interface. #45593 (Geoff Genz).
- Fixed performance of short
SELECT
queries that read from tables with large number ofArray
/Map
/Nested
columns. #45630 (Anton Popov). - Allow using Vertical merge algorithm with parts in Compact format. This will allow ClickHouse server to use much less memory for background operations. This closes #46084. #45681 (Anton Popov).
- Optimize Parquet reader by using batch reader. #45878 (LiuNeng).
- Improve performance of ColumnArray::filter for big int and decimal. #45949 (李扬).
- This change could effectively reduce the overhead of obtaining the filter from ColumnNullable(UInt8) and improve the overall query performance. To evaluate the impact of this change, we adopted TPC-H benchmark but revised the column types from non-nullable to nullable, and we measured the QPS of its queries as the performance indicator. #45962 (Zhiguo Zhou).
- Make the
_part
and_partition_id
virtual column beLowCardinality(String)
type. Closes #45964. #45975 (flynn). - Improve the performance of Decimal conversion when the scale does not change. #46095 (Alexey Milovidov).
- The introduced logic works if PREWHERE condition is a conjunction of multiple conditions (cond1 AND cond2 AND ... ). It groups those conditions that require reading the same columns into steps. After each step the corresponding part of the full condition is computed and the result rows might be filtered. This allows to read fewer rows in the next steps thus saving IO bandwidth and doing less computation. This logic is disabled by default for now. It will be enabled by default in one of the future releases once it is known to not have any regressions, so it is highly encouraged to be used for testing. It can be controlled by 2 settings: "enable_multiple_prewhere_read_steps" and "move_all_conditions_to_prewhere". #46140 (Alexander Gololobov).
- Allow to increase prefetching for read data. #46168 (Kseniia Sumarokova).
- Rewrite arrayExists(x -> x = 1, arr) -> has(arr, 1), which improve performance by 1.34x. #46188 (李扬).
- Fix too big memory usage for vertical merges on non-remote disk. Respect
max_insert_delayed_streams_for_parallel_write
for the remote disk. #46275 (Nikolai Kochetov). - Update zstd to v1.5.4. It has some minor improvements in performance and compression ratio. If you run replicas with different versions of ClickHouse you may see reasonable error messages
Data after merge/mutation is not byte-identical to data on another replicas.
with explanation. These messages are Ok and you should not worry. #46280 (Raúl Marín). - Allow using Vertical merge algorithm with parts in Compact format. This will allow ClickHouse server to use much less memory for background operations. This closes #46084. #46282 (Anton Popov).
- Fix performance degradation caused by #39737. #46309 (Alexey Milovidov).
- The
replicas_status
handle will answer quickly even in case of a large replication queue. #46310 (Alexey Milovidov).
Improvement
- Add avx512 support for Aggregate Sum, function unary arithmetic, function comparison. #37870 (zhao zhou).
- close issue: #38893. #38950 (hexiaoting).
- Migration from other databases and updates/deletes are mimicked by Collapsing/Replacing. Want to use the same SELECT queries without adding FINAL to all the existing queries. #40945 (Arthur Passos).
- Allow configuring storage as
SETTINGS disk='<disk_name>'
(instead ofstorage_policy
) and with explicit disk creationSETTINGS disk=disk(type=s3, ...)
. #41976 (Kseniia Sumarokova). - Add new metrics for backups: num_processed_files and processed_files_size described actual number of processed files. #42244 (Aleksandr).
- Added retries on interserver DNS errors. #43179 (Anton Kozlov).
- Rewrote the code around marks distribution and the overall coordination of the reading in order to achieve the maximum performance improvement. This closes #34527. #43772 (Nikita Mikhaylov).
- Remove redundant DISTINCT clauses in query (subqueries). Implemented on top of query plan. It does similar optimization as
optimize_duplicate_order_by_and_distinct
regarding DISTINCT clauses. Can be enabled viaquery_plan_remove_redundant_distinct
setting. Related to #42648. #44176 (Igor Nikonov). - Keeper improvement: try preallocating space on the disk to avoid undefined out-of-space issues. Introduce setting
max_log_file_size
for the maximum size of Keeper's Raft log files. #44370 (Antonio Andelic). sumIf(123, cond) -> 123 * countIf(1, cond) sum(if(cond, 123, 0)) -> 123 * countIf(cond) sum(if(cond, 0, 123)) -> 123 * countIf(not(cond))
. #44728 (李扬).- Optimize behavior for a replica delay api logic in case the replica is read-only. #45148 (mateng915).
- Introduce gwp-asan implemented by llvm runtime. This closes #27039. #45226 (Han Fei).
- ... in the case key casted from uint64 to uint32, small impact for little endian platform but key value becomes zero in big endian case. ### Documentation entry for user-facing changes. #45375 (Suzy Wang).
- Mark Gorilla compression on columns of non-Float* type as suspicious. #45376 (Robert Schulze).
- Allow removing redundant aggregation keys with constants (e.g., simplify
GROUP BY a, a + 1
toGROUP BY a
). #45415 (Dmitry Novik). - Show replica name that is executing a merge in the postpone_reason. #45458 (Frank Chen).
- Save exception stack trace in part_log. #45459 (Frank Chen).
- Make RegExpTreeDictionary a ua parser which is compatible with https://github.com/ua-parser/uap-core. #45631 (Han Fei).
- Enable ICU data support on s390x platform. #45632 (Suzy Wang).
- Updated checking of SYSTEM SYNC REPLICA resolves #45508 Implementation: * Updated to wait for current last entry to be processed (after pulling shared log) instead of queue size becoming 0. * Updated Subscriber to notify both queue size and removed log_entry_id. #45648 (SmitaRKulkarni).
- Disallow creation of new columns compressed by a combination of codecs "Delta" or "DoubleDelta" followed by codecs "Gorilla" or "FPC". This can be bypassed using setting "allow_suspicious_codecs = true". #45652 (Robert Schulze).
- Rename setting
replication_alter_partitions_sync
toalter_sync
. #45659 (Antonio Andelic). - The
generateRandom
table function and the engine now supportLowCardinality
data types. This is useful for testing, for example you can writeINSERT INTO table SELECT * FROM generateRandom() LIMIT 1000
. This is needed to debug #45590. #45661 (Alexey Milovidov). - Add ability to ignore unknown keys in JSON object for named tuples (
input_format_json_ignore_unknown_keys_in_named_tuple
). #45678 (Azat Khuzhin). -
- The experimental query result cache now provides more modular configuration settings. #45679 (Robert Schulze).
- Renamed "query result cache" to "query cache". #45682 (Robert Schulze).
- add SYSTEM SYNC FILE CACHE command. It will call sync syscall. It achieve #8921. #45685 (DR).
- Add new S3 setting
allow_head_object_request
. This PR makes usage ofGetObjectAttributes
request instead ofHeadObject
introduced in https://github.com/ClickHouse/ClickHouse/pull/45288 optional (and disabled by default). #45701 (Vitaly Baranov). - Add ability to override connection settings based on connection names (that said that now you can forget about storing password for each connection, you can simply put everything into
~/.clickhouse-client/config.xml
and even use different history files for them, which can be also useful). #45715 (Azat Khuzhin). - Arrow format support duration type. Closes #45669. #45750 (flynn).
- Extend the logging in the Query Cache to improve investigations of the caching behavior. #45751 (Robert Schulze).
- The query cache's server-level settings are now reconfigurable at runtime. #45758 (Robert Schulze).
- Hide password in logs when a table function's arguments are specified with a named collection:. #45774 (Vitaly Baranov).
- Improve internal S3 client to correctly deduce regions and redirections for different types of URLs. #45783 (Antonio Andelic).
-
- Add support for Map, IPv4 and IPv6 types in generateRandom. Mostly useful for testing. #45785 (Raúl Marín).
- Support empty/notEmpty for IP types. #45799 (Yakov Olkhovskiy).
- The column
num_processed_files
was splitted into two columns:num_files
(for BACKUP) andfiles_read
(for RESTORE). The columnprocessed_files_size
was splitted into two columns:total_size
(for BACKUP) andbytes_read
(for RESTORE). #45800 (Vitaly Baranov). - Add support for
SHOW ENGINES
query. #45859 (Filatenkov Artur). -
- Improved how the obfuscator deals with queries. #45867 (Raúl Marín).
- Improved how memory bound merging and aggregation in order on top query plan interact. Previously we fell back to explicit sorting for AIO in some cases when it wasn't actually needed. So it is a perf issue, not a correctness one. #45892 (Nikita Taranov).
- Improve behaviour of conversion into Date for boundary value 65535 (2149-06-06). #45914 (Joanna Hulboj).
- Add setting
check_referential_table_dependencies
to check referential dependencies onDROP TABLE
. This PR solves #38326. #45936 (Vitaly Baranov). - Fix
tupleElement
returnNull
when havingNull
argument. Closes #45894. #45952 (flynn). - Throw an error on no files satisfying S3 wildcard. Closes #45587. #45957 (chen).
- Use cluster state data to check concurrent backup/restore. #45982 (SmitaRKulkarni).
- Use "exact" matching for fuzzy search, which has correct case ignorance and more appropriate algorithm for matching SQL queries. #46000 (Azat Khuzhin).
- Improve behaviour of conversion into Date for boundary value 65535 (2149-06-06). #46042 (Joanna Hulboj).
- Forbid wrong create View syntax
CREATE View X TO Y AS SELECT
. Closes #4331. #46043 (flynn). - Storage Log family support settings
storage_policy
. Closes #43421. #46044 (flynn). - Improve format
JSONColumns
when result is empty. Closes #46024. #46053 (flynn). -
- MultiVersion: replace lock_guard to atomic op. #46057 (Konstantin Morozov).
- Add reference implementation for SipHash128. #46065 (Salvatore Mesoraca).
- Add new metric to record allocations times and bytes using mmap. #46068 (李扬).
- Currently for functions like
leftPad
,rightPad
,leftPadUTF8
,rightPadUTF8
, the second argumentlength
must be UInt8|16|32|64|128|256. Which is too strict for clickhouse users, besides, it is not consistent with other similar functions likearrayResize
,substring
and so on. #46103 (李扬). - Update CapnProto to v0.10.3 to avoid CVE-2022-46149 ### Documentation entry for user-facing changes. #46139 (Mallik Hassan).
- Fix assertion in the
welchTTest
function in debug build when the resulting statistics is NaN. Unified the behavior with other similar functions. Change the behavior ofstudentTTest
to return NaN instead of throwing an exception because the previous behavior was inconvenient. This closes #41176 This closes #42162. #46141 (Alexey Milovidov). - More convenient usage of big integers and ORDER BY WITH FILL. Allow using plain integers for start and end points in WITH FILL when ORDER BY big (128-bit and 256-bit) integers. Fix the wrong result for big integers with negative start or end points. This closes #16733. #46152 (Alexey Milovidov).
- Add parts, active_parts and total_marks columns to system.tables on issue. #46161 (attack204).
- Functions "multi[Fuzzy]Match(Any|AnyIndex|AllIndices}" now reject regexes which will likely evaluate very slowly in vectorscan. #46167 (Robert Schulze).
- When
insert_null_as_default
is enabled and column doesn't have defined default value, the default of column type will be used. Also this PR fixes using default values on nulls in case of LowCardinality columns. #46171 (Kruglov Pavel). - Prefer explicitly defined access keys for S3 clients. If
use_environment_credentials
is set totrue
, and the user has provided the access key through query or config, they will be used instead of the ones from the environment variable. #46191 (Antonio Andelic). - Concurrent merges are scheduled using round-robin by default to ensure fair and starvation-free operation. Previously in heavily overloaded shards, big merges could possibly be starved by smaller merges due to the use of strict priority scheduling. Added
background_merges_mutations_scheduling_policy
server config option to select scheduling algorithm (round_robin
orshortest_task_first
). #46247 (Sergei Trifonov). - Extend setting
input_format_null_as_default
for more formats. Fix settinginput_format_defaults_for_omitted_fields
with Native and TSKV formats. #46284 (Kruglov Pavel). -
- Add an alias "DATE_FORMAT()" for function "formatDateTime()" to improve compatibility with MySQL's SQL dialect, extend function
formatDateTime()
with substitutions "a", "b", "c", "h", "i", "k", "l" "r", "s", "W". ### Documentation entry for user-facing changes User-readable short description:DATE_FORMAT
is an alias offormatDateTime
. Formats a Time according to the given Format string. Format is a constant expression, so you cannot have multiple formats for a single result column. (Provide link to formatDateTime). #46302 (Jake Bamrah).
- Add an alias "DATE_FORMAT()" for function "formatDateTime()" to improve compatibility with MySQL's SQL dialect, extend function
- not for changelog - part of #42648. #46306 (Yakov Olkhovskiy).
- Enable retries for INSERT by default in case of ZooKeeper session loss. We already use it in production. #46308 (Alexey Milovidov).
- Add
ProfileEvents
andCurrentMetrics
about the callback tasks for parallel replicas (s3Cluster
andMergeTree
tables). #46313 (Alexey Milovidov). - Add support for
DELETE
andUPDATE
for tables usingKeeperMap
storage engine. #46330 (Antonio Andelic). - Update unixodbc to v2.3.11 to mitigate CVE-2011-1145 ### Documentation entry for user-facing changes. #46363 (Mallik Hassan).
- Allow writing RENAME queries with query parameters. Resolves #45778. #46407 (Nikolay Degterinsky).
- Fix parameterized SELECT queries with REPLACE transformer. Resolves #33002. #46420 (Nikolay Degterinsky).
- Exclude the internal database used for temporary/external tables from the calculation of asynchronous metric "NumberOfDatabases". This makes the behavior consistent with system table "system.databases". #46435 (Robert Schulze).
- Added
last_exception_time
column into distribution_queue table. #46564 (Aleksandr). - Support for IN clause in parameterized views Implementation: * In case of parameterized views, the IN clause cannot be evaluated as constant expression during CREATE VIEW, added a check to ignore this step in case of parameterized view. * If parmeters are not in IN clause, we continue to evaluate it as constant expression. #46583 (SmitaRKulkarni).
- Do not load named collections on server startup (load them on first access instead). #46607 (Kseniia Sumarokova).
- Add separate access type
SHOW_NAMED_COLLECTIONS_SECRETS
to allow to see named collections and their keys, but making values hidden. Nevertheless, access typeSHOW NAMED COLLECTIONS
is still required. #46667 (Kseniia Sumarokova). - Hide arguments of custom disk merge tree setting. #46670 (Kseniia Sumarokova).
- Ask for the password in clickhouse-client interactively in a case when the empty password is wrong. Closes #46702. #46730 (Nikolay Degterinsky).
- Backward compatibility for T64 codec support for IPv4. #46747 (Yakov Olkhovskiy).
- Allow to fallback from asynchronous insert to synchronous in case of large amount of data (more than
async_insert_max_data_size
bytes in single insert). #46753 (Anton Popov).
Bug Fix
- Fix wiping sensitive info in logs. #45603 (Vitaly Baranov).
- There is a check in format "time_check() || ptr ? ptr->finished() : data->is_finished()". Operator "||" will be executed before operator "?", but expected that there should be separated time and ptr checks. Also it's unexpected to run "ptr->finished()" in case of nullptr, but with current expression it's possible. #46054 (Alexey Perevyshin).
Build/Testing/Packaging Improvement
- Allow to randomize merge tree settings in tests. #38983 (Anton Popov).
- Enable the HDFS support in PowerPC and which helps to fixes the following functional tests 02113_hdfs_assert.sh, 02244_hdfs_cluster.sql and 02368_cancel_write_into_hdfs.sh. #44949 (MeenaRenganathan22).
- Add systemd.service file for clickhouse-keeper. Fixes #44293. #45568 (Mikhail f. Shiryaev).
- ClickHouse's fork of poco was moved from "contrib/" to "base/poco/". #46075 (Robert Schulze).
- Remove excessive license notices from preciseExp10.cpp. #46163 (DimasKovas).
- Add an option for
clickhouse-watchdog
to restart the child process. This does not make a lot of use. #46312 (Alexey Milovidov). - Get rid of unnecessary build for standalone clickhouse-keeper. #46367 (Mikhail f. Shiryaev).
- If the environment variable
CLICKHOUSE_DOCKER_RESTART_ON_EXIT
is set to 1, the Docker container will runclickhouse-server
as a child instead of the first process, and restart it when it exited. #46391 (Alexey Milovidov). - Some time ago the ccache compression was changed to
zst
, butgz
archives are downloaded by default. It fixes it by prioritizing zst archive. #46490 (Mikhail f. Shiryaev). - Raised the minimum Clang version needed to build ClickHouse from 12 to 15. #46710 (Robert Schulze).
Bug Fix (user-visible misbehavior in official stable or prestable release)
- Flush data exactly by
rabbitmq_flush_interval_ms
or byrabbitmq_max_block_size
inStorageRabbitMQ
. Closes #42389. Closes #45160. #44404 (Kseniia Sumarokova). -
- Use PODArray to render in sparkBar function, so we can control the memory usage. Close #44467. #44489 (Duc Canh Le).
- Fix functions (quantilesExactExclusive, quantilesExactInclusive) return unsorted array element. #45379 (wujunfu).
- Fix uncaught exception in HTTPHandler when open telemetry is enabled. #45456 (Frank Chen).
- Don't infer Dates from 8 digit numbers. It could lead to wrong data to be read. #45581 (Kruglov Pavel).
- Fixes to correctly use
odbc_bridge_use_connection_pooling
setting. #45591 (Bharat Nallan). - when the callback in the cache is called, it is possible that this cache is destructed. To keep it safe, we capture members by value. It's also safe for task schedule because it will be deactivated before storage is destroyed. Resolve #45548. #45601 (Han Fei).
-
- Fix data corruption when codecs Delta or DoubleDelta are combined with codec Gorilla. #45615 (Robert Schulze).
- Correctly check types when using N-gram bloom filter index to avoid invalid reads. #45617 (Antonio Andelic).
- A couple of seg faults have been reported around
c-ares
. All of the recent stack traces observed fail on inserting intostd::unodered_set<>
. I believe I have found the root cause of this, it seems to be unprocessed queries. Prior to this PR, CH callspoll
to wait on the file descriptors in thec-ares
channel. According to the poll docs, a negative return value means an error has ocurred. Because of this, we would abort the execution and return failure. The problem is thatpoll
will also return a negative value if a system interrupt occurs. A system interrupt does not mean the processing has failed or ended, but we would abort it anyways because we were checking for negative values. Once the execution is aborted, the whole stack is destroyed, which includes thestd::unordered_set<std::string>
passed to thevoid *
parameter of the c-ares callback. Once c-ares completed the request, the callback would be invoked and would access an invalid memory address causing a segfault. #45629 (Arthur Passos). - Fix key description when encountering duplicate primary keys. This can happen in projections. See #45590 for details. #45686 (Amos Bird).
- Set compression method and level for backup Closes #45690. #45737 (Pradeep Chhetri).
- Should use
select_query_typed.limitByOffset()
instead ofselect_query_typed.limitOffset()
. #45817 (刘陶峰). - When use experimental analyzer, queries like
SELECT number FROM numbers(100) LIMIT 10 OFFSET 10;
get wrong results (empty result for this sql). That is caused by an unnecessary offset step added by planner. #45822 (刘陶峰). - Backward compatibility - allow implicit narrowing conversion from UInt64 to IPv4 - required for "INSERT ... VALUES ..." expression. #45865 (Yakov Olkhovskiy).
- Bugfix IPv6 parser for mixed ip4 address with missed first octet (like
::.1.2.3
). #45871 (Yakov Olkhovskiy). - Add the
query_kind
column to thesystem.processes
table and theSHOW PROCESSLIST
query. Remove duplicate code. It fixes a bug: the global configuration parametermax_concurrent_select_queries
was not respected to queries withINTERSECT
orEXCEPT
chains. #45872 (Alexey Milovidov). - Fix crash in a function
stochasticLinearRegression
. Found by WingFuzz. #45985 (Nikolai Kochetov). - Fix crash in
SELECT
queries withINTERSECT
andEXCEPT
modifiers that read data from tables with enabled sparse columns (controlled by setting `ratio_of_defaults_for_sparse_serialization). #45987 (Anton Popov). -
- Fix read in order optimization for DESC sorting with FINAL, close #45815. #46009 (Vladimir C).
- Fix reading of non existing nested columns with multiple level in compact parts. #46045 (Azat Khuzhin).
- Fix elapsed column in system.processes (10x error). #46047 (Azat Khuzhin).
- Follow-up fix for Replace domain IP types (IPv4, IPv6) with native https://github.com/ClickHouse/ClickHouse/pull/43221. #46087 (Yakov Olkhovskiy).
- Fix environment variable substitution in the configuration when a parameter already has a value. This closes #46131. This closes #9547. #46144 (pufit).
- Fix incorrect predicate push down with grouping sets. Closes #45947. #46151 (flynn).
- Fix possible pipeline stuck error on
fulls_sorting_join
with constant keys. #46175 (Vladimir C). - Never rewrite tuple functions as literals during formatting to avoid incorrect results. #46232 (Salvatore Mesoraca).
- Fix possible out of bounds error while reading LowCardinality(Nullable) in Arrow format. #46270 (Kruglov Pavel).
- Fix
SYSTEM UNFREEZE
queries failing with the exceptionCANNOT_PARSE_INPUT_ASSERTION_FAILED
. #46325 (Aleksei Filatov). - Fix possible crash which can be caused by an integer overflow while deserializing aggregating state of a function that stores HashTable. #46349 (Nikolai Kochetov).
- Fix possible
LOGICAL_ERROR
in asynchronous inserts with invalid data sent in formatVALUES
. #46350 (Anton Popov). - Fixed a LOGICAL_ERROR on an attempt to execute
ALTER ... MOVE PART ... TO TABLE
. This type of query was never actually supported. #46359 (Alexander Tokmakov). - Fix s3Cluster schema inference in parallel distributed insert select when
parallel_distributed_insert_select
is enabled. #46381 (Kruglov Pavel). - Fix queries like
ALTER TABLE ... UPDATE nested.arr1 = nested.arr2 ...
, wherearr1
andarr2
are fields of the sameNested
column. #46387 (Anton Popov). - Scheduler may fail to schedule a task. If it happens, the whole MulityPartUpload should be aborted and
UploadHelper
must wait for already scheduled tasks. #46451 (Dmitry Novik). - Fix PREWHERE for Merge with different default types (fixes some
NOT_FOUND_COLUMN_IN_BLOCK
when the default type for the column differs, also allowPREWHERE
when the type of column is the same across tables, and prohibit it, only if it differs). #46454 (Azat Khuzhin). - Fix a crash that could happen when constant values are used in
ORDER BY
. Fixes #46466. #46493 (Nikolai Kochetov). - Do not throw exception if
disk
setting was specified on query level, butstorage_policy
was specified in config merge tree settings section.disk
will override setting from config. #46533 (Kseniia Sumarokova). - Fix an invalid processing of constant
LowCardinality
argument in functionarrayMap
. This bug could lead to a segfault in release, and logical errorBad cast
in debug build. #46569 (Alexey Milovidov). - fixes #46557. #46611 (Alexander Gololobov).
- Fix endless restarts of clickhouse-server systemd unit if server cannot start within 1m30sec (Disable timeout logic for starting clickhouse-server from systemd service). #46613 (Azat Khuzhin).
- Allocated during asynchronous inserts memory buffers were deallocated in the global context and MemoryTracker counters for corresponding user and query were not updated correctly. That led to false positive OOM exceptions. #46622 (Dmitry Novik).
- Fix totals and extremes with constants in clickhouse-local. Closes #43831. #46669 (Kruglov Pavel).
- Handle
input_format_null_as_default
for nested types. #46725 (Azat Khuzhin).
Bug-fix
- Updated to not clear on_expression from table_join as its used by future analyze runs resolves #45185. #46487 (SmitaRKulkarni).
Build Improvement
- Fixed endian issue in snappy library for s390x. #45670 (Harry Lee).
- Fixed endian issue in CityHash for s390x. #46096 (Harry Lee).
- Fixed Functional Test 00900_long_parquet for S390x. #46181 (Sanjam Panda).
- Fixed endian issues in SQL hash functions on s390x architectures. #46495 (Harry Lee).
NO CL ENTRY
- NO CL ENTRY: 'Revert "Add check for running workflows to merge_pr.py"'. #45802 (Mikhail f. Shiryaev).
- NO CL ENTRY: 'Revert "Improve behaviour of conversion into Date for boundary value 65535"'. #46007 (Antonio Andelic).
- NO CL ENTRY: 'Revert "Allow vertical merges from compact to wide parts"'. #46236 (Anton Popov).
- NO CL ENTRY: 'Revert "Beter diagnostics from http in clickhouse-test"'. #46301 (Alexander Tokmakov).
NOT FOR CHANGELOG / INSIGNIFICANT
- Revert "Merge pull request #38212 from azat/no-stress" #38750 (Azat Khuzhin).
- More interesting settings for Stress Tests #41534 (Alexander Tokmakov).
- Attempt to fix 'Local: No offset stored message' from Kafka #42391 (filimonov).
- Analyzer SETTINGS push down #42976 (Maksim Kita).
- Simply filesystem helpers to check is-readable/writable/executable #43405 (Azat Khuzhin).
- Add CPU flamegraphs for perf tests #43529 (Azat Khuzhin).
- More robust CI parsers #44226 (Azat Khuzhin).
- Fix error message for a broken distributed batches ("While sending batch") #44907 (Azat Khuzhin).
- Catch exceptions in BackgroundSchedulePool #44923 (Azat Khuzhin).
- Add encryption support to OpenSSL #45258 (Boris Kuschel).
- Revert code in TreeRewriter for proper column order for UNION #45282 (Azat Khuzhin).
- Fix no shared id during drop for the fourth time #45363 (alesapin).
- HashedDictionary sharded fix nullable values #45396 (Maksim Kita).
- Another attempt to fix automerge, or at least to have debug footprint #45476 (Mikhail f. Shiryaev).
- Simplify binary locating in clickhouse-test #45484 (Azat Khuzhin).
- Fix race in NuRaft's asio listener #45511 (Antonio Andelic).
- Make ColumnNode::isEqualImpl more strict #45518 (Dmitry Novik).
- Fix krb5 for OpenSSL #45519 (Boris Kuschel).
- s390x build support #45520 (Suzy Wang).
- Better formatting for exception messages 2 #45527 (Alexander Tokmakov).
- Try to fix test
test_storage_s3/test.py::test_wrong_s3_syntax
(race inStorageS3
) #45529 (Anton Popov). - Analyzer add test for CREATE TABLE AS SELECT #45533 (Maksim Kita).
- LowCardinality insert fix #45585 (Maksim Kita).
- Update 02482_load_parts_refcounts.sh #45604 (Alexander Tokmakov).
- Extend assertion in buildPushingToViewsChain() to respect is_detached #45610 (Azat Khuzhin).
- Remove useless code #45612 (Anton Popov).
- Improve "at least part X is missing" error message #45613 (Azat Khuzhin).
- Refactoring of code near merge tree parts #45619 (Anton Popov).
- Update version after release #45634 (Mikhail f. Shiryaev).
- Update version_date.tsv and changelogs after v23.1.1.3077-stable #45635 (robot-clickhouse).
- Trim refs/tags/ from GITHUB_TAG in release workflow #45636 (Mikhail f. Shiryaev).
- Update version_date.tsv and changelogs after v22.10.7.13-stable #45637 (robot-clickhouse).
- Improve release script #45657 (Mikhail f. Shiryaev).
- Suppress TOO_MANY_PARTS in BC check #45691 (Alexander Tokmakov).
- Fix build #45692 (Alexander Tokmakov).
- Add recordings for 23.1 and Tel Aviv #45695 (Tyler Hannan).
- Integrate IO scheduler with buffers for remote reads and writes #45711 (Sergei Trifonov).
- Add missing SYSTEM FLUSH LOGS for clickhouse-test #45713 (Azat Khuzhin).
- tests: add missing allow_suspicious_codecs in 02536_delta_gorilla_corruption (fixes fasttest) #45735 (Azat Khuzhin).
- Improve MEMERY_LIMIT_EXCEEDED exception message #45743 (Dmitry Novik).
- Fix style and typo #45744 (Alexey Milovidov).
- Update version_date.tsv and changelogs after v22.8.13.20-lts #45749 (robot-clickhouse).
- Update version_date.tsv and changelogs after v22.11.5.15-stable #45754 (robot-clickhouse).
- Update version_date.tsv and changelogs after v23.1.2.9-stable #45755 (robot-clickhouse).
- Docs: Fix formatting #45756 (Robert Schulze).
- Fix typo + add boringssl comment #45757 (Robert Schulze).
- Fix flaky test, @alesapin please help! #45759 (Alexey Milovidov).
- Remove trash #45760 (Alexey Milovidov).
- Fix Flaky Check #45765 (Alexey Milovidov).
- Update dictionary.md #45775 (Derek Chia).
- Added a test for multiple ignore subqueries with nested select #45784 (SmitaRKulkarni).
- Support DELETE ON CLUSTER #45786 (Alexander Gololobov).
- outdated parts are loading async, need to wait them after attach #45787 (Sema Checherinda).
- Fix bug in tables drop which can lead to potential query hung #45791 (alesapin).
- Fix race condition on a part check cancellation #45793 (Alexander Tokmakov).
- Do not restrict count() query to 1 thread in isStorageTouchedByMutations() #45794 (Alexander Gololobov).
- Fix test
test_azure_blob_storage_zero_copy_replication
(memory leak in azure sdk) #45796 (Anton Popov). - Add check for running workflows to merge_pr.py #45801 (Mikhail f. Shiryaev).
- Add check for running workflows to merge_pr.py #45803 (Mikhail f. Shiryaev).
- Fix flaky test
02531_two_level_aggregation_bug.sh
#45806 (Alexey Milovidov). - Minor doc follow-up to #45382 #45816 (Robert Schulze).
- Get rid of progress timestamps in release publishing #45818 (Mikhail f. Shiryaev).
- Make separate DROP_PART log entry type #45821 (Alexander Tokmakov).
- Do not cancel created prs #45823 (Mikhail f. Shiryaev).
- Fix ASTQualifiedAsterisk cloning #45829 (Raúl Marín).
- Update 02540_duplicate_primary_key.sql #45846 (Alexander Tokmakov).
- Proper includes for ConnectionTimeoutsContext.h #45848 (Raúl Marín).
- Fix minor mistake after refactoring #45857 (Anton Popov).
- Fix flaky ttl_replicated test (remove sleep) #45858 (alesapin).
- Add some context to stress test failures #45869 (Alexander Tokmakov).
- Fix clang-tidy error in Keeper
Changelog
#45888 (Antonio Andelic). - do not block merges when old parts are droping in drop queries #45889 (Sema Checherinda).
- do not run wal on remote disks #45907 (Sema Checherinda).
- Dashboard improvements #45935 (Kevin Zhang).
- Better context for stress tests failures #45937 (Alexander Tokmakov).
- Fix IO URing #45940 (Alexey Milovidov).
- Docs: Remove obsolete query result cache page #45958 (Robert Schulze).
- Add necessary dependency for sanitizers #45959 (Mikhail f. Shiryaev).
- Update AggregateFunctionSparkbar #45961 (Vladimir C).
- Update cherrypick_pr to get mergeable state #45972 (Mikhail f. Shiryaev).
- Add "final" specifier to some classes #45973 (Robert Schulze).
- Improve local running of cherry_pick.py #45980 (Mikhail f. Shiryaev).
- Properly detect changes in Rust code and recompile Rust libraries #45981 (Azat Khuzhin).
- Avoid leaving symbols leftovers for query fuzzy search #45983 (Azat Khuzhin).
- Fix basic functionality with type
Object
and new analyzer #45992 (Anton Popov). - Check dynamic columns of part before its commit #45995 (Anton Popov).
- Minor doc fixes for inverted index #46004 (Robert Schulze).
- Fix terribly broken, fragile and potentially cyclic linking #46006 (Robert Schulze).
- Docs: Mention time zone randomization #46008 (Robert Schulze).
- Analyzer limit offset test rename #46011 (Maksim Kita).
- Update version_date.tsv and changelogs after v23.1.3.5-stable #46012 (robot-clickhouse).
- Update sorting properties after reading in order applied #46014 (Igor Nikonov).
- Fix disabled by mistake hung check #46020 (Alexander Tokmakov).
- Fix memory leak at creation of curl connection in azure sdk #46025 (Anton Popov).
- Add checks for installable packages to workflows #46036 (Mikhail f. Shiryaev).
- Fix data race in BACKUP #46040 (Azat Khuzhin).
- Dump sanitizer errors in the integration tests logs #46041 (Azat Khuzhin).
- Temporarily disable one rabbitmq flaky test #46052 (Kseniia Sumarokova).
- Remove unnecessary execute() while evaluating a constant expression. #46058 (Vitaly Baranov).
- Polish S3 client #46070 (Antonio Andelic).
- Smallish follow-up to #46057 #46072 (Robert Schulze).
- Fix 00002_log_and_exception_messages_formatting #46077 (Alexander Tokmakov).
- Disable temporarily rabbitmq tests which use channel.startConsuming() #46078 (Kseniia Sumarokova).
- Update yarn packages for dev branches #46079 (Mikhail f. Shiryaev).
- Add helping logging to auto-merge script #46080 (Mikhail f. Shiryaev).
- Simplify code around storages s3/hudi/delta-lake #46083 (Kseniia Sumarokova).
- Fix build with
-DENABLE_LIBURING=0
(or-DENABLE_LIBRARIES=0
) #46088 (Robert Schulze). - Add also last messages from stdout/stderr/debuglog in clickhouse-test #46090 (Azat Khuzhin).
- Sanity assertions for closing file descriptors #46091 (Azat Khuzhin).
- Fix flaky rabbitmq test #46107 (Kseniia Sumarokova).
- Fix test_merge_tree_azure_blob_storage::test_zero_copy_replication test #46108 (Azat Khuzhin).
- allow_drop_detached requires an argument #46110 (Sema Checherinda).
- Fix fault injection in copier and test_cluster_copier flakiness #46120 (Azat Khuzhin).
- Update liburing CMakeLists.txt #46127 (Nikolay Degterinsky).
- Use BAD_ARGUMENTS over LOGICAL_ERROR for schema inference error file() over fd #46132 (Azat Khuzhin).
- Stricter warnings + fix whitespaces in poco #46133 (Robert Schulze).
- Fix dependency checks #46138 (Vitaly Baranov).
- Interpret
cluster_name
identifier ins3Cluster
function as literal #46143 (Nikolay Degterinsky). - Remove flaky test #46149 (Alexey Milovidov).
- Fix spelling + duplicate includes in poco #46155 (Robert Schulze).
- Add 00002_log_and_exception_messages_formatting back #46156 (Alexander Tokmakov).
- Fix clickhouse/clickhouse-server description to make it in sync #46159 (Mikhail f. Shiryaev).
- Complain about missing Yasm at configure time at of build time #46162 (Robert Schulze).
- Update Dockerfile.ubuntu #46173 (Alexander Tokmakov).
- Cleanup disk unittest #46179 (Sergei Trifonov).
- Update 01513_optimize_aggregation_in_order_memory_long.sql #46180 (Alexander Tokmakov).
- Make a bug in HTTP interface less annoying #46183 (Alexander Tokmakov).
- Fix write buffer destruction order for vertical merge. #46205 (Nikolai Kochetov).
- fix typo #46207 (Sergei Trifonov).
- increase a time gap between insert and ttl move #46233 (Sema Checherinda).
- Make
test_replicated_merge_tree_s3_restore
less flaky #46242 (Alexander Tokmakov). - Fix test_distributed_ddl_parallel #46243 (Alexander Tokmakov).
- Update 00564_versioned_collapsing_merge_tree.sql #46245 (Alexander Tokmakov).
- Optimize docker binary-builder #46246 (Mikhail f. Shiryaev).
- Update Curl to 7.87.0 #46248 (Boris Kuschel).
- Upgrade libxml2 to address CVE-2022-40303 CVE-2022-40304 #46249 (larryluogit).
- Run clang-format over poco #46259 (Robert Schulze).
- Suppress "Container already exists" in BC check #46260 (Alexander Tokmakov).
- Fix failure description for hung check #46267 (Alexander Tokmakov).
- Add upcoming Events #46271 (Tyler Hannan).
- coordination: do not allow election_timeout_lower_bound_ms > election_timeout_upper_bound_ms #46274 (Salvatore Mesoraca).
- fix data race between check table request and background checker #46278 (Sema Checherinda).
- Try to make 02346_full_text_search less flaky #46279 (Robert Schulze).
- Beter diagnostics from http in clickhouse-test #46281 (Alexander Tokmakov).
- Add more logging to RabbitMQ (to help debug tests) #46283 (Kseniia Sumarokova).
- Fix window view test #46285 (Kseniia Sumarokova).
- suppressing test inaccuracy 00738_lock_for_inner_table #46287 (Sema Checherinda).
- Simplify ATTACH MergeTree table FROM S3 in tests #46288 (Azat Khuzhin).
- Update RabbitMQProducer.cpp #46295 (Kseniia Sumarokova).
- Fix macOs compilation due to sprintf #46298 (Jordi Villar).
- Slightly improve error message for required Yasm assembler #46328 (Robert Schulze).
- Unifdef unused parts of poco #46329 (Robert Schulze).
- Trigger automerge on approved PRs #46332 (Mikhail f. Shiryaev).
- Wait for background tasks in ~UploadHelper #46334 (Nikolai Kochetov).
- Fix flaky test_storage_rabbitmq::test_rabbitmq_address #46337 (Kseniia Sumarokova).
- Extract common logic for S3 #46339 (Antonio Andelic).
- Update cluster.py #46340 (Kseniia Sumarokova).
- Try to stabilize test 02346_full_text_search.sql #46344 (Robert Schulze).
- Remove an unused argument #46346 (Alexander Tokmakov).
- fix candidate selection #46347 (Sema Checherinda).
- Do not pollute logs in clickhouse-test #46361 (Azat Khuzhin).
- Do not continue perf tests in case of exception in create_query/fill_query #46362 (Azat Khuzhin).
- Minor fix in files locating for Bugfix validate check #46368 (Vladimir C).
- Temporarily disable test_rabbitmq_overloaded_insert #46403 (Kseniia Sumarokova).
- Fix test test_rabbitmq_overloaded_insert #46404 (Kseniia Sumarokova).
- Fix stress test #46405 (Kseniia Sumarokova).
- Fix stress tests statuses #46406 (Alexander Tokmakov).
- Follow-up to #46168 #46409 (Kseniia Sumarokova).
- Fix noisy log messages #46410 (Alexander Tokmakov).
- Docs: Clarify query parameters #46419 (Robert Schulze).
- Make tests with window view less bad #46421 (Alexander Tokmakov).
- Move MongoDB and PostgreSQL sources to Sources folder #46422 (Nikolay Degterinsky).
- Another fix for cluster copier #46433 (Antonio Andelic).
- Update version_date.tsv and changelogs after v22.3.18.37-lts #46436 (robot-clickhouse).
- Fix a backup test #46449 (Vitaly Baranov).
- Do not fetch submodules in release.py #46450 (Mikhail f. Shiryaev).
- resolve race in getCSNAndAssert #46452 (Sema Checherinda).
- move database credential inputs to the center on initial load #46455 (Kevin Zhang).
- Improve install_check.py #46458 (Mikhail f. Shiryaev).
- Change logging level of a verbose message to Trace #46459 (Alexander Tokmakov).
- Analyzer planner fixes before enable by default #46471 (Maksim Kita).
- Fix some flaky integration tests #46478 (Alexander Tokmakov).
- Allow to override host for client connection credentials #46480 (Azat Khuzhin).
- Try fix flaky test test_parallel_distributed_insert_select_with_schema_inference #46488 (Kruglov Pavel).
- Planner filter push down optimization fix #46494 (Maksim Kita).
- Fix 01161_all_system_tables test flakiness #46499 (Azat Khuzhin).
- Compress tar archives with zstd in intergration tests #46516 (Azat Khuzhin).
- chore: bump testcontainers-go to 0.18.0 #46518 (Manuel de la Peña).
- Rollback unnecessary sync because of checking exit code #46520 (Mikhail f. Shiryaev).
- Fix stress test #46521 (Kseniia Sumarokova).
- Add myrrc to trusted contributors #46526 (Anton Popov).
- fix style #46530 (flynn).
- Autoupdate keeper dockerfile #46535 (Mikhail f. Shiryaev).
- Fixes for OpenSSL and s390x #46546 (Boris Kuschel).
- enable async-insert-max-query-number only if async_insert_deduplicate #46549 (Han Fei).
- Remove extra try/catch for QueryState/LocalQueryState reset #46552 (Azat Khuzhin).
- Whitespaces #46553 (Alexey Milovidov).
- fix build without avro #46554 (flynn).
- Inhibit randomization in test
01551_mergetree_read_in_order_spread.sql
#46562 (Alexey Milovidov). - Remove PVS-Studio #46565 (Alexey Milovidov).
- Inhibit settings randomization in
01304_direct_io_long.sh
#46566 (Alexey Milovidov). - Fix double whitespace in comment in test #46567 (Alexey Milovidov).
- Rename test #46568 (Alexey Milovidov).
- Fix ASTAsterisk::clone() #46570 (Nikolay Degterinsky).
- Small update of sparkbar docs #46579 (Robert Schulze).
- Fix flakiness of expect tests for clickhouse-client by avoiding history overlap #46582 (Azat Khuzhin).
- Always log rollback for release.py #46586 (Mikhail f. Shiryaev).
- Increase table retries in
test_cluster_copier
#46590 (Antonio Andelic). - Update 00170_s3_cache.sql #46593 (Kseniia Sumarokova).
- Fix rabbitmq test #46595 (Kseniia Sumarokova).
- Fix meilisearch test flakyness #46596 (Kseniia Sumarokova).
- Fix dependencies for InstallPackagesTestAarch64 #46597 (Mikhail f. Shiryaev).
- Update compare.sh #46599 (Alexander Tokmakov).
- update llvm-project to fix gwp-asan #46600 (Han Fei).
- Temporarily disable test_rabbitmq_overloaded_insert #46608 (Kseniia Sumarokova).
- Update 01565_reconnect_after_client_error to not expect explicit reconnect #46619 (Azat Khuzhin).
- Inhibit
index_granularity_bytes
randomization in some tests #46626 (Alexey Milovidov). - Fix coverity #46629 (Alexey Milovidov).
- Fix 01179_insert_values_semicolon test #46636 (Azat Khuzhin).
- Fix typo in read prefetch #46640 (Nikita Taranov).
- Avoid OOM in perf tests #46641 (Azat Khuzhin).
- Fix: remove redundant sorting optimization #46642 (Igor Nikonov).
- Fix flaky test 01710_normal_projections #46645 (Kruglov Pavel).
- Update postgres_utility.py #46656 (Kseniia Sumarokova).
- Fix integration test: terminate old version without wait #46660 (alesapin).
- Break Stress tests #46663 (Alexander Tokmakov).
- Get rid of legacy DocsReleaseChecks #46665 (Mikhail f. Shiryaev).
- fix layout issues in dashboard.html #46671 (Kevin Zhang).
- Fix Stress tests #46683 (Alexander Tokmakov).
- Disable flaky test_ttl_move_memory_usage.py #46687 (Alexander Tokmakov).
- BackgroundSchedulePool should not have any query context #46709 (Azat Khuzhin).
- Better exception message during Tuple JSON deserialization #46727 (Kruglov Pavel).
- Poco: POCO_HAVE_INT64 is always defined #46728 (Robert Schulze).
- Fix SonarCloud Job #46732 (Julio Jimenez).
- Remove unused MergeTreeReadTask::remove_prewhere_column #46744 (Alexander Gololobov).
- On out-of-space
at
returns error, we must terminate still #46754 (Mikhail f. Shiryaev). - CI: don't run builds/tests when CHANGELOG.md or README.md were modified #46773 (Robert Schulze).
- Cosmetics in hashing code #46780 (Robert Schulze).