diff --git a/CHANGELOG.md b/CHANGELOG.md index 34d11c6a2cd..c7c054a53a8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,114 @@ +### ClickHouse release v21.8, 2021-08-11 + +#### New Features + +* Collect common system metrics (in `system.asynchronous_metrics` and `system.asynchronous_metric_log`) on CPU usage, disk usage, memory usage, IO, network, files, load average, CPU frequencies, thermal sensors, EDAC counters, system uptime; also added metrics about the scheduling jitter and the time spent collecting the metrics. It works similar to `atop` in ClickHouse and allows access to monitoring data even if you have no additional tools installed. Close [#9430](https://github.com/ClickHouse/ClickHouse/issues/9430). [#24416](https://github.com/ClickHouse/ClickHouse/pull/24416) ([Yegor Levankov](https://github.com/elevankoff)). +* Add new functions `leftPad()`, `rightPad()`, `leftPadUTF8()`, `rightPadUTF8()`. [#26075](https://github.com/ClickHouse/ClickHouse/pull/26075) ([Vitaly Baranov](https://github.com/vitlibar)). +* Add the `FIRST` keyword to the `ADD INDEX` command to be able to add the index at the beginning of the indices list. [#25904](https://github.com/ClickHouse/ClickHouse/pull/25904) ([xjewer](https://github.com/xjewer)). +* Introduce `system.data_skipping_indices` table containing information about existing data skipping indices. Close [#7659](https://github.com/ClickHouse/ClickHouse/issues/7659). [#25693](https://github.com/ClickHouse/ClickHouse/pull/25693) ([Dmitry Novik](https://github.com/novikd)). +* Add `bin`/`unbin` functions. [#25609](https://github.com/ClickHouse/ClickHouse/pull/25609) ([zhaoyu](https://github.com/zxc111)). +* Support `Map` and `(U)Int128`, `U(Int256) types in `mapAdd` and `mapSubtract` functions. [#25596](https://github.com/ClickHouse/ClickHouse/pull/25596) ([Ildus Kurbangaliev](https://github.com/ildus)). +* Support `DISTINCT ON (columns)` expression, close [#25404](https://github.com/ClickHouse/ClickHouse/issues/25404). [#25589](https://github.com/ClickHouse/ClickHouse/pull/25589) ([Zijie Lu](https://github.com/TszKitLo40)). +* Add support for a part of SQLJSON standard. [#24148](https://github.com/ClickHouse/ClickHouse/pull/24148) ([l1tsolaiki](https://github.com/l1tsolaiki)). +* Add MaterializedPostgreSQL table engine and database engine. This database engine allows replicating a whole database or any subset of database tables. [#20470](https://github.com/ClickHouse/ClickHouse/pull/20470) ([Kseniia Sumarokova](https://github.com/kssenii)). +* Add an ability to reset a custom setting to default and remove it from the table's metadata. It allows rolling back the change without knowing the system/config's default. Closes [#14449](https://github.com/ClickHouse/ClickHouse/issues/14449). [#17769](https://github.com/ClickHouse/ClickHouse/pull/17769) ([xjewer](https://github.com/xjewer)). +* Render pipelines as graphs in Web UI if `EXPLAIN PIPELINE graph = 1` query is submitted. [#26067](https://github.com/ClickHouse/ClickHouse/pull/26067) ([alexey-milovidov](https://github.com/alexey-milovidov)). + +#### Performance Improvements + +* Compile aggregate functions. Use option `compile_aggregate_expressions` to enable it. [#24789](https://github.com/ClickHouse/ClickHouse/pull/24789) ([Maksim Kita](https://github.com/kitaisreal)). +* Improve latency of short queries that require reading from tables with many columns. [#26371](https://github.com/ClickHouse/ClickHouse/pull/26371) ([Anton Popov](https://github.com/CurtizJ)). + +#### Improvements + +* Use `Map` data type for system logs tables (`system.query_log`, `system.query_thread_log`, `system.processes`, `system.opentelemetry_span_log`). These tables will be auto-created with new data types. Virtual columns are created to support old queries. Closes [#18698](https://github.com/ClickHouse/ClickHouse/issues/18698). [#23934](https://github.com/ClickHouse/ClickHouse/pull/23934), [#25773](https://github.com/ClickHouse/ClickHouse/pull/25773) ([hexiaoting](https://github.com/hexiaoting), [sundy-li](https://github.com/sundy-li)). +* For a dictionary with a complex key containing only one attribute, allow not wrapping the key expression in tuple for functions `dictGet`, `dictHas`. [#26130](https://github.com/ClickHouse/ClickHouse/pull/26130) ([Maksim Kita](https://github.com/kitaisreal)). +* Implement function `bin`/`hex` from `AggregateFunction` states. [#26094](https://github.com/ClickHouse/ClickHouse/pull/26094) ([zhaoyu](https://github.com/zxc111)). +* Support arguments of `UUID` type for `empty` and `notEmpty` functions. `UUID` is empty if it is all zeros (nil UUID). Closes [#3446](https://github.com/ClickHouse/ClickHouse/issues/3446). [#25974](https://github.com/ClickHouse/ClickHouse/pull/25974) ([zhaoyu](https://github.com/zxc111)). +* Fix error with query `SET SQL_SELECT_LIMIT` in MySQL protocol. Closes [#17115](https://github.com/ClickHouse/ClickHouse/issues/17115). [#25972](https://github.com/ClickHouse/ClickHouse/pull/25972) ([Kseniia Sumarokova](https://github.com/kssenii)). +* More instrumentation for network interaction: add counters for recv/send bytes; add gauges for recvs/sends. Added missing documentation. Close [#5897](https://github.com/ClickHouse/ClickHouse/issues/5897). [#25962](https://github.com/ClickHouse/ClickHouse/pull/25962) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Add setting `optimize_move_to_prewhere_if_final`. If query has `FINAL`, the optimization `move_to_prewhere` will be enabled only if both `optimize_move_to_prewhere` and `optimize_move_to_prewhere_if_final` are enabled. Closes [#8684](https://github.com/ClickHouse/ClickHouse/issues/8684). [#25940](https://github.com/ClickHouse/ClickHouse/pull/25940) ([Kseniia Sumarokova](https://github.com/kssenii)). +* Allow complex quoted identifiers of JOINed tables. Close [#17861](https://github.com/ClickHouse/ClickHouse/issues/17861). [#25924](https://github.com/ClickHouse/ClickHouse/pull/25924) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Add support for Unicode (e.g. Chinese, Cyrillic) components in `Nested` data types. Close [#25594](https://github.com/ClickHouse/ClickHouse/issues/25594). [#25923](https://github.com/ClickHouse/ClickHouse/pull/25923) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Allow `quantiles*` functions to work with `aggregate_functions_null_for_empty`. Close [#25892](https://github.com/ClickHouse/ClickHouse/issues/25892). [#25919](https://github.com/ClickHouse/ClickHouse/pull/25919) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Allow parameters for parametric aggregate functions to be arbitrary constant expressions (e.g., `1 + 2`), not just literals. It also allows using the query parameters (in parameterized queries like `{param:UInt8}`) inside parametric aggregate functions. Closes [#11607](https://github.com/ClickHouse/ClickHouse/issues/11607). [#25910](https://github.com/ClickHouse/ClickHouse/pull/25910) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Correctly throw the exception on the attempt to parse an invalid `Date`. Closes [#6481](https://github.com/ClickHouse/ClickHouse/issues/6481). [#25909](https://github.com/ClickHouse/ClickHouse/pull/25909) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Support for multiple includes in configuration. It is possible to include users configuration, remote server configuration from multiple sources. Simply place `` element with `from_zk`, `from_env` or `incl` attribute, and it will be replaced with the substitution. [#24404](https://github.com/ClickHouse/ClickHouse/pull/24404) ([nvartolomei](https://github.com/nvartolomei)). +* Support for queries with a column named `"null"` (it must be specified in back-ticks or double quotes) and `ON CLUSTER`. Closes [#24035](https://github.com/ClickHouse/ClickHouse/issues/24035). [#25907](https://github.com/ClickHouse/ClickHouse/pull/25907) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Support `LowCardinality`, `Decimal`, and `UUID` for `JSONExtract`. Closes [#24606](https://github.com/ClickHouse/ClickHouse/issues/24606). [#25900](https://github.com/ClickHouse/ClickHouse/pull/25900) ([Kseniia Sumarokova](https://github.com/kssenii)). +* Convert history file from `readline` format to `replxx` format. [#25888](https://github.com/ClickHouse/ClickHouse/pull/25888) ([Azat Khuzhin](https://github.com/azat)). +* Fix bug which can lead to intersecting parts after `DROP PART` or background deletion of an empty part. [#25884](https://github.com/ClickHouse/ClickHouse/pull/25884) ([alesapin](https://github.com/alesapin)). +* Better handling of lost parts for `ReplicatedMergeTree` tables. Fixes rare inconsistencies in `ReplicationQueue`. Fixes [#10368](https://github.com/ClickHouse/ClickHouse/issues/10368). [#25820](https://github.com/ClickHouse/ClickHouse/pull/25820) ([alesapin](https://github.com/alesapin)). +* Allow starting clickhouse-client with unreadable working directory. [#25817](https://github.com/ClickHouse/ClickHouse/pull/25817) ([ianton-ru](https://github.com/ianton-ru)). +* Fix "No available columns" error for `Merge` storage. [#25801](https://github.com/ClickHouse/ClickHouse/pull/25801) ([Azat Khuzhin](https://github.com/azat)). +* MySQL Engine now supports the exchange of column comments between MySQL and ClickHouse. [#25795](https://github.com/ClickHouse/ClickHouse/pull/25795) ([Storozhuk Kostiantyn](https://github.com/sand6255)). +* Fix inconsistent behaviour of `GROUP BY` constant on empty set. Closes [#6842](https://github.com/ClickHouse/ClickHouse/issues/6842). [#25786](https://github.com/ClickHouse/ClickHouse/pull/25786) ([Kseniia Sumarokova](https://github.com/kssenii)). +* Cancel already running merges in partition on `DROP PARTITION` and `TRUNCATE` for `ReplicatedMergeTree`. Resolves [#17151](https://github.com/ClickHouse/ClickHouse/issues/17151). [#25684](https://github.com/ClickHouse/ClickHouse/pull/25684) ([tavplubix](https://github.com/tavplubix)). +* Support ENUM` data type for MaterializeMySQL. [#25676](https://github.com/ClickHouse/ClickHouse/pull/25676) ([Storozhuk Kostiantyn](https://github.com/sand6255)). +* Support materialized and aliased columns in JOIN, close [#13274](https://github.com/ClickHouse/ClickHouse/issues/13274). [#25634](https://github.com/ClickHouse/ClickHouse/pull/25634) ([Vladimir C](https://github.com/vdimir)). +* Fix possible logical race condition between `ALTER TABLE ... DETACH` and background merges. [#25605](https://github.com/ClickHouse/ClickHouse/pull/25605) ([Azat Khuzhin](https://github.com/azat)). +* Make `NetworkReceiveElapsedMicroseconds` metric to correctly include the time spent waiting for data from the client to `INSERT`. Close [#9958](https://github.com/ClickHouse/ClickHouse/issues/9958). [#25602](https://github.com/ClickHouse/ClickHouse/pull/25602) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Support `TRUNCATE TABLE` for StorageS3 and StorageHDFS. Close [#25530](https://github.com/ClickHouse/ClickHouse/issues/25530). [#25550](https://github.com/ClickHouse/ClickHouse/pull/25550) ([Kseniia Sumarokova](https://github.com/kssenii)). +* Support for dynamic reloading of config to change number of threads in pool for background jobs execution (merges, mutations, fetches). [#25548](https://github.com/ClickHouse/ClickHouse/pull/25548) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)). +* Allow extracting of non-string element as string using `JSONExtract`. This is for [#25414](https://github.com/ClickHouse/ClickHouse/issues/25414). [#25452](https://github.com/ClickHouse/ClickHouse/pull/25452) ([Amos Bird](https://github.com/amosbird)). +* Support regular expression in `Database` argument for `StorageMerge`. Close [#776](https://github.com/ClickHouse/ClickHouse/issues/776). [#25064](https://github.com/ClickHouse/ClickHouse/pull/25064) ([flynn](https://github.com/ucasfl)). +* Web UI: if the value looks like a URL, automatically generate a link. [#25965](https://github.com/ClickHouse/ClickHouse/pull/25965) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Make `sudo service clickhouse-server start` to work on systems with `systemd` like Centos 8. Close [#14298](https://github.com/ClickHouse/ClickHouse/issues/14298). Close [#17799](https://github.com/ClickHouse/ClickHouse/issues/17799). [#25921](https://github.com/ClickHouse/ClickHouse/pull/25921) ([alexey-milovidov](https://github.com/alexey-milovidov)). + +#### Bug Fixes + +* Fix incorrect `SET ROLE` in some cases. [#26707](https://github.com/ClickHouse/ClickHouse/pull/26707) ([Vitaly Baranov](https://github.com/vitlibar)). +* Fix potential `nullptr` dereference in window functions. Fix [#25276](https://github.com/ClickHouse/ClickHouse/issues/25276). [#26668](https://github.com/ClickHouse/ClickHouse/pull/26668) ([Alexander Kuzmenkov](https://github.com/akuzm)). +* Fix incorrect function names of `groupBitmapAnd/Or/Xor`. Fix [#26557](https://github.com/ClickHouse/ClickHouse/pull/26557) ([Amos Bird](https://github.com/amosbird)). +* Fix crash in rabbitmq shutdown in case rabbitmq setup was not started. Closes [#26504](https://github.com/ClickHouse/ClickHouse/issues/26504). [#26529](https://github.com/ClickHouse/ClickHouse/pull/26529) ([Kseniia Sumarokova](https://github.com/kssenii)). +* Fix issues with `CREATE DICTIONARY` query if dictionary name or database name was quoted. Closes [#26491](https://github.com/ClickHouse/ClickHouse/issues/26491). [#26508](https://github.com/ClickHouse/ClickHouse/pull/26508) ([Maksim Kita](https://github.com/kitaisreal)). +* Fix broken name resolution after rewriting column aliases. Fix [#26432](https://github.com/ClickHouse/ClickHouse/issues/26432). [#26475](https://github.com/ClickHouse/ClickHouse/pull/26475) ([Amos Bird](https://github.com/amosbird)). +* Fix infinite non-joined block stream in `partial_merge_join` close [#26325](https://github.com/ClickHouse/ClickHouse/issues/26325). [#26374](https://github.com/ClickHouse/ClickHouse/pull/26374) ([Vladimir C](https://github.com/vdimir)). +* Fix possible crash when login as dropped user. Fix [#26073](https://github.com/ClickHouse/ClickHouse/issues/26073). [#26363](https://github.com/ClickHouse/ClickHouse/pull/26363) ([Vitaly Baranov](https://github.com/vitlibar)). +* Fix `optimize_distributed_group_by_sharding_key` for multiple columns (leads to incorrect result w/ `optimize_skip_unused_shards=1`/`allow_nondeterministic_optimize_skip_unused_shards=1` and multiple columns in sharding key expression). [#26353](https://github.com/ClickHouse/ClickHouse/pull/26353) ([Azat Khuzhin](https://github.com/azat)). + * `CAST` from `Date` to `DateTime` (or `DateTime64`) was not using the timezone of the `DateTime` type. It can also affect the comparison between `Date` and `DateTime`. Inference of the common type for `Date` and `DateTime` also was not using the corresponding timezone. It affected the results of function `if` and array construction. Closes [#24128](https://github.com/ClickHouse/ClickHouse/issues/24128). [#24129](https://github.com/ClickHouse/ClickHouse/pull/24129) ([Maksim Kita](https://github.com/kitaisreal)). +* Fixed rare bug in lost replica recovery that may cause replicas to diverge. [#26321](https://github.com/ClickHouse/ClickHouse/pull/26321) ([tavplubix](https://github.com/tavplubix)). +* Fix zstd decompression in case there are escape sequences at the end of internal buffer. Closes [#26013](https://github.com/ClickHouse/ClickHouse/issues/26013). [#26314](https://github.com/ClickHouse/ClickHouse/pull/26314) ([Kseniia Sumarokova](https://github.com/kssenii)). +* Fix logical error on join with totals, close [#26017](https://github.com/ClickHouse/ClickHouse/issues/26017). [#26250](https://github.com/ClickHouse/ClickHouse/pull/26250) ([Vladimir C](https://github.com/vdimir)). +* Remove excessive newline in `thread_name` column in `system.stack_trace` table. Fix [#24124](https://github.com/ClickHouse/ClickHouse/issues/24124). [#26210](https://github.com/ClickHouse/ClickHouse/pull/26210) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix `joinGet` with `LowCarinality` columns, close [#25993](https://github.com/ClickHouse/ClickHouse/issues/25993). [#26118](https://github.com/ClickHouse/ClickHouse/pull/26118) ([Vladimir C](https://github.com/vdimir)). +* Fix possible crash in `pointInPolygon` if the setting `validate_polygons` is turned off. [#26113](https://github.com/ClickHouse/ClickHouse/pull/26113) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix throwing exception when iterate over non-existing remote directory. [#26087](https://github.com/ClickHouse/ClickHouse/pull/26087) ([ianton-ru](https://github.com/ianton-ru)). +* Fix rare server crash because of `abort` in ZooKeeper client. Fixes [#25813](https://github.com/ClickHouse/ClickHouse/issues/25813). [#26079](https://github.com/ClickHouse/ClickHouse/pull/26079) ([alesapin](https://github.com/alesapin)). +* Fix wrong thread estimation for right subquery join in some cases. Close [#24075](https://github.com/ClickHouse/ClickHouse/issues/24075). [#26052](https://github.com/ClickHouse/ClickHouse/pull/26052) ([Vladimir C](https://github.com/vdimir)). +* Fixed incorrect `sequence_id` in MySQL protocol packets that ClickHouse sends on exception during query execution. It might cause MySQL client to reset connection to ClickHouse server. Fixes [#21184](https://github.com/ClickHouse/ClickHouse/issues/21184). [#26051](https://github.com/ClickHouse/ClickHouse/pull/26051) ([tavplubix](https://github.com/tavplubix)). +* Fix possible mismatched header when using normal projection with `PREWHERE`. Fix [#26020](https://github.com/ClickHouse/ClickHouse/issues/26020). [#26038](https://github.com/ClickHouse/ClickHouse/pull/26038) ([Amos Bird](https://github.com/amosbird)). +* Fix formatting of type `Map` with integer keys to `JSON`. [#25982](https://github.com/ClickHouse/ClickHouse/pull/25982) ([Anton Popov](https://github.com/CurtizJ)). +* Fix possible deadlock during query profiler stack unwinding. Fix [#25968](https://github.com/ClickHouse/ClickHouse/issues/25968). [#25970](https://github.com/ClickHouse/ClickHouse/pull/25970) ([Maksim Kita](https://github.com/kitaisreal)). +* Fix crash on call `dictGet()` with bad arguments. [#25913](https://github.com/ClickHouse/ClickHouse/pull/25913) ([Vitaly Baranov](https://github.com/vitlibar)). +* Fixed `scram-sha-256` authentication for PostgreSQL engines. Closes [#24516](https://github.com/ClickHouse/ClickHouse/issues/24516). [#25906](https://github.com/ClickHouse/ClickHouse/pull/25906) ([Kseniia Sumarokova](https://github.com/kssenii)). +* Fix extremely long backoff for background tasks when the background pool is full. Fixes [#25836](https://github.com/ClickHouse/ClickHouse/issues/25836). [#25893](https://github.com/ClickHouse/ClickHouse/pull/25893) ([alesapin](https://github.com/alesapin)). +* Fix ARM exception handling with non default page size. Fixes [#25512](https://github.com/ClickHouse/ClickHouse/issues/25512), [#25044](https://github.com/ClickHouse/ClickHouse/issues/25044), [#24901](https://github.com/ClickHouse/ClickHouse/issues/24901), [#23183](https://github.com/ClickHouse/ClickHouse/issues/23183), [#20221](https://github.com/ClickHouse/ClickHouse/issues/20221), [#19703](https://github.com/ClickHouse/ClickHouse/issues/19703), [#19028](https://github.com/ClickHouse/ClickHouse/issues/19028), [#18391](https://github.com/ClickHouse/ClickHouse/issues/18391), [#18121](https://github.com/ClickHouse/ClickHouse/issues/18121), [#17994](https://github.com/ClickHouse/ClickHouse/issues/17994), [#12483](https://github.com/ClickHouse/ClickHouse/issues/12483). [#25854](https://github.com/ClickHouse/ClickHouse/pull/25854) ([Maksim Kita](https://github.com/kitaisreal)). +* Fix sharding_key from column w/o function for `remote()` (before `select * from remote('127.1', system.one, dummy)` leads to `Unknown column: dummy, there are only columns .` error). [#25824](https://github.com/ClickHouse/ClickHouse/pull/25824) ([Azat Khuzhin](https://github.com/azat)). +* Fixed `Not found column ...` and `Missing column ...` errors when selecting from `MaterializeMySQL`. Fixes [#23708](https://github.com/ClickHouse/ClickHouse/issues/23708), [#24830](https://github.com/ClickHouse/ClickHouse/issues/24830), [#25794](https://github.com/ClickHouse/ClickHouse/issues/25794). [#25822](https://github.com/ClickHouse/ClickHouse/pull/25822) ([tavplubix](https://github.com/tavplubix)). +* Fix `optimize_skip_unused_shards_rewrite_in` for non-UInt64 types (may select incorrect shards eventually or throw `Cannot infer type of an empty tuple` or `Function tuple requires at least one argument`). [#25798](https://github.com/ClickHouse/ClickHouse/pull/25798) ([Azat Khuzhin](https://github.com/azat)). +* Fix rare bug with `DROP PART` query for `ReplicatedMergeTree` tables which can lead to error message `Unexpected merged part intersecting drop range`. [#25783](https://github.com/ClickHouse/ClickHouse/pull/25783) ([alesapin](https://github.com/alesapin)). +* Fix bug in `TTL` with `GROUP BY` expression which refuses to execute `TTL` after first execution in part. [#25743](https://github.com/ClickHouse/ClickHouse/pull/25743) ([alesapin](https://github.com/alesapin)). +* Allow StorageMerge to access tables with aliases. Closes [#6051](https://github.com/ClickHouse/ClickHouse/issues/6051). [#25694](https://github.com/ClickHouse/ClickHouse/pull/25694) ([Kseniia Sumarokova](https://github.com/kssenii)). +* Fix slow dict join in some cases, close [#24209](https://github.com/ClickHouse/ClickHouse/issues/24209). [#25618](https://github.com/ClickHouse/ClickHouse/pull/25618) ([Vladimir C](https://github.com/vdimir)). +* Fix `ALTER MODIFY COLUMN` of columns, which participates in TTL expressions. [#25554](https://github.com/ClickHouse/ClickHouse/pull/25554) ([Anton Popov](https://github.com/CurtizJ)). +* Fix assertion in `PREWHERE` with non-UInt8 type, close [#19589](https://github.com/ClickHouse/ClickHouse/issues/19589). [#25484](https://github.com/ClickHouse/ClickHouse/pull/25484) ([Vladimir C](https://github.com/vdimir)). +* Fix some fuzzed msan crash. Fixes [#22517](https://github.com/ClickHouse/ClickHouse/issues/22517). [#26428](https://github.com/ClickHouse/ClickHouse/pull/26428) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Fix empty history file conversion. [#26589](https://github.com/ClickHouse/ClickHouse/pull/26589) ([Azat Khuzhin](https://github.com/azat)). +* Update `chown` cmd check in `clickhouse-server` docker entrypoint. It fixes error 'cluster pod restart failed (or timeout)' on kubernetes. [#26545](https://github.com/ClickHouse/ClickHouse/pull/26545) ([Ky Li](https://github.com/Kylinrix)). + +#### Build/Testing/Packaging Improvements + +* Disabling TestFlows LDAP module due to test fails. [#26065](https://github.com/ClickHouse/ClickHouse/pull/26065) ([vzakaznikov](https://github.com/vzakaznikov)). +* Enabling all TestFlows modules and fixing some tests. [#26011](https://github.com/ClickHouse/ClickHouse/pull/26011) ([vzakaznikov](https://github.com/vzakaznikov)). +* Add new tests for checking access rights for columns used in filters (`WHERE` / `PREWHERE` / row policy) of the `SELECT` statement after changes in [#24405](https://github.com/ClickHouse/ClickHouse/pull/24405). [#25619](https://github.com/ClickHouse/ClickHouse/pull/25619) ([Vitaly Baranov](https://github.com/vitlibar)). + +#### Other + +* Add `clickhouse-keeper-converter` tool which allows converting zookeeper logs and snapshots into `clickhouse-keeper` snapshot format. [#25428](https://github.com/ClickHouse/ClickHouse/pull/25428) ([alesapin](https://github.com/alesapin)). + + + ### ClickHouse release v21.7, 2021-07-09 #### Backward Incompatible Change diff --git a/CMakeLists.txt b/CMakeLists.txt index d3cb5f70c83..1727caea766 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -271,12 +271,6 @@ endif() include(cmake/cpu_features.cmake) -option(ARCH_NATIVE "Add -march=native compiler flag. This makes your binaries non-portable but more performant code may be generated.") - -if (ARCH_NATIVE) - set (COMPILER_FLAGS "${COMPILER_FLAGS} -march=native") -endif () - # Asynchronous unwind tables are needed for Query Profiler. # They are already by default on some platforms but possibly not on all platforms. # Enable it explicitly. diff --git a/cmake/cpu_features.cmake b/cmake/cpu_features.cmake index d12eac2e3c4..46e42329958 100644 --- a/cmake/cpu_features.cmake +++ b/cmake/cpu_features.cmake @@ -5,109 +5,128 @@ include (CMakePushCheckState) cmake_push_check_state () -# gcc -dM -E -mno-sse2 - < /dev/null | sort > gcc-dump-nosse2 -# gcc -dM -E -msse2 - < /dev/null | sort > gcc-dump-sse2 -#define __SSE2__ 1 -#define __SSE2_MATH__ 1 +# The variables HAVE_* determine if compiler has support for the flag to use the corresponding instruction set. +# The options ENABLE_* determine if we will tell compiler to actually use the corresponding instruction set if compiler can do it. -# gcc -dM -E -msse4.1 - < /dev/null | sort > gcc-dump-sse41 -#define __SSE4_1__ 1 +# All of them are unrelated to the instruction set at the host machine +# (you can compile for newer instruction set on old machines and vice versa). -set (TEST_FLAG "-msse4.1") -set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0") -check_cxx_source_compiles(" - #include - int main() { - auto a = _mm_insert_epi8(__m128i(), 0, 0); - (void)a; - return 0; - } -" HAVE_SSE41) -if (HAVE_SSE41) - set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}") -endif () +option (ENABLE_SSSE3 "Use SSSE3 instructions on x86_64" 1) +option (ENABLE_SSE41 "Use SSE4.1 instructions on x86_64" 1) +option (ENABLE_SSE42 "Use SSE4.2 instructions on x86_64" 1) +option (ENABLE_PCLMULQDQ "Use pclmulqdq instructions on x86_64" 1) +option (ENABLE_POPCNT "Use popcnt instructions on x86_64" 1) +option (ENABLE_AVX "Use AVX instructions on x86_64" 0) +option (ENABLE_AVX2 "Use AVX2 instructions on x86_64" 0) -if (ARCH_PPC64LE) - set (COMPILER_FLAGS "${COMPILER_FLAGS} -maltivec -D__SSE2__=1 -DNO_WARN_X86_INTRINSICS") -endif () +option (ARCH_NATIVE "Add -march=native compiler flag. This makes your binaries non-portable but more performant code may be generated. This option overrides ENABLE_* options for specific instruction set. Highly not recommended to use." 0) -# gcc -dM -E -msse4.2 - < /dev/null | sort > gcc-dump-sse42 -#define __SSE4_2__ 1 +if (ARCH_NATIVE) + set (COMPILER_FLAGS "${COMPILER_FLAGS} -march=native") -set (TEST_FLAG "-msse4.2") -set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0") -check_cxx_source_compiles(" - #include - int main() { - auto a = _mm_crc32_u64(0, 0); - (void)a; - return 0; - } -" HAVE_SSE42) -if (HAVE_SSE42) - set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}") -endif () +else () + set (TEST_FLAG "-mssse3") + set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0") + check_cxx_source_compiles(" + #include + int main() { + __m64 a = _mm_abs_pi8(__m64()); + (void)a; + return 0; + } + " HAVE_SSSE3) + if (HAVE_SSSE3 AND ENABLE_SSSE3) + set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}") + endif () -set (TEST_FLAG "-mssse3") -set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0") -check_cxx_source_compiles(" - #include - int main() { - __m64 a = _mm_abs_pi8(__m64()); - (void)a; - return 0; - } -" HAVE_SSSE3) -set (TEST_FLAG "-mavx") -set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0") -check_cxx_source_compiles(" - #include - int main() { - auto a = _mm256_insert_epi8(__m256i(), 0, 0); - (void)a; - return 0; - } -" HAVE_AVX) + set (TEST_FLAG "-msse4.1") + set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0") + check_cxx_source_compiles(" + #include + int main() { + auto a = _mm_insert_epi8(__m128i(), 0, 0); + (void)a; + return 0; + } + " HAVE_SSE41) + if (HAVE_SSE41 AND ENABLE_SSE41) + set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}") + endif () -set (TEST_FLAG "-mavx2") -set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0") -check_cxx_source_compiles(" - #include - int main() { - auto a = _mm256_add_epi16(__m256i(), __m256i()); - (void)a; - return 0; - } -" HAVE_AVX2) + if (ARCH_PPC64LE) + set (COMPILER_FLAGS "${COMPILER_FLAGS} -maltivec -D__SSE2__=1 -DNO_WARN_X86_INTRINSICS") + endif () -set (TEST_FLAG "-mpclmul") -set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0") -check_cxx_source_compiles(" - #include - int main() { - auto a = _mm_clmulepi64_si128(__m128i(), __m128i(), 0); - (void)a; - return 0; - } -" HAVE_PCLMULQDQ) + set (TEST_FLAG "-msse4.2") + set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0") + check_cxx_source_compiles(" + #include + int main() { + auto a = _mm_crc32_u64(0, 0); + (void)a; + return 0; + } + " HAVE_SSE42) + if (HAVE_SSE42 AND ENABLE_SSE42) + set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}") + endif () -# gcc -dM -E -mpopcnt - < /dev/null | sort > gcc-dump-popcnt -#define __POPCNT__ 1 + set (TEST_FLAG "-mpclmul") + set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0") + check_cxx_source_compiles(" + #include + int main() { + auto a = _mm_clmulepi64_si128(__m128i(), __m128i(), 0); + (void)a; + return 0; + } + " HAVE_PCLMULQDQ) + if (HAVE_PCLMULQDQ AND ENABLE_PCLMULQDQ) + set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}") + endif () -set (TEST_FLAG "-mpopcnt") + set (TEST_FLAG "-mpopcnt") -set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0") -check_cxx_source_compiles(" - int main() { - auto a = __builtin_popcountll(0); - (void)a; - return 0; - } -" HAVE_POPCNT) + set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0") + check_cxx_source_compiles(" + int main() { + auto a = __builtin_popcountll(0); + (void)a; + return 0; + } + " HAVE_POPCNT) + if (HAVE_POPCNT AND ENABLE_POPCNT) + set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}") + endif () -if (HAVE_POPCNT AND NOT ARCH_AARCH64) - set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}") + set (TEST_FLAG "-mavx") + set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0") + check_cxx_source_compiles(" + #include + int main() { + auto a = _mm256_insert_epi8(__m256i(), 0, 0); + (void)a; + return 0; + } + " HAVE_AVX) + if (HAVE_AVX AND ENABLE_AVX) + set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}") + endif () + + set (TEST_FLAG "-mavx2") + set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0") + check_cxx_source_compiles(" + #include + int main() { + auto a = _mm256_add_epi16(__m256i(), __m256i()); + (void)a; + return 0; + } + " HAVE_AVX2) + if (HAVE_AVX2 AND ENABLE_AVX2) + set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}") + endif () endif () cmake_pop_check_state () diff --git a/contrib/simdjson-cmake/CMakeLists.txt b/contrib/simdjson-cmake/CMakeLists.txt index d3bcf6c046c..862d8dc50f8 100644 --- a/contrib/simdjson-cmake/CMakeLists.txt +++ b/contrib/simdjson-cmake/CMakeLists.txt @@ -4,3 +4,6 @@ set(SIMDJSON_SRC "${SIMDJSON_SRC_DIR}/simdjson.cpp") add_library(simdjson ${SIMDJSON_SRC}) target_include_directories(simdjson SYSTEM PUBLIC "${SIMDJSON_INCLUDE_DIR}" PRIVATE "${SIMDJSON_SRC_DIR}") + +# simdjson is using its own CPU dispatching and get confused if we enable AVX/AVX2 flags. +target_compile_options(simdjson PRIVATE -mno-avx -mno-avx2) diff --git a/programs/client/Client.cpp b/programs/client/Client.cpp index b28ef8f7c7f..14442167042 100644 --- a/programs/client/Client.cpp +++ b/programs/client/Client.cpp @@ -1,3 +1,6 @@ +#include +#include "Common/MemoryTracker.h" +#include "Columns/ColumnsNumber.h" #include "ConnectionParameters.h" #include "QueryFuzzer.h" #include "Suggest.h" @@ -100,6 +103,14 @@ #pragma GCC optimize("-fno-var-tracking-assignments") #endif +namespace CurrentMetrics +{ + extern const Metric Revision; + extern const Metric VersionInteger; + extern const Metric MemoryTracking; + extern const Metric MaxDDLEntryID; +} + namespace fs = std::filesystem; namespace DB @@ -524,6 +535,18 @@ private: { UseSSL use_ssl; + MainThreadStatus::getInstance(); + + /// Limit on total memory usage + size_t max_client_memory_usage = config().getInt64("max_memory_usage_in_client", 0 /*default value*/); + + if (max_client_memory_usage != 0) + { + total_memory_tracker.setHardLimit(max_client_memory_usage); + total_memory_tracker.setDescription("(total)"); + total_memory_tracker.setMetric(CurrentMetrics::MemoryTracking); + } + registerFormats(); registerFunctions(); registerAggregateFunctions(); @@ -2581,6 +2604,7 @@ public: ("opentelemetry-tracestate", po::value(), "OpenTelemetry tracestate header as described by W3C Trace Context recommendation") ("history_file", po::value(), "path to history file") ("no-warnings", "disable warnings when client connects to server") + ("max_memory_usage_in_client", po::value(), "sets memory limit in client") ; Settings cmd_settings; diff --git a/src/Access/IAccessStorage.cpp b/src/Access/IAccessStorage.cpp index 348987899cb..f0fbb95ff4e 100644 --- a/src/Access/IAccessStorage.cpp +++ b/src/Access/IAccessStorage.cpp @@ -455,7 +455,7 @@ UUID IAccessStorage::login( if (!replace_exception_with_cannot_authenticate) throw; - tryLogCurrentException(getLogger(), credentials.getUserName() + ": Authentication failed"); + tryLogCurrentException(getLogger(), "from: " + address.toString() + ", user: " + credentials.getUserName() + ": Authentication failed"); throwCannotAuthenticate(credentials.getUserName()); } } diff --git a/src/Columns/ColumnLowCardinality.h b/src/Columns/ColumnLowCardinality.h index 698f65b1281..faf5bb9e712 100644 --- a/src/Columns/ColumnLowCardinality.h +++ b/src/Columns/ColumnLowCardinality.h @@ -187,6 +187,7 @@ public: * So LC(Nullable(T)) would return true, LC(U) -- false. */ bool nestedIsNullable() const { return isColumnNullable(*dictionary.getColumnUnique().getNestedColumn()); } + bool nestedCanBeInsideNullable() const { return dictionary.getColumnUnique().getNestedColumn()->canBeInsideNullable(); } void nestedToNullable() { dictionary.getColumnUnique().nestedToNullable(); } void nestedRemoveNullable() { dictionary.getColumnUnique().nestedRemoveNullable(); } diff --git a/src/Core/Block.cpp b/src/Core/Block.cpp index efd8de43a3c..96667862e41 100644 --- a/src/Core/Block.cpp +++ b/src/Core/Block.cpp @@ -44,6 +44,13 @@ void Block::initializeIndexByName() } +void Block::reserve(size_t count) +{ + index_by_name.reserve(count); + data.reserve(count); +} + + void Block::insert(size_t position, ColumnWithTypeAndName elem) { if (position > data.size()) @@ -287,6 +294,7 @@ std::string Block::dumpIndex() const Block Block::cloneEmpty() const { Block res; + res.reserve(data.size()); for (const auto & elem : data) res.insert(elem.cloneEmpty()); @@ -364,6 +372,8 @@ Block Block::cloneWithColumns(MutableColumns && columns) const Block res; size_t num_columns = data.size(); + res.reserve(num_columns); + for (size_t i = 0; i < num_columns; ++i) res.insert({ std::move(columns[i]), data[i].type, data[i].name }); @@ -381,6 +391,8 @@ Block Block::cloneWithColumns(const Columns & columns) const throw Exception("Cannot clone block with columns because block has " + toString(num_columns) + " columns, " "but " + toString(columns.size()) + " columns given.", ErrorCodes::LOGICAL_ERROR); + res.reserve(num_columns); + for (size_t i = 0; i < num_columns; ++i) res.insert({ columns[i], data[i].type, data[i].name }); @@ -393,6 +405,8 @@ Block Block::cloneWithoutColumns() const Block res; size_t num_columns = data.size(); + res.reserve(num_columns); + for (size_t i = 0; i < num_columns; ++i) res.insert({ nullptr, data[i].type, data[i].name }); diff --git a/src/Core/Block.h b/src/Core/Block.h index a2d91190795..14f82cecd8d 100644 --- a/src/Core/Block.h +++ b/src/Core/Block.h @@ -152,6 +152,7 @@ public: private: void eraseImpl(size_t position); void initializeIndexByName(); + void reserve(size_t count); /// This is needed to allow function execution over data. /// It is safe because functions does not change column names, so index is unaffected. diff --git a/src/Core/ya.make b/src/Core/ya.make index d1e352ee846..6946d7a47bb 100644 --- a/src/Core/ya.make +++ b/src/Core/ya.make @@ -31,6 +31,10 @@ SRCS( MySQL/PacketsProtocolText.cpp MySQL/PacketsReplication.cpp NamesAndTypes.cpp + PostgreSQL/Connection.cpp + PostgreSQL/PoolWithFailover.cpp + PostgreSQL/Utils.cpp + PostgreSQL/insertPostgreSQLValue.cpp PostgreSQLProtocol.cpp QueryProcessingStage.cpp Settings.cpp diff --git a/src/DataStreams/ya.make b/src/DataStreams/ya.make index 2012af76697..b1205828a7e 100644 --- a/src/DataStreams/ya.make +++ b/src/DataStreams/ya.make @@ -49,6 +49,7 @@ SRCS( TTLUpdateInfoAlgorithm.cpp copyData.cpp finalizeBlock.cpp + formatBlock.cpp materializeBlock.cpp narrowBlockInputStreams.cpp diff --git a/src/Interpreters/AsynchronousMetrics.cpp b/src/Interpreters/AsynchronousMetrics.cpp index 1c79dedd978..8efe959a623 100644 --- a/src/Interpreters/AsynchronousMetrics.cpp +++ b/src/Interpreters/AsynchronousMetrics.cpp @@ -1091,7 +1091,14 @@ void AsynchronousMetrics::update(std::chrono::system_clock::time_point update_ti { sensor_file->rewind(); Int64 temperature = 0; - readText(temperature, *sensor_file); + try + { + readText(temperature, *sensor_file); + } + catch (const ErrnoException & e) + { + LOG_DEBUG(&Poco::Logger::get("AsynchronousMetrics"), "Hardware monitor '{}', sensor '{}' exists but could not be read, error {}.", hwmon_name, sensor_name, e.getErrno()); + } if (sensor_name.empty()) new_values[fmt::format("Temperature_{}", hwmon_name)] = temperature * 0.001; diff --git a/src/Interpreters/join_common.cpp b/src/Interpreters/join_common.cpp index 76bfd7f2899..e9f3e4f3fdd 100644 --- a/src/Interpreters/join_common.cpp +++ b/src/Interpreters/join_common.cpp @@ -2,6 +2,7 @@ #include #include +#include #include @@ -105,25 +106,57 @@ DataTypePtr convertTypeToNullable(const DataTypePtr & type) return type; } +/// Convert column to nullable. If column LowCardinality or Const, convert nested column. +/// Returns nullptr if conversion cannot be performed. +static ColumnPtr tryConvertColumnToNullable(const ColumnPtr & col) +{ + if (isColumnNullable(*col) || col->canBeInsideNullable()) + return makeNullable(col); + + if (col->lowCardinality()) + { + auto mut_col = IColumn::mutate(std::move(col)); + ColumnLowCardinality * col_lc = assert_cast(mut_col.get()); + if (col_lc->nestedIsNullable()) + { + return mut_col; + } + else if (col_lc->nestedCanBeInsideNullable()) + { + col_lc->nestedToNullable(); + return mut_col; + } + } + else if (const ColumnConst * col_const = checkAndGetColumn(*col)) + { + const auto & nested = col_const->getDataColumnPtr(); + if (nested->isNullable() || nested->canBeInsideNullable()) + { + return makeNullable(col); + } + else if (nested->lowCardinality()) + { + ColumnPtr nested_nullable = tryConvertColumnToNullable(nested); + if (nested_nullable) + return ColumnConst::create(nested_nullable, col_const->size()); + } + } + return nullptr; +} + void convertColumnToNullable(ColumnWithTypeAndName & column) { - column.type = convertTypeToNullable(column.type); - if (!column.column) + { + column.type = convertTypeToNullable(column.type); return; - - if (column.column->lowCardinality()) - { - /// Convert nested to nullable, not LowCardinality itself - auto mut_col = IColumn::mutate(std::move(column.column)); - ColumnLowCardinality * col_as_lc = assert_cast(mut_col.get()); - if (!col_as_lc->nestedIsNullable()) - col_as_lc->nestedToNullable(); - column.column = std::move(mut_col); } - else if (column.column->canBeInsideNullable()) + + ColumnPtr nullable_column = tryConvertColumnToNullable(column.column); + if (nullable_column) { - column.column = makeNullable(column.column); + column.type = convertTypeToNullable(column.type); + column.column = std::move(nullable_column); } } diff --git a/src/Parsers/ya.make b/src/Parsers/ya.make index 62e0c2b3225..3b8a9a19bce 100644 --- a/src/Parsers/ya.make +++ b/src/Parsers/ya.make @@ -21,6 +21,7 @@ SRCS( ASTCreateRowPolicyQuery.cpp ASTCreateSettingsProfileQuery.cpp ASTCreateUserQuery.cpp + ASTDatabaseOrNone.cpp ASTDictionary.cpp ASTDictionaryAttributeDeclaration.cpp ASTDropAccessEntityQuery.cpp @@ -95,6 +96,7 @@ SRCS( ParserCreateSettingsProfileQuery.cpp ParserCreateUserQuery.cpp ParserDataType.cpp + ParserDatabaseOrNone.cpp ParserDescribeTableQuery.cpp ParserDictionary.cpp ParserDictionaryAttributeDeclaration.cpp diff --git a/src/Processors/Transforms/WindowTransform.cpp b/src/Processors/Transforms/WindowTransform.cpp index 3ab1a23537b..1b8406682ea 100644 --- a/src/Processors/Transforms/WindowTransform.cpp +++ b/src/Processors/Transforms/WindowTransform.cpp @@ -1166,6 +1166,23 @@ void WindowTransform::appendChunk(Chunk & chunk) // Write out the aggregation results. writeOutCurrentRow(); + if (isCancelled()) + { + // Good time to check if the query is cancelled. Checking once + // per block might not be enough in severe quadratic cases. + // Just leave the work halfway through and return, the 'prepare' + // method will figure out what to do. Note that this doesn't + // handle 'max_execution_time' and other limits, because these + // limits are only updated between blocks. Eventually we should + // start updating them in background and canceling the processor, + // like we do for Ctrl+C handling. + // + // This class is final, so the check should hopefully be + // devirtualized and become a single never-taken branch that is + // basically free. + return; + } + // Move to the next row. The frame will have to be recalculated. // The peer group start is updated at the beginning of the loop, // because current_row might now be past-the-end. @@ -1255,10 +1272,12 @@ IProcessor::Status WindowTransform::prepare() // next_output_block_number, first_not_ready_row, first_block_number, // blocks.size()); - if (output.isFinished()) + if (output.isFinished() || isCancelled()) { // The consumer asked us not to continue (or we decided it ourselves), - // so we abort. + // so we abort. Not sure what the difference between the two conditions + // is, but it seemed that output.isFinished() is not enough to cancel on + // Ctrl+C. Test manually if you change it. input.close(); return Status::Finished; } diff --git a/src/Processors/Transforms/WindowTransform.h b/src/Processors/Transforms/WindowTransform.h index d7211f9edd7..5dc78a34f78 100644 --- a/src/Processors/Transforms/WindowTransform.h +++ b/src/Processors/Transforms/WindowTransform.h @@ -80,8 +80,10 @@ struct RowNumber * the order of input data. This property also trivially holds for the ROWS and * GROUPS frames. For the RANGE frame, the proof requires the additional fact * that the ranges are specified in terms of (the single) ORDER BY column. + * + * `final` is so that the isCancelled() is devirtualized, we call it every row. */ -class WindowTransform : public IProcessor /* public ISimpleTransform */ +class WindowTransform final : public IProcessor { public: WindowTransform( diff --git a/src/Processors/ya.make b/src/Processors/ya.make index 4b95484a828..543a08caca5 100644 --- a/src/Processors/ya.make +++ b/src/Processors/ya.make @@ -7,14 +7,8 @@ PEERDIR( clickhouse/src/Common contrib/libs/msgpack contrib/libs/protobuf - contrib/libs/arrow ) -ADDINCL( - contrib/libs/arrow/src -) - -CFLAGS(-DUSE_ARROW=1) SRCS( Chunk.cpp @@ -31,11 +25,6 @@ SRCS( Formats/IOutputFormat.cpp Formats/IRowInputFormat.cpp Formats/IRowOutputFormat.cpp - Formats/Impl/ArrowBlockInputFormat.cpp - Formats/Impl/ArrowBlockOutputFormat.cpp - Formats/Impl/ArrowBufferedStreams.cpp - Formats/Impl/ArrowColumnToCHColumn.cpp - Formats/Impl/CHColumnToArrowColumn.cpp Formats/Impl/BinaryRowInputFormat.cpp Formats/Impl/BinaryRowOutputFormat.cpp Formats/Impl/CSVRowInputFormat.cpp diff --git a/src/Storages/MergeTree/BackgroundJobsExecutor.cpp b/src/Storages/MergeTree/BackgroundJobsExecutor.cpp index 36803ba5197..f3d957117e8 100644 --- a/src/Storages/MergeTree/BackgroundJobsExecutor.cpp +++ b/src/Storages/MergeTree/BackgroundJobsExecutor.cpp @@ -146,6 +146,9 @@ try catch (...) /// Exception while we looking for a task, reschedule { tryLogCurrentException(__PRETTY_FUNCTION__); + + /// Why do we scheduleTask again? + /// To retry on exception, since it may be some temporary exception. scheduleTask(/* with_backoff = */ true); } @@ -180,10 +183,16 @@ void IBackgroundJobExecutor::triggerTask() } void IBackgroundJobExecutor::backgroundTaskFunction() +try { if (!scheduleJob()) scheduleTask(/* with_backoff = */ true); } +catch (...) /// Catch any exception to avoid thread termination. +{ + tryLogCurrentException(__PRETTY_FUNCTION__); + scheduleTask(/* with_backoff = */ true); +} IBackgroundJobExecutor::~IBackgroundJobExecutor() { diff --git a/src/Storages/StorageMergeTree.cpp b/src/Storages/StorageMergeTree.cpp index 0763e2a25c4..32c2c76dd10 100644 --- a/src/Storages/StorageMergeTree.cpp +++ b/src/Storages/StorageMergeTree.cpp @@ -959,9 +959,19 @@ std::shared_ptr StorageMergeTree::se if (!commands_for_size_validation.empty()) { - MutationsInterpreter interpreter( - shared_from_this(), metadata_snapshot, commands_for_size_validation, getContext(), false); - commands_size += interpreter.evaluateCommandsSize(); + try + { + MutationsInterpreter interpreter( + shared_from_this(), metadata_snapshot, commands_for_size_validation, getContext(), false); + commands_size += interpreter.evaluateCommandsSize(); + } + catch (...) + { + MergeTreeMutationEntry & entry = it->second; + entry.latest_fail_time = time(nullptr); + entry.latest_fail_reason = getCurrentExceptionMessage(false); + continue; + } } if (current_ast_elements + commands_size >= max_ast_elements) @@ -971,17 +981,21 @@ std::shared_ptr StorageMergeTree::se commands.insert(commands.end(), it->second.commands.begin(), it->second.commands.end()); } - auto new_part_info = part->info; - new_part_info.mutation = current_mutations_by_version.rbegin()->first; + if (!commands.empty()) + { + auto new_part_info = part->info; + new_part_info.mutation = current_mutations_by_version.rbegin()->first; - future_part.parts.push_back(part); - future_part.part_info = new_part_info; - future_part.name = part->getNewName(new_part_info); - future_part.type = part->getType(); + future_part.parts.push_back(part); + future_part.part_info = new_part_info; + future_part.name = part->getNewName(new_part_info); + future_part.type = part->getType(); - tagger = std::make_unique(future_part, MergeTreeDataMergerMutator::estimateNeededDiskSpace({part}), *this, metadata_snapshot, true); - return std::make_shared(future_part, std::move(tagger), commands); + tagger = std::make_unique(future_part, MergeTreeDataMergerMutator::estimateNeededDiskSpace({part}), *this, metadata_snapshot, true); + return std::make_shared(future_part, std::move(tagger), commands); + } } + return {}; } @@ -1036,6 +1050,7 @@ bool StorageMergeTree::scheduleDataProcessingJob(IBackgroundJobExecutor & execut auto share_lock = lockForShare(RWLockImpl::NO_QUERY, getSettings()->lock_acquire_timeout_for_background_operations); + bool has_mutations; { std::unique_lock lock(currently_processing_in_background_mutex); if (merger_mutator.merges_blocker.isCancelled()) @@ -1044,6 +1059,15 @@ bool StorageMergeTree::scheduleDataProcessingJob(IBackgroundJobExecutor & execut merge_entry = selectPartsToMerge(metadata_snapshot, false, {}, false, nullptr, share_lock, lock); if (!merge_entry) mutate_entry = selectPartsToMutate(metadata_snapshot, nullptr, share_lock); + + has_mutations = !current_mutations_by_version.empty(); + } + + if (!mutate_entry && has_mutations) + { + /// Notify in case of errors + std::lock_guard lock(mutation_wait_mutex); + mutation_wait_event.notify_all(); } if (merge_entry) diff --git a/src/Storages/ya.make b/src/Storages/ya.make index 476449e8e6c..b3494849441 100644 --- a/src/Storages/ya.make +++ b/src/Storages/ya.make @@ -141,6 +141,7 @@ SRCS( StorageMerge.cpp StorageMergeTree.cpp StorageMongoDB.cpp + StorageMongoDBSocketFactory.cpp StorageMySQL.cpp StorageNull.cpp StorageReplicatedMergeTree.cpp diff --git a/tests/clickhouse-test b/tests/clickhouse-test index b734af0bdea..f6833cfbd09 100755 --- a/tests/clickhouse-test +++ b/tests/clickhouse-test @@ -647,7 +647,7 @@ def run_tests_array(all_tests_with_params): failures_chain += 1 status += MSG_FAIL status += print_test_time(total_time) - status += " - having exception:\n{}\n".format( + status += " - having exception in stdout:\n{}\n".format( '\n'.join(stdout.split('\n')[:100])) status += 'Database: ' + testcase_args.testcase_database elif reference_file is None: diff --git a/tests/queries/0_stateless/01572_kill_window_function.reference b/tests/queries/0_stateless/01572_kill_window_function.reference new file mode 100644 index 00000000000..f1218bf5bdf --- /dev/null +++ b/tests/queries/0_stateless/01572_kill_window_function.reference @@ -0,0 +1,3 @@ +Started +Sent kill request +Exit 138 diff --git a/tests/queries/0_stateless/01572_kill_window_function.sh b/tests/queries/0_stateless/01572_kill_window_function.sh new file mode 100755 index 00000000000..7103b7f7210 --- /dev/null +++ b/tests/queries/0_stateless/01572_kill_window_function.sh @@ -0,0 +1,36 @@ +#!/usr/bin/env bash + +CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) +# shellcheck source=../shell_config.sh +. "$CURDIR"/../shell_config.sh + +set -e -o pipefail + +# Run a test query that takes very long to run. +query_id="01572_kill_window_function-$CLICKHOUSE_DATABASE" +$CLICKHOUSE_CLIENT --query_id="$query_id" --query "SELECT count(1048575) OVER (PARTITION BY intDiv(NULL, number) ORDER BY number DESC NULLS FIRST ROWS BETWEEN CURRENT ROW AND 1048575 FOLLOWING) FROM numbers(255, 1048575)" >/dev/null 2>&1 & +client_pid=$! +echo Started + +# Use one query to both kill the test query and verify that it has started, +# because if we try to kill it before it starts, the test will fail. +while [ -z "$($CLICKHOUSE_CLIENT --query "kill query where query_id = '$query_id' and current_database = currentDatabase()")" ] +do + # If we don't yet see the query in the process list, the client should still + # be running. The query is very long. + kill -0 -- $client_pid + sleep 1 +done +echo Sent kill request + +# Wait for the client to terminate. +client_exit_code=0 +wait $client_pid || client_exit_code=$? + +echo "Exit $client_exit_code" + +# We have tested for Ctrl+C. +# The following client flags don't cancel, but should: --max_execution_time, +# --receive_timeout. Probably needs asynchonous calculation of query limits, as +# discussed with Nikolay on TG: https://t.me/c/1214350934/21492 + diff --git a/tests/queries/0_stateless/02003_memory_limit_in_client.expect b/tests/queries/0_stateless/02003_memory_limit_in_client.expect new file mode 100755 index 00000000000..49b81240829 --- /dev/null +++ b/tests/queries/0_stateless/02003_memory_limit_in_client.expect @@ -0,0 +1,40 @@ +#!/usr/bin/expect -f + +# This is a test for system.warnings. Testing in interactive mode is necessary, +# as we want to see certain warnings from client + +log_user 0 +set timeout 60 +match_max 100000 + +# A default timeout action is to do nothing, change it to fail +expect_after { + timeout { + exit 1 + } +} + +set basedir [file dirname $argv0] +spawn bash -c "source $basedir/../shell_config.sh ; \$CLICKHOUSE_CLIENT_BINARY \$CLICKHOUSE_CLIENT_OPT --disable_suggestion --max_memory_usage_in_client=1" +expect ":) " + +send -- "SELECT arrayMap(x -> range(x), range(number)) FROM numbers(1000)\r" +expect "Code: 241" + +expect ":) " + +# Exit. +send -- "\4" +expect eof + +set basedir [file dirname $argv0] +spawn bash -c "source $basedir/../shell_config.sh ; \$CLICKHOUSE_CLIENT_BINARY \$CLICKHOUSE_CLIENT_OPT --disable_suggestion --max_memory_usage_in_client=1" +expect ":) " + +send -- "SELECT * FROM (SELECT * FROM system.numbers LIMIT 600000) as num WHERE num.number=60000\r" +expect "60000" +expect ":) " + +# Exit. +send -- "\4" +expect eof diff --git a/tests/queries/0_stateless/02003_memory_limit_in_client.reference b/tests/queries/0_stateless/02003_memory_limit_in_client.reference new file mode 100644 index 00000000000..e69de29bb2d diff --git a/tests/queries/0_stateless/02004_invalid_partition_mutation_stuck.reference b/tests/queries/0_stateless/02004_invalid_partition_mutation_stuck.reference new file mode 100644 index 00000000000..e69de29bb2d diff --git a/tests/queries/0_stateless/02004_invalid_partition_mutation_stuck.sql b/tests/queries/0_stateless/02004_invalid_partition_mutation_stuck.sql new file mode 100644 index 00000000000..481a5565095 --- /dev/null +++ b/tests/queries/0_stateless/02004_invalid_partition_mutation_stuck.sql @@ -0,0 +1,33 @@ +SET mutations_sync=2; + +DROP TABLE IF EXISTS rep_data; +CREATE TABLE rep_data +( + p Int, + t DateTime, + INDEX idx t TYPE minmax GRANULARITY 1 +) +ENGINE = ReplicatedMergeTree('/clickhouse/tables/{database}/rep_data', '1') +PARTITION BY p +ORDER BY t +SETTINGS number_of_free_entries_in_pool_to_execute_mutation=0; +INSERT INTO rep_data VALUES (1, now()); +ALTER TABLE rep_data MATERIALIZE INDEX idx IN PARTITION ID 'NO_SUCH_PART'; -- { serverError 248 } +ALTER TABLE rep_data MATERIALIZE INDEX idx IN PARTITION ID '1'; +ALTER TABLE rep_data MATERIALIZE INDEX idx IN PARTITION ID '2'; + +DROP TABLE IF EXISTS data; +CREATE TABLE data +( + p Int, + t DateTime, + INDEX idx t TYPE minmax GRANULARITY 1 +) +ENGINE = MergeTree +PARTITION BY p +ORDER BY t +SETTINGS number_of_free_entries_in_pool_to_execute_mutation=0; +INSERT INTO data VALUES (1, now()); +ALTER TABLE data MATERIALIZE INDEX idx IN PARTITION ID 'NO_SUCH_PART'; -- { serverError 341 } +ALTER TABLE data MATERIALIZE INDEX idx IN PARTITION ID '1'; +ALTER TABLE data MATERIALIZE INDEX idx IN PARTITION ID '2'; diff --git a/tests/queries/0_stateless/02007_join_use_nulls.reference b/tests/queries/0_stateless/02007_join_use_nulls.reference new file mode 100644 index 00000000000..30ee87bf91d --- /dev/null +++ b/tests/queries/0_stateless/02007_join_use_nulls.reference @@ -0,0 +1,8 @@ +1 2 3 1 3 +1 UInt8 2 UInt8 3 Nullable(UInt8) +1 LowCardinality(UInt8) 2 LowCardinality(UInt8) 3 LowCardinality(Nullable(UInt8)) +1 LowCardinality(UInt8) 2 LowCardinality(UInt8) 1 LowCardinality(Nullable(UInt8)) +1 UInt8 2 UInt8 3 Nullable(UInt8) +1 UInt8 2 UInt8 1 Nullable(UInt8) 3 Nullable(UInt8) +1 LowCardinality(UInt8) 2 LowCardinality(UInt8) 3 LowCardinality(Nullable(UInt8)) +1 LowCardinality(UInt8) 2 LowCardinality(UInt8) 1 LowCardinality(Nullable(UInt8)) 3 LowCardinality(Nullable(UInt8)) diff --git a/tests/queries/0_stateless/02007_join_use_nulls.sql b/tests/queries/0_stateless/02007_join_use_nulls.sql new file mode 100644 index 00000000000..e08fffce3b7 --- /dev/null +++ b/tests/queries/0_stateless/02007_join_use_nulls.sql @@ -0,0 +1,11 @@ +SET join_use_nulls = 1; + +SELECT *, d.* FROM ( SELECT 1 AS id, 2 AS value ) a SEMI LEFT JOIN ( SELECT 1 AS id, 3 AS values ) AS d USING id; + +SELECT id, toTypeName(id), value, toTypeName(value), d.values, toTypeName(d.values) FROM ( SELECT 1 AS id, 2 AS value ) a SEMI LEFT JOIN ( SELECT 1 AS id, 3 AS values ) AS d USING id; +SELECT id, toTypeName(id), value, toTypeName(value), d.values, toTypeName(d.values) FROM ( SELECT toLowCardinality(1) AS id, toLowCardinality(2) AS value ) a SEMI LEFT JOIN ( SELECT toLowCardinality(1) AS id, toLowCardinality(3) AS values ) AS d USING id; +SELECT id, toTypeName(id), value, toTypeName(value), d.id, toTypeName(d.id) FROM ( SELECT toLowCardinality(1) AS id, toLowCardinality(2) AS value ) a SEMI LEFT JOIN ( SELECT toLowCardinality(1) AS id, toLowCardinality(3) AS values ) AS d USING id; +SELECT id, toTypeName(id), value, toTypeName(value), d.values, toTypeName(d.values) FROM ( SELECT 1 AS id, 2 AS value ) a SEMI LEFT JOIN ( SELECT 1 AS id, 3 AS values ) AS d USING id; +SELECT id, toTypeName(id), value, toTypeName(value), d.id, toTypeName(d.id) , d.values, toTypeName(d.values) FROM ( SELECT 1 AS id, 2 AS value ) a SEMI LEFT JOIN ( SELECT 1 AS id, 3 AS values ) AS d USING id; +SELECT id, toTypeName(id), value, toTypeName(value), d.values, toTypeName(d.values) FROM ( SELECT toLowCardinality(1) AS id, toLowCardinality(2) AS value ) a SEMI LEFT JOIN ( SELECT toLowCardinality(1) AS id, toLowCardinality(3) AS values ) AS d USING id; +SELECT id, toTypeName(id), value, toTypeName(value), d.id, toTypeName(d.id) , d.values, toTypeName(d.values) FROM ( SELECT toLowCardinality(1) AS id, toLowCardinality(2) AS value ) a SEMI LEFT JOIN ( SELECT toLowCardinality(1) AS id, toLowCardinality(3) AS values ) AS d USING id; diff --git a/website/templates/index/success.html b/website/templates/index/success.html index 83b5c1427c9..a93efa8bdc5 100644 --- a/website/templates/index/success.html +++ b/website/templates/index/success.html @@ -2,19 +2,9 @@