diff --git a/.github/ISSUE_TEMPLATE/40_bug-report.md b/.github/ISSUE_TEMPLATE/85_bug-report.md
similarity index 93%
rename from .github/ISSUE_TEMPLATE/40_bug-report.md
rename to .github/ISSUE_TEMPLATE/85_bug-report.md
index d62ec578f8d..d78474670ff 100644
--- a/.github/ISSUE_TEMPLATE/40_bug-report.md
+++ b/.github/ISSUE_TEMPLATE/85_bug-report.md
@@ -1,8 +1,8 @@
---
name: Bug report
-about: Create a report to help us improve ClickHouse
+about: Wrong behaviour (visible to users) in official ClickHouse release.
title: ''
-labels: bug
+labels: 'potential bug'
assignees: ''
---
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 34d11c6a2cd..103d8e40fd9 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,3 +1,102 @@
+### ClickHouse release v21.8, 2021-08-12
+
+#### New Features
+
+* Add support for a part of SQL/JSON standard. [#24148](https://github.com/ClickHouse/ClickHouse/pull/24148) ([l1tsolaiki](https://github.com/l1tsolaiki), [Kseniia Sumarokova](https://github.com/kssenii)).
+* Collect common system metrics (in `system.asynchronous_metrics` and `system.asynchronous_metric_log`) on CPU usage, disk usage, memory usage, IO, network, files, load average, CPU frequencies, thermal sensors, EDAC counters, system uptime; also added metrics about the scheduling jitter and the time spent collecting the metrics. It works similar to `atop` in ClickHouse and allows access to monitoring data even if you have no additional tools installed. Close [#9430](https://github.com/ClickHouse/ClickHouse/issues/9430). [#24416](https://github.com/ClickHouse/ClickHouse/pull/24416) ([alexey-milovidov](https://github.com/alexey-milovidov), [Yegor Levankov](https://github.com/elevankoff)).
+* Add MaterializedPostgreSQL table engine and database engine. This database engine allows replicating a whole database or any subset of database tables. [#20470](https://github.com/ClickHouse/ClickHouse/pull/20470) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Add new functions `leftPad()`, `rightPad()`, `leftPadUTF8()`, `rightPadUTF8()`. [#26075](https://github.com/ClickHouse/ClickHouse/pull/26075) ([Vitaly Baranov](https://github.com/vitlibar)).
+* Add the `FIRST` keyword to the `ADD INDEX` command to be able to add the index at the beginning of the indices list. [#25904](https://github.com/ClickHouse/ClickHouse/pull/25904) ([xjewer](https://github.com/xjewer)).
+* Introduce `system.data_skipping_indices` table containing information about existing data skipping indices. Close [#7659](https://github.com/ClickHouse/ClickHouse/issues/7659). [#25693](https://github.com/ClickHouse/ClickHouse/pull/25693) ([Dmitry Novik](https://github.com/novikd)).
+* Add `bin`/`unbin` functions. [#25609](https://github.com/ClickHouse/ClickHouse/pull/25609) ([zhaoyu](https://github.com/zxc111)).
+* Support `Map` and `UInt128`, `Int128`, `UInt256`, `Int256` types in `mapAdd` and `mapSubtract` functions. [#25596](https://github.com/ClickHouse/ClickHouse/pull/25596) ([Ildus Kurbangaliev](https://github.com/ildus)).
+* Support `DISTINCT ON (columns)` expression, close [#25404](https://github.com/ClickHouse/ClickHouse/issues/25404). [#25589](https://github.com/ClickHouse/ClickHouse/pull/25589) ([Zijie Lu](https://github.com/TszKitLo40)).
+* Add an ability to reset a custom setting to default and remove it from the table's metadata. It allows rolling back the change without knowing the system/config's default. Closes [#14449](https://github.com/ClickHouse/ClickHouse/issues/14449). [#17769](https://github.com/ClickHouse/ClickHouse/pull/17769) ([xjewer](https://github.com/xjewer)).
+* Render pipelines as graphs in Web UI if `EXPLAIN PIPELINE graph = 1` query is submitted. [#26067](https://github.com/ClickHouse/ClickHouse/pull/26067) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+
+#### Performance Improvements
+
+* Compile aggregate functions. Use option `compile_aggregate_expressions` to enable it. [#24789](https://github.com/ClickHouse/ClickHouse/pull/24789) ([Maksim Kita](https://github.com/kitaisreal)).
+* Improve latency of short queries that require reading from tables with many columns. [#26371](https://github.com/ClickHouse/ClickHouse/pull/26371) ([Anton Popov](https://github.com/CurtizJ)).
+
+#### Improvements
+
+* Use `Map` data type for system logs tables (`system.query_log`, `system.query_thread_log`, `system.processes`, `system.opentelemetry_span_log`). These tables will be auto-created with new data types. Virtual columns are created to support old queries. Closes [#18698](https://github.com/ClickHouse/ClickHouse/issues/18698). [#23934](https://github.com/ClickHouse/ClickHouse/pull/23934), [#25773](https://github.com/ClickHouse/ClickHouse/pull/25773) ([hexiaoting](https://github.com/hexiaoting), [sundy-li](https://github.com/sundy-li), [Maksim Kita](https://github.com/kitaisreal)).
+* For a dictionary with a complex key containing only one attribute, allow not wrapping the key expression in tuple for functions `dictGet`, `dictHas`. [#26130](https://github.com/ClickHouse/ClickHouse/pull/26130) ([Maksim Kita](https://github.com/kitaisreal)).
+* Implement function `bin`/`hex` from `AggregateFunction` states. [#26094](https://github.com/ClickHouse/ClickHouse/pull/26094) ([zhaoyu](https://github.com/zxc111)).
+* Support arguments of `UUID` type for `empty` and `notEmpty` functions. `UUID` is empty if it is all zeros (nil UUID). Closes [#3446](https://github.com/ClickHouse/ClickHouse/issues/3446). [#25974](https://github.com/ClickHouse/ClickHouse/pull/25974) ([zhaoyu](https://github.com/zxc111)).
+* Add support for `SET SQL_SELECT_LIMIT` in MySQL protocol. Closes [#17115](https://github.com/ClickHouse/ClickHouse/issues/17115). [#25972](https://github.com/ClickHouse/ClickHouse/pull/25972) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* More instrumentation for network interaction: add counters for recv/send bytes; add gauges for recvs/sends. Added missing documentation. Close [#5897](https://github.com/ClickHouse/ClickHouse/issues/5897). [#25962](https://github.com/ClickHouse/ClickHouse/pull/25962) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Add setting `optimize_move_to_prewhere_if_final`. If query has `FINAL`, the optimization `move_to_prewhere` will be enabled only if both `optimize_move_to_prewhere` and `optimize_move_to_prewhere_if_final` are enabled. Closes [#8684](https://github.com/ClickHouse/ClickHouse/issues/8684). [#25940](https://github.com/ClickHouse/ClickHouse/pull/25940) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Allow complex quoted identifiers of JOINed tables. Close [#17861](https://github.com/ClickHouse/ClickHouse/issues/17861). [#25924](https://github.com/ClickHouse/ClickHouse/pull/25924) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Add support for Unicode (e.g. Chinese, Cyrillic) components in `Nested` data types. Close [#25594](https://github.com/ClickHouse/ClickHouse/issues/25594). [#25923](https://github.com/ClickHouse/ClickHouse/pull/25923) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Allow `quantiles*` functions to work with `aggregate_functions_null_for_empty`. Close [#25892](https://github.com/ClickHouse/ClickHouse/issues/25892). [#25919](https://github.com/ClickHouse/ClickHouse/pull/25919) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Allow parameters for parametric aggregate functions to be arbitrary constant expressions (e.g., `1 + 2`), not just literals. It also allows using the query parameters (in parameterized queries like `{param:UInt8}`) inside parametric aggregate functions. Closes [#11607](https://github.com/ClickHouse/ClickHouse/issues/11607). [#25910](https://github.com/ClickHouse/ClickHouse/pull/25910) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Correctly throw the exception on the attempt to parse an invalid `Date`. Closes [#6481](https://github.com/ClickHouse/ClickHouse/issues/6481). [#25909](https://github.com/ClickHouse/ClickHouse/pull/25909) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Support for multiple includes in configuration. It is possible to include users configuration, remote server configuration from multiple sources. Simply place `` element with `from_zk`, `from_env` or `incl` attribute, and it will be replaced with the substitution. [#24404](https://github.com/ClickHouse/ClickHouse/pull/24404) ([nvartolomei](https://github.com/nvartolomei)).
+* Support for queries with a column named `"null"` (it must be specified in back-ticks or double quotes) and `ON CLUSTER`. Closes [#24035](https://github.com/ClickHouse/ClickHouse/issues/24035). [#25907](https://github.com/ClickHouse/ClickHouse/pull/25907) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Support `LowCardinality`, `Decimal`, and `UUID` for `JSONExtract`. Closes [#24606](https://github.com/ClickHouse/ClickHouse/issues/24606). [#25900](https://github.com/ClickHouse/ClickHouse/pull/25900) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Convert history file from `readline` format to `replxx` format. [#25888](https://github.com/ClickHouse/ClickHouse/pull/25888) ([Azat Khuzhin](https://github.com/azat)).
+* Fix an issue which can lead to intersecting parts after `DROP PART` or background deletion of an empty part. [#25884](https://github.com/ClickHouse/ClickHouse/pull/25884) ([alesapin](https://github.com/alesapin)).
+* Better handling of lost parts for `ReplicatedMergeTree` tables. Fixes rare inconsistencies in `ReplicationQueue`. Fixes [#10368](https://github.com/ClickHouse/ClickHouse/issues/10368). [#25820](https://github.com/ClickHouse/ClickHouse/pull/25820) ([alesapin](https://github.com/alesapin)).
+* Allow starting clickhouse-client with unreadable working directory. [#25817](https://github.com/ClickHouse/ClickHouse/pull/25817) ([ianton-ru](https://github.com/ianton-ru)).
+* Fix "No available columns" error for `Merge` storage. [#25801](https://github.com/ClickHouse/ClickHouse/pull/25801) ([Azat Khuzhin](https://github.com/azat)).
+* MySQL Engine now supports the exchange of column comments between MySQL and ClickHouse. [#25795](https://github.com/ClickHouse/ClickHouse/pull/25795) ([Storozhuk Kostiantyn](https://github.com/sand6255)).
+* Fix inconsistent behaviour of `GROUP BY` constant on empty set. Closes [#6842](https://github.com/ClickHouse/ClickHouse/issues/6842). [#25786](https://github.com/ClickHouse/ClickHouse/pull/25786) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Cancel already running merges in partition on `DROP PARTITION` and `TRUNCATE` for `ReplicatedMergeTree`. Resolves [#17151](https://github.com/ClickHouse/ClickHouse/issues/17151). [#25684](https://github.com/ClickHouse/ClickHouse/pull/25684) ([tavplubix](https://github.com/tavplubix)).
+* Support ENUM` data type for MaterializeMySQL. [#25676](https://github.com/ClickHouse/ClickHouse/pull/25676) ([Storozhuk Kostiantyn](https://github.com/sand6255)).
+* Support materialized and aliased columns in JOIN, close [#13274](https://github.com/ClickHouse/ClickHouse/issues/13274). [#25634](https://github.com/ClickHouse/ClickHouse/pull/25634) ([Vladimir C](https://github.com/vdimir)).
+* Fix possible logical race condition between `ALTER TABLE ... DETACH` and background merges. [#25605](https://github.com/ClickHouse/ClickHouse/pull/25605) ([Azat Khuzhin](https://github.com/azat)).
+* Make `NetworkReceiveElapsedMicroseconds` metric to correctly include the time spent waiting for data from the client to `INSERT`. Close [#9958](https://github.com/ClickHouse/ClickHouse/issues/9958). [#25602](https://github.com/ClickHouse/ClickHouse/pull/25602) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Support `TRUNCATE TABLE` for S3 and HDFS. Close [#25530](https://github.com/ClickHouse/ClickHouse/issues/25530). [#25550](https://github.com/ClickHouse/ClickHouse/pull/25550) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Support for dynamic reloading of config to change number of threads in pool for background jobs execution (merges, mutations, fetches). [#25548](https://github.com/ClickHouse/ClickHouse/pull/25548) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
+* Allow extracting of non-string element as string using `JSONExtract`. This is for [#25414](https://github.com/ClickHouse/ClickHouse/issues/25414). [#25452](https://github.com/ClickHouse/ClickHouse/pull/25452) ([Amos Bird](https://github.com/amosbird)).
+* Support regular expression in `Database` argument for `StorageMerge`. Close [#776](https://github.com/ClickHouse/ClickHouse/issues/776). [#25064](https://github.com/ClickHouse/ClickHouse/pull/25064) ([flynn](https://github.com/ucasfl)).
+* Web UI: if the value looks like a URL, automatically generate a link. [#25965](https://github.com/ClickHouse/ClickHouse/pull/25965) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Make `sudo service clickhouse-server start` to work on systems with `systemd` like Centos 8. Close [#14298](https://github.com/ClickHouse/ClickHouse/issues/14298). Close [#17799](https://github.com/ClickHouse/ClickHouse/issues/17799). [#25921](https://github.com/ClickHouse/ClickHouse/pull/25921) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+
+#### Bug Fixes
+
+* Fix incorrect `SET ROLE` in some cases. [#26707](https://github.com/ClickHouse/ClickHouse/pull/26707) ([Vitaly Baranov](https://github.com/vitlibar)).
+* Fix potential `nullptr` dereference in window functions. Fix [#25276](https://github.com/ClickHouse/ClickHouse/issues/25276). [#26668](https://github.com/ClickHouse/ClickHouse/pull/26668) ([Alexander Kuzmenkov](https://github.com/akuzm)).
+* Fix incorrect function names of `groupBitmapAnd/Or/Xor`. Fix [#26557](https://github.com/ClickHouse/ClickHouse/pull/26557) ([Amos Bird](https://github.com/amosbird)).
+* Fix crash in RabbitMQ shutdown in case RabbitMQ setup was not started. Closes [#26504](https://github.com/ClickHouse/ClickHouse/issues/26504). [#26529](https://github.com/ClickHouse/ClickHouse/pull/26529) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Fix issues with `CREATE DICTIONARY` query if dictionary name or database name was quoted. Closes [#26491](https://github.com/ClickHouse/ClickHouse/issues/26491). [#26508](https://github.com/ClickHouse/ClickHouse/pull/26508) ([Maksim Kita](https://github.com/kitaisreal)).
+* Fix broken name resolution after rewriting column aliases. Fix [#26432](https://github.com/ClickHouse/ClickHouse/issues/26432). [#26475](https://github.com/ClickHouse/ClickHouse/pull/26475) ([Amos Bird](https://github.com/amosbird)).
+* Fix infinite non-joined block stream in `partial_merge_join` close [#26325](https://github.com/ClickHouse/ClickHouse/issues/26325). [#26374](https://github.com/ClickHouse/ClickHouse/pull/26374) ([Vladimir C](https://github.com/vdimir)).
+* Fix possible crash when login as dropped user. Fix [#26073](https://github.com/ClickHouse/ClickHouse/issues/26073). [#26363](https://github.com/ClickHouse/ClickHouse/pull/26363) ([Vitaly Baranov](https://github.com/vitlibar)).
+* Fix `optimize_distributed_group_by_sharding_key` for multiple columns (leads to incorrect result w/ `optimize_skip_unused_shards=1`/`allow_nondeterministic_optimize_skip_unused_shards=1` and multiple columns in sharding key expression). [#26353](https://github.com/ClickHouse/ClickHouse/pull/26353) ([Azat Khuzhin](https://github.com/azat)).
+* `CAST` from `Date` to `DateTime` (or `DateTime64`) was not using the timezone of the `DateTime` type. It can also affect the comparison between `Date` and `DateTime`. Inference of the common type for `Date` and `DateTime` also was not using the corresponding timezone. It affected the results of function `if` and array construction. Closes [#24128](https://github.com/ClickHouse/ClickHouse/issues/24128). [#24129](https://github.com/ClickHouse/ClickHouse/pull/24129) ([Maksim Kita](https://github.com/kitaisreal)).
+* Fixed rare bug in lost replica recovery that may cause replicas to diverge. [#26321](https://github.com/ClickHouse/ClickHouse/pull/26321) ([tavplubix](https://github.com/tavplubix)).
+* Fix zstd decompression in case there are escape sequences at the end of internal buffer. Closes [#26013](https://github.com/ClickHouse/ClickHouse/issues/26013). [#26314](https://github.com/ClickHouse/ClickHouse/pull/26314) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Fix logical error on join with totals, close [#26017](https://github.com/ClickHouse/ClickHouse/issues/26017). [#26250](https://github.com/ClickHouse/ClickHouse/pull/26250) ([Vladimir C](https://github.com/vdimir)).
+* Remove excessive newline in `thread_name` column in `system.stack_trace` table. Fix [#24124](https://github.com/ClickHouse/ClickHouse/issues/24124). [#26210](https://github.com/ClickHouse/ClickHouse/pull/26210) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix `joinGet` with `LowCarinality` columns, close [#25993](https://github.com/ClickHouse/ClickHouse/issues/25993). [#26118](https://github.com/ClickHouse/ClickHouse/pull/26118) ([Vladimir C](https://github.com/vdimir)).
+* Fix possible crash in `pointInPolygon` if the setting `validate_polygons` is turned off. [#26113](https://github.com/ClickHouse/ClickHouse/pull/26113) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix throwing exception when iterate over non-existing remote directory. [#26087](https://github.com/ClickHouse/ClickHouse/pull/26087) ([ianton-ru](https://github.com/ianton-ru)).
+* Fix rare server crash because of `abort` in ZooKeeper client. Fixes [#25813](https://github.com/ClickHouse/ClickHouse/issues/25813). [#26079](https://github.com/ClickHouse/ClickHouse/pull/26079) ([alesapin](https://github.com/alesapin)).
+* Fix wrong thread count estimation for right subquery join in some cases. Close [#24075](https://github.com/ClickHouse/ClickHouse/issues/24075). [#26052](https://github.com/ClickHouse/ClickHouse/pull/26052) ([Vladimir C](https://github.com/vdimir)).
+* Fixed incorrect `sequence_id` in MySQL protocol packets that ClickHouse sends on exception during query execution. It might cause MySQL client to reset connection to ClickHouse server. Fixes [#21184](https://github.com/ClickHouse/ClickHouse/issues/21184). [#26051](https://github.com/ClickHouse/ClickHouse/pull/26051) ([tavplubix](https://github.com/tavplubix)).
+* Fix possible mismatched header when using normal projection with `PREWHERE`. Fix [#26020](https://github.com/ClickHouse/ClickHouse/issues/26020). [#26038](https://github.com/ClickHouse/ClickHouse/pull/26038) ([Amos Bird](https://github.com/amosbird)).
+* Fix formatting of type `Map` with integer keys to `JSON`. [#25982](https://github.com/ClickHouse/ClickHouse/pull/25982) ([Anton Popov](https://github.com/CurtizJ)).
+* Fix possible deadlock during query profiler stack unwinding. Fix [#25968](https://github.com/ClickHouse/ClickHouse/issues/25968). [#25970](https://github.com/ClickHouse/ClickHouse/pull/25970) ([Maksim Kita](https://github.com/kitaisreal)).
+* Fix crash on call `dictGet()` with bad arguments. [#25913](https://github.com/ClickHouse/ClickHouse/pull/25913) ([Vitaly Baranov](https://github.com/vitlibar)).
+* Fixed `scram-sha-256` authentication for PostgreSQL engines. Closes [#24516](https://github.com/ClickHouse/ClickHouse/issues/24516). [#25906](https://github.com/ClickHouse/ClickHouse/pull/25906) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Fix extremely long backoff for background tasks when the background pool is full. Fixes [#25836](https://github.com/ClickHouse/ClickHouse/issues/25836). [#25893](https://github.com/ClickHouse/ClickHouse/pull/25893) ([alesapin](https://github.com/alesapin)).
+* Fix ARM exception handling with non default page size. Fixes [#25512](https://github.com/ClickHouse/ClickHouse/issues/25512), [#25044](https://github.com/ClickHouse/ClickHouse/issues/25044), [#24901](https://github.com/ClickHouse/ClickHouse/issues/24901), [#23183](https://github.com/ClickHouse/ClickHouse/issues/23183), [#20221](https://github.com/ClickHouse/ClickHouse/issues/20221), [#19703](https://github.com/ClickHouse/ClickHouse/issues/19703), [#19028](https://github.com/ClickHouse/ClickHouse/issues/19028), [#18391](https://github.com/ClickHouse/ClickHouse/issues/18391), [#18121](https://github.com/ClickHouse/ClickHouse/issues/18121), [#17994](https://github.com/ClickHouse/ClickHouse/issues/17994), [#12483](https://github.com/ClickHouse/ClickHouse/issues/12483). [#25854](https://github.com/ClickHouse/ClickHouse/pull/25854) ([Maksim Kita](https://github.com/kitaisreal)).
+* Fix sharding_key from column w/o function for `remote()` (before `select * from remote('127.1', system.one, dummy)` leads to `Unknown column: dummy, there are only columns .` error). [#25824](https://github.com/ClickHouse/ClickHouse/pull/25824) ([Azat Khuzhin](https://github.com/azat)).
+* Fixed `Not found column ...` and `Missing column ...` errors when selecting from `MaterializeMySQL`. Fixes [#23708](https://github.com/ClickHouse/ClickHouse/issues/23708), [#24830](https://github.com/ClickHouse/ClickHouse/issues/24830), [#25794](https://github.com/ClickHouse/ClickHouse/issues/25794). [#25822](https://github.com/ClickHouse/ClickHouse/pull/25822) ([tavplubix](https://github.com/tavplubix)).
+* Fix `optimize_skip_unused_shards_rewrite_in` for non-UInt64 types (may select incorrect shards eventually or throw `Cannot infer type of an empty tuple` or `Function tuple requires at least one argument`). [#25798](https://github.com/ClickHouse/ClickHouse/pull/25798) ([Azat Khuzhin](https://github.com/azat)).
+* Fix rare bug with `DROP PART` query for `ReplicatedMergeTree` tables which can lead to error message `Unexpected merged part intersecting drop range`. [#25783](https://github.com/ClickHouse/ClickHouse/pull/25783) ([alesapin](https://github.com/alesapin)).
+* Fix bug in `TTL` with `GROUP BY` expression which refuses to execute `TTL` after first execution in part. [#25743](https://github.com/ClickHouse/ClickHouse/pull/25743) ([alesapin](https://github.com/alesapin)).
+* Allow StorageMerge to access tables with aliases. Closes [#6051](https://github.com/ClickHouse/ClickHouse/issues/6051). [#25694](https://github.com/ClickHouse/ClickHouse/pull/25694) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Fix slow dict join in some cases, close [#24209](https://github.com/ClickHouse/ClickHouse/issues/24209). [#25618](https://github.com/ClickHouse/ClickHouse/pull/25618) ([Vladimir C](https://github.com/vdimir)).
+* Fix `ALTER MODIFY COLUMN` of columns, which participates in TTL expressions. [#25554](https://github.com/ClickHouse/ClickHouse/pull/25554) ([Anton Popov](https://github.com/CurtizJ)).
+* Fix assertion in `PREWHERE` with non-UInt8 type, close [#19589](https://github.com/ClickHouse/ClickHouse/issues/19589). [#25484](https://github.com/ClickHouse/ClickHouse/pull/25484) ([Vladimir C](https://github.com/vdimir)).
+* Fix some fuzzed msan crash. Fixes [#22517](https://github.com/ClickHouse/ClickHouse/issues/22517). [#26428](https://github.com/ClickHouse/ClickHouse/pull/26428) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Update `chown` cmd check in `clickhouse-server` docker entrypoint. It fixes error 'cluster pod restart failed (or timeout)' on kubernetes. [#26545](https://github.com/ClickHouse/ClickHouse/pull/26545) ([Ky Li](https://github.com/Kylinrix)).
+
+
### ClickHouse release v21.7, 2021-07-09
#### Backward Incompatible Change
@@ -1183,13 +1282,6 @@
* PODArray: Avoid call to memcpy with (nullptr, 0) arguments (Fix UBSan report). This fixes [#18525](https://github.com/ClickHouse/ClickHouse/issues/18525). [#18526](https://github.com/ClickHouse/ClickHouse/pull/18526) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Minor improvement for path concatenation of zookeeper paths inside DDLWorker. [#17767](https://github.com/ClickHouse/ClickHouse/pull/17767) ([Bharat Nallan](https://github.com/bharatnc)).
* Allow to reload symbols from debug file. This PR also fixes a build-id issue. [#17637](https://github.com/ClickHouse/ClickHouse/pull/17637) ([Amos Bird](https://github.com/amosbird)).
-* TestFlows: fixes to LDAP tests that fail due to slow test execution. [#18790](https://github.com/ClickHouse/ClickHouse/pull/18790) ([vzakaznikov](https://github.com/vzakaznikov)).
-* TestFlows: Merging requirements for AES encryption functions. Updating aes_encryption tests to use new requirements. Updating TestFlows version to 1.6.72. [#18221](https://github.com/ClickHouse/ClickHouse/pull/18221) ([vzakaznikov](https://github.com/vzakaznikov)).
-* TestFlows: Updating TestFlows version to the latest 1.6.72. Re-generating requirements.py. [#18208](https://github.com/ClickHouse/ClickHouse/pull/18208) ([vzakaznikov](https://github.com/vzakaznikov)).
-* TestFlows: Updating TestFlows README.md to include "How To Debug Why Test Failed" section. [#17808](https://github.com/ClickHouse/ClickHouse/pull/17808) ([vzakaznikov](https://github.com/vzakaznikov)).
-* TestFlows: tests for RBAC [ACCESS MANAGEMENT](https://clickhouse.tech/docs/en/sql-reference/statements/grant/#grant-access-management) privileges. [#17804](https://github.com/ClickHouse/ClickHouse/pull/17804) ([MyroTk](https://github.com/MyroTk)).
-* TestFlows: RBAC tests for SHOW, TRUNCATE, KILL, and OPTIMIZE. - Updates to old tests. - Resolved comments from #https://github.com/ClickHouse/ClickHouse/pull/16977. [#17657](https://github.com/ClickHouse/ClickHouse/pull/17657) ([MyroTk](https://github.com/MyroTk)).
-* TestFlows: Added RBAC tests for `ATTACH`, `CREATE`, `DROP`, and `DETACH`. [#16977](https://github.com/ClickHouse/ClickHouse/pull/16977) ([MyroTk](https://github.com/MyroTk)).
## [Changelog for 2020](https://github.com/ClickHouse/ClickHouse/blob/master/docs/en/whats-new/changelog/2020.md)
diff --git a/README.md b/README.md
index 496a6357f44..178547ea523 100644
--- a/README.md
+++ b/README.md
@@ -13,3 +13,6 @@ ClickHouse® is an open-source column-oriented database management system that a
* [Code Browser](https://clickhouse.tech/codebrowser/html_report/ClickHouse/index.html) with syntax highlight and navigation.
* [Contacts](https://clickhouse.tech/#contacts) can help to get your questions answered if there are any.
* You can also [fill this form](https://clickhouse.tech/#meet) to meet Yandex ClickHouse team in person.
+
+## Upcoming Events
+* [SF Bay Area ClickHouse August Community Meetup (online)](https://www.meetup.com/San-Francisco-Bay-Area-ClickHouse-Meetup/events/279109379/) on 25 August 2021.
diff --git a/base/common/DateLUTImpl.cpp b/base/common/DateLUTImpl.cpp
index e7faeb63760..472f24f3805 100644
--- a/base/common/DateLUTImpl.cpp
+++ b/base/common/DateLUTImpl.cpp
@@ -60,6 +60,7 @@ DateLUTImpl::DateLUTImpl(const std::string & time_zone_)
offset_at_start_of_epoch = cctz_time_zone.lookup(cctz_time_zone.lookup(epoch).pre).offset;
offset_at_start_of_lut = cctz_time_zone.lookup(cctz_time_zone.lookup(lut_start).pre).offset;
offset_is_whole_number_of_hours_during_epoch = true;
+ offset_is_whole_number_of_minutes_during_epoch = true;
cctz::civil_day date = lut_start;
@@ -108,6 +109,9 @@ DateLUTImpl::DateLUTImpl(const std::string & time_zone_)
if (offset_is_whole_number_of_hours_during_epoch && start_of_day > 0 && start_of_day % 3600)
offset_is_whole_number_of_hours_during_epoch = false;
+ if (offset_is_whole_number_of_minutes_during_epoch && start_of_day > 0 && start_of_day % 60)
+ offset_is_whole_number_of_minutes_during_epoch = false;
+
/// If UTC offset was changed this day.
/// Change in time zone without transition is possible, e.g. Moscow 1991 Sun, 31 Mar, 02:00 MSK to EEST
cctz::time_zone::civil_transition transition{};
diff --git a/base/common/DateLUTImpl.h b/base/common/DateLUTImpl.h
index 202eb88a361..012d2cefe84 100644
--- a/base/common/DateLUTImpl.h
+++ b/base/common/DateLUTImpl.h
@@ -193,6 +193,7 @@ private:
/// UTC offset at the beginning of the first supported year.
Time offset_at_start_of_lut;
bool offset_is_whole_number_of_hours_during_epoch;
+ bool offset_is_whole_number_of_minutes_during_epoch;
/// Time zone name.
std::string time_zone;
@@ -251,18 +252,23 @@ private:
}
template
- static inline T roundDown(T x, Divisor divisor)
+ inline T roundDown(T x, Divisor divisor) const
{
static_assert(std::is_integral_v && std::is_integral_v);
assert(divisor > 0);
- if (likely(x >= 0))
- return x / divisor * divisor;
+ if (likely(offset_is_whole_number_of_hours_during_epoch))
+ {
+ if (likely(x >= 0))
+ return x / divisor * divisor;
- /// Integer division for negative numbers rounds them towards zero (up).
- /// We will shift the number so it will be rounded towards -inf (down).
+ /// Integer division for negative numbers rounds them towards zero (up).
+ /// We will shift the number so it will be rounded towards -inf (down).
+ return (x + 1 - divisor) / divisor * divisor;
+ }
- return (x + 1 - divisor) / divisor * divisor;
+ Time date = find(x).date;
+ return date + (x - date) / divisor * divisor;
}
public:
@@ -459,10 +465,21 @@ public:
inline unsigned toSecond(Time t) const
{
- auto res = t % 60;
- if (likely(res >= 0))
- return res;
- return res + 60;
+ if (likely(offset_is_whole_number_of_minutes_during_epoch))
+ {
+ Time res = t % 60;
+ if (likely(res >= 0))
+ return res;
+ return res + 60;
+ }
+
+ LUTIndex index = findIndex(t);
+ Time time = t - lut[index].date;
+
+ if (time >= lut[index].time_at_offset_change())
+ time += lut[index].amount_of_offset_change();
+
+ return time % 60;
}
inline unsigned toMinute(Time t) const
@@ -483,29 +500,11 @@ public:
}
/// NOTE: Assuming timezone offset is a multiple of 15 minutes.
- inline Time toStartOfMinute(Time t) const { return roundDown(t, 60); }
- inline Time toStartOfFiveMinute(Time t) const { return roundDown(t, 300); }
- inline Time toStartOfFifteenMinutes(Time t) const { return roundDown(t, 900); }
-
- inline Time toStartOfTenMinutes(Time t) const
- {
- if (t >= 0 && offset_is_whole_number_of_hours_during_epoch)
- return t / 600 * 600;
-
- /// More complex logic is for Nepal - it has offset 05:45. Australia/Eucla is also unfortunate.
- Time date = find(t).date;
- return date + (t - date) / 600 * 600;
- }
-
- /// NOTE: Assuming timezone transitions are multiple of hours. Lord Howe Island in Australia is a notable exception.
- inline Time toStartOfHour(Time t) const
- {
- if (t >= 0 && offset_is_whole_number_of_hours_during_epoch)
- return t / 3600 * 3600;
-
- Time date = find(t).date;
- return date + (t - date) / 3600 * 3600;
- }
+ inline Time toStartOfMinute(Time t) const { return toStartOfMinuteInterval(t, 1); }
+ inline Time toStartOfFiveMinute(Time t) const { return toStartOfMinuteInterval(t, 5); }
+ inline Time toStartOfFifteenMinutes(Time t) const { return toStartOfMinuteInterval(t, 15); }
+ inline Time toStartOfTenMinutes(Time t) const { return toStartOfMinuteInterval(t, 10); }
+ inline Time toStartOfHour(Time t) const { return roundDown(t, 3600); }
/** Number of calendar day since the beginning of UNIX epoch (1970-01-01 is zero)
* We use just two bytes for it. It covers the range up to 2105 and slightly more.
@@ -903,25 +902,24 @@ public:
inline Time toStartOfMinuteInterval(Time t, UInt64 minutes) const
{
- if (minutes == 1)
- return toStartOfMinute(t);
+ UInt64 divisor = 60 * minutes;
+ if (likely(offset_is_whole_number_of_minutes_during_epoch))
+ {
+ if (likely(t >= 0))
+ return t / divisor * divisor;
+ return (t + 1 - divisor) / divisor * divisor;
+ }
- /** In contrast to "toStartOfHourInterval" function above,
- * the minute intervals are not aligned to the midnight.
- * You will get unexpected results if for example, you round down to 60 minute interval
- * and there was a time shift to 30 minutes.
- *
- * But this is not specified in docs and can be changed in future.
- */
-
- UInt64 seconds = 60 * minutes;
- return roundDown(t, seconds);
+ Time date = find(t).date;
+ return date + (t - date) / divisor * divisor;
}
inline Time toStartOfSecondInterval(Time t, UInt64 seconds) const
{
if (seconds == 1)
return t;
+ if (seconds % 60 == 0)
+ return toStartOfMinuteInterval(t, seconds / 60);
return roundDown(t, seconds);
}
@@ -955,7 +953,7 @@ public:
inline Time makeDateTime(Int16 year, UInt8 month, UInt8 day_of_month, UInt8 hour, UInt8 minute, UInt8 second) const
{
size_t index = makeLUTIndex(year, month, day_of_month);
- UInt32 time_offset = hour * 3600 + minute * 60 + second;
+ Time time_offset = hour * 3600 + minute * 60 + second;
if (time_offset >= lut[index].time_at_offset_change())
time_offset -= lut[index].amount_of_offset_change();
diff --git a/base/glibc-compatibility/musl/getauxval.c b/base/glibc-compatibility/musl/getauxval.c
index a429273fa1a..dad7aa938d7 100644
--- a/base/glibc-compatibility/musl/getauxval.c
+++ b/base/glibc-compatibility/musl/getauxval.c
@@ -1,4 +1,5 @@
#include
+#include "atomic.h"
#include // __environ
#include
@@ -17,18 +18,7 @@ static size_t __find_auxv(unsigned long type)
return (size_t) -1;
}
-__attribute__((constructor)) static void __auxv_init()
-{
- size_t i;
- for (i = 0; __environ[i]; i++);
- __auxv = (unsigned long *) (__environ + i + 1);
-
- size_t secure_idx = __find_auxv(AT_SECURE);
- if (secure_idx != ((size_t) -1))
- __auxv_secure = __auxv[secure_idx];
-}
-
-unsigned long getauxval(unsigned long type)
+unsigned long __getauxval(unsigned long type)
{
if (type == AT_SECURE)
return __auxv_secure;
@@ -43,3 +33,38 @@ unsigned long getauxval(unsigned long type)
errno = ENOENT;
return 0;
}
+
+static void * volatile getauxval_func;
+
+static unsigned long __auxv_init(unsigned long type)
+{
+ if (!__environ)
+ {
+ // __environ is not initialized yet so we can't initialize __auxv right now.
+ // That's normally occurred only when getauxval() is called from some sanitizer's internal code.
+ errno = ENOENT;
+ return 0;
+ }
+
+ // Initialize __auxv and __auxv_secure.
+ size_t i;
+ for (i = 0; __environ[i]; i++);
+ __auxv = (unsigned long *) (__environ + i + 1);
+
+ size_t secure_idx = __find_auxv(AT_SECURE);
+ if (secure_idx != ((size_t) -1))
+ __auxv_secure = __auxv[secure_idx];
+
+ // Now we've initialized __auxv, next time getauxval() will only call __get_auxval().
+ a_cas_p(&getauxval_func, (void *)__auxv_init, (void *)__getauxval);
+
+ return __getauxval(type);
+}
+
+// First time getauxval() will call __auxv_init().
+static void * volatile getauxval_func = (void *)__auxv_init;
+
+unsigned long getauxval(unsigned long type)
+{
+ return ((unsigned long (*)(unsigned long))getauxval_func)(type);
+}
diff --git a/base/mysqlxx/Pool.cpp b/base/mysqlxx/Pool.cpp
index 386b4544b78..2f47aa67356 100644
--- a/base/mysqlxx/Pool.cpp
+++ b/base/mysqlxx/Pool.cpp
@@ -296,7 +296,7 @@ void Pool::initialize()
Pool::Connection * Pool::allocConnection(bool dont_throw_if_failed_first_time)
{
- std::unique_ptr conn_ptr{new Connection};
+ std::unique_ptr conn_ptr = std::make_unique();
try
{
diff --git a/contrib/croaring-cmake/CMakeLists.txt b/contrib/croaring-cmake/CMakeLists.txt
index f0cb378864b..84cdccedbd3 100644
--- a/contrib/croaring-cmake/CMakeLists.txt
+++ b/contrib/croaring-cmake/CMakeLists.txt
@@ -26,17 +26,14 @@ target_include_directories(roaring SYSTEM BEFORE PUBLIC "${LIBRARY_DIR}/include"
target_include_directories(roaring SYSTEM BEFORE PUBLIC "${LIBRARY_DIR}/cpp")
# We redirect malloc/free family of functions to different functions that will track memory in ClickHouse.
-# It will make this library depend on linking to 'clickhouse_common_io' library that is not done explicitly via 'target_link_libraries'.
-# And we check that all libraries dependencies are satisfied and all symbols are resolved if we do build with shared libraries.
-# That's why we enable it only in static build.
# Also note that we exploit implicit function declarations.
-if (USE_STATIC_LIBRARIES)
- target_compile_definitions(roaring PRIVATE
+target_compile_definitions(roaring PRIVATE
-Dmalloc=clickhouse_malloc
-Dcalloc=clickhouse_calloc
-Drealloc=clickhouse_realloc
-Dreallocarray=clickhouse_reallocarray
-Dfree=clickhouse_free
-Dposix_memalign=clickhouse_posix_memalign)
-endif ()
+
+target_link_libraries(roaring PUBLIC clickhouse_common_io)
diff --git a/docs/en/development/build.md b/docs/en/development/build.md
index 97b477d55a5..be45c1ed5f7 100644
--- a/docs/en/development/build.md
+++ b/docs/en/development/build.md
@@ -155,6 +155,10 @@ Normally ClickHouse is statically linked into a single static `clickhouse` binar
-DUSE_STATIC_LIBRARIES=0 -DSPLIT_SHARED_LIBRARIES=1 -DCLICKHOUSE_SPLIT_BINARY=1
```
-Note that in this configuration there is no single `clickhouse` binary, and you have to run `clickhouse-server`, `clickhouse-client` etc.
+Note that the split build has several drawbacks:
+* There is no single `clickhouse` binary, and you have to run `clickhouse-server`, `clickhouse-client`, etc.
+* Risk of segfault if you run any of the programs while rebuilding the project.
+* You cannot run the integration tests since they only work a single complete binary.
+* You can't easily copy the binaries elsewhere. Instead of moving a single binary you'll need to copy all binaries and libraries.
[Original article](https://clickhouse.tech/docs/en/development/build/)
diff --git a/docs/en/engines/database-engines/materialized-mysql.md b/docs/en/engines/database-engines/materialized-mysql.md
index ca550776d53..d329dff32c5 100644
--- a/docs/en/engines/database-engines/materialized-mysql.md
+++ b/docs/en/engines/database-engines/materialized-mysql.md
@@ -1,6 +1,6 @@
---
toc_priority: 29
-toc_title: MaterializedMySQL
+toc_title: "[experimental] MaterializedMySQL"
---
# [experimental] MaterializedMySQL {#materialized-mysql}
@@ -27,28 +27,33 @@ ENGINE = MaterializedMySQL('host:port', ['database' | database], 'user', 'passwo
- `password` — User password.
**Engine Settings**
-- `max_rows_in_buffer` — Max rows that data is allowed to cache in memory(for single table and the cache data unable to query). when rows is exceeded, the data will be materialized. Default: `65505`.
-- `max_bytes_in_buffer` — Max bytes that data is allowed to cache in memory(for single table and the cache data unable to query). when rows is exceeded, the data will be materialized. Default: `1048576`.
-- `max_rows_in_buffers` — Max rows that data is allowed to cache in memory(for database and the cache data unable to query). when rows is exceeded, the data will be materialized. Default: `65505`.
-- `max_bytes_in_buffers` — Max bytes that data is allowed to cache in memory(for database and the cache data unable to query). when rows is exceeded, the data will be materialized. Default: `1048576`.
-- `max_flush_data_time` — Max milliseconds that data is allowed to cache in memory(for database and the cache data unable to query). when this time is exceeded, the data will be materialized. Default: `1000`.
-- `max_wait_time_when_mysql_unavailable` — Retry interval when MySQL is not available (milliseconds). Negative value disable retry. Default: `1000`.
-- `allows_query_when_mysql_lost` — Allow query materialized table when mysql is lost. Default: `0` (`false`).
-```
+
+- `max_rows_in_buffer` — Maximum number of rows that data is allowed to cache in memory (for single table and the cache data unable to query). When this number is exceeded, the data will be materialized. Default: `65 505`.
+- `max_bytes_in_buffer` — Maximum number of bytes that data is allowed to cache in memory (for single table and the cache data unable to query). When this number is exceeded, the data will be materialized. Default: `1 048 576`.
+- `max_rows_in_buffers` — Maximum number of rows that data is allowed to cache in memory (for database and the cache data unable to query). When this number is exceeded, the data will be materialized. Default: `65 505`.
+- `max_bytes_in_buffers` — Maximum number of bytes that data is allowed to cache in memory (for database and the cache data unable to query). When this number is exceeded, the data will be materialized. Default: `1 048 576`.
+- `max_flush_data_time` — Maximum number of milliseconds that data is allowed to cache in memory (for database and the cache data unable to query). When this time is exceeded, the data will be materialized. Default: `1000`.
+- `max_wait_time_when_mysql_unavailable` — Retry interval when MySQL is not available (milliseconds). Negative value disables retry. Default: `1000`.
+- `allows_query_when_mysql_lost` — Allows to query a materialized table when MySQL is lost. Default: `0` (`false`).
+
+```sql
CREATE DATABASE mysql ENGINE = MaterializedMySQL('localhost:3306', 'db', 'user', '***')
SETTINGS
allows_query_when_mysql_lost=true,
max_wait_time_when_mysql_unavailable=10000;
```
-**Settings on MySQL-server side**
+**Settings on MySQL-server Side**
-For the correct work of `MaterializedMySQL`, there are few mandatory `MySQL`-side configuration settings that should be set:
+For the correct work of `MaterializedMySQL`, there are few mandatory `MySQL`-side configuration settings that must be set:
- `default_authentication_plugin = mysql_native_password` since `MaterializedMySQL` can only authorize with this method.
-- `gtid_mode = on` since GTID based logging is a mandatory for providing correct `MaterializedMySQL` replication. Pay attention that while turning this mode `On` you should also specify `enforce_gtid_consistency = on`.
+- `gtid_mode = on` since GTID based logging is a mandatory for providing correct `MaterializedMySQL` replication.
-## Virtual columns {#virtual-columns}
+!!! attention "Attention"
+ While turning on `gtid_mode` you should also specify `enforce_gtid_consistency = on`.
+
+## Virtual Columns {#virtual-columns}
When working with the `MaterializedMySQL` database engine, [ReplacingMergeTree](../../engines/table-engines/mergetree-family/replacingmergetree.md) tables are used with virtual `_sign` and `_version` columns.
@@ -78,13 +83,13 @@ When working with the `MaterializedMySQL` database engine, [ReplacingMergeTree](
| BLOB | [String](../../sql-reference/data-types/string.md) |
| BINARY | [FixedString](../../sql-reference/data-types/fixedstring.md) |
-Other types are not supported. If MySQL table contains a column of such type, ClickHouse throws exception "Unhandled data type" and stops replication.
-
[Nullable](../../sql-reference/data-types/nullable.md) is supported.
+Other types are not supported. If MySQL table contains a column of such type, ClickHouse throws exception "Unhandled data type" and stops replication.
+
## Specifics and Recommendations {#specifics-and-recommendations}
-### Compatibility restrictions
+### Compatibility Restrictions {#compatibility-restrictions}
Apart of the data types limitations there are few restrictions comparing to `MySQL` databases, that should be resolved before replication will be possible:
diff --git a/docs/en/engines/table-engines/mergetree-family/mergetree.md b/docs/en/engines/table-engines/mergetree-family/mergetree.md
index 561b0ad8023..0c900454cd0 100644
--- a/docs/en/engines/table-engines/mergetree-family/mergetree.md
+++ b/docs/en/engines/table-engines/mergetree-family/mergetree.md
@@ -39,7 +39,10 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2] [TTL expr2],
...
INDEX index_name1 expr1 TYPE type1(...) GRANULARITY value1,
- INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2
+ INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2,
+ ...
+ PROJECTION projection_name_1 (SELECT [GROUP BY] [ORDER BY]),
+ PROJECTION projection_name_2 (SELECT [GROUP BY] [ORDER BY])
) ENGINE = MergeTree()
ORDER BY expr
[PARTITION BY expr]
@@ -385,6 +388,24 @@ Functions with a constant argument that is less than ngram size can’t be used
- `s != 1`
- `NOT startsWith(s, 'test')`
+### Projections {#projections}
+Projections are like materialized views but defined in part-level. It provides consistency guarantees along with automatic usage in queries.
+
+#### Query {#projection-query}
+A projection query is what defines a projection. It has the following grammar:
+
+`SELECT [GROUP BY] [ORDER BY]`
+
+It implicitly selects data from the parent table.
+
+#### Storage {#projection-storage}
+Projections are stored inside the part directory. It's similar to an index but contains a subdirectory that stores an anonymous MergeTree table's part. The table is induced by the definition query of the projection. If there is a GROUP BY clause, the underlying storage engine becomes AggregatedMergeTree, and all aggregate functions are converted to AggregateFunction. If there is an ORDER BY clause, the MergeTree table will use it as its primary key expression. During the merge process, the projection part will be merged via its storage's merge routine. The checksum of the parent table's part will combine the projection's part. Other maintenance jobs are similar to skip indices.
+
+#### Query Analysis {#projection-query-analysis}
+1. Check if the projection can be used to answer the given query, that is, it generates the same answer as querying the base table.
+2. Select the best feasible match, which contains the least granules to read.
+3. The query pipeline which uses projections will be different from the one that uses the original parts. If the projection is absent in some parts, we can add the pipeline to "project" it on the fly.
+
## Concurrent Data Access {#concurrent-data-access}
For concurrent table access, we use multi-versioning. In other words, when a table is simultaneously read and updated, data is read from a set of parts that is current at the time of the query. There are no lengthy locks. Inserts do not get in the way of read operations.
diff --git a/docs/en/operations/server-configuration-parameters/settings.md b/docs/en/operations/server-configuration-parameters/settings.md
index d7ffcff35fb..a620565b71a 100644
--- a/docs/en/operations/server-configuration-parameters/settings.md
+++ b/docs/en/operations/server-configuration-parameters/settings.md
@@ -892,6 +892,33 @@ If the table does not exist, ClickHouse will create it. If the structure of the
```
+## query_views_log {#server_configuration_parameters-query_views_log}
+
+Setting for logging views dependant of queries received with the [log_query_views=1](../../operations/settings/settings.md#settings-log-query-views) setting.
+
+Queries are logged in the [system.query_views_log](../../operations/system-tables/query_thread_log.md#system_tables-query_views_log) table, not in a separate file. You can change the name of the table in the `table` parameter (see below).
+
+Use the following parameters to configure logging:
+
+- `database` – Name of the database.
+- `table` – Name of the system table the queries will be logged in.
+- `partition_by` — [Custom partitioning key](../../engines/table-engines/mergetree-family/custom-partitioning-key.md) for a system table. Can't be used if `engine` defined.
+- `engine` - [MergeTree Engine Definition](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-creating-a-table) for a system table. Can't be used if `partition_by` defined.
+- `flush_interval_milliseconds` – Interval for flushing data from the buffer in memory to the table.
+
+If the table does not exist, ClickHouse will create it. If the structure of the query views log changed when the ClickHouse server was updated, the table with the old structure is renamed, and a new table is created automatically.
+
+**Example**
+
+``` xml
+
+ system
+
+ toYYYYMM(event_date)
+ 7500
+
+```
+
## text_log {#server_configuration_parameters-text_log}
Settings for the [text_log](../../operations/system-tables/text_log.md#system_tables-text_log) system table for logging text messages.
diff --git a/docs/en/operations/settings/settings.md b/docs/en/operations/settings/settings.md
index 4936c782299..07bfe158a0a 100644
--- a/docs/en/operations/settings/settings.md
+++ b/docs/en/operations/settings/settings.md
@@ -890,7 +890,7 @@ log_queries_min_type='EXCEPTION_WHILE_PROCESSING'
Setting up query threads logging.
-Queries’ threads runned by ClickHouse with this setup are logged according to the rules in the [query_thread_log](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-query_thread_log) server configuration parameter.
+Queries’ threads run by ClickHouse with this setup are logged according to the rules in the [query_thread_log](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-query_thread_log) server configuration parameter.
Example:
@@ -898,6 +898,19 @@ Example:
log_query_threads=1
```
+## log_query_views {#settings-log-query-views}
+
+Setting up query views logging.
+
+When a query run by ClickHouse with this setup on has associated views (materialized or live views), they are logged in the [query_views_log](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-query_views_log) server configuration parameter.
+
+Example:
+
+``` text
+log_query_views=1
+```
+
+
## log_comment {#settings-log-comment}
Specifies the value for the `log_comment` field of the [system.query_log](../system-tables/query_log.md) table and comment text for the server log.
diff --git a/docs/en/operations/system-tables/query_log.md b/docs/en/operations/system-tables/query_log.md
index 987f1968356..548e454cf58 100644
--- a/docs/en/operations/system-tables/query_log.md
+++ b/docs/en/operations/system-tables/query_log.md
@@ -50,6 +50,7 @@ Columns:
- `query_kind` ([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md)) — Type of the query.
- `databases` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — Names of the databases present in the query.
- `tables` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — Names of the tables present in the query.
+- `views` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — Names of the (materialized or live) views present in the query.
- `columns` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — Names of the columns present in the query.
- `projections` ([String](../../sql-reference/data-types/string.md)) — Names of the projections used during the query execution.
- `exception_code` ([Int32](../../sql-reference/data-types/int-uint.md)) — Code of an exception.
@@ -180,5 +181,6 @@ used_table_functions: []
**See Also**
- [system.query_thread_log](../../operations/system-tables/query_thread_log.md#system_tables-query_thread_log) — This table contains information about each query execution thread.
+- [system.query_views_log](../../operations/system-tables/query_views_log.md#system_tables-query_views_log) — This table contains information about each view executed during a query.
[Original article](https://clickhouse.tech/docs/en/operations/system-tables/query_log)
diff --git a/docs/en/operations/system-tables/query_thread_log.md b/docs/en/operations/system-tables/query_thread_log.md
index 7ecea2971b4..152a10504bb 100644
--- a/docs/en/operations/system-tables/query_thread_log.md
+++ b/docs/en/operations/system-tables/query_thread_log.md
@@ -112,5 +112,6 @@ ProfileEvents: {'Query':1,'SelectQuery':1,'ReadCompressedBytes':36,'Compr
**See Also**
- [system.query_log](../../operations/system-tables/query_log.md#system_tables-query_log) — Description of the `query_log` system table which contains common information about queries execution.
+- [system.query_views_log](../../operations/system-tables/query_views_log.md#system_tables-query_views_log) — This table contains information about each view executed during a query.
[Original article](https://clickhouse.tech/docs/en/operations/system-tables/query_thread_log)
diff --git a/docs/en/operations/system-tables/query_views_log.md b/docs/en/operations/system-tables/query_views_log.md
new file mode 100644
index 00000000000..48d36a6a118
--- /dev/null
+++ b/docs/en/operations/system-tables/query_views_log.md
@@ -0,0 +1,81 @@
+# system.query_views_log {#system_tables-query_views_log}
+
+Contains information about the dependent views executed when running a query, for example, the view type or the execution time.
+
+To start logging:
+
+1. Configure parameters in the [query_views_log](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-query_views_log) section.
+2. Set [log_query_views](../../operations/settings/settings.md#settings-log-query-views) to 1.
+
+The flushing period of data is set in `flush_interval_milliseconds` parameter of the [query_views_log](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-query_views_log) server settings section. To force flushing, use the [SYSTEM FLUSH LOGS](../../sql-reference/statements/system.md#query_language-system-flush_logs) query.
+
+ClickHouse does not delete data from the table automatically. See [Introduction](../../operations/system-tables/index.md#system-tables-introduction) for more details.
+
+Columns:
+
+- `event_date` ([Date](../../sql-reference/data-types/date.md)) — The date when the last event of the view happened.
+- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — The date and time when the view finished execution.
+- `event_time_microseconds` ([DateTime](../../sql-reference/data-types/datetime.md)) — The date and time when the view finished execution with microseconds precision.
+- `view_duration_ms` ([UInt64](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Duration of view execution (sum of its stages) in milliseconds.
+- `initial_query_id` ([String](../../sql-reference/data-types/string.md)) — ID of the initial query (for distributed query execution).
+- `view_name` ([String](../../sql-reference/data-types/string.md)) — Name of the view.
+- `view_uuid` ([UUID](../../sql-reference/data-types/uuid.md)) — UUID of the view.
+- `view_type` ([Enum8](../../sql-reference/data-types/enum.md)) — Type of the view. Values:
+ - `'Default' = 1` — [Default views](../../sql-reference/statements/create/view.md#normal). Should not appear in this log.
+ - `'Materialized' = 2` — [Materialized views](../../sql-reference/statements/create/view.md#materialized).
+ - `'Live' = 3` — [Live views](../../sql-reference/statements/create/view.md#live-view).
+- `view_query` ([String](../../sql-reference/data-types/string.md)) — The query executed by the view.
+- `view_target` ([String](../../sql-reference/data-types/string.md)) — The name of the view target table.
+- `read_rows` ([UInt64](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Number of read rows.
+- `read_bytes` ([UInt64](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Number of read bytes.
+- `written_rows` ([UInt64](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Number of written rows.
+- `written_bytes` ([UInt64](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Number of written bytes.
+- `peak_memory_usage` ([Int64](../../sql-reference/data-types/int-uint.md)) — The maximum difference between the amount of allocated and freed memory in context of this view.
+- `ProfileEvents` ([Map(String, UInt64)](../../sql-reference/data-types/array.md)) — ProfileEvents that measure different metrics. The description of them could be found in the table [system.events](../../operations/system-tables/events.md#system_tables-events).
+- `status` ([Enum8](../../sql-reference/data-types/enum.md)) — Status of the view. Values:
+ - `'QueryStart' = 1` — Successful start the view execution. Should not appear.
+ - `'QueryFinish' = 2` — Successful end of the view execution.
+ - `'ExceptionBeforeStart' = 3` — Exception before the start of the view execution.
+ - `'ExceptionWhileProcessing' = 4` — Exception during the view execution.
+- `exception_code` ([Int32](../../sql-reference/data-types/int-uint.md)) — Code of an exception.
+- `exception` ([String](../../sql-reference/data-types/string.md)) — Exception message.
+- `stack_trace` ([String](../../sql-reference/data-types/string.md)) — [Stack trace](https://en.wikipedia.org/wiki/Stack_trace). An empty string, if the query was completed successfully.
+
+**Example**
+
+``` sql
+ SELECT * FROM system.query_views_log LIMIT 1 \G
+```
+
+``` text
+Row 1:
+──────
+event_date: 2021-06-22
+event_time: 2021-06-22 13:23:07
+event_time_microseconds: 2021-06-22 13:23:07.738221
+view_duration_ms: 0
+initial_query_id: c3a1ac02-9cad-479b-af54-9e9c0a7afd70
+view_name: default.matview_inner
+view_uuid: 00000000-0000-0000-0000-000000000000
+view_type: Materialized
+view_query: SELECT * FROM default.table_b
+view_target: default.`.inner.matview_inner`
+read_rows: 4
+read_bytes: 64
+written_rows: 2
+written_bytes: 32
+peak_memory_usage: 4196188
+ProfileEvents: {'FileOpen':2,'WriteBufferFromFileDescriptorWrite':2,'WriteBufferFromFileDescriptorWriteBytes':187,'IOBufferAllocs':3,'IOBufferAllocBytes':3145773,'FunctionExecute':3,'DiskWriteElapsedMicroseconds':13,'InsertedRows':2,'InsertedBytes':16,'SelectedRows':4,'SelectedBytes':48,'ContextLock':16,'RWLockAcquiredReadLocks':1,'RealTimeMicroseconds':698,'SoftPageFaults':4,'OSReadChars':463}
+status: QueryFinish
+exception_code: 0
+exception:
+stack_trace:
+```
+
+**See Also**
+
+- [system.query_log](../../operations/system-tables/query_log.md#system_tables-query_log) — Description of the `query_log` system table which contains common information about queries execution.
+- [system.query_thread_log](../../operations/system-tables/query_thread_log.md#system_tables-query_thread_log) — This table contains information about each query execution thread.
+
+
+[Original article](https://clickhouse.tech/docs/en/operations/system_tables/query_thread_log)
diff --git a/docs/en/sql-reference/data-types/string.md b/docs/en/sql-reference/data-types/string.md
index cb3a70ec7f8..2cf11ac85a3 100644
--- a/docs/en/sql-reference/data-types/string.md
+++ b/docs/en/sql-reference/data-types/string.md
@@ -15,6 +15,6 @@ When creating tables, numeric parameters for string fields can be set (e.g. `VAR
ClickHouse does not have the concept of encodings. Strings can contain an arbitrary set of bytes, which are stored and output as-is.
If you need to store texts, we recommend using UTF-8 encoding. At the very least, if your terminal uses UTF-8 (as recommended), you can read and write your values without making conversions.
Similarly, certain functions for working with strings have separate variations that work under the assumption that the string contains a set of bytes representing a UTF-8 encoded text.
-For example, the ‘length’ function calculates the string length in bytes, while the ‘lengthUTF8’ function calculates the string length in Unicode code points, assuming that the value is UTF-8 encoded.
+For example, the [length](../functions/string-functions.md#length) function calculates the string length in bytes, while the [lengthUTF8](../functions/string-functions.md#lengthutf8) function calculates the string length in Unicode code points, assuming that the value is UTF-8 encoded.
[Original article](https://clickhouse.tech/docs/en/data_types/string/)
diff --git a/docs/en/sql-reference/functions/array-functions.md b/docs/en/sql-reference/functions/array-functions.md
index 422bbe4b4ea..e7918c018db 100644
--- a/docs/en/sql-reference/functions/array-functions.md
+++ b/docs/en/sql-reference/functions/array-functions.md
@@ -7,19 +7,89 @@ toc_title: Arrays
## empty {#function-empty}
-Returns 1 for an empty array, or 0 for a non-empty array.
-The result type is UInt8.
-The function also works for strings.
+Checks whether the input array is empty.
-Can be optimized by enabling the [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns) setting. With `optimize_functions_to_subcolumns = 1` the function reads only [size0](../../sql-reference/data-types/array.md#array-size) subcolumn instead of reading and processing the whole array column. The query `SELECT empty(arr) FROM table` transforms to `SELECT arr.size0 = 0 FROM TABLE`.
+**Syntax**
+
+``` sql
+empty([x])
+```
+
+An array is considered empty if it does not contain any elements.
+
+!!! note "Note"
+ Can be optimized by enabling the [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns) setting. With `optimize_functions_to_subcolumns = 1` the function reads only [size0](../../sql-reference/data-types/array.md#array-size) subcolumn instead of reading and processing the whole array column. The query `SELECT empty(arr) FROM TABLE;` transforms to `SELECT arr.size0 = 0 FROM TABLE;`.
+
+The function also works for [strings](string-functions.md#empty) or [UUID](uuid-functions.md#empty).
+
+**Arguments**
+
+- `[x]` — Input array. [Array](../data-types/array.md).
+
+**Returned value**
+
+- Returns `1` for an empty array or `0` for a non-empty array.
+
+Type: [UInt8](../data-types/int-uint.md).
+
+**Example**
+
+Query:
+
+```sql
+SELECT empty([]);
+```
+
+Result:
+
+```text
+┌─empty(array())─┐
+│ 1 │
+└────────────────┘
+```
## notEmpty {#function-notempty}
-Returns 0 for an empty array, or 1 for a non-empty array.
-The result type is UInt8.
-The function also works for strings.
+Checks whether the input array is non-empty.
-Can be optimized by enabling the [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns) setting. With `optimize_functions_to_subcolumns = 1` the function reads only [size0](../../sql-reference/data-types/array.md#array-size) subcolumn instead of reading and processing the whole array column. The query `SELECT notEmpty(arr) FROM table` transforms to `SELECT arr.size0 != 0 FROM TABLE`.
+**Syntax**
+
+``` sql
+notEmpty([x])
+```
+
+An array is considered non-empty if it contains at least one element.
+
+!!! note "Note"
+ Can be optimized by enabling the [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns) setting. With `optimize_functions_to_subcolumns = 1` the function reads only [size0](../../sql-reference/data-types/array.md#array-size) subcolumn instead of reading and processing the whole array column. The query `SELECT notEmpty(arr) FROM table` transforms to `SELECT arr.size0 != 0 FROM TABLE`.
+
+The function also works for [strings](string-functions.md#notempty) or [UUID](uuid-functions.md#notempty).
+
+**Arguments**
+
+- `[x]` — Input array. [Array](../data-types/array.md).
+
+**Returned value**
+
+- Returns `1` for a non-empty array or `0` for an empty array.
+
+Type: [UInt8](../data-types/int-uint.md).
+
+**Example**
+
+Query:
+
+```sql
+SELECT notEmpty([1,2]);
+```
+
+Result:
+
+```text
+┌─notEmpty([1, 2])─┐
+│ 1 │
+└──────────────────┘
+```
## length {#array_functions-length}
diff --git a/docs/en/sql-reference/functions/string-functions.md b/docs/en/sql-reference/functions/string-functions.md
index 8ec8aa7339d..c7c84c5aca6 100644
--- a/docs/en/sql-reference/functions/string-functions.md
+++ b/docs/en/sql-reference/functions/string-functions.md
@@ -10,17 +10,83 @@ toc_title: Strings
## empty {#empty}
-Returns 1 for an empty string or 0 for a non-empty string.
-The result type is UInt8.
+Checks whether the input string is empty.
+
+**Syntax**
+
+``` sql
+empty(x)
+```
+
A string is considered non-empty if it contains at least one byte, even if this is a space or a null byte.
-The function also works for arrays or UUID.
-UUID is empty if it is all zeros (nil UUID).
+
+The function also works for [arrays](array-functions.md#function-empty) or [UUID](uuid-functions.md#empty).
+
+**Arguments**
+
+- `x` — Input value. [String](../data-types/string.md).
+
+**Returned value**
+
+- Returns `1` for an empty string or `0` for a non-empty string.
+
+Type: [UInt8](../data-types/int-uint.md).
+
+**Example**
+
+Query:
+
+```sql
+SELECT empty('');
+```
+
+Result:
+
+```text
+┌─empty('')─┐
+│ 1 │
+└───────────┘
+```
## notEmpty {#notempty}
-Returns 0 for an empty string or 1 for a non-empty string.
-The result type is UInt8.
-The function also works for arrays or UUID.
+Checks whether the input string is non-empty.
+
+**Syntax**
+
+``` sql
+notEmpty(x)
+```
+
+A string is considered non-empty if it contains at least one byte, even if this is a space or a null byte.
+
+The function also works for [arrays](array-functions.md#function-notempty) or [UUID](uuid-functions.md#notempty).
+
+**Arguments**
+
+- `x` — Input value. [String](../data-types/string.md).
+
+**Returned value**
+
+- Returns `1` for a non-empty string or `0` for an empty string string.
+
+Type: [UInt8](../data-types/int-uint.md).
+
+**Example**
+
+Query:
+
+```sql
+SELECT notEmpty('text');
+```
+
+Result:
+
+```text
+┌─notEmpty('text')─┐
+│ 1 │
+└──────────────────┘
+```
## length {#length}
@@ -43,6 +109,158 @@ The result type is UInt64.
Returns the length of a string in Unicode code points (not in characters), assuming that the string contains a set of bytes that make up UTF-8 encoded text. If this assumption is not met, it returns some result (it does not throw an exception).
The result type is UInt64.
+## leftPad {#leftpad}
+
+Pads the current string from the left with spaces or a specified string (multiple times, if needed) until the resulting string reaches the given length. Similarly to the MySQL `LPAD` function.
+
+**Syntax**
+
+``` sql
+leftPad('string', 'length'[, 'pad_string'])
+```
+
+**Arguments**
+
+- `string` — Input string that needs to be padded. [String](../data-types/string.md).
+- `length` — The length of the resulting string. [UInt](../data-types/int-uint.md). If the value is less than the input string length, then the input string is returned as-is.
+- `pad_string` — The string to pad the input string with. [String](../data-types/string.md). Optional. If not specified, then the input string is padded with spaces.
+
+**Returned value**
+
+- The resulting string of the given length.
+
+Type: [String](../data-types/string.md).
+
+**Example**
+
+Query:
+
+``` sql
+SELECT leftPad('abc', 7, '*'), leftPad('def', 7);
+```
+
+Result:
+
+``` text
+┌─leftPad('abc', 7, '*')─┬─leftPad('def', 7)─┐
+│ ****abc │ def │
+└────────────────────────┴───────────────────┘
+```
+
+## leftPadUTF8 {#leftpadutf8}
+
+Pads the current string from the left with spaces or a specified string (multiple times, if needed) until the resulting string reaches the given length. Similarly to the MySQL `LPAD` function. While in the [leftPad](#leftpad) function the length is measured in bytes, here in the `leftPadUTF8` function it is measured in code points.
+
+**Syntax**
+
+``` sql
+leftPadUTF8('string','length'[, 'pad_string'])
+```
+
+**Arguments**
+
+- `string` — Input string that needs to be padded. [String](../data-types/string.md).
+- `length` — The length of the resulting string. [UInt](../data-types/int-uint.md). If the value is less than the input string length, then the input string is returned as-is.
+- `pad_string` — The string to pad the input string with. [String](../data-types/string.md). Optional. If not specified, then the input string is padded with spaces.
+
+**Returned value**
+
+- The resulting string of the given length.
+
+Type: [String](../data-types/string.md).
+
+**Example**
+
+Query:
+
+``` sql
+SELECT leftPadUTF8('абвг', 7, '*'), leftPadUTF8('дежз', 7);
+```
+
+Result:
+
+``` text
+┌─leftPadUTF8('абвг', 7, '*')─┬─leftPadUTF8('дежз', 7)─┐
+│ ***абвг │ дежз │
+└─────────────────────────────┴────────────────────────┘
+```
+
+## rightPad {#rightpad}
+
+Pads the current string from the right with spaces or a specified string (multiple times, if needed) until the resulting string reaches the given length. Similarly to the MySQL `RPAD` function.
+
+**Syntax**
+
+``` sql
+rightPad('string', 'length'[, 'pad_string'])
+```
+
+**Arguments**
+
+- `string` — Input string that needs to be padded. [String](../data-types/string.md).
+- `length` — The length of the resulting string. [UInt](../data-types/int-uint.md). If the value is less than the input string length, then the input string is returned as-is.
+- `pad_string` — The string to pad the input string with. [String](../data-types/string.md). Optional. If not specified, then the input string is padded with spaces.
+
+**Returned value**
+
+- The resulting string of the given length.
+
+Type: [String](../data-types/string.md).
+
+**Example**
+
+Query:
+
+``` sql
+SELECT rightPad('abc', 7, '*'), rightPad('abc', 7);
+```
+
+Result:
+
+``` text
+┌─rightPad('abc', 7, '*')─┬─rightPad('abc', 7)─┐
+│ abc**** │ abc │
+└─────────────────────────┴────────────────────┘
+```
+
+## rightPadUTF8 {#rightpadutf8}
+
+Pads the current string from the right with spaces or a specified string (multiple times, if needed) until the resulting string reaches the given length. Similarly to the MySQL `RPAD` function. While in the [rightPad](#rightpad) function the length is measured in bytes, here in the `rightPadUTF8` function it is measured in code points.
+
+**Syntax**
+
+``` sql
+rightPadUTF8('string','length'[, 'pad_string'])
+```
+
+**Arguments**
+
+- `string` — Input string that needs to be padded. [String](../data-types/string.md).
+- `length` — The length of the resulting string. [UInt](../data-types/int-uint.md). If the value is less than the input string length, then the input string is returned as-is.
+- `pad_string` — The string to pad the input string with. [String](../data-types/string.md). Optional. If not specified, then the input string is padded with spaces.
+
+**Returned value**
+
+- The resulting string of the given length.
+
+Type: [String](../data-types/string.md).
+
+**Example**
+
+Query:
+
+``` sql
+SELECT rightPadUTF8('абвг', 7, '*'), rightPadUTF8('абвг', 7);
+```
+
+Result:
+
+``` text
+┌─rightPadUTF8('абвг', 7, '*')─┬─rightPadUTF8('абвг', 7)─┐
+│ абвг*** │ абвг │
+└──────────────────────────────┴─────────────────────────┘
+```
+
## lower, lcase {#lower}
Converts ASCII Latin symbols in a string to lowercase.
diff --git a/docs/en/sql-reference/functions/uuid-functions.md b/docs/en/sql-reference/functions/uuid-functions.md
index e7e55c699cd..e5ab45bda40 100644
--- a/docs/en/sql-reference/functions/uuid-functions.md
+++ b/docs/en/sql-reference/functions/uuid-functions.md
@@ -9,7 +9,7 @@ The functions for working with UUID are listed below.
## generateUUIDv4 {#uuid-function-generate}
-Generates the [UUID](../../sql-reference/data-types/uuid.md) of [version 4](https://tools.ietf.org/html/rfc4122#section-4.4).
+Generates the [UUID](../data-types/uuid.md) of [version 4](https://tools.ietf.org/html/rfc4122#section-4.4).
``` sql
generateUUIDv4()
@@ -37,6 +37,90 @@ SELECT * FROM t_uuid
└──────────────────────────────────────┘
```
+## empty {#empty}
+
+Checks whether the input UUID is empty.
+
+**Syntax**
+
+```sql
+empty(UUID)
+```
+
+The UUID is considered empty if it contains all zeros (zero UUID).
+
+The function also works for [arrays](array-functions.md#function-empty) or [strings](string-functions.md#empty).
+
+**Arguments**
+
+- `x` — Input UUID. [UUID](../data-types/uuid.md).
+
+**Returned value**
+
+- Returns `1` for an empty UUID or `0` for a non-empty UUID.
+
+Type: [UInt8](../data-types/int-uint.md).
+
+**Example**
+
+To generate the UUID value, ClickHouse provides the [generateUUIDv4](#uuid-function-generate) function.
+
+Query:
+
+```sql
+SELECT empty(generateUUIDv4());
+```
+
+Result:
+
+```text
+┌─empty(generateUUIDv4())─┐
+│ 0 │
+└─────────────────────────┘
+```
+
+## notEmpty {#notempty}
+
+Checks whether the input UUID is non-empty.
+
+**Syntax**
+
+```sql
+notEmpty(UUID)
+```
+
+The UUID is considered empty if it contains all zeros (zero UUID).
+
+The function also works for [arrays](array-functions.md#function-notempty) or [strings](string-functions.md#notempty).
+
+**Arguments**
+
+- `x` — Input UUID. [UUID](../data-types/uuid.md).
+
+**Returned value**
+
+- Returns `1` for a non-empty UUID or `0` for an empty UUID.
+
+Type: [UInt8](../data-types/int-uint.md).
+
+**Example**
+
+To generate the UUID value, ClickHouse provides the [generateUUIDv4](#uuid-function-generate) function.
+
+Query:
+
+```sql
+SELECT notEmpty(generateUUIDv4());
+```
+
+Result:
+
+```text
+┌─notEmpty(generateUUIDv4())─┐
+│ 1 │
+└────────────────────────────┘
+```
+
## toUUID (x) {#touuid-x}
Converts String type value to UUID type.
diff --git a/docs/en/sql-reference/statements/select/distinct.md b/docs/en/sql-reference/statements/select/distinct.md
index 87154cba05a..390afa46248 100644
--- a/docs/en/sql-reference/statements/select/distinct.md
+++ b/docs/en/sql-reference/statements/select/distinct.md
@@ -6,23 +6,55 @@ toc_title: DISTINCT
If `SELECT DISTINCT` is specified, only unique rows will remain in a query result. Thus only a single row will remain out of all the sets of fully matching rows in the result.
-## Null Processing {#null-processing}
+You can specify the list of columns that must have unique values: `SELECT DISTINCT ON (column1, column2,...)`. If the columns are not specified, all of them are taken into consideration.
-`DISTINCT` works with [NULL](../../../sql-reference/syntax.md#null-literal) as if `NULL` were a specific value, and `NULL==NULL`. In other words, in the `DISTINCT` results, different combinations with `NULL` occur only once. It differs from `NULL` processing in most other contexts.
+Consider the table:
-## Alternatives {#alternatives}
+```text
+┌─a─┬─b─┬─c─┐
+│ 1 │ 1 │ 1 │
+│ 1 │ 1 │ 1 │
+│ 2 │ 2 │ 2 │
+│ 2 │ 2 │ 2 │
+│ 1 │ 1 │ 2 │
+│ 1 │ 2 │ 2 │
+└───┴───┴───┘
+```
-It is possible to obtain the same result by applying [GROUP BY](../../../sql-reference/statements/select/group-by.md) across the same set of values as specified as `SELECT` clause, without using any aggregate functions. But there are few differences from `GROUP BY` approach:
+Using `DISTINCT` without specifying columns:
-- `DISTINCT` can be applied together with `GROUP BY`.
-- When [ORDER BY](../../../sql-reference/statements/select/order-by.md) is omitted and [LIMIT](../../../sql-reference/statements/select/limit.md) is defined, the query stops running immediately after the required number of different rows has been read.
-- Data blocks are output as they are processed, without waiting for the entire query to finish running.
+```sql
+SELECT DISTINCT * FROM t1;
+```
-## Examples {#examples}
+```text
+┌─a─┬─b─┬─c─┐
+│ 1 │ 1 │ 1 │
+│ 2 │ 2 │ 2 │
+│ 1 │ 1 │ 2 │
+│ 1 │ 2 │ 2 │
+└───┴───┴───┘
+```
+
+Using `DISTINCT` with specified columns:
+
+```sql
+SELECT DISTINCT ON (a,b) * FROM t1;
+```
+
+```text
+┌─a─┬─b─┬─c─┐
+│ 1 │ 1 │ 1 │
+│ 2 │ 2 │ 2 │
+│ 1 │ 2 │ 2 │
+└───┴───┴───┘
+```
+
+## DISTINCT and ORDER BY {#distinct-orderby}
ClickHouse supports using the `DISTINCT` and `ORDER BY` clauses for different columns in one query. The `DISTINCT` clause is executed before the `ORDER BY` clause.
-Example table:
+Consider the table:
``` text
┌─a─┬─b─┐
@@ -33,7 +65,11 @@ Example table:
└───┴───┘
```
-When selecting data with the `SELECT DISTINCT a FROM t1 ORDER BY b ASC` query, we get the following result:
+Selecting data:
+
+```sql
+SELECT DISTINCT a FROM t1 ORDER BY b ASC;
+```
``` text
┌─a─┐
@@ -42,8 +78,11 @@ When selecting data with the `SELECT DISTINCT a FROM t1 ORDER BY b ASC` query, w
│ 3 │
└───┘
```
+Selecting data with the different sorting direction:
-If we change the sorting direction `SELECT DISTINCT a FROM t1 ORDER BY b DESC`, we get the following result:
+```sql
+SELECT DISTINCT a FROM t1 ORDER BY b DESC;
+```
``` text
┌─a─┐
@@ -56,3 +95,15 @@ If we change the sorting direction `SELECT DISTINCT a FROM t1 ORDER BY b DESC`,
Row `2, 4` was cut before sorting.
Take this implementation specificity into account when programming queries.
+
+## Null Processing {#null-processing}
+
+`DISTINCT` works with [NULL](../../../sql-reference/syntax.md#null-literal) as if `NULL` were a specific value, and `NULL==NULL`. In other words, in the `DISTINCT` results, different combinations with `NULL` occur only once. It differs from `NULL` processing in most other contexts.
+
+## Alternatives {#alternatives}
+
+It is possible to obtain the same result by applying [GROUP BY](../../../sql-reference/statements/select/group-by.md) across the same set of values as specified as `SELECT` clause, without using any aggregate functions. But there are few differences from `GROUP BY` approach:
+
+- `DISTINCT` can be applied together with `GROUP BY`.
+- When [ORDER BY](../../../sql-reference/statements/select/order-by.md) is omitted and [LIMIT](../../../sql-reference/statements/select/limit.md) is defined, the query stops running immediately after the required number of different rows has been read.
+- Data blocks are output as they are processed, without waiting for the entire query to finish running.
diff --git a/docs/en/sql-reference/statements/select/index.md b/docs/en/sql-reference/statements/select/index.md
index 04273ca1d4d..4e96bae8493 100644
--- a/docs/en/sql-reference/statements/select/index.md
+++ b/docs/en/sql-reference/statements/select/index.md
@@ -13,7 +13,7 @@ toc_title: Overview
``` sql
[WITH expr_list|(subquery)]
-SELECT [DISTINCT] expr_list
+SELECT [DISTINCT [ON (column1, column2, ...)]] expr_list
[FROM [db.]table | (subquery) | table_function] [FINAL]
[SAMPLE sample_coeff]
[ARRAY JOIN ...]
@@ -36,6 +36,8 @@ All clauses are optional, except for the required list of expressions immediatel
Specifics of each optional clause are covered in separate sections, which are listed in the same order as they are executed:
- [WITH clause](../../../sql-reference/statements/select/with.md)
+- [SELECT clause](#select-clause)
+- [DISTINCT clause](../../../sql-reference/statements/select/distinct.md)
- [FROM clause](../../../sql-reference/statements/select/from.md)
- [SAMPLE clause](../../../sql-reference/statements/select/sample.md)
- [JOIN clause](../../../sql-reference/statements/select/join.md)
@@ -44,8 +46,6 @@ Specifics of each optional clause are covered in separate sections, which are li
- [GROUP BY clause](../../../sql-reference/statements/select/group-by.md)
- [LIMIT BY clause](../../../sql-reference/statements/select/limit-by.md)
- [HAVING clause](../../../sql-reference/statements/select/having.md)
-- [SELECT clause](#select-clause)
-- [DISTINCT clause](../../../sql-reference/statements/select/distinct.md)
- [LIMIT clause](../../../sql-reference/statements/select/limit.md)
- [OFFSET clause](../../../sql-reference/statements/select/offset.md)
- [UNION clause](../../../sql-reference/statements/select/union.md)
diff --git a/docs/ru/development/developer-instruction.md b/docs/ru/development/developer-instruction.md
index d23c0bbbdca..c568db4731f 100644
--- a/docs/ru/development/developer-instruction.md
+++ b/docs/ru/development/developer-instruction.md
@@ -168,7 +168,13 @@ sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)"
cmake -D CMAKE_BUILD_TYPE=Debug ..
-Вы можете изменить вариант сборки, выполнив эту команду в директории build.
+В случае использования на разработческой машине старого HDD или SSD, а также при желании использовать меньше места для артефактов сборки можно использовать следующую команду:
+```bash
+cmake -DUSE_DEBUG_HELPERS=1 -DUSE_STATIC_LIBRARIES=0 -DSPLIT_SHARED_LIBRARIES=1 -DCLICKHOUSE_SPLIT_BINARY=1 ..
+```
+При этом надо учесть, что получаемые в результате сборки исполнимые файлы будут динамически слинкованы с библиотеками, и поэтому фактически станут непереносимыми на другие компьютеры (либо для этого нужно будет предпринять значительно больше усилий по сравнению со статической сборкой). Плюсом же в данном случае является значительно меньшее время сборки (это проявляется не на первой сборке, а на последующих, после внесения изменений в исходный код - тратится меньшее время на линковку по сравнению со статической сборкой) и значительно меньшее использование места на жёстком диске (экономия более, чем в 3 раза по сравнению со статической сборкой). Для целей разработки, когда планируются только отладочные запуски на том же компьютере, где осуществлялась сборка, это может быть наиболее удобным вариантом.
+
+Вы можете изменить вариант сборки, выполнив новую команду в директории build.
Запустите ninja для сборки:
diff --git a/docs/ru/engines/database-engines/materialized-mysql.md b/docs/ru/engines/database-engines/materialized-mysql.md
index 0175e794cd5..1cd864c01e9 100644
--- a/docs/ru/engines/database-engines/materialized-mysql.md
+++ b/docs/ru/engines/database-engines/materialized-mysql.md
@@ -1,10 +1,12 @@
---
toc_priority: 29
-toc_title: MaterializedMySQL
+toc_title: "[experimental] MaterializedMySQL"
---
# [экспериментальный] MaterializedMySQL {#materialized-mysql}
+**Это экспериментальный движок, который не следует использовать в продакшене.**
+
Создает базу данных ClickHouse со всеми таблицами, существующими в MySQL, и всеми данными в этих таблицах.
Сервер ClickHouse работает как реплика MySQL. Он читает файл binlog и выполняет DDL and DML-запросы.
@@ -23,6 +25,32 @@ ENGINE = MaterializedMySQL('host:port', ['database' | database], 'user', 'passwo
- `user` — пользователь MySQL.
- `password` — пароль пользователя.
+**Настройки движка**
+
+- `max_rows_in_buffer` — максимальное количество строк, содержимое которых может кешироваться в памяти (для одной таблицы и данных кеша, которые невозможно запросить). При превышении количества строк, данные будут материализованы. Значение по умолчанию: `65 505`.
+- `max_bytes_in_buffer` — максимальное количество байтов, которое разрешено кешировать в памяти (для одной таблицы и данных кеша, которые невозможно запросить). При превышении количества строк, данные будут материализованы. Значение по умолчанию: `1 048 576`.
+- `max_rows_in_buffers` — максимальное количество строк, содержимое которых может кешироваться в памяти (для базы данных и данных кеша, которые невозможно запросить). При превышении количества строк, данные будут материализованы. Значение по умолчанию: `65 505`.
+- `max_bytes_in_buffers` — максимальное количество байтов, которое разрешено кешировать данным в памяти (для базы данных и данных кеша, которые невозможно запросить). При превышении количества строк, данные будут материализованы. Значение по умолчанию: `1 048 576`.
+- `max_flush_data_time` — максимальное время в миллисекундах, в течение которого разрешено кешировать данные в памяти (для базы данных и данных кеша, которые невозможно запросить). При превышении количества указанного периода, данные будут материализованы. Значение по умолчанию: `1000`.
+- `max_wait_time_when_mysql_unavailable` — интервал между повторными попытками, если MySQL недоступен. Указывается в миллисекундах. Отрицательное значение отключает повторные попытки. Значение по умолчанию: `1000`.
+- `allows_query_when_mysql_lost` — признак, разрешен ли запрос к материализованной таблице при потере соединения с MySQL. Значение по умолчанию: `0` (`false`).
+
+```sql
+CREATE DATABASE mysql ENGINE = MaterializedMySQL('localhost:3306', 'db', 'user', '***')
+ SETTINGS
+ allows_query_when_mysql_lost=true,
+ max_wait_time_when_mysql_unavailable=10000;
+```
+
+**Настройки на стороне MySQL-сервера**
+
+Для правильной работы `MaterializedMySQL` следует обязательно указать на сервере MySQL следующие параметры конфигурации:
+- `default_authentication_plugin = mysql_native_password` — `MaterializedMySQL` может авторизоваться только с помощью этого метода.
+- `gtid_mode = on` — ведение журнала на основе GTID является обязательным для обеспечения правильной репликации.
+
+!!! attention "Внимание"
+ При включении `gtid_mode` вы также должны указать `enforce_gtid_consistency = on`.
+
## Виртуальные столбцы {#virtual-columns}
При работе с движком баз данных `MaterializedMySQL` используются таблицы семейства [ReplacingMergeTree](../../engines/table-engines/mergetree-family/replacingmergetree.md) с виртуальными столбцами `_sign` и `_version`.
@@ -51,13 +79,21 @@ ENGINE = MaterializedMySQL('host:port', ['database' | database], 'user', 'passwo
| STRING | [String](../../sql-reference/data-types/string.md) |
| VARCHAR, VAR_STRING | [String](../../sql-reference/data-types/string.md) |
| BLOB | [String](../../sql-reference/data-types/string.md) |
-
-Другие типы не поддерживаются. Если таблица MySQL содержит столбец другого типа, ClickHouse выдаст исключение "Неподдерживаемый тип данных" ("Unhandled data type") и остановит репликацию.
+| BINARY | [FixedString](../../sql-reference/data-types/fixedstring.md) |
Тип [Nullable](../../sql-reference/data-types/nullable.md) поддерживается.
+Другие типы не поддерживаются. Если таблица MySQL содержит столбец другого типа, ClickHouse выдаст исключение "Неподдерживаемый тип данных" ("Unhandled data type") и остановит репликацию.
+
## Особенности и рекомендации {#specifics-and-recommendations}
+### Ограничения совместимости {#compatibility-restrictions}
+
+Кроме ограничений на типы данных, существует несколько ограничений по сравнению с базами данных MySQL, которые следует решить до того, как станет возможной репликация:
+
+- Каждая таблица в MySQL должна содержать `PRIMARY KEY`.
+- Репликация для таблиц, содержащих строки со значениями полей `ENUM` вне диапазона значений (определяется размерностью `ENUM`), не будет работать.
+
### DDL-запросы {#ddl-queries}
DDL-запросы в MySQL конвертируются в соответствующие DDL-запросы в ClickHouse ([ALTER](../../sql-reference/statements/alter/index.md), [CREATE](../../sql-reference/statements/create/index.md), [DROP](../../sql-reference/statements/drop.md), [RENAME](../../sql-reference/statements/rename.md)). Если ClickHouse не может конвертировать какой-либо DDL-запрос, он его игнорирует.
@@ -158,3 +194,4 @@ SELECT * FROM mysql.test;
└───┴─────┴──────┘
```
+[Оригинальная статья](https://clickhouse.tech/docs/ru/engines/database-engines/materialized-mysql/)
diff --git a/docs/ru/engines/table-engines/mergetree-family/mergetree.md b/docs/ru/engines/table-engines/mergetree-family/mergetree.md
index 4bced6254d1..db6eb8154ba 100644
--- a/docs/ru/engines/table-engines/mergetree-family/mergetree.md
+++ b/docs/ru/engines/table-engines/mergetree-family/mergetree.md
@@ -375,6 +375,24 @@ INDEX b (u64 * length(str), i32 + f64 * 100, date, str) TYPE set(100) GRANULARIT
- `s != 1`
- `NOT startsWith(s, 'test')`
+### Проекции {#projections}
+Проекции похожи на материализованные представления, но определяются на уровне партов. Это обеспечивает гарантии согласованности наряду с автоматическим использованием в запросах.
+
+#### Запрос {#projection-query}
+Запрос проекции — это то, что определяет проекцию. Он имеет следующую грамматику:
+
+`SELECT [GROUP BY] [ORDER BY]`
+
+Он неявно выбирает данные из родительской таблицы.
+
+#### Хранение {#projection-storage}
+Проекции хранятся в каталоге парта. Это похоже на хранение индексов, но используется подкаталог, в котором хранится анонимный парт таблицы MergeTree. Таблица создается запросом определения проекции. Если есть конструкция GROUP BY, то базовый механизм хранения становится AggregatedMergeTree, а все агрегатные функции преобразуются в AggregateFunction. Если есть конструкция ORDER BY, таблица MergeTree будет использовать его в качестве выражения первичного ключа. Во время процесса слияния парт проекции будет слит с помощью процедуры слияния ее хранилища. Контрольная сумма парта родительской таблицы будет включать парт проекции. Другие процедуры аналогичны индексам пропуска данных.
+
+#### Анализ запросов {#projection-query-analysis}
+1. Проверить, можно ли использовать проекцию в данном запросе, то есть, что с ней выходит тот же результат, что и с запросом к базовой таблице.
+2. Выбрать наиболее подходящее совпадение, содержащее наименьшее количество гранул для чтения.
+3. План запроса, который использует проекции, будет отличаться от того, который использует исходные парты. При отсутствии проекции в некоторых партах можно расширить план, чтобы «проецировать» на лету.
+
## Конкурентный доступ к данным {#concurrent-data-access}
Для конкурентного доступа к таблице используется мультиверсионность. То есть, при одновременном чтении и обновлении таблицы, данные будут читаться из набора кусочков, актуального на момент запроса. Длинных блокировок нет. Вставки никак не мешают чтениям.
diff --git a/docs/ru/sql-reference/functions/array-functions.md b/docs/ru/sql-reference/functions/array-functions.md
index 52fd63864ce..b7a301d30a9 100644
--- a/docs/ru/sql-reference/functions/array-functions.md
+++ b/docs/ru/sql-reference/functions/array-functions.md
@@ -7,19 +7,89 @@ toc_title: "Массивы"
## empty {#function-empty}
-Возвращает 1 для пустого массива, и 0 для непустого массива.
-Тип результата - UInt8.
-Функция также работает для строк.
+Проверяет, является ли входной массив пустым.
-Функцию можно оптимизировать, если включить настройку [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns). При `optimize_functions_to_subcolumns = 1` функция читает только подстолбец [size0](../../sql-reference/data-types/array.md#array-size) вместо чтения и обработки всего столбца массива. Запрос `SELECT empty(arr) FROM table` преобразуется к запросу `SELECT arr.size0 = 0 FROM TABLE`.
+**Синтаксис**
+
+``` sql
+empty([x])
+```
+
+Массив считается пустым, если он не содержит ни одного элемента.
+
+!!! note "Примечание"
+ Функцию можно оптимизировать, если включить настройку [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns). При `optimize_functions_to_subcolumns = 1` функция читает только подстолбец [size0](../../sql-reference/data-types/array.md#array-size) вместо чтения и обработки всего столбца массива. Запрос `SELECT empty(arr) FROM TABLE` преобразуется к запросу `SELECT arr.size0 = 0 FROM TABLE`.
+
+Функция также поддерживает работу с типами [String](string-functions.md#empty) и [UUID](uuid-functions.md#empty).
+
+**Параметры**
+
+- `[x]` — массив на входе функции. [Array](../data-types/array.md).
+
+**Возвращаемое значение**
+
+- Возвращает `1` для пустого массива или `0` — для непустого массива.
+
+Тип: [UInt8](../data-types/int-uint.md).
+
+**Пример**
+
+Запрос:
+
+```sql
+SELECT empty([]);
+```
+
+Ответ:
+
+```text
+┌─empty(array())─┐
+│ 1 │
+└────────────────┘
+```
## notEmpty {#function-notempty}
-Возвращает 0 для пустого массива, и 1 для непустого массива.
-Тип результата - UInt8.
-Функция также работает для строк.
+Проверяет, является ли входной массив непустым.
-Функцию можно оптимизировать, если включить настройку [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns). При `optimize_functions_to_subcolumns = 1` функция читает только подстолбец [size0](../../sql-reference/data-types/array.md#array-size) вместо чтения и обработки всего столбца массива. Запрос `SELECT notEmpty(arr) FROM table` преобразуется к запросу `SELECT arr.size0 != 0 FROM TABLE`.
+**Синтаксис**
+
+``` sql
+notEmpty([x])
+```
+
+Массив считается непустым, если он содержит хотя бы один элемент.
+
+!!! note "Примечание"
+ Функцию можно оптимизировать, если включить настройку [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns). При `optimize_functions_to_subcolumns = 1` функция читает только подстолбец [size0](../../sql-reference/data-types/array.md#array-size) вместо чтения и обработки всего столбца массива. Запрос `SELECT notEmpty(arr) FROM table` преобразуется к запросу `SELECT arr.size0 != 0 FROM TABLE`.
+
+Функция также поддерживает работу с типами [String](string-functions.md#notempty) и [UUID](uuid-functions.md#notempty).
+
+**Параметры**
+
+- `[x]` — массив на входе функции. [Array](../data-types/array.md).
+
+**Возвращаемое значение**
+
+- Возвращает `1` для непустого массива или `0` — для пустого массива.
+
+Тип: [UInt8](../data-types/int-uint.md).
+
+**Пример**
+
+Запрос:
+
+```sql
+SELECT notEmpty([1,2]);
+```
+
+Результат:
+
+```text
+┌─notEmpty([1, 2])─┐
+│ 1 │
+└──────────────────┘
+```
## length {#array_functions-length}
diff --git a/docs/ru/sql-reference/functions/string-functions.md b/docs/ru/sql-reference/functions/string-functions.md
index b587a991db1..cbda5188881 100644
--- a/docs/ru/sql-reference/functions/string-functions.md
+++ b/docs/ru/sql-reference/functions/string-functions.md
@@ -7,16 +7,83 @@ toc_title: "Функции для работы со строками"
## empty {#empty}
-Возвращает 1 для пустой строки, и 0 для непустой строки.
-Тип результата — UInt8.
-Строка считается непустой, если содержит хотя бы один байт, пусть даже это пробел или нулевой байт.
-Функция также работает для массивов.
+Проверяет, является ли входная строка пустой.
+
+**Синтаксис**
+
+``` sql
+empty(x)
+```
+
+Строка считается непустой, если содержит хотя бы один байт, пусть даже это пробел или нулевой байт.
+
+Функция также поддерживает работу с типами [Array](array-functions.md#function-empty) и [UUID](uuid-functions.md#empty).
+
+**Параметры**
+
+- `x` — Входная строка. [String](../data-types/string.md).
+
+**Возвращаемое значение**
+
+- Возвращает `1` для пустой строки и `0` — для непустой строки.
+
+Тип: [UInt8](../data-types/int-uint.md).
+
+**Пример**
+
+Запрос:
+
+```sql
+SELECT notempty('text');
+```
+
+Результат:
+
+```text
+┌─empty('')─┐
+│ 1 │
+└───────────┘
+```
## notEmpty {#notempty}
-Возвращает 0 для пустой строки, и 1 для непустой строки.
-Тип результата — UInt8.
-Функция также работает для массивов.
+Проверяет, является ли входная строка непустой.
+
+**Синтаксис**
+
+``` sql
+notEmpty(x)
+```
+
+Строка считается непустой, если содержит хотя бы один байт, пусть даже это пробел или нулевой байт.
+
+Функция также поддерживает работу с типами [Array](array-functions.md#function-notempty) и [UUID](uuid-functions.md#notempty).
+
+**Параметры**
+
+- `x` — Входная строка. [String](../data-types/string.md).
+
+**Возвращаемое значение**
+
+- Возвращает `1` для непустой строки и `0` — для пустой строки.
+
+Тип: [UInt8](../data-types/int-uint.md).
+
+**Пример**
+
+Запрос:
+
+```sql
+SELECT notEmpty('text');
+```
+
+Результат:
+
+```text
+┌─notEmpty('text')─┐
+│ 1 │
+└──────────────────┘
+```
## length {#length}
@@ -39,6 +106,158 @@ toc_title: "Функции для работы со строками"
Возвращает длину строки в кодовых точках Unicode (не символах), при допущении, что строка содержит набор байтов, являющийся текстом в кодировке UTF-8. Если допущение не выполнено, возвращает какой-нибудь результат (не кидает исключение).
Тип результата — UInt64.
+## leftPad {#leftpad}
+
+Дополняет текущую строку слева пробелами или указанной строкой (несколько раз, если необходимо), пока результирующая строка не достигнет заданной длины. Соответствует MySQL функции `LPAD`.
+
+**Синтаксис**
+
+``` sql
+leftPad('string', 'length'[, 'pad_string'])
+```
+
+**Параметры**
+
+- `string` — входная строка, которую необходимо дополнить. [String](../data-types/string.md).
+- `length` — длина результирующей строки. [UInt](../data-types/int-uint.md). Если указанное значение меньше, чем длина входной строки, то входная строка возвращается как есть.
+- `pad_string` — строка, используемая для дополнения входной строки. [String](../data-types/string.md). Необязательный параметр. Если не указано, то входная строка дополняется пробелами.
+
+**Возвращаемое значение**
+
+- Результирующая строка заданной длины.
+
+Type: [String](../data-types/string.md).
+
+**Пример**
+
+Запрос:
+
+``` sql
+SELECT leftPad('abc', 7, '*'), leftPad('def', 7);
+```
+
+Результат:
+
+``` text
+┌─leftPad('abc', 7, '*')─┬─leftPad('def', 7)─┐
+│ ****abc │ def │
+└────────────────────────┴───────────────────┘
+```
+
+## leftPadUTF8 {#leftpadutf8}
+
+Дополняет текущую строку слева пробелами или указанной строкой (несколько раз, если необходимо), пока результирующая строка не достигнет заданной длины. Соответствует MySQL функции `LPAD`. В отличие от функции [leftPad](#leftpad), измеряет длину строки не в байтах, а в кодовых точках Unicode.
+
+**Синтаксис**
+
+``` sql
+leftPadUTF8('string','length'[, 'pad_string'])
+```
+
+**Параметры**
+
+- `string` — входная строка, которую необходимо дополнить. [String](../data-types/string.md).
+- `length` — длина результирующей строки. [UInt](../data-types/int-uint.md). Если указанное значение меньше, чем длина входной строки, то входная строка возвращается как есть.
+- `pad_string` — строка, используемая для дополнения входной строки. [String](../data-types/string.md). Необязательный параметр. Если не указано, то входная строка дополняется пробелами.
+
+**Возвращаемое значение**
+
+- Результирующая строка заданной длины.
+
+Type: [String](../data-types/string.md).
+
+**Пример**
+
+Запрос:
+
+``` sql
+SELECT leftPadUTF8('абвг', 7, '*'), leftPadUTF8('дежз', 7);
+```
+
+Результат:
+
+``` text
+┌─leftPadUTF8('абвг', 7, '*')─┬─leftPadUTF8('дежз', 7)─┐
+│ ***абвг │ дежз │
+└─────────────────────────────┴────────────────────────┘
+```
+
+## rightPad {#rightpad}
+
+Дополняет текущую строку справа пробелами или указанной строкой (несколько раз, если необходимо), пока результирующая строка не достигнет заданной длины. Соответствует MySQL функции `RPAD`.
+
+**Синтаксис**
+
+``` sql
+rightPad('string', 'length'[, 'pad_string'])
+```
+
+**Параметры**
+
+- `string` — входная строка, которую необходимо дополнить. [String](../data-types/string.md).
+- `length` — длина результирующей строки. [UInt](../data-types/int-uint.md). Если указанное значение меньше, чем длина входной строки, то входная строка возвращается как есть.
+- `pad_string` — строка, используемая для дополнения входной строки. [String](../data-types/string.md). Необязательный параметр. Если не указано, то входная строка дополняется пробелами.
+
+**Возвращаемое значение**
+
+- Результирующая строка заданной длины.
+
+Type: [String](../data-types/string.md).
+
+**Пример**
+
+Запрос:
+
+``` sql
+SELECT rightPad('abc', 7, '*'), rightPad('abc', 7);
+```
+
+Результат:
+
+``` text
+┌─rightPad('abc', 7, '*')─┬─rightPad('abc', 7)─┐
+│ abc**** │ abc │
+└─────────────────────────┴────────────────────┘
+```
+
+## rightPadUTF8 {#rightpadutf8}
+
+Дополняет текущую строку слева пробелами или указанной строкой (несколько раз, если необходимо), пока результирующая строка не достигнет заданной длины. Соответствует MySQL функции `RPAD`. В отличие от функции [rightPad](#rightpad), измеряет длину строки не в байтах, а в кодовых точках Unicode.
+
+**Синтаксис**
+
+``` sql
+rightPadUTF8('string','length'[, 'pad_string'])
+```
+
+**Параметры**
+
+- `string` — входная строка, которую необходимо дополнить. [String](../data-types/string.md).
+- `length` — длина результирующей строки. [UInt](../data-types/int-uint.md). Если указанное значение меньше, чем длина входной строки, то входная строка возвращается как есть.
+- `pad_string` — строка, используемая для дополнения входной строки. [String](../data-types/string.md). Необязательный параметр. Если не указано, то входная строка дополняется пробелами.
+
+**Возвращаемое значение**
+
+- Результирующая строка заданной длины.
+
+Type: [String](../data-types/string.md).
+
+**Пример**
+
+Запрос:
+
+``` sql
+SELECT rightPadUTF8('абвг', 7, '*'), rightPadUTF8('абвг', 7);
+```
+
+Результат:
+
+``` text
+┌─rightPadUTF8('абвг', 7, '*')─┬─rightPadUTF8('абвг', 7)─┐
+│ абвг*** │ абвг │
+└──────────────────────────────┴─────────────────────────┘
+```
+
## lower, lcase {#lower}
Переводит ASCII-символы латиницы в строке в нижний регистр.
diff --git a/docs/ru/sql-reference/functions/uuid-functions.md b/docs/ru/sql-reference/functions/uuid-functions.md
index f0017adbc8b..0d534a2d7ce 100644
--- a/docs/ru/sql-reference/functions/uuid-functions.md
+++ b/docs/ru/sql-reference/functions/uuid-functions.md
@@ -35,6 +35,90 @@ SELECT * FROM t_uuid
└──────────────────────────────────────┘
```
+## empty {#empty}
+
+Проверяет, является ли входной UUID пустым.
+
+**Синтаксис**
+
+```sql
+empty(UUID)
+```
+
+UUID считается пустым, если он содержит все нули (нулевой UUID).
+
+Функция также поддерживает работу с типами [Array](array-functions.md#function-empty) и [String](string-functions.md#empty).
+
+**Параметры**
+
+- `x` — UUID на входе функции. [UUID](../data-types/uuid.md).
+
+**Возвращаемое значение**
+
+- Возвращает `1` для пустого UUID или `0` — для непустого UUID.
+
+Тип: [UInt8](../data-types/int-uint.md).
+
+**Пример**
+
+Для генерации UUID-значений предназначена функция [generateUUIDv4](#uuid-function-generate).
+
+Запрос:
+
+```sql
+SELECT empty(generateUUIDv4());
+```
+
+Ответ:
+
+```text
+┌─empty(generateUUIDv4())─┐
+│ 0 │
+└─────────────────────────┘
+```
+
+## notEmpty {#notempty}
+
+Проверяет, является ли входной UUID непустым.
+
+**Синтаксис**
+
+```sql
+notEmpty(UUID)
+```
+
+UUID считается пустым, если он содержит все нули (нулевой UUID).
+
+Функция также поддерживает работу с типами [Array](array-functions.md#function-notempty) и [String](string-functions.md#function-notempty).
+
+**Параметры**
+
+- `x` — UUID на входе функции. [UUID](../data-types/uuid.md).
+
+**Возвращаемое значение**
+
+- Возвращает `1` для непустого UUID или `0` — для пустого UUID.
+
+Тип: [UInt8](../data-types/int-uint.md).
+
+**Пример**
+
+Для генерации UUID-значений предназначена функция [generateUUIDv4](#uuid-function-generate).
+
+Запрос:
+
+```sql
+SELECT notEmpty(generateUUIDv4());
+```
+
+Результат:
+
+```text
+┌─notEmpty(generateUUIDv4())─┐
+│ 1 │
+└────────────────────────────┘
+```
+
## toUUID (x) {#touuid-x}
Преобразует значение типа String в тип UUID.
diff --git a/docs/ru/sql-reference/statements/alter/projection.md b/docs/ru/sql-reference/statements/alter/projection.md
new file mode 100644
index 00000000000..db116963aa6
--- /dev/null
+++ b/docs/ru/sql-reference/statements/alter/projection.md
@@ -0,0 +1,23 @@
+---
+toc_priority: 49
+toc_title: PROJECTION
+---
+
+# Манипуляции с проекциями {#manipulations-with-projections}
+
+Доступны следующие операции:
+
+- `ALTER TABLE [db].name ADD PROJECTION name AS SELECT [GROUP BY] [ORDER BY]` — добавляет описание проекции в метаданные.
+
+- `ALTER TABLE [db].name DROP PROJECTION name` — удаляет описание проекции из метаданных и удаляет файлы проекции с диска.
+
+- `ALTER TABLE [db.]table MATERIALIZE PROJECTION name IN PARTITION partition_name` — перестраивает проекцию в указанной партиции. Реализовано как [мутация](../../../sql-reference/statements/alter/index.md#mutations).
+
+- `ALTER TABLE [db.]table CLEAR PROJECTION name IN PARTITION partition_name` — удаляет файлы проекции с диска без удаления описания.
+
+Комманды ADD, DROP и CLEAR — легковесны, поскольку они только меняют метаданные или удаляют файлы.
+
+Также команды реплицируются, синхронизируя описания проекций в метаданных с помощью ZooKeeper.
+
+!!! note "Note"
+ Манипуляции с проекциями поддерживаются только для таблиц с движком [`*MergeTree`](../../../engines/table-engines/mergetree-family/mergetree.md) (включая [replicated](../../../engines/table-engines/mergetree-family/replication.md) варианты).
diff --git a/docs/ru/sql-reference/statements/select/distinct.md b/docs/ru/sql-reference/statements/select/distinct.md
index f57c2a42593..42c1df64540 100644
--- a/docs/ru/sql-reference/statements/select/distinct.md
+++ b/docs/ru/sql-reference/statements/select/distinct.md
@@ -6,19 +6,51 @@ toc_title: DISTINCT
Если указан `SELECT DISTINCT`, то в результате запроса останутся только уникальные строки. Таким образом, из всех наборов полностью совпадающих строк в результате останется только одна строка.
-## Обработка NULL {#null-processing}
+Вы можете указать столбцы, по которым хотите отбирать уникальные значения: `SELECT DISTINCT ON (column1, column2,...)`. Если столбцы не указаны, то отбираются строки, в которых значения уникальны во всех столбцах.
-`DISTINCT` работает с [NULL](../../syntax.md#null-literal) как-будто `NULL` — обычное значение и `NULL==NULL`. Другими словами, в результате `DISTINCT`, различные комбинации с `NULL` встретятся только один раз. Это отличается от обработки `NULL` в большинстве других контекстов.
+Рассмотрим таблицу:
-## Альтернативы {#alternatives}
+```text
+┌─a─┬─b─┬─c─┐
+│ 1 │ 1 │ 1 │
+│ 1 │ 1 │ 1 │
+│ 2 │ 2 │ 2 │
+│ 2 │ 2 │ 2 │
+│ 1 │ 1 │ 2 │
+│ 1 │ 2 │ 2 │
+└───┴───┴───┘
+```
-Такой же результат можно получить, применив секцию [GROUP BY](group-by.md) для того же набора значений, которые указан в секции `SELECT`, без использования каких-либо агрегатных функций. Но есть от `GROUP BY` несколько отличий:
+Использование `DISTINCT` без указания столбцов:
-- `DISTINCT` может применяться вместе с `GROUP BY`.
-- Когда секция [ORDER BY](order-by.md) опущена, а секция [LIMIT](limit.md) присутствует, запрос прекращает выполнение сразу после считывания необходимого количества различных строк.
-- Блоки данных выводятся по мере их обработки, не дожидаясь завершения выполнения всего запроса.
+```sql
+SELECT DISTINCT * FROM t1;
+```
-## Примеры {#examples}
+```text
+┌─a─┬─b─┬─c─┐
+│ 1 │ 1 │ 1 │
+│ 2 │ 2 │ 2 │
+│ 1 │ 1 │ 2 │
+│ 1 │ 2 │ 2 │
+└───┴───┴───┘
+```
+
+Использование `DISTINCT` с указанием столбцов:
+
+```sql
+SELECT DISTINCT ON (a,b) * FROM t1;
+```
+
+```text
+┌─a─┬─b─┬─c─┐
+│ 1 │ 1 │ 1 │
+│ 2 │ 2 │ 2 │
+│ 1 │ 2 │ 2 │
+└───┴───┴───┘
+```
+
+## DISTINCT и ORDER BY {#distinct-orderby}
ClickHouse поддерживает использование секций `DISTINCT` и `ORDER BY` для разных столбцов в одном запросе. Секция `DISTINCT` выполняется до секции `ORDER BY`.
@@ -56,3 +88,16 @@ ClickHouse поддерживает использование секций `DIS
Ряд `2, 4` был разрезан перед сортировкой.
Учитывайте эту специфику при разработке запросов.
+
+## Обработка NULL {#null-processing}
+
+`DISTINCT` работает с [NULL](../../syntax.md#null-literal) как-будто `NULL` — обычное значение и `NULL==NULL`. Другими словами, в результате `DISTINCT`, различные комбинации с `NULL` встретятся только один раз. Это отличается от обработки `NULL` в большинстве других контекстов.
+
+## Альтернативы {#alternatives}
+
+Можно получить такой же результат, применив [GROUP BY](group-by.md) для того же набора значений, которые указан в секции `SELECT`, без использования каких-либо агрегатных функций. Но есть несколько отличий от `GROUP BY`:
+
+- `DISTINCT` может применяться вместе с `GROUP BY`.
+- Когда секция [ORDER BY](order-by.md) опущена, а секция [LIMIT](limit.md) присутствует, запрос прекращает выполнение сразу после считывания необходимого количества различных строк.
+- Блоки данных выводятся по мере их обработки, не дожидаясь завершения выполнения всего запроса.
+
diff --git a/docs/ru/sql-reference/statements/select/index.md b/docs/ru/sql-reference/statements/select/index.md
index a0a862cbf55..c2820bc7be4 100644
--- a/docs/ru/sql-reference/statements/select/index.md
+++ b/docs/ru/sql-reference/statements/select/index.md
@@ -11,7 +11,7 @@ toc_title: "Обзор"
``` sql
[WITH expr_list|(subquery)]
-SELECT [DISTINCT] expr_list
+SELECT [DISTINCT [ON (column1, column2, ...)]] expr_list
[FROM [db.]table | (subquery) | table_function] [FINAL]
[SAMPLE sample_coeff]
[ARRAY JOIN ...]
@@ -34,6 +34,8 @@ SELECT [DISTINCT] expr_list
Особенности каждой необязательной секции рассматриваются в отдельных разделах, которые перечислены в том же порядке, в каком они выполняются:
- [Секция WITH](with.md)
+- [Секция SELECT](#select-clause)
+- [Секция DISTINCT](distinct.md)
- [Секция FROM](from.md)
- [Секция SAMPLE](sample.md)
- [Секция JOIN](join.md)
@@ -42,8 +44,6 @@ SELECT [DISTINCT] expr_list
- [Секция GROUP BY](group-by.md)
- [Секция LIMIT BY](limit-by.md)
- [Секция HAVING](having.md)
-- [Секция SELECT](#select-clause)
-- [Секция DISTINCT](distinct.md)
- [Секция LIMIT](limit.md)
[Секция OFFSET](offset.md)
- [Секция UNION ALL](union.md)
diff --git a/programs/client/QueryFuzzer.h b/programs/client/QueryFuzzer.h
index 19f089c6c4e..09d57f4161f 100644
--- a/programs/client/QueryFuzzer.h
+++ b/programs/client/QueryFuzzer.h
@@ -7,7 +7,6 @@
#include
#include
-#include
#include
#include
diff --git a/programs/server/Server.cpp b/programs/server/Server.cpp
index 86bb04351b1..5520f920823 100644
--- a/programs/server/Server.cpp
+++ b/programs/server/Server.cpp
@@ -97,7 +97,7 @@
#endif
#if USE_SSL
-# if USE_INTERNAL_SSL_LIBRARY
+# if USE_INTERNAL_SSL_LIBRARY && !defined(ARCADIA_BUILD)
# include
# endif
# include
@@ -126,6 +126,7 @@ namespace CurrentMetrics
extern const Metric VersionInteger;
extern const Metric MemoryTracking;
extern const Metric MaxDDLEntryID;
+ extern const Metric MaxPushedDDLEntryID;
}
namespace fs = std::filesystem;
@@ -1468,7 +1469,8 @@ if (ThreadFuzzer::instance().isEffective())
if (pool_size < 1)
throw Exception("distributed_ddl.pool_size should be greater then 0", ErrorCodes::ARGUMENT_OUT_OF_BOUND);
global_context->setDDLWorker(std::make_unique(pool_size, ddl_zookeeper_path, global_context, &config(),
- "distributed_ddl", "DDLWorker", &CurrentMetrics::MaxDDLEntryID));
+ "distributed_ddl", "DDLWorker",
+ &CurrentMetrics::MaxDDLEntryID, &CurrentMetrics::MaxPushedDDLEntryID));
}
for (auto & server : *servers)
diff --git a/programs/server/config.xml b/programs/server/config.xml
index 78182482c1c..510a5e230f8 100644
--- a/programs/server/config.xml
+++ b/programs/server/config.xml
@@ -320,7 +320,7 @@
The amount of data in mapped files can be monitored
in system.metrics, system.metric_log by the MMappedFiles, MMappedFileBytes metrics
and in system.asynchronous_metrics, system.asynchronous_metrics_log by the MMapCacheCells metric,
- and also in system.events, system.processes, system.query_log, system.query_thread_log by the
+ and also in system.events, system.processes, system.query_log, system.query_thread_log, system.query_views_log by the
CreatedReadBufferMMap, CreatedReadBufferMMapFailed, MMappedFileCacheHits, MMappedFileCacheMisses events.
Note that the amount of data in mapped files does not consume memory directly and is not accounted
in query or server memory usage - because this memory can be discarded similar to OS page cache.
@@ -878,14 +878,23 @@
7500
+
+
+ system
+
+ toYYYYMM(event_date)
+ 7500
+
+
system
+ toYYYYMM(event_date)
7500
- -->
diff --git a/programs/server/config.yaml.example b/programs/server/config.yaml.example
index bebfd74ff58..5b2da1d3128 100644
--- a/programs/server/config.yaml.example
+++ b/programs/server/config.yaml.example
@@ -271,7 +271,7 @@ mark_cache_size: 5368709120
# The amount of data in mapped files can be monitored
# in system.metrics, system.metric_log by the MMappedFiles, MMappedFileBytes metrics
# and in system.asynchronous_metrics, system.asynchronous_metrics_log by the MMapCacheCells metric,
-# and also in system.events, system.processes, system.query_log, system.query_thread_log by the
+# and also in system.events, system.processes, system.query_log, system.query_thread_log, system.query_views_log by the
# CreatedReadBufferMMap, CreatedReadBufferMMapFailed, MMappedFileCacheHits, MMappedFileCacheMisses events.
# Note that the amount of data in mapped files does not consume memory directly and is not accounted
# in query or server memory usage - because this memory can be discarded similar to OS page cache.
@@ -731,12 +731,21 @@ query_thread_log:
partition_by: toYYYYMM(event_date)
flush_interval_milliseconds: 7500
+# Query views log. Has information about all dependent views associated with a query.
+# Used only for queries with setting log_query_views = 1.
+query_views_log:
+ database: system
+ table: query_views_log
+ partition_by: toYYYYMM(event_date)
+ flush_interval_milliseconds: 7500
+
# Uncomment if use part log.
# Part log contains information about all actions with parts in MergeTree tables (creation, deletion, merges, downloads).
-# part_log:
-# database: system
-# table: part_log
-# flush_interval_milliseconds: 7500
+part_log:
+ database: system
+ table: part_log
+ partition_by: toYYYYMM(event_date)
+ flush_interval_milliseconds: 7500
# Uncomment to write text log into table.
# Text log contains all information from usual server log but stores it in structured and efficient way.
diff --git a/src/Access/ya.make b/src/Access/ya.make
index 5f2f410cabd..38c1b007ff8 100644
--- a/src/Access/ya.make
+++ b/src/Access/ya.make
@@ -46,7 +46,6 @@ SRCS(
SettingsProfilesInfo.cpp
User.cpp
UsersConfigAccessStorage.cpp
- tests/gtest_access_rights_ops.cpp
)
diff --git a/src/Access/ya.make.in b/src/Access/ya.make.in
index 1f11c7d7d2a..5fa69cec4bb 100644
--- a/src/Access/ya.make.in
+++ b/src/Access/ya.make.in
@@ -8,7 +8,7 @@ PEERDIR(
SRCS(
- find . -name '*.cpp' | grep -v -F examples | sed 's/^\.\// /' | sort ?>
+ find . -name '*.cpp' | grep -v -F tests | grep -v -F examples | sed 's/^\.\// /' | sort ?>
)
END()
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 6c10d3e2f2b..796c9eb4d2c 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -299,10 +299,11 @@ target_link_libraries(clickhouse_common_io
${ZLIB_LIBRARIES}
pcg_random
Poco::Foundation
- roaring
)
-
+# Make dbms depend on roaring instead of clickhouse_common_io so that roaring itself can depend on clickhouse_common_io
+# That way we we can redirect malloc/free functions avoiding circular dependencies
+dbms_target_link_libraries(PUBLIC roaring)
if (USE_RDKAFKA)
dbms_target_link_libraries(PRIVATE ${CPPKAFKA_LIBRARY} ${RDKAFKA_LIBRARY})
diff --git a/src/Columns/ColumnLowCardinality.h b/src/Columns/ColumnLowCardinality.h
index faf5bb9e712..a78c7d88a11 100644
--- a/src/Columns/ColumnLowCardinality.h
+++ b/src/Columns/ColumnLowCardinality.h
@@ -194,6 +194,7 @@ public:
const IColumnUnique & getDictionary() const { return dictionary.getColumnUnique(); }
IColumnUnique & getDictionary() { return dictionary.getColumnUnique(); }
const ColumnPtr & getDictionaryPtr() const { return dictionary.getColumnUniquePtr(); }
+ ColumnPtr & getDictionaryPtr() { return dictionary.getColumnUniquePtr(); }
/// IColumnUnique & getUnique() { return static_cast(*column_unique); }
/// ColumnPtr getUniquePtr() const { return column_unique; }
diff --git a/src/Common/CurrentMetrics.cpp b/src/Common/CurrentMetrics.cpp
index f94c3421107..9acefe8a2d8 100644
--- a/src/Common/CurrentMetrics.cpp
+++ b/src/Common/CurrentMetrics.cpp
@@ -60,6 +60,7 @@
M(BrokenDistributedFilesToInsert, "Number of files for asynchronous insertion into Distributed tables that has been marked as broken. This metric will starts from 0 on start. Number of files for every shard is summed.") \
M(TablesToDropQueueSize, "Number of dropped tables, that are waiting for background data removal.") \
M(MaxDDLEntryID, "Max processed DDL entry of DDLWorker.") \
+ M(MaxPushedDDLEntryID, "Max DDL entry of DDLWorker that pushed to zookeeper.") \
M(PartsTemporary, "The part is generating now, it is not in data_parts list.") \
M(PartsPreCommitted, "The part is in data_parts, but not used for SELECTs.") \
M(PartsCommitted, "Active data part, used by current and upcoming SELECTs.") \
diff --git a/src/Common/DenseHashMap.h b/src/Common/DenseHashMap.h
new file mode 100644
index 00000000000..9ac21c82676
--- /dev/null
+++ b/src/Common/DenseHashMap.h
@@ -0,0 +1,29 @@
+#pragma once
+#include
+
+/// DenseHashMap is a wrapper for google::dense_hash_map.
+/// Some hacks are needed to make it work in "Arcadia".
+/// "Arcadia" is a proprietary monorepository in Yandex.
+/// It uses slightly changed version of sparsehash with a different set of hash functions (which we don't need).
+/// Those defines are needed to make it compile.
+#if defined(ARCADIA_BUILD)
+#define HASH_FUN_H
+template
+struct THash;
+#endif
+
+#include
+
+#if !defined(ARCADIA_BUILD)
+ template ,
+ class EqualKey = std::equal_to,
+ class Alloc = google::libc_allocator_with_realloc>>
+ using DenseHashMap = google::dense_hash_map;
+#else
+ template ,
+ class EqualKey = std::equal_to,
+ class Alloc = google::sparsehash::libc_allocator_with_realloc>>
+ using DenseHashMap = google::sparsehash::dense_hash_map;
+
+ #undef THash
+#endif
diff --git a/src/Common/DenseHashSet.h b/src/Common/DenseHashSet.h
new file mode 100644
index 00000000000..e8c06f36aa3
--- /dev/null
+++ b/src/Common/DenseHashSet.h
@@ -0,0 +1,25 @@
+#pragma once
+
+/// DenseHashSet is a wrapper for google::dense_hash_set.
+/// See comment in DenseHashMap.h
+#if defined(ARCADIA_BUILD)
+#define HASH_FUN_H
+template
+struct THash;
+#endif
+
+#include
+
+#if !defined(ARCADIA_BUILD)
+ template ,
+ class EqualKey = std::equal_to,
+ class Alloc = google::libc_allocator_with_realloc>
+ using DenseHashSet = google::dense_hash_set;
+#else
+ template ,
+ class EqualKey = std::equal_to,
+ class Alloc = google::sparsehash::libc_allocator_with_realloc>
+ using DenseHashSet = google::sparsehash::dense_hash_set;
+
+ #undef THash
+#endif
diff --git a/src/Common/Exception.cpp b/src/Common/Exception.cpp
index 641f8bbe0f0..09629b436b2 100644
--- a/src/Common/Exception.cpp
+++ b/src/Common/Exception.cpp
@@ -94,6 +94,22 @@ std::string getExceptionStackTraceString(const std::exception & e)
#endif
}
+std::string getExceptionStackTraceString(std::exception_ptr e)
+{
+ try
+ {
+ std::rethrow_exception(e);
+ }
+ catch (const std::exception & exception)
+ {
+ return getExceptionStackTraceString(exception);
+ }
+ catch (...)
+ {
+ return {};
+ }
+}
+
std::string Exception::getStackTraceString() const
{
@@ -380,6 +396,30 @@ int getCurrentExceptionCode()
}
}
+int getExceptionErrorCode(std::exception_ptr e)
+{
+ try
+ {
+ std::rethrow_exception(e);
+ }
+ catch (const Exception & exception)
+ {
+ return exception.code();
+ }
+ catch (const Poco::Exception &)
+ {
+ return ErrorCodes::POCO_EXCEPTION;
+ }
+ catch (const std::exception &)
+ {
+ return ErrorCodes::STD_EXCEPTION;
+ }
+ catch (...)
+ {
+ return ErrorCodes::UNKNOWN_EXCEPTION;
+ }
+}
+
void rethrowFirstException(const Exceptions & exceptions)
{
diff --git a/src/Common/Exception.h b/src/Common/Exception.h
index 79b4394948a..d04b0f71b9e 100644
--- a/src/Common/Exception.h
+++ b/src/Common/Exception.h
@@ -82,6 +82,7 @@ private:
std::string getExceptionStackTraceString(const std::exception & e);
+std::string getExceptionStackTraceString(std::exception_ptr e);
/// Contains an additional member `saved_errno`. See the throwFromErrno function.
@@ -167,6 +168,7 @@ std::string getCurrentExceptionMessage(bool with_stacktrace, bool check_embedded
/// Returns error code from ErrorCodes
int getCurrentExceptionCode();
+int getExceptionErrorCode(std::exception_ptr e);
/// An execution status of any piece of code, contains return code and optional error
diff --git a/src/Common/MemoryTracker.cpp b/src/Common/MemoryTracker.cpp
index a05fa3b5ad5..50ddcb5a9eb 100644
--- a/src/Common/MemoryTracker.cpp
+++ b/src/Common/MemoryTracker.cpp
@@ -183,9 +183,6 @@ void MemoryTracker::allocImpl(Int64 size, bool throw_if_memory_exceeded)
std::bernoulli_distribution fault(fault_probability);
if (unlikely(fault_probability && fault(thread_local_rng)) && memoryTrackerCanThrow(level, true) && throw_if_memory_exceeded)
{
- ProfileEvents::increment(ProfileEvents::QueryMemoryLimitExceeded);
- amount.fetch_sub(size, std::memory_order_relaxed);
-
/// Prevent recursion. Exception::ctor -> std::string -> new[] -> MemoryTracker::alloc
BlockerInThread untrack_lock(VariableContext::Global);
@@ -363,7 +360,7 @@ void MemoryTracker::setOrRaiseHardLimit(Int64 value)
{
/// This is just atomic set to maximum.
Int64 old_value = hard_limit.load(std::memory_order_relaxed);
- while (old_value < value && !hard_limit.compare_exchange_weak(old_value, value))
+ while ((value == 0 || old_value < value) && !hard_limit.compare_exchange_weak(old_value, value))
;
}
@@ -371,6 +368,6 @@ void MemoryTracker::setOrRaiseHardLimit(Int64 value)
void MemoryTracker::setOrRaiseProfilerLimit(Int64 value)
{
Int64 old_value = profiler_limit.load(std::memory_order_relaxed);
- while (old_value < value && !profiler_limit.compare_exchange_weak(old_value, value))
+ while ((value == 0 || old_value < value) && !profiler_limit.compare_exchange_weak(old_value, value))
;
}
diff --git a/src/Common/SparseHashMap.h b/src/Common/SparseHashMap.h
new file mode 100644
index 00000000000..f01fc633d84
--- /dev/null
+++ b/src/Common/SparseHashMap.h
@@ -0,0 +1,25 @@
+#pragma once
+
+/// SparseHashMap is a wrapper for google::sparse_hash_map.
+/// See comment in DenseHashMap.h
+#if defined(ARCADIA_BUILD)
+#define HASH_FUN_H
+template
+struct THash;
+#endif
+
+#include
+
+#if !defined(ARCADIA_BUILD)
+ template ,
+ class EqualKey = std::equal_to,
+ class Alloc = google::libc_allocator_with_realloc>>
+ using SparseHashMap = google::sparse_hash_map;
+#else
+ template ,
+ class EqualKey = std::equal_to,
+ class Alloc = google::sparsehash::libc_allocator_with_realloc>>
+ using SparseHashMap = google::sparsehash::sparse_hash_map;
+
+ #undef THash
+#endif
diff --git a/src/Common/ThreadStatus.cpp b/src/Common/ThreadStatus.cpp
index 0e12830e49d..81c6b8eb1c3 100644
--- a/src/Common/ThreadStatus.cpp
+++ b/src/Common/ThreadStatus.cpp
@@ -149,7 +149,11 @@ ThreadStatus::~ThreadStatus()
if (deleter)
deleter();
- current_thread = nullptr;
+
+ /// Only change current_thread if it's currently being used by this ThreadStatus
+ /// For example, PushingToViewsBlockOutputStream creates and deletes ThreadStatus instances while running in the main query thread
+ if (current_thread == this)
+ current_thread = nullptr;
}
void ThreadStatus::updatePerformanceCounters()
diff --git a/src/Common/ThreadStatus.h b/src/Common/ThreadStatus.h
index 6fc43114621..dbfb33a320c 100644
--- a/src/Common/ThreadStatus.h
+++ b/src/Common/ThreadStatus.h
@@ -37,6 +37,8 @@ struct RUsageCounters;
struct PerfEventsCounters;
class TaskStatsInfoGetter;
class InternalTextLogsQueue;
+struct ViewRuntimeData;
+class QueryViewsLog;
using InternalTextLogsQueuePtr = std::shared_ptr;
using InternalTextLogsQueueWeakPtr = std::weak_ptr;
@@ -143,6 +145,7 @@ protected:
Poco::Logger * log = nullptr;
friend class CurrentThread;
+ friend class PushingToViewsBlockOutputStream;
/// Use ptr not to add extra dependencies in the header
std::unique_ptr last_rusage;
@@ -151,6 +154,9 @@ protected:
/// Is used to send logs from logs_queue to client in case of fatal errors.
std::function fatal_error_callback;
+ /// It is used to avoid enabling the query profiler when you have multiple ThreadStatus in the same thread
+ bool query_profiled_enabled = true;
+
public:
ThreadStatus();
~ThreadStatus();
@@ -210,9 +216,13 @@ public:
/// Update ProfileEvents and dumps info to system.query_thread_log
void finalizePerformanceCounters();
+ /// Set the counters last usage to now
+ void resetPerformanceCountersLastUsage();
+
/// Detaches thread from the thread group and the query, dumps performance counters if they have not been dumped
void detachQuery(bool exit_if_already_detached = false, bool thread_exits = false);
+
protected:
void applyQuerySettings();
@@ -224,6 +234,8 @@ protected:
void logToQueryThreadLog(QueryThreadLog & thread_log, const String & current_database, std::chrono::time_point now);
+ void logToQueryViewsLog(const ViewRuntimeData & vinfo);
+
void assertState(const std::initializer_list & permitted_states, const char * description = nullptr) const;
diff --git a/src/Common/ya.make b/src/Common/ya.make
index 60dfd5f6bee..82962123e56 100644
--- a/src/Common/ya.make
+++ b/src/Common/ya.make
@@ -102,6 +102,7 @@ SRCS(
ZooKeeper/ZooKeeperNodeCache.cpp
checkStackSize.cpp
clearPasswordFromCommandLine.cpp
+ clickhouse_malloc.cpp
createHardLink.cpp
escapeForFileName.cpp
filesystemHelpers.cpp
@@ -116,6 +117,7 @@ SRCS(
hex.cpp
isLocalAddress.cpp
malloc.cpp
+ memory.cpp
new_delete.cpp
parseAddress.cpp
parseGlobs.cpp
diff --git a/src/Compression/CompressionCodecEncrypted.cpp b/src/Compression/CompressionCodecEncrypted.cpp
index d0904b4bf24..6b921fb9c0a 100644
--- a/src/Compression/CompressionCodecEncrypted.cpp
+++ b/src/Compression/CompressionCodecEncrypted.cpp
@@ -1,13 +1,15 @@
-#include
+#if !defined(ARCADIA_BUILD)
+# include
+#endif
#include
#if USE_SSL && USE_INTERNAL_SSL_LIBRARY
#include
#include
#include
-#include
+#include // Y_IGNORE
#include
-#include
+#include // Y_IGNORE
#include
namespace DB
diff --git a/src/Compression/CompressionCodecEncrypted.h b/src/Compression/CompressionCodecEncrypted.h
index e58fd4ab173..bacd58bcd2f 100644
--- a/src/Compression/CompressionCodecEncrypted.h
+++ b/src/Compression/CompressionCodecEncrypted.h
@@ -2,11 +2,11 @@
// This depends on BoringSSL-specific API, notably .
#include
-#if USE_SSL && USE_INTERNAL_SSL_LIBRARY
+#if USE_SSL && USE_INTERNAL_SSL_LIBRARY && !defined(ARCADIA_BUILD)
#include
#include
-#include
+#include // Y_IGNORE
#include
namespace DB
diff --git a/src/Coordination/KeeperStorageDispatcher.cpp b/src/Coordination/KeeperStorageDispatcher.cpp
index e95a6940baa..7c416b38d8b 100644
--- a/src/Coordination/KeeperStorageDispatcher.cpp
+++ b/src/Coordination/KeeperStorageDispatcher.cpp
@@ -1,6 +1,5 @@
#include
#include
-#include
#include
#include
#include
diff --git a/src/Core/ExternalResultDescription.h b/src/Core/ExternalResultDescription.h
index 78c054e805f..a9ffe8b2ed2 100644
--- a/src/Core/ExternalResultDescription.h
+++ b/src/Core/ExternalResultDescription.h
@@ -6,7 +6,7 @@
namespace DB
{
-/** Common part for implementation of MySQLBlockInputStream, MongoDBBlockInputStream and others.
+/** Common part for implementation of MySQLSource, MongoDBSource and others.
*/
struct ExternalResultDescription
{
diff --git a/src/Core/NamesAndTypes.cpp b/src/Core/NamesAndTypes.cpp
index 91191c73fd0..54f83fc13fc 100644
--- a/src/Core/NamesAndTypes.cpp
+++ b/src/Core/NamesAndTypes.cpp
@@ -6,7 +6,7 @@
#include
#include
#include
-#include
+#include
namespace DB
@@ -163,11 +163,7 @@ NamesAndTypesList NamesAndTypesList::filter(const Names & names) const
NamesAndTypesList NamesAndTypesList::addTypes(const Names & names) const
{
/// NOTE: It's better to make a map in `IStorage` than to create it here every time again.
-#if !defined(ARCADIA_BUILD)
- google::dense_hash_map types;
-#else
- google::sparsehash::dense_hash_map types;
-#endif
+ DenseHashMap types;
types.set_empty_key(StringRef());
for (const auto & column : *this)
diff --git a/src/Core/Settings.h b/src/Core/Settings.h
index e1bd1d29153..19f9f2a94c8 100644
--- a/src/Core/Settings.h
+++ b/src/Core/Settings.h
@@ -173,7 +173,7 @@ class IColumn;
M(Bool, log_queries, 1, "Log requests and write the log to the system table.", 0) \
M(Bool, log_formatted_queries, 0, "Log formatted queries and write the log to the system table.", 0) \
M(LogQueriesType, log_queries_min_type, QueryLogElementType::QUERY_START, "Minimal type in query_log to log, possible values (from low to high): QUERY_START, QUERY_FINISH, EXCEPTION_BEFORE_START, EXCEPTION_WHILE_PROCESSING.", 0) \
- M(Milliseconds, log_queries_min_query_duration_ms, 0, "Minimal time for the query to run, to get to the query_log/query_thread_log.", 0) \
+ M(Milliseconds, log_queries_min_query_duration_ms, 0, "Minimal time for the query to run, to get to the query_log/query_thread_log/query_views_log.", 0) \
M(UInt64, log_queries_cut_to_length, 100000, "If query length is greater than specified threshold (in bytes), then cut query when writing to query log. Also limit length of printed query in ordinary text log.", 0) \
\
M(DistributedProductMode, distributed_product_mode, DistributedProductMode::DENY, "How are distributed subqueries performed inside IN or JOIN sections?", IMPORTANT) \
@@ -352,9 +352,10 @@ class IColumn;
M(UInt64, max_network_bandwidth_for_user, 0, "The maximum speed of data exchange over the network in bytes per second for all concurrently running user queries. Zero means unlimited.", 0)\
M(UInt64, max_network_bandwidth_for_all_users, 0, "The maximum speed of data exchange over the network in bytes per second for all concurrently running queries. Zero means unlimited.", 0) \
\
- M(Bool, log_profile_events, true, "Log query performance statistics into the query_log and query_thread_log.", 0) \
+ M(Bool, log_profile_events, true, "Log query performance statistics into the query_log, query_thread_log and query_views_log.", 0) \
M(Bool, log_query_settings, true, "Log query settings into the query_log.", 0) \
M(Bool, log_query_threads, true, "Log query threads into system.query_thread_log table. This setting have effect only when 'log_queries' is true.", 0) \
+ M(Bool, log_query_views, true, "Log query dependent views into system.query_views_log table. This setting have effect only when 'log_queries' is true.", 0) \
M(String, log_comment, "", "Log comment into system.query_log table and server log. It can be set to arbitrary string no longer than max_query_size.", 0) \
M(LogsLevel, send_logs_level, LogsLevel::fatal, "Send server text logs with specified minimum level to client. Valid values: 'trace', 'debug', 'information', 'warning', 'error', 'fatal', 'none'", 0) \
M(Bool, enable_optimize_predicate_expression, 1, "If it is set to true, optimize predicates to subqueries.", 0) \
@@ -527,6 +528,9 @@ class IColumn;
M(Bool, input_format_tsv_empty_as_default, false, "Treat empty fields in TSV input as default values.", 0) \
M(Bool, input_format_tsv_enum_as_number, false, "Treat inserted enum values in TSV formats as enum indices \\N", 0) \
M(Bool, input_format_null_as_default, true, "For text input formats initialize null fields with default values if data type of this field is not nullable", 0) \
+ M(Bool, input_format_arrow_import_nested, false, "Allow to insert array of structs into Nested table in Arrow input format.", 0) \
+ M(Bool, input_format_orc_import_nested, false, "Allow to insert array of structs into Nested table in ORC input format.", 0) \
+ M(Bool, input_format_parquet_import_nested, false, "Allow to insert array of structs into Nested table in Parquet input format.", 0) \
\
M(DateTimeInputFormat, date_time_input_format, FormatSettings::DateTimeInputFormat::Basic, "Method to read DateTime from text input formats. Possible values: 'basic' and 'best_effort'.", 0) \
M(DateTimeOutputFormat, date_time_output_format, FormatSettings::DateTimeOutputFormat::Simple, "Method to write DateTime to text output. Possible values: 'simple', 'iso', 'unix_timestamp'.", 0) \
diff --git a/src/DataStreams/ExecutionSpeedLimits.h b/src/DataStreams/ExecutionSpeedLimits.h
index d52dc713c1a..9c86ba2faf4 100644
--- a/src/DataStreams/ExecutionSpeedLimits.h
+++ b/src/DataStreams/ExecutionSpeedLimits.h
@@ -3,7 +3,8 @@
#include
#include
#include
-#include
+
+class Stopwatch;
namespace DB
{
diff --git a/src/DataStreams/MongoDBBlockInputStream.cpp b/src/DataStreams/MongoDBSource.cpp
similarity index 99%
rename from src/DataStreams/MongoDBBlockInputStream.cpp
rename to src/DataStreams/MongoDBSource.cpp
index a0a8e3e40a5..c00d214249a 100644
--- a/src/DataStreams/MongoDBBlockInputStream.cpp
+++ b/src/DataStreams/MongoDBSource.cpp
@@ -1,3 +1,5 @@
+#include "MongoDBSource.h"
+
#include
#include
@@ -15,7 +17,6 @@
#include
#include
#include
-#include
#include
#include
#include
diff --git a/src/DataStreams/MongoDBBlockInputStream.h b/src/DataStreams/MongoDBSource.h
similarity index 100%
rename from src/DataStreams/MongoDBBlockInputStream.h
rename to src/DataStreams/MongoDBSource.h
diff --git a/src/DataStreams/PostgreSQLBlockInputStream.cpp b/src/DataStreams/PostgreSQLSource.cpp
similarity index 98%
rename from src/DataStreams/PostgreSQLBlockInputStream.cpp
rename to src/DataStreams/PostgreSQLSource.cpp
index 7f8949740df..c3bde8c84ad 100644
--- a/src/DataStreams/PostgreSQLBlockInputStream.cpp
+++ b/src/DataStreams/PostgreSQLSource.cpp
@@ -1,4 +1,4 @@
-#include "PostgreSQLBlockInputStream.h"
+#include "PostgreSQLSource.h"
#if USE_LIBPQXX
#include
@@ -73,7 +73,7 @@ void PostgreSQLSource::init(const Block & sample_block)
template
void PostgreSQLSource::onStart()
{
- if (connection_holder)
+ if (!tx)
tx = std::make_shared(connection_holder->get());
stream = std::make_unique(*tx, pqxx::from_query, std::string_view(query_str));
diff --git a/src/DataStreams/PostgreSQLBlockInputStream.h b/src/DataStreams/PostgreSQLSource.h
similarity index 86%
rename from src/DataStreams/PostgreSQLBlockInputStream.h
rename to src/DataStreams/PostgreSQLSource.h
index 008da976619..2736afec7a9 100644
--- a/src/DataStreams/PostgreSQLBlockInputStream.h
+++ b/src/DataStreams/PostgreSQLSource.h
@@ -76,19 +76,6 @@ public:
const Block & sample_block_,
const UInt64 max_block_size_)
: PostgreSQLSource(tx_, query_str_, sample_block_, max_block_size_, false) {}
-
- Chunk generate() override
- {
- if (!is_initialized)
- {
- Base::stream = std::make_unique(*Base::tx, pqxx::from_query, std::string_view(Base::query_str));
- is_initialized = true;
- }
-
- return Base::generate();
- }
-
- bool is_initialized = false;
};
}
diff --git a/src/DataStreams/PushingToViewsBlockOutputStream.cpp b/src/DataStreams/PushingToViewsBlockOutputStream.cpp
index 7729eb5fb44..dec5b710f75 100644
--- a/src/DataStreams/PushingToViewsBlockOutputStream.cpp
+++ b/src/DataStreams/PushingToViewsBlockOutputStream.cpp
@@ -1,24 +1,31 @@
#include
+#include
+#include
+#include
#include
#include
-#include
-#include
#include
#include
-#include
-#include
#include
+#include
+#include
#include
-#include
-#include
-#include
-#include
-#include
-#include
#include
+#include
#include
+#include
+#include
+#include
+#include
+#include
+#include
+#include
+#include
#include
-#include
+#include
+
+#include
+#include
namespace DB
{
@@ -79,9 +86,12 @@ PushingToViewsBlockOutputStream::PushingToViewsBlockOutputStream(
ASTPtr query;
BlockOutputStreamPtr out;
+ QueryViewsLogElement::ViewType type = QueryViewsLogElement::ViewType::DEFAULT;
+ String target_name = database_table.getFullTableName();
if (auto * materialized_view = dynamic_cast(dependent_table.get()))
{
+ type = QueryViewsLogElement::ViewType::MATERIALIZED;
addTableLock(
materialized_view->lockForShare(getContext()->getInitialQueryId(), getContext()->getSettingsRef().lock_acquire_timeout));
@@ -89,6 +99,7 @@ PushingToViewsBlockOutputStream::PushingToViewsBlockOutputStream(
auto inner_table_id = inner_table->getStorageID();
auto inner_metadata_snapshot = inner_table->getInMemoryMetadataPtr();
query = dependent_metadata_snapshot->getSelectQuery().inner_query;
+ target_name = inner_table_id.getFullTableName();
std::unique_ptr insert = std::make_unique();
insert->table_id = inner_table_id;
@@ -114,14 +125,57 @@ PushingToViewsBlockOutputStream::PushingToViewsBlockOutputStream(
BlockIO io = interpreter.execute();
out = io.out;
}
- else if (dynamic_cast(dependent_table.get()))
+ else if (const auto * live_view = dynamic_cast(dependent_table.get()))
+ {
+ type = QueryViewsLogElement::ViewType::LIVE;
+ query = live_view->getInnerQuery(); // Used only to log in system.query_views_log
out = std::make_shared(
dependent_table, dependent_metadata_snapshot, insert_context, ASTPtr(), true);
+ }
else
out = std::make_shared(
dependent_table, dependent_metadata_snapshot, insert_context, ASTPtr());
- views.emplace_back(ViewInfo{std::move(query), database_table, std::move(out), nullptr, 0 /* elapsed_ms */});
+ /// If the materialized view is executed outside of a query, for example as a result of SYSTEM FLUSH LOGS or
+ /// SYSTEM FLUSH DISTRIBUTED ..., we can't attach to any thread group and we won't log, so there is no point on collecting metrics
+ std::unique_ptr thread_status = nullptr;
+
+ ThreadGroupStatusPtr running_group = current_thread && current_thread->getThreadGroup()
+ ? current_thread->getThreadGroup()
+ : MainThreadStatus::getInstance().thread_group;
+ if (running_group)
+ {
+ /// We are creating a ThreadStatus per view to store its metrics individually
+ /// Since calling ThreadStatus() changes current_thread we save it and restore it after the calls
+ /// Later on, before doing any task related to a view, we'll switch to its ThreadStatus, do the work,
+ /// and switch back to the original thread_status.
+ auto * original_thread = current_thread;
+ SCOPE_EXIT({ current_thread = original_thread; });
+
+ thread_status = std::make_unique();
+ /// Disable query profiler for this ThreadStatus since the running (main query) thread should already have one
+ /// If we didn't disable it, then we could end up with N + 1 (N = number of dependencies) profilers which means
+ /// N times more interruptions
+ thread_status->query_profiled_enabled = false;
+ thread_status->setupState(running_group);
+ }
+
+ QueryViewsLogElement::ViewRuntimeStats runtime_stats{
+ target_name,
+ type,
+ std::move(thread_status),
+ 0,
+ std::chrono::system_clock::now(),
+ QueryViewsLogElement::ViewStatus::EXCEPTION_BEFORE_START};
+ views.emplace_back(ViewRuntimeData{std::move(query), database_table, std::move(out), nullptr, std::move(runtime_stats)});
+
+
+ /// Add the view to the query access info so it can appear in system.query_log
+ if (!no_destination)
+ {
+ getContext()->getQueryContext()->addQueryAccessInfo(
+ backQuoteIfNeed(database_table.getDatabaseName()), target_name, {}, "", database_table.getFullTableName());
+ }
}
/// Do not push to destination table if the flag is set
@@ -136,7 +190,6 @@ PushingToViewsBlockOutputStream::PushingToViewsBlockOutputStream(
}
}
-
Block PushingToViewsBlockOutputStream::getHeader() const
{
/// If we don't write directly to the destination
@@ -147,6 +200,39 @@ Block PushingToViewsBlockOutputStream::getHeader() const
return metadata_snapshot->getSampleBlockWithVirtuals(storage->getVirtuals());
}
+/// Auxiliary function to do the setup and teardown to run a view individually and collect its metrics inside the view ThreadStatus
+void inline runViewStage(ViewRuntimeData & view, const std::string & action, std::function stage)
+{
+ Stopwatch watch;
+
+ auto * original_thread = current_thread;
+ SCOPE_EXIT({ current_thread = original_thread; });
+
+ if (view.runtime_stats.thread_status)
+ {
+ /// Change thread context to store individual metrics per view. Once the work in done, go back to the original thread
+ view.runtime_stats.thread_status->resetPerformanceCountersLastUsage();
+ current_thread = view.runtime_stats.thread_status.get();
+ }
+
+ try
+ {
+ stage();
+ }
+ catch (Exception & ex)
+ {
+ ex.addMessage(action + " " + view.table_id.getNameForLogs());
+ view.setException(std::current_exception());
+ }
+ catch (...)
+ {
+ view.setException(std::current_exception());
+ }
+
+ if (view.runtime_stats.thread_status)
+ view.runtime_stats.thread_status->updatePerformanceCounters();
+ view.runtime_stats.elapsed_ms += watch.elapsedMilliseconds();
+}
void PushingToViewsBlockOutputStream::write(const Block & block)
{
@@ -169,39 +255,34 @@ void PushingToViewsBlockOutputStream::write(const Block & block)
output->write(block);
}
- /// Don't process materialized views if this block is duplicate
- if (!getContext()->getSettingsRef().deduplicate_blocks_in_dependent_materialized_views && replicated_output && replicated_output->lastBlockIsDuplicate())
+ if (views.empty())
return;
- // Insert data into materialized views only after successful insert into main table
+ /// Don't process materialized views if this block is duplicate
const Settings & settings = getContext()->getSettingsRef();
- if (settings.parallel_view_processing && views.size() > 1)
+ if (!settings.deduplicate_blocks_in_dependent_materialized_views && replicated_output && replicated_output->lastBlockIsDuplicate())
+ return;
+
+ size_t max_threads = 1;
+ if (settings.parallel_view_processing)
+ max_threads = settings.max_threads ? std::min(static_cast(settings.max_threads), views.size()) : views.size();
+ if (max_threads > 1)
{
- // Push to views concurrently if enabled and more than one view is attached
- ThreadPool pool(std::min(size_t(settings.max_threads), views.size()));
+ ThreadPool pool(max_threads);
for (auto & view : views)
{
- auto thread_group = CurrentThread::getGroup();
- pool.scheduleOrThrowOnError([=, &view, this]
- {
+ pool.scheduleOrThrowOnError([&] {
setThreadName("PushingToViews");
- if (thread_group)
- CurrentThread::attachToIfDetached(thread_group);
- process(block, view);
+ runViewStage(view, "while pushing to view", [&]() { process(block, view); });
});
}
- // Wait for concurrent view processing
pool.wait();
}
else
{
- // Process sequentially
for (auto & view : views)
{
- process(block, view);
-
- if (view.exception)
- std::rethrow_exception(view.exception);
+ runViewStage(view, "while pushing to view", [&]() { process(block, view); });
}
}
}
@@ -213,14 +294,11 @@ void PushingToViewsBlockOutputStream::writePrefix()
for (auto & view : views)
{
- try
+ runViewStage(view, "while writing prefix to view", [&] { view.out->writePrefix(); });
+ if (view.exception)
{
- view.out->writePrefix();
- }
- catch (Exception & ex)
- {
- ex.addMessage("while write prefix to view " + view.table_id.getNameForLogs());
- throw;
+ logQueryViews();
+ std::rethrow_exception(view.exception);
}
}
}
@@ -230,95 +308,82 @@ void PushingToViewsBlockOutputStream::writeSuffix()
if (output)
output->writeSuffix();
- std::exception_ptr first_exception;
+ if (views.empty())
+ return;
- const Settings & settings = getContext()->getSettingsRef();
- bool parallel_processing = false;
+ auto process_suffix = [](ViewRuntimeData & view)
+ {
+ view.out->writeSuffix();
+ view.runtime_stats.setStatus(QueryViewsLogElement::ViewStatus::QUERY_FINISH);
+ };
+ static std::string stage_step = "while writing suffix to view";
/// Run writeSuffix() for views in separate thread pool.
/// In could have been done in PushingToViewsBlockOutputStream::process, however
/// it is not good if insert into main table fail but into view succeed.
- if (settings.parallel_view_processing && views.size() > 1)
+ const Settings & settings = getContext()->getSettingsRef();
+ size_t max_threads = 1;
+ if (settings.parallel_view_processing)
+ max_threads = settings.max_threads ? std::min(static_cast(settings.max_threads), views.size()) : views.size();
+ bool exception_happened = false;
+ if (max_threads > 1)
{
- parallel_processing = true;
-
- // Push to views concurrently if enabled and more than one view is attached
- ThreadPool pool(std::min(size_t(settings.max_threads), views.size()));
- auto thread_group = CurrentThread::getGroup();
-
+ ThreadPool pool(max_threads);
+ std::atomic_uint8_t exception_count = 0;
for (auto & view : views)
{
if (view.exception)
- continue;
-
- pool.scheduleOrThrowOnError([thread_group, &view, this]
{
+ exception_happened = true;
+ continue;
+ }
+ pool.scheduleOrThrowOnError([&] {
setThreadName("PushingToViews");
- if (thread_group)
- CurrentThread::attachToIfDetached(thread_group);
- Stopwatch watch;
- try
- {
- view.out->writeSuffix();
- }
- catch (...)
- {
- view.exception = std::current_exception();
- }
- view.elapsed_ms += watch.elapsedMilliseconds();
-
- LOG_TRACE(log, "Pushing from {} to {} took {} ms.",
- storage->getStorageID().getNameForLogs(),
- view.table_id.getNameForLogs(),
- view.elapsed_ms);
+ runViewStage(view, stage_step, [&] { process_suffix(view); });
+ if (view.exception)
+ exception_count.fetch_add(1, std::memory_order_relaxed);
});
}
- // Wait for concurrent view processing
pool.wait();
+ exception_happened |= exception_count.load(std::memory_order_relaxed) != 0;
+ }
+ else
+ {
+ for (auto & view : views)
+ {
+ if (view.exception)
+ {
+ exception_happened = true;
+ continue;
+ }
+ runViewStage(view, stage_step, [&] { process_suffix(view); });
+ if (view.exception)
+ exception_happened = true;
+ }
}
for (auto & view : views)
{
- if (view.exception)
- {
- if (!first_exception)
- first_exception = view.exception;
-
- continue;
- }
-
- if (parallel_processing)
- continue;
-
- Stopwatch watch;
- try
- {
- view.out->writeSuffix();
- }
- catch (Exception & ex)
- {
- ex.addMessage("while write prefix to view " + view.table_id.getNameForLogs());
- throw;
- }
- view.elapsed_ms += watch.elapsedMilliseconds();
-
- LOG_TRACE(log, "Pushing from {} to {} took {} ms.",
- storage->getStorageID().getNameForLogs(),
- view.table_id.getNameForLogs(),
- view.elapsed_ms);
+ if (!view.exception)
+ LOG_TRACE(
+ log,
+ "Pushing ({}) from {} to {} took {} ms.",
+ max_threads <= 1 ? "sequentially" : ("parallel " + std::to_string(max_threads)),
+ storage->getStorageID().getNameForLogs(),
+ view.table_id.getNameForLogs(),
+ view.runtime_stats.elapsed_ms);
}
- if (first_exception)
- std::rethrow_exception(first_exception);
+ if (exception_happened)
+ checkExceptionsInViews();
- UInt64 milliseconds = main_watch.elapsedMilliseconds();
if (views.size() > 1)
{
- LOG_DEBUG(log, "Pushing from {} to {} views took {} ms.",
- storage->getStorageID().getNameForLogs(), views.size(),
- milliseconds);
+ UInt64 milliseconds = main_watch.elapsedMilliseconds();
+ LOG_DEBUG(log, "Pushing from {} to {} views took {} ms.", storage->getStorageID().getNameForLogs(), views.size(), milliseconds);
}
+ logQueryViews();
}
void PushingToViewsBlockOutputStream::flush()
@@ -330,70 +395,103 @@ void PushingToViewsBlockOutputStream::flush()
view.out->flush();
}
-void PushingToViewsBlockOutputStream::process(const Block & block, ViewInfo & view)
+void PushingToViewsBlockOutputStream::process(const Block & block, ViewRuntimeData & view)
{
- Stopwatch watch;
+ BlockInputStreamPtr in;
- try
+ /// We need keep InterpreterSelectQuery, until the processing will be finished, since:
+ ///
+ /// - We copy Context inside InterpreterSelectQuery to support
+ /// modification of context (Settings) for subqueries
+ /// - InterpreterSelectQuery lives shorter than query pipeline.
+ /// It's used just to build the query pipeline and no longer needed
+ /// - ExpressionAnalyzer and then, Functions, that created in InterpreterSelectQuery,
+ /// **can** take a reference to Context from InterpreterSelectQuery
+ /// (the problem raises only when function uses context from the
+ /// execute*() method, like FunctionDictGet do)
+ /// - These objects live inside query pipeline (DataStreams) and the reference become dangling.
+ std::optional select;
+
+ if (view.runtime_stats.type == QueryViewsLogElement::ViewType::MATERIALIZED)
{
- BlockInputStreamPtr in;
+ /// We create a table with the same name as original table and the same alias columns,
+ /// but it will contain single block (that is INSERT-ed into main table).
+ /// InterpreterSelectQuery will do processing of alias columns.
- /// We need keep InterpreterSelectQuery, until the processing will be finished, since:
- ///
- /// - We copy Context inside InterpreterSelectQuery to support
- /// modification of context (Settings) for subqueries
- /// - InterpreterSelectQuery lives shorter than query pipeline.
- /// It's used just to build the query pipeline and no longer needed
- /// - ExpressionAnalyzer and then, Functions, that created in InterpreterSelectQuery,
- /// **can** take a reference to Context from InterpreterSelectQuery
- /// (the problem raises only when function uses context from the
- /// execute*() method, like FunctionDictGet do)
- /// - These objects live inside query pipeline (DataStreams) and the reference become dangling.
- std::optional select;
+ auto local_context = Context::createCopy(select_context);
+ local_context->addViewSource(
+ StorageValues::create(storage->getStorageID(), metadata_snapshot->getColumns(), block, storage->getVirtuals()));
+ select.emplace(view.query, local_context, SelectQueryOptions());
+ in = std::make_shared(select->execute().getInputStream());
- if (view.query)
- {
- /// We create a table with the same name as original table and the same alias columns,
- /// but it will contain single block (that is INSERT-ed into main table).
- /// InterpreterSelectQuery will do processing of alias columns.
-
- auto local_context = Context::createCopy(select_context);
- local_context->addViewSource(
- StorageValues::create(storage->getStorageID(), metadata_snapshot->getColumns(), block, storage->getVirtuals()));
- select.emplace(view.query, local_context, SelectQueryOptions());
- in = std::make_shared(select->execute().getInputStream());
-
- /// Squashing is needed here because the materialized view query can generate a lot of blocks
- /// even when only one block is inserted into the parent table (e.g. if the query is a GROUP BY
- /// and two-level aggregation is triggered).
- in = std::make_shared(
- in, getContext()->getSettingsRef().min_insert_block_size_rows, getContext()->getSettingsRef().min_insert_block_size_bytes);
- in = std::make_shared(in, view.out->getHeader(), ConvertingBlockInputStream::MatchColumnsMode::Name);
- }
- else
- in = std::make_shared(block);
-
- in->readPrefix();
-
- while (Block result_block = in->read())
- {
- Nested::validateArraySizes(result_block);
- view.out->write(result_block);
- }
-
- in->readSuffix();
+ /// Squashing is needed here because the materialized view query can generate a lot of blocks
+ /// even when only one block is inserted into the parent table (e.g. if the query is a GROUP BY
+ /// and two-level aggregation is triggered).
+ in = std::make_shared(
+ in, getContext()->getSettingsRef().min_insert_block_size_rows, getContext()->getSettingsRef().min_insert_block_size_bytes);
+ in = std::make_shared(in, view.out->getHeader(), ConvertingBlockInputStream::MatchColumnsMode::Name);
}
- catch (Exception & ex)
+ else
+ in = std::make_shared(block);
+
+ in->setProgressCallback([this](const Progress & progress)
{
- ex.addMessage("while pushing to view " + view.table_id.getNameForLogs());
- view.exception = std::current_exception();
- }
- catch (...)
+ CurrentThread::updateProgressIn(progress);
+ this->onProgress(progress);
+ });
+
+ in->readPrefix();
+
+ while (Block result_block = in->read())
{
- view.exception = std::current_exception();
+ Nested::validateArraySizes(result_block);
+ view.out->write(result_block);
}
- view.elapsed_ms += watch.elapsedMilliseconds();
+ in->readSuffix();
}
+void PushingToViewsBlockOutputStream::checkExceptionsInViews()
+{
+ for (auto & view : views)
+ {
+ if (view.exception)
+ {
+ logQueryViews();
+ std::rethrow_exception(view.exception);
+ }
+ }
+}
+
+void PushingToViewsBlockOutputStream::logQueryViews()
+{
+ const auto & settings = getContext()->getSettingsRef();
+ const UInt64 min_query_duration = settings.log_queries_min_query_duration_ms.totalMilliseconds();
+ const QueryViewsLogElement::ViewStatus min_status = settings.log_queries_min_type;
+ if (views.empty() || !settings.log_queries || !settings.log_query_views)
+ return;
+
+ for (auto & view : views)
+ {
+ if ((min_query_duration && view.runtime_stats.elapsed_ms <= min_query_duration) || (view.runtime_stats.event_status < min_status))
+ continue;
+
+ try
+ {
+ if (view.runtime_stats.thread_status)
+ view.runtime_stats.thread_status->logToQueryViewsLog(view);
+ }
+ catch (...)
+ {
+ tryLogCurrentException(__PRETTY_FUNCTION__);
+ }
+ }
+}
+
+
+void PushingToViewsBlockOutputStream::onProgress(const Progress & progress)
+{
+ if (getContext()->getProgressCallback())
+ getContext()->getProgressCallback()(progress);
+}
}
diff --git a/src/DataStreams/PushingToViewsBlockOutputStream.h b/src/DataStreams/PushingToViewsBlockOutputStream.h
index db6b671ce2c..ba125e28829 100644
--- a/src/DataStreams/PushingToViewsBlockOutputStream.h
+++ b/src/DataStreams/PushingToViewsBlockOutputStream.h
@@ -1,6 +1,7 @@
#pragma once
#include
+#include
#include
#include
#include
@@ -8,13 +9,28 @@
namespace Poco
{
class Logger;
-};
+}
namespace DB
{
class ReplicatedMergeTreeSink;
+struct ViewRuntimeData
+{
+ const ASTPtr query;
+ StorageID table_id;
+ BlockOutputStreamPtr out;
+ std::exception_ptr exception;
+ QueryViewsLogElement::ViewRuntimeStats runtime_stats;
+
+ void setException(std::exception_ptr e)
+ {
+ exception = e;
+ runtime_stats.setStatus(QueryViewsLogElement::ViewStatus::EXCEPTION_WHILE_PROCESSING);
+ }
+};
+
/** Writes data to the specified table and to all dependent materialized views.
*/
class PushingToViewsBlockOutputStream : public IBlockOutputStream, WithContext
@@ -33,6 +49,7 @@ public:
void flush() override;
void writePrefix() override;
void writeSuffix() override;
+ void onProgress(const Progress & progress) override;
private:
StoragePtr storage;
@@ -44,20 +61,13 @@ private:
ASTPtr query_ptr;
Stopwatch main_watch;
- struct ViewInfo
- {
- ASTPtr query;
- StorageID table_id;
- BlockOutputStreamPtr out;
- std::exception_ptr exception;
- UInt64 elapsed_ms = 0;
- };
-
- std::vector views;
+ std::vector views;
ContextMutablePtr select_context;
ContextMutablePtr insert_context;
- void process(const Block & block, ViewInfo & view);
+ void process(const Block & block, ViewRuntimeData & view);
+ void checkExceptionsInViews();
+ void logQueryViews();
};
diff --git a/src/DataStreams/SQLiteBlockInputStream.cpp b/src/DataStreams/SQLiteSource.cpp
similarity index 74%
rename from src/DataStreams/SQLiteBlockInputStream.cpp
rename to src/DataStreams/SQLiteSource.cpp
index da7645d968d..d0d8724c2dd 100644
--- a/src/DataStreams/SQLiteBlockInputStream.cpp
+++ b/src/DataStreams/SQLiteSource.cpp
@@ -1,4 +1,4 @@
-#include "SQLiteBlockInputStream.h"
+#include "SQLiteSource.h"
#if USE_SQLITE
#include
@@ -22,21 +22,18 @@ namespace ErrorCodes
extern const int SQLITE_ENGINE_ERROR;
}
-SQLiteBlockInputStream::SQLiteBlockInputStream(
- SQLitePtr sqlite_db_,
- const String & query_str_,
- const Block & sample_block,
- const UInt64 max_block_size_)
- : query_str(query_str_)
+SQLiteSource::SQLiteSource(
+ SQLitePtr sqlite_db_,
+ const String & query_str_,
+ const Block & sample_block,
+ const UInt64 max_block_size_)
+ : SourceWithProgress(sample_block.cloneEmpty())
+ , query_str(query_str_)
, max_block_size(max_block_size_)
, sqlite_db(std::move(sqlite_db_))
{
description.init(sample_block);
-}
-
-void SQLiteBlockInputStream::readPrefix()
-{
sqlite3_stmt * compiled_stmt = nullptr;
int status = sqlite3_prepare_v2(sqlite_db.get(), query_str.c_str(), query_str.size() + 1, &compiled_stmt, nullptr);
@@ -48,11 +45,10 @@ void SQLiteBlockInputStream::readPrefix()
compiled_statement = std::unique_ptr(compiled_stmt, StatementDeleter());
}
-
-Block SQLiteBlockInputStream::readImpl()
+Chunk SQLiteSource::generate()
{
if (!compiled_statement)
- return Block();
+ return {};
MutableColumns columns = description.sample_block.cloneEmptyColumns();
size_t num_rows = 0;
@@ -73,30 +69,30 @@ Block SQLiteBlockInputStream::readImpl()
else if (status != SQLITE_ROW)
{
throw Exception(ErrorCodes::SQLITE_ENGINE_ERROR,
- "Expected SQLITE_ROW status, but got status {}. Error: {}, Message: {}",
- status, sqlite3_errstr(status), sqlite3_errmsg(sqlite_db.get()));
+ "Expected SQLITE_ROW status, but got status {}. Error: {}, Message: {}",
+ status, sqlite3_errstr(status), sqlite3_errmsg(sqlite_db.get()));
}
int column_count = sqlite3_column_count(compiled_statement.get());
- for (const auto idx : collections::range(0, column_count))
- {
- const auto & sample = description.sample_block.getByPosition(idx);
- if (sqlite3_column_type(compiled_statement.get(), idx) == SQLITE_NULL)
+ for (int column_index = 0; column_index < column_count; ++column_index)
+ {
+ if (sqlite3_column_type(compiled_statement.get(), column_index) == SQLITE_NULL)
{
- insertDefaultSQLiteValue(*columns[idx], *sample.column);
+ columns[column_index]->insertDefault();
continue;
}
- if (description.types[idx].second)
+ auto & [type, is_nullable] = description.types[column_index];
+ if (is_nullable)
{
- ColumnNullable & column_nullable = assert_cast(*columns[idx]);
- insertValue(column_nullable.getNestedColumn(), description.types[idx].first, idx);
+ ColumnNullable & column_nullable = assert_cast(*columns[column_index]);
+ insertValue(column_nullable.getNestedColumn(), type, column_index);
column_nullable.getNullMapData().emplace_back(0);
}
else
{
- insertValue(*columns[idx], description.types[idx].first, idx);
+ insertValue(*columns[column_index], type, column_index);
}
}
@@ -104,18 +100,16 @@ Block SQLiteBlockInputStream::readImpl()
break;
}
- return description.sample_block.cloneWithColumns(std::move(columns));
-}
-
-
-void SQLiteBlockInputStream::readSuffix()
-{
- if (compiled_statement)
+ if (num_rows == 0)
+ {
compiled_statement.reset();
+ return {};
+ }
+
+ return Chunk(std::move(columns), num_rows);
}
-
-void SQLiteBlockInputStream::insertValue(IColumn & column, const ExternalResultDescription::ValueType type, size_t idx)
+void SQLiteSource::insertValue(IColumn & column, ExternalResultDescription::ValueType type, size_t idx)
{
switch (type)
{
diff --git a/src/DataStreams/SQLiteBlockInputStream.h b/src/DataStreams/SQLiteSource.h
similarity index 59%
rename from src/DataStreams/SQLiteBlockInputStream.h
rename to src/DataStreams/SQLiteSource.h
index 35fc4801b4b..0f8b42c536b 100644
--- a/src/DataStreams/SQLiteBlockInputStream.h
+++ b/src/DataStreams/SQLiteSource.h
@@ -6,32 +6,28 @@
#if USE_SQLITE
#include
-#include
+#include
#include // Y_IGNORE
namespace DB
{
-class SQLiteBlockInputStream : public IBlockInputStream
+
+class SQLiteSource : public SourceWithProgress
{
+
using SQLitePtr = std::shared_ptr;
public:
- SQLiteBlockInputStream(SQLitePtr sqlite_db_,
+ SQLiteSource(SQLitePtr sqlite_db_,
const String & query_str_,
const Block & sample_block,
UInt64 max_block_size_);
String getName() const override { return "SQLite"; }
- Block getHeader() const override { return description.sample_block.cloneEmpty(); }
-
private:
- void insertDefaultSQLiteValue(IColumn & column, const IColumn & sample_column)
- {
- column.insertFrom(sample_column, 0);
- }
using ValueType = ExternalResultDescription::ValueType;
@@ -40,19 +36,14 @@ private:
void operator()(sqlite3_stmt * stmt) { sqlite3_finalize(stmt); }
};
- void readPrefix() override;
+ Chunk generate() override;
- Block readImpl() override;
-
- void readSuffix() override;
-
- void insertValue(IColumn & column, const ExternalResultDescription::ValueType type, size_t idx);
+ void insertValue(IColumn & column, ExternalResultDescription::ValueType type, size_t idx);
String query_str;
UInt64 max_block_size;
ExternalResultDescription description;
-
SQLitePtr sqlite_db;
std::unique_ptr compiled_statement;
};
diff --git a/src/DataStreams/ya.make b/src/DataStreams/ya.make
index b1205828a7e..c16db808a5b 100644
--- a/src/DataStreams/ya.make
+++ b/src/DataStreams/ya.make
@@ -29,7 +29,7 @@ SRCS(
ITTLAlgorithm.cpp
InternalTextLogsRowOutputStream.cpp
MaterializingBlockInputStream.cpp
- MongoDBBlockInputStream.cpp
+ MongoDBSource.cpp
NativeBlockInputStream.cpp
NativeBlockOutputStream.cpp
PushingToViewsBlockOutputStream.cpp
@@ -37,7 +37,7 @@ SRCS(
RemoteBlockOutputStream.cpp
RemoteQueryExecutor.cpp
RemoteQueryExecutorReadContext.cpp
- SQLiteBlockInputStream.cpp
+ SQLiteSource.cpp
SizeLimits.cpp
SquashingBlockInputStream.cpp
SquashingBlockOutputStream.cpp
diff --git a/src/DataTypes/NestedUtils.cpp b/src/DataTypes/NestedUtils.cpp
index ed9ea3e1b5c..94b3b2f3cf7 100644
--- a/src/DataTypes/NestedUtils.cpp
+++ b/src/DataTypes/NestedUtils.cpp
@@ -208,6 +208,18 @@ void validateArraySizes(const Block & block)
}
}
+std::unordered_set getAllTableNames(const Block & block)
+{
+ std::unordered_set nested_table_names;
+ for (auto & name : block.getNames())
+ {
+ auto nested_table_name = Nested::extractTableName(name);
+ if (!nested_table_name.empty())
+ nested_table_names.insert(nested_table_name);
+ }
+ return nested_table_names;
+}
+
}
}
diff --git a/src/DataTypes/NestedUtils.h b/src/DataTypes/NestedUtils.h
index b8428b96d3e..d16e309fc81 100644
--- a/src/DataTypes/NestedUtils.h
+++ b/src/DataTypes/NestedUtils.h
@@ -28,6 +28,9 @@ namespace Nested
/// Check that sizes of arrays - elements of nested data structures - are equal.
void validateArraySizes(const Block & block);
+
+ /// Get all nested tables names from a block.
+ std::unordered_set getAllTableNames(const Block & block);
}
}
diff --git a/src/Databases/DatabaseReplicated.cpp b/src/Databases/DatabaseReplicated.cpp
index 26dd8763c40..8e8fb4e2d6d 100644
--- a/src/Databases/DatabaseReplicated.cpp
+++ b/src/Databases/DatabaseReplicated.cpp
@@ -8,7 +8,6 @@
#include
#include
#include
-#include
#include
#include
#include
diff --git a/src/Databases/MySQL/DatabaseMySQL.cpp b/src/Databases/MySQL/DatabaseMySQL.cpp
index d4acd2af85e..858255e730a 100644
--- a/src/Databases/MySQL/DatabaseMySQL.cpp
+++ b/src/Databases/MySQL/DatabaseMySQL.cpp
@@ -11,7 +11,7 @@
# include
# include
# include
-# include
+# include
# include
# include
# include
diff --git a/src/Databases/MySQL/FetchTablesColumnsList.cpp b/src/Databases/MySQL/FetchTablesColumnsList.cpp
index 353bcd877ee..c67dcefb433 100644
--- a/src/Databases/MySQL/FetchTablesColumnsList.cpp
+++ b/src/Databases/MySQL/FetchTablesColumnsList.cpp
@@ -10,7 +10,7 @@
#include
#include
#include
-#include
+#include
#include
#include
#include
diff --git a/src/Databases/MySQL/MaterializeMetadata.cpp b/src/Databases/MySQL/MaterializeMetadata.cpp
index 9f5100991aa..f684797c675 100644
--- a/src/Databases/MySQL/MaterializeMetadata.cpp
+++ b/src/Databases/MySQL/MaterializeMetadata.cpp
@@ -5,7 +5,7 @@
#include
#include
#include
-#include
+#include
#include
#include
#include
diff --git a/src/Databases/MySQL/MaterializedMySQLSyncThread.cpp b/src/Databases/MySQL/MaterializedMySQLSyncThread.cpp
index 5175e9d0467..53495aa3cb1 100644
--- a/src/Databases/MySQL/MaterializedMySQLSyncThread.cpp
+++ b/src/Databases/MySQL/MaterializedMySQLSyncThread.cpp
@@ -16,7 +16,7 @@
# include
# include
# include
-# include
+# include
# include
# include
# include
diff --git a/src/Databases/PostgreSQL/DatabasePostgreSQL.cpp b/src/Databases/PostgreSQL/DatabasePostgreSQL.cpp
index c848c784712..259648f4399 100644
--- a/src/Databases/PostgreSQL/DatabasePostgreSQL.cpp
+++ b/src/Databases/PostgreSQL/DatabasePostgreSQL.cpp
@@ -164,7 +164,7 @@ StoragePtr DatabasePostgreSQL::tryGetTable(const String & table_name, ContextPtr
}
-StoragePtr DatabasePostgreSQL::fetchTable(const String & table_name, ContextPtr local_context, const bool table_checked) const
+StoragePtr DatabasePostgreSQL::fetchTable(const String & table_name, ContextPtr, const bool table_checked) const
{
if (!cache_tables || !cached_tables.count(table_name))
{
@@ -179,7 +179,7 @@ StoragePtr DatabasePostgreSQL::fetchTable(const String & table_name, ContextPtr
auto storage = StoragePostgreSQL::create(
StorageID(database_name, table_name), pool, table_name,
- ColumnsDescription{*columns}, ConstraintsDescription{}, String{}, local_context, postgres_schema);
+ ColumnsDescription{*columns}, ConstraintsDescription{}, String{}, postgres_schema);
if (cache_tables)
cached_tables[table_name] = storage;
diff --git a/src/Dictionaries/CacheDictionary.cpp b/src/Dictionaries/CacheDictionary.cpp
index 4dfe802dd2b..a5f953ccc15 100644
--- a/src/Dictionaries/CacheDictionary.cpp
+++ b/src/Dictionaries/CacheDictionary.cpp
@@ -10,7 +10,7 @@
#include
#include
-#include
+#include
#include
#include
@@ -18,21 +18,21 @@
namespace ProfileEvents
{
-extern const Event DictCacheKeysRequested;
-extern const Event DictCacheKeysRequestedMiss;
-extern const Event DictCacheKeysRequestedFound;
-extern const Event DictCacheKeysExpired;
-extern const Event DictCacheKeysNotFound;
-extern const Event DictCacheKeysHit;
-extern const Event DictCacheRequestTimeNs;
-extern const Event DictCacheRequests;
-extern const Event DictCacheLockWriteNs;
-extern const Event DictCacheLockReadNs;
+ extern const Event DictCacheKeysRequested;
+ extern const Event DictCacheKeysRequestedMiss;
+ extern const Event DictCacheKeysRequestedFound;
+ extern const Event DictCacheKeysExpired;
+ extern const Event DictCacheKeysNotFound;
+ extern const Event DictCacheKeysHit;
+ extern const Event DictCacheRequestTimeNs;
+ extern const Event DictCacheRequests;
+ extern const Event DictCacheLockWriteNs;
+ extern const Event DictCacheLockReadNs;
}
namespace CurrentMetrics
{
-extern const Metric DictCacheRequests;
+ extern const Metric DictCacheRequests;
}
namespace DB
diff --git a/src/Dictionaries/CassandraDictionarySource.cpp b/src/Dictionaries/CassandraDictionarySource.cpp
index 8b31b4d6fa2..aa8d6107508 100644
--- a/src/Dictionaries/CassandraDictionarySource.cpp
+++ b/src/Dictionaries/CassandraDictionarySource.cpp
@@ -36,10 +36,10 @@ void registerDictionarySourceCassandra(DictionarySourceFactory & factory)
#if USE_CASSANDRA
-#include
-#include
-#include "CassandraBlockInputStream.h"
#include
+#include
+#include
+#include
namespace DB
{
@@ -49,7 +49,7 @@ namespace ErrorCodes
extern const int INVALID_CONFIG_PARAMETER;
}
-CassandraSettings::CassandraSettings(
+CassandraDictionarySource::Configuration::Configuration(
const Poco::Util::AbstractConfiguration & config,
const String & config_prefix)
: host(config.getString(config_prefix + ".host"))
@@ -66,7 +66,7 @@ CassandraSettings::CassandraSettings(
setConsistency(config.getString(config_prefix + ".consistency", "One"));
}
-void CassandraSettings::setConsistency(const String & config_str)
+void CassandraDictionarySource::Configuration::setConsistency(const String & config_str)
{
if (config_str == "One")
consistency = CASS_CONSISTENCY_ONE;
@@ -96,19 +96,19 @@ static const size_t max_block_size = 8192;
CassandraDictionarySource::CassandraDictionarySource(
const DictionaryStructure & dict_struct_,
- const CassandraSettings & settings_,
+ const Configuration & configuration_,
const Block & sample_block_)
: log(&Poco::Logger::get("CassandraDictionarySource"))
, dict_struct(dict_struct_)
- , settings(settings_)
+ , configuration(configuration_)
, sample_block(sample_block_)
- , query_builder(dict_struct, settings.db, "", settings.table, settings.where, IdentifierQuotingStyle::DoubleQuotes)
+ , query_builder(dict_struct, configuration.db, "", configuration.table, configuration.query, configuration.where, IdentifierQuotingStyle::DoubleQuotes)
{
- cassandraCheck(cass_cluster_set_contact_points(cluster, settings.host.c_str()));
- if (settings.port)
- cassandraCheck(cass_cluster_set_port(cluster, settings.port));
- cass_cluster_set_credentials(cluster, settings.user.c_str(), settings.password.c_str());
- cassandraCheck(cass_cluster_set_consistency(cluster, settings.consistency));
+ cassandraCheck(cass_cluster_set_contact_points(cluster, configuration.host.c_str()));
+ if (configuration.port)
+ cassandraCheck(cass_cluster_set_port(cluster, configuration.port));
+ cass_cluster_set_credentials(cluster, configuration.user.c_str(), configuration.password.c_str());
+ cassandraCheck(cass_cluster_set_consistency(cluster, configuration.consistency));
}
CassandraDictionarySource::CassandraDictionarySource(
@@ -118,14 +118,14 @@ CassandraDictionarySource::CassandraDictionarySource(
Block & sample_block_)
: CassandraDictionarySource(
dict_struct_,
- CassandraSettings(config, config_prefix),
+ Configuration(config, config_prefix),
sample_block_)
{
}
void CassandraDictionarySource::maybeAllowFiltering(String & query) const
{
- if (!settings.allow_filtering)
+ if (!configuration.allow_filtering)
return;
query.pop_back(); /// remove semicolon
query += " ALLOW FILTERING;";
@@ -141,7 +141,7 @@ Pipe CassandraDictionarySource::loadAll()
std::string CassandraDictionarySource::toString() const
{
- return "Cassandra: " + settings.db + '.' + settings.table;
+ return "Cassandra: " + configuration.db + '.' + configuration.table;
}
Pipe CassandraDictionarySource::loadIds(const std::vector & ids)
@@ -162,7 +162,7 @@ Pipe CassandraDictionarySource::loadKeys(const Columns & key_columns, const std:
for (const auto & row : requested_rows)
{
SipHash partition_key;
- for (size_t i = 0; i < settings.partition_key_prefix; ++i)
+ for (size_t i = 0; i < configuration.partition_key_prefix; ++i)
key_columns[i]->updateHashWithValue(row, partition_key);
partitions[partition_key.get64()].push_back(row);
}
@@ -170,7 +170,7 @@ Pipe CassandraDictionarySource::loadKeys(const Columns & key_columns, const std:
Pipes pipes;
for (const auto & partition : partitions)
{
- String query = query_builder.composeLoadKeysQuery(key_columns, partition.second, ExternalQueryBuilder::CASSANDRA_SEPARATE_PARTITION_KEY, settings.partition_key_prefix);
+ String query = query_builder.composeLoadKeysQuery(key_columns, partition.second, ExternalQueryBuilder::CASSANDRA_SEPARATE_PARTITION_KEY, configuration.partition_key_prefix);
maybeAllowFiltering(query);
LOG_INFO(log, "Loading keys for partition hash {} using query: {}", partition.first, query);
pipes.push_back(Pipe(std::make_shared(getSession(), query, sample_block, max_block_size)));
diff --git a/src/Dictionaries/CassandraDictionarySource.h b/src/Dictionaries/CassandraDictionarySource.h
index 871e3dc4857..35419d3ea7d 100644
--- a/src/Dictionaries/CassandraDictionarySource.h
+++ b/src/Dictionaries/CassandraDictionarySource.h
@@ -14,33 +14,35 @@
namespace DB
{
-struct CassandraSettings
-{
- String host;
- UInt16 port;
- String user;
- String password;
- String db;
- String table;
-
- CassConsistency consistency;
- bool allow_filtering;
- /// TODO get information about key from the driver
- size_t partition_key_prefix;
- size_t max_threads;
- String where;
-
- CassandraSettings(const Poco::Util::AbstractConfiguration & config, const String & config_prefix);
-
- void setConsistency(const String & config_str);
-};
-
class CassandraDictionarySource final : public IDictionarySource
{
public:
+
+ struct Configuration
+ {
+ String host;
+ UInt16 port;
+ String user;
+ String password;
+ String db;
+ String table;
+ String query;
+
+ CassConsistency consistency;
+ bool allow_filtering;
+ /// TODO get information about key from the driver
+ size_t partition_key_prefix;
+ size_t max_threads;
+ String where;
+
+ Configuration(const Poco::Util::AbstractConfiguration & config, const String & config_prefix);
+
+ void setConsistency(const String & config_str);
+ };
+
CassandraDictionarySource(
const DictionaryStructure & dict_struct,
- const CassandraSettings & settings_,
+ const Configuration & configuration,
const Block & sample_block);
CassandraDictionarySource(
@@ -59,7 +61,7 @@ public:
DictionarySourcePtr clone() const override
{
- return std::make_unique(dict_struct, settings, sample_block);
+ return std::make_unique(dict_struct, configuration, sample_block);
}
Pipe loadIds(const std::vector & ids) override;
@@ -76,7 +78,7 @@ private:
Poco::Logger * log;
const DictionaryStructure dict_struct;
- const CassandraSettings settings;
+ const Configuration configuration;
Block sample_block;
ExternalQueryBuilder query_builder;
diff --git a/src/Dictionaries/CassandraBlockInputStream.cpp b/src/Dictionaries/CassandraSource.cpp
similarity index 99%
rename from src/Dictionaries/CassandraBlockInputStream.cpp
rename to src/Dictionaries/CassandraSource.cpp
index 384717e2ba2..1ebacdb2c2f 100644
--- a/src/Dictionaries/CassandraBlockInputStream.cpp
+++ b/src/Dictionaries/CassandraSource.cpp
@@ -10,7 +10,7 @@
#include
#include
#include