Merge branch 'master' into short-circuit

2024-11-21 15:12:02 +00:00 · 2021-08-12 08:38:45 +00:00 · 2021-08-12 08:38:45 +00:00 · 80eaea1c51
commit 80eaea1c51
parent eac2dc52a1 918a69e70b
120 changed files with 1773 additions and 761 deletions
--- a/.github/ISSUE_TEMPLATE/85_bug-report.md
+++ b/.github/ISSUE_TEMPLATE/85_bug-report.md
@ -1,6 +1,6 @@
 ---
 name: Bug report
-about: Create a report to help us improve ClickHouse
+about: Wrong behaviour (visible to users) in official ClickHouse release.
 title: ''
 labels: bug
 assignees: ''
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -1,3 +1,114 @@
+### ClickHouse release v21.8, 2021-08-12
+
+#### New Features
+
+* Collect common system metrics (in `system.asynchronous_metrics` and `system.asynchronous_metric_log`) on CPU usage, disk usage, memory usage, IO, network, files, load average, CPU frequencies, thermal sensors, EDAC counters, system uptime; also added metrics about the scheduling jitter and the time spent collecting the metrics. It works similar to `atop` in ClickHouse and allows access to monitoring data even if you have no additional tools installed. Close [#9430](https://github.com/ClickHouse/ClickHouse/issues/9430). [#24416](https://github.com/ClickHouse/ClickHouse/pull/24416) ([Yegor Levankov](https://github.com/elevankoff)).
+* Add new functions `leftPad()`, `rightPad()`, `leftPadUTF8()`, `rightPadUTF8()`. [#26075](https://github.com/ClickHouse/ClickHouse/pull/26075) ([Vitaly Baranov](https://github.com/vitlibar)).
+* Add the `FIRST` keyword to the `ADD INDEX` command to be able to add the index at the beginning of the indices list. [#25904](https://github.com/ClickHouse/ClickHouse/pull/25904) ([xjewer](https://github.com/xjewer)).
+* Introduce `system.data_skipping_indices` table containing information about existing data skipping indices. Close [#7659](https://github.com/ClickHouse/ClickHouse/issues/7659). [#25693](https://github.com/ClickHouse/ClickHouse/pull/25693) ([Dmitry Novik](https://github.com/novikd)).
+* Add `bin`/`unbin` functions. [#25609](https://github.com/ClickHouse/ClickHouse/pull/25609) ([zhaoyu](https://github.com/zxc111)).
+* Support `Map` and `(U)Int128`, `U(Int256) types in `mapAdd` and `mapSubtract` functions. [#25596](https://github.com/ClickHouse/ClickHouse/pull/25596) ([Ildus Kurbangaliev](https://github.com/ildus)).
+* Support `DISTINCT ON (columns)` expression, close [#25404](https://github.com/ClickHouse/ClickHouse/issues/25404). [#25589](https://github.com/ClickHouse/ClickHouse/pull/25589) ([Zijie Lu](https://github.com/TszKitLo40)).
+* Add support for a part of SQLJSON standard. [#24148](https://github.com/ClickHouse/ClickHouse/pull/24148) ([l1tsolaiki](https://github.com/l1tsolaiki)).
+* Add MaterializedPostgreSQL table engine and database engine. This database engine allows replicating a whole database or any subset of database tables. [#20470](https://github.com/ClickHouse/ClickHouse/pull/20470) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Add an ability to reset a custom setting to default and remove it from the table's metadata. It allows rolling back the change without knowing the system/config's default.  Closes [#14449](https://github.com/ClickHouse/ClickHouse/issues/14449). [#17769](https://github.com/ClickHouse/ClickHouse/pull/17769) ([xjewer](https://github.com/xjewer)).
+* Render pipelines as graphs in Web UI if `EXPLAIN PIPELINE graph = 1` query is submitted. [#26067](https://github.com/ClickHouse/ClickHouse/pull/26067) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+
+#### Performance Improvements
+
+* Compile aggregate functions. Use option `compile_aggregate_expressions` to enable it. [#24789](https://github.com/ClickHouse/ClickHouse/pull/24789) ([Maksim Kita](https://github.com/kitaisreal)).
+* Improve latency of short queries that require reading from tables with many columns. [#26371](https://github.com/ClickHouse/ClickHouse/pull/26371) ([Anton Popov](https://github.com/CurtizJ)).
+
+#### Improvements
+
+* Use `Map` data type for system logs tables (`system.query_log`, `system.query_thread_log`, `system.processes`, `system.opentelemetry_span_log`). These tables will be auto-created with new data types. Virtual columns are created to support old queries. Closes [#18698](https://github.com/ClickHouse/ClickHouse/issues/18698). [#23934](https://github.com/ClickHouse/ClickHouse/pull/23934), [#25773](https://github.com/ClickHouse/ClickHouse/pull/25773) ([hexiaoting](https://github.com/hexiaoting), [sundy-li](https://github.com/sundy-li)).
+* For a dictionary with a complex key containing only one attribute, allow not wrapping the key expression in tuple for functions `dictGet`, `dictHas`. [#26130](https://github.com/ClickHouse/ClickHouse/pull/26130) ([Maksim Kita](https://github.com/kitaisreal)).
+* Implement function `bin`/`hex` from `AggregateFunction` states. [#26094](https://github.com/ClickHouse/ClickHouse/pull/26094) ([zhaoyu](https://github.com/zxc111)).
+* Support arguments of `UUID` type for `empty` and `notEmpty` functions. `UUID` is empty if it is all zeros (nil UUID). Closes [#3446](https://github.com/ClickHouse/ClickHouse/issues/3446). [#25974](https://github.com/ClickHouse/ClickHouse/pull/25974) ([zhaoyu](https://github.com/zxc111)).
+* Fix error with query `SET SQL_SELECT_LIMIT` in MySQL protocol. Closes [#17115](https://github.com/ClickHouse/ClickHouse/issues/17115). [#25972](https://github.com/ClickHouse/ClickHouse/pull/25972) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* More instrumentation for network interaction: add counters for recv/send bytes; add gauges for recvs/sends. Added missing documentation. Close [#5897](https://github.com/ClickHouse/ClickHouse/issues/5897). [#25962](https://github.com/ClickHouse/ClickHouse/pull/25962) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Add setting `optimize_move_to_prewhere_if_final`. If query has `FINAL`, the optimization `move_to_prewhere` will be enabled only if both `optimize_move_to_prewhere` and `optimize_move_to_prewhere_if_final` are enabled. Closes [#8684](https://github.com/ClickHouse/ClickHouse/issues/8684). [#25940](https://github.com/ClickHouse/ClickHouse/pull/25940) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Allow complex quoted identifiers of JOINed tables. Close [#17861](https://github.com/ClickHouse/ClickHouse/issues/17861). [#25924](https://github.com/ClickHouse/ClickHouse/pull/25924) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Add support for Unicode (e.g. Chinese, Cyrillic) components in `Nested` data types. Close [#25594](https://github.com/ClickHouse/ClickHouse/issues/25594). [#25923](https://github.com/ClickHouse/ClickHouse/pull/25923) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Allow `quantiles*` functions to work with `aggregate_functions_null_for_empty`. Close [#25892](https://github.com/ClickHouse/ClickHouse/issues/25892). [#25919](https://github.com/ClickHouse/ClickHouse/pull/25919) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Allow parameters for parametric aggregate functions to be arbitrary constant expressions (e.g., `1 + 2`), not just literals. It also allows using the query parameters (in parameterized queries like `{param:UInt8}`) inside parametric aggregate functions. Closes [#11607](https://github.com/ClickHouse/ClickHouse/issues/11607). [#25910](https://github.com/ClickHouse/ClickHouse/pull/25910) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Correctly throw the exception on the attempt to parse an invalid `Date`. Closes [#6481](https://github.com/ClickHouse/ClickHouse/issues/6481). [#25909](https://github.com/ClickHouse/ClickHouse/pull/25909) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Support for multiple includes in configuration. It is possible to include users configuration, remote server configuration from multiple sources. Simply place `<include />` element with `from_zk`, `from_env` or `incl` attribute, and it will be replaced with the substitution. [#24404](https://github.com/ClickHouse/ClickHouse/pull/24404) ([nvartolomei](https://github.com/nvartolomei)).
+* Support for queries with a column named `"null"` (it must be specified in back-ticks or double quotes) and `ON CLUSTER`. Closes [#24035](https://github.com/ClickHouse/ClickHouse/issues/24035). [#25907](https://github.com/ClickHouse/ClickHouse/pull/25907) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Support `LowCardinality`, `Decimal`, and `UUID` for `JSONExtract`. Closes [#24606](https://github.com/ClickHouse/ClickHouse/issues/24606). [#25900](https://github.com/ClickHouse/ClickHouse/pull/25900) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Convert history file from `readline` format to `replxx` format. [#25888](https://github.com/ClickHouse/ClickHouse/pull/25888) ([Azat Khuzhin](https://github.com/azat)).
+* Fix bug which can lead to intersecting parts after `DROP PART` or background deletion of an empty part. [#25884](https://github.com/ClickHouse/ClickHouse/pull/25884) ([alesapin](https://github.com/alesapin)).
+* Better handling of lost parts for `ReplicatedMergeTree` tables. Fixes rare inconsistencies in `ReplicationQueue`. Fixes [#10368](https://github.com/ClickHouse/ClickHouse/issues/10368). [#25820](https://github.com/ClickHouse/ClickHouse/pull/25820) ([alesapin](https://github.com/alesapin)).
+* Allow starting clickhouse-client with unreadable working directory. [#25817](https://github.com/ClickHouse/ClickHouse/pull/25817) ([ianton-ru](https://github.com/ianton-ru)).
+* Fix "No available columns" error for `Merge` storage. [#25801](https://github.com/ClickHouse/ClickHouse/pull/25801) ([Azat Khuzhin](https://github.com/azat)).
+* MySQL Engine now supports the exchange of column comments between MySQL and ClickHouse. [#25795](https://github.com/ClickHouse/ClickHouse/pull/25795) ([Storozhuk Kostiantyn](https://github.com/sand6255)).
+* Fix inconsistent behaviour of `GROUP BY` constant on empty set. Closes [#6842](https://github.com/ClickHouse/ClickHouse/issues/6842). [#25786](https://github.com/ClickHouse/ClickHouse/pull/25786) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Cancel already running merges in partition on `DROP PARTITION` and `TRUNCATE` for `ReplicatedMergeTree`. Resolves [#17151](https://github.com/ClickHouse/ClickHouse/issues/17151). [#25684](https://github.com/ClickHouse/ClickHouse/pull/25684) ([tavplubix](https://github.com/tavplubix)).
+* Support ENUM` data type for MaterializeMySQL. [#25676](https://github.com/ClickHouse/ClickHouse/pull/25676) ([Storozhuk Kostiantyn](https://github.com/sand6255)).
+* Support materialized and aliased columns in JOIN, close [#13274](https://github.com/ClickHouse/ClickHouse/issues/13274). [#25634](https://github.com/ClickHouse/ClickHouse/pull/25634) ([Vladimir C](https://github.com/vdimir)).
+* Fix possible logical race condition between `ALTER TABLE ... DETACH` and background merges. [#25605](https://github.com/ClickHouse/ClickHouse/pull/25605) ([Azat Khuzhin](https://github.com/azat)).
+* Make `NetworkReceiveElapsedMicroseconds` metric to correctly include the time spent waiting for data from the client to `INSERT`. Close [#9958](https://github.com/ClickHouse/ClickHouse/issues/9958). [#25602](https://github.com/ClickHouse/ClickHouse/pull/25602) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Support `TRUNCATE TABLE` for StorageS3 and StorageHDFS. Close [#25530](https://github.com/ClickHouse/ClickHouse/issues/25530). [#25550](https://github.com/ClickHouse/ClickHouse/pull/25550) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Support for dynamic reloading of config to change number of threads in pool for background jobs execution (merges, mutations, fetches). [#25548](https://github.com/ClickHouse/ClickHouse/pull/25548) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
+* Allow extracting of non-string element as string using `JSONExtract`. This is for [#25414](https://github.com/ClickHouse/ClickHouse/issues/25414). [#25452](https://github.com/ClickHouse/ClickHouse/pull/25452) ([Amos Bird](https://github.com/amosbird)).
+* Support regular expression in `Database` argument for `StorageMerge`. Close [#776](https://github.com/ClickHouse/ClickHouse/issues/776). [#25064](https://github.com/ClickHouse/ClickHouse/pull/25064) ([flynn](https://github.com/ucasfl)).
+* Web UI: if the value looks like a URL, automatically generate a link. [#25965](https://github.com/ClickHouse/ClickHouse/pull/25965) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Make `sudo service clickhouse-server start` to work on systems with `systemd` like Centos 8. Close [#14298](https://github.com/ClickHouse/ClickHouse/issues/14298). Close [#17799](https://github.com/ClickHouse/ClickHouse/issues/17799). [#25921](https://github.com/ClickHouse/ClickHouse/pull/25921) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+
+#### Bug Fixes
+
+* Fix incorrect `SET ROLE` in some cases. [#26707](https://github.com/ClickHouse/ClickHouse/pull/26707) ([Vitaly Baranov](https://github.com/vitlibar)).
+* Fix potential `nullptr` dereference in window functions. Fix [#25276](https://github.com/ClickHouse/ClickHouse/issues/25276). [#26668](https://github.com/ClickHouse/ClickHouse/pull/26668) ([Alexander Kuzmenkov](https://github.com/akuzm)).
+* Fix incorrect function names of `groupBitmapAnd/Or/Xor`. Fix [#26557](https://github.com/ClickHouse/ClickHouse/pull/26557) ([Amos Bird](https://github.com/amosbird)).
+* Fix crash in rabbitmq shutdown in case rabbitmq setup was not started. Closes [#26504](https://github.com/ClickHouse/ClickHouse/issues/26504). [#26529](https://github.com/ClickHouse/ClickHouse/pull/26529) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Fix issues with `CREATE DICTIONARY` query if dictionary name or database name was quoted. Closes [#26491](https://github.com/ClickHouse/ClickHouse/issues/26491). [#26508](https://github.com/ClickHouse/ClickHouse/pull/26508) ([Maksim Kita](https://github.com/kitaisreal)).
+* Fix broken name resolution after rewriting column aliases. Fix [#26432](https://github.com/ClickHouse/ClickHouse/issues/26432). [#26475](https://github.com/ClickHouse/ClickHouse/pull/26475) ([Amos Bird](https://github.com/amosbird)).
+* Fix infinite non-joined block stream in `partial_merge_join` close [#26325](https://github.com/ClickHouse/ClickHouse/issues/26325). [#26374](https://github.com/ClickHouse/ClickHouse/pull/26374) ([Vladimir C](https://github.com/vdimir)).
+* Fix possible crash when login as dropped user. Fix [#26073](https://github.com/ClickHouse/ClickHouse/issues/26073). [#26363](https://github.com/ClickHouse/ClickHouse/pull/26363) ([Vitaly Baranov](https://github.com/vitlibar)).
+* Fix `optimize_distributed_group_by_sharding_key` for multiple columns (leads to incorrect result w/ `optimize_skip_unused_shards=1`/`allow_nondeterministic_optimize_skip_unused_shards=1` and multiple columns in sharding key expression). [#26353](https://github.com/ClickHouse/ClickHouse/pull/26353) ([Azat Khuzhin](https://github.com/azat)).
+  * `CAST` from `Date` to `DateTime` (or `DateTime64`) was not using the timezone of the `DateTime` type. It can also affect the comparison between `Date` and `DateTime`. Inference of the common type for `Date` and `DateTime` also was not using the corresponding timezone. It affected the results of function `if` and array construction. Closes [#24128](https://github.com/ClickHouse/ClickHouse/issues/24128). [#24129](https://github.com/ClickHouse/ClickHouse/pull/24129) ([Maksim Kita](https://github.com/kitaisreal)).
+* Fixed rare bug in lost replica recovery that may cause replicas to diverge. [#26321](https://github.com/ClickHouse/ClickHouse/pull/26321) ([tavplubix](https://github.com/tavplubix)).
+* Fix zstd decompression in case there are escape sequences at the end of internal buffer. Closes [#26013](https://github.com/ClickHouse/ClickHouse/issues/26013). [#26314](https://github.com/ClickHouse/ClickHouse/pull/26314) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Fix logical error on join with totals, close [#26017](https://github.com/ClickHouse/ClickHouse/issues/26017). [#26250](https://github.com/ClickHouse/ClickHouse/pull/26250) ([Vladimir C](https://github.com/vdimir)).
+* Remove excessive newline in `thread_name` column in `system.stack_trace` table. Fix [#24124](https://github.com/ClickHouse/ClickHouse/issues/24124). [#26210](https://github.com/ClickHouse/ClickHouse/pull/26210) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix `joinGet` with `LowCarinality` columns, close [#25993](https://github.com/ClickHouse/ClickHouse/issues/25993). [#26118](https://github.com/ClickHouse/ClickHouse/pull/26118) ([Vladimir C](https://github.com/vdimir)).
+* Fix possible crash in `pointInPolygon` if the setting `validate_polygons` is turned off. [#26113](https://github.com/ClickHouse/ClickHouse/pull/26113) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix throwing exception when iterate over non-existing remote directory. [#26087](https://github.com/ClickHouse/ClickHouse/pull/26087) ([ianton-ru](https://github.com/ianton-ru)).
+* Fix rare server crash because of `abort` in ZooKeeper client. Fixes [#25813](https://github.com/ClickHouse/ClickHouse/issues/25813). [#26079](https://github.com/ClickHouse/ClickHouse/pull/26079) ([alesapin](https://github.com/alesapin)).
+* Fix wrong thread estimation for right subquery join in some cases. Close [#24075](https://github.com/ClickHouse/ClickHouse/issues/24075). [#26052](https://github.com/ClickHouse/ClickHouse/pull/26052) ([Vladimir C](https://github.com/vdimir)).
+* Fixed incorrect `sequence_id` in MySQL protocol packets that ClickHouse sends on exception during query execution. It might cause MySQL client to reset connection to ClickHouse server. Fixes [#21184](https://github.com/ClickHouse/ClickHouse/issues/21184). [#26051](https://github.com/ClickHouse/ClickHouse/pull/26051) ([tavplubix](https://github.com/tavplubix)).
+* Fix possible mismatched header when using normal projection with `PREWHERE`. Fix [#26020](https://github.com/ClickHouse/ClickHouse/issues/26020). [#26038](https://github.com/ClickHouse/ClickHouse/pull/26038) ([Amos Bird](https://github.com/amosbird)).
+* Fix formatting of type `Map` with integer keys to `JSON`. [#25982](https://github.com/ClickHouse/ClickHouse/pull/25982) ([Anton Popov](https://github.com/CurtizJ)).
+* Fix possible deadlock during query profiler stack unwinding. Fix [#25968](https://github.com/ClickHouse/ClickHouse/issues/25968). [#25970](https://github.com/ClickHouse/ClickHouse/pull/25970) ([Maksim Kita](https://github.com/kitaisreal)).
+* Fix crash on call `dictGet()` with bad arguments. [#25913](https://github.com/ClickHouse/ClickHouse/pull/25913) ([Vitaly Baranov](https://github.com/vitlibar)).
+* Fixed `scram-sha-256` authentication for PostgreSQL engines. Closes [#24516](https://github.com/ClickHouse/ClickHouse/issues/24516). [#25906](https://github.com/ClickHouse/ClickHouse/pull/25906) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Fix extremely long backoff for background tasks when the background pool is full. Fixes [#25836](https://github.com/ClickHouse/ClickHouse/issues/25836). [#25893](https://github.com/ClickHouse/ClickHouse/pull/25893) ([alesapin](https://github.com/alesapin)).
+* Fix ARM exception handling with non default page size. Fixes [#25512](https://github.com/ClickHouse/ClickHouse/issues/25512), [#25044](https://github.com/ClickHouse/ClickHouse/issues/25044), [#24901](https://github.com/ClickHouse/ClickHouse/issues/24901), [#23183](https://github.com/ClickHouse/ClickHouse/issues/23183), [#20221](https://github.com/ClickHouse/ClickHouse/issues/20221), [#19703](https://github.com/ClickHouse/ClickHouse/issues/19703), [#19028](https://github.com/ClickHouse/ClickHouse/issues/19028), [#18391](https://github.com/ClickHouse/ClickHouse/issues/18391), [#18121](https://github.com/ClickHouse/ClickHouse/issues/18121), [#17994](https://github.com/ClickHouse/ClickHouse/issues/17994), [#12483](https://github.com/ClickHouse/ClickHouse/issues/12483). [#25854](https://github.com/ClickHouse/ClickHouse/pull/25854) ([Maksim Kita](https://github.com/kitaisreal)).
+* Fix sharding_key from column w/o function for `remote()` (before `select * from remote('127.1', system.one, dummy)` leads to `Unknown column: dummy, there are only columns .` error). [#25824](https://github.com/ClickHouse/ClickHouse/pull/25824) ([Azat Khuzhin](https://github.com/azat)).
+* Fixed `Not found column ...` and `Missing column ...` errors when selecting from `MaterializeMySQL`. Fixes [#23708](https://github.com/ClickHouse/ClickHouse/issues/23708), [#24830](https://github.com/ClickHouse/ClickHouse/issues/24830), [#25794](https://github.com/ClickHouse/ClickHouse/issues/25794). [#25822](https://github.com/ClickHouse/ClickHouse/pull/25822) ([tavplubix](https://github.com/tavplubix)).
+* Fix `optimize_skip_unused_shards_rewrite_in` for non-UInt64 types (may select incorrect shards eventually or throw `Cannot infer type of an empty tuple` or `Function tuple requires at least one argument`). [#25798](https://github.com/ClickHouse/ClickHouse/pull/25798) ([Azat Khuzhin](https://github.com/azat)).
+* Fix rare bug with `DROP PART` query for `ReplicatedMergeTree` tables which can lead to error message `Unexpected merged part intersecting drop range`. [#25783](https://github.com/ClickHouse/ClickHouse/pull/25783) ([alesapin](https://github.com/alesapin)).
+* Fix bug in `TTL` with `GROUP BY` expression which refuses to execute `TTL` after first execution in part. [#25743](https://github.com/ClickHouse/ClickHouse/pull/25743) ([alesapin](https://github.com/alesapin)).
+* Allow StorageMerge to access tables with aliases. Closes [#6051](https://github.com/ClickHouse/ClickHouse/issues/6051). [#25694](https://github.com/ClickHouse/ClickHouse/pull/25694) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Fix slow dict join in some cases, close [#24209](https://github.com/ClickHouse/ClickHouse/issues/24209). [#25618](https://github.com/ClickHouse/ClickHouse/pull/25618) ([Vladimir C](https://github.com/vdimir)).
+* Fix `ALTER MODIFY COLUMN` of columns, which participates in TTL expressions. [#25554](https://github.com/ClickHouse/ClickHouse/pull/25554) ([Anton Popov](https://github.com/CurtizJ)).
+* Fix assertion in `PREWHERE` with non-UInt8 type, close [#19589](https://github.com/ClickHouse/ClickHouse/issues/19589). [#25484](https://github.com/ClickHouse/ClickHouse/pull/25484) ([Vladimir C](https://github.com/vdimir)).
+* Fix some fuzzed msan crash. Fixes [#22517](https://github.com/ClickHouse/ClickHouse/issues/22517). [#26428](https://github.com/ClickHouse/ClickHouse/pull/26428) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Fix empty history file conversion. [#26589](https://github.com/ClickHouse/ClickHouse/pull/26589) ([Azat Khuzhin](https://github.com/azat)).
+* Update `chown` cmd check in `clickhouse-server` docker entrypoint. It fixes error 'cluster pod restart failed (or timeout)' on kubernetes. [#26545](https://github.com/ClickHouse/ClickHouse/pull/26545) ([Ky Li](https://github.com/Kylinrix)).
+
+#### Build/Testing/Packaging Improvements
+
+* Disabling TestFlows LDAP module due to test fails. [#26065](https://github.com/ClickHouse/ClickHouse/pull/26065) ([vzakaznikov](https://github.com/vzakaznikov)).
+* Enabling all TestFlows modules and fixing some tests. [#26011](https://github.com/ClickHouse/ClickHouse/pull/26011) ([vzakaznikov](https://github.com/vzakaznikov)).
+* Add new tests for checking access rights for columns used in filters (`WHERE` / `PREWHERE` / row policy) of the `SELECT` statement after changes in [#24405](https://github.com/ClickHouse/ClickHouse/pull/24405). [#25619](https://github.com/ClickHouse/ClickHouse/pull/25619) ([Vitaly Baranov](https://github.com/vitlibar)).
+
+#### Other
+
+* Add `clickhouse-keeper-converter` tool which allows converting zookeeper logs and snapshots into `clickhouse-keeper` snapshot format. [#25428](https://github.com/ClickHouse/ClickHouse/pull/25428) ([alesapin](https://github.com/alesapin)).
+
+
+
 ### ClickHouse release v21.7, 2021-07-09

 #### Backward Incompatible Change
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@ -271,12 +271,6 @@ endif()

 include(cmake/cpu_features.cmake)

-option(ARCH_NATIVE "Add -march=native compiler flag. This makes your binaries non-portable but more performant code may be generated.")
-
-if (ARCH_NATIVE)
-    set (COMPILER_FLAGS "${COMPILER_FLAGS} -march=native")
-endif ()
-
 # Asynchronous unwind tables are needed for Query Profiler.
 # They are already by default on some platforms but possibly not on all platforms.
 # Enable it explicitly.
--- a/base/common/DateLUTImpl.cpp
+++ b/base/common/DateLUTImpl.cpp
@ -60,6 +60,7 @@ DateLUTImpl::DateLUTImpl(const std::string & time_zone_)
    offset_at_start_of_epoch = cctz_time_zone.lookup(cctz_time_zone.lookup(epoch).pre).offset;
    offset_at_start_of_lut = cctz_time_zone.lookup(cctz_time_zone.lookup(lut_start).pre).offset;
    offset_is_whole_number_of_hours_during_epoch = true;
+    offset_is_whole_number_of_minutes_during_epoch = true;

    cctz::civil_day date = lut_start;

@ -108,6 +109,9 @@ DateLUTImpl::DateLUTImpl(const std::string & time_zone_)
        if (offset_is_whole_number_of_hours_during_epoch && start_of_day > 0 && start_of_day % 3600)
            offset_is_whole_number_of_hours_during_epoch = false;

+        if (offset_is_whole_number_of_minutes_during_epoch && start_of_day > 0 && start_of_day % 60)
+            offset_is_whole_number_of_minutes_during_epoch = false;
+
        /// If UTC offset was changed this day.
        /// Change in time zone without transition is possible, e.g. Moscow 1991 Sun, 31 Mar, 02:00 MSK to EEST
        cctz::time_zone::civil_transition transition{};
--- a/base/common/DateLUTImpl.h
+++ b/base/common/DateLUTImpl.h
@ -193,6 +193,7 @@ private:
    /// UTC offset at the beginning of the first supported year.
    Time offset_at_start_of_lut;
    bool offset_is_whole_number_of_hours_during_epoch;
+    bool offset_is_whole_number_of_minutes_during_epoch;

    /// Time zone name.
    std::string time_zone;
@ -251,18 +252,23 @@ private:
    }

    template <typename T, typename Divisor>
-    static inline T roundDown(T x, Divisor divisor)
+    inline T roundDown(T x, Divisor divisor) const
    {
        static_assert(std::is_integral_v<T> && std::is_integral_v<Divisor>);
        assert(divisor > 0);

-        if (likely(x >= 0))
-            return x / divisor * divisor;
+        if (likely(offset_is_whole_number_of_hours_during_epoch))
+        {
+            if (likely(x >= 0))
+                return x / divisor * divisor;

-        /// Integer division for negative numbers rounds them towards zero (up).
-        /// We will shift the number so it will be rounded towards -inf (down).
+            /// Integer division for negative numbers rounds them towards zero (up).
+            /// We will shift the number so it will be rounded towards -inf (down).
+            return (x + 1 - divisor) / divisor * divisor;
+        }

-        return (x + 1 - divisor) / divisor * divisor;
+        Time date = find(x).date;
+        return date + (x - date) / divisor * divisor;
    }

 public:
@ -459,10 +465,21 @@ public:

    inline unsigned toSecond(Time t) const
    {
-        auto res = t % 60;
-        if (likely(res >= 0))
-            return res;
-        return res + 60;
+        if (likely(offset_is_whole_number_of_minutes_during_epoch))
+        {
+            Time res = t % 60;
+            if (likely(res >= 0))
+                return res;
+            return res + 60;
+        }
+
+        LUTIndex index = findIndex(t);
+        Time time = t - lut[index].date;
+
+        if (time >= lut[index].time_at_offset_change())
+            time += lut[index].amount_of_offset_change();
+
+        return time % 60;
    }

    inline unsigned toMinute(Time t) const
@ -483,29 +500,11 @@ public:
    }

    /// NOTE: Assuming timezone offset is a multiple of 15 minutes.
-    inline Time toStartOfMinute(Time t) const { return roundDown(t, 60); }
-    inline Time toStartOfFiveMinute(Time t) const { return roundDown(t, 300); }
-    inline Time toStartOfFifteenMinutes(Time t) const { return roundDown(t, 900); }
-
-    inline Time toStartOfTenMinutes(Time t) const
-    {
-        if (t >= 0 && offset_is_whole_number_of_hours_during_epoch)
-            return t / 600 * 600;
-
-        /// More complex logic is for Nepal - it has offset 05:45. Australia/Eucla is also unfortunate.
-        Time date = find(t).date;
-        return date + (t - date) / 600 * 600;
-    }
-
-    /// NOTE: Assuming timezone transitions are multiple of hours. Lord Howe Island in Australia is a notable exception.
-    inline Time toStartOfHour(Time t) const
-    {
-        if (t >= 0 && offset_is_whole_number_of_hours_during_epoch)
-            return t / 3600 * 3600;
-
-        Time date = find(t).date;
-        return date + (t - date) / 3600 * 3600;
-    }
+    inline Time toStartOfMinute(Time t) const { return toStartOfMinuteInterval(t, 1); }
+    inline Time toStartOfFiveMinute(Time t) const { return toStartOfMinuteInterval(t, 5); }
+    inline Time toStartOfFifteenMinutes(Time t) const { return toStartOfMinuteInterval(t, 15); }
+    inline Time toStartOfTenMinutes(Time t) const { return toStartOfMinuteInterval(t, 10); }
+    inline Time toStartOfHour(Time t) const { return roundDown(t, 3600); }

    /** Number of calendar day since the beginning of UNIX epoch (1970-01-01 is zero)
      * We use just two bytes for it. It covers the range up to 2105 and slightly more.
@ -903,25 +902,24 @@ public:

    inline Time toStartOfMinuteInterval(Time t, UInt64 minutes) const
    {
-        if (minutes == 1)
-            return toStartOfMinute(t);
+        UInt64 divisor = 60 * minutes;
+        if (likely(offset_is_whole_number_of_minutes_during_epoch))
+        {
+            if (likely(t >= 0))
+                return t / divisor * divisor;
+            return (t + 1 - divisor) / divisor * divisor;
+        }

-        /** In contrast to "toStartOfHourInterval" function above,
-          * the minute intervals are not aligned to the midnight.
-          * You will get unexpected results if for example, you round down to 60 minute interval
-          * and there was a time shift to 30 minutes.
-          *
-          * But this is not specified in docs and can be changed in future.
-          */
-
-        UInt64 seconds = 60 * minutes;
-        return roundDown(t, seconds);
+        Time date = find(t).date;
+        return date + (t - date) / divisor * divisor;
    }

    inline Time toStartOfSecondInterval(Time t, UInt64 seconds) const
    {
        if (seconds == 1)
            return t;
+        if (seconds % 60 == 0)
+            return toStartOfMinuteInterval(t, seconds / 60);

        return roundDown(t, seconds);
    }
--- a/cmake/cpu_features.cmake
+++ b/cmake/cpu_features.cmake
@ -5,109 +5,128 @@ include (CMakePushCheckState)

 cmake_push_check_state ()

-# gcc -dM -E -mno-sse2 - < /dev/null | sort > gcc-dump-nosse2
-# gcc -dM -E -msse2 - < /dev/null | sort > gcc-dump-sse2
-#define __SSE2__ 1
-#define __SSE2_MATH__ 1
+# The variables HAVE_* determine if compiler has support for the flag to use the corresponding instruction set.
+# The options ENABLE_* determine if we will tell compiler to actually use the corresponding instruction set if compiler can do it.

-# gcc -dM -E -msse4.1 - < /dev/null | sort > gcc-dump-sse41
-#define __SSE4_1__ 1
+# All of them are unrelated to the instruction set at the host machine
+# (you can compile for newer instruction set on old machines and vice versa).

-set (TEST_FLAG "-msse4.1")
-set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0")
-check_cxx_source_compiles("
-    #include <smmintrin.h>
-    int main() {
-        auto a = _mm_insert_epi8(__m128i(), 0, 0);
-        (void)a;
-        return 0;
-    }
-" HAVE_SSE41)
-if (HAVE_SSE41)
-    set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}")
-endif ()
+option (ENABLE_SSSE3 "Use SSSE3 instructions on x86_64" 1)
+option (ENABLE_SSE41 "Use SSE4.1 instructions on x86_64" 1)
+option (ENABLE_SSE42 "Use SSE4.2 instructions on x86_64" 1)
+option (ENABLE_PCLMULQDQ "Use pclmulqdq instructions on x86_64" 1)
+option (ENABLE_POPCNT "Use popcnt instructions on x86_64" 1)
+option (ENABLE_AVX "Use AVX instructions on x86_64" 0)
+option (ENABLE_AVX2 "Use AVX2 instructions on x86_64" 0)

-if (ARCH_PPC64LE)
-    set (COMPILER_FLAGS "${COMPILER_FLAGS} -maltivec -D__SSE2__=1 -DNO_WARN_X86_INTRINSICS")
-endif ()
+option (ARCH_NATIVE "Add -march=native compiler flag. This makes your binaries non-portable but more performant code may be generated. This option overrides ENABLE_* options for specific instruction set. Highly not recommended to use." 0)

-# gcc -dM -E -msse4.2 - < /dev/null | sort > gcc-dump-sse42
-#define __SSE4_2__ 1
+if (ARCH_NATIVE)
+    set (COMPILER_FLAGS "${COMPILER_FLAGS} -march=native")

-set (TEST_FLAG "-msse4.2")
-set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0")
-check_cxx_source_compiles("
-    #include <nmmintrin.h>
-    int main() {
-        auto a = _mm_crc32_u64(0, 0);
-        (void)a;
-        return 0;
-    }
-" HAVE_SSE42)
-if (HAVE_SSE42)
-    set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}")
-endif ()
+else ()
+    set (TEST_FLAG "-mssse3")
+    set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0")
+    check_cxx_source_compiles("
+        #include <tmmintrin.h>
+        int main() {
+            __m64 a = _mm_abs_pi8(__m64());
+            (void)a;
+            return 0;
+        }
+    " HAVE_SSSE3)
+    if (HAVE_SSSE3 AND ENABLE_SSSE3)
+        set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}")
+    endif ()

-set (TEST_FLAG "-mssse3")
-set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0")
-check_cxx_source_compiles("
-    #include <tmmintrin.h>
-    int main() {
-        __m64 a = _mm_abs_pi8(__m64());
-        (void)a;
-        return 0;
-    }
-" HAVE_SSSE3)

-set (TEST_FLAG "-mavx")
-set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0")
-check_cxx_source_compiles("
-    #include <immintrin.h>
-    int main() {
-        auto a = _mm256_insert_epi8(__m256i(), 0, 0);
-        (void)a;
-        return 0;
-    }
-" HAVE_AVX)
+    set (TEST_FLAG "-msse4.1")
+    set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0")
+    check_cxx_source_compiles("
+        #include <smmintrin.h>
+        int main() {
+            auto a = _mm_insert_epi8(__m128i(), 0, 0);
+            (void)a;
+            return 0;
+        }
+    " HAVE_SSE41)
+    if (HAVE_SSE41 AND ENABLE_SSE41)
+        set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}")
+    endif ()

-set (TEST_FLAG "-mavx2")
-set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0")
-check_cxx_source_compiles("
-    #include <immintrin.h>
-    int main() {
-        auto a = _mm256_add_epi16(__m256i(), __m256i());
-        (void)a;
-        return 0;
-    }
-" HAVE_AVX2)
+    if (ARCH_PPC64LE)
+        set (COMPILER_FLAGS "${COMPILER_FLAGS} -maltivec -D__SSE2__=1 -DNO_WARN_X86_INTRINSICS")
+    endif ()

-set (TEST_FLAG "-mpclmul")
-set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0")
-check_cxx_source_compiles("
-    #include <wmmintrin.h>
-    int main() {
-        auto a = _mm_clmulepi64_si128(__m128i(), __m128i(), 0);
-        (void)a;
-        return 0;
-    }
-" HAVE_PCLMULQDQ)
+    set (TEST_FLAG "-msse4.2")
+    set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0")
+    check_cxx_source_compiles("
+        #include <nmmintrin.h>
+        int main() {
+            auto a = _mm_crc32_u64(0, 0);
+            (void)a;
+            return 0;
+        }
+    " HAVE_SSE42)
+    if (HAVE_SSE42 AND ENABLE_SSE42)
+        set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}")
+    endif ()

-# gcc -dM -E -mpopcnt - < /dev/null | sort > gcc-dump-popcnt
-#define __POPCNT__ 1
+    set (TEST_FLAG "-mpclmul")
+    set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0")
+    check_cxx_source_compiles("
+        #include <wmmintrin.h>
+        int main() {
+            auto a = _mm_clmulepi64_si128(__m128i(), __m128i(), 0);
+            (void)a;
+            return 0;
+        }
+    " HAVE_PCLMULQDQ)
+    if (HAVE_PCLMULQDQ AND ENABLE_PCLMULQDQ)
+        set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}")
+    endif ()

-set (TEST_FLAG "-mpopcnt")
+    set (TEST_FLAG "-mpopcnt")

-set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0")
-check_cxx_source_compiles("
-    int main() {
-        auto a = __builtin_popcountll(0);
-        (void)a;
-        return 0;
-    }
-" HAVE_POPCNT)
+    set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0")
+    check_cxx_source_compiles("
+        int main() {
+            auto a = __builtin_popcountll(0);
+            (void)a;
+            return 0;
+        }
+    " HAVE_POPCNT)
+    if (HAVE_POPCNT AND ENABLE_POPCNT)
+        set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}")
+    endif ()

-if (HAVE_POPCNT AND NOT ARCH_AARCH64)
-    set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}")
+    set (TEST_FLAG "-mavx")
+    set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0")
+    check_cxx_source_compiles("
+        #include <immintrin.h>
+        int main() {
+            auto a = _mm256_insert_epi8(__m256i(), 0, 0);
+            (void)a;
+            return 0;
+        }
+    " HAVE_AVX)
+    if (HAVE_AVX AND ENABLE_AVX)
+        set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}")
+    endif ()
+
+    set (TEST_FLAG "-mavx2")
+    set (CMAKE_REQUIRED_FLAGS "${TEST_FLAG} -O0")
+    check_cxx_source_compiles("
+        #include <immintrin.h>
+        int main() {
+            auto a = _mm256_add_epi16(__m256i(), __m256i());
+            (void)a;
+            return 0;
+        }
+    " HAVE_AVX2)
+    if (HAVE_AVX2 AND ENABLE_AVX2)
+        set (COMPILER_FLAGS "${COMPILER_FLAGS} ${TEST_FLAG}")
+    endif ()
 endif ()

 cmake_pop_check_state ()
--- a/contrib/croaring-cmake/CMakeLists.txt
+++ b/contrib/croaring-cmake/CMakeLists.txt
@ -26,17 +26,14 @@ target_include_directories(roaring SYSTEM BEFORE PUBLIC "${LIBRARY_DIR}/include"
 target_include_directories(roaring SYSTEM BEFORE PUBLIC "${LIBRARY_DIR}/cpp")

 # We redirect malloc/free family of functions to different functions that will track memory in ClickHouse.
-# It will make this library depend on linking to 'clickhouse_common_io' library that is not done explicitly via 'target_link_libraries'.
-# And we check that all libraries dependencies are satisfied and all symbols are resolved if we do build with shared libraries.
-# That's why we enable it only in static build.
 # Also note that we exploit implicit function declarations.

-if (USE_STATIC_LIBRARIES)
-    target_compile_definitions(roaring PRIVATE
+target_compile_definitions(roaring PRIVATE
        -Dmalloc=clickhouse_malloc
        -Dcalloc=clickhouse_calloc
        -Drealloc=clickhouse_realloc
        -Dreallocarray=clickhouse_reallocarray
        -Dfree=clickhouse_free
        -Dposix_memalign=clickhouse_posix_memalign)
-endif ()
+
+target_link_libraries(roaring PUBLIC clickhouse_common_io)
--- a/contrib/simdjson-cmake/CMakeLists.txt
+++ b/contrib/simdjson-cmake/CMakeLists.txt
@ -4,3 +4,6 @@ set(SIMDJSON_SRC "${SIMDJSON_SRC_DIR}/simdjson.cpp")

 add_library(simdjson ${SIMDJSON_SRC})
 target_include_directories(simdjson SYSTEM PUBLIC "${SIMDJSON_INCLUDE_DIR}" PRIVATE "${SIMDJSON_SRC_DIR}")
+
+# simdjson is using its own CPU dispatching and get confused if we enable AVX/AVX2 flags.
+target_compile_options(simdjson PRIVATE -mno-avx -mno-avx2)
--- a/docs/en/development/build.md
+++ b/docs/en/development/build.md
@ -155,6 +155,10 @@ Normally ClickHouse is statically linked into a single static `clickhouse` binar
 -DUSE_STATIC_LIBRARIES=0 -DSPLIT_SHARED_LIBRARIES=1 -DCLICKHOUSE_SPLIT_BINARY=1
 ```

-Note that in this configuration there is no single `clickhouse` binary, and you have to run `clickhouse-server`, `clickhouse-client` etc.
+Note that the split build has several drawbacks:
+* There is no single `clickhouse` binary, and you have to run `clickhouse-server`, `clickhouse-client`, etc.
+* Risk of segfault if you run any of the programs while rebuilding the project.
+* You cannot run the integration tests since they only work a single complete binary.
+* You can't easily copy the binaries elsewhere. Instead of moving a single binary you'll need to copy all binaries and libraries.

 [Original article](https://clickhouse.tech/docs/en/development/build/) <!--hide-->
--- a/docs/en/engines/database-engines/materialized-mysql.md
+++ b/docs/en/engines/database-engines/materialized-mysql.md
@ -43,10 +43,10 @@ CREATE DATABASE mysql ENGINE = MaterializedMySQL('localhost:3306', 'db', 'user',

 **Settings on MySQL-server side**

-For the correct work of `MaterializeMySQL`, there are few mandatory `MySQL`-side configuration settings that should be set:
+For the correct work of `MaterializedMySQL`, there are few mandatory `MySQL`-side configuration settings that should be set:

- `default_authentication_plugin = mysql_native_password` since `MaterializeMySQL` can only authorize with this method.
- `gtid_mode = on` since GTID based logging is a mandatory for providing correct `MaterializeMySQL` replication. Pay attention that while turning this mode `On` you should also specify `enforce_gtid_consistency = on`.
+- `default_authentication_plugin = mysql_native_password` since `MaterializedMySQL` can only authorize with this method.
+- `gtid_mode = on` since GTID based logging is a mandatory for providing correct `MaterializedMySQL` replication. Pay attention that while turning this mode `On` you should also specify `enforce_gtid_consistency = on`.

 ## Virtual columns {#virtual-columns}

--- a/docs/en/engines/table-engines/integrations/hdfs.md
+++ b/docs/en/engines/table-engines/integrations/hdfs.md
@ -50,11 +50,11 @@ SELECT * FROM hdfs_engine_table LIMIT 2

 ## Implementation Details {#implementation-details}

-   Reads and writes can be parallel
+-   Reads and writes can be parallel.
+-   [Zero-copy](../../../operations/storing-data.md#zero-copy) replication is supported.  
 -   Not supported:
    -   `ALTER` and `SELECT...SAMPLE` operations.
    -   Indexes.
-    -   Replication.

 **Globs in path**

@ -71,12 +71,12 @@ Constructions with `{}` are similar to the [remote](../../../sql-reference/table

 1.  Suppose we have several files in TSV format with the following URIs on HDFS:

-   ‘hdfs://hdfs1:9000/some_dir/some_file_1’
-   ‘hdfs://hdfs1:9000/some_dir/some_file_2’
-   ‘hdfs://hdfs1:9000/some_dir/some_file_3’
-   ‘hdfs://hdfs1:9000/another_dir/some_file_1’
-   ‘hdfs://hdfs1:9000/another_dir/some_file_2’
-   ‘hdfs://hdfs1:9000/another_dir/some_file_3’
+-   'hdfs://hdfs1:9000/some_dir/some_file_1'
+-   'hdfs://hdfs1:9000/some_dir/some_file_2'
+-   'hdfs://hdfs1:9000/some_dir/some_file_3'
+-   'hdfs://hdfs1:9000/another_dir/some_file_1'
+-   'hdfs://hdfs1:9000/another_dir/some_file_2'
+-   'hdfs://hdfs1:9000/another_dir/some_file_3'

 1.  There are several ways to make a table consisting of all six files:

@ -126,8 +126,9 @@ Similar to GraphiteMergeTree, the HDFS engine supports extended configuration us
  </hdfs_root>
 ```

-### List of possible configuration options with default values
-#### Supported by libhdfs3
+### Configuration Options {#configuration-options}
+
+#### Supported by libhdfs3 {#supported-by-libhdfs3}


 | **parameter**                                         | **default value**       |
@ -184,7 +185,7 @@ Similar to GraphiteMergeTree, the HDFS engine supports extended configuration us
 |hadoop\_kerberos\_principal                            | ""                      |
 |hadoop\_kerberos\_kinit\_command                       | kinit                   |

-#### Limitations {#limitations}
+### Limitations {#limitations}
  * hadoop\_security\_kerberos\_ticket\_cache\_path can be global only, not user specific

 ## Kerberos support {#kerberos-support}
--- a/docs/en/engines/table-engines/integrations/s3.md
+++ b/docs/en/engines/table-engines/integrations/s3.md
@ -57,10 +57,10 @@ For more information about virtual columns see [here](../../../engines/table-eng
 ## Implementation Details {#implementation-details}

 -   Reads and writes can be parallel
+-   [Zero-copy](../../../operations/storing-data.md#zero-copy) replication is supported.  
 -   Not supported:
    -   `ALTER` and `SELECT...SAMPLE` operations.
    -   Indexes.
-    -   Replication.

 ## Wildcards In Path {#wildcards-in-path}

@ -77,12 +77,12 @@ Constructions with `{}` are similar to the [remote](../../../sql-reference/table

 1. Suppose we have several files in CSV format with the following URIs on S3:

-   ‘https://storage.yandexcloud.net/my-test-bucket-768/some_prefix/some_file_1.csv’
-   ‘https://storage.yandexcloud.net/my-test-bucket-768/some_prefix/some_file_2.csv’
-   ‘https://storage.yandexcloud.net/my-test-bucket-768/some_prefix/some_file_3.csv’
-   ‘https://storage.yandexcloud.net/my-test-bucket-768/another_prefix/some_file_1.csv’
-   ‘https://storage.yandexcloud.net/my-test-bucket-768/another_prefix/some_file_2.csv’
-   ‘https://storage.yandexcloud.net/my-test-bucket-768/another_prefix/some_file_3.csv’
+-   'https://storage.yandexcloud.net/my-test-bucket-768/some_prefix/some_file_1.csv'
+-   'https://storage.yandexcloud.net/my-test-bucket-768/some_prefix/some_file_2.csv'
+-   'https://storage.yandexcloud.net/my-test-bucket-768/some_prefix/some_file_3.csv'
+-   'https://storage.yandexcloud.net/my-test-bucket-768/another_prefix/some_file_1.csv'
+-   'https://storage.yandexcloud.net/my-test-bucket-768/another_prefix/some_file_2.csv'
+-   'https://storage.yandexcloud.net/my-test-bucket-768/another_prefix/some_file_3.csv'

 There are several ways to make a table consisting of all six files:

--- a/docs/en/getting-started/example-datasets/index.md
+++ b/docs/en/getting-started/example-datasets/index.md
@ -17,6 +17,7 @@ The list of documented datasets:
 -   [OpenSky](../../getting-started/example-datasets/opensky.md)
 -   [New York Taxi Data](../../getting-started/example-datasets/nyc-taxi.md)
 -   [UK Property Price Paid](../../getting-started/example-datasets/uk-price-paid.md)
+-   [What's on the Menu?](../../getting-started/example-datasets/menus.md)
 -   [Star Schema Benchmark](../../getting-started/example-datasets/star-schema.md)
 -   [WikiStat](../../getting-started/example-datasets/wikistat.md)
 -   [Terabyte of Click Logs from Criteo](../../getting-started/example-datasets/criteo.md)
--- a/docs/en/getting-started/example-datasets/menus.md
+++ b/docs/en/getting-started/example-datasets/menus.md
@ -0,0 +1,324 @@
+---
+toc_priority: 21
+toc_title: Menus
+---
+
+# New York Public Library "What's on the Menu?" Dataset
+
+The dataset is created by the New York Public Library. It contains historical data on the menus of hotels, restaurants and cafes with the dishes along with their prices.
+
+Source: http://menus.nypl.org/data
+The data is in public domain.
+
+The data is from library's archive and it may be incomplete and difficult for statistical analysis. Nevertheless it is also very yummy.
+The size is just 1.3 million records about dishes in the menus (a very small data volume for ClickHouse, but it's still a good example).
+
+## Download the Dataset
+
+```
+wget https://s3.amazonaws.com/menusdata.nypl.org/gzips/2021_08_01_07_01_17_data.tgz
+```
+
+Replace the link to the up to date link from http://menus.nypl.org/data if needed.
+Download size is about 35 MB.
+
+## Unpack the Dataset
+
+```
+tar xvf 2021_08_01_07_01_17_data.tgz
+```
+
+Uncompressed size is about 150 MB.
+
+The data is normalized consisted of four tables:
+- Menu: information about menus: the name of the restaurant, the date when menu was seen, etc;
+- Dish: information about dishes: the name of the dish along with some characteristic;
+- MenuPage: information about the pages in the menus; every page belongs to some menu;
+- MenuItem: an item of the menu - a dish along with its price on some menu page: links to dish and menu page.
+
+## Create the Tables
+
+```
+CREATE TABLE dish
+(
+    id UInt32,
+    name String,
+    description String,
+    menus_appeared UInt32,
+    times_appeared Int32,
+    first_appeared UInt16,
+    last_appeared UInt16,
+    lowest_price Decimal64(3),
+    highest_price Decimal64(3)
+) ENGINE = MergeTree ORDER BY id;
+
+CREATE TABLE menu
+(
+    id UInt32,
+    name String,
+    sponsor String,
+    event String,
+    venue String,
+    place String,
+    physical_description String,
+    occasion String,
+    notes String,
+    call_number String,
+    keywords String,
+    language String,
+    date String,
+    location String,
+    location_type String,
+    currency String,
+    currency_symbol String,
+    status String,
+    page_count UInt16,
+    dish_count UInt16
+) ENGINE = MergeTree ORDER BY id;
+
+CREATE TABLE menu_page
+(
+    id UInt32,
+    menu_id UInt32,
+    page_number UInt16,
+    image_id String,
+    full_height UInt16,
+    full_width UInt16,
+    uuid UUID
+) ENGINE = MergeTree ORDER BY id;
+
+CREATE TABLE menu_item
+(
+    id UInt32,
+    menu_page_id UInt32,
+    price Decimal64(3),
+    high_price Decimal64(3),
+    dish_id UInt32,
+    created_at DateTime,
+    updated_at DateTime,
+    xpos Float64,
+    ypos Float64
+) ENGINE = MergeTree ORDER BY id;
+```
+
+We use `Decimal` data type to store prices. Everything else is quite straightforward.
+
+## Import Data
+
+Upload data into ClickHouse in parallel:
+
+```
+clickhouse-client --format_csv_allow_single_quotes 0 --input_format_null_as_default 0 --query "INSERT INTO dish FORMAT CSVWithNames" < Dish.csv
+clickhouse-client --format_csv_allow_single_quotes 0 --input_format_null_as_default 0 --query "INSERT INTO menu FORMAT CSVWithNames" < Menu.csv
+clickhouse-client --format_csv_allow_single_quotes 0 --input_format_null_as_default 0 --query "INSERT INTO menu_page FORMAT CSVWithNames" < MenuPage.csv
+clickhouse-client --format_csv_allow_single_quotes 0 --input_format_null_as_default 0 --date_time_input_format best_effort --query "INSERT INTO menu_item FORMAT CSVWithNames" < MenuItem.csv
+```
+
+We use `CSVWithNames` format as the data is represented by CSV with header.
+
+We disable `format_csv_allow_single_quotes` as only double quotes are used for data fields and single quotes can be inside the values and should not confuse the CSV parser.
+
+We disable `input_format_null_as_default` as our data does not have NULLs. Otherwise ClickHouse will try to parse `\N` sequences and can be confused with `\` in data.
+
+The setting `--date_time_input_format best_effort` allows to parse `DateTime` fields in wide variety of formats. For example, ISO-8601 without seconds like '2000-01-01 01:02' will be recognized. Without this setting only fixed DateTime format is allowed.
+
+## Denormalize the Data
+
+Data is presented in multiple tables in normalized form. It means you have to perform JOINs if you want to query, e.g. dish names from menu items.
+For typical analytical tasks it is way more efficient to deal with pre-JOINed data to avoid doing JOIN every time. It is called "denormalized" data.
+
+We will create a table that will contain all the data JOINed together:
+
+```
+CREATE TABLE menu_item_denorm
+ENGINE = MergeTree ORDER BY (dish_name, created_at)
+AS SELECT
+    price,
+    high_price,
+    created_at,
+    updated_at,
+    xpos,
+    ypos,
+    dish.id AS dish_id,
+    dish.name AS dish_name,
+    dish.description AS dish_description,
+    dish.menus_appeared AS dish_menus_appeared,
+    dish.times_appeared AS dish_times_appeared,
+    dish.first_appeared AS dish_first_appeared,
+    dish.last_appeared AS dish_last_appeared,
+    dish.lowest_price AS dish_lowest_price,
+    dish.highest_price AS dish_highest_price,
+    menu.id AS menu_id,
+    menu.name AS menu_name,
+    menu.sponsor AS menu_sponsor,
+    menu.event AS menu_event,
+    menu.venue AS menu_venue,
+    menu.place AS menu_place,
+    menu.physical_description AS menu_physical_description,
+    menu.occasion AS menu_occasion,
+    menu.notes AS menu_notes,
+    menu.call_number AS menu_call_number,
+    menu.keywords AS menu_keywords,
+    menu.language AS menu_language,
+    menu.date AS menu_date,
+    menu.location AS menu_location,
+    menu.location_type AS menu_location_type,
+    menu.currency AS menu_currency,
+    menu.currency_symbol AS menu_currency_symbol,
+    menu.status AS menu_status,
+    menu.page_count AS menu_page_count,
+    menu.dish_count AS menu_dish_count
+FROM menu_item
+    JOIN dish ON menu_item.dish_id = dish.id
+    JOIN menu_page ON menu_item.menu_page_id = menu_page.id
+    JOIN menu ON menu_page.menu_id = menu.id
+```
+
+## Validate the Data
+
+```
+SELECT count() FROM menu_item_denorm
+1329175
+```
+
+## Run Some Queries
+
+Averaged historical prices of dishes:
+
+```
+SELECT
+    round(toUInt32OrZero(extract(menu_date, '^\\d{4}')), -1) AS d,
+    count(),
+    round(avg(price), 2),
+    bar(avg(price), 0, 100, 100)
+FROM menu_item_denorm
+WHERE (menu_currency = 'Dollars') AND (d > 0) AND (d < 2022)
+GROUP BY d
+ORDER BY d ASC
+
+┌────d─┬─count()─┬─round(avg(price), 2)─┬─bar(avg(price), 0, 100, 100)─┐
+│ 1850 │     618 │                  1.5 │ █▍                           │
+│ 1860 │    1634 │                 1.29 │ █▎                           │
+│ 1870 │    2215 │                 1.36 │ █▎                           │
+│ 1880 │    3909 │                 1.01 │ █                            │
+│ 1890 │    8837 │                  1.4 │ █▍                           │
+│ 1900 │  176292 │                 0.68 │ ▋                            │
+│ 1910 │  212196 │                 0.88 │ ▊                            │
+│ 1920 │  179590 │                 0.74 │ ▋                            │
+│ 1930 │   73707 │                  0.6 │ ▌                            │
+│ 1940 │   58795 │                 0.57 │ ▌                            │
+│ 1950 │   41407 │                 0.95 │ ▊                            │
+│ 1960 │   51179 │                 1.32 │ █▎                           │
+│ 1970 │   12914 │                 1.86 │ █▋                           │
+│ 1980 │    7268 │                 4.35 │ ████▎                        │
+│ 1990 │   11055 │                 6.03 │ ██████                       │
+│ 2000 │    2467 │                11.85 │ ███████████▋                 │
+│ 2010 │     597 │                25.66 │ █████████████████████████▋   │
+└──────┴─────────┴──────────────────────┴──────────────────────────────┘
+
+17 rows in set. Elapsed: 0.044 sec. Processed 1.33 million rows, 54.62 MB (30.00 million rows/s., 1.23 GB/s.)
+```
+
+Take it with a grain of salt.
+
+### Burger Prices:
+
+```
+SELECT
+    round(toUInt32OrZero(extract(menu_date, '^\\d{4}')), -1) AS d,
+    count(),
+    round(avg(price), 2),
+    bar(avg(price), 0, 50, 100)
+FROM menu_item_denorm
+WHERE (menu_currency = 'Dollars') AND (d > 0) AND (d < 2022) AND (dish_name ILIKE '%burger%')
+GROUP BY d
+ORDER BY d ASC
+
+┌────d─┬─count()─┬─round(avg(price), 2)─┬─bar(avg(price), 0, 50, 100)───────────┐
+│ 1880 │       2 │                 0.42 │ ▋                                     │
+│ 1890 │       7 │                 0.85 │ █▋                                    │
+│ 1900 │     399 │                 0.49 │ ▊                                     │
+│ 1910 │     589 │                 0.68 │ █▎                                    │
+│ 1920 │     280 │                 0.56 │ █                                     │
+│ 1930 │      74 │                 0.42 │ ▋                                     │
+│ 1940 │     119 │                 0.59 │ █▏                                    │
+│ 1950 │     134 │                 1.09 │ ██▏                                   │
+│ 1960 │     272 │                 0.92 │ █▋                                    │
+│ 1970 │     108 │                 1.18 │ ██▎                                   │
+│ 1980 │      88 │                 2.82 │ █████▋                                │
+│ 1990 │     184 │                 3.68 │ ███████▎                              │
+│ 2000 │      21 │                 7.14 │ ██████████████▎                       │
+│ 2010 │       6 │                18.42 │ ████████████████████████████████████▋ │
+└──────┴─────────┴──────────────────────┴───────────────────────────────────────┘
+
+14 rows in set. Elapsed: 0.052 sec. Processed 1.33 million rows, 94.15 MB (25.48 million rows/s., 1.80 GB/s.)
+```
+
+### Vodka:
+
+```
+SELECT
+    round(toUInt32OrZero(extract(menu_date, '^\\d{4}')), -1) AS d,
+    count(),
+    round(avg(price), 2),
+    bar(avg(price), 0, 50, 100)
+FROM menu_item_denorm
+WHERE (menu_currency IN ('Dollars', '')) AND (d > 0) AND (d < 2022) AND (dish_name ILIKE '%vodka%')
+GROUP BY d
+ORDER BY d ASC
+
+┌────d─┬─count()─┬─round(avg(price), 2)─┬─bar(avg(price), 0, 50, 100)─┐
+│ 1910 │       2 │                    0 │                             │
+│ 1920 │       1 │                  0.3 │ ▌                           │
+│ 1940 │      21 │                 0.42 │ ▋                           │
+│ 1950 │      14 │                 0.59 │ █▏                          │
+│ 1960 │     113 │                 2.17 │ ████▎                       │
+│ 1970 │      37 │                 0.68 │ █▎                          │
+│ 1980 │      19 │                 2.55 │ █████                       │
+│ 1990 │      86 │                  3.6 │ ███████▏                    │
+│ 2000 │       2 │                 3.98 │ ███████▊                    │
+└──────┴─────────┴──────────────────────┴─────────────────────────────┘
+```
+
+To get vodka we have to write `ILIKE '%vodka%'` and this definitely makes a statement.
+
+### Caviar:
+
+Let's print caviar prices. Also let's print a name of any dish with caviar.
+
+```
+SELECT
+    round(toUInt32OrZero(extract(menu_date, '^\\d{4}')), -1) AS d,
+    count(),
+    round(avg(price), 2),
+    bar(avg(price), 0, 50, 100),
+    any(dish_name)
+FROM menu_item_denorm
+WHERE (menu_currency IN ('Dollars', '')) AND (d > 0) AND (d < 2022) AND (dish_name ILIKE '%caviar%')
+GROUP BY d
+ORDER BY d ASC
+
+┌────d─┬─count()─┬─round(avg(price), 2)─┬─bar(avg(price), 0, 50, 100)──────┬─any(dish_name)──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
+│ 1090 │       1 │                    0 │                                  │ Caviar                                                                                                                              │
+│ 1880 │       3 │                    0 │                                  │ Caviar                                                                                                                              │
+│ 1890 │      39 │                 0.59 │ █▏                               │ Butter and caviar                                                                                                                   │
+│ 1900 │    1014 │                 0.34 │ ▋                                │ Anchovy Caviar on Toast                                                                                                             │
+│ 1910 │    1588 │                 1.35 │ ██▋                              │ 1/1 Brötchen Caviar                                                                                                                 │
+│ 1920 │     927 │                 1.37 │ ██▋                              │ ASTRAKAN CAVIAR                                                                                                                     │
+│ 1930 │     289 │                 1.91 │ ███▋                             │ Astrachan caviar                                                                                                                    │
+│ 1940 │     201 │                 0.83 │ █▋                               │ (SPECIAL) Domestic Caviar Sandwich                                                                                                  │
+│ 1950 │      81 │                 2.27 │ ████▌                            │ Beluga Caviar                                                                                                                       │
+│ 1960 │     126 │                 2.21 │ ████▍                            │ Beluga Caviar                                                                                                                       │
+│ 1970 │     105 │                 0.95 │ █▊                               │ BELUGA MALOSSOL CAVIAR AMERICAN DRESSING                                                                                            │
+│ 1980 │      12 │                 7.22 │ ██████████████▍                  │ Authentic Iranian Beluga Caviar the world's finest black caviar presented in ice garni and a sampling of chilled 100° Russian vodka │
+│ 1990 │      74 │                14.42 │ ████████████████████████████▋    │ Avocado Salad, Fresh cut avocado with caviare                                                                                       │
+│ 2000 │       3 │                 7.82 │ ███████████████▋                 │ Aufgeschlagenes Kartoffelsueppchen mit Forellencaviar                                                                               │
+│ 2010 │       6 │                15.58 │ ███████████████████████████████▏ │ "OYSTERS AND PEARLS" "Sabayon" of Pearl Tapioca with Island Creek Oysters and Russian Sevruga Caviar                                │
+└──────┴─────────┴──────────────────────┴──────────────────────────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
+```
+
+At least they have caviar with vodka. Very nice.
+
+### Test it in Playground
+
+The data is uploaded to ClickHouse Playground, [example](https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUCiAgICByb3VuZCh0b1VJbnQzMk9yWmVybyhleHRyYWN0KG1lbnVfZGF0ZSwgJ15cXGR7NH0nKSksIC0xKSBBUyBkLAogICAgY291bnQoKSwKICAgIHJvdW5kKGF2ZyhwcmljZSksIDIpLAogICAgYmFyKGF2ZyhwcmljZSksIDAsIDUwLCAxMDApLAogICAgYW55KGRpc2hfbmFtZSkKRlJPTSBtZW51X2l0ZW1fZGVub3JtCldIRVJFIChtZW51X2N1cnJlbmN5IElOICgnRG9sbGFycycsICcnKSkgQU5EIChkID4gMCkgQU5EIChkIDwgMjAyMikgQU5EIChkaXNoX25hbWUgSUxJS0UgJyVjYXZpYXIlJykKR1JPVVAgQlkgZApPUkRFUiBCWSBkIEFTQw==).
--- a/docs/en/getting-started/example-datasets/uk-price-paid.md
+++ b/docs/en/getting-started/example-datasets/uk-price-paid.md
@ -212,8 +212,6 @@ HAVING c >= 100
 ORDER BY price DESC
 LIMIT 100

-Query id: df8c0a98-4713-4f0e-9690-5f73b52f7206
-
 ┌─town─────────────────┬─district───────────────┬────c─┬───price─┬─bar(round(avg(price)), 0, 5000000, 100)────────────────────────────┐
 │ LONDON               │ CITY OF WESTMINSTER    │ 3372 │ 3305225 │ ██████████████████████████████████████████████████████████████████ │
 │ LONDON               │ CITY OF LONDON         │  257 │ 3294478 │ █████████████████████████████████████████████████████████████████▊ │
@ -323,3 +321,261 @@ Query id: df8c0a98-4713-4f0e-9690-5f73b52f7206
 ### Test it in Playground

 The data is uploaded to ClickHouse Playground, [example](https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUIHRvd24sIGRpc3RyaWN0LCBjb3VudCgpIEFTIGMsIHJvdW5kKGF2ZyhwcmljZSkpIEFTIHByaWNlLCBiYXIocHJpY2UsIDAsIDUwMDAwMDAsIDEwMCkgRlJPTSB1a19wcmljZV9wYWlkIFdIRVJFIGRhdGUgPj0gJzIwMjAtMDEtMDEnIEdST1VQIEJZIHRvd24sIGRpc3RyaWN0IEhBVklORyBjID49IDEwMCBPUkRFUiBCWSBwcmljZSBERVNDIExJTUlUIDEwMA==).
+
+## Let's speed up queries using projections
+
+[Projections](https://../../sql-reference/statements/alter/projection/) allow to improve queries speed by storing pre-aggregated data.
+
+### Build a projection 
+
+```
+-- create an aggregate projection by dimensions (toYear(date), district, town)
+
+ALTER TABLE uk_price_paid
+    ADD PROJECTION projection_by_year_district_town
+    (
+        SELECT
+            toYear(date),
+            district,
+            town,
+            avg(price),
+            sum(price),
+            count()
+        GROUP BY
+            toYear(date),
+            district,
+            town
+    );
+
+-- populate the projection for existing data (without it projection will be 
+--  created for only newly inserted data)
+
+ALTER TABLE uk_price_paid
+    MATERIALIZE PROJECTION projection_by_year_district_town
+SETTINGS mutations_sync = 1;
+```
+
+## Test performance
+
+Let's run the same 3 queries.
+
+```
+-- enable projections for selects
+set allow_experimental_projection_optimization=1;
+
+-- Q1) Average price per year:
+
+SELECT
+    toYear(date) AS year,
+    round(avg(price)) AS price,
+    bar(price, 0, 1000000, 80)
+FROM uk_price_paid
+GROUP BY year
+ORDER BY year ASC;
+
+┌─year─┬──price─┬─bar(round(avg(price)), 0, 1000000, 80)─┐
+│ 1995 │  67932 │ █████▍                                 │
+│ 1996 │  71505 │ █████▋                                 │
+│ 1997 │  78532 │ ██████▎                                │
+│ 1998 │  85435 │ ██████▋                                │
+│ 1999 │  96036 │ ███████▋                               │
+│ 2000 │ 107478 │ ████████▌                              │
+│ 2001 │ 118886 │ █████████▌                             │
+│ 2002 │ 137940 │ ███████████                            │
+│ 2003 │ 155888 │ ████████████▍                          │
+│ 2004 │ 178885 │ ██████████████▎                        │
+│ 2005 │ 189350 │ ███████████████▏                       │
+│ 2006 │ 203528 │ ████████████████▎                      │
+│ 2007 │ 219377 │ █████████████████▌                     │
+│ 2008 │ 217056 │ █████████████████▎                     │
+│ 2009 │ 213419 │ █████████████████                      │
+│ 2010 │ 236110 │ ██████████████████▊                    │
+│ 2011 │ 232804 │ ██████████████████▌                    │
+│ 2012 │ 238366 │ ███████████████████                    │
+│ 2013 │ 256931 │ ████████████████████▌                  │
+│ 2014 │ 279917 │ ██████████████████████▍                │
+│ 2015 │ 297264 │ ███████████████████████▋               │
+│ 2016 │ 313197 │ █████████████████████████              │
+│ 2017 │ 346070 │ ███████████████████████████▋           │
+│ 2018 │ 350117 │ ████████████████████████████           │
+│ 2019 │ 351010 │ ████████████████████████████           │
+│ 2020 │ 368974 │ █████████████████████████████▌         │
+│ 2021 │ 384351 │ ██████████████████████████████▋        │
+└──────┴────────┴────────────────────────────────────────┘
+
+27 rows in set. Elapsed: 0.003 sec. Processed 106.87 thousand rows, 3.21 MB (31.92 million rows/s., 959.03 MB/s.)
+
+-- Q2) Average price per year in London:
+
+SELECT
+    toYear(date) AS year,
+    round(avg(price)) AS price,
+    bar(price, 0, 2000000, 100)
+FROM uk_price_paid
+WHERE town = 'LONDON'
+GROUP BY year
+ORDER BY year ASC;
+
+┌─year─┬───price─┬─bar(round(avg(price)), 0, 2000000, 100)───────────────┐
+│ 1995 │  109112 │ █████▍                                                │
+│ 1996 │  118667 │ █████▊                                                │
+│ 1997 │  136518 │ ██████▋                                               │
+│ 1998 │  152983 │ ███████▋                                              │
+│ 1999 │  180633 │ █████████                                             │
+│ 2000 │  215830 │ ██████████▋                                           │
+│ 2001 │  232996 │ ███████████▋                                          │
+│ 2002 │  263672 │ █████████████▏                                        │
+│ 2003 │  278394 │ █████████████▊                                        │
+│ 2004 │  304665 │ ███████████████▏                                      │
+│ 2005 │  322875 │ ████████████████▏                                     │
+│ 2006 │  356192 │ █████████████████▋                                    │
+│ 2007 │  404055 │ ████████████████████▏                                 │
+│ 2008 │  420741 │ █████████████████████                                 │
+│ 2009 │  427754 │ █████████████████████▍                                │
+│ 2010 │  480306 │ ████████████████████████                              │
+│ 2011 │  496274 │ ████████████████████████▋                             │
+│ 2012 │  519441 │ █████████████████████████▊                            │
+│ 2013 │  616209 │ ██████████████████████████████▋                       │
+│ 2014 │  724144 │ ████████████████████████████████████▏                 │
+│ 2015 │  792112 │ ███████████████████████████████████████▌              │
+│ 2016 │  843568 │ ██████████████████████████████████████████▏           │
+│ 2017 │  982566 │ █████████████████████████████████████████████████▏    │
+│ 2018 │ 1016845 │ ██████████████████████████████████████████████████▋   │
+│ 2019 │ 1043277 │ ████████████████████████████████████████████████████▏ │
+│ 2020 │ 1003963 │ ██████████████████████████████████████████████████▏   │
+│ 2021 │  940794 │ ███████████████████████████████████████████████       │
+└──────┴─────────┴───────────────────────────────────────────────────────┘
+
+27 rows in set. Elapsed: 0.005 sec. Processed 106.87 thousand rows, 3.53 MB (23.49 million rows/s., 775.95 MB/s.)
+
+-- Q3) The most expensive neighborhoods:
+-- the condition (date >= '2020-01-01') needs to be modified to match projection dimension (toYear(date) >= 2020)
+
+SELECT
+    town,
+    district,
+    count() AS c,
+    round(avg(price)) AS price,
+    bar(price, 0, 5000000, 100)
+FROM uk_price_paid
+WHERE toYear(date) >= 2020
+GROUP BY
+    town,
+    district
+HAVING c >= 100
+ORDER BY price DESC
+LIMIT 100
+
+┌─town─────────────────┬─district───────────────┬────c─┬───price─┬─bar(round(avg(price)), 0, 5000000, 100)────────────────────────────┐
+│ LONDON               │ CITY OF WESTMINSTER    │ 3372 │ 3305225 │ ██████████████████████████████████████████████████████████████████ │
+│ LONDON               │ CITY OF LONDON         │  257 │ 3294478 │ █████████████████████████████████████████████████████████████████▊ │
+│ LONDON               │ KENSINGTON AND CHELSEA │ 2367 │ 2342422 │ ██████████████████████████████████████████████▋                    │
+│ LEATHERHEAD          │ ELMBRIDGE              │  108 │ 1927143 │ ██████████████████████████████████████▌                            │
+│ VIRGINIA WATER       │ RUNNYMEDE              │  142 │ 1868819 │ █████████████████████████████████████▍                             │
+│ LONDON               │ CAMDEN                 │ 2815 │ 1736788 │ ██████████████████████████████████▋                                │
+│ THORNTON HEATH       │ CROYDON                │  521 │ 1733051 │ ██████████████████████████████████▋                                │
+│ WINDLESHAM           │ SURREY HEATH           │  103 │ 1717255 │ ██████████████████████████████████▎                                │
+│ BARNET               │ ENFIELD                │  115 │ 1503458 │ ██████████████████████████████                                     │
+│ OXFORD               │ SOUTH OXFORDSHIRE      │  298 │ 1275200 │ █████████████████████████▌                                         │
+│ LONDON               │ ISLINGTON              │ 2458 │ 1274308 │ █████████████████████████▍                                         │
+│ COBHAM               │ ELMBRIDGE              │  364 │ 1260005 │ █████████████████████████▏                                         │
+│ LONDON               │ HOUNSLOW               │  618 │ 1215682 │ ████████████████████████▎                                          │
+│ ASCOT                │ WINDSOR AND MAIDENHEAD │  379 │ 1215146 │ ████████████████████████▎                                          │
+│ LONDON               │ RICHMOND UPON THAMES   │  654 │ 1207551 │ ████████████████████████▏                                          │
+│ BEACONSFIELD         │ BUCKINGHAMSHIRE        │  307 │ 1186220 │ ███████████████████████▋                                           │
+│ RICHMOND             │ RICHMOND UPON THAMES   │  805 │ 1100420 │ ██████████████████████                                             │
+│ LONDON               │ HAMMERSMITH AND FULHAM │ 2888 │ 1062959 │ █████████████████████▎                                             │
+│ WEYBRIDGE            │ ELMBRIDGE              │  607 │ 1027161 │ ████████████████████▌                                              │
+│ RADLETT              │ HERTSMERE              │  265 │ 1015896 │ ████████████████████▎                                              │
+│ SALCOMBE             │ SOUTH HAMS             │  124 │ 1014393 │ ████████████████████▎                                              │
+│ BURFORD              │ WEST OXFORDSHIRE       │  102 │  993100 │ ███████████████████▋                                               │
+│ ESHER                │ ELMBRIDGE              │  454 │  969770 │ ███████████████████▍                                               │
+│ HINDHEAD             │ WAVERLEY               │  128 │  967786 │ ███████████████████▎                                               │
+│ BROCKENHURST         │ NEW FOREST             │  121 │  967046 │ ███████████████████▎                                               │
+│ LEATHERHEAD          │ GUILDFORD              │  191 │  964489 │ ███████████████████▎                                               │
+│ GERRARDS CROSS       │ BUCKINGHAMSHIRE        │  376 │  958555 │ ███████████████████▏                                               │
+│ EAST MOLESEY         │ ELMBRIDGE              │  181 │  943457 │ ██████████████████▋                                                │
+│ OLNEY                │ MILTON KEYNES          │  220 │  942892 │ ██████████████████▋                                                │
+│ CHALFONT ST GILES    │ BUCKINGHAMSHIRE        │  135 │  926950 │ ██████████████████▌                                                │
+│ HENLEY-ON-THAMES     │ SOUTH OXFORDSHIRE      │  509 │  905732 │ ██████████████████                                                 │
+│ KINGSTON UPON THAMES │ KINGSTON UPON THAMES   │  889 │  899689 │ █████████████████▊                                                 │
+│ BELVEDERE            │ BEXLEY                 │  313 │  895336 │ █████████████████▊                                                 │
+│ CRANBROOK            │ TUNBRIDGE WELLS        │  404 │  888190 │ █████████████████▋                                                 │
+│ LONDON               │ EALING                 │ 2460 │  865893 │ █████████████████▎                                                 │
+│ MAIDENHEAD           │ BUCKINGHAMSHIRE        │  114 │  863814 │ █████████████████▎                                                 │
+│ LONDON               │ MERTON                 │ 1958 │  857192 │ █████████████████▏                                                 │
+│ GUILDFORD            │ WAVERLEY               │  131 │  854447 │ █████████████████                                                  │
+│ LONDON               │ HACKNEY                │ 3088 │  846571 │ ████████████████▊                                                  │
+│ LYMM                 │ WARRINGTON             │  285 │  839920 │ ████████████████▋                                                  │
+│ HARPENDEN            │ ST ALBANS              │  606 │  836994 │ ████████████████▋                                                  │
+│ LONDON               │ WANDSWORTH             │ 6113 │  832292 │ ████████████████▋                                                  │
+│ LONDON               │ SOUTHWARK              │ 3612 │  831319 │ ████████████████▋                                                  │
+│ BERKHAMSTED          │ DACORUM                │  502 │  830356 │ ████████████████▌                                                  │
+│ KINGS LANGLEY        │ DACORUM                │  137 │  821358 │ ████████████████▍                                                  │
+│ TONBRIDGE            │ TUNBRIDGE WELLS        │  339 │  806736 │ ████████████████▏                                                  │
+│ EPSOM                │ REIGATE AND BANSTEAD   │  157 │  805903 │ ████████████████                                                   │
+│ WOKING               │ GUILDFORD              │  161 │  803283 │ ████████████████                                                   │
+│ STOCKBRIDGE          │ TEST VALLEY            │  168 │  801973 │ ████████████████                                                   │
+│ TEDDINGTON           │ RICHMOND UPON THAMES   │  539 │  798591 │ ███████████████▊                                                   │
+│ OXFORD               │ VALE OF WHITE HORSE    │  329 │  792907 │ ███████████████▋                                                   │
+│ LONDON               │ BARNET                 │ 3624 │  789583 │ ███████████████▋                                                   │
+│ TWICKENHAM           │ RICHMOND UPON THAMES   │ 1090 │  787760 │ ███████████████▋                                                   │
+│ LUTON                │ CENTRAL BEDFORDSHIRE   │  196 │  786051 │ ███████████████▋                                                   │
+│ TONBRIDGE            │ MAIDSTONE              │  277 │  785746 │ ███████████████▋                                                   │
+│ TOWCESTER            │ WEST NORTHAMPTONSHIRE  │  186 │  783532 │ ███████████████▋                                                   │
+│ LONDON               │ LAMBETH                │ 4832 │  783422 │ ███████████████▋                                                   │
+│ LUTTERWORTH          │ HARBOROUGH             │  515 │  781775 │ ███████████████▋                                                   │
+│ WOODSTOCK            │ WEST OXFORDSHIRE       │  135 │  777499 │ ███████████████▌                                                   │
+│ ALRESFORD            │ WINCHESTER             │  196 │  775577 │ ███████████████▌                                                   │
+│ LONDON               │ NEWHAM                 │ 2942 │  768551 │ ███████████████▎                                                   │
+│ ALDERLEY EDGE        │ CHESHIRE EAST          │  168 │  768280 │ ███████████████▎                                                   │
+│ MARLOW               │ BUCKINGHAMSHIRE        │  301 │  762784 │ ███████████████▎                                                   │
+│ BILLINGSHURST        │ CHICHESTER             │  134 │  760920 │ ███████████████▏                                                   │
+│ LONDON               │ TOWER HAMLETS          │ 4183 │  759635 │ ███████████████▏                                                   │
+│ MIDHURST             │ CHICHESTER             │  245 │  759101 │ ███████████████▏                                                   │
+│ THAMES DITTON        │ ELMBRIDGE              │  227 │  753347 │ ███████████████                                                    │
+│ POTTERS BAR          │ WELWYN HATFIELD        │  163 │  752926 │ ███████████████                                                    │
+│ REIGATE              │ REIGATE AND BANSTEAD   │  555 │  740961 │ ██████████████▋                                                    │
+│ TADWORTH             │ REIGATE AND BANSTEAD   │  477 │  738997 │ ██████████████▋                                                    │
+│ SEVENOAKS            │ SEVENOAKS              │ 1074 │  734658 │ ██████████████▋                                                    │
+│ PETWORTH             │ CHICHESTER             │  138 │  732432 │ ██████████████▋                                                    │
+│ BOURNE END           │ BUCKINGHAMSHIRE        │  127 │  730742 │ ██████████████▌                                                    │
+│ PURLEY               │ CROYDON                │  540 │  727721 │ ██████████████▌                                                    │
+│ OXTED                │ TANDRIDGE              │  320 │  726078 │ ██████████████▌                                                    │
+│ LONDON               │ HARINGEY               │ 2988 │  724573 │ ██████████████▍                                                    │
+│ BANSTEAD             │ REIGATE AND BANSTEAD   │  373 │  713834 │ ██████████████▎                                                    │
+│ PINNER               │ HARROW                 │  480 │  712166 │ ██████████████▏                                                    │
+│ MALMESBURY           │ WILTSHIRE              │  293 │  707747 │ ██████████████▏                                                    │
+│ RICKMANSWORTH        │ THREE RIVERS           │  732 │  705400 │ ██████████████                                                     │
+│ SLOUGH               │ BUCKINGHAMSHIRE        │  359 │  705002 │ ██████████████                                                     │
+│ GREAT MISSENDEN      │ BUCKINGHAMSHIRE        │  214 │  704904 │ ██████████████                                                     │
+│ READING              │ SOUTH OXFORDSHIRE      │  295 │  701697 │ ██████████████                                                     │
+│ HYTHE                │ FOLKESTONE AND HYTHE   │  457 │  700334 │ ██████████████                                                     │
+│ WELWYN               │ WELWYN HATFIELD        │  217 │  699649 │ █████████████▊                                                     │
+│ CHIGWELL             │ EPPING FOREST          │  242 │  697869 │ █████████████▊                                                     │
+│ BARNET               │ BARNET                 │  906 │  695680 │ █████████████▊                                                     │
+│ HASLEMERE            │ CHICHESTER             │  120 │  694028 │ █████████████▊                                                     │
+│ LEATHERHEAD          │ MOLE VALLEY            │  748 │  692026 │ █████████████▋                                                     │
+│ LONDON               │ BRENT                  │ 1945 │  690799 │ █████████████▋                                                     │
+│ HASLEMERE            │ WAVERLEY               │  258 │  690765 │ █████████████▋                                                     │
+│ NORTHWOOD            │ HILLINGDON             │  252 │  690753 │ █████████████▋                                                     │
+│ WALTON-ON-THAMES     │ ELMBRIDGE              │  871 │  689431 │ █████████████▋                                                     │
+│ INGATESTONE          │ BRENTWOOD              │  150 │  688345 │ █████████████▋                                                     │
+│ OXFORD               │ OXFORD                 │ 1761 │  686114 │ █████████████▋                                                     │
+│ CHISLEHURST          │ BROMLEY                │  410 │  682892 │ █████████████▋                                                     │
+│ KINGS LANGLEY        │ THREE RIVERS           │  109 │  682320 │ █████████████▋                                                     │
+│ ASHTEAD              │ MOLE VALLEY            │  280 │  680483 │ █████████████▌                                                     │
+│ WOKING               │ SURREY HEATH           │  269 │  679035 │ █████████████▌                                                     │
+│ ASCOT                │ BRACKNELL FOREST       │  160 │  678632 │ █████████████▌                                                     │
+└──────────────────────┴────────────────────────┴──────┴─────────┴────────────────────────────────────────────────────────────────────┘
+
+100 rows in set. Elapsed: 0.005 sec. Processed 12.85 thousand rows, 813.40 KB (2.73 million rows/s., 172.95 MB/s.)
+```
+
+All 3 queries work much faster and read fewer rows.
+
+```
+Q1) 
+no projection: 27 rows in set. Elapsed: 0.027 sec. Processed 26.25 million rows, 157.49 MB (955.96 million rows/s., 5.74 GB/s.)
+   projection: 27 rows in set. Elapsed: 0.003 sec. Processed 106.87 thousand rows, 3.21 MB (31.92 million rows/s., 959.03 MB/s.)
+```
--- a/docs/en/operations/server-configuration-parameters/settings.md
+++ b/docs/en/operations/server-configuration-parameters/settings.md
@ -69,6 +69,7 @@ If no conditions met for a data part, ClickHouse uses the `lz4` compression.
 </compression>
 ```

+<!--
 ## encryption {#server-settings-encryption}

 Configures a command to obtain a key to be used by [encryption codecs](../../sql-reference/statements/create/table.md#create-query-encryption-codecs). The command, or a shell script, is expected to write a Base64-encoded key of any length to the stdout.
@ -90,7 +91,7 @@ For other systems:
    <key_command><![CDATA[IFS=; echo -n >/dev/tty "Enter the ClickHouse encryption passphrase: "; stty=`stty -F /dev/tty -g`; stty -F /dev/tty -echo; read k </dev/tty; stty -F /dev/tty "$stty"; echo -n $k | base64]]></key_command>
 </encryption>
 ```
-
+-->
 ## custom_settings_prefixes {#custom_settings_prefixes}

 List of prefixes for [custom settings](../../operations/settings/index.md#custom_settings). The prefixes must be separated with commas.
--- a/docs/en/operations/storing-data.md
+++ b/docs/en/operations/storing-data.md
@ -0,0 +1,14 @@
+---
+toc_priority: 68
+toc_title: External Disks for Storing Data
+---
+
+# External Disks for Storing Data {#external-disks}
+
+Data, processed in ClickHouse, is usually stored in the local file system — on the same machine with the ClickHouse server. That requires large-capacity disks, which can be expensive enough. To avoid that you can store the data remotely — on [Amazon s3](https://aws.amazon.com/s3/) disks or in the Hadoop Distributed File System ([HDFS](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html)). 
+
+To work with data stored on `Amazon s3` disks use [s3](../engines/table-engines/integrations/s3.md) table engine, and to work with data in the Hadoop Distributed File System — [HDFS](../engines/table-engines/integrations/hdfs.md) table engine. 
+
+## Zero-copy Replication {#zero-copy}
+
+ClickHouse supports zero-copy replication for `s3` and `HDFS` disks, which means that if the data is stored remotely on several machines and needs to be synchronized, then only the metadata is replicated (paths to the data parts), but not the data itself. 
--- a/docs/en/sql-reference/functions/bitmap-functions.md
+++ b/docs/en/sql-reference/functions/bitmap-functions.md
@ -125,6 +125,44 @@ Result:
 └───────────────────────────┘
 ```

+## subBitmap {#subBitmap}
+
+Creates a subset of bitmap limit the results to `cardinality_limit` with offset of `offset`.
+
+**Syntax**
+
+``` sql
+subBitmap(bitmap, offset, cardinality_limit)
+```
+
+**Arguments**
+
+-   `bitmap` – [Bitmap object](#bitmap_functions-bitmapbuild).
+-   `offset` – the number of offsets. Type: [UInt32](../../sql-reference/data-types/int-uint.md).
+-   `cardinality_limit` – The subset cardinality upper limit. Type: [UInt32](../../sql-reference/data-types/int-uint.md).
+
+**Returned value**
+
+The subset.
+
+Type: `Bitmap object`.
+
+**Example**
+
+Query:
+
+``` sql
+SELECT bitmapToArray(subBitmap(bitmapBuild([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,100,200,500]), toUInt32(10), toUInt32(10))) AS res;
+```
+
+Result:
+
+``` text
+┌─res─────────────────────────────┐
+│ [10,11,12,13,14,15,16,17,18,19] │
+└─────────────────────────────────┘
+```
+
 ## bitmapContains {#bitmap_functions-bitmapcontains}

 Checks whether the bitmap contains an element.
--- a/docs/en/sql-reference/statements/create/table.md
+++ b/docs/en/sql-reference/statements/create/table.md
@ -254,6 +254,7 @@ CREATE TABLE codec_example
 ENGINE = MergeTree()
 ```

+<!--
 ### Encryption Codecs {#create-query-encryption-codecs}

 These codecs don't actually compress data, but instead encrypt data on disk. These are only available when an encryption key is specified by [encryption](../../../operations/server-configuration-parameters/settings.md#server-settings-encryption) settings. Note that encryption only makes sense at the end of codec pipelines, because encrypted data usually can't be compressed in any meaningful way.
@ -267,7 +268,7 @@ Encryption codecs:

 !!! attention "Attention"
    If you perform a SELECT query mentioning a specific value in an encrypted column (such as in its WHERE clause), the value may appear in [system.query_log](../../../operations/system-tables/query_log.md). You may want to disable the logging.
-
+-->
 ## Temporary Tables {#temporary-tables}

 ClickHouse supports temporary tables which have the following characteristics:
--- a/docs/ru/development/developer-instruction.md
+++ b/docs/ru/development/developer-instruction.md
@ -241,7 +241,7 @@ sudo ./llvm.sh 12

 В качестве простых редакторов кода можно использовать Sublime Text или Visual Studio Code или Kate (все варианты доступны под Linux).

-На всякий случай заметим, что CLion самостоятельно создаёт свою build директорию, самостоятельно выбирает тип сборки debug по-умолчанию, для конфигурации использует встроенную в CLion версию CMake вместо установленного вами, а для запуска задач использует make вместо ninja. Это нормально, просто имейте это ввиду, чтобы не возникало путаницы.
+На всякий случай заметим, что CLion самостоятельно создаёт свою build директорию, самостоятельно выбирает тип сборки debug по-умолчанию, для конфигурации использует встроенную в CLion версию CMake вместо установленного вами, а для запуска задач использует make вместо ninja (но при желании начиная с версии CLion 2019.3 EAP можно настроить использование ninja, см. подробнее [тут](https://blog.jetbrains.com/clion/2019/10/clion-2019-3-eap-ninja-cmake-generators/)). Это нормально, просто имейте это ввиду, чтобы не возникало путаницы.

 ## Написание кода {#napisanie-koda}

--- a/docs/ru/engines/table-engines/integrations/hdfs.md
+++ b/docs/ru/engines/table-engines/integrations/hdfs.md
@ -7,7 +7,7 @@ toc_title: HDFS

 Управляет данными в HDFS. Данный движок похож на движки [File](../special/file.md#table_engines-file) и [URL](../special/url.md#table_engines-url).

-## Использование движка {#ispolzovanie-dvizhka}
+## Использование движка {#usage}

 ``` sql
 ENGINE = HDFS(URI, format)
@ -44,13 +44,13 @@ SELECT * FROM hdfs_engine_table LIMIT 2
 └──────┴───────┘
 ```

-## Детали реализации {#detali-realizatsii}
+## Детали реализации {#implementation-details}

 -   Поддерживается многопоточное чтение и запись.
+-   Поддерживается репликация без копирования данных ([zero-copy](../../../operations/storing-data.md#zero-copy)).  
 -   Не поддерживается:
    -   использование операций `ALTER` и `SELECT...SAMPLE`;
-    -   индексы;
-    -   репликация.
+    -   индексы.

 **Шаблоны в пути**

@ -67,12 +67,12 @@ SELECT * FROM hdfs_engine_table LIMIT 2

 1.  Предположим, у нас есть несколько файлов со следующими URI в HDFS:

-   ‘hdfs://hdfs1:9000/some_dir/some_file_1’
-   ‘hdfs://hdfs1:9000/some_dir/some_file_2’
-   ‘hdfs://hdfs1:9000/some_dir/some_file_3’
-   ‘hdfs://hdfs1:9000/another_dir/some_file_1’
-   ‘hdfs://hdfs1:9000/another_dir/some_file_2’
-   ‘hdfs://hdfs1:9000/another_dir/some_file_3’
+-   'hdfs://hdfs1:9000/some_dir/some_file_1'
+-   'hdfs://hdfs1:9000/some_dir/some_file_2'
+-   'hdfs://hdfs1:9000/some_dir/some_file_3'
+-   'hdfs://hdfs1:9000/another_dir/some_file_1'
+-   'hdfs://hdfs1:9000/another_dir/some_file_2'
+-   'hdfs://hdfs1:9000/another_dir/some_file_3'

 1.  Есть несколько возможностей создать таблицу, состояющую из этих шести файлов:

@ -122,8 +122,9 @@ CREATE TABLE big_table (name String, value UInt32) ENGINE = HDFS('hdfs://hdfs1:9
  </hdfs_root>
 ```

-### Список возможных опций конфигурации со значениями по умолчанию
-#### Поддерживаемые из libhdfs3
+### Параметры конфигурации {#configuration-options}
+
+#### Поддерживаемые из libhdfs3 {#supported-by-libhdfs3}


 | **параметр**                                          | **по умолчанию**        |
@ -180,7 +181,7 @@ CREATE TABLE big_table (name String, value UInt32) ENGINE = HDFS('hdfs://hdfs1:9
 |hadoop\_kerberos\_principal                            | ""                      |
 |hadoop\_kerberos\_kinit\_command                       | kinit                   |

-#### Ограничения {#limitations}
+### Ограничения {#limitations}
  * hadoop\_security\_kerberos\_ticket\_cache\_path могут быть определены только на глобальном уровне

 ## Поддержка Kerberos {#kerberos-support}
@ -193,7 +194,7 @@ CREATE TABLE big_table (name String, value UInt32) ENGINE = HDFS('hdfs://hdfs1:9

 Если hadoop\_kerberos\_keytab, hadoop\_kerberos\_principal или hadoop\_kerberos\_kinit\_command указаны в настройках, kinit будет вызван. hadoop\_kerberos\_keytab и hadoop\_kerberos\_principal обязательны в этом случае. Необходимо также будет установить kinit и файлы конфигурации krb5.

-## Виртуальные столбцы {#virtualnye-stolbtsy}
+## Виртуальные столбцы {#virtual-columns}

 -   `_path` — Путь к файлу.
 -   `_file` — Имя файла.
@ -201,4 +202,3 @@ CREATE TABLE big_table (name String, value UInt32) ENGINE = HDFS('hdfs://hdfs1:9
 **См. также**

 -   [Виртуальные колонки](../../../engines/table-engines/index.md#table_engines-virtual_columns)
-
--- a/docs/ru/engines/table-engines/integrations/s3.md
+++ b/docs/ru/engines/table-engines/integrations/s3.md
@ -47,10 +47,10 @@ SELECT * FROM s3_engine_table LIMIT 2;
 ## Детали реализации {#implementation-details}

 -   Чтение и запись могут быть параллельными.
+-   Поддерживается репликация без копирования данных ([zero-copy](../../../operations/storing-data.md#zero-copy)).
 -   Не поддерживаются:
    -   запросы `ALTER` и `SELECT...SAMPLE`,
-    -   индексы,
-    -   репликация.
+    -   индексы.

 ## Символы подстановки  {#wildcards-in-path}

@ -72,7 +72,7 @@ SELECT * FROM s3_engine_table LIMIT 2;
 -   `s3_max_redirects` — максимальное количество разрешенных переадресаций S3. Значение по умолчанию — `10`.
 -   `s3_single_read_retries` — максимальное количество попыток запроса при единичном чтении. Значение по умолчанию — `4`.

-Соображение безопасности: если злонамеренный пользователь попробует указать произвольные URL-адреса S3, параметр `s3_max_redirects` должен быть установлен в ноль, чтобы избежать атак [SSRF] (https://en.wikipedia.org/wiki/Server-side_request_forgery). Как альтернатива, в конфигурации сервера должен быть указан `remote_host_filter`.
+Соображение безопасности: если злонамеренный пользователь попробует указать произвольные URL-адреса S3, параметр `s3_max_redirects` должен быть установлен в ноль, чтобы избежать атак [SSRF](https://en.wikipedia.org/wiki/Server-side_request_forgery). Как альтернатива, в конфигурации сервера должен быть указан `remote_host_filter`.

 ## Настройки точки приема запроса {#endpoint-settings}

--- a/docs/ru/operations/storing-data.md
+++ b/docs/ru/operations/storing-data.md
@ -0,0 +1,14 @@
+---
+toc_priority: 68
+toc_title: "Хранение данных на внешних дисках"
+---
+
+# Хранение данных на внешних дисках {#external-disks}
+
+Данные, которые обрабатываются в ClickHouse, обычно хранятся в файловой системе локально, где развернут сервер ClickHouse. При этом для хранения данных требуются диски большого объема, которые могут быть довольно дорогостоящими. Решением проблемы может стать хранение данных отдельно от сервера — в распределенных файловых системах — [Amazon s3](https://aws.amazon.com/s3/) или Hadoop ([HDFS](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html)). 
+
+Для работы с данными, хранящимися в файловой системе `Amazon s3`, используйте движок [s3](../engines/table-engines/integrations/s3.md), а для работы с данными в файловой системе Hadoop — движок [HDFS](../engines/table-engines/integrations/hdfs.md). 
+
+## Репликация без копирования данных {#zero-copy}
+
+Для дисков `s3` и `HDFS` в ClickHouse поддерживается репликация без копирования данных (zero-copy): если данные хранятся на нескольких репликах, то при синхронизации пересылаются только метаданные (пути к кускам данных), а сами данные не копируются.
--- a/docs/ru/sql-reference/functions/conditional-functions.md
+++ b/docs/ru/sql-reference/functions/conditional-functions.md
@ -7,7 +7,7 @@ toc_title: "Условные функции"

 ## if {#if}

-Условное выражение. В отличии от большинства систем, ClickHouse всегда считает оба выражения `then` и `else`.
+Условное выражение. В отличие от большинства систем, ClickHouse всегда считает оба выражения `then` и `else`.

 **Синтаксис**

--- a/docs/zh/sql-reference/functions/bitmap-functions.md
+++ b/docs/zh/sql-reference/functions/bitmap-functions.md
@ -88,6 +88,30 @@ SELECT bitmapToArray(bitmapSubsetLimit(bitmapBuild([0,1,2,3,4,5,6,7,8,9,10,11,12
    │ [30,31,32,33,100,200,500] │
    └───────────────────────────┘

+## subBitmap {#subBitmap}
+
+将位图跳过`offset`个元素，限制大小为`limit`个的结果转换为另一个位图。
+
+    subBitmap(bitmap, offset, limit)
+
+**参数**
+
+-   `bitmap` – 位图对象.
+-   `offset` – 跳过多少个元素.
+-   `limit` – 子位图基数上限.
+
+**示例**
+
+``` sql
+SELECT bitmapToArray(subBitmap(bitmapBuild([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,100,200,500]), toUInt32(10), toUInt32(10))) AS res
+```
+
+```text
+┌─res─────────────────────────────┐
+│ [10,11,12,13,14,15,16,17,18,19] │
+└─────────────────────────────────┘
+```
+
 ## bitmapContains {#bitmapcontains}

 检查位图是否包含指定元素。
--- a/programs/client/Client.cpp
+++ b/programs/client/Client.cpp
@ -1,3 +1,6 @@
+#include <string>
+#include "Common/MemoryTracker.h"
+#include "Columns/ColumnsNumber.h"
 #include "ConnectionParameters.h"
 #include "QueryFuzzer.h"
 #include "Suggest.h"
@ -100,6 +103,14 @@
 #pragma GCC optimize("-fno-var-tracking-assignments")
 #endif

+namespace CurrentMetrics
+{
+    extern const Metric Revision;
+    extern const Metric VersionInteger;
+    extern const Metric MemoryTracking;
+    extern const Metric MaxDDLEntryID;
+}
+
 namespace fs = std::filesystem;

 namespace DB
@ -524,6 +535,18 @@ private:
    {
        UseSSL use_ssl;

+        MainThreadStatus::getInstance();
+
+        /// Limit on total memory usage
+        size_t max_client_memory_usage = config().getInt64("max_memory_usage_in_client", 0 /*default value*/);
+
+        if (max_client_memory_usage != 0)
+        {
+            total_memory_tracker.setHardLimit(max_client_memory_usage);
+            total_memory_tracker.setDescription("(total)");
+            total_memory_tracker.setMetric(CurrentMetrics::MemoryTracking);
+        }
+
        registerFormats();
        registerFunctions();
        registerAggregateFunctions();
@ -2581,6 +2604,7 @@ public:
            ("opentelemetry-tracestate", po::value<std::string>(), "OpenTelemetry tracestate header as described by W3C Trace Context recommendation")
            ("history_file", po::value<std::string>(), "path to history file")
            ("no-warnings", "disable warnings when client connects to server")
+            ("max_memory_usage_in_client", po::value<int>(), "sets memory limit in client")
        ;

        Settings cmd_settings;
--- a/programs/client/QueryFuzzer.h
+++ b/programs/client/QueryFuzzer.h
@ -7,7 +7,6 @@
 #include <pcg-random/pcg_random.hpp>

 #include <Common/randomSeed.h>
-#include <Common/Stopwatch.h>
 #include <Core/Field.h>
 #include <Parsers/IAST.h>

--- a/programs/server/Server.cpp
+++ b/programs/server/Server.cpp
@ -126,6 +126,7 @@ namespace CurrentMetrics
    extern const Metric VersionInteger;
    extern const Metric MemoryTracking;
    extern const Metric MaxDDLEntryID;
+    extern const Metric MaxPushedDDLEntryID;
 }

 namespace fs = std::filesystem;
@ -1468,7 +1469,8 @@ if (ThreadFuzzer::instance().isEffective())
            if (pool_size < 1)
                throw Exception("distributed_ddl.pool_size should be greater then 0", ErrorCodes::ARGUMENT_OUT_OF_BOUND);
            global_context->setDDLWorker(std::make_unique<DDLWorker>(pool_size, ddl_zookeeper_path, global_context, &config(),
-                                                                     "distributed_ddl", "DDLWorker", &CurrentMetrics::MaxDDLEntryID));
+                                                                     "distributed_ddl", "DDLWorker",
+                                                                     &CurrentMetrics::MaxDDLEntryID, &CurrentMetrics::MaxPushedDDLEntryID));
        }

        for (auto & server : *servers)
--- a/programs/server/play.html
+++ b/programs/server/play.html
@ -68,7 +68,7 @@
        html, body
        {
            /* Personal choice. */
-            font-family: Sans-Serif;
+            font-family: Liberation Sans, DejaVu Sans, sans-serif, Noto Color Emoji, Apple Color Emoji, Segoe UI Emoji;
            background: var(--background-color);
            color: var(--text-color);
        }
@ -96,11 +96,16 @@
        .monospace
        {
            /* Prefer fonts that have full hinting info. This is important for non-retina displays.
-               Also I personally dislike "Ubuntu" font due to the similarity of 'r' and 'г' (it looks very ignorant).
-            */
+               Also I personally dislike "Ubuntu" font due to the similarity of 'r' and 'г' (it looks very ignorant). */
            font-family: Liberation Mono, DejaVu Sans Mono, MonoLisa, Consolas, Monospace;
        }

+        .monospace-table
+        {
+            /* Liberation is worse than DejaVu for block drawing characters. */
+            font-family: DejaVu Sans Mono, Liberation Mono, MonoLisa, Consolas, Monospace;
+        }
+
        .shadow
        {
            box-shadow: 0 0 1rem var(--shadow-color);
@ -325,8 +330,8 @@
        <span id="toggle-dark">🌑</span><span id="toggle-light">🌞</span>
    </div>
    <div id="data_div">
-        <table class="monospace shadow" id="data-table"></table>
-        <pre class="monospace shadow" id="data-unparsed"></pre>
+        <table class="monospace-table shadow" id="data-table"></table>
+        <pre class="monospace-table shadow" id="data-unparsed"></pre>
    </div>
    <svg id="graph" fill="none"></svg>
    <p id="error" class="monospace shadow">
@ -367,7 +372,7 @@
        const server_address = document.getElementById('url').value;

        const url = server_address +
-            (server_address.indexOf('?') >= 0 ? '&' : '?') + 
+            (server_address.indexOf('?') >= 0 ? '&' : '?') +
            /// Ask server to allow cross-domain requests.
            'add_http_cors_header=1' +
            '&user=' + encodeURIComponent(user) +
--- a/src/Access/IAccessStorage.cpp
+++ b/src/Access/IAccessStorage.cpp
@ -455,7 +455,7 @@ UUID IAccessStorage::login(
        if (!replace_exception_with_cannot_authenticate)
            throw;

-        tryLogCurrentException(getLogger(), credentials.getUserName() + ": Authentication failed");
+        tryLogCurrentException(getLogger(), "from: " + address.toString() + ", user: " + credentials.getUserName()  + ": Authentication failed");
        throwCannotAuthenticate(credentials.getUserName());
    }
 }
--- a/src/AggregateFunctions/AggregateFunctionGroupBitmapData.h
+++ b/src/AggregateFunctions/AggregateFunctionGroupBitmapData.h
@ -579,6 +579,37 @@ public:
        }
    }

+    UInt64 rb_offset_limit(UInt64 offset, UInt64 limit, RoaringBitmapWithSmallSet & r1) const
+    {
+        if (limit == 0 || offset >= size())
+            return 0;
+
+        if (isSmall())
+        {
+            UInt64 count = 0;
+            UInt64 offset_count = 0;
+            auto it = small.begin();
+            for (;it != small.end() && offset_count < offset; ++it)
+                ++offset_count;
+
+            for (;it != small.end() && count < limit; ++it, ++count)
+                r1.add(it->getValue());
+            return count;
+        }
+        else
+        {
+            UInt64 count = 0;
+            UInt64 offset_count = 0;
+            auto it = rb->begin();
+            for (;it != rb->end() && offset_count < offset; ++it)
+                ++offset_count;
+
+            for (;it != rb->end() && count < limit; ++it, ++count)
+                r1.add(*it);
+            return count;
+        }
+    }
+
    UInt64 rb_min() const
    {
        if (isSmall())
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@ -299,10 +299,11 @@ target_link_libraries(clickhouse_common_io
            ${ZLIB_LIBRARIES}
            pcg_random
            Poco::Foundation
-            roaring
 )

-
+# Make dbms depend on roaring instead of clickhouse_common_io so that roaring itself can depend on clickhouse_common_io
+# That way we we can redirect malloc/free functions avoiding circular dependencies
+dbms_target_link_libraries(PUBLIC roaring)

 if (USE_RDKAFKA)
    dbms_target_link_libraries(PRIVATE ${CPPKAFKA_LIBRARY} ${RDKAFKA_LIBRARY})
--- a/src/Columns/ColumnLowCardinality.h
+++ b/src/Columns/ColumnLowCardinality.h
@ -192,6 +192,7 @@ public:
     * So LC(Nullable(T)) would return true, LC(U) -- false.
     */
    bool nestedIsNullable() const { return isColumnNullable(*dictionary.getColumnUnique().getNestedColumn()); }
+    bool nestedCanBeInsideNullable() const { return dictionary.getColumnUnique().getNestedColumn()->canBeInsideNullable(); }
    void nestedToNullable() { dictionary.getColumnUnique().nestedToNullable(); }
    void nestedRemoveNullable() { dictionary.getColumnUnique().nestedRemoveNullable(); }

--- a/src/Common/CurrentMetrics.cpp
+++ b/src/Common/CurrentMetrics.cpp
@ -60,6 +60,7 @@
    M(BrokenDistributedFilesToInsert, "Number of files for asynchronous insertion into Distributed tables that has been marked as broken. This metric will starts from 0 on start. Number of files for every shard is summed.") \
    M(TablesToDropQueueSize, "Number of dropped tables, that are waiting for background data removal.") \
    M(MaxDDLEntryID, "Max processed DDL entry of DDLWorker.") \
+    M(MaxPushedDDLEntryID, "Max DDL entry of DDLWorker that pushed to zookeeper.") \
    M(PartsTemporary, "The part is generating now, it is not in data_parts list.") \
    M(PartsPreCommitted, "The part is in data_parts, but not used for SELECTs.") \
    M(PartsCommitted, "Active data part, used by current and upcoming SELECTs.") \
--- a/src/Common/MemoryTracker.cpp
+++ b/src/Common/MemoryTracker.cpp
@ -183,9 +183,6 @@ void MemoryTracker::allocImpl(Int64 size, bool throw_if_memory_exceeded)
    std::bernoulli_distribution fault(fault_probability);
    if (unlikely(fault_probability && fault(thread_local_rng)) && memoryTrackerCanThrow(level, true) && throw_if_memory_exceeded)
    {
-        ProfileEvents::increment(ProfileEvents::QueryMemoryLimitExceeded);
-        amount.fetch_sub(size, std::memory_order_relaxed);
-
        /// Prevent recursion. Exception::ctor -> std::string -> new[] -> MemoryTracker::alloc
        BlockerInThread untrack_lock(VariableContext::Global);

--- a/src/Coordination/KeeperStorageDispatcher.cpp
+++ b/src/Coordination/KeeperStorageDispatcher.cpp
@ -1,6 +1,5 @@
 #include <Coordination/KeeperStorageDispatcher.h>
 #include <Common/setThreadName.h>
-#include <Common/Stopwatch.h>
 #include <Common/ZooKeeper/KeeperException.h>
 #include <future>
 #include <chrono>
--- a/src/Core/Block.cpp
+++ b/src/Core/Block.cpp
@ -44,6 +44,13 @@ void Block::initializeIndexByName()
 }


+void Block::reserve(size_t count)
+{
+    index_by_name.reserve(count);
+    data.reserve(count);
+}
+
+
 void Block::insert(size_t position, ColumnWithTypeAndName elem)
 {
    if (position > data.size())
@ -287,6 +294,7 @@ std::string Block::dumpIndex() const
 Block Block::cloneEmpty() const
 {
    Block res;
+    res.reserve(data.size());

    for (const auto & elem : data)
        res.insert(elem.cloneEmpty());
@ -364,6 +372,8 @@ Block Block::cloneWithColumns(MutableColumns && columns) const
    Block res;

    size_t num_columns = data.size();
+    res.reserve(num_columns);
+
    for (size_t i = 0; i < num_columns; ++i)
        res.insert({ std::move(columns[i]), data[i].type, data[i].name });

@ -381,6 +391,8 @@ Block Block::cloneWithColumns(const Columns & columns) const
        throw Exception("Cannot clone block with columns because block has " + toString(num_columns) + " columns, "
                        "but " + toString(columns.size()) + " columns given.", ErrorCodes::LOGICAL_ERROR);

+    res.reserve(num_columns);
+
    for (size_t i = 0; i < num_columns; ++i)
        res.insert({ columns[i], data[i].type, data[i].name });

@ -393,6 +405,8 @@ Block Block::cloneWithoutColumns() const
    Block res;

    size_t num_columns = data.size();
+    res.reserve(num_columns);
+
    for (size_t i = 0; i < num_columns; ++i)
        res.insert({ nullptr, data[i].type, data[i].name });

--- a/src/Core/Block.h
+++ b/src/Core/Block.h
@ -152,6 +152,7 @@ public:
 private:
    void eraseImpl(size_t position);
    void initializeIndexByName();
+    void reserve(size_t count);

    /// This is needed to allow function execution over data.
    /// It is safe because functions does not change column names, so index is unaffected.
--- a/src/Core/ya.make
+++ b/src/Core/ya.make
@ -31,6 +31,10 @@ SRCS(
    MySQL/PacketsProtocolText.cpp
    MySQL/PacketsReplication.cpp
    NamesAndTypes.cpp
+    PostgreSQL/Connection.cpp
+    PostgreSQL/PoolWithFailover.cpp
+    PostgreSQL/Utils.cpp
+    PostgreSQL/insertPostgreSQLValue.cpp
    PostgreSQLProtocol.cpp
    QueryProcessingStage.cpp
    Settings.cpp
--- a/src/DataStreams/ExecutionSpeedLimits.h
+++ b/src/DataStreams/ExecutionSpeedLimits.h
@ -3,7 +3,8 @@
 #include <Poco/Timespan.h>
 #include <common/types.h>
 #include <DataStreams/SizeLimits.h>
-#include <Common/Stopwatch.h>
+
+class Stopwatch;

 namespace DB
 {
--- a/src/DataStreams/PostgreSQLBlockInputStream.cpp
+++ b/src/DataStreams/PostgreSQLBlockInputStream.cpp
@ -73,7 +73,7 @@ void PostgreSQLSource<T>::init(const Block & sample_block)
 template<typename T>
 void PostgreSQLSource<T>::onStart()
 {
-    if (connection_holder)
+    if (!tx)
        tx = std::make_shared<T>(connection_holder->get());

    stream = std::make_unique<pqxx::stream_from>(*tx, pqxx::from_query, std::string_view(query_str));
--- a/src/DataStreams/PostgreSQLBlockInputStream.h
+++ b/src/DataStreams/PostgreSQLBlockInputStream.h
@ -76,19 +76,6 @@ public:
        const Block & sample_block_,
        const UInt64 max_block_size_)
        : PostgreSQLSource<T>(tx_, query_str_, sample_block_, max_block_size_, false) {}
-
-    Chunk generate() override
-    {
-        if (!is_initialized)
-        {
-            Base::stream = std::make_unique<pqxx::stream_from>(*Base::tx, pqxx::from_query, std::string_view(Base::query_str));
-            is_initialized = true;
-        }
-
-        return Base::generate();
-    }
-
-    bool is_initialized = false;
 };

 }
--- a/src/DataStreams/ya.make
+++ b/src/DataStreams/ya.make
@ -49,6 +49,7 @@ SRCS(
    TTLUpdateInfoAlgorithm.cpp
    copyData.cpp
    finalizeBlock.cpp
+    formatBlock.cpp
    materializeBlock.cpp
    narrowBlockInputStreams.cpp

--- a/src/Databases/DatabaseReplicated.cpp
+++ b/src/Databases/DatabaseReplicated.cpp
@ -8,7 +8,6 @@
 #include <Interpreters/executeQuery.h>
 #include <Parsers/queryToString.h>
 #include <Common/Exception.h>
-#include <Common/Stopwatch.h>
 #include <Common/ZooKeeper/KeeperException.h>
 #include <Common/ZooKeeper/Types.h>
 #include <Common/ZooKeeper/ZooKeeper.h>
--- a/src/Dictionaries/CacheDictionary.cpp
+++ b/src/Dictionaries/CacheDictionary.cpp
@ -10,7 +10,7 @@
 #include <Common/ProfileEvents.h>
 #include <Common/ProfilingScopedRWLock.h>

-#include <Dictionaries/DictionaryBlockInputStream.h>
+#include <Dictionaries//DictionarySource.h>
 #include <Dictionaries/HierarchyDictionariesUtils.h>

 #include <Processors/Executors/PullingPipelineExecutor.h>
@ -18,21 +18,21 @@

 namespace ProfileEvents
 {
-extern const Event DictCacheKeysRequested;
-extern const Event DictCacheKeysRequestedMiss;
-extern const Event DictCacheKeysRequestedFound;
-extern const Event DictCacheKeysExpired;
-extern const Event DictCacheKeysNotFound;
-extern const Event DictCacheKeysHit;
-extern const Event DictCacheRequestTimeNs;
-extern const Event DictCacheRequests;
-extern const Event DictCacheLockWriteNs;
-extern const Event DictCacheLockReadNs;
+    extern const Event DictCacheKeysRequested;
+    extern const Event DictCacheKeysRequestedMiss;
+    extern const Event DictCacheKeysRequestedFound;
+    extern const Event DictCacheKeysExpired;
+    extern const Event DictCacheKeysNotFound;
+    extern const Event DictCacheKeysHit;
+    extern const Event DictCacheRequestTimeNs;
+    extern const Event DictCacheRequests;
+    extern const Event DictCacheLockWriteNs;
+    extern const Event DictCacheLockReadNs;
 }

 namespace CurrentMetrics
 {
-extern const Metric DictCacheRequests;
+    extern const Metric DictCacheRequests;
 }

 namespace DB
--- a/src/Dictionaries/DictionaryHelpers.h
+++ b/src/Dictionaries/DictionaryHelpers.h
@ -648,6 +648,16 @@ static const PaddedPODArray<T> & getColumnVectorData(
    }
 }

+template <typename T>
+static ColumnPtr getColumnFromPODArray(const PaddedPODArray<T> & array)
+{
+    auto column_vector = ColumnVector<T>::create();
+    column_vector->getData().reserve(array.size());
+    column_vector->getData().insert(array.begin(), array.end());
+
+    return column_vector;
+}
+
 }


--- a/src/Dictionaries/DictionaryBlockInputStream.cpp
+++ b/src/Dictionaries/DictionaryBlockInputStream.cpp
@ -1,4 +1,5 @@
-#include "DictionaryBlockInputStream.h"
+#include "DictionarySource.h"
+#include <Dictionaries/DictionaryHelpers.h>

 namespace DB
 {
@ -12,7 +13,7 @@ DictionarySourceData::DictionarySourceData(
    std::shared_ptr<const IDictionary> dictionary_, PaddedPODArray<UInt64> && ids_, const Names & column_names_)
    : num_rows(ids_.size())
    , dictionary(dictionary_)
-    , column_names(column_names_)
+    , column_names(column_names_.begin(), column_names_.end())
    , ids(std::move(ids_))
    , key_type(DictionaryInputStreamKeyType::Id)
 {
@ -24,7 +25,7 @@ DictionarySourceData::DictionarySourceData(
    const Names & column_names_)
    : num_rows(keys.size())
    , dictionary(dictionary_)
-    , column_names(column_names_)
+    , column_names(column_names_.begin(), column_names_.end())
    , key_type(DictionaryInputStreamKeyType::ComplexKey)
 {
    const DictionaryStructure & dictionary_structure = dictionary->getStructure();
@ -39,7 +40,7 @@ DictionarySourceData::DictionarySourceData(
    GetColumnsFunction && get_view_columns_function_)
    : num_rows(data_columns_.front()->size())
    , dictionary(dictionary_)
-    , column_names(column_names_)
+    , column_names(column_names_.begin(), column_names_.end())
    , data_columns(data_columns_)
    , get_key_columns_function(std::move(get_key_columns_function_))
    , get_view_columns_function(std::move(get_view_columns_function_))
@ -102,8 +103,6 @@ Block DictionarySourceData::fillBlock(
    const DataTypes & types,
    ColumnsWithTypeAndName && view) const
 {
-    std::unordered_set<std::string> names(column_names.begin(), column_names.end());
-
    DataTypes data_types = types;
    ColumnsWithTypeAndName block_columns;

@ -114,13 +113,13 @@ Block DictionarySourceData::fillBlock(
            data_types.push_back(key.type);

    for (const auto & column : view)
-        if (names.find(column.name) != names.end())
+        if (column_names.find(column.name) != column_names.end())
            block_columns.push_back(column);

    const DictionaryStructure & structure = dictionary->getStructure();
-    ColumnPtr ids_column = getColumnFromIds(ids_to_fill);
+    ColumnPtr ids_column = getColumnFromPODArray(ids_to_fill);

-    if (structure.id && names.find(structure.id->name) != names.end())
+    if (structure.id && column_names.find(structure.id->name) != column_names.end())
    {
        block_columns.emplace_back(ids_column, std::make_shared<DataTypeUInt64>(), structure.id->name);
    }
@ -129,7 +128,7 @@ Block DictionarySourceData::fillBlock(

    for (const auto & attribute : structure.attributes)
    {
-        if (names.find(attribute.name) != names.end())
+        if (column_names.find(attribute.name) != column_names.end())
        {
            ColumnPtr column;

@ -159,13 +158,6 @@ Block DictionarySourceData::fillBlock(
    return Block(block_columns);
 }

-ColumnPtr DictionarySourceData::getColumnFromIds(const PaddedPODArray<UInt64> & ids_to_fill)
-{
-    auto column_vector = ColumnVector<UInt64>::create();
-    column_vector->getData().assign(ids_to_fill);
-    return column_vector;
-}
-
 void DictionarySourceData::fillKeyColumns(
    const PaddedPODArray<StringRef> & keys,
    size_t start,
--- a/src/Dictionaries/DictionaryBlockInputStream.h
+++ b/src/Dictionaries/DictionaryBlockInputStream.h
@ -7,19 +7,14 @@
 #include <Columns/IColumn.h>
 #include <Core/Names.h>
 #include <DataTypes/DataTypesNumber.h>
-#include <common/logger_useful.h>
-#include "DictionaryBlockInputStreamBase.h"
-#include "DictionaryStructure.h"
-#include "IDictionary.h"
+#include <Dictionaries/DictionaryStructure.h>
+#include <Dictionaries/IDictionary.h>
+#include <Dictionaries/DictionarySourceBase.h>


 namespace DB
 {

-/// TODO: Remove this class
-/* BlockInputStream implementation for external dictionaries
- * read() returns blocks consisting of the in-memory contents of the dictionaries
- */
 class DictionarySourceData
 {
 public:
@ -56,8 +51,6 @@ private:
        const DataTypes & types,
        ColumnsWithTypeAndName && view) const;

-    static ColumnPtr getColumnFromIds(const PaddedPODArray<UInt64> & ids_to_fill);
-
    static void fillKeyColumns(
        const PaddedPODArray<StringRef> & keys,
        size_t start,
@ -67,7 +60,7 @@ private:

    const size_t num_rows;
    std::shared_ptr<const IDictionary> dictionary;
-    Names column_names;
+    std::unordered_set<std::string> column_names;
    PaddedPODArray<UInt64> ids;
    ColumnsWithTypeAndName key_columns;

--- a/src/Dictionaries/DictionaryBlockInputStreamBase.cpp
+++ b/src/Dictionaries/DictionaryBlockInputStreamBase.cpp
@ -1,4 +1,4 @@
-#include "DictionaryBlockInputStreamBase.h"
+#include "DictionarySourceBase.h"

 namespace DB
 {
--- a/src/Dictionaries/DictionaryBlockInputStreamBase.h
+++ b/src/Dictionaries/DictionaryBlockInputStreamBase.h
--- a/src/Dictionaries/FlatDictionary.cpp
+++ b/src/Dictionaries/FlatDictionary.cpp
@ -13,7 +13,7 @@
 #include <Processors/QueryPipeline.h>
 #include <Processors/Executors/PullingPipelineExecutor.h>

-#include <Dictionaries/DictionaryBlockInputStream.h>
+#include <Dictionaries//DictionarySource.h>
 #include <Dictionaries/DictionaryFactory.h>
 #include <Dictionaries/HierarchyDictionariesUtils.h>

--- a/src/Dictionaries/HashedDictionary.cpp
+++ b/src/Dictionaries/HashedDictionary.cpp
@ -6,7 +6,7 @@
 #include <Columns/ColumnNullable.h>
 #include <Functions/FunctionHelpers.h>

-#include <Dictionaries/DictionaryBlockInputStream.h>
+#include <Dictionaries//DictionarySource.h>
 #include <Dictionaries/DictionaryFactory.h>
 #include <Dictionaries/HierarchyDictionariesUtils.h>

--- a/src/Dictionaries/IPAddressDictionary.cpp
+++ b/src/Dictionaries/IPAddressDictionary.cpp
@ -13,7 +13,7 @@
 #include <common/itoa.h>
 #include <common/map.h>
 #include <common/range.h>
-#include <Dictionaries/DictionaryBlockInputStream.h>
+#include <Dictionaries/DictionarySource.h>
 #include <Dictionaries/DictionaryFactory.h>
 #include <Functions/FunctionHelpers.h>

--- a/src/Dictionaries/PolygonDictionary.cpp
+++ b/src/Dictionaries/PolygonDictionary.cpp
@ -3,14 +3,14 @@
 #include <numeric>
 #include <cmath>

-#include "DictionaryBlockInputStream.h"
-#include "DictionaryFactory.h"
-
 #include <Columns/ColumnArray.h>
 #include <Columns/ColumnTuple.h>
 #include <DataTypes/DataTypeArray.h>
 #include <Functions/FunctionHelpers.h>
 #include <DataTypes/DataTypesDecimal.h>
+#include <Dictionaries/DictionaryFactory.h>
+#include <Dictionaries/DictionarySource.h>
+

 namespace DB
 {
--- a/src/Dictionaries/RangeDictionaryBlockInputStream.h
+++ b/src/Dictionaries/RangeDictionaryBlockInputStream.h
@ -1,14 +1,14 @@
 #pragma once
+#include <DataTypes/DataTypeDate.h>
+#include <DataTypes/DataTypesNumber.h>
 #include <Columns/ColumnString.h>
 #include <Columns/ColumnVector.h>
 #include <Columns/IColumn.h>
-#include <DataTypes/DataTypeDate.h>
-#include <DataTypes/DataTypesNumber.h>
-#include <common/range.h>
-#include "DictionaryBlockInputStreamBase.h"
-#include "DictionaryStructure.h"
-#include "IDictionary.h"
-#include "RangeHashedDictionary.h"
+#include <Dictionaries/DictionaryStructure.h>
+#include <Dictionaries/IDictionary.h>
+#include <Dictionaries/DictionarySourceBase.h>
+#include <Dictionaries/DictionaryHelpers.h>
+#include <Dictionaries/RangeHashedDictionary.h>


 namespace DB
@ -31,8 +31,6 @@ public:
    size_t getNumRows() const { return ids.size(); }

 private:
-    template <typename T>
-    ColumnPtr getColumnFromPODArray(const PaddedPODArray<T> & array) const;

    Block fillBlock(
        const PaddedPODArray<Key> & ids_to_fill,
@ -86,17 +84,6 @@ Block RangeDictionarySourceData<RangeType>::getBlock(size_t start, size_t length
    return fillBlock(block_ids, block_start_dates, block_end_dates);
 }

-template <typename RangeType>
-template <typename T>
-ColumnPtr RangeDictionarySourceData<RangeType>::getColumnFromPODArray(const PaddedPODArray<T> & array) const
-{
-    auto column_vector = ColumnVector<T>::create();
-    column_vector->getData().reserve(array.size());
-    column_vector->getData().insert(array.begin(), array.end());
-
-    return column_vector;
-}
-
 template <typename RangeType>
 PaddedPODArray<Int64> RangeDictionarySourceData<RangeType>::makeDateKey(
    const PaddedPODArray<RangeType> & block_start_dates, const PaddedPODArray<RangeType> & block_end_dates) const
--- a/src/Dictionaries/RangeHashedDictionary.cpp
+++ b/src/Dictionaries/RangeHashedDictionary.cpp
@ -2,11 +2,11 @@
 #include <Columns/ColumnNullable.h>
 #include <Functions/FunctionHelpers.h>
 #include <Common/TypeList.h>
-#include <common/range.h>
-#include "DictionaryFactory.h"
-#include "RangeDictionaryBlockInputStream.h"
 #include <Interpreters/castColumn.h>
 #include <DataTypes/DataTypesDecimal.h>
+#include <Dictionaries/DictionaryFactory.h>
+#include <Dictionaries/RangeDictionarySource.h>
+

 namespace
 {
--- a/src/Dictionaries/RangeHashedDictionary.h
+++ b/src/Dictionaries/RangeHashedDictionary.h
@ -9,10 +9,10 @@
 #include <Columns/ColumnString.h>
 #include <Common/HashTable/HashMap.h>
 #include <Common/HashTable/HashSet.h>
-#include "DictionaryStructure.h"
-#include "IDictionary.h"
-#include "IDictionarySource.h"
-#include "DictionaryHelpers.h"
+#include <Dictionaries/DictionaryStructure.h>
+#include <Dictionaries/IDictionary.h>
+#include <Dictionaries/IDictionarySource.h>
+#include <Dictionaries/DictionaryHelpers.h>

 namespace DB
 {
--- a/src/Dictionaries/SSDCacheDictionaryStorage.h
+++ b/src/Dictionaries/SSDCacheDictionaryStorage.h
@ -12,7 +12,6 @@
 #include <absl/container/flat_hash_set.h>

 #include <common/unaligned.h>
-#include <Common/Stopwatch.h>
 #include <Common/randomSeed.h>
 #include <Common/Arena.h>
 #include <Common/ArenaWithFreeLists.h>
--- a/src/Formats/FormatFactory.cpp
+++ b/src/Formats/FormatFactory.cpp
@ -212,13 +212,11 @@ BlockOutputStreamPtr FormatFactory::getOutputStreamParallelIfPossible(

    const Settings & settings = context->getSettingsRef();
    bool parallel_formatting = settings.output_format_parallel_formatting;
+    auto format_settings = _format_settings ? *_format_settings : getFormatSettings(context);

-    if (output_getter && parallel_formatting && getCreators(name).supports_parallel_formatting
-        && !settings.output_format_json_array_of_rows)
+    if (output_getter && parallel_formatting && getCreators(name).supports_parallel_formatting && !settings.output_format_json_array_of_rows
+        && !format_settings.mysql_wire.sequence_id)
    {
-        auto format_settings = _format_settings
-        ? *_format_settings : getFormatSettings(context);
-
        auto formatter_creator = [output_getter, sample, callback, format_settings]
            (WriteBuffer & output) -> OutputFormatPtr
            { return output_getter(output, sample, {std::move(callback)}, format_settings);};
@ -317,7 +315,7 @@ OutputFormatPtr FormatFactory::getOutputFormatParallelIfPossible(
    const Settings & settings = context->getSettingsRef();

    if (settings.output_format_parallel_formatting && getCreators(name).supports_parallel_formatting
-        && !settings.output_format_json_array_of_rows)
+        && !settings.output_format_json_array_of_rows && !format_settings.mysql_wire.sequence_id)
    {
        auto formatter_creator = [output_getter, sample, callback, format_settings]
        (WriteBuffer & output) -> OutputFormatPtr
--- a/src/Functions/FunctionsBitmap.cpp
+++ b/src/Functions/FunctionsBitmap.cpp
@ -13,6 +13,7 @@ void registerFunctionsBitmap(FunctionFactory & factory)
    factory.registerFunction<FunctionBitmapToArray>();
    factory.registerFunction<FunctionBitmapSubsetInRange>();
    factory.registerFunction<FunctionBitmapSubsetLimit>();
+    factory.registerFunction<FunctionBitmapSubsetOffsetLimit>();
    factory.registerFunction<FunctionBitmapTransform>();

    factory.registerFunction<FunctionBitmapSelfCardinality>();
--- a/src/Functions/FunctionsBitmap.h
+++ b/src/Functions/FunctionsBitmap.h
@ -466,9 +466,24 @@ public:
    }
 };

+struct BitmapSubsetOffsetLimitImpl
+{
+public:
+    static constexpr auto name = "subBitmap";
+    template <typename T>
+    static void apply(
+        const AggregateFunctionGroupBitmapData<T> & bitmap_data_0,
+        UInt64 range_start,
+        UInt64 range_end,
+        AggregateFunctionGroupBitmapData<T> & bitmap_data_2)
+        {
+        bitmap_data_0.rbs.rb_offset_limit(range_start, range_end, bitmap_data_2.rbs);
+        }
+};
+
 using FunctionBitmapSubsetInRange = FunctionBitmapSubset<BitmapSubsetInRangeImpl>;
 using FunctionBitmapSubsetLimit = FunctionBitmapSubset<BitmapSubsetLimitImpl>;
-
+using FunctionBitmapSubsetOffsetLimit = FunctionBitmapSubset<BitmapSubsetOffsetLimitImpl>;

 class FunctionBitmapTransform : public IFunction
 {
--- a/src/IO/Progress.h
+++ b/src/IO/Progress.h
@ -6,8 +6,6 @@
 #include <common/types.h>

 #include <Core/Defines.h>
-#include <Common/Stopwatch.h>
-

 namespace DB
 {
--- a/src/Interpreters/Aggregator.cpp
+++ b/src/Interpreters/Aggregator.cpp
@ -977,13 +977,14 @@ bool Aggregator::executeOnBlock(Columns columns, UInt64 num_rows, AggregatedData
    /// For the case when there are no keys (all aggregate into one row).
    if (result.type == AggregatedDataVariants::Type::without_key)
    {
-#if USE_EMBEDDED_COMPILER
-        if (compiled_aggregate_functions_holder)
-        {
-            executeWithoutKeyImpl<true>(result.without_key, num_rows, aggregate_functions_instructions.data(), result.aggregates_pool);
-        }
-        else
-#endif
+        /// TODO: Enable compilation after investigation
+// #if USE_EMBEDDED_COMPILER
+//         if (compiled_aggregate_functions_holder)
+//         {
+//             executeWithoutKeyImpl<true>(result.without_key, num_rows, aggregate_functions_instructions.data(), result.aggregates_pool);
+//         }
+//         else
+// #endif
        {
            executeWithoutKeyImpl<false>(result.without_key, num_rows, aggregate_functions_instructions.data(), result.aggregates_pool);
        }
--- a/src/Interpreters/AsynchronousMetrics.cpp
+++ b/src/Interpreters/AsynchronousMetrics.cpp
@ -1091,7 +1091,14 @@ void AsynchronousMetrics::update(std::chrono::system_clock::time_point update_ti
            {
                sensor_file->rewind();
                Int64 temperature = 0;
-                readText(temperature, *sensor_file);
+                try
+                {
+                    readText(temperature, *sensor_file);
+                }
+                catch (const ErrnoException & e)
+                {
+                    LOG_DEBUG(&Poco::Logger::get("AsynchronousMetrics"), "Hardware monitor '{}', sensor '{}' exists but could not be read, error {}.", hwmon_name, sensor_name, e.getErrno());
+                }

                if (sensor_name.empty())
                    new_values[fmt::format("Temperature_{}", hwmon_name)] = temperature * 0.001;
--- a/src/Interpreters/CrashLog.cpp
+++ b/src/Interpreters/CrashLog.cpp
@ -6,6 +6,7 @@
 #include <DataTypes/DataTypeDateTime.h>
 #include <Common/ClickHouseRevision.h>
 #include <Common/SymbolIndex.h>
+#include <Common/Stopwatch.h>

 #if !defined(ARCADIA_BUILD)
 #   include <Common/config_version.h>
--- a/src/Interpreters/DDLWorker.cpp
+++ b/src/Interpreters/DDLWorker.cpp
@ -158,15 +158,20 @@ DDLWorker::DDLWorker(
    const Poco::Util::AbstractConfiguration * config,
    const String & prefix,
    const String & logger_name,
-    const CurrentMetrics::Metric * max_entry_metric_)
+    const CurrentMetrics::Metric * max_entry_metric_,
+    const CurrentMetrics::Metric * max_pushed_entry_metric_)
    : context(Context::createCopy(context_))
    , log(&Poco::Logger::get(logger_name))
    , pool_size(pool_size_)
    , max_entry_metric(max_entry_metric_)
+    , max_pushed_entry_metric(max_pushed_entry_metric_)
 {
    if (max_entry_metric)
        CurrentMetrics::set(*max_entry_metric, 0);

+    if (max_pushed_entry_metric)
+        CurrentMetrics::set(*max_pushed_entry_metric, 0);
+
    if (1 < pool_size)
    {
        LOG_WARNING(log, "DDLWorker is configured to use multiple threads. "
@ -1046,6 +1051,15 @@ String DDLWorker::enqueueQuery(DDLLogEntry & entry)
    zookeeper->createAncestors(query_path_prefix);

    String node_path = zookeeper->create(query_path_prefix, entry.toString(), zkutil::CreateMode::PersistentSequential);
+    if (max_pushed_entry_metric)
+    {
+        String str_buf = node_path.substr(query_path_prefix.length());
+        DB::ReadBufferFromString in(str_buf);
+        CurrentMetrics::Metric id;
+        readText(id, in);
+        id = std::max(*max_pushed_entry_metric, id);
+        CurrentMetrics::set(*max_pushed_entry_metric, id);
+    }

    /// We cannot create status dirs in a single transaction with previous request,
    /// because we don't know node_path until previous request is executed.
--- a/src/Interpreters/DDLWorker.h
+++ b/src/Interpreters/DDLWorker.h
@ -44,7 +44,7 @@ class DDLWorker
 {
 public:
    DDLWorker(int pool_size_, const std::string & zk_root_dir, ContextPtr context_, const Poco::Util::AbstractConfiguration * config, const String & prefix,
-              const String & logger_name = "DDLWorker", const CurrentMetrics::Metric * max_entry_metric_ = nullptr);
+              const String & logger_name = "DDLWorker", const CurrentMetrics::Metric * max_entry_metric_ = nullptr, const CurrentMetrics::Metric * max_pushed_entry_metric_ = nullptr);
    virtual ~DDLWorker();

    /// Pushes query into DDL queue, returns path to created node
@ -148,6 +148,7 @@ protected:

    std::atomic<UInt64> max_id = 0;
    const CurrentMetrics::Metric * max_entry_metric;
+    const CurrentMetrics::Metric * max_pushed_entry_metric;
 };


--- a/src/Interpreters/DNSCacheUpdater.h
+++ b/src/Interpreters/DNSCacheUpdater.h
@ -2,8 +2,6 @@

 #include <Core/BackgroundSchedulePool.h>
 #include <Interpreters/Context_fwd.h>
-#include <Common/Stopwatch.h>
-

 namespace DB
 {
--- a/src/Interpreters/ExpressionActions.cpp
+++ b/src/Interpreters/ExpressionActions.cpp
@ -570,6 +570,13 @@ static void executeAction(const ExpressionActions::Action & action, ExecutionCon
            res_column.type = action.node->result_type;
            res_column.name = action.node->result_name;

+            if (action.node->column)
+            {
+                /// Do not execute function if it's result is already known.
+                res_column.column = action.node->column->cloneResized(num_rows);
+                break;
+            }
+
            ColumnsWithTypeAndName arguments(action.arguments.size());
            for (size_t i = 0; i < arguments.size(); ++i)
            {
--- a/src/Interpreters/InterpreterCreateQuery.cpp
+++ b/src/Interpreters/InterpreterCreateQuery.cpp
@ -764,7 +764,7 @@ void InterpreterCreateQuery::assertOrSetUUID(ASTCreateQuery & create, const Data
    const auto * kind = create.is_dictionary ? "Dictionary" : "Table";
    const auto * kind_upper = create.is_dictionary ? "DICTIONARY" : "TABLE";

-    if (database->getEngineName() == "Replicated" && getContext()->getClientInfo().query_kind == ClientInfo::QueryKind::SECONDARY_QUERY
+    if (database->getEngineName() == "Replicated" && getContext()->getClientInfo().is_replicated_database_internal
        && !internal)
    {
        if (create.uuid == UUIDHelpers::Nil)
--- a/src/Interpreters/QueryPriorities.h
+++ b/src/Interpreters/QueryPriorities.h
@ -6,8 +6,6 @@
 #include <memory>
 #include <chrono>
 #include <Common/CurrentMetrics.h>
-#include <Common/Stopwatch.h>
-

 namespace CurrentMetrics
 {
--- a/src/Interpreters/SystemLog.h
+++ b/src/Interpreters/SystemLog.h
@ -12,7 +12,6 @@
 #include <common/types.h>
 #include <Core/Defines.h>
 #include <Storages/IStorage.h>
-#include <Common/Stopwatch.h>
 #include <Parsers/ASTCreateQuery.h>
 #include <Parsers/parseQuery.h>
 #include <Parsers/ParserCreateQuery.h>
--- a/src/Interpreters/join_common.cpp
+++ b/src/Interpreters/join_common.cpp
@ -2,6 +2,7 @@

 #include <Columns/ColumnLowCardinality.h>
 #include <Columns/ColumnNullable.h>
+#include <Columns/ColumnConst.h>

 #include <DataStreams/materializeBlock.h>

@ -105,25 +106,57 @@ DataTypePtr convertTypeToNullable(const DataTypePtr & type)
    return type;
 }

+/// Convert column to nullable. If column LowCardinality or Const, convert nested column.
+/// Returns nullptr if conversion cannot be performed.
+static ColumnPtr tryConvertColumnToNullable(const ColumnPtr & col)
+{
+    if (isColumnNullable(*col) || col->canBeInsideNullable())
+        return makeNullable(col);
+
+    if (col->lowCardinality())
+    {
+        auto mut_col = IColumn::mutate(std::move(col));
+        ColumnLowCardinality * col_lc = assert_cast<ColumnLowCardinality *>(mut_col.get());
+        if (col_lc->nestedIsNullable())
+        {
+            return mut_col;
+        }
+        else if (col_lc->nestedCanBeInsideNullable())
+        {
+            col_lc->nestedToNullable();
+            return mut_col;
+        }
+    }
+    else if (const ColumnConst * col_const = checkAndGetColumn<ColumnConst>(*col))
+    {
+        const auto & nested = col_const->getDataColumnPtr();
+        if (nested->isNullable() || nested->canBeInsideNullable())
+        {
+            return makeNullable(col);
+        }
+        else if (nested->lowCardinality())
+        {
+            ColumnPtr nested_nullable = tryConvertColumnToNullable(nested);
+            if (nested_nullable)
+                return ColumnConst::create(nested_nullable, col_const->size());
+        }
+    }
+    return nullptr;
+}
+
 void convertColumnToNullable(ColumnWithTypeAndName & column)
 {
-    column.type = convertTypeToNullable(column.type);
-
    if (!column.column)
+    {
+        column.type = convertTypeToNullable(column.type);
        return;
-
-    if (column.column->lowCardinality())
-    {
-        /// Convert nested to nullable, not LowCardinality itself
-        auto mut_col = IColumn::mutate(std::move(column.column));
-        ColumnLowCardinality * col_as_lc = assert_cast<ColumnLowCardinality *>(mut_col.get());
-        if (!col_as_lc->nestedIsNullable())
-            col_as_lc->nestedToNullable();
-        column.column = std::move(mut_col);
    }
-    else if (column.column->canBeInsideNullable())
+
+    ColumnPtr nullable_column = tryConvertColumnToNullable(column.column);
+    if (nullable_column)
    {
-        column.column = makeNullable(column.column);
+        column.type = convertTypeToNullable(column.type);
+        column.column = std::move(nullable_column);
    }
 }

--- a/src/Parsers/ya.make
+++ b/src/Parsers/ya.make
@ -21,6 +21,7 @@ SRCS(
    ASTCreateRowPolicyQuery.cpp
    ASTCreateSettingsProfileQuery.cpp
    ASTCreateUserQuery.cpp
+    ASTDatabaseOrNone.cpp
    ASTDictionary.cpp
    ASTDictionaryAttributeDeclaration.cpp
    ASTDropAccessEntityQuery.cpp
@ -95,6 +96,7 @@ SRCS(
    ParserCreateSettingsProfileQuery.cpp
    ParserCreateUserQuery.cpp
    ParserDataType.cpp
+    ParserDatabaseOrNone.cpp
    ParserDescribeTableQuery.cpp
    ParserDictionary.cpp
    ParserDictionaryAttributeDeclaration.cpp
--- a/src/Processors/Formats/IRowInputFormat.h
+++ b/src/Processors/Formats/IRowInputFormat.h
@ -5,8 +5,8 @@
 #include <Processors/Formats/IInputFormat.h>
 #include <DataStreams/SizeLimits.h>
 #include <Poco/Timespan.h>
-#include <Common/Stopwatch.h>

+class Stopwatch;

 namespace DB
 {
--- a/src/Processors/Transforms/WindowTransform.cpp
+++ b/src/Processors/Transforms/WindowTransform.cpp
@ -1166,6 +1166,23 @@ void WindowTransform::appendChunk(Chunk & chunk)
            // Write out the aggregation results.
            writeOutCurrentRow();

+            if (isCancelled())
+            {
+                // Good time to check if the query is cancelled. Checking once
+                // per block might not be enough in severe quadratic cases.
+                // Just leave the work halfway through and return, the 'prepare'
+                // method will figure out what to do. Note that this doesn't
+                // handle 'max_execution_time' and other limits, because these
+                // limits are only updated between blocks. Eventually we should
+                // start updating them in background and canceling the processor,
+                // like we do for Ctrl+C handling.
+                //
+                // This class is final, so the check should hopefully be
+                // devirtualized and become a single never-taken branch that is
+                // basically free.
+                return;
+            }
+
            // Move to the next row. The frame will have to be recalculated.
            // The peer group start is updated at the beginning of the loop,
            // because current_row might now be past-the-end.
@ -1255,10 +1272,12 @@ IProcessor::Status WindowTransform::prepare()
 //        next_output_block_number, first_not_ready_row, first_block_number,
 //        blocks.size());

-    if (output.isFinished())
+    if (output.isFinished() || isCancelled())
    {
        // The consumer asked us not to continue (or we decided it ourselves),
-        // so we abort.
+        // so we abort. Not sure what the difference between the two conditions
+        // is, but it seemed that output.isFinished() is not enough to cancel on
+        // Ctrl+C. Test manually if you change it.
        input.close();
        return Status::Finished;
    }
--- a/src/Processors/Transforms/WindowTransform.h
+++ b/src/Processors/Transforms/WindowTransform.h
@ -80,8 +80,10 @@ struct RowNumber
 * the order of input data. This property also trivially holds for the ROWS and
 * GROUPS frames. For the RANGE frame, the proof requires the additional fact
 * that the ranges are specified in terms of (the single) ORDER BY column.
+ *
+ * `final` is so that the isCancelled() is devirtualized, we call it every row.
 */
-class WindowTransform : public IProcessor /* public ISimpleTransform */
+class WindowTransform final : public IProcessor
 {
 public:
    WindowTransform(
--- a/src/Processors/ya.make
+++ b/src/Processors/ya.make
@ -7,14 +7,8 @@ PEERDIR(
    clickhouse/src/Common
    contrib/libs/msgpack
    contrib/libs/protobuf
-    contrib/libs/arrow
 )

-ADDINCL(
-    contrib/libs/arrow/src
-)
-
-CFLAGS(-DUSE_ARROW=1)

 SRCS(
    Chunk.cpp
@ -31,11 +25,6 @@ SRCS(
    Formats/IOutputFormat.cpp
    Formats/IRowInputFormat.cpp
    Formats/IRowOutputFormat.cpp
-    Formats/Impl/ArrowBlockInputFormat.cpp
-    Formats/Impl/ArrowBlockOutputFormat.cpp
-    Formats/Impl/ArrowBufferedStreams.cpp
-    Formats/Impl/ArrowColumnToCHColumn.cpp
-    Formats/Impl/CHColumnToArrowColumn.cpp
    Formats/Impl/BinaryRowInputFormat.cpp
    Formats/Impl/BinaryRowOutputFormat.cpp
    Formats/Impl/CSVRowInputFormat.cpp
--- a/src/Storages/MergeTree/BackgroundJobsExecutor.cpp
+++ b/src/Storages/MergeTree/BackgroundJobsExecutor.cpp
@ -146,6 +146,9 @@ try
 catch (...) /// Exception while we looking for a task, reschedule
 {
    tryLogCurrentException(__PRETTY_FUNCTION__);
+
+    /// Why do we scheduleTask again?
+    /// To retry on exception, since it may be some temporary exception.
    scheduleTask(/* with_backoff = */ true);
 }

@ -180,10 +183,16 @@ void IBackgroundJobExecutor::triggerTask()
 }

 void IBackgroundJobExecutor::backgroundTaskFunction()
+try
 {
    if (!scheduleJob())
        scheduleTask(/* with_backoff = */ true);
 }
+catch (...) /// Catch any exception to avoid thread termination.
+{
+    tryLogCurrentException(__PRETTY_FUNCTION__);
+    scheduleTask(/* with_backoff = */ true);
+}

 IBackgroundJobExecutor::~IBackgroundJobExecutor()
 {
--- a/src/Storages/MergeTree/MergeTreeDataMergerMutator.cpp
+++ b/src/Storages/MergeTree/MergeTreeDataMergerMutator.cpp
@ -1663,7 +1663,12 @@ NameToNameVector MergeTreeDataMergerMutator::collectFilesForRenames(
    {
        if (command.type == MutationCommand::Type::DROP_INDEX)
        {
-            if (source_part->checksums.has(INDEX_FILE_PREFIX + command.column_name + ".idx"))
+            if (source_part->checksums.has(INDEX_FILE_PREFIX + command.column_name + ".idx2"))
+            {
+                rename_vector.emplace_back(INDEX_FILE_PREFIX + command.column_name + ".idx2", "");
+                rename_vector.emplace_back(INDEX_FILE_PREFIX + command.column_name + mrk_extension, "");
+            }
+            else if (source_part->checksums.has(INDEX_FILE_PREFIX + command.column_name + ".idx"))
            {
                rename_vector.emplace_back(INDEX_FILE_PREFIX + command.column_name + ".idx", "");
                rename_vector.emplace_back(INDEX_FILE_PREFIX + command.column_name + mrk_extension, "");
@ -1749,6 +1754,7 @@ NameSet MergeTreeDataMergerMutator::collectFilesToSkip(
    for (const auto & index : indices_to_recalc)
    {
        files_to_skip.insert(index->getFileName() + ".idx");
+        files_to_skip.insert(index->getFileName() + ".idx2");
        files_to_skip.insert(index->getFileName() + mrk_extension);
    }
    for (const auto & projection : projections_to_recalc)
@ -1893,8 +1899,11 @@ std::set<MergeTreeIndexPtr> MergeTreeDataMergerMutator::getIndicesToRecalculate(
    {
        const auto & index = indices[i];

+        bool has_index =
+            source_part->checksums.has(INDEX_FILE_PREFIX + index.name + ".idx") ||
+            source_part->checksums.has(INDEX_FILE_PREFIX + index.name + ".idx2");
        // If we ask to materialize and it already exists
-        if (!source_part->checksums.has(INDEX_FILE_PREFIX + index.name + ".idx") && materialized_indices.count(index.name))
+        if (!has_index && materialized_indices.count(index.name))
        {
            if (indices_to_recalc.insert(index_factory.get(index)).second)
            {
--- a/src/Storages/MergeTree/MergeTreeDataPartWriterOnDisk.cpp
+++ b/src/Storages/MergeTree/MergeTreeDataPartWriterOnDisk.cpp
@ -9,11 +9,6 @@ namespace ErrorCodes
    extern const int LOGICAL_ERROR;
 }

-namespace
-{
-    constexpr auto INDEX_FILE_EXTENSION = ".idx";
-}
-
 void MergeTreeDataPartWriterOnDisk::Stream::finalize()
 {
    compressed.next();
@ -165,7 +160,7 @@ void MergeTreeDataPartWriterOnDisk::initSkipIndices()
                std::make_unique<MergeTreeDataPartWriterOnDisk::Stream>(
                        stream_name,
                        data_part->volume->getDisk(),
-                        part_path + stream_name, INDEX_FILE_EXTENSION,
+                        part_path + stream_name, index_helper->getSerializedFileExtension(),
                        part_path + stream_name, marks_file_extension,
                        default_codec, settings.max_compress_block_size));
        skip_indices_aggregators.push_back(index_helper->createIndexAggregator());
--- a/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp
+++ b/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp
@ -1457,9 +1457,10 @@ MarkRanges MergeTreeDataSelectExecutor::filterMarksUsingIndex(
    size_t & granules_dropped,
    Poco::Logger * log)
 {
-    if (!part->volume->getDisk()->exists(part->getFullRelativePath() + index_helper->getFileName() + ".idx"))
+    const std::string & path_prefix = part->getFullRelativePath() + index_helper->getFileName();
+    if (!index_helper->getDeserializedFormat(part->volume->getDisk(), path_prefix))
    {
-        LOG_DEBUG(log, "File for index {} does not exist. Skipping it.", backQuote(index_helper->index.name));
+        LOG_DEBUG(log, "File for index {} does not exist ({}.*). Skipping it.", backQuote(index_helper->index.name), path_prefix);
        return ranges;
    }

--- a/src/Storages/MergeTree/MergeTreeIndexFullText.cpp
+++ b/src/Storages/MergeTree/MergeTreeIndexFullText.cpp
@ -101,14 +101,17 @@ MergeTreeIndexGranuleFullText::MergeTreeIndexGranuleFullText(
 void MergeTreeIndexGranuleFullText::serializeBinary(WriteBuffer & ostr) const
 {
    if (empty())
-        throw Exception("Attempt to write empty fulltext index " + backQuote(index_name), ErrorCodes::LOGICAL_ERROR);
+        throw Exception(ErrorCodes::LOGICAL_ERROR, "Attempt to write empty fulltext index {}.", backQuote(index_name));

    for (const auto & bloom_filter : bloom_filters)
        ostr.write(reinterpret_cast<const char *>(bloom_filter.getFilter().data()), params.filter_size);
 }

-void MergeTreeIndexGranuleFullText::deserializeBinary(ReadBuffer & istr)
+void MergeTreeIndexGranuleFullText::deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version)
 {
+    if (version != 1)
+        throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown index version {}.", version);
+
    for (auto & bloom_filter : bloom_filters)
    {
        istr.read(reinterpret_cast<char *>(
--- a/src/Storages/MergeTree/MergeTreeIndexFullText.h
+++ b/src/Storages/MergeTree/MergeTreeIndexFullText.h
@ -45,7 +45,7 @@ struct MergeTreeIndexGranuleFullText final : public IMergeTreeIndexGranule
    ~MergeTreeIndexGranuleFullText() override = default;

    void serializeBinary(WriteBuffer & ostr) const override;
-    void deserializeBinary(ReadBuffer & istr) override;
+    void deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version) override;

    bool empty() const override { return !has_elems; }

--- a/src/Storages/MergeTree/MergeTreeIndexGranuleBloomFilter.cpp
+++ b/src/Storages/MergeTree/MergeTreeIndexGranuleBloomFilter.cpp
@ -84,10 +84,12 @@ bool MergeTreeIndexGranuleBloomFilter::empty() const
    return !total_rows;
 }

-void MergeTreeIndexGranuleBloomFilter::deserializeBinary(ReadBuffer & istr)
+void MergeTreeIndexGranuleBloomFilter::deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version)
 {
    if (!empty())
-        throw Exception("Cannot read data to a non-empty bloom filter index.", ErrorCodes::LOGICAL_ERROR);
+        throw Exception(ErrorCodes::LOGICAL_ERROR, "Cannot read data to a non-empty bloom filter index.");
+    if (version != 1)
+        throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown index version {}.", version);

    readVarUInt(total_rows, istr);
    for (auto & filter : bloom_filters)
@ -102,7 +104,7 @@ void MergeTreeIndexGranuleBloomFilter::deserializeBinary(ReadBuffer & istr)
 void MergeTreeIndexGranuleBloomFilter::serializeBinary(WriteBuffer & ostr) const
 {
    if (empty())
-        throw Exception("Attempt to write empty bloom filter index.", ErrorCodes::LOGICAL_ERROR);
+        throw Exception(ErrorCodes::LOGICAL_ERROR, "Attempt to write empty bloom filter index.");

    static size_t atom_size = 8;
    writeVarUInt(total_rows, ostr);
--- a/src/Storages/MergeTree/MergeTreeIndexGranuleBloomFilter.h
+++ b/src/Storages/MergeTree/MergeTreeIndexGranuleBloomFilter.h
@ -16,8 +16,7 @@ public:
    bool empty() const override;

    void serializeBinary(WriteBuffer & ostr) const override;
-
-    void deserializeBinary(ReadBuffer & istr) override;
+    void deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version) override;

    const std::vector<BloomFilterPtr> & getFilters() const { return bloom_filters; }

--- a/src/Storages/MergeTree/MergeTreeIndexMinMax.cpp
+++ b/src/Storages/MergeTree/MergeTreeIndexMinMax.cpp
@ -40,28 +40,12 @@ void MergeTreeIndexGranuleMinMax::serializeBinary(WriteBuffer & ostr) const
        const DataTypePtr & type = index_sample_block.getByPosition(i).type;
        auto serialization = type->getDefaultSerialization();

-        if (!type->isNullable())
-        {
-            serialization->serializeBinary(hyperrectangle[i].left, ostr);
-            serialization->serializeBinary(hyperrectangle[i].right, ostr);
-        }
-        else
-        {
-            /// NOTE: that this serialization differs from
-            /// IMergeTreeDataPart::MinMaxIndex::store() due to preserve
-            /// backward compatibility.
-            bool is_null = hyperrectangle[i].left.isNull() || hyperrectangle[i].right.isNull(); // one is enough
-            writeBinary(is_null, ostr);
-            if (!is_null)
-            {
-                serialization->serializeBinary(hyperrectangle[i].left, ostr);
-                serialization->serializeBinary(hyperrectangle[i].right, ostr);
-            }
-        }
+        serialization->serializeBinary(hyperrectangle[i].left, ostr);
+        serialization->serializeBinary(hyperrectangle[i].right, ostr);
    }
 }

-void MergeTreeIndexGranuleMinMax::deserializeBinary(ReadBuffer & istr)
+void MergeTreeIndexGranuleMinMax::deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version)
 {
    hyperrectangle.clear();
    Field min_val;
@ -72,29 +56,53 @@ void MergeTreeIndexGranuleMinMax::deserializeBinary(ReadBuffer & istr)
        const DataTypePtr & type = index_sample_block.getByPosition(i).type;
        auto serialization = type->getDefaultSerialization();

-        if (!type->isNullable())
+        switch (version)
        {
-            serialization->deserializeBinary(min_val, istr);
-            serialization->deserializeBinary(max_val, istr);
-        }
-        else
-        {
-            /// NOTE: that this serialization differs from
-            /// IMergeTreeDataPart::MinMaxIndex::load() due to preserve
-            /// backward compatibility.
-            bool is_null;
-            readBinary(is_null, istr);
-            if (!is_null)
-            {
+            case 1:
+                if (!type->isNullable())
+                {
+                    serialization->deserializeBinary(min_val, istr);
+                    serialization->deserializeBinary(max_val, istr);
+                }
+                else
+                {
+                    /// NOTE: that this serialization differs from
+                    /// IMergeTreeDataPart::MinMaxIndex::load() to preserve
+                    /// backward compatibility.
+                    ///
+                    /// But this is deprecated format, so this is OK.
+
+                    bool is_null;
+                    readBinary(is_null, istr);
+                    if (!is_null)
+                    {
+                        serialization->deserializeBinary(min_val, istr);
+                        serialization->deserializeBinary(max_val, istr);
+                    }
+                    else
+                    {
+                        min_val = Null();
+                        max_val = Null();
+                    }
+                }
+                break;
+
+            /// New format with proper Nullable support for values that includes Null values
+            case 2:
                serialization->deserializeBinary(min_val, istr);
                serialization->deserializeBinary(max_val, istr);
-            }
-            else
-            {
-                min_val = Null();
-                max_val = Null();
-            }
+
+                // NULL_LAST
+                if (min_val.isNull())
+                    min_val = PositiveInfinity();
+                if (max_val.isNull())
+                    max_val = PositiveInfinity();
+
+                break;
+            default:
+                throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown index version {}.", version);
        }
+
        hyperrectangle.emplace_back(min_val, true, max_val, true);
    }
 }
@ -203,6 +211,15 @@ bool MergeTreeIndexMinMax::mayBenefitFromIndexForIn(const ASTPtr & node) const
    return false;
 }

+MergeTreeIndexFormat MergeTreeIndexMinMax::getDeserializedFormat(const DiskPtr disk, const std::string & relative_path_prefix) const
+{
+    if (disk->exists(relative_path_prefix + ".idx2"))
+        return {2, ".idx2"};
+    else if (disk->exists(relative_path_prefix + ".idx"))
+        return {1, ".idx"};
+    return {0 /* unknown */, ""};
+}
+
 MergeTreeIndexPtr minmaxIndexCreator(
    const IndexDescription & index)
 {
--- a/src/Storages/MergeTree/MergeTreeIndexMinMax.h
+++ b/src/Storages/MergeTree/MergeTreeIndexMinMax.h
@ -21,7 +21,7 @@ struct MergeTreeIndexGranuleMinMax final : public IMergeTreeIndexGranule
    ~MergeTreeIndexGranuleMinMax() override = default;

    void serializeBinary(WriteBuffer & ostr) const override;
-    void deserializeBinary(ReadBuffer & istr) override;
+    void deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version) override;

    bool empty() const override { return hyperrectangle.empty(); }

@ -81,6 +81,9 @@ public:
        const SelectQueryInfo & query, ContextPtr context) const override;

    bool mayBenefitFromIndexForIn(const ASTPtr & node) const override;
+
+    const char* getSerializedFileExtension() const override { return ".idx2"; }
+    MergeTreeIndexFormat getDeserializedFormat(const DiskPtr disk, const std::string & path_prefix) const override;
 };

 }
--- a/src/Storages/MergeTree/MergeTreeIndexReader.cpp
+++ b/src/Storages/MergeTree/MergeTreeIndexReader.cpp
@ -1,5 +1,29 @@
 #include <Storages/MergeTree/MergeTreeIndexReader.h>

+namespace
+{
+
+using namespace DB;
+
+std::unique_ptr<MergeTreeReaderStream> makeIndexReader(
+    const std::string & extension,
+    MergeTreeIndexPtr index,
+    MergeTreeData::DataPartPtr part,
+    size_t marks_count,
+    const MarkRanges & all_mark_ranges,
+    MergeTreeReaderSettings settings)
+{
+    return std::make_unique<MergeTreeReaderStream>(
+        part->volume->getDisk(),
+        part->getFullRelativePath() + index->getFileName(), extension, marks_count,
+        all_mark_ranges,
+        std::move(settings), nullptr, nullptr,
+        part->getFileSizeOrZero(index->getFileName() + extension),
+        &part->index_granularity_info,
+        ReadBufferFromFileBase::ProfileCallback{}, CLOCK_MONOTONIC_COARSE);
+}
+
+}

 namespace DB
 {
@ -7,27 +31,28 @@ namespace DB
 MergeTreeIndexReader::MergeTreeIndexReader(
    MergeTreeIndexPtr index_, MergeTreeData::DataPartPtr part_, size_t marks_count_, const MarkRanges & all_mark_ranges_,
    MergeTreeReaderSettings settings)
-    : index(index_), stream(
-        part_->volume->getDisk(),
-        part_->getFullRelativePath() + index->getFileName(), ".idx", marks_count_,
-        all_mark_ranges_,
-        std::move(settings), nullptr, nullptr,
-        part_->getFileSizeOrZero(index->getFileName() + ".idx"),
-        &part_->index_granularity_info,
-        ReadBufferFromFileBase::ProfileCallback{}, CLOCK_MONOTONIC_COARSE)
+    : index(index_)
 {
-    stream.seekToStart();
+    const std::string & path_prefix = part_->getFullRelativePath() + index->getFileName();
+    auto index_format = index->getDeserializedFormat(part_->volume->getDisk(), path_prefix);
+
+    stream = makeIndexReader(index_format.extension, index_, part_, marks_count_, all_mark_ranges_, std::move(settings));
+    version = index_format.version;
+
+    stream->seekToStart();
 }

+MergeTreeIndexReader::~MergeTreeIndexReader() = default;
+
 void MergeTreeIndexReader::seek(size_t mark)
 {
-    stream.seekToMark(mark);
+    stream->seekToMark(mark);
 }

 MergeTreeIndexGranulePtr MergeTreeIndexReader::read()
 {
    auto granule = index->createIndexGranule();
-    granule->deserializeBinary(*stream.data_buffer);
+    granule->deserializeBinary(*stream->data_buffer, version);
    return granule;
 }

--- a/src/Storages/MergeTree/MergeTreeIndexReader.h
+++ b/src/Storages/MergeTree/MergeTreeIndexReader.h
@ -1,5 +1,6 @@
 #pragma once

+#include <memory>
 #include <Storages/MergeTree/MergeTreeReaderStream.h>
 #include <Storages/MergeTree/MergeTreeIndices.h>
 #include <Storages/MergeTree/MergeTreeData.h>
@ -16,6 +17,7 @@ public:
        size_t marks_count_,
        const MarkRanges & all_mark_ranges_,
        MergeTreeReaderSettings settings);
+    ~MergeTreeIndexReader();

    void seek(size_t mark);

@ -23,7 +25,8 @@ public:

 private:
    MergeTreeIndexPtr index;
-    MergeTreeReaderStream stream;
+    std::unique_ptr<MergeTreeReaderStream> stream;
+    uint8_t version = 0;
 };

 }
--- a/src/Storages/MergeTree/MergeTreeIndexSet.cpp
+++ b/src/Storages/MergeTree/MergeTreeIndexSet.cpp
@ -48,8 +48,7 @@ MergeTreeIndexGranuleSet::MergeTreeIndexGranuleSet(
 void MergeTreeIndexGranuleSet::serializeBinary(WriteBuffer & ostr) const
 {
    if (empty())
-        throw Exception(
-            "Attempt to write empty set index " + backQuote(index_name), ErrorCodes::LOGICAL_ERROR);
+        throw Exception(ErrorCodes::LOGICAL_ERROR, "Attempt to write empty set index {}.", backQuote(index_name));

    const auto & size_type = DataTypePtr(std::make_shared<DataTypeUInt64>());
    auto size_serialization = size_type->getDefaultSerialization();
@ -80,8 +79,11 @@ void MergeTreeIndexGranuleSet::serializeBinary(WriteBuffer & ostr) const
    }
 }

-void MergeTreeIndexGranuleSet::deserializeBinary(ReadBuffer & istr)
+void MergeTreeIndexGranuleSet::deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version)
 {
+    if (version != 1)
+        throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown index version {}.", version);
+
    block.clear();

    Field field_rows;
--- a/src/Storages/MergeTree/MergeTreeIndexSet.h
+++ b/src/Storages/MergeTree/MergeTreeIndexSet.h
@ -28,7 +28,7 @@ struct MergeTreeIndexGranuleSet final : public IMergeTreeIndexGranule
        MutableColumns && columns_);

    void serializeBinary(WriteBuffer & ostr) const override;
-    void deserializeBinary(ReadBuffer & istr) override;
+    void deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version) override;

    size_t size() const { return block.rows(); }
    bool empty() const override { return !size(); }
--- a/src/Storages/MergeTree/MergeTreeIndices.h
+++ b/src/Storages/MergeTree/MergeTreeIndices.h
@ -4,6 +4,7 @@
 #include <unordered_map>
 #include <vector>
 #include <memory>
+#include <utility>
 #include <Core/Block.h>
 #include <Storages/StorageInMemoryMetadata.h>
 #include <Storages/MergeTree/MergeTreeDataPartChecksum.h>
@ -17,13 +18,37 @@ constexpr auto INDEX_FILE_PREFIX = "skp_idx_";
 namespace DB
 {

+using MergeTreeIndexVersion = uint8_t;
+struct MergeTreeIndexFormat
+{
+    MergeTreeIndexVersion version;
+    const char* extension;
+
+    operator bool() const { return version != 0; }
+};
+
 /// Stores some info about a single block of data.
 struct IMergeTreeIndexGranule
 {
    virtual ~IMergeTreeIndexGranule() = default;

+    /// Serialize always last version.
    virtual void serializeBinary(WriteBuffer & ostr) const = 0;
-    virtual void deserializeBinary(ReadBuffer & istr) = 0;
+
+    /// Version of the index to deserialize:
+    ///
+    /// - 2 -- minmax index for proper Nullable support,
+    /// - 1 -- everything else.
+    ///
+    /// Implementation is responsible for version check,
+    /// and throw LOGICAL_ERROR in case of unsupported version.
+    ///
+    /// See also:
+    /// - IMergeTreeIndex::getSerializedFileExtension()
+    /// - IMergeTreeIndex::getDeserializedFormat()
+    /// - MergeTreeDataMergerMutator::collectFilesToSkip()
+    /// - MergeTreeDataMergerMutator::collectFilesForRenames()
+    virtual void deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version) = 0;

    virtual bool empty() const = 0;
 };
@ -73,9 +98,26 @@ struct IMergeTreeIndex

    virtual ~IMergeTreeIndex() = default;

-    /// gets filename without extension
+    /// Returns filename without extension.
    String getFileName() const { return INDEX_FILE_PREFIX + index.name; }

+    /// Returns extension for serialization.
+    /// Reimplement if you want new index format.
+    ///
+    /// NOTE: In case getSerializedFileExtension() is reimplemented,
+    /// getDeserializedFormat() should be reimplemented too,
+    /// and check all previous extensions too
+    /// (to avoid breaking backward compatibility).
+    virtual const char* getSerializedFileExtension() const { return ".idx"; }
+
+    /// Returns extension for deserialization.
+    ///
+    /// Return pair<extension, version>.
+    virtual MergeTreeIndexFormat getDeserializedFormat(const DiskPtr, const std::string & /* relative_path_prefix */) const
+    {
+        return {1, ".idx"};
+    }
+
    /// Checks whether the column is in data skipping index.
    virtual bool mayBenefitFromIndexForIn(const ASTPtr & node) const = 0;

--- a/src/Storages/MergeTree/MergeTreeRangeReader.cpp
+++ b/src/Storages/MergeTree/MergeTreeRangeReader.cpp
@ -2,6 +2,7 @@
 #include <Columns/FilterDescription.h>
 #include <Columns/ColumnsCommon.h>
 #include <common/range.h>
+#include <Interpreters/castColumn.h>
 #include <DataTypes/DataTypeNothing.h>

 #ifdef __SSE2__
@ -1043,9 +1044,9 @@ void MergeTreeRangeReader::executePrewhereActionsAndFilterColumns(ReadResult & r
    /// Filter in WHERE instead
    else
    {
-        result.columns[prewhere_column_pos] = result.getFilterHolder()->convertToFullColumnIfConst();
-        if (getSampleBlock().getByName(prewhere_info->prewhere_column_name).type->isNullable())
-            result.columns[prewhere_column_pos] = makeNullable(std::move(result.columns[prewhere_column_pos]));
+        auto type = getSampleBlock().getByName(prewhere_info->prewhere_column_name).type;
+        ColumnWithTypeAndName col(result.getFilterHolder()->convertToFullColumnIfConst(), std::make_shared<DataTypeUInt8>(), "");
+        result.columns[prewhere_column_pos] = castColumn(col, type);
        result.clearFilter(); // Acting as a flag to not filter in PREWHERE
    }
 }
--- a/src/Storages/StorageMergeTree.cpp
+++ b/src/Storages/StorageMergeTree.cpp
@ -959,9 +959,19 @@ std::shared_ptr<StorageMergeTree::MergeMutateSelectedEntry> StorageMergeTree::se

            if (!commands_for_size_validation.empty())
            {
-                MutationsInterpreter interpreter(
-                    shared_from_this(), metadata_snapshot, commands_for_size_validation, getContext(), false);
-                commands_size += interpreter.evaluateCommandsSize();
+                try
+                {
+                    MutationsInterpreter interpreter(
+                        shared_from_this(), metadata_snapshot, commands_for_size_validation, getContext(), false);
+                    commands_size += interpreter.evaluateCommandsSize();
+                }
+                catch (...)
+                {
+                    MergeTreeMutationEntry & entry = it->second;
+                    entry.latest_fail_time = time(nullptr);
+                    entry.latest_fail_reason = getCurrentExceptionMessage(false);
+                    continue;
+                }
            }

            if (current_ast_elements + commands_size >= max_ast_elements)
@ -971,17 +981,21 @@ std::shared_ptr<StorageMergeTree::MergeMutateSelectedEntry> StorageMergeTree::se
            commands.insert(commands.end(), it->second.commands.begin(), it->second.commands.end());
        }

-        auto new_part_info = part->info;
-        new_part_info.mutation = current_mutations_by_version.rbegin()->first;
+        if (!commands.empty())
+        {
+            auto new_part_info = part->info;
+            new_part_info.mutation = current_mutations_by_version.rbegin()->first;

-        future_part.parts.push_back(part);
-        future_part.part_info = new_part_info;
-        future_part.name = part->getNewName(new_part_info);
-        future_part.type = part->getType();
+            future_part.parts.push_back(part);
+            future_part.part_info = new_part_info;
+            future_part.name = part->getNewName(new_part_info);
+            future_part.type = part->getType();

-        tagger = std::make_unique<CurrentlyMergingPartsTagger>(future_part, MergeTreeDataMergerMutator::estimateNeededDiskSpace({part}), *this, metadata_snapshot, true);
-        return std::make_shared<MergeMutateSelectedEntry>(future_part, std::move(tagger), commands);
+            tagger = std::make_unique<CurrentlyMergingPartsTagger>(future_part, MergeTreeDataMergerMutator::estimateNeededDiskSpace({part}), *this, metadata_snapshot, true);
+            return std::make_shared<MergeMutateSelectedEntry>(future_part, std::move(tagger), commands);
+        }
    }
+
    return {};
 }

@ -1036,6 +1050,7 @@ bool StorageMergeTree::scheduleDataProcessingJob(IBackgroundJobExecutor & execut

    auto share_lock = lockForShare(RWLockImpl::NO_QUERY, getSettings()->lock_acquire_timeout_for_background_operations);

+    bool has_mutations;
    {
        std::unique_lock lock(currently_processing_in_background_mutex);
        if (merger_mutator.merges_blocker.isCancelled())
@ -1044,6 +1059,15 @@ bool StorageMergeTree::scheduleDataProcessingJob(IBackgroundJobExecutor & execut
        merge_entry = selectPartsToMerge(metadata_snapshot, false, {}, false, nullptr, share_lock, lock);
        if (!merge_entry)
            mutate_entry = selectPartsToMutate(metadata_snapshot, nullptr, share_lock);
+
+        has_mutations = !current_mutations_by_version.empty();
+    }
+
+    if (!mutate_entry && has_mutations)
+    {
+        /// Notify in case of errors
+        std::lock_guard lock(mutation_wait_mutex);
+        mutation_wait_event.notify_all();
    }

    if (merge_entry)
--- a/src/Storages/ya.make
+++ b/src/Storages/ya.make
@ -141,6 +141,7 @@ SRCS(
    StorageMerge.cpp
    StorageMergeTree.cpp
    StorageMongoDB.cpp
+    StorageMongoDBSocketFactory.cpp
    StorageMySQL.cpp
    StorageNull.cpp
    StorageReplicatedMergeTree.cpp
--- a/tests/clickhouse-test
+++ b/tests/clickhouse-test
@ -647,9 +647,13 @@ def run_tests_array(all_tests_with_params):
                        failures_chain += 1
                        status += MSG_FAIL
                        status += print_test_time(total_time)
-                        status += " - having exception:\n{}\n".format(
+                        status += " - having exception in stdout:\n{}\n".format(
                            '\n'.join(stdout.split('\n')[:100]))
                        status += 'Database: ' + testcase_args.testcase_database
+                    elif '@@SKIP@@' in stdout:
+                        skipped_total += 1
+                        skip_reason = stdout.replace('@@SKIP@@', '').rstrip("\n")
+                        status += MSG_SKIPPED + f" - {skip_reason}\n"
                    elif reference_file is None:
                        status += MSG_UNKNOWN
                        status += print_test_time(total_time)
--- a/tests/integration/test_postgresql_replica_database_engine/test.py
+++ b/tests/integration/test_postgresql_replica_database_engine/test.py
@ -113,6 +113,7 @@ def assert_nested_table_is_created(table_name, materialized_database='test_datab
    assert(table_name in database_tables)


+@pytest.mark.timeout(320)
 def check_tables_are_synchronized(table_name, order_by='key', postgres_database='postgres_database', materialized_database='test_database'):
    assert_nested_table_is_created(table_name, materialized_database)

--- a/tests/performance/jit_aggregate_functions_no_key.xml
+++ b/tests/performance/jit_aggregate_functions_no_key.xml
@ -1,284 +0,0 @@
-<test>
-    <preconditions>
-        <table_exists>hits_100m_single</table_exists>
-    </preconditions>
-
-    <settings>
-        <compile_aggregate_expressions>1</compile_aggregate_expressions>
-        <min_count_to_compile_aggregate_expression>0</min_count_to_compile_aggregate_expression>
-    </settings>
-
-    <create_query>
-        CREATE TABLE jit_test_memory (
-            key UInt64,
-            value_1 UInt64,
-            value_2 UInt64,
-            value_3 UInt64,
-            value_4 UInt64,
-            value_5 UInt64,
-            predicate UInt8
-        ) Engine = Memory
-    </create_query>
-
-    <create_query>
-        CREATE TABLE jit_test_merge_tree (
-            key UInt64,
-            value_1 UInt64,
-            value_2 UInt64,
-            value_3 UInt64,
-            value_4 UInt64,
-            value_5 UInt64,
-            predicate UInt8
-        ) Engine = MergeTree
-        ORDER BY key
-    </create_query>
-
-    <create_query>
-        CREATE TABLE jit_test_merge_tree_nullable (
-            key UInt64,
-            value_1 Nullable(UInt64),
-            value_2 Nullable(UInt64),
-            value_3 Nullable(UInt64),
-            value_4 Nullable(UInt64),
-            value_5 Nullable(UInt64),
-            predicate UInt8
-        ) Engine = Memory
-    </create_query>
-
-    <create_query>
-        CREATE TABLE jit_test_memory_nullable (
-            key UInt64,
-            value_1 Nullable(UInt64),
-            value_2 Nullable(UInt64),
-            value_3 Nullable(UInt64),
-            value_4 Nullable(UInt64),
-            value_5 Nullable(UInt64),
-            predicate UInt8
-        ) Engine = MergeTree
-        ORDER BY key
-    </create_query>
-
-    <substitutions>
-        <substitution>
-            <name>function</name>
-            <values>
-                <value>sum</value>
-                <value>min</value>
-                <value>max</value>
-                <value>avg</value>
-                <value>any</value>
-                <value>anyLast</value>
-                <value>count</value>
-                <value>groupBitOr</value>
-                <value>groupBitAnd</value>
-                <value>groupBitXor</value>
-            </values>
-        </substitution>
-
-        <substitution>
-            <name>table</name>
-            <values>
-                <value>jit_test_memory</value>
-                <value>jit_test_merge_tree</value>
-                <value>jit_test_memory_nullable</value>
-                <value>jit_test_merge_tree_nullable</value>
-            </values>
-        </substitution>
-
-        <substitution>
-            <name>group_scale</name>
-            <values>
-                <value>1000000</value>
-            </values>
-        </substitution>
-    </substitutions>
-
-    <fill_query>
-        INSERT INTO {table}
-            SELECT
-                number % 1000000,
-                number,
-                number,
-                number,
-                number,
-                number,
-                if (number % 2 == 0, 1, 0)
-            FROM
-                system.numbers_mt
-            LIMIT 10000000
-    </fill_query>
-
-    <query>
-        SELECT
-            {function}(value_1),
-            {function}(value_2),
-            {function}(value_3)
-        FROM {table}
-        FORMAT Null
-    </query>
-
-    <query>
-        SELECT
-            {function}(value_1),
-            {function}(value_2),
-            sum(toUInt256(value_3)),
-            {function}(value_3)
-        FROM {table}
-        FORMAT Null
-    </query>
-
-    <query>
-        SELECT
-            {function}If(value_1, predicate),
-            {function}If(value_2, predicate),
-            {function}If(value_3, predicate)
-        FROM {table}
-        FORMAT Null
-    </query>
-
-    <query>
-        SELECT
-            {function}If(value_1, predicate),
-            {function}If(value_2, predicate),
-            sumIf(toUInt256(value_3), predicate),
-            {function}If(value_3, predicate)
-        FROM {table}
-        FORMAT Null
-    </query>
-
-    <query>
-        SELECT
-            {function}(value_1),
-            {function}(value_2),
-            {function}(value_3),
-            {function}(value_4),
-            {function}(value_5)
-        FROM {table}
-        FORMAT Null
-    </query>
-
-    <query>
-        SELECT
-            {function}(value_1),
-            {function}(value_2),
-            sum(toUInt256(value_3)),
-            {function}(value_3),
-            {function}(value_4),
-            {function}(value_5)
-        FROM {table}
-        FORMAT Null
-    </query>
-
-    <query>
-        SELECT
-            {function}If(value_1, predicate),
-            {function}If(value_2, predicate),
-            {function}If(value_3, predicate),
-            {function}If(value_4, predicate),
-            {function}If(value_5, predicate)
-        FROM {table}
-        FORMAT Null
-    </query>
-
-    <query>
-        SELECT
-            {function}If(value_1, predicate),
-            {function}If(value_2, predicate),
-            sumIf(toUInt256(value_3), predicate),
-            {function}If(value_3, predicate),
-            {function}If(value_4, predicate),
-            {function}If(value_5, predicate)
-        FROM {table}
-        FORMAT Null
-    </query>
-
-    <query>
-        SELECT
-            {function}(WatchID),
-            {function}(CounterID),
-            {function}(ClientIP)
-        FROM hits_100m_single
-        FORMAT Null
-    </query>
-
-    <query>
-        SELECT
-            {function}(WatchID),
-            {function}(CounterID),
-            sum(toUInt256(ClientIP)),
-            {function}(ClientIP)
-        FROM hits_100m_single
-        FORMAT Null
-    </query>
-
-    <query>
-        SELECT
-            {function}(WatchID),
-            {function}(CounterID),
-            {function}(ClientIP),
-            {function}(IPNetworkID),
-            {function}(SearchEngineID)
-        FROM hits_100m_single
-        FORMAT Null
-    </query>
-
-    <query>
-        SELECT
-            {function}(WatchID),
-            {function}(CounterID),
-            sum(toUInt256(ClientIP)),
-            {function}(ClientIP),
-            {function}(IPNetworkID),
-            {function}(SearchEngineID)
-        FROM hits_100m_single
-        FORMAT Null
-    </query>
-
-    <query>
-        WITH (WatchID % 2 == 0) AS predicate
-        SELECT
-            {function}If(WatchID, predicate),
-            {function}If(CounterID, predicate),
-            {function}If(ClientIP, predicate)
-        FROM hits_100m_single
-        FORMAT Null
-    </query>
-
-    <query>
-        WITH (WatchID % 2 == 0) AS predicate
-        SELECT
-            {function}If(WatchID, predicate),
-            {function}If(CounterID, predicate),
-            sumIf(toUInt256(ClientIP), predicate),
-            {function}If(ClientIP, predicate)
-        FROM hits_100m_single
-        FORMAT Null
-    </query>
-
-    <query>
-        WITH (WatchID % 2 == 0) AS predicate
-        SELECT
-            {function}If(WatchID, predicate),
-            {function}If(CounterID, predicate),
-            {function}If(ClientIP, predicate),
-            {function}If(IPNetworkID, predicate),
-            {function}If(SearchEngineID, predicate)
-        FROM hits_100m_single
-        FORMAT Null
-    </query>
-
-    <query>
-        WITH (WatchID % 2 == 0) AS predicate
-        SELECT
-            {function}If(WatchID, predicate),
-            {function}If(CounterID, predicate),
-            sumIf(toUInt256(ClientIP), predicate),
-            {function}If(ClientIP, predicate),
-            {function}If(IPNetworkID, predicate),
-            {function}If(SearchEngineID, predicate)
-        FROM hits_100m_single
-        FORMAT Null
-    </query>
-
-    <drop_query>DROP TABLE IF EXISTS {table}</drop_query>
-</test>
--- a/tests/queries/0_stateless/00829_bitmap_function.reference
+++ b/tests/queries/0_stateless/00829_bitmap_function.reference
@ -91,6 +91,14 @@ tag4	[0,1,2,3,4,5,6,7,8,9]	[5,999,2]	[2,888,20]	[0,1,3,4,6,7,8,9,20]
 [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,100,200,500]
 [30,31,32,33,100,200,500]
 [100,200,500]
+[]
+[]
+[1,5,7,9]
+[5,7,9]
+[5,7]
+[0,1,2,3,4,5,6,7,8,9]
+[30,31,32,33,100,200,500]
+[100,200,500]
 0
 0
 0
--- a/Show More
+++ b/Show More