Merge branch 'master' into datatype-date32

# Conflicts: # src/DataTypes/IDataType.h # src/Functions/CustomWeekTransforms.h
2024-11-10 09:32:06 +00:00 · 2021-06-22 10:00:18 +08:00 · 2021-06-22 10:00:18 +08:00 · 7ed1728a37
commit 7ed1728a37
parent 45e9e9f8bb cbc7e61140
2043 changed files with 56393 additions and 24463 deletions
--- a/.github/ISSUE_TEMPLATE/40_bug-report.md
+++ b/.github/ISSUE_TEMPLATE/40_bug-report.md
@ -10,12 +10,26 @@ assignees: ''
 You have to provide the following information whenever possible.

 **Describe the bug**
+
 A clear and concise description of what works not as it is supposed to.

 **Does it reproduce on recent release?**
+
 [The list of releases](https://github.com/ClickHouse/ClickHouse/blob/master/utils/list-versions/version_date.tsv)

+**Enable crash reporting**
+
+If possible, change "enabled" to true in "send_crash_reports" section in `config.xml`:
+
+```
+<send_crash_reports>
+        <!-- Changing <enabled> to true allows sending crash reports to -->
+        <!-- the ClickHouse core developers team via Sentry https://sentry.io -->
+        <enabled>false</enabled>
+```
+
 **How to reproduce**
+
 * Which ClickHouse server version to use
 * Which interface to use, if matters
 * Non-default settings, if any
@ -24,10 +38,13 @@ A clear and concise description of what works not as it is supposed to.
 * Queries to run that lead to unexpected result

 **Expected behavior**
+
 A clear and concise description of what you expected to happen.

 **Error message and/or stacktrace**
+
 If applicable, add screenshots to help explain your problem.

 **Additional context**
+
 Add any other context about the problem here.
--- a/.gitignore
+++ b/.gitignore
@ -14,6 +14,11 @@
 /build-*
 /tests/venv

+# logs
+*.log
+*.stderr
+*.stdout
+
 /docs/build
 /docs/publish
 /docs/edit
--- a/.gitmodules
+++ b/.gitmodules
@ -103,7 +103,7 @@
 	url = https://github.com/ClickHouse-Extras/fastops
 [submodule "contrib/orc"]
 	path = contrib/orc
-	url = https://github.com/apache/orc
+	url = https://github.com/ClickHouse-Extras/orc
 [submodule "contrib/sparsehash-c11"]
 	path = contrib/sparsehash-c11
 	url = https://github.com/sparsehash/sparsehash-c11.git
@ -210,9 +210,6 @@
 [submodule "contrib/fast_float"]
 	path = contrib/fast_float
 	url = https://github.com/fastfloat/fast_float
-[submodule "contrib/libpqxx"]
-	path = contrib/libpqxx
-	url = https://github.com/jtv/libpqxx
 [submodule "contrib/libpq"]
 	path = contrib/libpq
 	url = https://github.com/ClickHouse-Extras/libpq
@ -228,7 +225,9 @@
 [submodule "contrib/datasketches-cpp"]
 	path = contrib/datasketches-cpp
 	url = https://github.com/ClickHouse-Extras/datasketches-cpp.git
-
 [submodule "contrib/yaml-cpp"]
 	path = contrib/yaml-cpp
 	url = https://github.com/ClickHouse-Extras/yaml-cpp.git
+[submodule "contrib/libpqxx"]
+	path = contrib/libpqxx
+	url = https://github.com/ClickHouse-Extras/libpqxx.git
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -1,3 +1,129 @@
+### ClickHouse release 21.6, 2021-06-05
+
+#### Upgrade Notes
+
+* `zstd` compression library is updated to v1.5.0. You may get messages about "checksum does not match" in replication. These messages are expected due to update of compression algorithm and you can ignore them. These messages are informational and do not indicate any kinds of undesired behaviour.
+* The setting `compile_expressions` is enabled by default. Although it has been heavily tested on variety of scenarios, if you find some undesired behaviour on your servers, you can try turning this setting off.
+* Values of `UUID` type cannot be compared with integer. For example, instead of writing `uuid != 0` type `uuid != '00000000-0000-0000-0000-000000000000'`.
+
+#### New Feature
+
+* Add Postgres-like cast operator (`::`). E.g.: `[1, 2]::Array(UInt8)`, `0.1::Decimal(4, 4)`, `number::UInt16`. [#23871](https://github.com/ClickHouse/ClickHouse/pull/23871) ([Anton Popov](https://github.com/CurtizJ)).
+* Make big integers production ready. Add support for `UInt128` data type. Fix known issues with the `Decimal256` data type. Support big integers in dictionaries. Support `gcd`/`lcm` functions for big integers. Support big integers in array search and conditional functions. Support `LowCardinality(UUID)`. Support big integers in `generateRandom` table function and `clickhouse-obfuscator`. Fix error with returning `UUID` from scalar subqueries. This fixes [#7834](https://github.com/ClickHouse/ClickHouse/issues/7834). This fixes [#23936](https://github.com/ClickHouse/ClickHouse/issues/23936). This fixes [#4176](https://github.com/ClickHouse/ClickHouse/issues/4176). This fixes [#24018](https://github.com/ClickHouse/ClickHouse/issues/24018). Backward incompatible change: values of `UUID` type cannot be compared with integer. For example, instead of writing `uuid != 0` type `uuid != '00000000-0000-0000-0000-000000000000'`. [#23631](https://github.com/ClickHouse/ClickHouse/pull/23631) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Support `Array` data type for inserting and selecting data in `Arrow`, `Parquet` and `ORC` formats. [#21770](https://github.com/ClickHouse/ClickHouse/pull/21770) ([taylor12805](https://github.com/taylor12805)).
+* Implement table comments. Closes [#23225](https://github.com/ClickHouse/ClickHouse/issues/23225). [#23548](https://github.com/ClickHouse/ClickHouse/pull/23548) ([flynn](https://github.com/ucasFL)).
+* Support creating dictionaries with DDL queries in `clickhouse-local`. Closes [#22354](https://github.com/ClickHouse/ClickHouse/issues/22354). Added support for `DETACH DICTIONARY PERMANENTLY`. Added support for `EXCHANGE DICTIONARIES` for `Atomic` database engine. Added support for moving dictionaries between databases using `RENAME DICTIONARY`. [#23436](https://github.com/ClickHouse/ClickHouse/pull/23436) ([Maksim Kita](https://github.com/kitaisreal)).
+* Add aggregate function `uniqTheta` to support [Theta Sketch](https://datasketches.apache.org/docs/Theta/ThetaSketchFramework.html) in ClickHouse. [#23894](https://github.com/ClickHouse/ClickHouse/pull/23894). [#22609](https://github.com/ClickHouse/ClickHouse/pull/22609) ([Ping Yu](https://github.com/pingyu)).
+* Add function `splitByRegexp`. [#24077](https://github.com/ClickHouse/ClickHouse/pull/24077) ([abel-cheng](https://github.com/abel-cheng)).
+* Add function `arrayProduct` which accept an array as the parameter, and return the product of all the elements in array. Closes [#21613](https://github.com/ClickHouse/ClickHouse/issues/21613). [#23782](https://github.com/ClickHouse/ClickHouse/pull/23782) ([Maksim Kita](https://github.com/kitaisreal)).
+* Add `thread_name` column in `system.stack_trace`. This closes [#23256](https://github.com/ClickHouse/ClickHouse/issues/23256). [#24124](https://github.com/ClickHouse/ClickHouse/pull/24124) ([abel-cheng](https://github.com/abel-cheng)).
+* If `insert_null_as_default` = 1, insert default values instead of NULL in `INSERT ... SELECT` and `INSERT ... SELECT ... UNION ALL ...` queries. Closes [#22832](https://github.com/ClickHouse/ClickHouse/issues/22832). [#23524](https://github.com/ClickHouse/ClickHouse/pull/23524) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Add support for progress indication in `clickhouse-local` with `--progress` option. [#23196](https://github.com/ClickHouse/ClickHouse/pull/23196) ([Egor Savin](https://github.com/Amesaru)).
+* Add support for HTTP compression (determined by `Content-Encoding` HTTP header) in `http` dictionary source. This fixes [#8912](https://github.com/ClickHouse/ClickHouse/issues/8912). [#23946](https://github.com/ClickHouse/ClickHouse/pull/23946) ([FArthur-cmd](https://github.com/FArthur-cmd)).
+* Added `SYSTEM QUERY RELOAD MODEL`, `SYSTEM QUERY RELOAD MODELS`. Closes [#18722](https://github.com/ClickHouse/ClickHouse/issues/18722). [#23182](https://github.com/ClickHouse/ClickHouse/pull/23182) ([Maksim Kita](https://github.com/kitaisreal)).
+* Add setting `json` (boolean, 0 by default) for `EXPLAIN PLAN` query. When enabled, query output will be a single `JSON` row. It is recommended to use `TSVRaw` format to avoid unnecessary escaping. [#23082](https://github.com/ClickHouse/ClickHouse/pull/23082) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Add setting `indexes` (boolean, disabled by default) to `EXPLAIN PIPELINE` query. When enabled, shows used indexes, number of filtered parts and granules for every index applied. Supported for `MergeTree*` tables. [#22352](https://github.com/ClickHouse/ClickHouse/pull/22352) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* LDAP: implemented user DN detection functionality to use when mapping Active Directory groups to ClickHouse roles. [#22228](https://github.com/ClickHouse/ClickHouse/pull/22228) ([Denis Glazachev](https://github.com/traceon)).
+* New aggregate function `deltaSumTimestamp` for summing the difference between consecutive rows while maintaining ordering during merge by storing timestamps. [#21888](https://github.com/ClickHouse/ClickHouse/pull/21888) ([Russ Frank](https://github.com/rf)).
+* Added less secure IMDS credentials provider for S3 which works under docker correctly. [#21852](https://github.com/ClickHouse/ClickHouse/pull/21852) ([Vladimir Chebotarev](https://github.com/excitoon)).
+* Add back `indexHint` function. This is for [#21238](https://github.com/ClickHouse/ClickHouse/issues/21238). This reverts [#9542](https://github.com/ClickHouse/ClickHouse/pull/9542). This fixes [#9540](https://github.com/ClickHouse/ClickHouse/issues/9540). [#21304](https://github.com/ClickHouse/ClickHouse/pull/21304) ([Amos Bird](https://github.com/amosbird)).
+
+#### Experimental Feature
+
+* Add `PROJECTION` support for `MergeTree*` tables. [#20202](https://github.com/ClickHouse/ClickHouse/pull/20202) ([Amos Bird](https://github.com/amosbird)).
+
+#### Performance Improvement
+
+* Enable `compile_expressions` setting by default. When this setting enabled, compositions of simple functions and operators will be compiled to native code with LLVM at runtime. [#8482](https://github.com/ClickHouse/ClickHouse/pull/8482) ([Maksim Kita](https://github.com/kitaisreal), [alexey-milovidov](https://github.com/alexey-milovidov)). Note: if you feel in trouble, turn this option off.
+* Update `re2` library. Performance of regular expressions matching is improved. Also this PR adds compatibility with gcc-11. [#24196](https://github.com/ClickHouse/ClickHouse/pull/24196) ([Raúl Marín](https://github.com/Algunenano)).
+* ORC input format reading by stripe instead of reading entire table into memory by once which is cost memory when file size is huge. [#23102](https://github.com/ClickHouse/ClickHouse/pull/23102) ([Chao Ma](https://github.com/godliness)).
+* Fusion of aggregate functions `sum`, `count` and `avg` in a query into single aggregate function. The optimization is controlled with the `optimize_fuse_sum_count_avg` setting. This is implemented with a new aggregate function `sumCount`. This function returns a tuple of two fields: `sum` and `count`. [#21337](https://github.com/ClickHouse/ClickHouse/pull/21337) ([hexiaoting](https://github.com/hexiaoting)).
+* Update `zstd` to v1.5.0. The performance of compression is improved for single digits percentage. [#24135](https://github.com/ClickHouse/ClickHouse/pull/24135) ([Raúl Marín](https://github.com/Algunenano)). Note: you may get messages about "checksum does not match" in replication. These messages are expected due to update of compression algorithm and you can ignore them.
+* Improved performance of `Buffer` tables: do not acquire lock for total_bytes/total_rows for `Buffer` engine. [#24066](https://github.com/ClickHouse/ClickHouse/pull/24066) ([Azat Khuzhin](https://github.com/azat)).
+* Preallocate support for hashed/sparse_hashed dictionaries is returned. [#23979](https://github.com/ClickHouse/ClickHouse/pull/23979) ([Azat Khuzhin](https://github.com/azat)).
+* Enable `async_socket_for_remote` by default (lower amount of threads in querying Distributed tables with large fanout). [#23683](https://github.com/ClickHouse/ClickHouse/pull/23683) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+
+#### Improvement
+
+* Add `_partition_value` virtual column to MergeTree table family. It can be used to prune partition in a deterministic way. It's needed to implement partition matcher for mutations. [#23673](https://github.com/ClickHouse/ClickHouse/pull/23673) ([Amos Bird](https://github.com/amosbird)).
+* Added `region` parameter for S3 storage and disk. [#23846](https://github.com/ClickHouse/ClickHouse/pull/23846) ([Vladimir Chebotarev](https://github.com/excitoon)).
+* Allow configuring different log levels for different logging channels. Closes [#19569](https://github.com/ClickHouse/ClickHouse/issues/19569). [#23857](https://github.com/ClickHouse/ClickHouse/pull/23857) ([filimonov](https://github.com/filimonov)).
+* Keep default timezone on `DateTime` operations if it was not provided explicitly. For example, if you add one second to a value of `DateTime` type without timezone it will remain `DateTime` without timezone. In previous versions the value of default timezone was placed to the returned data type explicitly so it becomes DateTime('something'). This closes [#4854](https://github.com/ClickHouse/ClickHouse/issues/4854). [#23392](https://github.com/ClickHouse/ClickHouse/pull/23392) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Allow user to specify empty string instead of database name for `MySQL` storage. Default database will be used for queries. In previous versions it was working for SELECT queries and not support for INSERT was also added. This closes [#19281](https://github.com/ClickHouse/ClickHouse/issues/19281). This can be useful working with `Sphinx` or other MySQL-compatible foreign databases. [#23319](https://github.com/ClickHouse/ClickHouse/pull/23319) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fixed `quantile(s)TDigest`. Added special handling of singleton centroids according to tdunning/t-digest 3.2+. Also a bug with over-compression of centroids in implementation of earlier version of the algorithm was fixed. [#23314](https://github.com/ClickHouse/ClickHouse/pull/23314) ([Vladimir Chebotarev](https://github.com/excitoon)).
+* Function `now64` now supports optional timezone argument. [#24091](https://github.com/ClickHouse/ClickHouse/pull/24091) ([Vasily Nemkov](https://github.com/Enmk)).
+* Fix the case when a progress bar in interactive mode in clickhouse-client that appear in the middle of the data may rewrite some parts of visible data in terminal. This closes [#19283](https://github.com/ClickHouse/ClickHouse/issues/19283). [#23050](https://github.com/ClickHouse/ClickHouse/pull/23050) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix crash when memory allocation fails in simdjson. https://github.com/simdjson/simdjson/pull/1567 . Mark as improvement because it's a very rare bug. [#24147](https://github.com/ClickHouse/ClickHouse/pull/24147) ([Amos Bird](https://github.com/amosbird)).
+* Preserve dictionaries until storage shutdown (this will avoid possible `external dictionary 'DICT' not found` errors at server shutdown during final flush of the `Buffer` engine). [#24068](https://github.com/ClickHouse/ClickHouse/pull/24068) ([Azat Khuzhin](https://github.com/azat)).
+* Flush `Buffer` tables before shutting down tables (within one database), to avoid discarding blocks due to underlying table had been already detached (and `Destination table default.a_data_01870 doesn't exist. Block of data is discarded` error in the log). [#24067](https://github.com/ClickHouse/ClickHouse/pull/24067) ([Azat Khuzhin](https://github.com/azat)).
+* Now `prefer_column_name_to_alias = 1` will also favor column names for `group by`, `having` and `order by`. This fixes [#23882](https://github.com/ClickHouse/ClickHouse/issues/23882). [#24022](https://github.com/ClickHouse/ClickHouse/pull/24022) ([Amos Bird](https://github.com/amosbird)).
+* Add support for `ORDER BY WITH FILL` with `DateTime64`. [#24016](https://github.com/ClickHouse/ClickHouse/pull/24016) ([kevin wan](https://github.com/MaxWk)).
+* Enable `DateTime64` to be a version column in `ReplacingMergeTree`. [#23992](https://github.com/ClickHouse/ClickHouse/pull/23992) ([kevin wan](https://github.com/MaxWk)).
+* Log information about OS name, kernel version and CPU architecture on server startup. [#23988](https://github.com/ClickHouse/ClickHouse/pull/23988) ([Azat Khuzhin](https://github.com/azat)).
+* Support specifying table schema for `postgresql` dictionary source. Closes [#23958](https://github.com/ClickHouse/ClickHouse/issues/23958). [#23980](https://github.com/ClickHouse/ClickHouse/pull/23980) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Add hints for names of `Enum` elements (suggest names in case of typos). Closes [#17112](https://github.com/ClickHouse/ClickHouse/issues/17112). [#23919](https://github.com/ClickHouse/ClickHouse/pull/23919) ([flynn](https://github.com/ucasFL)).
+* Measure found rate (the percentage for which the value was found) for dictionaries (see `found_rate` in `system.dictionaries`). [#23916](https://github.com/ClickHouse/ClickHouse/pull/23916) ([Azat Khuzhin](https://github.com/azat)).
+* Allow to add specific queue settings via table settng `rabbitmq_queue_settings_list`. (Closes [#23737](https://github.com/ClickHouse/ClickHouse/issues/23737) and [#23918](https://github.com/ClickHouse/ClickHouse/issues/23918)). Allow user to control all RabbitMQ setup: if table setting `rabbitmq_queue_consume` is set to `1` - RabbitMQ table engine will only connect to specified queue and will not perform any RabbitMQ consumer-side setup like declaring exchange, queues, bindings. (Closes [#21757](https://github.com/ClickHouse/ClickHouse/issues/21757)). Add proper cleanup when RabbitMQ table is dropped - delete queues, which the table has declared and all bound exchanges - if they were created by the table. [#23887](https://github.com/ClickHouse/ClickHouse/pull/23887) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Add `broken_data_files`/`broken_data_compressed_bytes` into `system.distribution_queue`. Add metric for number of files for asynchronous insertion into Distributed tables that has been marked as broken (`BrokenDistributedFilesToInsert`). [#23885](https://github.com/ClickHouse/ClickHouse/pull/23885) ([Azat Khuzhin](https://github.com/azat)).
+* Querying `system.tables` does not go to ZooKeeper anymore. [#23793](https://github.com/ClickHouse/ClickHouse/pull/23793) ([Fuwang Hu](https://github.com/fuwhu)).
+* Respect `lock_acquire_timeout_for_background_operations` for `OPTIMIZE` queries. [#23623](https://github.com/ClickHouse/ClickHouse/pull/23623) ([Azat Khuzhin](https://github.com/azat)).
+* Possibility to change `S3` disk settings in runtime via new `SYSTEM RESTART DISK` SQL command. [#23429](https://github.com/ClickHouse/ClickHouse/pull/23429) ([Pavel Kovalenko](https://github.com/Jokser)).
+* If user applied a misconfiguration by mistakenly setting `max_distributed_connections` to value zero, every query to a `Distributed` table will throw exception with a message containing "logical error". But it's really an expected behaviour, not a logical error, so the exception message was slightly incorrect. It also triggered checks in our CI enviroment that ensures that no logical errors ever happen. Instead we will treat `max_distributed_connections` misconfigured to zero as the minimum possible value (one). [#23348](https://github.com/ClickHouse/ClickHouse/pull/23348) ([Azat Khuzhin](https://github.com/azat)).
+* Disable `min_bytes_to_use_mmap_io` by default. [#23322](https://github.com/ClickHouse/ClickHouse/pull/23322) ([Azat Khuzhin](https://github.com/azat)).
+* Support `LowCardinality` nullability with `join_use_nulls`, close [#15101](https://github.com/ClickHouse/ClickHouse/issues/15101). [#23237](https://github.com/ClickHouse/ClickHouse/pull/23237) ([vdimir](https://github.com/vdimir)).
+* Added possibility to restore `MergeTree` parts to `detached` directory for `S3` disk. [#23112](https://github.com/ClickHouse/ClickHouse/pull/23112) ([Pavel Kovalenko](https://github.com/Jokser)).
+* Retries on HTTP connection drops in S3. [#22988](https://github.com/ClickHouse/ClickHouse/pull/22988) ([Vladimir Chebotarev](https://github.com/excitoon)).
+* Add settings `external_storage_max_read_rows` and `external_storage_max_read_rows` for MySQL table engine, dictionary source and MaterializeMySQL minor data fetches. [#22697](https://github.com/ClickHouse/ClickHouse/pull/22697) ([TCeason](https://github.com/TCeason)).
+* `MaterializeMySQL` (experimental feature): Previously, MySQL 5.7.9 was not supported due to SQL incompatibility. Now leave MySQL parameter verification to the MaterializeMySQL. [#23413](https://github.com/ClickHouse/ClickHouse/pull/23413) ([TCeason](https://github.com/TCeason)).
+* Enable reading of subcolumns for distributed tables. [#24472](https://github.com/ClickHouse/ClickHouse/pull/24472) ([Anton Popov](https://github.com/CurtizJ)).
+* Fix usage of tuples in `CREATE .. AS SELECT` queries. [#24464](https://github.com/ClickHouse/ClickHouse/pull/24464) ([Anton Popov](https://github.com/CurtizJ)).
+* Support for `Parquet` format in `Kafka` tables. [#23412](https://github.com/ClickHouse/ClickHouse/pull/23412) ([Chao Ma](https://github.com/godliness)).
+
+#### Bug Fix
+
+* Use old modulo function version when used in partition key and primary key. Closes [#23508](https://github.com/ClickHouse/ClickHouse/issues/23508). [#24157](https://github.com/ClickHouse/ClickHouse/pull/24157) ([Kseniia Sumarokova](https://github.com/kssenii)). It was a source of backward incompatibility in previous releases.
+* Fixed the behavior when query `SYSTEM RESTART REPLICA` or `SYSTEM SYNC REPLICA` is being processed infinitely. This was detected on server with extremely little amount of RAM. [#24457](https://github.com/ClickHouse/ClickHouse/pull/24457) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
+* Fix incorrect monotonicity of `toWeek` function. This fixes [#24422](https://github.com/ClickHouse/ClickHouse/issues/24422) . This bug was introduced in [#5212](https://github.com/ClickHouse/ClickHouse/pull/5212), and was exposed later by smarter partition pruner. [#24446](https://github.com/ClickHouse/ClickHouse/pull/24446) ([Amos Bird](https://github.com/amosbird)).
+* Fix drop partition with intersect fake parts. In rare cases there might be parts with mutation version greater than current block number. [#24321](https://github.com/ClickHouse/ClickHouse/pull/24321) ([Amos Bird](https://github.com/amosbird)).
+* Fixed a bug in moving Materialized View from Ordinary to Atomic database (`RENAME TABLE` query). Now inner table is moved to new database together with Materialized View. Fixes [#23926](https://github.com/ClickHouse/ClickHouse/issues/23926). [#24309](https://github.com/ClickHouse/ClickHouse/pull/24309) ([tavplubix](https://github.com/tavplubix)).
+* Allow empty HTTP headers in client requests. Fixes [#23901](https://github.com/ClickHouse/ClickHouse/issues/23901). [#24285](https://github.com/ClickHouse/ClickHouse/pull/24285) ([Ivan](https://github.com/abyss7)).
+* Set `max_threads = 1` to fix mutation fail of `Memory` tables. Closes [#24274](https://github.com/ClickHouse/ClickHouse/issues/24274). [#24275](https://github.com/ClickHouse/ClickHouse/pull/24275) ([flynn](https://github.com/ucasFL)).
+* Fix typo in implementation of `Memory` tables, this bug was introduced at [#15127](https://github.com/ClickHouse/ClickHouse/issues/15127). Closes [#24192](https://github.com/ClickHouse/ClickHouse/issues/24192). [#24193](https://github.com/ClickHouse/ClickHouse/pull/24193) ([张中南](https://github.com/plugine)).
+* Fix abnormal server termination due to `HDFS` becoming not accessible during query execution. Closes [#24117](https://github.com/ClickHouse/ClickHouse/issues/24117). [#24191](https://github.com/ClickHouse/ClickHouse/pull/24191) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Fix crash on updating of `Nested` column with const condition. [#24183](https://github.com/ClickHouse/ClickHouse/pull/24183) ([hexiaoting](https://github.com/hexiaoting)).
+* Fix race condition which could happen in RBAC under a heavy load. This PR fixes [#24090](https://github.com/ClickHouse/ClickHouse/issues/24090), [#24134](https://github.com/ClickHouse/ClickHouse/issues/24134),. [#24176](https://github.com/ClickHouse/ClickHouse/pull/24176) ([Vitaly Baranov](https://github.com/vitlibar)).
+* Fix a rare bug that could lead to a partially initialized table that can serve write requests (insert/alter/so on). Now such tables will be in readonly mode. [#24122](https://github.com/ClickHouse/ClickHouse/pull/24122) ([alesapin](https://github.com/alesapin)).
+* Fix an issue: `EXPLAIN PIPELINE` with `SELECT xxx FINAL` showed a wrong pipeline. ([hexiaoting](https://github.com/hexiaoting)).
+* Fixed using const `DateTime` value vs `DateTime64` column in `WHERE`. [#24100](https://github.com/ClickHouse/ClickHouse/pull/24100) ([Vasily Nemkov](https://github.com/Enmk)).
+* Fix crash in merge JOIN, closes [#24010](https://github.com/ClickHouse/ClickHouse/issues/24010). [#24013](https://github.com/ClickHouse/ClickHouse/pull/24013) ([vdimir](https://github.com/vdimir)).
+* Some `ALTER PARTITION` queries might cause `Part A intersects previous part B` and `Unexpected merged part C intersecting drop range D` errors in replication queue. It's fixed. Fixes [#23296](https://github.com/ClickHouse/ClickHouse/issues/23296). [#23997](https://github.com/ClickHouse/ClickHouse/pull/23997) ([tavplubix](https://github.com/tavplubix)).
+* Fix SIGSEGV for external GROUP BY and overflow row (i.e. queries like `SELECT FROM GROUP BY WITH TOTALS SETTINGS max_bytes_before_external_group_by>0, max_rows_to_group_by>0, group_by_overflow_mode='any', totals_mode='before_having'`). [#23962](https://github.com/ClickHouse/ClickHouse/pull/23962) ([Azat Khuzhin](https://github.com/azat)).
+* Fix keys metrics accounting for `CACHE` dictionary with duplicates in the source (leads to `DictCacheKeysRequestedMiss` overflows). [#23929](https://github.com/ClickHouse/ClickHouse/pull/23929) ([Azat Khuzhin](https://github.com/azat)).
+* Fix implementation of connection pool of `PostgreSQL` engine. Closes [#23897](https://github.com/ClickHouse/ClickHouse/issues/23897). [#23909](https://github.com/ClickHouse/ClickHouse/pull/23909) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Fix `distributed_group_by_no_merge = 2` with `GROUP BY` and aggregate function wrapped into regular function (had been broken in [#23546](https://github.com/ClickHouse/ClickHouse/issues/23546)). Throw exception in case of someone trying to use `distributed_group_by_no_merge = 2` with window functions. Disable `optimize_distributed_group_by_sharding_key` for queries with window functions. [#23906](https://github.com/ClickHouse/ClickHouse/pull/23906) ([Azat Khuzhin](https://github.com/azat)).
+* A fix for `s3` table function: better handling of HTTP errors. Response bodies of HTTP errors were being ignored earlier. [#23844](https://github.com/ClickHouse/ClickHouse/pull/23844) ([Vladimir Chebotarev](https://github.com/excitoon)).
+* A fix for `s3` table function: better handling of URI's. Fixed an incompatibility with URLs containing `+` symbol, data with such keys could not be read previously. [#23822](https://github.com/ClickHouse/ClickHouse/pull/23822) ([Vladimir Chebotarev](https://github.com/excitoon)).
+* Fix error `Can't initialize pipeline with empty pipe` for queries with `GLOBAL IN/JOIN` and `use_hedged_requests`. Fixes [#23431](https://github.com/ClickHouse/ClickHouse/issues/23431). [#23805](https://github.com/ClickHouse/ClickHouse/pull/23805) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Fix `CLEAR COLUMN` does not work when it is referenced by materialized view. Close [#23764](https://github.com/ClickHouse/ClickHouse/issues/23764). [#23781](https://github.com/ClickHouse/ClickHouse/pull/23781) ([flynn](https://github.com/ucasFL)).
+* Fix heap use after free when reading from HDFS if `Values` format is used. [#23761](https://github.com/ClickHouse/ClickHouse/pull/23761) ([Kseniia Sumarokova](https://github.com/kssenii)).
+* Avoid possible "Cannot schedule a task" error (in case some exception had been occurred) on INSERT into Distributed. [#23744](https://github.com/ClickHouse/ClickHouse/pull/23744) ([Azat Khuzhin](https://github.com/azat)).
+* Fixed a bug in recovery of staled `ReplicatedMergeTree` replica. Some metadata updates could be ignored by staled replica if `ALTER` query was executed during downtime of the replica. [#23742](https://github.com/ClickHouse/ClickHouse/pull/23742) ([tavplubix](https://github.com/tavplubix)).
+* Fix a bug with `Join` and `WITH TOTALS`, close [#17718](https://github.com/ClickHouse/ClickHouse/issues/17718). [#23549](https://github.com/ClickHouse/ClickHouse/pull/23549) ([vdimir](https://github.com/vdimir)).
+* Fix possible `Block structure mismatch` error for queries with `UNION` which could possibly happen after filter-pushdown optimization. Fixes [#23029](https://github.com/ClickHouse/ClickHouse/issues/23029). [#23359](https://github.com/ClickHouse/ClickHouse/pull/23359) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Add type conversion when the setting `optimize_skip_unused_shards_rewrite_in` is enabled. This fixes MSan report. [#23219](https://github.com/ClickHouse/ClickHouse/pull/23219) ([Azat Khuzhin](https://github.com/azat)).
+* Add a missing check when updating nested subcolumns, close issue: [#22353](https://github.com/ClickHouse/ClickHouse/issues/22353). [#22503](https://github.com/ClickHouse/ClickHouse/pull/22503) ([hexiaoting](https://github.com/hexiaoting)).
+
+#### Build/Testing/Packaging Improvement
+
+* Support building on Illumos. [#24144](https://github.com/ClickHouse/ClickHouse/pull/24144). Adds support for building on Solaris-derived operating systems. [#23746](https://github.com/ClickHouse/ClickHouse/pull/23746) ([bnaecker](https://github.com/bnaecker)).
+* Add more benchmarks for hash tables, including the Swiss Table from Google (that appeared to be slower than ClickHouse hash map in our specific usage scenario). [#24111](https://github.com/ClickHouse/ClickHouse/pull/24111) ([Maksim Kita](https://github.com/kitaisreal)).
+* Update librdkafka 1.6.0-RC3 to 1.6.1. [#23874](https://github.com/ClickHouse/ClickHouse/pull/23874) ([filimonov](https://github.com/filimonov)).
+* Always enable `asynchronous-unwind-tables` explicitly. It may fix query profiler on AArch64. [#23602](https://github.com/ClickHouse/ClickHouse/pull/23602) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Avoid possible build dependency on locale and filesystem order. This allows reproducible builds. [#23600](https://github.com/ClickHouse/ClickHouse/pull/23600) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Remove a source of nondeterminism from build. Now builds at different point of time will produce byte-identical binaries. Partially addressed [#22113](https://github.com/ClickHouse/ClickHouse/issues/22113). [#23559](https://github.com/ClickHouse/ClickHouse/pull/23559) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Add simple tool for benchmarking (Zoo)Keeper. [#23038](https://github.com/ClickHouse/ClickHouse/pull/23038) ([alesapin](https://github.com/alesapin)).
+
+
 ## ClickHouse release 21.5, 2021-05-20

 #### Backward Incompatible Change
@ -637,6 +763,7 @@
 * Allow using extended integer types (`Int128`, `Int256`, `UInt256`) in `avg` and `avgWeighted` functions. Also allow using different types (integer, decimal, floating point) for value and for weight in `avgWeighted` function. This is a backward-incompatible change: now the `avg` and `avgWeighted` functions always return `Float64` (as documented). Before this change the return type for `Decimal` arguments was also `Decimal`. [#15419](https://github.com/ClickHouse/ClickHouse/pull/15419) ([Mike](https://github.com/myrrc)).
 * Expression `toUUID(N)` no longer works. Replace with `toUUID('00000000-0000-0000-0000-000000000000')`. This change is motivated by non-obvious results of `toUUID(N)` where N is non zero.
 * SSL Certificates with incorrect "key usage" are rejected. In previous versions they are used to work. See [#19262](https://github.com/ClickHouse/ClickHouse/issues/19262).
+* `incl` references to substitutions file (`/etc/metrika.xml`) were removed from the default config (`<remote_servers>`, `<zookeeper>`, `<macros>`, `<compression>`, `<networks>`). If you were using substitutions file and were relying on those implicit references, you should put them back manually and explicitly by adding corresponding sections with `incl="..."` attributes before the update. See [#18740](https://github.com/ClickHouse/ClickHouse/pull/18740) ([alexey-milovidov](https://github.com/alexey-milovidov)).

 #### New Feature

--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@ -183,24 +183,20 @@ endif ()
 # Make sure the final executable has symbols exported
 set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -rdynamic")

-if (OS_LINUX)
-    find_program (OBJCOPY_PATH NAMES "llvm-objcopy" "llvm-objcopy-12" "llvm-objcopy-11" "llvm-objcopy-10" "llvm-objcopy-9" "llvm-objcopy-8" "objcopy")
-    if (OBJCOPY_PATH)
-        message(STATUS "Using objcopy: ${OBJCOPY_PATH}.")
-
-        if (ARCH_AMD64)
-            set(OBJCOPY_ARCH_OPTIONS -O elf64-x86-64 -B i386)
-        elseif (ARCH_AARCH64)
-            set(OBJCOPY_ARCH_OPTIONS -O elf64-aarch64 -B aarch64)
-        endif ()
-    else ()
-        message(FATAL_ERROR "Cannot find objcopy.")
-    endif ()
+find_program (OBJCOPY_PATH NAMES "llvm-objcopy" "llvm-objcopy-12" "llvm-objcopy-11" "llvm-objcopy-10" "llvm-objcopy-9" "llvm-objcopy-8" "objcopy")
+if (OBJCOPY_PATH)
+   message(STATUS "Using objcopy: ${OBJCOPY_PATH}.")
+else ()
+  message(FATAL_ERROR "Cannot find objcopy.")
 endif ()

 if (OS_DARWIN)
-    set(WHOLE_ARCHIVE -all_load)
-    set(NO_WHOLE_ARCHIVE -noall_load)
+    # The `-all_load` flag forces loading of all symbols from all libraries,
+    # and leads to multiply-defined symbols. This flag allows force loading
+    # from a _specific_ library, which is what we need.
+    set(WHOLE_ARCHIVE -force_load)
+    # The `-noall_load` flag is the default and now obsolete.
+    set(NO_WHOLE_ARCHIVE "")
 else ()
    set(WHOLE_ARCHIVE --whole-archive)
    set(NO_WHOLE_ARCHIVE --no-whole-archive)
@ -528,7 +524,6 @@ include (cmake/find/libpqxx.cmake)
 include (cmake/find/nuraft.cmake)
 include (cmake/find/yaml-cpp.cmake)

-
 if(NOT USE_INTERNAL_PARQUET_LIBRARY)
    set (ENABLE_ORC OFF CACHE INTERNAL "")
 endif()
--- a/README.md
+++ b/README.md
@ -13,3 +13,6 @@ ClickHouse® is an open-source column-oriented database management system that a
 * [Code Browser](https://clickhouse.tech/codebrowser/html_report/ClickHouse/index.html) with syntax highlight and navigation.
 * [Contacts](https://clickhouse.tech/#contacts) can help to get your questions answered if there are any.
 * You can also [fill this form](https://clickhouse.tech/#meet) to meet Yandex ClickHouse team in person.
+
+## Upcoming Events
+* [China ClickHouse Community Meetup (online)](http://hdxu.cn/rhbfZ) on 26 June 2021.
--- a/base/bridge/IBridge.cpp
+++ b/base/bridge/IBridge.cpp
@ -1,14 +1,22 @@
 #include "IBridge.h"

-#include <IO/ReadHelpers.h>
 #include <boost/program_options.hpp>
 #include <Poco/Net/NetException.h>
 #include <Poco/Util/HelpFormatter.h>
-#include <Common/StringUtils/StringUtils.h>
-#include <Formats/registerFormats.h>
+
 #include <common/logger_useful.h>
+#include <common/range.h>
+
+#include <Common/StringUtils/StringUtils.h>
 #include <Common/SensitiveDataMasker.h>
+#include <common/errnoToString.h>
+#include <IO/ReadHelpers.h>
+#include <Formats/registerFormats.h>
 #include <Server/HTTP/HTTPServer.h>
+#include <IO/WriteBufferFromFile.h>
+#include <IO/WriteHelpers.h>
+#include <sys/time.h>
+#include <sys/resource.h>

 #if USE_ODBC
 #    include <Poco/Data/ODBC/Connector.h>
@ -163,6 +171,31 @@ void IBridge::initialize(Application & self)
    max_server_connections = config().getUInt("max-server-connections", 1024);
    keep_alive_timeout = config().getUInt64("keep-alive-timeout", 10);

+    struct rlimit limit;
+    const UInt64 gb = 1024 * 1024 * 1024;
+
+    /// Set maximum RSS to 1 GiB.
+    limit.rlim_max = limit.rlim_cur = gb;
+    if (setrlimit(RLIMIT_RSS, &limit))
+        LOG_WARNING(log, "Unable to set maximum RSS to 1GB: {} (current rlim_cur={}, rlim_max={})",
+                    errnoToString(errno), limit.rlim_cur, limit.rlim_max);
+
+    if (!getrlimit(RLIMIT_RSS, &limit))
+        LOG_INFO(log, "RSS limit: cur={}, max={}", limit.rlim_cur, limit.rlim_max);
+
+    try
+    {
+        const auto oom_score = toString(config().getUInt64("bridge_oom_score", 500));
+        WriteBufferFromFile buf("/proc/self/oom_score_adj");
+        buf.write(oom_score.data(), oom_score.size());
+        buf.close();
+        LOG_INFO(log, "OOM score is set to {}", oom_score);
+    }
+    catch (const Exception & e)
+    {
+        LOG_WARNING(log, "Failed to set OOM score, error: {}", e.what());
+    }
+
    initializeTerminationAndSignalProcessing();

    ServerApplication::initialize(self); // NOLINT
@ -214,7 +247,7 @@ int IBridge::main(const std::vector<std::string> & /*args*/)

        server.stop();

-        for (size_t count : ext::range(1, 6))
+        for (size_t count : collections::range(1, 6))
        {
            if (server.currentConnections() == 0)
                break;
--- a/base/common/DecomposedFloat.h
+++ b/base/common/DecomposedFloat.h
@ -91,10 +91,12 @@ struct DecomposedFloat


    /// Compare float with integer of arbitrary width (both signed and unsigned are supported). Assuming two's complement arithmetic.
+    /// This function is generic, big integers (128, 256 bit) are supported as well.
    /// Infinities are compared correctly. NaNs are treat similarly to infinities, so they can be less than all numbers.
    /// (note that we need total order)
+    /// Returns -1, 0 or 1.
    template <typename Int>
-    int compare(Int rhs)
+    int compare(Int rhs) const
    {
        if (rhs == 0)
            return sign();
@ -137,10 +139,11 @@ struct DecomposedFloat
        if (normalized_exponent() >= static_cast<int16_t>(8 * sizeof(Int) - is_signed_v<Int>))
            return is_negative() ? -1 : 1;

-        using UInt = make_unsigned_t<Int>;
+        using UInt = std::conditional_t<(sizeof(Int) > sizeof(typename Traits::UInt)), make_unsigned_t<Int>, typename Traits::UInt>;
        UInt uint_rhs = rhs < 0 ? -rhs : rhs;

        /// Smaller octave: abs(rhs) < abs(float)
+        /// FYI, TIL: octave is also called "binade", https://en.wikipedia.org/wiki/Binade
        if (uint_rhs < (static_cast<UInt>(1) << normalized_exponent()))
            return is_negative() ? -1 : 1;

@ -154,11 +157,11 @@ struct DecomposedFloat

        bool large_and_always_integer = normalized_exponent() >= static_cast<int16_t>(Traits::mantissa_bits);

-        typename Traits::UInt a = large_and_always_integer
-            ? mantissa() << (normalized_exponent() - Traits::mantissa_bits)
-            : mantissa() >> (Traits::mantissa_bits - normalized_exponent());
+        UInt a = large_and_always_integer
+            ? static_cast<UInt>(mantissa()) << (normalized_exponent() - Traits::mantissa_bits)
+            : static_cast<UInt>(mantissa()) >> (Traits::mantissa_bits - normalized_exponent());

-        typename Traits::UInt b = uint_rhs - (static_cast<UInt>(1) << normalized_exponent());
+        UInt b = uint_rhs - (static_cast<UInt>(1) << normalized_exponent());

        if (a < b)
            return is_negative() ? 1 : -1;
@ -175,37 +178,37 @@ struct DecomposedFloat


    template <typename Int>
-    bool equals(Int rhs)
+    bool equals(Int rhs) const
    {
        return compare(rhs) == 0;
    }

    template <typename Int>
-    bool notEquals(Int rhs)
+    bool notEquals(Int rhs) const
    {
        return compare(rhs) != 0;
    }

    template <typename Int>
-    bool less(Int rhs)
+    bool less(Int rhs) const
    {
        return compare(rhs) < 0;
    }

    template <typename Int>
-    bool greater(Int rhs)
+    bool greater(Int rhs) const
    {
        return compare(rhs) > 0;
    }

    template <typename Int>
-    bool lessOrEquals(Int rhs)
+    bool lessOrEquals(Int rhs) const
    {
        return compare(rhs) <= 0;
    }

    template <typename Int>
-    bool greaterOrEquals(Int rhs)
+    bool greaterOrEquals(Int rhs) const
    {
        return compare(rhs) >= 0;
    }
--- a/base/common/ReadlineLineReader.cpp
+++ b/base/common/ReadlineLineReader.cpp
@ -1,6 +1,6 @@
 #include <common/ReadlineLineReader.h>
 #include <common/errnoToString.h>
-#include <ext/scope_guard.h>
+#include <common/scope_guard.h>

 #include <errno.h>
 #include <signal.h>
--- a/base/common/SimpleCache.h
+++ b/base/common/SimpleCache.h
@ -3,7 +3,7 @@
 #include <map>
 #include <tuple>
 #include <mutex>
-#include <ext/function_traits.h>
+#include <common/function_traits.h>


 /** The simplest cache for a free function.
@ -32,10 +32,11 @@ public:
    template <typename... Args>
    Result operator() (Args &&... args)
    {
+        Key key{std::forward<Args>(args)...};
+
        {
            std::lock_guard lock(mutex);

-            Key key{std::forward<Args>(args)...};
            auto it = cache.find(key);

            if (cache.end() != it)
@ -43,7 +44,7 @@ public:
        }

        /// The calculations themselves are not done under mutex.
-        Result res = f(std::forward<Args>(args)...);
+        Result res = std::apply(f, key);

        {
            std::lock_guard lock(mutex);
@ -57,11 +58,12 @@ public:
    template <typename... Args>
    void update(Args &&... args)
    {
-        Result res = f(std::forward<Args>(args)...);
+        Key key{std::forward<Args>(args)...};
+
+        Result res = std::apply(f, key);
+
        {
            std::lock_guard lock(mutex);
-
-            Key key{std::forward<Args>(args)...};
            cache[key] = std::move(res);
        }
    }
--- a/base/common/arraySize.h
+++ b/base/common/arraySize.h
@ -0,0 +1,7 @@
+#pragma once
+
+#include <cstdlib>
+
+/** \brief Returns number of elements in an automatic array. */
+template <typename T, std::size_t N>
+constexpr size_t arraySize(const T (&)[N]) noexcept { return N; }
--- a/base/common/bit_cast.h
+++ b/base/common/bit_cast.h
@ -0,0 +1,27 @@
+#pragma once
+
+#include <string.h>
+#include <algorithm>
+#include <type_traits>
+
+
+/** \brief Returns value `from` converted to type `To` while retaining bit representation.
+  *    `To` and `From` must satisfy `CopyConstructible`.
+  */
+template <typename To, typename From>
+std::decay_t<To> bit_cast(const From & from)
+{
+    To res {};
+    memcpy(static_cast<void*>(&res), &from, std::min(sizeof(res), sizeof(from)));
+    return res;
+}
+
+/** \brief Returns value `from` converted to type `To` while retaining bit representation.
+  *    `To` and `From` must satisfy `CopyConstructible`.
+  */
+template <typename To, typename From>
+std::decay_t<To> safe_bit_cast(const From & from)
+{
+    static_assert(sizeof(To) == sizeof(From), "bit cast on types of different width");
+    return bit_cast<To, From>(from);
+}
--- a/base/common/chrono_io.h
+++ b/base/common/chrono_io.h
@ -0,0 +1,46 @@
+#pragma once
+
+#include <chrono>
+#include <string>
+#include <sstream>
+#include <cctz/time_zone.h>
+
+
+inline std::string to_string(const std::time_t & time)
+{
+    return cctz::format("%Y-%m-%d %H:%M:%S", std::chrono::system_clock::from_time_t(time), cctz::local_time_zone());
+}
+
+template <typename Clock, typename Duration = typename Clock::duration>
+std::string to_string(const std::chrono::time_point<Clock, Duration> & tp)
+{
+    // Don't use DateLUT because it shows weird characters for
+    // TimePoint::max(). I wish we could use C++20 format, but it's not
+    // there yet.
+    // return DateLUT::instance().timeToString(std::chrono::system_clock::to_time_t(tp));
+
+    auto in_time_t = std::chrono::system_clock::to_time_t(tp);
+    return to_string(in_time_t);
+}
+
+template <typename Rep, typename Period = std::ratio<1>>
+std::string to_string(const std::chrono::duration<Rep, Period> & duration)
+{
+    auto seconds_as_int = std::chrono::duration_cast<std::chrono::seconds>(duration);
+    if (seconds_as_int == duration)
+        return std::to_string(seconds_as_int.count()) + "s";
+    auto seconds_as_double = std::chrono::duration_cast<std::chrono::duration<double>>(duration);
+    return std::to_string(seconds_as_double.count()) + "s";
+}
+
+template <typename Clock, typename Duration = typename Clock::duration>
+std::ostream & operator<<(std::ostream & o, const std::chrono::time_point<Clock, Duration> & tp)
+{
+    return o << to_string(tp);
+}
+
+template <typename Rep, typename Period = std::ratio<1>>
+std::ostream & operator<<(std::ostream & o, const std::chrono::duration<Rep, Period> & duration)
+{
+    return o << to_string(duration);
+}
--- a/base/common/function_traits.h
+++ b/base/common/function_traits.h
--- a/base/common/getResource.cpp
+++ b/base/common/getResource.cpp
@ -4,23 +4,42 @@
 #include <string>
 #include <boost/algorithm/string/replace.hpp>

-
 std::string_view getResource(std::string_view name)
 {
+    // Convert the resource file name into the form generated by `ld -r -b binary`.
    std::string name_replaced(name);
    std::replace(name_replaced.begin(), name_replaced.end(), '/', '_');
    std::replace(name_replaced.begin(), name_replaced.end(), '-', '_');
    std::replace(name_replaced.begin(), name_replaced.end(), '.', '_');
    boost::replace_all(name_replaced, "+", "_PLUS_");

-    /// These are the names that are generated by "ld -r -b binary"
-    std::string symbol_name_data = "_binary_" + name_replaced + "_start";
-    std::string symbol_name_size = "_binary_" + name_replaced + "_size";
+    // In most `dlsym(3)` APIs, one passes the symbol name as it appears via
+    // something like `nm` or `objdump -t`. For example, a symbol `_foo` would be
+    // looked up with the string `"_foo"`.
+    //
+    // Apple's linker is confusingly different. The NOTES on the man page for
+    // `dlsym(3)` claim that one looks up the symbol with "the name used in C
+    // source code". In this example, that would mean using the string `"foo"`.
+    // This apparently applies even in the case where the symbol did not originate
+    // from C source, such as the embedded binary resource files used here. So
+    // the symbol name must not have a leading `_` on Apple platforms. It's not
+    // clear how this applies to other symbols, such as those which _have_ a leading
+    // underscore in them by design, many leading underscores, etc.
+#if defined OS_DARWIN
+    std::string prefix = "binary_";
+#else
+    std::string prefix = "_binary_";
+#endif
+    std::string symbol_name_start = prefix + name_replaced + "_start";
+    std::string symbol_name_end = prefix + name_replaced + "_end";

-    const void * sym_data = dlsym(RTLD_DEFAULT, symbol_name_data.c_str());
-    const void * sym_size = dlsym(RTLD_DEFAULT, symbol_name_size.c_str());
+    const char* sym_start = reinterpret_cast<const char*>(dlsym(RTLD_DEFAULT, symbol_name_start.c_str()));
+    const char* sym_end = reinterpret_cast<const char*>(dlsym(RTLD_DEFAULT, symbol_name_end.c_str()));

-    if (sym_data && sym_size)
-        return { static_cast<const char *>(sym_data), unalignedLoad<size_t>(&sym_size) };
+    if (sym_start && sym_end)
+    {
+        auto resource_size = static_cast<size_t>(std::distance(sym_start, sym_end));
+        return { sym_start, resource_size };
+    }
    return {};
 }
--- a/base/common/map.h
+++ b/base/common/map.h
@ -0,0 +1,52 @@
+#pragma once
+
+#include <type_traits>
+#include <boost/iterator/transform_iterator.hpp>
+
+namespace collections
+{
+
+/// \brief Strip type off top level reference and cv-qualifiers thus allowing storage in containers
+template <typename T>
+using unqualified_t = std::remove_cv_t<std::remove_reference_t<T>>;
+
+/** \brief Returns collection of the same container-type as the input collection,
+  *    with each element transformed by the application of `mapper`.
+  */
+template <template <typename...> class Collection, typename... Params, typename Mapper>
+auto map(const Collection<Params...> & collection, Mapper && mapper)
+{
+    using value_type = unqualified_t<decltype(mapper(*std::begin(collection)))>;
+
+    return Collection<value_type>(
+        boost::make_transform_iterator(std::begin(collection), std::forward<Mapper>(mapper)),
+        boost::make_transform_iterator(std::end(collection), std::forward<Mapper>(mapper)));
+}
+
+/** \brief Returns collection of specified container-type,
+  *    with each element transformed by the application of `mapper`.
+  *    Allows conversion between different container-types, e.g. std::vector to std::list
+  */
+template <template <typename...> class ResultCollection, typename Collection, typename Mapper>
+auto map(const Collection & collection, Mapper && mapper)
+{
+    using value_type = unqualified_t<decltype(mapper(*std::begin(collection)))>;
+
+    return ResultCollection<value_type>(
+        boost::make_transform_iterator(std::begin(collection), std::forward<Mapper>(mapper)),
+        boost::make_transform_iterator(std::end(collection), std::forward<Mapper>(mapper)));
+}
+
+/** \brief Returns collection of specified type,
+  *    with each element transformed by the application of `mapper`.
+  *    Allows leveraging implicit conversion between the result of applying `mapper` and R::value_type.
+  */
+template <typename ResultCollection, typename Collection, typename Mapper>
+auto map(const Collection & collection, Mapper && mapper)
+{
+    return ResultCollection(
+        boost::make_transform_iterator(std::begin(collection), std::forward<Mapper>(mapper)),
+        boost::make_transform_iterator(std::end(collection), std::forward<Mapper>(mapper)));
+}
+
+}
--- a/base/common/range.h
+++ b/base/common/range.h
@ -4,9 +4,9 @@
 #include <boost/range/adaptor/transformed.hpp>
 #include <type_traits>

-
-namespace ext
+namespace collections
 {
+
 namespace internal
 {
    template <typename ResultType, typename CountingType, typename BeginType, typename EndType>
@ -24,11 +24,11 @@ namespace internal
 /// For loop adaptor which is used to iterate through a half-closed interval [begin, end).
 /// The parameters `begin` and `end` can have any integral or enum types.
 template <typename BeginType,
-          typename EndType,
-          typename = std::enable_if_t<
-              (std::is_integral_v<BeginType> || std::is_enum_v<BeginType>) &&
-              (std::is_integral_v<EndType> || std::is_enum_v<EndType>) &&
-              (!std::is_enum_v<BeginType> || !std::is_enum_v<EndType> || std::is_same_v<BeginType, EndType>), void>>
+        typename EndType,
+        typename = std::enable_if_t<
+            (std::is_integral_v<BeginType> || std::is_enum_v<BeginType>) &&
+            (std::is_integral_v<EndType> || std::is_enum_v<EndType>) &&
+            (!std::is_enum_v<BeginType> || !std::is_enum_v<EndType> || std::is_same_v<BeginType, EndType>), void>>
 inline auto range(BeginType begin, EndType end)
 {
    if constexpr (std::is_integral_v<BeginType> && std::is_integral_v<EndType>)
@ -51,7 +51,7 @@ inline auto range(BeginType begin, EndType end)
 /// The parameter `end` can have any integral or enum type.
 /// The same as range(0, end).
 template <typename Type,
-          typename = std::enable_if_t<std::is_integral_v<Type> || std::is_enum_v<Type>, void>>
+        typename = std::enable_if_t<std::is_integral_v<Type> || std::is_enum_v<Type>, void>>
 inline auto range(Type end)
 {
    if constexpr (std::is_integral_v<Type>)
@ -59,4 +59,5 @@ inline auto range(Type end)
    else
        return internal::rangeImpl<Type, std::underlying_type_t<Type>>(0, end);
 }
+
 }
--- a/base/common/scope_guard.h
+++ b/base/common/scope_guard.h
@ -4,9 +4,6 @@
 #include <memory>
 #include <utility>

-
-namespace ext
-{
 template <class F>
 class [[nodiscard]] basic_scope_guard
 {
@ -105,10 +102,9 @@ using scope_guard = basic_scope_guard<std::function<void(void)>>;

 template <class F>
 inline basic_scope_guard<F> make_scope_guard(F && function_) { return std::forward<F>(function_); }
-}

 #define SCOPE_EXIT_CONCAT(n, ...) \
-const auto scope_exit##n = ext::make_scope_guard([&] { __VA_ARGS__; })
+const auto scope_exit##n = make_scope_guard([&] { __VA_ARGS__; })
 #define SCOPE_EXIT_FWD(n, ...) SCOPE_EXIT_CONCAT(n, __VA_ARGS__)
 #define SCOPE_EXIT(...) SCOPE_EXIT_FWD(__LINE__, __VA_ARGS__)

--- a/base/common/scope_guard_safe.h
+++ b/base/common/scope_guard_safe.h
@ -1,6 +1,6 @@
 #pragma once

-#include <ext/scope_guard.h>
+#include <common/scope_guard.h>
 #include <common/logger_useful.h>
 #include <Common/MemoryTracker.h>

--- a/base/common/shared_ptr_helper.h
+++ b/base/common/shared_ptr_helper.h
@ -2,8 +2,6 @@

 #include <memory>

-namespace ext
-{

 /** Allows to make std::shared_ptr from T with protected constructor.
  *
@ -36,4 +34,3 @@ struct is_shared_ptr<std::shared_ptr<T>>

 template <typename T>
 inline constexpr bool is_shared_ptr_v = is_shared_ptr<T>::value;
-}
--- a/base/common/wide_integer.h
+++ b/base/common/wide_integer.h
@ -109,10 +109,7 @@ public:

    constexpr explicit operator bool() const noexcept;

-    template <class T>
-    using _integral_not_wide_integer_class = typename std::enable_if<std::is_arithmetic<T>::value, T>::type;
-
-    template <class T, class = _integral_not_wide_integer_class<T>>
+    template <typename T, typename = std::enable_if_t<std::is_arithmetic_v<T>, T>>
    constexpr operator T() const noexcept;

    constexpr operator long double() const noexcept;
--- a/base/common/wide_integer_impl.h
+++ b/base/common/wide_integer_impl.h
@ -255,13 +255,13 @@ struct integer<Bits, Signed>::_impl
            set_multiplier<double>(self, alpha);

        self *= max_int;
-        self += static_cast<uint64_t>(t - alpha * static_cast<T>(max_int)); // += b_i
+        self += static_cast<uint64_t>(t - floor(alpha) * static_cast<T>(max_int)); // += b_i
    }

-    constexpr static void wide_integer_from_builtin(integer<Bits, Signed>& self, double rhs) noexcept
+    constexpr static void wide_integer_from_builtin(integer<Bits, Signed> & self, double rhs) noexcept
    {
        constexpr int64_t max_int = std::numeric_limits<int64_t>::max();
-        constexpr int64_t min_int = std::numeric_limits<int64_t>::min();
+        constexpr int64_t min_int = std::numeric_limits<int64_t>::lowest();

        /// There are values in int64 that have more than 53 significant bits (in terms of double
        /// representation). Such values, being promoted to double, are rounded up or down. If they are rounded up,
@ -271,14 +271,14 @@ struct integer<Bits, Signed>::_impl
        /// The necessary check here is that long double has enough significant (mantissa) bits to store the
        /// int64_t max value precisely.

-        //TODO Be compatible with Apple aarch64
+        // TODO Be compatible with Apple aarch64
 #if not (defined(__APPLE__) && defined(__aarch64__))
        static_assert(LDBL_MANT_DIG >= 64,
-            "On your system long double has less than 64 precision bits,"
+            "On your system long double has less than 64 precision bits, "
            "which may result in UB when initializing double from int64_t");
 #endif

-        if ((rhs > 0 && rhs < static_cast<long double>(max_int)) || (rhs < 0 && rhs > static_cast<long double>(min_int)))
+        if (rhs > static_cast<long double>(min_int) && rhs < static_cast<long double>(max_int))
        {
            self = static_cast<int64_t>(rhs);
            return;
--- a/base/daemon/BaseDaemon.cpp
+++ b/base/daemon/BaseDaemon.cpp
@ -21,13 +21,11 @@
 #include <fstream>
 #include <sstream>
 #include <memory>
-#include <ext/scope_guard.h>
+#include <common/scope_guard.h>

 #include <Poco/Observer.h>
 #include <Poco/AutoPtr.h>
 #include <Poco/PatternFormatter.h>
-#include <Poco/File.h>
-#include <Poco/Path.h>
 #include <Poco/Message.h>
 #include <Poco/Util/Application.h>
 #include <Poco/Exception.h>
@ -59,6 +57,7 @@
 #include <Common/getExecutablePath.h>
 #include <Common/getHashOfLoadedBinary.h>
 #include <Common/Elf.h>
+#include <filesystem>

 #if !defined(ARCADIA_BUILD)
 #   include <Common/config_version.h>
@ -70,6 +69,7 @@
 #endif
 #include <ucontext.h>

+namespace fs = std::filesystem;

 DB::PipeFDs signal_pipe;

@ -437,11 +437,11 @@ static void sanitizerDeathCallback()

 static std::string createDirectory(const std::string & file)
 {
-    auto path = Poco::Path(file).makeParent();
-    if (path.toString().empty())
+    fs::path path = fs::path(file).parent_path();
+    if (path.empty())
        return "";
-    Poco::File(path).createDirectories();
-    return path.toString();
+    fs::create_directories(path);
+    return path;
 };


@ -449,7 +449,7 @@ static bool tryCreateDirectories(Poco::Logger * logger, const std::string & path
 {
    try
    {
-        Poco::File(path).createDirectories();
+        fs::create_directories(path);
        return true;
    }
    catch (...)
@ -470,7 +470,7 @@ void BaseDaemon::reloadConfiguration()
      */
    config_path = config().getString("config-file", getDefaultConfigFileName());
    DB::ConfigProcessor config_processor(config_path, false, true);
-    config_processor.setConfigPath(Poco::Path(config_path).makeParent().toString());
+    config_processor.setConfigPath(fs::path(config_path).parent_path());
    loaded_config = config_processor.loadConfig(/* allow_zk_includes = */ true);

    if (last_configuration != nullptr)
@ -524,18 +524,20 @@ std::string BaseDaemon::getDefaultConfigFileName() const
 void BaseDaemon::closeFDs()
 {
 #if defined(OS_FREEBSD) || defined(OS_DARWIN)
-    Poco::File proc_path{"/dev/fd"};
+    fs::path proc_path{"/dev/fd"};
 #else
-    Poco::File proc_path{"/proc/self/fd"};
+    fs::path proc_path{"/proc/self/fd"};
 #endif
-    if (proc_path.isDirectory()) /// Hooray, proc exists
+    if (fs::is_directory(proc_path)) /// Hooray, proc exists
    {
-        std::vector<std::string> fds;
-        /// in /proc/self/fd directory filenames are numeric file descriptors
-        proc_path.list(fds);
-        for (const auto & fd_str : fds)
+        /// in /proc/self/fd directory filenames are numeric file descriptors.
+        /// Iterate directory separately from closing fds to avoid closing iterated directory fd.
+        std::vector<int> fds;
+        for (const auto & path : fs::directory_iterator(proc_path))
+            fds.push_back(DB::parse<int>(path.path().filename()));
+
+        for (const auto & fd : fds)
        {
-            int fd = DB::parse<int>(fd_str);
            if (fd > 2 && fd != signal_pipe.fds_rw[0] && fd != signal_pipe.fds_rw[1])
                ::close(fd);
        }
@ -597,7 +599,7 @@ void BaseDaemon::initialize(Application & self)
    {
        /** When creating pid file and looking for config, will search for paths relative to the working path of the program when started.
          */
-        std::string path = Poco::Path(config().getString("application.path")).setFileName("").toString();
+        std::string path = fs::path(config().getString("application.path")).replace_filename("");
        if (0 != chdir(path.c_str()))
            throw Poco::Exception("Cannot change directory to " + path);
    }
@ -645,7 +647,7 @@ void BaseDaemon::initialize(Application & self)

    std::string log_path = config().getString("logger.log", "");
    if (!log_path.empty())
-        log_path = Poco::Path(log_path).setFileName("").toString();
+        log_path = fs::path(log_path).replace_filename("");

    /** Redirect stdout, stderr to separate files in the log directory (or in the specified file).
      * Some libraries write to stderr in case of errors in debug mode,
@ -708,8 +710,7 @@ void BaseDaemon::initialize(Application & self)

        tryCreateDirectories(&logger(), core_path);

-        Poco::File cores = core_path;
-        if (!(cores.exists() && cores.isDirectory()))
+        if (!(fs::exists(core_path) && fs::is_directory(core_path)))
        {
            core_path = !log_path.empty() ? log_path : "/opt/";
            tryCreateDirectories(&logger(), core_path);
--- a/base/daemon/SentryWriter.cpp
+++ b/base/daemon/SentryWriter.cpp
@ -1,6 +1,5 @@
 #include <daemon/SentryWriter.h>

-#include <Poco/File.h>
 #include <Poco/Util/Application.h>
 #include <Poco/Util/LayeredConfiguration.h>

@ -25,6 +24,7 @@
 #    include <stdio.h>
 #    include <filesystem>

+namespace fs = std::filesystem;

 namespace
 {
@ -53,8 +53,7 @@ void setExtras()
    sentry_set_extra("physical_cpu_cores", sentry_value_new_int32(getNumberOfPhysicalCPUCores()));

    if (!server_data_path.empty())
-        sentry_set_extra("disk_free_space", sentry_value_new_string(formatReadableSizeWithBinarySuffix(
-            Poco::File(server_data_path).freeSpace()).c_str()));
+        sentry_set_extra("disk_free_space", sentry_value_new_string(formatReadableSizeWithBinarySuffix(fs::space(server_data_path).free).c_str()));
 }

 void sentry_logger(sentry_level_e level, const char * message, va_list args, void *)
@ -110,12 +109,12 @@ void SentryWriter::initialize(Poco::Util::LayeredConfiguration & config)
    if (enabled)
    {
        server_data_path = config.getString("path", "");
-        const std::filesystem::path & default_tmp_path = std::filesystem::path(config.getString("tmp_path", Poco::Path::temp())) / "sentry";
+        const std::filesystem::path & default_tmp_path = fs::path(config.getString("tmp_path", fs::temp_directory_path())) / "sentry";
        const std::string & endpoint
            = config.getString("send_crash_reports.endpoint");
        const std::string & temp_folder_path
            = config.getString("send_crash_reports.tmp_path", default_tmp_path);
-        Poco::File(temp_folder_path).createDirectories();
+        fs::create_directories(temp_folder_path);

        sentry_options_t * options = sentry_options_new();  /// will be freed by sentry_init or sentry_shutdown
        sentry_options_set_release(options, VERSION_STRING_SHORT);
--- a/base/ext/bit_cast.h
+++ b/base/ext/bit_cast.h
@ -1,30 +0,0 @@
-#pragma once
-
-#include <string.h>
-#include <algorithm>
-#include <type_traits>
-
-
-namespace ext
-{
-    /** \brief Returns value `from` converted to type `To` while retaining bit representation.
-      *    `To` and `From` must satisfy `CopyConstructible`.
-      */
-    template <typename To, typename From>
-    std::decay_t<To> bit_cast(const From & from)
-    {
-        To res {};
-        memcpy(static_cast<void*>(&res), &from, std::min(sizeof(res), sizeof(from)));
-        return res;
-    }
-
-    /** \brief Returns value `from` converted to type `To` while retaining bit representation.
-      *    `To` and `From` must satisfy `CopyConstructible`.
-      */
-    template <typename To, typename From>
-    std::decay_t<To> safe_bit_cast(const From & from)
-    {
-        static_assert(sizeof(To) == sizeof(From), "bit cast on types of different width");
-        return bit_cast<To, From>(from);
-    }
-}
--- a/base/ext/chrono_io.h
+++ b/base/ext/chrono_io.h
@ -1,49 +0,0 @@
-#pragma once
-
-#include <chrono>
-#include <string>
-#include <sstream>
-#include <cctz/time_zone.h>
-
-
-namespace ext
-{
-    inline std::string to_string(const std::time_t & time)
-    {
-        return cctz::format("%Y-%m-%d %H:%M:%S", std::chrono::system_clock::from_time_t(time), cctz::local_time_zone());
-    }
-
-    template <typename Clock, typename Duration = typename Clock::duration>
-    std::string to_string(const std::chrono::time_point<Clock, Duration> & tp)
-    {
-        // Don't use DateLUT because it shows weird characters for
-        // TimePoint::max(). I wish we could use C++20 format, but it's not
-        // there yet.
-        // return DateLUT::instance().timeToString(std::chrono::system_clock::to_time_t(tp));
-
-        auto in_time_t = std::chrono::system_clock::to_time_t(tp);
-        return to_string(in_time_t);
-    }
-
-    template <typename Rep, typename Period = std::ratio<1>>
-    std::string to_string(const std::chrono::duration<Rep, Period> & duration)
-    {
-        auto seconds_as_int = std::chrono::duration_cast<std::chrono::seconds>(duration);
-        if (seconds_as_int == duration)
-            return std::to_string(seconds_as_int.count()) + "s";
-        auto seconds_as_double = std::chrono::duration_cast<std::chrono::duration<double>>(duration);
-        return std::to_string(seconds_as_double.count()) + "s";
-    }
-
-    template <typename Clock, typename Duration = typename Clock::duration>
-    std::ostream & operator<<(std::ostream & o, const std::chrono::time_point<Clock, Duration> & tp)
-    {
-        return o << to_string(tp);
-    }
-
-    template <typename Rep, typename Period = std::ratio<1>>
-    std::ostream & operator<<(std::ostream & o, const std::chrono::duration<Rep, Period> & duration)
-    {
-        return o << to_string(duration);
-    }
-}
--- a/base/ext/collection_cast.h
+++ b/base/ext/collection_cast.h
@ -1,24 +0,0 @@
-#pragma once
-
-#include <iterator>
-
-namespace ext
-{
-    /** \brief Returns collection of specified container-type.
-     *    Retains stored value_type, constructs resulting collection using iterator range. */
-    template <template <typename...> class ResultCollection, typename Collection>
-    auto collection_cast(const Collection & collection)
-    {
-        using value_type = typename Collection::value_type;
-
-        return ResultCollection<value_type>(std::begin(collection), std::end(collection));
-    }
-
-    /** \brief Returns collection of specified type.
-     *    Performs implicit conversion of between source and result value_type, if available and required. */
-    template <typename ResultCollection, typename Collection>
-    auto collection_cast(const Collection & collection)
-    {
-        return ResultCollection(std::begin(collection), std::end(collection));
-    }
-}
--- a/base/ext/enumerate.h
+++ b/base/ext/enumerate.h
@ -1,60 +0,0 @@
-#pragma once
-
-#include <ext/size.h>
-#include <type_traits>
-#include <utility>
-#include <iterator>
-
-
-/** \brief Provides a wrapper view around a container, allowing to iterate over it's elements and indices.
-  *    Allow writing code like shown below:
-  *
-  *        std::vector<T> v = getVector();
-  *        for (const std::pair<const std::size_t, T &> index_and_value : ext::enumerate(v))
-  *            std::cout << "element " << index_and_value.first << " is " << index_and_value.second << std::endl;
-  */
-namespace ext
-{
-    template <typename It> struct enumerate_iterator
-    {
-        using traits = typename std::iterator_traits<It>;
-        using iterator_category = typename traits::iterator_category;
-        using value_type = std::pair<const std::size_t, typename traits::value_type>;
-        using difference_type = typename traits::difference_type;
-        using reference = std::pair<const std::size_t, typename traits::reference>;
-
-        std::size_t idx;
-        It it;
-
-        enumerate_iterator(const std::size_t idx_, It it_) : idx{idx_}, it{it_} {}
-
-        auto operator*() const { return reference(idx, *it); }
-
-        bool operator!=(const enumerate_iterator & other) const { return it != other.it; }
-
-        enumerate_iterator & operator++() { return ++idx, ++it, *this; }
-    };
-
-    template <typename Collection> struct enumerate_wrapper
-    {
-        using underlying_iterator = decltype(std::begin(std::declval<Collection &>()));
-        using iterator = enumerate_iterator<underlying_iterator>;
-
-        Collection & collection;
-
-        enumerate_wrapper(Collection & collection_) : collection(collection_) {}
-
-        auto begin() { return iterator(0, std::begin(collection)); }
-        auto end() { return iterator(ext::size(collection), std::end(collection)); }
-    };
-
-    template <typename Collection> auto enumerate(Collection & collection)
-    {
-        return enumerate_wrapper<Collection>{collection};
-    }
-
-    template <typename Collection> auto enumerate(const Collection & collection)
-    {
-        return enumerate_wrapper<const Collection>{collection};
-    }
-}
--- a/base/ext/identity.h
+++ b/base/ext/identity.h
@ -1,24 +0,0 @@
-#pragma once
-
-#include <utility>
-
-namespace ext
-{
-    /// \brief Identity function for use with other algorithms as a pass-through.
-    class identity
-    {
-        /** \brief Function pointer type template for converting identity to a function pointer.
-         *    Presumably useless, provided for completeness. */
-        template <typename T> using function_ptr_t = T &&(*)(T &&);
-
-        /** \brief Implementation of identity as a non-instance member function for taking function pointer. */
-        template <typename T> static T && invoke(T && t) { return std::forward<T>(t); }
-
-    public:
-        /** \brief Returns the value passed as a sole argument using perfect forwarding. */
-        template <typename T> T && operator()(T && t) const { return std::forward<T>(t); }
-
-        /** \brief Allows conversion of identity instance to a function pointer. */
-        template <typename T> operator function_ptr_t<T>() const { return &invoke; };
-    };
-}
--- a/base/ext/make_array_n.h
+++ b/base/ext/make_array_n.h
@ -1,43 +0,0 @@
-#pragma once
-
-#include <utility>
-#include <type_traits>
-#include <array>
-
-
-/** \brief Produces std::array of specified size, containing copies of provided object.
-  *    Copy is performed N-1 times, and the last element is being moved.
-  * This helper allows to initialize std::array in place.
-  */
-namespace ext
-{
-    namespace detail
-    {
-
-        template<std::size_t size, typename T, std::size_t... indexes>
-        constexpr auto make_array_n_impl(T && value, std::index_sequence<indexes...>)
-        {
-            /// Comma is used to make N-1 copies of value
-            return std::array<std::decay_t<T>, size>{ (static_cast<void>(indexes), value)..., std::forward<T>(value) };
-        }
-
-    }
-
-    template<typename T>
-    constexpr auto make_array_n(std::integral_constant<std::size_t, 0>, T &&)
-    {
-        return std::array<std::decay_t<T>, 0>{};
-    }
-
-    template<std::size_t size, typename T>
-    constexpr auto make_array_n(std::integral_constant<std::size_t, size>, T && value)
-    {
-        return detail::make_array_n_impl<size>(std::forward<T>(value), std::make_index_sequence<size - 1>{});
-    }
-
-    template<std::size_t size, typename T>
-    constexpr auto make_array_n(T && value)
-    {
-        return make_array_n(std::integral_constant<std::size_t, size>{}, std::forward<T>(value));
-    }
-}
--- a/base/ext/map.h
+++ b/base/ext/map.h
@ -1,51 +0,0 @@
-#pragma once
-
-#include <type_traits>
-#include <boost/iterator/transform_iterator.hpp>
-
-
-namespace ext
-{
-    /// \brief Strip type off top level reference and cv-qualifiers thus allowing storage in containers
-    template <typename T>
-    using unqualified_t = std::remove_cv_t<std::remove_reference_t<T>>;
-
-    /** \brief Returns collection of the same container-type as the input collection,
-      *    with each element transformed by the application of `mapper`.
-      */
-    template <template <typename...> class Collection, typename... Params, typename Mapper>
-    auto map(const Collection<Params...> & collection, const Mapper mapper)
-    {
-        using value_type = unqualified_t<decltype(mapper(*std::begin(collection)))>;
-
-        return Collection<value_type>(
-            boost::make_transform_iterator(std::begin(collection), mapper),
-            boost::make_transform_iterator(std::end(collection), mapper));
-    }
-
-    /** \brief Returns collection of specified container-type,
-      *    with each element transformed by the application of `mapper`.
-      *    Allows conversion between different container-types, e.g. std::vector to std::list
-      */
-    template <template <typename...> class ResultCollection, typename Collection, typename Mapper>
-    auto map(const Collection & collection, const Mapper mapper)
-    {
-        using value_type = unqualified_t<decltype(mapper(*std::begin(collection)))>;
-
-        return ResultCollection<value_type>(
-            boost::make_transform_iterator(std::begin(collection), mapper),
-            boost::make_transform_iterator(std::end(collection), mapper));
-    }
-
-    /** \brief Returns collection of specified type,
-      *    with each element transformed by the application of `mapper`.
-      *    Allows leveraging implicit conversion between the result of applying `mapper` and R::value_type.
-      */
-    template <typename ResultCollection, typename Collection, typename Mapper>
-    auto map(const Collection & collection, const Mapper mapper)
-    {
-        return ResultCollection(
-            boost::make_transform_iterator(std::begin(collection), mapper),
-            boost::make_transform_iterator(std::end(collection), mapper));
-    }
-}
--- a/base/ext/push_back.h
+++ b/base/ext/push_back.h
@ -1,25 +0,0 @@
-#pragma once
-
-#include <vector>
-
-namespace ext
-{
-
-/// Moves all arguments starting from the second to the end of the vector.
-/// For example, `push_back(vec, a1, a2, a3)` is a more compact way to write
-/// `vec.push_back(a1); vec.push_back(a2); vec.push_back(a3);`
-/// This function is like boost::range::push_back() but works for noncopyable types too.
-template <typename T>
-void push_back(std::vector<T> &)
-{
-}
-
-template <typename T, typename FirstArg, typename... OtherArgs>
-void push_back(std::vector<T> & vec, FirstArg && first, OtherArgs &&... other)
-{
-    vec.reserve(vec.size() + sizeof...(other) + 1);
-    vec.emplace_back(std::move(first));
-    push_back(vec, std::move(other)...);
-}
-
-}
--- a/base/ext/size.h
+++ b/base/ext/size.h
@ -1,14 +0,0 @@
-#pragma once
-
-#include <cstdlib>
-
-
-namespace ext
-{
-    /** \brief Returns number of elements in an automatic array. */
-    template <typename T, std::size_t N>
-    constexpr std::size_t size(const T (&)[N]) noexcept { return N; }
-
-    /** \brief Returns number of in a container providing size() member function. */
-    template <typename T> constexpr auto size(const T & t) { return t.size(); }
-}
--- a/base/ext/unlock_guard.h
+++ b/base/ext/unlock_guard.h
@ -1,27 +0,0 @@
-#pragma once
-
-namespace ext
-{
-
-template <typename T>
-class unlock_guard
-{
-public:
-    unlock_guard(T & mutex_) : mutex(mutex_)
-    {
-        mutex.unlock();
-    }
-
-    ~unlock_guard()
-    {
-        mutex.lock();
-    }
-
-    unlock_guard(const unlock_guard &) = delete;
-    unlock_guard & operator=(const unlock_guard &) = delete;
-
-private:
-    T & mutex;
-};
-
-}
--- a/base/glibc-compatibility/glibc-compatibility.c
+++ b/base/glibc-compatibility/glibc-compatibility.c
@ -8,13 +8,6 @@
 extern "C" {
 #endif

-#include <pthread.h>
-
-size_t __pthread_get_minstack(const pthread_attr_t * attr)
-{
-    return 1048576;        /// This is a guess. Don't sure it is correct.
-}
-
 #include <signal.h>
 #include <unistd.h>
 #include <string.h>
@ -141,6 +134,8 @@ int __open_2(const char *path, int oflag)
 }


+#include <pthread.h>
+
 /// No-ops.
 int pthread_setname_np(pthread_t thread, const char *name) { return 0; }
 int pthread_getname_np(pthread_t thread, char *name, size_t len) { name[0] = '\0'; return 0; };
--- a/base/loggers/Loggers.cpp
+++ b/base/loggers/Loggers.cpp
@ -6,10 +6,11 @@
 #include "OwnFormattingChannel.h"
 #include "OwnPatternFormatter.h"
 #include <Poco/ConsoleChannel.h>
-#include <Poco/File.h>
 #include <Poco/Logger.h>
 #include <Poco/Net/RemoteSyslogChannel.h>
-#include <Poco/Path.h>
+#include <filesystem>
+
+namespace fs = std::filesystem;

 namespace DB
 {
@ -20,11 +21,11 @@ namespace DB
 // TODO: move to libcommon
 static std::string createDirectory(const std::string & file)
 {
-    auto path = Poco::Path(file).makeParent();
-    if (path.toString().empty())
+    auto path = fs::path(file).parent_path();
+    if (path.empty())
        return "";
-    Poco::File(path).createDirectories();
-    return path.toString();
+    fs::create_directories(path);
+    return path;
 };

 void Loggers::setTextLog(std::shared_ptr<DB::TextLog> log, int max_priority)
@ -70,7 +71,7 @@ void Loggers::buildLoggers(Poco::Util::AbstractConfiguration & config, Poco::Log

        // Set up two channel chains.
        log_file = new Poco::FileChannel;
-        log_file->setProperty(Poco::FileChannel::PROP_PATH, Poco::Path(log_path).absolute().toString());
+        log_file->setProperty(Poco::FileChannel::PROP_PATH, fs::weakly_canonical(log_path));
        log_file->setProperty(Poco::FileChannel::PROP_ROTATION, config.getRawString("logger.size", "100M"));
        log_file->setProperty(Poco::FileChannel::PROP_ARCHIVE, "number");
        log_file->setProperty(Poco::FileChannel::PROP_COMPRESS, config.getRawString("logger.compress", "true"));
@ -102,7 +103,7 @@ void Loggers::buildLoggers(Poco::Util::AbstractConfiguration & config, Poco::Log
        std::cerr << "Logging errors to " << errorlog_path << std::endl;

        error_log_file = new Poco::FileChannel;
-        error_log_file->setProperty(Poco::FileChannel::PROP_PATH, Poco::Path(errorlog_path).absolute().toString());
+        error_log_file->setProperty(Poco::FileChannel::PROP_PATH, fs::weakly_canonical(errorlog_path));
        error_log_file->setProperty(Poco::FileChannel::PROP_ROTATION, config.getRawString("logger.size", "100M"));
        error_log_file->setProperty(Poco::FileChannel::PROP_ARCHIVE, "number");
        error_log_file->setProperty(Poco::FileChannel::PROP_COMPRESS, config.getRawString("logger.compress", "true"));
--- a/base/loggers/OwnSplitChannel.cpp
+++ b/base/loggers/OwnSplitChannel.cpp
@ -4,12 +4,14 @@
 #include <Core/Block.h>
 #include <Interpreters/InternalTextLogsQueue.h>
 #include <Interpreters/TextLog.h>
+#include <IO/WriteBufferFromFileDescriptor.h>
 #include <sys/time.h>
 #include <Poco/Message.h>
 #include <Common/CurrentThread.h>
 #include <Common/DNSResolver.h>
 #include <common/getThreadId.h>
 #include <Common/SensitiveDataMasker.h>
+#include <Common/IO.h>

 namespace DB
 {
@ -26,16 +28,48 @@ void OwnSplitChannel::log(const Poco::Message & msg)
        auto matches = masker->wipeSensitiveData(message_text);
        if (matches > 0)
        {
-            logSplit({msg, message_text}); // we will continue with the copy of original message with text modified
+            tryLogSplit({msg, message_text}); // we will continue with the copy of original message with text modified
            return;
        }

    }

-    logSplit(msg);
+    tryLogSplit(msg);
 }


+void OwnSplitChannel::tryLogSplit(const Poco::Message & msg)
+{
+    try
+    {
+        logSplit(msg);
+    }
+    /// It is better to catch the errors here in order to avoid
+    /// breaking some functionality because of unexpected "File not
+    /// found" (or similar) error.
+    ///
+    /// For example StorageDistributedDirectoryMonitor will mark batch
+    /// as broken, some MergeTree code can also be affected.
+    ///
+    /// Also note, that we cannot log the exception here, since this
+    /// will lead to recursion, using regular tryLogCurrentException().
+    /// but let's log it into the stderr at least.
+    catch (...)
+    {
+        MemoryTracker::LockExceptionInThread lock_memory_tracker(VariableContext::Global);
+
+        const std::string & exception_message = getCurrentExceptionMessage(true);
+        const std::string & message = msg.getText();
+
+        /// NOTE: errors are ignored, since nothing can be done.
+        writeRetry(STDERR_FILENO, "Cannot add message to the log: ");
+        writeRetry(STDERR_FILENO, message.data(), message.size());
+        writeRetry(STDERR_FILENO, "\n");
+        writeRetry(STDERR_FILENO, exception_message.data(), exception_message.size());
+        writeRetry(STDERR_FILENO, "\n");
+    }
+}
+
 void OwnSplitChannel::logSplit(const Poco::Message & msg)
 {
    ExtendedLogMessage msg_ext = ExtendedLogMessage::getFrom(msg);
--- a/base/loggers/OwnSplitChannel.h
+++ b/base/loggers/OwnSplitChannel.h
@ -24,6 +24,7 @@ public:

 private:
    void logSplit(const Poco::Message & msg);
+    void tryLogSplit(const Poco::Message & msg);

    using ChannelPtr = Poco::AutoPtr<Poco::Channel>;
    /// Handler and its pointer casted to extended interface
--- a/base/mysqlxx/Query.cpp
+++ b/base/mysqlxx/Query.cpp
@ -2,7 +2,7 @@
 #include <errmsg.h>
 #include <mysql.h>
 #else
-#include <mysql/errmsg.h>
+#include <mysql/errmsg.h> //Y_IGNORE
 #include <mysql/mysql.h>
 #endif

--- a/base/mysqlxx/ya.make
+++ b/base/mysqlxx/ya.make
@ -0,0 +1,39 @@
+# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
+LIBRARY()
+
+OWNER(g:clickhouse)
+
+CFLAGS(-g0)
+
+PEERDIR(
+    contrib/restricted/boost/libs
+    contrib/libs/libmysql_r
+    contrib/libs/poco/Foundation
+    contrib/libs/poco/Util
+)
+
+ADDINCL(
+    GLOBAL clickhouse/base
+    clickhouse/base
+    contrib/libs/libmysql_r
+)
+
+NO_COMPILER_WARNINGS()
+
+NO_UTIL()
+
+SRCS(
+    Connection.cpp
+    Exception.cpp
+    Pool.cpp
+    PoolFactory.cpp
+    PoolWithFailover.cpp
+    Query.cpp
+    ResultBase.cpp
+    Row.cpp
+    UseQueryResult.cpp
+    Value.cpp
+
+)
+
+END()
--- a/base/mysqlxx/ya.make.in
+++ b/base/mysqlxx/ya.make.in
@ -0,0 +1,28 @@
+LIBRARY()
+
+OWNER(g:clickhouse)
+
+CFLAGS(-g0)
+
+PEERDIR(
+    contrib/restricted/boost/libs
+    contrib/libs/libmysql_r
+    contrib/libs/poco/Foundation
+    contrib/libs/poco/Util
+)
+
+ADDINCL(
+    GLOBAL clickhouse/base
+    clickhouse/base
+    contrib/libs/libmysql_r
+)
+
+NO_COMPILER_WARNINGS()
+
+NO_UTIL()
+
+SRCS(
+<? find . -name '*.cpp' | grep -v -F tests/ | grep -v -F examples | sed 's/^\.\//    /' | sort ?>
+)
+
+END()
--- a/base/ya.make
+++ b/base/ya.make
@ -4,6 +4,7 @@ RECURSE(
    common
    daemon
    loggers
+    mysqlxx
    pcg-random
    widechar_width
    readpassphrase
--- a/cmake/embed_binary.cmake
+++ b/cmake/embed_binary.cmake
@ -0,0 +1,76 @@
+# Embed a set of resource files into a resulting object file.
+#
+# Signature: `clickhouse_embed_binaries(TARGET <target> RESOURCE_DIR <dir> RESOURCES <resource> ...)
+#
+# This will generate a static library target named `<target>`, which contains the contents of
+# each `<resource>` file. The files should be located in `<dir>`. <dir> defaults to
+# ${CMAKE_CURRENT_SOURCE_DIR}, and the resources may not be empty.
+#
+# Each resource will result in three symbols in the final archive, based on the name `<resource>`.
+# These are:
+#   1. `_binary_<name>_start`: Points to the start of the binary data from `<resource>`.
+#   2. `_binary_<name>_end`: Points to the end of the binary data from `<resource>`.
+#   2. `_binary_<name>_size`: Points to the size of the binary data from `<resource>`.
+#
+# `<name>` is a normalized name derived from `<resource>`, by replacing the characters "./-" with
+# the character "_", and the character "+" with "_PLUS_". This scheme is similar to those generated
+# by `ld -r -b binary`, and matches the expectations in `./base/common/getResource.cpp`.
+macro(clickhouse_embed_binaries)
+    set(one_value_args TARGET RESOURCE_DIR)
+    set(resources RESOURCES)
+    cmake_parse_arguments(EMBED "" "${one_value_args}" ${resources} ${ARGN})
+
+    if (NOT DEFINED EMBED_TARGET)
+        message(FATAL_ERROR "A target name must be provided for embedding binary resources into")
+    endif()
+
+    if (NOT DEFINED EMBED_RESOURCE_DIR)
+        set(EMBED_RESOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}")
+    endif()
+
+    list(LENGTH EMBED_RESOURCES N_RESOURCES)
+    if (N_RESOURCES LESS 1)
+        message(FATAL_ERROR "The list of binary resources to embed may not be empty")
+    endif()
+
+    # If cross-compiling, ensure we use the toolchain file and target the
+    # actual target architecture
+    if (CMAKE_CROSSCOMPILING)
+        set(CROSS_COMPILE_FLAGS "--target=${CMAKE_C_COMPILER_TARGET} --gcc-toolchain=${TOOLCHAIN_FILE}")
+    else()
+        set(CROSS_COMPILE_FLAGS "")
+    endif()
+
+    set(EMBED_TEMPLATE_FILE "${PROJECT_SOURCE_DIR}/programs/embed_binary.S.in")
+    set(RESOURCE_OBJS)
+    foreach(RESOURCE_FILE ${EMBED_RESOURCES})
+        set(RESOURCE_OBJ "${RESOURCE_FILE}.o")
+        list(APPEND RESOURCE_OBJS "${RESOURCE_OBJ}")
+
+        # Normalize the name of the resource
+        set(BINARY_FILE_NAME "${RESOURCE_FILE}")
+        string(REGEX REPLACE "[\./-]" "_" SYMBOL_NAME "${RESOURCE_FILE}") # - must be last in regex
+        string(REPLACE "+" "_PLUS_" SYMBOL_NAME "${SYMBOL_NAME}")
+        set(ASSEMBLY_FILE_NAME "${RESOURCE_FILE}.S")
+
+        # Put the configured assembly file in the output directory.
+        # This is so we can clean it up as usual, and we CD to the
+        # source directory before compiling, so that the assembly
+        # `.incbin` directive can find the file.
+        configure_file("${EMBED_TEMPLATE_FILE}" "${CMAKE_CURRENT_BINARY_DIR}/${ASSEMBLY_FILE_NAME}" @ONLY)
+
+        # Generate the output object file by compiling the assembly, in the directory of
+        # the sources so that the resource file may also be found
+        add_custom_command(
+            OUTPUT ${RESOURCE_OBJ}
+            COMMAND cd "${EMBED_RESOURCE_DIR}" &&
+                 ${CMAKE_C_COMPILER} "${CROSS_COMPILE_FLAGS}" -c -o
+                    "${CMAKE_CURRENT_BINARY_DIR}/${RESOURCE_OBJ}"
+                    "${CMAKE_CURRENT_BINARY_DIR}/${ASSEMBLY_FILE_NAME}"
+        )
+        set_source_files_properties("${RESOURCE_OBJ}" PROPERTIES EXTERNAL_OBJECT true GENERATED true)
+    endforeach()
+
+    add_library("${EMBED_TARGET}" STATIC ${RESOURCE_OBJS})
+    set_target_properties("${EMBED_TARGET}" PROPERTIES LINKER_LANGUAGE C)
+endmacro()
--- a/cmake/find/llvm.cmake
+++ b/cmake/find/llvm.cmake
@ -29,7 +29,6 @@ message(STATUS "LLVM C++ compiler flags: ${LLVM_CXXFLAGS}")

 # This list was generated by listing all LLVM libraries, compiling the binary and removing all libraries while it still compiles.
 set (REQUIRED_LLVM_LIBRARIES
-LLVMOrcJIT
 LLVMExecutionEngine
 LLVMRuntimeDyld
 LLVMX86CodeGen
--- a/cmake/find/s3.cmake
+++ b/cmake/find/s3.cmake
@ -1,7 +1,7 @@
-if(NOT OS_FREEBSD AND NOT APPLE)
+if(NOT OS_FREEBSD)
    option(ENABLE_S3 "Enable S3" ${ENABLE_LIBRARIES})
 elseif(ENABLE_S3 OR USE_INTERNAL_AWS_S3_LIBRARY)
-    message (${RECONFIGURE_MESSAGE_LEVEL} "Can't use S3 on Apple or FreeBSD")
+    message (${RECONFIGURE_MESSAGE_LEVEL} "Can't use S3 on FreeBSD")
 endif()

 if(NOT ENABLE_S3)
--- a/cmake/find/yaml-cpp.cmake
+++ b/cmake/find/yaml-cpp.cmake
@ -4,6 +4,6 @@ if (NOT USE_YAML_CPP)
    return()
 endif()

-if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/yaml-cpp")
+if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/yaml-cpp/README.md")
    message (ERROR "submodule contrib/yaml-cpp is missing. to fix try run: \n git submodule update --init --recursive")
 endif()
--- a/cmake/linux/toolchain-aarch64.cmake
+++ b/cmake/linux/toolchain-aarch64.cmake
@ -4,6 +4,7 @@ set (CMAKE_C_COMPILER_TARGET "aarch64-linux-gnu")
 set (CMAKE_CXX_COMPILER_TARGET "aarch64-linux-gnu")
 set (CMAKE_ASM_COMPILER_TARGET "aarch64-linux-gnu")
 set (CMAKE_SYSROOT "${CMAKE_CURRENT_LIST_DIR}/../toolchain/linux-aarch64/aarch64-linux-gnu/libc")
+get_filename_component (TOOLCHAIN_FILE "${CMAKE_TOOLCHAIN_FILE}" REALPATH)

 # We don't use compiler from toolchain because it's gcc-8, and we provide support only for gcc-9.
 set (CMAKE_AR "${CMAKE_CURRENT_LIST_DIR}/../toolchain/linux-aarch64/bin/aarch64-linux-gnu-ar" CACHE FILEPATH "" FORCE)
--- a/contrib/CMakeLists.txt
+++ b/contrib/CMakeLists.txt
@ -61,7 +61,6 @@ endif()
 add_subdirectory (poco-cmake)
 add_subdirectory (croaring-cmake)

-
 # TODO: refactor the contrib libraries below this comment.

 if (USE_INTERNAL_ZSTD_LIBRARY)
--- a/contrib/NuRaft
+++ b/contrib/NuRaft
@ -1 +1 @@
-Subproject commit 95d6bbba579b3a4e4c2dede954f541ff6f3dba51
+Subproject commit 2a1bf7d87b4a03561fc66fbb49cee8a288983c5d
--- a/contrib/arrow
+++ b/contrib/arrow
@ -1 +1 @@
-Subproject commit 616b3dc76a0c8450b4027ded8a78e9619d7c845f
+Subproject commit debf751a129bdda9ff4d1e895e08957ff77000a1
--- a/contrib/arrow-cmake/CMakeLists.txt
+++ b/contrib/arrow-cmake/CMakeLists.txt
@ -188,6 +188,7 @@ set(ARROW_SRCS
        "${LIBRARY_DIR}/array/util.cc"
        "${LIBRARY_DIR}/array/validate.cc"

+        "${LIBRARY_DIR}/compute/api_aggregate.cc"
        "${LIBRARY_DIR}/compute/api_scalar.cc"
        "${LIBRARY_DIR}/compute/api_vector.cc"
        "${LIBRARY_DIR}/compute/cast.cc"
@ -198,8 +199,11 @@ set(ARROW_SRCS

        "${LIBRARY_DIR}/compute/kernels/aggregate_basic.cc"
        "${LIBRARY_DIR}/compute/kernels/aggregate_mode.cc"
+        "${LIBRARY_DIR}/compute/kernels/aggregate_quantile.cc"
+        "${LIBRARY_DIR}/compute/kernels/aggregate_tdigest.cc"
        "${LIBRARY_DIR}/compute/kernels/aggregate_var_std.cc"
        "${LIBRARY_DIR}/compute/kernels/codegen_internal.cc"
+        "${LIBRARY_DIR}/compute/kernels/hash_aggregate.cc"
        "${LIBRARY_DIR}/compute/kernels/scalar_arithmetic.cc"
        "${LIBRARY_DIR}/compute/kernels/scalar_boolean.cc"
        "${LIBRARY_DIR}/compute/kernels/scalar_cast_boolean.cc"
@ -243,6 +247,7 @@ set(ARROW_SRCS
        "${LIBRARY_DIR}/io/interfaces.cc"
        "${LIBRARY_DIR}/io/memory.cc"
        "${LIBRARY_DIR}/io/slow.cc"
+        "${LIBRARY_DIR}/io/transform.cc"

        "${LIBRARY_DIR}/tensor/coo_converter.cc"
        "${LIBRARY_DIR}/tensor/csf_converter.cc"
@ -256,11 +261,8 @@ set(ARROW_SRCS
        "${LIBRARY_DIR}/util/bitmap_builders.cc"
        "${LIBRARY_DIR}/util/bitmap_ops.cc"
        "${LIBRARY_DIR}/util/bpacking.cc"
+        "${LIBRARY_DIR}/util/cancel.cc"
        "${LIBRARY_DIR}/util/compression.cc"
-        "${LIBRARY_DIR}/util/compression_lz4.cc"
-        "${LIBRARY_DIR}/util/compression_snappy.cc"
-        "${LIBRARY_DIR}/util/compression_zlib.cc"
-        "${LIBRARY_DIR}/util/compression_zstd.cc"
        "${LIBRARY_DIR}/util/cpu_info.cc"
        "${LIBRARY_DIR}/util/decimal.cc"
        "${LIBRARY_DIR}/util/delimiting.cc"
@ -268,13 +270,14 @@ set(ARROW_SRCS
        "${LIBRARY_DIR}/util/future.cc"
        "${LIBRARY_DIR}/util/int_util.cc"
        "${LIBRARY_DIR}/util/io_util.cc"
-        "${LIBRARY_DIR}/util/iterator.cc"
        "${LIBRARY_DIR}/util/key_value_metadata.cc"
        "${LIBRARY_DIR}/util/logging.cc"
        "${LIBRARY_DIR}/util/memory.cc"
+        "${LIBRARY_DIR}/util/mutex.cc"
        "${LIBRARY_DIR}/util/string_builder.cc"
        "${LIBRARY_DIR}/util/string.cc"
        "${LIBRARY_DIR}/util/task_group.cc"
+        "${LIBRARY_DIR}/util/tdigest.cc"
        "${LIBRARY_DIR}/util/thread_pool.cc"
        "${LIBRARY_DIR}/util/time.cc"
        "${LIBRARY_DIR}/util/trie.cc"
@ -368,14 +371,14 @@ set(PARQUET_SRCS
        "${LIBRARY_DIR}/column_reader.cc"
        "${LIBRARY_DIR}/column_scanner.cc"
        "${LIBRARY_DIR}/column_writer.cc"
-        "${LIBRARY_DIR}/deprecated_io.cc"
        "${LIBRARY_DIR}/encoding.cc"
-        "${LIBRARY_DIR}/encryption.cc"
-        "${LIBRARY_DIR}/encryption_internal.cc"
+        "${LIBRARY_DIR}/encryption/encryption.cc"
+        "${LIBRARY_DIR}/encryption/encryption_internal.cc"
+        "${LIBRARY_DIR}/encryption/internal_file_decryptor.cc"
+        "${LIBRARY_DIR}/encryption/internal_file_encryptor.cc"
+        "${LIBRARY_DIR}/exception.cc"
        "${LIBRARY_DIR}/file_reader.cc"
        "${LIBRARY_DIR}/file_writer.cc"
-        "${LIBRARY_DIR}/internal_file_decryptor.cc"
-        "${LIBRARY_DIR}/internal_file_encryptor.cc"
        "${LIBRARY_DIR}/level_conversion.cc"
        "${LIBRARY_DIR}/level_comparison.cc"
        "${LIBRARY_DIR}/metadata.cc"
@ -385,6 +388,8 @@ set(PARQUET_SRCS
        "${LIBRARY_DIR}/properties.cc"
        "${LIBRARY_DIR}/schema.cc"
        "${LIBRARY_DIR}/statistics.cc"
+        "${LIBRARY_DIR}/stream_reader.cc"
+        "${LIBRARY_DIR}/stream_writer.cc"
        "${LIBRARY_DIR}/types.cc"

        "${GEN_LIBRARY_DIR}/parquet_constants.cpp"
--- a/contrib/avro
+++ b/contrib/avro
@ -1 +1 @@
-Subproject commit 92caca2d42fc9a97e34e95f963593539d32ed331
+Subproject commit e43c46e87fd32eafdc09471e95344555454c5ef8
--- a/contrib/cassandra
+++ b/contrib/cassandra
@ -1 +1 @@
-Subproject commit c097fb5c7e63cc430016d9a8b240d8e63fbefa52
+Subproject commit eb9b68dadbb4417a2c132ad4a1c2fa76e65e6fc1
--- a/contrib/cctz-cmake/CMakeLists.txt
+++ b/contrib/cctz-cmake/CMakeLists.txt
@ -39,6 +39,7 @@ if (NOT USE_INTERNAL_CCTZ_LIBRARY)
 endif()

 if (NOT EXTERNAL_CCTZ_LIBRARY_FOUND OR NOT EXTERNAL_CCTZ_LIBRARY_WORKS)
+    include(${ClickHouse_SOURCE_DIR}/cmake/embed_binary.cmake)
    set(USE_INTERNAL_CCTZ_LIBRARY 1)
    set(LIBRARY_DIR "${ClickHouse_SOURCE_DIR}/contrib/cctz")

@ -70,63 +71,36 @@ if (NOT EXTERNAL_CCTZ_LIBRARY_FOUND OR NOT EXTERNAL_CCTZ_LIBRARY_WORKS)
    set(SYSTEM_STORAGE_TZ_FILE "${CMAKE_BINARY_DIR}/src/Storages/System/StorageSystemTimeZones.generated.cpp")
    # remove existing copies so that its generated fresh on each build.
    file(REMOVE ${SYSTEM_STORAGE_TZ_FILE})
-    # Build a libray with embedded tzdata
-    if (OS_LINUX)
-        # get the list of timezones from tzdata shipped with cctz
-        set(TZDIR "${LIBRARY_DIR}/testdata/zoneinfo")
-        file(STRINGS "${LIBRARY_DIR}/testdata/version" TZDATA_VERSION)
-        set_property(GLOBAL PROPERTY TZDATA_VERSION_PROP "${TZDATA_VERSION}")
-        message(STATUS "Packaging with tzdata version: ${TZDATA_VERSION}")

-        set(TZ_OBJS)
+    # get the list of timezones from tzdata shipped with cctz
+    set(TZDIR "${LIBRARY_DIR}/testdata/zoneinfo")
+    file(STRINGS "${LIBRARY_DIR}/testdata/version" TZDATA_VERSION)
+    set_property(GLOBAL PROPERTY TZDATA_VERSION_PROP "${TZDATA_VERSION}")
+    message(STATUS "Packaging with tzdata version: ${TZDATA_VERSION}")

-        # each file in that dir (except of tab and localtime) store the info about timezone
-        execute_process(COMMAND
-            bash -c "cd ${TZDIR} && find * -type f -and ! -name '*.tab' -and ! -name 'localtime' | sort | paste -sd ';'"
-            OUTPUT_STRIP_TRAILING_WHITESPACE
-            OUTPUT_VARIABLE TIMEZONES)
+    set(TIMEZONE_RESOURCE_FILES)

-        file(APPEND ${SYSTEM_STORAGE_TZ_FILE} "// autogenerated by ClickHouse/contrib/cctz-cmake/CMakeLists.txt\n")
-        file(APPEND ${SYSTEM_STORAGE_TZ_FILE} "const char * auto_time_zones[] {\n" )
+    # each file in that dir (except of tab and localtime) store the info about timezone
+    execute_process(COMMAND
+        bash -c "cd ${TZDIR} && find * -type f -and ! -name '*.tab' -and ! -name 'localtime' | sort | paste -sd ';' -"
+        OUTPUT_STRIP_TRAILING_WHITESPACE
+        OUTPUT_VARIABLE TIMEZONES)

-        foreach(TIMEZONE ${TIMEZONES})
-            file(APPEND ${SYSTEM_STORAGE_TZ_FILE} "    \"${TIMEZONE}\",\n")
-            string(REPLACE "/" "_" TIMEZONE_ID ${TIMEZONE})
-            string(REPLACE "+" "_PLUS_" TIMEZONE_ID ${TIMEZONE_ID})
-            set(TZ_OBJ ${TIMEZONE_ID}.o)
-            set(TZ_OBJS ${TZ_OBJS} ${TZ_OBJ})
+    file(APPEND ${SYSTEM_STORAGE_TZ_FILE} "// autogenerated by ClickHouse/contrib/cctz-cmake/CMakeLists.txt\n")
+    file(APPEND ${SYSTEM_STORAGE_TZ_FILE} "const char * auto_time_zones[] {\n" )

-            # https://stackoverflow.com/questions/14776463/compile-and-add-an-object-file-from-a-binary-with-cmake
-            # PPC64LE fails to do this with objcopy, use ld or lld instead
-            if (ARCH_PPC64LE)
-                add_custom_command(OUTPUT ${TZ_OBJ}
-                    COMMAND cp "${TZDIR}/${TIMEZONE}" "${CMAKE_CURRENT_BINARY_DIR}/${TIMEZONE_ID}"
-                    COMMAND cd ${CMAKE_CURRENT_BINARY_DIR} && ${CMAKE_LINKER} -m elf64lppc -r -b binary -o ${TZ_OBJ} ${TIMEZONE_ID}
-                    COMMAND rm "${CMAKE_CURRENT_BINARY_DIR}/${TIMEZONE_ID}")
-            else()
-                add_custom_command(OUTPUT ${TZ_OBJ}
-                    COMMAND cp "${TZDIR}/${TIMEZONE}" "${CMAKE_CURRENT_BINARY_DIR}/${TIMEZONE_ID}"
-                    COMMAND cd ${CMAKE_CURRENT_BINARY_DIR} && ${OBJCOPY_PATH} -I binary ${OBJCOPY_ARCH_OPTIONS}
-                            --rename-section .data=.rodata,alloc,load,readonly,data,contents ${TIMEZONE_ID} ${TZ_OBJ}
-                    COMMAND rm "${CMAKE_CURRENT_BINARY_DIR}/${TIMEZONE_ID}")
-            endif()
-            set_source_files_properties(${TZ_OBJ} PROPERTIES EXTERNAL_OBJECT true GENERATED true)
-        endforeach(TIMEZONE)
-
-        file(APPEND ${SYSTEM_STORAGE_TZ_FILE} "    nullptr};\n")
-
-        add_library(tzdata STATIC ${TZ_OBJS})
-        set_target_properties(tzdata PROPERTIES LINKER_LANGUAGE C)
-        # whole-archive prevents symbols from being discarded for unknown reason
-        # CMake can shuffle each of target_link_libraries arguments with other
-        # libraries in linker command. To avoid this we hardcode whole-archive
-        # library into single string.
-        add_dependencies(cctz tzdata)
-        target_link_libraries(cctz INTERFACE "-Wl,${WHOLE_ARCHIVE} $<TARGET_FILE:tzdata> -Wl,${NO_WHOLE_ARCHIVE}")
-    else ()
-        file(APPEND ${SYSTEM_STORAGE_TZ_FILE} "// autogenerated by ClickHouse/contrib/cctz-cmake/CMakeLists.txt\n")
-        file(APPEND ${SYSTEM_STORAGE_TZ_FILE} "const char * auto_time_zones[] {nullptr};\n" )
-    endif ()
+    foreach(TIMEZONE ${TIMEZONES})
+        file(APPEND ${SYSTEM_STORAGE_TZ_FILE} "    \"${TIMEZONE}\",\n")
+        list(APPEND TIMEZONE_RESOURCE_FILES "${TIMEZONE}")
+    endforeach(TIMEZONE)
+    file(APPEND ${SYSTEM_STORAGE_TZ_FILE} "    nullptr};\n")
+    clickhouse_embed_binaries(
+        TARGET tzdata
+        RESOURCE_DIR "${TZDIR}"
+        RESOURCES ${TIMEZONE_RESOURCE_FILES}
+    )
+    add_dependencies(cctz tzdata)
+    target_link_libraries(cctz INTERFACE "-Wl,${WHOLE_ARCHIVE} $<TARGET_FILE:tzdata> -Wl,${NO_WHOLE_ARCHIVE}")
 endif ()

 message (STATUS "Using cctz")
--- a/contrib/cppkafka
+++ b/contrib/cppkafka
@ -1 +1 @@
-Subproject commit 57a599d99c540e647bcd0eb9ea77c523cca011b3
+Subproject commit 5a119f689f8a4d90d10a9635e7ee2bee5c127de1
--- a/contrib/croaring
+++ b/contrib/croaring
@ -1 +1 @@
-Subproject commit d8402939b5c9fc134fd4fcf058fe0f7006d2b129
+Subproject commit 2c867e9f9c9e2a3a7032791f94c4c7ae3013f6e0
--- a/contrib/flatbuffers
+++ b/contrib/flatbuffers
@ -1 +1 @@
-Subproject commit 22e3ffc66d2d7d72d1414390aa0f04ffd114a5a1
+Subproject commit eb3f827948241ce0e701516f16cd67324802bce9
--- a/contrib/jemalloc-cmake/CMakeLists.txt
+++ b/contrib/jemalloc-cmake/CMakeLists.txt
@ -1,6 +1,6 @@
 if (SANITIZE OR NOT (
    ((OS_LINUX OR OS_FREEBSD) AND (ARCH_AMD64 OR ARCH_ARM OR ARCH_PPC64LE)) OR
-    (OS_DARWIN AND CMAKE_BUILD_TYPE STREQUAL "RelWithDebInfo")
+    (OS_DARWIN AND (CMAKE_BUILD_TYPE STREQUAL "RelWithDebInfo" OR CMAKE_BUILD_TYPE STREQUAL "Debug"))
 ))
    if (ENABLE_JEMALLOC)
        message (${RECONFIGURE_MESSAGE_LEVEL}
--- a/contrib/libpqxx
+++ b/contrib/libpqxx
@ -1 +1 @@
-Subproject commit 58d2a028d1600225ac3a478d6b3a06ba2f0c01f6
+Subproject commit 357608d11b7a1961c3fb7db2ef9a5dbb2e87da77
--- a/contrib/libpqxx-cmake/CMakeLists.txt
+++ b/contrib/libpqxx-cmake/CMakeLists.txt
@ -64,7 +64,7 @@ set (HDRS
 add_library(libpqxx ${SRCS} ${HDRS})

 target_link_libraries(libpqxx PUBLIC ${LIBPQ_LIBRARY})
-target_include_directories (libpqxx PRIVATE "${LIBRARY_DIR}/include")
+target_include_directories (libpqxx SYSTEM PRIVATE "${LIBRARY_DIR}/include")

 # crutch
 set(CM_CONFIG_H_IN "${LIBRARY_DIR}/include/pqxx/config.h.in")
--- a/contrib/libunwind
+++ b/contrib/libunwind
@ -1 +1 @@
-Subproject commit 8fe25d7dc70f2a4ea38c3e5a33fa9d4199b67a5a
+Subproject commit a491c27b33109a842d577c0f7ac5f5f218859181
--- a/contrib/llvm
+++ b/contrib/llvm
@ -1 +1 @@
-Subproject commit a7198805de67374eb3fb4c6b89797fa2d1cd7e50
+Subproject commit e5751459412bce1391fb7a2e9bbc01e131bf72f1
--- a/contrib/orc
+++ b/contrib/orc
@ -1 +1 @@
-Subproject commit 5981208e39447df84827f6a961d1da76bacb6078
+Subproject commit 0a936f6bbdb9303308973073f8623b5a8d82eae1
--- a/contrib/replxx
+++ b/contrib/replxx
@ -1 +1 @@
-Subproject commit 2b24f14594d7606792b92544bb112a6322ba34d7
+Subproject commit c81be6c68b146f15f2096b7ef80e3f21fe27004c
--- a/debian/clickhouse-server.cron.d
+++ b/debian/clickhouse-server.cron.d
@ -1 +1 @@
-#*/10 * * * * root (which service > /dev/null 2>&1 && (service clickhouse-server condstart ||:)) || /etc/init.d/clickhouse-server condstart > /dev/null 2>&1
+#*/10 * * * * root ((which service > /dev/null 2>&1 && (service clickhouse-server condstart ||:)) || /etc/init.d/clickhouse-server condstart) > /dev/null 2>&1
--- a/debian/clickhouse-server.init
+++ b/debian/clickhouse-server.init
@ -229,6 +229,7 @@ status()
 case "$1" in
 status)
    status
+    exit 0
    ;;
 esac

--- a/docker/packager/packager
+++ b/docker/packager/packager
@ -154,6 +154,10 @@ def parse_env_variables(build_type, compiler, sanitizer, package_type, image_typ

    if clang_tidy:
        cmake_flags.append('-DENABLE_CLANG_TIDY=1')
+        cmake_flags.append('-DENABLE_UTILS=1')
+        cmake_flags.append('-DUSE_GTEST=1')
+        cmake_flags.append('-DENABLE_TESTS=1')
+        cmake_flags.append('-DENABLE_EXAMPLES=1')
        # Don't stop on first error to find more clang-tidy errors in one run.
        result.append('NINJA_FLAGS=-k0')

--- a/docker/server/entrypoint.sh
+++ b/docker/server/entrypoint.sh
@ -34,7 +34,7 @@ fi
 CLICKHOUSE_CONFIG="${CLICKHOUSE_CONFIG:-/etc/clickhouse-server/config.xml}"

 if ! $gosu test -f "$CLICKHOUSE_CONFIG" -a -r "$CLICKHOUSE_CONFIG"; then
-    echo "Configuration file '$dir' isn't readable by user with id '$USER'"
+    echo "Configuration file '$CLICKHOUSE_CONFIG' isn't readable by user with id '$USER'"
    exit 1
 fi

--- a/docker/test/codebrowser/Dockerfile
+++ b/docker/test/codebrowser/Dockerfile
@ -22,9 +22,9 @@ ENV SHA=nosha
 ENV DATA="data"

 CMD mkdir -p $BUILD_DIRECTORY && cd $BUILD_DIRECTORY && \
-    cmake $SOURCE_DIRECTORY -DCMAKE_CXX_COMPILER=/usr/bin/clang\+\+-11 -DCMAKE_C_COMPILER=/usr/bin/clang-11 -DCMAKE_EXPORT_COMPILE_COMMANDS=ON && \
+    cmake $SOURCE_DIRECTORY -DCMAKE_CXX_COMPILER=/usr/bin/clang\+\+-11 -DCMAKE_C_COMPILER=/usr/bin/clang-11 -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DENABLE_EMBEDDED_COMPILER=0 -DENABLE_S3=0 && \
    mkdir -p $HTML_RESULT_DIRECTORY && \
-    $CODEGEN -b $BUILD_DIRECTORY -a -o $HTML_RESULT_DIRECTORY -p ClickHouse:$SOURCE_DIRECTORY:$SHA -d $DATA && \
+    $CODEGEN -b $BUILD_DIRECTORY -a -o $HTML_RESULT_DIRECTORY -p ClickHouse:$SOURCE_DIRECTORY:$SHA -d $DATA | ts '%Y-%m-%d %H:%M:%S' && \
    cp -r $STATIC_DATA $HTML_RESULT_DIRECTORY/ &&\
-    $CODEINDEX $HTML_RESULT_DIRECTORY -d $DATA && \
+    $CODEINDEX $HTML_RESULT_DIRECTORY -d $DATA | ts '%Y-%m-%d %H:%M:%S' && \
    mv $HTML_RESULT_DIRECTORY /test_output
--- a/docker/test/fasttest/run.sh
+++ b/docker/test/fasttest/run.sh
@ -374,9 +374,13 @@ function run_tests
        01801_s3_cluster

        # Depends on LLVM JIT
+        01072_nullable_jit
        01852_jit_if
        01865_jit_comparison_constant_result
        01871_merge_tree_compile_expressions
+
+        # needs psql
+        01889_postgresql_protocol_null_fields
    )

    time clickhouse-test --hung-check -j 8 --order=random --use-skip-list \
--- a/docker/test/fuzzer/run-fuzzer.sh
+++ b/docker/test/fuzzer/run-fuzzer.sh
@ -56,17 +56,19 @@ function watchdog
    sleep 3600

    echo "Fuzzing run has timed out"
-    killall clickhouse-client ||:
    for _ in {1..10}
    do
-        if ! pgrep -f clickhouse-client
+        # Only kill by pid the particular client that runs the fuzzing, or else
+        # we can kill some clickhouse-client processes this script starts later,
+        # e.g. for checking server liveness.
+        if ! kill $fuzzer_pid
        then
            break
        fi
        sleep 1
    done

-    killall -9 clickhouse-client ||:
+    kill -9 -- $fuzzer_pid ||:
 }

 function filter_exists
@ -85,7 +87,7 @@ function fuzz
 {
    # Obtain the list of newly added tests. They will be fuzzed in more extreme way than other tests.
    # Don't overwrite the NEW_TESTS_OPT so that it can be set from the environment.
-    NEW_TESTS="$(grep -P 'tests/queries/0_stateless/.*\.sql' ci-changed-files.txt | sed -r -e 's!^!ch/!' | sort -R)"
+    NEW_TESTS="$(sed -n 's!\(^tests/queries/0_stateless/.*\.sql\)$!ch/\1!p' ci-changed-files.txt | sort -R)"
    # ci-changed-files.txt contains also files that has been deleted/renamed, filter them out.
    NEW_TESTS="$(filter_exists $NEW_TESTS)"
    if [[ -n "$NEW_TESTS" ]]
@ -95,14 +97,10 @@ function fuzz
        NEW_TESTS_OPT="${NEW_TESTS_OPT:-}"
    fi

+    export CLICKHOUSE_WATCHDOG_ENABLE=0 # interferes with gdb
    clickhouse-server --config-file db/config.xml -- --path db 2>&1 | tail -100000 > server.log &
-
    server_pid=$!
    kill -0 $server_pid
-    while ! clickhouse-client --query "select 1" && kill -0 $server_pid ; do echo . ; sleep 1 ; done
-    clickhouse-client --query "select 1"
-    kill -0 $server_pid
-    echo Server started

    echo "
 handle all noprint
@ -113,19 +111,70 @@ thread apply all backtrace
 continue
 " > script.gdb

-    gdb -batch -command script.gdb -p "$(pidof clickhouse-server)" &
+    gdb -batch -command script.gdb -p $server_pid &
+
+    # Check connectivity after we attach gdb, because it might cause the server
+    # to freeze and the fuzzer will fail.
+    for _ in {1..60}
+    do
+        sleep 1
+        if clickhouse-client --query "select 1"
+        then
+            break
+        fi
+    done
+    clickhouse-client --query "select 1" # This checks that the server is responding
+    kill -0 $server_pid # This checks that it is our server that is started and not some other one
+    echo Server started and responded

-    fuzzer_exit_code=0
    # SC2012: Use find instead of ls to better handle non-alphanumeric filenames. They are all alphanumeric.
    # SC2046: Quote this to prevent word splitting. Actually I need word splitting.
    # shellcheck disable=SC2012,SC2046
-    clickhouse-client --query-fuzzer-runs=1000 --queries-file $(ls -1 ch/tests/queries/0_stateless/*.sql | sort -R) $NEW_TESTS_OPT \
+    clickhouse-client \
+        --receive_timeout=10 \
+        --receive_data_timeout_ms=10000 \
+        --query-fuzzer-runs=1000 \
+        --queries-file $(ls -1 ch/tests/queries/0_stateless/*.sql | sort -R) \
+        $NEW_TESTS_OPT \
        > >(tail -n 100000 > fuzzer.log) \
-        2>&1 \
-        || fuzzer_exit_code=$?
+        2>&1 &
+    fuzzer_pid=$!
+    echo "Fuzzer pid is $fuzzer_pid"

+    # Start a watchdog that should kill the fuzzer on timeout.
+    # The shell won't kill the child sleep when we kill it, so we have to put it
+    # into a separate process group so that we can kill them all.
+    set -m
+    watchdog &
+    watchdog_pid=$!
+    set +m
+    # Check that the watchdog has started.
+    kill -0 $watchdog_pid
+
+    # Wait for the fuzzer to complete.
+    # Note that the 'wait || ...' thing is required so that the script doesn't
+    # exit because of 'set -e' when 'wait' returns nonzero code.
+    fuzzer_exit_code=0
+    wait "$fuzzer_pid" || fuzzer_exit_code=$?
    echo "Fuzzer exit code is $fuzzer_exit_code"

+    kill -- -$watchdog_pid ||:
+
+    # If the server dies, most often the fuzzer returns code 210: connetion
+    # refused, and sometimes also code 32: attempt to read after eof. For
+    # simplicity, check again whether the server is accepting connections, using
+    # clickhouse-client. We don't check for existence of server process, because
+    # the process is still present while the server is terminating and not
+    # accepting the connections anymore.
+    if clickhouse-client --query "select 1 format Null"
+    then
+        server_died=0
+    else
+        echo "Server live check returns $?"
+        server_died=1
+    fi
+
+    # Stop the server.
    clickhouse-client --query "select elapsed, query from system.processes" ||:
    killall clickhouse-server ||:
    for _ in {1..10}
@ -137,6 +186,45 @@ continue
        sleep 1
    done
    killall -9 clickhouse-server ||:
+
+    # Debug.
+    date
+    sleep 10
+    jobs
+    pstree -aspgT
+
+    # Make files with status and description we'll show for this check on Github.
+    task_exit_code=$fuzzer_exit_code
+    if [ "$server_died" == 1 ]
+    then
+        # The server has died.
+        task_exit_code=210
+        echo "failure" > status.txt
+        if ! grep --text -ao "Received signal.*\|Logical error.*\|Assertion.*failed\|Failed assertion.*\|.*runtime error: .*\|.*is located.*\|SUMMARY: AddressSanitizer:.*\|SUMMARY: MemorySanitizer:.*\|SUMMARY: ThreadSanitizer:.*\|.*_LIBCPP_ASSERT.*" server.log > description.txt
+        then
+            echo "Lost connection to server. See the logs." > description.txt
+        fi
+    elif [ "$fuzzer_exit_code" == "143" ] || [ "$fuzzer_exit_code" == "0" ]
+    then
+        # Variants of a normal run:
+        # 0 -- fuzzing ended earlier than timeout.
+        # 143 -- SIGTERM -- the fuzzer was killed by timeout.
+        task_exit_code=0
+        echo "success" > status.txt
+        echo "OK" > description.txt
+    else
+        # The server was alive, but the fuzzer returned some error. This might
+        # be some client-side error detected by fuzzing, or a problem in the
+        # fuzzer itself. Don't grep the server log in this case, because we will
+        # find a message about normal server termination (Received signal 15),
+        # which is confusing.
+        task_exit_code=$fuzzer_exit_code
+        echo "failure" > status.txt
+        { grep --text -o "Found error:.*" fuzzer.log \
+            || grep --text -o "Exception.*" fuzzer.log \
+            || echo "Fuzzer failed ($fuzzer_exit_code). See the logs." ; } \
+            | tail -1 > description.txt
+    fi
 }

 case "$stage" in
@ -165,50 +253,7 @@ case "$stage" in
    time configure
    ;&
 "fuzz")
-    # Start a watchdog that should kill the fuzzer on timeout.
-    # The shell won't kill the child sleep when we kill it, so we have to put it
-    # into a separate process group so that we can kill them all.
-    set -m
-    watchdog &
-    watchdog_pid=$!
-    set +m
-    # Check that the watchdog has started
-    kill -0 $watchdog_pid
-
-    fuzzer_exit_code=0
-    time fuzz || fuzzer_exit_code=$?
-    kill -- -$watchdog_pid ||:
-
-    # Debug
-    date
-    sleep 10
-    jobs
-    pstree -aspgT
-
-    # Make files with status and description we'll show for this check on Github
-    task_exit_code=$fuzzer_exit_code
-    if [ "$fuzzer_exit_code" == 143 ]
-    then
-        # SIGTERM -- the fuzzer was killed by timeout, which means a normal run.
-        echo "success" > status.txt
-        echo "OK" > description.txt
-        task_exit_code=0
-    elif [ "$fuzzer_exit_code" == 210 ]
-    then
-        # Lost connection to the server. This probably means that the server died
-        # with abort.
-        echo "failure" > status.txt
-        if ! grep -ao "Received signal.*\|Logical error.*\|Assertion.*failed\|Failed assertion.*\|.*runtime error: .*\|.*is located.*\|SUMMARY: AddressSanitizer:.*\|SUMMARY: MemorySanitizer:.*\|SUMMARY: ThreadSanitizer:.*\|.*_LIBCPP_ASSERT.*" server.log > description.txt
-        then
-            echo "Lost connection to server. See the logs." > description.txt
-        fi
-    else
-        # Something different -- maybe the fuzzer itself died? Don't grep the
-        # server log in this case, because we will find a message about normal
-        # server termination (Received signal 15), which is confusing.
-        echo "failure" > status.txt
-        echo "Fuzzer failed ($fuzzer_exit_code). See the logs." > description.txt
-    fi
+    time fuzz
    ;&
 "report")
 cat > report.html <<EOF ||:
--- a/docker/test/integration/runner/Dockerfile
+++ b/docker/test/integration/runner/Dockerfile
@ -1,5 +1,5 @@
 # docker build -t yandex/clickhouse-integration-tests-runner .
-FROM ubuntu:18.04
+FROM ubuntu:20.04

 RUN apt-get update \
    && env DEBIAN_FRONTEND=noninteractive apt-get install --yes \
@ -14,7 +14,6 @@ RUN apt-get update \
    wget \
    git \
    iproute2 \
-    module-init-tools \
    cgroupfs-mount \
    python3-pip \
    tzdata \
@ -42,7 +41,6 @@ ENV TZ=Europe/Moscow
 RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone

 ENV DOCKER_CHANNEL stable
-ENV DOCKER_VERSION 5:19.03.13~3-0~ubuntu-bionic
 RUN curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
 RUN add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -c -s) ${DOCKER_CHANNEL}"

@ -66,25 +64,28 @@ RUN python3 -m pip install \
    dict2xml \
    dicttoxml \
    docker \
-    docker-compose==1.22.0 \
+    docker-compose==1.28.2 \
    grpcio \
    grpcio-tools \
    kafka-python \
    kazoo \
    minio \
    protobuf \
-    psycopg2-binary==2.7.5 \
+    psycopg2-binary==2.8.6 \
    pymongo \
    pytest \
    pytest-timeout \
+    pytest-xdist \
    redis \
    tzlocal \
    urllib3 \
-    requests-kerberos
+    requests-kerberos \
+    pyhdfs

 COPY modprobe.sh /usr/local/bin/modprobe
 COPY dockerd-entrypoint.sh /usr/local/bin/
 COPY compose/ /compose/
+COPY misc/ /misc/

 RUN set -x \
  && addgroup --system dockremap \
@ -93,7 +94,6 @@ RUN set -x \
  && echo 'dockremap:165536:65536' >> /etc/subuid \
    && echo 'dockremap:165536:65536' >> /etc/subgid

-VOLUME /var/lib/docker
 EXPOSE 2375
 ENTRYPOINT ["dockerd-entrypoint.sh"]
 CMD ["sh", "-c", "pytest $PYTEST_OPTS"]
--- a/docker/test/integration/runner/compose/docker_compose_cassandra.yml
+++ b/docker/test/integration/runner/compose/docker_compose_cassandra.yml
@ -1,7 +1,5 @@
 version: '2.3'
 services:
    cassandra1:
-        image: cassandra
+        image: cassandra:4.0
        restart: always
-        ports:
-            - 9043:9042
--- a/docker/test/integration/runner/compose/docker_compose_hdfs.yml
+++ b/docker/test/integration/runner/compose/docker_compose_hdfs.yml
@ -4,7 +4,11 @@ services:
        image: sequenceiq/hadoop-docker:2.7.0
        hostname: hdfs1
        restart: always
-        ports:
-            - 50075:50075
-            - 50070:50070
+        expose:
+            - ${HDFS_NAME_PORT}
+            - ${HDFS_DATA_PORT}
        entrypoint: /etc/bootstrap.sh -d
+        volumes:
+            - type: ${HDFS_FS:-tmpfs}
+              source: ${HDFS_LOGS:-}
+              target: /usr/local/hadoop/logs
--- a/docker/test/integration/runner/compose/docker_compose_jdbc_bridge.yml
+++ b/docker/test/integration/runner/compose/docker_compose_jdbc_bridge.yml
@ -0,0 +1,23 @@
+version: '2.3'
+services:
+  bridge1:
+    image: yandex/clickhouse-jdbc-bridge
+    command: |
+      /bin/bash -c 'cat << EOF > config/datasources/self.json
+      {
+        "self": {
+          "jdbcUrl": "jdbc:clickhouse://instance:8123/test",
+          "username": "default",
+          "password": "",
+          "maximumPoolSize": 5
+        }
+      }
+      EOF
+      ./docker-entrypoint.sh'
+    ports:
+      - 9020:9019
+    healthcheck:
+      test: ["CMD", "curl", "-s", "localhost:9019/ping"]
+      interval: 5s
+      timeout: 3s
+      retries: 30
--- a/docker/test/integration/runner/compose/docker_compose_kafka.yml
+++ b/docker/test/integration/runner/compose/docker_compose_kafka.yml
@ -15,10 +15,11 @@ services:
    image: confluentinc/cp-kafka:5.2.0
    hostname: kafka1
    ports:
-      - "9092:9092"
+      - ${KAFKA_EXTERNAL_PORT}:${KAFKA_EXTERNAL_PORT}
    environment:
-      KAFKA_ADVERTISED_LISTENERS: INSIDE://localhost:9092,OUTSIDE://kafka1:19092
-      KAFKA_LISTENERS: INSIDE://:9092,OUTSIDE://:19092
+      KAFKA_ADVERTISED_LISTENERS: INSIDE://localhost:${KAFKA_EXTERNAL_PORT},OUTSIDE://kafka1:19092
+      KAFKA_ADVERTISED_HOST_NAME: kafka1
+      KAFKA_LISTENERS: INSIDE://0.0.0.0:${KAFKA_EXTERNAL_PORT},OUTSIDE://0.0.0.0:19092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE
      KAFKA_BROKER_ID: 1
@ -34,7 +35,7 @@ services:
    image: confluentinc/cp-schema-registry:5.2.0
    hostname: schema-registry
    ports:
-      - "8081:8081"
+      - ${SCHEMA_REGISTRY_EXTERNAL_PORT}:${SCHEMA_REGISTRY_INTERNAL_PORT}
    environment:
      SCHEMA_REGISTRY_HOST_NAME: schema-registry
      SCHEMA_REGISTRY_KAFKASTORE_SECURITY_PROTOCOL: PLAINTEXT
--- a/docker/test/integration/runner/compose/docker_compose_kerberized_hdfs.yml
+++ b/docker/test/integration/runner/compose/docker_compose_kerberized_hdfs.yml
@ -11,19 +11,21 @@ services:
      - ${KERBERIZED_HDFS_DIR}/../../hdfs_configs/bootstrap.sh:/etc/bootstrap.sh:ro
      - ${KERBERIZED_HDFS_DIR}/secrets:/usr/local/hadoop/etc/hadoop/conf
      - ${KERBERIZED_HDFS_DIR}/secrets/krb_long.conf:/etc/krb5.conf:ro
-    ports:
-      - 1006:1006
-      - 50070:50070
-      - 9010:9010
+      - type: ${KERBERIZED_HDFS_FS:-tmpfs}
+        source: ${KERBERIZED_HDFS_LOGS:-}
+        target: /var/log/hadoop-hdfs
+    expose:
+      - ${KERBERIZED_HDFS_NAME_PORT}
+      - ${KERBERIZED_HDFS_DATA_PORT}
    depends_on:
      - hdfskerberos
    entrypoint: /etc/bootstrap.sh -d

  hdfskerberos:
-    image: yandex/clickhouse-kerberos-kdc:${DOCKER_KERBEROS_KDC_TAG}
+    image: yandex/clickhouse-kerberos-kdc:${DOCKER_KERBEROS_KDC_TAG:-latest}
    hostname: hdfskerberos
    volumes:
      - ${KERBERIZED_HDFS_DIR}/secrets:/tmp/keytab
      - ${KERBERIZED_HDFS_DIR}/../../kerberos_image_config.sh:/config.sh
      - /dev/urandom:/dev/random
-    ports: [88, 749]
+    expose: [88, 749]
--- a/docker/test/integration/runner/compose/docker_compose_kerberized_kafka.yml
+++ b/docker/test/integration/runner/compose/docker_compose_kerberized_kafka.yml
@ -23,13 +23,13 @@ services:
    # restart: always
    hostname: kerberized_kafka1
    ports:
-      - "9092:9092"
-      - "9093:9093"
+      - ${KERBERIZED_KAFKA_EXTERNAL_PORT}:${KERBERIZED_KAFKA_EXTERNAL_PORT}
    environment:
-      KAFKA_LISTENERS: OUTSIDE://:19092,UNSECURED_OUTSIDE://:19093,UNSECURED_INSIDE://:9093
-      KAFKA_ADVERTISED_LISTENERS: OUTSIDE://kerberized_kafka1:19092,UNSECURED_OUTSIDE://kerberized_kafka1:19093,UNSECURED_INSIDE://localhost:9093
+      KAFKA_LISTENERS: OUTSIDE://:19092,UNSECURED_OUTSIDE://:19093,UNSECURED_INSIDE://0.0.0.0:${KERBERIZED_KAFKA_EXTERNAL_PORT}
+      KAFKA_ADVERTISED_LISTENERS: OUTSIDE://kerberized_kafka1:19092,UNSECURED_OUTSIDE://kerberized_kafka1:19093,UNSECURED_INSIDE://localhost:${KERBERIZED_KAFKA_EXTERNAL_PORT}
      # KAFKA_LISTENERS: INSIDE://kerberized_kafka1:9092,OUTSIDE://kerberized_kafka1:19092
      # KAFKA_ADVERTISED_LISTENERS: INSIDE://localhost:9092,OUTSIDE://kerberized_kafka1:19092
+      KAFKA_ADVERTISED_HOST_NAME: kerberized_kafka1
      KAFKA_SASL_MECHANISM_INTER_BROKER_PROTOCOL: GSSAPI
      KAFKA_SASL_ENABLED_MECHANISMS: GSSAPI
      KAFKA_SASL_KERBEROS_SERVICE_NAME: kafka
--- a/docker/test/integration/runner/compose/docker_compose_minio.yml
+++ b/docker/test/integration/runner/compose/docker_compose_minio.yml
@ -6,8 +6,8 @@ services:
    volumes:
      - data1-1:/data1
      - ${MINIO_CERTS_DIR:-}:/certs
-    ports:
-      - "9001:9001"
+    expose:
+      - ${MINIO_PORT}
    environment:
      MINIO_ACCESS_KEY: minio
      MINIO_SECRET_KEY: minio123
@ -20,14 +20,14 @@ services:
  # HTTP proxies for Minio.
  proxy1:
    image: yandex/clickhouse-s3-proxy
-    ports:
+    expose:
      - "8080" # Redirect proxy port
      - "80"   # Reverse proxy port
      - "443"  # Reverse proxy port (secure)

  proxy2:
    image: yandex/clickhouse-s3-proxy
-    ports:
+    expose:
      - "8080"
      - "80"
      - "443"
@ -35,7 +35,7 @@ services:
  # Empty container to run proxy resolver.
  resolver:
    image: yandex/clickhouse-python-bottle
-    ports:
+    expose:
      - "8080"
    tty: true
    depends_on:
--- a/docker/test/integration/runner/compose/docker_compose_mongo.yml
+++ b/docker/test/integration/runner/compose/docker_compose_mongo.yml
@ -7,5 +7,5 @@ services:
            MONGO_INITDB_ROOT_USERNAME: root
            MONGO_INITDB_ROOT_PASSWORD: clickhouse
        ports:
-            - 27018:27017
+            - ${MONGO_EXTERNAL_PORT}:${MONGO_INTERNAL_PORT}
        command: --profile=2 --verbose
--- a/docker/test/integration/runner/compose/docker_compose_mysql.yml
+++ b/docker/test/integration/runner/compose/docker_compose_mysql.yml
@ -1,10 +1,24 @@
 version: '2.3'
 services:
-    mysql1:
+    mysql57:
        image: mysql:5.7
        restart: always
        environment:
            MYSQL_ROOT_PASSWORD: clickhouse
-        ports:
-            - 3308:3306
-        command: --server_id=100 --log-bin='mysql-bin-1.log' --default-time-zone='+3:00' --gtid-mode="ON" --enforce-gtid-consistency
+            MYSQL_ROOT_HOST: ${MYSQL_ROOT_HOST}
+            DATADIR: /mysql/
+        expose:
+            - ${MYSQL_PORT}
+        command: --server_id=100 
+            --log-bin='mysql-bin-1.log' 
+            --default-time-zone='+3:00' 
+            --gtid-mode="ON" 
+            --enforce-gtid-consistency
+            --log-error-verbosity=3
+            --log-error=/mysql/error.log
+            --general-log=ON
+            --general-log-file=/mysql/general.log
+        volumes:
+            - type: ${MYSQL_LOGS_FS:-tmpfs}
+              source: ${MYSQL_LOGS:-}
+              target: /mysql/
--- a/docker/test/integration/runner/compose/docker_compose_mysql_5_7_for_materialize_mysql.yml
+++ b/docker/test/integration/runner/compose/docker_compose_mysql_5_7_for_materialize_mysql.yml
@ -12,3 +12,10 @@ services:
            --gtid-mode="ON"
            --enforce-gtid-consistency
            --log-error-verbosity=3
+            --log-error=/var/log/mysqld/error.log
+            --general-log=ON
+            --general-log-file=/var/log/mysqld/general.log
+        volumes:
+            - type: ${MYSQL_LOGS_FS:-tmpfs}
+              source: ${MYSQL_LOGS:-}
+              target: /var/log/mysqld/
--- a/docker/test/integration/runner/compose/docker_compose_mysql_8_0.yml
+++ b/docker/test/integration/runner/compose/docker_compose_mysql_8_0.yml
@ -0,0 +1,23 @@
+version: '2.3'
+services:
+    mysql80:
+        image: mysql:8.0
+        restart: always
+        environment:
+            MYSQL_ROOT_PASSWORD: clickhouse
+            MYSQL_ROOT_HOST: ${MYSQL_ROOT_HOST}
+            DATADIR: /mysql/
+        expose:
+            - ${MYSQL8_PORT}
+        command: --server_id=100 --log-bin='mysql-bin-1.log' 
+            --default_authentication_plugin='mysql_native_password' 
+            --default-time-zone='+3:00' --gtid-mode="ON" 
+            --enforce-gtid-consistency
+            --log-error-verbosity=3
+            --log-error=/mysql/error.log
+            --general-log=ON
+            --general-log-file=/mysql/general.log
+        volumes:
+            - type: ${MYSQL8_LOGS_FS:-tmpfs}
+              source: ${MYSQL8_LOGS:-}
+              target: /mysql/
--- a/docker/test/integration/runner/compose/docker_compose_mysql_8_0_for_materialize_mysql.yml
+++ b/docker/test/integration/runner/compose/docker_compose_mysql_8_0_for_materialize_mysql.yml
@ -1,15 +0,0 @@
-version: '2.3'
-services:
-    mysql8_0:
-        image: mysql:8.0
-        restart: 'no'
-        environment:
-            MYSQL_ROOT_PASSWORD: clickhouse
-        ports:
-            - 3309:3306
-        command: --server_id=100 --log-bin='mysql-bin-1.log'
-            --default_authentication_plugin='mysql_native_password'
-            --default-time-zone='+3:00'
-            --gtid-mode="ON"
-            --enforce-gtid-consistency
-            --log-error-verbosity=3
--- a/docker/test/integration/runner/compose/docker_compose_mysql_client.yml
+++ b/docker/test/integration/runner/compose/docker_compose_mysql_client.yml
@ -1,6 +1,6 @@
 version: '2.3'
 services:
-  mysql1:
+  mysql_client:
    image: mysql:5.7
    restart: always
    environment:
--- a/docker/test/integration/runner/compose/docker_compose_mysql_cluster.yml
+++ b/docker/test/integration/runner/compose/docker_compose_mysql_cluster.yml
@ -5,19 +5,64 @@ services:
        restart: always
        environment:
            MYSQL_ROOT_PASSWORD: clickhouse
-        ports:
-            - 3348:3306
+            MYSQL_ROOT_HOST: ${MYSQL_CLUSTER_ROOT_HOST}
+            DATADIR: /mysql/
+        expose:
+            - ${MYSQL_CLUSTER_PORT}
+        command: --server_id=100 
+            --log-bin='mysql-bin-2.log' 
+            --default-time-zone='+3:00' 
+            --gtid-mode="ON" 
+            --enforce-gtid-consistency
+            --log-error-verbosity=3
+            --log-error=/mysql/2_error.log
+            --general-log=ON
+            --general-log-file=/mysql/2_general.log
+        volumes:
+            - type: ${MYSQL_CLUSTER_LOGS_FS:-tmpfs}
+              source: ${MYSQL_CLUSTER_LOGS:-}
+              target: /mysql/
    mysql3:
        image: mysql:5.7
        restart: always
        environment:
            MYSQL_ROOT_PASSWORD: clickhouse
-        ports:
-            - 3388:3306
+            MYSQL_ROOT_HOST: ${MYSQL_CLUSTER_ROOT_HOST}
+            DATADIR: /mysql/
+        expose:
+            - ${MYSQL_CLUSTER_PORT}
+        command: --server_id=100 
+            --log-bin='mysql-bin-3.log' 
+            --default-time-zone='+3:00' 
+            --gtid-mode="ON" 
+            --enforce-gtid-consistency
+            --log-error-verbosity=3
+            --log-error=/mysql/3_error.log
+            --general-log=ON
+            --general-log-file=/mysql/3_general.log
+        volumes:
+            - type: ${MYSQL_CLUSTER_LOGS_FS:-tmpfs}
+              source: ${MYSQL_CLUSTER_LOGS:-}
+              target: /mysql/
    mysql4:
        image: mysql:5.7
        restart: always
        environment:
            MYSQL_ROOT_PASSWORD: clickhouse
-        ports:
-            - 3368:3306
+            MYSQL_ROOT_HOST: ${MYSQL_CLUSTER_ROOT_HOST}
+            DATADIR: /mysql/
+        expose:
+            - ${MYSQL_CLUSTER_PORT}
+        command: --server_id=100 
+            --log-bin='mysql-bin-4.log' 
+            --default-time-zone='+3:00' 
+            --gtid-mode="ON" 
+            --enforce-gtid-consistency
+            --log-error-verbosity=3
+            --log-error=/mysql/4_error.log
+            --general-log=ON
+            --general-log-file=/mysql/4_general.log
+        volumes:
+            - type: ${MYSQL_CLUSTER_LOGS_FS:-tmpfs}
+              source: ${MYSQL_CLUSTER_LOGS:-}
+              target: /mysql/
--- a/docker/test/integration/runner/compose/docker_compose_postgres.yml
+++ b/docker/test/integration/runner/compose/docker_compose_postgres.yml
@ -2,12 +2,24 @@ version: '2.3'
 services:
    postgres1:
        image: postgres
+        command: ["postgres", "-c", "logging_collector=on", "-c", "log_directory=/postgres/logs", "-c", "log_filename=postgresql.log", "-c", "log_statement=all"]
        restart: always
-        environment:
-            POSTGRES_PASSWORD: mysecretpassword
-        ports:
-            - 5432:5432
+        expose:
+            - ${POSTGRES_PORT}
+        healthcheck:
+            test: ["CMD-SHELL", "pg_isready -U postgres"]
+            interval: 10s
+            timeout: 5s
+            retries: 5
        networks:
-            default:
-                aliases:
-                    - postgre-sql.local
+          default:
+            aliases:
+                - postgre-sql.local
+        environment:
+            POSTGRES_HOST_AUTH_METHOD: "trust"
+            POSTGRES_PASSWORD: mysecretpassword
+            PGDATA: /postgres/data
+        volumes:
+            - type: ${POSTGRES_LOGS_FS:-tmpfs}
+              source: ${POSTGRES_DIR:-}
+              target: /postgres/
--- a/docker/test/integration/runner/compose/docker_compose_postgres_cluster.yml
+++ b/docker/test/integration/runner/compose/docker_compose_postgres_cluster.yml
@ -2,22 +2,43 @@ version: '2.3'
 services:
    postgres2:
        image: postgres
+        command: ["postgres", "-c", "logging_collector=on", "-c", "log_directory=/postgres/logs", "-c", "log_filename=postgresql.log", "-c", "log_statement=all"]
        restart: always
        environment:
+            POSTGRES_HOST_AUTH_METHOD: "trust"
            POSTGRES_PASSWORD: mysecretpassword
-        ports:
-            - 5421:5432
+            PGDATA: /postgres/data
+        expose:
+            - ${POSTGRES_PORT}
+        volumes:
+            - type: ${POSTGRES_LOGS_FS:-tmpfs}
+              source: ${POSTGRES2_DIR:-}
+              target: /postgres/
    postgres3:
        image: postgres
+        command: ["postgres", "-c", "logging_collector=on", "-c", "log_directory=/postgres/logs", "-c", "log_filename=postgresql.log", "-c", "log_statement=all"]
        restart: always
        environment:
+            POSTGRES_HOST_AUTH_METHOD: "trust"
            POSTGRES_PASSWORD: mysecretpassword
-        ports:
-            - 5441:5432
+            PGDATA: /postgres/data
+        expose:
+            - ${POSTGRES_PORT}
+        volumes:
+            - type: ${POSTGRES_LOGS_FS:-tmpfs}
+              source: ${POSTGRES3_DIR:-}
+              target: /postgres/
    postgres4:
        image: postgres
+        command: ["postgres", "-c", "logging_collector=on", "-c", "log_directory=/postgres/logs", "-c", "log_filename=postgresql.log", "-c", "log_statement=all"]
        restart: always
        environment:
+            POSTGRES_HOST_AUTH_METHOD: "trust"
            POSTGRES_PASSWORD: mysecretpassword
-        ports:
-            - 5461:5432
+            PGDATA: /postgres/data
+        expose:
+            - ${POSTGRES_PORT}
+        volumes:
+            - type: ${POSTGRES_LOGS_FS:-tmpfs}
+              source: ${POSTGRES4_DIR:-}
+              target: /postgres/
--- a/docker/test/integration/runner/compose/docker_compose_postgresql.yml
+++ b/docker/test/integration/runner/compose/docker_compose_postgresql.yml
--- a/docker/test/integration/runner/compose/docker_compose_postgresql_java_client.yml
+++ b/docker/test/integration/runner/compose/docker_compose_postgresql_java_client.yml
--- a/docker/test/integration/runner/compose/docker_compose_rabbitmq.yml
+++ b/docker/test/integration/runner/compose/docker_compose_rabbitmq.yml
@ -2,11 +2,15 @@ version: '2.3'

 services:
    rabbitmq1:
-        image: rabbitmq:3-management
+        image: rabbitmq:3-management-alpine
        hostname: rabbitmq1
-        ports:
-            - "5672:5672"
-            - "15672:15672"
+        expose:
+            - ${RABBITMQ_PORT}
        environment:
            RABBITMQ_DEFAULT_USER: "root"
            RABBITMQ_DEFAULT_PASS: "clickhouse"
+            RABBITMQ_LOG_BASE: /rabbitmq_logs/
+        volumes:
+            - type: ${RABBITMQ_LOGS_FS:-tmpfs}
+              source: ${RABBITMQ_LOGS:-}
+              target: /rabbitmq_logs/
--- a/docker/test/integration/runner/compose/docker_compose_redis.yml
+++ b/docker/test/integration/runner/compose/docker_compose_redis.yml
@ -4,5 +4,5 @@ services:
        image: redis
        restart: always
        ports:
-            - 6380:6379
+            - ${REDIS_EXTERNAL_PORT}:${REDIS_INTERNAL_PORT}
        command: redis-server --requirepass "clickhouse" --databases 32
--- a/docker/test/integration/runner/compose/docker_compose_zookeeper_secure.yml
+++ b/docker/test/integration/runner/compose/docker_compose_zookeeper_secure.yml
@ -0,0 +1,75 @@
+version: '2.3'
+services:
+    zoo1:
+        image: zookeeper:3.6.2
+        restart: always
+        environment:
+            ZOO_TICK_TIME: 500
+            ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181
+            ZOO_MY_ID: 1
+            JVMFLAGS: -Dzookeeper.forceSync=no
+            ZOO_SECURE_CLIENT_PORT: $ZOO_SECURE_CLIENT_PORT
+        command: ["zkServer.sh", "start-foreground"]
+        entrypoint: /zookeeper-ssl-entrypoint.sh
+        volumes:
+            - type:  bind
+              source: /misc/zookeeper-ssl-entrypoint.sh
+              target: /zookeeper-ssl-entrypoint.sh
+            - type: bind
+              source: /misc/client.crt
+              target: /clickhouse-config/client.crt
+            - type: ${ZK_FS:-tmpfs}
+              source: ${ZK_DATA1:-}
+              target: /data
+            - type: ${ZK_FS:-tmpfs}
+              source: ${ZK_DATA_LOG1:-}
+              target: /datalog
+    zoo2:
+        image: zookeeper:3.6.2
+        restart: always
+        environment:
+            ZOO_TICK_TIME: 500
+            ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888
+            ZOO_MY_ID: 2
+            JVMFLAGS: -Dzookeeper.forceSync=no
+            ZOO_SECURE_CLIENT_PORT: $ZOO_SECURE_CLIENT_PORT
+
+        command: ["zkServer.sh", "start-foreground"]
+        entrypoint: /zookeeper-ssl-entrypoint.sh
+        volumes:
+            - type:  bind
+              source: /misc/zookeeper-ssl-entrypoint.sh
+              target: /zookeeper-ssl-entrypoint.sh
+            - type: bind
+              source: /misc/client.crt
+              target: /clickhouse-config/client.crt
+            - type: ${ZK_FS:-tmpfs}
+              source: ${ZK_DATA2:-}
+              target: /data
+            - type: ${ZK_FS:-tmpfs}
+              source: ${ZK_DATA_LOG2:-}
+              target: /datalog
+    zoo3:
+        image: zookeeper:3.6.2
+        restart: always
+        environment:
+            ZOO_TICK_TIME: 500
+            ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181
+            ZOO_MY_ID: 3
+            JVMFLAGS: -Dzookeeper.forceSync=no
+            ZOO_SECURE_CLIENT_PORT: $ZOO_SECURE_CLIENT_PORT
+        command: ["zkServer.sh", "start-foreground"]
+        entrypoint: /zookeeper-ssl-entrypoint.sh
+        volumes:
+            - type:  bind
+              source: /misc/zookeeper-ssl-entrypoint.sh
+              target: /zookeeper-ssl-entrypoint.sh
+            - type: bind
+              source: /misc/client.crt
+              target: /clickhouse-config/client.crt
+            - type: ${ZK_FS:-tmpfs}
+              source: ${ZK_DATA3:-}
+              target: /data
+            - type: ${ZK_FS:-tmpfs}
+              source: ${ZK_DATA_LOG3:-}
+              target: /datalog
--- a/docker/test/integration/runner/dockerd-entrypoint.sh
+++ b/docker/test/integration/runner/dockerd-entrypoint.sh
@ -2,17 +2,17 @@
 set -e

 mkdir -p /etc/docker/
-cat > /etc/docker/daemon.json << EOF
-{
+echo '{
    "ipv6": true,
    "fixed-cidr-v6": "fd00::/8",
    "ip-forward": true,
+    "log-level": "debug",
+    "storage-driver": "overlay2",
    "insecure-registries" : ["dockerhub-proxy.sas.yp-c.yandex.net:5000"],
    "registry-mirrors" : ["http://dockerhub-proxy.sas.yp-c.yandex.net:5000"]
-}
-EOF
+}' | dd of=/etc/docker/daemon.json 2>/dev/null

-dockerd --host=unix:///var/run/docker.sock --host=tcp://0.0.0.0:2375 &>/var/log/somefile &
+dockerd --host=unix:///var/run/docker.sock --host=tcp://0.0.0.0:2375 --default-address-pool base=172.17.0.0/12,size=24 &>/ClickHouse/tests/integration/dockerd.log &

 set +e
 reties=0
@ -27,6 +27,10 @@ while true; do
 done
 set -e

+# cleanup for retry run if volume is not recreated
+docker kill "$(docker ps -aq)" || true
+docker rm "$(docker ps -aq)" || true
+
 echo "Start tests"
 export CLICKHOUSE_TESTS_SERVER_BIN_PATH=/clickhouse
 export CLICKHOUSE_TESTS_CLIENT_BIN_PATH=/clickhouse
--- a/docker/test/integration/runner/misc/client.crt
+++ b/docker/test/integration/runner/misc/client.crt
@ -0,0 +1,19 @@
+-----BEGIN CERTIFICATE-----
+MIIC/TCCAeWgAwIBAgIJANjx1QSR77HBMA0GCSqGSIb3DQEBCwUAMBQxEjAQBgNV
+BAMMCWxvY2FsaG9zdDAgFw0xODA3MzAxODE2MDhaGA8yMjkyMDUxNDE4MTYwOFow
+FDESMBAGA1UEAwwJbG9jYWxob3N0MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIB
+CgKCAQEAs9uSo6lJG8o8pw0fbVGVu0tPOljSWcVSXH9uiJBwlZLQnhN4SFSFohfI
+4K8U1tBDTnxPLUo/V1K9yzoLiRDGMkwVj6+4+hE2udS2ePTQv5oaMeJ9wrs+5c9T
+4pOtlq3pLAdm04ZMB1nbrEysceVudHRkQbGHzHp6VG29Fw7Ga6YpqyHQihRmEkTU
+7UCYNA+Vk7aDPdMS/khweyTpXYZimaK9f0ECU3/VOeG3fH6Sp2X6FN4tUj/aFXEj
+sRmU5G2TlYiSIUMF2JPdhSihfk1hJVALrHPTU38SOL+GyyBRWdNcrIwVwbpvsvPg
+pryMSNxnpr0AK0dFhjwnupIv5hJIOQIDAQABo1AwTjAdBgNVHQ4EFgQUjPLb3uYC
+kcamyZHK4/EV8jAP0wQwHwYDVR0jBBgwFoAUjPLb3uYCkcamyZHK4/EV8jAP0wQw
+DAYDVR0TBAUwAwEB/zANBgkqhkiG9w0BAQsFAAOCAQEAM/ocuDvfPus/KpMVD51j
+4IdlU8R0vmnYLQ+ygzOAo7+hUWP5j0yvq4ILWNmQX6HNvUggCgFv9bjwDFhb/5Vr
+85ieWfTd9+LTjrOzTw4avdGwpX9G+6jJJSSq15tw5ElOIFb/qNA9O4dBiu8vn03C
+L/zRSXrARhSqTW5w/tZkUcSTT+M5h28+Lgn9ysx4Ff5vi44LJ1NnrbJbEAIYsAAD
+UA+4MBFKx1r6hHINULev8+lCfkpwIaeS8RL+op4fr6kQPxnULw8wT8gkuc8I4+L
+P9gg/xDHB44T3ADGZ5Ib6O0DJaNiToO6rnoaaxs0KkotbvDWvRoxEytSbXKoYjYp
+0g==
+-----END CERTIFICATE-----
--- a/docker/test/integration/runner/misc/zookeeper-ssl-entrypoint.sh
+++ b/docker/test/integration/runner/misc/zookeeper-ssl-entrypoint.sh
@ -81,8 +81,8 @@ if [[ ! -f "$ZOO_DATA_DIR/myid" ]]; then
    echo "${ZOO_MY_ID:-1}" > "$ZOO_DATA_DIR/myid"
 fi

-mkdir -p $(dirname $ZOO_SSL_KEYSTORE_LOCATION)
-mkdir -p $(dirname $ZOO_SSL_TRUSTSTORE_LOCATION)
+mkdir -p "$(dirname $ZOO_SSL_KEYSTORE_LOCATION)"
+mkdir -p "$(dirname $ZOO_SSL_TRUSTSTORE_LOCATION)"

 if [[ ! -f "$ZOO_SSL_KEYSTORE_LOCATION" ]]; then
    keytool -genkeypair -alias zookeeper -keyalg RSA -validity 365 -keysize 2048 -dname "cn=zookeeper" -keypass password -keystore $ZOO_SSL_KEYSTORE_LOCATION -storepass password -deststoretype pkcs12
--- a/docker/test/performance-comparison/compare.sh
+++ b/docker/test/performance-comparison/compare.sh
@ -552,6 +552,66 @@ create table query_metric_stats_denorm engine File(TSVWithNamesAndTypes,
    order by test, query_index, metric_name
    ;
 " 2> >(tee -a analyze/errors.log 1>&2)
+
+# Fetch historical query variability thresholds from the CI database
+if [ -v CHPC_DATABASE_URL ]
+then
+    set +x # Don't show password in the log
+    client=(clickhouse-client
+        # Surprisingly, clickhouse-client doesn't understand --host 127.0.0.1:9000
+        # so I have to extract host and port with clickhouse-local. I tried to use
+        # Poco URI parser to support this in the client, but it's broken and can't
+        # parse host:port.
+        $(clickhouse-local --query "with '${CHPC_DATABASE_URL}' as url select '--host ' || domain(url) || ' --port ' || toString(port(url)) format TSV")
+        --secure
+        --user "${CHPC_DATABASE_USER}"
+        --password "${CHPC_DATABASE_PASSWORD}"
+        --config "right/config/client_config.xml"
+        --database perftest
+        --date_time_input_format=best_effort)
+
+
+# Precision is going to be 1.5 times worse for PRs, because we run the queries
+# less times. How do I know it? I ran this:
+# SELECT quantilesExact(0., 0.1, 0.5, 0.75, 0.95, 1.)(p / m)
+# FROM
+# (
+#     SELECT
+#         quantileIf(0.95)(stat_threshold, pr_number = 0) AS m,
+#         quantileIf(0.95)(stat_threshold, (pr_number != 0) AND (abs(diff) < stat_threshold)) AS p
+#     FROM query_metrics_v2
+#     WHERE (event_date > (today() - toIntervalMonth(1))) AND (metric = 'client_time')
+#     GROUP BY
+#         test,
+#         query_index,
+#         query_display_name
+#     HAVING count(*) > 100
+# )
+#
+# The file can be empty if the server is inaccessible, so we can't use
+# TSVWithNamesAndTypes.
+#
+    "${client[@]}" --query "
+            select test, query_index,
+                quantileExact(0.99)(abs(diff)) * 1.5 AS max_diff,
+                quantileExactIf(0.99)(stat_threshold, abs(diff) < stat_threshold) * 1.5 AS max_stat_threshold,
+                query_display_name
+            from query_metrics_v2
+            -- We use results at least one week in the past, so that the current
+            -- changes do not immediately influence the statistics, and we have
+            -- some time to notice that something is wrong.
+            where event_date between now() - interval 1 month - interval 1 week
+                    and now() - interval 1 week
+                and metric = 'client_time'
+                and pr_number = 0
+            group by test, query_index, query_display_name
+            having count(*) > 100
+            " > analyze/historical-thresholds.tsv
+    set -x
+else
+    touch analyze/historical-thresholds.tsv
+fi
+
 }

 # Analyze results
@ -596,6 +656,26 @@ create view query_metric_stats as
            diff float, stat_threshold float')
    ;

+create table report_thresholds engine File(TSVWithNamesAndTypes, 'report/thresholds.tsv')
+    as select
+        query_display_names.test test, query_display_names.query_index query_index,
+        ceil(greatest(0.1, historical_thresholds.max_diff,
+            test_thresholds.report_threshold), 2) changed_threshold,
+        ceil(greatest(0.2, historical_thresholds.max_stat_threshold,
+            test_thresholds.report_threshold + 0.1), 2) unstable_threshold,
+        query_display_names.query_display_name query_display_name
+    from query_display_names
+    left join file('analyze/historical-thresholds.tsv', TSV,
+        'test text, query_index int, max_diff float, max_stat_threshold float,
+            query_display_name text') historical_thresholds
+    on query_display_names.test = historical_thresholds.test
+        and query_display_names.query_index = historical_thresholds.query_index
+        and query_display_names.query_display_name = historical_thresholds.query_display_name
+    left join file('analyze/report-thresholds.tsv', TSV,
+        'test text, report_threshold float') test_thresholds
+    on query_display_names.test = test_thresholds.test
+    ;
+
 -- Main statistics for queries -- query time as reported in query log.
 create table queries engine File(TSVWithNamesAndTypes, 'report/queries.tsv')
    as select
@ -610,23 +690,23 @@ create table queries engine File(TSVWithNamesAndTypes, 'report/queries.tsv')
        -- uncaught regressions, because for the default 7 runs we do for PRs,
        -- the randomization distribution has only 16 values, so the max quantile
        -- is actually 0.9375.
-        abs(diff) > report_threshold        and abs(diff) >= stat_threshold as changed_fail,
-        abs(diff) > report_threshold - 0.05 and abs(diff) >= stat_threshold as changed_show,
+        abs(diff) > changed_threshold        and abs(diff) >= stat_threshold as changed_fail,
+        abs(diff) > changed_threshold - 0.05 and abs(diff) >= stat_threshold as changed_show,

-        not changed_fail and stat_threshold > report_threshold + 0.10 as unstable_fail,
-        not changed_show and stat_threshold > report_threshold - 0.05 as unstable_show,
+        not changed_fail and stat_threshold > unstable_threshold as unstable_fail,
+        not changed_show and stat_threshold > unstable_threshold - 0.05 as unstable_show,

        left, right, diff, stat_threshold,
-        if(report_threshold > 0, report_threshold, 0.10) as report_threshold,
        query_metric_stats.test test, query_metric_stats.query_index query_index,
-        query_display_name
+        query_display_names.query_display_name query_display_name
    from query_metric_stats
-    left join file('analyze/report-thresholds.tsv', TSV,
-            'test text, report_threshold float') thresholds
-        on query_metric_stats.test = thresholds.test
    left join query_display_names
        on query_metric_stats.test = query_display_names.test
            and query_metric_stats.query_index = query_display_names.query_index
+    left join report_thresholds
+        on query_display_names.test = report_thresholds.test
+            and query_display_names.query_index = report_thresholds.query_index
+            and query_display_names.query_display_name = report_thresholds.query_display_name
    -- 'server_time' is rounded down to ms, which might be bad for very short queries.
    -- Use 'client_time' instead.
    where metric_name = 'client_time'
@ -889,7 +969,6 @@ create table all_query_metrics_tsv engine File(TSV, 'report/all-query-metrics.ts
    order by test, query_index;
 " 2> >(tee -a report/errors.log 1>&2)

-
 # Prepare source data for metrics and flamegraphs for queries that were profiled
 # by perf.py.
 for version in {right,left}
@ -1148,6 +1227,55 @@ unset IFS

 function upload_results
 {
+    # Prepare info for the CI checks table.
+    rm ci-checks.tsv
+    clickhouse-local --query "
+create view queries as select * from file('report/queries.tsv', TSVWithNamesAndTypes,
+    'changed_fail int, changed_show int, unstable_fail int, unstable_show int,
+        left float, right float, diff float, stat_threshold float,
+        test text, query_index int, query_display_name text');
+
+create table ci_checks engine File(TSVWithNamesAndTypes, 'ci-checks.tsv')
+    as select
+        $PR_TO_TEST pull_request_number,
+        '$SHA_TO_TEST' commit_sha,
+        'Performance' check_name,
+        '$(sed -n 's/.*<!--status: \(.*\)-->/\1/p' report.html)' check_status,
+        -- TODO toDateTime() can't parse output of 'date', so no time for now.
+        ($(date +%s) - $CHPC_CHECK_START_TIMESTAMP) * 1000 check_duration_ms,
+        fromUnixTimestamp($CHPC_CHECK_START_TIMESTAMP) check_start_time,
+        test_name,
+        test_status,
+        test_duration_ms,
+        report_url,
+        $PR_TO_TEST = 0
+            ? 'https://github.com/ClickHouse/ClickHouse/commit/$SHA_TO_TEST'
+            : 'https://github.com/ClickHouse/ClickHouse/pull/$PR_TO_TEST' pull_request_url,
+        '' commit_url,
+        '' task_url,
+        '' base_ref,
+        '' base_repo,
+        '' head_ref,
+        '' head_repo
+    from (
+        select '' test_name,
+            '$(sed -n 's/.*<!--message: \(.*\)-->/\1/p' report.html)' test_status,
+            0 test_duration_ms,
+            'https://clickhouse-test-reports.s3.yandex.net/$PR_TO_TEST/$SHA_TO_TEST/performance_comparison/report.html#fail1' report_url
+        union all
+            select test || ' #' || toString(query_index), 'slower' test_status, 0 test_duration_ms,
+                'https://clickhouse-test-reports.s3.yandex.net/$PR_TO_TEST/$SHA_TO_TEST/performance_comparison/report.html#changes-in-performance.'
+                    || test || '.' || toString(query_index) report_url
+            from queries where changed_fail != 0 and diff > 0
+        union all
+            select test || ' #' || toString(query_index), 'unstable' test_status, 0 test_duration_ms,
+                'https://clickhouse-test-reports.s3.yandex.net/$PR_TO_TEST/$SHA_TO_TEST/performance_comparison/report.html#unstable-queries.'
+                    || test || '.' || toString(query_index) report_url
+            from queries where unstable_fail != 0
+    )
+;
+    "
+
    if ! [ -v CHPC_DATABASE_URL ]
    then
        echo Database for test results is not specified, will not upload them.
@ -1216,6 +1344,10 @@ $REF_SHA	$SHA_TO_TEST	$(numactl --show | sed -n 's/^cpubind:[[:space:]]\+/numact
 $REF_SHA	$SHA_TO_TEST	$(numactl --hardware | sed -n 's/^available:[[:space:]]\+/numactl-available	/p')
 EOF

+    # Also insert some data about the check into the CI checks table.
+    "${client[@]}" --query "INSERT INTO "'"'"gh-data"'"'".checks FORMAT TSVWithNamesAndTypes" \
+        < ci-checks.tsv
+
    set -x
 }

--- a/docker/test/performance-comparison/entrypoint.sh
+++ b/docker/test/performance-comparison/entrypoint.sh
@ -1,6 +1,9 @@
 #!/bin/bash
 set -ex

+CHPC_CHECK_START_TIMESTAMP="$(date +%s)"
+export CHPC_CHECK_START_TIMESTAMP
+
 # Use the packaged repository to find the revision we will compare to.
 function find_reference_sha
 {
--- a/docker/test/performance-comparison/report.py
+++ b/docker/test/performance-comparison/report.py
@ -453,7 +453,10 @@ if args.report == 'main':
            text += tableRow(r, attrs, anchor)

        text += tableEnd()
-        tables.append(text)
+
+        # Don't add an empty table.
+        if very_unstable_queries:
+            tables.append(text)

    add_unstable_queries()

@ -486,7 +489,7 @@ if args.report == 'main':
        text = tableStart('Test Times')
        text += tableHeader(columns, attrs)

-        allowed_average_run_time = 1.6 # 30 seconds per test at 7 runs
+        allowed_average_run_time = 3.75 # 60 seconds per test at (7 + 1) * 2 runs
        for r in rows:
            anchor = f'{currentTableAnchor()}.{r[0]}'
            total_runs = (int(r[7]) + 1) * 2  # one prewarm run, two servers
@ -552,14 +555,15 @@ if args.report == 'main':
        message_array.append(str(slower_queries) + ' slower')

    if unstable_partial_queries:
-        unstable_queries += unstable_partial_queries
-        error_tests += unstable_partial_queries
+        very_unstable_queries += unstable_partial_queries
        status = 'failure'

    # Don't show mildly unstable queries, only the very unstable ones we
    # treat as errors.
    if very_unstable_queries:
-        status = 'failure'
+        if very_unstable_queries > 3:
+            error_tests += very_unstable_queries
+            status = 'failure'
        message_array.append(str(very_unstable_queries) + ' unstable')

    error_tests += slow_average_tests
--- a/docker/test/sqlancer/Dockerfile
+++ b/docker/test/sqlancer/Dockerfile
@ -2,7 +2,6 @@
 FROM ubuntu:20.04

 RUN apt-get update --yes && env DEBIAN_FRONTEND=noninteractive apt-get install wget unzip git openjdk-14-jdk maven python3 --yes --no-install-recommends
-
 RUN wget https://github.com/sqlancer/sqlancer/archive/master.zip -O /sqlancer.zip
 RUN mkdir /sqlancer && \
 	cd /sqlancer && \
--- a/Show More
+++ b/Show More