Merge and rebase with master

2024-11-24 16:42:05 +00:00 · 2020-12-16 09:38:38 +08:00 · 2020-12-16 09:38:38 +08:00 · a7933ddf31
commit a7933ddf31
parent b56523fd94 eed767bbab
113 changed files with 1974 additions and 1619 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -1,3 +1,126 @@
+### ClickHouse release 20.12
+
+### ClickHouse release v20.12.3.3-stable, 2020-12-13
+
+#### Backward Incompatible Change
+
+* Enable `use_compact_format_in_distributed_parts_names` by default (see the documentation for the reference). [#16728](https://github.com/ClickHouse/ClickHouse/pull/16728) ([Azat Khuzhin](https://github.com/azat)).
+* Accept user settings related to file formats (e.g. `format_csv_delimiter`) in the `SETTINGS` clause when creating a table that uses `File` engine, and use these settings in all `INSERT`s and `SELECT`s. The file format settings changed in the current user session, or in the `SETTINGS` clause of a DML query itself, no longer affect the query. [#16591](https://github.com/ClickHouse/ClickHouse/pull/16591) ([Alexander Kuzmenkov](https://github.com/akuzm)).
+
+#### New Feature
+
+* add `*.xz` compression/decompression support.It enables using `*.xz` in `file()` function. This closes [#8828](https://github.com/ClickHouse/ClickHouse/issues/8828). [#16578](https://github.com/ClickHouse/ClickHouse/pull/16578) ([Abi Palagashvili](https://github.com/fibersel)).
+* Introduce the query `ALTER TABLE ... DROP|DETACH PART 'part_name'`. [#15511](https://github.com/ClickHouse/ClickHouse/pull/15511) ([nvartolomei](https://github.com/nvartolomei)).
+* Added new ALTER UPDATE/DELETE IN PARTITION syntax. [#13403](https://github.com/ClickHouse/ClickHouse/pull/13403) ([Vladimir Chebotarev](https://github.com/excitoon)).
+* Allow formatting named tuples as JSON objects when using JSON input/output formats, controlled by the `output_format_json_named_tuples_as_objects` setting, disabled by default. [#17175](https://github.com/ClickHouse/ClickHouse/pull/17175) ([Alexander Kuzmenkov](https://github.com/akuzm)).
+* Add a possibility to input enum value as it's id in TSV and CSV formats by default. [#16834](https://github.com/ClickHouse/ClickHouse/pull/16834) ([Kruglov Pavel](https://github.com/Avogar)).
+* Add COLLATE support for Nullable, LowCardinality, Array and Tuple, where nested type is String. Also refactor the code associated with collations in ColumnString.cpp. [#16273](https://github.com/ClickHouse/ClickHouse/pull/16273) ([Kruglov Pavel](https://github.com/Avogar)).
+* New `tcpPort` function returns TCP port listened by this server. [#17134](https://github.com/ClickHouse/ClickHouse/pull/17134) ([Ivan](https://github.com/abyss7)).
+* Add new math functions: `acosh`, `asinh`, `atan2`, `atanh`, `cosh`, `hypot`, `log1p`, `sinh`. [#16636](https://github.com/ClickHouse/ClickHouse/pull/16636) ([Konstantin Malanchev](https://github.com/hombit)).
+* Possibility to distribute the merges between different replicas. Introduces the `execute_merges_on_single_replica_time_threshold` mergetree setting. [#16424](https://github.com/ClickHouse/ClickHouse/pull/16424) ([filimonov](https://github.com/filimonov)).
+* Add setting `aggregate_functions_null_for_empty` for SQL standard compatibility. This option will rewrite all aggregate functions in a query, adding -OrNull suffix to them. Implements [10273](https://github.com/ClickHouse/ClickHouse/issues/10273). [#16123](https://github.com/ClickHouse/ClickHouse/pull/16123) ([flynn](https://github.com/ucasFL)).
+* Updated DateTime, DateTime64 parsing to accept string Date literal format. [#16040](https://github.com/ClickHouse/ClickHouse/pull/16040) ([Maksim Kita](https://github.com/kitaisreal)).
+* Make it possible to change the path to history file in `clickhouse-client` using the `--history_file` parameter. [#15960](https://github.com/ClickHouse/ClickHouse/pull/15960) ([Maksim Kita](https://github.com/kitaisreal)).
+
+#### Bug Fix
+
+* Fix the issue when server can stop accepting connections in very rare cases. [#17542](https://github.com/ClickHouse/ClickHouse/pull/17542) ([Amos Bird](https://github.com/amosbird)).
+* Fixed `Function not implemented` error when executing `RENAME` query in `Atomic` database with ClickHouse running on Windows Subsystem for Linux. Fixes [#17661](https://github.com/ClickHouse/ClickHouse/issues/17661). [#17664](https://github.com/ClickHouse/ClickHouse/pull/17664) ([tavplubix](https://github.com/tavplubix)).
+* Do not restore parts from WAL if `in_memory_parts_enable_wal` is disabled. [#17802](https://github.com/ClickHouse/ClickHouse/pull/17802) ([detailyang](https://github.com/detailyang)).
+* fix incorrect initialization of `max_compress_block_size` of MergeTreeWriterSettings with `min_compress_block_size`. [#17833](https://github.com/ClickHouse/ClickHouse/pull/17833) ([flynn](https://github.com/ucasFL)).
+* Exception message about max table size to drop was displayed incorrectly. [#17764](https://github.com/ClickHouse/ClickHouse/pull/17764) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fixed possible segfault when there is not enough space when inserting into `Distributed` table. [#17737](https://github.com/ClickHouse/ClickHouse/pull/17737) ([tavplubix](https://github.com/tavplubix)).
+* Fixed problem when ClickHouse fails to resume connection to MySQL servers. [#17681](https://github.com/ClickHouse/ClickHouse/pull/17681) ([Alexander Kazakov](https://github.com/Akazz)).
+* In might be determined incorrectly if cluster is circular- (cross-) replicated or not when executing `ON CLUSTER` query due to race condition when `pool_size` > 1. It's fixed. [#17640](https://github.com/ClickHouse/ClickHouse/pull/17640) ([tavplubix](https://github.com/tavplubix)).
+* Exception `fmt::v7::format_error` can be logged in background for MergeTree tables. This fixes [#17613](https://github.com/ClickHouse/ClickHouse/issues/17613). [#17615](https://github.com/ClickHouse/ClickHouse/pull/17615) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* When clickhouse-client is used in interactive mode with multiline queries, single line comment was erronously extended till the end of query. This fixes [#13654](https://github.com/ClickHouse/ClickHouse/issues/13654). [#17565](https://github.com/ClickHouse/ClickHouse/pull/17565) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix alter query hang when the corresponding mutation was killed on the different replica. Fixes [#16953](https://github.com/ClickHouse/ClickHouse/issues/16953). [#17499](https://github.com/ClickHouse/ClickHouse/pull/17499) ([alesapin](https://github.com/alesapin)).
+* Fix issue when mark cache size was underestimated by clickhouse. It may happen when there are a lot of tiny files with marks. [#17496](https://github.com/ClickHouse/ClickHouse/pull/17496) ([alesapin](https://github.com/alesapin)).
+* Fix `ORDER BY` with enabled setting `optimize_redundant_functions_in_order_by`. [#17471](https://github.com/ClickHouse/ClickHouse/pull/17471) ([Anton Popov](https://github.com/CurtizJ)).
+* Fix duplicates after `DISTINCT` which were possible because of incorrect optimization. Fixes [#17294](https://github.com/ClickHouse/ClickHouse/issues/17294). [#17296](https://github.com/ClickHouse/ClickHouse/pull/17296) ([li chengxiang](https://github.com/chengxianglibra)). [#17439](https://github.com/ClickHouse/ClickHouse/pull/17439) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Fix crash while reading from `JOIN` table with `LowCardinality` types. Fixes [#17228](https://github.com/ClickHouse/ClickHouse/issues/17228). [#17397](https://github.com/ClickHouse/ClickHouse/pull/17397) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* fix `toInt256(inf)` stack overflow. Int256 is an experimental feature. Closed [#17235](https://github.com/ClickHouse/ClickHouse/issues/17235). [#17257](https://github.com/ClickHouse/ClickHouse/pull/17257) ([flynn](https://github.com/ucasFL)).
+* Fix possible `Unexpected packet Data received from client` error logged for Distributed queries with `LIMIT`. [#17254](https://github.com/ClickHouse/ClickHouse/pull/17254) ([Azat Khuzhin](https://github.com/azat)).
+* Fix set index invalidation when there are const columns in the subquery. This fixes [#17246](https://github.com/ClickHouse/ClickHouse/issues/17246). [#17249](https://github.com/ClickHouse/ClickHouse/pull/17249) ([Amos Bird](https://github.com/amosbird)).
+* Fix possible wrong index analysis when the types of the index comparison are different. This fixes [#17122](https://github.com/ClickHouse/ClickHouse/issues/17122). [#17145](https://github.com/ClickHouse/ClickHouse/pull/17145) ([Amos Bird](https://github.com/amosbird)).
+* Fix ColumnConst comparison which leads to crash. This fixed [#17088](https://github.com/ClickHouse/ClickHouse/issues/17088) . [#17135](https://github.com/ClickHouse/ClickHouse/pull/17135) ([Amos Bird](https://github.com/amosbird)).
+* Multiple fixed for MaterializeMySQL (experimental feature). Fixes [#16923](https://github.com/ClickHouse/ClickHouse/issues/16923) Fixes [#15883](https://github.com/ClickHouse/ClickHouse/issues/15883) Fix MaterializeMySQL SYNC failure when the modify MySQL binlog_checksum. [#17091](https://github.com/ClickHouse/ClickHouse/pull/17091) ([Winter Zhang](https://github.com/zhang2014)).
+* Fix bug when `ON CLUSTER` queries may hang forever for non-leader ReplicatedMergeTreeTables. [#17089](https://github.com/ClickHouse/ClickHouse/pull/17089) ([alesapin](https://github.com/alesapin)).
+* Fixed crash on `CREATE TABLE ... AS some_table` query when `some_table` was created `AS table_function()` Fixes [#16944](https://github.com/ClickHouse/ClickHouse/issues/16944). [#17072](https://github.com/ClickHouse/ClickHouse/pull/17072) ([tavplubix](https://github.com/tavplubix)).
+* Bug unfinished implementation for funciton fuzzBits, related issue: [#16980](https://github.com/ClickHouse/ClickHouse/issues/16980). [#17051](https://github.com/ClickHouse/ClickHouse/pull/17051) ([hexiaoting](https://github.com/hexiaoting)).
+* Fix LLVM's libunwind in the case when CFA register is RAX. This is the [bug](https://bugs.llvm.org/show_bug.cgi?id=48186) in [LLVM's libunwind](https://github.com/llvm/llvm-project/tree/master/libunwind). We already have workarounds for this bug. [#17046](https://github.com/ClickHouse/ClickHouse/pull/17046) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Avoid unnecessary network errors for remote queries which may be cancelled while execution, like queries with `LIMIT`. [#17006](https://github.com/ClickHouse/ClickHouse/pull/17006) ([Azat Khuzhin](https://github.com/azat)).
+* Fix `optimize_distributed_group_by_sharding_key` setting (that is disabled by default) for query with OFFSET only. [#16996](https://github.com/ClickHouse/ClickHouse/pull/16996) ([Azat Khuzhin](https://github.com/azat)).
+* Fix for Merge tables over Distributed tables with JOIN. [#16993](https://github.com/ClickHouse/ClickHouse/pull/16993) ([Azat Khuzhin](https://github.com/azat)).
+* Fixed wrong result in big integers (128, 256 bit) when casting from double. Big integers support is experimental. [#16986](https://github.com/ClickHouse/ClickHouse/pull/16986) ([Mike](https://github.com/myrrc)).
+* Fix possible server crash after `ALTER TABLE ... MODIFY COLUMN ... NewType` when `SELECT` have `WHERE` expression on altering column and alter doesn't finished yet. [#16968](https://github.com/ClickHouse/ClickHouse/pull/16968) ([Amos Bird](https://github.com/amosbird)).
+* Blame info was not calculated correctly in `clickhouse-git-import`. [#16959](https://github.com/ClickHouse/ClickHouse/pull/16959) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix order by optimization with monotonous functions. Fixes [#16107](https://github.com/ClickHouse/ClickHouse/issues/16107). [#16956](https://github.com/ClickHouse/ClickHouse/pull/16956) ([Anton Popov](https://github.com/CurtizJ)).
+* Fix optimization of group by with enabled setting `optimize_aggregators_of_group_by_keys` and joins. Fixes [#12604](https://github.com/ClickHouse/ClickHouse/issues/12604). [#16951](https://github.com/ClickHouse/ClickHouse/pull/16951) ([Anton Popov](https://github.com/CurtizJ)).
+* Fix possible error `Illegal type of argument` for queries with `ORDER BY`. Fixes [#16580](https://github.com/ClickHouse/ClickHouse/issues/16580). [#16928](https://github.com/ClickHouse/ClickHouse/pull/16928) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Fix strange code in InterpreterShowAccessQuery. [#16866](https://github.com/ClickHouse/ClickHouse/pull/16866) ([tavplubix](https://github.com/tavplubix)).
+* Prevent clickhouse server crashes when using the function `timeSeriesGroupSum`. The function is removed from newer ClickHouse releases. [#16865](https://github.com/ClickHouse/ClickHouse/pull/16865) ([filimonov](https://github.com/filimonov)).
+* Fix rare silent crashes when query profiler is on and ClickHouse is installed on OS with glibc version that has (supposedly) broken asynchronous unwind tables for some functions. This fixes [#15301](https://github.com/ClickHouse/ClickHouse/issues/15301). This fixes [#13098](https://github.com/ClickHouse/ClickHouse/issues/13098). [#16846](https://github.com/ClickHouse/ClickHouse/pull/16846) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix crash when using `any` without any arguments. This is for [#16803](https://github.com/ClickHouse/ClickHouse/issues/16803) . cc @azat. [#16826](https://github.com/ClickHouse/ClickHouse/pull/16826) ([Amos Bird](https://github.com/amosbird)).
+* If no memory can be allocated while writing table metadata on disk, broken metadata file can be written. [#16772](https://github.com/ClickHouse/ClickHouse/pull/16772) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix trivial query optimization with partition predicate. [#16767](https://github.com/ClickHouse/ClickHouse/pull/16767) ([Azat Khuzhin](https://github.com/azat)).
+* Fix `IN` operator over several columns and tuples with enabled `transform_null_in` setting. Fixes [#15310](https://github.com/ClickHouse/ClickHouse/issues/15310). [#16722](https://github.com/ClickHouse/ClickHouse/pull/16722) ([Anton Popov](https://github.com/CurtizJ)).
+* Return number of affected rows for INSERT queries via MySQL protocol. Previously ClickHouse used to always return 0, it's fixed. Fixes [#16605](https://github.com/ClickHouse/ClickHouse/issues/16605). [#16715](https://github.com/ClickHouse/ClickHouse/pull/16715) ([Winter Zhang](https://github.com/zhang2014)).
+* Fix remote query failure when using 'if' suffix aggregate function. Fixes [#16574](https://github.com/ClickHouse/ClickHouse/issues/16574) Fixes [#16231](https://github.com/ClickHouse/ClickHouse/issues/16231) [#16610](https://github.com/ClickHouse/ClickHouse/pull/16610) ([Winter Zhang](https://github.com/zhang2014)).
+* Fix inconsistent behavior caused by `select_sequential_consistency` for optimized trivial count query and system.tables. [#16309](https://github.com/ClickHouse/ClickHouse/pull/16309) ([Hao Chen](https://github.com/haoch)).
+
+#### Improvement
+
+* Remove empty parts after they were pruned by TTL, mutation, or collapsing merge algorithm. [#16895](https://github.com/ClickHouse/ClickHouse/pull/16895) ([Anton Popov](https://github.com/CurtizJ)).
+* Enable compact format of directories for asynchronous sends in Distributed tables: `use_compact_format_in_distributed_parts_names` is set to 1 by default. [#16788](https://github.com/ClickHouse/ClickHouse/pull/16788) ([Azat Khuzhin](https://github.com/azat)).
+* Abort multipart upload if no data was written to S3. [#16840](https://github.com/ClickHouse/ClickHouse/pull/16840) ([Pavel Kovalenko](https://github.com/Jokser)).
+* Reresolve the IP of the `format_avro_schema_registry_url` in case of errors. [#16985](https://github.com/ClickHouse/ClickHouse/pull/16985) ([filimonov](https://github.com/filimonov)).
+* Mask password in data_path in the system.distribution_queue. [#16727](https://github.com/ClickHouse/ClickHouse/pull/16727) ([Azat Khuzhin](https://github.com/azat)).
+* Throw error when use column transformer replaces non existing column. [#16183](https://github.com/ClickHouse/ClickHouse/pull/16183) ([hexiaoting](https://github.com/hexiaoting)).
+* Turn off parallel parsing when there is no enough memory for all threads to work simultaneously. Also there could be exceptions like "Memory limit exceeded" when somebody will try to insert extremely huge rows (> min_chunk_bytes_for_parallel_parsing), because each piece to parse has to be independent set of strings (one or more). [#16721](https://github.com/ClickHouse/ClickHouse/pull/16721) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
+* Install script should always create subdirs in config folders. This is only relevant for Docker build with custom config. [#16936](https://github.com/ClickHouse/ClickHouse/pull/16936) ([filimonov](https://github.com/filimonov)).
+* Correct grammar in error message in JSONEachRow, JSONCompactEachRow, and RegexpRow input formats. [#17205](https://github.com/ClickHouse/ClickHouse/pull/17205) ([nico piderman](https://github.com/sneako)).
+* Set default `host` and `port` parameters for `SOURCE(CLICKHOUSE(...))` to current instance and set default `user` value to `'default'`. [#16997](https://github.com/ClickHouse/ClickHouse/pull/16997) ([vdimir](https://github.com/vdimir)).
+* Throw an informative error message when doing `ATTACH/DETACH TABLE <DICTIONARY>`. Before this PR, `detach table <dict>` works but leads to an ill-formed in-memory metadata. [#16885](https://github.com/ClickHouse/ClickHouse/pull/16885) ([Amos Bird](https://github.com/amosbird)).
+* Add cutToFirstSignificantSubdomainWithWWW(). [#16845](https://github.com/ClickHouse/ClickHouse/pull/16845) ([Azat Khuzhin](https://github.com/azat)).
+* Server refused to startup with exception message if wrong config is given (`metric_log`.`collect_interval_milliseconds` is missing). [#16815](https://github.com/ClickHouse/ClickHouse/pull/16815) ([Ivan](https://github.com/abyss7)).
+* Better exception message when configuration for distributed DDL is absent. This fixes [#5075](https://github.com/ClickHouse/ClickHouse/issues/5075). [#16769](https://github.com/ClickHouse/ClickHouse/pull/16769) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
+* Usability improvement: better suggestions in syntax error message when `CODEC` expression is misplaced in `CREATE TABLE` query. This fixes [#12493](https://github.com/ClickHouse/ClickHouse/issues/12493). [#16768](https://github.com/ClickHouse/ClickHouse/pull/16768) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Remove empty directories for async INSERT at start of Distributed engine. [#16729](https://github.com/ClickHouse/ClickHouse/pull/16729) ([Azat Khuzhin](https://github.com/azat)).
+* Workaround for use S3 with nginx server as proxy. Nginx currenty does not accept urls with empty path like `http://domain.com?delete`, but vanilla aws-sdk-cpp produces this kind of urls. This commit uses patched aws-sdk-cpp version, which makes urls with "/" as path in this cases, like `http://domain.com/?delete`. [#16709](https://github.com/ClickHouse/ClickHouse/pull/16709) ([ianton-ru](https://github.com/ianton-ru)).
+* Allow `reinterpretAs*` functions to work for integers and floats of the same size. Implements [16640](https://github.com/ClickHouse/ClickHouse/issues/16640). [#16657](https://github.com/ClickHouse/ClickHouse/pull/16657) ([flynn](https://github.com/ucasFL)).
+* Now, `<auxiliary_zookeepers>` configuration can be changed in `config.xml` and reloaded without server startup. [#16627](https://github.com/ClickHouse/ClickHouse/pull/16627) ([Amos Bird](https://github.com/amosbird)).
+* Support SNI in https connections to remote resources. This will allow to connect to Cloudflare servers that require SNI. This fixes [#10055](https://github.com/ClickHouse/ClickHouse/issues/10055). [#16252](https://github.com/ClickHouse/ClickHouse/pull/16252) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Make it possible to connect to `clickhouse-server` secure endpoint which requires SNI. This is possible when `clickhouse-server` is hosted behind TLS proxy. [#16938](https://github.com/ClickHouse/ClickHouse/pull/16938) ([filimonov](https://github.com/filimonov)).
+* Fix possible stack overflow if a loop of materialized views is created. This closes [#15732](https://github.com/ClickHouse/ClickHouse/issues/15732). [#16048](https://github.com/ClickHouse/ClickHouse/pull/16048) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Simplify the implementation of background tasks processing for the MergeTree table engines family. There should be no visible changes for user. [#15983](https://github.com/ClickHouse/ClickHouse/pull/15983) ([alesapin](https://github.com/alesapin)).
+* Improvement for MaterializeMySQL (experimental feature). Throw exception about right sync privileges when MySQL sync user has error privileges. [#15977](https://github.com/ClickHouse/ClickHouse/pull/15977) ([TCeason](https://github.com/TCeason)).
+* Made `indexOf()` use BloomFilter. [#14977](https://github.com/ClickHouse/ClickHouse/pull/14977) ([achimbab](https://github.com/achimbab)).
+
+#### Performance Improvement
+
+* Use Floyd-Rivest algorithm, it is the best for the ClickHouse use case of partial sorting. Bechmarks are in https://github.com/danlark1/miniselect and [here](https://drive.google.com/drive/folders/1DHEaeXgZuX6AJ9eByeZ8iQVQv0ueP8XM). [#16825](https://github.com/ClickHouse/ClickHouse/pull/16825) ([Danila Kutenin](https://github.com/danlark1)).
+* Now `ReplicatedMergeTree` tree engines family uses a separate thread pool for replicated fetches. Size of the pool limited by setting `background_fetches_pool_size` which can be tuned with a server restart. The default value of the setting is 3 and it means that the maximum amount of parallel fetches is equal to 3 (and it allows to utilize 10G network). Fixes #520. [#16390](https://github.com/ClickHouse/ClickHouse/pull/16390) ([alesapin](https://github.com/alesapin)).
+* Fixed uncontrolled growth of the state of `quantileTDigest`. [#16680](https://github.com/ClickHouse/ClickHouse/pull/16680) ([hrissan](https://github.com/hrissan)).
+* Add `VIEW` subquery description to `EXPLAIN`. Limit push down optimisation for `VIEW`. Add local replicas of `Distributed` to query plan. [#14936](https://github.com/ClickHouse/ClickHouse/pull/14936) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
+* Fix optimize_read_in_order/optimize_aggregation_in_order with max_threads > 0 and expression in ORDER BY. [#16637](https://github.com/ClickHouse/ClickHouse/pull/16637) ([Azat Khuzhin](https://github.com/azat)).
+* Fix performance of reading from `Merge` tables over huge number of `MergeTree` tables. Fixes [#7748](https://github.com/ClickHouse/ClickHouse/issues/7748). [#16988](https://github.com/ClickHouse/ClickHouse/pull/16988) ([Anton Popov](https://github.com/CurtizJ)).
+* Now we can safely prune partitions with exact match. Useful case: Suppose table is partitioned by `intHash64(x) % 100` and the query has condition on `intHash64(x) % 100` verbatim, not on x. [#16253](https://github.com/ClickHouse/ClickHouse/pull/16253) ([Amos Bird](https://github.com/amosbird)).
+
+#### Experimental Feature
+
+* Add `EmbeddedRocksDB` table engine (can be used for dictionaries). [#15073](https://github.com/ClickHouse/ClickHouse/pull/15073) ([sundyli](https://github.com/sundy-li)).
+
+#### Build/Testing/Packaging Improvement
+
+* Improvements in test coverage building images. [#17233](https://github.com/ClickHouse/ClickHouse/pull/17233) ([alesapin](https://github.com/alesapin)).
+* Update embedded timezone data to version 2020d (also update cctz to the latest master). [#17204](https://github.com/ClickHouse/ClickHouse/pull/17204) ([filimonov](https://github.com/filimonov)).
+* Fix UBSan report in Poco. This closes [#12719](https://github.com/ClickHouse/ClickHouse/issues/12719). [#16765](https://github.com/ClickHouse/ClickHouse/pull/16765) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Do not instrument 3rd-party libraries with UBSan. [#16764](https://github.com/ClickHouse/ClickHouse/pull/16764) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix UBSan report in cache dictionaries. This closes [#12641](https://github.com/ClickHouse/ClickHouse/issues/12641). [#16763](https://github.com/ClickHouse/ClickHouse/pull/16763) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+* Fix UBSan report when trying to convert infinite floating point number to integer. This closes [#14190](https://github.com/ClickHouse/ClickHouse/issues/14190). [#16677](https://github.com/ClickHouse/ClickHouse/pull/16677) ([alexey-milovidov](https://github.com/alexey-milovidov)).
+
+
 ## ClickHouse release 20.11

 ### ClickHouse release v20.11.3.3-stable, 2020-11-13
--- a/cmake/sanitize.cmake
+++ b/cmake/sanitize.cmake
@ -41,9 +41,10 @@ if (SANITIZE)
        if (COMPILER_CLANG)
            set (TSAN_FLAGS "${TSAN_FLAGS} -fsanitize-blacklist=${CMAKE_SOURCE_DIR}/tests/tsan_suppressions.txt")
        else()
-            message (WARNING "TSAN suppressions was not passed to the compiler (since the compiler is not clang)")
-            message (WARNING "Use the following command to pass them manually:")
-            message (WARNING "    export TSAN_OPTIONS=\"$TSAN_OPTIONS suppressions=${CMAKE_SOURCE_DIR}/tests/tsan_suppressions.txt\"")
+            set (MESSAGE "TSAN suppressions was not passed to the compiler (since the compiler is not clang)\n")
+            set (MESSAGE "${MESSAGE}Use the following command to pass them manually:\n")
+            set (MESSAGE "${MESSAGE}    export TSAN_OPTIONS=\"$TSAN_OPTIONS suppressions=${CMAKE_SOURCE_DIR}/tests/tsan_suppressions.txt\"")
+            message (WARNING "${MESSAGE}")
        endif()


@ -57,8 +58,18 @@ if (SANITIZE)
        endif ()

    elseif (SANITIZE STREQUAL "undefined")
-        set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} -fsanitize=undefined -fno-sanitize-recover=all -fno-sanitize=float-divide-by-zero -fsanitize-blacklist=${CMAKE_SOURCE_DIR}/tests/ubsan_suppressions.txt")
-        set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SAN_FLAGS} -fsanitize=undefined -fno-sanitize-recover=all -fno-sanitize=float-divide-by-zero -fsanitize-blacklist=${CMAKE_SOURCE_DIR}/tests/ubsan_suppressions.txt")
+        set (UBSAN_FLAGS "-fsanitize=undefined -fno-sanitize-recover=all -fno-sanitize=float-divide-by-zero")
+        if (COMPILER_CLANG)
+            set (UBSAN_FLAGS "${UBSAN_FLAGS} -fsanitize-blacklist=${CMAKE_SOURCE_DIR}/tests/ubsan_suppressions.txt")
+        else()
+            set (MESSAGE "UBSAN suppressions was not passed to the compiler (since the compiler is not clang)\n")
+            set (MESSAGE "${MESSAGE}Use the following command to pass them manually:\n")
+            set (MESSAGE "${MESSAGE}        export UBSAN_OPTIONS=\"$UBSAN_OPTIONS suppressions=${CMAKE_SOURCE_DIR}/tests/ubsan_suppressions.txt\"")
+            message (WARNING "${MESSAGE}")
+        endif()
+
+        set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} ${UBSAN_FLAGS}")
+        set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SAN_FLAGS} ${UBSAN_FLAGS}")
        if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
            set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fsanitize=undefined")
        endif()
--- a/contrib/librdkafka
+++ b/contrib/librdkafka
@ -1 +1 @@
-Subproject commit 9902bc4fb18bb441fa55ca154b341cdda191e5d3
+Subproject commit f2f6616419d567c9198aef0d1133a2e9b4f02276
--- a/contrib/libunwind
+++ b/contrib/libunwind
@ -1 +1 @@
-Subproject commit 51b84d9b6d2548f1cbdcafe622d5a753853b6149
+Subproject commit 8fe25d7dc70f2a4ea38c3e5a33fa9d4199b67a5a
--- a/docker/packager/packager
+++ b/docker/packager/packager
@ -148,6 +148,10 @@ def parse_env_variables(build_type, compiler, sanitizer, package_type, image_typ

    if split_binary:
        cmake_flags.append('-DUSE_STATIC_LIBRARIES=0 -DSPLIT_SHARED_LIBRARIES=1 -DCLICKHOUSE_SPLIT_BINARY=1')
+        # We can't always build utils because it requires too much space, but
+        # we have to build them at least in some way in CI. The split build is
+        # probably the least heavy disk-wise.
+        cmake_flags.append('-DENABLE_UTILS=1')

    if clang_tidy:
        cmake_flags.append('-DENABLE_CLANG_TIDY=1')
--- a/docker/server/README.md
+++ b/docker/server/README.md
@ -15,6 +15,8 @@ For more information and documentation see https://clickhouse.yandex/.
 $ docker run -d --name some-clickhouse-server --ulimit nofile=262144:262144 yandex/clickhouse-server
 ```

+By default ClickHouse will be accessible only via docker network. See the [networking section below](#networking).
+
 ### connect to it from a native client
 ```bash
 $ docker run -it --rm --link some-clickhouse-server:clickhouse-server yandex/clickhouse-client --host clickhouse-server
@ -22,6 +24,70 @@ $ docker run -it --rm --link some-clickhouse-server:clickhouse-server yandex/cli

 More information about [ClickHouse client](https://clickhouse.yandex/docs/en/interfaces/cli/).

+### connect to it using curl
+
+```bash
+echo "SELECT 'Hello, ClickHouse!'" | docker run -i --rm --link some-clickhouse-server:clickhouse-server curlimages/curl 'http://clickhouse-server:8123/?query=' -s --data-binary @-
+```
+More information about [ClickHouse HTTP Interface](https://clickhouse.tech/docs/en/interfaces/http/).
+
+### stopping / removing the containter
+
+```bash
+$ docker stop some-clickhouse-server
+$ docker rm some-clickhouse-server
+```
+
+### networking
+
+You can expose you ClickHouse running in docker by [mapping particular port](https://docs.docker.com/config/containers/container-networking/) from inside container to a host ports:
+
+```bash
+$ docker run -d -p 18123:8123 -p19000:9000 --name some-clickhouse-server --ulimit nofile=262144:262144 yandex/clickhouse-server
+$ echo 'SELECT version()' | curl 'http://localhost:18123/' --data-binary @-
+20.12.3.3
+```
+
+or by allowing container to use [host ports directly](https://docs.docker.com/network/host/) using `--network=host` (also allows archiving better network performance):
+
+```bash
+$ docker run -d --network=host --name some-clickhouse-server --ulimit nofile=262144:262144 yandex/clickhouse-server
+$ echo 'SELECT version()' | curl 'http://localhost:8123/' --data-binary @-
+20.12.3.3
+```
+
+### Volumes 
+
+Typically you may want to mount the following folders inside your container to archieve persistency:
+
+* `/var/lib/clickhouse/` - main folder where ClickHouse stores the data
+* `/val/log/clickhouse-server/` - logs
+
+```bash
+$ docker run -d \
+	-v $(realpath ./ch_data):/var/lib/clickhouse/ \
+	-v $(realpath ./ch_logs):/var/log/clickhouse-server/ \
+	--name some-clickhouse-server --ulimit nofile=262144:262144 yandex/clickhouse-server
+```
+
+You may also want to mount:
+
+* `/etc/clickhouse-server/config.d/*.xml` - files with server configuration adjustmenets
+* `/etc/clickhouse-server/usert.d/*.xml` - files with use settings adjustmenets
+* `/docker-entrypoint-initdb.d/` - folder with database initialization scripts (see below).
+
+### Linux capabilities 
+
+ClickHouse has some advanced functionality which requite enabling several [linux capabilities](https://man7.org/linux/man-pages/man7/capabilities.7.html).
+
+It is optional and can be enabled using the following [docker command line agruments](https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities):
+
+```bash
+$ docker run -d \
+	--cap-add=SYS_NICE --cap-add=NET_ADMIN --cap-add=IPC_LOCK \
+	--name some-clickhouse-server --ulimit nofile=262144:262144 yandex/clickhouse-server
+```
+
 ## Configuration

 Container exposes 8123 port for [HTTP interface](https://clickhouse.yandex/docs/en/interfaces/http_interface/) and 9000 port for [native client](https://clickhouse.yandex/docs/en/interfaces/tcp/).
--- a/docker/test/performance-comparison/config/config.d/zzz-perf-comparison-tweaks-config.xml
+++ b/docker/test/performance-comparison/config/config.d/zzz-perf-comparison-tweaks-config.xml
@ -3,6 +3,7 @@
    <mysql_port remove="remove"/>
    <interserver_http_port remove="remove"/>
    <tcp_with_proxy_port remove="remove"/>
+    <test_keeper_server remove="remove"/>
    <listen_host>::</listen_host>

    <logger>
--- a/docs/en/operations/tips.md
+++ b/docs/en/operations/tips.md
@ -91,6 +91,23 @@ The Linux kernel prior to 3.2 had a multitude of problems with IPv6 implementati

 Use at least a 10 GB network, if possible. 1 Gb will also work, but it will be much worse for patching replicas with tens of terabytes of data, or for processing distributed queries with a large amount of intermediate data.

+## Hypervisor configuration
+
+If you are using OpenStack, set
+```
+cpu_mode=host-passthrough
+```
+in nova.conf.
+
+If you are using libvirt, set
+```
+<cpu mode='host-passthrough'/>
+```
+in XML configuration.
+
+This is important for ClickHouse to be able to get correct information with `cpuid` instruction.
+Otherwise you may get `Illegal instruction` crashes when hypervisor is run on old CPU models.
+
 ## ZooKeeper {#zookeeper}

 You are probably already using ZooKeeper for other purposes. You can use the same installation of ZooKeeper, if it isn’t already overloaded.
--- a/docs/ru/sql-reference/functions/date-time-functions.md
+++ b/docs/ru/sql-reference/functions/date-time-functions.md
@ -593,6 +593,18 @@ SELECT dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-0
 Например, `timeSlots(toDateTime('2012-01-01 12:20:00'), toUInt32(600)) = [toDateTime('2012-01-01 12:00:00'), toDateTime('2012-01-01 12:30:00')]`.
 Это нужно для поиска хитов, входящих в соответствующий визит.

+## toYYYYMM
+
+Переводит дату или дату со временем в число типа UInt32, содержащее номер года и месяца (YYYY * 100 + MM).
+
+## toYYYYMMDD
+
+Переводит дату или дату со временем в число типа UInt32, содержащее номер года, месяца и дня (YYYY * 10000 + MM * 100 + DD).
+
+## toYYYYMMDDhhmmss
+
+Переводит дату или дату со временем в число типа UInt64 содержащее номер года, месяца, дня и время (YYYY * 10000000000 + MM * 100000000 + DD * 1000000 + hh * 10000 + mm * 100 + ss).
+
 ## formatDateTime {#formatdatetime}

 Функция преобразует дату-и-время в строку по заданному шаблону. Важно: шаблон — константное выражение, поэтому использовать разные шаблоны в одной колонке не получится.
--- a/docs/zh/sql-reference/statements/select/limit.md
+++ b/docs/zh/sql-reference/statements/select/limit.md
@ -14,7 +14,7 @@ toc_title: LIMIT

 ## LIMIT … WITH TIES 修饰符 {#limit-with-ties}

-如果为 `LIMIT n[,m]` 设置了 `WITH TIES` ，并且声明了 `ORDER BY expr_list`, you will get in result first `n` or `n,m` rows and all rows with same `ORDER BY` fields values equal to row at position `n` for `LIMIT n` and `m` for `LIMIT n,m`.
+如果为 `LIMIT n[,m]` 设置了 `WITH TIES` ，并且声明了 `ORDER BY expr_list`, 除了得到无修饰符的结果（正常情况下的 `limit n`, 前n行数据), 还会返回与第`n`行具有相同排序字段的行(即如果第n+1行的字段与第n行 拥有相同的排序字段，同样返回该结果.

 此修饰符可以与： [ORDER BY … WITH FILL modifier](../../../sql-reference/statements/select/order-by.md#orderby-with-fill) 组合使用.

@ -38,7 +38,7 @@ SELECT * FROM (
 └───┘
 ```

-单子执行了 `WITH TIES` 修饰符后
+添加 `WITH TIES` 修饰符后

 ``` sql
 SELECT * FROM (
@ -59,4 +59,8 @@ SELECT * FROM (
 └───┘
 ```

-cause row number 6 have same value “2” for field `n` as row number 5
+虽然指定了`LIMIT 5`, 但第6行的`n`字段值为2，与第5行相同，因此也作为满足条件的记录返回。
+简而言之，该修饰符可理解为是否增加“并列行”的数据。
+
+``` sql，
+``` sql
--- a/src/Access/ContextAccess.cpp
+++ b/src/Access/ContextAccess.cpp
@ -41,42 +41,7 @@ namespace
    }


-    void applyParamsToAccessRights(AccessRights & access, const ContextAccessParams & params)
-    {
-        static const AccessFlags table_ddl = AccessType::CREATE_DATABASE | AccessType::CREATE_TABLE | AccessType::CREATE_VIEW
-            | AccessType::ALTER_TABLE | AccessType::ALTER_VIEW | AccessType::DROP_DATABASE | AccessType::DROP_TABLE | AccessType::DROP_VIEW
-            | AccessType::TRUNCATE;
-
-        static const AccessFlags dictionary_ddl = AccessType::CREATE_DICTIONARY | AccessType::DROP_DICTIONARY;
-        static const AccessFlags table_and_dictionary_ddl = table_ddl | dictionary_ddl;
-        static const AccessFlags write_table_access = AccessType::INSERT | AccessType::OPTIMIZE;
-        static const AccessFlags write_dcl_access = AccessType::ACCESS_MANAGEMENT - AccessType::SHOW_ACCESS;
-
-        if (params.readonly)
-            access.revoke(write_table_access | table_and_dictionary_ddl | write_dcl_access | AccessType::SYSTEM | AccessType::KILL_QUERY);
-
-        if (params.readonly == 1)
-        {
-            /// Table functions are forbidden in readonly mode.
-            /// For readonly = 2 they're allowed.
-            access.revoke(AccessType::CREATE_TEMPORARY_TABLE);
-        }
-
-        if (!params.allow_ddl)
-            access.revoke(table_and_dictionary_ddl);
-
-        if (!params.allow_introspection)
-            access.revoke(AccessType::INTROSPECTION);
-
-        if (params.readonly)
-        {
-            /// No grant option in readonly mode.
-            access.revokeGrantOption(AccessType::ALL);
-        }
-    }
-
-
-    void addImplicitAccessRights(AccessRights & access)
+    AccessRights addImplicitAccessRights(const AccessRights & access)
    {
        auto modifier = [&](const AccessFlags & flags, const AccessFlags & min_flags_with_children, const AccessFlags & max_flags_with_children, const std::string_view & database, const std::string_view & table, const std::string_view & column) -> AccessFlags
        {
@ -150,23 +115,12 @@ namespace
            return res;
        };

-        access.modifyFlags(modifier);
-
-        /// Transform access to temporary tables into access to "_temporary_and_external_tables" database.
-        if (access.isGranted(AccessType::CREATE_TEMPORARY_TABLE))
-            access.grant(AccessFlags::allTableFlags() | AccessFlags::allColumnFlags(), DatabaseCatalog::TEMPORARY_DATABASE);
+        AccessRights res = access;
+        res.modifyFlags(modifier);

        /// Anyone has access to the "system" database.
-        access.grant(AccessType::SELECT, DatabaseCatalog::SYSTEM_DATABASE);
-    }
-
-
-    AccessRights calculateFinalAccessRights(const AccessRights & access_from_user_and_roles, const ContextAccessParams & params)
-    {
-        AccessRights res_access = access_from_user_and_roles;
-        applyParamsToAccessRights(res_access, params);
-        addImplicitAccessRights(res_access);
-        return res_access;
+        res.grant(AccessType::SELECT, DatabaseCatalog::SYSTEM_DATABASE);
+        return res;
    }


@ -176,6 +130,12 @@ namespace
        ids[0] = id;
        return ids;
    }
+
+    /// Helper for using in templates.
+    std::string_view getDatabase() { return {}; }
+
+    template <typename... OtherArgs>
+    std::string_view getDatabase(const std::string_view & arg1, const OtherArgs &...) { return arg1; }
 }


@ -203,10 +163,7 @@ void ContextAccess::setUser(const UserPtr & user_) const
        /// User has been dropped.
        auto nothing_granted = std::make_shared<AccessRights>();
        access = nothing_granted;
-        access_without_readonly = nothing_granted;
-        access_with_allow_ddl = nothing_granted;
-        access_with_allow_introspection = nothing_granted;
-        access_from_user_and_roles = nothing_granted;
+        access_with_implicit = nothing_granted;
        subscription_for_user_change = {};
        subscription_for_roles_changes = {};
        enabled_roles = nullptr;
@ -270,12 +227,8 @@ void ContextAccess::setRolesInfo(const std::shared_ptr<const EnabledRolesInfo> &

 void ContextAccess::calculateAccessRights() const
 {
-    access_from_user_and_roles = std::make_shared<AccessRights>(mixAccessRightsFromUserAndRoles(*user, *roles_info));
-    access = std::make_shared<AccessRights>(calculateFinalAccessRights(*access_from_user_and_roles, params));
-
-    access_without_readonly = nullptr;
-    access_with_allow_ddl = nullptr;
-    access_with_allow_introspection = nullptr;
+    access = std::make_shared<AccessRights>(mixAccessRightsFromUserAndRoles(*user, *roles_info));
+    access_with_implicit = std::make_shared<AccessRights>(addImplicitAccessRights(*access));

    if (trace_log)
    {
@ -287,6 +240,7 @@ void ContextAccess::calculateAccessRights() const
        }
        LOG_TRACE(trace_log, "Settings: readonly={}, allow_ddl={}, allow_introspection_functions={}", params.readonly, params.allow_ddl, params.allow_introspection);
        LOG_TRACE(trace_log, "List of all grants: {}", access->toString());
+        LOG_TRACE(trace_log, "List of all grants including implicit: {}", access_with_implicit->toString());
    }
 }

@ -340,6 +294,7 @@ std::shared_ptr<const ContextAccess> ContextAccess::getFullAccess()
    static const std::shared_ptr<const ContextAccess> res = []
    {
        auto full_access = std::shared_ptr<ContextAccess>(new ContextAccess);
+        full_access->is_full_access = true;
        full_access->access = std::make_shared<AccessRights>(AccessRights::getFullAccess());
        full_access->enabled_quota = EnabledQuota::getUnlimitedQuota();
        return full_access;
@ -362,323 +317,303 @@ std::shared_ptr<const SettingsConstraints> ContextAccess::getSettingsConstraints
 }


-std::shared_ptr<const AccessRights> ContextAccess::getAccess() const
+std::shared_ptr<const AccessRights> ContextAccess::getAccessRights() const
 {
    std::lock_guard lock{mutex};
    return access;
 }


-template <bool grant_option, typename... Args>
-bool ContextAccess::isGrantedImpl2(const AccessFlags & flags, const Args &... args) const
+std::shared_ptr<const AccessRights> ContextAccess::getAccessRightsWithImplicit() const
 {
-    bool access_granted;
-    if constexpr (grant_option)
-        access_granted = getAccess()->hasGrantOption(flags, args...);
-    else
-        access_granted = getAccess()->isGranted(flags, args...);
-
-    if (trace_log)
-        LOG_TRACE(trace_log, "Access {}: {}{}", (access_granted ? "granted" : "denied"), (AccessRightsElement{flags, args...}.toString()),
-                  (grant_option ? " WITH GRANT OPTION" : ""));
-
-    return access_granted;
+    std::lock_guard lock{mutex};
+    return access_with_implicit;
 }

-template <bool grant_option>
-bool ContextAccess::isGrantedImpl(const AccessFlags & flags) const
+
+template <bool throw_if_denied, bool grant_option>
+bool ContextAccess::checkAccessImpl(const AccessFlags & flags) const
 {
-    return isGrantedImpl2<grant_option>(flags);
+    return checkAccessImpl2<throw_if_denied, grant_option>(flags);
 }

-template <bool grant_option, typename... Args>
-bool ContextAccess::isGrantedImpl(const AccessFlags & flags, const std::string_view & database, const Args &... args) const
+template <bool throw_if_denied, bool grant_option, typename... Args>
+bool ContextAccess::checkAccessImpl(const AccessFlags & flags, const std::string_view & database, const Args &... args) const
 {
-    return isGrantedImpl2<grant_option>(flags, database.empty() ? params.current_database : database, args...);
+    return checkAccessImpl2<throw_if_denied, grant_option>(flags, database.empty() ? params.current_database : database, args...);
 }

-template <bool grant_option>
-bool ContextAccess::isGrantedImpl(const AccessRightsElement & element) const
+template <bool throw_if_denied, bool grant_option>
+bool ContextAccess::checkAccessImpl(const AccessRightsElement & element) const
 {
    if (element.any_database)
-        return isGrantedImpl<grant_option>(element.access_flags);
+        return checkAccessImpl<throw_if_denied, grant_option>(element.access_flags);
    else if (element.any_table)
-        return isGrantedImpl<grant_option>(element.access_flags, element.database);
+        return checkAccessImpl<throw_if_denied, grant_option>(element.access_flags, element.database);
    else if (element.any_column)
-        return isGrantedImpl<grant_option>(element.access_flags, element.database, element.table);
+        return checkAccessImpl<throw_if_denied, grant_option>(element.access_flags, element.database, element.table);
    else
-        return isGrantedImpl<grant_option>(element.access_flags, element.database, element.table, element.columns);
+        return checkAccessImpl<throw_if_denied, grant_option>(element.access_flags, element.database, element.table, element.columns);
 }

-template <bool grant_option>
-bool ContextAccess::isGrantedImpl(const AccessRightsElements & elements) const
+template <bool throw_if_denied, bool grant_option>
+bool ContextAccess::checkAccessImpl(const AccessRightsElements & elements) const
 {
    for (const auto & element : elements)
-        if (!isGrantedImpl<grant_option>(element))
+        if (!checkAccessImpl<throw_if_denied, grant_option>(element))
            return false;
    return true;
 }

-bool ContextAccess::isGranted(const AccessFlags & flags) const { return isGrantedImpl<false>(flags); }
-bool ContextAccess::isGranted(const AccessFlags & flags, const std::string_view & database) const { return isGrantedImpl<false>(flags, database); }
-bool ContextAccess::isGranted(const AccessFlags & flags, const std::string_view & database, const std::string_view & table) const { return isGrantedImpl<false>(flags, database, table); }
-bool ContextAccess::isGranted(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::string_view & column) const { return isGrantedImpl<false>(flags, database, table, column); }
-bool ContextAccess::isGranted(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::vector<std::string_view> & columns) const { return isGrantedImpl<false>(flags, database, table, columns); }
-bool ContextAccess::isGranted(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const Strings & columns) const { return isGrantedImpl<false>(flags, database, table, columns); }
-bool ContextAccess::isGranted(const AccessRightsElement & element) const { return isGrantedImpl<false>(element); }
-bool ContextAccess::isGranted(const AccessRightsElements & elements) const { return isGrantedImpl<false>(elements); }
-
-bool ContextAccess::hasGrantOption(const AccessFlags & flags) const { return isGrantedImpl<true>(flags); }
-bool ContextAccess::hasGrantOption(const AccessFlags & flags, const std::string_view & database) const { return isGrantedImpl<true>(flags, database); }
-bool ContextAccess::hasGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table) const { return isGrantedImpl<true>(flags, database, table); }
-bool ContextAccess::hasGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::string_view & column) const { return isGrantedImpl<true>(flags, database, table, column); }
-bool ContextAccess::hasGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::vector<std::string_view> & columns) const { return isGrantedImpl<true>(flags, database, table, columns); }
-bool ContextAccess::hasGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const Strings & columns) const { return isGrantedImpl<true>(flags, database, table, columns); }
-bool ContextAccess::hasGrantOption(const AccessRightsElement & element) const { return isGrantedImpl<true>(element); }
-bool ContextAccess::hasGrantOption(const AccessRightsElements & elements) const { return isGrantedImpl<true>(elements); }
-
-
-template <bool grant_option, typename... Args>
-void ContextAccess::checkAccessImpl2(const AccessFlags & flags, const Args &... args) const
+template <bool throw_if_denied, bool grant_option, typename... Args>
+bool ContextAccess::checkAccessImpl2(const AccessFlags & flags, const Args &... args) const
 {
-    if constexpr (grant_option)
+    auto access_granted = [&]
    {
-        if (hasGrantOption(flags, args...))
-            return;
-    }
-    else
-    {
-        if (isGranted(flags, args...))
-            return;
-    }
-
-    auto show_error = [&](const String & msg, int error_code)
-    {
-        throw Exception(user_name + ": " + msg, error_code);
+        if (trace_log)
+            LOG_TRACE(trace_log, "Access granted: {}{}", (AccessRightsElement{flags, args...}.toString()),
+                      (grant_option ? " WITH GRANT OPTION" : ""));
+        return true;
    };

-    std::lock_guard lock{mutex};
-
-    if (!user)
-        show_error("User has been dropped", ErrorCodes::UNKNOWN_USER);
-
-    if (grant_option && access->isGranted(flags, args...))
+    auto access_denied = [&](const String & error_msg, int error_code)
    {
-        show_error(
-            "Not enough privileges. "
-            "The required privileges have been granted, but without grant option. "
-            "To execute this query it's necessary to have the grant "
-                + AccessRightsElement{flags, args...}.toString() + " WITH GRANT OPTION",
+        if (trace_log)
+            LOG_TRACE(trace_log, "Access denied: {}{}", (AccessRightsElement{flags, args...}.toString()),
+                      (grant_option ? " WITH GRANT OPTION" : ""));
+        if constexpr (throw_if_denied)
+            throw Exception(getUserName() + ": " + error_msg, error_code);
+        return false;
+    };
+
+    if (!flags || is_full_access)
+        return access_granted();
+
+    if (!getUser())
+        return access_denied("User has been dropped", ErrorCodes::UNKNOWN_USER);
+
+    /// If the current user was allowed to create a temporary table
+    /// then he is allowed to do with it whatever he wants.
+    if ((sizeof...(args) >= 2) && (getDatabase(args...) == DatabaseCatalog::TEMPORARY_DATABASE))
+        return access_granted();
+
+    auto acs = getAccessRightsWithImplicit();
+    bool granted;
+    if constexpr (grant_option)
+        granted = acs->hasGrantOption(flags, args...);
+    else
+        granted = acs->isGranted(flags, args...);
+
+    if (!granted)
+    {
+        if (grant_option && acs->isGranted(flags, args...))
+        {
+            return access_denied(
+                "Not enough privileges. "
+                "The required privileges have been granted, but without grant option. "
+                "To execute this query it's necessary to have grant "
+                    + AccessRightsElement{flags, args...}.toString() + " WITH GRANT OPTION",
+                ErrorCodes::ACCESS_DENIED);
+        }
+
+        return access_denied(
+            "Not enough privileges. To execute this query it's necessary to have grant "
+                + AccessRightsElement{flags, args...}.toString() + (grant_option ? " WITH GRANT OPTION" : ""),
            ErrorCodes::ACCESS_DENIED);
    }

+    struct PrecalculatedFlags
+    {
+        const AccessFlags table_ddl = AccessType::CREATE_DATABASE | AccessType::CREATE_TABLE | AccessType::CREATE_VIEW
+            | AccessType::ALTER_TABLE | AccessType::ALTER_VIEW | AccessType::DROP_DATABASE | AccessType::DROP_TABLE | AccessType::DROP_VIEW
+            | AccessType::TRUNCATE;
+
+        const AccessFlags dictionary_ddl = AccessType::CREATE_DICTIONARY | AccessType::DROP_DICTIONARY;
+        const AccessFlags table_and_dictionary_ddl = table_ddl | dictionary_ddl;
+        const AccessFlags write_table_access = AccessType::INSERT | AccessType::OPTIMIZE;
+        const AccessFlags write_dcl_access = AccessType::ACCESS_MANAGEMENT - AccessType::SHOW_ACCESS;
+
+        const AccessFlags not_readonly_flags = write_table_access | table_and_dictionary_ddl | write_dcl_access | AccessType::SYSTEM | AccessType::KILL_QUERY;
+        const AccessFlags not_readonly_1_flags = AccessType::CREATE_TEMPORARY_TABLE;
+
+        const AccessFlags ddl_flags = table_ddl | dictionary_ddl;
+        const AccessFlags introspection_flags = AccessType::INTROSPECTION;
+    };
+    static const PrecalculatedFlags precalc;
+
    if (params.readonly)
    {
-        if (!access_without_readonly)
-        {
-            Params changed_params = params;
-            changed_params.readonly = 0;
-            access_without_readonly = std::make_shared<AccessRights>(calculateFinalAccessRights(*access_from_user_and_roles, changed_params));
-        }
-
-        if (access_without_readonly->isGranted(flags, args...))
+        if constexpr (grant_option)
+            return access_denied("Cannot change grants in readonly mode.", ErrorCodes::READONLY);
+        if ((flags & precalc.not_readonly_flags) ||
+            ((params.readonly == 1) && (flags & precalc.not_readonly_1_flags)))
        {
            if (params.interface == ClientInfo::Interface::HTTP && params.http_method == ClientInfo::HTTPMethod::GET)
-                show_error(
+            {
+                return access_denied(
                    "Cannot execute query in readonly mode. "
                    "For queries over HTTP, method GET implies readonly. You should use method POST for modifying queries",
                    ErrorCodes::READONLY);
+            }
            else
-                show_error("Cannot execute query in readonly mode", ErrorCodes::READONLY);
+                return access_denied("Cannot execute query in readonly mode", ErrorCodes::READONLY);
        }
    }

-    if (!params.allow_ddl)
+    if (!params.allow_ddl && !grant_option)
    {
-        if (!access_with_allow_ddl)
-        {
-            Params changed_params = params;
-            changed_params.allow_ddl = true;
-            access_with_allow_ddl = std::make_shared<AccessRights>(calculateFinalAccessRights(*access_from_user_and_roles, changed_params));
-        }
-
-        if (access_with_allow_ddl->isGranted(flags, args...))
-        {
-            show_error("Cannot execute query. DDL queries are prohibited for the user", ErrorCodes::QUERY_IS_PROHIBITED);
-        }
+        if (flags & precalc.ddl_flags)
+            return access_denied("Cannot execute query. DDL queries are prohibited for the user", ErrorCodes::QUERY_IS_PROHIBITED);
    }

-    if (!params.allow_introspection)
+    if (!params.allow_introspection && !grant_option)
    {
-        if (!access_with_allow_introspection)
-        {
-            Params changed_params = params;
-            changed_params.allow_introspection = true;
-            access_with_allow_introspection = std::make_shared<AccessRights>(calculateFinalAccessRights(*access_from_user_and_roles, changed_params));
-        }
-
-        if (access_with_allow_introspection->isGranted(flags, args...))
-        {
-            show_error("Introspection functions are disabled, because setting 'allow_introspection_functions' is set to 0", ErrorCodes::FUNCTION_NOT_ALLOWED);
-        }
+        if (flags & precalc.introspection_flags)
+            return access_denied("Introspection functions are disabled, because setting 'allow_introspection_functions' is set to 0", ErrorCodes::FUNCTION_NOT_ALLOWED);
    }

-    show_error(
-        "Not enough privileges. To execute this query it's necessary to have the grant "
-            + AccessRightsElement{flags, args...}.toString() + (grant_option ? " WITH GRANT OPTION" : ""),
-        ErrorCodes::ACCESS_DENIED);
+    return access_granted();
 }

-template <bool grant_option>
-void ContextAccess::checkAccessImpl(const AccessFlags & flags) const
+
+bool ContextAccess::isGranted(const AccessFlags & flags) const { return checkAccessImpl<false, false>(flags); }
+bool ContextAccess::isGranted(const AccessFlags & flags, const std::string_view & database) const { return checkAccessImpl<false, false>(flags, database); }
+bool ContextAccess::isGranted(const AccessFlags & flags, const std::string_view & database, const std::string_view & table) const { return checkAccessImpl<false, false>(flags, database, table); }
+bool ContextAccess::isGranted(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::string_view & column) const { return checkAccessImpl<false, false>(flags, database, table, column); }
+bool ContextAccess::isGranted(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::vector<std::string_view> & columns) const { return checkAccessImpl<false, false>(flags, database, table, columns); }
+bool ContextAccess::isGranted(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const Strings & columns) const { return checkAccessImpl<false, false>(flags, database, table, columns); }
+bool ContextAccess::isGranted(const AccessRightsElement & element) const { return checkAccessImpl<false, false>(element); }
+bool ContextAccess::isGranted(const AccessRightsElements & elements) const { return checkAccessImpl<false, false>(elements); }
+
+bool ContextAccess::hasGrantOption(const AccessFlags & flags) const { return checkAccessImpl<false, true>(flags); }
+bool ContextAccess::hasGrantOption(const AccessFlags & flags, const std::string_view & database) const { return checkAccessImpl<false, true>(flags, database); }
+bool ContextAccess::hasGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table) const { return checkAccessImpl<false, true>(flags, database, table); }
+bool ContextAccess::hasGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::string_view & column) const { return checkAccessImpl<false, true>(flags, database, table, column); }
+bool ContextAccess::hasGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::vector<std::string_view> & columns) const { return checkAccessImpl<false, true>(flags, database, table, columns); }
+bool ContextAccess::hasGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const Strings & columns) const { return checkAccessImpl<false, true>(flags, database, table, columns); }
+bool ContextAccess::hasGrantOption(const AccessRightsElement & element) const { return checkAccessImpl<false, true>(element); }
+bool ContextAccess::hasGrantOption(const AccessRightsElements & elements) const { return checkAccessImpl<false, true>(elements); }
+
+void ContextAccess::checkAccess(const AccessFlags & flags) const { checkAccessImpl<true, false>(flags); }
+void ContextAccess::checkAccess(const AccessFlags & flags, const std::string_view & database) const { checkAccessImpl<true, false>(flags, database); }
+void ContextAccess::checkAccess(const AccessFlags & flags, const std::string_view & database, const std::string_view & table) const { checkAccessImpl<true, false>(flags, database, table); }
+void ContextAccess::checkAccess(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::string_view & column) const { checkAccessImpl<true, false>(flags, database, table, column); }
+void ContextAccess::checkAccess(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::vector<std::string_view> & columns) const { checkAccessImpl<true, false>(flags, database, table, columns); }
+void ContextAccess::checkAccess(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const Strings & columns) const { checkAccessImpl<true, false>(flags, database, table, columns); }
+void ContextAccess::checkAccess(const AccessRightsElement & element) const { checkAccessImpl<true, false>(element); }
+void ContextAccess::checkAccess(const AccessRightsElements & elements) const { checkAccessImpl<true, false>(elements); }
+
+void ContextAccess::checkGrantOption(const AccessFlags & flags) const { checkAccessImpl<true, true>(flags); }
+void ContextAccess::checkGrantOption(const AccessFlags & flags, const std::string_view & database) const { checkAccessImpl<true, true>(flags, database); }
+void ContextAccess::checkGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table) const { checkAccessImpl<true, true>(flags, database, table); }
+void ContextAccess::checkGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::string_view & column) const { checkAccessImpl<true, true>(flags, database, table, column); }
+void ContextAccess::checkGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::vector<std::string_view> & columns) const { checkAccessImpl<true, true>(flags, database, table, columns); }
+void ContextAccess::checkGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const Strings & columns) const { checkAccessImpl<true, true>(flags, database, table, columns); }
+void ContextAccess::checkGrantOption(const AccessRightsElement & element) const { checkAccessImpl<true, true>(element); }
+void ContextAccess::checkGrantOption(const AccessRightsElements & elements) const { checkAccessImpl<true, true>(elements); }
+
+
+template <bool throw_if_denied>
+bool ContextAccess::checkAdminOptionImpl(const UUID & role_id) const
 {
-    checkAccessImpl2<grant_option>(flags);
+    return checkAdminOptionImpl2<throw_if_denied>(to_array(role_id), [this](const UUID & id, size_t) { return manager->tryReadName(id); });
 }

-template <bool grant_option, typename... Args>
-void ContextAccess::checkAccessImpl(const AccessFlags & flags, const std::string_view & database, const Args &... args) const
+template <bool throw_if_denied>
+bool ContextAccess::checkAdminOptionImpl(const UUID & role_id, const String & role_name) const
 {
-    checkAccessImpl2<grant_option>(flags, database.empty() ? params.current_database : database, args...);
+    return checkAdminOptionImpl2<throw_if_denied>(to_array(role_id), [&role_name](const UUID &, size_t) { return std::optional<String>{role_name}; });
 }

-template <bool grant_option>
-void ContextAccess::checkAccessImpl(const AccessRightsElement & element) const
+template <bool throw_if_denied>
+bool ContextAccess::checkAdminOptionImpl(const UUID & role_id, const std::unordered_map<UUID, String> & names_of_roles) const
 {
-    if (element.any_database)
-        checkAccessImpl<grant_option>(element.access_flags);
-    else if (element.any_table)
-        checkAccessImpl<grant_option>(element.access_flags, element.database);
-    else if (element.any_column)
-        checkAccessImpl<grant_option>(element.access_flags, element.database, element.table);
-    else
-        checkAccessImpl<grant_option>(element.access_flags, element.database, element.table, element.columns);
+    return checkAdminOptionImpl2<throw_if_denied>(to_array(role_id), [&names_of_roles](const UUID & id, size_t) { auto it = names_of_roles.find(id); return (it != names_of_roles.end()) ? it->second : std::optional<String>{}; });
 }

-template <bool grant_option>
-void ContextAccess::checkAccessImpl(const AccessRightsElements & elements) const
+template <bool throw_if_denied>
+bool ContextAccess::checkAdminOptionImpl(const std::vector<UUID> & role_ids) const
 {
-    for (const auto & element : elements)
-        checkAccessImpl<grant_option>(element);
+    return checkAdminOptionImpl2<throw_if_denied>(role_ids, [this](const UUID & id, size_t) { return manager->tryReadName(id); });
 }

-void ContextAccess::checkAccess(const AccessFlags & flags) const { checkAccessImpl<false>(flags); }
-void ContextAccess::checkAccess(const AccessFlags & flags, const std::string_view & database) const { checkAccessImpl<false>(flags, database); }
-void ContextAccess::checkAccess(const AccessFlags & flags, const std::string_view & database, const std::string_view & table) const { checkAccessImpl<false>(flags, database, table); }
-void ContextAccess::checkAccess(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::string_view & column) const { checkAccessImpl<false>(flags, database, table, column); }
-void ContextAccess::checkAccess(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::vector<std::string_view> & columns) const { checkAccessImpl<false>(flags, database, table, columns); }
-void ContextAccess::checkAccess(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const Strings & columns) const { checkAccessImpl<false>(flags, database, table, columns); }
-void ContextAccess::checkAccess(const AccessRightsElement & element) const { checkAccessImpl<false>(element); }
-void ContextAccess::checkAccess(const AccessRightsElements & elements) const { checkAccessImpl<false>(elements); }
-
-void ContextAccess::checkGrantOption(const AccessFlags & flags) const { checkAccessImpl<true>(flags); }
-void ContextAccess::checkGrantOption(const AccessFlags & flags, const std::string_view & database) const { checkAccessImpl<true>(flags, database); }
-void ContextAccess::checkGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table) const { checkAccessImpl<true>(flags, database, table); }
-void ContextAccess::checkGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::string_view & column) const { checkAccessImpl<true>(flags, database, table, column); }
-void ContextAccess::checkGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const std::vector<std::string_view> & columns) const { checkAccessImpl<true>(flags, database, table, columns); }
-void ContextAccess::checkGrantOption(const AccessFlags & flags, const std::string_view & database, const std::string_view & table, const Strings & columns) const { checkAccessImpl<true>(flags, database, table, columns); }
-void ContextAccess::checkGrantOption(const AccessRightsElement & element) const { checkAccessImpl<true>(element); }
-void ContextAccess::checkGrantOption(const AccessRightsElements & elements) const { checkAccessImpl<true>(elements); }
-
-
-template <typename Container, typename GetNameFunction>
-bool ContextAccess::checkAdminOptionImpl(bool throw_on_error, const Container & role_ids, const GetNameFunction & get_name_function) const
+template <bool throw_if_denied>
+bool ContextAccess::checkAdminOptionImpl(const std::vector<UUID> & role_ids, const Strings & names_of_roles) const
 {
+    return checkAdminOptionImpl2<throw_if_denied>(role_ids, [&names_of_roles](const UUID &, size_t i) { return std::optional<String>{names_of_roles[i]}; });
+}
+
+template <bool throw_if_denied>
+bool ContextAccess::checkAdminOptionImpl(const std::vector<UUID> & role_ids, const std::unordered_map<UUID, String> & names_of_roles) const
+{
+    return checkAdminOptionImpl2<throw_if_denied>(role_ids, [&names_of_roles](const UUID & id, size_t) { auto it = names_of_roles.find(id); return (it != names_of_roles.end()) ? it->second : std::optional<String>{}; });
+}
+
+template <bool throw_if_denied, typename Container, typename GetNameFunction>
+bool ContextAccess::checkAdminOptionImpl2(const Container & role_ids, const GetNameFunction & get_name_function) const
+{
+    if (!std::size(role_ids) || is_full_access)
+        return true;
+
+    auto show_error = [this](const String & msg, int error_code)
+    {
+        UNUSED(this);
+        if constexpr (throw_if_denied)
+            throw Exception(getUserName() + ": " + msg, error_code);
+    };
+
+    if (!getUser())
+    {
+        show_error("User has been dropped", ErrorCodes::UNKNOWN_USER);
+        return false;
+    }
+
    if (isGranted(AccessType::ROLE_ADMIN))
        return true;

    auto info = getRolesInfo();
-    if (!info)
-    {
-        if (!user)
-        {
-            if (throw_on_error)
-                throw Exception(user_name + ": User has been dropped", ErrorCodes::UNKNOWN_USER);
-            else
-                return false;
-        }
-        return true;
-    }
-
    size_t i = 0;
    for (auto it = std::begin(role_ids); it != std::end(role_ids); ++it, ++i)
    {
        const UUID & role_id = *it;
-        if (info->enabled_roles_with_admin_option.count(role_id))
+        if (info && info->enabled_roles_with_admin_option.count(role_id))
            continue;

-        auto role_name = get_name_function(role_id, i);
-        if (!role_name)
-            role_name = "ID {" + toString(role_id) + "}";
-        String msg = "To execute this query it's necessary to have the role " + backQuoteIfNeed(*role_name) + " granted with ADMIN option";
-        if (info->enabled_roles.count(role_id))
-            msg = "Role " + backQuote(*role_name) + " is granted, but without ADMIN option. " + msg;
-        if (throw_on_error)
-            throw Exception(getUserName() + ": Not enough privileges. " + msg, ErrorCodes::ACCESS_DENIED);
-        else
-            return false;
+        if (throw_if_denied)
+        {
+            auto role_name = get_name_function(role_id, i);
+            if (!role_name)
+                role_name = "ID {" + toString(role_id) + "}";
+
+            if (info && info->enabled_roles.count(role_id))
+                show_error("Not enough privileges. "
+                           "Role " + backQuote(*role_name) + " is granted, but without ADMIN option. "
+                           "To execute this query it's necessary to have the role " + backQuoteIfNeed(*role_name) + " granted with ADMIN option.",
+                           ErrorCodes::ACCESS_DENIED);
+            else
+                show_error("Not enough privileges. "
+                           "To execute this query it's necessary to have the role " + backQuoteIfNeed(*role_name) + " granted with ADMIN option.",
+                           ErrorCodes::ACCESS_DENIED);
+        }
+
+        return false;
    }

    return true;
 }

-bool ContextAccess::hasAdminOption(const UUID & role_id) const
-{
-    return checkAdminOptionImpl(false, to_array(role_id), [this](const UUID & id, size_t) { return manager->tryReadName(id); });
-}
+bool ContextAccess::hasAdminOption(const UUID & role_id) const { return checkAdminOptionImpl<false>(role_id); }
+bool ContextAccess::hasAdminOption(const UUID & role_id, const String & role_name) const { return checkAdminOptionImpl<false>(role_id, role_name); }
+bool ContextAccess::hasAdminOption(const UUID & role_id, const std::unordered_map<UUID, String> & names_of_roles) const { return checkAdminOptionImpl<false>(role_id, names_of_roles); }
+bool ContextAccess::hasAdminOption(const std::vector<UUID> & role_ids) const { return checkAdminOptionImpl<false>(role_ids); }
+bool ContextAccess::hasAdminOption(const std::vector<UUID> & role_ids, const Strings & names_of_roles) const { return checkAdminOptionImpl<false>(role_ids, names_of_roles); }
+bool ContextAccess::hasAdminOption(const std::vector<UUID> & role_ids, const std::unordered_map<UUID, String> & names_of_roles) const { return checkAdminOptionImpl<false>(role_ids, names_of_roles); }

-bool ContextAccess::hasAdminOption(const UUID & role_id, const String & role_name) const
-{
-    return checkAdminOptionImpl(false, to_array(role_id), [&role_name](const UUID &, size_t) { return std::optional<String>{role_name}; });
-}
-
-bool ContextAccess::hasAdminOption(const UUID & role_id, const std::unordered_map<UUID, String> & names_of_roles) const
-{
-    return checkAdminOptionImpl(false, to_array(role_id), [&names_of_roles](const UUID & id, size_t) { auto it = names_of_roles.find(id); return (it != names_of_roles.end()) ? it->second : std::optional<String>{}; });
-}
-
-bool ContextAccess::hasAdminOption(const std::vector<UUID> & role_ids) const
-{
-    return checkAdminOptionImpl(false, role_ids, [this](const UUID & id, size_t) { return manager->tryReadName(id); });
-}
-
-bool ContextAccess::hasAdminOption(const std::vector<UUID> & role_ids, const Strings & names_of_roles) const
-{
-    return checkAdminOptionImpl(false, role_ids, [&names_of_roles](const UUID &, size_t i) { return std::optional<String>{names_of_roles[i]}; });
-}
-
-bool ContextAccess::hasAdminOption(const std::vector<UUID> & role_ids, const std::unordered_map<UUID, String> & names_of_roles) const
-{
-    return checkAdminOptionImpl(false, role_ids, [&names_of_roles](const UUID & id, size_t) { auto it = names_of_roles.find(id); return (it != names_of_roles.end()) ? it->second : std::optional<String>{}; });
-}
-
-void ContextAccess::checkAdminOption(const UUID & role_id) const
-{
-    checkAdminOptionImpl(true, to_array(role_id), [this](const UUID & id, size_t) { return manager->tryReadName(id); });
-}
-
-void ContextAccess::checkAdminOption(const UUID & role_id, const String & role_name) const
-{
-    checkAdminOptionImpl(true, to_array(role_id), [&role_name](const UUID &, size_t) { return std::optional<String>{role_name}; });
-}
-
-void ContextAccess::checkAdminOption(const UUID & role_id, const std::unordered_map<UUID, String> & names_of_roles) const
-{
-    checkAdminOptionImpl(true, to_array(role_id), [&names_of_roles](const UUID & id, size_t) { auto it = names_of_roles.find(id); return (it != names_of_roles.end()) ? it->second : std::optional<String>{}; });
-}
-
-void ContextAccess::checkAdminOption(const std::vector<UUID> & role_ids) const
-{
-    checkAdminOptionImpl(true, role_ids, [this](const UUID & id, size_t) { return manager->tryReadName(id); });
-}
-
-void ContextAccess::checkAdminOption(const std::vector<UUID> & role_ids, const Strings & names_of_roles) const
-{
-    checkAdminOptionImpl(true, role_ids, [&names_of_roles](const UUID &, size_t i) { return std::optional<String>{names_of_roles[i]}; });
-}
-
-void ContextAccess::checkAdminOption(const std::vector<UUID> & role_ids, const std::unordered_map<UUID, String> & names_of_roles) const
-{
-    checkAdminOptionImpl(true, role_ids, [&names_of_roles](const UUID & id, size_t) { auto it = names_of_roles.find(id); return (it != names_of_roles.end()) ? it->second : std::optional<String>{}; });
-}
+void ContextAccess::checkAdminOption(const UUID & role_id) const { checkAdminOptionImpl<true>(role_id); }
+void ContextAccess::checkAdminOption(const UUID & role_id, const String & role_name) const { checkAdminOptionImpl<true>(role_id, role_name); }
+void ContextAccess::checkAdminOption(const UUID & role_id, const std::unordered_map<UUID, String> & names_of_roles) const { checkAdminOptionImpl<true>(role_id, names_of_roles); }
+void ContextAccess::checkAdminOption(const std::vector<UUID> & role_ids) const { checkAdminOptionImpl<true>(role_ids); }
+void ContextAccess::checkAdminOption(const std::vector<UUID> & role_ids, const Strings & names_of_roles) const { checkAdminOptionImpl<true>(role_ids, names_of_roles); }
+void ContextAccess::checkAdminOption(const std::vector<UUID> & role_ids, const std::unordered_map<UUID, String> & names_of_roles) const { checkAdminOptionImpl<true>(role_ids, names_of_roles); }

 }
--- a/src/Access/ContextAccess.h
+++ b/src/Access/ContextAccess.h
@ -96,7 +96,8 @@ public:
    std::shared_ptr<const SettingsConstraints> getSettingsConstraints() const;

    /// Returns the current access rights.
-    std::shared_ptr<const AccessRights> getAccess() const;
+    std::shared_ptr<const AccessRights> getAccessRights() const;
+    std::shared_ptr<const AccessRights> getAccessRightsWithImplicit() const;

    /// Checks if a specified access is granted.
    bool isGranted(const AccessFlags & flags) const;
@ -166,41 +167,45 @@ private:
    void setSettingsAndConstraints() const;
    void calculateAccessRights() const;

-    template <bool grant_option>
-    bool isGrantedImpl(const AccessFlags & flags) const;
+    template <bool throw_if_denied, bool grant_option>
+    bool checkAccessImpl(const AccessFlags & flags) const;

-    template <bool grant_option, typename... Args>
-    bool isGrantedImpl(const AccessFlags & flags, const std::string_view & database, const Args &... args) const;
+    template <bool throw_if_denied, bool grant_option, typename... Args>
+    bool checkAccessImpl(const AccessFlags & flags, const std::string_view & database, const Args &... args) const;

-    template <bool grant_option>
-    bool isGrantedImpl(const AccessRightsElement & element) const;
+    template <bool throw_if_denied, bool grant_option>
+    bool checkAccessImpl(const AccessRightsElement & element) const;

-    template <bool grant_option>
-    bool isGrantedImpl(const AccessRightsElements & elements) const;
+    template <bool throw_if_denied, bool grant_option>
+    bool checkAccessImpl(const AccessRightsElements & elements) const;

-    template <bool grant_option, typename... Args>
-    bool isGrantedImpl2(const AccessFlags & flags, const Args &... args) const;
+    template <bool throw_if_denied, bool grant_option, typename... Args>
+    bool checkAccessImpl2(const AccessFlags & flags, const Args &... args) const;

-    template <bool grant_option>
-    void checkAccessImpl(const AccessFlags & flags) const;
+    template <bool throw_if_denied>
+    bool checkAdminOptionImpl(const UUID & role_id) const;

-    template <bool grant_option, typename... Args>
-    void checkAccessImpl(const AccessFlags & flags, const std::string_view & database, const Args &... args) const;
+    template <bool throw_if_denied>
+    bool checkAdminOptionImpl(const UUID & role_id, const String & role_name) const;

-    template <bool grant_option>
-    void checkAccessImpl(const AccessRightsElement & element) const;
+    template <bool throw_if_denied>
+    bool checkAdminOptionImpl(const UUID & role_id, const std::unordered_map<UUID, String> & names_of_roles) const;

-    template <bool grant_option>
-    void checkAccessImpl(const AccessRightsElements & elements) const;
+    template <bool throw_if_denied>
+    bool checkAdminOptionImpl(const std::vector<UUID> & role_ids) const;

-    template <bool grant_option, typename... Args>
-    void checkAccessImpl2(const AccessFlags & flags, const Args &... args) const;
+    template <bool throw_if_denied>
+    bool checkAdminOptionImpl(const std::vector<UUID> & role_ids, const Strings & names_of_roles) const;

-    template <typename Container, typename GetNameFunction>
-    bool checkAdminOptionImpl(bool throw_on_error, const Container & role_ids, const GetNameFunction & get_name_function) const;
+    template <bool throw_if_denied>
+    bool checkAdminOptionImpl(const std::vector<UUID> & role_ids, const std::unordered_map<UUID, String> & names_of_roles) const;
+
+    template <bool throw_if_denied, typename Container, typename GetNameFunction>
+    bool checkAdminOptionImpl2(const Container & role_ids, const GetNameFunction & get_name_function) const;

    const AccessControlManager * manager = nullptr;
    const Params params;
+    bool is_full_access = false;
    mutable Poco::Logger * trace_log = nullptr;
    mutable UserPtr user;
    mutable String user_name;
@ -209,13 +214,10 @@ private:
    mutable ext::scope_guard subscription_for_roles_changes;
    mutable std::shared_ptr<const EnabledRolesInfo> roles_info;
    mutable std::shared_ptr<const AccessRights> access;
+    mutable std::shared_ptr<const AccessRights> access_with_implicit;
    mutable std::shared_ptr<const EnabledRowPolicies> enabled_row_policies;
    mutable std::shared_ptr<const EnabledQuota> enabled_quota;
    mutable std::shared_ptr<const EnabledSettings> enabled_settings;
-    mutable std::shared_ptr<const AccessRights> access_without_readonly;
-    mutable std::shared_ptr<const AccessRights> access_with_allow_ddl;
-    mutable std::shared_ptr<const AccessRights> access_with_allow_introspection;
-    mutable std::shared_ptr<const AccessRights> access_from_user_and_roles;
    mutable std::mutex mutex;
 };

--- a/src/AggregateFunctions/AggregateFunctionAvg.h
+++ b/src/AggregateFunctions/AggregateFunctionAvg.h
@ -33,7 +33,7 @@ struct AvgFraction

    /// Allow division by zero as sometimes we need to return NaN.
    /// Invoked only is either Numerator or Denominator are Decimal.
-    Float64 NO_SANITIZE_UNDEFINED divideIfAnyDecimal(UInt32 num_scale, UInt32 denom_scale) const
+    Float64 NO_SANITIZE_UNDEFINED divideIfAnyDecimal(UInt32 num_scale, UInt32 denom_scale [[maybe_unused]]) const
    {
        if constexpr (IsDecimalNumber<Numerator> && IsDecimalNumber<Denominator>)
        {
--- a/src/Core/Settings.h
+++ b/src/Core/Settings.h
@ -65,6 +65,7 @@ class IColumn;
    M(UInt64, distributed_connections_pool_size, DBMS_DEFAULT_DISTRIBUTED_CONNECTIONS_POOL_SIZE, "Maximum number of connections with one remote server in the pool.", 0) \
    M(UInt64, connections_with_failover_max_tries, DBMS_CONNECTION_POOL_WITH_FAILOVER_DEFAULT_MAX_TRIES, "The maximum number of attempts to connect to replicas.", 0) \
    M(UInt64, s3_min_upload_part_size, 512*1024*1024, "The minimum size of part to upload during multipart upload to S3.", 0) \
+    M(UInt64, s3_max_single_part_upload_size, 64*1024*1024, "The maximum size of object to upload using singlepart upload to S3.", 0) \
    M(UInt64, s3_max_redirects, 10, "Max number of S3 redirects hops allowed.", 0) \
    M(Bool, extremes, false, "Calculate minimums and maximums of the result columns. They can be output in JSON-formats.", IMPORTANT) \
    M(Bool, use_uncompressed_cache, true, "Whether to use the cache of uncompressed blocks.", 0) \
@ -400,6 +401,7 @@ class IColumn;
    M(Bool, enable_global_with_statement, false, "Propagate WITH statements to UNION queries and all subqueries", 0) \
    M(Bool, aggregate_functions_null_for_empty, false, "Rewrite all aggregate functions in a query, adding -OrNull suffix to them", 0) \
    M(Bool, optimize_skip_merged_partitions, false, "Skip partitions with one part with level > 0 in optimize final", 0) \
+    M(Bool, optimize_on_insert, true, "Do the same transformation for inserted block of data as if merge was done on this block.", 0) \
    \
    M(Bool, use_antlr_parser, false, "Parse incoming queries using ANTLR-generated parser", 0) \
    \
@ -407,8 +409,6 @@ class IColumn;
    \
    M(UInt64, max_memory_usage_for_all_queries, 0, "Obsolete. Will be removed after 2020-10-20", 0) \
    M(UInt64, multiple_joins_rewriter_version, 0, "Obsolete setting, does nothing. Will be removed after 2021-03-31", 0) \
-    M(Bool, experimental_use_processors, true, "Obsolete setting, does nothing. Will be removed after 2020-11-29.", 0) \
-    M(Bool, force_optimize_skip_unused_shards_no_nested, false, "Obsolete setting, does nothing. Will be removed after 2020-12-01. Use force_optimize_skip_unused_shards_nesting instead.", 0) \
    M(Bool, enable_debug_queries, false, "Enabled debug queries, but now is obsolete", 0) \
    M(Bool, allow_experimental_database_atomic, true, "Obsolete setting, does nothing. Will be removed after 2021-02-12", 0) \
    M(UnionMode, union_default_mode, UnionMode::DISTINCT, "Set default Union Mode in SelectWithUnion query. Possible values: empty string, 'ALL', 'DISTINCT'. If empty, query without Union Mode will throw exception.", 0)
--- a/src/Core/SortCursor.h
+++ b/src/Core/SortCursor.h
@ -30,7 +30,6 @@ struct SortCursorImpl
    ColumnRawPtrs all_columns;
    SortDescription desc;
    size_t sort_columns_size = 0;
-    size_t pos = 0;
    size_t rows = 0;

    /** Determines order if comparing columns are equal.
@ -49,15 +48,20 @@ struct SortCursorImpl
    /** Is there at least one column with Collator. */
    bool has_collation = false;

+    /** We could use SortCursorImpl in case when columns aren't sorted
+      *  but we have their sorted permutation
+      */
+    IColumn::Permutation * permutation = nullptr;
+
    SortCursorImpl() {}

-    SortCursorImpl(const Block & block, const SortDescription & desc_, size_t order_ = 0)
+    SortCursorImpl(const Block & block, const SortDescription & desc_, size_t order_ = 0, IColumn::Permutation * perm = nullptr)
        : desc(desc_), sort_columns_size(desc.size()), order(order_), need_collation(desc.size())
    {
-        reset(block);
+        reset(block, perm);
    }

-    SortCursorImpl(const Columns & columns, const SortDescription & desc_, size_t order_ = 0)
+    SortCursorImpl(const Columns & columns, const SortDescription & desc_, size_t order_ = 0, IColumn::Permutation * perm = nullptr)
        : desc(desc_), sort_columns_size(desc.size()), order(order_), need_collation(desc.size())
    {
        for (auto & column_desc : desc)
@ -66,19 +70,19 @@ struct SortCursorImpl
                throw Exception("SortDescription should contain column position if SortCursor was used without header.",
                        ErrorCodes::LOGICAL_ERROR);
        }
-        reset(columns, {});
+        reset(columns, {}, perm);
    }

    bool empty() const { return rows == 0; }

    /// Set the cursor to the beginning of the new block.
-    void reset(const Block & block)
+    void reset(const Block & block, IColumn::Permutation * perm = nullptr)
    {
-        reset(block.getColumns(), block);
+        reset(block.getColumns(), block, perm);
    }

    /// Set the cursor to the beginning of the new block.
-    void reset(const Columns & columns, const Block & block)
+    void reset(const Columns & columns, const Block & block, IColumn::Permutation * perm = nullptr)
    {
        all_columns.clear();
        sort_columns.clear();
@ -96,18 +100,33 @@ struct SortCursorImpl
                                   : column_desc.column_number;
            sort_columns.push_back(columns[column_number].get());

-            need_collation[j] = desc[j].collator != nullptr && sort_columns.back()->isCollationSupported();    /// TODO Nullable(String)
+            need_collation[j] = desc[j].collator != nullptr && sort_columns.back()->isCollationSupported();
            has_collation |= need_collation[j];
        }

        pos = 0;
        rows = all_columns[0]->size();
+        permutation = perm;
    }

+    size_t getRow() const
+    {
+        if (permutation)
+            return (*permutation)[pos];
+        return pos;
+    }
+
+    /// We need a possibility to change pos (see MergeJoin).
+    size_t & getPosRef() { return pos; }
+
    bool isFirst() const { return pos == 0; }
    bool isLast() const { return pos + 1 >= rows; }
    bool isValid() const { return pos < rows; }
    void next() { ++pos; }
+
+/// Prevent using pos instead of getRow()
+private:
+    size_t pos;
 };

 using SortCursorImpls = std::vector<SortCursorImpl>;
@ -127,7 +146,7 @@ struct SortCursorHelper

    bool ALWAYS_INLINE greater(const SortCursorHelper & rhs) const
    {
-        return derived().greaterAt(rhs.derived(), impl->pos, rhs.impl->pos);
+        return derived().greaterAt(rhs.derived(), impl->getRow(), rhs.impl->getRow());
    }

    /// Inverted so that the priority queue elements are removed in ascending order.
--- a/src/DataStreams/MergingSortedBlockInputStream.cpp
+++ b/src/DataStreams/MergingSortedBlockInputStream.cpp
@ -222,7 +222,7 @@ void MergingSortedBlockInputStream::merge(MutableColumns & merged_columns, TSort
 //        std::cerr << "total_merged_rows: " << total_merged_rows << ", merged_rows: " << merged_rows << "\n";
 //        std::cerr << "Inserting row\n";
        for (size_t i = 0; i < num_columns; ++i)
-            merged_columns[i]->insertFrom(*current->all_columns[i], current->pos);
+            merged_columns[i]->insertFrom(*current->all_columns[i], current->getRow());

        if (out_row_sources_buf)
        {
--- a/src/Databases/DatabaseAtomic.cpp
+++ b/src/Databases/DatabaseAtomic.cpp
@ -35,8 +35,8 @@ public:
 };


-DatabaseAtomic::DatabaseAtomic(String name_, String metadata_path_, UUID uuid, Context & context_)
-    : DatabaseOrdinary(name_, std::move(metadata_path_), "store/", "DatabaseAtomic (" + name_ + ")", context_)
+DatabaseAtomic::DatabaseAtomic(String name_, String metadata_path_, UUID uuid, const String & logger_name, const Context & context_)
+    : DatabaseOrdinary(name_, std::move(metadata_path_), "store/", logger_name, context_)
    , path_to_table_symlinks(global_context.getPath() + "data/" + escapeForFileName(name_) + "/")
    , path_to_metadata_symlink(global_context.getPath() + "metadata/" + escapeForFileName(name_))
    , db_uuid(uuid)
@ -46,6 +46,11 @@ DatabaseAtomic::DatabaseAtomic(String name_, String metadata_path_, UUID uuid, C
    tryCreateMetadataSymlink();
 }

+DatabaseAtomic::DatabaseAtomic(String name_, String metadata_path_, UUID uuid, const Context & context_)
+    : DatabaseAtomic(name_, std::move(metadata_path_), uuid, "DatabaseAtomic (" + name_ + ")", context_)
+{
+}
+
 String DatabaseAtomic::getTableDataPath(const String & table_name) const
 {
    std::lock_guard lock(mutex);
--- a/src/Databases/DatabaseAtomic.h
+++ b/src/Databases/DatabaseAtomic.h
@ -20,8 +20,8 @@ namespace DB
 class DatabaseAtomic : public DatabaseOrdinary
 {
 public:
-
-    DatabaseAtomic(String name_, String metadata_path_, UUID uuid, Context & context_);
+    DatabaseAtomic(String name_, String metadata_path_, UUID uuid, const String & logger_name, const Context & context_);
+    DatabaseAtomic(String name_, String metadata_path_, UUID uuid, const Context & context_);

    String getEngineName() const override { return "Atomic"; }
    UUID getUUID() const override { return db_uuid; }
@ -51,14 +51,14 @@ public:
    void loadStoredObjects(Context & context, bool has_force_restore_data_flag, bool force_attach) override;

    /// Atomic database cannot be detached if there is detached table which still in use
-    void assertCanBeDetached(bool cleanup);
+    void assertCanBeDetached(bool cleanup) override;

    UUID tryGetTableUUID(const String & table_name) const override;

    void tryCreateSymlink(const String & table_name, const String & actual_data_path, bool if_data_path_exist = false);
    void tryRemoveSymlink(const String & table_name);

-    void waitDetachedTableNotInUse(const UUID & uuid);
+    void waitDetachedTableNotInUse(const UUID & uuid) override;

 private:
    void commitAlterTable(const StorageID & table_id, const String & table_metadata_tmp_path, const String & table_metadata_path) override;
--- a/src/Databases/DatabaseFactory.cpp
+++ b/src/Databases/DatabaseFactory.cpp
@ -120,27 +120,32 @@ DatabasePtr DatabaseFactory::getImpl(const ASTCreateQuery & create, const String
            const auto & [remote_host_name, remote_port] = parseAddress(host_name_and_port, 3306);
            auto mysql_pool = mysqlxx::Pool(mysql_database_name, remote_host_name, mysql_user_name, mysql_user_password, remote_port);

-            if (engine_name == "MaterializeMySQL")
+            if (engine_name == "MySQL")
            {
-                MySQLClient client(remote_host_name, remote_port, mysql_user_name, mysql_user_password);
+                auto mysql_database_settings = std::make_unique<ConnectionMySQLSettings>();

-                auto materialize_mode_settings = std::make_unique<MaterializeMySQLSettings>();
+                mysql_database_settings->loadFromQueryContext(context);
+                mysql_database_settings->loadFromQuery(*engine_define); /// higher priority

-                if (engine_define->settings)
-                    materialize_mode_settings->loadFromQuery(*engine_define);
-
-                return std::make_shared<DatabaseMaterializeMySQL>(
-                    context, database_name, metadata_path, engine_define, mysql_database_name, std::move(mysql_pool), std::move(client)
-                    , std::move(materialize_mode_settings));
+                return std::make_shared<DatabaseConnectionMySQL>(
+                    context, database_name, metadata_path, engine_define, mysql_database_name, std::move(mysql_database_settings), std::move(mysql_pool));
            }

-            auto mysql_database_settings = std::make_unique<ConnectionMySQLSettings>();
+            MySQLClient client(remote_host_name, remote_port, mysql_user_name, mysql_user_password);

-            mysql_database_settings->loadFromQueryContext(context);
-            mysql_database_settings->loadFromQuery(*engine_define); /// higher priority
+            auto materialize_mode_settings = std::make_unique<MaterializeMySQLSettings>();

-            return std::make_shared<DatabaseConnectionMySQL>(
-                context, database_name, metadata_path, engine_define, mysql_database_name, std::move(mysql_database_settings), std::move(mysql_pool));
+            if (engine_define->settings)
+                materialize_mode_settings->loadFromQuery(*engine_define);
+
+            if (create.uuid == UUIDHelpers::Nil)
+                return std::make_shared<DatabaseMaterializeMySQL<DatabaseOrdinary>>(
+                    context, database_name, metadata_path, uuid, mysql_database_name, std::move(mysql_pool), std::move(client)
+                    , std::move(materialize_mode_settings));
+            else
+                return std::make_shared<DatabaseMaterializeMySQL<DatabaseAtomic>>(
+                    context, database_name, metadata_path, uuid, mysql_database_name, std::move(mysql_pool), std::move(client)
+                    , std::move(materialize_mode_settings));
        }
        catch (...)
        {
--- a/src/Databases/DatabaseOnDisk.cpp
+++ b/src/Databases/DatabaseOnDisk.cpp
@ -400,7 +400,7 @@ void DatabaseOnDisk::iterateMetadataFiles(const Context & context, const Iterati
 {
    auto process_tmp_drop_metadata_file = [&](const String & file_name)
    {
-        assert(getEngineName() != "Atomic");
+        assert(getUUID() == UUIDHelpers::Nil);
        static const char * tmp_drop_ext = ".sql.tmp_drop";
        const std::string object_name = file_name.substr(0, file_name.size() - strlen(tmp_drop_ext));
        if (Poco::File(context.getPath() + getDataPath() + '/' + object_name).exists())
--- a/src/Databases/DatabasesCommon.cpp
+++ b/src/Databases/DatabasesCommon.cpp
@ -80,7 +80,7 @@ StoragePtr DatabaseWithOwnTablesBase::detachTableUnlocked(const String & table_n
    auto table_id = res->getStorageID();
    if (table_id.hasUUID())
    {
-        assert(database_name == DatabaseCatalog::TEMPORARY_DATABASE || getEngineName() == "Atomic");
+        assert(database_name == DatabaseCatalog::TEMPORARY_DATABASE || getUUID() != UUIDHelpers::Nil);
        DatabaseCatalog::instance().removeUUIDMapping(table_id.uuid);
    }

@ -102,7 +102,7 @@ void DatabaseWithOwnTablesBase::attachTableUnlocked(const String & table_name, c

    if (table_id.hasUUID())
    {
-        assert(database_name == DatabaseCatalog::TEMPORARY_DATABASE || getEngineName() == "Atomic");
+        assert(database_name == DatabaseCatalog::TEMPORARY_DATABASE || getUUID() != UUIDHelpers::Nil);
        DatabaseCatalog::instance().addUUIDMapping(table_id.uuid, shared_from_this(), table);
    }

@ -131,7 +131,7 @@ void DatabaseWithOwnTablesBase::shutdown()
        kv.second->shutdown();
        if (table_id.hasUUID())
        {
-            assert(getDatabaseName() == DatabaseCatalog::TEMPORARY_DATABASE || getEngineName() == "Atomic");
+            assert(getDatabaseName() == DatabaseCatalog::TEMPORARY_DATABASE || getUUID() != UUIDHelpers::Nil);
            DatabaseCatalog::instance().removeUUIDMapping(table_id.uuid);
        }
    }
--- a/src/Databases/IDatabase.h
+++ b/src/Databases/IDatabase.h
@ -334,6 +334,10 @@ public:
    /// All tables and dictionaries should be detached before detaching the database.
    virtual bool shouldBeEmptyOnDetach() const { return true; }

+    virtual void assertCanBeDetached(bool /*cleanup*/) {}
+
+    virtual void waitDetachedTableNotInUse(const UUID & /*uuid*/) { assert(false); }
+
    /// Ask all tables to complete the background threads they are using and delete all table objects.
    virtual void shutdown() = 0;

--- a/src/Databases/MySQL/DatabaseMaterializeMySQL.cpp
+++ b/src/Databases/MySQL/DatabaseMaterializeMySQL.cpp
@ -8,6 +8,7 @@

 #    include <Interpreters/Context.h>
 #    include <Databases/DatabaseOrdinary.h>
+#    include <Databases/DatabaseAtomic.h>
 #    include <Databases/MySQL/DatabaseMaterializeTablesIterator.h>
 #    include <Databases/MySQL/MaterializeMySQLSyncThread.h>
 #    include <Parsers/ASTCreateQuery.h>
@ -22,21 +23,37 @@ namespace DB
 namespace ErrorCodes
 {
    extern const int NOT_IMPLEMENTED;
+    extern const int LOGICAL_ERROR;
 }

-DatabaseMaterializeMySQL::DatabaseMaterializeMySQL(
-    const Context & context, const String & database_name_, const String & metadata_path_, const IAST * database_engine_define_
-    , const String & mysql_database_name_, mysqlxx::Pool && pool_, MySQLClient && client_, std::unique_ptr<MaterializeMySQLSettings> settings_)
-    : IDatabase(database_name_), global_context(context.getGlobalContext()), engine_define(database_engine_define_->clone())
-    , nested_database(std::make_shared<DatabaseOrdinary>(database_name_, metadata_path_, context))
-    , settings(std::move(settings_)), log(&Poco::Logger::get("DatabaseMaterializeMySQL"))
+template<>
+DatabaseMaterializeMySQL<DatabaseOrdinary>::DatabaseMaterializeMySQL(
+    const Context & context, const String & database_name_, const String & metadata_path_, UUID /*uuid*/,
+    const String & mysql_database_name_, mysqlxx::Pool && pool_, MySQLClient && client_, std::unique_ptr<MaterializeMySQLSettings> settings_)
+    : DatabaseOrdinary(database_name_
+                      , metadata_path_
+                      , "data/" + escapeForFileName(database_name_) + "/"
+                      , "DatabaseMaterializeMySQL<Ordinary> (" + database_name_ + ")", context
+                      )
+    , settings(std::move(settings_))
    , materialize_thread(context, database_name_, mysql_database_name_, std::move(pool_), std::move(client_), settings.get())
 {
 }

-void DatabaseMaterializeMySQL::rethrowExceptionIfNeed() const
+template<>
+DatabaseMaterializeMySQL<DatabaseAtomic>::DatabaseMaterializeMySQL(
+    const Context & context, const String & database_name_, const String & metadata_path_, UUID uuid,
+    const String & mysql_database_name_, mysqlxx::Pool && pool_, MySQLClient && client_, std::unique_ptr<MaterializeMySQLSettings> settings_)
+    : DatabaseAtomic(database_name_, metadata_path_, uuid, "DatabaseMaterializeMySQL<Atomic> (" + database_name_ + ")", context)
+    , settings(std::move(settings_))
+    , materialize_thread(context, database_name_, mysql_database_name_, std::move(pool_), std::move(client_), settings.get())
 {
-    std::unique_lock<std::mutex> lock(mutex);
+}
+
+template<typename Base>
+void DatabaseMaterializeMySQL<Base>::rethrowExceptionIfNeed() const
+{
+    std::unique_lock<std::mutex> lock(Base::mutex);

    if (!settings->allows_query_when_mysql_lost && exception)
    {
@ -46,129 +63,71 @@ void DatabaseMaterializeMySQL::rethrowExceptionIfNeed() const
        }
        catch (Exception & ex)
        {
+            /// This method can be called from multiple threads
+            /// and Exception can be modified concurrently by calling addMessage(...),
+            /// so we rethrow a copy.
            throw Exception(ex);
        }
    }
 }

-void DatabaseMaterializeMySQL::setException(const std::exception_ptr & exception_)
+template<typename Base>
+void DatabaseMaterializeMySQL<Base>::setException(const std::exception_ptr & exception_)
 {
-    std::unique_lock<std::mutex> lock(mutex);
+    std::unique_lock<std::mutex> lock(Base::mutex);
    exception = exception_;
 }

-ASTPtr DatabaseMaterializeMySQL::getCreateDatabaseQuery() const
-{
-    const auto & create_query = std::make_shared<ASTCreateQuery>();
-    create_query->database = database_name;
-    create_query->set(create_query->storage, engine_define);
-    return create_query;
-}
-
-void DatabaseMaterializeMySQL::loadStoredObjects(Context & context, bool has_force_restore_data_flag, bool force_attach)
+template<typename Base>
+void DatabaseMaterializeMySQL<Base>::loadStoredObjects(Context & context, bool has_force_restore_data_flag, bool force_attach)
 {
+    Base::loadStoredObjects(context, has_force_restore_data_flag, force_attach);
    try
    {
-        std::unique_lock<std::mutex> lock(mutex);
-        nested_database->loadStoredObjects(context, has_force_restore_data_flag, force_attach);
        materialize_thread.startSynchronization();
+        started_up = true;
    }
    catch (...)
    {
-        tryLogCurrentException(log, "Cannot load MySQL nested database stored objects.");
+        tryLogCurrentException(Base::log, "Cannot load MySQL nested database stored objects.");

        if (!force_attach)
            throw;
    }
 }

-void DatabaseMaterializeMySQL::shutdown()
+template<typename Base>
+void DatabaseMaterializeMySQL<Base>::createTable(const Context & context, const String & name, const StoragePtr & table, const ASTPtr & query)
 {
-    materialize_thread.stopSynchronization();
-
-    auto iterator = nested_database->getTablesIterator(global_context, {});
-
-    /// We only shutdown the table, The tables is cleaned up when destructed database
-    for (; iterator->isValid(); iterator->next())
-        iterator->table()->shutdown();
+    assertCalledFromSyncThreadOrDrop("create table");
+    Base::createTable(context, name, table, query);
 }

-bool DatabaseMaterializeMySQL::empty() const
+template<typename Base>
+void DatabaseMaterializeMySQL<Base>::dropTable(const Context & context, const String & name, bool no_delay)
 {
-    return nested_database->empty();
+    assertCalledFromSyncThreadOrDrop("drop table");
+    Base::dropTable(context, name, no_delay);
 }

-String DatabaseMaterializeMySQL::getDataPath() const
+template<typename Base>
+void DatabaseMaterializeMySQL<Base>::attachTable(const String & name, const StoragePtr & table, const String & relative_table_path)
 {
-    return nested_database->getDataPath();
+    assertCalledFromSyncThreadOrDrop("attach table");
+    Base::attachTable(name, table, relative_table_path);
 }

-String DatabaseMaterializeMySQL::getMetadataPath() const
+template<typename Base>
+StoragePtr DatabaseMaterializeMySQL<Base>::detachTable(const String & name)
 {
-    return nested_database->getMetadataPath();
+    assertCalledFromSyncThreadOrDrop("detach table");
+    return Base::detachTable(name);
 }

-String DatabaseMaterializeMySQL::getTableDataPath(const String & table_name) const
+template<typename Base>
+void DatabaseMaterializeMySQL<Base>::renameTable(const Context & context, const String & name, IDatabase & to_database, const String & to_name, bool exchange, bool dictionary)
 {
-    return nested_database->getTableDataPath(table_name);
-}
-
-String DatabaseMaterializeMySQL::getTableDataPath(const ASTCreateQuery & query) const
-{
-    return nested_database->getTableDataPath(query);
-}
-
-String DatabaseMaterializeMySQL::getObjectMetadataPath(const String & table_name) const
-{
-    return nested_database->getObjectMetadataPath(table_name);
-}
-
-UUID DatabaseMaterializeMySQL::tryGetTableUUID(const String & table_name) const
-{
-    return nested_database->tryGetTableUUID(table_name);
-}
-
-time_t DatabaseMaterializeMySQL::getObjectMetadataModificationTime(const String & name) const
-{
-    return nested_database->getObjectMetadataModificationTime(name);
-}
-
-void DatabaseMaterializeMySQL::createTable(const Context & context, const String & name, const StoragePtr & table, const ASTPtr & query)
-{
-    if (!MaterializeMySQLSyncThread::isMySQLSyncThread())
-        throw Exception("MaterializeMySQL database not support create table.", ErrorCodes::NOT_IMPLEMENTED);
-
-    nested_database->createTable(context, name, table, query);
-}
-
-void DatabaseMaterializeMySQL::dropTable(const Context & context, const String & name, bool no_delay)
-{
-    if (!MaterializeMySQLSyncThread::isMySQLSyncThread())
-        throw Exception("MaterializeMySQL database not support drop table.", ErrorCodes::NOT_IMPLEMENTED);
-
-    nested_database->dropTable(context, name, no_delay);
-}
-
-void DatabaseMaterializeMySQL::attachTable(const String & name, const StoragePtr & table, const String & relative_table_path)
-{
-    if (!MaterializeMySQLSyncThread::isMySQLSyncThread())
-        throw Exception("MaterializeMySQL database not support attach table.", ErrorCodes::NOT_IMPLEMENTED);
-
-    nested_database->attachTable(name, table, relative_table_path);
-}
-
-StoragePtr DatabaseMaterializeMySQL::detachTable(const String & name)
-{
-    if (!MaterializeMySQLSyncThread::isMySQLSyncThread())
-        throw Exception("MaterializeMySQL database not support detach table.", ErrorCodes::NOT_IMPLEMENTED);
-
-    return nested_database->detachTable(name);
-}
-
-void DatabaseMaterializeMySQL::renameTable(const Context & context, const String & name, IDatabase & to_database, const String & to_name, bool exchange, bool dictionary)
-{
-    if (!MaterializeMySQLSyncThread::isMySQLSyncThread())
-        throw Exception("MaterializeMySQL database not support rename table.", ErrorCodes::NOT_IMPLEMENTED);
+    assertCalledFromSyncThreadOrDrop("rename table");

    if (exchange)
        throw Exception("MaterializeMySQL database not support exchange table.", ErrorCodes::NOT_IMPLEMENTED);
@ -176,57 +135,37 @@ void DatabaseMaterializeMySQL::renameTable(const Context & context, const String
    if (dictionary)
        throw Exception("MaterializeMySQL database not support rename dictionary.", ErrorCodes::NOT_IMPLEMENTED);

-    if (to_database.getDatabaseName() != getDatabaseName())
+    if (to_database.getDatabaseName() != Base::getDatabaseName())
        throw Exception("Cannot rename with other database for MaterializeMySQL database.", ErrorCodes::NOT_IMPLEMENTED);

-    nested_database->renameTable(context, name, *nested_database, to_name, exchange, dictionary);
+    Base::renameTable(context, name, *this, to_name, exchange, dictionary);
 }

-void DatabaseMaterializeMySQL::alterTable(const Context & context, const StorageID & table_id, const StorageInMemoryMetadata & metadata)
+template<typename Base>
+void DatabaseMaterializeMySQL<Base>::alterTable(const Context & context, const StorageID & table_id, const StorageInMemoryMetadata & metadata)
 {
-    if (!MaterializeMySQLSyncThread::isMySQLSyncThread())
-        throw Exception("MaterializeMySQL database not support alter table.", ErrorCodes::NOT_IMPLEMENTED);
-
-    nested_database->alterTable(context, table_id, metadata);
+    assertCalledFromSyncThreadOrDrop("alter table");
+    Base::alterTable(context, table_id, metadata);
 }

-bool DatabaseMaterializeMySQL::shouldBeEmptyOnDetach() const
+template<typename Base>
+void DatabaseMaterializeMySQL<Base>::drop(const Context & context)
 {
-    return false;
+    /// Remove metadata info
+    Poco::File metadata(Base::getMetadataPath() + "/.metadata");
+
+    if (metadata.exists())
+        metadata.remove(false);
+
+    Base::drop(context);
 }

-void DatabaseMaterializeMySQL::drop(const Context & context)
-{
-    if (nested_database->shouldBeEmptyOnDetach())
-    {
-        for (auto iterator = nested_database->getTablesIterator(context, {}); iterator->isValid(); iterator->next())
-        {
-            TableExclusiveLockHolder table_lock = iterator->table()->lockExclusively(
-                context.getCurrentQueryId(), context.getSettingsRef().lock_acquire_timeout);
-
-            nested_database->dropTable(context, iterator->name(), true);
-        }
-
-        /// Remove metadata info
-        Poco::File metadata(getMetadataPath() + "/.metadata");
-
-        if (metadata.exists())
-            metadata.remove(false);
-    }
-
-    nested_database->drop(context);
-}
-
-bool DatabaseMaterializeMySQL::isTableExist(const String & name, const Context & context) const
-{
-    return nested_database->isTableExist(name, context);
-}
-
-StoragePtr DatabaseMaterializeMySQL::tryGetTable(const String & name, const Context & context) const
+template<typename Base>
+StoragePtr DatabaseMaterializeMySQL<Base>::tryGetTable(const String & name, const Context & context) const
 {
    if (!MaterializeMySQLSyncThread::isMySQLSyncThread())
    {
-        StoragePtr nested_storage = nested_database->tryGetTable(name, context);
+        StoragePtr nested_storage = Base::tryGetTable(name, context);

        if (!nested_storage)
            return {};
@ -234,20 +173,71 @@ StoragePtr DatabaseMaterializeMySQL::tryGetTable(const String & name, const Cont
        return std::make_shared<StorageMaterializeMySQL>(std::move(nested_storage), this);
    }

-    return nested_database->tryGetTable(name, context);
+    return Base::tryGetTable(name, context);
 }

-DatabaseTablesIteratorPtr DatabaseMaterializeMySQL::getTablesIterator(const Context & context, const FilterByNameFunction & filter_by_table_name)
+template<typename Base>
+DatabaseTablesIteratorPtr DatabaseMaterializeMySQL<Base>::getTablesIterator(const Context & context, const DatabaseOnDisk::FilterByNameFunction & filter_by_table_name)
 {
    if (!MaterializeMySQLSyncThread::isMySQLSyncThread())
    {
-        DatabaseTablesIteratorPtr iterator = nested_database->getTablesIterator(context, filter_by_table_name);
+        DatabaseTablesIteratorPtr iterator = Base::getTablesIterator(context, filter_by_table_name);
        return std::make_unique<DatabaseMaterializeTablesIterator>(std::move(iterator), this);
    }

-    return nested_database->getTablesIterator(context, filter_by_table_name);
+    return Base::getTablesIterator(context, filter_by_table_name);
 }

+template<typename Base>
+void DatabaseMaterializeMySQL<Base>::assertCalledFromSyncThreadOrDrop(const char * method) const
+{
+    if (!MaterializeMySQLSyncThread::isMySQLSyncThread() && started_up)
+        throw Exception(ErrorCodes::NOT_IMPLEMENTED, "MaterializeMySQL database not support {}", method);
+}
+
+template<typename Base>
+void DatabaseMaterializeMySQL<Base>::shutdownSynchronizationThread()
+{
+    materialize_thread.stopSynchronization();
+    started_up = false;
+}
+
+template<typename Database, template<class> class Helper, typename... Args>
+auto castToMaterializeMySQLAndCallHelper(Database * database, Args && ... args)
+{
+    using Ordinary = DatabaseMaterializeMySQL<DatabaseOrdinary>;
+    using Atomic = DatabaseMaterializeMySQL<DatabaseAtomic>;
+    using ToOrdinary = typename std::conditional_t<std::is_const_v<Database>, const Ordinary *, Ordinary *>;
+    using ToAtomic = typename std::conditional_t<std::is_const_v<Database>, const Atomic *, Atomic *>;
+    if (auto * database_materialize = typeid_cast<ToOrdinary>(database))
+        return (database_materialize->*Helper<Ordinary>::v)(std::forward<Args>(args)...);
+    if (auto * database_materialize = typeid_cast<ToAtomic>(database))
+        return (database_materialize->*Helper<Atomic>::v)(std::forward<Args>(args)...);
+
+    throw Exception("LOGICAL_ERROR: cannot cast to DatabaseMaterializeMySQL, it is a bug.", ErrorCodes::LOGICAL_ERROR);
+}
+
+template<typename T> struct HelperSetException { static constexpr auto v = &T::setException; };
+void setSynchronizationThreadException(const DatabasePtr & materialize_mysql_db, const std::exception_ptr & exception)
+{
+    castToMaterializeMySQLAndCallHelper<IDatabase, HelperSetException>(materialize_mysql_db.get(), exception);
+}
+
+template<typename T> struct HelperStopSync { static constexpr auto v = &T::shutdownSynchronizationThread; };
+void stopDatabaseSynchronization(const DatabasePtr & materialize_mysql_db)
+{
+    castToMaterializeMySQLAndCallHelper<IDatabase, HelperStopSync>(materialize_mysql_db.get());
+}
+
+template<typename T> struct HelperRethrow { static constexpr auto v = &T::rethrowExceptionIfNeed; };
+void rethrowSyncExceptionIfNeed(const IDatabase * materialize_mysql_db)
+{
+    castToMaterializeMySQLAndCallHelper<const IDatabase, HelperRethrow>(materialize_mysql_db);
+}
+
+template class DatabaseMaterializeMySQL<DatabaseOrdinary>;
+template class DatabaseMaterializeMySQL<DatabaseAtomic>;
+
 }

 #endif
--- a/src/Databases/MySQL/DatabaseMaterializeMySQL.h
+++ b/src/Databases/MySQL/DatabaseMaterializeMySQL.h
@ -17,48 +17,34 @@ namespace DB
 *
 *  All table structure and data will be written to the local file system
 */
-class DatabaseMaterializeMySQL : public IDatabase
+template<typename Base>
+class DatabaseMaterializeMySQL : public Base
 {
 public:
+
    DatabaseMaterializeMySQL(
-        const Context & context, const String & database_name_, const String & metadata_path_,
-        const IAST * database_engine_define_, const String & mysql_database_name_, mysqlxx::Pool && pool_,
+        const Context & context, const String & database_name_, const String & metadata_path_, UUID uuid,
+        const String & mysql_database_name_, mysqlxx::Pool && pool_,
        MySQLClient && client_, std::unique_ptr<MaterializeMySQLSettings> settings_);

    void rethrowExceptionIfNeed() const;

    void setException(const std::exception_ptr & exception);
 protected:
-    const Context & global_context;

-    ASTPtr engine_define;
-    DatabasePtr nested_database;
    std::unique_ptr<MaterializeMySQLSettings> settings;

-    Poco::Logger * log;
    MaterializeMySQLSyncThread materialize_thread;

    std::exception_ptr exception;

+    std::atomic_bool started_up{false};
+
 public:
    String getEngineName() const override { return "MaterializeMySQL"; }

-    ASTPtr getCreateDatabaseQuery() const override;
-
    void loadStoredObjects(Context & context, bool has_force_restore_data_flag, bool force_attach) override;

-    void shutdown() override;
-
-    bool empty() const override;
-
-    String getDataPath() const override;
-
-    String getTableDataPath(const String & table_name) const override;
-
-    String getTableDataPath(const ASTCreateQuery & query) const override;
-
-    UUID tryGetTableUUID(const String & table_name) const override;
-
    void createTable(const Context & context, const String & name, const StoragePtr & table, const ASTPtr & query) override;

    void dropTable(const Context & context, const String & name, bool no_delay) override;
@ -71,23 +57,22 @@ public:

    void alterTable(const Context & context, const StorageID & table_id, const StorageInMemoryMetadata & metadata) override;

-    time_t getObjectMetadataModificationTime(const String & name) const override;
-
-    String getMetadataPath() const override;
-
-    String getObjectMetadataPath(const String & table_name) const override;
-
-    bool shouldBeEmptyOnDetach() const override;
-
    void drop(const Context & context) override;

-    bool isTableExist(const String & name, const Context & context) const override;
-
    StoragePtr tryGetTable(const String & name, const Context & context) const override;

-    DatabaseTablesIteratorPtr getTablesIterator(const Context & context, const FilterByNameFunction & filter_by_table_name) override;
+    DatabaseTablesIteratorPtr getTablesIterator(const Context & context, const DatabaseOnDisk::FilterByNameFunction & filter_by_table_name) override;
+
+    void assertCalledFromSyncThreadOrDrop(const char * method) const;
+
+    void shutdownSynchronizationThread();
 };

+
+void setSynchronizationThreadException(const DatabasePtr & materialize_mysql_db, const std::exception_ptr & exception);
+void stopDatabaseSynchronization(const DatabasePtr & materialize_mysql_db);
+void rethrowSyncExceptionIfNeed(const IDatabase * materialize_mysql_db);
+
 }

 #endif
--- a/src/Databases/MySQL/DatabaseMaterializeTablesIterator.h
+++ b/src/Databases/MySQL/DatabaseMaterializeTablesIterator.h
@ -2,7 +2,6 @@

 #include <Databases/IDatabase.h>
 #include <Storages/StorageMaterializeMySQL.h>
-#include <Databases/MySQL/DatabaseMaterializeMySQL.h>

 namespace DB
 {
@ -30,7 +29,7 @@ public:

    UUID uuid() const override { return nested_iterator->uuid(); }

-    DatabaseMaterializeTablesIterator(DatabaseTablesIteratorPtr nested_iterator_, DatabaseMaterializeMySQL * database_)
+    DatabaseMaterializeTablesIterator(DatabaseTablesIteratorPtr nested_iterator_, const IDatabase * database_)
        : nested_iterator(std::move(nested_iterator_)), database(database_)
    {
    }
@ -38,8 +37,7 @@ public:
 private:
    mutable std::vector<StoragePtr> tables;
    DatabaseTablesIteratorPtr nested_iterator;
-    DatabaseMaterializeMySQL * database;
-
+    const IDatabase * database;
 };

 }
--- a/src/Databases/MySQL/MaterializeMySQLSyncThread.cpp
+++ b/src/Databases/MySQL/MaterializeMySQLSyncThread.cpp
@ -71,15 +71,6 @@ static BlockIO tryToExecuteQuery(const String & query_to_execute, Context & quer
    }
 }

-static inline DatabaseMaterializeMySQL & getDatabase(const String & database_name)
-{
-    DatabasePtr database = DatabaseCatalog::instance().getDatabase(database_name);
-
-    if (DatabaseMaterializeMySQL * database_materialize = typeid_cast<DatabaseMaterializeMySQL *>(database.get()))
-        return *database_materialize;
-
-    throw Exception("LOGICAL_ERROR: cannot cast to DatabaseMaterializeMySQL, it is a bug.", ErrorCodes::LOGICAL_ERROR);
-}

 MaterializeMySQLSyncThread::~MaterializeMySQLSyncThread()
 {
@ -190,7 +181,8 @@ void MaterializeMySQLSyncThread::synchronization()
    {
        client.disconnect();
        tryLogCurrentException(log);
-        getDatabase(database_name).setException(std::current_exception());
+        auto db = DatabaseCatalog::instance().getDatabase(database_name);
+        setSynchronizationThreadException(db, std::current_exception());
    }
 }

@ -343,7 +335,7 @@ std::optional<MaterializeMetadata> MaterializeMySQLSyncThread::prepareSynchroniz
            opened_transaction = false;

            MaterializeMetadata metadata(
-                connection, getDatabase(database_name).getMetadataPath() + "/.metadata", mysql_database_name, opened_transaction);
+                connection, DatabaseCatalog::instance().getDatabase(database_name)->getMetadataPath() + "/.metadata", mysql_database_name, opened_transaction);

            if (!metadata.need_dumping_tables.empty())
            {
--- a/src/Disks/S3/DiskS3.cpp
+++ b/src/Disks/S3/DiskS3.cpp
@ -348,11 +348,11 @@ public:
        DiskS3::Metadata metadata_,
        const String & s3_path_,
        std::optional<DiskS3::ObjectMetadata> object_metadata_,
-        bool is_multipart,
        size_t min_upload_part_size,
+        size_t max_single_part_upload_size,
        size_t buf_size_)
        : WriteBufferFromFileBase(buf_size_, nullptr, 0)
-        , impl(WriteBufferFromS3(client_ptr_, bucket_, metadata_.s3_root_path + s3_path_, min_upload_part_size, is_multipart,std::move(object_metadata_), buf_size_))
+        , impl(WriteBufferFromS3(client_ptr_, bucket_, metadata_.s3_root_path + s3_path_, min_upload_part_size, max_single_part_upload_size,std::move(object_metadata_), buf_size_))
        , metadata(std::move(metadata_))
        , s3_path(s3_path_)
    {
@ -542,7 +542,7 @@ DiskS3::DiskS3(
    String s3_root_path_,
    String metadata_path_,
    size_t min_upload_part_size_,
-    size_t min_multi_part_upload_size_,
+    size_t max_single_part_upload_size_,
    size_t min_bytes_for_seek_,
    bool send_metadata_)
    : IDisk(std::make_unique<AsyncExecutor>())
@ -553,7 +553,7 @@ DiskS3::DiskS3(
    , s3_root_path(std::move(s3_root_path_))
    , metadata_path(std::move(metadata_path_))
    , min_upload_part_size(min_upload_part_size_)
-    , min_multi_part_upload_size(min_multi_part_upload_size_)
+    , max_single_part_upload_size(max_single_part_upload_size_)
    , min_bytes_for_seek(min_bytes_for_seek_)
    , send_metadata(send_metadata_)
 {
@ -665,7 +665,7 @@ std::unique_ptr<ReadBufferFromFileBase> DiskS3::readFile(const String & path, si
    return std::make_unique<SeekAvoidingReadBuffer>(std::move(reader), min_bytes_for_seek);
 }

-std::unique_ptr<WriteBufferFromFileBase> DiskS3::writeFile(const String & path, size_t buf_size, WriteMode mode, size_t estimated_size, size_t)
+std::unique_ptr<WriteBufferFromFileBase> DiskS3::writeFile(const String & path, size_t buf_size, WriteMode mode, size_t, size_t)
 {
    bool exist = exists(path);
    if (exist && readMeta(path).read_only)
@ -674,7 +674,6 @@ std::unique_ptr<WriteBufferFromFileBase> DiskS3::writeFile(const String & path,
    /// Path to store new S3 object.
    auto s3_path = getRandomName();
    auto object_metadata = createObjectMetadata(path);
-    bool is_multipart = estimated_size >= min_multi_part_upload_size;
    if (!exist || mode == WriteMode::Rewrite)
    {
        /// If metadata file exists - remove and create new.
@ -687,7 +686,8 @@ std::unique_ptr<WriteBufferFromFileBase> DiskS3::writeFile(const String & path,

        LOG_DEBUG(&Poco::Logger::get("DiskS3"), "Write to file by path: {}. New S3 path: {}", backQuote(metadata_path + path), s3_root_path + s3_path);

-        return std::make_unique<WriteIndirectBufferFromS3>(client, bucket, metadata, s3_path, object_metadata, is_multipart, min_upload_part_size, buf_size);
+        return std::make_unique<WriteIndirectBufferFromS3>(
+            client, bucket, metadata, s3_path, object_metadata, min_upload_part_size, max_single_part_upload_size, buf_size);
    }
    else
    {
@ -696,7 +696,8 @@ std::unique_ptr<WriteBufferFromFileBase> DiskS3::writeFile(const String & path,
        LOG_DEBUG(&Poco::Logger::get("DiskS3"), "Append to file by path: {}. New S3 path: {}. Existing S3 objects: {}.",
            backQuote(metadata_path + path), s3_root_path + s3_path, metadata.s3_objects.size());

-        return std::make_unique<WriteIndirectBufferFromS3>(client, bucket, metadata, s3_path, object_metadata, is_multipart, min_upload_part_size, buf_size);
+        return std::make_unique<WriteIndirectBufferFromS3>(
+            client, bucket, metadata, s3_path, object_metadata, min_upload_part_size, max_single_part_upload_size, buf_size);
    }
 }

--- a/src/Disks/S3/DiskS3.h
+++ b/src/Disks/S3/DiskS3.h
@ -34,7 +34,7 @@ public:
        String s3_root_path_,
        String metadata_path_,
        size_t min_upload_part_size_,
-        size_t min_multi_part_upload_size_,
+        size_t max_single_part_upload_size_,
        size_t min_bytes_for_seek_,
        bool send_metadata_);

@ -133,7 +133,7 @@ private:
    const String s3_root_path;
    const String metadata_path;
    size_t min_upload_part_size;
-    size_t min_multi_part_upload_size;
+    size_t max_single_part_upload_size;
    size_t min_bytes_for_seek;
    bool send_metadata;

--- a/src/Disks/S3/registerDiskS3.cpp
+++ b/src/Disks/S3/registerDiskS3.cpp
@ -148,7 +148,7 @@ void registerDiskS3(DiskFactory & factory)
            uri.key,
            metadata_path,
            context.getSettingsRef().s3_min_upload_part_size,
-            config.getUInt64(config_prefix + ".min_multi_part_upload_size", 10 * 1024 * 1024),
+            context.getSettingsRef().s3_max_single_part_upload_size,
            config.getUInt64(config_prefix + ".min_bytes_for_seek", 1024 * 1024),
            config.getBool(config_prefix + ".send_object_metadata", false));

--- a/src/Functions/array/arrayAggregation.cpp
+++ b/src/Functions/array/arrayAggregation.cpp
@ -0,0 +1,311 @@
+#include <DataTypes/DataTypesNumber.h>
+#include <DataTypes/DataTypesDecimal.h>
+#include <DataTypes/DataTypeDateTime64.h>
+#include <Columns/ColumnsNumber.h>
+#include <Columns/ColumnDecimal.h>
+#include "FunctionArrayMapped.h"
+#include <Functions/FunctionFactory.h>
+
+
+namespace DB
+{
+
+namespace ErrorCodes
+{
+    extern const int ILLEGAL_TYPE_OF_ARGUMENT;
+    extern const int ILLEGAL_COLUMN;
+}
+
+enum class AggregateOperation
+{
+    min,
+    max,
+    sum,
+    average
+};
+
+/**
+ * During array aggregation we derive result type from operation.
+ * For array min or array max we use array element as result type.
+ * For array average we use Float64.
+ * For array sum for decimal numbers we use Decimal128, for floating point numbers Float64, for numeric unsigned Int64,
+ * and for numeric signed UInt64.
+ */
+
+template <typename ArrayElement, AggregateOperation operation>
+struct ArrayAggregateResultImpl;
+
+template <typename ArrayElement>
+struct ArrayAggregateResultImpl<ArrayElement, AggregateOperation::min>
+{
+    using Result = ArrayElement;
+};
+
+template <typename ArrayElement>
+struct ArrayAggregateResultImpl<ArrayElement, AggregateOperation::max>
+{
+    using Result = ArrayElement;
+};
+
+template <typename ArrayElement>
+struct ArrayAggregateResultImpl<ArrayElement, AggregateOperation::average>
+{
+    using Result = Float64;
+};
+
+template <typename ArrayElement>
+struct ArrayAggregateResultImpl<ArrayElement, AggregateOperation::sum>
+{
+    using Result = std::conditional_t<
+        IsDecimalNumber<ArrayElement>,
+        Decimal128,
+        std::conditional_t<
+            std::is_floating_point_v<ArrayElement>,
+            Float64,
+            std::conditional_t<std::is_signed_v<ArrayElement>, Int64, UInt64>>>;
+};
+
+template <typename ArrayElement, AggregateOperation operation>
+using ArrayAggregateResult = typename ArrayAggregateResultImpl<ArrayElement, operation>::Result;
+
+template<AggregateOperation aggregate_operation>
+struct ArrayAggregateImpl
+{
+    static bool needBoolean() { return false; }
+    static bool needExpression() { return false; }
+    static bool needOneArray() { return false; }
+
+    static DataTypePtr getReturnType(const DataTypePtr & expression_return, const DataTypePtr & /*array_element*/)
+    {
+        DataTypePtr result;
+
+        auto call = [&](const auto & types)
+        {
+            using Types = std::decay_t<decltype(types)>;
+            using DataType = typename Types::LeftType;
+
+            if constexpr (aggregate_operation == AggregateOperation::average)
+            {
+                result = std::make_shared<DataTypeFloat64>();
+
+                return true;
+            }
+            else if constexpr (IsDataTypeNumber<DataType>)
+            {
+                using NumberReturnType = ArrayAggregateResult<typename DataType::FieldType, aggregate_operation>;
+                result = std::make_shared<DataTypeNumber<NumberReturnType>>();
+
+                return true;
+            }
+            else if constexpr (IsDataTypeDecimal<DataType> && !IsDataTypeDateOrDateTime<DataType>)
+            {
+                using DecimalReturnType = ArrayAggregateResult<typename DataType::FieldType, aggregate_operation>;
+                UInt32 scale = getDecimalScale(*expression_return);
+                result = std::make_shared<DataTypeDecimal<DecimalReturnType>>(DecimalUtils::maxPrecision<DecimalReturnType>(), scale);
+
+                return true;
+            }
+
+            return false;
+        };
+
+        if (!callOnIndexAndDataType<void>(expression_return->getTypeId(), call))
+        {
+            throw Exception(
+                "array aggregation function cannot be performed on type " + expression_return->getName(),
+                ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
+        }
+
+        return result;
+    }
+
+    template <typename Element>
+    static bool executeType(const ColumnPtr & mapped, const ColumnArray::Offsets & offsets, ColumnPtr & res_ptr)
+    {
+        using Result = ArrayAggregateResult<Element, aggregate_operation>;
+        using ColVecType = std::conditional_t<IsDecimalNumber<Element>, ColumnDecimal<Element>, ColumnVector<Element>>;
+        using ColVecResult = std::conditional_t<IsDecimalNumber<Result>, ColumnDecimal<Result>, ColumnVector<Result>>;
+
+        /// For average on decimal array we return Float64 as result,
+        /// but to keep decimal presisision we convert to Float64 as last step of average computation
+        static constexpr bool use_decimal_for_average_aggregation
+            = aggregate_operation == AggregateOperation::average && IsDecimalNumber<Element>;
+
+        using AggregationType = std::conditional_t<use_decimal_for_average_aggregation, Decimal128, Result>;
+
+
+        const ColVecType * column = checkAndGetColumn<ColVecType>(&*mapped);
+
+        /// Constant case.
+        if (!column)
+        {
+            const ColumnConst * column_const = checkAndGetColumnConst<ColVecType>(&*mapped);
+
+            if (!column_const)
+                return false;
+
+            const AggregationType x = column_const->template getValue<Element>(); // NOLINT
+            const typename ColVecType::Container & data
+                = checkAndGetColumn<ColVecType>(&column_const->getDataColumn())->getData();
+
+            typename ColVecResult::MutablePtr res_column;
+            if constexpr (IsDecimalNumber<Element>)
+            {
+                res_column = ColVecResult::create(offsets.size(), data.getScale());
+            }
+            else
+                res_column = ColVecResult::create(offsets.size());
+
+            typename ColVecResult::Container & res = res_column->getData();
+
+            size_t pos = 0;
+            for (size_t i = 0; i < offsets.size(); ++i)
+            {
+                if constexpr (aggregate_operation == AggregateOperation::sum)
+                {
+                    size_t array_size = offsets[i] - pos;
+                    /// Just multiply the value by array size.
+                    res[i] = x * array_size;
+                }
+                else if constexpr (aggregate_operation == AggregateOperation::min ||
+                                aggregate_operation == AggregateOperation::max)
+                {
+                    res[i] = x;
+                }
+                else if constexpr (aggregate_operation == AggregateOperation::average)
+                {
+                    if constexpr (IsDecimalNumber<Element>)
+                    {
+                        res[i] = DecimalUtils::convertTo<Result>(x, data.getScale());
+                    }
+                    else
+                    {
+                        res[i] = x;
+                    }
+                }
+
+                pos = offsets[i];
+            }
+
+            res_ptr = std::move(res_column);
+            return true;
+        }
+
+        const typename ColVecType::Container & data = column->getData();
+
+        typename ColVecResult::MutablePtr res_column;
+        if constexpr (IsDecimalNumber<Element>)
+            res_column = ColVecResult::create(offsets.size(), data.getScale());
+        else
+            res_column = ColVecResult::create(offsets.size());
+
+        typename ColVecResult::Container & res = res_column->getData();
+
+        size_t pos = 0;
+        for (size_t i = 0; i < offsets.size(); ++i)
+        {
+            AggregationType s = 0;
+
+            /// Array is empty
+            if (offsets[i] == pos)
+            {
+                res[i] = s;
+                continue;
+            }
+
+            size_t count = 1;
+            s = data[pos]; // NOLINT
+            ++pos;
+
+            for (; pos < offsets[i]; ++pos)
+            {
+                auto element = data[pos];
+
+                if constexpr (aggregate_operation == AggregateOperation::sum ||
+                            aggregate_operation == AggregateOperation::average)
+                {
+                    s += element;
+                }
+                else if constexpr (aggregate_operation == AggregateOperation::min)
+                {
+                    if (element < s)
+                    {
+                        s = element;
+                    }
+                }
+                else if constexpr (aggregate_operation == AggregateOperation::max)
+                {
+                    if (element > s)
+                    {
+                        s = element;
+                    }
+                }
+
+                ++count;
+            }
+
+            if constexpr (aggregate_operation == AggregateOperation::average)
+            {
+                s = s / count;
+            }
+
+            if constexpr (use_decimal_for_average_aggregation)
+            {
+                res[i] = DecimalUtils::convertTo<Result>(s, data.getScale());
+            }
+            else
+            {
+                res[i] = s;
+            }
+        }
+
+        res_ptr = std::move(res_column);
+        return true;
+    }
+
+    static ColumnPtr execute(const ColumnArray & array, ColumnPtr mapped)
+    {
+        const IColumn::Offsets & offsets = array.getOffsets();
+        ColumnPtr res;
+
+        if (executeType<UInt8>(mapped, offsets, res) ||
+            executeType<UInt16>(mapped, offsets, res) ||
+            executeType<UInt32>(mapped, offsets, res) ||
+            executeType<UInt64>(mapped, offsets, res) ||
+            executeType<Int8>(mapped, offsets, res) ||
+            executeType<Int16>(mapped, offsets, res) ||
+            executeType<Int32>(mapped, offsets, res) ||
+            executeType<Int64>(mapped, offsets, res) ||
+            executeType<Float32>(mapped, offsets, res) ||
+            executeType<Float64>(mapped, offsets, res) ||
+            executeType<Decimal32>(mapped, offsets, res) ||
+            executeType<Decimal64>(mapped, offsets, res) ||
+            executeType<Decimal128>(mapped, offsets, res))
+            return res;
+        else
+            throw Exception("Unexpected column for arraySum: " + mapped->getName(), ErrorCodes::ILLEGAL_COLUMN);
+    }
+};
+
+struct NameArrayMin { static constexpr auto name = "arrayMin"; };
+using FunctionArrayMin = FunctionArrayMapped<ArrayAggregateImpl<AggregateOperation::min>, NameArrayMin>;
+
+struct NameArrayMax { static constexpr auto name = "arrayMax"; };
+using FunctionArrayMax = FunctionArrayMapped<ArrayAggregateImpl<AggregateOperation::max>, NameArrayMax>;
+
+struct NameArraySum { static constexpr auto name = "arraySum"; };
+using FunctionArraySum = FunctionArrayMapped<ArrayAggregateImpl<AggregateOperation::sum>, NameArraySum>;
+
+struct NameArrayAverage { static constexpr auto name = "arrayAvg"; };
+using FunctionArrayAverage = FunctionArrayMapped<ArrayAggregateImpl<AggregateOperation::average>, NameArrayAverage>;
+
+void registerFunctionArrayAggregation(FunctionFactory & factory)
+{
+    factory.registerFunction<FunctionArrayMin>();
+    factory.registerFunction<FunctionArrayMax>();
+    factory.registerFunction<FunctionArraySum>();
+    factory.registerFunction<FunctionArrayAverage>();
+}
+
+}
+
--- a/src/Functions/array/arraySum.cpp
+++ b/src/Functions/array/arraySum.cpp
@ -1,146 +0,0 @@
-#include <DataTypes/DataTypesNumber.h>
-#include <DataTypes/DataTypesDecimal.h>
-#include <Columns/ColumnsNumber.h>
-#include <Columns/ColumnDecimal.h>
-#include "FunctionArrayMapped.h"
-#include <Functions/FunctionFactory.h>
-
-
-namespace DB
-{
-
-namespace ErrorCodes
-{
-    extern const int ILLEGAL_TYPE_OF_ARGUMENT;
-    extern const int ILLEGAL_COLUMN;
-}
-
-struct ArraySumImpl
-{
-    static bool needBoolean() { return false; }
-    static bool needExpression() { return false; }
-    static bool needOneArray() { return false; }
-
-    static DataTypePtr getReturnType(const DataTypePtr & expression_return, const DataTypePtr & /*array_element*/)
-    {
-        WhichDataType which(expression_return);
-
-        if (which.isNativeUInt())
-            return std::make_shared<DataTypeUInt64>();
-
-        if (which.isNativeInt())
-            return std::make_shared<DataTypeInt64>();
-
-        if (which.isFloat())
-            return std::make_shared<DataTypeFloat64>();
-
-        if (which.isDecimal())
-        {
-            UInt32 scale = getDecimalScale(*expression_return);
-            return std::make_shared<DataTypeDecimal<Decimal128>>(DecimalUtils::maxPrecision<Decimal128>(), scale);
-        }
-
-        throw Exception("arraySum cannot add values of type " + expression_return->getName(), ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
-    }
-
-    template <typename Element, typename Result>
-    static bool executeType(const ColumnPtr & mapped, const ColumnArray::Offsets & offsets, ColumnPtr & res_ptr)
-    {
-        using ColVecType = std::conditional_t<IsDecimalNumber<Element>, ColumnDecimal<Element>, ColumnVector<Element>>;
-        using ColVecResult = std::conditional_t<IsDecimalNumber<Result>, ColumnDecimal<Result>, ColumnVector<Result>>;
-
-        const ColVecType * column = checkAndGetColumn<ColVecType>(&*mapped);
-
-        /// Constant case.
-        if (!column)
-        {
-            const ColumnConst * column_const = checkAndGetColumnConst<ColVecType>(&*mapped);
-
-            if (!column_const)
-                return false;
-
-            const Result x = column_const->template getValue<Element>(); // NOLINT
-
-            typename ColVecResult::MutablePtr res_column;
-            if constexpr (IsDecimalNumber<Element>)
-            {
-                const typename ColVecType::Container & data =
-                    checkAndGetColumn<ColVecType>(&column_const->getDataColumn())->getData();
-                res_column = ColVecResult::create(offsets.size(), data.getScale());
-            }
-            else
-                res_column = ColVecResult::create(offsets.size());
-
-            typename ColVecResult::Container & res = res_column->getData();
-
-            size_t pos = 0;
-            for (size_t i = 0; i < offsets.size(); ++i)
-            {
-                /// Just multiply the value by array size.
-                res[i] = x * (offsets[i] - pos);
-                pos = offsets[i];
-            }
-
-            res_ptr = std::move(res_column);
-            return true;
-        }
-
-        const typename ColVecType::Container & data = column->getData();
-
-        typename ColVecResult::MutablePtr res_column;
-        if constexpr (IsDecimalNumber<Element>)
-            res_column = ColVecResult::create(offsets.size(), data.getScale());
-        else
-            res_column = ColVecResult::create(offsets.size());
-
-        typename ColVecResult::Container & res = res_column->getData();
-
-        size_t pos = 0;
-        for (size_t i = 0; i < offsets.size(); ++i)
-        {
-            Result s = 0;
-            for (; pos < offsets[i]; ++pos)
-            {
-                s += data[pos];
-            }
-            res[i] = s;
-        }
-
-        res_ptr = std::move(res_column);
-        return true;
-    }
-
-    static ColumnPtr execute(const ColumnArray & array, ColumnPtr mapped)
-    {
-        const IColumn::Offsets & offsets = array.getOffsets();
-        ColumnPtr res;
-
-        if (executeType< UInt8 , UInt64>(mapped, offsets, res) ||
-            executeType< UInt16, UInt64>(mapped, offsets, res) ||
-            executeType< UInt32, UInt64>(mapped, offsets, res) ||
-            executeType< UInt64, UInt64>(mapped, offsets, res) ||
-            executeType<  Int8 ,  Int64>(mapped, offsets, res) ||
-            executeType<  Int16,  Int64>(mapped, offsets, res) ||
-            executeType<  Int32,  Int64>(mapped, offsets, res) ||
-            executeType<  Int64,  Int64>(mapped, offsets, res) ||
-            executeType<Float32,Float64>(mapped, offsets, res) ||
-            executeType<Float64,Float64>(mapped, offsets, res) ||
-            executeType<Decimal32, Decimal128>(mapped, offsets, res) ||
-            executeType<Decimal64, Decimal128>(mapped, offsets, res) ||
-            executeType<Decimal128, Decimal128>(mapped, offsets, res))
-            return res;
-        else
-            throw Exception("Unexpected column for arraySum: " + mapped->getName(), ErrorCodes::ILLEGAL_COLUMN);
-    }
-};
-
-struct NameArraySum { static constexpr auto name = "arraySum"; };
-using FunctionArraySum = FunctionArrayMapped<ArraySumImpl, NameArraySum>;
-
-void registerFunctionArraySum(FunctionFactory & factory)
-{
-    factory.registerFunction<FunctionArraySum>();
-}
-
-}
-
--- a/src/Functions/registerFunctionsHigherOrder.cpp
+++ b/src/Functions/registerFunctionsHigherOrder.cpp
@ -9,7 +9,7 @@ void registerFunctionArrayCount(FunctionFactory & factory);
 void registerFunctionArrayExists(FunctionFactory & factory);
 void registerFunctionArrayAll(FunctionFactory & factory);
 void registerFunctionArrayCompact(FunctionFactory & factory);
-void registerFunctionArraySum(FunctionFactory & factory);
+void registerFunctionArrayAggregation(FunctionFactory & factory);
 void registerFunctionArrayFirst(FunctionFactory & factory);
 void registerFunctionArrayFirstIndex(FunctionFactory & factory);
 void registerFunctionsArrayFill(FunctionFactory & factory);
@ -27,7 +27,7 @@ void registerFunctionsHigherOrder(FunctionFactory & factory)
    registerFunctionArrayExists(factory);
    registerFunctionArrayAll(factory);
    registerFunctionArrayCompact(factory);
-    registerFunctionArraySum(factory);
+    registerFunctionArrayAggregation(factory);
    registerFunctionArrayFirst(factory);
    registerFunctionArrayFirstIndex(factory);
    registerFunctionsArrayFill(factory);
--- a/src/Functions/ya.make
+++ b/src/Functions/ya.make
@ -120,6 +120,7 @@ SRCS(
    appendTrailingCharIfAbsent.cpp
    array/array.cpp
    array/arrayAUC.cpp
+    array/arrayAggregation.cpp
    array/arrayAll.cpp
    array/arrayCompact.cpp
    array/arrayConcat.cpp
@ -155,7 +156,6 @@ SRCS(
    array/arraySlice.cpp
    array/arraySort.cpp
    array/arraySplit.cpp
-    array/arraySum.cpp
    array/arrayUniq.cpp
    array/arrayWithConstant.cpp
    array/arrayZip.cpp
--- a/src/IO/S3Common.cpp
+++ b/src/IO/S3Common.cpp
@ -368,8 +368,12 @@ namespace S3
                throw Exception(
                    "Bucket name length is out of bounds in virtual hosted style S3 URI: " + bucket + " (" + uri.toString() + ")", ErrorCodes::BAD_ARGUMENTS);

-            /// Remove leading '/' from path to extract key.
-            key = uri.getPath().substr(1);
+            if (!uri.getPath().empty())
+            {
+                /// Remove leading '/' from path to extract key.
+                key = uri.getPath().substr(1);
+            }
+
            if (key.empty() || key == "/")
                throw Exception("Key name is empty in virtual hosted style S3 URI: " + key + " (" + uri.toString() + ")", ErrorCodes::BAD_ARGUMENTS);
            boost::to_upper(name);
--- a/src/IO/WriteBufferFromS3.cpp
+++ b/src/IO/WriteBufferFromS3.cpp
@ -5,12 +5,9 @@
 #    include <IO/WriteBufferFromS3.h>
 #    include <IO/WriteHelpers.h>

-#    include <aws/core/utils/memory/stl/AWSStreamFwd.h>
-#    include <aws/core/utils/memory/stl/AWSStringStream.h>
 #    include <aws/s3/S3Client.h>
-#    include <aws/s3/model/CompleteMultipartUploadRequest.h>
-#    include <aws/s3/model/AbortMultipartUploadRequest.h>
 #    include <aws/s3/model/CreateMultipartUploadRequest.h>
+#    include <aws/s3/model/CompleteMultipartUploadRequest.h>
 #    include <aws/s3/model/PutObjectRequest.h>
 #    include <aws/s3/model/UploadPartRequest.h>
 #    include <common/logger_useful.h>
@ -42,22 +39,19 @@ WriteBufferFromS3::WriteBufferFromS3(
    const String & bucket_,
    const String & key_,
    size_t minimum_upload_part_size_,
-    bool is_multipart_,
+    size_t max_single_part_upload_size_,
    std::optional<std::map<String, String>> object_metadata_,
    size_t buffer_size_)
    : BufferWithOwnMemory<WriteBuffer>(buffer_size_, nullptr, 0)
-    , is_multipart(is_multipart_)
    , bucket(bucket_)
    , key(key_)
    , object_metadata(std::move(object_metadata_))
    , client_ptr(std::move(client_ptr_))
-    , minimum_upload_part_size{minimum_upload_part_size_}
-    , temporary_buffer{std::make_unique<WriteBufferFromOwnString>()}
-    , last_part_size{0}
-{
-    if (is_multipart)
-        initiate();
-}
+    , minimum_upload_part_size(minimum_upload_part_size_)
+    , max_single_part_upload_size(max_single_part_upload_size_)
+    , temporary_buffer(Aws::MakeShared<Aws::StringStream>("temporary buffer"))
+    , last_part_size(0)
+{ }

 void WriteBufferFromS3::nextImpl()
 {
@ -68,16 +62,17 @@ void WriteBufferFromS3::nextImpl()

    ProfileEvents::increment(ProfileEvents::S3WriteBytes, offset());

-    if (is_multipart)
-    {
-        last_part_size += offset();
+    last_part_size += offset();

-        if (last_part_size > minimum_upload_part_size)
-        {
-            writePart(temporary_buffer->str());
-            last_part_size = 0;
-            temporary_buffer->restart();
-        }
+    /// Data size exceeds singlepart upload threshold, need to use multipart upload.
+    if (multipart_upload_id.empty() && last_part_size > max_single_part_upload_size)
+        createMultipartUpload();
+
+    if (!multipart_upload_id.empty() && last_part_size > minimum_upload_part_size)
+    {
+        writePart();
+        last_part_size = 0;
+        temporary_buffer = Aws::MakeShared<Aws::StringStream>("temporary buffer");
    }
 }

@ -88,17 +83,23 @@ void WriteBufferFromS3::finalize()

 void WriteBufferFromS3::finalizeImpl()
 {
-    if (!finalized)
+    if (finalized)
+        return;
+
+    next();
+
+    if (multipart_upload_id.empty())
    {
-        next();
-
-        if (is_multipart)
-            writePart(temporary_buffer->str());
-
-        complete();
-
-        finalized = true;
+        makeSinglepartUpload();
    }
+    else
+    {
+        /// Write rest of the data as last part.
+        writePart();
+        completeMultipartUpload();
+    }
+
+    finalized = true;
 }

 WriteBufferFromS3::~WriteBufferFromS3()
@ -113,7 +114,7 @@ WriteBufferFromS3::~WriteBufferFromS3()
    }
 }

-void WriteBufferFromS3::initiate()
+void WriteBufferFromS3::createMultipartUpload()
 {
    Aws::S3::Model::CreateMultipartUploadRequest req;
    req.SetBucket(bucket);
@ -125,17 +126,17 @@ void WriteBufferFromS3::initiate()

    if (outcome.IsSuccess())
    {
-        upload_id = outcome.GetResult().GetUploadId();
-        LOG_DEBUG(log, "Multipart upload initiated. Upload id: {}", upload_id);
+        multipart_upload_id = outcome.GetResult().GetUploadId();
+        LOG_DEBUG(log, "Multipart upload has created. Upload id: {}", multipart_upload_id);
    }
    else
        throw Exception(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
 }


-void WriteBufferFromS3::writePart(const String & data)
+void WriteBufferFromS3::writePart()
 {
-    if (data.empty())
+    if (temporary_buffer->tellp() <= 0)
        return;

    if (part_tags.size() == S3_WARN_MAX_PARTS)
@ -149,93 +150,71 @@ void WriteBufferFromS3::writePart(const String & data)
    req.SetBucket(bucket);
    req.SetKey(key);
    req.SetPartNumber(part_tags.size() + 1);
-    req.SetUploadId(upload_id);
-    req.SetContentLength(data.size());
-    req.SetBody(std::make_shared<Aws::StringStream>(data));
+    req.SetUploadId(multipart_upload_id);
+    req.SetContentLength(temporary_buffer->tellp());
+    req.SetBody(temporary_buffer);

    auto outcome = client_ptr->UploadPart(req);

-    LOG_TRACE(log, "Writing part. Bucket: {}, Key: {}, Upload_id: {}, Data size: {}", bucket, key, upload_id, data.size());
+    LOG_TRACE(log, "Writing part. Bucket: {}, Key: {}, Upload_id: {}, Data size: {}", bucket, key, multipart_upload_id, temporary_buffer->tellp());

    if (outcome.IsSuccess())
    {
        auto etag = outcome.GetResult().GetETag();
        part_tags.push_back(etag);
-        LOG_DEBUG(log, "Writing part finished. Total parts: {}, Upload_id: {}, Etag: {}", part_tags.size(), upload_id, etag);
+        LOG_DEBUG(log, "Writing part finished. Total parts: {}, Upload_id: {}, Etag: {}", part_tags.size(), multipart_upload_id, etag);
    }
    else
        throw Exception(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
 }

-
-void WriteBufferFromS3::complete()
+void WriteBufferFromS3::completeMultipartUpload()
 {
-    if (is_multipart)
+    LOG_DEBUG(log, "Completing multipart upload. Bucket: {}, Key: {}, Upload_id: {}", bucket, key, multipart_upload_id);
+
+    Aws::S3::Model::CompleteMultipartUploadRequest req;
+    req.SetBucket(bucket);
+    req.SetKey(key);
+    req.SetUploadId(multipart_upload_id);
+
+    Aws::S3::Model::CompletedMultipartUpload multipart_upload;
+    for (size_t i = 0; i < part_tags.size(); ++i)
    {
-        if (part_tags.empty())
-        {
-            LOG_DEBUG(log, "Multipart upload has no data. Aborting it. Bucket: {}, Key: {}, Upload_id: {}", bucket, key, upload_id);
-
-            Aws::S3::Model::AbortMultipartUploadRequest req;
-            req.SetBucket(bucket);
-            req.SetKey(key);
-            req.SetUploadId(upload_id);
-
-            auto outcome = client_ptr->AbortMultipartUpload(req);
-
-            if (outcome.IsSuccess())
-                LOG_DEBUG(log, "Aborting multipart upload completed. Upload_id: {}", upload_id);
-            else
-                throw Exception(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
-
-            return;
-        }
-
-        LOG_DEBUG(log, "Completing multipart upload. Bucket: {}, Key: {}, Upload_id: {}", bucket, key, upload_id);
-
-        Aws::S3::Model::CompleteMultipartUploadRequest req;
-        req.SetBucket(bucket);
-        req.SetKey(key);
-        req.SetUploadId(upload_id);
-
-        Aws::S3::Model::CompletedMultipartUpload multipart_upload;
-        for (size_t i = 0; i < part_tags.size(); ++i)
-        {
-            Aws::S3::Model::CompletedPart part;
-            multipart_upload.AddParts(part.WithETag(part_tags[i]).WithPartNumber(i + 1));
-        }
-
-        req.SetMultipartUpload(multipart_upload);
-
-        auto outcome = client_ptr->CompleteMultipartUpload(req);
-
-        if (outcome.IsSuccess())
-            LOG_DEBUG(log, "Multipart upload completed. Upload_id: {}", upload_id);
-        else
-            throw Exception(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
+        Aws::S3::Model::CompletedPart part;
+        multipart_upload.AddParts(part.WithETag(part_tags[i]).WithPartNumber(i + 1));
    }
+
+    req.SetMultipartUpload(multipart_upload);
+
+    auto outcome = client_ptr->CompleteMultipartUpload(req);
+
+    if (outcome.IsSuccess())
+        LOG_DEBUG(log, "Multipart upload has completed. Upload_id: {}", multipart_upload_id);
    else
-    {
-        LOG_DEBUG(log, "Making single part upload. Bucket: {}, Key: {}", bucket, key);
+        throw Exception(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
+}

-        Aws::S3::Model::PutObjectRequest req;
-        req.SetBucket(bucket);
-        req.SetKey(key);
-        if (object_metadata.has_value())
-            req.SetMetadata(object_metadata.value());
+void WriteBufferFromS3::makeSinglepartUpload()
+{
+    if (temporary_buffer->tellp() <= 0)
+        return;

-        /// This could be improved using an adapter to WriteBuffer.
-        const std::shared_ptr<Aws::IOStream> input_data = Aws::MakeShared<Aws::StringStream>("temporary buffer", temporary_buffer->str());
-        temporary_buffer = std::make_unique<WriteBufferFromOwnString>();
-        req.SetBody(input_data);
+    LOG_DEBUG(log, "Making single part upload. Bucket: {}, Key: {}", bucket, key);

-        auto outcome = client_ptr->PutObject(req);
+    Aws::S3::Model::PutObjectRequest req;
+    req.SetBucket(bucket);
+    req.SetKey(key);
+    req.SetContentLength(temporary_buffer->tellp());
+    req.SetBody(temporary_buffer);
+    if (object_metadata.has_value())
+        req.SetMetadata(object_metadata.value());

-        if (outcome.IsSuccess())
-            LOG_DEBUG(log, "Single part upload has completed. Bucket: {}, Key: {}", bucket, key);
-        else
-            throw Exception(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
-    }
+    auto outcome = client_ptr->PutObject(req);
+
+    if (outcome.IsSuccess())
+        LOG_DEBUG(log, "Single part upload has completed. Bucket: {}, Key: {}", bucket, key);
+    else
+        throw Exception(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
 }

 }
--- a/src/IO/WriteBufferFromS3.h
+++ b/src/IO/WriteBufferFromS3.h
@ -6,11 +6,13 @@

 #    include <memory>
 #    include <vector>
+#    include <common/logger_useful.h>
 #    include <common/types.h>
+
 #    include <IO/BufferWithOwnMemory.h>
-#    include <IO/HTTPCommon.h>
 #    include <IO/WriteBuffer.h>
-#    include <IO/WriteBufferFromString.h>
+
+#    include <aws/core/utils/memory/stl/AWSStringStream.h>

 namespace Aws::S3
 {
@ -19,24 +21,30 @@ class S3Client;

 namespace DB
 {
-/* Perform S3 HTTP PUT request.
+
+/**
+ * Buffer to write a data to a S3 object with specified bucket and key.
+ * If data size written to the buffer is less than 'max_single_part_upload_size' write is performed using singlepart upload.
+ * In another case multipart upload is used:
+ * Data is divided on chunks with size greater than 'minimum_upload_part_size'. Last chunk can be less than this threshold.
+ * Each chunk is written as a part to S3.
 */
 class WriteBufferFromS3 : public BufferWithOwnMemory<WriteBuffer>
 {
 private:
-    bool is_multipart;
-
    String bucket;
    String key;
    std::optional<std::map<String, String>> object_metadata;
    std::shared_ptr<Aws::S3::S3Client> client_ptr;
    size_t minimum_upload_part_size;
-    std::unique_ptr<WriteBufferFromOwnString> temporary_buffer;
+    size_t max_single_part_upload_size;
+    /// Buffer to accumulate data.
+    std::shared_ptr<Aws::StringStream> temporary_buffer;
    size_t last_part_size;

    /// Upload in S3 is made in parts.
    /// We initiate upload, then upload each part and get ETag as a response, and then finish upload with listing all our parts.
-    String upload_id;
+    String multipart_upload_id;
    std::vector<String> part_tags;

    Poco::Logger * log = &Poco::Logger::get("WriteBufferFromS3");
@ -47,7 +55,7 @@ public:
        const String & bucket_,
        const String & key_,
        size_t minimum_upload_part_size_,
-        bool is_multipart,
+        size_t max_single_part_upload_size_,
        std::optional<std::map<String, String>> object_metadata_ = std::nullopt,
        size_t buffer_size_ = DBMS_DEFAULT_BUFFER_SIZE);

@ -61,9 +69,11 @@ public:
 private:
    bool finalized = false;

-    void initiate();
-    void writePart(const String & data);
-    void complete();
+    void createMultipartUpload();
+    void writePart();
+    void completeMultipartUpload();
+
+    void makeSinglepartUpload();

    void finalizeImpl();
 };
--- a/src/IO/readFloatText.h
+++ b/src/IO/readFloatText.h
@ -249,6 +249,19 @@ ReturnType readFloatTextPreciseImpl(T & x, ReadBuffer & buf)
 }


+// credit: https://johnnylee-sde.github.io/Fast-numeric-string-to-int/
+static inline bool is_made_of_eight_digits_fast(uint64_t val) noexcept
+{
+    return (((val & 0xF0F0F0F0F0F0F0F0) | (((val + 0x0606060606060606) & 0xF0F0F0F0F0F0F0F0) >> 4)) == 0x3333333333333333);
+}
+
+static inline bool is_made_of_eight_digits_fast(const char * chars) noexcept
+{
+    uint64_t val;
+    ::memcpy(&val, chars, 8);
+    return is_made_of_eight_digits_fast(val);
+}
+
 template <size_t N, typename T>
 static inline void readUIntTextUpToNSignificantDigits(T & x, ReadBuffer & buf)
 {
@ -266,9 +279,6 @@ static inline void readUIntTextUpToNSignificantDigits(T & x, ReadBuffer & buf)
            else
                return;
        }
-
-        while (!buf.eof() && isNumericASCII(*buf.position()))
-            ++buf.position();
    }
    else
    {
@ -283,10 +293,16 @@ static inline void readUIntTextUpToNSignificantDigits(T & x, ReadBuffer & buf)
            else
                return;
        }
-
-        while (!buf.eof() && isNumericASCII(*buf.position()))
-            ++buf.position();
    }
+
+    while (!buf.eof() && (buf.position() + 8 <= buf.buffer().end()) &&
+         is_made_of_eight_digits_fast(buf.position()))
+    {
+        buf.position() += 8;
+    }
+
+    while (!buf.eof() && isNumericASCII(*buf.position()))
+        ++buf.position();
 }


@ -319,7 +335,6 @@ ReturnType readFloatTextFastImpl(T & x, ReadBuffer & in)
        ++in.position();
    }

-
    auto count_after_sign = in.count();

    constexpr int significant_digits = std::numeric_limits<UInt64>::digits10;
--- a/src/Interpreters/DatabaseCatalog.cpp
+++ b/src/Interpreters/DatabaseCatalog.cpp
@ -4,7 +4,7 @@
 #include <Storages/IStorage.h>
 #include <Databases/IDatabase.h>
 #include <Databases/DatabaseMemory.h>
-#include <Databases/DatabaseAtomic.h>
+#include <Databases/DatabaseOnDisk.h>
 #include <Poco/File.h>
 #include <Common/quoteString.h>
 #include <Storages/StorageMemory.h>
@ -16,6 +16,15 @@
 #include <Common/CurrentMetrics.h>
 #include <common/logger_useful.h>

+#if !defined(ARCADIA_BUILD)
+#    include "config_core.h"
+#endif
+
+#if USE_MYSQL
+#    include <Databases/MySQL/MaterializeMySQLSyncThread.h>
+#    include <Storages/StorageMaterializeMySQL.h>
+#endif
+
 #include <filesystem>


@ -217,6 +226,15 @@ DatabaseAndTable DatabaseCatalog::getTableImpl(
                exception->emplace("Table " + table_id.getNameForLogs() + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE);
            return {};
        }
+
+#if USE_MYSQL
+        /// It's definitely not the best place for this logic, but behaviour must be consistent with DatabaseMaterializeMySQL::tryGetTable(...)
+        if (db_and_table.first->getEngineName() == "MaterializeMySQL")
+        {
+            if (!MaterializeMySQLSyncThread::isMySQLSyncThread())
+                db_and_table.second = std::make_shared<StorageMaterializeMySQL>(std::move(db_and_table.second), db_and_table.first.get());
+        }
+#endif
        return db_and_table;
    }

@ -286,7 +304,6 @@ void DatabaseCatalog::attachDatabase(const String & database_name, const Databas
    assertDatabaseDoesntExistUnlocked(database_name);
    databases.emplace(database_name, database);
    UUID db_uuid = database->getUUID();
-    assert((db_uuid != UUIDHelpers::Nil) ^ (dynamic_cast<DatabaseAtomic *>(database.get()) == nullptr));
    if (db_uuid != UUIDHelpers::Nil)
        db_uuid_map.emplace(db_uuid, database);
 }
@ -313,9 +330,8 @@ DatabasePtr DatabaseCatalog::detachDatabase(const String & database_name, bool d
            if (!db->empty())
                throw Exception("New table appeared in database being dropped or detached. Try again.",
                                ErrorCodes::DATABASE_NOT_EMPTY);
-            auto * database_atomic = typeid_cast<DatabaseAtomic *>(db.get());
-            if (!drop && database_atomic)
-                database_atomic->assertCanBeDetached(false);
+            if (!drop)
+                db->assertCanBeDetached(false);
        }
        catch (...)
        {
--- a/src/Interpreters/InterpreterCreateQuery.cpp
+++ b/src/Interpreters/InterpreterCreateQuery.cpp
@ -157,6 +157,35 @@ BlockIO InterpreterCreateQuery::createDatabase(ASTCreateQuery & create)
        if (!create.attach && fs::exists(metadata_path))
            throw Exception(ErrorCodes::DATABASE_ALREADY_EXISTS, "Metadata directory {} already exists", metadata_path.string());
    }
+    else if (create.storage->engine->name == "MaterializeMySQL")
+    {
+        /// It creates nested database with Ordinary or Atomic engine depending on UUID in query and default engine setting.
+        /// Do nothing if it's an internal ATTACH on server startup or short-syntax ATTACH query from user,
+        /// because we got correct query from the metadata file in this case.
+        /// If we got query from user, then normalize it first.
+        bool attach_from_user = create.attach && !internal && !create.attach_short_syntax;
+        bool create_from_user = !create.attach;
+
+        if (create_from_user)
+        {
+            const auto & default_engine = context.getSettingsRef().default_database_engine.value;
+            if (create.uuid == UUIDHelpers::Nil && default_engine == DefaultDatabaseEngine::Atomic)
+                create.uuid = UUIDHelpers::generateV4();    /// Will enable Atomic engine for nested database
+        }
+        else if (attach_from_user && create.uuid == UUIDHelpers::Nil)
+        {
+            /// Ambiguity is possible: should we attach nested database as Ordinary
+            /// or throw "UUID must be specified" for Atomic? So we suggest short syntax for Ordinary.
+            throw Exception("Use short attach syntax ('ATTACH DATABASE name;' without engine) to attach existing database "
+                            "or specify UUID to attach new database with Atomic engine", ErrorCodes::INCORRECT_QUERY);
+        }
+
+        /// Set metadata path according to nested engine
+        if (create.uuid == UUIDHelpers::Nil)
+            metadata_path = metadata_path / "metadata" / database_name_escaped;
+        else
+            metadata_path = metadata_path / "store" / DatabaseCatalog::getPathForUUID(create.uuid);
+    }
    else
    {
        bool is_on_cluster = context.getClientInfo().query_kind == ClientInfo::QueryKind::SECONDARY_QUERY;
@ -655,7 +684,7 @@ void InterpreterCreateQuery::assertOrSetUUID(ASTCreateQuery & create, const Data

    bool from_path = create.attach_from_path.has_value();

-    if (database->getEngineName() == "Atomic")
+    if (database->getUUID() != UUIDHelpers::Nil)
    {
        if (create.attach && !from_path && create.uuid == UUIDHelpers::Nil)
        {
--- a/src/Interpreters/InterpreterDropQuery.cpp
+++ b/src/Interpreters/InterpreterDropQuery.cpp
@ -11,7 +11,14 @@
 #include <Common/escapeForFileName.h>
 #include <Common/quoteString.h>
 #include <Common/typeid_cast.h>
-#include <Databases/DatabaseAtomic.h>
+
+#if !defined(ARCADIA_BUILD)
+#    include "config_core.h"
+#endif
+
+#if USE_MYSQL
+#   include <Databases/MySQL/DatabaseMaterializeMySQL.h>
+#endif


 namespace DB
@ -66,10 +73,7 @@ void InterpreterDropQuery::waitForTableToBeActuallyDroppedOrDetached(const ASTDr
    if (query.kind == ASTDropQuery::Kind::Drop)
        DatabaseCatalog::instance().waitTableFinallyDropped(uuid_to_wait);
    else if (query.kind == ASTDropQuery::Kind::Detach)
-    {
-        if (auto * atomic = typeid_cast<DatabaseAtomic *>(db.get()))
-            atomic->waitDetachedTableNotInUse(uuid_to_wait);
-    }
+        db->waitDetachedTableNotInUse(uuid_to_wait);
 }

 BlockIO InterpreterDropQuery::executeToTable(const ASTDropQuery & query)
@ -122,7 +126,7 @@ BlockIO InterpreterDropQuery::executeToTableImpl(const ASTDropQuery & query, Dat
            table->checkTableCanBeDetached();
            table->shutdown();
            TableExclusiveLockHolder table_lock;
-            if (database->getEngineName() != "Atomic")
+            if (database->getUUID() == UUIDHelpers::Nil)
                table_lock = table->lockExclusively(context.getCurrentQueryId(), context.getSettingsRef().lock_acquire_timeout);
            /// Drop table from memory, don't touch data and metadata
            database->detachTable(table_id.table_name);
@ -145,7 +149,7 @@ BlockIO InterpreterDropQuery::executeToTableImpl(const ASTDropQuery & query, Dat
            table->shutdown();

            TableExclusiveLockHolder table_lock;
-            if (database->getEngineName() != "Atomic")
+            if (database->getUUID() == UUIDHelpers::Nil)
                table_lock = table->lockExclusively(context.getCurrentQueryId(), context.getSettingsRef().lock_acquire_timeout);

            database->dropTable(context, table_id.table_name, query.no_delay);
@ -282,6 +286,11 @@ BlockIO InterpreterDropQuery::executeToDatabaseImpl(const ASTDropQuery & query,
            bool drop = query.kind == ASTDropQuery::Kind::Drop;
            context.checkAccess(AccessType::DROP_DATABASE, database_name);

+#if USE_MYSQL
+            if (database->getEngineName() == "MaterializeMySQL")
+                stopDatabaseSynchronization(database);
+#endif
+
            if (database->shouldBeEmptyOnDetach())
            {
                /// DETACH or DROP all tables and dictionaries inside database.
@ -312,9 +321,8 @@ BlockIO InterpreterDropQuery::executeToDatabaseImpl(const ASTDropQuery & query,
            /// Protects from concurrent CREATE TABLE queries
            auto db_guard = DatabaseCatalog::instance().getExclusiveDDLGuardForDatabase(database_name);

-            auto * database_atomic = typeid_cast<DatabaseAtomic *>(database.get());
-            if (!drop && database_atomic)
-                database_atomic->assertCanBeDetached(true);
+            if (!drop)
+                database->assertCanBeDetached(true);

            /// DETACH or DROP database itself
            DatabaseCatalog::instance().detachDatabase(database_name, drop, database->shouldBeEmptyOnDetach());
--- a/src/Interpreters/InterpreterSelectQuery.cpp
+++ b/src/Interpreters/InterpreterSelectQuery.cpp
@ -1552,7 +1552,12 @@ void InterpreterSelectQuery::executeFetchColumns(
        throw Exception("Logical error in InterpreterSelectQuery: nowhere to read", ErrorCodes::LOGICAL_ERROR);

    /// Specify the number of threads only if it wasn't specified in storage.
-    if (!query_plan.getMaxThreads())
+    ///
+    /// But in case of remote query and prefer_localhost_replica=1 (default)
+    /// The inner local query (that is done in the same process, without
+    /// network interaction), it will setMaxThreads earlier and distributed
+    /// query will not update it.
+    if (!query_plan.getMaxThreads() || is_remote)
        query_plan.setMaxThreads(max_threads_execute_query);

    /// Aliases in table declaration.
--- a/src/Interpreters/InterpreterSelectWithUnionQuery.cpp
+++ b/src/Interpreters/InterpreterSelectWithUnionQuery.cpp
@ -128,24 +128,24 @@ InterpreterSelectWithUnionQuery::InterpreterSelectWithUnionQuery(
    const ASTPtr & query_ptr_, const Context & context_, const SelectQueryOptions & options_, const Names & required_result_column_names)
    : IInterpreterUnionOrSelectQuery(query_ptr_, context_, options_)
 {
-    auto & ast = query_ptr->as<ASTSelectWithUnionQuery &>();
+    ASTSelectWithUnionQuery * ast = query_ptr->as<ASTSelectWithUnionQuery>();

    /// Normalize AST Tree
-    if (!ast.is_normalized)
+    if (!ast->is_normalized)
    {
        CustomizeASTSelectWithUnionQueryNormalizeVisitor::Data union_default_mode{context->getSettingsRef().union_default_mode};
        CustomizeASTSelectWithUnionQueryNormalizeVisitor(union_default_mode).visit(query_ptr);

        /// After normalization, if it only has one ASTSelectWithUnionQuery child,
        /// we can lift it up, this can reduce one unnecessary recursion later.
-        if (ast.list_of_selects->children.size() == 1 && ast.list_of_selects->children.at(0)->as<ASTSelectWithUnionQuery>())
+        if (ast->list_of_selects->children.size() == 1 && ast->list_of_selects->children.at(0)->as<ASTSelectWithUnionQuery>())
        {
-            query_ptr = std::move(ast.list_of_selects->children.at(0));
-            ast = query_ptr->as<ASTSelectWithUnionQuery &>();
+            query_ptr = std::move(ast->list_of_selects->children.at(0));
+            ast = query_ptr->as<ASTSelectWithUnionQuery>();
        }
    }

-    size_t num_children = ast.list_of_selects->children.size();
+    size_t num_children = ast->list_of_selects->children.size();
    if (!num_children)
        throw Exception("Logical error: no children in ASTSelectWithUnionQuery", ErrorCodes::LOGICAL_ERROR);

@ -161,7 +161,7 @@ InterpreterSelectWithUnionQuery::InterpreterSelectWithUnionQuery(
        /// Result header if there are no filtering by 'required_result_column_names'.
        /// We use it to determine positions of 'required_result_column_names' in SELECT clause.

-        Block full_result_header = getCurrentChildResultHeader(ast.list_of_selects->children.at(0), required_result_column_names);
+        Block full_result_header = getCurrentChildResultHeader(ast->list_of_selects->children.at(0), required_result_column_names);

        std::vector<size_t> positions_of_required_result_columns(required_result_column_names.size());

@ -171,7 +171,7 @@ InterpreterSelectWithUnionQuery::InterpreterSelectWithUnionQuery(
        for (size_t query_num = 1; query_num < num_children; ++query_num)
        {
            Block full_result_header_for_current_select
-                = getCurrentChildResultHeader(ast.list_of_selects->children.at(query_num), required_result_column_names);
+                = getCurrentChildResultHeader(ast->list_of_selects->children.at(query_num), required_result_column_names);

            if (full_result_header_for_current_select.columns() != full_result_header.columns())
                throw Exception("Different number of columns in UNION ALL elements:\n"
@ -192,7 +192,7 @@ InterpreterSelectWithUnionQuery::InterpreterSelectWithUnionQuery(
            = query_num == 0 ? required_result_column_names : required_result_column_names_for_other_selects[query_num];

        nested_interpreters.emplace_back(
-            buildCurrentChildInterpreter(ast.list_of_selects->children.at(query_num), current_required_result_column_names));
+            buildCurrentChildInterpreter(ast->list_of_selects->children.at(query_num), current_required_result_column_names));
    }

    /// Determine structure of the result.
--- a/src/Interpreters/MergeJoin.cpp
+++ b/src/Interpreters/MergeJoin.cpp
@ -180,12 +180,16 @@ class MergeJoinCursor
 public:
    MergeJoinCursor(const Block & block, const SortDescription & desc_)
        : impl(SortCursorImpl(block, desc_))
-    {}
+    {
+        /// SortCursorImpl can work with permutation, but MergeJoinCursor can't.
+        if (impl.permutation)
+            throw Exception("Logical error: MergeJoinCursor doesn't support permutation", ErrorCodes::LOGICAL_ERROR);
+    }

-    size_t position() const { return impl.pos; }
+    size_t position() const { return impl.getRow(); }
    size_t end() const { return impl.rows; }
-    bool atEnd() const { return impl.pos >= impl.rows; }
-    void nextN(size_t num) { impl.pos += num; }
+    bool atEnd() const { return impl.getRow() >= impl.rows; }
+    void nextN(size_t num) { impl.getPosRef() += num; }

    void setCompareNullability(const MergeJoinCursor & rhs)
    {
@ -254,10 +258,10 @@ private:
            else if (cmp > 0)
                rhs.impl.next();
            else if (!cmp)
-                return Range{impl.pos, rhs.impl.pos, getEqualLength(), rhs.getEqualLength()};
+                return Range{impl.getRow(), rhs.impl.getRow(), getEqualLength(), rhs.getEqualLength()};
        }

-        return Range{impl.pos, rhs.impl.pos, 0, 0};
+        return Range{impl.getRow(), rhs.impl.getRow(), 0, 0};
    }

    template <bool left_nulls, bool right_nulls>
@ -268,7 +272,7 @@ private:
            const auto * left_column = impl.sort_columns[i];
            const auto * right_column = rhs.impl.sort_columns[i];

-            int res = nullableCompareAt<left_nulls, right_nulls>(*left_column, *right_column, impl.pos, rhs.impl.pos);
+            int res = nullableCompareAt<left_nulls, right_nulls>(*left_column, *right_column, impl.getRow(), rhs.impl.getRow());
            if (res)
                return res;
        }
@ -278,11 +282,11 @@ private:
    /// Expects !atEnd()
    size_t getEqualLength()
    {
-        size_t pos = impl.pos + 1;
+        size_t pos = impl.getRow() + 1;
        for (; pos < impl.rows; ++pos)
            if (!samePrev(pos))
                break;
-        return pos - impl.pos;
+        return pos - impl.getRow();
    }

    /// Expects lhs_pos > 0
--- a/src/Parsers/ParserCreateQuery.cpp
+++ b/src/Parsers/ParserCreateQuery.cpp
@ -449,12 +449,22 @@ bool ParserCreateTableQuery::parseImpl(Pos & pos, ASTPtr & node, Expected & expe
        if (!s_rparen.ignore(pos, expected))
            return false;

-        if (!storage_p.parse(pos, storage, expected) && !is_temporary)
+        auto storage_parse_result = storage_p.parse(pos, storage, expected);
+
+        if (storage_parse_result && s_as.ignore(pos, expected))
+        {
+            if (!select_p.parse(pos, select, expected))
+                return false;
+        }
+
+        if (!storage_parse_result && !is_temporary)
        {
            if (!s_as.ignore(pos, expected))
                return false;
            if (!table_function_p.parse(pos, as_table_function, expected))
+            {
                return false;
+            }
        }
    }
    else
--- a/src/Processors/Merges/Algorithms/AggregatingSortedAlgorithm.cpp
+++ b/src/Processors/Merges/Algorithms/AggregatingSortedAlgorithm.cpp
@ -257,12 +257,12 @@ void AggregatingSortedAlgorithm::AggregatingMergedData::addRow(SortCursor & curs
        throw Exception("Can't add a row to the group because it was not started.", ErrorCodes::LOGICAL_ERROR);

    for (auto & desc : def.columns_to_aggregate)
-        desc.column->insertMergeFrom(*cursor->all_columns[desc.column_number], cursor->pos);
+        desc.column->insertMergeFrom(*cursor->all_columns[desc.column_number], cursor->getRow());

    for (auto & desc : def.columns_to_simple_aggregate)
    {
        auto & col = cursor->all_columns[desc.column_number];
-        desc.add_function(desc.function.get(), desc.state.data(), &col, cursor->pos, arena.get());
+        desc.add_function(desc.function.get(), desc.state.data(), &col, cursor->getRow(), arena.get());
    }
 }

@ -352,7 +352,7 @@ IMergingAlgorithm::Status AggregatingSortedAlgorithm::merge()
                return Status(merged_data.pull());
            }

-            merged_data.startGroup(current->all_columns, current->pos);
+            merged_data.startGroup(current->all_columns, current->getRow());
        }

        merged_data.addRow(current);
--- a/src/Processors/Merges/Algorithms/CollapsingSortedAlgorithm.cpp
+++ b/src/Processors/Merges/Algorithms/CollapsingSortedAlgorithm.cpp
@ -26,9 +26,9 @@ CollapsingSortedAlgorithm::CollapsingSortedAlgorithm(
    const String & sign_column,
    bool only_positive_sign_,
    size_t max_block_size,
+    Poco::Logger * log_,
    WriteBuffer * out_row_sources_buf_,
-    bool use_average_block_sizes,
-    Poco::Logger * log_)
+    bool use_average_block_sizes)
    : IMergingAlgorithmWithSharedChunks(num_inputs, std::move(description_), out_row_sources_buf_, max_row_refs)
    , merged_data(header.cloneEmptyColumns(), use_average_block_sizes, max_block_size)
    , sign_column_number(header.getPositionByName(sign_column))
@ -123,7 +123,7 @@ IMergingAlgorithm::Status CollapsingSortedAlgorithm::merge()
            return Status(current.impl->order);
        }

-        Int8 sign = assert_cast<const ColumnInt8 &>(*current->all_columns[sign_column_number]).getData()[current->pos];
+        Int8 sign = assert_cast<const ColumnInt8 &>(*current->all_columns[sign_column_number]).getData()[current->getRow()];

        RowRef current_row;
        setRowRef(current_row, current);
--- a/src/Processors/Merges/Algorithms/CollapsingSortedAlgorithm.h
+++ b/src/Processors/Merges/Algorithms/CollapsingSortedAlgorithm.h
@ -33,9 +33,9 @@ public:
        const String & sign_column,
        bool only_positive_sign_, /// For select final. Skip rows with sum(sign) < 0.
        size_t max_block_size,
-        WriteBuffer * out_row_sources_buf_,
-        bool use_average_block_sizes,
-        Poco::Logger * log_);
+        Poco::Logger * log_,
+        WriteBuffer * out_row_sources_buf_ = nullptr,
+        bool use_average_block_sizes = false);

    Status merge() override;

--- a/src/Processors/Merges/Algorithms/GraphiteRollupSortedAlgorithm.cpp
+++ b/src/Processors/Merges/Algorithms/GraphiteRollupSortedAlgorithm.cpp
@ -164,12 +164,12 @@ IMergingAlgorithm::Status GraphiteRollupSortedAlgorithm::merge()
            return Status(current.impl->order);
        }

-        StringRef next_path = current->all_columns[columns_definition.path_column_num]->getDataAt(current->pos);
+        StringRef next_path = current->all_columns[columns_definition.path_column_num]->getDataAt(current->getRow());
        bool new_path = is_first || next_path != current_group_path;

        is_first = false;

-        time_t next_row_time = current->all_columns[columns_definition.time_column_num]->getUInt(current->pos);
+        time_t next_row_time = current->all_columns[columns_definition.time_column_num]->getUInt(current->getRow());
        /// Is new key before rounding.
        bool is_new_key = new_path || next_row_time != current_time;

@ -227,7 +227,7 @@ IMergingAlgorithm::Status GraphiteRollupSortedAlgorithm::merge()
        /// and for rows with same maximum version - only last row.
        if (is_new_key
            || current->all_columns[columns_definition.version_column_num]->compareAt(
-                current->pos, current_subgroup_newest_row.row_num,
+                current->getRow(), current_subgroup_newest_row.row_num,
                *(*current_subgroup_newest_row.all_columns)[columns_definition.version_column_num],
                /* nan_direction_hint = */ 1) >= 0)
        {
@ -263,7 +263,7 @@ IMergingAlgorithm::Status GraphiteRollupSortedAlgorithm::merge()

 void GraphiteRollupSortedAlgorithm::startNextGroup(SortCursor & cursor, Graphite::RollupRule next_rule)
 {
-    merged_data.startNextGroup(cursor->all_columns, cursor->pos, next_rule, columns_definition);
+    merged_data.startNextGroup(cursor->all_columns, cursor->getRow(), next_rule, columns_definition);
 }

 void GraphiteRollupSortedAlgorithm::finishCurrentGroup()
--- a/src/Processors/Merges/Algorithms/IMergingAlgorithm.h
+++ b/src/Processors/Merges/Algorithms/IMergingAlgorithm.h
@ -29,6 +29,8 @@ public:
        /// between different algorithm objects in parallel FINAL.
        bool skip_last_row = false;

+        IColumn::Permutation * permutation = nullptr;
+
        void swap(Input & other)
        {
            chunk.swap(other.chunk);
--- a/src/Processors/Merges/Algorithms/IMergingAlgorithmWithDelayedChunk.cpp
+++ b/src/Processors/Merges/Algorithms/IMergingAlgorithmWithDelayedChunk.cpp
@ -22,7 +22,7 @@ void IMergingAlgorithmWithDelayedChunk::initializeQueue(Inputs inputs)
        if (!current_inputs[source_num].chunk)
            continue;

-        cursors[source_num] = SortCursorImpl(current_inputs[source_num].chunk.getColumns(), description, source_num);
+        cursors[source_num] = SortCursorImpl(current_inputs[source_num].chunk.getColumns(), description, source_num, current_inputs[source_num].permutation);
    }

    queue = SortingHeap<SortCursor>(cursors);
@ -37,7 +37,7 @@ void IMergingAlgorithmWithDelayedChunk::updateCursor(Input & input, size_t sourc
    last_chunk_sort_columns = std::move(cursors[source_num].sort_columns);

    current_input.swap(input);
-    cursors[source_num].reset(current_input.chunk.getColumns(), {});
+    cursors[source_num].reset(current_input.chunk.getColumns(), {}, current_input.permutation);

    queue.push(cursors[source_num]);
 }
--- a/src/Processors/Merges/Algorithms/IMergingAlgorithmWithSharedChunks.cpp
+++ b/src/Processors/Merges/Algorithms/IMergingAlgorithmWithSharedChunks.cpp
@ -39,7 +39,7 @@ void IMergingAlgorithmWithSharedChunks::initialize(Inputs inputs)

        source.skip_last_row = inputs[source_num].skip_last_row;
        source.chunk = chunk_allocator.alloc(inputs[source_num].chunk);
-        cursors[source_num] = SortCursorImpl(source.chunk->getColumns(), description, source_num);
+        cursors[source_num] = SortCursorImpl(source.chunk->getColumns(), description, source_num, inputs[source_num].permutation);

        source.chunk->all_columns = cursors[source_num].all_columns;
        source.chunk->sort_columns = cursors[source_num].sort_columns;
@ -55,7 +55,7 @@ void IMergingAlgorithmWithSharedChunks::consume(Input & input, size_t source_num
    auto & source = sources[source_num];
    source.skip_last_row = input.skip_last_row;
    source.chunk = chunk_allocator.alloc(input.chunk);
-    cursors[source_num].reset(source.chunk->getColumns(), {});
+    cursors[source_num].reset(source.chunk->getColumns(), {}, input.permutation);

    source.chunk->all_columns = cursors[source_num].all_columns;
    source.chunk->sort_columns = cursors[source_num].sort_columns;
--- a/src/Processors/Merges/Algorithms/MergingSortedAlgorithm.cpp
+++ b/src/Processors/Merges/Algorithms/MergingSortedAlgorithm.cpp
@ -139,7 +139,7 @@ IMergingAlgorithm::Status MergingSortedAlgorithm::mergeImpl(TSortingHeap & queue

        //std::cerr << "total_merged_rows: " << total_merged_rows << ", merged_rows: " << merged_rows << "\n";
        //std::cerr << "Inserting row\n";
-        merged_data.insertRow(current->all_columns, current->pos, current->rows);
+        merged_data.insertRow(current->all_columns, current->getRow(), current->rows);

        if (out_row_sources_buf)
        {
--- a/src/Processors/Merges/Algorithms/MergingSortedAlgorithm.h
+++ b/src/Processors/Merges/Algorithms/MergingSortedAlgorithm.h
@ -18,9 +18,9 @@ public:
        size_t num_inputs,
        SortDescription description_,
        size_t max_block_size,
-        UInt64 limit_,
-        WriteBuffer * out_row_sources_buf_,
-        bool use_average_block_sizes);
+        UInt64 limit_ = 0,
+        WriteBuffer * out_row_sources_buf_ = nullptr,
+        bool use_average_block_sizes = false);

    void addInput();

--- a/src/Processors/Merges/Algorithms/ReplacingSortedAlgorithm.cpp
+++ b/src/Processors/Merges/Algorithms/ReplacingSortedAlgorithm.cpp
@ -73,7 +73,7 @@ IMergingAlgorithm::Status ReplacingSortedAlgorithm::merge()
        if (version_column_number == -1
            || selected_row.empty()
            || current->all_columns[version_column_number]->compareAt(
-                current->pos, selected_row.row_num,
+                current->getRow(), selected_row.row_num,
                *(*selected_row.all_columns)[version_column_number],
                /* nan_direction_hint = */ 1) >= 0)
        {
--- a/src/Processors/Merges/Algorithms/RowRef.h
+++ b/src/Processors/Merges/Algorithms/RowRef.h
@ -136,7 +136,7 @@ struct RowRef
    {
        sort_columns = cursor.impl->sort_columns.data();
        num_columns = cursor.impl->sort_columns.size();
-        row_num = cursor.impl->pos;
+        row_num = cursor.impl->getRow();
    }

    static bool checkEquals(size_t size, const IColumn ** lhs, size_t lhs_row, const IColumn ** rhs, size_t rhs_row)
@ -192,7 +192,7 @@ struct RowRefWithOwnedChunk
    void set(SortCursor & cursor, SharedChunkPtr chunk)
    {
        owned_chunk = std::move(chunk);
-        row_num = cursor.impl->pos;
+        row_num = cursor.impl->getRow();
        all_columns = &owned_chunk->all_columns;
        sort_columns = &owned_chunk->sort_columns;
    }
--- a/src/Processors/Merges/Algorithms/SummingSortedAlgorithm.cpp
+++ b/src/Processors/Merges/Algorithms/SummingSortedAlgorithm.cpp
@ -688,10 +688,10 @@ IMergingAlgorithm::Status SummingSortedAlgorithm::merge()
                return Status(merged_data.pull());
            }

-            merged_data.startGroup(current->all_columns, current->pos);
+            merged_data.startGroup(current->all_columns, current->getRow());
        }
        else
-            merged_data.addRow(current->all_columns, current->pos);
+            merged_data.addRow(current->all_columns, current->getRow());

        if (!current->isLast())
        {
--- a/src/Processors/Merges/Algorithms/VersionedCollapsingAlgorithm.cpp
+++ b/src/Processors/Merges/Algorithms/VersionedCollapsingAlgorithm.cpp
@ -73,7 +73,7 @@ IMergingAlgorithm::Status VersionedCollapsingAlgorithm::merge()

        RowRef current_row;

-        Int8 sign = assert_cast<const ColumnInt8 &>(*current->all_columns[sign_column_number]).getData()[current->pos];
+        Int8 sign = assert_cast<const ColumnInt8 &>(*current->all_columns[sign_column_number]).getData()[current->getRow()];

        setRowRef(current_row, current);

--- a/src/Processors/Merges/CollapsingSortedTransform.h
+++ b/src/Processors/Merges/CollapsingSortedTransform.h
@ -27,9 +27,9 @@ public:
            sign_column,
            only_positive_sign,
            max_block_size,
+            &Poco::Logger::get("CollapsingSortedTransform"),
            out_row_sources_buf_,
-            use_average_block_sizes,
-            &Poco::Logger::get("CollapsingSortedTransform"))
+            use_average_block_sizes)
    {
    }

--- a/src/Processors/Transforms/SortingTransform.cpp
+++ b/src/Processors/Transforms/SortingTransform.cpp
@ -100,7 +100,7 @@ Chunk MergeSorter::mergeImpl(TSortingHeap & queue)

        /// Append a row from queue.
        for (size_t i = 0; i < num_columns; ++i)
-            merged_columns[i]->insertFrom(*current->all_columns[i], current->pos);
+            merged_columns[i]->insertFrom(*current->all_columns[i], current->getRow());

        ++total_merged_rows;
        ++merged_rows;
--- a/src/Storages/Kafka/KafkaSettings.h
+++ b/src/Storages/Kafka/KafkaSettings.h
@ -20,7 +20,7 @@ class ASTStorage;
    M(Milliseconds, kafka_poll_timeout_ms, 0, "Timeout for single poll from Kafka.", 0) \
    /* default is min(max_block_size, kafka_max_block_size)*/ \
    M(UInt64, kafka_poll_max_batch_size, 0, "Maximum amount of messages to be polled in a single Kafka poll.", 0) \
-    /* default is = min_insert_block_size / kafka_num_consumers  */ \
+    /* default is = max_insert_block_size / kafka_num_consumers  */ \
    M(UInt64, kafka_max_block_size, 0, "Number of row collected by poll(s) for flushing data from Kafka.", 0) \
    /* default is stream_flush_interval_ms */ \
    M(Milliseconds, kafka_flush_interval_ms, 0, "Timeout for flushing data from Kafka.", 0) \
--- a/src/Storages/MergeTree/MergeTreeBlockOutputStream.cpp
+++ b/src/Storages/MergeTree/MergeTreeBlockOutputStream.cpp
@ -22,7 +22,7 @@ void MergeTreeBlockOutputStream::write(const Block & block)
    {
        Stopwatch watch;

-        MergeTreeData::MutableDataPartPtr part = storage.writer.writeTempPart(current_block, metadata_snapshot);
+        MergeTreeData::MutableDataPartPtr part = storage.writer.writeTempPart(current_block, metadata_snapshot, optimize_on_insert);
        storage.renameTempPartAndAdd(part, &storage.increment);

        PartLog::addNewPart(storage.global_context, part, watch.elapsed());
--- a/src/Storages/MergeTree/MergeTreeBlockOutputStream.h
+++ b/src/Storages/MergeTree/MergeTreeBlockOutputStream.h
@ -14,10 +14,11 @@ class StorageMergeTree;
 class MergeTreeBlockOutputStream : public IBlockOutputStream
 {
 public:
-    MergeTreeBlockOutputStream(StorageMergeTree & storage_, const StorageMetadataPtr metadata_snapshot_, size_t max_parts_per_block_)
+    MergeTreeBlockOutputStream(StorageMergeTree & storage_, const StorageMetadataPtr metadata_snapshot_, size_t max_parts_per_block_, bool optimize_on_insert_)
        : storage(storage_)
        , metadata_snapshot(metadata_snapshot_)
        , max_parts_per_block(max_parts_per_block_)
+        , optimize_on_insert(optimize_on_insert_)
    {
    }

@ -28,6 +29,7 @@ private:
    StorageMergeTree & storage;
    StorageMetadataPtr metadata_snapshot;
    size_t max_parts_per_block;
+    bool optimize_on_insert;
 };

 }
--- a/src/Storages/MergeTree/MergeTreeDataWriter.cpp
+++ b/src/Storages/MergeTree/MergeTreeDataWriter.cpp
@ -16,6 +16,14 @@

 #include <Parsers/queryToString.h>

+#include <Processors/Merges/Algorithms/ReplacingSortedAlgorithm.h>
+#include <Processors/Merges/Algorithms/MergingSortedAlgorithm.h>
+#include <Processors/Merges/Algorithms/CollapsingSortedAlgorithm.h>
+#include <Processors/Merges/Algorithms/SummingSortedAlgorithm.h>
+#include <Processors/Merges/Algorithms/AggregatingSortedAlgorithm.h>
+#include <Processors/Merges/Algorithms/VersionedCollapsingAlgorithm.h>
+#include <Processors/Merges/Algorithms/GraphiteRollupSortedAlgorithm.h>
+
 namespace ProfileEvents
 {
    extern const Event MergeTreeDataWriterBlocks;
@ -194,7 +202,74 @@ BlocksWithPartition MergeTreeDataWriter::splitBlockIntoParts(const Block & block
    return result;
 }

-MergeTreeData::MutableDataPartPtr MergeTreeDataWriter::writeTempPart(BlockWithPartition & block_with_partition, const StorageMetadataPtr & metadata_snapshot)
+Block MergeTreeDataWriter::mergeBlock(const Block & block, SortDescription sort_description, Names & partition_key_columns, IColumn::Permutation *& permutation)
+{
+    size_t block_size = block.rows();
+
+    auto get_merging_algorithm = [&]() -> std::shared_ptr<IMergingAlgorithm>
+    {
+        switch (data.merging_params.mode)
+        {
+            /// There is nothing to merge in single block in ordinary MergeTree
+            case MergeTreeData::MergingParams::Ordinary:
+                return nullptr;
+            case MergeTreeData::MergingParams::Replacing:
+                return std::make_shared<ReplacingSortedAlgorithm>(
+                    block, 1, sort_description, data.merging_params.version_column, block_size + 1);
+            case MergeTreeData::MergingParams::Collapsing:
+                return std::make_shared<CollapsingSortedAlgorithm>(
+                    block, 1, sort_description, data.merging_params.sign_column,
+                    false, block_size + 1, &Poco::Logger::get("MergeTreeBlockOutputStream"));
+            case MergeTreeData::MergingParams::Summing:
+                return std::make_shared<SummingSortedAlgorithm>(
+                    block, 1, sort_description, data.merging_params.columns_to_sum,
+                    partition_key_columns, block_size + 1);
+            case MergeTreeData::MergingParams::Aggregating:
+                return std::make_shared<AggregatingSortedAlgorithm>(block, 1, sort_description, block_size + 1);
+            case MergeTreeData::MergingParams::VersionedCollapsing:
+                return std::make_shared<VersionedCollapsingAlgorithm>(
+                    block, 1, sort_description, data.merging_params.sign_column, block_size + 1);
+            case MergeTreeData::MergingParams::Graphite:
+                return std::make_shared<GraphiteRollupSortedAlgorithm>(
+                    block, 1, sort_description, block_size + 1, data.merging_params.graphite_params, time(nullptr));
+        }
+
+        __builtin_unreachable();
+    };
+
+    auto merging_algorithm = get_merging_algorithm();
+    if (!merging_algorithm)
+        return block;
+
+    Chunk chunk(block.getColumns(), block_size);
+
+    IMergingAlgorithm::Input input;
+    input.set(std::move(chunk));
+    input.permutation = permutation;
+
+    IMergingAlgorithm::Inputs inputs;
+    inputs.push_back(std::move(input));
+    merging_algorithm->initialize(std::move(inputs));
+
+    IMergingAlgorithm::Status status = merging_algorithm->merge();
+
+    /// Check that after first merge merging_algorithm is waiting for data from input 0.
+    if (status.required_source != 0)
+        throw Exception("Logical error: required source after the first merge is not 0.", ErrorCodes::LOGICAL_ERROR);
+
+    status = merging_algorithm->merge();
+
+    /// Check that merge is finished.
+    if (!status.is_finished)
+        throw Exception("Logical error: merge is not finished after the second merge.", ErrorCodes::LOGICAL_ERROR);
+
+    /// Merged Block is sorted and we don't need to use permutation anymore
+    permutation = nullptr;
+
+    return block.cloneWithColumns(status.chunk.getColumns());
+}
+
+MergeTreeData::MutableDataPartPtr MergeTreeDataWriter::writeTempPart(BlockWithPartition & block_with_partition, const StorageMetadataPtr & metadata_snapshot, bool optimize_on_insert)
 {
    Block & block = block_with_partition.block;

@ -228,6 +303,38 @@ MergeTreeData::MutableDataPartPtr MergeTreeDataWriter::writeTempPart(BlockWithPa
    else
        part_name = new_part_info.getPartName();

+    /// If we need to calculate some columns to sort.
+    if (metadata_snapshot->hasSortingKey() || metadata_snapshot->hasSecondaryIndices())
+        data.getSortingKeyAndSkipIndicesExpression(metadata_snapshot)->execute(block);
+
+    Names sort_columns = metadata_snapshot->getSortingKeyColumns();
+    SortDescription sort_description;
+    size_t sort_columns_size = sort_columns.size();
+    sort_description.reserve(sort_columns_size);
+
+    for (size_t i = 0; i < sort_columns_size; ++i)
+        sort_description.emplace_back(block.getPositionByName(sort_columns[i]), 1, 1);
+
+    ProfileEvents::increment(ProfileEvents::MergeTreeDataWriterBlocks);
+
+    /// Sort
+    IColumn::Permutation * perm_ptr = nullptr;
+    IColumn::Permutation perm;
+    if (!sort_description.empty())
+    {
+        if (!isAlreadySorted(block, sort_description))
+        {
+            stableGetPermutation(block, sort_description, perm);
+            perm_ptr = &perm;
+        }
+        else
+            ProfileEvents::increment(ProfileEvents::MergeTreeDataWriterBlocksAlreadySorted);
+    }
+
+    Names partition_key_columns = metadata_snapshot->getPartitionKey().column_names;
+    if (optimize_on_insert)
+        block = mergeBlock(block, sort_description, partition_key_columns, perm_ptr);
+
    /// Size of part would not be greater than block.bytes() + epsilon
    size_t expected_size = block.bytes();

@ -274,34 +381,6 @@ MergeTreeData::MutableDataPartPtr MergeTreeDataWriter::writeTempPart(BlockWithPa
            sync_guard.emplace(disk, full_path);
    }

-    /// If we need to calculate some columns to sort.
-    if (metadata_snapshot->hasSortingKey() || metadata_snapshot->hasSecondaryIndices())
-        data.getSortingKeyAndSkipIndicesExpression(metadata_snapshot)->execute(block);
-
-    Names sort_columns = metadata_snapshot->getSortingKeyColumns();
-    SortDescription sort_description;
-    size_t sort_columns_size = sort_columns.size();
-    sort_description.reserve(sort_columns_size);
-
-    for (size_t i = 0; i < sort_columns_size; ++i)
-        sort_description.emplace_back(block.getPositionByName(sort_columns[i]), 1, 1);
-
-    ProfileEvents::increment(ProfileEvents::MergeTreeDataWriterBlocks);
-
-    /// Sort
-    IColumn::Permutation * perm_ptr = nullptr;
-    IColumn::Permutation perm;
-    if (!sort_description.empty())
-    {
-        if (!isAlreadySorted(block, sort_description))
-        {
-            stableGetPermutation(block, sort_description, perm);
-            perm_ptr = &perm;
-        }
-        else
-            ProfileEvents::increment(ProfileEvents::MergeTreeDataWriterBlocksAlreadySorted);
-    }
-
    if (metadata_snapshot->hasRowsTTL())
        updateTTL(metadata_snapshot->getRowsTTL(), new_data_part->ttl_infos, new_data_part->ttl_infos.table_ttl, block, true);

--- a/src/Storages/MergeTree/MergeTreeDataWriter.h
+++ b/src/Storages/MergeTree/MergeTreeDataWriter.h
@ -45,7 +45,9 @@ public:
    /** All rows must correspond to same partition.
      * Returns part with unique name starting with 'tmp_', yet not added to MergeTreeData.
      */
-    MergeTreeData::MutableDataPartPtr writeTempPart(BlockWithPartition & block, const StorageMetadataPtr & metadata_snapshot);
+    MergeTreeData::MutableDataPartPtr writeTempPart(BlockWithPartition & block, const StorageMetadataPtr & metadata_snapshot, bool optimize_on_insert);
+
+    Block mergeBlock(const Block & block, SortDescription sort_description, Names & partition_key_columns, IColumn::Permutation *& permutation);

 private:
    MergeTreeData & data;
--- a/src/Storages/MergeTree/ReplicatedMergeTreeBlockOutputStream.cpp
+++ b/src/Storages/MergeTree/ReplicatedMergeTreeBlockOutputStream.cpp
@ -40,7 +40,8 @@ ReplicatedMergeTreeBlockOutputStream::ReplicatedMergeTreeBlockOutputStream(
    size_t quorum_timeout_ms_,
    size_t max_parts_per_block_,
    bool quorum_parallel_,
-    bool deduplicate_)
+    bool deduplicate_,
+    bool optimize_on_insert_)
    : storage(storage_)
    , metadata_snapshot(metadata_snapshot_)
    , quorum(quorum_)
@ -49,6 +50,7 @@ ReplicatedMergeTreeBlockOutputStream::ReplicatedMergeTreeBlockOutputStream(
    , quorum_parallel(quorum_parallel_)
    , deduplicate(deduplicate_)
    , log(&Poco::Logger::get(storage.getLogName() + " (Replicated OutputStream)"))
+    , optimize_on_insert(optimize_on_insert_)
 {
    /// The quorum value `1` has the same meaning as if it is disabled.
    if (quorum == 1)
@ -142,7 +144,7 @@ void ReplicatedMergeTreeBlockOutputStream::write(const Block & block)

        /// Write part to the filesystem under temporary name. Calculate a checksum.

-        MergeTreeData::MutableDataPartPtr part = storage.writer.writeTempPart(current_block, metadata_snapshot);
+        MergeTreeData::MutableDataPartPtr part = storage.writer.writeTempPart(current_block, metadata_snapshot, optimize_on_insert);

        String block_id;

--- a/src/Storages/MergeTree/ReplicatedMergeTreeBlockOutputStream.h
+++ b/src/Storages/MergeTree/ReplicatedMergeTreeBlockOutputStream.h
@ -29,7 +29,8 @@ public:
        size_t quorum_timeout_ms_,
        size_t max_parts_per_block_,
        bool quorum_parallel_,
-        bool deduplicate_);
+        bool deduplicate_,
+        bool optimize_on_insert);

    Block getHeader() const override;
    void writePrefix() override;
@ -71,6 +72,8 @@ private:

    using Logger = Poco::Logger;
    Poco::Logger * log;
+
+    bool optimize_on_insert;
 };

 }
--- a/src/Storages/StorageMaterializeMySQL.cpp
+++ b/src/Storages/StorageMaterializeMySQL.cpp
@ -21,12 +21,13 @@
 #include <Processors/Pipe.h>
 #include <Processors/Transforms/FilterTransform.h>

+#include <Databases/MySQL/DatabaseMaterializeMySQL.h>
 #include <Storages/SelectQueryInfo.h>

 namespace DB
 {

-StorageMaterializeMySQL::StorageMaterializeMySQL(const StoragePtr & nested_storage_, const DatabaseMaterializeMySQL * database_)
+StorageMaterializeMySQL::StorageMaterializeMySQL(const StoragePtr & nested_storage_, const IDatabase * database_)
    : StorageProxy(nested_storage_->getStorageID()), nested_storage(nested_storage_), database(database_)
 {
    auto nested_memory_metadata = nested_storage->getInMemoryMetadata();
@ -45,7 +46,7 @@ Pipe StorageMaterializeMySQL::read(
    unsigned int num_streams)
 {
    /// If the background synchronization thread has exception.
-    database->rethrowExceptionIfNeed();
+    rethrowSyncExceptionIfNeed(database);

    NameSet column_names_set = NameSet(column_names.begin(), column_names.end());
    auto lock = nested_storage->lockForShare(context.getCurrentQueryId(), context.getSettingsRef().lock_acquire_timeout);
@ -106,7 +107,7 @@ Pipe StorageMaterializeMySQL::read(
 NamesAndTypesList StorageMaterializeMySQL::getVirtuals() const
 {
    /// If the background synchronization thread has exception.
-    database->rethrowExceptionIfNeed();
+    rethrowSyncExceptionIfNeed(database);
    return nested_storage->getVirtuals();
 }

--- a/src/Storages/StorageMaterializeMySQL.h
+++ b/src/Storages/StorageMaterializeMySQL.h
@ -5,7 +5,6 @@
 #if USE_MYSQL

 #include <Storages/StorageProxy.h>
-#include <Databases/MySQL/DatabaseMaterializeMySQL.h>

 namespace DB
 {
@ -21,7 +20,7 @@ class StorageMaterializeMySQL final : public ext::shared_ptr_helper<StorageMater
 public:
    String getName() const override { return "MaterializeMySQL"; }

-    StorageMaterializeMySQL(const StoragePtr & nested_storage_, const DatabaseMaterializeMySQL * database_);
+    StorageMaterializeMySQL(const StoragePtr & nested_storage_, const IDatabase * database_);

    Pipe read(
        const Names & column_names, const StorageMetadataPtr & metadata_snapshot, SelectQueryInfo & query_info,
@ -32,15 +31,18 @@ public:
    NamesAndTypesList getVirtuals() const override;
    ColumnSizeByName getColumnSizes() const override;

-private:
    StoragePtr getNested() const override { return nested_storage; }
+
+    void drop() override { nested_storage->drop(); }
+
+private:
    [[noreturn]] void throwNotAllowed() const
    {
        throw Exception("This method is not allowed for MaterializeMySQL", ErrorCodes::NOT_IMPLEMENTED);
    }

    StoragePtr nested_storage;
-    const DatabaseMaterializeMySQL * database;
+    const IDatabase * database;
 };

 }
--- a/src/Storages/StorageMemory.cpp
+++ b/src/Storages/StorageMemory.cpp
@ -23,7 +23,7 @@ namespace ErrorCodes

 class MemorySource : public SourceWithProgress
 {
-    using InitializerFunc = std::function<void(BlocksList::const_iterator &, size_t &, std::shared_ptr<const BlocksList> &)>;
+    using InitializerFunc = std::function<void(std::shared_ptr<const Blocks> &)>;
 public:
    /// Blocks are stored in std::list which may be appended in another thread.
    /// We use pointer to the beginning of the list and its current size.
@ -32,17 +32,15 @@ public:

    MemorySource(
        Names column_names_,
-        BlocksList::const_iterator first_,
-        size_t num_blocks_,
        const StorageMemory & storage,
        const StorageMetadataPtr & metadata_snapshot,
-        std::shared_ptr<const BlocksList> data_,
-        InitializerFunc initializer_func_ = [](BlocksList::const_iterator &, size_t &, std::shared_ptr<const BlocksList> &) {})
+        std::shared_ptr<const Blocks> data_,
+        std::shared_ptr<std::atomic<size_t>> parallel_execution_index_,
+        InitializerFunc initializer_func_ = {})
        : SourceWithProgress(metadata_snapshot->getSampleBlockForColumns(column_names_, storage.getVirtuals(), storage.getStorageID()))
        , column_names(std::move(column_names_))
-        , current_it(first_)
-        , num_blocks(num_blocks_)
        , data(data_)
+        , parallel_execution_index(parallel_execution_index_)
        , initializer_func(std::move(initializer_func_))
    {
    }
@ -52,16 +50,20 @@ public:
 protected:
    Chunk generate() override
    {
-        if (!postponed_init_done)
+        if (initializer_func)
        {
-            initializer_func(current_it, num_blocks, data);
-            postponed_init_done = true;
+            initializer_func(data);
+            initializer_func = {};
        }

-        if (current_block_idx == num_blocks)
-            return {};
+        size_t current_index = getAndIncrementExecutionIndex();

-        const Block & src = *current_it;
+        if (current_index >= data->size())
+        {
+            return {};
+        }
+
+        const Block & src = (*data)[current_index];
        Columns columns;
        columns.reserve(column_names.size());

@ -69,20 +71,26 @@ protected:
        for (const auto & name : column_names)
            columns.push_back(src.getByName(name).column);

-        if (++current_block_idx < num_blocks)
-            ++current_it;
-
        return Chunk(std::move(columns), src.rows());
    }

 private:
-    const Names column_names;
-    BlocksList::const_iterator current_it;
-    size_t num_blocks;
-    size_t current_block_idx = 0;
+    size_t getAndIncrementExecutionIndex()
+    {
+        if (parallel_execution_index)
+        {
+            return (*parallel_execution_index)++;
+        }
+        else
+        {
+            return execution_index++;
+        }
+    }

-    std::shared_ptr<const BlocksList> data;
-    bool postponed_init_done = false;
+    const Names column_names;
+    size_t execution_index = 0;
+    std::shared_ptr<const Blocks> data;
+    std::shared_ptr<std::atomic<size_t>> parallel_execution_index;
    InitializerFunc initializer_func;
 };

@ -107,7 +115,7 @@ public:
        metadata_snapshot->check(block, true);
        {
            std::lock_guard lock(storage.mutex);
-            auto new_data = std::make_unique<BlocksList>(*(storage.data.get()));
+            auto new_data = std::make_unique<Blocks>(*(storage.data.get()));
            new_data->push_back(block);
            storage.data.set(std::move(new_data));

@ -123,7 +131,7 @@ private:


 StorageMemory::StorageMemory(const StorageID & table_id_, ColumnsDescription columns_description_, ConstraintsDescription constraints_)
-    : IStorage(table_id_), data(std::make_unique<const BlocksList>())
+    : IStorage(table_id_), data(std::make_unique<const Blocks>())
 {
    StorageInMemoryMetadata storage_metadata;
    storage_metadata.setColumns(std::move(columns_description_));
@ -155,21 +163,17 @@ Pipe StorageMemory::read(

        return Pipe(std::make_shared<MemorySource>(
            column_names,
-            data.get()->end(),
-            0,
            *this,
            metadata_snapshot,
-            data.get(),
-            [this](BlocksList::const_iterator & current_it, size_t & num_blocks, std::shared_ptr<const BlocksList> & current_data)
+            nullptr /* data */,
+            nullptr /* parallel execution index */,
+            [this](std::shared_ptr<const Blocks> & data_to_initialize)
            {
-                current_data = data.get();
-                current_it = current_data->begin();
-                num_blocks = current_data->size();
+                data_to_initialize = data.get();
            }));
    }

    auto current_data = data.get();
-
    size_t size = current_data->size();

    if (num_streams > size)
@ -177,23 +181,11 @@ Pipe StorageMemory::read(

    Pipes pipes;

-    BlocksList::const_iterator it = current_data->begin();
+    auto parallel_execution_index = std::make_shared<std::atomic<size_t>>(0);

-    size_t offset = 0;
    for (size_t stream = 0; stream < num_streams; ++stream)
    {
-        size_t next_offset = (stream + 1) * size / num_streams;
-        size_t num_blocks = next_offset - offset;
-
-        assert(num_blocks > 0);
-
-        pipes.emplace_back(std::make_shared<MemorySource>(column_names, it, num_blocks, *this, metadata_snapshot, current_data));
-
-        while (offset < next_offset)
-        {
-            ++it;
-            ++offset;
-        }
+        pipes.emplace_back(std::make_shared<MemorySource>(column_names, *this, metadata_snapshot, current_data, parallel_execution_index));
    }

    return Pipe::unitePipes(std::move(pipes));
@ -208,7 +200,7 @@ BlockOutputStreamPtr StorageMemory::write(const ASTPtr & /*query*/, const Storag

 void StorageMemory::drop()
 {
-    data.set(std::make_unique<BlocksList>());
+    data.set(std::make_unique<Blocks>());
    total_size_bytes.store(0, std::memory_order_relaxed);
    total_size_rows.store(0, std::memory_order_relaxed);
 }
@ -233,7 +225,7 @@ void StorageMemory::mutate(const MutationCommands & commands, const Context & co
    auto in = interpreter->execute();

    in->readPrefix();
-    BlocksList out;
+    Blocks out;
    Block block;
    while ((block = in->read()))
    {
@ -241,17 +233,17 @@ void StorageMemory::mutate(const MutationCommands & commands, const Context & co
    }
    in->readSuffix();

-    std::unique_ptr<BlocksList> new_data;
+    std::unique_ptr<Blocks> new_data;

    // all column affected
    if (interpreter->isAffectingAllColumns())
    {
-        new_data = std::make_unique<BlocksList>(out);
+        new_data = std::make_unique<Blocks>(out);
    }
    else
    {
        /// just some of the column affected, we need update it with new column
-        new_data = std::make_unique<BlocksList>(*(data.get()));
+        new_data = std::make_unique<Blocks>(*(data.get()));
        auto data_it = new_data->begin();
        auto out_it = out.begin();

@ -284,7 +276,7 @@ void StorageMemory::mutate(const MutationCommands & commands, const Context & co
 void StorageMemory::truncate(
    const ASTPtr &, const StorageMetadataPtr &, const Context &, TableExclusiveLockHolder &)
 {
-    data.set(std::make_unique<BlocksList>());
+    data.set(std::make_unique<Blocks>());
    total_size_bytes.store(0, std::memory_order_relaxed);
    total_size_rows.store(0, std::memory_order_relaxed);
 }
--- a/src/Storages/StorageMemory.h
+++ b/src/Storages/StorageMemory.h
@ -91,7 +91,7 @@ public:

 private:
    /// MultiVersion data storage, so that we can copy the list of blocks to readers.
-    MultiVersion<BlocksList> data;
+    MultiVersion<Blocks> data;

    mutable std::mutex mutex;

--- a/src/Storages/StorageMergeTree.cpp
+++ b/src/Storages/StorageMergeTree.cpp
@ -233,7 +233,7 @@ BlockOutputStreamPtr StorageMergeTree::write(const ASTPtr & /*query*/, const Sto

    const auto & settings = context.getSettingsRef();
    return std::make_shared<MergeTreeBlockOutputStream>(
-        *this, metadata_snapshot, settings.max_partitions_per_insert_block);
+        *this, metadata_snapshot, settings.max_partitions_per_insert_block, context.getSettingsRef().optimize_on_insert);
 }

 void StorageMergeTree::checkTableCanBeDropped() const
--- a/src/Storages/StorageReplicatedMergeTree.cpp
+++ b/src/Storages/StorageReplicatedMergeTree.cpp
@ -3861,7 +3861,8 @@ BlockOutputStreamPtr StorageReplicatedMergeTree::write(const ASTPtr & /*query*/,
        query_settings.insert_quorum_timeout.totalMilliseconds(),
        query_settings.max_partitions_per_insert_block,
        query_settings.insert_quorum_parallel,
-        deduplicate);
+        deduplicate,
+        context.getSettingsRef().optimize_on_insert);
 }


@ -4444,7 +4445,7 @@ PartitionCommandsResultInfo StorageReplicatedMergeTree::attachPartition(
    PartsTemporaryRename renamed_parts(*this, "detached/");
    MutableDataPartsVector loaded_parts = tryLoadPartsToAttach(partition, attach_part, query_context, renamed_parts);

-    ReplicatedMergeTreeBlockOutputStream output(*this, metadata_snapshot, 0, 0, 0, false, false);   /// TODO Allow to use quorum here.
+    ReplicatedMergeTreeBlockOutputStream output(*this, metadata_snapshot, 0, 0, 0, false, false, false);   /// TODO Allow to use quorum here.
    for (size_t i = 0; i < loaded_parts.size(); ++i)
    {
        String old_name = loaded_parts[i]->name;
--- a/src/Storages/StorageS3.cpp
+++ b/src/Storages/StorageS3.cpp
@ -141,17 +141,18 @@ namespace
    public:
        StorageS3BlockOutputStream(
            const String & format,
-            UInt64 min_upload_part_size,
            const Block & sample_block_,
            const Context & context,
            const CompressionMethod compression_method,
            const std::shared_ptr<Aws::S3::S3Client> & client,
            const String & bucket,
-            const String & key)
+            const String & key,
+            size_t min_upload_part_size,
+            size_t max_single_part_upload_size)
            : sample_block(sample_block_)
        {
            write_buf = wrapWriteBufferWithCompressionMethod(
-                std::make_unique<WriteBufferFromS3>(client, bucket, key, min_upload_part_size, true), compression_method, 3);
+                std::make_unique<WriteBufferFromS3>(client, bucket, key, min_upload_part_size, max_single_part_upload_size), compression_method, 3);
            writer = FormatFactory::instance().getOutput(format, *write_buf, sample_block, context);
        }

@ -192,6 +193,7 @@ StorageS3::StorageS3(
    const StorageID & table_id_,
    const String & format_name_,
    UInt64 min_upload_part_size_,
+    UInt64 max_single_part_upload_size_,
    const ColumnsDescription & columns_,
    const ConstraintsDescription & constraints_,
    const Context & context_,
@ -201,6 +203,7 @@ StorageS3::StorageS3(
    , global_context(context_.getGlobalContext())
    , format_name(format_name_)
    , min_upload_part_size(min_upload_part_size_)
+    , max_single_part_upload_size(max_single_part_upload_size_)
    , compression_method(compression_method_)
    , name(uri_.storage_name)
 {
@ -331,9 +334,15 @@ Pipe StorageS3::read(
 BlockOutputStreamPtr StorageS3::write(const ASTPtr & /*query*/, const StorageMetadataPtr & metadata_snapshot, const Context & /*context*/)
 {
    return std::make_shared<StorageS3BlockOutputStream>(
-        format_name, min_upload_part_size, metadata_snapshot->getSampleBlock(),
-        global_context, chooseCompressionMethod(uri.endpoint, compression_method),
-        client, uri.bucket, uri.key);
+        format_name,
+        metadata_snapshot->getSampleBlock(),
+        global_context,
+        chooseCompressionMethod(uri.endpoint, compression_method),
+        client,
+        uri.bucket,
+        uri.key,
+        min_upload_part_size,
+        max_single_part_upload_size);
 }

 void registerStorageS3Impl(const String & name, StorageFactory & factory)
@ -362,6 +371,7 @@ void registerStorageS3Impl(const String & name, StorageFactory & factory)
        }

        UInt64 min_upload_part_size = args.local_context.getSettingsRef().s3_min_upload_part_size;
+        UInt64 max_single_part_upload_size = args.local_context.getSettingsRef().s3_max_single_part_upload_size;

        String compression_method;
        String format_name;
@ -383,6 +393,7 @@ void registerStorageS3Impl(const String & name, StorageFactory & factory)
            args.table_id,
            format_name,
            min_upload_part_size,
+            max_single_part_upload_size,
            args.columns,
            args.constraints,
            args.context,
--- a/src/Storages/StorageS3.h
+++ b/src/Storages/StorageS3.h
@ -31,6 +31,7 @@ public:
        const StorageID & table_id_,
        const String & format_name_,
        UInt64 min_upload_part_size_,
+        UInt64 max_single_part_upload_size_,
        const ColumnsDescription & columns_,
        const ConstraintsDescription & constraints_,
        const Context & context_,
@ -59,7 +60,8 @@ private:
    const Context & global_context;

    String format_name;
-    UInt64 min_upload_part_size;
+    size_t min_upload_part_size;
+    size_t max_single_part_upload_size;
    String compression_method;
    std::shared_ptr<Aws::S3::S3Client> client;
    String name;
--- a/src/TableFunctions/TableFunctionS3.cpp
+++ b/src/TableFunctions/TableFunctionS3.cpp
@ -67,6 +67,7 @@ StoragePtr TableFunctionS3::executeImpl(const ASTPtr & /*ast_function*/, const C
    Poco::URI uri (filename);
    S3::URI s3_uri (uri);
    UInt64 min_upload_part_size = context.getSettingsRef().s3_min_upload_part_size;
+    UInt64 max_single_part_upload_size = context.getSettingsRef().s3_max_single_part_upload_size;

    StoragePtr storage = StorageS3::create(
            s3_uri,
@ -75,6 +76,7 @@ StoragePtr TableFunctionS3::executeImpl(const ASTPtr & /*ast_function*/, const C
            StorageID(getDatabaseName(), table_name),
            format,
            min_upload_part_size,
+            max_single_part_upload_size,
            getActualTableStructure(context),
            ConstraintsDescription{},
            const_cast<Context &>(context),
--- a/tests/clickhouse-test
+++ b/tests/clickhouse-test
@ -15,6 +15,7 @@ from subprocess import check_call
 from subprocess import Popen
 from subprocess import PIPE
 from subprocess import CalledProcessError
+from subprocess import TimeoutExpired
 from datetime import datetime
 from time import time, sleep
 from errno import ESRCH
@ -114,6 +115,7 @@ def get_db_engine(args):
 def run_single_test(args, ext, server_logs_level, client_options, case_file, stdout_file, stderr_file):
    # print(client_options)

+    start_time = datetime.now()
    if args.database:
        database = args.database
        os.environ.setdefault("CLICKHOUSE_DATABASE", database)
@ -129,7 +131,11 @@ def run_single_test(args, ext, server_logs_level, client_options, case_file, std
        database = 'test_{suffix}'.format(suffix=random_str())

        clickhouse_proc_create = Popen(shlex.split(args.client), stdin=PIPE, stdout=PIPE, stderr=PIPE, universal_newlines=True)
-        clickhouse_proc_create.communicate(("CREATE DATABASE " + database + get_db_engine(args)))
+        try:
+            clickhouse_proc_create.communicate(("CREATE DATABASE " + database + get_db_engine(args)), timeout=args.timeout)
+        except TimeoutExpired:
+            total_time = (datetime.now() - start_time).total_seconds()
+            return clickhouse_proc_create, "", "Timeout creating database {} before test".format(database), total_time

        os.environ["CLICKHOUSE_DATABASE"] = database

@ -152,14 +158,24 @@ def run_single_test(args, ext, server_logs_level, client_options, case_file, std
    # print(command)

    proc = Popen(command, shell=True, env=os.environ)
-    start_time = datetime.now()

    while (datetime.now() - start_time).total_seconds() < args.timeout and proc.poll() is None:
        sleep(0.01)

    if not args.database:
        clickhouse_proc_create = Popen(shlex.split(args.client), stdin=PIPE, stdout=PIPE, stderr=PIPE, universal_newlines=True)
-        clickhouse_proc_create.communicate(("DROP DATABASE " + database))
+        seconds_left = max(args.timeout - (datetime.now() - start_time).total_seconds(), 10)
+        try:
+            clickhouse_proc_create.communicate(("DROP DATABASE " + database), timeout=seconds_left)
+        except TimeoutExpired:
+            # kill test process because it can also hung
+            if proc.returncode is None:
+                try:
+                    proc.kill()
+                except OSError as e:
+                    if e.errno != ESRCH:
+                        raise
+            return clickhouse_proc_create, "", "Timeout dropping database {} after test".format(database), total_time

    total_time = (datetime.now() - start_time).total_seconds()

@ -305,7 +321,7 @@ def run_tests_array(all_tests_with_params):

                    if args.testname:
                        clickhouse_proc = Popen(shlex.split(args.client), stdin=PIPE, stdout=PIPE, stderr=PIPE, universal_newlines=True)
-                        clickhouse_proc.communicate(("SELECT 'Running test {suite}/{case} from pid={pid}';".format(pid = os.getpid(), case = case, suite = suite)))
+                        clickhouse_proc.communicate(("SELECT 'Running test {suite}/{case} from pid={pid}';".format(pid = os.getpid(), case = case, suite = suite)), timeout=10)

                        if clickhouse_proc.returncode != 0:
                            failures += 1
@ -330,6 +346,8 @@ def run_tests_array(all_tests_with_params):
                        print(MSG_FAIL, end='')
                        print_test_time(total_time)
                        print(" - Timeout!")
+                        if stderr:
+                            print(stderr)
                    else:
                        counter = 1
                        while proc.returncode != 0 and need_retry(stderr):
--- a/tests/integration/test_graphite_merge_tree/configs/users.xml
+++ b/tests/integration/test_graphite_merge_tree/configs/users.xml
@ -0,0 +1,8 @@
+<?xml version="1.0"?>
+<yandex>
+    <profiles>
+        <default>
+            <optimize_on_insert>0</optimize_on_insert>
+        </default>
+    </profiles>
+</yandex>
--- a/tests/integration/test_graphite_merge_tree/test.py
+++ b/tests/integration/test_graphite_merge_tree/test.py
@ -8,7 +8,8 @@ from helpers.test_tools import TSV

 cluster = ClickHouseCluster(__file__)
 instance = cluster.add_instance('instance',
-                                main_configs=['configs/graphite_rollup.xml'])
+                                main_configs=['configs/graphite_rollup.xml'],
+                                user_configs=["configs/users.xml"])
 q = instance.query


--- a/tests/integration/test_materialize_mysql_database/configs/users.xml
+++ b/tests/integration/test_materialize_mysql_database/configs/users.xml
@ -4,6 +4,8 @@
        <default>
            <allow_experimental_database_materialize_mysql>1</allow_experimental_database_materialize_mysql>
            <allow_introspection_functions>1</allow_introspection_functions>
+            <optimize_on_insert>0</optimize_on_insert>
+            <default_database_engine>Ordinary</default_database_engine>
        </default>
    </profiles>

--- a/tests/integration/test_materialize_mysql_database/configs/users_db_atomic.xml
+++ b/tests/integration/test_materialize_mysql_database/configs/users_db_atomic.xml
@ -0,0 +1,19 @@
+<?xml version="1.0"?>
+<yandex>
+    <profiles>
+        <default>
+            <allow_experimental_database_materialize_mysql>1</allow_experimental_database_materialize_mysql>
+            <default_database_engine>Atomic</default_database_engine>
+        </default>
+    </profiles>
+
+    <users>
+        <default>
+            <password></password>
+            <networks incl="networks" replace="replace">
+                <ip>::/0</ip>
+            </networks>
+            <profile>default</profile>
+        </default>
+    </users>
+</yandex>
--- a/tests/integration/test_materialize_mysql_database/materialize_with_ddl.py
+++ b/tests/integration/test_materialize_mysql_database/materialize_with_ddl.py
@ -15,6 +15,7 @@ from multiprocessing.dummy import Pool

 def check_query(clickhouse_node, query, result_set, retry_count=60, interval_seconds=3):
    lastest_result = ''
+
    for i in range(retry_count):
        try:
            lastest_result = clickhouse_node.query(query)
@ -35,6 +36,7 @@ def dml_with_materialize_mysql_database(clickhouse_node, mysql_node, service_nam
    clickhouse_node.query("DROP DATABASE IF EXISTS test_database")
    mysql_node.query("CREATE DATABASE test_database DEFAULT CHARACTER SET 'utf8'")
    # existed before the mapping was created
+
    mysql_node.query("CREATE TABLE test_database.test_table_1 ("
                     "`key` INT NOT NULL PRIMARY KEY, "
                     "unsigned_tiny_int TINYINT UNSIGNED, tiny_int TINYINT, "
@ -51,9 +53,10 @@ def dml_with_materialize_mysql_database(clickhouse_node, mysql_node, service_nam
                     "_date Date, _datetime DateTime, _timestamp TIMESTAMP, _bool BOOLEAN) ENGINE = InnoDB;")

    # it already has some data
-    mysql_node.query(
-        "INSERT INTO test_database.test_table_1 VALUES(1, 1, -1, 2, -2, 3, -3, 4, -4, 5, -5, 6, -6, 3.2, -3.2, 3.4, -3.4, 'varchar', 'char', "
-        "'2020-01-01', '2020-01-01 00:00:00', '2020-01-01 00:00:00', true);")
+    mysql_node.query("""
+        INSERT INTO test_database.test_table_1 VALUES(1, 1, -1, 2, -2, 3, -3, 4, -4, 5, -5, 6, -6, 3.2, -3.2, 3.4, -3.4, 'varchar', 'char', 
+        '2020-01-01', '2020-01-01 00:00:00', '2020-01-01 00:00:00', true);
+        """)

    clickhouse_node.query(
        "CREATE DATABASE test_database ENGINE = MaterializeMySQL('{}:3306', 'test_database', 'root', 'clickhouse')".format(
@ -65,9 +68,10 @@ def dml_with_materialize_mysql_database(clickhouse_node, mysql_node, service_nam
                "1\t1\t-1\t2\t-2\t3\t-3\t4\t-4\t5\t-5\t6\t-6\t3.2\t-3.2\t3.4\t-3.4\tvarchar\tchar\t2020-01-01\t"
                "2020-01-01 00:00:00\t2020-01-01 00:00:00\t1\n")

-    mysql_node.query(
-        "INSERT INTO test_database.test_table_1 VALUES(2, 1, -1, 2, -2, 3, -3, 4, -4, 5, -5, 6, -6, 3.2, -3.2, 3.4, -3.4, 'varchar', 'char', "
-        "'2020-01-01', '2020-01-01 00:00:00', '2020-01-01 00:00:00', false);")
+    mysql_node.query("""
+        INSERT INTO test_database.test_table_1 VALUES(2, 1, -1, 2, -2, 3, -3, 4, -4, 5, -5, 6, -6, 3.2, -3.2, 3.4, -3.4, 'varchar', 'char', 
+        '2020-01-01', '2020-01-01 00:00:00', '2020-01-01 00:00:00', false);
+        """)

    check_query(clickhouse_node, "SELECT * FROM test_database.test_table_1 ORDER BY key FORMAT TSV",
                "1\t1\t-1\t2\t-2\t3\t-3\t4\t-4\t5\t-5\t6\t-6\t3.2\t-3.2\t3.4\t-3.4\tvarchar\tchar\t2020-01-01\t"
@ -76,14 +80,16 @@ def dml_with_materialize_mysql_database(clickhouse_node, mysql_node, service_nam

    mysql_node.query("UPDATE test_database.test_table_1 SET unsigned_tiny_int = 2 WHERE `key` = 1")

-    check_query(clickhouse_node, "SELECT key, unsigned_tiny_int, tiny_int, unsigned_small_int,"
-                                 " small_int, unsigned_medium_int, medium_int, unsigned_int, _int, unsigned_integer, _integer, "
-                                 " unsigned_bigint, _bigint, unsigned_float, _float, unsigned_double, _double, _varchar, _char, "
-                                 " _date, _datetime, /* exclude it, because ON UPDATE CURRENT_TIMESTAMP _timestamp, */ "
-                                 " _bool FROM test_database.test_table_1 ORDER BY key FORMAT TSV",
-                "1\t2\t-1\t2\t-2\t3\t-3\t4\t-4\t5\t-5\t6\t-6\t3.2\t-3.2\t3.4\t-3.4\tvarchar\tchar\t2020-01-01\t"
-                "2020-01-01 00:00:00\t1\n2\t1\t-1\t2\t-2\t3\t-3\t4\t-4\t5\t-5\t6\t-6\t3.2\t-3.2\t3.4\t-3.4\t"
-                "varchar\tchar\t2020-01-01\t2020-01-01 00:00:00\t0\n")
+    check_query(clickhouse_node, """
+        SELECT key, unsigned_tiny_int, tiny_int, unsigned_small_int,
+         small_int, unsigned_medium_int, medium_int, unsigned_int, _int, unsigned_integer, _integer, 
+         unsigned_bigint, _bigint, unsigned_float, _float, unsigned_double, _double, _varchar, _char, 
+         _date, _datetime, /* exclude it, because ON UPDATE CURRENT_TIMESTAMP _timestamp, */ 
+         _bool FROM test_database.test_table_1 ORDER BY key FORMAT TSV
+        """,
+        "1\t2\t-1\t2\t-2\t3\t-3\t4\t-4\t5\t-5\t6\t-6\t3.2\t-3.2\t3.4\t-3.4\tvarchar\tchar\t2020-01-01\t"
+        "2020-01-01 00:00:00\t1\n2\t1\t-1\t2\t-2\t3\t-3\t4\t-4\t5\t-5\t6\t-6\t3.2\t-3.2\t3.4\t-3.4\t"
+        "varchar\tchar\t2020-01-01\t2020-01-01 00:00:00\t0\n")

    # update primary key
    mysql_node.query("UPDATE test_database.test_table_1 SET `key` = 3 WHERE `unsigned_tiny_int` = 2")
@ -556,6 +562,12 @@ def err_sync_user_privs_with_materialize_mysql_database(clickhouse_node, mysql_n
    assert 'MySQL SYNC USER ACCESS ERR:' in str(exception.value)
    assert "priv_err_db" not in clickhouse_node.query("SHOW DATABASES")

+    mysql_node.query("GRANT SELECT ON priv_err_db.* TO 'test'@'%'")
+    time.sleep(3)
+    clickhouse_node.query("ATTACH DATABASE priv_err_db")
+    clickhouse_node.query("DROP DATABASE priv_err_db")
+    mysql_node.query("REVOKE SELECT ON priv_err_db.* FROM 'test'@'%'")
+
    mysql_node.query("DROP DATABASE priv_err_db;")
    mysql_node.query("DROP USER 'test'@'%'")

--- a/tests/integration/test_materialize_mysql_database/test.py
+++ b/tests/integration/test_materialize_mysql_database/test.py
@ -14,7 +14,10 @@ from . import materialize_with_ddl
 DOCKER_COMPOSE_PATH = get_docker_compose_path()

 cluster = ClickHouseCluster(__file__)
-clickhouse_node = cluster.add_instance('node1', user_configs=["configs/users.xml"], with_mysql=False, stay_alive=True)
+
+node_db_ordinary = cluster.add_instance('node1', user_configs=["configs/users.xml"], with_mysql=False, stay_alive=True)
+node_db_atomic = cluster.add_instance('node2', user_configs=["configs/users_db_atomic.xml"], with_mysql=False, stay_alive=True)
+

@pytest.fixture(scope="module")
 def started_cluster():
@ -119,39 +122,30 @@ def started_mysql_8_0():
                               '--remove-orphans'])


-def test_materialize_database_dml_with_mysql_5_7(started_cluster, started_mysql_5_7):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_materialize_database_dml_with_mysql_5_7(started_cluster, started_mysql_5_7, clickhouse_node):
    materialize_with_ddl.dml_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7, "mysql1")
    materialize_with_ddl.materialize_mysql_database_with_datetime_and_decimal(clickhouse_node, started_mysql_5_7, "mysql1")

-
-def test_materialize_database_dml_with_mysql_8_0(started_cluster, started_mysql_8_0):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_materialize_database_dml_with_mysql_8_0(started_cluster, started_mysql_8_0, clickhouse_node):
    materialize_with_ddl.dml_with_materialize_mysql_database(clickhouse_node, started_mysql_8_0, "mysql8_0")
    materialize_with_ddl.materialize_mysql_database_with_datetime_and_decimal(clickhouse_node, started_mysql_8_0, "mysql8_0")

+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_materialize_database_ddl_with_mysql_5_7(started_cluster, started_mysql_5_7, clickhouse_node):
+    materialize_with_ddl.drop_table_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7, "mysql1")
+    materialize_with_ddl.create_table_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7, "mysql1")
+    materialize_with_ddl.rename_table_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7, "mysql1")
+    materialize_with_ddl.alter_add_column_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7, "mysql1")
+    materialize_with_ddl.alter_drop_column_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7, "mysql1")
+    # mysql 5.7 cannot support alter rename column
+    # materialize_with_ddl.alter_rename_column_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7, "mysql1")
+    materialize_with_ddl.alter_rename_table_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7, "mysql1")
+    materialize_with_ddl.alter_modify_column_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7, "mysql1")

-
-def test_materialize_database_ddl_with_mysql_5_7(started_cluster, started_mysql_5_7):
-    try:
-        materialize_with_ddl.drop_table_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7, "mysql1")
-        materialize_with_ddl.create_table_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7, "mysql1")
-        materialize_with_ddl.rename_table_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7, "mysql1")
-        materialize_with_ddl.alter_add_column_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7,
-                                                                              "mysql1")
-        materialize_with_ddl.alter_drop_column_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7,
-                                                                               "mysql1")
-        # mysql 5.7 cannot support alter rename column
-        # materialize_with_ddl.alter_rename_column_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7, "mysql1")
-        materialize_with_ddl.alter_rename_table_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7,
-                                                                                "mysql1")
-        materialize_with_ddl.alter_modify_column_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7,
-                                                                                 "mysql1")
-    except:
-        print((clickhouse_node.query(
-            "select '\n', thread_id, query_id, arrayStringConcat(arrayMap(x -> concat(demangle(addressToSymbol(x)), '\n    ', addressToLine(x)), trace), '\n') AS sym from system.stack_trace format TSVRaw")))
-        raise
-
-
-def test_materialize_database_ddl_with_mysql_8_0(started_cluster, started_mysql_8_0):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_materialize_database_ddl_with_mysql_8_0(started_cluster, started_mysql_8_0, clickhouse_node):
    materialize_with_ddl.drop_table_with_materialize_mysql_database(clickhouse_node, started_mysql_8_0, "mysql8_0")
    materialize_with_ddl.create_table_with_materialize_mysql_database(clickhouse_node, started_mysql_8_0, "mysql8_0")
    materialize_with_ddl.rename_table_with_materialize_mysql_database(clickhouse_node, started_mysql_8_0, "mysql8_0")
@ -166,61 +160,72 @@ def test_materialize_database_ddl_with_mysql_8_0(started_cluster, started_mysql_
    materialize_with_ddl.alter_modify_column_with_materialize_mysql_database(clickhouse_node, started_mysql_8_0,
                                                                             "mysql8_0")

-
-def test_materialize_database_ddl_with_empty_transaction_5_7(started_cluster, started_mysql_5_7):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_materialize_database_ddl_with_empty_transaction_5_7(started_cluster, started_mysql_5_7, clickhouse_node):
    materialize_with_ddl.query_event_with_empty_transaction(clickhouse_node, started_mysql_5_7, "mysql1")

-
-def test_materialize_database_ddl_with_empty_transaction_8_0(started_cluster, started_mysql_8_0):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_materialize_database_ddl_with_empty_transaction_8_0(started_cluster, started_mysql_8_0, clickhouse_node):
    materialize_with_ddl.query_event_with_empty_transaction(clickhouse_node, started_mysql_8_0, "mysql8_0")


-def test_select_without_columns_5_7(started_cluster, started_mysql_5_7):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_select_without_columns_5_7(started_cluster, started_mysql_5_7, clickhouse_node):
    materialize_with_ddl.select_without_columns(clickhouse_node, started_mysql_5_7, "mysql1")


-def test_select_without_columns_8_0(started_cluster, started_mysql_8_0):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_select_without_columns_8_0(started_cluster, started_mysql_8_0, clickhouse_node):
    materialize_with_ddl.select_without_columns(clickhouse_node, started_mysql_8_0, "mysql8_0")


-def test_insert_with_modify_binlog_checksum_5_7(started_cluster, started_mysql_5_7):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_insert_with_modify_binlog_checksum_5_7(started_cluster, started_mysql_5_7, clickhouse_node):
    materialize_with_ddl.insert_with_modify_binlog_checksum(clickhouse_node, started_mysql_5_7, "mysql1")


-def test_insert_with_modify_binlog_checksum_8_0(started_cluster, started_mysql_8_0):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_insert_with_modify_binlog_checksum_8_0(started_cluster, started_mysql_8_0, clickhouse_node):
    materialize_with_ddl.insert_with_modify_binlog_checksum(clickhouse_node, started_mysql_8_0, "mysql8_0")


-def test_materialize_database_err_sync_user_privs_5_7(started_cluster, started_mysql_5_7):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_materialize_database_err_sync_user_privs_5_7(started_cluster, started_mysql_5_7, clickhouse_node):
    materialize_with_ddl.err_sync_user_privs_with_materialize_mysql_database(clickhouse_node, started_mysql_5_7, "mysql1")

-def test_materialize_database_err_sync_user_privs_8_0(started_cluster, started_mysql_8_0):
+
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_materialize_database_err_sync_user_privs_8_0(started_cluster, started_mysql_8_0, clickhouse_node):
    materialize_with_ddl.err_sync_user_privs_with_materialize_mysql_database(clickhouse_node, started_mysql_8_0, "mysql8_0")

-
-def test_network_partition_5_7(started_cluster, started_mysql_5_7):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_network_partition_5_7(started_cluster, started_mysql_5_7, clickhouse_node):
    materialize_with_ddl.network_partition_test(clickhouse_node, started_mysql_5_7, "mysql1")

-def test_network_partition_8_0(started_cluster, started_mysql_8_0):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_network_partition_8_0(started_cluster, started_mysql_8_0, clickhouse_node):
    materialize_with_ddl.network_partition_test(clickhouse_node, started_mysql_8_0, "mysql8_0")

-
-def test_mysql_kill_sync_thread_restore_5_7(started_cluster, started_mysql_5_7):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_mysql_kill_sync_thread_restore_5_7(started_cluster, started_mysql_5_7, clickhouse_node):
    materialize_with_ddl.mysql_kill_sync_thread_restore_test(clickhouse_node, started_mysql_5_7, "mysql1")

-def test_mysql_kill_sync_thread_restore_8_0(started_cluster, started_mysql_8_0):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_mysql_kill_sync_thread_restore_8_0(started_cluster, started_mysql_8_0, clickhouse_node):
    materialize_with_ddl.mysql_kill_sync_thread_restore_test(clickhouse_node, started_mysql_8_0, "mysql8_0")

-
-def test_mysql_killed_while_insert_5_7(started_cluster, started_mysql_5_7):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_mysql_killed_while_insert_5_7(started_cluster, started_mysql_5_7, clickhouse_node):
    materialize_with_ddl.mysql_killed_while_insert(clickhouse_node, started_mysql_5_7, "mysql1")

-def test_mysql_killed_while_insert_8_0(started_cluster, started_mysql_8_0):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_mysql_killed_while_insert_8_0(started_cluster, started_mysql_8_0, clickhouse_node):
    materialize_with_ddl.mysql_killed_while_insert(clickhouse_node, started_mysql_8_0, "mysql8_0")

-
-def test_clickhouse_killed_while_insert_5_7(started_cluster, started_mysql_5_7):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_clickhouse_killed_while_insert_5_7(started_cluster, started_mysql_5_7, clickhouse_node):
    materialize_with_ddl.clickhouse_killed_while_insert(clickhouse_node, started_mysql_5_7, "mysql1")

-def test_clickhouse_killed_while_insert_8_0(started_cluster, started_mysql_8_0):
+@pytest.mark.parametrize(('clickhouse_node'), [node_db_ordinary, node_db_atomic])
+def test_clickhouse_killed_while_insert_8_0(started_cluster, started_mysql_8_0, clickhouse_node):
    materialize_with_ddl.clickhouse_killed_while_insert(clickhouse_node, started_mysql_8_0, "mysql8_0")
--- a/tests/integration/test_row_policy/configs/users.d/another_user.xml
+++ b/tests/integration/test_row_policy/configs/users.d/another_user.xml
@ -1,5 +1,10 @@
 <?xml version="1.0"?>
 <yandex>
+    <profiles>
+        <default>
+            <optimize_on_insert>0</optimize_on_insert>
+        </default>
+    </profiles>
    <users>
        <another>
            <password/>
@ -10,4 +15,4 @@
            <quota>default</quota>
        </another>
    </users>
-</yandex>
+</yandex>
--- a/tests/integration/test_settings_profile/test.py
+++ b/tests/integration/test_settings_profile/test.py
@ -207,34 +207,26 @@ def test_show_profiles():


 def test_allow_ddl():
-    assert "Not enough privileges" in instance.query_and_get_error("CREATE TABLE tbl(a Int32) ENGINE=Log", user="robin")
-    assert "DDL queries are prohibited" in instance.query_and_get_error("CREATE TABLE tbl(a Int32) ENGINE=Log",
-                                                                        settings={"allow_ddl": 0})
-
-    assert "Not enough privileges" in instance.query_and_get_error("GRANT CREATE ON tbl TO robin", user="robin")
-    assert "DDL queries are prohibited" in instance.query_and_get_error("GRANT CREATE ON tbl TO robin",
-                                                                        settings={"allow_ddl": 0})
-
+    assert "it's necessary to have grant" in instance.query_and_get_error("CREATE TABLE tbl(a Int32) ENGINE=Log", user="robin")
+    assert "it's necessary to have grant" in instance.query_and_get_error("GRANT CREATE ON tbl TO robin", user="robin")
+    assert "DDL queries are prohibited" in instance.query_and_get_error("CREATE TABLE tbl(a Int32) ENGINE=Log", settings={"allow_ddl": 0})
+    
    instance.query("GRANT CREATE ON tbl TO robin")
    instance.query("CREATE TABLE tbl(a Int32) ENGINE=Log", user="robin")
    instance.query("DROP TABLE tbl")


 def test_allow_introspection():
-    assert "Introspection functions are disabled" in instance.query_and_get_error("SELECT demangle('a')")
-    assert "Not enough privileges" in instance.query_and_get_error("SELECT demangle('a')", user="robin")
-    assert "Not enough privileges" in instance.query_and_get_error("SELECT demangle('a')", user="robin",
-                                                                   settings={"allow_introspection_functions": 1})
-
-    assert "Introspection functions are disabled" in instance.query_and_get_error("GRANT demangle ON *.* TO robin")
-    assert "Not enough privileges" in instance.query_and_get_error("GRANT demangle ON *.* TO robin", user="robin")
-    assert "Not enough privileges" in instance.query_and_get_error("GRANT demangle ON *.* TO robin", user="robin",
-                                                                   settings={"allow_introspection_functions": 1})
-
    assert instance.query("SELECT demangle('a')", settings={"allow_introspection_functions": 1}) == "signed char\n"
-    instance.query("GRANT demangle ON *.* TO robin", settings={"allow_introspection_functions": 1})

+    assert "Introspection functions are disabled" in instance.query_and_get_error("SELECT demangle('a')")
+    assert "it's necessary to have grant" in instance.query_and_get_error("SELECT demangle('a')", user="robin")
+    assert "it's necessary to have grant" in instance.query_and_get_error("SELECT demangle('a')", user="robin", settings={"allow_introspection_functions": 1})
+
+    instance.query("GRANT demangle ON *.* TO robin")
    assert "Introspection functions are disabled" in instance.query_and_get_error("SELECT demangle('a')", user="robin")
+    assert instance.query("SELECT demangle('a')", user="robin", settings={"allow_introspection_functions": 1}) == "signed char\n"
+
    instance.query("ALTER USER robin SETTINGS allow_introspection_functions=1")
    assert instance.query("SELECT demangle('a')", user="robin") == "signed char\n"

@ -248,4 +240,4 @@ def test_allow_introspection():
    assert "Introspection functions are disabled" in instance.query_and_get_error("SELECT demangle('a')", user="robin")

    instance.query("REVOKE demangle ON *.* FROM robin", settings={"allow_introspection_functions": 1})
-    assert "Not enough privileges" in instance.query_and_get_error("SELECT demangle('a')", user="robin")
+    assert "it's necessary to have grant" in instance.query_and_get_error("SELECT demangle('a')", user="robin")
--- a/tests/integration/test_storage_kafka/test.py
+++ b/tests/integration/test_storage_kafka/test.py
@ -9,6 +9,7 @@ import time
 import avro.schema
 from confluent_kafka.avro.cached_schema_registry_client import CachedSchemaRegistryClient
 from confluent_kafka.avro.serializer.message_serializer import MessageSerializer
+from confluent_kafka import admin

 import kafka.errors
 import pytest
@ -1161,6 +1162,66 @@ def test_kafka_materialized_view(kafka_cluster):

    kafka_check_result(result, True)

+@pytest.mark.timeout(180)
+def test_librdkafka_snappy_regression(kafka_cluster):
+    """
+    Regression for UB in snappy-c (that is used in librdkafka),
+    backport pr is [1].
+
+      [1]: https://github.com/ClickHouse-Extras/librdkafka/pull/3
+
+    Example of corruption:
+
+        2020.12.10 09:59:56.831507 [ 20 ] {} <Error> void DB::StorageKafka::threadFunc(size_t): Code: 27, e.displayText() = DB::Exception: Cannot parse input: expected '"' before: 'foo"}': (while reading the value of key value): (at row 1)
+, Stack trace (when copying this message, always include the lines below):
+    """
+
+    # create topic with snappy compression
+    admin_client = admin.AdminClient({'bootstrap.servers': 'localhost:9092'})
+    topic_snappy = admin.NewTopic(topic='snappy_regression', num_partitions=1, replication_factor=1, config={
+        'compression.type': 'snappy',
+    })
+    admin_client.create_topics(new_topics=[topic_snappy], validate_only=False)
+
+    instance.query('''
+        CREATE TABLE test.kafka (key UInt64, value String)
+            ENGINE = Kafka
+            SETTINGS kafka_broker_list = 'kafka1:19092',
+                     kafka_topic_list = 'snappy_regression',
+                     kafka_group_name = 'ch_snappy_regression',
+                     kafka_format = 'JSONEachRow';
+    ''')
+
+    messages = []
+    expected = []
+    # To trigger this regression there should duplicated messages
+    # Orignal reproducer is:
+    #
+    #     $ gcc --version |& fgrep gcc
+    #     gcc (GCC) 10.2.0
+    #     $ yes foobarbaz | fold -w 80 | head -n10 >| in-…
+    #     $ make clean && make CFLAGS='-Wall -g -O2 -ftree-loop-vectorize -DNDEBUG=1 -DSG=1 -fPIC'
+    #     $ ./verify in
+    #     final comparision of in failed at 20 of 100
+    value = 'foobarbaz'*10
+    number_of_messages = 50
+    for i in range(number_of_messages):
+        messages.append(json.dumps({'key': i, 'value': value}))
+        expected.append(f'{i}\t{value}')
+    kafka_produce('snappy_regression', messages)
+
+    expected = '\n'.join(expected)
+
+    while True:
+        result = instance.query('SELECT * FROM test.kafka')
+        rows = len(result.strip('\n').split('\n'))
+        print(rows)
+        if rows == number_of_messages:
+            break
+
+    assert TSV(result) == TSV(expected)
+
+    instance.query('DROP TABLE test.kafka')

@pytest.mark.timeout(180)
 def test_kafka_materialized_view_with_subquery(kafka_cluster):
--- a/tests/integration/test_storage_s3/test.py
+++ b/tests/integration/test_storage_s3/test.py
@ -306,7 +306,8 @@ def test_multipart_put(cluster, maybe_auth, positive):
        cluster.minio_redirect_host, cluster.minio_redirect_port, bucket, filename, maybe_auth, table_format)

    try:
-        run_query(instance, put_query, stdin=csv_data, settings={'s3_min_upload_part_size': min_part_size_bytes})
+        run_query(instance, put_query, stdin=csv_data, settings={'s3_min_upload_part_size': min_part_size_bytes,
+                                                                 's3_max_single_part_upload_size': 0})
    except helpers.client.QueryRuntimeException:
        if positive:
            raise
--- a/tests/queries/0_stateless/00083_create_merge_tree_zookeeper.sql
+++ b/tests/queries/0_stateless/00083_create_merge_tree_zookeeper.sql
@ -1,3 +1,5 @@
+SET optimize_on_insert = 0;
+
 DROP TABLE IF EXISTS merge_tree;
 DROP TABLE IF EXISTS collapsing_merge_tree;
 DROP TABLE IF EXISTS versioned_collapsing_merge_tree;
--- a/tests/queries/0_stateless/00327_summing_composite_nested.sql
+++ b/tests/queries/0_stateless/00327_summing_composite_nested.sql
@ -1,3 +1,5 @@
+SET optimize_on_insert = 0;
+
 DROP TABLE IF EXISTS summing_composite_key;
 CREATE TABLE summing_composite_key (d Date, k UInt64, FirstMap Nested(k1 UInt32, k2ID Int8, s Float64), SecondMap Nested(k1ID UInt64, k2Key String, k3Type Int32, s Int64)) ENGINE = SummingMergeTree(d, k, 1);

--- a/tests/queries/0_stateless/00443_optimize_final_vertical_merge.sh
+++ b/tests/queries/0_stateless/00443_optimize_final_vertical_merge.sh
@ -42,13 +42,13 @@ $CLICKHOUSE_CLIENT -q "INSERT INTO $name (date, Sign, ki) SELECT
 toDate(0) AS date,
 toInt8(1) AS Sign,
 toUInt64(0) AS ki
-FROM system.numbers LIMIT 9000"
+FROM system.numbers LIMIT 9000" --server_logs_file=/dev/null

 $CLICKHOUSE_CLIENT -q "INSERT INTO $name (date, Sign, ki) SELECT
 toDate(0) AS date,
 toInt8(1) AS Sign,
 number AS ki
-FROM system.numbers LIMIT 9000, 9000"
+FROM system.numbers LIMIT 9000, 9000" --server_logs_file=/dev/null

 $CLICKHOUSE_CLIENT -q "INSERT INTO $name SELECT
 toDate(0) AS date,
@ -67,7 +67,7 @@ number AS di09,
 number AS di10,
 [number, number+1] AS \`n.i\`,
 [hex(number), hex(number+1)] AS \`n.s\`
-FROM system.numbers LIMIT $res_rows"
+FROM system.numbers LIMIT $res_rows" --server_logs_file=/dev/null

 while [[ $(get_num_parts) -ne 1 ]] ; do $CLICKHOUSE_CLIENT -q "OPTIMIZE TABLE $name PARTITION 197001" --server_logs_file=/dev/null; done

--- a/tests/queries/0_stateless/00509_extended_storage_definition_syntax_zookeeper.sql
+++ b/tests/queries/0_stateless/00509_extended_storage_definition_syntax_zookeeper.sql
@ -1,3 +1,5 @@
+SET optimize_on_insert = 0;
+
 SELECT '*** Replicated with sampling ***';

 DROP TABLE IF EXISTS replicated_with_sampling;
--- a/tests/queries/0_stateless/00564_versioned_collapsing_merge_tree.sql
+++ b/tests/queries/0_stateless/00564_versioned_collapsing_merge_tree.sql
@ -1,3 +1,5 @@
+set optimize_on_insert = 0;
+
 drop table if exists mult_tab;
 create table mult_tab (date Date, value String, version UInt64, sign Int8) engine = VersionedCollapsingMergeTree(date, (date), 8192, sign, version);
 insert into mult_tab select '2018-01-31', 'str_' || toString(number), 0, if(number % 2, 1, -1) from system.numbers limit 10;
--- a/tests/queries/0_stateless/00577_replacing_merge_tree_vertical_merge.sql
+++ b/tests/queries/0_stateless/00577_replacing_merge_tree_vertical_merge.sql
@ -1,3 +1,5 @@
+set optimize_on_insert = 0;
+
 drop table if exists tab_00577;
 create table tab_00577 (date Date, version UInt64, val UInt64) engine = ReplacingMergeTree(version) partition by date order by date settings enable_vertical_merge_algorithm = 1, vertical_merge_algorithm_min_rows_to_activate = 1, vertical_merge_algorithm_min_columns_to_activate = 0;
 insert into tab_00577 values ('2018-01-01', 2, 2), ('2018-01-01', 1, 1);
--- a/tests/queries/0_stateless/00616_final_single_part.sql
+++ b/tests/queries/0_stateless/00616_final_single_part.sql
@ -1,3 +1,5 @@
+SET optimize_on_insert = 0;
+
 DROP TABLE IF EXISTS test_00616;
 DROP TABLE IF EXISTS replacing_00616;

--- a/tests/queries/0_stateless/00660_optimize_final_without_partition.sql
+++ b/tests/queries/0_stateless/00660_optimize_final_without_partition.sql
@ -1,3 +1,5 @@
+SET optimize_on_insert = 0;
+
 DROP TABLE IF EXISTS partitioned_by_tuple;

 CREATE TABLE partitioned_by_tuple (d Date, x UInt8, w String, y UInt8) ENGINE SummingMergeTree (y) PARTITION BY (d, x) ORDER BY (d, x, w);
--- a/tests/queries/0_stateless/00661_optimize_final_replicated_without_partition_zookeeper.sql
+++ b/tests/queries/0_stateless/00661_optimize_final_replicated_without_partition_zookeeper.sql
@ -1,3 +1,5 @@
+SET optimize_on_insert = 0;
+
 DROP TABLE IF EXISTS partitioned_by_tuple_replica1_00661;
 DROP TABLE IF EXISTS partitioned_by_tuple_replica2_00661;
 CREATE TABLE partitioned_by_tuple_replica1_00661(d Date, x UInt8, w String, y UInt8) ENGINE = ReplicatedSummingMergeTree('/clickhouse/tables/test/partitioned_by_tuple_00661', '1') PARTITION BY (d, x) ORDER BY (d, x, w);
--- a/tests/queries/0_stateless/00754_alter_modify_order_by.sql
+++ b/tests/queries/0_stateless/00754_alter_modify_order_by.sql
@ -1,4 +1,5 @@
 SET send_logs_level = 'fatal';
+SET optimize_on_insert = 0;

 DROP TABLE IF EXISTS old_style;
 CREATE TABLE old_style(d Date, x UInt32) ENGINE MergeTree(d, x, 8192);
--- a/tests/queries/0_stateless/00754_alter_modify_order_by_replicated_zookeeper.sql
+++ b/tests/queries/0_stateless/00754_alter_modify_order_by_replicated_zookeeper.sql
@ -1,3 +1,5 @@
+SET optimize_on_insert = 0;
+
 SET send_logs_level = 'fatal';

 DROP TABLE IF EXISTS old_style;
--- a/tests/queries/0_stateless/01030_incorrect_count_summing_merge_tree.sql
+++ b/tests/queries/0_stateless/01030_incorrect_count_summing_merge_tree.sql
@ -1,3 +1,5 @@
+SET optimize_on_insert = 0;
+
 select '-- SummingMergeTree with Nullable column without duplicates.';  

 drop table if exists tst;
--- a/tests/queries/0_stateless/01056_create_table_as.reference
+++ b/tests/queries/0_stateless/01056_create_table_as.reference
@ -0,0 +1 @@
+1	String
--- a/tests/queries/0_stateless/01056_create_table_as.sql
+++ b/tests/queries/0_stateless/01056_create_table_as.sql
@ -49,3 +49,7 @@ DROP DICTIONARY dict;
 DROP TABLE test_01056_dict_data.dict_data;

 DROP DATABASE test_01056_dict_data;
+
+CREATE TABLE t1 (x String) ENGINE = Memory AS SELECT 1;
+SELECT x, toTypeName(x) FROM t1;
+DROP TABLE t1;
--- a/Show More
+++ b/Show More