Merge remote-tracking branch 'rschu1ze/master' into substr-with-enums

2024-11-22 23:52:03 +00:00 · 2023-12-12 12:59:30 +00:00 · 2023-12-12 12:59:30 +00:00 · e6372d5528
commit e6372d5528
parent 9e434d43d7 0e548a4caf
78 changed files with 2122 additions and 1021 deletions
--- a/cmake/target.cmake
+++ b/cmake/target.cmake
@ -73,8 +73,3 @@ if (CMAKE_CROSSCOMPILING)

    message (STATUS "Cross-compiling for target: ${CMAKE_CXX_COMPILE_TARGET}")
 endif ()
-
-if (USE_MUSL)
-    # Does not work for unknown reason
-    set (ENABLE_RUST OFF CACHE INTERNAL "")
-endif ()
--- a/docs/en/development/build-osx.md
+++ b/docs/en/development/build-osx.md
@ -3,7 +3,7 @@ slug: /en/development/build-osx
 sidebar_position: 65
 sidebar_label: Build on macOS
 title: How to Build ClickHouse on macOS
-description: How to build ClickHouse on macOS
+description: How to build ClickHouse on macOS for macOS
 ---

 :::info You don't have to build ClickHouse yourself!
--- a/docs/en/development/developer-instruction.md
+++ b/docs/en/development/developer-instruction.md
@ -7,42 +7,39 @@ description: Prerequisites and an overview of how to build ClickHouse

 # Getting Started Guide for Building ClickHouse

-The building of ClickHouse is supported on Linux, FreeBSD and macOS.
+ClickHouse can be build on Linux, FreeBSD and macOS. If you use Windows, you can still build ClickHouse in a virtual machine running Linux, e.g. [VirtualBox](https://www.virtualbox.org/) with Ubuntu.

-If you use Windows, you need to create a virtual machine with Ubuntu. To start working with a virtual machine please install VirtualBox. You can download Ubuntu from the website: https://www.ubuntu.com/#download. Please create a virtual machine from the downloaded image (you should reserve at least 4GB of RAM for it). To run a command-line terminal in Ubuntu, please locate a program containing the word “terminal” in its name (gnome-terminal, konsole etc.) or just press Ctrl+Alt+T.
-
-ClickHouse cannot work or build on a 32-bit system. You should acquire access to a 64-bit system and you can continue reading.
+ClickHouse requires a 64-bit system to compile and run, 32-bit systems do not work.

 ## Creating a Repository on GitHub {#creating-a-repository-on-github}

-To start working with ClickHouse repository you will need a GitHub account.
+To start developing for ClickHouse you will need a [GitHub](https://www.virtualbox.org/) account. Please also generate a SSH key locally (if you don't have one already) and upload the public key to GitHub as this is a prerequisite for contributing patches.

-You probably already have one, but if you do not, please register at https://github.com. In case you do not have SSH keys, you should generate them and then upload them on GitHub. It is required for sending over your patches. It is also possible to use the same SSH keys that you use with any other SSH servers - probably you already have those.
+Next, create a fork of the [ClickHouse repository](https://github.com/ClickHouse/ClickHouse/) in your personal account by clicking the "fork" button in the upper right corner.

-Create a fork of ClickHouse repository. To do that please click on the “fork” button in the upper right corner at https://github.com/ClickHouse/ClickHouse. It will fork your own copy of ClickHouse/ClickHouse to your account.
+To contribute, e.g. a fix for an issue or a feature, please commit your changes to a branch in your fork, then create a "pull request" with the changes to the main repository.

-The development process consists of first committing the intended changes into your fork of ClickHouse and then creating a “pull request” for these changes to be accepted into the main repository (ClickHouse/ClickHouse).
+For working with Git repositories, please install `git`. In Ubuntu run these commands in a terminal:

-To work with Git repositories, please install `git`. To do that in Ubuntu you would run in the command line terminal:
+```sh
+sudo apt update
+sudo apt install git
+```

-    sudo apt update
-    sudo apt install git
-
-A brief manual on using Git can be found [here](https://education.github.com/git-cheat-sheet-education.pdf).
-For a detailed manual on Git see [here](https://git-scm.com/book/en/v2).
+A cheatsheet for using Git can be found [here](https://education.github.com/git-cheat-sheet-education.pdf). The detailed manual for Git is [here](https://git-scm.com/book/en/v2).

 ## Cloning a Repository to Your Development Machine {#cloning-a-repository-to-your-development-machine}

-Next, you need to download the source files onto your working machine. This is called “to clone a repository” because it creates a local copy of the repository on your working machine.
+First, download the source files to your working machine, i.e. clone the repository:

-Run in your terminal:
+```sh
+git clone git@github.com:your_github_username/ClickHouse.git  # replace placeholder with your GitHub user name
+cd ClickHouse
+```

-    git clone git@github.com:your_github_username/ClickHouse.git  # replace placeholder with your GitHub user name
-    cd ClickHouse
+This command creates a directory `ClickHouse/` containing the source code of ClickHouse. If you specify a custom checkout directory after the URL but it is important that this path does not contain whitespaces as it may lead to problems with the build later on.

-This command will create a directory `ClickHouse/` containing the source code of ClickHouse. If you specify a custom checkout directory (after the URL), it is important that this path does not contain whitespaces as it may lead to problems with the build system.
-
-To make library dependencies available for the build, the ClickHouse repository uses Git submodules, i.e. references to external repositories. These are not checked out by default. To do so, you can either
+The ClickHouse repository uses Git submodules, i.e. references to external repositories (usually 3rd party libraries used by ClickHouse). These are not checked out by default. To do so, you can either

 - run `git clone` with option `--recurse-submodules`,

@ -52,7 +49,7 @@ To make library dependencies available for the build, the ClickHouse repository

 You can check the Git status with the command: `git submodule status`.

-If you get the following error message:
+If you get the following error message

    Permission denied (publickey).
    fatal: Could not read from remote repository.
@ -60,7 +57,7 @@ If you get the following error message:
    Please make sure you have the correct access rights
    and the repository exists.

-It generally means that the SSH keys for connecting to GitHub are missing. These keys are normally located in `~/.ssh`. For SSH keys to be accepted you need to upload them in the settings section of GitHub UI.
+it generally means that the SSH keys for connecting to GitHub are missing. These keys are normally located in `~/.ssh`. For SSH keys to be accepted you need to upload them in GitHub's settings.

 You can also clone the repository via https protocol:

@ -74,12 +71,17 @@ You can also add original ClickHouse repo address to your local repository to pu

 After successfully running this command you will be able to pull updates from the main ClickHouse repo by running `git pull upstream master`.

+:::note 
+Instructions below assume you are building on Linux. If you are cross-compiling or using building on macOS, please also check for operating system and architecture specific guides, such as building [on macOS for macOS](build-osx.md), [on Linux for macOS](build-cross-osx.md), [on Linux for Linux/RISC-V](build-cross-riscv.md) and so on.
+:::
+
 ## Build System {#build-system}

 ClickHouse uses CMake and Ninja for building.

-CMake - a meta-build system that can generate Ninja files (build tasks).
-Ninja - a smaller build system with a focus on the speed used to execute those cmake generated tasks.
+- CMake - a meta-build system that can generate Ninja files (build tasks).
+
+- Ninja - a smaller build system with a focus on the speed used to execute those cmake generated tasks.

 To install on Ubuntu, Debian or Mint run `sudo apt install cmake ninja-build`.

--- a/docs/en/engines/table-engines/special/distributed.md
+++ b/docs/en/engines/table-engines/special/distributed.md
@ -1,13 +1,16 @@
 ---
-slug: /en/engines/table-engines/special/distributed
+sidebar_label: "Distributed"
 sidebar_position: 10
-sidebar_label: Distributed
+slug: /en/engines/table-engines/special/distributed
 ---

 # Distributed Table Engine

-Tables with Distributed engine do not store any data of their own, but allow distributed query processing on multiple servers.
-Reading is automatically parallelized. During a read, the table indexes on remote servers are used, if there are any.
+:::warning
+To create a distributed table engine in the cloud, you can use the [remote and remoteSecure](../../../sql-reference/table-functions/remote) table functions. The `Distributed(...)` syntax cannot be used in ClickHouse Cloud.
+:::
+
+Tables with Distributed engine do not store any data of their own, but allow distributed query processing on multiple servers. Reading is automatically parallelized. During a read, the table indexes on remote servers are used, if there are any.

 ## Creating a Table {#distributed-creating-a-table}

@ -22,6 +25,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
 ```

 ### From a Table {#distributed-from-a-table}
+
 When the `Distributed` table is pointing to a table on the current server you can adopt that table's schema:

 ``` sql
@ -48,7 +52,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] AS [db2.]name2

 Specifying the `sharding_key` is necessary for the following:

- For `INSERTs` into a distributed table (as the table engine needs the `sharding_key` to determine how to split the data). However, if `insert_distributed_one_random_shard` setting is enabled, then `INSERTs` do not need the sharding key
+- For `INSERTs` into a distributed table (as the table engine needs the `sharding_key` to determine how to split the data). However, if `insert_distributed_one_random_shard` setting is enabled, then `INSERTs` do not need the sharding key.
 - For use with `optimize_skip_unused_shards` as the `sharding_key` is necessary to determine what shards should be queried

 #### policy_name
@ -122,9 +126,7 @@ SETTINGS
    fsync_directories=0;
 ```

-Data will be read from all servers in the `logs` cluster, from the `default.hits` table located on every server in the cluster.
-Data is not only read but is partially processed on the remote servers (to the extent that this is possible).
-For example, for a query with `GROUP BY`, data will be aggregated on remote servers, and the intermediate states of aggregate functions will be sent to the requestor server. Then data will be further aggregated.
+Data will be read from all servers in the `logs` cluster, from the `default.hits` table located on every server in the cluster. Data is not only read but is partially processed on the remote servers (to the extent that this is possible). For example, for a query with `GROUP BY`, data will be aggregated on remote servers, and the intermediate states of aggregate functions will be sent to the requestor server. Then data will be further aggregated.

 Instead of the database name, you can use a constant expression that returns a string. For example: `currentDatabase()`.

@ -183,9 +185,7 @@ Clusters are configured in the [server configuration file](../../../operations/c
 </remote_servers>
 ```

-Here a cluster is defined with the name `logs` that consists of two shards, each of which contains two replicas.
-Shards refer to the servers that contain different parts of the data (in order to read all the data, you must access all the shards).
-Replicas are duplicating servers (in order to read all the data, you can access the data on any one of the replicas).
+Here a cluster is defined with the name `logs` that consists of two shards, each of which contains two replicas. Shards refer to the servers that contain different parts of the data (in order to read all the data, you must access all the shards). Replicas are duplicating servers (in order to read all the data, you can access the data on any one of the replicas).

 Cluster names must not contain dots.

@ -198,9 +198,7 @@ The parameters `host`, `port`, and optionally `user`, `password`, `secure`, `com
 - `secure` - Whether to use a secure SSL/TLS connection. Usually also requires specifying the port (the default secure port is `9440`). The server should listen on `<tcp_port_secure>9440</tcp_port_secure>` and be configured with correct certificates.
 - `compression` - Use data compression. Default value: `true`.

-When specifying replicas, one of the available replicas will be selected for each of the shards when reading. You can configure the algorithm for load balancing (the preference for which replica to access) – see the [load_balancing](../../../operations/settings/settings.md#settings-load_balancing) setting.
-If the connection with the server is not established, there will be an attempt to connect with a short timeout. If the connection failed, the next replica will be selected, and so on for all the replicas. If the connection attempt failed for all the replicas, the attempt will be repeated the same way, several times.
-This works in favour of resiliency, but does not provide complete fault tolerance: a remote server might accept the connection, but might not work, or work poorly.
+When specifying replicas, one of the available replicas will be selected for each of the shards when reading. You can configure the algorithm for load balancing (the preference for which replica to access) – see the [load_balancing](../../../operations/settings/settings.md#settings-load_balancing) setting. If the connection with the server is not established, there will be an attempt to connect with a short timeout. If the connection failed, the next replica will be selected, and so on for all the replicas. If the connection attempt failed for all the replicas, the attempt will be repeated the same way, several times. This works in favour of resiliency, but does not provide complete fault tolerance: a remote server might accept the connection, but might not work, or work poorly.

 You can specify just one of the shards (in this case, query processing should be called remote, rather than distributed) or up to any number of shards. In each shard, you can specify from one to any number of replicas. You can specify a different number of replicas for each shard.

--- a/docs/en/sql-reference/functions/string-functions.md
+++ b/docs/en/sql-reference/functions/string-functions.md
@ -393,40 +393,6 @@ Reverses the sequence of bytes in a string.

 Reverses a sequence of Unicode code points in a string. Assumes that the string contains valid UTF-8 encoded text. If this assumption is violated, no exception is thrown and the result is undefined.

-## format
-
-Format the `pattern` string with the strings listed in the arguments, similar to formatting in Python. The pattern string can contain replacement fields surrounded by curly braces `{}`. Anything not contained in braces is considered literal text and copied verbatim into the output. Literal brace character can be escaped by two braces: `{{ '{{' }}` and `{{ '}}' }}`. Field names can be numbers (starting from zero) or empty (then they are implicitly given monotonically increasing numbers).
-
-**Syntax**
-
-```sql
-format(pattern, s0, s1, …)
-```
-
-**Example**
-
-``` sql
-SELECT format('{1} {0} {1}', 'World', 'Hello')
-```
-
-```result
-┌─format('{1} {0} {1}', 'World', 'Hello')─┐
-│ Hello World Hello                       │
-└─────────────────────────────────────────┘
-```
-
-With implicit numbers:
-
-``` sql
-SELECT format('{} {}', 'Hello', 'World')
-```
-
-```result
-┌─format('{} {}', 'Hello', 'World')─┐
-│ Hello World                       │
-└───────────────────────────────────┘
-```
-
 ## concat

 Concatenates the given arguments.
--- a/docs/en/sql-reference/functions/string-replace-functions.md
+++ b/docs/en/sql-reference/functions/string-replace-functions.md
@ -132,6 +132,40 @@ For more information, see [RE2](https://github.com/google/re2/blob/master/re2/re
 regexpQuoteMeta(s)
 ```

+## format
+
+Format the `pattern` string with the values (strings, integers, etc.) listed in the arguments, similar to formatting in Python. The pattern string can contain replacement fields surrounded by curly braces `{}`. Anything not contained in braces is considered literal text and copied verbatim into the output. Literal brace character can be escaped by two braces: `{{ '{{' }}` and `{{ '}}' }}`. Field names can be numbers (starting from zero) or empty (then they are implicitly given monotonically increasing numbers).
+
+**Syntax**
+
+```sql
+format(pattern, s0, s1, …)
+```
+
+**Example**
+
+``` sql
+SELECT format('{1} {0} {1}', 'World', 'Hello')
+```
+
+```result
+┌─format('{1} {0} {1}', 'World', 'Hello')─┐
+│ Hello World Hello                       │
+└─────────────────────────────────────────┘
+```
+
+With implicit numbers:
+
+``` sql
+SELECT format('{} {}', 'Hello', 'World')
+```
+
+```result
+┌─format('{} {}', 'Hello', 'World')─┐
+│ Hello World                       │
+└───────────────────────────────────┘
+```
+
 ## translate

 Replaces characters in the string `s` using a one-to-one character mapping defined by `from` and `to` strings. `from` and `to` must be constant ASCII strings of the same size. Non-ASCII characters in the original string are not modified.
--- a/programs/keeper/Keeper.cpp
+++ b/programs/keeper/Keeper.cpp
@ -14,6 +14,7 @@
 #include <Common/assertProcessUserMatchesDataOwner.h>
 #include <Common/makeSocketAddress.h>
 #include <Server/waitServersToFinish.h>
+#include <base/getMemoryAmount.h>
 #include <base/scope_guard.h>
 #include <base/safeExit.h>
 #include <Poco/Net/NetException.h>
@ -289,6 +290,33 @@ try
    if (!config().has("keeper_server"))
        throw Exception(ErrorCodes::NO_ELEMENTS_IN_CONFIG, "Keeper configuration (<keeper_server> section) not found in config");

+    auto updateMemorySoftLimitInConfig = [&](Poco::Util::AbstractConfiguration & config)
+    {
+        UInt64 memory_soft_limit = 0;
+        if (config.has("keeper_server.max_memory_usage_soft_limit"))
+        {
+            memory_soft_limit = config.getUInt64("keeper_server.max_memory_usage_soft_limit");
+        }
+
+        /// if memory soft limit is not set, we will use default value
+        if (memory_soft_limit == 0)
+        {
+            Float64 ratio = 0.9;
+            if (config.has("keeper_server.max_memory_usage_soft_limit_ratio"))
+                ratio = config.getDouble("keeper_server.max_memory_usage_soft_limit_ratio");
+
+            size_t physical_server_memory = getMemoryAmount();
+            if (ratio > 0 && physical_server_memory > 0)
+            {
+                memory_soft_limit = static_cast<UInt64>(physical_server_memory * ratio);
+                config.setUInt64("keeper_server.max_memory_usage_soft_limit", memory_soft_limit);
+            }
+        }
+        LOG_INFO(log, "keeper_server.max_memory_usage_soft_limit is set to {}", formatReadableSizeWithBinarySuffix(memory_soft_limit));
+    };
+
+    updateMemorySoftLimitInConfig(config());
+
    std::string path;

    if (config().has("keeper_server.storage_path"))
@ -499,6 +527,8 @@ try
        {
            updateLevels(*config, logger());

+            updateMemorySoftLimitInConfig(*config);
+
            if (config->has("keeper_server"))
                global_context->updateKeeperConfiguration(*config);

--- a/rust/CMakeLists.txt
+++ b/rust/CMakeLists.txt
@ -14,6 +14,10 @@ macro(configure_rustc)
        set(RUST_CFLAGS "${RUST_CFLAGS} --sysroot ${CMAKE_SYSROOT}")
    endif()

+    if (USE_MUSL)
+        set(RUST_CXXFLAGS "${RUST_CXXFLAGS} -D_LIBCPP_HAS_MUSL_LIBC=1")
+    endif ()
+
    if(CCACHE_EXECUTABLE MATCHES "/sccache$")
        message(STATUS "Using RUSTC_WRAPPER: ${CCACHE_EXECUTABLE}")
        set(RUSTCWRAPPER "rustc-wrapper = \"${CCACHE_EXECUTABLE}\"")
--- a/src/Common/RWLock.cpp
+++ b/src/Common/RWLock.cpp
@ -3,6 +3,8 @@
 #include <Common/Exception.h>
 #include <Common/CurrentMetrics.h>
 #include <Common/ProfileEvents.h>
+#include <IO/Operators.h>
+#include <IO/WriteBufferFromString.h>


 namespace ProfileEvents
@ -155,25 +157,34 @@ RWLockImpl::getLock(RWLockImpl::Type type, const String & query_id, const std::c

    if (type == Type::Write)
    {
+        /// Always add a group for a writer (writes are never performed simultaneously).
        writers_queue.emplace_back(type);  /// SM1: may throw (nothing to roll back)
    }
-    else if (readers_queue.empty() ||
-            (rdlock_owner == readers_queue.begin() && readers_queue.size() == 1 && !writers_queue.empty()))
+    else
    {
-        readers_queue.emplace_back(type);  /// SM1: may throw (nothing to roll back)
+        /// We don't always add a group to readers_queue here because multiple readers can use the same group.
+        /// We can reuse the last group if the last group didn't get ownership yet,
+        /// or even if it got ownership but there are no writers waiting in writers_queue.
+        bool can_use_last_group = !readers_queue.empty() && (!readers_queue.back().ownership || writers_queue.empty());
+
+        if (!can_use_last_group)
+            readers_queue.emplace_back(type);  /// SM1: may throw (nothing to roll back)
    }
+
    GroupsContainer::iterator it_group =
            (type == Type::Write) ? std::prev(writers_queue.end()) : std::prev(readers_queue.end());

    /// Lock is free to acquire
    if (rdlock_owner == readers_queue.end() && wrlock_owner == writers_queue.end())
    {
+        /// Set `rdlock_owner` or `wrlock_owner` and make it owner.
        (type == Read ? rdlock_owner : wrlock_owner) = it_group;  /// SM2: nothrow
+        grantOwnership(it_group);
    }
    else
    {
        /// Wait until our group becomes the lock owner
-        const auto predicate = [&] () { return it_group == (type == Read ? rdlock_owner : wrlock_owner); };
+        const auto predicate = [&] () { return it_group->ownership; };

        if (lock_deadline_tp == std::chrono::time_point<std::chrono::steady_clock>::max())
        {
@ -193,15 +204,20 @@ RWLockImpl::getLock(RWLockImpl::Type type, const String & query_id, const std::c
                /// Rollback(SM1): nothrow
                if (it_group->requests == 0)
                {
-                    /// When WRITE lock fails, we need to notify next read that is waiting,
-                    /// to avoid handing request, hence next=true.
-                    dropOwnerGroupAndPassOwnership(it_group, /* next= */ true);
+                    ((type == Read) ? readers_queue : writers_queue).erase(it_group);
                }
+                /// While we were waiting for this write lock (which has just failed) more readers could start waiting,
+                /// we need to wake up them now.
+                if ((rdlock_owner != readers_queue.end()) && writers_queue.empty())
+                    grantOwnershipToAllReaders();
                return nullptr;
            }
        }
    }

+    /// Our group must be an owner here.
+    chassert(it_group->ownership);
+
    if (request_has_query_id)
    {
        try
@ -216,7 +232,7 @@ RWLockImpl::getLock(RWLockImpl::Type type, const String & query_id, const std::c
            /// Methods std::list<>::emplace_back() and std::unordered_map<>::emplace() provide strong exception safety
            /// We only need to roll back the changes to these objects: owner_queries and the readers/writers queue
            if (it_group->requests == 0)
-                dropOwnerGroupAndPassOwnership(it_group, /* next= */ false);  /// Rollback(SM1): nothrow
+                dropOwnerGroupAndPassOwnership(it_group);  /// Rollback(SM1): nothrow

            throw;
        }
@ -237,19 +253,28 @@ RWLockImpl::getLock(RWLockImpl::Type type, const String & query_id, const std::c
  * it is guaranteed that all three steps have been executed successfully and the resulting state is consistent.
  * With the mutex locked the order of steps to restore the lock's state can be arbitrary
  *
-  * We do not employ try-catch: if something bad happens, there is nothing we can do =(
+  * We do not employ try-catch: if something bad happens and chassert() is disabled, there is nothing we can do
+  * (we can't throw an exception here because RWLockImpl::unlock() is called from the destructor ~LockHolderImpl).
  */
 void RWLockImpl::unlock(GroupsContainer::iterator group_it, const String & query_id) noexcept
 {
    std::lock_guard state_lock(internal_state_mtx);

-    /// All of these are Undefined behavior and nothing we can do!
-    if (rdlock_owner == readers_queue.end() && wrlock_owner == writers_queue.end())
+    /// Our group must be an owner here.
+    if (!group_it->ownership)
+    {
+        chassert(false && "RWLockImpl::unlock() is called for a non-owner group");
        return;
-    if (rdlock_owner != readers_queue.end() && group_it != rdlock_owner)
-        return;
-    if (wrlock_owner != writers_queue.end() && group_it != wrlock_owner)
+    }
+
+    /// Check consistency.
+    if ((group_it->type == Read)
+            ? !(rdlock_owner != readers_queue.end() && wrlock_owner == writers_queue.end())
+            : !(wrlock_owner != writers_queue.end() && rdlock_owner == readers_queue.end() && group_it == wrlock_owner))
+    {
+        chassert(false && "RWLockImpl::unlock() found the rwlock inconsistent");
        return;
+    }

    /// If query_id is not empty it must be listed in parent->owner_queries
    if (query_id != NO_QUERY)
@ -264,12 +289,26 @@ void RWLockImpl::unlock(GroupsContainer::iterator group_it, const String & query

    /// If we are the last remaining referrer, remove this QNode and notify the next one
    if (--group_it->requests == 0)               /// SM: nothrow
-        dropOwnerGroupAndPassOwnership(group_it, /* next= */ false);
+        dropOwnerGroupAndPassOwnership(group_it);
 }


-void RWLockImpl::dropOwnerGroupAndPassOwnership(GroupsContainer::iterator group_it, bool next) noexcept
+void RWLockImpl::dropOwnerGroupAndPassOwnership(GroupsContainer::iterator group_it) noexcept
 {
+    /// All readers with ownership must finish before switching to write phase.
+    /// Such readers has iterators from `readers_queue.begin()` to `rdlock_owner`, so if `rdlock_owner` is equal to `readers_queue.begin()`
+    /// that means there is only one reader with ownership left in the readers_queue and we can proceed to generic procedure.
+    if ((group_it->type == Read) && (rdlock_owner != readers_queue.begin()) && (rdlock_owner != readers_queue.end()))
+    {
+        if (rdlock_owner == group_it)
+            --rdlock_owner;
+        readers_queue.erase(group_it);
+        /// If there are no writers waiting in writers_queue then we can wake up other readers.
+        if (writers_queue.empty())
+            grantOwnershipToAllReaders();
+        return;
+    }
+
    rdlock_owner = readers_queue.end();
    wrlock_owner = writers_queue.end();

@ -278,42 +317,86 @@ void RWLockImpl::dropOwnerGroupAndPassOwnership(GroupsContainer::iterator group_
        readers_queue.erase(group_it);
        /// Prepare next phase
        if (!writers_queue.empty())
-        {
            wrlock_owner = writers_queue.begin();
-        }
        else
-        {
            rdlock_owner = readers_queue.begin();
-        }
    }
    else
    {
        writers_queue.erase(group_it);
        /// Prepare next phase
        if (!readers_queue.empty())
-        {
-            if (next && readers_queue.size() > 1)
-            {
-                rdlock_owner = std::next(readers_queue.begin());
-            }
-            else
-            {
-                rdlock_owner = readers_queue.begin();
-            }
-        }
+            rdlock_owner = readers_queue.begin();
        else
-        {
            wrlock_owner = writers_queue.begin();
-        }
    }

    if (rdlock_owner != readers_queue.end())
    {
-        rdlock_owner->cv.notify_all();
+        grantOwnershipToAllReaders();
    }
    else if (wrlock_owner != writers_queue.end())
    {
-        wrlock_owner->cv.notify_one();
+        grantOwnership(wrlock_owner);
    }
 }
+
+
+void RWLockImpl::grantOwnership(GroupsContainer::iterator group_it) noexcept
+{
+    if (!group_it->ownership)
+    {
+        group_it->ownership = true;
+        group_it->cv.notify_all();
+    }
+}
+
+
+void RWLockImpl::grantOwnershipToAllReaders() noexcept
+{
+    if (rdlock_owner != readers_queue.end())
+    {
+        size_t num_new_owners = 0;
+
+        for (;;)
+        {
+            if (!rdlock_owner->ownership)
+                ++num_new_owners;
+            grantOwnership(rdlock_owner);
+            if (std::next(rdlock_owner) == readers_queue.end())
+                break;
+            ++rdlock_owner;
+        }
+
+        /// There couldn't be more than one reader group which is not an owner.
+        /// (Because we add a new reader group only if the last reader group is already an owner - see the `can_use_last_group` variable.)
+        chassert(num_new_owners <= 1);
+    }
+}
+
+
+std::unordered_map<String, size_t> RWLockImpl::getOwnerQueryIds() const
+{
+    std::lock_guard lock{internal_state_mtx};
+    return owner_queries;
+}
+
+
+String RWLockImpl::getOwnerQueryIdsDescription() const
+{
+    auto map = getOwnerQueryIds();
+    WriteBufferFromOwnString out;
+    bool need_comma = false;
+    for (const auto & [query_id, num_owners] : map)
+    {
+        if (need_comma)
+            out << ", ";
+        out << query_id;
+        if (num_owners != 1)
+            out << " (" << num_owners << ")";
+        need_comma = true;
+    }
+    return out.str();
+}
+
 }
--- a/src/Common/RWLock.h
+++ b/src/Common/RWLock.h
@ -62,35 +62,42 @@ public:
    inline static const String NO_QUERY = String();
    inline static const auto default_locking_timeout_ms = std::chrono::milliseconds(120000);

+    /// Returns all query_id owning locks (both read and write) right now.
+    /// !! This function are for debugging and logging purposes only, DO NOT use them for synchronization!
+    std::unordered_map<String, size_t> getOwnerQueryIds() const;
+    String getOwnerQueryIdsDescription() const;
+
 private:
    /// Group of locking requests that should be granted simultaneously
    /// i.e. one or several readers or a single writer
    struct Group
    {
        const Type type;
-        size_t requests;
+        size_t requests = 0;

+        bool ownership = false; /// whether this group got ownership? (that means `cv` is notified and the locking requests should stop waiting)
        std::condition_variable cv; /// all locking requests of the group wait on this condvar

-        explicit Group(Type type_) : type{type_}, requests{0} {}
+        explicit Group(Type type_) : type{type_} {}
    };

    using GroupsContainer = std::list<Group>;
-    using OwnerQueryIds = std::unordered_map<String, size_t>;
+    using OwnerQueryIds = std::unordered_map<String /* query_id */, size_t /* num_owners */>;

    mutable std::mutex internal_state_mtx;

    GroupsContainer readers_queue;
    GroupsContainer writers_queue;
-    GroupsContainer::iterator rdlock_owner{readers_queue.end()};  /// equals to readers_queue.begin() in read phase
-                                                                  /// or readers_queue.end() otherwise
+    GroupsContainer::iterator rdlock_owner{readers_queue.end()};  /// last group with ownership in readers_queue in read phase
+                                                                  /// or readers_queue.end() in writer phase
    GroupsContainer::iterator wrlock_owner{writers_queue.end()};  /// equals to writers_queue.begin() in write phase
-                                                                  /// or writers_queue.end() otherwise
+                                                                  /// or writers_queue.end() in read phase
    OwnerQueryIds owner_queries;

    RWLockImpl() = default;
    void unlock(GroupsContainer::iterator group_it, const String & query_id) noexcept;
-    /// @param next - notify next after begin, used on writer lock failures
-    void dropOwnerGroupAndPassOwnership(GroupsContainer::iterator group_it, bool next) noexcept;
+    void dropOwnerGroupAndPassOwnership(GroupsContainer::iterator group_it) noexcept;
+    void grantOwnership(GroupsContainer::iterator group_it) noexcept;
+    void grantOwnershipToAllReaders() noexcept;
 };
 }
--- a/src/Common/tests/gtest_rw_lock.cpp
+++ b/src/Common/tests/gtest_rw_lock.cpp
@ -24,6 +24,41 @@ namespace DB
 }


+namespace
+{
+    class Events
+    {
+    public:
+        Events() : start_time(std::chrono::steady_clock::now()) {}
+
+        void add(String && event, std::chrono::milliseconds correction = std::chrono::milliseconds::zero())
+        {
+            String timepoint = std::to_string(std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - start_time).count());
+            if (timepoint.length() < 5)
+                timepoint.insert(0, 5 - timepoint.length(), ' ');
+            if (correction.count())
+                std::this_thread::sleep_for(correction);
+            std::lock_guard lock{mutex};
+            //std::cout << timepoint << " : " << event << std::endl;
+            events.emplace_back(std::move(event));
+        }
+
+        void check(const Strings & expected_events)
+        {
+            std::lock_guard lock{mutex};
+            EXPECT_EQ(events.size(), expected_events.size());
+            for (size_t i = 0; i != events.size(); ++i)
+                EXPECT_EQ(events[i], (i < expected_events.size() ? expected_events[i] : ""));
+        }
+
+    private:
+        const std::chrono::time_point<std::chrono::steady_clock> start_time;
+        Strings events TSA_GUARDED_BY(mutex);
+        mutable std::mutex mutex;
+    };
+}
+
+
 TEST(Common, RWLock1)
 {
    /// Tests with threads require this, because otherwise
@ -287,3 +322,260 @@ TEST(Common, RWLockNotUpgradeableWithNoQuery)

    read_thread.join();
 }
+
+
+TEST(Common, RWLockWriteLockTimeoutDuringRead)
+{
+    /// 0                 100                         200                      300                 400
+    /// <---------------------------------------- ra ---------------------------------------------->
+    ///                     <----- wc (acquiring lock, failed by timeout) ----->
+    ///                                                                                             <wd>
+    ///
+    ///    0 : Locking ra
+    ///    0 : Locked ra
+    ///  100 : Locking wc
+    ///  300 : Failed to lock wc
+    ///  400 : Unlocking ra
+    ///  400 : Unlocked ra
+    ///  400 : Locking wd
+    ///  400 : Locked wd
+    ///  400 : Unlocking wd
+    ///  400 : Unlocked wd
+
+    static auto rw_lock = RWLockImpl::create();
+    Events events;
+
+    std::thread ra_thread([&] ()
+    {
+        events.add("Locking ra");
+        auto ra = rw_lock->getLock(RWLockImpl::Read, "ra");
+        events.add(ra ? "Locked ra" : "Failed to lock ra");
+        EXPECT_NE(ra, nullptr);
+
+        std::this_thread::sleep_for(std::chrono::duration<int, std::milli>(400));
+
+        events.add("Unlocking ra");
+        ra.reset();
+        events.add("Unlocked ra");
+    });
+
+    std::thread wc_thread([&] ()
+    {
+        std::this_thread::sleep_for(std::chrono::duration<int, std::milli>(100));
+        events.add("Locking wc");
+        auto wc = rw_lock->getLock(RWLockImpl::Write, "wc", std::chrono::milliseconds(200));
+        events.add(wc ? "Locked wc" : "Failed to lock wc");
+        EXPECT_EQ(wc, nullptr);
+    });
+
+    ra_thread.join();
+    wc_thread.join();
+
+    {
+        events.add("Locking wd");
+        auto wd = rw_lock->getLock(RWLockImpl::Write, "wd", std::chrono::milliseconds(1000));
+        events.add(wd ? "Locked wd" : "Failed to lock wd");
+        EXPECT_NE(wd, nullptr);
+        events.add("Unlocking wd");
+        wd.reset();
+        events.add("Unlocked wd");
+    }
+
+    events.check(
+        {"Locking ra",
+         "Locked ra",
+         "Locking wc",
+         "Failed to lock wc",
+         "Unlocking ra",
+         "Unlocked ra",
+         "Locking wd",
+         "Locked wd",
+         "Unlocking wd",
+         "Unlocked wd"});
+}
+
+
+TEST(Common, RWLockWriteLockTimeoutDuringTwoReads)
+{
+    /// 0                 100                         200                         300               400                500
+    /// <---------------------------------------- ra ----------------------------------------------->
+    ///                     <------ wc (acquiring lock, failed by timeout) ------->
+    ///                                                 <-- rb (acquiring lock) --><---------- rb (locked) ------------>
+    ///                                                                                                                 <wd>
+    ///
+    ///    0 : Locking ra
+    ///    0 : Locked ra
+    ///  100 : Locking wc
+    ///  200 : Locking rb
+    ///  300 : Failed to lock wc
+    ///  300 : Locked rb
+    ///  400 : Unlocking ra
+    ///  400 : Unlocked ra
+    ///  500 : Unlocking rb
+    ///  500 : Unlocked rb
+    ///  501 : Locking wd
+    ///  501 : Locked wd
+    ///  501 : Unlocking wd
+    ///  501 : Unlocked wd
+
+    static auto rw_lock = RWLockImpl::create();
+    Events events;
+
+    std::thread ra_thread([&] ()
+    {
+        events.add("Locking ra");
+        auto ra = rw_lock->getLock(RWLockImpl::Read, "ra");
+        events.add(ra ? "Locked ra" : "Failed to lock ra");
+        EXPECT_NE(ra, nullptr);
+
+        std::this_thread::sleep_for(std::chrono::duration<int, std::milli>(400));
+
+        events.add("Unlocking ra");
+        ra.reset();
+        events.add("Unlocked ra");
+    });
+
+    std::thread rb_thread([&] ()
+    {
+        std::this_thread::sleep_for(std::chrono::duration<int, std::milli>(200));
+        events.add("Locking rb");
+
+        auto rb = rw_lock->getLock(RWLockImpl::Read, "rb");
+
+        /// `correction` is used here to add an event to `events` a little later.
+        /// (Because the event "Locked rb" happens at nearly the same time as "Failed to lock wc" and we don't want our test to be flaky.)
+        auto correction = std::chrono::duration<int, std::milli>(50);
+        events.add(rb ? "Locked rb" : "Failed to lock rb", correction);
+        EXPECT_NE(rb, nullptr);
+
+        std::this_thread::sleep_for(std::chrono::duration<int, std::milli>(200) - correction);
+        events.add("Unlocking rb");
+        rb.reset();
+        events.add("Unlocked rb");
+    });
+
+    std::thread wc_thread([&] ()
+    {
+        std::this_thread::sleep_for(std::chrono::duration<int, std::milli>(100));
+        events.add("Locking wc");
+        auto wc = rw_lock->getLock(RWLockImpl::Write, "wc", std::chrono::milliseconds(200));
+        events.add(wc ? "Locked wc" : "Failed to lock wc");
+        EXPECT_EQ(wc, nullptr);
+    });
+
+    ra_thread.join();
+    rb_thread.join();
+    wc_thread.join();
+
+    {
+        events.add("Locking wd");
+        auto wd = rw_lock->getLock(RWLockImpl::Write, "wd", std::chrono::milliseconds(1000));
+        events.add(wd ? "Locked wd" : "Failed to lock wd");
+        EXPECT_NE(wd, nullptr);
+        events.add("Unlocking wd");
+        wd.reset();
+        events.add("Unlocked wd");
+    }
+
+    events.check(
+        {"Locking ra",
+         "Locked ra",
+         "Locking wc",
+         "Locking rb",
+         "Failed to lock wc",
+         "Locked rb",
+         "Unlocking ra",
+         "Unlocked ra",
+         "Unlocking rb",
+         "Unlocked rb",
+         "Locking wd",
+         "Locked wd",
+         "Unlocking wd",
+         "Unlocked wd"});
+}
+
+
+TEST(Common, RWLockWriteLockTimeoutDuringWriteWithWaitingRead)
+{
+    /// 0                 100                         200                        300                 400                500
+    /// <--------------------------------------------------- wa -------------------------------------------------------->
+    ///                     <------ wb (acquiring lock, failed by timeout) ------>
+    ///                                                 <-- rc (acquiring lock, failed by timeout) -->
+    ///                                                                                                                  <wd>
+    ///
+    ///    0 : Locking wa
+    ///    0 : Locked wa
+    ///  100 : Locking wb
+    ///  200 : Locking rc
+    ///  300 : Failed to lock wb
+    ///  400 : Failed to lock rc
+    ///  500 : Unlocking wa
+    ///  500 : Unlocked wa
+    ///  501 : Locking wd
+    ///  501 : Locked wd
+    ///  501 : Unlocking wd
+    ///  501 : Unlocked wd
+
+    static auto rw_lock = RWLockImpl::create();
+    Events events;
+
+    std::thread wa_thread([&] ()
+    {
+        events.add("Locking wa");
+        auto wa = rw_lock->getLock(RWLockImpl::Write, "wa");
+        events.add(wa ? "Locked wa" : "Failed to lock wa");
+        EXPECT_NE(wa, nullptr);
+
+        std::this_thread::sleep_for(std::chrono::duration<int, std::milli>(500));
+
+        events.add("Unlocking wa");
+        wa.reset();
+        events.add("Unlocked wa");
+    });
+
+    std::thread wb_thread([&] ()
+    {
+        std::this_thread::sleep_for(std::chrono::duration<int, std::milli>(100));
+        events.add("Locking wb");
+        auto wc = rw_lock->getLock(RWLockImpl::Write, "wc", std::chrono::milliseconds(200));
+        events.add(wc ? "Locked wb" : "Failed to lock wb");
+        EXPECT_EQ(wc, nullptr);
+    });
+    
+    std::thread rc_thread([&] ()
+    {
+        std::this_thread::sleep_for(std::chrono::duration<int, std::milli>(200));
+        events.add("Locking rc");
+        auto rc = rw_lock->getLock(RWLockImpl::Read, "rc", std::chrono::milliseconds(200));
+        events.add(rc ? "Locked rc" : "Failed to lock rc");
+        EXPECT_EQ(rc, nullptr);
+    });
+
+    wa_thread.join();
+    wb_thread.join();
+    rc_thread.join();
+
+    {
+        events.add("Locking wd");
+        auto wd = rw_lock->getLock(RWLockImpl::Write, "wd", std::chrono::milliseconds(1000));
+        events.add(wd ? "Locked wd" : "Failed to lock wd");
+        EXPECT_NE(wd, nullptr);
+        events.add("Unlocking wd");
+        wd.reset();
+        events.add("Unlocked wd");
+    }
+
+    events.check(
+        {"Locking wa",
+         "Locked wa",
+         "Locking wb",
+         "Locking rc",
+         "Failed to lock wb",
+         "Failed to lock rc",
+         "Unlocking wa",
+         "Unlocked wa",
+         "Locking wd",
+         "Locked wd",
+         "Unlocking wd",
+         "Unlocked wd"});
+}
--- a/src/Coordination/CoordinationSettings.h
+++ b/src/Coordination/CoordinationSettings.h
@ -43,7 +43,6 @@ struct Settings;
    M(UInt64, max_requests_batch_bytes_size, 100*1024, "Max size in bytes of batch of requests that can be sent to RAFT", 0) \
    M(UInt64, max_flush_batch_size, 1000, "Max size of batch of requests that can be flushed together", 0) \
    M(UInt64, max_requests_quick_batch_size, 100, "Max size of batch of requests to try to get before proceeding with RAFT. Keeper will not wait for requests but take only requests that are already in queue" , 0) \
-    M(UInt64, max_memory_usage_soft_limit, 0, "Soft limit in bytes of keeper memory usage", 0) \
    M(Bool, quorum_reads, false, "Execute read requests as writes through whole RAFT consesus with similar speed", 0) \
    M(Bool, force_sync, true, "Call fsync on each change in RAFT changelog", 0) \
    M(Bool, compress_logs, false, "Write compressed coordination logs in ZSTD format", 0) \
--- a/src/Coordination/KeeperContext.cpp
+++ b/src/Coordination/KeeperContext.cpp
@ -59,6 +59,8 @@ void KeeperContext::initialize(const Poco::Util::AbstractConfiguration & config,
        }
    }

+    updateKeeperMemorySoftLimit(config);
+
    digest_enabled = config.getBool("keeper_server.digest_enabled", false);
    ignore_system_path_on_startup = config.getBool("keeper_server.ignore_system_path_on_startup", false);

@ -375,4 +377,10 @@ void KeeperContext::initializeFeatureFlags(const Poco::Util::AbstractConfigurati
    feature_flags.logFlags(&Poco::Logger::get("KeeperContext"));
 }

+void KeeperContext::updateKeeperMemorySoftLimit(const Poco::Util::AbstractConfiguration & config)
+{
+    if (config.hasProperty("keeper_server.max_memory_usage_soft_limit"))
+        memory_soft_limit = config.getUInt64("keeper_server.max_memory_usage_soft_limit");
+}
+
 }
--- a/src/Coordination/KeeperContext.h
+++ b/src/Coordination/KeeperContext.h
@ -53,6 +53,9 @@ public:

    constexpr KeeperDispatcher * getDispatcher() const { return dispatcher; }

+    UInt64 getKeeperMemorySoftLimit() const { return memory_soft_limit; }
+    void updateKeeperMemorySoftLimit(const Poco::Util::AbstractConfiguration & config);
+
    /// set to true when we have preprocessed or committed all the logs
    /// that were already present locally during startup
    std::atomic<bool> local_logs_preprocessed = false;
@ -92,6 +95,8 @@ private:

    KeeperFeatureFlags feature_flags;
    KeeperDispatcher * dispatcher{nullptr};
+
+    std::atomic<UInt64> memory_soft_limit = 0;
 };

 using KeeperContextPtr = std::shared_ptr<KeeperContext>;
--- a/src/Coordination/KeeperDispatcher.cpp
+++ b/src/Coordination/KeeperDispatcher.cpp
@ -143,7 +143,7 @@ void KeeperDispatcher::requestThread()
                if (shutdown_called)
                    break;

-                Int64 mem_soft_limit = configuration_and_settings->coordination_settings->max_memory_usage_soft_limit;
+                Int64 mem_soft_limit = keeper_context->getKeeperMemorySoftLimit();
                if (configuration_and_settings->standalone_keeper && mem_soft_limit > 0 && total_memory_tracker.get() >= mem_soft_limit && checkIfRequestIncreaseMem(request.request))
                {
                    LOG_TRACE(log, "Processing requests refused because of max_memory_usage_soft_limit {}, the total used memory is {}, request type is {}", mem_soft_limit, total_memory_tracker.get(), request.request->getOpNum());
@ -930,6 +930,8 @@ void KeeperDispatcher::updateConfiguration(const Poco::Util::AbstractConfigurati
                throw Exception(ErrorCodes::SYSTEM_ERROR, "Cannot push configuration update to queue");

    snapshot_s3.updateS3Configuration(config, macros);
+
+    keeper_context->updateKeeperMemorySoftLimit(config);
 }

 void KeeperDispatcher::updateKeeperStatLatency(uint64_t process_time_ms)
--- a/src/Core/ExternalResultDescription.cpp
+++ b/src/Core/ExternalResultDescription.cpp
@ -20,6 +20,11 @@ namespace ErrorCodes
    extern const int UNKNOWN_TYPE;
 }

+ExternalResultDescription::ExternalResultDescription(const Block & sample_block_)
+{
+    init(sample_block_);
+}
+
 void ExternalResultDescription::init(const Block & sample_block_)
 {
    sample_block = sample_block_;
--- a/src/Core/ExternalResultDescription.h
+++ b/src/Core/ExternalResultDescription.h
@ -41,6 +41,9 @@ struct ExternalResultDescription
    Block sample_block;
    std::vector<std::pair<ValueType, bool /* is_nullable */>> types;

+    ExternalResultDescription() = default;
+    explicit ExternalResultDescription(const Block & sample_block_);
+
    void init(const Block & sample_block_);
 };

--- a/src/Core/PostgreSQL/insertPostgreSQLValue.cpp
+++ b/src/Core/PostgreSQL/insertPostgreSQLValue.cpp
@ -36,7 +36,7 @@ void insertDefaultPostgreSQLValue(IColumn & column, const IColumn & sample_colum
 void insertPostgreSQLValue(
        IColumn & column, std::string_view value,
        const ExternalResultDescription::ValueType type, const DataTypePtr data_type,
-        std::unordered_map<size_t, PostgreSQLArrayInfo> & array_info, size_t idx)
+        const std::unordered_map<size_t, PostgreSQLArrayInfo> & array_info, size_t idx)
 {
    switch (type)
    {
@ -125,8 +125,8 @@ void insertPostgreSQLValue(
            pqxx::array_parser parser{value};
            std::pair<pqxx::array_parser::juncture, std::string> parsed = parser.get_next();

-            size_t dimension = 0, max_dimension = 0, expected_dimensions = array_info[idx].num_dimensions;
-            const auto parse_value = array_info[idx].pqxx_parser;
+            size_t dimension = 0, max_dimension = 0, expected_dimensions = array_info.at(idx).num_dimensions;
+            const auto parse_value = array_info.at(idx).pqxx_parser;
            std::vector<Row> dimensions(expected_dimensions + 1);

            while (parsed.first != pqxx::array_parser::juncture::done)
@ -138,7 +138,7 @@ void insertPostgreSQLValue(
                    dimensions[dimension].emplace_back(parse_value(parsed.second));

                else if (parsed.first == pqxx::array_parser::juncture::null_value)
-                    dimensions[dimension].emplace_back(array_info[idx].default_value);
+                    dimensions[dimension].emplace_back(array_info.at(idx).default_value);

                else if (parsed.first == pqxx::array_parser::juncture::row_end)
                {
--- a/src/Core/PostgreSQL/insertPostgreSQLValue.h
+++ b/src/Core/PostgreSQL/insertPostgreSQLValue.h
@ -23,7 +23,7 @@ struct PostgreSQLArrayInfo
 void insertPostgreSQLValue(
        IColumn & column, std::string_view value,
        const ExternalResultDescription::ValueType type, const DataTypePtr data_type,
-        std::unordered_map<size_t, PostgreSQLArrayInfo> & array_info, size_t idx);
+        const std::unordered_map<size_t, PostgreSQLArrayInfo> & array_info, size_t idx);

 void preparePostgreSQLArrayInfo(
        std::unordered_map<size_t, PostgreSQLArrayInfo> & array_info, size_t column_idx, const DataTypePtr data_type);
--- a/src/DataTypes/Serializations/SerializationString.cpp
+++ b/src/DataTypes/Serializations/SerializationString.cpp
@ -152,6 +152,9 @@ template <int UNROLL_TIMES>
 static NO_INLINE void deserializeBinarySSE2(ColumnString::Chars & data, ColumnString::Offsets & offsets, ReadBuffer & istr, size_t limit)
 {
    size_t offset = data.size();
+    /// Avoiding calling resize in a loop improves the performance.
+    data.resize(std::max(data.capacity(), static_cast<size_t>(4096)));
+
    for (size_t i = 0; i < limit; ++i)
    {
        if (istr.eof())
@ -171,7 +174,8 @@ static NO_INLINE void deserializeBinarySSE2(ColumnString::Chars & data, ColumnSt
        offset += size + 1;
        offsets.push_back(offset);

-        data.resize(offset);
+        if (unlikely(offset > data.size()))
+            data.resize(roundUpToPowerOfTwoOrZero(std::max(offset, data.size() * 2)));

        if (size)
        {
@ -203,6 +207,8 @@ static NO_INLINE void deserializeBinarySSE2(ColumnString::Chars & data, ColumnSt

        data[offset - 1] = 0;
    }
+
+    data.resize(offset);
 }


--- a/src/Databases/PostgreSQL/fetchPostgreSQLTableStructure.cpp
+++ b/src/Databases/PostgreSQL/fetchPostgreSQLTableStructure.cpp
@ -25,6 +25,7 @@ namespace ErrorCodes
 {
    extern const int UNKNOWN_TABLE;
    extern const int BAD_ARGUMENTS;
+    extern const int LOGICAL_ERROR;
 }


@ -186,20 +187,25 @@ PostgreSQLTableStructure::ColumnsInfoPtr readNamesAndTypesList(
            }
            else
            {
-                std::tuple<std::string, std::string, std::string, uint16_t, std::string, std::string> row;
+                std::tuple<std::string, std::string, std::string, uint16_t, std::string, std::string, std::string> row;
                while (stream >> row)
                {
-                    auto data_type = convertPostgreSQLDataType(
+                    const auto column_name = std::get<0>(row);
+                    const auto data_type = convertPostgreSQLDataType(
                        std::get<1>(row), recheck_array,
                        use_nulls && (std::get<2>(row) == /* not nullable */"f"),
                        std::get<3>(row));

-                    columns.push_back(NameAndTypePair(std::get<0>(row), data_type));
+                    columns.push_back(NameAndTypePair(column_name, data_type));
+                    auto attgenerated = std::get<6>(row);
+                    LOG_TEST(&Poco::Logger::get("kssenii"), "KSSENII: attgenerated: {}", attgenerated);

-                    attributes.emplace_back(
-                    PostgreSQLTableStructure::PGAttribute{
-                        .atttypid = parse<int>(std::get<4>(row)),
-                        .atttypmod = parse<int>(std::get<5>(row)),
+                    attributes.emplace(
+                        column_name,
+                        PostgreSQLTableStructure::PGAttribute{
+                            .atttypid = parse<int>(std::get<4>(row)),
+                            .atttypmod = parse<int>(std::get<5>(row)),
+                            .attgenerated = attgenerated.empty() ? char{} : char(attgenerated[0])
                    });

                    ++i;
@ -255,14 +261,19 @@ PostgreSQLTableStructure fetchPostgreSQLTableStructure(
    PostgreSQLTableStructure table;

    auto where = fmt::format("relname = {}", quoteString(postgres_table));
-    if (postgres_schema.empty())
-        where += " AND relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'public')";
-    else
-        where += fmt::format(" AND relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = {})", quoteString(postgres_schema));
+
+    where += postgres_schema.empty()
+        ? " AND relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'public')"
+        : fmt::format(" AND relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = {})", quoteString(postgres_schema));

    std::string query = fmt::format(
-           "SELECT attname AS name, format_type(atttypid, atttypmod) AS type, "
-           "attnotnull AS not_null, attndims AS dims, atttypid as type_id, atttypmod as type_modifier "
+           "SELECT attname AS name, " /// column name
+           "format_type(atttypid, atttypmod) AS type, " /// data type
+           "attnotnull AS not_null, " /// is nullable
+           "attndims AS dims, " /// array dimensions
+           "atttypid as type_id, "
+           "atttypmod as type_modifier, "
+           "attgenerated as generated " /// if column has GENERATED
           "FROM pg_attribute "
           "WHERE attrelid = (SELECT oid FROM pg_class WHERE {}) "
           "AND NOT attisdropped AND attnum > 0 "
@ -274,11 +285,44 @@ PostgreSQLTableStructure fetchPostgreSQLTableStructure(
    if (!table.physical_columns)
        throw Exception(ErrorCodes::UNKNOWN_TABLE, "PostgreSQL table {} does not exist", postgres_table_with_schema);

+    for (const auto & column : table.physical_columns->columns)
+    {
+        table.physical_columns->names.push_back(column.name);
+    }
+
+    bool check_generated = table.physical_columns->attributes.end() != std::find_if(
+        table.physical_columns->attributes.begin(),
+        table.physical_columns->attributes.end(),
+        [](const auto & attr){ return attr.second.attgenerated == 's'; });
+
+    if (check_generated)
+    {
+        std::string attrdef_query = fmt::format(
+            "SELECT adnum, pg_get_expr(adbin, adrelid) as generated_expression "
+            "FROM pg_attrdef "
+            "WHERE adrelid = (SELECT oid FROM pg_class WHERE {});", where);
+
+        pqxx::result result{tx.exec(attrdef_query)};
+        for (const auto row : result)
+        {
+            size_t adnum = row[0].as<int>();
+            if (!adnum || adnum > table.physical_columns->names.size())
+            {
+                throw Exception(ErrorCodes::LOGICAL_ERROR,
+                                "Received adnum {}, but currently fetched columns list has {} columns",
+                                adnum, table.physical_columns->attributes.size());
+            }
+            const auto column_name = table.physical_columns->names[adnum - 1];
+            table.physical_columns->attributes.at(column_name).attr_def = row[1].as<std::string>();
+        }
+    }
+
    if (with_primary_key)
    {
        /// wiki.postgresql.org/wiki/Retrieve_primary_key_columns
        query = fmt::format(
-                "SELECT a.attname, format_type(a.atttypid, a.atttypmod) AS data_type "
+                "SELECT a.attname, " /// column name
+                "format_type(a.atttypid, a.atttypmod) AS data_type " /// data type
                "FROM pg_index i "
                "JOIN pg_attribute a ON a.attrelid = i.indrelid "
                "AND a.attnum = ANY(i.indkey) "
--- a/src/Databases/PostgreSQL/fetchPostgreSQLTableStructure.h
+++ b/src/Databases/PostgreSQL/fetchPostgreSQLTableStructure.h
@ -16,13 +16,17 @@ struct PostgreSQLTableStructure
    {
        Int32 atttypid;
        Int32 atttypmod;
+        bool atthasdef;
+        char attgenerated;
+        std::string attr_def;
    };
-    using Attributes = std::vector<PGAttribute>;
+    using Attributes = std::unordered_map<std::string, PGAttribute>;

    struct ColumnsInfo
    {
        NamesAndTypesList columns;
        Attributes attributes;
+        std::vector<std::string> names;
        ColumnsInfo(NamesAndTypesList && columns_, Attributes && attributes_) : columns(columns_), attributes(attributes_) {}
    };
    using ColumnsInfoPtr = std::shared_ptr<ColumnsInfo>;
--- a/src/Formats/FormatFactory.cpp
+++ b/src/Formats/FormatFactory.cpp
@ -347,7 +347,13 @@ InputFormatPtr FormatFactory::getInput(
    if (owned_buf)
        format->addBuffer(std::move(owned_buf));
    if (!settings.input_format_record_errors_file_path.toString().empty())
-        format->setErrorsLogger(std::make_shared<ParallelInputFormatErrorsLogger>(context));
+    {
+        if (parallel_parsing)
+            format->setErrorsLogger(std::make_shared<ParallelInputFormatErrorsLogger>(context));
+        else
+            format->setErrorsLogger(std::make_shared<InputFormatErrorsLogger>(context));
+    }
+

    /// It's a kludge. Because I cannot remove context from values format.
    /// (Not needed in the parallel_parsing case above because VALUES format doesn't support it.)
--- a/src/Formats/JSONUtils.cpp
+++ b/src/Formats/JSONUtils.cpp
@ -564,6 +564,15 @@ namespace JSONUtils
        skipWhitespaceIfAny(in);
    }

+    bool checkAndSkipColon(ReadBuffer & in)
+    {
+        skipWhitespaceIfAny(in);
+        if (!checkChar(':', in))
+            return false;
+        skipWhitespaceIfAny(in);
+        return true;
+    }
+
    String readFieldName(ReadBuffer & in)
    {
        skipWhitespaceIfAny(in);
@ -573,6 +582,12 @@ namespace JSONUtils
        return field;
    }

+    bool tryReadFieldName(ReadBuffer & in, String & field)
+    {
+        skipWhitespaceIfAny(in);
+        return tryReadJSONStringInto(field, in) && checkAndSkipColon(in);
+    }
+
    String readStringField(ReadBuffer & in)
    {
        skipWhitespaceIfAny(in);
@ -582,6 +597,15 @@ namespace JSONUtils
        return value;
    }

+    bool tryReadStringField(ReadBuffer & in, String & value)
+    {
+        skipWhitespaceIfAny(in);
+        if (!tryReadJSONStringInto(value, in))
+            return false;
+        skipWhitespaceIfAny(in);
+        return true;
+    }
+
    void skipArrayStart(ReadBuffer & in)
    {
        skipWhitespaceIfAny(in);
@ -628,6 +652,15 @@ namespace JSONUtils
        skipWhitespaceIfAny(in);
    }

+    bool checkAndSkipObjectStart(ReadBuffer & in)
+    {
+        skipWhitespaceIfAny(in);
+        if (!checkChar('{', in))
+            return false;
+        skipWhitespaceIfAny(in);
+        return true;
+    }
+
    bool checkAndSkipObjectEnd(ReadBuffer & in)
    {
        skipWhitespaceIfAny(in);
@ -644,6 +677,15 @@ namespace JSONUtils
        skipWhitespaceIfAny(in);
    }

+    bool checkAndSkipComma(ReadBuffer & in)
+    {
+        skipWhitespaceIfAny(in);
+        if (!checkChar(',', in))
+            return false;
+        skipWhitespaceIfAny(in);
+        return true;
+    }
+
    std::pair<String, String> readStringFieldNameAndValue(ReadBuffer & in)
    {
        auto field_name = readFieldName(in);
@ -651,6 +693,11 @@ namespace JSONUtils
        return {field_name, field_value};
    }

+    bool tryReadStringFieldNameAndValue(ReadBuffer & in, std::pair<String, String> & field_and_value)
+    {
+        return tryReadFieldName(in, field_and_value.first) && tryReadStringField(in, field_and_value.second);
+    }
+
    NameAndTypePair readObjectWithNameAndType(ReadBuffer & in)
    {
        skipObjectStart(in);
@ -673,6 +720,44 @@ namespace JSONUtils
        return name_and_type;
    }

+    bool tryReadObjectWithNameAndType(ReadBuffer & in, NameAndTypePair & name_and_type)
+    {
+        if (!checkAndSkipObjectStart(in))
+            return false;
+
+        std::pair<String, String> first_field_and_value;
+        if (!tryReadStringFieldNameAndValue(in, first_field_and_value))
+            return false;
+
+        if (!checkAndSkipComma(in))
+            return false;
+
+        std::pair<String, String> second_field_and_value;
+        if (!tryReadStringFieldNameAndValue(in, second_field_and_value))
+            return false;
+
+        if (first_field_and_value.first == "name" && second_field_and_value.first == "type")
+        {
+            auto type = DataTypeFactory::instance().tryGet(second_field_and_value.second);
+            if (!type)
+                return false;
+            name_and_type = {first_field_and_value.second, type};
+        }
+        else if (second_field_and_value.first == "name" && first_field_and_value.first == "type")
+        {
+            auto type = DataTypeFactory::instance().tryGet(first_field_and_value.second);
+            if (!type)
+                return false;
+            name_and_type = {second_field_and_value.second, type};
+        }
+        else
+        {
+            return false;
+        }
+
+        return checkAndSkipObjectEnd(in);
+    }
+
    NamesAndTypesList readMetadata(ReadBuffer & in)
    {
        auto field_name = readFieldName(in);
@ -693,6 +778,37 @@ namespace JSONUtils
        return names_and_types;
    }

+    bool tryReadMetadata(ReadBuffer & in, NamesAndTypesList & names_and_types)
+    {
+        String field_name;
+        if (!tryReadFieldName(in, field_name) || field_name != "meta")
+            return false;
+
+        if (!checkAndSkipArrayStart(in))
+            return false;
+
+        bool first = true;
+        while (!checkAndSkipArrayEnd(in))
+        {
+            if (!first)
+            {
+                if (!checkAndSkipComma(in))
+                    return false;
+            }
+            else
+            {
+                first = false;
+            }
+
+            NameAndTypePair name_and_type;
+            if (!tryReadObjectWithNameAndType(in, name_and_type))
+                return false;
+            names_and_types.push_back(name_and_type);
+        }
+
+        return !names_and_types.empty();
+    }
+
    void validateMetadataByHeader(const NamesAndTypesList & names_and_types_from_metadata, const Block & header)
    {
        for (const auto & [name, type] : names_and_types_from_metadata)
--- a/src/Formats/JSONUtils.h
+++ b/src/Formats/JSONUtils.h
@ -112,6 +112,7 @@ namespace JSONUtils

    void skipColon(ReadBuffer & in);
    void skipComma(ReadBuffer & in);
+    bool checkAndSkipComma(ReadBuffer & in);

    String readFieldName(ReadBuffer & in);

@ -122,9 +123,11 @@ namespace JSONUtils

    void skipObjectStart(ReadBuffer & in);
    void skipObjectEnd(ReadBuffer & in);
+    bool checkAndSkipObjectStart(ReadBuffer & in);
    bool checkAndSkipObjectEnd(ReadBuffer & in);

    NamesAndTypesList readMetadata(ReadBuffer & in);
+    bool tryReadMetadata(ReadBuffer & in, NamesAndTypesList & names_and_types);
    NamesAndTypesList readMetadataAndValidateHeader(ReadBuffer & in, const Block & header);
    void validateMetadataByHeader(const NamesAndTypesList & names_and_types_from_metadata, const Block & header);

--- a/src/Functions/FunctionBinaryArithmetic.h
+++ b/src/Functions/FunctionBinaryArithmetic.h
@ -1483,6 +1483,17 @@ public:
            return getReturnTypeImplStatic(new_arguments, context);
        }

+        /// Special case - one or both arguments are IPv6
+        if (isIPv6(arguments[0]) || isIPv6(arguments[1]))
+        {
+            DataTypes new_arguments {
+                    isIPv6(arguments[0]) ? std::make_shared<DataTypeUInt128>() : arguments[0],
+                    isIPv6(arguments[1]) ? std::make_shared<DataTypeUInt128>() : arguments[1],
+            };
+
+            return getReturnTypeImplStatic(new_arguments, context);
+        }
+

        if constexpr (is_plus || is_minus)
        {
@ -2181,6 +2192,25 @@ ColumnPtr executeStringInteger(const ColumnsWithTypeAndName & arguments, const A
            return executeImpl2(new_arguments, result_type, input_rows_count, right_nullmap);
        }

+        /// Special case - one or both arguments are IPv6
+        if (isIPv6(arguments[0].type) || isIPv6(arguments[1].type))
+        {
+            ColumnsWithTypeAndName new_arguments {
+                {
+                    isIPv6(arguments[0].type) ? castColumn(arguments[0], std::make_shared<DataTypeUInt128>()) : arguments[0].column,
+                    isIPv6(arguments[0].type) ? std::make_shared<DataTypeUInt128>() : arguments[0].type,
+                    arguments[0].name,
+                },
+                {
+                    isIPv6(arguments[1].type) ? castColumn(arguments[1], std::make_shared<DataTypeUInt128>()) : arguments[1].column,
+                    isIPv6(arguments[1].type) ? std::make_shared<DataTypeUInt128>() : arguments[1].type,
+                    arguments[1].name
+                }
+            };
+
+            return executeImpl2(new_arguments, result_type, input_rows_count, right_nullmap);
+        }
+
        const auto * const left_generic = left_argument.type.get();
        const auto * const right_generic = right_argument.type.get();
        ColumnPtr res;
--- a/src/Functions/FunctionsConversion.h
+++ b/src/Functions/FunctionsConversion.h
@ -221,6 +221,18 @@ struct ConvertImpl
                    continue;
                }

+                if constexpr (std::is_same_v<FromDataType, DataTypeIPv6> && std::is_same_v<ToDataType, DataTypeUInt128>)
+                {
+                    static_assert(
+                        std::is_same_v<DataTypeUInt128::FieldType, DataTypeUUID::FieldType::UnderlyingType>,
+                        "UInt128 and IPv6 types must be same");
+
+                    vec_to[i].items[1] = std::byteswap(vec_from[i].toUnderType().items[0]);
+                    vec_to[i].items[0] = std::byteswap(vec_from[i].toUnderType().items[1]);
+
+                    continue;
+                }
+
                if constexpr (std::is_same_v<FromDataType, DataTypeUUID> != std::is_same_v<ToDataType, DataTypeUUID>)
                {
                    throw Exception(ErrorCodes::NOT_IMPLEMENTED,
--- a/src/Functions/FunctionsHashingSSL.cpp
+++ b/src/Functions/FunctionsHashingSSL.cpp
@ -14,14 +14,147 @@ namespace DB

 REGISTER_FUNCTION(HashingSSL)
 {
-    factory.registerFunction<FunctionMD4>();
-    factory.registerFunction<FunctionHalfMD5>();
-    factory.registerFunction<FunctionMD5>();
-    factory.registerFunction<FunctionSHA1>();
-    factory.registerFunction<FunctionSHA224>();
-    factory.registerFunction<FunctionSHA256>();
-    factory.registerFunction<FunctionSHA384>();
-    factory.registerFunction<FunctionSHA512>();
+    factory.registerFunction<FunctionMD4>(FunctionDocumentation{
+        .description = R"(Calculates the MD4 hash of the given string.)",
+        .syntax = "SELECT MD4(s);",
+        .arguments = {{"s", "The input [String](../../sql-reference/data-types/string.md)."}},
+        .returned_value
+        = "The MD4 hash of the given input string returned as a [FixedString(16)](../../sql-reference/data-types/fixedstring.md).",
+        .examples
+        = {{"",
+            "SELECT HEX(MD4('abc'));",
+            R"(
+┌─hex(MD4('abc'))──────────────────┐
+│ A448017AAF21D8525FC10AE87AA6729D │
+└──────────────────────────────────┘
+            )"
+          }}
+    });
+    factory.registerFunction<FunctionHalfMD5>(FunctionDocumentation{
+        .description = R"(
+[Interprets](../..//sql-reference/functions/type-conversion-functions.md/#type_conversion_functions-reinterpretAsString) all the input
+parameters as strings and calculates the MD5 hash value for each of them. Then combines hashes, takes the first 8 bytes of the hash of the
+resulting string, and interprets them as [UInt64](../../../sql-reference/data-types/int-uint.md) in big-endian byte order. The function is
+relatively slow (5 million short strings per second per processor core).
+
+Consider using the [sipHash64](../../sql-reference/functions/hash-functions.md/#hash_functions-siphash64) function instead.
+                       )",
+        .syntax = "SELECT halfMD5(par1,par2,...,parN);",
+        .arguments = {{"par1,par2,...,parN",
+                       R"(
+The function takes a variable number of input parameters. Arguments can be any of the supported data types. For some data types calculated
+value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed
+Tuple with the same data, Map and the corresponding Array(Tuple(key, value)) type with the same data).
+                       )"
+                     }},
+        .returned_value
+        = "The computed half MD5 hash of the given input params returned as a [UInt64](../../../sql-reference/data-types/int-uint.md) in big-endian byte order.",
+        .examples
+        = {{"",
+            "SELECT HEX(halfMD5('abc', 'cde', 'fgh'));",
+            R"(
+┌─hex(halfMD5('abc', 'cde', 'fgh'))─┐
+│ 2C9506B7374CFAF4                  │
+└───────────────────────────────────┘
+            )"
+          }}
+    });
+    factory.registerFunction<FunctionMD5>(FunctionDocumentation{
+        .description = R"(Calculates the MD5 hash of the given string.)",
+        .syntax = "SELECT MD5(s);",
+        .arguments = {{"s", "The input [String](../../sql-reference/data-types/string.md)."}},
+        .returned_value
+        = "The MD5 hash of the given input string returned as a [FixedString(16)](../../sql-reference/data-types/fixedstring.md).",
+        .examples
+        = {{"",
+            "SELECT HEX(MD5('abc'));",
+            R"(
+┌─hex(MD5('abc'))──────────────────┐
+│ 900150983CD24FB0D6963F7D28E17F72 │
+└──────────────────────────────────┘
+            )"
+          }}
+    });
+    factory.registerFunction<FunctionSHA1>(FunctionDocumentation{
+        .description = R"(Calculates the SHA1 hash of the given string.)",
+        .syntax = "SELECT SHA1(s);",
+        .arguments = {{"s", "The input [String](../../sql-reference/data-types/string.md)."}},
+        .returned_value
+        = "The SHA1 hash of the given input string returned as a [FixedString](../../sql-reference/data-types/fixedstring.md).",
+        .examples
+        = {{"",
+            "SELECT HEX(SHA1('abc'));",
+            R"(
+┌─hex(SHA1('abc'))─────────────────────────┐
+│ A9993E364706816ABA3E25717850C26C9CD0D89D │
+└──────────────────────────────────────────┘
+            )"
+          }}
+    });
+    factory.registerFunction<FunctionSHA224>(FunctionDocumentation{
+        .description = R"(Calculates the SHA224 hash of the given string.)",
+        .syntax = "SELECT SHA224(s);",
+        .arguments = {{"s", "The input [String](../../sql-reference/data-types/string.md)."}},
+        .returned_value
+        = "The SHA224 hash of the given input string returned as a [FixedString](../../sql-reference/data-types/fixedstring.md).",
+        .examples
+        = {{"",
+            "SELECT HEX(SHA224('abc'));",
+            R"(
+┌─hex(SHA224('abc'))───────────────────────────────────────┐
+│ 23097D223405D8228642A477BDA255B32AADBCE4BDA0B3F7E36C9DA7 │
+└──────────────────────────────────────────────────────────┘
+            )"
+          }}
+    });
+    factory.registerFunction<FunctionSHA256>(FunctionDocumentation{
+        .description = R"(Calculates the SHA256 hash of the given string.)",
+        .syntax = "SELECT SHA256(s);",
+        .arguments = {{"s", "The input [String](../../sql-reference/data-types/string.md)."}},
+        .returned_value
+        = "The SHA256 hash of the given input string returned as a [FixedString](../../sql-reference/data-types/fixedstring.md).",
+        .examples
+        = {{"",
+            "SELECT HEX(SHA256('abc'));",
+            R"(
+┌─hex(SHA256('abc'))───────────────────────────────────────────────┐
+│ BA7816BF8F01CFEA414140DE5DAE2223B00361A396177A9CB410FF61F20015AD │
+└──────────────────────────────────────────────────────────────────┘
+            )"
+          }}
+    });
+    factory.registerFunction<FunctionSHA384>(FunctionDocumentation{
+        .description = R"(Calculates the SHA384 hash of the given string.)",
+        .syntax = "SELECT SHA384(s);",
+        .arguments = {{"s", "The input [String](../../sql-reference/data-types/string.md)."}},
+        .returned_value
+        = "The SHA384 hash of the given input string returned as a [FixedString](../../sql-reference/data-types/fixedstring.md).",
+        .examples
+        = {{"",
+            "SELECT HEX(SHA384('abc'));",
+            R"(
+┌─hex(SHA384('abc'))───────────────────────────────────────────────────────────────────────────────┐
+│ CB00753F45A35E8BB5A03D699AC65007272C32AB0EDED1631A8B605A43FF5BED8086072BA1E7CC2358BAECA134C825A7 │
+└──────────────────────────────────────────────────────────────────────────────────────────────────┘
+            )"
+          }}
+    });
+    factory.registerFunction<FunctionSHA512>(FunctionDocumentation{
+        .description = R"(Calculates the SHA512 hash of the given string.)",
+        .syntax = "SELECT SHA512(s);",
+        .arguments = {{"s", "The input [String](../../sql-reference/data-types/string.md)."}},
+        .returned_value
+        = "The SHA512 hash of the given input string returned as a [FixedString](../../sql-reference/data-types/fixedstring.md).",
+        .examples
+        = {{"",
+            "SELECT HEX(SHA512('abc'));",
+            R"(
+┌─hex(SHA512('abc'))───────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
+│ DDAF35A193617ABACC417349AE20413112E6FA4E89A97EA20A9EEEE64B55D39A2192992A274FC1A836BA3C23A3FEEBBD454D4423643CE80E2A9AC94FA54CA49F │
+└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
+            )"
+          }}
+    });
    factory.registerFunction<FunctionSHA512_256>(FunctionDocumentation{
        .description = R"(Calculates the SHA512_256 hash of the given string.)",
        .syntax = "SELECT SHA512_256(s);",
--- a/src/Functions/concat.cpp
+++ b/src/Functions/concat.cpp
@ -7,10 +7,10 @@
 #include <Functions/GatherUtils/Sinks.h>
 #include <Functions/GatherUtils/Sources.h>
 #include <Functions/IFunction.h>
+#include <Functions/formatString.h>
 #include <IO/WriteHelpers.h>
 #include <base/map.h>

-#include "formatString.h"

 namespace DB
 {
--- a/src/Functions/concatWithSeparator.cpp
+++ b/src/Functions/concatWithSeparator.cpp
@ -4,11 +4,11 @@
 #include <Functions/FunctionFactory.h>
 #include <Functions/FunctionHelpers.h>
 #include <Functions/IFunction.h>
+#include <Functions/formatString.h>
 #include <IO/WriteHelpers.h>
 #include <base/map.h>
 #include <base/range.h>

-#include "formatString.h"

 namespace DB
 {
--- a/src/Functions/formatString.cpp
+++ b/src/Functions/formatString.cpp
@ -1,35 +1,33 @@
 #include <Columns/ColumnFixedString.h>
 #include <Columns/ColumnString.h>
+#include <Columns/ColumnStringHelpers.h>
 #include <DataTypes/DataTypeString.h>
 #include <Functions/FunctionFactory.h>
 #include <Functions/FunctionHelpers.h>
 #include <Functions/IFunction.h>
+#include <Functions/formatString.h>
 #include <IO/WriteHelpers.h>
-#include <base/range.h>

 #include <memory>
 #include <string>
 #include <vector>

-#include "formatString.h"

 namespace DB
 {
 namespace ErrorCodes
 {
    extern const int ILLEGAL_COLUMN;
-    extern const int ILLEGAL_TYPE_OF_ARGUMENT;
    extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
 }

 namespace
 {

-template <typename Name>
 class FormatFunction : public IFunction
 {
 public:
-    static constexpr auto name = Name::name;
+    static constexpr auto name = "format";

    static FunctionPtr create(ContextPtr) { return std::make_shared<FormatFunction>(); }

@ -52,18 +50,6 @@ public:
                getName(),
                arguments.size());

-        for (const auto arg_idx : collections::range(0, arguments.size()))
-        {
-            const auto * arg = arguments[arg_idx].get();
-            if (!isStringOrFixedString(arg))
-                throw Exception(
-                    ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
-                    "Illegal type {} of argument {} of function {}",
-                    arg->getName(),
-                    arg_idx + 1,
-                    getName());
-        }
-
        return std::make_shared<DataTypeString>();
    }

@ -83,6 +69,7 @@ public:
        std::vector<const ColumnString::Offsets *> offsets(arguments.size() - 1);
        std::vector<size_t> fixed_string_sizes(arguments.size() - 1);
        std::vector<std::optional<String>> constant_strings(arguments.size() - 1);
+        std::vector<ColumnString::MutablePtr> converted_col_ptrs(arguments.size() - 1);

        bool has_column_string = false;
        bool has_column_fixed_string = false;
@ -106,8 +93,29 @@ public:
                constant_strings[i - 1] = const_col->getValue<String>();
            }
            else
-                throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Illegal column {} of argument of function {}",
-                    column->getName(), getName());
+            {
+                /// A non-String/non-FixedString-type argument: use the default serialization to convert it to String
+                auto full_column = column->convertToFullIfNeeded();
+                auto serialization = arguments[i].type->getDefaultSerialization();
+                auto converted_col_str = ColumnString::create();
+                ColumnStringHelpers::WriteHelper write_helper(*converted_col_str, column->size());
+                auto & write_buffer = write_helper.getWriteBuffer();
+                FormatSettings format_settings;
+                for (size_t row = 0; row < column->size(); ++row)
+                {
+                    serialization->serializeText(*full_column, row, write_buffer, format_settings);
+                    write_helper.rowWritten();
+                }
+                write_helper.finalize();
+
+                /// Same as the normal `ColumnString` branch
+                has_column_string = true;
+                data[i - 1] = &converted_col_str->getChars();
+                offsets[i - 1] = &converted_col_str->getOffsets();
+
+                /// Keep the pointer alive
+                converted_col_ptrs[i - 1] = std::move(converted_col_str);
+            }
        }

        FormatStringImpl::formatExecute(
@ -127,11 +135,7 @@ public:
 };


-struct NameFormat
-{
-    static constexpr auto name = "format";
-};
-using FunctionFormat = FormatFunction<NameFormat>;
+using FunctionFormat = FormatFunction;

 }

--- a/src/IO/ReadHelpers.cpp
+++ b/src/IO/ReadHelpers.cpp
@ -1591,7 +1591,7 @@ void skipToNextRowOrEof(PeekableReadBuffer & buf, const String & row_after_delim
        if (skip_spaces)
            skipWhitespaceIfAny(buf);

-        if (checkString(row_between_delimiter, buf))
+        if (buf.eof() || checkString(row_between_delimiter, buf))
            break;
    }
 }
--- a/src/Interpreters/InterpreterSelectQuery.cpp
+++ b/src/Interpreters/InterpreterSelectQuery.cpp
@ -2942,7 +2942,6 @@ void InterpreterSelectQuery::executeWindow(QueryPlan & query_plan)
            auto sorting_step = std::make_unique<SortingStep>(
                query_plan.getCurrentDataStream(),
                window.full_sort_description,
-                window.partition_by,
                0 /* LIMIT */,
                sort_settings,
                settings.optimize_sorting_by_input_stream_properties);
--- a/src/Parsers/ExpressionListParsers.cpp
+++ b/src/Parsers/ExpressionListParsers.cpp
@ -468,7 +468,8 @@ enum class OperatorType
    StartIf,
    FinishIf,
    Cast,
-    Lambda
+    Lambda,
+    Not
 };

 /** Operator struct stores parameters of the operator:
@ -2420,7 +2421,7 @@ const std::vector<std::pair<std::string_view, Operator>> ParserExpressionImpl::o

 const std::vector<std::pair<std::string_view, Operator>> ParserExpressionImpl::unary_operators_table
 {
-    {"NOT",           Operator("not",             5,  1)},
+    {"NOT",           Operator("not",             5,  1, OperatorType::Not)},
    {"-",             Operator("negate",          13, 1)},
    {"−",             Operator("negate",          13, 1)}
 };
@ -2592,7 +2593,16 @@ Action ParserExpressionImpl::tryParseOperand(Layers & layers, IParser::Pos & pos

    if (cur_op != unary_operators_table.end())
    {
-        layers.back()->pushOperator(cur_op->second);
+        if (cur_op->second.type == OperatorType::Not && pos->type == TokenType::OpeningRoundBracket)
+        {
+            ++pos;
+            auto identifier = std::make_shared<ASTIdentifier>(cur_op->second.function_name);
+            layers.push_back(getFunctionLayer(identifier, layers.front()->is_table_function));
+        }
+        else
+        {
+            layers.back()->pushOperator(cur_op->second);
+        }
        return Action::OPERAND;
    }

--- a/src/Planner/Planner.cpp
+++ b/src/Planner/Planner.cpp
@ -915,7 +915,6 @@ void addWindowSteps(QueryPlan & query_plan,
            auto sorting_step = std::make_unique<SortingStep>(
                query_plan.getCurrentDataStream(),
                window_description.full_sort_description,
-                window_description.partition_by,
                0 /*limit*/,
                sort_settings,
                settings.optimize_sorting_by_input_stream_properties);
--- a/src/Processors/Formats/IRowInputFormat.cpp
+++ b/src/Processors/Formats/IRowInputFormat.cpp
@ -128,7 +128,7 @@ Chunk IRowInputFormat::generate()

        RowReadExtension info;
        bool continue_reading = true;
-        for (size_t rows = 0; rows < params.max_block_size && continue_reading; ++rows)
+        for (size_t rows = 0; (rows < params.max_block_size || num_rows == 0) && continue_reading; ++rows)
        {
            try
            {
--- a/src/Processors/Formats/Impl/JSONRowInputFormat.cpp
+++ b/src/Processors/Formats/Impl/JSONRowInputFormat.cpp
@ -7,11 +7,6 @@
 namespace DB
 {

-namespace ErrorCodes
-{
-    extern const int INCORRECT_DATA;
-}
-
 JSONRowInputFormat::JSONRowInputFormat(ReadBuffer & in_, const Block & header_, Params params_, const FormatSettings & format_settings_)
    : JSONRowInputFormat(std::make_unique<PeekableReadBuffer>(in_), header_, params_, format_settings_)
 {
@ -30,38 +25,24 @@ void JSONRowInputFormat::readPrefix()
    NamesAndTypesList names_and_types_from_metadata;

    /// Try to parse metadata, if failed, try to parse data as JSONEachRow format.
-    try
+    if (JSONUtils::checkAndSkipObjectStart(*peekable_buf)
+        && JSONUtils::tryReadMetadata(*peekable_buf, names_and_types_from_metadata)
+        && JSONUtils::checkAndSkipComma(*peekable_buf)
+        && JSONUtils::skipUntilFieldInObject(*peekable_buf, "data")
+        && JSONUtils::checkAndSkipArrayStart(*peekable_buf))
    {
-        JSONUtils::skipObjectStart(*peekable_buf);
-        names_and_types_from_metadata = JSONUtils::readMetadata(*peekable_buf);
-        JSONUtils::skipComma(*peekable_buf);
-        if (!JSONUtils::skipUntilFieldInObject(*peekable_buf, "data"))
-            throw Exception(ErrorCodes::INCORRECT_DATA, "Expected field \"data\" with table content");
-
-        JSONUtils::skipArrayStart(*peekable_buf);
        data_in_square_brackets = true;
+        if (validate_types_from_metadata)
+        {
+            JSONUtils::validateMetadataByHeader(names_and_types_from_metadata, getPort().getHeader());
+        }
    }
-    catch (const ParsingException &)
+    else
    {
        parse_as_json_each_row = true;
-    }
-    catch (const Exception & e)
-    {
-        if (e.code() != ErrorCodes::INCORRECT_DATA)
-            throw;
-
-        parse_as_json_each_row = true;
-    }
-
-    if (parse_as_json_each_row)
-    {
        peekable_buf->rollbackToCheckpoint();
        JSONEachRowRowInputFormat::readPrefix();
    }
-    else if (validate_types_from_metadata)
-    {
-        JSONUtils::validateMetadataByHeader(names_and_types_from_metadata, getPort().getHeader());
-    }
 }

 void JSONRowInputFormat::readSuffix()
@ -103,16 +84,12 @@ NamesAndTypesList JSONRowSchemaReader::readSchema()
    skipBOMIfExists(*peekable_buf);
    PeekableReadBufferCheckpoint checkpoint(*peekable_buf);
    /// Try to parse metadata, if failed, try to parse data as JSONEachRow format
-    try
-    {
-        JSONUtils::skipObjectStart(*peekable_buf);
-        return JSONUtils::readMetadata(*peekable_buf);
-    }
-    catch (...)
-    {
-        peekable_buf->rollbackToCheckpoint(true);
-        return JSONEachRowSchemaReader::readSchema();
-    }
+    NamesAndTypesList names_and_types;
+    if (JSONUtils::checkAndSkipObjectStart(*peekable_buf) && JSONUtils::tryReadMetadata(*peekable_buf, names_and_types))
+        return names_and_types;
+
+    peekable_buf->rollbackToCheckpoint(true);
+    return JSONEachRowSchemaReader::readSchema();
 }

 void registerInputFormatJSON(FormatFactory & factory)
--- a/src/Processors/Formats/Impl/ParallelParsingInputFormat.cpp
+++ b/src/Processors/Formats/Impl/ParallelParsingInputFormat.cpp
@ -126,10 +126,6 @@ void ParallelParsingInputFormat::parserThreadFunction(ThreadGroupPtr thread_grou
            first_parser_finished.set();
        }

-        // We suppose we will get at least some blocks for a non-empty buffer,
-        // except at the end of file. Also see a matching assert in readImpl().
-        assert(unit.is_last || !unit.chunk_ext.chunk.empty() || parsing_finished);
-
        std::lock_guard<std::mutex> lock(mutex);
        unit.status = READY_TO_READ;
        reader_condvar.notify_all();
@ -200,62 +196,69 @@ Chunk ParallelParsingInputFormat::generate()
    }

    const auto inserter_unit_number = reader_ticket_number % processing_units.size();
-    auto & unit = processing_units[inserter_unit_number];
+    auto * unit = &processing_units[inserter_unit_number];

    if (!next_block_in_current_unit.has_value())
    {
-        // We have read out all the Blocks from the previous Processing Unit,
-        // wait for the current one to become ready.
-        std::unique_lock<std::mutex> lock(mutex);
-        reader_condvar.wait(lock, [&](){ return unit.status == READY_TO_READ || parsing_finished; });
-
-        if (parsing_finished)
+        while (true)
        {
-            /**
-              * Check for background exception and rethrow it before we return.
-              */
-            if (background_exception)
+            // We have read out all the Blocks from the previous Processing Unit,
+            // wait for the current one to become ready.
+            std::unique_lock<std::mutex> lock(mutex);
+            reader_condvar.wait(lock, [&]() { return unit->status == READY_TO_READ || parsing_finished; });
+
+            if (parsing_finished)
            {
-                lock.unlock();
-                cancel();
-                std::rethrow_exception(background_exception);
+                /// Check for background exception and rethrow it before we return.
+                if (background_exception)
+                {
+                    lock.unlock();
+                    cancel();
+                    std::rethrow_exception(background_exception);
+                }
+
+                return {};
            }

-            return {};
+            assert(unit->status == READY_TO_READ);
+
+            if (!unit->chunk_ext.chunk.empty())
+                break;
+
+            /// If this uint is last, parsing is finished.
+            if (unit->is_last)
+            {
+                parsing_finished = true;
+                return {};
+            }
+
+            /// We can get zero blocks for an entire segment if format parser
+            /// skipped all rows. For example, it can happen while using settings
+            /// input_format_allow_errors_num/input_format_allow_errors_ratio
+            /// and this segment contained only rows with errors.
+            /// Process the next unit.
+            ++reader_ticket_number;
+            unit = &processing_units[reader_ticket_number % processing_units.size()];
        }

-        assert(unit.status == READY_TO_READ);
        next_block_in_current_unit = 0;
    }

-    if (unit.chunk_ext.chunk.empty())
-    {
-        /*
-         * Can we get zero blocks for an entire segment, when the format parser
-         * skips it entire content and does not create any blocks? Probably not,
-         * but if we ever do, we should add a loop around the above if, to skip
-         * these. Also see a matching assert in the parser thread.
-         */
-        assert(unit.is_last);
-        parsing_finished = true;
-        return {};
-    }
+    assert(next_block_in_current_unit.value() < unit->chunk_ext.chunk.size());

-    assert(next_block_in_current_unit.value() < unit.chunk_ext.chunk.size());
-
-    Chunk res = std::move(unit.chunk_ext.chunk.at(*next_block_in_current_unit));
-    last_block_missing_values = std::move(unit.chunk_ext.block_missing_values[*next_block_in_current_unit]);
-    last_approx_bytes_read_for_chunk = unit.chunk_ext.approx_chunk_sizes.at(*next_block_in_current_unit);
+    Chunk res = std::move(unit->chunk_ext.chunk.at(*next_block_in_current_unit));
+    last_block_missing_values = std::move(unit->chunk_ext.block_missing_values[*next_block_in_current_unit]);
+    last_approx_bytes_read_for_chunk = unit->chunk_ext.approx_chunk_sizes.at(*next_block_in_current_unit);

    next_block_in_current_unit.value() += 1;

-    if (*next_block_in_current_unit == unit.chunk_ext.chunk.size())
+    if (*next_block_in_current_unit == unit->chunk_ext.chunk.size())
    {
        // parsing_finished reading this Processing Unit, move to the next one.
        next_block_in_current_unit.reset();
        ++reader_ticket_number;

-        if (unit.is_last)
+        if (unit->is_last)
        {
            // It it was the last unit, we're parsing_finished.
            parsing_finished = true;
@ -264,7 +267,7 @@ Chunk ParallelParsingInputFormat::generate()
        {
            // Pass the unit back to the segmentator.
            std::lock_guard lock(mutex);
-            unit.status = READY_TO_INSERT;
+            unit->status = READY_TO_INSERT;
            segmentator_condvar.notify_all();
        }
    }
--- a/src/Processors/Formats/InputFormatErrorsLogger.cpp
+++ b/src/Processors/Formats/InputFormatErrorsLogger.cpp
@ -20,7 +20,7 @@ namespace
    const String DEFAULT_OUTPUT_FORMAT = "CSV";
 }

-InputFormatErrorsLogger::InputFormatErrorsLogger(const ContextPtr & context)
+InputFormatErrorsLogger::InputFormatErrorsLogger(const ContextPtr & context) : max_block_size(context->getSettingsRef().max_block_size)
 {
    String output_format = context->getSettingsRef().errors_output_format;
    if (!FormatFactory::instance().isOutputFormat(output_format))
@ -59,30 +59,47 @@ InputFormatErrorsLogger::InputFormatErrorsLogger(const ContextPtr & context)
        {std::make_shared<DataTypeUInt32>(), "offset"},
        {std::make_shared<DataTypeString>(), "reason"},
        {std::make_shared<DataTypeString>(), "raw_data"}};
+    errors_columns = header.cloneEmptyColumns();

    writer = context->getOutputFormat(output_format, *write_buf, header);
 }

+
 InputFormatErrorsLogger::~InputFormatErrorsLogger()
 {
-    writer->finalize();
-    writer->flush();
-    write_buf->finalize();
+    try
+    {
+        if (!errors_columns[0]->empty())
+            writeErrors();
+        writer->finalize();
+        writer->flush();
+        write_buf->finalize();
+    }
+    catch (...)
+    {
+        tryLogCurrentException("InputFormatErrorsLogger");
+    }
 }

 void InputFormatErrorsLogger::logErrorImpl(ErrorEntry entry)
 {
-    auto error = header.cloneEmpty();
-    auto columns = error.mutateColumns();
-    columns[0]->insert(entry.time);
-    database.empty() ? columns[1]->insertDefault() : columns[1]->insert(database);
-    table.empty() ? columns[2]->insertDefault() : columns[2]->insert(table);
-    columns[3]->insert(entry.offset);
-    columns[4]->insert(entry.reason);
-    columns[5]->insert(entry.raw_data);
-    error.setColumns(std::move(columns));
+    errors_columns[0]->insert(entry.time);
+    database.empty() ? errors_columns[1]->insertDefault() : errors_columns[1]->insert(database);
+    table.empty() ? errors_columns[2]->insertDefault() : errors_columns[2]->insert(table);
+    errors_columns[3]->insert(entry.offset);
+    errors_columns[4]->insert(entry.reason);
+    errors_columns[5]->insert(entry.raw_data);

-    writer->write(error);
+    if (errors_columns[0]->size() >= max_block_size)
+        writeErrors();
+}
+
+void InputFormatErrorsLogger::writeErrors()
+{
+    auto block = header.cloneEmpty();
+    block.setColumns(std::move(errors_columns));
+    writer->write(block);
+    errors_columns = header.cloneEmptyColumns();
 }

 void InputFormatErrorsLogger::logError(ErrorEntry entry)
--- a/src/Processors/Formats/InputFormatErrorsLogger.h
+++ b/src/Processors/Formats/InputFormatErrorsLogger.h
@ -24,6 +24,7 @@ public:

    virtual void logError(ErrorEntry entry);
    void logErrorImpl(ErrorEntry entry);
+    void writeErrors();

 private:
    Block header;
@ -34,6 +35,9 @@ private:

    String database;
    String table;
+
+    MutableColumns errors_columns;
+    size_t max_block_size;
 };

 using InputFormatErrorsLoggerPtr = std::shared_ptr<InputFormatErrorsLogger>;
--- a/src/Processors/QueryPlan/SortingStep.cpp
+++ b/src/Processors/QueryPlan/SortingStep.cpp
@ -1,4 +1,3 @@
-#include <memory>
 #include <stdexcept>
 #include <IO/Operators.h>
 #include <Processors/Merges/MergingSortedTransform.h>
@ -10,8 +9,6 @@
 #include <QueryPipeline/QueryPipelineBuilder.h>
 #include <Common/JSONBuilder.h>

-#include <Processors/ResizeProcessor.h>
-#include <Processors/Transforms/ScatterByPartitionTransform.h>

 namespace CurrentMetrics
 {
@ -79,21 +76,6 @@ SortingStep::SortingStep(
    output_stream->sort_scope = DataStream::SortScope::Global;
 }

-SortingStep::SortingStep(
-        const DataStream & input_stream,
-        const SortDescription & description_,
-        const SortDescription & partition_by_description_,
-        UInt64 limit_,
-        const Settings & settings_,
-        bool optimize_sorting_by_input_stream_properties_)
-    : SortingStep(input_stream, description_, limit_, settings_, optimize_sorting_by_input_stream_properties_)
-{
-    partition_by_description = partition_by_description_;
-
-    output_stream->sort_description = result_description;
-    output_stream->sort_scope = DataStream::SortScope::Stream;
-}
-
 SortingStep::SortingStep(
    const DataStream & input_stream_,
    SortDescription prefix_description_,
@ -135,11 +117,7 @@ void SortingStep::updateOutputStream()
 {
    output_stream = createOutputStream(input_streams.front(), input_streams.front().header, getDataStreamTraits());
    output_stream->sort_description = result_description;
-
-    if (partition_by_description.empty())
-        output_stream->sort_scope = DataStream::SortScope::Global;
-    else
-        output_stream->sort_scope = DataStream::SortScope::Stream;
+    output_stream->sort_scope = DataStream::SortScope::Global;
 }

 void SortingStep::updateLimit(size_t limit_)
@ -157,55 +135,6 @@ void SortingStep::convertToFinishSorting(SortDescription prefix_description_)
    prefix_description = std::move(prefix_description_);
 }

-void SortingStep::scatterByPartitionIfNeeded(QueryPipelineBuilder& pipeline)
-{
-    size_t threads = pipeline.getNumThreads();
-    size_t streams = pipeline.getNumStreams();
-
-    if (!partition_by_description.empty() && threads > 1)
-    {
-        Block stream_header = pipeline.getHeader();
-
-        ColumnNumbers key_columns;
-        key_columns.reserve(partition_by_description.size());
-        for (auto & col : partition_by_description)
-        {
-            key_columns.push_back(stream_header.getPositionByName(col.column_name));
-        }
-
-        pipeline.transform([&](OutputPortRawPtrs ports)
-        {
-            Processors processors;
-            for (auto * port : ports)
-            {
-                auto scatter = std::make_shared<ScatterByPartitionTransform>(stream_header, threads, key_columns);
-                connect(*port, scatter->getInputs().front());
-                processors.push_back(scatter);
-            }
-            return processors;
-        });
-
-        if (streams > 1)
-        {
-            pipeline.transform([&](OutputPortRawPtrs ports)
-            {
-                Processors processors;
-                for (size_t i = 0; i < threads; ++i)
-                {
-                    size_t output_it = i;
-                    auto resize = std::make_shared<ResizeProcessor>(stream_header, streams, 1);
-                    auto & inputs = resize->getInputs();
-
-                    for (auto input_it = inputs.begin(); input_it != inputs.end(); output_it += threads, ++input_it)
-                        connect(*ports[output_it], *input_it);
-                    processors.push_back(resize);
-                }
-                return processors;
-            });
-        }
-    }
-}
-
 void SortingStep::finishSorting(
    QueryPipelineBuilder & pipeline, const SortDescription & input_sort_desc, const SortDescription & result_sort_desc, const UInt64 limit_)
 {
@ -331,12 +260,10 @@ void SortingStep::fullSortStreams(
 void SortingStep::fullSort(
    QueryPipelineBuilder & pipeline, const SortDescription & result_sort_desc, const UInt64 limit_, const bool skip_partial_sort)
 {
-    scatterByPartitionIfNeeded(pipeline);
-
    fullSortStreams(pipeline, sort_settings, result_sort_desc, limit_, skip_partial_sort);

    /// If there are several streams, then we merge them into one
-    if (pipeline.getNumStreams() > 1 && (partition_by_description.empty() || pipeline.getNumThreads() == 1))
+    if (pipeline.getNumStreams() > 1)
    {
        auto transform = std::make_shared<MergingSortedTransform>(
            pipeline.getHeader(),
@ -368,7 +295,6 @@ void SortingStep::transformPipeline(QueryPipelineBuilder & pipeline, const Build
    {
        bool need_finish_sorting = (prefix_description.size() < result_description.size());
        mergingSorted(pipeline, prefix_description, (need_finish_sorting ? 0 : limit));
-
        if (need_finish_sorting)
        {
            finishSorting(pipeline, prefix_description, result_description, limit);
--- a/src/Processors/QueryPlan/SortingStep.h
+++ b/src/Processors/QueryPlan/SortingStep.h
@ -40,15 +40,6 @@ public:
        const Settings & settings_,
        bool optimize_sorting_by_input_stream_properties_);

-    /// Full with partitioning
-    SortingStep(
-        const DataStream & input_stream,
-        const SortDescription & description_,
-        const SortDescription & partition_by_description_,
-        UInt64 limit_,
-        const Settings & settings_,
-        bool optimize_sorting_by_input_stream_properties_);
-
    /// FinishSorting
    SortingStep(
        const DataStream & input_stream_,
@ -92,24 +83,14 @@ public:
        bool skip_partial_sort = false);

 private:
-    void scatterByPartitionIfNeeded(QueryPipelineBuilder& pipeline);
    void updateOutputStream() override;

-    static void mergeSorting(
-        QueryPipelineBuilder & pipeline,
-        const Settings & sort_settings,
-        const SortDescription & result_sort_desc,
-        UInt64 limit_);
+    static void
+    mergeSorting(QueryPipelineBuilder & pipeline, const Settings & sort_settings, const SortDescription & result_sort_desc, UInt64 limit_);

-    void mergingSorted(
-        QueryPipelineBuilder & pipeline,
-        const SortDescription & result_sort_desc,
-        UInt64 limit_);
+    void mergingSorted(QueryPipelineBuilder & pipeline, const SortDescription & result_sort_desc, UInt64 limit_);
    void finishSorting(
-        QueryPipelineBuilder & pipeline,
-        const SortDescription & input_sort_desc,
-        const SortDescription & result_sort_desc,
-        UInt64 limit_);
+        QueryPipelineBuilder & pipeline, const SortDescription & input_sort_desc, const SortDescription & result_sort_desc, UInt64 limit_);
    void fullSort(
        QueryPipelineBuilder & pipeline,
        const SortDescription & result_sort_desc,
@ -120,9 +101,6 @@ private:

    SortDescription prefix_description;
    const SortDescription result_description;
-
-    SortDescription partition_by_description;
-
    UInt64 limit;
    bool always_read_till_end = false;

--- a/src/Processors/QueryPlan/WindowStep.cpp
+++ b/src/Processors/QueryPlan/WindowStep.cpp
@ -67,8 +67,7 @@ void WindowStep::transformPipeline(QueryPipelineBuilder & pipeline, const BuildQ
    // This resize is needed for cases such as `over ()` when we don't have a
    // sort node, and the input might have multiple streams. The sort node would
    // have resized it.
-    if (window_description.full_sort_description.empty())
-        pipeline.resize(1);
+    pipeline.resize(1);

    pipeline.addSimpleTransform(
        [&](const Block & /*header*/)
--- a/src/Processors/Transforms/ScatterByPartitionTransform.cpp
+++ b/src/Processors/Transforms/ScatterByPartitionTransform.cpp
@ -1,129 +0,0 @@
-#include <Processors/Transforms/ScatterByPartitionTransform.h>
-
-#include <Common/PODArray.h>
-#include <Core/ColumnNumbers.h>
-
-namespace DB
-{
-ScatterByPartitionTransform::ScatterByPartitionTransform(Block header, size_t output_size_, ColumnNumbers key_columns_)
-    : IProcessor(InputPorts{header}, OutputPorts{output_size_, header})
-    , output_size(output_size_)
-    , key_columns(std::move(key_columns_))
-    , hash(0)
-{}
-
-IProcessor::Status ScatterByPartitionTransform::prepare()
-{
-    auto & input = getInputs().front();
-
-    /// Check all outputs are finished or ready to get data.
-
-    bool all_finished = true;
-    for (auto & output : outputs)
-    {
-        if (output.isFinished())
-            continue;
-
-        all_finished = false;
-    }
-
-    if (all_finished)
-    {
-        input.close();
-        return Status::Finished;
-    }
-
-    if (!all_outputs_processed)
-    {
-        auto output_it = outputs.begin();
-        bool can_push = false;
-        for (size_t i = 0; i < output_size; ++i, ++output_it)
-            if (!was_output_processed[i] && output_it->canPush())
-                can_push = true;
-        if (!can_push)
-            return Status::PortFull;
-        return Status::Ready;
-    }
-    /// Try get chunk from input.
-
-    if (input.isFinished())
-    {
-        for (auto & output : outputs)
-            output.finish();
-
-        return Status::Finished;
-    }
-
-    input.setNeeded();
-    if (!input.hasData())
-        return Status::NeedData;
-
-    chunk = input.pull();
-    has_data = true;
-    was_output_processed.assign(outputs.size(), false);
-
-    return Status::Ready;
-}
-
-void ScatterByPartitionTransform::work()
-{
-    if (all_outputs_processed)
-        generateOutputChunks();
-    all_outputs_processed = true;
-
-    size_t chunk_number = 0;
-    for (auto & output : outputs)
-    {
-        auto & was_processed = was_output_processed[chunk_number];
-        auto & output_chunk = output_chunks[chunk_number];
-        ++chunk_number;
-
-        if (was_processed)
-            continue;
-
-        if (output.isFinished())
-            continue;
-
-        if (!output.canPush())
-        {
-            all_outputs_processed = false;
-            continue;
-        }
-
-        output.push(std::move(output_chunk));
-        was_processed = true;
-    }
-
-    if (all_outputs_processed)
-    {
-        has_data = false;
-        output_chunks.clear();
-    }
-}
-
-void ScatterByPartitionTransform::generateOutputChunks()
-{
-    auto num_rows = chunk.getNumRows();
-    const auto & columns = chunk.getColumns();
-
-    hash.reset(num_rows);
-
-    for (const auto & column_number : key_columns)
-        columns[column_number]->updateWeakHash32(hash);
-
-    const auto & hash_data = hash.getData();
-    IColumn::Selector selector(num_rows);
-
-    for (size_t row = 0; row < num_rows; ++row)
-        selector[row] = hash_data[row] % output_size;
-
-    output_chunks.resize(output_size);
-    for (const auto & column : columns)
-    {
-        auto filtered_columns = column->scatter(output_size, selector);
-        for (size_t i = 0; i < output_size; ++i)
-            output_chunks[i].addColumn(std::move(filtered_columns[i]));
-    }
-}
-
-}
--- a/src/Processors/Transforms/ScatterByPartitionTransform.h
+++ b/src/Processors/Transforms/ScatterByPartitionTransform.h
@ -1,34 +0,0 @@
-#pragma once
-#include <Common/WeakHash.h>
-#include <Core/ColumnNumbers.h>
-#include <Processors/IProcessor.h>
-
-namespace DB
-{
-
-struct ScatterByPartitionTransform : IProcessor
-{
-    ScatterByPartitionTransform(Block header, size_t output_size_, ColumnNumbers key_columns_);
-
-    String getName() const override { return "ScatterByPartitionTransform"; }
-
-    Status prepare() override;
-    void work() override;
-
-private:
-
-    void generateOutputChunks();
-
-    size_t output_size;
-    ColumnNumbers key_columns;
-
-    bool has_data = false;
-    bool all_outputs_processed = true;
-    std::vector<char> was_output_processed;
-    Chunk chunk;
-
-    WeakHash32 hash;
-    Chunks output_chunks;
-};
-
-}
--- a/src/Storages/IStorage.cpp
+++ b/src/Storages/IStorage.cpp
@ -41,8 +41,8 @@ RWLockImpl::LockHolder IStorage::tryLockTimed(
    {
        const String type_str = type == RWLockImpl::Type::Read ? "READ" : "WRITE";
        throw Exception(ErrorCodes::DEADLOCK_AVOIDED,
-            "{} locking attempt on \"{}\" has timed out! ({}ms) Possible deadlock avoided. Client should retry",
-            type_str, getStorageID(), acquire_timeout.count());
+            "{} locking attempt on \"{}\" has timed out! ({}ms) Possible deadlock avoided. Client should retry. Owner query ids: {}",
+            type_str, getStorageID(), acquire_timeout.count(), rwlock->getOwnerQueryIdsDescription());
    }
    return lock_holder;
 }
--- a/src/Storages/PostgreSQL/MaterializedPostgreSQLConsumer.cpp
+++ b/src/Storages/PostgreSQL/MaterializedPostgreSQLConsumer.cpp
@ -22,6 +22,23 @@ namespace ErrorCodes
    extern const int LOGICAL_ERROR;
    extern const int POSTGRESQL_REPLICATION_INTERNAL_ERROR;
    extern const int BAD_ARGUMENTS;
+    extern const int ILLEGAL_COLUMN;
+}
+
+namespace
+{
+    using ArrayInfo = std::unordered_map<size_t, PostgreSQLArrayInfo>;
+
+    ArrayInfo createArrayInfos(const NamesAndTypesList & columns, const ExternalResultDescription & columns_description)
+    {
+        ArrayInfo array_info;
+        for (size_t i = 0; i < columns.size(); ++i)
+        {
+            if (columns_description.types[i].first == ExternalResultDescription::ValueType::vtArray)
+                preparePostgreSQLArrayInfo(array_info, i, columns_description.sample_block.getByPosition(i).type);
+        }
+        return array_info;
+    }
 }

 MaterializedPostgreSQLConsumer::MaterializedPostgreSQLConsumer(
@ -40,126 +57,161 @@ MaterializedPostgreSQLConsumer::MaterializedPostgreSQLConsumer(
    , publication_name(publication_name_)
    , connection(connection_)
    , current_lsn(start_lsn)
+    , final_lsn(start_lsn)
    , lsn_value(getLSNValue(start_lsn))
    , max_block_size(max_block_size_)
    , schema_as_a_part_of_table_name(schema_as_a_part_of_table_name_)
 {
-    final_lsn = start_lsn;
-    auto tx = std::make_shared<pqxx::nontransaction>(connection->getRef());
-    current_lsn = advanceLSN(tx);
-    LOG_TRACE(log, "Starting replication. LSN: {} (last: {})", getLSNValue(current_lsn), getLSNValue(final_lsn));
-    tx->commit();
-
-    for (const auto & [table_name, storage_info] : storages_info_)
-        storages.emplace(table_name, storage_info);
-}
-
-
-MaterializedPostgreSQLConsumer::StorageData::StorageData(const StorageInfo & storage_info)
-    : storage(storage_info.storage), buffer(storage_info.storage->getInMemoryMetadataPtr(), storage_info.attributes)
-{
-    auto table_id = storage_info.storage->getStorageID();
-    LOG_TRACE(&Poco::Logger::get("StorageMaterializedPostgreSQL"),
-              "New buffer for table {}, number of attributes: {}, number if columns: {}, structure: {}",
-              table_id.getNameForLogs(), buffer.attributes.size(), buffer.getColumnsNum(), buffer.description.sample_block.dumpStructure());
-}
-
-
-MaterializedPostgreSQLConsumer::StorageData::Buffer::Buffer(
-    StorageMetadataPtr storage_metadata, const PostgreSQLTableStructure::Attributes & attributes_)
-    : attributes(attributes_)
-{
-    const Block sample_block = storage_metadata->getSampleBlock();
-
-    /// Need to clear type, because in description.init() the types are appended
-    description.types.clear();
-    description.init(sample_block);
-
-    columns = description.sample_block.cloneEmptyColumns();
-    const auto & storage_columns = storage_metadata->getColumns().getAllPhysical();
-    auto insert_columns = std::make_shared<ASTExpressionList>();
-
-    auto columns_num = description.sample_block.columns();
-    assert(columns_num == storage_columns.size());
-    if (attributes.size() + 2 != columns_num) /// +2 because sign and version columns
-        throw Exception(ErrorCodes::LOGICAL_ERROR, "Columns number mismatch. Attributes: {}, buffer: {}",
-                        attributes.size(), columns_num);
-
-    size_t idx = 0;
-    for (const auto & column : storage_columns)
    {
-        if (description.types[idx].first == ExternalResultDescription::ValueType::vtArray)
-            preparePostgreSQLArrayInfo(array_info, idx, description.sample_block.getByPosition(idx).type);
-        idx++;
-
-        insert_columns->children.emplace_back(std::make_shared<ASTIdentifier>(column.name));
+        auto tx = std::make_shared<pqxx::nontransaction>(connection->getRef());
+        current_lsn = advanceLSN(tx);
+        tx->commit();
    }

-    columns_ast = std::move(insert_columns);
+    for (const auto & [table_name, storage_info] : storages_info_)
+        storages.emplace(table_name, StorageData(storage_info, log));
+
+    LOG_TRACE(log, "Starting replication. LSN: {} (last: {}), storages: {}",
+              getLSNValue(current_lsn), getLSNValue(final_lsn), storages.size());
 }


-void MaterializedPostgreSQLConsumer::assertCorrectInsertion(StorageData::Buffer & buffer, size_t column_idx)
+MaterializedPostgreSQLConsumer::StorageData::StorageData(const StorageInfo & storage_info, Poco::Logger * log_)
+    : storage(storage_info.storage)
+    , table_description(storage_info.storage->getInMemoryMetadataPtr()->getSampleBlock())
+    , columns_attributes(storage_info.attributes)
+    , column_names(storage_info.storage->getInMemoryMetadataPtr()->getColumns().getNamesOfPhysical())
+    , array_info(createArrayInfos(storage_info.storage->getInMemoryMetadataPtr()->getColumns().getAllPhysical(), table_description))
 {
-    if (column_idx >= buffer.description.sample_block.columns()
-        || column_idx >= buffer.description.types.size()
-        || column_idx >= buffer.columns.size())
-        throw Exception(
-                        ErrorCodes::LOGICAL_ERROR,
+    auto columns_num = table_description.sample_block.columns();
+    /// +2 because of _sign and _version columns
+    if (columns_attributes.size() + 2 != columns_num)
+    {
+        throw Exception(ErrorCodes::LOGICAL_ERROR,
+                        "Columns number mismatch. Attributes: {}, buffer: {}",
+                        columns_attributes.size(), columns_num);
+    }
+
+    LOG_TRACE(log_, "Adding definition for table {}, structure: {}",
+              storage_info.storage->getStorageID().getNameForLogs(),
+              table_description.sample_block.dumpStructure());
+}
+
+MaterializedPostgreSQLConsumer::StorageData::Buffer::Buffer(
+    ColumnsWithTypeAndName && columns_,
+    const ExternalResultDescription & table_description_)
+{
+    if (columns_.end() != std::find_if(
+            columns_.begin(), columns_.end(),
+            [](const auto & col) { return col.name == "_sign" || col.name == "_version"; }))
+    {
+        throw Exception(ErrorCodes::ILLEGAL_COLUMN,
+                        "PostgreSQL table cannot contain `_sign` or `_version` columns "
+                        "as they are reserved for internal usage");
+    }
+
+    columns_.push_back(table_description_.sample_block.getByName("_sign"));
+    columns_.push_back(table_description_.sample_block.getByName("_version"));
+
+    for (const auto & col : columns_)
+    {
+        if (!table_description_.sample_block.has(col.name))
+        {
+            throw Exception(ErrorCodes::LOGICAL_ERROR,
+                            "Having column {}, but no such column in table ({})",
+                            col.name, table_description_.sample_block.dumpStructure());
+        }
+
+        const auto & actual_column = table_description_.sample_block.getByName(col.name);
+        if (col.type != actual_column.type)
+        {
+            throw Exception(ErrorCodes::LOGICAL_ERROR,
+                            "Having column {} of type {}, but expected {}",
+                            col.name, col.type->getName(), actual_column.type->getName());
+        }
+    }
+
+    sample_block = Block(columns_);
+    columns = sample_block.cloneEmptyColumns();
+
+    for (const auto & name : sample_block.getNames())
+        columns_ast.children.emplace_back(std::make_shared<ASTIdentifier>(name));
+}
+
+MaterializedPostgreSQLConsumer::StorageData::Buffer & MaterializedPostgreSQLConsumer::StorageData::getBuffer()
+{
+    if (!buffer)
+    {
+        throw Exception(ErrorCodes::LOGICAL_ERROR, "Data buffer not initialized for {}",
+                        storage->getStorageID().getNameForLogs());
+    }
+
+    return *buffer;
+}
+
+void MaterializedPostgreSQLConsumer::StorageData::Buffer::assertInsertIsPossible(size_t col_idx) const
+{
+    if (col_idx >= columns.size())
+    {
+        throw Exception(ErrorCodes::LOGICAL_ERROR,
                        "Attempt to insert into buffer at position: "
-                        "{}, but block columns size is {}, types size: {}, columns size: {}, buffer structure: {}",
-                        column_idx,
-                        buffer.description.sample_block.columns(),
-                        buffer.description.types.size(), buffer.columns.size(),
-                        buffer.description.sample_block.dumpStructure());
+                        "{}, but block columns size is {} (full structure: {})",
+                        col_idx, columns.size(), sample_block.dumpStructure());
+    }
 }


-void MaterializedPostgreSQLConsumer::insertValue(StorageData::Buffer & buffer, const std::string & value, size_t column_idx)
+void MaterializedPostgreSQLConsumer::insertValue(StorageData & storage_data, const std::string & value, size_t column_idx)
 {
-    assertCorrectInsertion(buffer, column_idx);
+    auto & buffer = storage_data.getBuffer();
+    buffer.assertInsertIsPossible(column_idx);

-    const auto & sample = buffer.description.sample_block.getByPosition(column_idx);
-    bool is_nullable = buffer.description.types[column_idx].second;
+    const auto & column_type_and_name = buffer.sample_block.getByPosition(column_idx);
+    auto & column = buffer.columns[column_idx];
+
+    const size_t column_idx_in_table = storage_data.table_description.sample_block.getPositionByName(column_type_and_name.name);
+    const auto & type_description = storage_data.table_description.types[column_idx_in_table];

    try
    {
-        if (is_nullable)
+        if (column_type_and_name.type->isNullable())
        {
-            ColumnNullable & column_nullable = assert_cast<ColumnNullable &>(*buffer.columns[column_idx]);
-            const auto & data_type = assert_cast<const DataTypeNullable &>(*sample.type);
+            ColumnNullable & column_nullable = assert_cast<ColumnNullable &>(*column);
+            const auto & data_type = assert_cast<const DataTypeNullable &>(*column_type_and_name.type);

            insertPostgreSQLValue(
-                    column_nullable.getNestedColumn(), value,
-                    buffer.description.types[column_idx].first, data_type.getNestedType(), buffer.array_info, column_idx);
+                    column_nullable.getNestedColumn(), value, type_description.first,
+                    data_type.getNestedType(), storage_data.array_info, column_idx_in_table);

            column_nullable.getNullMapData().emplace_back(0);
        }
        else
        {
            insertPostgreSQLValue(
-                    *buffer.columns[column_idx], value,
-                    buffer.description.types[column_idx].first, sample.type,
-                    buffer.array_info, column_idx);
+                *column, value, type_description.first, column_type_and_name.type,
+                storage_data.array_info, column_idx_in_table);
        }
    }
    catch (const pqxx::conversion_error & e)
    {
-        LOG_ERROR(log, "Conversion failed while inserting PostgreSQL value {}, will insert default value. Error: {}", value, e.what());
-        insertDefaultValue(buffer, column_idx);
+        LOG_ERROR(log, "Conversion failed while inserting PostgreSQL value {}, "
+                  "will insert default value. Error: {}", value, e.what());
+
+        insertDefaultPostgreSQLValue(*column, *column_type_and_name.column);
    }
 }

-
-void MaterializedPostgreSQLConsumer::insertDefaultValue(StorageData::Buffer & buffer, size_t column_idx)
+void MaterializedPostgreSQLConsumer::insertDefaultValue(StorageData & storage_data, size_t column_idx)
 {
-    assertCorrectInsertion(buffer, column_idx);
+    auto & buffer = storage_data.getBuffer();
+    buffer.assertInsertIsPossible(column_idx);

-    const auto & sample = buffer.description.sample_block.getByPosition(column_idx);
-    insertDefaultPostgreSQLValue(*buffer.columns[column_idx], *sample.column);
+    const auto & column_type_and_name = buffer.sample_block.getByPosition(column_idx);
+    auto & column = buffer.columns[column_idx];
+
+    insertDefaultPostgreSQLValue(*column, *column_type_and_name.column);
 }

-
 void MaterializedPostgreSQLConsumer::readString(const char * message, size_t & pos, size_t size, String & result)
 {
    assert(size > pos + 2);
@ -173,7 +225,6 @@ void MaterializedPostgreSQLConsumer::readString(const char * message, size_t & p
    }
 }

-
 template<typename T>
 T MaterializedPostgreSQLConsumer::unhexN(const char * message, size_t pos, size_t n)
 {
@ -186,7 +237,6 @@ T MaterializedPostgreSQLConsumer::unhexN(const char * message, size_t pos, size_
    return result;
 }

-
 Int64 MaterializedPostgreSQLConsumer::readInt64(const char * message, size_t & pos, [[maybe_unused]] size_t size)
 {
    assert(size >= pos + 16);
@ -195,7 +245,6 @@ Int64 MaterializedPostgreSQLConsumer::readInt64(const char * message, size_t & p
    return result;
 }

-
 Int32 MaterializedPostgreSQLConsumer::readInt32(const char * message, size_t & pos, [[maybe_unused]] size_t size)
 {
    assert(size >= pos + 8);
@ -204,7 +253,6 @@ Int32 MaterializedPostgreSQLConsumer::readInt32(const char * message, size_t & p
    return result;
 }

-
 Int16 MaterializedPostgreSQLConsumer::readInt16(const char * message, size_t & pos, [[maybe_unused]] size_t size)
 {
    assert(size >= pos + 4);
@ -213,7 +261,6 @@ Int16 MaterializedPostgreSQLConsumer::readInt16(const char * message, size_t & p
    return result;
 }

-
 Int8 MaterializedPostgreSQLConsumer::readInt8(const char * message, size_t & pos, [[maybe_unused]] size_t size)
 {
    assert(size >= pos + 2);
@ -222,25 +269,23 @@ Int8 MaterializedPostgreSQLConsumer::readInt8(const char * message, size_t & pos
    return result;
 }

-
 void MaterializedPostgreSQLConsumer::readTupleData(
-        StorageData::Buffer & buffer, const char * message, size_t & pos, [[maybe_unused]] size_t size, PostgreSQLQuery type, bool old_value)
+    StorageData & storage_data,
+    const char * message,
+    size_t & pos,
+    size_t size,
+    PostgreSQLQuery type,
+    bool old_value)
 {
    Int16 num_columns = readInt16(message, pos, size);

-    /// Sanity check. In fact, it was already checked.
-    if (static_cast<size_t>(num_columns) + 2 != buffer.getColumnsNum()) /// +2 -- sign and version columns
-        throw Exception(ErrorCodes::POSTGRESQL_REPLICATION_INTERNAL_ERROR,
-                        "Number of columns does not match. Got: {}, expected {}, current buffer structure: {}",
-                        num_columns, buffer.getColumnsNum(), buffer.description.sample_block.dumpStructure());
-
    auto proccess_column_value = [&](Int8 identifier, Int16 column_idx)
    {
        switch (identifier) // NOLINT(bugprone-switch-missing-default-case)
        {
            case 'n': /// NULL
            {
-                insertDefaultValue(buffer, column_idx);
+                insertDefaultValue(storage_data, column_idx);
                break;
            }
            case 't': /// Text formatted value
@ -250,7 +295,7 @@ void MaterializedPostgreSQLConsumer::readTupleData(
                for (Int32 i = 0; i < col_len; ++i)
                    value += readInt8(message, pos, size);

-                insertValue(buffer, value, column_idx);
+                insertValue(storage_data, value, column_idx);
                break;
            }
            case 'u': /// TOAST value && unchanged at the same time. Actual value is not sent.
@ -258,13 +303,13 @@ void MaterializedPostgreSQLConsumer::readTupleData(
                /// TOAST values are not supported. (TOAST values are values that are considered in postgres
                /// to be too large to be stored directly)
                LOG_WARNING(log, "Got TOAST value, which is not supported, default value will be used instead.");
-                insertDefaultValue(buffer, column_idx);
+                insertDefaultValue(storage_data, column_idx);
                break;
            }
            case 'b': /// Binary data.
            {
                LOG_WARNING(log, "We do not yet process this format of data, will insert default value");
-                insertDefaultValue(buffer, column_idx);
+                insertDefaultValue(storage_data, column_idx);
                break;
            }
            default:
@ -272,7 +317,7 @@ void MaterializedPostgreSQLConsumer::readTupleData(
                LOG_WARNING(log, "Unexpected identifier: {}. This is a bug! Please report an issue on github", identifier);
                chassert(false);

-                insertDefaultValue(buffer, column_idx);
+                insertDefaultValue(storage_data, column_idx);
                break;
            }
        }
@ -291,7 +336,7 @@ void MaterializedPostgreSQLConsumer::readTupleData(
                      "Got error while receiving value for column {}, will insert default value. Error: {}",
                      column_idx, getCurrentExceptionMessage(true));

-            insertDefaultValue(buffer, column_idx);
+            insertDefaultValue(storage_data, column_idx);
            /// Let's collect only the first exception.
            /// This delaying of error throw is needed because
            /// some errors can be ignored and just logged,
@ -301,19 +346,20 @@ void MaterializedPostgreSQLConsumer::readTupleData(
        }
    }

+    auto & columns = storage_data.getBuffer().columns;
    switch (type)
    {
        case PostgreSQLQuery::INSERT:
        {
-            buffer.columns[num_columns]->insert(static_cast<Int8>(1));
-            buffer.columns[num_columns + 1]->insert(lsn_value);
+            columns[num_columns]->insert(static_cast<Int8>(1));
+            columns[num_columns + 1]->insert(lsn_value);

            break;
        }
        case PostgreSQLQuery::DELETE:
        {
-            buffer.columns[num_columns]->insert(static_cast<Int8>(-1));
-            buffer.columns[num_columns + 1]->insert(lsn_value);
+            columns[num_columns]->insert(static_cast<Int8>(-1));
+            columns[num_columns + 1]->insert(lsn_value);

            break;
        }
@ -321,11 +367,11 @@ void MaterializedPostgreSQLConsumer::readTupleData(
        {
            /// Process old value in case changed value is a primary key.
            if (old_value)
-                buffer.columns[num_columns]->insert(static_cast<Int8>(-1));
+                columns[num_columns]->insert(static_cast<Int8>(-1));
            else
-                buffer.columns[num_columns]->insert(static_cast<Int8>(1));
+                columns[num_columns]->insert(static_cast<Int8>(1));

-            buffer.columns[num_columns + 1]->insert(lsn_value);
+            columns[num_columns + 1]->insert(lsn_value);

            break;
        }
@ -335,7 +381,6 @@ void MaterializedPostgreSQLConsumer::readTupleData(
        std::rethrow_exception(error);
 }

-
 /// https://www.postgresql.org/docs/13/protocol-logicalrep-message-formats.html
 void MaterializedPostgreSQLConsumer::processReplicationMessage(const char * replication_message, size_t size)
 {
@ -366,10 +411,10 @@ void MaterializedPostgreSQLConsumer::processReplicationMessage(const char * repl
                return;

            Int8 new_tuple = readInt8(replication_message, pos, size);
-            auto & buffer = storages.find(table_name)->second.buffer;
+            auto & storage_data = storages.find(table_name)->second;

            if (new_tuple)
-                readTupleData(buffer, replication_message, pos, size, PostgreSQLQuery::INSERT);
+                readTupleData(storage_data, replication_message, pos, size, PostgreSQLQuery::INSERT);

            break;
        }
@ -386,7 +431,7 @@ void MaterializedPostgreSQLConsumer::processReplicationMessage(const char * repl
            if (!isSyncAllowed(relation_id, table_name))
                return;

-            auto & buffer = storages.find(table_name)->second.buffer;
+            auto & storage_data = storages.find(table_name)->second;

            auto proccess_identifier = [&](Int8 identifier) -> bool
            {
@ -401,13 +446,13 @@ void MaterializedPostgreSQLConsumer::processReplicationMessage(const char * repl
                    /// it is much more efficient to use replica identity index, but support all possible cases.
                    case 'O':
                    {
-                        readTupleData(buffer, replication_message, pos, size, PostgreSQLQuery::UPDATE, true);
+                        readTupleData(storage_data, replication_message, pos, size, PostgreSQLQuery::UPDATE, true);
                        break;
                    }
                    case 'N':
                    {
                        /// New row.
-                        readTupleData(buffer, replication_message, pos, size, PostgreSQLQuery::UPDATE);
+                        readTupleData(storage_data, replication_message, pos, size, PostgreSQLQuery::UPDATE);
                        read_next = false;
                        break;
                    }
@ -441,8 +486,8 @@ void MaterializedPostgreSQLConsumer::processReplicationMessage(const char * repl
             /// 0 or 1 if replica identity is set to full. For now only default replica identity is supported (with primary keys).
            readInt8(replication_message, pos, size);

-            auto & buffer = storages.find(table_name)->second.buffer;
-            readTupleData(buffer, replication_message, pos, size, PostgreSQLQuery::DELETE);
+            auto & storage_data = storages.find(table_name)->second;
+            readTupleData(storage_data, replication_message, pos, size, PostgreSQLQuery::DELETE);
            break;
        }
        case 'C': // Commit
@ -490,8 +535,6 @@ void MaterializedPostgreSQLConsumer::processReplicationMessage(const char * repl
                return;
            }

-            auto & buffer = storage_iter->second.buffer;
-
            /// 'd' - default (primary key if any)
            /// 'n' - nothing
            /// 'f' - all columns (set replica identity full)
@ -507,47 +550,94 @@ void MaterializedPostgreSQLConsumer::processReplicationMessage(const char * repl
                return;
            }

+            auto log_table_structure_changed = [&](const std::string & reason)
+            {
+                LOG_INFO(log, "Table structure of the table {} changed ({}), "
+                         "will mark it as skipped from replication. "
+                         "Please perform manual DETACH and ATTACH of the table to bring it back",
+                         table_name, reason);
+            };
+
            Int16 num_columns = readInt16(replication_message, pos, size);

-            if (static_cast<size_t>(num_columns) + 2 != buffer.getColumnsNum()) /// +2 -- sign and version columns
-            {
-                markTableAsSkipped(relation_id, table_name);
-                return;
-            }
+            auto & storage_data = storage_iter->second;
+            const auto & description = storage_data.table_description;

-            if (static_cast<size_t>(num_columns) != buffer.attributes.size())
+            const size_t actual_columns_num = storage_data.getColumnsNum();
+            if (size_t(num_columns) > actual_columns_num - 2)
            {
-#ifndef NDEBUG
-                throw Exception(ErrorCodes::LOGICAL_ERROR,
-                                "Mismatch in attributes size. Got {}, expected {}. It's a bug. Current buffer structure: {}",
-                                num_columns, buffer.attributes.size(), buffer.description.sample_block.dumpStructure());
-#else
-                LOG_ERROR(log, "Mismatch in attributes size. Got {}, expected {}. It's a bug. Current buffer structure: {}",
-                          num_columns, buffer.attributes.size(), buffer.description.sample_block.dumpStructure());
+                log_table_structure_changed(fmt::format("received {} columns, expected {}", num_columns, actual_columns_num - 2));
                markTableAsSkipped(relation_id, table_name);
                return;
-#endif
            }

            Int32 data_type_id;
            Int32 type_modifier; /// For example, n in varchar(n)

+            std::set<std::string> all_columns(storage_data.column_names.begin(), storage_data.column_names.end());
+            std::set<std::string> received_columns;
+            ColumnsWithTypeAndName columns;
+
            for (uint16_t i = 0; i < num_columns; ++i)
            {
                String column_name;
                readInt8(replication_message, pos, size); /// Marks column as part of replica identity index
                readString(replication_message, pos, size, column_name);

+                if (!all_columns.contains(column_name))
+                {
+                    log_table_structure_changed(fmt::format("column {} is not known", column_name));
+                    markTableAsSkipped(relation_id, table_name);
+                    return;
+                }
+
                data_type_id = readInt32(replication_message, pos, size);
                type_modifier = readInt32(replication_message, pos, size);

-                if (buffer.attributes[i].atttypid != data_type_id || buffer.attributes[i].atttypmod != type_modifier)
+                columns.push_back(description.sample_block.getByName(column_name));
+                received_columns.emplace(column_name);
+
+                const auto & attributes_it = storage_data.columns_attributes.find(column_name);
+                if (attributes_it == storage_data.columns_attributes.end())
+                    throw Exception(ErrorCodes::LOGICAL_ERROR, "No column {} in attributes", column_name);
+
+                const auto & attributes = attributes_it->second;
+                if (attributes.atttypid != data_type_id || attributes.atttypmod != type_modifier)
                {
+                    log_table_structure_changed(fmt::format("column {} has a different type", column_name));
                    markTableAsSkipped(relation_id, table_name);
                    return;
                }
            }

+
+            if (size_t(num_columns) < actual_columns_num)
+            {
+                std::vector<std::string> absent_columns;
+                std::set_difference(
+                    all_columns.begin(), all_columns.end(),
+                    received_columns.begin(), received_columns.end(), std::back_inserter(absent_columns));
+
+                for (const auto & name : absent_columns)
+                {
+                    if (name == "_sign" || name == "_version")
+                        continue;
+
+                    const auto & attributes_it = storage_data.columns_attributes.find(name);
+                    if (attributes_it == storage_data.columns_attributes.end())
+                        throw Exception(ErrorCodes::LOGICAL_ERROR, "No column {} in attributes", name);
+
+                    /// Column has a default value or it is a GENERATED columns.
+                    if (!attributes_it->second.attr_def.empty())
+                        continue;
+
+                    log_table_structure_changed(fmt::format("column {} was not found", name));
+                    markTableAsSkipped(relation_id, table_name);
+                    return;
+                }
+            }
+
+            storage_data.setBuffer(std::make_unique<StorageData::Buffer>(std::move(columns), description));
            tables_to_sync.insert(table_name);
            break;
        }
@ -563,7 +653,6 @@ void MaterializedPostgreSQLConsumer::processReplicationMessage(const char * repl
    }
 }

-
 void MaterializedPostgreSQLConsumer::syncTables()
 {
    size_t synced_tables = 0;
@ -571,8 +660,8 @@ void MaterializedPostgreSQLConsumer::syncTables()
    {
        auto table_name = *tables_to_sync.begin();
        auto & storage_data = storages.find(table_name)->second;
-        Block result_rows = storage_data.buffer.description.sample_block.cloneWithColumns(std::move(storage_data.buffer.columns));
-        storage_data.buffer.columns = storage_data.buffer.description.sample_block.cloneEmptyColumns();
+        auto & buffer = storage_data.getBuffer();
+        Block result_rows = buffer.sample_block.cloneWithColumns(std::move(buffer.columns));

        try
        {
@ -585,7 +674,7 @@ void MaterializedPostgreSQLConsumer::syncTables()

                auto insert = std::make_shared<ASTInsertQuery>();
                insert->table_id = storage->getStorageID();
-                insert->columns = storage_data.buffer.columns_ast;
+                insert->columns = std::make_shared<ASTExpressionList>(buffer.columns_ast);

                InterpreterInsertQuery interpreter(insert, insert_context, true);
                auto io = interpreter.execute();
@ -603,10 +692,11 @@ void MaterializedPostgreSQLConsumer::syncTables()
        catch (...)
        {
            /// Retry this buffer later.
-            storage_data.buffer.columns = result_rows.mutateColumns();
+            buffer.columns = result_rows.mutateColumns();
            throw;
        }

+        storage_data.setBuffer(nullptr);
        tables_to_sync.erase(tables_to_sync.begin());
    }

@ -616,7 +706,6 @@ void MaterializedPostgreSQLConsumer::syncTables()
    updateLsn();
 }

-
 void MaterializedPostgreSQLConsumer::updateLsn()
 {
    try
@ -632,7 +721,6 @@ void MaterializedPostgreSQLConsumer::updateLsn()
    }
 }

-
 String MaterializedPostgreSQLConsumer::advanceLSN(std::shared_ptr<pqxx::nontransaction> tx)
 {
    std::string query_str = fmt::format("SELECT end_lsn FROM pg_replication_slot_advance('{}', '{}')", replication_slot_name, final_lsn);
@ -644,7 +732,6 @@ String MaterializedPostgreSQLConsumer::advanceLSN(std::shared_ptr<pqxx::nontrans
    return final_lsn;
 }

-
 /// Sync for some table might not be allowed if:
 /// 1. Table schema changed and might break synchronization.
 /// 2. There is no storage for this table. (As a result of some exception or incorrect pg_publication)
@ -700,7 +787,6 @@ bool MaterializedPostgreSQLConsumer::isSyncAllowed(Int32 relation_id, const Stri
    return false;
 }

-
 void MaterializedPostgreSQLConsumer::markTableAsSkipped(Int32 relation_id, const String & relation_name)
 {
    skip_list.insert({relation_id, ""}); /// Empty lsn string means - continue waiting for valid lsn.
@ -712,12 +798,11 @@ void MaterializedPostgreSQLConsumer::markTableAsSkipped(Int32 relation_id, const
        relation_name, relation_id);
 }

-
 void MaterializedPostgreSQLConsumer::addNested(
    const String & postgres_table_name, StorageInfo nested_storage_info, const String & table_start_lsn)
 {
    assert(!storages.contains(postgres_table_name));
-    storages.emplace(postgres_table_name, nested_storage_info);
+    storages.emplace(postgres_table_name, StorageData(nested_storage_info, log));

    auto it = deleted_tables.find(postgres_table_name);
    if (it != deleted_tables.end())
@ -728,17 +813,15 @@ void MaterializedPostgreSQLConsumer::addNested(
    waiting_list[postgres_table_name] = table_start_lsn;
 }

-
 void MaterializedPostgreSQLConsumer::updateNested(const String & table_name, StorageInfo nested_storage_info, Int32 table_id, const String & table_start_lsn)
 {
    assert(!storages.contains(table_name));
-    storages.emplace(table_name, nested_storage_info);
+    storages.emplace(table_name, StorageData(nested_storage_info, log));

    /// Set start position to valid lsn. Before it was an empty string. Further read for table allowed, if it has a valid lsn.
    skip_list[table_id] = table_start_lsn;
 }

-
 void MaterializedPostgreSQLConsumer::removeNested(const String & postgres_table_name)
 {
    auto it = storages.find(postgres_table_name);
@ -747,7 +830,6 @@ void MaterializedPostgreSQLConsumer::removeNested(const String & postgres_table_
    deleted_tables.insert(postgres_table_name);
 }

-
 void MaterializedPostgreSQLConsumer::setSetting(const SettingChange & setting)
 {
    if (setting.name == "materialized_postgresql_max_block_size")
@ -756,7 +838,6 @@ void MaterializedPostgreSQLConsumer::setSetting(const SettingChange & setting)
        throw Exception(ErrorCodes::BAD_ARGUMENTS, "Unsupported setting: {}", setting.name);
 }

-
 /// Read binary changes from replication slot via COPY command (starting from current lsn in a slot).
 bool MaterializedPostgreSQLConsumer::consume()
 {
--- a/src/Storages/PostgreSQL/MaterializedPostgreSQLConsumer.h
+++ b/src/Storages/PostgreSQL/MaterializedPostgreSQLConsumer.h
@ -32,32 +32,37 @@ class MaterializedPostgreSQLConsumer
 private:
    struct StorageData
    {
+        explicit StorageData(const StorageInfo & storage_info, Poco::Logger * log_);
+
+        size_t getColumnsNum() const { return table_description.sample_block.columns(); }
+
+        const Block & getSampleBlock() const { return table_description.sample_block; }
+
+        using ArrayInfo = std::unordered_map<size_t, PostgreSQLArrayInfo>;
+
+        const StoragePtr storage;
+        const ExternalResultDescription table_description;
+        const PostgreSQLTableStructure::Attributes columns_attributes;
+        const Names column_names;
+        const ArrayInfo array_info;
+
        struct Buffer
        {
-            ExternalResultDescription description;
+            Block sample_block;
            MutableColumns columns;
+            ASTExpressionList columns_ast;

-            /// Needed to pass to insert query columns list in syncTables().
-            std::shared_ptr<ASTExpressionList> columns_ast;
-            /// Needed for insertPostgreSQLValue() method to parse array
-            std::unordered_map<size_t, PostgreSQLArrayInfo> array_info;
-            /// To validate ddl.
-            PostgreSQLTableStructure::Attributes attributes;
+            explicit Buffer(ColumnsWithTypeAndName && columns_, const ExternalResultDescription & table_description_);

-            Buffer(StorageMetadataPtr storage_metadata, const PostgreSQLTableStructure::Attributes & attributes_);
-
-            size_t getColumnsNum() const
-            {
-                const auto & sample_block = description.sample_block;
-                return sample_block.columns();
-            }
+            void assertInsertIsPossible(size_t col_idx) const;
        };

-        StoragePtr storage;
-        Buffer buffer;
+        Buffer & getBuffer();

-        explicit StorageData(const StorageInfo & storage_info);
-        StorageData(const StorageData & other) = delete;
+        void setBuffer(std::unique_ptr<Buffer> buffer_) { buffer = std::move(buffer_); }
+
+    private:
+        std::unique_ptr<Buffer> buffer;
    };

    using Storages = std::unordered_map<String, StorageData>;
@ -97,8 +102,8 @@ private:

    bool isSyncAllowed(Int32 relation_id, const String & relation_name);

-    static void insertDefaultValue(StorageData::Buffer & buffer, size_t column_idx);
-    void insertValue(StorageData::Buffer & buffer, const std::string & value, size_t column_idx);
+    static void insertDefaultValue(StorageData & storage_data, size_t column_idx);
+    void insertValue(StorageData & storage_data, const std::string & value, size_t column_idx);

    enum class PostgreSQLQuery
    {
@ -107,7 +112,7 @@ private:
        DELETE
    };

-    void readTupleData(StorageData::Buffer & buffer, const char * message, size_t & pos, size_t size, PostgreSQLQuery type, bool old_value = false);
+    void readTupleData(StorageData & storage_data, const char * message, size_t & pos, size_t size, PostgreSQLQuery type, bool old_value = false);

    template<typename T>
    static T unhexN(const char * message, size_t pos, size_t n);
@ -119,8 +124,6 @@ private:

    void markTableAsSkipped(Int32 relation_id, const String & relation_name);

-    static void assertCorrectInsertion(StorageData::Buffer & buffer, size_t column_idx);
-
    /// lsn - log sequence number, like wal offset (64 bit).
    static Int64 getLSNValue(const std::string & lsn)
    {
--- a/src/Storages/PostgreSQL/PostgreSQLReplicationHandler.cpp
+++ b/src/Storages/PostgreSQL/PostgreSQLReplicationHandler.cpp
@ -337,6 +337,7 @@ void PostgreSQLReplicationHandler::startSynchronization(bool throw_on_error)
            dropReplicationSlot(tx);

        initial_sync();
+        LOG_DEBUG(log, "Loaded {} tables", nested_storages.size());
    }
    /// Synchronization and initial load already took place - do not create any new tables, just fetch StoragePtr's
    /// and pass them to replication consumer.
@ -414,16 +415,18 @@ StorageInfo PostgreSQLReplicationHandler::loadFromSnapshot(postgres::Connection
    std::string query_str = fmt::format("SET TRANSACTION SNAPSHOT '{}'", snapshot_name);
    tx->exec(query_str);

+    auto table_structure = fetchTableStructure(*tx, table_name);
+    if (!table_structure->physical_columns)
+        throw Exception(ErrorCodes::LOGICAL_ERROR, "No table attributes");
+
+    auto table_attributes = table_structure->physical_columns->attributes;
+
    /// Load from snapshot, which will show table state before creation of replication slot.
    /// Already connected to needed database, no need to add it to query.
    auto quoted_name = doubleQuoteWithSchema(table_name);
    query_str = fmt::format("SELECT * FROM ONLY {}", quoted_name);
-    LOG_DEBUG(log, "Loading PostgreSQL table {}.{}", postgres_database, quoted_name);

-    auto table_structure = fetchTableStructure(*tx, table_name);
-    if (!table_structure->physical_columns)
-        throw Exception(ErrorCodes::LOGICAL_ERROR, "No table attributes");
-    auto table_attributes = table_structure->physical_columns->attributes;
+    LOG_DEBUG(log, "Loading PostgreSQL table {}.{}", postgres_database, quoted_name);

    auto table_override = tryGetTableOverride(current_database_name, table_name);
    materialized_storage->createNestedIfNeeded(std::move(table_structure), table_override ? table_override->as<ASTTableOverride>() : nullptr);
@ -449,7 +452,9 @@ StorageInfo PostgreSQLReplicationHandler::loadFromSnapshot(postgres::Connection

    materialized_storage->set(nested_storage);
    auto nested_table_id = nested_storage->getStorageID();
-    LOG_DEBUG(log, "Loaded table {}.{} (uuid: {})", nested_table_id.database_name, nested_table_id.table_name, toString(nested_table_id.uuid));
+
+    LOG_DEBUG(log, "Loaded table {}.{} (uuid: {})",
+              nested_table_id.database_name, nested_table_id.table_name, toString(nested_table_id.uuid));

    return StorageInfo(nested_storage, std::move(table_attributes));
 }
--- a/src/Storages/PostgreSQL/StorageMaterializedPostgreSQL.cpp
+++ b/src/Storages/PostgreSQL/StorageMaterializedPostgreSQL.cpp
@ -25,6 +25,8 @@
 #include <Parsers/ASTFunction.h>
 #include <Parsers/ASTIdentifier.h>
 #include <Parsers/ASTTablesInSelectQuery.h>
+#include <Parsers/ExpressionListParsers.h>
+#include <Parsers/formatAST.h>

 #include <Interpreters/applyTableOverride.h>
 #include <Interpreters/InterpreterDropQuery.h>
@ -195,7 +197,8 @@ void StorageMaterializedPostgreSQL::createNestedIfNeeded(PostgreSQLTableStructur
        const auto ast_create = getCreateNestedTableQuery(std::move(table_structure), table_override);
        auto table_id = getStorageID();
        auto tmp_nested_table_id = StorageID(table_id.database_name, getNestedTableName());
-        LOG_DEBUG(log, "Creating clickhouse table for postgresql table {}", table_id.getNameForLogs());
+        LOG_DEBUG(log, "Creating clickhouse table for postgresql table {} (ast: {})",
+                  table_id.getNameForLogs(), ast_create->formatForLogging());

        InterpreterCreateQuery interpreter(ast_create, nested_context);
        interpreter.execute();
@ -359,7 +362,8 @@ ASTPtr StorageMaterializedPostgreSQL::getColumnDeclaration(const DataTypePtr & d
 }


-std::shared_ptr<ASTExpressionList> StorageMaterializedPostgreSQL::getColumnsExpressionList(const NamesAndTypesList & columns) const
+std::shared_ptr<ASTExpressionList>
+StorageMaterializedPostgreSQL::getColumnsExpressionList(const NamesAndTypesList & columns, std::unordered_map<std::string, ASTPtr> defaults) const
 {
    auto columns_expression_list = std::make_shared<ASTExpressionList>();
    for (const auto & [name, type] : columns)
@ -369,6 +373,12 @@ std::shared_ptr<ASTExpressionList> StorageMaterializedPostgreSQL::getColumnsExpr
        column_declaration->name = name;
        column_declaration->type = getColumnDeclaration(type);

+        if (auto it = defaults.find(name); it != defaults.end())
+        {
+            column_declaration->default_expression = it->second;
+            column_declaration->default_specifier = "DEFAULT";
+        }
+
        columns_expression_list->children.emplace_back(column_declaration);
    }
    return columns_expression_list;
@ -460,8 +470,28 @@ ASTPtr StorageMaterializedPostgreSQL::getCreateNestedTableQuery(
        }
        else
        {
-            ordinary_columns_and_types = table_structure->physical_columns->columns;
-            columns_declare_list->set(columns_declare_list->columns, getColumnsExpressionList(ordinary_columns_and_types));
+            const auto columns = table_structure->physical_columns;
+            std::unordered_map<std::string, ASTPtr> defaults;
+            for (const auto & col : columns->columns)
+            {
+                const auto & attr = columns->attributes.at(col.name);
+                if (!attr.attr_def.empty())
+                {
+                    ParserExpression expr_parser;
+                    Expected expected;
+                    ASTPtr result;
+
+                    Tokens tokens(attr.attr_def.data(), attr.attr_def.data() + attr.attr_def.size());
+                    IParser::Pos pos(tokens, DBMS_DEFAULT_MAX_PARSER_DEPTH);
+                    if (!expr_parser.parse(pos, result, expected))
+                    {
+                        throw Exception(ErrorCodes::BAD_ARGUMENTS, "Failed to parse default expression: {}", attr.attr_def);
+                    }
+                    defaults.emplace(col.name, result);
+                }
+            }
+            ordinary_columns_and_types = columns->columns;
+            columns_declare_list->set(columns_declare_list->columns, getColumnsExpressionList(ordinary_columns_and_types, defaults));
        }

        if (ordinary_columns_and_types.empty())
--- a/src/Storages/PostgreSQL/StorageMaterializedPostgreSQL.h
+++ b/src/Storages/PostgreSQL/StorageMaterializedPostgreSQL.h
@ -109,7 +109,8 @@ public:

    ASTPtr getCreateNestedTableQuery(PostgreSQLTableStructurePtr table_structure, const ASTTableOverride * table_override);

-    std::shared_ptr<ASTExpressionList> getColumnsExpressionList(const NamesAndTypesList & columns) const;
+    std::shared_ptr<ASTExpressionList> getColumnsExpressionList(
+        const NamesAndTypesList & columns, std::unordered_map<std::string, ASTPtr> defaults = {}) const;

    StoragePtr getNested() const;

--- a/tests/ci/autoscale_runners_lambda/app.py
+++ b/tests/ci/autoscale_runners_lambda/app.py
@ -66,11 +66,6 @@ def get_scales(runner_type: str) -> Tuple[int, int]:
    # Let's have it the same as the other ASG
    # UPDATE THE COMMENT ON CHANGES
    scale_up = 3
-    if runner_type.startswith("private-"):
-        scale_up = 1
-    elif runner_type == "limited-tester":
-        # The limited runners should inflate and deflate faster
-        scale_up = 2
    return scale_down, scale_up


@ -120,7 +115,9 @@ def set_capacity(
        # Are we already at the capacity limits
        stop = stop or asg["MaxSize"] <= asg["DesiredCapacity"]
        # Let's calculate a new desired capacity
-        desired_capacity = asg["DesiredCapacity"] + (capacity_deficit // scale_up)
+        desired_capacity = (
+            asg["DesiredCapacity"] + (capacity_deficit + scale_up - 1) // scale_up
+        )
        desired_capacity = max(desired_capacity, asg["MinSize"])
        desired_capacity = min(desired_capacity, asg["MaxSize"])
        # Finally, should the capacity be even changed
--- a/tests/ci/autoscale_runners_lambda/test_autoscale.py
+++ b/tests/ci/autoscale_runners_lambda/test_autoscale.py
@ -69,14 +69,14 @@ class TestSetCapacity(unittest.TestCase):
            # Do not change capacity
            TestCase("noqueue", 1, 13, 20, [Queue("in_progress", 155, "noqueue")], -1),
            TestCase(
-                "w/reserve-1", 1, 13, 20, [Queue("queued", 15, "w/reserve-1")], -1
+                "w/reserve-1", 1, 13, 20, [Queue("queued", 15, "w/reserve-1")], 14
            ),
            # Increase capacity
-            TestCase("increase-1", 1, 13, 20, [Queue("queued", 23, "increase-1")], 16),
+            TestCase("increase-1", 1, 13, 20, [Queue("queued", 23, "increase-1")], 17),
            TestCase(
-                "style-checker", 1, 13, 20, [Queue("queued", 33, "style-checker")], 19
+                "style-checker", 1, 13, 20, [Queue("queued", 33, "style-checker")], 20
            ),
-            TestCase("increase-2", 1, 13, 20, [Queue("queued", 18, "increase-2")], 14),
+            TestCase("increase-2", 1, 13, 20, [Queue("queued", 18, "increase-2")], 15),
            TestCase("increase-3", 1, 13, 20, [Queue("queued", 183, "increase-3")], 20),
            TestCase(
                "increase-w/o reserve",
@ -87,21 +87,9 @@ class TestSetCapacity(unittest.TestCase):
                    Queue("in_progress", 11, "increase-w/o reserve"),
                    Queue("queued", 12, "increase-w/o reserve"),
                ],
-                16,
+                17,
            ),
            TestCase("lower-min", 10, 5, 20, [Queue("queued", 5, "lower-min")], 10),
-            # scale up group with prefix private-
-            TestCase(
-                "private-increase",
-                1,
-                13,
-                20,
-                [
-                    Queue("in_progress", 12, "private-increase"),
-                    Queue("queued", 11, "private-increase"),
-                ],
-                20,
-            ),
            # Decrease capacity
            TestCase("w/reserve", 1, 13, 20, [Queue("queued", 5, "w/reserve")], 5),
            TestCase(
--- a/tests/ci/worker/dockerhub_proxy_template.sh
+++ b/tests/ci/worker/dockerhub_proxy_template.sh
@ -15,6 +15,19 @@ if [[ "$ETH_DNS" ]] && [[ "${ETH_DNS#*: }" != *"$CLOUDFLARE_NS"* ]]; then
  resolvectl dns "$IFACE" "${new_dns[@]}"
 fi

+# tune sysctl for network performance
+cat > /etc/sysctl.d/10-network-memory.conf << EOF
+net.core.netdev_max_backlog=2000
+net.core.rmem_max=1048576
+net.core.wmem_max=1048576
+net.ipv4.tcp_max_syn_backlog=1024
+net.ipv4.tcp_rmem=4096 131072  16777216
+net.ipv4.tcp_wmem=4096 87380   16777216
+net.ipv4.tcp_mem=4096  131072  16777216
+EOF
+
+sysctl -p /etc/sysctl.d/10-network-memory.conf
+
 mkdir /home/ubuntu/registrystorage

 sed -i 's/preserve_hostname: false/preserve_hostname: true/g' /etc/cloud/cloud.cfg
@ -22,4 +35,11 @@ sed -i 's/preserve_hostname: false/preserve_hostname: true/g' /etc/cloud/cloud.c
 REGISTRY_PROXY_USERNAME=robotclickhouse
 REGISTRY_PROXY_PASSWORD=$(aws ssm get-parameter --name dockerhub_robot_password --with-decryption | jq '.Parameter.Value' -r)

-docker run -d --network=host -p 5000:5000 -v /home/ubuntu/registrystorage:/var/lib/registry -e REGISTRY_HTTP_ADDR=0.0.0.0:5000 -e REGISTRY_STORAGE_DELETE_ENABLED=true -e REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io -e REGISTRY_PROXY_PASSWORD="$REGISTRY_PROXY_PASSWORD" -e REGISTRY_PROXY_USERNAME="$REGISTRY_PROXY_USERNAME" --restart=always --name registry registry:2
+docker run -d --network=host -p 5000:5000 -v /home/ubuntu/registrystorage:/var/lib/registry \
+  -e REGISTRY_STORAGE_CACHE='' \
+  -e REGISTRY_HTTP_ADDR=0.0.0.0:5000 \
+  -e REGISTRY_STORAGE_DELETE_ENABLED=true \
+  -e REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io \
+  -e REGISTRY_PROXY_PASSWORD="$REGISTRY_PROXY_PASSWORD" \
+  -e REGISTRY_PROXY_USERNAME="$REGISTRY_PROXY_USERNAME" \
+  --restart=always --name registry registry:2
--- a/tests/integration/test_backup_restore_on_cluster/test_concurrency.py
+++ b/tests/integration/test_backup_restore_on_cluster/test_concurrency.py
@ -216,7 +216,7 @@ def test_create_or_drop_tables_during_backup(db_engine, table_engine):
            node = nodes[randint(0, num_nodes - 1)]
            # "DROP TABLE IF EXISTS" still can throw some errors (e.g. "WRITE locking attempt on node0 has timed out!")
            # So we use query_and_get_answer_with_error() to ignore any errors.
-            # `lock_acquire_timeout` is also reduced because we don't wait our test to wait too long.
+            # `lock_acquire_timeout` is reduced because we don't wait our test to wait too long.
            node.query_and_get_answer_with_error(
                f"DROP TABLE IF EXISTS {table_name} SYNC",
                settings={"lock_acquire_timeout": 10},
@ -227,15 +227,24 @@ def test_create_or_drop_tables_during_backup(db_engine, table_engine):
            table_name1 = f"mydb.tbl{randint(1, num_nodes)}"
            table_name2 = f"mydb.tbl{randint(1, num_nodes)}"
            node = nodes[randint(0, num_nodes - 1)]
+            # `lock_acquire_timeout` is reduced because we don't wait our test to wait too long.
            node.query_and_get_answer_with_error(
-                f"RENAME TABLE {table_name1} TO {table_name2}"
+                f"RENAME TABLE {table_name1} TO {table_name2}",
+                settings={"lock_acquire_timeout": 10},
            )

    def truncate_tables():
        while time.time() < end_time:
            table_name = f"mydb.tbl{randint(1, num_nodes)}"
            node = nodes[randint(0, num_nodes - 1)]
-            node.query(f"TRUNCATE TABLE IF EXISTS {table_name} SYNC")
+            # "TRUNCATE TABLE IF EXISTS" still can throw some errors
+            # (e.g. "WRITE locking attempt on node0 has timed out!" if the table engine is "Log").
+            # So we use query_and_get_answer_with_error() to ignore any errors.
+            # `lock_acquire_timeout` is reduced because we don't wait our test to wait too long.
+            node.query_and_get_answer_with_error(
+                f"TRUNCATE TABLE IF EXISTS {table_name} SYNC",
+                settings={"lock_acquire_timeout": 10},
+            )

    def make_backups():
        ids = []
--- a/tests/integration/test_keeper_memory_soft_limit/configs/keeper_config1.xml
+++ b/tests/integration/test_keeper_memory_soft_limit/configs/keeper_config1.xml
@ -15,6 +15,7 @@
            <value>az-zoo1</value>
        </availability_zone>
        <server_id>1</server_id>
+        <max_memory_usage_soft_limit>200000000</max_memory_usage_soft_limit>

        <coordination_settings>
            <operation_timeout_ms>10000</operation_timeout_ms>
@ -23,7 +24,6 @@
            <force_sync>false</force_sync>
            <election_timeout_lower_bound_ms>2000</election_timeout_lower_bound_ms>
            <election_timeout_upper_bound_ms>4000</election_timeout_upper_bound_ms>
-	    <max_memory_usage_soft_limit>200000000</max_memory_usage_soft_limit>

            <async_replication>1</async_replication>
        </coordination_settings>
--- a/tests/integration/test_keeper_memory_soft_limit/configs/keeper_config2.xml
+++ b/tests/integration/test_keeper_memory_soft_limit/configs/keeper_config2.xml
@ -16,6 +16,7 @@
            <value>az-zoo2</value>
            <enable_auto_detection_on_cloud>1</enable_auto_detection_on_cloud>
        </availability_zone>
+        <max_memory_usage_soft_limit>20000000</max_memory_usage_soft_limit>

        <coordination_settings>
            <operation_timeout_ms>10000</operation_timeout_ms>
@ -24,7 +25,6 @@
            <force_sync>false</force_sync>
            <election_timeout_lower_bound_ms>2000</election_timeout_lower_bound_ms>
            <election_timeout_upper_bound_ms>4000</election_timeout_upper_bound_ms>
-	    <max_memory_usage_soft_limit>20000000</max_memory_usage_soft_limit>

            <async_replication>1</async_replication>
        </coordination_settings>
--- a/tests/integration/test_keeper_memory_soft_limit/configs/keeper_config3.xml
+++ b/tests/integration/test_keeper_memory_soft_limit/configs/keeper_config3.xml
@ -13,6 +13,8 @@
        <tcp_port>2181</tcp_port>
        <server_id>3</server_id>

+        <max_memory_usage_soft_limit>20000000</max_memory_usage_soft_limit>
+
        <coordination_settings>
            <operation_timeout_ms>10000</operation_timeout_ms>
            <session_timeout_ms>15000</session_timeout_ms>
@ -20,7 +22,6 @@
            <force_sync>false</force_sync>
            <election_timeout_lower_bound_ms>2000</election_timeout_lower_bound_ms>
            <election_timeout_upper_bound_ms>4000</election_timeout_upper_bound_ms>
-	    <max_memory_usage_soft_limit>20000000</max_memory_usage_soft_limit>

            <async_replication>1</async_replication>
        </coordination_settings>
--- a/tests/integration/test_postgresql_replica_database_engine_2/test.py
+++ b/tests/integration/test_postgresql_replica_database_engine_2/test.py
@ -944,6 +944,100 @@ def test_symbols_in_publication_name(started_cluster):
    )


+def test_generated_columns(started_cluster):
+    table = "test_generated_columns"
+
+    pg_manager.create_postgres_table(
+        table,
+        "",
+        f"""CREATE TABLE {table} (
+             key integer PRIMARY KEY,
+             x integer,
+             y integer GENERATED ALWAYS AS (x*2) STORED,
+             z text);
+         """,
+    )
+
+    pg_manager.execute(f"insert into {table} (key, x, z) values (1,1,'1');")
+    pg_manager.execute(f"insert into {table} (key, x, z) values (2,2,'2');")
+
+    pg_manager.create_materialized_db(
+        ip=started_cluster.postgres_ip,
+        port=started_cluster.postgres_port,
+        settings=[
+            f"materialized_postgresql_tables_list = '{table}'",
+            "materialized_postgresql_backoff_min_ms = 100",
+            "materialized_postgresql_backoff_max_ms = 100",
+        ],
+    )
+
+    check_tables_are_synchronized(
+        instance, table, postgres_database=pg_manager.get_default_database()
+    )
+
+    pg_manager.execute(f"insert into {table} (key, x, z) values (3,3,'3');")
+    pg_manager.execute(f"insert into {table} (key, x, z) values (4,4,'4');")
+
+    check_tables_are_synchronized(
+        instance, table, postgres_database=pg_manager.get_default_database()
+    )
+
+    pg_manager.execute(f"insert into {table} (key, x, z) values (5,5,'5');")
+    pg_manager.execute(f"insert into {table} (key, x, z) values (6,6,'6');")
+
+    check_tables_are_synchronized(
+        instance, table, postgres_database=pg_manager.get_default_database()
+    )
+
+
+def test_default_columns(started_cluster):
+    table = "test_default_columns"
+
+    pg_manager.create_postgres_table(
+        table,
+        "",
+        f"""CREATE TABLE {table} (
+             key integer PRIMARY KEY,
+             x integer,
+             y text DEFAULT 'y1',
+             z integer,
+             a text DEFAULT 'a1',
+             b integer);
+         """,
+    )
+
+    pg_manager.execute(f"insert into {table} (key, x, z, b) values (1,1,1,1);")
+    pg_manager.execute(f"insert into {table} (key, x, z, b) values (2,2,2,2);")
+
+    pg_manager.create_materialized_db(
+        ip=started_cluster.postgres_ip,
+        port=started_cluster.postgres_port,
+        settings=[
+            f"materialized_postgresql_tables_list = '{table}'",
+            "materialized_postgresql_backoff_min_ms = 100",
+            "materialized_postgresql_backoff_max_ms = 100",
+        ],
+    )
+
+    check_tables_are_synchronized(
+        instance, table, postgres_database=pg_manager.get_default_database()
+    )
+
+    pg_manager.execute(f"insert into {table} (key, x, z, b) values (3,3,3,3);")
+    pg_manager.execute(f"insert into {table} (key, x, z, b) values (4,4,4,4);")
+
+    check_tables_are_synchronized(
+        instance, table, postgres_database=pg_manager.get_default_database()
+    )
+
+    pg_manager.execute(f"insert into {table} (key, x, z, b) values (5,5,5,5);")
+    pg_manager.execute(f"insert into {table} (key, x, z, b) values (6,6,6,6);")
+
+    check_tables_are_synchronized(
+        instance, table, postgres_database=pg_manager.get_default_database()
+    )
+
+
 if __name__ == "__main__":
    cluster.start()
    input("Cluster created, press any key to destroy...")
--- a/tests/queries/0_stateless/01568_window_functions_distributed.reference
+++ b/tests/queries/0_stateless/01568_window_functions_distributed.reference
@ -22,16 +22,6 @@ select sum(number) over w as x, max(number) over w as y from t_01568 window w as
 21	8
 21	8
 21	8
-select sum(number) over w, max(number) over w from t_01568 window w as (partition by p) order by p;
-3	2
-3	2
-3	2
-12	5
-12	5
-12	5
-21	8
-21	8
-21	8
 select sum(number) over w as x, max(number) over w as y from remote('127.0.0.{1,2}', '', t_01568) window w as (partition by p) order by x, y;
 6	2
 6	2
@ -51,25 +41,6 @@ select sum(number) over w as x, max(number) over w as y from remote('127.0.0.{1,
 42	8
 42	8
 42	8
-select sum(number) over w as x, max(number) over w as y from remote('127.0.0.{1,2}', '', t_01568) window w as (partition by p) order by x, y SETTINGS max_threads = 1;
-6	2
-6	2
-6	2
-6	2
-6	2
-6	2
-24	5
-24	5
-24	5
-24	5
-24	5
-24	5
-42	8
-42	8
-42	8
-42	8
-42	8
-42	8
 select distinct sum(number) over w as x, max(number) over w as y from remote('127.0.0.{1,2}', '', t_01568) window w as (partition by p) order by x, y;
 6	2
 24	5
--- a/tests/queries/0_stateless/01568_window_functions_distributed.sql
+++ b/tests/queries/0_stateless/01568_window_functions_distributed.sql
@ -15,12 +15,8 @@ from numbers(9);

 select sum(number) over w as x, max(number) over w as y from t_01568 window w as (partition by p) order by x, y;

-select sum(number) over w, max(number) over w from t_01568 window w as (partition by p) order by p;
-
 select sum(number) over w as x, max(number) over w as y from remote('127.0.0.{1,2}', '', t_01568) window w as (partition by p) order by x, y;

-select sum(number) over w as x, max(number) over w as y from remote('127.0.0.{1,2}', '', t_01568) window w as (partition by p) order by x, y SETTINGS max_threads = 1;
-
 select distinct sum(number) over w as x, max(number) over w as y from remote('127.0.0.{1,2}', '', t_01568) window w as (partition by p) order by x, y;

 -- window functions + aggregation w/shards
--- a/tests/queries/0_stateless/01926_order_by_desc_limit.sql
+++ b/tests/queries/0_stateless/01926_order_by_desc_limit.sql
@ -11,11 +11,9 @@ SETTINGS index_granularity = 1024, index_granularity_bytes = '10Mi';
 INSERT INTO order_by_desc SELECT number, repeat('a', 1024) FROM numbers(1024 * 300);
 OPTIMIZE TABLE order_by_desc FINAL;

-SELECT s FROM order_by_desc ORDER BY u DESC LIMIT 10 FORMAT Null
-SETTINGS max_memory_usage = '400M';
+SELECT s FROM order_by_desc ORDER BY u DESC LIMIT 10 FORMAT Null;

-SELECT s FROM order_by_desc ORDER BY u LIMIT 10 FORMAT Null
-SETTINGS max_memory_usage = '400M';
+SELECT s FROM order_by_desc ORDER BY u LIMIT 10 FORMAT Null;

 SYSTEM FLUSH LOGS;

--- a/tests/queries/0_stateless/02415_all_new_functions_must_be_documented.sql
+++ b/tests/queries/0_stateless/02415_all_new_functions_must_be_documented.sql
@ -2,7 +2,6 @@
 -- Please help shorten this list down to zero elements.
 SELECT name FROM system.functions WHERE NOT is_aggregate AND origin = 'System' AND alias_to = '' AND length(description) < 10
 AND name NOT IN (
-    'MD4', 'MD5', 'SHA1', 'SHA224', 'SHA256', 'SHA384', 'SHA512', 'halfMD5',
    'aes_decrypt_mysql', 'aes_encrypt_mysql', 'decrypt', 'encrypt',
    'base64Decode', 'base64Encode', 'tryBase64Decode',
    'convertCharset',
--- a/tests/queries/0_stateless/02884_parallel_window_functions.reference
+++ b/tests/queries/0_stateless/02884_parallel_window_functions.reference
@ -1,100 +0,0 @@
-1
-- { echoOn }
-
-SELECT
-    nw,
-    sum(WR) AS R,
-    sumIf(WR, uniq_rows = 1) AS UNR
-FROM
-(
-    SELECT
-        uniq(nw) OVER (PARTITION BY ac) AS uniq_rows,
-        AVG(wg) AS WR,
-        ac,
-        nw
-    FROM window_funtion_threading
-    GROUP BY ac, nw
-)
-GROUP BY nw
-ORDER BY nw ASC, R DESC
-LIMIT 10;
-0	2	0
-1	2	0
-2	2	0
-SELECT
-    nw,
-    sum(WR) AS R,
-    sumIf(WR, uniq_rows = 1) AS UNR
-FROM
-(
-    SELECT
-        uniq(nw) OVER (PARTITION BY ac) AS uniq_rows,
-        AVG(wg) AS WR,
-        ac,
-        nw
-    FROM window_funtion_threading
-    GROUP BY ac, nw
-)
-GROUP BY nw
-ORDER BY nw ASC, R DESC
-LIMIT 10
-SETTINGS max_threads = 1;
-0	2	0
-1	2	0
-2	2	0
-SELECT
-    nw,
-    sum(WR) AS R,
-    sumIf(WR, uniq_rows = 1) AS UNR
-FROM
-(
-    SELECT
-        uniq(nw) OVER (PARTITION BY ac) AS uniq_rows,
-        AVG(wg) AS WR,
-        ac,
-        nw
-    FROM window_funtion_threading
-    WHERE (ac % 4) = 0
-    GROUP BY
-        ac,
-        nw
-    UNION ALL
-    SELECT
-        uniq(nw) OVER (PARTITION BY ac) AS uniq_rows,
-        AVG(wg) AS WR,
-        ac,
-        nw
-    FROM window_funtion_threading
-    WHERE (ac % 4) = 1
-    GROUP BY
-        ac,
-        nw
-    UNION ALL
-    SELECT
-        uniq(nw) OVER (PARTITION BY ac) AS uniq_rows,
-        AVG(wg) AS WR,
-        ac,
-        nw
-    FROM window_funtion_threading
-    WHERE (ac % 4) = 2
-    GROUP BY
-        ac,
-        nw
-    UNION ALL
-    SELECT
-        uniq(nw) OVER (PARTITION BY ac) AS uniq_rows,
-        AVG(wg) AS WR,
-        ac,
-        nw
-    FROM window_funtion_threading
-    WHERE (ac % 4) = 3
-    GROUP BY
-        ac,
-        nw
-)
-GROUP BY nw
-ORDER BY nw ASC, R DESC
-LIMIT 10;
-0	2	0
-1	2	0
-2	2	0
--- a/tests/queries/0_stateless/02884_parallel_window_functions.sql
+++ b/tests/queries/0_stateless/02884_parallel_window_functions.sql
@ -1,119 +0,0 @@
-CREATE TABLE window_funtion_threading
-Engine = MergeTree
-ORDER BY (ac, nw)
-AS SELECT
-        toUInt64(toFloat32(number % 2) % 20000000) as ac,
-        toFloat32(1) as wg,        
-        toUInt16(toFloat32(number % 3) % 400) as nw
-FROM numbers_mt(10000000);
-
-SELECT count() FROM (EXPLAIN PIPELINE SELECT
-    nw,
-    sum(WR) AS R,
-    sumIf(WR, uniq_rows = 1) AS UNR
-FROM
-(
-    SELECT
-        uniq(nw) OVER (PARTITION BY ac) AS uniq_rows,
-        AVG(wg) AS WR,
-        ac,
-        nw
-    FROM window_funtion_threading
-    GROUP BY ac, nw
-)
-GROUP BY nw
-ORDER BY nw ASC, R DESC
-LIMIT 10) where explain ilike '%ScatterByPartitionTransform%' SETTINGS max_threads = 4;
-
-- { echoOn }
-
-SELECT
-    nw,
-    sum(WR) AS R,
-    sumIf(WR, uniq_rows = 1) AS UNR
-FROM
-(
-    SELECT
-        uniq(nw) OVER (PARTITION BY ac) AS uniq_rows,
-        AVG(wg) AS WR,
-        ac,
-        nw
-    FROM window_funtion_threading
-    GROUP BY ac, nw
-)
-GROUP BY nw
-ORDER BY nw ASC, R DESC
-LIMIT 10;
-
-SELECT
-    nw,
-    sum(WR) AS R,
-    sumIf(WR, uniq_rows = 1) AS UNR
-FROM
-(
-    SELECT
-        uniq(nw) OVER (PARTITION BY ac) AS uniq_rows,
-        AVG(wg) AS WR,
-        ac,
-        nw
-    FROM window_funtion_threading
-    GROUP BY ac, nw
-)
-GROUP BY nw
-ORDER BY nw ASC, R DESC
-LIMIT 10
-SETTINGS max_threads = 1;
-
-SELECT
-    nw,
-    sum(WR) AS R,
-    sumIf(WR, uniq_rows = 1) AS UNR
-FROM
-(
-    SELECT
-        uniq(nw) OVER (PARTITION BY ac) AS uniq_rows,
-        AVG(wg) AS WR,
-        ac,
-        nw
-    FROM window_funtion_threading
-    WHERE (ac % 4) = 0
-    GROUP BY
-        ac,
-        nw
-    UNION ALL
-    SELECT
-        uniq(nw) OVER (PARTITION BY ac) AS uniq_rows,
-        AVG(wg) AS WR,
-        ac,
-        nw
-    FROM window_funtion_threading
-    WHERE (ac % 4) = 1
-    GROUP BY
-        ac,
-        nw
-    UNION ALL
-    SELECT
-        uniq(nw) OVER (PARTITION BY ac) AS uniq_rows,
-        AVG(wg) AS WR,
-        ac,
-        nw
-    FROM window_funtion_threading
-    WHERE (ac % 4) = 2
-    GROUP BY
-        ac,
-        nw
-    UNION ALL
-    SELECT
-        uniq(nw) OVER (PARTITION BY ac) AS uniq_rows,
-        AVG(wg) AS WR,
-        ac,
-        nw
-    FROM window_funtion_threading
-    WHERE (ac % 4) = 3
-    GROUP BY
-        ac,
-        nw
-)
-GROUP BY nw
-ORDER BY nw ASC, R DESC
-LIMIT 10;
--- a/tests/queries/0_stateless/02918_template_format_deadlock.reference
+++ b/tests/queries/0_stateless/02918_template_format_deadlock.reference
@ -0,0 +1 @@
+42	43
--- a/tests/queries/0_stateless/02918_template_format_deadlock.sh
+++ b/tests/queries/0_stateless/02918_template_format_deadlock.sh
@ -0,0 +1,19 @@
+#!/usr/bin/env bash
+# Tags: no-fasttest
+
+CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
+# shellcheck source=../shell_config.sh
+. "$CURDIR"/../shell_config.sh
+
+DATA_FILE=$CLICKHOUSE_TEST_UNIQUE_NAME
+TEMPLATE_FILE=$CLICKHOUSE_TEST_UNIQUE_NAME.template
+
+echo "42 | 43
+Error line" > $DATA_FILE
+echo '${a:CSV} | ${b:CSV}' > $TEMPLATE_FILE
+
+$CLICKHOUSE_LOCAL -q "select * from file('$DATA_FILE', Template, 'a UInt32, b UInt32') settings format_template_row='$TEMPLATE_FILE', input_format_allow_errors_num=1"
+
+rm $DATA_FILE
+rm $TEMPLATE_FILE
+
--- a/tests/queries/0_stateless/02919_skip_lots_of_parsing_errors.reference
+++ b/tests/queries/0_stateless/02919_skip_lots_of_parsing_errors.reference
@ -0,0 +1,4 @@
+42
+100000
+42
+100000
--- a/tests/queries/0_stateless/02919_skip_lots_of_parsing_errors.sh
+++ b/tests/queries/0_stateless/02919_skip_lots_of_parsing_errors.sh
@ -0,0 +1,23 @@
+#!/usr/bin/env bash
+# Tags: no-fasttest, no-cpu-aarch64
+
+CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
+# shellcheck source=../shell_config.sh
+. "$CURDIR"/../shell_config.sh
+
+FILE=$CLICKHOUSE_TEST_UNIQUE_NAME
+ERRORS_FILE=$CLICKHOUSE_TEST_UNIQUE_NAME.errors
+
+$CLICKHOUSE_LOCAL -q "select 'Error' from numbers(100000) format TSVRaw" > $FILE
+echo -e "42" >> $FILE
+
+$CLICKHOUSE_LOCAL -q "select * from file('$FILE', CSV, 'x UInt32') settings input_format_allow_errors_ratio=1, max_block_size=10000, input_format_parallel_parsing=0, input_format_record_errors_file_path='$ERRORS_FILE'";
+$CLICKHOUSE_LOCAL -q "select count() from file('$ERRORS_FILE', CSV)"
+rm $ERRORS_FILE
+
+$CLICKHOUSE_LOCAL -q "select * from file('$FILE', CSV, 'x UInt32') settings input_format_allow_errors_ratio=1, max_block_size=10000, input_format_parallel_parsing=1, input_format_record_errors_file_path='$ERRORS_FILE'";
+$CLICKHOUSE_LOCAL -q "select count() from file('$ERRORS_FILE', CSV)"
+rm $ERRORS_FILE
+
+rm $FILE
+
--- a/tests/queries/0_stateless/02920_unary_operators_functions.reference
+++ b/tests/queries/0_stateless/02920_unary_operators_functions.reference
@ -0,0 +1 @@
+2
--- a/tests/queries/0_stateless/02920_unary_operators_functions.sql
+++ b/tests/queries/0_stateless/02920_unary_operators_functions.sql
@ -0,0 +1 @@
+SELECT NOT (0) + NOT (0);
--- a/tests/queries/0_stateless/02935_format_with_arbitrary_types.reference
+++ b/tests/queries/0_stateless/02935_format_with_arbitrary_types.reference
@ -0,0 +1,70 @@
+-- Const string + non-const arbitrary type
+The answer to all questions is 42.
+The answer to all questions is 43.
+The answer to all questions is 44.
+The answer to all questions is 45.
+The answer to all questions is 46.
+The answer to all questions is 47.
+The answer to all questions is 48.
+The answer to all questions is 49.
+The answer to all questions is 50.
+The answer to all questions is 51.
+The answer to all questions is 52.
+The answer to all questions is 53.
+The answer to all questions is 42.42.
+The answer to all questions is 43.43.
+The answer to all questions is 44.
+The answer to all questions is true.
+The answer to all questions is false.
+The answer to all questions is foo.
+The answer to all questions is bar.
+The answer to all questions is foo.
+The answer to all questions is bar.
+The answer to all questions is foo.
+The answer to all questions is bar.
+The answer to all questions is foo.
+The answer to all questions is bar.
+The answer to all questions is 42.
+The answer to all questions is 42.
+The answer to all questions is fae310ca-d52a-4923-9e9b-02bf67f4b009.
+The answer to all questions is 2023-11-14.
+The answer to all questions is 2123-11-14.
+The answer to all questions is 2023-11-14 05:50:12.
+The answer to all questions is 2023-11-14 05:50:12.123.
+The answer to all questions is hallo.
+The answer to all questions is [\'foo\',\'bar\'].
+The answer to all questions is {"foo":"bar"}.
+The answer to all questions is (42,\'foo\').
+The answer to all questions is {42:\'foo\'}.
+The answer to all questions is 122.233.64.201.
+The answer to all questions is 2001:1:130f:2:3:9c0:876a:130b.
+The answer to all questions is (42,43).
+The answer to all questions is [(0,0),(10,0),(10,10),(0,10)].
+The answer to all questions is [[(20,20),(50,20),(50,50),(20,50)],[(30,30),(50,50),(50,30)]].
+The answer to all questions is [[[(0,0),(10,0),(10,10),(0,10)]],[[(20,20),(50,20),(50,50),(20,50)],[(30,30),(50,50),(50,30)]]].
+-- Nested
+The [\'foo\',\'bar\'] to all questions is [\'qaz\',\'qux\'].
+-- NULL arguments
+\N
+\N
+\N
+\N
+\N
+\N
+\N
+-- Various arguments tests
+The Non-const to all questions is  strings
+The Two arguments  to all questions is test
+The Three  to all questions is arguments and  test
+The 3 to all questions is  arguments test and  with int type
+The 42 to all questions is 144
+The 42 to all questions is 144 and 255
+The 42 to all questions is 144
+The 42 to all questions is 144 and 255
+-- Single argument tests
+The answer to all questions is 42.
+The answer to all questions is 42.
+The answer to all questions is foo.
+The answer to all questions is foo.
+\N
+\N
--- a/tests/queries/0_stateless/02935_format_with_arbitrary_types.sql
+++ b/tests/queries/0_stateless/02935_format_with_arbitrary_types.sql
@ -0,0 +1,85 @@
+
+-- Tags: no-fasttest
+-- no-fasttest: json type needs rapidjson library, geo types need s2 geometry
+
+SET allow_experimental_object_type = 1;
+SET allow_suspicious_low_cardinality_types=1;
+
+SELECT '-- Const string + non-const arbitrary type';
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(42 :: Int8));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(43 :: Int16));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(44 :: Int32));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(45 :: Int64));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(46 :: Int128));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(47 :: Int256));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(48 :: UInt8));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(49 :: UInt16));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(50 :: UInt32));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(51 :: UInt64));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(52 :: UInt128));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(53 :: UInt256));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(42.42 :: Float32));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(43.43 :: Float64));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(44.44 :: Decimal(2)));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(true :: Bool));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(false :: Bool));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('foo' :: String));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('bar' :: FixedString(3)));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('foo' :: Nullable(String)));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('bar' :: Nullable(FixedString(3))));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('foo' :: LowCardinality(String)));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('bar' :: LowCardinality(FixedString(3))));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('foo' :: LowCardinality(Nullable(String))));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('bar' :: LowCardinality(Nullable(FixedString(3)))));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(42 :: LowCardinality(Nullable(UInt32))));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(42 :: LowCardinality(UInt32)));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('fae310ca-d52a-4923-9e9b-02bf67f4b009' :: UUID));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('2023-11-14' :: Date));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('2123-11-14' :: Date32));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('2023-11-14 05:50:12' :: DateTime('Europe/Amsterdam')));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('2023-11-14 05:50:12.123' :: DateTime64(3, 'Europe/Amsterdam')));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('hallo' :: Enum('hallo' = 1)));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(['foo', 'bar'] :: Array(String)));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('{"foo": "bar"}' :: JSON));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize((42, 'foo') :: Tuple(Int32, String)));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize(map(42, 'foo') :: Map(Int32, String)));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('122.233.64.201' :: IPv4));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize('2001:0001:130F:0002:0003:09C0:876A:130B' :: IPv6));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize((42, 43) :: Point));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize([(0,0),(10,0),(10,10),(0,10)] :: Ring));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize([[(20, 20), (50, 20), (50, 50), (20, 50)], [(30, 30), (50, 50), (50, 30)]] :: Polygon));
+SELECT format('The {0} to all questions is {1}.', 'answer', materialize([[[(0, 0), (10, 0), (10, 10), (0, 10)]], [[(20, 20), (50, 20), (50, 50), (20, 50)],[(30, 30), (50, 50), (50, 30)]]] :: MultiPolygon));
+
+SELECT '-- Nested';
+DROP TABLE IF EXISTS format_nested;
+CREATE TABLE format_nested(attrs Nested(k String, v String)) ENGINE = MergeTree ORDER BY tuple();
+INSERT INTO format_nested VALUES (['foo', 'bar'], ['qaz', 'qux']);
+SELECT format('The {0} to all questions is {1}.', attrs.k, attrs.v) FROM format_nested;
+DROP TABLE format_nested;
+
+SELECT '-- NULL arguments';
+SELECT format('The {0} to all questions is {1}', NULL, NULL);
+SELECT format('The {0} to all questions is {1}', NULL, materialize(NULL :: Nullable(UInt64)));
+SELECT format('The {0} to all questions is {1}', materialize(NULL :: Nullable(UInt64)), materialize(NULL :: Nullable(UInt64)));
+SELECT format('The {0} to all questions is {1}', 42, materialize(NULL :: Nullable(UInt64)));
+SELECT format('The {0} to all questions is {1}', '42', materialize(NULL :: Nullable(UInt64)));
+SELECT format('The {0} to all questions is {1}', 42, materialize(NULL :: Nullable(UInt64)), materialize(NULL :: Nullable(UInt64)));
+SELECT format('The {0} to all questions is {1}', '42', materialize(NULL :: Nullable(UInt64)), materialize(NULL :: Nullable(UInt64)));
+
+SELECT '-- Various arguments tests';
+SELECT format('The {0} to all questions is {1}', materialize('Non-const'), materialize(' strings'));
+SELECT format('The {0} to all questions is {1}', 'Two arguments ', 'test');
+SELECT format('The {0} to all questions is {1} and {2}', 'Three ', 'arguments', ' test');
+SELECT format('The {0} to all questions is {1} and {2}', materialize(3 :: Int64), ' arguments test', ' with int type');
+SELECT format('The {0} to all questions is {1}', materialize(42 :: Int32), materialize(144 :: UInt64));
+SELECT format('The {0} to all questions is {1} and {2}', materialize(42 :: Int32), materialize(144 :: UInt64), materialize(255 :: UInt32));
+SELECT format('The {0} to all questions is {1}', 42, 144);
+SELECT format('The {0} to all questions is {1} and {2}', 42, 144, 255);
+
+SELECT '-- Single argument tests';
+SELECT format('The answer to all questions is {0}.', 42);
+SELECT format('The answer to all questions is {0}.', materialize(42));
+SELECT format('The answer to all questions is {0}.', 'foo');
+SELECT format('The answer to all questions is {0}.', materialize('foo'));
+SELECT format('The answer to all questions is {0}.', NULL);
+SELECT format('The answer to all questions is {0}.', materialize(NULL :: Nullable(UInt64)));
--- a/tests/queries/0_stateless/02935_ipv6_bit_operations.reference
+++ b/tests/queries/0_stateless/02935_ipv6_bit_operations.reference
@ -0,0 +1 @@
+11111111111111110000000000000000111111111111111100000000000000001111111111111111000000000000000011111111111111110000000000000000	00000000000000001111111111111111000000000000000011111111111111110000000000000000111111111111111100000000000000001111111111111111	10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010	01010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101	10101010101010100000000000000000101010101010101000000000000000001010101010101010000000000000000010101010101010100000000000000000	10101010101010100000000000000000101010101010101000000000000000001010101010101010000000000000000010101010101010100000000000000000	1010101010101010000000000000000010101010101010100000000000000000101010101010101000000000000000001010101010101010	1010101010101010000000000000000010101010101010100000000000000000101010101010101000000000000000001010101010101010	01010101010101010000000000000000010101010101010100000000000000000101010101010101000000000000000001010101010101010000000000000000	01010101010101010000000000000000010101010101010100000000000000000101010101010101000000000000000001010101010101010000000000000000	0101010101010101000000000000000001010101010101010000000000000000010101010101010100000000000000000101010101010101	0101010101010101000000000000000001010101010101010000000000000000010101010101010100000000000000000101010101010101	11111111111111111010101010101010111111111111111110101010101010101111111111111111101010101010101011111111111111111010101010101010	11111111111111111010101010101010111111111111111110101010101010101111111111111111101010101010101011111111111111111010101010101010	10101010101010101111111111111111101010101010101011111111111111111010101010101010111111111111111110101010101010101111111111111111	10101010101010101111111111111111101010101010101011111111111111111010101010101010111111111111111110101010101010101111111111111111	11111111111111110101010101010101111111111111111101010101010101011111111111111111010101010101010111111111111111110101010101010101	11111111111111110101010101010101111111111111111101010101010101011111111111111111010101010101010111111111111111110101010101010101	01010101010101011111111111111111010101010101010111111111111111110101010101010101111111111111111101010101010101011111111111111111	01010101010101011111111111111111010101010101010111111111111111110101010101010101111111111111111101010101010101011111111111111111
--- a/tests/queries/0_stateless/02935_ipv6_bit_operations.sql
+++ b/tests/queries/0_stateless/02935_ipv6_bit_operations.sql
@ -0,0 +1,7 @@
+WITH toIPv6('FFFF:0000:FFFF:0000:FFFF:0000:FFFF:0000') AS ip1, toIPv6('0000:FFFF:0000:FFFF:0000:FFFF:0000:FFFF') AS ip2,
+     CAST('226854911280625642308916404954512140970', 'UInt128') AS n1, CAST('113427455640312821154458202477256070485', 'UInt128') AS n2
+SELECT bin(ip1), bin(ip2), bin(n1), bin(n2),
+       bin(bitAnd(ip1, n1)), bin(bitAnd(n1, ip1)), bin(bitAnd(ip2, n1)), bin(bitAnd(n1, ip2)),
+       bin(bitAnd(ip1, n2)), bin(bitAnd(n2, ip1)), bin(bitAnd(ip2, n2)), bin(bitAnd(n2, ip2)),
+       bin(bitOr(ip1, n1)), bin(bitOr(n1, ip1)), bin(bitOr(ip2, n1)), bin(bitOr(n1, ip2)),
+       bin(bitOr(ip1, n2)), bin(bitOr(n2, ip1)), bin(bitOr(ip2, n2)), bin(bitOr(n2, ip2));
--- a/tests/queries/0_stateless/02942_window_functions_logical_error.reference
+++ b/tests/queries/0_stateless/02942_window_functions_logical_error.reference
@ -0,0 +1,216 @@
+1	901	19
+1	911	19
+1	921	19
+1	931	19
+1	941	19
+1	951	20
+1	961	20
+1	971	20
+1	981	20
+1	991	20
+2	902	19
+2	912	19
+2	922	19
+2	932	19
+2	942	19
+2	952	20
+2	962	20
+2	972	20
+2	982	20
+2	992	20
+3	903	19
+3	913	19
+3	923	19
+3	933	19
+3	943	19
+3	953	20
+3	963	20
+3	973	20
+3	983	20
+3	993	20
+4	904	19
+4	914	19
+4	924	19
+4	934	19
+4	944	19
+4	954	20
+4	964	20
+4	974	20
+4	984	20
+4	994	20
+5	905	19
+5	915	19
+5	925	19
+5	935	19
+5	945	19
+5	955	20
+5	965	20
+5	975	20
+5	985	20
+5	995	20
+6	906	19
+6	916	19
+6	926	19
+6	936	19
+6	946	19
+6	956	20
+6	966	20
+6	976	20
+6	986	20
+6	996	20
+7	907	19
+7	917	19
+7	927	19
+7	937	19
+7	947	19
+7	957	20
+7	967	20
+7	977	20
+7	987	20
+7	997	20
+8	908	19
+8	918	19
+8	928	19
+8	938	19
+8	948	19
+8	958	20
+8	968	20
+8	978	20
+8	988	20
+8	998	20
+9	909	19
+9	919	19
+9	929	19
+9	939	19
+9	949	19
+9	959	20
+9	969	20
+9	979	20
+9	989	20
+9	999	20
+1	1301	19
+1	1311	19
+1	1321	19
+1	1331	19
+1	1341	19
+1	1351	19
+1	1361	19
+1	1371	20
+1	1381	20
+1	1391	20
+1	1401	20
+1	1411	20
+1	1421	20
+1	1431	20
+2	1302	19
+2	1312	19
+2	1322	19
+2	1332	19
+2	1342	19
+2	1352	19
+2	1362	19
+2	1372	20
+2	1382	20
+2	1392	20
+2	1402	20
+2	1412	20
+2	1422	20
+2	1432	20
+3	1303	19
+3	1313	19
+3	1323	19
+3	1333	19
+3	1343	19
+3	1353	19
+3	1363	19
+3	1373	20
+3	1383	20
+3	1393	20
+3	1403	20
+3	1413	20
+3	1423	20
+3	1433	20
+4	1304	19
+4	1314	19
+4	1324	19
+4	1334	19
+4	1344	19
+4	1354	19
+4	1364	19
+4	1374	20
+4	1384	20
+4	1394	20
+4	1404	20
+4	1414	20
+4	1424	20
+4	1434	20
+5	1305	19
+5	1315	19
+5	1325	19
+5	1335	19
+5	1345	19
+5	1355	19
+5	1365	19
+5	1375	20
+5	1385	20
+5	1395	20
+5	1405	20
+5	1415	20
+5	1425	20
+5	1435	20
+6	1306	19
+6	1316	19
+6	1326	19
+6	1336	19
+6	1346	19
+6	1356	19
+6	1366	19
+6	1376	20
+6	1386	20
+6	1396	20
+6	1406	20
+6	1416	20
+6	1426	20
+6	1436	20
+7	1307	19
+7	1317	19
+7	1327	19
+7	1337	19
+7	1347	19
+7	1357	19
+7	1367	19
+7	1377	20
+7	1387	20
+7	1397	20
+7	1407	20
+7	1417	20
+7	1427	20
+7	1437	20
+8	1308	19
+8	1318	19
+8	1328	19
+8	1338	19
+8	1348	19
+8	1358	19
+8	1368	19
+8	1378	20
+8	1388	20
+8	1398	20
+8	1408	20
+8	1418	20
+8	1428	20
+8	1438	20
+9	1309	19
+9	1319	19
+9	1329	19
+9	1339	19
+9	1349	19
+9	1359	19
+9	1369	19
+9	1379	20
+9	1389	20
+9	1399	20
+9	1409	20
+9	1419	20
+9	1429	20
+9	1439	20
--- a/tests/queries/0_stateless/02942_window_functions_logical_error.sql
+++ b/tests/queries/0_stateless/02942_window_functions_logical_error.sql
@ -0,0 +1,158 @@
+DROP TABLE IF EXISTS posts;
+DROP TABLE IF EXISTS post_metrics;
+
+CREATE TABLE IF NOT EXISTS posts
+(
+    `page_id` LowCardinality(String),
+    `post_id` String CODEC(LZ4),
+    `host_id` UInt32 CODEC(T64, LZ4),
+    `path_id` UInt32,
+    `created` DateTime CODEC(T64, LZ4),
+    `as_of` DateTime CODEC(T64, LZ4)
+)
+ENGINE = ReplacingMergeTree(as_of)
+PARTITION BY toStartOfMonth(created)
+ORDER BY (page_id, post_id)
+TTL created + toIntervalMonth(26);
+
+
+INSERT INTO posts SELECT
+    repeat('a', (number % 10) + 1),
+    toString(number),
+    number % 10,
+    number,
+    now() - toIntervalMinute(number),
+    now()
+FROM numbers(1000);
+
+
+CREATE TABLE IF NOT EXISTS post_metrics
+(
+    `page_id` LowCardinality(String),
+    `post_id` String CODEC(LZ4),
+    `created` DateTime CODEC(T64, LZ4),
+    `impressions` UInt32 CODEC(T64, LZ4),
+    `clicks` UInt32 CODEC(T64, LZ4),
+    `as_of` DateTime CODEC(T64, LZ4)
+)
+ENGINE = ReplacingMergeTree(as_of)
+PARTITION BY toStartOfMonth(created)
+ORDER BY (page_id, post_id)
+TTL created + toIntervalMonth(26);
+
+
+INSERT INTO post_metrics SELECT
+    repeat('a', (number % 10) + 1),
+    toString(number),
+    now() - toIntervalMinute(number),
+    number * 100,
+    number * 10,
+    now()
+FROM numbers(1000);
+
+
+SELECT
+    host_id,
+    path_id,
+    max(rank) AS rank
+FROM
+(
+    WITH
+        as_of_posts AS
+        (
+            SELECT
+                *,
+                row_number() OVER (PARTITION BY (page_id, post_id) ORDER BY as_of DESC) AS row_num
+            FROM posts
+            WHERE (created >= subtractHours(now(), 24)) AND (host_id > 0)
+        ),
+        as_of_post_metrics AS
+        (
+            SELECT
+                *,
+                row_number() OVER (PARTITION BY (page_id, post_id) ORDER BY as_of DESC) AS row_num
+            FROM post_metrics
+            WHERE created >= subtractHours(now(), 24)
+        )
+    SELECT
+        page_id,
+        post_id,
+        host_id,
+        path_id,
+        impressions,
+        clicks,
+        ntile(20) OVER (PARTITION BY page_id ORDER BY clicks ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS rank
+    FROM as_of_posts
+    GLOBAL LEFT JOIN as_of_post_metrics USING (page_id, post_id, row_num)
+    WHERE (row_num = 1) AND (impressions > 0)
+) AS t
+WHERE t.rank > 18
+GROUP BY
+    host_id,
+    path_id
+ORDER BY host_id, path_id;
+
+
+INSERT INTO posts SELECT
+    repeat('a', (number % 10) + 1),
+    toString(number),
+    number % 10,
+    number,
+    now() - toIntervalMinute(number),
+    now()
+FROM numbers(100000);
+
+
+INSERT INTO post_metrics SELECT
+    repeat('a', (number % 10) + 1),
+    toString(number),
+    now() - toIntervalMinute(number),
+    number * 100,
+    number * 10,
+    now()
+FROM numbers(100000);
+
+
+SELECT
+    host_id,
+    path_id,
+    max(rank) AS rank
+FROM
+(
+    WITH
+        as_of_posts AS
+        (
+            SELECT
+                *,
+                row_number() OVER (PARTITION BY (page_id, post_id) ORDER BY as_of DESC) AS row_num
+            FROM posts
+            WHERE (created >= subtractHours(now(), 24)) AND (host_id > 0)
+        ),
+        as_of_post_metrics AS
+        (
+            SELECT
+                *,
+                row_number() OVER (PARTITION BY (page_id, post_id) ORDER BY as_of DESC) AS row_num
+            FROM post_metrics
+            WHERE created >= subtractHours(now(), 24)
+        )
+    SELECT
+        page_id,
+        post_id,
+        host_id,
+        path_id,
+        impressions,
+        clicks,
+        ntile(20) OVER (PARTITION BY page_id ORDER BY clicks ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS rank
+    FROM as_of_posts
+    GLOBAL LEFT JOIN as_of_post_metrics USING (page_id, post_id, row_num)
+    WHERE (row_num = 1) AND (impressions > 0)
+) AS t
+WHERE t.rank > 18
+GROUP BY
+    host_id,
+    path_id
+ORDER BY host_id, path_id;
+
+DROP TABLE posts;
+DROP TABLE post_metrics;
--- a/utils/check-style/aspell-ignore/en/aspell-dict.txt
+++ b/utils/check-style/aspell-ignore/en/aspell-dict.txt
@ -30,6 +30,7 @@ AppleClang
 Approximative
 ArrayJoin
 ArrowStream
+AsyncInsertCacheSize
 AsynchronousHeavyMetricsCalculationTimeSpent
 AsynchronousHeavyMetricsUpdateInterval
 AsynchronousInsert
@ -38,11 +39,6 @@ AsynchronousInsertThreadsActive
 AsynchronousMetricsCalculationTimeSpent
 AsynchronousMetricsUpdateInterval
 AsynchronousReadWait
-AsyncInsertCacheSize
-TablesLoaderBackgroundThreads
-TablesLoaderBackgroundThreadsActive
-TablesLoaderForegroundThreads
-TablesLoaderForegroundThreadsActive
 Authenticator
 Authenticators
 AutoFDO
@ -887,6 +883,10 @@ TabSeparatedRawWithNamesAndTypes
 TabSeparatedWithNames
 TabSeparatedWithNamesAndTypes
 Tabix
+TablesLoaderBackgroundThreads
+TablesLoaderBackgroundThreadsActive
+TablesLoaderForegroundThreads
+TablesLoaderForegroundThreadsActive
 TablesToDropQueueSize
 TargetSpecific
 Telegraf
@ -1233,6 +1233,7 @@ changelogs
 charset
 charsets
 chconn
+cheatsheet
 checkouting
 checksummed
 checksumming
				`@ -0,0 +1 @@`
				11111111111111110000000000000000111111111111111100000000000000001111111111111111000000000000000011111111111111110000000000000000 00000000000000001111111111111111000000000000000011111111111111110000000000000000111111111111111100000000000000001111111111111111 10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010 01010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101 10101010101010100000000000000000101010101010101000000000000000001010101010101010000000000000000010101010101010100000000000000000 10101010101010100000000000000000101010101010101000000000000000001010101010101010000000000000000010101010101010100000000000000000 1010101010101010000000000000000010101010101010100000000000000000101010101010101000000000000000001010101010101010 1010101010101010000000000000000010101010101010100000000000000000101010101010101000000000000000001010101010101010 01010101010101010000000000000000010101010101010100000000000000000101010101010101000000000000000001010101010101010000000000000000 01010101010101010000000000000000010101010101010100000000000000000101010101010101000000000000000001010101010101010000000000000000 0101010101010101000000000000000001010101010101010000000000000000010101010101010100000000000000000101010101010101 0101010101010101000000000000000001010101010101010000000000000000010101010101010100000000000000000101010101010101 11111111111111111010101010101010111111111111111110101010101010101111111111111111101010101010101011111111111111111010101010101010 11111111111111111010101010101010111111111111111110101010101010101111111111111111101010101010101011111111111111111010101010101010 10101010101010101111111111111111101010101010101011111111111111111010101010101010111111111111111110101010101010101111111111111111 10101010101010101111111111111111101010101010101011111111111111111010101010101010111111111111111110101010101010101111111111111111 11111111111111110101010101010101111111111111111101010101010101011111111111111111010101010101010111111111111111110101010101010101 11111111111111110101010101010101111111111111111101010101010101011111111111111111010101010101010111111111111111110101010101010101 01010101010101011111111111111111010101010101010111111111111111110101010101010101111111111111111101010101010101011111111111111111 01010101010101011111111111111111010101010101010111111111111111110101010101010101111111111111111101010101010101011111111111111111