Merge branch 'master' into libcxx-as-submodule

2024-11-21 23:21:59 +00:00 · 2018-12-28 03:18:50 +03:00 · 2018-12-28 03:18:50 +03:00 · df42e26146
commit df42e26146
parent 2b9d9536ee 5ff30f4512
885 changed files with 17343 additions and 8450 deletions
--- a/.clang-format
+++ b/.clang-format
@ -50,7 +50,8 @@ IncludeCategories:
  - Regex: '.*'
    Priority: 40
 ReflowComments: false
-AlignEscapedNewlinesLeft: true
+AlignEscapedNewlinesLeft: false
+AlignEscapedNewlines: DontAlign

 # Not changed:
 AccessModifierOffset: -4
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@ -1 +1,20 @@
 I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
+
+For changelog. Remove if this is non-significant change.
+
+Category (leave one):
+- New Feature
+- Bug Fix
+- Improvement
+- Performance Improvement
+- Backward Incompatible Change
+- Build/Testing/Packaging Improvement
+- Other
+
+Short description (up to few sentences):
+
+...
+
+Detailed description (optional):
+
+...
--- a/.gitignore
+++ b/.gitignore
@ -248,3 +248,6 @@ website/package-lock.json

 # Ignore files for locally disabled tests
 /dbms/tests/queries/**/*.disabled
+
+# cquery cache
+/.cquery-cache
--- a/.gitmodules
+++ b/.gitmodules
@ -36,7 +36,7 @@
 	url = https://github.com/ClickHouse-Extras/llvm
 [submodule "contrib/mariadb-connector-c"]
 	path = contrib/mariadb-connector-c
-	url = https://github.com/MariaDB/mariadb-connector-c.git
+	url = https://github.com/ClickHouse-Extras/mariadb-connector-c.git
 [submodule "contrib/jemalloc"]
 	path = contrib/jemalloc
 	url = https://github.com/jemalloc/jemalloc.git
--- a/CHANGELOG.draft.md
+++ b/CHANGELOG.draft.md
@ -1,4 +1 @@
-* Настройка `enable_optimize_predicate_expression` выключена по-умолчанию.

-### Улучшения:
-* Файлы *-preprocessed.xml записываются в директорию с данными (/var/lib/clickhouse/preprocessed_configs). Для /etc/clickhouse-server больше не нужен +w для пользователя clickhouse. Для удобства создан симлинк /var/lib/clickhouse/preprocessed_configs -> /etc/clickhouse-server/preprocessed
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -1,9 +1,131 @@
+## ClickHouse release 18.16.1, 2018-12-21
+
+### Bug fixes:
+
+* Fixed an error that led to problems with updating dictionaries with the ODBC source. [#3825](https://github.com/yandex/ClickHouse/issues/3825), [#3829](https://github.com/yandex/ClickHouse/issues/3829)
+* JIT compilation of aggregate functions now works with LowCardinality columns. [#3838](https://github.com/yandex/ClickHouse/issues/3838)
+
+### Improvements:
+
+* Added the `low_cardinality_allow_in_native_format` setting (enabled by default). When disabled, LowCardinality columns will be converted to ordinary columns for SELECT queries and ordinary columns will be expected for INSERT queries. [#3879](https://github.com/yandex/ClickHouse/pull/3879)
+
+### Build improvements:
+
+* Fixes for builds on macOS and ARM.
+
+## ClickHouse release 18.16.0, 2018-12-14
+
+### New features:
+
+* `DEFAULT` expressions are evaluated for missing fields when loading data in semi-structured input formats (`JSONEachRow`, `TSKV`). [#3555](https://github.com/yandex/ClickHouse/pull/3555)
+* The `ALTER TABLE` query now has the `MODIFY ORDER BY` action for changing the sorting key when adding or removing a table column. This is useful for tables in the `MergeTree` family that perform additional tasks when merging based on this sorting key, such as `SummingMergeTree`, `AggregatingMergeTree`, and so on. [#3581](https://github.com/yandex/ClickHouse/pull/3581) [#3755](https://github.com/yandex/ClickHouse/pull/3755)
+* For tables in the `MergeTree` family, now you can specify a different sorting key (`ORDER BY`) and index (`PRIMARY KEY`). The sorting key can be longer than the index. [#3581](https://github.com/yandex/ClickHouse/pull/3581)
+* Added the `hdfs` table function and the `HDFS` table engine for importing and exporting data to HDFS. [chenxing-xc](https://github.com/yandex/ClickHouse/pull/3617)
+* Added functions for working with base64: `base64Encode`, `base64Decode`, `tryBase64Decode`. [Alexander Krasheninnikov](https://github.com/yandex/ClickHouse/pull/3350)
+* Now you can use a parameter to configure the precision of the `uniqCombined` aggregate function (select the number of HyperLogLog cells). [#3406](https://github.com/yandex/ClickHouse/pull/3406)
+* Added the `system.contributors` table that contains the names of everyone who made commits in ClickHouse. [#3452](https://github.com/yandex/ClickHouse/pull/3452)
+* Added the ability to omit the partition for the `ALTER TABLE ... FREEZE` query in order to back up all partitions at once. [#3514](https://github.com/yandex/ClickHouse/pull/3514)
+* Added `dictGet` and `dictGetOrDefault` functions that don't require specifying the type of return value. The type is determined automatically from the dictionary description. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3564)
+* Now you can specify comments for a column in the table description and change it using `ALTER`. [#3377](https://github.com/yandex/ClickHouse/pull/3377)
+* Reading is supported for `Join` type tables with simple keys. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
+* Now you can specify the options `join_use_nulls`, `max_rows_in_join`, `max_bytes_in_join`, and `join_overflow_mode` when creating a `Join` type table. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
+* Added the `joinGet` function that allows you to use a `Join` type table like a dictionary. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
+* Added the `partition_key`, `sorting_key`, `primary_key`, and `sampling_key` columns to the `system.tables` table in order to provide information about table keys. [#3609](https://github.com/yandex/ClickHouse/pull/3609)
+* Added the `is_in_partition_key`, `is_in_sorting_key`, `is_in_primary_key`, and `is_in_sampling_key` columns to the `system.columns` table. [#3609](https://github.com/yandex/ClickHouse/pull/3609)
+* Added the `min_time` and `max_time`  columns to the `system.parts` table. These columns are populated when the partitioning key is an expression consisting of `DateTime` columns. [Emmanuel Donin de Rosière](https://github.com/yandex/ClickHouse/pull/3800)
+
+### Bug fixes:
+
+* Fixes and performance improvements for the `LowCardinality` data type. `GROUP BY` using `LowCardinality(Nullable(...))`. Getting the values of `extremes`. Processing high-order functions. `LEFT ARRAY JOIN`. Distributed `GROUP BY`. Functions that return `Array`. Execution of `ORDER BY`. Writing to `Distributed` tables (nicelulu). Backward compatibility for `INSERT` queries from old clients that implement the `Native` protocol. Support for `LowCardinality` for `JOIN`. Improved performance when working in a single stream. [#3823](https://github.com/yandex/ClickHouse/pull/3823) [#3803](https://github.com/yandex/ClickHouse/pull/3803) [#3799](https://github.com/yandex/ClickHouse/pull/3799) [#3769](https://github.com/yandex/ClickHouse/pull/3769) [#3744](https://github.com/yandex/ClickHouse/pull/3744) [#3681](https://github.com/yandex/ClickHouse/pull/3681) [#3651](https://github.com/yandex/ClickHouse/pull/3651) [#3649](https://github.com/yandex/ClickHouse/pull/3649) [#3641](https://github.com/yandex/ClickHouse/pull/3641) [#3632](https://github.com/yandex/ClickHouse/pull/3632) [#3568](https://github.com/yandex/ClickHouse/pull/3568) [#3523](https://github.com/yandex/ClickHouse/pull/3523) [#3518](https://github.com/yandex/ClickHouse/pull/3518)
+* Fixed how the `select_sequential_consistency` option works. Previously, when this setting was enabled, an incomplete result was sometimes returned after beginning to write to a new partition. [#2863](https://github.com/yandex/ClickHouse/pull/2863)
+* Databases are correctly specified when executing DDL `ON CLUSTER` queries and `ALTER UPDATE/DELETE`. [#3772](https://github.com/yandex/ClickHouse/pull/3772) [#3460](https://github.com/yandex/ClickHouse/pull/3460)
+* Databases are correctly specified for subqueries inside a VIEW. [#3521](https://github.com/yandex/ClickHouse/pull/3521)
+* Fixed a bug in `PREWHERE` with `FINAL` for `VersionedCollapsingMergeTree`. [7167bfd7](https://github.com/yandex/ClickHouse/commit/7167bfd7b365538f7a91c4307ad77e552ab4e8c1)
+* Now you can use `KILL QUERY` to cancel queries that have not started yet because they are waiting for the table to be locked. [#3517](https://github.com/yandex/ClickHouse/pull/3517)
+* Corrected date and time calculations if the clocks were moved back at midnight (this happens in Iran, and happened in Moscow from 1981 to 1983). Previously, this led to the time being reset a day earlier than necessary, and also caused incorrect formatting of the date and time in text format. [#3819](https://github.com/yandex/ClickHouse/pull/3819)
+* Fixed bugs in some cases of `VIEW` and subqueries that omit the database. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3521)
+* Fixed a race condition when simultaneously reading from a `MATERIALIZED VIEW` and deleting a `MATERIALIZED VIEW` due to not locking the internal `MATERIALIZED VIEW`. [#3404](https://github.com/yandex/ClickHouse/pull/3404) [#3694](https://github.com/yandex/ClickHouse/pull/3694)
+* Fixed the error `Lock handler cannot be nullptr.` [#3689](https://github.com/yandex/ClickHouse/pull/3689)
+* Fixed query processing when the `compile_expressions` option is enabled (it's enabled by default). Nondeterministic constant expressions like the `now` function are no longer unfolded. [#3457](https://github.com/yandex/ClickHouse/pull/3457)
+* Fixed a crash when specifying a non-constant scale argument in `toDecimal32/64/128` functions.
+* Fixed an error when trying to insert an array with `NULL` elements in the `Values` format into a column of type `Array` without `Nullable` (if `input_format_values_interpret_expressions` = 1). [#3487](https://github.com/yandex/ClickHouse/pull/3487) [#3503](https://github.com/yandex/ClickHouse/pull/3503)
+* Fixed continuous error logging in `DDLWorker` if ZooKeeper is not available. [8f50c620](https://github.com/yandex/ClickHouse/commit/8f50c620334988b28018213ec0092fe6423847e2)
+* Fixed the return type for `quantile*` functions from `Date` and `DateTime` types of arguments. [#3580](https://github.com/yandex/ClickHouse/pull/3580)
+* Fixed the `WITH` clause if it specifies a simple alias without expressions. [#3570](https://github.com/yandex/ClickHouse/pull/3570)
+* Fixed processing of queries with named sub-queries and qualified column names when `enable_optimize_predicate_expression` is enabled. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3588)
+* Fixed the error `Attempt to attach to nullptr thread group` when working with materialized views. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3623)
+* Fixed a crash when passing certain incorrect arguments to the `arrayReverse` function. [73e3a7b6](https://github.com/yandex/ClickHouse/commit/73e3a7b662161d6005e7727d8a711b930386b871)
+* Fixed the buffer overflow in the `extractURLParameter` function. Improved performance. Added correct processing of strings containing zero bytes. [141e9799](https://github.com/yandex/ClickHouse/commit/141e9799e49201d84ea8e951d1bed4fb6d3dacb5)
+* Fixed buffer overflow in the `lowerUTF8` and `upperUTF8` functions. Removed the ability to execute these functions over `FixedString` type arguments. [#3662](https://github.com/yandex/ClickHouse/pull/3662)
+* Fixed a rare race condition when deleting `MergeTree` tables. [#3680](https://github.com/yandex/ClickHouse/pull/3680)
+* Fixed a race condition when reading from `Buffer` tables and simultaneously performing `ALTER` or `DROP` on the target tables. [#3719](https://github.com/yandex/ClickHouse/pull/3719)
+* Fixed a segfault if the `max_temporary_non_const_columns` limit was exceeded. [#3788](https://github.com/yandex/ClickHouse/pull/3788)
+
+### Improvements:
+
+* The server does not write the processed configuration files to the `/etc/clickhouse-server/` directory. Instead, it saves them in the `preprocessed_configs` directory inside `path`. This means that the `/etc/clickhouse-server/` directory doesn't have write access for the `clickhouse` user, which improves security. [#2443](https://github.com/yandex/ClickHouse/pull/2443)
+* The `min_merge_bytes_to_use_direct_io` option is set to 10 GiB by default. A merge that forms large parts of tables from the MergeTree family will be performed in `O_DIRECT` mode, which prevents excessive page cache eviction. [#3504](https://github.com/yandex/ClickHouse/pull/3504)
+* Accelerated server start when there is a very large number of tables. [#3398](https://github.com/yandex/ClickHouse/pull/3398)
+* Added a connection pool and HTTP `Keep-Alive` for connections between replicas. [#3594](https://github.com/yandex/ClickHouse/pull/3594)
+* If the query syntax is invalid, the `400 Bad Request` code is returned in the `HTTP` interface (500 was returned previously). [31bc680a](https://github.com/yandex/ClickHouse/commit/31bc680ac5f4bb1d0360a8ba4696fa84bb47d6ab)
+* The `join_default_strictness` option is set to `ALL` by default for compatibility. [120e2cbe](https://github.com/yandex/ClickHouse/commit/120e2cbe2ff4fbad626c28042d9b28781c805afe)
+* Removed logging to `stderr` from the `re2` library for invalid or complex regular expressions. [#3723](https://github.com/yandex/ClickHouse/pull/3723)
+* Added for the `Kafka` table engine: checks for subscriptions before beginning to read from Kafka; the kafka_max_block_size setting for the table. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3396)
+* The `cityHash64`, `farmHash64`, `metroHash64`, `sipHash64`, `halfMD5`, `murmurHash2_32`, `murmurHash2_64`, `murmurHash3_32`, and `murmurHash3_64` functions now work for any number of arguments and for arguments in the form of tuples. [#3451](https://github.com/yandex/ClickHouse/pull/3451) [#3519](https://github.com/yandex/ClickHouse/pull/3519)
+* The `arrayReverse` function now works with any types of arrays. [73e3a7b6](https://github.com/yandex/ClickHouse/commit/73e3a7b662161d6005e7727d8a711b930386b871)
+* Added an optional parameter: the slot size for the `timeSlots` function. [Kirill Shvakov](https://github.com/yandex/ClickHouse/pull/3724)
+* For `FULL` and `RIGHT JOIN`, the `max_block_size` setting is used for a stream of non-joined data from the right table. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3699)
+* Added the `--secure` command line parameter in  `clickhouse-benchmark` and `clickhouse-performance-test` to enable TLS. [#3688](https://github.com/yandex/ClickHouse/pull/3688) [#3690](https://github.com/yandex/ClickHouse/pull/3690)
+* Type conversion when the structure of a `Buffer` type table does not match the structure of the destination table. [Vitaly Baranov](https://github.com/yandex/ClickHouse/pull/3603)
+* Added the `tcp_keep_alive_timeout` option to enable keep-alive packets after inactivity for the specified time interval. [#3441](https://github.com/yandex/ClickHouse/pull/3441)
+* Removed unnecessary quoting of values for the partition key in the `system.parts` table if it consists of a single column. [#3652](https://github.com/yandex/ClickHouse/pull/3652)
+* The modulo function works for `Date` and `DateTime` data types. [#3385](https://github.com/yandex/ClickHouse/pull/3385)
+* Added synonyms for the `POWER`, `LN`, `LCASE`, `UCASE`, `REPLACE`, `LOCATE`, `SUBSTR`, and `MID` functions. [#3774](https://github.com/yandex/ClickHouse/pull/3774) [#3763](https://github.com/yandex/ClickHouse/pull/3763) Some function names are case-insensitive for compatibility with the SQL standard. Added syntactic sugar `SUBSTRING(expr FROM start FOR length)` for compatibility with SQL. [#3804](https://github.com/yandex/ClickHouse/pull/3804)
+* Added the ability to `mlock` memory pages corresponding to `clickhouse-server`  executable code to prevent it from being forced out of memory. This feature is disabled by default. [#3553](https://github.com/yandex/ClickHouse/pull/3553)
+* Improved performance when reading from `O_DIRECT` (with the `min_bytes_to_use_direct_io` option enabled). [#3405](https://github.com/yandex/ClickHouse/pull/3405)
+* Improved performance of the `dictGet...OrDefault` function for a constant key argument and a non-constant default argument. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3563)
+* The `firstSignificantSubdomain` function now processes the domains `gov`, `mil`, and `edu`. [Igor Hatarist](https://github.com/yandex/ClickHouse/pull/3601) Improved performance. [#3628](https://github.com/yandex/ClickHouse/pull/3628)
+* Ability to specify custom environment variables for starting `clickhouse-server` using the `SYS-V init.d` script by defining `CLICKHOUSE_PROGRAM_ENV` in `/etc/default/clickhouse`.
+[Pavlo Bashynskyi](https://github.com/yandex/ClickHouse/pull/3612)
+* Correct return code for the clickhouse-server init script. [#3516](https://github.com/yandex/ClickHouse/pull/3516)
+* The `system.metrics` table now has the `VersionInteger` metric, and `system.build_options` has the added line `VERSION_INTEGER`, which contains the numeric form of the ClickHouse version, such as  `18016000`. [#3644](https://github.com/yandex/ClickHouse/pull/3644)
+* Removed the ability to compare the `Date` type with a number to avoid potential errors like `date = 2018-12-17`, where quotes around the date are omitted by mistake. [#3687](https://github.com/yandex/ClickHouse/pull/3687)
+* Fixed the behavior of stateful functions like `rowNumberInAllBlocks`. They previously output a result that was one number larger due to starting during query analysis. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3729)
+* If the `force_restore_data` file can't be deleted, an error message is displayed. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3794)
+
+### Build improvements:
+
+* Updated the `jemalloc` library, which fixes a potential memory leak. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3557)
+* Profiling with `jemalloc` is enabled by default in order to debug builds. [2cc82f5c](https://github.com/yandex/ClickHouse/commit/2cc82f5cbe266421cd4c1165286c2c47e5ffcb15)
+* Added the ability to run integration tests when only `Docker` is installed on the system. [#3650](https://github.com/yandex/ClickHouse/pull/3650)
+* Added the fuzz expression test in SELECT queries. [#3442](https://github.com/yandex/ClickHouse/pull/3442)
+* Added a stress test for commits, which performs functional tests in parallel and in random order to detect more race conditions. [#3438](https://github.com/yandex/ClickHouse/pull/3438)
+* Improved the method for starting clickhouse-server in a Docker image. [Elghazal Ahmed](https://github.com/yandex/ClickHouse/pull/3663)
+* For a Docker image, added support for initializing databases using files in the `/docker-entrypoint-initdb.d` directory. [Konstantin Lebedev](https://github.com/yandex/ClickHouse/pull/3695)
+* Fixes for builds on ARM. [#3709](https://github.com/yandex/ClickHouse/pull/3709)
+
+### Backward incompatible changes:
+
+* Removed the ability to compare the `Date` type with a number. Instead of `toDate('2018-12-18') = 17883`, you must use explicit type conversion `= toDate(17883)` [#3687](https://github.com/yandex/ClickHouse/pull/3687)
+
+## ClickHouse release 18.14.19, 2018-12-19
+
+### Bug fixes:
+
+* Fixed an error that led to problems with updating dictionaries with the ODBC source. [#3825](https://github.com/yandex/ClickHouse/issues/3825), [#3829](https://github.com/yandex/ClickHouse/issues/3829)
+* Databases are correctly specified when executing DDL `ON CLUSTER` queries. [#3460](https://github.com/yandex/ClickHouse/pull/3460)
+* Fixed a segfault if the `max_temporary_non_const_columns` limit was exceeded. [#3788](https://github.com/yandex/ClickHouse/pull/3788)
+
+### Build improvements:
+
+* Fixes for builds on ARM.
+
 ## ClickHouse release 18.14.18, 2018-12-04

 ### Bug fixes:
 * Fixed error in `dictGet...` function for dictionaries of type `range`, if one of the arguments is constant and other is not. [#3751](https://github.com/yandex/ClickHouse/pull/3751)
 * Fixed error that caused messages `netlink: '...': attribute type 1 has an invalid length` to be printed in Linux kernel log, that was happening only on fresh enough versions of Linux kernel. [#3749](https://github.com/yandex/ClickHouse/pull/3749)
-* Fixed segfault in function `empty` for argument of `FixedString` type. [#3703](https://github.com/yandex/ClickHouse/pull/3703)
+* Fixed segfault in function `empty` for argument of `FixedString` type. [Daniel, Dao Quang Minh](https://github.com/yandex/ClickHouse/pull/3703)
 * Fixed excessive memory allocation when using large value of `max_query_size` setting (a memory chunk of `max_query_size` bytes was preallocated at once). [#3720](https://github.com/yandex/ClickHouse/pull/3720)

 ### Build changes:
@ -90,7 +212,7 @@

 ### Improvements:

-* Significantly reduced memory consumption for requests with `ORDER BY` and `LIMIT`. See the `max_bytes_before_remerge_sort` setting. [#3205](https://github.com/yandex/ClickHouse/pull/3205)
+* Significantly reduced memory consumption for queries with `ORDER BY` and `LIMIT`. See the `max_bytes_before_remerge_sort` setting. [#3205](https://github.com/yandex/ClickHouse/pull/3205)
 * In the absence of `JOIN` (`LEFT`, `INNER`, ...), `INNER JOIN` is assumed. [#3147](https://github.com/yandex/ClickHouse/pull/3147)
 * Qualified asterisks work correctly in queries with `JOIN`. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3202)
 * The `ODBC` table engine correctly chooses the method for quoting identifiers in the SQL dialect of a remote database. [Alexandr Krasheninnikov](https://github.com/yandex/ClickHouse/pull/3210)
@ -127,7 +249,7 @@
 * If after merging data parts, the checksum for the resulting part differs from the result of the same merge in another replica, the result of the merge is deleted and the data part is downloaded from the other replica (this is the correct behavior). But after downloading the data part, it couldn't be added to the working set because of an error that the part already exists (because the data part was deleted with some delay after the merge). This led to cyclical attempts to download the same data. [#3194](https://github.com/yandex/ClickHouse/pull/3194)
 * Fixed incorrect calculation of total memory consumption by queries (because of incorrect calculation, the `max_memory_usage_for_all_queries` setting worked incorrectly and the `MemoryTracking` metric had an incorrect value). This error occurred in version 18.12.13. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3344)
 * Fixed the functionality of `CREATE TABLE ... ON CLUSTER ... AS SELECT ...` This error occurred in version 18.12.13. [#3247](https://github.com/yandex/ClickHouse/pull/3247)
-* Fixed unnecessary preparation of data structures for `JOIN`s on the server that initiates the request if the `JOIN` is only performed on remote servers. [#3340](https://github.com/yandex/ClickHouse/pull/3340)
+* Fixed unnecessary preparation of data structures for `JOIN`s on the server that initiates the query if the `JOIN` is only performed on remote servers. [#3340](https://github.com/yandex/ClickHouse/pull/3340)
 * Fixed bugs in the `Kafka` engine: deadlocks after exceptions when starting to read data, and locks upon completion [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3215).
 * For `Kafka` tables, the optional `schema` parameter was not passed  (the schema of the `Cap'n'Proto` format). [Vojtech Splichal](https://github.com/yandex/ClickHouse/pull/3150)
 * If the ensemble of ZooKeeper servers has servers that accept the connection but then immediately close it instead of responding to the handshake, ClickHouse chooses to connect another server. Previously, this produced the error `Cannot read all data. Bytes read: 0. Bytes expected: 4.` and the server couldn't start. [8218cf3a](https://github.com/yandex/ClickHouse/commit/8218cf3a5f39a43401953769d6d12a0bb8d29da9)
@ -208,7 +330,7 @@

 * Added the `DECIMAL(digits, scale)` data type (`Decimal32(scale)`, `Decimal64(scale)`, `Decimal128(scale)`). To enable it, use the setting `allow_experimental_decimal_type`. [#2846](https://github.com/yandex/ClickHouse/pull/2846) [#2970](https://github.com/yandex/ClickHouse/pull/2970) [#3008](https://github.com/yandex/ClickHouse/pull/3008) [#3047](https://github.com/yandex/ClickHouse/pull/3047)
 * New `WITH ROLLUP` modifier for `GROUP BY` (alternative syntax: `GROUP BY ROLLUP(...)`). [#2948](https://github.com/yandex/ClickHouse/pull/2948)
-* In requests with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting `asterisk_left_columns_only` to 1 on the user configuration level. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2787)
+* In queries with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting `asterisk_left_columns_only` to 1 on the user configuration level. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2787)
 * Added support for JOIN with table functions. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2907)
 * Autocomplete by pressing Tab in clickhouse-client. [Sergey Shcherbin](https://github.com/yandex/ClickHouse/pull/2447)
 * Ctrl+C in clickhouse-client clears a query that was entered. [#2877](https://github.com/yandex/ClickHouse/pull/2877)
@ -294,7 +416,7 @@

 ### Backward incompatible changes:

-* In requests with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting `asterisk_left_columns_only` to 1 on the user configuration level.
+* In queries with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting `asterisk_left_columns_only` to 1 on the user configuration level.

 ### Build changes:

@ -338,7 +460,7 @@
 * Fixed an error for concurrent `Set` or `Join`. [Amos Bird](https://github.com/yandex/ClickHouse/pull/2823)
 * Fixed the `Block structure mismatch in UNION stream: different number of columns` error that occurred for `UNION ALL` queries inside a sub-query if one of the `SELECT` queries contains duplicate column names. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2094)
 * Fixed a memory leak if an exception occurred when connecting to a MySQL server.
-* Fixed incorrect clickhouse-client response code in case of a request error.
+* Fixed incorrect clickhouse-client response code in case of a query error.
 * Fixed incorrect behavior of materialized views containing DISTINCT. [#2795](https://github.com/yandex/ClickHouse/issues/2795)

 ### Backward incompatible changes
@ -452,7 +574,7 @@ The expression must be a chain of equalities joined by the AND operator. Each si
 * Fixed a problem with a very small timeout for sockets (one second) for reading and writing when sending and downloading replicated data, which made it impossible to download larger parts if there is a load on the network or disk (it resulted in cyclical attempts to download parts). This error occurred in version 1.1.54388.
 * Fixed issues when using chroot in ZooKeeper if you inserted duplicate data blocks in the table.
 * The `has` function now works correctly for an array with Nullable elements ([#2115](https://github.com/yandex/ClickHouse/issues/2115)).
-* The `system.tables` table now works correctly when used in distributed queries. The `metadata_modification_time` and `engine_full` columns are now non-virtual. Fixed an error that occurred if only these columns were requested from the table.
+* The `system.tables` table now works correctly when used in distributed queries. The `metadata_modification_time` and `engine_full` columns are now non-virtual. Fixed an error that occurred if only these columns were queried from the table.
 * Fixed how an empty `TinyLog` table works after inserting an empty data block ([#2563](https://github.com/yandex/ClickHouse/issues/2563)).
 * The `system.zookeeper` table works if the value of the node in ZooKeeper is NULL.

@ -701,7 +823,7 @@ The expression must be a chain of equalities joined by the AND operator. Each si
 * Added the `parseDateTimeBestEffort`, `parseDateTimeBestEffortOrZero`, and `parseDateTimeBestEffortOrNull` functions to read the DateTime from a string containing text in a wide variety of possible formats.
 * Data can be partially reloaded from external dictionaries during updating (load just the records in which the value of the specified field greater than in the previous download) (Arsen Hakobyan).
 * Added the `cluster` table function. Example: `cluster(cluster_name, db, table)`. The `remote` table function can accept the cluster name as the first argument, if it is specified as an identifier.
-* The `remote` and `cluster` table functions can be used in `INSERT` requests.
+* The `remote` and `cluster` table functions can be used in `INSERT` queries.
 * Added the `create_table_query` and `engine_full` virtual columns to the `system.tables`table . The `metadata_modification_time` column is virtual.
 * Added the `data_path` and `metadata_path` columns to `system.tables`and` system.databases` tables, and added the `path` column to the `system.parts` and `system.parts_columns` tables.
 * Added additional information about merges in the `system.part_log` table.
@ -926,7 +1048,7 @@ This release contains bug fixes for the previous release 1.1.54310:
 ### New features:

 * Custom partitioning key for the MergeTree family of table engines.
-* [ Kafka](https://clickhouse.yandex/docs/en/single/index.html#document-table_engines/kafka)  table engine.
+* [Kafka](https://clickhouse.yandex/docs/en/operations/table_engines/kafka/)  table engine.
 * Added support for loading [CatBoost](https://catboost.yandex/)  models and applying them to data stored in ClickHouse.
 * Added support for time zones with non-integer offsets from UTC.
 * Added support for arithmetic operations with time intervals.
@ -1040,7 +1162,7 @@ This release contains bug fixes for the previous release 1.1.54310:

 ### Please note when upgrading:

-* There is now a higher default value for the MergeTree setting `max_bytes_to_merge_at_max_space_in_pool`  (the maximum total size of data parts to merge, in bytes): it has increased from 100 GiB to 150 GiB. This might result in large merges running after the server upgrade, which could cause an increased load on the disk subsystem. If the free space available on the server is less than twice the total amount of the merges that are running, this will cause all other merges to stop running, including merges of small data parts. As a result, INSERT requests will fail with the message "Merges are processing significantly slower than inserts." Use the ` SELECT * FROM system.merges`  request to monitor the situation. You can also check the `DiskSpaceReservedForMerge`  metric in the `system.metrics`  table, or in Graphite. You don't need to do anything to fix this, since the issue will resolve itself once the large merges finish. If you find this unacceptable, you can restore the previous value for the `max_bytes_to_merge_at_max_space_in_pool`  setting. To do this, go to the <merge_tree> section in config.xml, set `<merge_tree>``<max_bytes_to_merge_at_max_space_in_pool>107374182400</max_bytes_to_merge_at_max_space_in_pool>` and restart the server.
+* There is now a higher default value for the MergeTree setting `max_bytes_to_merge_at_max_space_in_pool`  (the maximum total size of data parts to merge, in bytes): it has increased from 100 GiB to 150 GiB. This might result in large merges running after the server upgrade, which could cause an increased load on the disk subsystem. If the free space available on the server is less than twice the total amount of the merges that are running, this will cause all other merges to stop running, including merges of small data parts. As a result, INSERT queries will fail with the message "Merges are processing significantly slower than inserts." Use the ` SELECT * FROM system.merges`  query to monitor the situation. You can also check the `DiskSpaceReservedForMerge`  metric in the `system.metrics`  table, or in Graphite. You don't need to do anything to fix this, since the issue will resolve itself once the large merges finish. If you find this unacceptable, you can restore the previous value for the `max_bytes_to_merge_at_max_space_in_pool`  setting. To do this, go to the <merge_tree> section in config.xml, set `<merge_tree>``<max_bytes_to_merge_at_max_space_in_pool>107374182400</max_bytes_to_merge_at_max_space_in_pool>` and restart the server.

 ## ClickHouse release 1.1.54284, 2017-08-29

@ -1133,7 +1255,7 @@ This release contains bug fixes for the previous release 1.1.54276:
 ### New features:

 * Distributed DDL (for example, `CREATE TABLE ON CLUSTER`)
-* The replicated request `ALTER TABLE CLEAR COLUMN IN PARTITION.`
+* The replicated query `ALTER TABLE CLEAR COLUMN IN PARTITION.`
 * The engine for Dictionary tables (access to dictionary data in the form of a table).
 * Dictionary database engine (this type of database automatically has Dictionary tables available for all the connected external dictionaries).
 * You can check for updates to the dictionary by sending a request to the source.
--- a/CHANGELOG_RU.md
+++ b/CHANGELOG_RU.md
@ -1,9 +1,129 @@
+## ClickHouse release 18.16.1, 2018-12-21
+
+### Исправления ошибок:
+
+* Исправлена проблема, приводившая к невозможности обновить словари с источником ODBC. [#3825](https://github.com/yandex/ClickHouse/issues/3825), [#3829](https://github.com/yandex/ClickHouse/issues/3829)
+* JIT-компиляция агрегатных функций теперь работает с LowCardinality столбцами. [#3838](https://github.com/yandex/ClickHouse/issues/3838)
+
+### Улучшения:
+
+* Добавлена настройка `low_cardinality_allow_in_native_format` (по умолчанию включена). Если её выключить, столбцы LowCardinality в Native формате будут преобразовываться в соответствующий обычный тип при SELECT и из этого типа при INSERT. [#3879](https://github.com/yandex/ClickHouse/pull/3879)
+
+### Улучшения сборки:
+* Исправления сборки под macOS и ARM.
+
+## ClickHouse release 18.16.0, 2018-12-14
+
+### Новые возможности:
+
+* Вычисление `DEFAULT` выражений для отсутствующих полей при загрузке данных в полуструктурированных форматах (`JSONEachRow`, `TSKV`). [#3555](https://github.com/yandex/ClickHouse/pull/3555)
+* Для запроса `ALTER TABLE` добавлено действие `MODIFY ORDER BY` для изменения ключа сортировки при одновременном добавлении или удалении столбца таблицы. Это полезно для таблиц семейства `MergeTree`, выполняющих дополнительную работу при слияниях, согласно этому ключу сортировки, как например, `SummingMergeTree`, `AggregatingMergeTree` и т. п. [#3581](https://github.com/yandex/ClickHouse/pull/3581) [#3755](https://github.com/yandex/ClickHouse/pull/3755)
+* Для таблиц семейства `MergeTree` появилась возможность указать различный ключ сортировки (`ORDER BY`) и индекс (`PRIMARY KEY`). Ключ сортировки может быть длиннее, чем индекс. [#3581](https://github.com/yandex/ClickHouse/pull/3581)
+* Добавлена табличная функция `hdfs` и движок таблиц `HDFS` для импорта и экспорта данных в HDFS. [chenxing-xc](https://github.com/yandex/ClickHouse/pull/3617)
+* Добавлены функции для работы с base64: `base64Encode`, `base64Decode`, `tryBase64Decode`. [Alexander Krasheninnikov](https://github.com/yandex/ClickHouse/pull/3350)
+* Для агрегатной функции `uniqCombined` появилась возможность настраивать точность работы с помощью параметра (выбирать количество ячеек HyperLogLog). [#3406](https://github.com/yandex/ClickHouse/pull/3406)
+* Добавлена таблица `system.contributors`, содержащая имена всех, кто делал коммиты в ClickHouse. [#3452](https://github.com/yandex/ClickHouse/pull/3452)
+* Добавлена возможность не указывать партицию для запроса `ALTER TABLE ... FREEZE` для бэкапа сразу всех партиций. [#3514](https://github.com/yandex/ClickHouse/pull/3514)
+* Добавлены функции `dictGet`, `dictGetOrDefault` без указания типа возвращаемого значения. Тип определяется автоматически из описания словаря. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3564)
+* Возможность указания комментария для столбца в описании таблицы и изменения его с помощью `ALTER`. [#3377](https://github.com/yandex/ClickHouse/pull/3377)
+* Возможность чтения из таблицы типа `Join` в случае простых ключей. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
+* Возможность указания настроек `join_use_nulls`, `max_rows_in_join`, `max_bytes_in_join`, `join_overflow_mode` при создании таблицы типа `Join`. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
+* Добавлена функция `joinGet`, позволяющая использовать таблицы типа `Join` как словарь. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
+* Добавлены столбцы `partition_key`, `sorting_key`, `primary_key`, `sampling_key` в таблицу `system.tables`, позволяющие получить информацию о ключах таблицы. [#3609](https://github.com/yandex/ClickHouse/pull/3609)
+* Добавлены столбцы `is_in_partition_key`, `is_in_sorting_key`, `is_in_primary_key`, `is_in_sampling_key` в таблицу `system.columns`. [#3609](https://github.com/yandex/ClickHouse/pull/3609)
+* Добавлены столбцы `min_time`, `max_time` в таблицу `system.parts`. Эти столбцы заполняются, если ключ партиционирования является выражением от столбцов типа `DateTime`. [Emmanuel Donin de Rosière](https://github.com/yandex/ClickHouse/pull/3800)
+
+### Исправления ошибок:
+
+* Исправления и улучшения производительности для типа данных `LowCardinality`. `GROUP BY` по `LowCardinality(Nullable(...))`. Получение `extremes` значений. Выполнение функций высшего порядка. `LEFT ARRAY JOIN`. Распределённый `GROUP BY`. Функции, возвращающие `Array`. Выполнение `ORDER BY`. Запись в `Distributed` таблицы (nicelulu). Обратная совместимость для запросов `INSERT` от старых клиентов, реализующих `Native` протокол. Поддержка `LowCardinality` для `JOIN`. Производительность при работе в один поток. [#3823](https://github.com/yandex/ClickHouse/pull/3823) [#3803](https://github.com/yandex/ClickHouse/pull/3803) [#3799](https://github.com/yandex/ClickHouse/pull/3799) [#3769](https://github.com/yandex/ClickHouse/pull/3769) [#3744](https://github.com/yandex/ClickHouse/pull/3744) [#3681](https://github.com/yandex/ClickHouse/pull/3681) [#3651](https://github.com/yandex/ClickHouse/pull/3651) [#3649](https://github.com/yandex/ClickHouse/pull/3649) [#3641](https://github.com/yandex/ClickHouse/pull/3641) [#3632](https://github.com/yandex/ClickHouse/pull/3632) [#3568](https://github.com/yandex/ClickHouse/pull/3568) [#3523](https://github.com/yandex/ClickHouse/pull/3523) [#3518](https://github.com/yandex/ClickHouse/pull/3518)
+* Исправлена работа настройки `select_sequential_consistency`. Ранее, при включенной настройке, после начала записи в новую партицию, мог возвращаться неполный результат. [#2863](https://github.com/yandex/ClickHouse/pull/2863)
+* Корректное указание базы данных при выполнении DDL запросов `ON CLUSTER`, а также при выполнении `ALTER UPDATE/DELETE`. [#3772](https://github.com/yandex/ClickHouse/pull/3772) [#3460](https://github.com/yandex/ClickHouse/pull/3460)
+* Корректное указание базы данных для подзапросов внутри VIEW. [#3521](https://github.com/yandex/ClickHouse/pull/3521)
+* Исправлена работа `PREWHERE` с `FINAL` для `VersionedCollapsingMergeTree`. [7167bfd7](https://github.com/yandex/ClickHouse/commit/7167bfd7b365538f7a91c4307ad77e552ab4e8c1)
+* Возможность с помощью запроса `KILL QUERY` отмены запросов, которые ещё не начали выполняться из-за ожидания блокировки таблицы. [#3517](https://github.com/yandex/ClickHouse/pull/3517)
+* Исправлены расчёты с датой и временем в случае, если стрелки часов были переведены назад в полночь (это происходит в Иране, а также было Москве с 1981 по 1983 год). Ранее это приводило к тому, что стрелки часов переводились на сутки раньше, чем нужно, а также приводило к некорректному форматированию даты-с-временем в текстовом виде. [#3819](https://github.com/yandex/ClickHouse/pull/3819)
+* Исправлена работа некоторых случаев `VIEW` и подзапросов без указания базы данных. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3521)
+* Исправлен race condition при одновременном чтении из `MATERIALIZED VIEW` и удалением `MATERIALIZED VIEW` из-за отсутствия блокировки внутренней таблицы `MATERIALIZED VIEW`. [#3404](https://github.com/yandex/ClickHouse/pull/3404) [#3694](https://github.com/yandex/ClickHouse/pull/3694)
+* Исправлена ошибка `Lock handler cannot be nullptr.` [#3689](https://github.com/yandex/ClickHouse/pull/3689)
+* Исправления выполнения запросов при включенной настройке `compile_expressions` (включена по-умолчанию) - убрана свёртка недетерминированных константных выражений, как например, функции `now`. [#3457](https://github.com/yandex/ClickHouse/pull/3457)
+* Исправлено падение при указании неконстантного аргумента scale в функциях `toDecimal32/64/128`.
+* Исправлена ошибка при попытке вставки в формате `Values` массива с `NULL` элементами в столбец типа `Array` без `Nullable` (в случае `input_format_values_interpret_expressions` = 1). [#3487](https://github.com/yandex/ClickHouse/pull/3487) [#3503](https://github.com/yandex/ClickHouse/pull/3503)
+* Исправлено непрерывное логгирование ошибок в `DDLWorker`, если ZooKeeper недоступен. [8f50c620](https://github.com/yandex/ClickHouse/commit/8f50c620334988b28018213ec0092fe6423847e2)
+* Исправлен тип возвращаемого значения для функций `quantile*` от аргументов типа `Date` и `DateTime`. [#3580](https://github.com/yandex/ClickHouse/pull/3580)
+* Исправлена работа секции `WITH`, если она задаёт простой алиас без выражений. [#3570](https://github.com/yandex/ClickHouse/pull/3570)
+* Исправлена обработка запросов с именованными подзапросами и квалифицированными именами столбцов при включенной настройке `enable_optimize_predicate_expression`. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3588)
+* Исправлена ошибка `Attempt to attach to nullptr thread group` при работе материализованных представлений. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3623)
+* Исправлено падение при передаче некоторых некорректных аргументов в функцию `arrayReverse`. [73e3a7b6](https://github.com/yandex/ClickHouse/commit/73e3a7b662161d6005e7727d8a711b930386b871)
+* Исправлен buffer overflow в функции `extractURLParameter`. Увеличена производительность. Добавлена корректная обработка строк, содержащих нулевые байты. [141e9799](https://github.com/yandex/ClickHouse/commit/141e9799e49201d84ea8e951d1bed4fb6d3dacb5)
+* Исправлен buffer overflow в функциях `lowerUTF8`, `upperUTF8`. Удалена возможность выполнения этих функций над аргументами типа `FixedString`. [#3662](https://github.com/yandex/ClickHouse/pull/3662)
+* Исправлен редкий race condition при удалении таблиц типа `MergeTree`. [#3680](https://github.com/yandex/ClickHouse/pull/3680)
+* Исправлен race condition при чтении из таблиц типа `Buffer` и одновременном `ALTER` либо `DROP` таблиц назначения. [#3719](https://github.com/yandex/ClickHouse/pull/3719)
+* Исправлен segfault в случае превышения ограничения `max_temporary_non_const_columns`. [#3788](https://github.com/yandex/ClickHouse/pull/3788)
+
+### Улучшения:
+
+* Обработанные конфигурационные файлы записываются сервером не в `/etc/clickhouse-server/` директорию, а в директорию `preprocessed_configs` внутри `path`. Это позволяет оставить директорию `/etc/clickhouse-server/` недоступной для записи пользователем `clickhouse`, что повышает безопасность. [#2443](https://github.com/yandex/ClickHouse/pull/2443)
+* Настройка `min_merge_bytes_to_use_direct_io` выставлена по-умолчанию в 10 GiB. Слияния, образующие крупные куски таблиц семейства MergeTree, будут производиться в режиме `O_DIRECT`, что исключает вымывание кэша. [#3504](https://github.com/yandex/ClickHouse/pull/3504)
+* Ускорен запуск сервера в случае наличия очень большого количества таблиц. [#3398](https://github.com/yandex/ClickHouse/pull/3398)
+* Добавлен пул соединений и HTTP `Keep-Alive` для соединения между репликами. [#3594](https://github.com/yandex/ClickHouse/pull/3594)
+* В случае ошибки синтаксиса запроса, в `HTTP` интерфейсе возвращается код `400 Bad Request` (ранее возвращался код 500). [31bc680a](https://github.com/yandex/ClickHouse/commit/31bc680ac5f4bb1d0360a8ba4696fa84bb47d6ab)
+* Для настройки `join_default_strictness` выбрано значение по-умолчанию `ALL` для совместимости. [120e2cbe](https://github.com/yandex/ClickHouse/commit/120e2cbe2ff4fbad626c28042d9b28781c805afe)
+* Убрано логгирование в `stderr` из библиотеки `re2` в случае некорректных или сложных регулярных выражений. [#3723](https://github.com/yandex/ClickHouse/pull/3723)
+* Для движка таблиц `Kafka`: проверка наличия подписок перед началом чтения из Kafka; настройка таблицы kafka_max_block_size. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3396)
+* Функции `cityHash64`, `farmHash64`, `metroHash64`, `sipHash64`, `halfMD5`, `murmurHash2_32`, `murmurHash2_64`, `murmurHash3_32`, `murmurHash3_64` теперь работают для произвольного количества аргументов, а также для аргументов-кортежей. [#3451](https://github.com/yandex/ClickHouse/pull/3451) [#3519](https://github.com/yandex/ClickHouse/pull/3519)
+* Функция `arrayReverse` теперь работает с любыми типами массивов. [73e3a7b6](https://github.com/yandex/ClickHouse/commit/73e3a7b662161d6005e7727d8a711b930386b871)
+* Добавлен опциональный параметр - размер слота для функции `timeSlots`. [Kirill Shvakov](https://github.com/yandex/ClickHouse/pull/3724)
+* Для `FULL` и `RIGHT JOIN` учитывается настройка `max_block_size` для потока неприсоединённых данных из правой таблицы. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3699)
+* В `clickhouse-benchmark` и `clickhouse-performance-test` добавлен параметр командной строки `--secure` для включения TLS. [#3688](https://github.com/yandex/ClickHouse/pull/3688) [#3690](https://github.com/yandex/ClickHouse/pull/3690)
+* Преобразование типов в случае, если структура таблицы типа `Buffer` не соответствует структуре таблицы назначения. [Vitaly Baranov](https://github.com/yandex/ClickHouse/pull/3603)
+* Добавлена настройка `tcp_keep_alive_timeout` для включения keep-alive пакетов после неактивности в течение указанного интервала времени. [#3441](https://github.com/yandex/ClickHouse/pull/3441)
+* Убрано излишнее квотирование значений ключа партиции в таблице `system.parts`, если он состоит из одного столбца. [#3652](https://github.com/yandex/ClickHouse/pull/3652)
+* Функция деления с остатком работает для типов данных `Date` и `DateTime`. [#3385](https://github.com/yandex/ClickHouse/pull/3385)
+* Добавлены синонимы функций `POWER`, `LN`, `LCASE`, `UCASE`, `REPLACE`, `LOCATE`, `SUBSTR`, `MID`. [#3774](https://github.com/yandex/ClickHouse/pull/3774) [#3763](https://github.com/yandex/ClickHouse/pull/3763) Некоторые имена функций сделаны регистронезависимыми для совместимости со стандартом SQL. Добавлен синтаксический сахар `SUBSTRING(expr FROM start FOR length)` для совместимости с SQL. [#3804](https://github.com/yandex/ClickHouse/pull/3804)
+* Добавлена возможность фиксации (`mlock`) страниц памяти, соответствующих исполняемому коду `clickhouse-server` для предотвращения вытеснения их из памяти. Возможность выключена по-умолчанию. [#3553](https://github.com/yandex/ClickHouse/pull/3553)
+* Увеличена производительность чтения с `O_DIRECT` (с включенной опцией `min_bytes_to_use_direct_io`). [#3405](https://github.com/yandex/ClickHouse/pull/3405)
+* Улучшена производительность работы функции `dictGet...OrDefault` в случае константного аргумента-ключа и неконстантного аргумента-default. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3563)
+* В функции `firstSignificantSubdomain` добавлена обработка доменов `gov`, `mil`, `edu`. [Igor Hatarist](https://github.com/yandex/ClickHouse/pull/3601) Увеличена производительность работы. [#3628](https://github.com/yandex/ClickHouse/pull/3628)
+* Возможность указания произвольных переменных окружения для запуска `clickhouse-server` посредством `SYS-V init.d`-скрипта с помощью указания `CLICKHOUSE_PROGRAM_ENV` в `/etc/default/clickhouse`.
+[Pavlo Bashynskyi](https://github.com/yandex/ClickHouse/pull/3612)
+* Правильный код возврата init-скрипта clickhouse-server. [#3516](https://github.com/yandex/ClickHouse/pull/3516)
+* В таблицу `system.metrics` добавлена метрика `VersionInteger`, а в `system.build_options` добавлена строчка `VERSION_INTEGER`, содержащая версию ClickHouse в числовом представлении, вида `18016000`. [#3644](https://github.com/yandex/ClickHouse/pull/3644)
+* Удалена возможность сравнения типа `Date` с числом, чтобы избежать потенциальных ошибок вида `date = 2018-12-17`, где ошибочно не указаны кавычки вокруг даты. [#3687](https://github.com/yandex/ClickHouse/pull/3687)
+* Исправлено поведение функций с состоянием типа `rowNumberInAllBlocks` - раньше они выдавали число на единицу больше вследствие их запуска во время анализа запроса. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3729)
+* При невозможности удалить файл `force_restore_data`, выводится сообщение об ошибке. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3794)
+
+### Улучшение сборки:
+
+* Обновлена библиотека `jemalloc`, что исправляет потенциальную утечку памяти. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3557)
+* Для debug сборок включено по-умолчанию профилирование `jemalloc`. [2cc82f5c](https://github.com/yandex/ClickHouse/commit/2cc82f5cbe266421cd4c1165286c2c47e5ffcb15)
+* Добавлена возможность запуска интеграционных тестов, при наличии установленным в системе лишь `Docker`. [#3650](https://github.com/yandex/ClickHouse/pull/3650)
+* Добавлен fuzz тест выражений в SELECT запросах. [#3442](https://github.com/yandex/ClickHouse/pull/3442)
+* Добавлен покоммитный стресс-тест, выполняющий функциональные тесты параллельно и в произвольном порядке, позволяющий обнаружить больше race conditions. [#3438](https://github.com/yandex/ClickHouse/pull/3438)
+* Улучшение способа запуска clickhouse-server в Docker образе. [Elghazal Ahmed](https://github.com/yandex/ClickHouse/pull/3663)
+* Для Docker образа добавлена поддержка инициализации базы данных с помощью файлов в директории `/docker-entrypoint-initdb.d`. [Konstantin Lebedev](https://github.com/yandex/ClickHouse/pull/3695)
+* Исправления для сборки под ARM. [#3709](https://github.com/yandex/ClickHouse/pull/3709)
+
+### Обратно несовместимые изменения:
+
+* Удалена возможность сравнения типа `Date` с числом, необходимо вместо  `toDate('2018-12-18') = 17883`, использовать явное приведение типов `= toDate(17883)` [#3687](https://github.com/yandex/ClickHouse/pull/3687)
+
+## ClickHouse release 18.14.19, 2018-12-19
+
+### Исправления ошибок:
+
+* Исправлена проблема, приводившая к невозможности обновить словари с источником ODBC. [#3825](https://github.com/yandex/ClickHouse/issues/3825), [#3829](https://github.com/yandex/ClickHouse/issues/3829)
+* Исправлен segfault в случае превышения ограничения `max_temporary_non_const_columns`. [#3788](https://github.com/yandex/ClickHouse/pull/3788)
+* Корректное указание базы данных при выполнении DDL запросов `ON CLUSTER`. [#3460](https://github.com/yandex/ClickHouse/pull/3460)
+
+### Улучшения сборки:
+* Исправления сборки под ARM.
+
 ## ClickHouse release 18.14.18, 2018-12-04

 ### Исправления ошибок:
 * Исправлена ошибка в функции `dictGet...` для словарей типа `range`, если один из аргументов константный, а другой - нет. [#3751](https://github.com/yandex/ClickHouse/pull/3751)
 * Исправлена ошибка, приводящая к выводу сообщений `netlink: '...': attribute type 1 has an invalid length` в логе ядра Linux, проявляющаяся на достаточно новых ядрах Linux. [#3749](https://github.com/yandex/ClickHouse/pull/3749)
-* Исправлен segfault при выполнении функции `empty` от аргумента типа `FixedString`. [#3703](https://github.com/yandex/ClickHouse/pull/3703)
+* Исправлен segfault при выполнении функции `empty` от аргумента типа `FixedString`. [Daniel, Dao Quang Minh](https://github.com/yandex/ClickHouse/pull/3703)
 * Исправлена избыточная аллокация памяти при большом значении настройки `max_query_size` (кусок памяти размера `max_query_size` выделялся сразу). [#3720](https://github.com/yandex/ClickHouse/pull/3720)

 ### Улучшения процесса сборки ClickHouse:
@ -897,7 +1017,7 @@

 ### Новые возможности:
 * Произвольный ключ партиционирования для таблиц семейства MergeTree.
-* Движок таблиц [Kafka](https://clickhouse.yandex/docs/en/single/index.html#document-table_engines/kafka).
+* Движок таблиц [Kafka](https://clickhouse.yandex/docs/en/operations/table_engines/kafka/).
 * Возможность загружать модели [CatBoost](https://catboost.yandex/) и применять их к данным, хранящимся в ClickHouse.
 * Поддержка часовых поясов с нецелым смещением от UTC.
 * Поддержка операций с временными интервалами.
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@ -25,18 +25,6 @@ endif ()
 # Write compile_commands.json
 set(CMAKE_EXPORT_COMPILE_COMMANDS 1)

-set(PARALLEL_COMPILE_JOBS "" CACHE STRING "Define the maximum number of concurrent compilation jobs")
-if (PARALLEL_COMPILE_JOBS)
-    set_property(GLOBAL APPEND PROPERTY JOB_POOLS compile_job_pool="${PARALLEL_COMPILE_JOBS}")
-    set(CMAKE_JOB_POOL_COMPILE compile_job_pool)
-endif ()
-
-set(PARALLEL_LINK_JOBS "" CACHE STRING "Define the maximum number of concurrent link jobs")
-if (LLVM_PARALLEL_LINK_JOBS)
-    set_property(GLOBAL APPEND PROPERTY JOB_POOLS link_job_pool=${PARALLEL_LINK_JOBS})
-    set(CMAKE_JOB_POOL_LINK link_job_pool)
-endif ()
-
 include (cmake/find_ccache.cmake)

 if (NOT CMAKE_BUILD_TYPE OR CMAKE_BUILD_TYPE STREQUAL "None")
@ -127,7 +115,10 @@ endif ()

 include (cmake/test_cpu.cmake)

-option (ARCH_NATIVE "Enable -march=native compiler flag" ${ARCH_ARM})
+if(NOT COMPILER_CLANG) # clang: error: the clang compiler does not support '-march=native'
+    option(ARCH_NATIVE "Enable -march=native compiler flag" ${ARCH_ARM})
+endif()
+
 if (ARCH_NATIVE)
    set (COMPILER_FLAGS                  "${COMPILER_FLAGS} -march=native")
 endif ()
@ -159,43 +150,8 @@ set (CMAKE_C_FLAGS                       "${CMAKE_C_FLAGS} ${COMPILER_FLAGS} -fn
 set (CMAKE_C_FLAGS_RELWITHDEBINFO        "${CMAKE_C_FLAGS_RELWITHDEBINFO} -O3 ${CMAKE_C_FLAGS_ADD}")
 set (CMAKE_C_FLAGS_DEBUG                 "${CMAKE_C_FLAGS_DEBUG} -O0 -g3 -ggdb3 -fno-inline ${CMAKE_C_FLAGS_ADD}")

-set(THREADS_PREFER_PTHREAD_FLAG ON)
-find_package (Threads)

-include (cmake/test_compiler.cmake)
-
-if (OS_LINUX AND COMPILER_CLANG)
-    option (USE_LIBCXX "Use libc++ and libc++abi instead of libstdc++ (only make sense on Linux with Clang)" ${HAVE_LIBCXX})
-
-    if (USE_LIBCXX)
-        set (CMAKE_CXX_FLAGS             "${CMAKE_CXX_FLAGS} -stdlib=libc++") # Ok for clang6, for older can cause 'not used option' warning
-        set (CMAKE_CXX_FLAGS_DEBUG       "${CMAKE_CXX_FLAGS_DEBUG} -D_LIBCPP_DEBUG=0") # More checks in debug build.
-        if (MAKE_STATIC_LIBRARIES)
-            execute_process (COMMAND ${CMAKE_CXX_COMPILER} --print-file-name=libclang_rt.builtins-${CMAKE_SYSTEM_PROCESSOR}.a OUTPUT_VARIABLE BUILTINS_LIB_PATH OUTPUT_STRIP_TRAILING_WHITESPACE)
-            link_libraries (-nodefaultlibs -Wl,-Bstatic -stdlib=libc++ c++ c++abi gcc_eh ${BUILTINS_LIB_PATH} rt -Wl,-Bdynamic dl pthread m c)
-        else ()
-            link_libraries (-stdlib=libc++ c++ c++abi)
-        endif ()
-    endif ()
-endif ()
-
-if (COMPILER_GCC)
-    set (STATIC_STDLIB_FLAGS "-static-libgcc -static-libstdc++")
-else ()
-    set (STATIC_STDLIB_FLAGS "")
-endif ()
-
-if (MAKE_STATIC_LIBRARIES AND NOT APPLE AND NOT (COMPILER_CLANG AND OS_FREEBSD))
-    set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${STATIC_STDLIB_FLAGS}")
-
-    # Along with executables, we also build example of shared library for "library dictionary source"; and it also should be self-contained.
-    set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} ${STATIC_STDLIB_FLAGS}")
-endif ()
-
-if (USE_STATIC_LIBRARIES AND HAVE_NO_PIE)
-    set (CMAKE_CXX_FLAGS                 "${CMAKE_CXX_FLAGS} ${FLAG_NO_PIE}")
-    set (CMAKE_C_FLAGS                   "${CMAKE_C_FLAGS} ${FLAG_NO_PIE}")
-endif ()
+include (cmake/use_libcxx.cmake)

 if (NOT MAKE_STATIC_LIBRARIES)
    set(CMAKE_POSITION_INDEPENDENT_CODE ON)
@ -257,6 +213,7 @@ include (cmake/find_odbc.cmake)
 # openssl, zlib, odbc before poco
 include (cmake/find_poco.cmake)
 include (cmake/find_lz4.cmake)
+include (cmake/find_xxhash.cmake)
 include (cmake/find_sparsehash.cmake)
 include (cmake/find_rt.cmake)
 include (cmake/find_execinfo.cmake)
--- a/README.md
+++ b/README.md
@ -1,4 +1,4 @@
-# ClickHouse
+[![ClickHouse — open source distributed column-oriented DBMS](https://github.com/yandex/ClickHouse/raw/master/website/images/logo-400x240.png)](https://clickhouse.yandex)

 ClickHouse is an open-source column-oriented database management system that allows generating analytical data reports in real time.

--- a/cmake/find_base64.cmake
+++ b/cmake/find_base64.cmake
@ -1,4 +1,11 @@
-option (ENABLE_BASE64 "Enable base64" ON)
+if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/base64/lib/lib.c")
+    set (MISSING_INTERNAL_BASE64_LIBRARY 1)
+    message (WARNING "submodule contrib/base64 is missing. to fix try run: \n git submodule update --init --recursive")
+endif ()
+
+if (NOT MISSING_INTERNAL_BASE64_LIBRARY)
+    option (ENABLE_BASE64 "Enable base64" ON)
+endif ()

 if (ENABLE_BASE64)
    if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/base64")
@ -9,4 +16,3 @@ if (ENABLE_BASE64)
        set (USE_BASE64 1)
    endif()
 endif ()
-
--- a/cmake/find_hdfs3.cmake
+++ b/cmake/find_hdfs3.cmake
@ -15,12 +15,15 @@ if (NOT USE_INTERNAL_HDFS3_LIBRARY)
 endif ()

 if (HDFS3_LIBRARY AND HDFS3_INCLUDE_DIR)
-else ()
+    set(USE_HDFS 1)
+elseif (LIBGSASL_LIBRARY AND LIBXML2_LIBRARY)
    set(HDFS3_INCLUDE_DIR "${ClickHouse_SOURCE_DIR}/contrib/libhdfs3/include")
    set(HDFS3_LIBRARY hdfs3)
+    set(USE_HDFS 1)
+else()
+    set(USE_INTERNAL_HDFS3_LIBRARY 0)
 endif()
-set (USE_HDFS 1)

 endif()

-message (STATUS "Using hdfs3: ${HDFS3_INCLUDE_DIR} : ${HDFS3_LIBRARY}")
+message (STATUS "Using hdfs3=${USE_HDFS}: ${HDFS3_INCLUDE_DIR} : ${HDFS3_LIBRARY}")
--- a/cmake/find_icu.cmake
+++ b/cmake/find_icu.cmake
@ -1,16 +1,15 @@
-option (ENABLE_ICU "Enable ICU" ON)
+option(ENABLE_ICU "Enable ICU" ON)

-if (ENABLE_ICU)
-    find_package(ICU COMPONENTS data i18n uc) # TODO: remove Modules/FindICU.cmake after cmake 3.7
+if(ENABLE_ICU)
+    find_package(ICU COMPONENTS i18n uc data) # TODO: remove Modules/FindICU.cmake after cmake 3.7
    #set (ICU_LIBRARIES ${ICU_I18N_LIBRARY} ${ICU_UC_LIBRARY} ${ICU_DATA_LIBRARY} CACHE STRING "")
-    set (ICU_LIBRARIES ICU::i18n ICU::uc ICU::data CACHE STRING "")
-    if (ICU_FOUND)
+    if(ICU_FOUND)
        set(USE_ICU 1)
-    endif ()
-endif ()
+    endif()
+endif()

-if (USE_ICU)
-    message (STATUS "Using icu=${USE_ICU}: ${ICU_INCLUDE_DIR} : ${ICU_LIBRARIES}")
-else ()
-    message (STATUS "Build without ICU (support for collations and charset conversion functions will be disabled)")
-endif ()
+if(USE_ICU)
+    message(STATUS "Using icu=${USE_ICU}: ${ICU_INCLUDE_DIR} : ${ICU_LIBRARIES}")
+else()
+    message(STATUS "Build without ICU (support for collations and charset conversion functions will be disabled)")
+endif()
--- a/cmake/find_libgsasl.cmake
+++ b/cmake/find_libgsasl.cmake
@ -1,10 +1,13 @@
-if (NOT APPLE)
+if (NOT APPLE AND NOT ARCH_32)
    option (USE_INTERNAL_LIBGSASL_LIBRARY "Set to FALSE to use system libgsasl library instead of bundled" ${NOT_UNBUNDLED})
 endif ()

-if (USE_INTERNAL_LIBGSASL_LIBRARY AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libgsasl/src/gsasl.h")
-   message (WARNING "submodule contrib/libgsasl is missing. to fix try run: \n git submodule update --init --recursive")
-   set (USE_INTERNAL_LIBGSASL_LIBRARY 0)
+if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libgsasl/src/gsasl.h")
+    if (USE_INTERNAL_LIBGSASL_LIBRARY)
+        message (WARNING "submodule contrib/libgsasl is missing. to fix try run: \n git submodule update --init --recursive")
+        set (USE_INTERNAL_LIBGSASL_LIBRARY 0)
+    endif ()
+    set (MISSING_INTERNAL_LIBGSASL_LIBRARY 1)
 endif ()

 if (NOT USE_INTERNAL_LIBGSASL_LIBRARY)
@ -13,7 +16,7 @@ if (NOT USE_INTERNAL_LIBGSASL_LIBRARY)
 endif ()

 if (LIBGSASL_LIBRARY AND LIBGSASL_INCLUDE_DIR)
-else ()
+elseif (NOT MISSING_INTERNAL_LIBGSASL_LIBRARY AND NOT APPLE AND NOT ARCH_32)
    set (LIBGSASL_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libgsasl/src ${ClickHouse_SOURCE_DIR}/contrib/libgsasl/linux_x86_64/include)
    set (USE_INTERNAL_LIBGSASL_LIBRARY 1)
    set (LIBGSASL_LIBRARY libgsasl)
--- a/cmake/find_libxml2.cmake
+++ b/cmake/find_libxml2.cmake
@ -1,8 +1,11 @@
 option (USE_INTERNAL_LIBXML2_LIBRARY "Set to FALSE to use system libxml2 library instead of bundled" ${NOT_UNBUNDLED})

-if (USE_INTERNAL_LIBXML2_LIBRARY AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libxml2/libxml.h")
-   message (WARNING "submodule contrib/libxml2 is missing. to fix try run: \n git submodule update --init --recursive")
-   set (USE_INTERNAL_LIBXML2_LIBRARY 0)
+if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libxml2/libxml.h")
+    if (USE_INTERNAL_LIBXML2_LIBRARY)
+        message (WARNING "submodule contrib/libxml2 is missing. to fix try run: \n git submodule update --init --recursive")
+        set (USE_INTERNAL_LIBXML2_LIBRARY 0)
+    endif ()
+    set (MISSING_INTERNAL_LIBXML2_LIBRARY 1)
 endif ()

 if (NOT USE_INTERNAL_LIBXML2_LIBRARY)
@ -11,7 +14,7 @@ if (NOT USE_INTERNAL_LIBXML2_LIBRARY)
 endif ()

 if (LIBXML2_LIBRARY AND LIBXML2_INCLUDE_DIR)
-else ()
+elseif (NOT MISSING_INTERNAL_LIBXML2_LIBRARY)
    set (LIBXML2_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libxml2/include ${ClickHouse_SOURCE_DIR}/contrib/libxml2-cmake/linux_x86_64/include)
    set (USE_INTERNAL_LIBXML2_LIBRARY 1)
    set (LIBXML2_LIBRARY libxml2)
--- a/cmake/find_rdkafka.cmake
+++ b/cmake/find_rdkafka.cmake
@ -1,4 +1,4 @@
-if (NOT ARCH_ARM)
+if (NOT ARCH_ARM AND NOT ARCH_32)
    option (ENABLE_RDKAFKA "Enable kafka" ON)
 endif ()

--- a/cmake/find_ssl.cmake
+++ b/cmake/find_ssl.cmake
@ -1,6 +1,6 @@
-#if (OS_LINUX)
-option (USE_INTERNAL_SSL_LIBRARY "Set to FALSE to use system *ssl library instead of bundled" ${NOT_UNBUNDLED})
-#endif ()
+if(NOT ARCH_32)
+    option(USE_INTERNAL_SSL_LIBRARY "Set to FALSE to use system *ssl library instead of bundled" ${NOT_UNBUNDLED})
+endif()

 set (OPENSSL_USE_STATIC_LIBS ${USE_STATIC_LIBRARIES})

--- a/cmake/find_xxhash.cmake
+++ b/cmake/find_xxhash.cmake
@ -0,0 +1,10 @@
+if (LZ4_INCLUDE_DIR)
+    if (NOT EXISTS "${LZ4_INCLUDE_DIR}/xxhash.h")
+        message (WARNING "LZ4 library does not have XXHash. Support for XXHash will be disabled.")
+        set (USE_XXHASH 0)
+    else ()
+        set (USE_XXHASH 1)
+    endif ()
+endif ()
+
+message (STATUS "Using xxhash=${USE_XXHASH}")
--- a/cmake/find_zlib.cmake
+++ b/cmake/find_zlib.cmake
@ -1,4 +1,4 @@
-if (NOT OS_FREEBSD)
+if (NOT OS_FREEBSD AND NOT ARCH_32)
    option (USE_INTERNAL_ZLIB_LIBRARY "Set to FALSE to use system zlib library instead of bundled" ${NOT_UNBUNDLED})
 endif ()

@ -8,23 +8,24 @@ endif ()

 if (NOT ZLIB_FOUND)
    if (NOT MSVC)
-        set (INTERNAL_ZLIB_NAME "zlib-ng")
+        set (INTERNAL_ZLIB_NAME "zlib-ng" CACHE INTERNAL "")
    else ()
-        set (INTERNAL_ZLIB_NAME "zlib")
+        set (INTERNAL_ZLIB_NAME "zlib" CACHE INTERNAL "")
        if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/${INTERNAL_ZLIB_NAME}")
            message (WARNING "Will use standard zlib, please clone manually:\n git clone https://github.com/madler/zlib.git ${ClickHouse_SOURCE_DIR}/contrib/${INTERNAL_ZLIB_NAME}")
        endif ()
    endif ()

    set (USE_INTERNAL_ZLIB_LIBRARY 1)
-    set (ZLIB_INCLUDE_DIR "${ClickHouse_SOURCE_DIR}/contrib/${INTERNAL_ZLIB_NAME}" "${ClickHouse_BINARY_DIR}/contrib/${INTERNAL_ZLIB_NAME}") # generated zconf.h
+    set (ZLIB_INCLUDE_DIR "${ClickHouse_SOURCE_DIR}/contrib/${INTERNAL_ZLIB_NAME}" "${ClickHouse_BINARY_DIR}/contrib/${INTERNAL_ZLIB_NAME}" CACHE INTERNAL "") # generated zconf.h
    set (ZLIB_INCLUDE_DIRS ${ZLIB_INCLUDE_DIR}) # for poco
+    set (ZLIB_INCLUDE_DIRECTORIES ${ZLIB_INCLUDE_DIR}) # for protobuf
    set (ZLIB_FOUND 1) # for poco
    if (USE_STATIC_LIBRARIES)
-        set (ZLIB_LIBRARIES zlibstatic)
+        set (ZLIB_LIBRARIES zlibstatic CACHE INTERNAL "")
    else ()
-        set (ZLIB_LIBRARIES zlib)
+        set (ZLIB_LIBRARIES zlib CACHE INTERNAL "")
    endif ()
 endif ()

-message (STATUS "Using zlib: ${ZLIB_INCLUDE_DIR} : ${ZLIB_LIBRARIES}")
+message (STATUS "Using ${INTERNAL_ZLIB_NAME}: ${ZLIB_INCLUDE_DIR} : ${ZLIB_LIBRARIES}")
--- a/cmake/limit_jobs.cmake
+++ b/cmake/limit_jobs.cmake
@ -0,0 +1,37 @@
+# Usage:
+# set (MAX_COMPILER_MEMORY 2000 CACHE INTERNAL "") # In megabytes
+# set (MAX_LINKER_MEMORY 3500 CACHE INTERNAL "")
+# include (cmake/limit_jobs.cmake)
+
+cmake_host_system_information(RESULT AVAILABLE_PHYSICAL_MEMORY QUERY AVAILABLE_PHYSICAL_MEMORY) # Not available under freebsd
+
+option(PARALLEL_COMPILE_JOBS "Define the maximum number of concurrent compilation jobs" "")
+if (NOT PARALLEL_COMPILE_JOBS AND AVAILABLE_PHYSICAL_MEMORY AND MAX_COMPILER_MEMORY)
+    math(EXPR PARALLEL_COMPILE_JOBS ${AVAILABLE_PHYSICAL_MEMORY}/${MAX_COMPILER_MEMORY})
+    if (NOT PARALLEL_COMPILE_JOBS)
+        set (PARALLEL_COMPILE_JOBS 1)
+    endif ()
+endif ()
+if (PARALLEL_COMPILE_JOBS)
+    set(CMAKE_JOB_POOL_COMPILE compile_job_pool${CMAKE_CURRENT_SOURCE_DIR})
+    string (REGEX REPLACE "[^a-zA-Z0-9]+" "_" CMAKE_JOB_POOL_COMPILE ${CMAKE_JOB_POOL_COMPILE})
+    set_property(GLOBAL APPEND PROPERTY JOB_POOLS ${CMAKE_JOB_POOL_COMPILE}=${PARALLEL_COMPILE_JOBS})
+endif ()
+
+option(PARALLEL_LINK_JOBS "Define the maximum number of concurrent link jobs" "")
+if (NOT PARALLEL_LINK_JOBS AND AVAILABLE_PHYSICAL_MEMORY AND MAX_LINKER_MEMORY)
+    math(EXPR PARALLEL_LINK_JOBS ${AVAILABLE_PHYSICAL_MEMORY}/${MAX_LINKER_MEMORY})
+    if (NOT PARALLEL_LINK_JOBS)
+        set (PARALLEL_LINK_JOBS 1)
+    endif ()
+endif ()
+if (PARALLEL_COMPILE_JOBS OR PARALLEL_LINK_JOBS)
+    message(STATUS "${CMAKE_CURRENT_SOURCE_DIR}: Have ${AVAILABLE_PHYSICAL_MEMORY} megabytes of memory. Limiting concurrent linkers jobs to ${PARALLEL_LINK_JOBS} and compiler jobs to ${PARALLEL_COMPILE_JOBS}")
+endif ()
+
+if (LLVM_PARALLEL_LINK_JOBS)
+    set(CMAKE_JOB_POOL_LINK link_job_pool${CMAKE_CURRENT_SOURCE_DIR})
+    string (REGEX REPLACE "[^a-zA-Z0-9]+" "_" CMAKE_JOB_POOL_LINK ${CMAKE_JOB_POOL_LINK})
+    set_property(GLOBAL APPEND PROPERTY JOB_POOLS ${CMAKE_JOB_POOL_LINK}=${PARALLEL_LINK_JOBS})
+endif ()
+
--- a/cmake/sanitize.cmake
+++ b/cmake/sanitize.cmake
@ -25,8 +25,8 @@ if (SANITIZE)
            set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -static-libtsan")
        endif ()
    elseif (SANITIZE STREQUAL "undefined")
-        set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} -fsanitize=undefined")
-        set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SAN_FLAGS} -fsanitize=undefined")
+        set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} -fsanitize=undefined -fno-sanitize-recover=all")
+        set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SAN_FLAGS} -fsanitize=undefined -fno-sanitize-recover=all")
        set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fsanitize=undefined")
        if (MAKE_STATIC_LIBRARIES AND CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
            set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -static-libubsan")
--- a/cmake/use_libcxx.cmake
+++ b/cmake/use_libcxx.cmake
@ -0,0 +1,49 @@
+# Uses MAKE_STATIC_LIBRARIES
+
+
+set(THREADS_PREFER_PTHREAD_FLAG ON)
+find_package (Threads)
+
+include (cmake/test_compiler.cmake)
+include (cmake/arch.cmake)
+
+if (OS_LINUX AND COMPILER_CLANG)
+    set (CMAKE_EXE_LINKER_FLAGS          "${CMAKE_EXE_LINKER_FLAGS}")
+
+    option (USE_LIBCXX "Use libc++ and libc++abi instead of libstdc++ (only make sense on Linux with Clang)" ${HAVE_LIBCXX})
+    set (LIBCXX_PATH "" CACHE STRING "Use custom path for libc++. It should be used for MSan.")
+
+    if (USE_LIBCXX)
+        set (CMAKE_CXX_FLAGS             "${CMAKE_CXX_FLAGS} -stdlib=libc++") # Ok for clang6, for older can cause 'not used option' warning
+        set (CMAKE_CXX_FLAGS_DEBUG       "${CMAKE_CXX_FLAGS_DEBUG} -D_LIBCPP_DEBUG=0") # More checks in debug build.
+        if (MAKE_STATIC_LIBRARIES)
+            execute_process (COMMAND ${CMAKE_CXX_COMPILER} --print-file-name=libclang_rt.builtins-${CMAKE_SYSTEM_PROCESSOR}.a OUTPUT_VARIABLE BUILTINS_LIB_PATH OUTPUT_STRIP_TRAILING_WHITESPACE)
+            link_libraries (-nodefaultlibs -Wl,-Bstatic -stdlib=libc++ c++ c++abi gcc_eh ${BUILTINS_LIB_PATH} rt -Wl,-Bdynamic dl pthread m c)
+        else ()
+            link_libraries (-stdlib=libc++ c++ c++abi)
+        endif ()
+
+        if (LIBCXX_PATH)
+#            include_directories (SYSTEM BEFORE "${LIBCXX_PATH}/include" "${LIBCXX_PATH}/include/c++/v1")
+            link_directories ("${LIBCXX_PATH}/lib")
+        endif ()
+    endif ()
+endif ()
+
+if (USE_LIBCXX)
+    set (STATIC_STDLIB_FLAGS "")
+else ()
+    set (STATIC_STDLIB_FLAGS "-static-libgcc -static-libstdc++")
+endif ()
+
+if (MAKE_STATIC_LIBRARIES AND NOT APPLE AND NOT (COMPILER_CLANG AND OS_FREEBSD))
+    set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${STATIC_STDLIB_FLAGS}")
+
+    # Along with executables, we also build example of shared library for "library dictionary source"; and it also should be self-contained.
+    set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} ${STATIC_STDLIB_FLAGS}")
+endif ()
+
+if (USE_STATIC_LIBRARIES AND HAVE_NO_PIE)
+    set (CMAKE_CXX_FLAGS                 "${CMAKE_CXX_FLAGS} ${FLAG_NO_PIE}")
+    set (CMAKE_C_FLAGS                   "${CMAKE_C_FLAGS} ${FLAG_NO_PIE}")
+endif ()
--- a/contrib/CMakeLists.txt
+++ b/contrib/CMakeLists.txt
@ -2,9 +2,9 @@

 if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
    set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-maybe-uninitialized -Wno-format -Wno-misleading-indentation -Wno-stringop-overflow")
-    set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-old-style-cast -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-non-virtual-dtor -Wno-maybe-uninitialized -Wno-format -Wno-misleading-indentation -Wno-implicit-fallthrough -Wno-class-memaccess -std=c++1z")
+    set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-old-style-cast -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-non-virtual-dtor -Wno-maybe-uninitialized -Wno-format -Wno-misleading-indentation -Wno-implicit-fallthrough -Wno-class-memaccess -Wno-sign-compare -std=c++1z")
 elseif (CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
-    set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-format -Wno-parentheses-equality -Wno-tautological-constant-out-of-range-compare -Wno-implicit-function-declaration -Wno-return-type -Wno-pointer-bool-conversion -Wno-enum-conversion")
+    set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-format -Wno-parentheses-equality -Wno-tautological-constant-compare -Wno-tautological-constant-out-of-range-compare -Wno-implicit-function-declaration -Wno-return-type -Wno-pointer-bool-conversion -Wno-enum-conversion -Wno-int-conversion -Wno-switch")
    set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-old-style-cast -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-non-virtual-dtor -Wno-format -Wno-inconsistent-missing-override -std=c++1z")
 endif ()

@ -228,6 +228,7 @@ if (USE_INTERNAL_HDFS3_LIBRARY)
    if (USE_INTERNAL_PROTOBUF_LIBRARY)
        set(protobuf_BUILD_TESTS OFF CACHE INTERNAL "" FORCE)
        set(protobuf_BUILD_SHARED_LIBS OFF CACHE INTERNAL "" FORCE)
+        set(protobuf_WITH_ZLIB 0 CACHE INTERNAL "" FORCE) # actually will use zlib, but skip find
        add_subdirectory(protobuf/cmake)
    endif ()
    add_subdirectory(libhdfs3-cmake)
--- a/contrib/base64-cmake/CMakeLists.txt
+++ b/contrib/base64-cmake/CMakeLists.txt
@ -1,7 +1,5 @@
 SET(LIBRARY_DIR ${ClickHouse_SOURCE_DIR}/contrib/base64)

-set(base64_compile_instructions "")
-LIST(LENGTH base64_compile_instructions 0)
 macro(cast_to_bool var instruction)
    if (HAVE_${var})
        set(base64_${var} 1)
@ -11,27 +9,20 @@ macro(cast_to_bool var instruction)
    endif()
 endmacro()

+cast_to_bool(NEON32 "") # TODO flags
+cast_to_bool(NEON64 "") # TODO flags
 cast_to_bool(SSSE3 "-mssse3")
 cast_to_bool(SSE41 "-msse4.1")
 cast_to_bool(SSE42 "-msse4.2")
 cast_to_bool(AVX   "-mavx")
 cast_to_bool(AVX2  "-mavx2")

-# write config.h file, to include it in application
-file(READ config-header.tpl header)
-file(WRITE config.h ${header})
-file(APPEND config.h "#define HAVE_SSSE3                 ${base64_SSSE3}\n")
-file(APPEND config.h "#define HAVE_SSE41                 ${base64_SSE41}\n")
-file(APPEND config.h "#define HAVE_SSE42                 ${base64_SSE42}\n")
-file(APPEND config.h "#define HAVE_AVX                   ${base64_AVX}\n")
-file(APPEND config.h "#define HAVE_AVX2                  ${base64_AVX2}\n")
-
 set(HAVE_FAST_UNALIGNED_ACCESS 0)
-if (${base64_SSSE3} OR ${base64_SSE41} OR ${base64_SSE42} OR ${base64_AVX} OR ${base64_AVX2})
+if(HAVE_SSSE3 OR HAVE_SSE41 OR HAVE_SSE42 OR HAVE_AVX OR HAVE_AVX2)
    set(HAVE_FAST_UNALIGNED_ACCESS 1)
 endif ()

-file(APPEND config.h "#define HAVE_FAST_UNALIGNED_ACCESS " ${HAVE_FAST_UNALIGNED_ACCESS} "\n")
+configure_file(config.h.in ${CMAKE_CURRENT_BINARY_DIR}/config.h)

 add_library(base64 ${LINK_MODE}
        ${LIBRARY_DIR}/lib/lib.c
@ -46,7 +37,7 @@ add_library(base64 ${LINK_MODE}
        ${LIBRARY_DIR}/lib/arch/ssse3/codec.c

        ${LIBRARY_DIR}/lib/codecs.h
-        config.h)
+        ${CMAKE_CURRENT_BINARY_DIR}/config.h)

 target_compile_options(base64 PRIVATE ${base64_SSSE3_opt} ${base64_SSE41_opt} ${base64_SSE42_opt} ${base64_AVX_opt} ${base64_AVX2_opt})
-target_include_directories(base64 PRIVATE ${LIBRARY_DIR}/include .)
+target_include_directories(base64 PRIVATE ${LIBRARY_DIR}/include ${CMAKE_CURRENT_BINARY_DIR})
--- a/contrib/base64-cmake/config-header.tpl
+++ b/contrib/base64-cmake/config-header.tpl
@ -1,2 +0,0 @@
-#define HAVE_NEON32                0
-#define HAVE_NEON64                0
--- a/contrib/base64-cmake/config.h.in
+++ b/contrib/base64-cmake/config.h.in
@ -0,0 +1,8 @@
+#define HAVE_NEON32                @base64_NEON32@
+#define HAVE_NEON64                @base64_NEON64@
+#cmakedefine HAVE_SSSE3                 @base64_SSSE3@
+#cmakedefine HAVE_SSE41                 @base64_SSE41@
+#cmakedefine HAVE_SSE42                 @base64_SSE42@
+#cmakedefine HAVE_AVX                   @base64_AVX@
+#cmakedefine HAVE_AVX2                  @base64_AVX2@
+#cmakedefine HAVE_FAST_UNALIGNED_ACCESS @HAVE_FAST_UNALIGNED_ACCESS@
--- a/contrib/mariadb-connector-c
+++ b/contrib/mariadb-connector-c
@ -1 +1 @@
-Subproject commit a0fd36cc5a5313414a5a2ebe9322577a29b4782a
+Subproject commit d85d0e98999cd9e28ceb66645999b4a9ce85370e
--- a/dbms/CMakeLists.txt
+++ b/dbms/CMakeLists.txt
@ -2,15 +2,23 @@ if (USE_INCLUDE_WHAT_YOU_USE)
    set (CMAKE_CXX_INCLUDE_WHAT_YOU_USE ${IWYU_PATH})
 endif ()

-include(${CMAKE_CURRENT_SOURCE_DIR}/cmake/find_vectorclass.cmake)
+set (MAX_COMPILER_MEMORY 2500 CACHE INTERNAL "")
+if (MAKE_STATIC_LIBRARIES)
+    set (MAX_LINKER_MEMORY 3500 CACHE INTERNAL "")
+else()
+    set (MAX_LINKER_MEMORY 2500 CACHE INTERNAL "")
+endif ()
+include (../cmake/limit_jobs.cmake)
+
+include(cmake/find_vectorclass.cmake)

 set (CONFIG_VERSION ${CMAKE_CURRENT_BINARY_DIR}/src/Common/config_version.h)
 set (CONFIG_COMMON ${CMAKE_CURRENT_BINARY_DIR}/src/Common/config.h)

 include (cmake/version.cmake)
 message (STATUS "Will build ${VERSION_FULL}")
-configure_file (${CMAKE_CURRENT_SOURCE_DIR}/src/Common/config.h.in ${CONFIG_COMMON})
-configure_file (${CMAKE_CURRENT_SOURCE_DIR}/src/Common/config_version.h.in ${CONFIG_VERSION})
+configure_file (src/Common/config.h.in ${CONFIG_COMMON})
+configure_file (src/Common/config_version.h.in ${CONFIG_VERSION})

 if (NOT MSVC)
    set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wextra")
@ -53,7 +61,7 @@ add_subdirectory (src)
 set(dbms_headers)
 set(dbms_sources)

-include(${ClickHouse_SOURCE_DIR}/cmake/dbms_glob_sources.cmake)
+include(../cmake/dbms_glob_sources.cmake)

 add_headers_and_sources(clickhouse_common_io src/Common)
 add_headers_and_sources(clickhouse_common_io src/Common/HashTable)
@ -291,6 +299,11 @@ target_include_directories (clickhouse_common_io BEFORE PRIVATE ${COMMON_INCLUDE
 add_subdirectory (programs)
 add_subdirectory (tests)

+if (GLIBC_COMPATIBILITY AND NOT CLICKHOUSE_SPLIT_BINARY)
+    MESSAGE(STATUS "Some symbols from glibc will be replaced for compatibility")
+    target_link_libraries(dbms PUBLIC glibc-compatibility)
+endif()
+
 if (ENABLE_TESTS)
    macro (grep_gtest_sources BASE_DIR DST_VAR)
        # Cold match files that are not in tests/ directories
--- a/dbms/cmake/version.cmake
+++ b/dbms/cmake/version.cmake
@ -1,11 +1,11 @@
 # This strings autochanged from release_lib.sh:
-set(VERSION_REVISION 54409 CACHE STRING "")
+set(VERSION_REVISION 54412 CACHE STRING "") # changed manually for tests
 set(VERSION_MAJOR 18 CACHE STRING "")
-set(VERSION_MINOR 14 CACHE STRING "")
-set(VERSION_PATCH 17 CACHE STRING "")
-set(VERSION_GITHASH ac2895d769c3dcf070530dec7fcfdcf87bfa852a CACHE STRING "")
-set(VERSION_DESCRIBE v18.14.17-testing CACHE STRING "")
-set(VERSION_STRING 18.14.17 CACHE STRING "")
+set(VERSION_MINOR 16 CACHE STRING "")
+set(VERSION_PATCH 0 CACHE STRING "")
+set(VERSION_GITHASH b9b48c646c253358340bd39fd57754e92f88cd8a CACHE STRING "")
+set(VERSION_DESCRIBE v18.16.0-testing CACHE STRING "")
+set(VERSION_STRING 18.16.0 CACHE STRING "")
 # end of autochange

 set(VERSION_EXTRA "" CACHE STRING "")
--- a/dbms/programs/clang/CMakeLists.txt
+++ b/dbms/programs/clang/CMakeLists.txt
@ -27,12 +27,12 @@ elseif (EXISTS ${INTERNAL_COMPILER_BIN_ROOT}${INTERNAL_COMPILER_EXECUTABLE})
 endif ()

 if (COPY_HEADERS_COMPILER AND OS_LINUX)
-    add_custom_target (copy-headers ALL env CLANG=${COPY_HEADERS_COMPILER} BUILD_PATH=${ClickHouse_BINARY_DIR} DESTDIR=${ClickHouse_SOURCE_DIR} ${ClickHouse_SOURCE_DIR}/copy_headers.sh ${ClickHouse_SOURCE_DIR} ${TMP_HEADERS_DIR} DEPENDS ${COPY_HEADERS_DEPENDS} WORKING_DIRECTORY ${ClickHouse_SOURCE_DIR} SOURCES ${ClickHouse_SOURCE_DIR}/copy_headers.sh)
+    add_custom_target (copy-headers env CLANG=${COPY_HEADERS_COMPILER} BUILD_PATH=${ClickHouse_BINARY_DIR} DESTDIR=${ClickHouse_SOURCE_DIR} ${ClickHouse_SOURCE_DIR}/copy_headers.sh ${ClickHouse_SOURCE_DIR} ${TMP_HEADERS_DIR} DEPENDS ${COPY_HEADERS_DEPENDS} WORKING_DIRECTORY ${ClickHouse_SOURCE_DIR} SOURCES ${ClickHouse_SOURCE_DIR}/copy_headers.sh)

    if (USE_INTERNAL_LLVM_LIBRARY)
        set (CLANG_HEADERS_DIR "${ClickHouse_SOURCE_DIR}/contrib/llvm/clang/lib/Headers")
        set (CLANG_HEADERS_DEST "${TMP_HEADERS_DIR}/usr/local/lib/clang/${LLVM_VERSION}/include") # original: ${LLVM_LIBRARY_OUTPUT_INTDIR}/clang/${CLANG_VERSION}/include
-        add_custom_target (copy-headers-clang ALL ${CMAKE_COMMAND} -E make_directory ${CLANG_HEADERS_DEST} && ${CMAKE_COMMAND} -E copy_if_different ${CLANG_HEADERS_DIR}/* ${CLANG_HEADERS_DEST} )
+        add_custom_target (copy-headers-clang ${CMAKE_COMMAND} -E make_directory ${CLANG_HEADERS_DEST} && ${CMAKE_COMMAND} -E copy_if_different ${CLANG_HEADERS_DIR}/* ${CLANG_HEADERS_DEST} )
        add_dependencies (copy-headers copy-headers-clang)
    endif ()
 endif ()
--- a/dbms/programs/client/Client.cpp
+++ b/dbms/programs/client/Client.cpp
@ -43,6 +43,7 @@
 #include <IO/WriteHelpers.h>
 #include <IO/UseSSL.h>
 #include <DataStreams/AsynchronousBlockInputStream.h>
+#include <DataStreams/AddingDefaultsBlockInputStream.h>
 #include <DataStreams/InternalTextLogsRowOutputStream.h>
 #include <Parsers/ParserQuery.h>
 #include <Parsers/ASTSetQuery.h>
@ -60,6 +61,7 @@
 #include <Functions/registerFunctions.h>
 #include <AggregateFunctions/registerAggregateFunctions.h>
 #include <Common/Config/configReadClient.h>
+#include <Storages/ColumnsDescription.h>

 #if USE_READLINE
 #include "Suggest.h" // Y_IGNORE
@ -69,7 +71,6 @@
 #pragma GCC optimize("-fno-var-tracking-assignments")
 #endif

-
 /// http://en.wikipedia.org/wiki/ANSI_escape_code

 /// Similar codes \e[s, \e[u don't work in VT100 and Mosh.
@ -616,8 +617,14 @@ private:
                    {
                        std::cerr << std::endl
                            << "Exception on client:" << std::endl
-                            << "Code: " << e.code() << ". " << e.displayText() << std::endl
-                            << std::endl;
+                            << "Code: " << e.code() << ". " << e.displayText() << std::endl;
+
+                        if (config().getBool("stacktrace", false))
+                            std::cerr << "Stack trace:" << std::endl
+                                      << e.getStackTrace().toString() << std::endl;
+
+                        std::cerr << std::endl;
+
                    }

                    /// Client-side exception during query execution can result in the loss of
@ -659,6 +666,12 @@ private:
        const bool test_mode = config().has("testmode");
        if (config().has("multiquery"))
        {
+            {   /// disable logs if expects errors
+                TestHint test_hint(test_mode, text);
+                if (test_hint.clientError() || test_hint.serverError())
+                    process("SET send_logs_level = 'none'");
+            }
+
            /// Several queries separated by ';'.
            /// INSERT data is ended by the end of line, not ';'.

@ -875,11 +888,12 @@ private:

        /// Receive description of table structure.
        Block sample;
-        if (receiveSampleBlock(sample))
+        ColumnsDescription columns_description;
+        if (receiveSampleBlock(sample, columns_description))
        {
            /// If structure was received (thus, server has not thrown an exception),
            /// send our data with that structure.
-            sendData(sample);
+            sendData(sample, columns_description);
            receiveEndOfQuery();
        }
    }
@ -917,7 +931,7 @@ private:
    }


-    void sendData(Block & sample)
+    void sendData(Block & sample, const ColumnsDescription & columns_description)
    {
        /// If INSERT data must be sent.
        const ASTInsertQuery * parsed_insert_query = typeid_cast<const ASTInsertQuery *>(&*parsed_query);
@ -928,19 +942,19 @@ private:
        {
            /// Send data contained in the query.
            ReadBufferFromMemory data_in(parsed_insert_query->data, parsed_insert_query->end - parsed_insert_query->data);
-            sendDataFrom(data_in, sample);
+            sendDataFrom(data_in, sample, columns_description);
        }
        else if (!is_interactive)
        {
            /// Send data read from stdin.
-            sendDataFrom(std_in, sample);
+            sendDataFrom(std_in, sample, columns_description);
        }
        else
            throw Exception("No data to insert", ErrorCodes::NO_DATA_TO_INSERT);
    }


-    void sendDataFrom(ReadBuffer & buf, Block & sample)
+    void sendDataFrom(ReadBuffer & buf, Block & sample, const ColumnsDescription & columns_description)
    {
        String current_format = insert_format;

@ -952,6 +966,10 @@ private:
        BlockInputStreamPtr block_input = context.getInputFormat(
            current_format, buf, sample, insert_format_max_block_size);

+        const auto & column_defaults = columns_description.defaults;
+        if (!column_defaults.empty())
+            block_input = std::make_shared<AddingDefaultsBlockInputStream>(block_input, column_defaults, context);
+
        BlockInputStreamPtr async_block_input = std::make_shared<AsynchronousBlockInputStream>(block_input);

        async_block_input->readPrefix();
@ -1089,7 +1107,7 @@ private:


    /// Receive the block that serves as an example of the structure of table where data will be inserted.
-    bool receiveSampleBlock(Block & out)
+    bool receiveSampleBlock(Block & out, ColumnsDescription & columns_description)
    {
        while (true)
        {
@ -1110,6 +1128,10 @@ private:
                    onLogData(packet.block);
                    break;

+                case Protocol::Server::TableColumns:
+                    columns_description = ColumnsDescription::parse(packet.multistring_message[1]);
+                    return receiveSampleBlock(out, columns_description);
+
                default:
                    throw NetException("Unexpected packet from server (expected Data, Exception or Log, got "
                        + String(Protocol::Server::toString(packet.type)) + ")", ErrorCodes::UNEXPECTED_PACKET_FROM_SERVER);
--- a/dbms/programs/client/TestHint.h
+++ b/dbms/programs/client/TestHint.h
@ -5,6 +5,7 @@
 #include <iostream>
 #include <Core/Types.h>
 #include <Common/Exception.h>
+#include <Parsers/Lexer.h>


 namespace DB
@ -27,25 +28,26 @@ public:
        if (!enabled_)
            return;

-        /// TODO: This is absolutely wrong. Fragment may be contained inside string literal.
-        size_t pos = query.find("--");
+        String full_comment;
+        Lexer lexer(query.data(), query.data() + query.size());

-        if (pos != String::npos && query.find("--", pos + 2) != String::npos)
-            return; /// It's not last comment. Hint belongs to commented query. /// TODO Absolutely wrong: there maybe the following comment for the next query.
-
-        if (pos != String::npos)
+        for (Token token = lexer.nextToken(); !token.isEnd(); token = lexer.nextToken())
        {
-            /// TODO: This is also wrong. Comment may already have ended by line break.
-            pos = query.find('{', pos + 2);
+            if (token.type == TokenType::Comment)
+                full_comment += String(token.begin, token.begin + token.size()) + ' ';
+        }

-            if (pos != String::npos)
+        if (!full_comment.empty())
+        {
+            size_t pos_start = full_comment.find('{', 0);
+            if (pos_start != String::npos)
            {
-                String hint = query.substr(pos + 1);
-
-                /// TODO: And this is wrong for the same reason.
-                pos = hint.find('}');
-                hint.resize(pos);
-                parse(hint);
+                size_t pos_end = full_comment.find('}', pos_start);
+                if (pos_end != String::npos)
+                {
+                    String hint(full_comment.begin() + pos_start + 1, full_comment.begin() + pos_end);
+                    parse(hint);
+                }
            }
        }
    }
--- a/dbms/programs/compressor/Compressor.cpp
+++ b/dbms/programs/compressor/Compressor.cpp
@ -61,7 +61,7 @@ int mainEntryClickHouseCompressor(int argc, char ** argv)
        ("block-size,b", boost::program_options::value<unsigned>()->default_value(DBMS_DEFAULT_BUFFER_SIZE), "compress in blocks of specified size")
        ("hc", "use LZ4HC instead of LZ4")
        ("zstd", "use ZSTD instead of LZ4")
-        ("level", "compression level")
+        ("level", boost::program_options::value<int>(), "compression level")
        ("none", "use no compression instead of LZ4")
        ("stat", "print block statistics of compressed data")
    ;
@ -94,7 +94,9 @@ int mainEntryClickHouseCompressor(int argc, char ** argv)
        else if (use_none)
            method = DB::CompressionMethod::NONE;

-        DB::CompressionSettings settings(method, options.count("level") > 0 ? options["level"].as<int>() : DB::CompressionSettings::getDefaultLevel(method));
+        DB::CompressionSettings settings(method, options.count("level")
+            ? options["level"].as<int>()
+            : DB::CompressionSettings::getDefaultLevel(method));

        DB::ReadBufferFromFileDescriptor rb(STDIN_FILENO);
        DB::WriteBufferFromFileDescriptor wb(STDOUT_FILENO);
--- a/dbms/programs/config_tools.h.in
+++ b/dbms/programs/config_tools.h.in
@ -12,3 +12,4 @@
 #cmakedefine01 ENABLE_CLICKHOUSE_COMPRESSOR
 #cmakedefine01 ENABLE_CLICKHOUSE_FORMAT
 #cmakedefine01 ENABLE_CLICKHOUSE_OBFUSCATOR
+#cmakedefine01 ENABLE_CLICKHOUSE_ODBC_BRIDGE
--- a/dbms/programs/copier/ClusterCopier.cpp
+++ b/dbms/programs/copier/ClusterCopier.cpp
@ -481,7 +481,7 @@ String DB::TaskShard::getHostNameExample() const
 }


-static bool isExtedndedDefinitionStorage(const ASTPtr & storage_ast)
+static bool isExtendedDefinitionStorage(const ASTPtr & storage_ast)
 {
    const ASTStorage & storage = typeid_cast<const ASTStorage &>(*storage_ast);
    return storage.partition_by || storage.order_by || storage.sample_by;
@ -503,7 +503,7 @@ static ASTPtr extractPartitionKey(const ASTPtr & storage_ast)
    ASTPtr arguments_ast = engine.arguments->clone();
    ASTs & arguments = typeid_cast<ASTExpressionList &>(*arguments_ast).children;

-    if (isExtedndedDefinitionStorage(storage_ast))
+    if (isExtendedDefinitionStorage(storage_ast))
    {
        if (storage.partition_by)
            return storage.partition_by->clone();
--- a/dbms/programs/main.cpp
+++ b/dbms/programs/main.cpp
@ -53,7 +53,7 @@ int mainEntryClickHouseFormat(int argc, char ** argv);
 #if ENABLE_CLICKHOUSE_COPIER || !defined(ENABLE_CLICKHOUSE_COPIER)
 int mainEntryClickHouseClusterCopier(int argc, char ** argv);
 #endif
-#if ENABLE_CLICKHOUSE_OBFUSCATOR
+#if ENABLE_CLICKHOUSE_OBFUSCATOR || !defined(ENABLE_CLICKHOUSE_OBFUSCATOR)
 int mainEntryClickHouseObfuscator(int argc, char ** argv);
 #endif
 #if ENABLE_CLICKHOUSE_ODBC_BRIDGE || !defined(ENABLE_CLICKHOUSE_ODBC_BRIDGE)
@ -102,7 +102,7 @@ std::pair<const char *, MainFunc> clickhouse_applications[] =
 #if ENABLE_CLICKHOUSE_COPIER || !defined(ENABLE_CLICKHOUSE_COPIER)
    {"copier", mainEntryClickHouseClusterCopier},
 #endif
-#if ENABLE_CLICKHOUSE_OBFUSCATOR
+#if ENABLE_CLICKHOUSE_OBFUSCATOR || !defined(ENABLE_CLICKHOUSE_OBFUSCATOR)
    {"obfuscator", mainEntryClickHouseObfuscator},
 #endif
 #if ENABLE_CLICKHOUSE_ODBC_BRIDGE || !defined(ENABLE_CLICKHOUSE_ODBC_BRIDGE)
--- a/dbms/programs/odbc-bridge/ODBCBridge.cpp
+++ b/dbms/programs/odbc-bridge/ODBCBridge.cpp
@ -121,9 +121,6 @@ void ODBCBridge::initialize(Application & self)
    if (is_help)
        return;

-    if (!config().has("logger.log"))
-        config().setBool("logger.console", true);
-
    config().setString("logger", "ODBCBridge");

    buildLoggers(config());
--- a/dbms/programs/server/Server.cpp
+++ b/dbms/programs/server/Server.cpp
@ -2,7 +2,11 @@

 #include <memory>
 #include <sys/resource.h>
+#include <sys/stat.h>
+#include <sys/types.h>
 #include <errno.h>
+#include <pwd.h>
+#include <unistd.h>
 #include <Poco/Version.h>
 #include <Poco/DirectoryIterator.h>
 #include <Poco/Net/HTTPServer.h>
@ -70,6 +74,8 @@ namespace ErrorCodes
    extern const int EXCESSIVE_ELEMENT_IN_CONFIG;
    extern const int INVALID_CONFIG_PARAMETER;
    extern const int SYSTEM_ERROR;
+    extern const int FAILED_TO_GETPWUID;
+    extern const int MISMATCHING_USERS_FOR_PROCESS_AND_DATA;
 }


@ -83,6 +89,26 @@ static std::string getCanonicalPath(std::string && path)
    return std::move(path);
 }

+static std::string getUserName(uid_t user_id)
+{
+    /// Try to convert user id into user name.
+    auto buffer_size = sysconf(_SC_GETPW_R_SIZE_MAX);
+    if (buffer_size <= 0)
+        buffer_size = 1024;
+    std::string buffer;
+    buffer.reserve(buffer_size);
+
+    struct passwd passwd_entry;
+    struct passwd * result = nullptr;
+    const auto error = getpwuid_r(user_id, &passwd_entry, buffer.data(), buffer_size, &result);
+
+    if (error)
+        throwFromErrno("Failed to find user name for " + toString(user_id), ErrorCodes::FAILED_TO_GETPWUID, error);
+    else if (result)
+        return result->pw_name;
+    return toString(user_id);
+}
+
 void Server::uninitialize()
 {
    logger().information("shutting down");
@ -166,6 +192,26 @@ int Server::main(const std::vector<std::string> & /*args*/)
    std::string path = getCanonicalPath(config().getString("path", DBMS_DEFAULT_PATH));
    std::string default_database = config().getString("default_database", "default");

+    /// Check that the process' user id matches the owner of the data.
+    const auto effective_user_id = geteuid();
+    struct stat statbuf;
+    if (stat(path.c_str(), &statbuf) == 0 && effective_user_id != statbuf.st_uid)
+    {
+        const auto effective_user = getUserName(effective_user_id);
+        const auto data_owner = getUserName(statbuf.st_uid);
+        std::string message = "Effective user of the process (" + effective_user +
+            ") does not match the owner of the data (" + data_owner + ").";
+        if (effective_user_id == 0)
+        {
+            message += " Run under 'sudo -u " + data_owner + "'.";
+            throw Exception(message, ErrorCodes::MISMATCHING_USERS_FOR_PROCESS_AND_DATA);
+        }
+        else
+        {
+            LOG_WARNING(log, message);
+        }
+    }
+
    global_context->setPath(path);

    /// Create directories for 'path' and for default database, if not exist.
--- a/dbms/programs/server/TCPHandler.cpp
+++ b/dbms/programs/server/TCPHandler.cpp
@ -30,6 +30,7 @@
 #include <Storages/StorageMemory.h>
 #include <Storages/StorageReplicatedMergeTree.h>
 #include <Core/ExternalTable.h>
+#include <Storages/ColumnDefault.h>
 #include <DataTypes/DataTypeLowCardinality.h>

 #include "TCPHandler.h"
@ -360,20 +361,16 @@ void TCPHandler::processInsertQuery(const Settings & global_settings)
      */
    state.io.out->writePrefix();

-    /// Send block to the client - table structure.
-    Block block = state.io.out->getHeader();
-
-    /// Support insert from old clients without low cardinality type.
-    if (client_revision && client_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE)
+    /// Send ColumnsDescription for insertion table
+    if (client_revision >= DBMS_MIN_REVISION_WITH_COLUMN_DEFAULTS_METADATA)
    {
-        for (auto & col : block)
-        {
-            col.type = recursiveRemoveLowCardinality(col.type);
-            col.column = recursiveRemoveLowCardinality(col.column);
-        }
+        const auto & db_and_table = query_context.getInsertionTable();
+        if (auto * columns = ColumnsDescription::loadFromContext(query_context, db_and_table.first, db_and_table.second))
+            sendTableColumns(*columns);
    }

-    sendData(block);
+    /// Send block to the client - table structure.
+    sendData(state.io.out->getHeader());

    readData(global_settings);
    state.io.out->writeSuffix();
@ -389,6 +386,7 @@ void TCPHandler::processOrdinaryQuery()
        /// Send header-block, to allow client to prepare output format for data to send.
        {
            Block header = state.io.in->getHeader();
+
            if (header)
                sendData(header);
        }
@ -762,7 +760,8 @@ void TCPHandler::initBlockInput()
        state.block_in = std::make_shared<NativeBlockInputStream>(
            *state.maybe_compressed_in,
            header,
-            client_revision);
+            client_revision,
+            !connection_context.getSettingsRef().low_cardinality_allow_in_native_format);
    }
 }

@ -783,7 +782,8 @@ void TCPHandler::initBlockOutput(const Block & block)
        state.block_out = std::make_shared<NativeBlockOutputStream>(
            *state.maybe_compressed_out,
            client_revision,
-            block.cloneEmpty());
+            block.cloneEmpty(),
+            !connection_context.getSettingsRef().low_cardinality_allow_in_native_format);
    }
 }

@ -795,7 +795,8 @@ void TCPHandler::initLogsBlockOutput(const Block & block)
        state.logs_block_out = std::make_shared<NativeBlockOutputStream>(
            *out,
            client_revision,
-            block.cloneEmpty());
+            block.cloneEmpty(),
+            !connection_context.getSettingsRef().low_cardinality_allow_in_native_format);
    }
 }

@ -860,6 +861,16 @@ void TCPHandler::sendLogData(const Block & block)
    out->next();
 }

+void TCPHandler::sendTableColumns(const ColumnsDescription & columns)
+{
+    writeVarUInt(Protocol::Server::TableColumns, *out);
+
+    /// Send external table name (empty name is the main table)
+    writeStringBinary("", *out);
+    writeStringBinary(columns.toString(), *out);
+
+    out->next();
+}

 void TCPHandler::sendException(const Exception & e, bool with_stack_trace)
 {
--- a/dbms/programs/server/TCPHandler.h
+++ b/dbms/programs/server/TCPHandler.h
@ -25,6 +25,7 @@ namespace Poco { class Logger; }
 namespace DB
 {

+struct ColumnsDescription;

 /// State of query processing.
 struct QueryState
@ -144,6 +145,7 @@ private:
    void sendHello();
    void sendData(const Block & block);    /// Write a block to the network.
    void sendLogData(const Block & block);
+    void sendTableColumns(const ColumnsDescription & columns);
    void sendException(const Exception & e, bool with_stack_trace);
    void sendProgress();
    void sendLogs();
--- a/dbms/programs/server/config.xml
+++ b/dbms/programs/server/config.xml
@ -187,6 +187,20 @@
                </replica>
            </shard>
        </test_shard_localhost_secure>
+        <test_unavailable_shard>
+            <shard>
+                <replica>
+                    <host>localhost</host>
+                    <port>9000</port>
+                </replica>
+            </shard>
+            <shard>
+                <replica>
+                    <host>localhost</host>
+                    <port>1</port>
+                </replica>
+            </shard>
+        </test_unavailable_shard>
    </remote_servers>


--- a/dbms/src/AggregateFunctions/AggregateFunctionArray.h
+++ b/dbms/src/AggregateFunctions/AggregateFunctionArray.h
@ -85,7 +85,7 @@ public:
        const ColumnArray & first_array_column = static_cast<const ColumnArray &>(*columns[0]);
        const IColumn::Offsets & offsets = first_array_column.getOffsets();

-        size_t begin = row_num == 0 ? 0 : offsets[row_num - 1];
+        size_t begin = offsets[row_num - 1];
        size_t end = offsets[row_num];

        /// Sanity check. NOTE We can implement specialization for a case with single argument, if the check will hurt performance.
--- a/dbms/src/AggregateFunctions/AggregateFunctionAvg.h
+++ b/dbms/src/AggregateFunctions/AggregateFunctionAvg.h
@ -25,7 +25,7 @@ struct AggregateFunctionAvgData
    UInt64 count = 0;

    template <typename ResultT>
-    ResultT result() const
+    ResultT NO_SANITIZE_UNDEFINED result() const
    {
        if constexpr (std::is_floating_point_v<ResultT>)
            if constexpr (std::numeric_limits<ResultT>::is_iec559)
--- a/dbms/src/AggregateFunctions/AggregateFunctionBoundingRatio.cpp
+++ b/dbms/src/AggregateFunctions/AggregateFunctionBoundingRatio.cpp
@ -0,0 +1,36 @@
+#include <AggregateFunctions/AggregateFunctionFactory.h>
+#include <AggregateFunctions/AggregateFunctionBoundingRatio.h>
+#include <AggregateFunctions/FactoryHelpers.h>
+
+
+namespace DB
+{
+
+namespace ErrorCodes
+{
+    extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
+}
+
+namespace
+{
+
+AggregateFunctionPtr createAggregateFunctionRate(const std::string & name, const DataTypes & argument_types, const Array & parameters)
+{
+    assertNoParameters(name, parameters);
+    assertBinary(name, argument_types);
+
+    if (argument_types.size() < 2)
+        throw Exception("Aggregate function " + name + " requires at least two arguments",
+            ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
+
+    return std::make_shared<AggregateFunctionBoundingRatio>(argument_types);
+}
+
+}
+
+void registerAggregateFunctionRate(AggregateFunctionFactory & factory)
+{
+    factory.registerFunction("boundingRatio", createAggregateFunctionRate, AggregateFunctionFactory::CaseInsensitive);
+}
+
+}
--- a/dbms/src/AggregateFunctions/AggregateFunctionBoundingRatio.h
+++ b/dbms/src/AggregateFunctions/AggregateFunctionBoundingRatio.h
@ -0,0 +1,162 @@
+#pragma once
+
+#include <DataTypes/DataTypesNumber.h>
+#include <Columns/ColumnsNumber.h>
+#include <Common/FieldVisitors.h>
+#include <IO/ReadHelpers.h>
+#include <IO/WriteHelpers.h>
+#include <AggregateFunctions/Helpers.h>
+#include <AggregateFunctions/IAggregateFunction.h>
+
+
+namespace DB
+{
+
+namespace ErrorCodes
+{
+    extern const int BAD_ARGUMENTS;
+}
+
+/** Tracks the leftmost and rightmost (x, y) data points.
+  */
+struct AggregateFunctionBoundingRatioData
+{
+    struct Point
+    {
+        Float64 x;
+        Float64 y;
+    };
+
+    bool empty = true;
+    Point left;
+    Point right;
+
+    void add(Float64 x, Float64 y)
+    {
+        Point point{x, y};
+
+        if (empty)
+        {
+            left = point;
+            right = point;
+            empty = false;
+        }
+        else if (point.x < left.x)
+        {
+            left = point;
+        }
+        else if (point.x > right.x)
+        {
+            right = point;
+        }
+    }
+
+    void merge(const AggregateFunctionBoundingRatioData & other)
+    {
+        if (empty)
+        {
+            *this = other;
+        }
+        else
+        {
+            if (other.left.x < left.x)
+                left = other.left;
+            if (other.right.x > right.x)
+                right = other.right;
+        }
+    }
+
+    void serialize(WriteBuffer & buf) const
+    {
+        writeBinary(empty, buf);
+
+        if (!empty)
+        {
+            writePODBinary(left, buf);
+            writePODBinary(right, buf);
+        }
+    }
+
+    void deserialize(ReadBuffer & buf)
+    {
+        readBinary(empty, buf);
+
+        if (!empty)
+        {
+            readPODBinary(left, buf);
+            readPODBinary(right, buf);
+        }
+    }
+};
+
+
+class AggregateFunctionBoundingRatio final : public IAggregateFunctionDataHelper<AggregateFunctionBoundingRatioData, AggregateFunctionBoundingRatio>
+{
+private:
+    /** Calculates the slope of a line between leftmost and rightmost data points.
+      * (y2 - y1) / (x2 - x1)
+      */
+    Float64 NO_SANITIZE_UNDEFINED getBoundingRatio(const AggregateFunctionBoundingRatioData & data) const
+    {
+        if (data.empty)
+            return std::numeric_limits<Float64>::quiet_NaN();
+
+        return (data.right.y - data.left.y) / (data.right.x - data.left.x);
+    }
+
+public:
+    String getName() const override
+    {
+        return "boundingRatio";
+    }
+
+    AggregateFunctionBoundingRatio(const DataTypes & arguments)
+    {
+        const auto x_arg = arguments.at(0).get();
+        const auto y_arg = arguments.at(0).get();
+
+        if (!x_arg->isValueRepresentedByNumber() || !y_arg->isValueRepresentedByNumber())
+            throw Exception("Illegal types of arguments of aggregate function " + getName() + ", must have number representation.",
+                ErrorCodes::BAD_ARGUMENTS);
+    }
+
+    DataTypePtr getReturnType() const override
+    {
+        return std::make_shared<DataTypeFloat64>();
+    }
+
+    void add(AggregateDataPtr place, const IColumn ** columns, const size_t row_num, Arena *) const override
+    {
+        /// TODO Inefficient.
+        const auto x = applyVisitor(FieldVisitorConvertToNumber<Float64>(), (*columns[0])[row_num]);
+        const auto y = applyVisitor(FieldVisitorConvertToNumber<Float64>(), (*columns[1])[row_num]);
+        data(place).add(x, y);
+    }
+
+    void merge(AggregateDataPtr place, ConstAggregateDataPtr rhs, Arena *) const override
+    {
+        data(place).merge(data(rhs));
+    }
+
+    void serialize(ConstAggregateDataPtr place, WriteBuffer & buf) const override
+    {
+        data(place).serialize(buf);
+    }
+
+    void deserialize(AggregateDataPtr place, ReadBuffer & buf, Arena *) const override
+    {
+        data(place).deserialize(buf);
+    }
+
+    void insertResultInto(ConstAggregateDataPtr place, IColumn & to) const override
+    {
+        static_cast<ColumnFloat64 &>(to).getData().push_back(getBoundingRatio(data(place)));
+    }
+
+    const char * getHeaderFilePath() const override
+    {
+        return __FILE__;
+    }
+};
+
+}
--- a/dbms/src/AggregateFunctions/AggregateFunctionForEach.h
+++ b/dbms/src/AggregateFunctions/AggregateFunctionForEach.h
@ -146,7 +146,7 @@ public:
        const ColumnArray & first_array_column = static_cast<const ColumnArray &>(*columns[0]);
        const IColumn::Offsets & offsets = first_array_column.getOffsets();

-        size_t begin = row_num == 0 ? 0 : offsets[row_num - 1];
+        size_t begin = offsets[row_num - 1];
        size_t end = offsets[row_num];

        /// Sanity check. NOTE We can implement specialization for a case with single argument, if the check will hurt performance.
--- a/dbms/src/AggregateFunctions/AggregateFunctionGroupArray.h
+++ b/dbms/src/AggregateFunctions/AggregateFunctionGroupArray.h
@ -36,7 +36,7 @@ template <typename T>
 struct GroupArrayNumericData
 {
    // Switch to ordinary Allocator after 4096 bytes to avoid fragmentation and trash in Arena
-    using Allocator = MixedArenaAllocator<4096>;
+    using Allocator = MixedAlignedArenaAllocator<alignof(T), 4096>;
    using Array = PODArray<T, 32, Allocator>;

    Array value;
@ -77,12 +77,14 @@ public:

        if (!limit_num_elems)
        {
-            cur_elems.value.insert(rhs_elems.value.begin(), rhs_elems.value.end(), arena);
+            if (rhs_elems.value.size())
+                cur_elems.value.insert(rhs_elems.value.begin(), rhs_elems.value.end(), arena);
        }
        else
        {
            UInt64 elems_to_insert = std::min(static_cast<size_t>(max_elems) - cur_elems.value.size(), rhs_elems.value.size());
-            cur_elems.value.insert(rhs_elems.value.begin(), rhs_elems.value.begin() + elems_to_insert, arena);
+            if (elems_to_insert)
+                cur_elems.value.insert(rhs_elems.value.begin(), rhs_elems.value.begin() + elems_to_insert, arena);
        }
    }

@ -119,10 +121,13 @@ public:
        ColumnArray & arr_to = static_cast<ColumnArray &>(to);
        ColumnArray::Offsets & offsets_to = arr_to.getOffsets();

-        offsets_to.push_back((offsets_to.size() == 0 ? 0 : offsets_to.back()) + size);
+        offsets_to.push_back(offsets_to.back() + size);

-        typename ColumnVector<T>::Container & data_to = static_cast<ColumnVector<T> &>(arr_to.getData()).getData();
-        data_to.insert(this->data(place).value.begin(), this->data(place).value.end());
+        if (size)
+        {
+            typename ColumnVector<T>::Container & data_to = static_cast<ColumnVector<T> &>(arr_to.getData()).getData();
+            data_to.insert(this->data(place).value.begin(), this->data(place).value.end());
+        }
    }

    bool allocatesMemoryInArena() const override
@ -370,7 +375,7 @@ public:
        auto & column_array = static_cast<ColumnArray &>(to);

        auto & offsets = column_array.getOffsets();
-        offsets.push_back((offsets.size() == 0 ? 0 : offsets.back()) + data(place).elems);
+        offsets.push_back(offsets.back() + data(place).elems);

        auto & column_data = column_array.getData();

--- a/dbms/src/AggregateFunctions/AggregateFunctionGroupUniqArray.h
+++ b/dbms/src/AggregateFunctions/AggregateFunctionGroupUniqArray.h
@ -83,7 +83,7 @@ public:
        const typename State::Set & set = this->data(place).value;
        size_t size = set.size();

-        offsets_to.push_back((offsets_to.size() == 0 ? 0 : offsets_to.back()) + size);
+        offsets_to.push_back(offsets_to.back() + size);

        typename ColumnVector<T>::Container & data_to = static_cast<ColumnVector<T> &>(arr_to.getData()).getData();
        size_t old_size = data_to.size();
@ -195,7 +195,7 @@ public:
        for (auto & rhs_elem : rhs_set)
        {
            cur_set.emplace(rhs_elem, it, inserted);
-            if (inserted)
+            if (inserted && it->size)
                it->data = arena->insert(it->data, it->size);
        }
    }
@ -207,7 +207,7 @@ public:
        IColumn & data_to = arr_to.getData();

        auto & set = this->data(place).value;
-        offsets_to.push_back((offsets_to.size() == 0 ? 0 : offsets_to.back()) + set.size());
+        offsets_to.push_back(offsets_to.back() + set.size());

        for (auto & elem : set)
        {
--- a/dbms/src/AggregateFunctions/AggregateFunctionHistogram.cpp
+++ b/dbms/src/AggregateFunctions/AggregateFunctionHistogram.cpp
@ -17,6 +17,7 @@ namespace ErrorCodes
    extern const int PARAMETER_OUT_OF_BOUND;
 }

+
 namespace
 {

@ -44,6 +45,8 @@ AggregateFunctionPtr createAggregateFunctionHistogram(const std::string & name,
        throw Exception("Illegal type " + arguments[0]->getName() + " of argument for aggregate function " + name, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);

    return res;
+
+    return nullptr;
 }

 }
--- a/dbms/src/AggregateFunctions/AggregateFunctionMaxIntersections.h
+++ b/dbms/src/AggregateFunctions/AggregateFunctionMaxIntersections.h
@ -38,7 +38,7 @@ struct MaxIntersectionsData
    using Value = std::pair<T, Int64>;

    // Switch to ordinary Allocator after 4096 bytes to avoid fragmentation and trash in Arena
-    using Allocator = MixedArenaAllocator<4096>;
+    using Allocator = MixedAlignedArenaAllocator<alignof(Value), 4096>;
    using Array = PODArray<Value, 32, Allocator>;

    Array value;
--- a/dbms/src/AggregateFunctions/AggregateFunctionQuantile.h
+++ b/dbms/src/AggregateFunctions/AggregateFunctionQuantile.h
@ -138,7 +138,7 @@ public:
            ColumnArray::Offsets & offsets_to = arr_to.getOffsets();

            size_t size = levels.size();
-            offsets_to.push_back((offsets_to.size() == 0 ? 0 : offsets_to.back()) + size);
+            offsets_to.push_back(offsets_to.back() + size);

            if (!size)
                return;
--- a/dbms/src/AggregateFunctions/AggregateFunctionStatisticsSimple.h
+++ b/dbms/src/AggregateFunctions/AggregateFunctionStatisticsSimple.h
@ -68,12 +68,12 @@ struct VarMoments
        readPODBinary(*this, buf);
    }

-    T getPopulation() const
+    T NO_SANITIZE_UNDEFINED getPopulation() const
    {
        return (m2 - m1 * m1 / m0) / m0;
    }

-    T getSample() const
+    T NO_SANITIZE_UNDEFINED getSample() const
    {
        if (m0 == 0)
            return std::numeric_limits<T>::quiet_NaN();
@ -177,12 +177,12 @@ struct CovarMoments
        readPODBinary(*this, buf);
    }

-    T getPopulation() const
+    T NO_SANITIZE_UNDEFINED getPopulation() const
    {
        return (xy - x1 * y1 / m0) / m0;
    }

-    T getSample() const
+    T NO_SANITIZE_UNDEFINED getSample() const
    {
        if (m0 == 0)
            return std::numeric_limits<T>::quiet_NaN();
@ -232,7 +232,7 @@ struct CorrMoments
        readPODBinary(*this, buf);
    }

-    T get() const
+    T NO_SANITIZE_UNDEFINED get() const
    {
        return (m0 * xy - x1 * y1) / sqrt((m0 * x2 - x1 * x1) * (m0 * y2 - y1 * y1));
    }
--- a/dbms/src/AggregateFunctions/AggregateFunctionSumMap.h
+++ b/dbms/src/AggregateFunctions/AggregateFunctionSumMap.h
@ -83,7 +83,7 @@ public:
        const ColumnArray & array_column = static_cast<const ColumnArray &>(*columns[0]);
        const IColumn::Offsets & offsets = array_column.getOffsets();
        const auto & keys_vec = static_cast<const ColVecType &>(array_column.getData());
-        const size_t keys_vec_offset = row_num == 0 ? 0 : offsets[row_num - 1];
+        const size_t keys_vec_offset = offsets[row_num - 1];
        const size_t keys_vec_size = (offsets[row_num] - keys_vec_offset);

        // Columns 1..n contain arrays of numeric values to sum
@ -93,7 +93,7 @@ public:
            Field value;
            const ColumnArray & array_column = static_cast<const ColumnArray &>(*columns[col + 1]);
            const IColumn::Offsets & offsets = array_column.getOffsets();
-            const size_t values_vec_offset = row_num == 0 ? 0 : offsets[row_num - 1];
+            const size_t values_vec_offset = offsets[row_num - 1];
            const size_t values_vec_size = (offsets[row_num] - values_vec_offset);

            // Expect key and value arrays to be of same length
--- a/dbms/src/AggregateFunctions/AggregateFunctionTopK.h
+++ b/dbms/src/AggregateFunctions/AggregateFunctionTopK.h
@ -93,7 +93,7 @@ public:
        auto result_vec = set.topK(threshold);
        size_t size = result_vec.size();

-        offsets_to.push_back((offsets_to.size() == 0 ? 0 : offsets_to.back()) + size);
+        offsets_to.push_back(offsets_to.back() + size);

        typename ColumnVector<T>::Container & data_to = static_cast<ColumnVector<T> &>(arr_to.getData()).getData();
        size_t old_size = data_to.size();
@ -212,7 +212,7 @@ public:
        IColumn & data_to = arr_to.getData();

        auto result_vec = this->data(place).value.topK(threshold);
-        offsets_to.push_back((offsets_to.size() == 0 ? 0 : offsets_to.back()) + result_vec.size());
+        offsets_to.push_back(offsets_to.back() + result_vec.size());

        for (auto & elem : result_vec)
        {
--- a/dbms/src/AggregateFunctions/Helpers.h
+++ b/dbms/src/AggregateFunctions/Helpers.h
@ -28,8 +28,8 @@ static IAggregateFunction * createWithNumericType(const IDataType & argument_typ
    if (which.idx == TypeIndex::TYPE) return new AggregateFunctionTemplate<TYPE>(std::forward<TArgs>(args)...);
    FOR_NUMERIC_TYPES(DISPATCH)
 #undef DISPATCH
-    if (which.idx == TypeIndex::Enum8) return new AggregateFunctionTemplate<UInt8>(std::forward<TArgs>(args)...);
-    if (which.idx == TypeIndex::Enum16) return new AggregateFunctionTemplate<UInt16>(std::forward<TArgs>(args)...);
+    if (which.idx == TypeIndex::Enum8) return new AggregateFunctionTemplate<Int8>(std::forward<TArgs>(args)...);
+    if (which.idx == TypeIndex::Enum16) return new AggregateFunctionTemplate<Int16>(std::forward<TArgs>(args)...);
    return nullptr;
 }

@ -41,8 +41,8 @@ static IAggregateFunction * createWithNumericType(const IDataType & argument_typ
    if (which.idx == TypeIndex::TYPE) return new AggregateFunctionTemplate<TYPE, Data>(std::forward<TArgs>(args)...);
    FOR_NUMERIC_TYPES(DISPATCH)
 #undef DISPATCH
-    if (which.idx == TypeIndex::Enum8) return new AggregateFunctionTemplate<UInt8, Data>(std::forward<TArgs>(args)...);
-    if (which.idx == TypeIndex::Enum16) return new AggregateFunctionTemplate<UInt16, Data>(std::forward<TArgs>(args)...);
+    if (which.idx == TypeIndex::Enum8) return new AggregateFunctionTemplate<Int8, Data>(std::forward<TArgs>(args)...);
+    if (which.idx == TypeIndex::Enum16) return new AggregateFunctionTemplate<Int16, Data>(std::forward<TArgs>(args)...);
    return nullptr;
 }

@ -54,8 +54,8 @@ static IAggregateFunction * createWithNumericType(const IDataType & argument_typ
    if (which.idx == TypeIndex::TYPE) return new AggregateFunctionTemplate<TYPE, Data<TYPE>>(std::forward<TArgs>(args)...);
    FOR_NUMERIC_TYPES(DISPATCH)
 #undef DISPATCH
-    if (which.idx == TypeIndex::Enum8) return new AggregateFunctionTemplate<UInt8, Data<UInt8>>(std::forward<TArgs>(args)...);
-    if (which.idx == TypeIndex::Enum16) return new AggregateFunctionTemplate<UInt16, Data<UInt16>>(std::forward<TArgs>(args)...);
+    if (which.idx == TypeIndex::Enum8) return new AggregateFunctionTemplate<Int8, Data<Int8>>(std::forward<TArgs>(args)...);
+    if (which.idx == TypeIndex::Enum16) return new AggregateFunctionTemplate<Int16, Data<Int16>>(std::forward<TArgs>(args)...);
    return nullptr;
 }

@ -106,8 +106,8 @@ static IAggregateFunction * createWithTwoNumericTypesSecond(const IDataType & se
    if (which.idx == TypeIndex::TYPE) return new AggregateFunctionTemplate<FirstType, TYPE>(std::forward<TArgs>(args)...);
    FOR_NUMERIC_TYPES(DISPATCH)
 #undef DISPATCH
-    if (which.idx == TypeIndex::Enum8) return new AggregateFunctionTemplate<FirstType, UInt8>(std::forward<TArgs>(args)...);
-    if (which.idx == TypeIndex::Enum16) return new AggregateFunctionTemplate<FirstType, UInt16>(std::forward<TArgs>(args)...);
+    if (which.idx == TypeIndex::Enum8) return new AggregateFunctionTemplate<FirstType, Int8>(std::forward<TArgs>(args)...);
+    if (which.idx == TypeIndex::Enum16) return new AggregateFunctionTemplate<FirstType, Int16>(std::forward<TArgs>(args)...);
    return nullptr;
 }

@ -121,9 +121,9 @@ static IAggregateFunction * createWithTwoNumericTypes(const IDataType & first_ty
    FOR_NUMERIC_TYPES(DISPATCH)
 #undef DISPATCH
    if (which.idx == TypeIndex::Enum8)
-        return createWithTwoNumericTypesSecond<UInt8, AggregateFunctionTemplate>(second_type, std::forward<TArgs>(args)...);
+        return createWithTwoNumericTypesSecond<Int8, AggregateFunctionTemplate>(second_type, std::forward<TArgs>(args)...);
    if (which.idx == TypeIndex::Enum16)
-        return createWithTwoNumericTypesSecond<UInt16, AggregateFunctionTemplate>(second_type, std::forward<TArgs>(args)...);
+        return createWithTwoNumericTypesSecond<Int16, AggregateFunctionTemplate>(second_type, std::forward<TArgs>(args)...);
    return nullptr;
 }

--- a/dbms/src/AggregateFunctions/registerAggregateFunctions.cpp
+++ b/dbms/src/AggregateFunctions/registerAggregateFunctions.cpp
@ -15,6 +15,7 @@ void registerAggregateFunctionGroupArrayInsertAt(AggregateFunctionFactory &);
 void registerAggregateFunctionsQuantile(AggregateFunctionFactory &);
 void registerAggregateFunctionsSequenceMatch(AggregateFunctionFactory &);
 void registerAggregateFunctionWindowFunnel(AggregateFunctionFactory &);
+void registerAggregateFunctionRate(AggregateFunctionFactory &);
 void registerAggregateFunctionsMinMaxAny(AggregateFunctionFactory &);
 void registerAggregateFunctionsStatisticsStable(AggregateFunctionFactory &);
 void registerAggregateFunctionsStatisticsSimple(AggregateFunctionFactory &);
@ -50,6 +51,7 @@ void registerAggregateFunctions()
        registerAggregateFunctionsQuantile(factory);
        registerAggregateFunctionsSequenceMatch(factory);
        registerAggregateFunctionWindowFunnel(factory);
+        registerAggregateFunctionRate(factory);
        registerAggregateFunctionsMinMaxAny(factory);
        registerAggregateFunctionsStatisticsStable(factory);
        registerAggregateFunctionsStatisticsSimple(factory);
--- a/dbms/src/Client/Connection.cpp
+++ b/dbms/src/Client/Connection.cpp
@ -603,6 +603,10 @@ Connection::Packet Connection::receivePacket()
                res.block = receiveLogData();
                return res;

+            case Protocol::Server::TableColumns:
+                res.multistring_message = receiveMultistringMessage(res.type);
+                return res;
+
            case Protocol::Server::EndOfStream:
                return res;

@ -712,6 +716,16 @@ std::unique_ptr<Exception> Connection::receiveException()
 }


+std::vector<String> Connection::receiveMultistringMessage(UInt64 msg_type)
+{
+    size_t num = Protocol::Server::stringsInMessage(msg_type);
+    std::vector<String> out(num);
+    for (size_t i = 0; i < num; ++i)
+        readStringBinary(out[i], *in);
+    return out;
+}
+
+
 Progress Connection::receiveProgress()
 {
    //LOG_TRACE(log_wrapper.get(), "Receiving progress");
--- a/dbms/src/Client/Connection.h
+++ b/dbms/src/Client/Connection.h
@ -1,5 +1,7 @@
 #pragma once

+#include <optional>
+
 #include <common/logger_useful.h>

 #include <Poco/Net/StreamSocket.h>
@ -96,6 +98,7 @@ public:

        Block block;
        std::unique_ptr<Exception> exception;
+        std::vector<String> multistring_message;
        Progress progress;
        BlockStreamProfileInfo profile_info;

@ -254,6 +257,7 @@ private:
    Block receiveLogData();
    Block receiveDataImpl(BlockInputStreamPtr & stream);

+    std::vector<String> receiveMultistringMessage(UInt64 msg_type);
    std::unique_ptr<Exception> receiveException();
    Progress receiveProgress();
    BlockStreamProfileInfo receiveProfileInfo();
--- a/dbms/src/Columns/CMakeLists.txt
+++ b/dbms/src/Columns/CMakeLists.txt
@ -0,0 +1,3 @@
+if (ENABLE_TESTS)
+    add_subdirectory (tests)
+endif ()
--- a/dbms/src/Columns/ColumnAggregateFunction.cpp
+++ b/dbms/src/Columns/ColumnAggregateFunction.cpp
@ -378,7 +378,7 @@ const char * ColumnAggregateFunction::deserializeAndInsertFromArena(const char *
      *  as we cannot legally compare pointers after last element + 1 of some valid memory region.
      *  Probably this will not work under UBSan.
      */
-    ReadBufferFromMemory read_buffer(src_arena, std::numeric_limits<char *>::max() - src_arena);
+    ReadBufferFromMemory read_buffer(src_arena, std::numeric_limits<char *>::max() - src_arena - 1);
    func->deserialize(data.back(), read_buffer, &dst_arena);

    return read_buffer.position();
--- a/dbms/src/Columns/ColumnArray.cpp
+++ b/dbms/src/Columns/ColumnArray.cpp
@ -8,6 +8,8 @@
 #include <Columns/ColumnConst.h>
 #include <Columns/ColumnsCommon.h>

+#include <common/unaligned.h>
+
 #include <DataStreams/ColumnGathererStream.h>

 #include <Common/Exception.h>
@ -132,13 +134,13 @@ StringRef ColumnArray::getDataAt(size_t n) const
      *  since it contains only the data laid in succession, but not the offsets.
      */

-    size_t array_size = sizeAt(n);
-    if (array_size == 0)
-        return StringRef();
-
    size_t offset_of_first_elem = offsetAt(n);
    StringRef first = getData().getDataAtWithTerminatingZero(offset_of_first_elem);

+    size_t array_size = sizeAt(n);
+    if (array_size == 0)
+        return StringRef(first.data, 0);
+
    size_t offset_of_last_elem = getOffsets()[n] - 1;
    StringRef last = getData().getDataAtWithTerminatingZero(offset_of_last_elem);

@ -164,7 +166,7 @@ void ColumnArray::insertData(const char * pos, size_t length)
    if (pos != end)
        throw Exception("Incorrect length argument for method ColumnArray::insertData", ErrorCodes::BAD_ARGUMENTS);

-    getOffsets().push_back((getOffsets().size() == 0 ? 0 : getOffsets().back()) + elems);
+    getOffsets().push_back(getOffsets().back() + elems);
 }


@ -186,13 +188,13 @@ StringRef ColumnArray::serializeValueIntoArena(size_t n, Arena & arena, char con

 const char * ColumnArray::deserializeAndInsertFromArena(const char * pos)
 {
-    size_t array_size = *reinterpret_cast<const size_t *>(pos);
+    size_t array_size = unalignedLoad<size_t>(pos);
    pos += sizeof(array_size);

    for (size_t i = 0; i < array_size; ++i)
        pos = getData().deserializeAndInsertFromArena(pos);

-    getOffsets().push_back((getOffsets().size() == 0 ? 0 : getOffsets().back()) + array_size);
+    getOffsets().push_back(getOffsets().back() + array_size);
    return pos;
 }

@ -214,7 +216,7 @@ void ColumnArray::insert(const Field & x)
    size_t size = array.size();
    for (size_t i = 0; i < size; ++i)
        getData().insert(array[i]);
-    getOffsets().push_back((getOffsets().size() == 0 ? 0 : getOffsets().back()) + size);
+    getOffsets().push_back(getOffsets().back() + size);
 }


@ -225,13 +227,13 @@ void ColumnArray::insertFrom(const IColumn & src_, size_t n)
    size_t offset = src.offsetAt(n);

    getData().insertRangeFrom(src.getData(), offset, size);
-    getOffsets().push_back((getOffsets().size() == 0 ? 0 : getOffsets().back()) + size);
+    getOffsets().push_back(getOffsets().back() + size);
 }


 void ColumnArray::insertDefault()
 {
-    getOffsets().push_back(getOffsets().size() == 0 ? 0 : getOffsets().back());
+    getOffsets().push_back(getOffsets().back());
 }


@ -320,17 +322,11 @@ bool ColumnArray::hasEqualOffsets(const ColumnArray & other) const

 ColumnPtr ColumnArray::convertToFullColumnIfConst() const
 {
-    ColumnPtr new_data;
-
-    if (ColumnPtr full_column = getData().convertToFullColumnIfConst())
-        new_data = full_column;
-    else
-        new_data = data;
-
-    return ColumnArray::create(new_data, offsets);
+    /// It is possible to have an array with constant data and non-constant offsets.
+    /// Example is the result of expression: replicate('hello', [1])
+    return ColumnArray::create(data->convertToFullColumnIfConst(), offsets);
 }

-
 void ColumnArray::getExtremes(Field & min, Field & max) const
 {
    min = Array();
--- a/dbms/src/Columns/ColumnArray.h
+++ b/dbms/src/Columns/ColumnArray.h
@ -124,8 +124,8 @@ private:
    ColumnPtr data;
    ColumnPtr offsets;

-    size_t ALWAYS_INLINE offsetAt(size_t i) const { return i == 0 ? 0 : getOffsets()[i - 1]; }
-    size_t ALWAYS_INLINE sizeAt(size_t i) const { return i == 0 ? getOffsets()[0] : (getOffsets()[i] - getOffsets()[i - 1]); }
+    size_t ALWAYS_INLINE offsetAt(ssize_t i) const { return getOffsets()[i - 1]; }
+    size_t ALWAYS_INLINE sizeAt(ssize_t i) const { return getOffsets()[i] - getOffsets()[i - 1]; }


    /// Multiply values if the nested column is ColumnVector<T>.
--- a/dbms/src/Columns/ColumnDecimal.cpp
+++ b/dbms/src/Columns/ColumnDecimal.cpp
@ -2,12 +2,15 @@
 #include <Common/Arena.h>
 #include <Common/SipHash.h>

+#include <common/unaligned.h>
+
 #include <IO/WriteHelpers.h>

 #include <Columns/ColumnsCommon.h>
 #include <Columns/ColumnDecimal.h>
 #include <DataStreams/ColumnGathererStream.h>

+
 template <typename T> bool decimalLess(T x, T y, UInt32 x_scale, UInt32 y_scale);

 namespace DB
@ -41,7 +44,7 @@ StringRef ColumnDecimal<T>::serializeValueIntoArena(size_t n, Arena & arena, cha
 template <typename T>
 const char * ColumnDecimal<T>::deserializeAndInsertFromArena(const char * pos)
 {
-    data.push_back(*reinterpret_cast<const T *>(pos));
+    data.push_back(unalignedLoad<T>(pos));
    return pos + sizeof(T);
 }

--- a/dbms/src/Columns/ColumnDecimal.h
+++ b/dbms/src/Columns/ColumnDecimal.h
@ -3,6 +3,7 @@
 #include <cmath>

 #include <Columns/IColumn.h>
+#include <Columns/ColumnVectorHelper.h>


 namespace DB
@ -53,13 +54,13 @@ private:

 /// A ColumnVector for Decimals
 template <typename T>
-class ColumnDecimal final : public COWPtrHelper<IColumn, ColumnDecimal<T>>
+class ColumnDecimal final : public COWPtrHelper<ColumnVectorHelper, ColumnDecimal<T>>
 {
    static_assert(IsDecimalNumber<T>);

 private:
    using Self = ColumnDecimal;
-    friend class COWPtrHelper<IColumn, Self>;
+    friend class COWPtrHelper<ColumnVectorHelper, Self>;

 public:
    using Container = DecimalPaddedPODArray<T>;
--- a/dbms/src/Columns/ColumnFixedString.h
+++ b/dbms/src/Columns/ColumnFixedString.h
@ -1,9 +1,10 @@
 #pragma once

-#include <string.h> // memcpy
+#include <string.h> // memcmp

 #include <Common/PODArray.h>
 #include <Columns/IColumn.h>
+#include <Columns/ColumnVectorHelper.h>


 namespace DB
@ -12,10 +13,10 @@ namespace DB
 /** A column of values of "fixed-length string" type.
  * If you insert a smaller string, it will be padded with zero bytes.
  */
-class ColumnFixedString final : public COWPtrHelper<IColumn, ColumnFixedString>
+class ColumnFixedString final : public COWPtrHelper<ColumnVectorHelper, ColumnFixedString>
 {
 public:
-    friend class COWPtrHelper<IColumn, ColumnFixedString>;
+    friend class COWPtrHelper<ColumnVectorHelper, ColumnFixedString>;

    using Chars = PaddedPODArray<UInt8>;

--- a/dbms/src/Columns/ColumnLowCardinality.cpp
+++ b/dbms/src/Columns/ColumnLowCardinality.cpp
@ -212,13 +212,6 @@ void ColumnLowCardinality::insertData(const char * pos, size_t length)
    idx.check(getDictionary().size());
 }

-void ColumnLowCardinality::insertDataWithTerminatingZero(const char * pos, size_t length)
-{
-    compactIfSharedDictionary();
-    idx.insertPosition(dictionary.getColumnUnique().uniqueInsertDataWithTerminatingZero(pos, length));
-    idx.check(getDictionary().size());
-}
-
 StringRef ColumnLowCardinality::serializeValueIntoArena(size_t n, Arena & arena, char const *& begin) const
 {
    return getDictionary().serializeValueIntoArena(getIndexes().getUInt(n), arena, begin);
--- a/dbms/src/Columns/ColumnLowCardinality.h
+++ b/dbms/src/Columns/ColumnLowCardinality.h
@ -73,8 +73,6 @@ public:
    void insertRangeFromDictionaryEncodedColumn(const IColumn & keys, const IColumn & positions);

    void insertData(const char * pos, size_t length) override;
-    void insertDataWithTerminatingZero(const char * pos, size_t length) override;
-

    void popBack(size_t n) override { idx.popBack(n); }

--- a/dbms/src/Columns/ColumnNullable.cpp
+++ b/dbms/src/Columns/ColumnNullable.cpp
@ -22,8 +22,7 @@ ColumnNullable::ColumnNullable(MutableColumnPtr && nested_column_, MutableColumn
    : nested_column(std::move(nested_column_)), null_map(std::move(null_map_))
 {
    /// ColumnNullable cannot have constant nested column. But constant argument could be passed. Materialize it.
-    if (ColumnPtr nested_column_materialized = getNestedColumn().convertToFullColumnIfConst())
-        nested_column = nested_column_materialized;
+    nested_column = getNestedColumn().convertToFullColumnIfConst();

    if (!getNestedColumn().canBeInsideNullable())
        throw Exception{getNestedColumn().getName() + " cannot be inside Nullable column", ErrorCodes::ILLEGAL_COLUMN};
--- a/dbms/src/Columns/ColumnString.cpp
+++ b/dbms/src/Columns/ColumnString.cpp
@ -5,6 +5,8 @@
 #include <Columns/ColumnsCommon.h>
 #include <DataStreams/ColumnGathererStream.h>

+#include <common/unaligned.h>
+

 namespace DB
 {
@ -146,7 +148,7 @@ ColumnPtr ColumnString::permute(const Permutation & perm, size_t limit) const
    for (size_t i = 0; i < limit; ++i)
    {
        size_t j = perm[i];
-        size_t string_offset = j == 0 ? 0 : offsets[j - 1];
+        size_t string_offset = offsets[j - 1];
        size_t string_size = offsets[j] - string_offset;

        memcpySmallAllowReadWriteOverflow15(&res_chars[current_new_offset], &chars[string_offset], string_size);
@ -176,7 +178,7 @@ StringRef ColumnString::serializeValueIntoArena(size_t n, Arena & arena, char co

 const char * ColumnString::deserializeAndInsertFromArena(const char * pos)
 {
-    const size_t string_size = *reinterpret_cast<const size_t *>(pos);
+    const size_t string_size = unalignedLoad<size_t>(pos);
    pos += sizeof(string_size);

    const size_t old_size = chars.size();
@ -217,7 +219,7 @@ ColumnPtr ColumnString::indexImpl(const PaddedPODArray<Type> & indexes, size_t l
    for (size_t i = 0; i < limit; ++i)
    {
        size_t j = indexes[i];
-        size_t string_offset = j == 0 ? 0 : offsets[j - 1];
+        size_t string_offset = offsets[j - 1];
        size_t string_size = offsets[j] - string_offset;

        memcpySmallAllowReadWriteOverflow15(&res_chars[current_new_offset], &chars[string_offset], string_size);
--- a/dbms/src/Columns/ColumnString.h
+++ b/dbms/src/Columns/ColumnString.h
@ -31,10 +31,10 @@ private:
    /// For convenience, every string ends with terminating zero byte. Note that strings could contain zero bytes in the middle.
    Chars chars;

-    size_t ALWAYS_INLINE offsetAt(size_t i) const { return i == 0 ? 0 : offsets[i - 1]; }
+    size_t ALWAYS_INLINE offsetAt(ssize_t i) const { return offsets[i - 1]; }

    /// Size of i-th element, including terminating zero.
-    size_t ALWAYS_INLINE sizeAt(size_t i) const { return i == 0 ? offsets[0] : (offsets[i] - offsets[i - 1]); }
+    size_t ALWAYS_INLINE sizeAt(ssize_t i) const { return offsets[i] - offsets[i - 1]; }

    template <bool positive>
    struct less;
@ -153,12 +153,14 @@ public:
        const size_t new_size = old_size + length + 1;

        chars.resize(new_size);
-        memcpy(&chars[old_size], pos, length);
+        if (length)
+            memcpy(&chars[old_size], pos, length);
        chars[old_size + length] = 0;
        offsets.push_back(new_size);
    }

-    void insertDataWithTerminatingZero(const char * pos, size_t length) override
+    /// Like getData, but inserting data should be zero-ending (i.e. length is 1 byte greater than real string size).
+    void insertDataWithTerminatingZero(const char * pos, size_t length)
    {
        const size_t old_size = chars.size();
        const size_t new_size = old_size + length;
@ -202,7 +204,7 @@ public:
    void insertDefault() override
    {
        chars.push_back(0);
-        offsets.push_back(offsets.size() == 0 ? 1 : (offsets.back() + 1));
+        offsets.push_back(offsets.back() + 1);
    }

    int compareAt(size_t n, size_t m, const IColumn & rhs_, int /*nan_direction_hint*/) const override
--- a/dbms/src/Columns/ColumnUnique.h
+++ b/dbms/src/Columns/ColumnUnique.h
@ -13,6 +13,9 @@
 #include <Common/typeid_cast.h>
 #include <ext/range.h>

+#include <common/unaligned.h>
+
+
 namespace DB
 {

@ -44,7 +47,6 @@ public:
    IColumnUnique::IndexesWithOverflow uniqueInsertRangeWithOverflow(const IColumn & src, size_t start, size_t length,
                                                                     size_t max_dictionary_size) override;
    size_t uniqueInsertData(const char * pos, size_t length) override;
-    size_t uniqueInsertDataWithTerminatingZero(const char * pos, size_t length) override;
    size_t uniqueDeserializeAndInsertFromArena(const char * pos, const char *& new_pos) override;

    size_t getDefaultValueIndex() const override { return 0; }
@ -100,6 +102,7 @@ private:

    ColumnPtr column_holder;
    bool is_nullable;
+    size_t size_of_value_if_fixed = 0;
    ReverseIndex<UInt64, ColumnType> index;

    /// For DataTypeNullable, stores null map.
@ -151,6 +154,7 @@ template <typename ColumnType>
 ColumnUnique<ColumnType>::ColumnUnique(const ColumnUnique & other)
    : column_holder(other.column_holder)
    , is_nullable(other.is_nullable)
+    , size_of_value_if_fixed (other.size_of_value_if_fixed)
    , index(numSpecialValues(is_nullable), 0)
 {
    index.setColumn(getRawColumnPtr());
@ -166,6 +170,9 @@ ColumnUnique<ColumnType>::ColumnUnique(const IDataType & type)
    column_holder = holder_type.createColumn()->cloneResized(numSpecialValues());
    index.setColumn(getRawColumnPtr());
    createNullMask();
+
+    if (column_holder->valuesHaveFixedSize())
+        size_of_value_if_fixed = column_holder->sizeOfValueIfFixed();
 }

 template <typename ColumnType>
@ -181,6 +188,9 @@ ColumnUnique<ColumnType>::ColumnUnique(MutableColumnPtr && holder, bool is_nulla

    index.setColumn(getRawColumnPtr());
    createNullMask();
+
+    if (column_holder->valuesHaveFixedSize())
+        size_of_value_if_fixed = column_holder->sizeOfValueIfFixed();
 }

 template <typename ColumnType>
@ -243,20 +253,11 @@ size_t ColumnUnique<ColumnType>::uniqueInsert(const Field & x)
    if (x.getType() == Field::Types::Null)
        return getNullValueIndex();

-    auto column = getRawColumnPtr();
-    auto prev_size = static_cast<UInt64>(column->size());
+    if (size_of_value_if_fixed)
+        return uniqueInsertData(&x.get<char>(), size_of_value_if_fixed);

-    if ((*column)[getNestedTypeDefaultValueIndex()] == x)
-        return getNestedTypeDefaultValueIndex();
-
-    column->insert(x);
-    auto pos = index.insert(prev_size);
-    if (pos != prev_size)
-        column->popBack(1);
-
-    updateNullMask();
-
-    return pos;
+    auto & val = x.get<String>();
+    return uniqueInsertData(val.data(), val.size());
 }

 template <typename ColumnType>
@ -280,50 +281,13 @@ size_t ColumnUnique<ColumnType>::uniqueInsertData(const char * pos, size_t lengt
    if (column->getDataAt(getNestedTypeDefaultValueIndex()) == StringRef(pos, length))
        return getNestedTypeDefaultValueIndex();

-    UInt64 size = column->size();
-    UInt64 insertion_point = index.getInsertionPoint(StringRef(pos, length));
-
-    if (insertion_point == size)
-    {
-        column->insertData(pos, length);
-        index.insertFromLastRow();
-    }
+    auto insertion_point = index.insert(StringRef(pos, length));

    updateNullMask();

    return insertion_point;
 }

-template <typename ColumnType>
-size_t ColumnUnique<ColumnType>::uniqueInsertDataWithTerminatingZero(const char * pos, size_t length)
-{
-    if (std::is_same<ColumnType, ColumnString>::value)
-        return uniqueInsertData(pos, length - 1);
-
-    if (column_holder->valuesHaveFixedSize())
-        return uniqueInsertData(pos, length);
-
-    /// Don't know if data actually has terminating zero. So, insert it firstly.
-
-    auto column = getRawColumnPtr();
-    size_t prev_size = column->size();
-    column->insertDataWithTerminatingZero(pos, length);
-
-    if (column->compareAt(getNestedTypeDefaultValueIndex(), prev_size, *column, 1) == 0)
-    {
-        column->popBack(1);
-        return getNestedTypeDefaultValueIndex();
-    }
-
-    auto position = index.insert(prev_size);
-    if (position != prev_size)
-        column->popBack(1);
-
-    updateNullMask();
-
-    return static_cast<size_t>(position);
-}
-
 template <typename ColumnType>
 StringRef ColumnUnique<ColumnType>::serializeValueIntoArena(size_t n, Arena & arena, char const *& begin) const
 {
@ -362,23 +326,20 @@ size_t ColumnUnique<ColumnType>::uniqueDeserializeAndInsertFromArena(const char
        }
    }

-    auto column = getRawColumnPtr();
-    size_t prev_size = column->size();
-    new_pos = column->deserializeAndInsertFromArena(pos);
-
-    if (column->compareAt(getNestedTypeDefaultValueIndex(), prev_size, *column, 1) == 0)
+    /// Numbers, FixedString
+    if (size_of_value_if_fixed)
    {
-        column->popBack(1);
-        return getNestedTypeDefaultValueIndex();
+        new_pos = pos + size_of_value_if_fixed;
+        return uniqueInsertData(pos, size_of_value_if_fixed);
    }

-    auto index_pos = index.insert(prev_size);
-    if (index_pos != prev_size)
-        column->popBack(1);
+    /// String
+    const size_t string_size = unalignedLoad<size_t>(pos);
+    pos += sizeof(string_size);
+    new_pos = pos + string_size;

-    updateNullMask();
-
-    return static_cast<size_t>(index_pos);
+    /// -1 because of terminating zero
+    return uniqueInsertData(pos, string_size - 1);
 }

 template <typename ColumnType>
@ -482,20 +443,14 @@ MutableColumnPtr ColumnUnique<ColumnType>::uniqueInsertRangeImpl(
    if (secondary_index)
        next_position += secondary_index->size();

-    auto check_inserted_position = [&next_position](UInt64 inserted_position)
+    auto insert_key = [&](const StringRef & ref, ReverseIndex<UInt64, ColumnType> & cur_index) -> MutableColumnPtr
    {
-        if (inserted_position != next_position)
-            throw Exception("Inserted position " + toString(inserted_position)
-                            + " is not equal with expected " + toString(next_position), ErrorCodes::LOGICAL_ERROR);
-    };
+        auto inserted_pos = cur_index.insert(ref);
+        positions[num_added_rows] = inserted_pos;
+        if (inserted_pos == next_position)
+            return update_position(next_position);

-    auto insert_key = [&](const StringRef & ref, ReverseIndex<UInt64, ColumnType> * cur_index)
-    {
-        positions[num_added_rows] = next_position;
-        cur_index->getColumn()->insertData(ref.data, ref.size);
-        auto inserted_pos = cur_index->insertFromLastRow();
-        check_inserted_position(inserted_pos);
-        return update_position(next_position);
+        return nullptr;
    };

    for (; num_added_rows < length; ++num_added_rows)
@ -509,29 +464,21 @@ MutableColumnPtr ColumnUnique<ColumnType>::uniqueInsertRangeImpl(
        else
        {
            auto ref = src_column->getDataAt(row);
-            auto cur_index = &index;
-            bool inserted = false;
+            MutableColumnPtr res = nullptr;

-            while (!inserted)
+            if (secondary_index && next_position >= max_dictionary_size)
            {
-                auto insertion_point = cur_index->getInsertionPoint(ref);
-
-                if (insertion_point == cur_index->lastInsertionPoint())
-                {
-                    if (secondary_index && cur_index != secondary_index && next_position >= max_dictionary_size)
-                    {
-                        cur_index = secondary_index;
-                        continue;
-                    }
-
-                    if (auto res = insert_key(ref, cur_index))
-                        return res;
-                }
+                auto insertion_point = index.getInsertionPoint(ref);
+                if (insertion_point == index.lastInsertionPoint())
+                    res = insert_key(ref, *secondary_index);
                else
-                   positions[num_added_rows] = insertion_point;
-
-                inserted = true;
+                    positions[num_added_rows] = insertion_point;
            }
+            else
+                res = insert_key(ref, index);
+
+            if (res)
+                return res;
        }
    }

--- a/dbms/src/Columns/ColumnVector.h
+++ b/dbms/src/Columns/ColumnVector.h
@ -1,8 +1,9 @@
 #pragma once

 #include <cmath>
-
 #include <Columns/IColumn.h>
+#include <Columns/ColumnVectorHelper.h>
+#include <common/unaligned.h>


 namespace DB
@ -86,47 +87,16 @@ template <> struct CompareHelper<Float32> : public FloatCompareHelper<Float32> {
 template <> struct CompareHelper<Float64> : public FloatCompareHelper<Float64> {};


-/** To implement `get64` function.
-  */
-template <typename T>
-inline UInt64 unionCastToUInt64(T x) { return x; }
-
-template <> inline UInt64 unionCastToUInt64(Float64 x)
-{
-    union
-    {
-        Float64 src;
-        UInt64 res;
-    };
-
-    src = x;
-    return res;
-}
-
-template <> inline UInt64 unionCastToUInt64(Float32 x)
-{
-    union
-    {
-        Float32 src;
-        UInt64 res;
-    };
-
-    res = 0;
-    src = x;
-    return res;
-}
-
-
 /** A template for columns that use a simple array to store.
 */
 template <typename T>
-class ColumnVector final : public COWPtrHelper<IColumn, ColumnVector<T>>
+class ColumnVector final : public COWPtrHelper<ColumnVectorHelper, ColumnVector<T>>
 {
    static_assert(!IsDecimalNumber<T>);

 private:
    using Self = ColumnVector;
-    friend class COWPtrHelper<IColumn, Self>;
+    friend class COWPtrHelper<ColumnVectorHelper, Self>;

    struct less;
    struct greater;
@ -164,7 +134,7 @@ public:

    void insertData(const char * pos, size_t /*length*/) override
    {
-        data.push_back(*reinterpret_cast<const T *>(pos));
+        data.push_back(unalignedLoad<T>(pos));
    }

    void insertDefault() override
--- a/dbms/src/Columns/ColumnVectorHelper.h
+++ b/dbms/src/Columns/ColumnVectorHelper.h
@ -0,0 +1,39 @@
+#pragma once
+
+#include <Columns/IColumn.h>
+
+
+namespace DB
+{
+
+/** Allows to access internal array of ColumnVector or ColumnFixedString without cast to concrete type.
+  * We will inherit ColumnVector and ColumnFixedString from this class instead of IColumn.
+  * Assumes data layout of ColumnVector, ColumnFixedString and PODArray.
+  *
+  * Why it is needed?
+  *
+  * There are some algorithms that specialize on the size of data type but doesn't care about concrete type.
+  * The same specialization may work for UInt64, Int64, Float64, FixedString(8), if it only does byte moving and hashing.
+  * To avoid code bloat and compile time increase, we can use single template instantiation for these cases
+  *  and just static_cast pointer to some single column type (e. g. ColumnUInt64) assuming that all types have identical memory layout.
+  *
+  * But this static_cast (downcast to unrelated type) is illegal according to the C++ standard and UBSan warns about it.
+  * To allow functional tests to work under UBSan we have to separate some base class that will present the memory layout in explicit way,
+  *  and we will do static_cast to this class.
+  */
+class ColumnVectorHelper : public IColumn
+{
+public:
+    const char * getRawDataBegin() const
+    {
+        return *reinterpret_cast<const char * const *>(reinterpret_cast<const char *>(this) + sizeof(*this));
+    }
+
+    template <size_t ELEMENT_SIZE>
+    void insertRawData(const char * ptr)
+    {
+        return reinterpret_cast<PODArrayBase<ELEMENT_SIZE, 4096, Allocator<false>, 15, 16> *>(reinterpret_cast<char *>(this) + sizeof(*this))->push_back_raw(ptr);
+    }
+};
+
+}
--- a/dbms/src/Columns/IColumn.h
+++ b/dbms/src/Columns/IColumn.h
@ -45,7 +45,7 @@ public:
    /** If column isn't constant, returns nullptr (or itself).
      * If column is constant, transforms constant to full column (if column type allows such tranform) and return it.
      */
-    virtual Ptr convertToFullColumnIfConst() const { return {}; }
+    virtual Ptr convertToFullColumnIfConst() const { return getPtr(); }

    /// If column isn't ColumnLowCardinality, return itself.
    /// If column is ColumnLowCardinality, transforms is to full column.
@ -143,13 +143,6 @@ public:
    /// Parameter length could be ignored if column values have fixed size.
    virtual void insertData(const char * pos, size_t length) = 0;

-    /// Like getData, but has special behavior for columns that contain variable-length strings.
-    /// In this special case inserting data should be zero-ending (i.e. length is 1 byte greater than real string size).
-    virtual void insertDataWithTerminatingZero(const char * pos, size_t length)
-    {
-        insertData(pos, length);
-    }
-
    /// Appends "default value".
    /// Is used when there are need to increase column size, but inserting value doesn't make sense.
    /// For example, ColumnNullable(Nested) absolutely ignores values of nested column if it is marked as NULL.
--- a/dbms/src/Columns/IColumnDummy.h
+++ b/dbms/src/Columns/IColumnDummy.h
@ -107,7 +107,7 @@ public:
        if (s != offsets.size())
            throw Exception("Size of offsets doesn't match size of column.", ErrorCodes::SIZES_OF_COLUMNS_DOESNT_MATCH);

-        return cloneDummy(s == 0 ? 0 : offsets.back());
+        return cloneDummy(offsets.back());
    }

    MutableColumns scatter(ColumnIndex num_columns, const Selector & selector) const override
--- a/dbms/src/Columns/IColumnUnique.h
+++ b/dbms/src/Columns/IColumnUnique.h
@ -51,7 +51,6 @@ public:
    /// Is used to optimize some computations (in aggregation, for example).
    /// Parameter length could be ignored if column values have fixed size.
    virtual size_t uniqueInsertData(const char * pos, size_t length) = 0;
-    virtual size_t uniqueInsertDataWithTerminatingZero(const char * pos, size_t length) = 0;

    virtual size_t getDefaultValueIndex() const = 0;  /// Nullable ? getNullValueIndex : getNestedTypeDefaultValueIndex
    virtual size_t getNullValueIndex() const = 0;  /// Throws if not nullable.
--- a/dbms/src/Columns/ReverseIndex.h
+++ b/dbms/src/Columns/ReverseIndex.h
@ -6,6 +6,8 @@
 #include <Columns/ColumnString.h>
 #include <Columns/ColumnsNumber.h>
 #include <ext/range.h>
+#include <common/unaligned.h>
+

 namespace DB
 {
@ -56,32 +58,15 @@ namespace
    };


-    template <typename Hash>
-    struct ReverseIndexHash : public Hash
+    struct ReverseIndexHash
    {
        template <typename T>
        size_t operator()(T) const
        {
            throw Exception("operator()(key) is not implemented for ReverseIndexHash.", ErrorCodes::LOGICAL_ERROR);
        }
-
-        template <typename State, typename T>
-        size_t operator()(const State & state, T key) const
-        {
-            auto index = key;
-            if constexpr (State::has_base_index)
-                index -= state.base_index;
-
-            return Hash::operator()(state.index_column->getElement(index));
-        }
    };

-    using ReverseIndexStringHash = ReverseIndexHash<StringRefHash>;
-
-    template <typename IndexType>
-    using ReverseIndexNumberHash = ReverseIndexHash<DefaultHash<IndexType>>;
-
-
    template <typename IndexType, typename Hash, typename HashTable, typename ColumnType, bool string_hash, bool has_base_index>
    struct ReverseIndexHashTableCell
        : public HashTableCell<IndexType, Hash, ReverseIndexHashTableState<ColumnType, string_hash, has_base_index>>
@ -99,6 +84,7 @@ namespace
            static_assert(!std::is_same_v<typename std::decay<T>::type, typename std::decay<IndexType>::type>);
            return false;
        }
+
        /// Special case when we want to compare with something not in index_column.
        /// When we compare something inside column default keyEquals checks only that row numbers are equal.
        bool keyEquals(const StringRef & object, size_t hash_ [[maybe_unused]], const State & state) const
@ -126,7 +112,11 @@ namespace
            if constexpr (string_hash)
                return (*state.saved_hash_column)[index];
            else
-                return hash(state, key);
+            {
+                using ValueType = typename ColumnType::value_type;
+                ValueType value = unalignedLoad<ValueType>(state.index_column->getDataAt(index).data);
+                return DefaultHash<ValueType>()(value);
+            }
        }
    };

@ -147,28 +137,28 @@ namespace
            IndexType,
            ReverseIndexHashTableCell<
                    IndexType,
-                    ReverseIndexStringHash,
+                    ReverseIndexHash,
                    ReverseIndexStringHashTable<IndexType, ColumnType, has_base_index>,
                    ColumnType,
                    true,
                    has_base_index>,
-            ReverseIndexStringHash>
+            ReverseIndexHash>
    {
        using Base = HashTableWithPublicState<
                IndexType,
                ReverseIndexHashTableCell<
                        IndexType,
-                        ReverseIndexStringHash,
+                        ReverseIndexHash,
                        ReverseIndexStringHashTable<IndexType, ColumnType, has_base_index>,
                        ColumnType,
                        true,
                        has_base_index>,
-                ReverseIndexStringHash>;
+                ReverseIndexHash>;
    public:
        using Base::Base;
        friend struct ReverseIndexHashTableCell<
                IndexType,
-                ReverseIndexStringHash,
+                ReverseIndexHash,
                ReverseIndexStringHashTable<IndexType, ColumnType, has_base_index>,
                ColumnType,
                true,
@ -180,28 +170,28 @@ namespace
            IndexType,
            ReverseIndexHashTableCell<
                    IndexType,
-                    ReverseIndexNumberHash<typename ColumnType::value_type>,
+                    ReverseIndexHash,
                    ReverseIndexNumberHashTable<IndexType, ColumnType, has_base_index>,
                    ColumnType,
                    false,
                    has_base_index>,
-            ReverseIndexNumberHash<typename ColumnType::value_type>>
+            ReverseIndexHash>
    {
        using Base = HashTableWithPublicState<
                IndexType,
                ReverseIndexHashTableCell<
                        IndexType,
-                        ReverseIndexNumberHash<typename ColumnType::value_type>,
+                        ReverseIndexHash,
                        ReverseIndexNumberHashTable<IndexType, ColumnType, has_base_index>,
                        ColumnType,
                        false,
                        has_base_index>,
-                ReverseIndexNumberHash<typename ColumnType::value_type>>;
+                ReverseIndexHash>;
    public:
        using Base::Base;
        friend struct ReverseIndexHashTableCell<
                IndexType,
-                ReverseIndexNumberHash<typename ColumnType::value_type>,
+                ReverseIndexHash,
                ReverseIndexNumberHashTable<IndexType, ColumnType, has_base_index>,
                ColumnType,
                false,
@ -253,8 +243,7 @@ public:
    static constexpr bool is_numeric_column = isNumericColumn(static_cast<ColumnType *>(nullptr));
    static constexpr bool use_saved_hash = !is_numeric_column;

-    UInt64 insert(UInt64 from_position);  /// Insert into index column[from_position];
-    UInt64 insertFromLastRow();
+    UInt64 insert(const StringRef & data);
    UInt64 getInsertionPoint(const StringRef & data);
    UInt64 lastInsertionPoint() const { return size() + base_index; }

@ -302,7 +291,7 @@ private:
        if constexpr (is_numeric_column)
        {
            using ValueType = typename ColumnType::value_type;
-            ValueType value = *reinterpret_cast<const ValueType *>(ref.data);
+            ValueType value = unalignedLoad<ValueType>(ref.data);
            return DefaultHash<ValueType>()(value);
        }
        else
@ -367,7 +356,7 @@ void ReverseIndex<IndexType, ColumnType>::buildIndex()
        else
            hash = getHash(column->getDataAt(row));

-        index->emplace(row + base_index, iterator, inserted, hash);
+        index->emplace(row + base_index, iterator, inserted, hash, column->getDataAt(row));

        if (!inserted)
            throw Exception("Duplicating keys found in ReverseIndex.", ErrorCodes::LOGICAL_ERROR);
@ -390,7 +379,7 @@ ColumnUInt64::MutablePtr ReverseIndex<IndexType, ColumnType>::calcHashes() const
 }

 template <typename IndexType, typename ColumnType>
-UInt64 ReverseIndex<IndexType, ColumnType>::insert(UInt64 from_position)
+UInt64 ReverseIndex<IndexType, ColumnType>::insert(const StringRef & data)
 {
    if (!index)
        buildIndex();
@ -399,42 +388,35 @@ UInt64 ReverseIndex<IndexType, ColumnType>::insert(UInt64 from_position)
    IteratorType iterator;
    bool inserted;

-    auto hash = getHash(column->getDataAt(from_position));
+    auto hash = getHash(data);
+    UInt64 num_rows = size();

    if constexpr (use_saved_hash)
    {
        auto & data = saved_hash->getData();
-        if (data.size() <= from_position)
-            data.resize(from_position + 1);
-        data[from_position] = hash;
+        if (data.size() <= num_rows)
+            data.resize(num_rows + 1);
+        data[num_rows] = hash;
+    }
+    else
+        column->insertData(data.data, data.size);
+
+    index->emplace(num_rows + base_index, iterator, inserted, hash, data);
+
+    if constexpr (use_saved_hash)
+    {
+        if (inserted)
+            column->insertData(data.data, data.size);
+    }
+    else
+    {
+        if (!inserted)
+            column->popBack(1);
    }

-    index->emplace(from_position + base_index, iterator, inserted, hash);
-
    return *iterator;
 }

-template <typename IndexType, typename ColumnType>
-UInt64 ReverseIndex<IndexType, ColumnType>::insertFromLastRow()
-{
-    if (!column)
-        throw Exception("ReverseIndex can't insert row from column because index column wasn't set.",
-                        ErrorCodes::LOGICAL_ERROR);
-
-    UInt64 num_rows = size();
-
-    if (num_rows == 0)
-        throw Exception("ReverseIndex can't insert row from column because it is empty.", ErrorCodes::LOGICAL_ERROR);
-
-    UInt64 position = num_rows - 1;
-    UInt64 inserted_pos = insert(position);
-    if (position + base_index != inserted_pos)
-        throw Exception("Can't insert into reverse index from last row (" + toString(position + base_index)
-                        + ") because the same row is in position " + toString(inserted_pos), ErrorCodes::LOGICAL_ERROR);
-
-    return inserted_pos;
-}
-
 template <typename IndexType, typename ColumnType>
 UInt64 ReverseIndex<IndexType, ColumnType>::getInsertionPoint(const StringRef & data)
 {
--- a/dbms/src/Columns/tests/CMakeLists.txt
+++ b/dbms/src/Columns/tests/CMakeLists.txt
@ -0,0 +1,4 @@
+set(SRCS)
+
+add_executable (column_unique column_unique.cpp ${SRCS})
+target_link_libraries (column_unique PRIVATE dbms gtest_main)
--- a/dbms/src/Columns/tests/column_unique.cpp
+++ b/dbms/src/Columns/tests/column_unique.cpp
@ -0,0 +1,193 @@
+#include <Columns/ColumnUnique.h>
+#include <Columns/ColumnString.h>
+#include <Columns/ColumnsNumber.h>
+#include <Columns/ColumnNullable.h>
+
+#include <DataTypes/DataTypeString.h>
+#include <DataTypes/DataTypesNumber.h>
+#include <DataTypes/DataTypeNullable.h>
+
+#pragma GCC diagnostic ignored "-Wsign-compare"
+#ifdef __clang__
+#pragma clang diagnostic ignored "-Wzero-as-null-pointer-constant"
+#endif
+
+#include <gtest/gtest.h>
+
+#include <unordered_map>
+#include <vector>
+using namespace DB;
+
+TEST(column_unique, column_unique_unique_insert_range_Test)
+{
+    std::unordered_map<String, size_t> ref_map;
+    auto data_type = std::make_shared<DataTypeString>();
+    auto column_unique = ColumnUnique<ColumnString>::create(*data_type);
+    auto column_string = ColumnString::create();
+
+    size_t num_values = 1000000;
+    size_t mod_to = 1000;
+
+    std::vector<size_t> indexes(num_values);
+    for (size_t i = 0; i < num_values; ++i)
+    {
+        String str = toString(i % mod_to);
+        column_string->insertData(str.data(), str.size());
+
+        if (ref_map.count(str) == 0)
+            ref_map[str] = ref_map.size();
+
+        indexes[i]= ref_map[str];
+    }
+
+    auto idx = column_unique->uniqueInsertRangeFrom(*column_string, 0, num_values);
+    ASSERT_EQ(idx->size(), num_values);
+
+    for (size_t i = 0; i < num_values; ++i)
+    {
+        ASSERT_EQ(indexes[i] + 1, idx->getUInt(i)) << "Different indexes at position " << i;
+    }
+
+    auto & nested = column_unique->getNestedColumn();
+    ASSERT_EQ(nested->size(), mod_to + 1);
+
+    for (size_t i = 0; i < mod_to; ++i)
+    {
+        ASSERT_EQ(std::to_string(i), nested->getDataAt(i + 1).toString());
+    }
+}
+
+TEST(column_unique, column_unique_unique_insert_range_with_overflow_Test)
+{
+    std::unordered_map<String, size_t> ref_map;
+    auto data_type = std::make_shared<DataTypeString>();
+    auto column_unique = ColumnUnique<ColumnString>::create(*data_type);
+    auto column_string = ColumnString::create();
+
+    size_t num_values = 1000000;
+    size_t mod_to = 1000;
+
+    std::vector<size_t> indexes(num_values);
+    for (size_t i = 0; i < num_values; ++i)
+    {
+        String str = toString(i % mod_to);
+        column_string->insertData(str.data(), str.size());
+
+        if (ref_map.count(str) == 0)
+            ref_map[str] = ref_map.size();
+
+        indexes[i]= ref_map[str];
+    }
+
+    size_t max_val = mod_to / 2;
+    size_t max_dict_size = max_val + 1;
+    auto idx_with_overflow = column_unique->uniqueInsertRangeWithOverflow(*column_string, 0, num_values, max_dict_size);
+    auto & idx = idx_with_overflow.indexes;
+    auto & add_keys = idx_with_overflow.overflowed_keys;
+
+    ASSERT_EQ(idx->size(), num_values);
+
+    for (size_t i = 0; i < num_values; ++i)
+    {
+        ASSERT_EQ(indexes[i] + 1, idx->getUInt(i)) << "Different indexes at position " << i;
+    }
+
+    auto & nested = column_unique->getNestedColumn();
+    ASSERT_EQ(nested->size(), max_dict_size);
+    ASSERT_EQ(add_keys->size(), mod_to - max_val);
+
+    for (size_t i = 0; i < max_val; ++i)
+    {
+        ASSERT_EQ(std::to_string(i), nested->getDataAt(i + 1).toString());
+    }
+
+    for (size_t i = 0; i < mod_to - max_val; ++i)
+    {
+        ASSERT_EQ(std::to_string(max_val + i), add_keys->getDataAt(i).toString());
+    }
+}
+
+template <typename ColumnType>
+void column_unique_unique_deserialize_from_arena_impl(ColumnType & column, const IDataType & data_type)
+{
+    size_t num_values = column.size();
+
+    {
+        /// Check serialization is reversible.
+        Arena arena;
+        auto column_unique_pattern = ColumnUnique<ColumnString>::create(data_type);
+        auto column_unique = ColumnUnique<ColumnString>::create(data_type);
+        auto idx = column_unique_pattern->uniqueInsertRangeFrom(column, 0, num_values);
+
+        const char * pos = nullptr;
+        for (size_t i = 0; i < num_values; ++i)
+        {
+            auto ref = column_unique_pattern->serializeValueIntoArena(idx->getUInt(i), arena, pos);
+            const char * new_pos;
+            column_unique->uniqueDeserializeAndInsertFromArena(ref.data, new_pos);
+            ASSERT_EQ(new_pos - ref.data, ref.size) << "Deserialized data has different sizes at position " << i;
+
+            ASSERT_EQ(column_unique_pattern->getNestedNotNullableColumn()->getDataAt(idx->getUInt(i)),
+                      column_unique->getNestedNotNullableColumn()->getDataAt(idx->getUInt(i)))
+                                        << "Deserialized data is different from pattern at position " << i;
+
+        }
+    }
+
+    {
+        /// Check serialization the same with ordinary column.
+        Arena arena_string;
+        Arena arena_lc;
+        auto column_unique = ColumnUnique<ColumnString>::create(data_type);
+        auto idx = column_unique->uniqueInsertRangeFrom(column, 0, num_values);
+
+        const char * pos_string = nullptr;
+        const char * pos_lc = nullptr;
+        for (size_t i = 0; i < num_values; ++i)
+        {
+            auto ref_string = column.serializeValueIntoArena(i, arena_string, pos_string);
+            auto ref_lc = column_unique->serializeValueIntoArena(idx->getUInt(i), arena_lc, pos_lc);
+            ASSERT_EQ(ref_string, ref_lc) << "Serialized data is different from pattern at position " << i;
+        }
+    }
+}
+
+TEST(column_unique, column_unique_unique_deserialize_from_arena_String_Test)
+{
+    auto data_type = std::make_shared<DataTypeString>();
+    auto column_string = ColumnString::create();
+
+    size_t num_values = 1000000;
+    size_t mod_to = 1000;
+
+    std::vector<size_t> indexes(num_values);
+    for (size_t i = 0; i < num_values; ++i)
+    {
+        String str = toString(i % mod_to);
+        column_string->insertData(str.data(), str.size());
+    }
+
+    column_unique_unique_deserialize_from_arena_impl(*column_string, *data_type);
+}
+
+TEST(column_unique, column_unique_unique_deserialize_from_arena_Nullable_String_Test)
+{
+    auto data_type = std::make_shared<DataTypeNullable>(std::make_shared<DataTypeString>());
+    auto column_string = ColumnString::create();
+    auto null_mask = ColumnUInt8::create();
+
+    size_t num_values = 1000000;
+    size_t mod_to = 1000;
+
+    std::vector<size_t> indexes(num_values);
+    for (size_t i = 0; i < num_values; ++i)
+    {
+        String str = toString(i % mod_to);
+        column_string->insertData(str.data(), str.size());
+
+        null_mask->insertValue(i % 3 ? 1 : 0);
+    }
+
+    auto column = ColumnNullable::create(std::move(column_string), std::move(null_mask));
+    column_unique_unique_deserialize_from_arena_impl(*column, *data_type);
+}
--- a/dbms/src/Common/Allocator.h
+++ b/dbms/src/Common/Allocator.h
@ -48,6 +48,7 @@ protected:
 #endif

 /** Allocator with optimization to place small memory ranges in automatic memory.
+  * TODO alignment
  */
 template <typename Base, size_t N = 64>
 class AllocatorWithStackMemory : private Base
--- a/dbms/src/Common/Arena.h
+++ b/dbms/src/Common/Arena.h
@ -36,7 +36,7 @@ private:
    static constexpr size_t pad_right = 15;

    /// Contiguous chunk of memory and pointer to free space inside it. Member of single-linked list.
-    struct Chunk : private Allocator<false>    /// empty base optimization
+    struct alignas(16) Chunk : private Allocator<false>    /// empty base optimization
    {
        char * begin;
        char * pos;
@ -149,6 +149,12 @@ public:
        } while (true);
    }

+    template <typename T>
+    T * alloc()
+    {
+        return reinterpret_cast<T *>(alignedAlloc(sizeof(T), alignof(T)));
+    }
+
    /** Rollback just performed allocation.
      * Must pass size not more that was just allocated.
      */
--- a/dbms/src/Common/ArenaAllocator.h
+++ b/dbms/src/Common/ArenaAllocator.h
@ -38,6 +38,7 @@ public:
 };


+/// Allocates in Arena with proper alignment.
 template <size_t alignment>
 class AlignedArenaAllocator
 {
@ -69,14 +70,14 @@ public:


 /// Switches to ordinary Allocator after REAL_ALLOCATION_TRESHOLD bytes to avoid fragmentation and trash in Arena.
-template <size_t REAL_ALLOCATION_TRESHOLD = 4096, typename TRealAllocator = Allocator<false>, typename TArenaAllocator = ArenaAllocator>
+template <size_t REAL_ALLOCATION_TRESHOLD = 4096, typename TRealAllocator = Allocator<false>, typename TArenaAllocator = ArenaAllocator, size_t alignment = 0>
 class MixedArenaAllocator : private TRealAllocator
 {
 public:

    void * alloc(size_t size, Arena * arena)
    {
-        return (size < REAL_ALLOCATION_TRESHOLD) ? TArenaAllocator::alloc(size, arena) : TRealAllocator::alloc(size);
+        return (size < REAL_ALLOCATION_TRESHOLD) ? TArenaAllocator::alloc(size, arena) : TRealAllocator::alloc(size, alignment);
    }

    void * realloc(void * buf, size_t old_size, size_t new_size, Arena * arena)
@ -87,9 +88,9 @@ public:
            return TArenaAllocator::realloc(buf, old_size, new_size, arena);

        if (old_size >= REAL_ALLOCATION_TRESHOLD)
-            return TRealAllocator::realloc(buf, old_size, new_size);
+            return TRealAllocator::realloc(buf, old_size, new_size, alignment);

-        void * new_buf = TRealAllocator::alloc(new_size);
+        void * new_buf = TRealAllocator::alloc(new_size, alignment);
        memcpy(new_buf, buf, old_size);
        return new_buf;
    }
@ -103,11 +104,11 @@ public:


 template <size_t alignment, size_t REAL_ALLOCATION_TRESHOLD = 4096>
-using MixedAlignedArenaAllocator = MixedArenaAllocator<REAL_ALLOCATION_TRESHOLD, Allocator<false>, AlignedArenaAllocator<alignment>>;
+using MixedAlignedArenaAllocator = MixedArenaAllocator<REAL_ALLOCATION_TRESHOLD, Allocator<false>, AlignedArenaAllocator<alignment>, alignment>;


 template <size_t N = 64, typename Base = ArenaAllocator>
-class ArenaAllocatorWithStackMemoty : public Base
+class ArenaAllocatorWithStackMemory : public Base
 {
    char stack_memory[N];

--- a/dbms/src/Common/CompactArray.h
+++ b/dbms/src/Common/CompactArray.h
@ -55,26 +55,6 @@ public:
        return locus;
    }

-    void readText(ReadBuffer & in)
-    {
-        for (size_t i = 0; i < BITSET_SIZE; ++i)
-        {
-            if (i != 0)
-                assertChar(',', in);
-            readIntText(bitset[i], in);
-        }
-    }
-
-    void writeText(WriteBuffer & out) const
-    {
-        for (size_t i = 0; i < BITSET_SIZE; ++i)
-        {
-            if (i != 0)
-                writeCString(",", out);
-            writeIntText(bitset[i], out);
-        }
-    }
-
 private:
    /// number of bytes in bitset
    static constexpr size_t BITSET_SIZE = (static_cast<size_t>(bucket_count) * content_width + 7) / 8;
@ -165,7 +145,9 @@ private:
    bool fits_in_byte;
 };

-/** The `Locus` structure contains the necessary information to find for each cell
+/** TODO This code looks very suboptimal.
+  *
+  * The `Locus` structure contains the necessary information to find for each cell
  * the corresponding byte and offset, in bits, from the beginning of the cell. Since in general
  * case the size of one byte is not divisible by the size of one cell, cases possible
  * when one cell overlaps two bytes. Therefore, the `Locus` structure contains two
@ -219,13 +201,20 @@ private:

    void ALWAYS_INLINE init(BucketIndex bucket_index)
    {
+        /// offset in bits to the leftmost bit
        size_t l = static_cast<size_t>(bucket_index) * content_width;
-        index_l = l >> 3;
-        offset_l = l & 7;

-        size_t r = static_cast<size_t>(bucket_index + 1) * content_width;
-        index_r = r >> 3;
-        offset_r = r & 7;
+        /// offset of byte that contains the leftmost bit
+        index_l = l / 8;
+
+        /// offset in bits to the leftmost bit at that byte
+        offset_l = l % 8;
+
+        /// offset of byte that contains the rightmost bit
+        index_r = (l + content_width - 1) / 8;
+
+        /// offset in bits to the next to the rightmost bit at that byte; or zero if the rightmost bit is the rightmost bit in that byte.
+        offset_r = (l + content_width) % 8;
    }

    UInt8 ALWAYS_INLINE read(UInt8 value_l) const
--- a/dbms/src/Common/Config/ConfigProcessor.cpp
+++ b/dbms/src/Common/Config/ConfigProcessor.cpp
@ -24,7 +24,7 @@ namespace DB
 {

 /// For cutting prerpocessed path to this base
-std::string main_config_path;
+static std::string main_config_path;

 /// Extracts from a string the first encountered number consisting of at least two digits.
 static std::string numberFromHost(const std::string & s)
@ -447,6 +447,11 @@ XMLDocumentPtr ConfigProcessor::processConfig(
            merge(config, with);
            contributing_files.push_back(merge_file);
        }
+        catch (Exception & e)
+        {
+            e.addMessage("while merging config '" + path + "' with '" + merge_file + "'");
+            throw;
+        }
        catch (Poco::Exception & e)
        {
            throw Poco::Exception("Failed to merge config with '" + merge_file + "': " + e.displayText());
@ -479,6 +484,11 @@ XMLDocumentPtr ConfigProcessor::processConfig(

        doIncludesRecursive(config, include_from, getRootNode(config.get()), zk_node_cache, zk_changed_event, contributing_zk_paths);
    }
+    catch (Exception & e)
+    {
+        e.addMessage("while preprocessing config '" + path + "'");
+        throw;
+    }
    catch (Poco::Exception & e)
    {
        throw Poco::Exception("Failed to preprocess config '" + path + "': " + e.displayText(), e);
--- a/dbms/src/Common/Config/ConfigReloader.cpp
+++ b/dbms/src/Common/Config/ConfigReloader.cpp
@ -81,7 +81,7 @@ void ConfigReloader::reloadIfNewer(bool force, bool throw_on_error, bool fallbac
    std::lock_guard<std::mutex> lock(reload_mutex);

    FilesChangesTracker new_files = getNewFileList();
-    if (force || new_files.isDifferOrNewerThan(files))
+    if (force || need_reload_from_zk || new_files.isDifferOrNewerThan(files))
    {
        ConfigProcessor config_processor(path);
        ConfigProcessor::LoadedConfig loaded_config;
@ -94,6 +94,17 @@ void ConfigReloader::reloadIfNewer(bool force, bool throw_on_error, bool fallbac
                loaded_config = config_processor.loadConfigWithZooKeeperIncludes(
                    zk_node_cache, zk_changed_event, fallback_to_preprocessed);
        }
+        catch (const Coordination::Exception & e)
+        {
+            if (Coordination::isHardwareError(e.code))
+                need_reload_from_zk = true;
+
+            if (throw_on_error)
+                throw;
+
+            tryLogCurrentException(log, "ZooKeeper error when loading config from `" + path + "'");
+            return;
+        }
        catch (...)
        {
            if (throw_on_error)
@ -110,7 +121,10 @@ void ConfigReloader::reloadIfNewer(bool force, bool throw_on_error, bool fallbac
         *  When file has been written (and contain valid data), we don't load new data since modification time remains the same.
         */
        if (!loaded_config.loaded_from_preprocessed)
+        {
            files = std::move(new_files);
+            need_reload_from_zk = false;
+        }

        try
        {
--- a/dbms/src/Common/Config/ConfigReloader.h
+++ b/dbms/src/Common/Config/ConfigReloader.h
@ -75,6 +75,7 @@ private:
    std::string preprocessed_dir;
    FilesChangesTracker files;
    zkutil::ZooKeeperNodeCache zk_node_cache;
+    bool need_reload_from_zk = false;
    zkutil::EventPtr zk_changed_event = std::make_shared<Poco::Event>();

    Updater updater;
--- a/dbms/src/Common/ErrorCodes.cpp
+++ b/dbms/src/Common/ErrorCodes.cpp
@ -402,6 +402,9 @@ namespace ErrorCodes
    extern const int SYSTEM_ERROR = 425;
    extern const int NULL_POINTER_DEREFERENCE = 426;
    extern const int CANNOT_COMPILE_REGEXP = 427;
+    extern const int UNKNOWN_LOG_LEVEL = 428;
+    extern const int FAILED_TO_GETPWUID = 429;
+    extern const int MISMATCHING_USERS_FOR_PROCESS_AND_DATA = 430;

    extern const int KEEPER_EXCEPTION = 999;
    extern const int POCO_EXCEPTION = 1000;
--- a/dbms/src/Common/HashTable/HashTable.h
+++ b/dbms/src/Common/HashTable/HashTable.h
@ -142,7 +142,7 @@ struct HashTableCell

    /// Deserialization, in binary and text form.
    void read(DB::ReadBuffer & rb)        { DB::readBinary(key, rb); }
-    void readText(DB::ReadBuffer & rb)    { DB::writeDoubleQuoted(key, rb); }
+    void readText(DB::ReadBuffer & rb)    { DB::readDoubleQuoted(key, rb); }
 };


@ -658,12 +658,8 @@ protected:
        return false;
    }

-
-    /// Only for non-zero keys. Find the right place, insert the key there, if it does not already exist. Set iterator to the cell in output parameter.
-    void ALWAYS_INLINE emplaceNonZero(Key x, iterator & it, bool & inserted, size_t hash_value)
+    void ALWAYS_INLINE emplaceNonZeroImpl(size_t place_value, Key x, iterator & it, bool & inserted, size_t hash_value)
    {
-        size_t place_value = findCell(x, hash_value, grower.place(hash_value));
-
        it = iterator(this, &buf[place_value]);

        if (!buf[place_value].isZero(*this))
@ -698,6 +694,21 @@ protected:
        }
    }

+    /// Only for non-zero keys. Find the right place, insert the key there, if it does not already exist. Set iterator to the cell in output parameter.
+    void ALWAYS_INLINE emplaceNonZero(Key x, iterator & it, bool & inserted, size_t hash_value)
+    {
+        size_t place_value = findCell(x, hash_value, grower.place(hash_value));
+        emplaceNonZeroImpl(place_value, x, it, inserted, hash_value);
+    }
+
+    /// Same but find place using object. Hack for ReverseIndex.
+    template <typename ObjectToCompareWith>
+    void ALWAYS_INLINE emplaceNonZero(Key x, iterator & it, bool & inserted, size_t hash_value, const ObjectToCompareWith & object)
+    {
+        size_t place_value = findCell(object, hash_value, grower.place(hash_value));
+        emplaceNonZeroImpl(place_value, x, it, inserted, hash_value);
+    }
+

 public:
    /// Insert a value. In the case of any more complex values, it is better to use the `emplace` function.
@ -753,6 +764,13 @@ public:
            emplaceNonZero(x, it, inserted, hash_value);
    }

+    /// Same, but search position by object. Hack for ReverseIndex.
+    template <typename ObjectToCompareWith>
+    void ALWAYS_INLINE emplace(Key x, iterator & it, bool & inserted, size_t hash_value, const ObjectToCompareWith & object)
+    {
+        if (!emplaceIfZero(x, it, inserted, hash_value))
+            emplaceNonZero(x, it, inserted, hash_value, object);
+    }

    /// Copy the cell from another hash table. It is assumed that the cell is not zero, and also that there was no such key in the table yet.
    void ALWAYS_INLINE insertUniqueNonZero(const Cell * cell, size_t hash_value)
--- a/dbms/src/Common/PODArray.cpp
+++ b/dbms/src/Common/PODArray.cpp
@ -0,0 +1,8 @@
+#include <Common/PODArray.h>
+
+namespace DB
+{
+/// Used for left padding of PODArray when empty
+const char EmptyPODArray[EmptyPODArraySize]{};
+
+}
--- a/dbms/src/Common/PODArray.h
+++ b/dbms/src/Common/PODArray.h
@ -20,6 +20,11 @@
 namespace DB
 {

+inline constexpr size_t integerRoundUp(size_t value, size_t dividend)
+{
+    return ((value + dividend - 1) / dividend) * dividend;
+}
+
 /** A dynamic array for POD types.
  * Designed for a small number of large arrays (rather than a lot of small ones).
  * To be more precise - for use in ColumnVector.
@ -37,6 +42,10 @@ namespace DB
  * The template parameter `pad_right` - always allocate at the end of the array as many unused bytes.
  * Can be used to make optimistic reading, writing, copying with unaligned SIMD instructions.
  *
+  * The template parameter `pad_left` - always allocate memory before 0th element of the array (rounded up to the whole number of elements)
+  *  and zero initialize -1th element. It allows to use -1th element that will have value 0.
+  * This gives performance benefits when converting an array of offsets to array of sizes.
+  *
  * Some methods using allocator have TAllocatorParams variadic arguments.
  * These arguments will be passed to corresponding methods of TAllocator.
  * Example: pointer to Arena, that is used for allocations.
@ -45,31 +54,38 @@ namespace DB
  * Because sometimes we have many small objects, that share same allocator with same parameters,
  *  and we must avoid larger object size due to storing the same parameters in each object.
  * This is required for states of aggregate functions.
+  *
+  * TODO Pass alignment to Allocator.
+  * TODO Allow greater alignment than alignof(T). Example: array of char aligned to page size.
  */
-template <typename T, size_t INITIAL_SIZE = 4096, typename TAllocator = Allocator<false>, size_t pad_right_ = 0>
-class PODArray : private boost::noncopyable, private TAllocator    /// empty base optimization
+static constexpr size_t EmptyPODArraySize = 1024;
+extern const char EmptyPODArray[EmptyPODArraySize];
+
+/** Base class that depend only on size of element, not on element itself.
+  * You can static_cast to this class if you want to insert some data regardless to the actual type T.
+  */
+template <size_t ELEMENT_SIZE, size_t INITIAL_SIZE, typename TAllocator, size_t pad_right_, size_t pad_left_>
+class PODArrayBase : private boost::noncopyable, private TAllocator    /// empty base optimization
 {
 protected:
    /// Round padding up to an whole number of elements to simplify arithmetic.
-    static constexpr size_t pad_right = (pad_right_ + sizeof(T) - 1) / sizeof(T) * sizeof(T);
+    static constexpr size_t pad_right = integerRoundUp(pad_right_, ELEMENT_SIZE);
+    /// pad_left is also rounded up to 16 bytes to maintain alignment of allocated memory.
+    static constexpr size_t pad_left = integerRoundUp(integerRoundUp(pad_left_, ELEMENT_SIZE), 16);
+    /// Empty array will point to this static memory as padding.
+    static constexpr char * null = pad_left ? const_cast<char *>(EmptyPODArray) + EmptyPODArraySize : nullptr;

-    char * c_start          = nullptr;
-    char * c_end            = nullptr;
-    char * c_end_of_storage = nullptr;    /// Does not include pad_right.
+    static_assert(pad_left <= EmptyPODArraySize && "Left Padding exceeds EmptyPODArraySize. Is the element size too large?");

-    T * t_start()                      { return reinterpret_cast<T *>(c_start); }
-    T * t_end()                        { return reinterpret_cast<T *>(c_end); }
-    T * t_end_of_storage()             { return reinterpret_cast<T *>(c_end_of_storage); }
-
-    const T * t_start() const          { return reinterpret_cast<const T *>(c_start); }
-    const T * t_end() const            { return reinterpret_cast<const T *>(c_end); }
-    const T * t_end_of_storage() const { return reinterpret_cast<const T *>(c_end_of_storage); }
+    char * c_start          = null;    /// Does not include pad_left.
+    char * c_end            = null;
+    char * c_end_of_storage = null;    /// Does not include pad_right.

    /// The amount of memory occupied by the num_elements of the elements.
-    static size_t byte_size(size_t num_elements) { return num_elements * sizeof(T); }
+    static size_t byte_size(size_t num_elements) { return num_elements * ELEMENT_SIZE; }

    /// Minimum amount of memory to allocate for num_elements, including padding.
-    static size_t minimum_memory_for_elements(size_t num_elements) { return byte_size(num_elements) + pad_right; }
+    static size_t minimum_memory_for_elements(size_t num_elements) { return byte_size(num_elements) + pad_right + pad_left; }

    void alloc_for_num_elements(size_t num_elements)
    {
@ -79,22 +95,25 @@ protected:
    template <typename ... TAllocatorParams>
    void alloc(size_t bytes, TAllocatorParams &&... allocator_params)
    {
-        c_start = c_end = reinterpret_cast<char *>(TAllocator::alloc(bytes, std::forward<TAllocatorParams>(allocator_params)...));
-        c_end_of_storage = c_start + bytes - pad_right;
+        c_start = c_end = reinterpret_cast<char *>(TAllocator::alloc(bytes, std::forward<TAllocatorParams>(allocator_params)...)) + pad_left;
+        c_end_of_storage = c_start + bytes - pad_right - pad_left;
+
+        if (pad_left)
+            memset(c_start - ELEMENT_SIZE, 0, ELEMENT_SIZE);
    }

    void dealloc()
    {
-        if (c_start == nullptr)
+        if (c_start == null)
            return;

-        TAllocator::free(c_start, allocated_bytes());
+        TAllocator::free(c_start - pad_left, allocated_bytes());
    }

    template <typename ... TAllocatorParams>
    void realloc(size_t bytes, TAllocatorParams &&... allocator_params)
    {
-        if (c_start == nullptr)
+        if (c_start == null)
        {
            alloc(bytes, std::forward<TAllocatorParams>(allocator_params)...);
            return;
@ -102,15 +121,20 @@ protected:

        ptrdiff_t end_diff = c_end - c_start;

-        c_start = reinterpret_cast<char *>(TAllocator::realloc(c_start, allocated_bytes(), bytes, std::forward<TAllocatorParams>(allocator_params)...));
+        c_start = reinterpret_cast<char *>(
+                TAllocator::realloc(c_start - pad_left, allocated_bytes(), bytes, std::forward<TAllocatorParams>(allocator_params)...))
+            + pad_left;

        c_end = c_start + end_diff;
-        c_end_of_storage = c_start + bytes - pad_right;
+        c_end_of_storage = c_start + bytes - pad_right - pad_left;
+
+        if (pad_left)
+            memset(c_start - ELEMENT_SIZE, 0, ELEMENT_SIZE);
    }

    bool isInitialized() const
    {
-        return (c_start != nullptr) && (c_end != nullptr) && (c_end_of_storage != nullptr);
+        return (c_start != null) && (c_end != null) && (c_end_of_storage != null);
    }

    bool isAllocatedFromStack() const
@ -124,9 +148,9 @@ protected:
    {
        if (size() == 0)
        {
-            // The allocated memory should be multiplication of sizeof(T) to hold the element, otherwise,
+            // The allocated memory should be multiplication of ELEMENT_SIZE to hold the element, otherwise,
            // memory issue such as corruption could appear in edge case.
-            realloc(std::max(((INITIAL_SIZE - 1) / sizeof(T) + 1) * sizeof(T), minimum_memory_for_elements(1)),
+            realloc(std::max(((INITIAL_SIZE - 1) / ELEMENT_SIZE + 1) * ELEMENT_SIZE, minimum_memory_for_elements(1)),
                    std::forward<TAllocatorParams>(allocator_params)...);
        }
        else
@ -134,83 +158,13 @@ protected:
    }

 public:
-    using value_type = T;
+    bool empty() const { return c_end == c_start; }
+    size_t size() const { return (c_end - c_start) / ELEMENT_SIZE; }
+    size_t capacity() const { return (c_end_of_storage - c_start) / ELEMENT_SIZE; }

-    size_t allocated_bytes() const { return c_end_of_storage - c_start + pad_right; }
+    size_t allocated_bytes() const { return c_end_of_storage - c_start + pad_right + pad_left; }

-    /// You can not just use `typedef`, because there is ambiguity for the constructors and `assign` functions.
-    struct iterator : public boost::iterator_adaptor<iterator, T*>
-    {
-        iterator() {}
-        iterator(T * ptr_) : iterator::iterator_adaptor_(ptr_) {}
-    };
-
-    struct const_iterator : public boost::iterator_adaptor<const_iterator, const T*>
-    {
-        const_iterator() {}
-        const_iterator(const T * ptr_) : const_iterator::iterator_adaptor_(ptr_) {}
-    };
-
-
-    PODArray() {}
-
-    PODArray(size_t n)
-    {
-        alloc_for_num_elements(n);
-        c_end += byte_size(n);
-    }
-
-    PODArray(size_t n, const T & x)
-    {
-        alloc_for_num_elements(n);
-        assign(n, x);
-    }
-
-    PODArray(const_iterator from_begin, const_iterator from_end)
-    {
-        alloc_for_num_elements(from_end - from_begin);
-        insert(from_begin, from_end);
-    }
-
-    PODArray(std::initializer_list<T> il) : PODArray(std::begin(il), std::end(il)) {}
-
-    ~PODArray()
-    {
-        dealloc();
-    }
-
-    PODArray(PODArray && other)
-    {
-        this->swap(other);
-    }
-
-    PODArray & operator=(PODArray && other)
-    {
-        this->swap(other);
-        return *this;
-    }
-
-    T * data() { return t_start(); }
-    const T * data() const { return t_start(); }
-
-    size_t size() const { return t_end() - t_start(); }
-    bool empty() const { return t_end() == t_start(); }
-    size_t capacity() const { return t_end_of_storage() - t_start(); }
-
-    T & operator[] (size_t n)                 { return t_start()[n]; }
-    const T & operator[] (size_t n) const     { return t_start()[n]; }
-
-    T & front()             { return t_start()[0]; }
-    T & back()              { return t_end()[-1]; }
-    const T & front() const { return t_start()[0]; }
-    const T & back() const  { return t_end()[-1]; }
-
-    iterator begin()              { return t_start(); }
-    iterator end()                { return t_end(); }
-    const_iterator begin() const  { return t_start(); }
-    const_iterator end() const    { return t_end(); }
-    const_iterator cbegin() const { return t_start(); }
-    const_iterator cend() const   { return t_end(); }
+    void clear() { c_end = c_start; }

    template <typename ... TAllocatorParams>
    void reserve(size_t n, TAllocatorParams &&... allocator_params)
@ -231,42 +185,141 @@ public:
        c_end = c_start + byte_size(n);
    }

+    const char * raw_data() const
+    {
+        return c_start;
+    }
+
+    template <typename ... TAllocatorParams>
+    void push_back_raw(const char * ptr, TAllocatorParams &&... allocator_params)
+    {
+        if (unlikely(c_end == c_end_of_storage))
+            reserveForNextSize(std::forward<TAllocatorParams>(allocator_params)...);
+
+        memcpy(c_end, ptr, ELEMENT_SIZE);
+        c_end += byte_size(1);
+    }
+
+    ~PODArrayBase()
+    {
+        dealloc();
+    }
+};
+
+template <typename T, size_t INITIAL_SIZE = 4096, typename TAllocator = Allocator<false>, size_t pad_right_ = 0, size_t pad_left_ = 0>
+class PODArray : public PODArrayBase<sizeof(T), INITIAL_SIZE, TAllocator, pad_right_, pad_left_>
+{
+protected:
+    using Base = PODArrayBase<sizeof(T), INITIAL_SIZE, TAllocator, pad_right_, pad_left_>;
+
+    T * t_start()                      { return reinterpret_cast<T *>(this->c_start); }
+    T * t_end()                        { return reinterpret_cast<T *>(this->c_end); }
+    T * t_end_of_storage()             { return reinterpret_cast<T *>(this->c_end_of_storage); }
+
+    const T * t_start() const          { return reinterpret_cast<const T *>(this->c_start); }
+    const T * t_end() const            { return reinterpret_cast<const T *>(this->c_end); }
+    const T * t_end_of_storage() const { return reinterpret_cast<const T *>(this->c_end_of_storage); }
+
+public:
+    using value_type = T;
+
+    /// You can not just use `typedef`, because there is ambiguity for the constructors and `assign` functions.
+    struct iterator : public boost::iterator_adaptor<iterator, T*>
+    {
+        iterator() {}
+        iterator(T * ptr_) : iterator::iterator_adaptor_(ptr_) {}
+    };
+
+    struct const_iterator : public boost::iterator_adaptor<const_iterator, const T*>
+    {
+        const_iterator() {}
+        const_iterator(const T * ptr_) : const_iterator::iterator_adaptor_(ptr_) {}
+    };
+
+
+    PODArray() {}
+
+    PODArray(size_t n)
+    {
+        this->alloc_for_num_elements(n);
+        this->c_end += this->byte_size(n);
+    }
+
+    PODArray(size_t n, const T & x)
+    {
+        this->alloc_for_num_elements(n);
+        assign(n, x);
+    }
+
+    PODArray(const_iterator from_begin, const_iterator from_end)
+    {
+        this->alloc_for_num_elements(from_end - from_begin);
+        insert(from_begin, from_end);
+    }
+
+    PODArray(std::initializer_list<T> il) : PODArray(std::begin(il), std::end(il)) {}
+
+    PODArray(PODArray && other)
+    {
+        this->swap(other);
+    }
+
+    PODArray & operator=(PODArray && other)
+    {
+        this->swap(other);
+        return *this;
+    }
+
+    T * data() { return t_start(); }
+    const T * data() const { return t_start(); }
+
+    /// The index is signed to access -1th element without pointer overflow.
+    T & operator[] (ssize_t n)                 { return t_start()[n]; }
+    const T & operator[] (ssize_t n) const     { return t_start()[n]; }
+
+    T & front()             { return t_start()[0]; }
+    T & back()              { return t_end()[-1]; }
+    const T & front() const { return t_start()[0]; }
+    const T & back() const  { return t_end()[-1]; }
+
+    iterator begin()              { return t_start(); }
+    iterator end()                { return t_end(); }
+    const_iterator begin() const  { return t_start(); }
+    const_iterator end() const    { return t_end(); }
+    const_iterator cbegin() const { return t_start(); }
+    const_iterator cend() const   { return t_end(); }
+
    /// Same as resize, but zeroes new elements.
    void resize_fill(size_t n)
    {
-        size_t old_size = size();
+        size_t old_size = this->size();
        if (n > old_size)
        {
-            reserve(n);
-            memset(c_end, 0, byte_size(n - old_size));
+            this->reserve(n);
+            memset(this->c_end, 0, this->byte_size(n - old_size));
        }
-        c_end = c_start + byte_size(n);
+        this->c_end = this->c_start + this->byte_size(n);
    }

    void resize_fill(size_t n, const T & value)
    {
-        size_t old_size = size();
+        size_t old_size = this->size();
        if (n > old_size)
        {
-            reserve(n);
+            this->reserve(n);
            std::fill(t_end(), t_end() + n - old_size, value);
        }
-        c_end = c_start + byte_size(n);
-    }
-
-    void clear()
-    {
-        c_end = c_start;
+        this->c_end = this->c_start + this->byte_size(n);
    }

    template <typename ... TAllocatorParams>
    void push_back(const T & x, TAllocatorParams &&... allocator_params)
    {
-        if (unlikely(c_end == c_end_of_storage))
-            reserveForNextSize(std::forward<TAllocatorParams>(allocator_params)...);
+        if (unlikely(this->c_end == this->c_end_of_storage))
+            this->reserveForNextSize(std::forward<TAllocatorParams>(allocator_params)...);

        *t_end() = x;
-        c_end += byte_size(1);
+        this->c_end += this->byte_size(1);
    }

    /** This method doesn't allow to pass parameters for Allocator,
@ -275,25 +328,25 @@ public:
    template <typename... Args>
    void emplace_back(Args &&... args)
    {
-        if (unlikely(c_end == c_end_of_storage))
-            reserveForNextSize();
+        if (unlikely(this->c_end == this->c_end_of_storage))
+            this->reserveForNextSize();

        new (t_end()) T(std::forward<Args>(args)...);
-        c_end += byte_size(1);
+        this->c_end += this->byte_size(1);
    }

    void pop_back()
    {
-        c_end -= byte_size(1);
+        this->c_end -= this->byte_size(1);
    }

    /// Do not insert into the array a piece of itself. Because with the resize, the iterators on themselves can be invalidated.
    template <typename It1, typename It2, typename ... TAllocatorParams>
    void insertPrepare(It1 from_begin, It2 from_end, TAllocatorParams &&... allocator_params)
    {
-        size_t required_capacity = size() + (from_end - from_begin);
-        if (required_capacity > capacity())
-            reserve(roundUpToPowerOfTwoOrZero(required_capacity), std::forward<TAllocatorParams>(allocator_params)...);
+        size_t required_capacity = this->size() + (from_end - from_begin);
+        if (required_capacity > this->capacity())
+            this->reserve(roundUpToPowerOfTwoOrZero(required_capacity), std::forward<TAllocatorParams>(allocator_params)...);
    }

    /// Do not insert into the array a piece of itself. Because with the resize, the iterators on themselves can be invalidated.
@ -310,9 +363,9 @@ public:
    {
        static_assert(pad_right_ >= 15);
        insertPrepare(from_begin, from_end, std::forward<TAllocatorParams>(allocator_params)...);
-        size_t bytes_to_copy = byte_size(from_end - from_begin);
-        memcpySmallAllowReadWriteOverflow15(c_end, reinterpret_cast<const void *>(&*from_begin), bytes_to_copy);
-        c_end += bytes_to_copy;
+        size_t bytes_to_copy = this->byte_size(from_end - from_begin);
+        memcpySmallAllowReadWriteOverflow15(this->c_end, reinterpret_cast<const void *>(&*from_begin), bytes_to_copy);
+        this->c_end += bytes_to_copy;
    }

    template <typename It1, typename It2>
@ -320,22 +373,22 @@ public:
    {
        insertPrepare(from_begin, from_end);

-        size_t bytes_to_copy = byte_size(from_end - from_begin);
+        size_t bytes_to_copy = this->byte_size(from_end - from_begin);
        size_t bytes_to_move = (end() - it) * sizeof(T);

        if (unlikely(bytes_to_move))
-            memcpy(c_end + bytes_to_copy - bytes_to_move, c_end - bytes_to_move, bytes_to_move);
+            memcpy(this->c_end + bytes_to_copy - bytes_to_move, this->c_end - bytes_to_move, bytes_to_move);

-        memcpy(c_end - bytes_to_move, reinterpret_cast<const void *>(&*from_begin), bytes_to_copy);
-        c_end += bytes_to_copy;
+        memcpy(this->c_end - bytes_to_move, reinterpret_cast<const void *>(&*from_begin), bytes_to_copy);
+        this->c_end += bytes_to_copy;
    }

    template <typename It1, typename It2>
    void insert_assume_reserved(It1 from_begin, It2 from_end)
    {
-        size_t bytes_to_copy = byte_size(from_end - from_begin);
-        memcpy(c_end, reinterpret_cast<const void *>(&*from_begin), bytes_to_copy);
-        c_end += bytes_to_copy;
+        size_t bytes_to_copy = this->byte_size(from_end - from_begin);
+        memcpy(this->c_end, reinterpret_cast<const void *>(&*from_begin), bytes_to_copy);
+        this->c_end += bytes_to_copy;
    }

    void swap(PODArray & rhs)
@ -343,7 +396,7 @@ public:
        /// Swap two PODArray objects, arr1 and arr2, that satisfy the following conditions:
        /// - The elements of arr1 are stored on stack.
        /// - The elements of arr2 are stored on heap.
-        auto swap_stack_heap = [](PODArray & arr1, PODArray & arr2)
+        auto swap_stack_heap = [this](PODArray & arr1, PODArray & arr2)
        {
            size_t stack_size = arr1.size();
            size_t stack_allocated = arr1.allocated_bytes();
@ -357,27 +410,27 @@ public:
            /// arr1 takes ownership of the heap memory of arr2.
            arr1.c_start = arr2.c_start;
            arr1.c_end_of_storage = arr1.c_start + heap_allocated - arr1.pad_right;
-            arr1.c_end = arr1.c_start + byte_size(heap_size);
+            arr1.c_end = arr1.c_start + this->byte_size(heap_size);

            /// Allocate stack space for arr2.
            arr2.alloc(stack_allocated);
            /// Copy the stack content.
-            memcpy(arr2.c_start, stack_c_start, byte_size(stack_size));
-            arr2.c_end = arr2.c_start + byte_size(stack_size);
+            memcpy(arr2.c_start, stack_c_start, this->byte_size(stack_size));
+            arr2.c_end = arr2.c_start + this->byte_size(stack_size);
        };

-        auto do_move = [](PODArray & src, PODArray & dest)
+        auto do_move = [this](PODArray & src, PODArray & dest)
        {
            if (src.isAllocatedFromStack())
            {
                dest.dealloc();
                dest.alloc(src.allocated_bytes());
-                memcpy(dest.c_start, src.c_start, byte_size(src.size()));
+                memcpy(dest.c_start, src.c_start, this->byte_size(src.size()));
                dest.c_end = dest.c_start + (src.c_end - src.c_start);

-                src.c_start = nullptr;
-                src.c_end = nullptr;
-                src.c_end_of_storage = nullptr;
+                src.c_start = Base::null;
+                src.c_end = Base::null;
+                src.c_end_of_storage = Base::null;
            }
            else
            {
@ -387,28 +440,28 @@ public:
            }
        };

-        if (!isInitialized() && !rhs.isInitialized())
+        if (!this->isInitialized() && !rhs.isInitialized())
            return;
-        else if (!isInitialized() && rhs.isInitialized())
+        else if (!this->isInitialized() && rhs.isInitialized())
        {
            do_move(rhs, *this);
            return;
        }
-        else if (isInitialized() && !rhs.isInitialized())
+        else if (this->isInitialized() && !rhs.isInitialized())
        {
            do_move(*this, rhs);
            return;
        }

-        if (isAllocatedFromStack() && rhs.isAllocatedFromStack())
+        if (this->isAllocatedFromStack() && rhs.isAllocatedFromStack())
        {
-            size_t min_size = std::min(size(), rhs.size());
-            size_t max_size = std::max(size(), rhs.size());
+            size_t min_size = std::min(this->size(), rhs.size());
+            size_t max_size = std::max(this->size(), rhs.size());

            for (size_t i = 0; i < min_size; ++i)
                std::swap(this->operator[](i), rhs[i]);

-            if (size() == max_size)
+            if (this->size() == max_size)
            {
                for (size_t i = min_size; i < max_size; ++i)
                    rhs[i] = this->operator[](i);
@ -419,33 +472,33 @@ public:
                    this->operator[](i) = rhs[i];
            }

-            size_t lhs_size = size();
-            size_t lhs_allocated = allocated_bytes();
+            size_t lhs_size = this->size();
+            size_t lhs_allocated = this->allocated_bytes();

            size_t rhs_size = rhs.size();
            size_t rhs_allocated = rhs.allocated_bytes();

-            c_end_of_storage = c_start + rhs_allocated - pad_right;
-            rhs.c_end_of_storage = rhs.c_start + lhs_allocated - pad_right;
+            this->c_end_of_storage = this->c_start + rhs_allocated - Base::pad_right;
+            rhs.c_end_of_storage = rhs.c_start + lhs_allocated - Base::pad_right;

-            c_end = c_start + byte_size(rhs_size);
-            rhs.c_end = rhs.c_start + byte_size(lhs_size);
+            this->c_end = this->c_start + this->byte_size(rhs_size);
+            rhs.c_end = rhs.c_start + this->byte_size(lhs_size);
        }
-        else if (isAllocatedFromStack() && !rhs.isAllocatedFromStack())
+        else if (this->isAllocatedFromStack() && !rhs.isAllocatedFromStack())
            swap_stack_heap(*this, rhs);
-        else if (!isAllocatedFromStack() && rhs.isAllocatedFromStack())
+        else if (!this->isAllocatedFromStack() && rhs.isAllocatedFromStack())
            swap_stack_heap(rhs, *this);
        else
        {
-            std::swap(c_start, rhs.c_start);
-            std::swap(c_end, rhs.c_end);
-            std::swap(c_end_of_storage, rhs.c_end_of_storage);
+            std::swap(this->c_start, rhs.c_start);
+            std::swap(this->c_end, rhs.c_end);
+            std::swap(this->c_end_of_storage, rhs.c_end_of_storage);
        }
    }

    void assign(size_t n, const T & x)
    {
-        resize(n);
+        this->resize(n);
        std::fill(begin(), end(), x);
    }

@ -453,12 +506,12 @@ public:
    void assign(It1 from_begin, It2 from_end)
    {
        size_t required_capacity = from_end - from_begin;
-        if (required_capacity > capacity())
-            reserve(roundUpToPowerOfTwoOrZero(required_capacity));
+        if (required_capacity > this->capacity())
+            this->reserve(roundUpToPowerOfTwoOrZero(required_capacity));

-        size_t bytes_to_copy = byte_size(required_capacity);
-        memcpy(c_start, reinterpret_cast<const void *>(&*from_begin), bytes_to_copy);
-        c_end = c_start + bytes_to_copy;
+        size_t bytes_to_copy = this->byte_size(required_capacity);
+        memcpy(this->c_start, reinterpret_cast<const void *>(&*from_begin), bytes_to_copy);
+        this->c_end = this->c_start + bytes_to_copy;
    }

    void assign(const PODArray & from)
@ -469,7 +522,7 @@ public:

    bool operator== (const PODArray & other) const
    {
-        if (size() != other.size())
+        if (this->size() != other.size())
            return false;

        const_iterator this_it = begin();
@ -501,15 +554,9 @@ void swap(PODArray<T, INITIAL_SIZE, TAllocator, pad_right_> & lhs, PODArray<T, I

 /** For columns. Padding is enough to read and write xmm-register at the address of the last element. */
 template <typename T, size_t INITIAL_SIZE = 4096, typename TAllocator = Allocator<false>>
-using PaddedPODArray = PODArray<T, INITIAL_SIZE, TAllocator, 15>;
-
-
-inline constexpr size_t integerRound(size_t value, size_t dividend)
-{
-    return ((value + dividend - 1) / dividend) * dividend;
-}
+using PaddedPODArray = PODArray<T, INITIAL_SIZE, TAllocator, 15, 16>;

 template <typename T, size_t stack_size_in_bytes>
-using PODArrayWithStackMemory = PODArray<T, 0, AllocatorWithStackMemory<Allocator<false>, integerRound(stack_size_in_bytes, sizeof(T))>>;
+using PODArrayWithStackMemory = PODArray<T, 0, AllocatorWithStackMemory<Allocator<false>, integerRoundUp(stack_size_in_bytes, sizeof(T))>>;

 }
--- a/dbms/src/Common/Volnitsky.h
+++ b/dbms/src/Common/Volnitsky.h
@ -5,6 +5,7 @@
 #include <Core/Types.h>
 #include <Poco/UTF8Encoding.h>
 #include <Poco/Unicode.h>
+#include <common/unaligned.h>
 #include <ext/range.h>
 #include <stdint.h>
 #include <string.h>
@ -121,9 +122,9 @@ protected:
    CRTP & self() { return static_cast<CRTP &>(*this); }
    const CRTP & self() const { return const_cast<VolnitskyBase *>(this)->self(); }

-    static const Ngram & toNGram(const UInt8 * const pos)
+    static Ngram toNGram(const UInt8 * const pos)
    {
-        return *reinterpret_cast<const Ngram *>(pos);
+        return unalignedLoad<Ngram>(pos);
    }

    void putNGramBase(const Ngram ngram, const int offset)
--- a/dbms/src/Common/ZooKeeper/ZooKeeperNodeCache.cpp
+++ b/dbms/src/Common/ZooKeeper/ZooKeeperNodeCache.cpp
@ -20,23 +20,21 @@ ZooKeeperNodeCache::ZNode ZooKeeperNodeCache::get(const std::string & path, Even

 ZooKeeperNodeCache::ZNode ZooKeeperNodeCache::get(const std::string & path, Coordination::WatchCallback caller_watch_callback)
 {
-    zkutil::ZooKeeperPtr zookeeper;
    std::unordered_set<std::string> invalidated_paths;
    {
        std::lock_guard<std::mutex> lock(context->mutex);

-        if (!context->zookeeper)
+        if (context->all_paths_invalidated)
        {
            /// Possibly, there was a previous session and it has expired. Clear the cache.
            path_to_cached_znode.clear();
-
-            context->zookeeper = get_zookeeper();
+            context->all_paths_invalidated = false;
        }
-        zookeeper = context->zookeeper;

        invalidated_paths.swap(context->invalidated_paths);
    }

+    zkutil::ZooKeeperPtr zookeeper = get_zookeeper();
    if (!zookeeper)
        throw DB::Exception("Could not get znode: `" + path + "'. ZooKeeper not configured.", DB::ErrorCodes::NO_ZOOKEEPER);

@ -65,8 +63,8 @@ ZooKeeperNodeCache::ZNode ZooKeeperNodeCache::get(const std::string & path, Coor
                changed = owned_context->invalidated_paths.emplace(response.path).second;
            else if (response.state == Coordination::EXPIRED_SESSION)
            {
-                owned_context->zookeeper = nullptr;
                owned_context->invalidated_paths.clear();
+                owned_context->all_paths_invalidated = true;
                changed = true;
            }
        }
--- a/dbms/src/Common/ZooKeeper/ZooKeeperNodeCache.h
+++ b/dbms/src/Common/ZooKeeper/ZooKeeperNodeCache.h
@ -53,8 +53,8 @@ private:
    struct Context
    {
        std::mutex mutex;
-        zkutil::ZooKeeperPtr zookeeper;
        std::unordered_set<std::string> invalidated_paths;
+        bool all_paths_invalidated = false;
    };

    std::shared_ptr<Context> context;
--- a/dbms/src/Common/config.h.in
+++ b/dbms/src/Common/config.h.in
@ -9,11 +9,13 @@
 #cmakedefine01 USE_RDKAFKA
 #cmakedefine01 USE_CAPNP
 #cmakedefine01 USE_EMBEDDED_COMPILER
-#cmakedefine01 LLVM_HAS_RTTI
 #cmakedefine01 USE_POCO_SQLODBC
 #cmakedefine01 USE_POCO_DATAODBC
 #cmakedefine01 USE_POCO_MONGODB
 #cmakedefine01 USE_POCO_NETSSL
-#cmakedefine01 CLICKHOUSE_SPLIT_BINARY
 #cmakedefine01 USE_BASE64
 #cmakedefine01 USE_HDFS
+#cmakedefine01 USE_XXHASH
+
+#cmakedefine01 CLICKHOUSE_SPLIT_BINARY
+#cmakedefine01 LLVM_HAS_RTTI
--- a/libs/libcommon/include/common/intExp.h
+++ b/libs/libcommon/include/common/intExp.h
@ -3,10 +3,12 @@
 #include <cstdint>
 #include <limits>

+#include <Core/Defines.h>
+

 /// On overlow, the function returns unspecified value.

-inline uint64_t intExp2(int x)
+inline NO_SANITIZE_UNDEFINED uint64_t intExp2(int x)
 {
    return 1ULL << x;
 }
@ -32,7 +34,8 @@ inline uint64_t intExp10(int x)
    return table[x];
 }

-namespace common {
+namespace common
+{

 inline int exp10_i32(int x)
 {
@ -123,4 +126,4 @@ inline __int128 exp10_i128(int x)
    return values[x];
 }

-} // common
+}
--- a/dbms/src/Core/AccurateComparison.h
+++ b/dbms/src/Core/AccurateComparison.h
@ -267,55 +267,55 @@ inline bool_if_safe_conversion<A, B> equalsOp(A a, B b)
 }

 template <>
-inline bool equalsOp<DB::Float64, DB::UInt64>(DB::Float64 f, DB::UInt64 u)
+inline bool NO_SANITIZE_UNDEFINED equalsOp<DB::Float64, DB::UInt64>(DB::Float64 f, DB::UInt64 u)
 {
    return static_cast<DB::UInt64>(f) == u && f == static_cast<DB::Float64>(u);
 }

 template <>
-inline bool equalsOp<DB::UInt64, DB::Float64>(DB::UInt64 u, DB::Float64 f)
+inline bool NO_SANITIZE_UNDEFINED equalsOp<DB::UInt64, DB::Float64>(DB::UInt64 u, DB::Float64 f)
 {
    return u == static_cast<DB::UInt64>(f) && static_cast<DB::Float64>(u) == f;
 }

 template <>
-inline bool equalsOp<DB::Float64, DB::Int64>(DB::Float64 f, DB::Int64 u)
+inline bool NO_SANITIZE_UNDEFINED equalsOp<DB::Float64, DB::Int64>(DB::Float64 f, DB::Int64 u)
 {
    return static_cast<DB::Int64>(f) == u && f == static_cast<DB::Float64>(u);
 }

 template <>
-inline bool equalsOp<DB::Int64, DB::Float64>(DB::Int64 u, DB::Float64 f)
+inline bool NO_SANITIZE_UNDEFINED equalsOp<DB::Int64, DB::Float64>(DB::Int64 u, DB::Float64 f)
 {
    return u == static_cast<DB::Int64>(f) && static_cast<DB::Float64>(u) == f;
 }

 template <>
-inline bool equalsOp<DB::Float32, DB::UInt64>(DB::Float32 f, DB::UInt64 u)
+inline bool NO_SANITIZE_UNDEFINED equalsOp<DB::Float32, DB::UInt64>(DB::Float32 f, DB::UInt64 u)
 {
    return static_cast<DB::UInt64>(f) == u && f == static_cast<DB::Float32>(u);
 }

 template <>
-inline bool equalsOp<DB::UInt64, DB::Float32>(DB::UInt64 u, DB::Float32 f)
+inline bool NO_SANITIZE_UNDEFINED equalsOp<DB::UInt64, DB::Float32>(DB::UInt64 u, DB::Float32 f)
 {
    return u == static_cast<DB::UInt64>(f) && static_cast<DB::Float32>(u) == f;
 }

 template <>
-inline bool equalsOp<DB::Float32, DB::Int64>(DB::Float32 f, DB::Int64 u)
+inline bool NO_SANITIZE_UNDEFINED equalsOp<DB::Float32, DB::Int64>(DB::Float32 f, DB::Int64 u)
 {
    return static_cast<DB::Int64>(f) == u && f == static_cast<DB::Float32>(u);
 }

 template <>
-inline bool equalsOp<DB::Int64, DB::Float32>(DB::Int64 u, DB::Float32 f)
+inline bool NO_SANITIZE_UNDEFINED equalsOp<DB::Int64, DB::Float32>(DB::Int64 u, DB::Float32 f)
 {
    return u == static_cast<DB::Int64>(f) && static_cast<DB::Float32>(u) == f;
 }

 template <>
-inline bool equalsOp<DB::UInt128, DB::Float64>(DB::UInt128 u, DB::Float64 f)
+inline bool NO_SANITIZE_UNDEFINED equalsOp<DB::UInt128, DB::Float64>(DB::UInt128 u, DB::Float64 f)
 {
    return u.low == 0 && equalsOp(static_cast<UInt64>(u.high), f);
 }
@ -338,7 +338,7 @@ inline bool equalsOp<DB::Float32, DB::UInt128>(DB::Float32 f, DB::UInt128 u)
    return equalsOp(static_cast<DB::Float64>(f), u);
 }

-inline bool greaterOp(DB::Int128 i, DB::Float64 f)
+inline bool NO_SANITIZE_UNDEFINED greaterOp(DB::Int128 i, DB::Float64 f)
 {
    static constexpr __int128 min_int128 = __int128(0x8000000000000000ll) << 64;
    static constexpr __int128 max_int128 = (__int128(0x7fffffffffffffffll) << 64) + 0xffffffffffffffffll;
@ -350,7 +350,7 @@ inline bool greaterOp(DB::Int128 i, DB::Float64 f)
        || (f < static_cast<DB::Float64>(max_int128) && i > static_cast<DB::Int128>(f));
 }

-inline bool greaterOp(DB::Float64 f, DB::Int128 i)
+inline bool NO_SANITIZE_UNDEFINED greaterOp(DB::Float64 f, DB::Int128 i)
 {
    static constexpr __int128 min_int128 = __int128(0x8000000000000000ll) << 64;
    static constexpr __int128 max_int128 = (__int128(0x7fffffffffffffffll) << 64) + 0xffffffffffffffffll;
@ -365,8 +365,8 @@ inline bool greaterOp(DB::Float64 f, DB::Int128 i)
 inline bool greaterOp(DB::Int128 i, DB::Float32 f) { return greaterOp(i, static_cast<DB::Float64>(f)); }
 inline bool greaterOp(DB::Float32 f, DB::Int128 i) { return greaterOp(static_cast<DB::Float64>(f), i); }

-inline bool equalsOp(DB::Int128 i, DB::Float64 f) { return i == static_cast<DB::Int128>(f) && static_cast<DB::Float64>(i) == f; }
-inline bool equalsOp(DB::Int128 i, DB::Float32 f) { return i == static_cast<DB::Int128>(f) && static_cast<DB::Float32>(i) == f; }
+inline bool NO_SANITIZE_UNDEFINED equalsOp(DB::Int128 i, DB::Float64 f) { return i == static_cast<DB::Int128>(f) && static_cast<DB::Float64>(i) == f; }
+inline bool NO_SANITIZE_UNDEFINED equalsOp(DB::Int128 i, DB::Float32 f) { return i == static_cast<DB::Int128>(f) && static_cast<DB::Float32>(i) == f; }
 inline bool equalsOp(DB::Float64 f, DB::Int128 i) { return equalsOp(i, f); }
 inline bool equalsOp(DB::Float32 f, DB::Int128 i) { return equalsOp(i, f); }

--- a/dbms/src/Core/Block.cpp
+++ b/dbms/src/Core/Block.cpp
@ -99,6 +99,13 @@ void Block::insertUnique(ColumnWithTypeAndName && elem)
 }


+void Block::erase(const std::set<size_t> & positions)
+{
+    for (auto it = positions.rbegin(); it != positions.rend(); ++it)
+        erase(*it);
+}
+
+
 void Block::erase(size_t position)
 {
    if (data.empty())
--- a/dbms/src/Core/Block.h
+++ b/dbms/src/Core/Block.h
@ -2,6 +2,7 @@

 #include <vector>
 #include <list>
+#include <set>
 #include <map>
 #include <initializer_list>

@ -51,6 +52,8 @@ public:
    void insertUnique(ColumnWithTypeAndName && elem);
    /// remove the column at the specified position
    void erase(size_t position);
+    /// remove the columns at the specified positions
+    void erase(const std::set<size_t> & positions);
    /// remove the column with the specified name
    void erase(const String & name);

@ -94,8 +97,8 @@ public:
    /// Approximate number of allocated bytes in memory - for profiling and limits.
    size_t allocatedBytes() const;

-    operator bool() const { return !data.empty(); }
-    bool operator!() const { return data.empty(); }
+    operator bool() const { return !!columns(); }
+    bool operator!() const { return !this->operator bool(); }

    /** Get a list of column names separated by commas. */
    std::string dumpNames() const;
--- a/dbms/src/Core/BlockInfo.cpp
+++ b/dbms/src/Core/BlockInfo.cpp
@ -58,4 +58,20 @@ void BlockInfo::read(ReadBuffer & in)
    }
 }

+void BlockMissingValues::setBit(size_t column_idx, size_t row_idx)
+{
+    RowsBitMask & mask = rows_mask_by_column_id[column_idx];
+    mask.resize(row_idx + 1);
+    mask[row_idx] = true;
+}
+
+const BlockMissingValues::RowsBitMask & BlockMissingValues::getDefaultsBitmask(size_t column_idx) const
+{
+    static RowsBitMask none;
+    auto it = rows_mask_by_column_id.find(column_idx);
+    if (it != rows_mask_by_column_id.end())
+        return it->second;
+    return none;
+}
+
 }
--- a/dbms/src/Core/BlockInfo.h
+++ b/dbms/src/Core/BlockInfo.h
@ -1,5 +1,7 @@
 #pragma once

+#include <unordered_map>
+
 #include <Core/Types.h>


@ -43,4 +45,24 @@ struct BlockInfo
    void read(ReadBuffer & in);
 };

+/// Block extention to support delayed defaults. AddingDefaultsBlockInputStream uses it to replace missing values with column defaults.
+class BlockMissingValues
+{
+public:
+    using RowsBitMask = std::vector<bool>; /// a bit per row for a column
+
+    const RowsBitMask & getDefaultsBitmask(size_t column_idx) const;
+    void setBit(size_t column_idx, size_t row_idx);
+    bool empty() const { return rows_mask_by_column_id.empty(); }
+    size_t size() const { return rows_mask_by_column_id.size(); }
+    void clear() { rows_mask_by_column_id.clear(); }
+
+private:
+    using RowsMaskByColumnId = std::unordered_map<size_t, RowsBitMask>;
+
+    /// If rows_mask_by_column_id[column_id][row_id] is true related value in Block should be replaced with column default.
+    /// It could contain less columns and rows then related block.
+    RowsMaskByColumnId rows_mask_by_column_id;
+};
+
 }
--- a/dbms/src/Core/Defines.h
+++ b/dbms/src/Core/Defines.h
@ -51,6 +51,7 @@
 /// Two-level (bucketed) aggregation is incompatible if servers are inconsistent in these rules
 /// (keys will be placed in different buckets and result will not be fully aggregated).
 #define DBMS_MIN_REVISION_WITH_CURRENT_AGGREGATION_VARIANT_SELECTION_METHOD 54408
+#define DBMS_MIN_REVISION_WITH_COLUMN_DEFAULTS_METADATA 54410

 #define DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE 54405

@ -60,8 +61,6 @@
 /// The boundary on which the blocks for asynchronous file operations should be aligned.
 #define DEFAULT_AIO_FILE_BLOCK_SIZE 4096

-#define DEFAULT_QUERY_LOG_FLUSH_INTERVAL_MILLISECONDS 7500
-
 #define DEFAULT_HTTP_READ_BUFFER_TIMEOUT 1800
 #define DEFAULT_HTTP_READ_BUFFER_CONNECTION_TIMEOUT 1
 /// Maximum namber of http-connections between two endpoints
@ -105,3 +104,13 @@
 #elif defined(__SANITIZE_THREAD__)
    #define THREAD_SANITIZER 1
 #endif
+
+/// Explicitly allow undefined behaviour for certain functions. Use it as a function attribute.
+/// It is useful in case when compiler cannot see (and exploit) it, but UBSan can.
+/// Example: multiplication of signed integers with possibility of overflow when both sides are from user input.
+#if defined(__clang__)
+    #define NO_SANITIZE_UNDEFINED __attribute__((__no_sanitize__("undefined")))
+#else
+    /// It does not work in GCC. GCC 7 cannot recognize this attribute and GCC 8 simply ignores it.
+    #define NO_SANITIZE_UNDEFINED
+#endif
--- a/Show More
+++ b/Show More