Merge branch 'master' of github.com:yandex/ClickHouse

This commit is contained in:
BayoNet 2018-12-25 08:57:05 +03:00
commit 4e43a39c95
319 changed files with 4626 additions and 2381 deletions

3
.gitignore vendored
View File

@ -248,3 +248,6 @@ website/package-lock.json
# Ignore files for locally disabled tests # Ignore files for locally disabled tests
/dbms/tests/queries/**/*.disabled /dbms/tests/queries/**/*.disabled
# cquery cache
/.cquery-cache

View File

@ -1,9 +1,131 @@
## ClickHouse release 18.16.1, 2018-12-21
### Bug fixes:
* Fixed an error that led to problems with updating dictionaries with the ODBC source. [#3825](https://github.com/yandex/ClickHouse/issues/3825), [#3829](https://github.com/yandex/ClickHouse/issues/3829)
* JIT compilation of aggregate functions now works with LowCardinality columns. [#3838](https://github.com/yandex/ClickHouse/issues/3838)
### Improvements:
* Added the `low_cardinality_allow_in_native_format` setting (enabled by default). When disabled, LowCardinality columns will be converted to ordinary columns for SELECT queries and ordinary columns will be expected for INSERT queries. [#3879](https://github.com/yandex/ClickHouse/pull/3879)
### Build improvements:
* Fixes for builds on macOS and ARM.
## ClickHouse release 18.16.0, 2018-12-14
### New features:
* `DEFAULT` expressions are evaluated for missing fields when loading data in semi-structured input formats (`JSONEachRow`, `TSKV`). [#3555](https://github.com/yandex/ClickHouse/pull/3555)
* The `ALTER TABLE` query now has the `MODIFY ORDER BY` action for changing the sorting key when adding or removing a table column. This is useful for tables in the `MergeTree` family that perform additional tasks when merging based on this sorting key, such as `SummingMergeTree`, `AggregatingMergeTree`, and so on. [#3581](https://github.com/yandex/ClickHouse/pull/3581) [#3755](https://github.com/yandex/ClickHouse/pull/3755)
* For tables in the `MergeTree` family, now you can specify a different sorting key (`ORDER BY`) and index (`PRIMARY KEY`). The sorting key can be longer than the index. [#3581](https://github.com/yandex/ClickHouse/pull/3581)
* Added the `hdfs` table function and the `HDFS` table engine for importing and exporting data to HDFS. [chenxing-xc](https://github.com/yandex/ClickHouse/pull/3617)
* Added functions for working with base64: `base64Encode`, `base64Decode`, `tryBase64Decode`. [Alexander Krasheninnikov](https://github.com/yandex/ClickHouse/pull/3350)
* Now you can use a parameter to configure the precision of the `uniqCombined` aggregate function (select the number of HyperLogLog cells). [#3406](https://github.com/yandex/ClickHouse/pull/3406)
* Added the `system.contributors` table that contains the names of everyone who made commits in ClickHouse. [#3452](https://github.com/yandex/ClickHouse/pull/3452)
* Added the ability to omit the partition for the `ALTER TABLE ... FREEZE` query in order to back up all partitions at once. [#3514](https://github.com/yandex/ClickHouse/pull/3514)
* Added `dictGet` and `dictGetOrDefault` functions that don't require specifying the type of return value. The type is determined automatically from the dictionary description. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3564)
* Now you can specify comments for a column in the table description and change it using `ALTER`. [#3377](https://github.com/yandex/ClickHouse/pull/3377)
* Reading is supported for `Join` type tables with simple keys. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
* Now you can specify the options `join_use_nulls`, `max_rows_in_join`, `max_bytes_in_join`, and `join_overflow_mode` when creating a `Join` type table. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
* Added the `joinGet` function that allows you to use a `Join` type table like a dictionary. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
* Added the `partition_key`, `sorting_key`, `primary_key`, and `sampling_key` columns to the `system.tables` table in order to provide information about table keys. [#3609](https://github.com/yandex/ClickHouse/pull/3609)
* Added the `is_in_partition_key`, `is_in_sorting_key`, `is_in_primary_key`, and `is_in_sampling_key` columns to the `system.columns` table. [#3609](https://github.com/yandex/ClickHouse/pull/3609)
* Added the `min_time` and `max_time` columns to the `system.parts` table. These columns are populated when the partitioning key is an expression consisting of `DateTime` columns. [Emmanuel Donin de Rosière](https://github.com/yandex/ClickHouse/pull/3800)
### Bug fixes:
* Fixes and performance improvements for the `LowCardinality` data type. `GROUP BY` using `LowCardinality(Nullable(...))`. Getting the values of `extremes`. Processing high-order functions. `LEFT ARRAY JOIN`. Distributed `GROUP BY`. Functions that return `Array`. Execution of `ORDER BY`. Writing to `Distributed` tables (nicelulu). Backward compatibility for `INSERT` queries from old clients that implement the `Native` protocol. Support for `LowCardinality` for `JOIN`. Improved performance when working in a single stream. [#3823](https://github.com/yandex/ClickHouse/pull/3823) [#3803](https://github.com/yandex/ClickHouse/pull/3803) [#3799](https://github.com/yandex/ClickHouse/pull/3799) [#3769](https://github.com/yandex/ClickHouse/pull/3769) [#3744](https://github.com/yandex/ClickHouse/pull/3744) [#3681](https://github.com/yandex/ClickHouse/pull/3681) [#3651](https://github.com/yandex/ClickHouse/pull/3651) [#3649](https://github.com/yandex/ClickHouse/pull/3649) [#3641](https://github.com/yandex/ClickHouse/pull/3641) [#3632](https://github.com/yandex/ClickHouse/pull/3632) [#3568](https://github.com/yandex/ClickHouse/pull/3568) [#3523](https://github.com/yandex/ClickHouse/pull/3523) [#3518](https://github.com/yandex/ClickHouse/pull/3518)
* Fixed how the `select_sequential_consistency` option works. Previously, when this setting was enabled, an incomplete result was sometimes returned after beginning to write to a new partition. [#2863](https://github.com/yandex/ClickHouse/pull/2863)
* Databases are correctly specified when executing DDL `ON CLUSTER` queries and `ALTER UPDATE/DELETE`. [#3772](https://github.com/yandex/ClickHouse/pull/3772) [#3460](https://github.com/yandex/ClickHouse/pull/3460)
* Databases are correctly specified for subqueries inside a VIEW. [#3521](https://github.com/yandex/ClickHouse/pull/3521)
* Fixed a bug in `PREWHERE` with `FINAL` for `VersionedCollapsingMergeTree`. [7167bfd7](https://github.com/yandex/ClickHouse/commit/7167bfd7b365538f7a91c4307ad77e552ab4e8c1)
* Now you can use `KILL QUERY` to cancel queries that have not started yet because they are waiting for the table to be locked. [#3517](https://github.com/yandex/ClickHouse/pull/3517)
* Corrected date and time calculations if the clocks were moved back at midnight (this happens in Iran, and happened in Moscow from 1981 to 1983). Previously, this led to the time being reset a day earlier than necessary, and also caused incorrect formatting of the date and time in text format. [#3819](https://github.com/yandex/ClickHouse/pull/3819)
* Fixed bugs in some cases of `VIEW` and subqueries that omit the database. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3521)
* Fixed a race condition when simultaneously reading from a `MATERIALIZED VIEW` and deleting a `MATERIALIZED VIEW` due to not locking the internal `MATERIALIZED VIEW`. [#3404](https://github.com/yandex/ClickHouse/pull/3404) [#3694](https://github.com/yandex/ClickHouse/pull/3694)
* Fixed the error `Lock handler cannot be nullptr.` [#3689](https://github.com/yandex/ClickHouse/pull/3689)
* Fixed query processing when the `compile_expressions` option is enabled (it's enabled by default). Nondeterministic constant expressions like the `now` function are no longer unfolded. [#3457](https://github.com/yandex/ClickHouse/pull/3457)
* Fixed a crash when specifying a non-constant scale argument in `toDecimal32/64/128` functions.
* Fixed an error when trying to insert an array with `NULL` elements in the `Values` format into a column of type `Array` without `Nullable` (if `input_format_values_interpret_expressions` = 1). [#3487](https://github.com/yandex/ClickHouse/pull/3487) [#3503](https://github.com/yandex/ClickHouse/pull/3503)
* Fixed continuous error logging in `DDLWorker` if ZooKeeper is not available. [8f50c620](https://github.com/yandex/ClickHouse/commit/8f50c620334988b28018213ec0092fe6423847e2)
* Fixed the return type for `quantile*` functions from `Date` and `DateTime` types of arguments. [#3580](https://github.com/yandex/ClickHouse/pull/3580)
* Fixed the `WITH` clause if it specifies a simple alias without expressions. [#3570](https://github.com/yandex/ClickHouse/pull/3570)
* Fixed processing of queries with named sub-queries and qualified column names when `enable_optimize_predicate_expression` is enabled. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3588)
* Fixed the error `Attempt to attach to nullptr thread group` when working with materialized views. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3623)
* Fixed a crash when passing certain incorrect arguments to the `arrayReverse` function. [73e3a7b6](https://github.com/yandex/ClickHouse/commit/73e3a7b662161d6005e7727d8a711b930386b871)
* Fixed the buffer overflow in the `extractURLParameter` function. Improved performance. Added correct processing of strings containing zero bytes. [141e9799](https://github.com/yandex/ClickHouse/commit/141e9799e49201d84ea8e951d1bed4fb6d3dacb5)
* Fixed buffer overflow in the `lowerUTF8` and `upperUTF8` functions. Removed the ability to execute these functions over `FixedString` type arguments. [#3662](https://github.com/yandex/ClickHouse/pull/3662)
* Fixed a rare race condition when deleting `MergeTree` tables. [#3680](https://github.com/yandex/ClickHouse/pull/3680)
* Fixed a race condition when reading from `Buffer` tables and simultaneously performing `ALTER` or `DROP` on the target tables. [#3719](https://github.com/yandex/ClickHouse/pull/3719)
* Fixed a segfault if the `max_temporary_non_const_columns` limit was exceeded. [#3788](https://github.com/yandex/ClickHouse/pull/3788)
### Improvements:
* The server does not write the processed configuration files to the `/etc/clickhouse-server/` directory. Instead, it saves them in the `preprocessed_configs` directory inside `path`. This means that the `/etc/clickhouse-server/` directory doesn't have write access for the `clickhouse` user, which improves security. [#2443](https://github.com/yandex/ClickHouse/pull/2443)
* The `min_merge_bytes_to_use_direct_io` option is set to 10 GiB by default. A merge that forms large parts of tables from the MergeTree family will be performed in `O_DIRECT` mode, which prevents excessive page cache eviction. [#3504](https://github.com/yandex/ClickHouse/pull/3504)
* Accelerated server start when there is a very large number of tables. [#3398](https://github.com/yandex/ClickHouse/pull/3398)
* Added a connection pool and HTTP `Keep-Alive` for connections between replicas. [#3594](https://github.com/yandex/ClickHouse/pull/3594)
* If the query syntax is invalid, the `400 Bad Request` code is returned in the `HTTP` interface (500 was returned previously). [31bc680a](https://github.com/yandex/ClickHouse/commit/31bc680ac5f4bb1d0360a8ba4696fa84bb47d6ab)
* The `join_default_strictness` option is set to `ALL` by default for compatibility. [120e2cbe](https://github.com/yandex/ClickHouse/commit/120e2cbe2ff4fbad626c28042d9b28781c805afe)
* Removed logging to `stderr` from the `re2` library for invalid or complex regular expressions. [#3723](https://github.com/yandex/ClickHouse/pull/3723)
* Added for the `Kafka` table engine: checks for subscriptions before beginning to read from Kafka; the kafka_max_block_size setting for the table. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3396)
* The `cityHash64`, `farmHash64`, `metroHash64`, `sipHash64`, `halfMD5`, `murmurHash2_32`, `murmurHash2_64`, `murmurHash3_32`, and `murmurHash3_64` functions now work for any number of arguments and for arguments in the form of tuples. [#3451](https://github.com/yandex/ClickHouse/pull/3451) [#3519](https://github.com/yandex/ClickHouse/pull/3519)
* The `arrayReverse` function now works with any types of arrays. [73e3a7b6](https://github.com/yandex/ClickHouse/commit/73e3a7b662161d6005e7727d8a711b930386b871)
* Added an optional parameter: the slot size for the `timeSlots` function. [Kirill Shvakov](https://github.com/yandex/ClickHouse/pull/3724)
* For `FULL` and `RIGHT JOIN`, the `max_block_size` setting is used for a stream of non-joined data from the right table. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3699)
* Added the `--secure` command line parameter in `clickhouse-benchmark` and `clickhouse-performance-test` to enable TLS. [#3688](https://github.com/yandex/ClickHouse/pull/3688) [#3690](https://github.com/yandex/ClickHouse/pull/3690)
* Type conversion when the structure of a `Buffer` type table does not match the structure of the destination table. [Vitaly Baranov](https://github.com/yandex/ClickHouse/pull/3603)
* Added the `tcp_keep_alive_timeout` option to enable keep-alive packets after inactivity for the specified time interval. [#3441](https://github.com/yandex/ClickHouse/pull/3441)
* Removed unnecessary quoting of values for the partition key in the `system.parts` table if it consists of a single column. [#3652](https://github.com/yandex/ClickHouse/pull/3652)
* The modulo function works for `Date` and `DateTime` data types. [#3385](https://github.com/yandex/ClickHouse/pull/3385)
* Added synonyms for the `POWER`, `LN`, `LCASE`, `UCASE`, `REPLACE`, `LOCATE`, `SUBSTR`, and `MID` functions. [#3774](https://github.com/yandex/ClickHouse/pull/3774) [#3763](https://github.com/yandex/ClickHouse/pull/3763) Some function names are case-insensitive for compatibility with the SQL standard. Added syntactic sugar `SUBSTRING(expr FROM start FOR length)` for compatibility with SQL. [#3804](https://github.com/yandex/ClickHouse/pull/3804)
* Added the ability to `mlock` memory pages corresponding to `clickhouse-server` executable code to prevent it from being forced out of memory. This feature is disabled by default. [#3553](https://github.com/yandex/ClickHouse/pull/3553)
* Improved performance when reading from `O_DIRECT` (with the `min_bytes_to_use_direct_io` option enabled). [#3405](https://github.com/yandex/ClickHouse/pull/3405)
* Improved performance of the `dictGet...OrDefault` function for a constant key argument and a non-constant default argument. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3563)
* The `firstSignificantSubdomain` function now processes the domains `gov`, `mil`, and `edu`. [Igor Hatarist](https://github.com/yandex/ClickHouse/pull/3601) Improved performance. [#3628](https://github.com/yandex/ClickHouse/pull/3628)
* Ability to specify custom environment variables for starting `clickhouse-server` using the `SYS-V init.d` script by defining `CLICKHOUSE_PROGRAM_ENV` in `/etc/default/clickhouse`.
[Pavlo Bashynskyi](https://github.com/yandex/ClickHouse/pull/3612)
* Correct return code for the clickhouse-server init script. [#3516](https://github.com/yandex/ClickHouse/pull/3516)
* The `system.metrics` table now has the `VersionInteger` metric, and `system.build_options` has the added line `VERSION_INTEGER`, which contains the numeric form of the ClickHouse version, such as `18016000`. [#3644](https://github.com/yandex/ClickHouse/pull/3644)
* Removed the ability to compare the `Date` type with a number to avoid potential errors like `date = 2018-12-17`, where quotes around the date are omitted by mistake. [#3687](https://github.com/yandex/ClickHouse/pull/3687)
* Fixed the behavior of stateful functions like `rowNumberInAllBlocks`. They previously output a result that was one number larger due to starting during query analysis. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3729)
* If the `force_restore_data` file can't be deleted, an error message is displayed. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3794)
### Build improvements:
* Updated the `jemalloc` library, which fixes a potential memory leak. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3557)
* Profiling with `jemalloc` is enabled by default in order to debug builds. [2cc82f5c](https://github.com/yandex/ClickHouse/commit/2cc82f5cbe266421cd4c1165286c2c47e5ffcb15)
* Added the ability to run integration tests when only `Docker` is installed on the system. [#3650](https://github.com/yandex/ClickHouse/pull/3650)
* Added the fuzz expression test in SELECT queries. [#3442](https://github.com/yandex/ClickHouse/pull/3442)
* Added a stress test for commits, which performs functional tests in parallel and in random order to detect more race conditions. [#3438](https://github.com/yandex/ClickHouse/pull/3438)
* Improved the method for starting clickhouse-server in a Docker image. [Elghazal Ahmed](https://github.com/yandex/ClickHouse/pull/3663)
* For a Docker image, added support for initializing databases using files in the `/docker-entrypoint-initdb.d` directory. [Konstantin Lebedev](https://github.com/yandex/ClickHouse/pull/3695)
* Fixes for builds on ARM. [#3709](https://github.com/yandex/ClickHouse/pull/3709)
### Backward incompatible changes:
* Removed the ability to compare the `Date` type with a number. Instead of `toDate('2018-12-18') = 17883`, you must use explicit type conversion `= toDate(17883)` [#3687](https://github.com/yandex/ClickHouse/pull/3687)
## ClickHouse release 18.14.19, 2018-12-19
### Bug fixes:
* Fixed an error that led to problems with updating dictionaries with the ODBC source. [#3825](https://github.com/yandex/ClickHouse/issues/3825), [#3829](https://github.com/yandex/ClickHouse/issues/3829)
* Databases are correctly specified when executing DDL `ON CLUSTER` queries. [#3460](https://github.com/yandex/ClickHouse/pull/3460)
* Fixed a segfault if the `max_temporary_non_const_columns` limit was exceeded. [#3788](https://github.com/yandex/ClickHouse/pull/3788)
### Build improvements:
* Fixes for builds on ARM.
## ClickHouse release 18.14.18, 2018-12-04 ## ClickHouse release 18.14.18, 2018-12-04
### Bug fixes: ### Bug fixes:
* Fixed error in `dictGet...` function for dictionaries of type `range`, if one of the arguments is constant and other is not. [#3751](https://github.com/yandex/ClickHouse/pull/3751) * Fixed error in `dictGet...` function for dictionaries of type `range`, if one of the arguments is constant and other is not. [#3751](https://github.com/yandex/ClickHouse/pull/3751)
* Fixed error that caused messages `netlink: '...': attribute type 1 has an invalid length` to be printed in Linux kernel log, that was happening only on fresh enough versions of Linux kernel. [#3749](https://github.com/yandex/ClickHouse/pull/3749) * Fixed error that caused messages `netlink: '...': attribute type 1 has an invalid length` to be printed in Linux kernel log, that was happening only on fresh enough versions of Linux kernel. [#3749](https://github.com/yandex/ClickHouse/pull/3749)
* Fixed segfault in function `empty` for argument of `FixedString` type. [#3703](https://github.com/yandex/ClickHouse/pull/3703) * Fixed segfault in function `empty` for argument of `FixedString` type. [Daniel, Dao Quang Minh](https://github.com/yandex/ClickHouse/pull/3703)
* Fixed excessive memory allocation when using large value of `max_query_size` setting (a memory chunk of `max_query_size` bytes was preallocated at once). [#3720](https://github.com/yandex/ClickHouse/pull/3720) * Fixed excessive memory allocation when using large value of `max_query_size` setting (a memory chunk of `max_query_size` bytes was preallocated at once). [#3720](https://github.com/yandex/ClickHouse/pull/3720)
### Build changes: ### Build changes:
@ -90,7 +212,7 @@
### Improvements: ### Improvements:
* Significantly reduced memory consumption for requests with `ORDER BY` and `LIMIT`. See the `max_bytes_before_remerge_sort` setting. [#3205](https://github.com/yandex/ClickHouse/pull/3205) * Significantly reduced memory consumption for queries with `ORDER BY` and `LIMIT`. See the `max_bytes_before_remerge_sort` setting. [#3205](https://github.com/yandex/ClickHouse/pull/3205)
* In the absence of `JOIN` (`LEFT`, `INNER`, ...), `INNER JOIN` is assumed. [#3147](https://github.com/yandex/ClickHouse/pull/3147) * In the absence of `JOIN` (`LEFT`, `INNER`, ...), `INNER JOIN` is assumed. [#3147](https://github.com/yandex/ClickHouse/pull/3147)
* Qualified asterisks work correctly in queries with `JOIN`. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3202) * Qualified asterisks work correctly in queries with `JOIN`. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3202)
* The `ODBC` table engine correctly chooses the method for quoting identifiers in the SQL dialect of a remote database. [Alexandr Krasheninnikov](https://github.com/yandex/ClickHouse/pull/3210) * The `ODBC` table engine correctly chooses the method for quoting identifiers in the SQL dialect of a remote database. [Alexandr Krasheninnikov](https://github.com/yandex/ClickHouse/pull/3210)
@ -127,7 +249,7 @@
* If after merging data parts, the checksum for the resulting part differs from the result of the same merge in another replica, the result of the merge is deleted and the data part is downloaded from the other replica (this is the correct behavior). But after downloading the data part, it couldn't be added to the working set because of an error that the part already exists (because the data part was deleted with some delay after the merge). This led to cyclical attempts to download the same data. [#3194](https://github.com/yandex/ClickHouse/pull/3194) * If after merging data parts, the checksum for the resulting part differs from the result of the same merge in another replica, the result of the merge is deleted and the data part is downloaded from the other replica (this is the correct behavior). But after downloading the data part, it couldn't be added to the working set because of an error that the part already exists (because the data part was deleted with some delay after the merge). This led to cyclical attempts to download the same data. [#3194](https://github.com/yandex/ClickHouse/pull/3194)
* Fixed incorrect calculation of total memory consumption by queries (because of incorrect calculation, the `max_memory_usage_for_all_queries` setting worked incorrectly and the `MemoryTracking` metric had an incorrect value). This error occurred in version 18.12.13. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3344) * Fixed incorrect calculation of total memory consumption by queries (because of incorrect calculation, the `max_memory_usage_for_all_queries` setting worked incorrectly and the `MemoryTracking` metric had an incorrect value). This error occurred in version 18.12.13. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3344)
* Fixed the functionality of `CREATE TABLE ... ON CLUSTER ... AS SELECT ...` This error occurred in version 18.12.13. [#3247](https://github.com/yandex/ClickHouse/pull/3247) * Fixed the functionality of `CREATE TABLE ... ON CLUSTER ... AS SELECT ...` This error occurred in version 18.12.13. [#3247](https://github.com/yandex/ClickHouse/pull/3247)
* Fixed unnecessary preparation of data structures for `JOIN`s on the server that initiates the request if the `JOIN` is only performed on remote servers. [#3340](https://github.com/yandex/ClickHouse/pull/3340) * Fixed unnecessary preparation of data structures for `JOIN`s on the server that initiates the query if the `JOIN` is only performed on remote servers. [#3340](https://github.com/yandex/ClickHouse/pull/3340)
* Fixed bugs in the `Kafka` engine: deadlocks after exceptions when starting to read data, and locks upon completion [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3215). * Fixed bugs in the `Kafka` engine: deadlocks after exceptions when starting to read data, and locks upon completion [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3215).
* For `Kafka` tables, the optional `schema` parameter was not passed (the schema of the `Cap'n'Proto` format). [Vojtech Splichal](https://github.com/yandex/ClickHouse/pull/3150) * For `Kafka` tables, the optional `schema` parameter was not passed (the schema of the `Cap'n'Proto` format). [Vojtech Splichal](https://github.com/yandex/ClickHouse/pull/3150)
* If the ensemble of ZooKeeper servers has servers that accept the connection but then immediately close it instead of responding to the handshake, ClickHouse chooses to connect another server. Previously, this produced the error `Cannot read all data. Bytes read: 0. Bytes expected: 4.` and the server couldn't start. [8218cf3a](https://github.com/yandex/ClickHouse/commit/8218cf3a5f39a43401953769d6d12a0bb8d29da9) * If the ensemble of ZooKeeper servers has servers that accept the connection but then immediately close it instead of responding to the handshake, ClickHouse chooses to connect another server. Previously, this produced the error `Cannot read all data. Bytes read: 0. Bytes expected: 4.` and the server couldn't start. [8218cf3a](https://github.com/yandex/ClickHouse/commit/8218cf3a5f39a43401953769d6d12a0bb8d29da9)
@ -208,7 +330,7 @@
* Added the `DECIMAL(digits, scale)` data type (`Decimal32(scale)`, `Decimal64(scale)`, `Decimal128(scale)`). To enable it, use the setting `allow_experimental_decimal_type`. [#2846](https://github.com/yandex/ClickHouse/pull/2846) [#2970](https://github.com/yandex/ClickHouse/pull/2970) [#3008](https://github.com/yandex/ClickHouse/pull/3008) [#3047](https://github.com/yandex/ClickHouse/pull/3047) * Added the `DECIMAL(digits, scale)` data type (`Decimal32(scale)`, `Decimal64(scale)`, `Decimal128(scale)`). To enable it, use the setting `allow_experimental_decimal_type`. [#2846](https://github.com/yandex/ClickHouse/pull/2846) [#2970](https://github.com/yandex/ClickHouse/pull/2970) [#3008](https://github.com/yandex/ClickHouse/pull/3008) [#3047](https://github.com/yandex/ClickHouse/pull/3047)
* New `WITH ROLLUP` modifier for `GROUP BY` (alternative syntax: `GROUP BY ROLLUP(...)`). [#2948](https://github.com/yandex/ClickHouse/pull/2948) * New `WITH ROLLUP` modifier for `GROUP BY` (alternative syntax: `GROUP BY ROLLUP(...)`). [#2948](https://github.com/yandex/ClickHouse/pull/2948)
* In requests with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting `asterisk_left_columns_only` to 1 on the user configuration level. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2787) * In queries with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting `asterisk_left_columns_only` to 1 on the user configuration level. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2787)
* Added support for JOIN with table functions. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2907) * Added support for JOIN with table functions. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2907)
* Autocomplete by pressing Tab in clickhouse-client. [Sergey Shcherbin](https://github.com/yandex/ClickHouse/pull/2447) * Autocomplete by pressing Tab in clickhouse-client. [Sergey Shcherbin](https://github.com/yandex/ClickHouse/pull/2447)
* Ctrl+C in clickhouse-client clears a query that was entered. [#2877](https://github.com/yandex/ClickHouse/pull/2877) * Ctrl+C in clickhouse-client clears a query that was entered. [#2877](https://github.com/yandex/ClickHouse/pull/2877)
@ -294,7 +416,7 @@
### Backward incompatible changes: ### Backward incompatible changes:
* In requests with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting `asterisk_left_columns_only` to 1 on the user configuration level. * In queries with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting `asterisk_left_columns_only` to 1 on the user configuration level.
### Build changes: ### Build changes:
@ -338,7 +460,7 @@
* Fixed an error for concurrent `Set` or `Join`. [Amos Bird](https://github.com/yandex/ClickHouse/pull/2823) * Fixed an error for concurrent `Set` or `Join`. [Amos Bird](https://github.com/yandex/ClickHouse/pull/2823)
* Fixed the `Block structure mismatch in UNION stream: different number of columns` error that occurred for `UNION ALL` queries inside a sub-query if one of the `SELECT` queries contains duplicate column names. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2094) * Fixed the `Block structure mismatch in UNION stream: different number of columns` error that occurred for `UNION ALL` queries inside a sub-query if one of the `SELECT` queries contains duplicate column names. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2094)
* Fixed a memory leak if an exception occurred when connecting to a MySQL server. * Fixed a memory leak if an exception occurred when connecting to a MySQL server.
* Fixed incorrect clickhouse-client response code in case of a request error. * Fixed incorrect clickhouse-client response code in case of a query error.
* Fixed incorrect behavior of materialized views containing DISTINCT. [#2795](https://github.com/yandex/ClickHouse/issues/2795) * Fixed incorrect behavior of materialized views containing DISTINCT. [#2795](https://github.com/yandex/ClickHouse/issues/2795)
### Backward incompatible changes ### Backward incompatible changes
@ -452,7 +574,7 @@ The expression must be a chain of equalities joined by the AND operator. Each si
* Fixed a problem with a very small timeout for sockets (one second) for reading and writing when sending and downloading replicated data, which made it impossible to download larger parts if there is a load on the network or disk (it resulted in cyclical attempts to download parts). This error occurred in version 1.1.54388. * Fixed a problem with a very small timeout for sockets (one second) for reading and writing when sending and downloading replicated data, which made it impossible to download larger parts if there is a load on the network or disk (it resulted in cyclical attempts to download parts). This error occurred in version 1.1.54388.
* Fixed issues when using chroot in ZooKeeper if you inserted duplicate data blocks in the table. * Fixed issues when using chroot in ZooKeeper if you inserted duplicate data blocks in the table.
* The `has` function now works correctly for an array with Nullable elements ([#2115](https://github.com/yandex/ClickHouse/issues/2115)). * The `has` function now works correctly for an array with Nullable elements ([#2115](https://github.com/yandex/ClickHouse/issues/2115)).
* The `system.tables` table now works correctly when used in distributed queries. The `metadata_modification_time` and `engine_full` columns are now non-virtual. Fixed an error that occurred if only these columns were requested from the table. * The `system.tables` table now works correctly when used in distributed queries. The `metadata_modification_time` and `engine_full` columns are now non-virtual. Fixed an error that occurred if only these columns were queried from the table.
* Fixed how an empty `TinyLog` table works after inserting an empty data block ([#2563](https://github.com/yandex/ClickHouse/issues/2563)). * Fixed how an empty `TinyLog` table works after inserting an empty data block ([#2563](https://github.com/yandex/ClickHouse/issues/2563)).
* The `system.zookeeper` table works if the value of the node in ZooKeeper is NULL. * The `system.zookeeper` table works if the value of the node in ZooKeeper is NULL.
@ -701,7 +823,7 @@ The expression must be a chain of equalities joined by the AND operator. Each si
* Added the `parseDateTimeBestEffort`, `parseDateTimeBestEffortOrZero`, and `parseDateTimeBestEffortOrNull` functions to read the DateTime from a string containing text in a wide variety of possible formats. * Added the `parseDateTimeBestEffort`, `parseDateTimeBestEffortOrZero`, and `parseDateTimeBestEffortOrNull` functions to read the DateTime from a string containing text in a wide variety of possible formats.
* Data can be partially reloaded from external dictionaries during updating (load just the records in which the value of the specified field greater than in the previous download) (Arsen Hakobyan). * Data can be partially reloaded from external dictionaries during updating (load just the records in which the value of the specified field greater than in the previous download) (Arsen Hakobyan).
* Added the `cluster` table function. Example: `cluster(cluster_name, db, table)`. The `remote` table function can accept the cluster name as the first argument, if it is specified as an identifier. * Added the `cluster` table function. Example: `cluster(cluster_name, db, table)`. The `remote` table function can accept the cluster name as the first argument, if it is specified as an identifier.
* The `remote` and `cluster` table functions can be used in `INSERT` requests. * The `remote` and `cluster` table functions can be used in `INSERT` queries.
* Added the `create_table_query` and `engine_full` virtual columns to the `system.tables`table . The `metadata_modification_time` column is virtual. * Added the `create_table_query` and `engine_full` virtual columns to the `system.tables`table . The `metadata_modification_time` column is virtual.
* Added the `data_path` and `metadata_path` columns to `system.tables`and` system.databases` tables, and added the `path` column to the `system.parts` and `system.parts_columns` tables. * Added the `data_path` and `metadata_path` columns to `system.tables`and` system.databases` tables, and added the `path` column to the `system.parts` and `system.parts_columns` tables.
* Added additional information about merges in the `system.part_log` table. * Added additional information about merges in the `system.part_log` table.
@ -1040,7 +1162,7 @@ This release contains bug fixes for the previous release 1.1.54310:
### Please note when upgrading: ### Please note when upgrading:
* There is now a higher default value for the MergeTree setting `max_bytes_to_merge_at_max_space_in_pool` (the maximum total size of data parts to merge, in bytes): it has increased from 100 GiB to 150 GiB. This might result in large merges running after the server upgrade, which could cause an increased load on the disk subsystem. If the free space available on the server is less than twice the total amount of the merges that are running, this will cause all other merges to stop running, including merges of small data parts. As a result, INSERT requests will fail with the message "Merges are processing significantly slower than inserts." Use the ` SELECT * FROM system.merges` request to monitor the situation. You can also check the `DiskSpaceReservedForMerge` metric in the `system.metrics` table, or in Graphite. You don't need to do anything to fix this, since the issue will resolve itself once the large merges finish. If you find this unacceptable, you can restore the previous value for the `max_bytes_to_merge_at_max_space_in_pool` setting. To do this, go to the <merge_tree> section in config.xml, set `<merge_tree>``<max_bytes_to_merge_at_max_space_in_pool>107374182400</max_bytes_to_merge_at_max_space_in_pool>` and restart the server. * There is now a higher default value for the MergeTree setting `max_bytes_to_merge_at_max_space_in_pool` (the maximum total size of data parts to merge, in bytes): it has increased from 100 GiB to 150 GiB. This might result in large merges running after the server upgrade, which could cause an increased load on the disk subsystem. If the free space available on the server is less than twice the total amount of the merges that are running, this will cause all other merges to stop running, including merges of small data parts. As a result, INSERT queries will fail with the message "Merges are processing significantly slower than inserts." Use the ` SELECT * FROM system.merges` query to monitor the situation. You can also check the `DiskSpaceReservedForMerge` metric in the `system.metrics` table, or in Graphite. You don't need to do anything to fix this, since the issue will resolve itself once the large merges finish. If you find this unacceptable, you can restore the previous value for the `max_bytes_to_merge_at_max_space_in_pool` setting. To do this, go to the <merge_tree> section in config.xml, set `<merge_tree>``<max_bytes_to_merge_at_max_space_in_pool>107374182400</max_bytes_to_merge_at_max_space_in_pool>` and restart the server.
## ClickHouse release 1.1.54284, 2017-08-29 ## ClickHouse release 1.1.54284, 2017-08-29
@ -1133,7 +1255,7 @@ This release contains bug fixes for the previous release 1.1.54276:
### New features: ### New features:
* Distributed DDL (for example, `CREATE TABLE ON CLUSTER`) * Distributed DDL (for example, `CREATE TABLE ON CLUSTER`)
* The replicated request `ALTER TABLE CLEAR COLUMN IN PARTITION.` * The replicated query `ALTER TABLE CLEAR COLUMN IN PARTITION.`
* The engine for Dictionary tables (access to dictionary data in the form of a table). * The engine for Dictionary tables (access to dictionary data in the form of a table).
* Dictionary database engine (this type of database automatically has Dictionary tables available for all the connected external dictionaries). * Dictionary database engine (this type of database automatically has Dictionary tables available for all the connected external dictionaries).
* You can check for updates to the dictionary by sending a request to the source. * You can check for updates to the dictionary by sending a request to the source.

View File

@ -1,100 +1,129 @@
## ClickHouse release 18.16.1, 2018-12-21
### Исправления ошибок:
* Исправлена проблема, приводившая к невозможности обновить словари с источником ODBC. [#3825](https://github.com/yandex/ClickHouse/issues/3825), [#3829](https://github.com/yandex/ClickHouse/issues/3829)
* JIT-компиляция агрегатных функций теперь работает с LowCardinality столбцами. [#3838](https://github.com/yandex/ClickHouse/issues/3838)
### Улучшения:
* Добавлена настройка `low_cardinality_allow_in_native_format` (по умолчанию включена). Если её выключить, столбцы LowCardinality в Native формате будут преобразовываться в соответствующий обычный тип при SELECT и из этого типа при INSERT. [#3879](https://github.com/yandex/ClickHouse/pull/3879)
### Улучшения сборки:
* Исправления сборки под macOS и ARM.
## ClickHouse release 18.16.0, 2018-12-14 ## ClickHouse release 18.16.0, 2018-12-14
### Новые возможности: ### Новые возможности:
* Вычисление `DEFAULT` выражений для отсутствующих полей при загрузке данных в полуструктурированных форматах (`JSONEachRow`, `TSKV`). * Вычисление `DEFAULT` выражений для отсутствующих полей при загрузке данных в полуструктурированных форматах (`JSONEachRow`, `TSKV`). [#3555](https://github.com/yandex/ClickHouse/pull/3555)
* Для запроса `ALTER TABLE` добавлено действие `MODIFY ORDER BY` для изменения ключа сортировки при одновременном добавлении или удалении столбца таблицы. Это полезно для таблиц семейства `MergeTree`, выполняющих дополнительную работу при слияниях, согласно этому ключу сортировки, как например, `SummingMergeTree`, `AggregatingMergeTree` и т. п. * Для запроса `ALTER TABLE` добавлено действие `MODIFY ORDER BY` для изменения ключа сортировки при одновременном добавлении или удалении столбца таблицы. Это полезно для таблиц семейства `MergeTree`, выполняющих дополнительную работу при слияниях, согласно этому ключу сортировки, как например, `SummingMergeTree`, `AggregatingMergeTree` и т. п. [#3581](https://github.com/yandex/ClickHouse/pull/3581) [#3755](https://github.com/yandex/ClickHouse/pull/3755)
* Для таблиц семейства `MergeTree` появилась возможность указать различный ключ сортировки (`ORDER BY`) и индекс (`PRIMARY KEY`). Ключ сортировки может быть длиннее, чем индекс. * Для таблиц семейства `MergeTree` появилась возможность указать различный ключ сортировки (`ORDER BY`) и индекс (`PRIMARY KEY`). Ключ сортировки может быть длиннее, чем индекс. [#3581](https://github.com/yandex/ClickHouse/pull/3581)
* Добавлена табличная функция `hdfs` и движок таблиц `HDFS` для импорта и экспорта данных в HDFS. * Добавлена табличная функция `hdfs` и движок таблиц `HDFS` для импорта и экспорта данных в HDFS. [chenxing-xc](https://github.com/yandex/ClickHouse/pull/3617)
* Добавлены функции для работы с base64: `base64Encode`, `base64Decode`, `tryBase64Decode`. * Добавлены функции для работы с base64: `base64Encode`, `base64Decode`, `tryBase64Decode`. [Alexander Krasheninnikov](https://github.com/yandex/ClickHouse/pull/3350)
* Для агрегатной функции `uniqCombined` появилась возможность настраивать точность работы с помощью параметра (выбирать количество ячеек HyperLogLog). * Для агрегатной функции `uniqCombined` появилась возможность настраивать точность работы с помощью параметра (выбирать количество ячеек HyperLogLog). [#3406](https://github.com/yandex/ClickHouse/pull/3406)
* Добавлена таблица `system.contributors`, содержащая имена всех, кто делал коммиты в ClickHouse. * Добавлена таблица `system.contributors`, содержащая имена всех, кто делал коммиты в ClickHouse. [#3452](https://github.com/yandex/ClickHouse/pull/3452)
* Добавлена возможность не указывать партицию для запроса `ALTER TABLE ... FREEZE` для бэкапа сразу всех партиций. * Добавлена возможность не указывать партицию для запроса `ALTER TABLE ... FREEZE` для бэкапа сразу всех партиций. [#3514](https://github.com/yandex/ClickHouse/pull/3514)
* Добавлены функции `dictGet`, `dictGetOrDefault` без указания типа возвращаемого значения. Тип определяется автоматически из описания словаря. * Добавлены функции `dictGet`, `dictGetOrDefault` без указания типа возвращаемого значения. Тип определяется автоматически из описания словаря. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3564)
* Возможность указания комментария для столбца в описании таблицы и изменения его с помощью `ALTER`. * Возможность указания комментария для столбца в описании таблицы и изменения его с помощью `ALTER`. [#3377](https://github.com/yandex/ClickHouse/pull/3377)
* Возможность чтения из таблицы типа `Join` в случае простых ключей. * Возможность чтения из таблицы типа `Join` в случае простых ключей. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
* Возможность указания настроек `join_use_nulls`, `max_rows_in_join`, `max_bytes_in_join`, `join_overflow_mode` при создании таблицы типа `Join`. * Возможность указания настроек `join_use_nulls`, `max_rows_in_join`, `max_bytes_in_join`, `join_overflow_mode` при создании таблицы типа `Join`. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
* Добавлена функция `joinGet`, позволяющая использовать таблицы типа `Join` как словарь. * Добавлена функция `joinGet`, позволяющая использовать таблицы типа `Join` как словарь. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
* Добавлены столбцы `partition_key`, `sorting_key`, `primary_key`, `sampling_key` в таблицу `system.tables`, позволяющие получить информацию о ключах таблицы. * Добавлены столбцы `partition_key`, `sorting_key`, `primary_key`, `sampling_key` в таблицу `system.tables`, позволяющие получить информацию о ключах таблицы. [#3609](https://github.com/yandex/ClickHouse/pull/3609)
* Добавлены столбцы `is_in_partition_key`, `is_in_sorting_key`, `is_in_primary_key`, `is_in_sampling_key` в таблицу `system.columns`. * Добавлены столбцы `is_in_partition_key`, `is_in_sorting_key`, `is_in_primary_key`, `is_in_sampling_key` в таблицу `system.columns`. [#3609](https://github.com/yandex/ClickHouse/pull/3609)
* Добавлены столбцы `min_time`, `max_time` в таблицу `system.parts`. Эти столбцы заполняются, если ключ партиционирования является выражением от столбцов типа `DateTime`. * Добавлены столбцы `min_time`, `max_time` в таблицу `system.parts`. Эти столбцы заполняются, если ключ партиционирования является выражением от столбцов типа `DateTime`. [Emmanuel Donin de Rosière](https://github.com/yandex/ClickHouse/pull/3800)
### Исправления ошибок: ### Исправления ошибок:
* Исправления и улучшения производительности для типа данных `LowCardinality`. `GROUP BY` по `LowCardinality(Nullable(...))`. Получение `extremes` значений. Выполнение функций высшего порядка. `LEFT ARRAY JOIN`. Распределённый `GROUP BY`. Функции, возвращающие `Array`. Выполнение `ORDER BY`. Запись в `Distributed` таблицы (nicelulu). Обратная совместимость для запросов `INSERT` от старых клиентов, реализующих `Native` протокол. Поддержка `LowCardinality` для `JOIN`. Производительность при работе в один поток. * Исправления и улучшения производительности для типа данных `LowCardinality`. `GROUP BY` по `LowCardinality(Nullable(...))`. Получение `extremes` значений. Выполнение функций высшего порядка. `LEFT ARRAY JOIN`. Распределённый `GROUP BY`. Функции, возвращающие `Array`. Выполнение `ORDER BY`. Запись в `Distributed` таблицы (nicelulu). Обратная совместимость для запросов `INSERT` от старых клиентов, реализующих `Native` протокол. Поддержка `LowCardinality` для `JOIN`. Производительность при работе в один поток. [#3823](https://github.com/yandex/ClickHouse/pull/3823) [#3803](https://github.com/yandex/ClickHouse/pull/3803) [#3799](https://github.com/yandex/ClickHouse/pull/3799) [#3769](https://github.com/yandex/ClickHouse/pull/3769) [#3744](https://github.com/yandex/ClickHouse/pull/3744) [#3681](https://github.com/yandex/ClickHouse/pull/3681) [#3651](https://github.com/yandex/ClickHouse/pull/3651) [#3649](https://github.com/yandex/ClickHouse/pull/3649) [#3641](https://github.com/yandex/ClickHouse/pull/3641) [#3632](https://github.com/yandex/ClickHouse/pull/3632) [#3568](https://github.com/yandex/ClickHouse/pull/3568) [#3523](https://github.com/yandex/ClickHouse/pull/3523) [#3518](https://github.com/yandex/ClickHouse/pull/3518)
* Исправлена работа настройки `select_sequential_consistency`. Ранее, при включенной настройке, после начала записи в новую партицию, мог возвращаться неполный результат. * Исправлена работа настройки `select_sequential_consistency`. Ранее, при включенной настройке, после начала записи в новую партицию, мог возвращаться неполный результат. [#2863](https://github.com/yandex/ClickHouse/pull/2863)
* Корректное указание базы данных при выполнении DDL запросов `ON CLUSTER`, а также при выполнении `ALTER UPDATE/DELETE`. * Корректное указание базы данных при выполнении DDL запросов `ON CLUSTER`, а также при выполнении `ALTER UPDATE/DELETE`. [#3772](https://github.com/yandex/ClickHouse/pull/3772) [#3460](https://github.com/yandex/ClickHouse/pull/3460)
* Корректное указание базы данных для подзапросов внутри VIEW. * Корректное указание базы данных для подзапросов внутри VIEW. [#3521](https://github.com/yandex/ClickHouse/pull/3521)
* Исправлена работа `PREWHERE` с `FINAL` для `VersionedCollapsingMergeTree`. * Исправлена работа `PREWHERE` с `FINAL` для `VersionedCollapsingMergeTree`. [7167bfd7](https://github.com/yandex/ClickHouse/commit/7167bfd7b365538f7a91c4307ad77e552ab4e8c1)
* Возможность с помощью запроса `KILL QUERY` отмены запросов, которые ещё не начали выполняться из-за ожидания блокировки таблицы. * Возможность с помощью запроса `KILL QUERY` отмены запросов, которые ещё не начали выполняться из-за ожидания блокировки таблицы. [#3517](https://github.com/yandex/ClickHouse/pull/3517)
* Исправлены расчёты с датой и временем в случае, если стрелки часов были переведены назад в полночь (это происходит в Иране, а также было Москве с 1981 по 1983 год). Ранее это приводило к тому, что стрелки часов переводились на сутки раньше, чем нужно, а также приводило к некорректному форматированию даты-с-временем в текстовом виде. * Исправлены расчёты с датой и временем в случае, если стрелки часов были переведены назад в полночь (это происходит в Иране, а также было Москве с 1981 по 1983 год). Ранее это приводило к тому, что стрелки часов переводились на сутки раньше, чем нужно, а также приводило к некорректному форматированию даты-с-временем в текстовом виде. [#3819](https://github.com/yandex/ClickHouse/pull/3819)
* Исправлена работа некоторых случаев `VIEW` и подзапросов без указания базы данных. * Исправлена работа некоторых случаев `VIEW` и подзапросов без указания базы данных. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3521)
* Исправлен race condition при одновременном чтении из `MATERIALIZED VIEW` и удалением `MATERIALIZED VIEW` из-за отсутствия блокировки внутренней таблицы `MATERIALIZED VIEW`. * Исправлен race condition при одновременном чтении из `MATERIALIZED VIEW` и удалением `MATERIALIZED VIEW` из-за отсутствия блокировки внутренней таблицы `MATERIALIZED VIEW`. [#3404](https://github.com/yandex/ClickHouse/pull/3404) [#3694](https://github.com/yandex/ClickHouse/pull/3694)
* Исправлена ошибка `Lock handler cannot be nullptr.` * Исправлена ошибка `Lock handler cannot be nullptr.` [#3689](https://github.com/yandex/ClickHouse/pull/3689)
* Исправления выполнения запросов при включенной настройке `compile_expressions`ыключена по-умолчанию) - убрана свёртка недетерминированных константных выражений, как например, функции `now`. * Исправления выполнения запросов при включенной настройке `compile_expressions` (включена по-умолчанию) - убрана свёртка недетерминированных константных выражений, как например, функции `now`. [#3457](https://github.com/yandex/ClickHouse/pull/3457)
* Исправлено падение при указании неконстантного аргумента scale в функциях `toDecimal32/64/128`. * Исправлено падение при указании неконстантного аргумента scale в функциях `toDecimal32/64/128`.
* Исправлена ошибка при попытке вставки в формате `Values` массива с `NULL` элементами в столбец типа `Array` без `Nullable` (в случае `input_format_values_interpret_expressions` = 1). * Исправлена ошибка при попытке вставки в формате `Values` массива с `NULL` элементами в столбец типа `Array` без `Nullable` (в случае `input_format_values_interpret_expressions` = 1). [#3487](https://github.com/yandex/ClickHouse/pull/3487) [#3503](https://github.com/yandex/ClickHouse/pull/3503)
* Исправлено непрерывное логгирование ошибок в `DDLWorker`, если ZooKeeper недоступен. * Исправлено непрерывное логгирование ошибок в `DDLWorker`, если ZooKeeper недоступен. [8f50c620](https://github.com/yandex/ClickHouse/commit/8f50c620334988b28018213ec0092fe6423847e2)
* Исправлен тип возвращаемого значения для функций `quantile*` от аргументов типа `Date` и `DateTime`. * Исправлен тип возвращаемого значения для функций `quantile*` от аргументов типа `Date` и `DateTime`. [#3580](https://github.com/yandex/ClickHouse/pull/3580)
* Исправлена работа секции `WITH`, если она задаёт простой алиас без выражений. * Исправлена работа секции `WITH`, если она задаёт простой алиас без выражений. [#3570](https://github.com/yandex/ClickHouse/pull/3570)
* Исправлена обработка запросов с именованными подзапросами и квалифицированными именами столбцов при включенной настройке `enable_optimize_predicate_expression`. * Исправлена обработка запросов с именованными подзапросами и квалифицированными именами столбцов при включенной настройке `enable_optimize_predicate_expression`. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3588)
* Исправлена ошибка `Attempt to attach to nullptr thread group` при работе материализованных представлений. * Исправлена ошибка `Attempt to attach to nullptr thread group` при работе материализованных представлений. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3623)
* Исправлено падение при передаче некоторых некорректных аргументов в функцию `arrayReverse`. * Исправлено падение при передаче некоторых некорректных аргументов в функцию `arrayReverse`. [73e3a7b6](https://github.com/yandex/ClickHouse/commit/73e3a7b662161d6005e7727d8a711b930386b871)
* Исправлен buffer overflow в функции `extractURLParameter`. Увеличена производительность. Добавлена корректная обработка строк, содержащих нулевые байты. * Исправлен buffer overflow в функции `extractURLParameter`. Увеличена производительность. Добавлена корректная обработка строк, содержащих нулевые байты. [141e9799](https://github.com/yandex/ClickHouse/commit/141e9799e49201d84ea8e951d1bed4fb6d3dacb5)
* Исправлен buffer overflow в функциях `lowerUTF8`, `upperUTF8`. Удалена возможность выполнения этих функций над аргументами типа `FixedString`. * Исправлен buffer overflow в функциях `lowerUTF8`, `upperUTF8`. Удалена возможность выполнения этих функций над аргументами типа `FixedString`. [#3662](https://github.com/yandex/ClickHouse/pull/3662)
* Исправлен редкий race condition при удалении таблиц типа `MergeTree`. * Исправлен редкий race condition при удалении таблиц типа `MergeTree`. [#3680](https://github.com/yandex/ClickHouse/pull/3680)
* Исправлен race condition при чтении из таблиц типа `Buffer` и одновременном `ALTER` либо `DROP` таблиц назначения. * Исправлен race condition при чтении из таблиц типа `Buffer` и одновременном `ALTER` либо `DROP` таблиц назначения. [#3719](https://github.com/yandex/ClickHouse/pull/3719)
* Исправлен segfault в случае превышения ограничения `max_temporary_non_const_columns`. * Исправлен segfault в случае превышения ограничения `max_temporary_non_const_columns`. [#3788](https://github.com/yandex/ClickHouse/pull/3788)
### Улучшения: ### Улучшения:
* Обработанные конфигурационные файлы записываются сервером не в `/etc/clickhouse-server/` директорию, а в директорию `preprocessed_configs` внутри `path`. Это позволяет оставить директорию `/etc/clickhouse-server/` недоступной для записи пользователем `clickhouse`, что повышает безопасность. * Обработанные конфигурационные файлы записываются сервером не в `/etc/clickhouse-server/` директорию, а в директорию `preprocessed_configs` внутри `path`. Это позволяет оставить директорию `/etc/clickhouse-server/` недоступной для записи пользователем `clickhouse`, что повышает безопасность. [#2443](https://github.com/yandex/ClickHouse/pull/2443)
* Настройка `min_merge_bytes_to_use_direct_io` выставлена по-умолчанию в 10 GiB. Слияния, образующие крупные куски таблиц семейства MergeTree, будут производиться в режиме `O_DIRECT`, что исключает вымывание кэша. * Настройка `min_merge_bytes_to_use_direct_io` выставлена по-умолчанию в 10 GiB. Слияния, образующие крупные куски таблиц семейства MergeTree, будут производиться в режиме `O_DIRECT`, что исключает вымывание кэша. [#3504](https://github.com/yandex/ClickHouse/pull/3504)
* Ускорен запуск сервера в случае наличия очень большого количества таблиц. * Ускорен запуск сервера в случае наличия очень большого количества таблиц. [#3398](https://github.com/yandex/ClickHouse/pull/3398)
* Добавлен пул соединений и HTTP `Keep-Alive` для соединения между репликами. * Добавлен пул соединений и HTTP `Keep-Alive` для соединения между репликами. [#3594](https://github.com/yandex/ClickHouse/pull/3594)
* В случае ошибки синтаксиса запроса, в `HTTP` интерфейсе возвращается код `400 Bad Request` (ранее возвращался код 500). * В случае ошибки синтаксиса запроса, в `HTTP` интерфейсе возвращается код `400 Bad Request` (ранее возвращался код 500). [31bc680a](https://github.com/yandex/ClickHouse/commit/31bc680ac5f4bb1d0360a8ba4696fa84bb47d6ab)
* Для настройки `join_default_strictness` выбрано значение по-умолчанию `ALL` для совместимости. * Для настройки `join_default_strictness` выбрано значение по-умолчанию `ALL` для совместимости. [120e2cbe](https://github.com/yandex/ClickHouse/commit/120e2cbe2ff4fbad626c28042d9b28781c805afe)
* Убрано логгирование в `stderr` из библиотеки `re2` в случае некорректных или сложных регулярных выражений. * Убрано логгирование в `stderr` из библиотеки `re2` в случае некорректных или сложных регулярных выражений. [#3723](https://github.com/yandex/ClickHouse/pull/3723)
* Для движка таблиц `Kafka` TODO * Для движка таблиц `Kafka`: проверка наличия подписок перед началом чтения из Kafka; настройка таблицы kafka_max_block_size. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3396)
* Функции `cityHash64`, `farmHash64`, `metroHash64`, `sipHash64`, `halfMD5`, `murmurHash2_32`, `murmurHash2_64`, `murmurHash3_32`, `murmurHash3_64` теперь работают для произвольного количества аргументов, а также для аргументов-кортежей. * Функции `cityHash64`, `farmHash64`, `metroHash64`, `sipHash64`, `halfMD5`, `murmurHash2_32`, `murmurHash2_64`, `murmurHash3_32`, `murmurHash3_64` теперь работают для произвольного количества аргументов, а также для аргументов-кортежей. [#3451](https://github.com/yandex/ClickHouse/pull/3451) [#3519](https://github.com/yandex/ClickHouse/pull/3519)
* Функция `arrayReverse` теперь работает с любыми типами массивов. * Функция `arrayReverse` теперь работает с любыми типами массивов. [73e3a7b6](https://github.com/yandex/ClickHouse/commit/73e3a7b662161d6005e7727d8a711b930386b871)
* Добавлен опциональный параметр - размер слота для функции `timeSlots`. * Добавлен опциональный параметр - размер слота для функции `timeSlots`. [Kirill Shvakov](https://github.com/yandex/ClickHouse/pull/3724)
* Для `FULL` и `RIGHT JOIN` учитывается настройка `max_block_size` для потока неприсоединённых данных из правой таблицы. * Для `FULL` и `RIGHT JOIN` учитывается настройка `max_block_size` для потока неприсоединённых данных из правой таблицы. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3699)
* В `clickhouse-benchmark` и `clickhouse-performance-test` добавлен параметр командной строки `--secure` для включения TLS. * В `clickhouse-benchmark` и `clickhouse-performance-test` добавлен параметр командной строки `--secure` для включения TLS. [#3688](https://github.com/yandex/ClickHouse/pull/3688) [#3690](https://github.com/yandex/ClickHouse/pull/3690)
* Преобразование типов в случае, если структура таблицы типа `Buffer` не соответствует структуре таблицы назначения. * Преобразование типов в случае, если структура таблицы типа `Buffer` не соответствует структуре таблицы назначения. [Vitaly Baranov](https://github.com/yandex/ClickHouse/pull/3603)
* Добавлена настройка `tcp_keep_alive_timeout` для включения keep-alive пакетов после неактивности в течение указанного интервала времени. * Добавлена настройка `tcp_keep_alive_timeout` для включения keep-alive пакетов после неактивности в течение указанного интервала времени. [#3441](https://github.com/yandex/ClickHouse/pull/3441)
* Убрано излишнее квотирование значений ключа партиции в таблице `system.parts`, если он состоит из одного столбца. * Убрано излишнее квотирование значений ключа партиции в таблице `system.parts`, если он состоит из одного столбца. [#3652](https://github.com/yandex/ClickHouse/pull/3652)
* Функция деления с остатком работает для типов данных `Date` и `DateTime`. * Функция деления с остатком работает для типов данных `Date` и `DateTime`. [#3385](https://github.com/yandex/ClickHouse/pull/3385)
* Добавлены синонимы функций `POWER`, `LN`, `LCASE`, `UCASE`, `REPLACE`, `LOCATE`, `SUBSTR`, `MID`. Некоторые имена функций сделаны регистронезависимыми для совместимости со стандартом SQL. Добавлен синтаксический сахар `SUBSTRING(expr FROM start FOR length)` для совместимости с SQL. * Добавлены синонимы функций `POWER`, `LN`, `LCASE`, `UCASE`, `REPLACE`, `LOCATE`, `SUBSTR`, `MID`. [#3774](https://github.com/yandex/ClickHouse/pull/3774) [#3763](https://github.com/yandex/ClickHouse/pull/3763) Некоторые имена функций сделаны регистронезависимыми для совместимости со стандартом SQL. Добавлен синтаксический сахар `SUBSTRING(expr FROM start FOR length)` для совместимости с SQL. [#3804](https://github.com/yandex/ClickHouse/pull/3804)
* Добавлена возможность фиксации (`mlock`) страниц памяти, соответствующих исполняемому коду `clickhouse-server` для предотвращения вытеснения их из памяти. Возможность выключена по-умолчанию. * Добавлена возможность фиксации (`mlock`) страниц памяти, соответствующих исполняемому коду `clickhouse-server` для предотвращения вытеснения их из памяти. Возможность выключена по-умолчанию. [#3553](https://github.com/yandex/ClickHouse/pull/3553)
* Увеличена производительность чтения с `O_DIRECT` (с включенной опцией `min_bytes_to_use_direct_io`). * Увеличена производительность чтения с `O_DIRECT` (с включенной опцией `min_bytes_to_use_direct_io`). [#3405](https://github.com/yandex/ClickHouse/pull/3405)
* Улучшена производительность работы функции `dictGet...OrDefault` в случае константного аргумента-ключа и неконстантного аргумента-default. * Улучшена производительность работы функции `dictGet...OrDefault` в случае константного аргумента-ключа и неконстантного аргумента-default. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3563)
* В функции `firstSignificantSubdomain` добавлена обработка доменов `gov`, `mil`, `edu`. Увеличена производительность работы. * В функции `firstSignificantSubdomain` добавлена обработка доменов `gov`, `mil`, `edu`. [Igor Hatarist](https://github.com/yandex/ClickHouse/pull/3601) Увеличена производительность работы. [#3628](https://github.com/yandex/ClickHouse/pull/3628)
* Возможность указания произвольных переменных окружения для запуска `clickhouse-server` посредством `SYS-V init.d`-скрипта с помощью указания `CLICKHOUSE_PROGRAM_ENV` в `/etc/default/clickhouse`. * Возможность указания произвольных переменных окружения для запуска `clickhouse-server` посредством `SYS-V init.d`-скрипта с помощью указания `CLICKHOUSE_PROGRAM_ENV` в `/etc/default/clickhouse`.
* Правильный код возврата init-скрипта clickhouse-server. [Pavlo Bashynskyi](https://github.com/yandex/ClickHouse/pull/3612)
* В таблицу `system.metrics` добавлена метрика `VersionInteger`, а в `system.build_options` добавлена строчка `VERSION_INTEGER`, содержащая версию ClickHouse в числовом представлении, вида `18016000`. * Правильный код возврата init-скрипта clickhouse-server. [#3516](https://github.com/yandex/ClickHouse/pull/3516)
* Удалена возможность сравнения типа `Date` с числом, чтобы избежать потенциальных ошибок вида `date = 2018-12-17`, где ошибочно не указаны кавычки вокруг даты. * В таблицу `system.metrics` добавлена метрика `VersionInteger`, а в `system.build_options` добавлена строчка `VERSION_INTEGER`, содержащая версию ClickHouse в числовом представлении, вида `18016000`. [#3644](https://github.com/yandex/ClickHouse/pull/3644)
* Исправлено поведение функций с состоянием типа `rowNumberInAllBlocks` - раньше они выдавали число на единицу больше вследствие их запуска во время анализа запроса. * Удалена возможность сравнения типа `Date` с числом, чтобы избежать потенциальных ошибок вида `date = 2018-12-17`, где ошибочно не указаны кавычки вокруг даты. [#3687](https://github.com/yandex/ClickHouse/pull/3687)
* При невозможности удалить файл `force_restore_data`, выводится сообщение об ошибке. * Исправлено поведение функций с состоянием типа `rowNumberInAllBlocks` - раньше они выдавали число на единицу больше вследствие их запуска во время анализа запроса. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3729)
* При невозможности удалить файл `force_restore_data`, выводится сообщение об ошибке. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3794)
### Улучшение сборки: ### Улучшение сборки:
* Обновлена библиотека `jemalloc`, что исправляет потенциальную утечку памяти. * Обновлена библиотека `jemalloc`, что исправляет потенциальную утечку памяти. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3557)
* Для debug сборок включено по-умолчанию профилирование `jemalloc`. * Для debug сборок включено по-умолчанию профилирование `jemalloc`. [2cc82f5c](https://github.com/yandex/ClickHouse/commit/2cc82f5cbe266421cd4c1165286c2c47e5ffcb15)
* Добавлена возможность запуска интеграционных тестов, при наличии установленным в системе лишь `Docker`. * Добавлена возможность запуска интеграционных тестов, при наличии установленным в системе лишь `Docker`. [#3650](https://github.com/yandex/ClickHouse/pull/3650)
* Добавлен fuzz тест выражений в SELECT запросах. * Добавлен fuzz тест выражений в SELECT запросах. [#3442](https://github.com/yandex/ClickHouse/pull/3442)
* Добавлен покоммитный стресс-тест, выполняющий функциональные тесты параллельно и в произвольном порядке, позволяющий обнаружить больше race conditions. * Добавлен покоммитный стресс-тест, выполняющий функциональные тесты параллельно и в произвольном порядке, позволяющий обнаружить больше race conditions. [#3438](https://github.com/yandex/ClickHouse/pull/3438)
* Улучшение способа запуска clickhouse-server в Docker образе. * Улучшение способа запуска clickhouse-server в Docker образе. [Elghazal Ahmed](https://github.com/yandex/ClickHouse/pull/3663)
* Для Docker образа добавлена поддержка инициализации базы данных с помощью файлов в директории `/docker-entrypoint-initdb.d`. * Для Docker образа добавлена поддержка инициализации базы данных с помощью файлов в директории `/docker-entrypoint-initdb.d`. [Konstantin Lebedev](https://github.com/yandex/ClickHouse/pull/3695)
* Исправления для сборки под ARM. * Исправления для сборки под ARM. [#3709](https://github.com/yandex/ClickHouse/pull/3709)
### Обратно несовместимые изменения:
* Удалена возможность сравнения типа `Date` с числом, необходимо вместо `toDate('2018-12-18') = 17883`, использовать явное приведение типов `= toDate(17883)` [#3687](https://github.com/yandex/ClickHouse/pull/3687)
## ClickHouse release 18.14.19, 2018-12-19
### Исправления ошибок:
* Исправлена проблема, приводившая к невозможности обновить словари с источником ODBC. [#3825](https://github.com/yandex/ClickHouse/issues/3825), [#3829](https://github.com/yandex/ClickHouse/issues/3829)
* Исправлен segfault в случае превышения ограничения `max_temporary_non_const_columns`. [#3788](https://github.com/yandex/ClickHouse/pull/3788)
* Корректное указание базы данных при выполнении DDL запросов `ON CLUSTER`. [#3460](https://github.com/yandex/ClickHouse/pull/3460)
### Улучшения сборки:
* Исправления сборки под ARM.
## ClickHouse release 18.14.18, 2018-12-04 ## ClickHouse release 18.14.18, 2018-12-04
### Исправления ошибок: ### Исправления ошибок:
* Исправлена ошибка в функции `dictGet...` для словарей типа `range`, если один из аргументов константный, а другой - нет. [#3751](https://github.com/yandex/ClickHouse/pull/3751) * Исправлена ошибка в функции `dictGet...` для словарей типа `range`, если один из аргументов константный, а другой - нет. [#3751](https://github.com/yandex/ClickHouse/pull/3751)
* Исправлена ошибка, приводящая к выводу сообщений `netlink: '...': attribute type 1 has an invalid length` в логе ядра Linux, проявляющаяся на достаточно новых ядрах Linux. [#3749](https://github.com/yandex/ClickHouse/pull/3749) * Исправлена ошибка, приводящая к выводу сообщений `netlink: '...': attribute type 1 has an invalid length` в логе ядра Linux, проявляющаяся на достаточно новых ядрах Linux. [#3749](https://github.com/yandex/ClickHouse/pull/3749)
* Исправлен segfault при выполнении функции `empty` от аргумента типа `FixedString`. [#3703](https://github.com/yandex/ClickHouse/pull/3703) * Исправлен segfault при выполнении функции `empty` от аргумента типа `FixedString`. [Daniel, Dao Quang Minh](https://github.com/yandex/ClickHouse/pull/3703)
* Исправлена избыточная аллокация памяти при большом значении настройки `max_query_size` (кусок памяти размера `max_query_size` выделялся сразу). [#3720](https://github.com/yandex/ClickHouse/pull/3720) * Исправлена избыточная аллокация памяти при большом значении настройки `max_query_size` (кусок памяти размера `max_query_size` выделялся сразу). [#3720](https://github.com/yandex/ClickHouse/pull/3720)
### Улучшения процесса сборки ClickHouse: ### Улучшения процесса сборки ClickHouse:

View File

@ -25,17 +25,10 @@ endif ()
# Write compile_commands.json # Write compile_commands.json
set(CMAKE_EXPORT_COMPILE_COMMANDS 1) set(CMAKE_EXPORT_COMPILE_COMMANDS 1)
set(PARALLEL_COMPILE_JOBS "" CACHE STRING "Define the maximum number of concurrent compilation jobs")
if (PARALLEL_COMPILE_JOBS)
set_property(GLOBAL APPEND PROPERTY JOB_POOLS compile_job_pool="${PARALLEL_COMPILE_JOBS}")
set(CMAKE_JOB_POOL_COMPILE compile_job_pool)
endif ()
set(PARALLEL_LINK_JOBS "" CACHE STRING "Define the maximum number of concurrent link jobs") set (MAX_COMPILER_MEMORY 2000 CACHE INTERNAL "")
if (LLVM_PARALLEL_LINK_JOBS) set (MAX_LINKER_MEMORY 3500 CACHE INTERNAL "")
set_property(GLOBAL APPEND PROPERTY JOB_POOLS link_job_pool=${PARALLEL_LINK_JOBS}) include (cmake/limit_jobs.cmake)
set(CMAKE_JOB_POOL_LINK link_job_pool)
endif ()
include (cmake/find_ccache.cmake) include (cmake/find_ccache.cmake)
@ -162,51 +155,8 @@ set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${COMPILER_FLAGS} -fn
set (CMAKE_C_FLAGS_RELWITHDEBINFO "${CMAKE_C_FLAGS_RELWITHDEBINFO} -O3 ${CMAKE_C_FLAGS_ADD}") set (CMAKE_C_FLAGS_RELWITHDEBINFO "${CMAKE_C_FLAGS_RELWITHDEBINFO} -O3 ${CMAKE_C_FLAGS_ADD}")
set (CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG} -O0 -g3 -ggdb3 -fno-inline ${CMAKE_C_FLAGS_ADD}") set (CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG} -O0 -g3 -ggdb3 -fno-inline ${CMAKE_C_FLAGS_ADD}")
set(THREADS_PREFER_PTHREAD_FLAG ON)
find_package (Threads)
include (cmake/test_compiler.cmake) include (cmake/use_libcxx.cmake)
if (OS_LINUX AND COMPILER_CLANG)
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS}")
option (USE_LIBCXX "Use libc++ and libc++abi instead of libstdc++ (only make sense on Linux with Clang)" ${HAVE_LIBCXX})
set (LIBCXX_PATH "" CACHE STRING "Use custom path for libc++. It should be used for MSan.")
if (USE_LIBCXX)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -stdlib=libc++") # Ok for clang6, for older can cause 'not used option' warning
set (CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -D_LIBCPP_DEBUG=0") # More checks in debug build.
if (MAKE_STATIC_LIBRARIES)
execute_process (COMMAND ${CMAKE_CXX_COMPILER} --print-file-name=libclang_rt.builtins-${CMAKE_SYSTEM_PROCESSOR}.a OUTPUT_VARIABLE BUILTINS_LIB_PATH OUTPUT_STRIP_TRAILING_WHITESPACE)
link_libraries (-nodefaultlibs -Wl,-Bstatic -stdlib=libc++ c++ c++abi gcc_eh ${BUILTINS_LIB_PATH} rt -Wl,-Bdynamic dl pthread m c)
else ()
link_libraries (-stdlib=libc++ c++ c++abi)
endif ()
if (LIBCXX_PATH)
# include_directories (SYSTEM BEFORE "${LIBCXX_PATH}/include" "${LIBCXX_PATH}/include/c++/v1")
link_directories ("${LIBCXX_PATH}/lib")
endif ()
endif ()
endif ()
if (USE_LIBCXX)
set (STATIC_STDLIB_FLAGS "")
else ()
set (STATIC_STDLIB_FLAGS "-static-libgcc -static-libstdc++")
endif ()
if (MAKE_STATIC_LIBRARIES AND NOT APPLE AND NOT (COMPILER_CLANG AND OS_FREEBSD))
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${STATIC_STDLIB_FLAGS}")
# Along with executables, we also build example of shared library for "library dictionary source"; and it also should be self-contained.
set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} ${STATIC_STDLIB_FLAGS}")
endif ()
if (USE_STATIC_LIBRARIES AND HAVE_NO_PIE)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${FLAG_NO_PIE}")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${FLAG_NO_PIE}")
endif ()
if (NOT MAKE_STATIC_LIBRARIES) if (NOT MAKE_STATIC_LIBRARIES)
set(CMAKE_POSITION_INDEPENDENT_CODE ON) set(CMAKE_POSITION_INDEPENDENT_CODE ON)
@ -268,6 +218,7 @@ include (cmake/find_odbc.cmake)
# openssl, zlib, odbc before poco # openssl, zlib, odbc before poco
include (cmake/find_poco.cmake) include (cmake/find_poco.cmake)
include (cmake/find_lz4.cmake) include (cmake/find_lz4.cmake)
include (cmake/find_xxhash.cmake)
include (cmake/find_sparsehash.cmake) include (cmake/find_sparsehash.cmake)
include (cmake/find_rt.cmake) include (cmake/find_rt.cmake)
include (cmake/find_execinfo.cmake) include (cmake/find_execinfo.cmake)

View File

@ -1,4 +1,11 @@
if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/base64/lib/lib.c")
set (MISSING_INTERNAL_BASE64_LIBRARY 1)
message (WARNING "submodule contrib/base64 is missing. to fix try run: \n git submodule update --init --recursive")
endif ()
if (NOT MISSING_INTERNAL_BASE64_LIBRARY)
option (ENABLE_BASE64 "Enable base64" ON) option (ENABLE_BASE64 "Enable base64" ON)
endif ()
if (ENABLE_BASE64) if (ENABLE_BASE64)
if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/base64") if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/base64")
@ -9,4 +16,3 @@ if (ENABLE_BASE64)
set (USE_BASE64 1) set (USE_BASE64 1)
endif() endif()
endif () endif ()

View File

@ -16,7 +16,7 @@ endif ()
if (HDFS3_LIBRARY AND HDFS3_INCLUDE_DIR) if (HDFS3_LIBRARY AND HDFS3_INCLUDE_DIR)
set(USE_HDFS 1) set(USE_HDFS 1)
elseif (LIBGSASL_LIBRARY) elseif (LIBGSASL_LIBRARY AND LIBXML2_LIBRARY)
set(HDFS3_INCLUDE_DIR "${ClickHouse_SOURCE_DIR}/contrib/libhdfs3/include") set(HDFS3_INCLUDE_DIR "${ClickHouse_SOURCE_DIR}/contrib/libhdfs3/include")
set(HDFS3_LIBRARY hdfs3) set(HDFS3_LIBRARY hdfs3)
set(USE_HDFS 1) set(USE_HDFS 1)

View File

@ -2,10 +2,13 @@ if (NOT APPLE AND NOT ARCH_32)
option (USE_INTERNAL_LIBGSASL_LIBRARY "Set to FALSE to use system libgsasl library instead of bundled" ${NOT_UNBUNDLED}) option (USE_INTERNAL_LIBGSASL_LIBRARY "Set to FALSE to use system libgsasl library instead of bundled" ${NOT_UNBUNDLED})
endif () endif ()
if (USE_INTERNAL_LIBGSASL_LIBRARY AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libgsasl/src/gsasl.h") if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libgsasl/src/gsasl.h")
if (USE_INTERNAL_LIBGSASL_LIBRARY)
message (WARNING "submodule contrib/libgsasl is missing. to fix try run: \n git submodule update --init --recursive") message (WARNING "submodule contrib/libgsasl is missing. to fix try run: \n git submodule update --init --recursive")
set (USE_INTERNAL_LIBGSASL_LIBRARY 0) set (USE_INTERNAL_LIBGSASL_LIBRARY 0)
endif () endif ()
set (MISSING_INTERNAL_LIBGSASL_LIBRARY 1)
endif ()
if (NOT USE_INTERNAL_LIBGSASL_LIBRARY) if (NOT USE_INTERNAL_LIBGSASL_LIBRARY)
find_library (LIBGSASL_LIBRARY gsasl) find_library (LIBGSASL_LIBRARY gsasl)
@ -13,7 +16,7 @@ if (NOT USE_INTERNAL_LIBGSASL_LIBRARY)
endif () endif ()
if (LIBGSASL_LIBRARY AND LIBGSASL_INCLUDE_DIR) if (LIBGSASL_LIBRARY AND LIBGSASL_INCLUDE_DIR)
elseif (NOT APPLE AND NOT ARCH_32) elseif (NOT MISSING_INTERNAL_LIBGSASL_LIBRARY AND NOT APPLE AND NOT ARCH_32)
set (LIBGSASL_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libgsasl/src ${ClickHouse_SOURCE_DIR}/contrib/libgsasl/linux_x86_64/include) set (LIBGSASL_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libgsasl/src ${ClickHouse_SOURCE_DIR}/contrib/libgsasl/linux_x86_64/include)
set (USE_INTERNAL_LIBGSASL_LIBRARY 1) set (USE_INTERNAL_LIBGSASL_LIBRARY 1)
set (LIBGSASL_LIBRARY libgsasl) set (LIBGSASL_LIBRARY libgsasl)

View File

@ -1,9 +1,12 @@
option (USE_INTERNAL_LIBXML2_LIBRARY "Set to FALSE to use system libxml2 library instead of bundled" ${NOT_UNBUNDLED}) option (USE_INTERNAL_LIBXML2_LIBRARY "Set to FALSE to use system libxml2 library instead of bundled" ${NOT_UNBUNDLED})
if (USE_INTERNAL_LIBXML2_LIBRARY AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libxml2/libxml.h") if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libxml2/libxml.h")
if (USE_INTERNAL_LIBXML2_LIBRARY)
message (WARNING "submodule contrib/libxml2 is missing. to fix try run: \n git submodule update --init --recursive") message (WARNING "submodule contrib/libxml2 is missing. to fix try run: \n git submodule update --init --recursive")
set (USE_INTERNAL_LIBXML2_LIBRARY 0) set (USE_INTERNAL_LIBXML2_LIBRARY 0)
endif () endif ()
set (MISSING_INTERNAL_LIBXML2_LIBRARY 1)
endif ()
if (NOT USE_INTERNAL_LIBXML2_LIBRARY) if (NOT USE_INTERNAL_LIBXML2_LIBRARY)
find_library (LIBXML2_LIBRARY libxml2) find_library (LIBXML2_LIBRARY libxml2)
@ -11,7 +14,7 @@ if (NOT USE_INTERNAL_LIBXML2_LIBRARY)
endif () endif ()
if (LIBXML2_LIBRARY AND LIBXML2_INCLUDE_DIR) if (LIBXML2_LIBRARY AND LIBXML2_INCLUDE_DIR)
else () elseif (NOT MISSING_INTERNAL_LIBXML2_LIBRARY)
set (LIBXML2_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libxml2/include ${ClickHouse_SOURCE_DIR}/contrib/libxml2-cmake/linux_x86_64/include) set (LIBXML2_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libxml2/include ${ClickHouse_SOURCE_DIR}/contrib/libxml2-cmake/linux_x86_64/include)
set (USE_INTERNAL_LIBXML2_LIBRARY 1) set (USE_INTERNAL_LIBXML2_LIBRARY 1)
set (LIBXML2_LIBRARY libxml2) set (LIBXML2_LIBRARY libxml2)

10
cmake/find_xxhash.cmake Normal file
View File

@ -0,0 +1,10 @@
if (LZ4_INCLUDE_DIR)
if (NOT EXISTS "${LZ4_INCLUDE_DIR}/xxhash.h")
message (WARNING "LZ4 library does not have XXHash. Support for XXHash will be disabled.")
set (USE_XXHASH 0)
else ()
set (USE_XXHASH 1)
endif ()
endif ()
message (STATUS "Using xxhash=${USE_XXHASH}")

35
cmake/limit_jobs.cmake Normal file
View File

@ -0,0 +1,35 @@
# Usage:
# set (MAX_COMPILER_MEMORY 2000 CACHE INTERNAL "") # In megabytes
# set (MAX_LINKER_MEMORY 3500 CACHE INTERNAL "")
# include (cmake/limit_jobs.cmake)
cmake_host_system_information(RESULT AVAILABLE_PHYSICAL_MEMORY QUERY AVAILABLE_PHYSICAL_MEMORY) # Not available under freebsd
option(PARALLEL_COMPILE_JOBS "Define the maximum number of concurrent compilation jobs" "")
if (NOT PARALLEL_COMPILE_JOBS AND AVAILABLE_PHYSICAL_MEMORY)
math(EXPR PARALLEL_COMPILE_JOBS ${AVAILABLE_PHYSICAL_MEMORY}/2500) # ~2.5gb max per one compiler
if (NOT PARALLEL_COMPILE_JOBS)
set (PARALLEL_COMPILE_JOBS 1)
endif ()
endif ()
if (PARALLEL_COMPILE_JOBS)
set_property(GLOBAL APPEND PROPERTY JOB_POOLS compile_job_pool=${PARALLEL_COMPILE_JOBS})
set(CMAKE_JOB_POOL_COMPILE compile_job_pool)
endif ()
option(PARALLEL_LINK_JOBS "Define the maximum number of concurrent link jobs" "")
if (NOT PARALLEL_LINK_JOBS AND AVAILABLE_PHYSICAL_MEMORY)
math(EXPR PARALLEL_LINK_JOBS ${AVAILABLE_PHYSICAL_MEMORY}/4000) # ~4gb max per one linker
if (NOT PARALLEL_LINK_JOBS)
set (PARALLEL_LINK_JOBS 1)
endif ()
endif ()
if (PARALLEL_COMPILE_JOBS OR PARALLEL_LINK_JOBS)
message(STATUS "Have ${AVAILABLE_PHYSICAL_MEMORY} megabytes of memory. Limiting concurrent linkers jobs to ${PARALLEL_LINK_JOBS} and compiler jobs to ${PARALLEL_COMPILE_JOBS}")
endif ()
if (LLVM_PARALLEL_LINK_JOBS)
set_property(GLOBAL APPEND PROPERTY JOB_POOLS link_job_pool=${PARALLEL_LINK_JOBS})
set(CMAKE_JOB_POOL_LINK link_job_pool)
endif ()

49
cmake/use_libcxx.cmake Normal file
View File

@ -0,0 +1,49 @@
# Uses MAKE_STATIC_LIBRARIES
set(THREADS_PREFER_PTHREAD_FLAG ON)
find_package (Threads)
include (cmake/test_compiler.cmake)
include (cmake/arch.cmake)
if (OS_LINUX AND COMPILER_CLANG)
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS}")
option (USE_LIBCXX "Use libc++ and libc++abi instead of libstdc++ (only make sense on Linux with Clang)" ${HAVE_LIBCXX})
set (LIBCXX_PATH "" CACHE STRING "Use custom path for libc++. It should be used for MSan.")
if (USE_LIBCXX)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -stdlib=libc++") # Ok for clang6, for older can cause 'not used option' warning
set (CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -D_LIBCPP_DEBUG=0") # More checks in debug build.
if (MAKE_STATIC_LIBRARIES)
execute_process (COMMAND ${CMAKE_CXX_COMPILER} --print-file-name=libclang_rt.builtins-${CMAKE_SYSTEM_PROCESSOR}.a OUTPUT_VARIABLE BUILTINS_LIB_PATH OUTPUT_STRIP_TRAILING_WHITESPACE)
link_libraries (-nodefaultlibs -Wl,-Bstatic -stdlib=libc++ c++ c++abi gcc_eh ${BUILTINS_LIB_PATH} rt -Wl,-Bdynamic dl pthread m c)
else ()
link_libraries (-stdlib=libc++ c++ c++abi)
endif ()
if (LIBCXX_PATH)
# include_directories (SYSTEM BEFORE "${LIBCXX_PATH}/include" "${LIBCXX_PATH}/include/c++/v1")
link_directories ("${LIBCXX_PATH}/lib")
endif ()
endif ()
endif ()
if (USE_LIBCXX)
set (STATIC_STDLIB_FLAGS "")
else ()
set (STATIC_STDLIB_FLAGS "-static-libgcc -static-libstdc++")
endif ()
if (MAKE_STATIC_LIBRARIES AND NOT APPLE AND NOT (COMPILER_CLANG AND OS_FREEBSD))
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${STATIC_STDLIB_FLAGS}")
# Along with executables, we also build example of shared library for "library dictionary source"; and it also should be self-contained.
set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} ${STATIC_STDLIB_FLAGS}")
endif ()
if (USE_STATIC_LIBRARIES AND HAVE_NO_PIE)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${FLAG_NO_PIE}")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${FLAG_NO_PIE}")
endif ()

View File

@ -2,7 +2,7 @@
if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU") if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-maybe-uninitialized -Wno-format -Wno-misleading-indentation -Wno-stringop-overflow") set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-maybe-uninitialized -Wno-format -Wno-misleading-indentation -Wno-stringop-overflow")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-old-style-cast -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-non-virtual-dtor -Wno-maybe-uninitialized -Wno-format -Wno-misleading-indentation -Wno-implicit-fallthrough -Wno-class-memaccess -std=c++1z") set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-old-style-cast -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-non-virtual-dtor -Wno-maybe-uninitialized -Wno-format -Wno-misleading-indentation -Wno-implicit-fallthrough -Wno-class-memaccess -Wno-sign-compare -std=c++1z")
elseif (CMAKE_CXX_COMPILER_ID STREQUAL "Clang") elseif (CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-format -Wno-parentheses-equality") set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-format -Wno-parentheses-equality")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-old-style-cast -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-non-virtual-dtor -Wno-format -std=c++1z") set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-old-style-cast -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-non-virtual-dtor -Wno-format -std=c++1z")

View File

@ -27,12 +27,12 @@ elseif (EXISTS ${INTERNAL_COMPILER_BIN_ROOT}${INTERNAL_COMPILER_EXECUTABLE})
endif () endif ()
if (COPY_HEADERS_COMPILER AND OS_LINUX) if (COPY_HEADERS_COMPILER AND OS_LINUX)
add_custom_target (copy-headers ALL env CLANG=${COPY_HEADERS_COMPILER} BUILD_PATH=${ClickHouse_BINARY_DIR} DESTDIR=${ClickHouse_SOURCE_DIR} ${ClickHouse_SOURCE_DIR}/copy_headers.sh ${ClickHouse_SOURCE_DIR} ${TMP_HEADERS_DIR} DEPENDS ${COPY_HEADERS_DEPENDS} WORKING_DIRECTORY ${ClickHouse_SOURCE_DIR} SOURCES ${ClickHouse_SOURCE_DIR}/copy_headers.sh) add_custom_target (copy-headers env CLANG=${COPY_HEADERS_COMPILER} BUILD_PATH=${ClickHouse_BINARY_DIR} DESTDIR=${ClickHouse_SOURCE_DIR} ${ClickHouse_SOURCE_DIR}/copy_headers.sh ${ClickHouse_SOURCE_DIR} ${TMP_HEADERS_DIR} DEPENDS ${COPY_HEADERS_DEPENDS} WORKING_DIRECTORY ${ClickHouse_SOURCE_DIR} SOURCES ${ClickHouse_SOURCE_DIR}/copy_headers.sh)
if (USE_INTERNAL_LLVM_LIBRARY) if (USE_INTERNAL_LLVM_LIBRARY)
set (CLANG_HEADERS_DIR "${ClickHouse_SOURCE_DIR}/contrib/llvm/clang/lib/Headers") set (CLANG_HEADERS_DIR "${ClickHouse_SOURCE_DIR}/contrib/llvm/clang/lib/Headers")
set (CLANG_HEADERS_DEST "${TMP_HEADERS_DIR}/usr/local/lib/clang/${LLVM_VERSION}/include") # original: ${LLVM_LIBRARY_OUTPUT_INTDIR}/clang/${CLANG_VERSION}/include set (CLANG_HEADERS_DEST "${TMP_HEADERS_DIR}/usr/local/lib/clang/${LLVM_VERSION}/include") # original: ${LLVM_LIBRARY_OUTPUT_INTDIR}/clang/${CLANG_VERSION}/include
add_custom_target (copy-headers-clang ALL ${CMAKE_COMMAND} -E make_directory ${CLANG_HEADERS_DEST} && ${CMAKE_COMMAND} -E copy_if_different ${CLANG_HEADERS_DIR}/* ${CLANG_HEADERS_DEST} ) add_custom_target (copy-headers-clang ${CMAKE_COMMAND} -E make_directory ${CLANG_HEADERS_DEST} && ${CMAKE_COMMAND} -E copy_if_different ${CLANG_HEADERS_DIR}/* ${CLANG_HEADERS_DEST} )
add_dependencies (copy-headers copy-headers-clang) add_dependencies (copy-headers copy-headers-clang)
endif () endif ()
endif () endif ()

View File

@ -61,7 +61,7 @@ int mainEntryClickHouseCompressor(int argc, char ** argv)
("block-size,b", boost::program_options::value<unsigned>()->default_value(DBMS_DEFAULT_BUFFER_SIZE), "compress in blocks of specified size") ("block-size,b", boost::program_options::value<unsigned>()->default_value(DBMS_DEFAULT_BUFFER_SIZE), "compress in blocks of specified size")
("hc", "use LZ4HC instead of LZ4") ("hc", "use LZ4HC instead of LZ4")
("zstd", "use ZSTD instead of LZ4") ("zstd", "use ZSTD instead of LZ4")
("level", "compression level") ("level", boost::program_options::value<int>(), "compression level")
("none", "use no compression instead of LZ4") ("none", "use no compression instead of LZ4")
("stat", "print block statistics of compressed data") ("stat", "print block statistics of compressed data")
; ;
@ -94,7 +94,9 @@ int mainEntryClickHouseCompressor(int argc, char ** argv)
else if (use_none) else if (use_none)
method = DB::CompressionMethod::NONE; method = DB::CompressionMethod::NONE;
DB::CompressionSettings settings(method, options.count("level") > 0 ? options["level"].as<int>() : DB::CompressionSettings::getDefaultLevel(method)); DB::CompressionSettings settings(method, options.count("level")
? options["level"].as<int>()
: DB::CompressionSettings::getDefaultLevel(method));
DB::ReadBufferFromFileDescriptor rb(STDIN_FILENO); DB::ReadBufferFromFileDescriptor rb(STDIN_FILENO);
DB::WriteBufferFromFileDescriptor wb(STDOUT_FILENO); DB::WriteBufferFromFileDescriptor wb(STDOUT_FILENO);

View File

@ -2,7 +2,11 @@
#include <memory> #include <memory>
#include <sys/resource.h> #include <sys/resource.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <errno.h> #include <errno.h>
#include <pwd.h>
#include <unistd.h>
#include <Poco/Version.h> #include <Poco/Version.h>
#include <Poco/DirectoryIterator.h> #include <Poco/DirectoryIterator.h>
#include <Poco/Net/HTTPServer.h> #include <Poco/Net/HTTPServer.h>
@ -70,6 +74,8 @@ namespace ErrorCodes
extern const int EXCESSIVE_ELEMENT_IN_CONFIG; extern const int EXCESSIVE_ELEMENT_IN_CONFIG;
extern const int INVALID_CONFIG_PARAMETER; extern const int INVALID_CONFIG_PARAMETER;
extern const int SYSTEM_ERROR; extern const int SYSTEM_ERROR;
extern const int FAILED_TO_GETPWUID;
extern const int MISMATCHING_USERS_FOR_PROCESS_AND_DATA;
} }
@ -83,6 +89,26 @@ static std::string getCanonicalPath(std::string && path)
return std::move(path); return std::move(path);
} }
static std::string getUserName(uid_t user_id)
{
/// Try to convert user id into user name.
auto buffer_size = sysconf(_SC_GETPW_R_SIZE_MAX);
if (buffer_size <= 0)
buffer_size = 1024;
std::string buffer;
buffer.reserve(buffer_size);
struct passwd passwd_entry;
struct passwd * result = nullptr;
const auto error = getpwuid_r(user_id, &passwd_entry, buffer.data(), buffer_size, &result);
if (error)
throwFromErrno("Failed to find user name for " + toString(user_id), ErrorCodes::FAILED_TO_GETPWUID, error);
else if (result)
return result->pw_name;
return toString(user_id);
}
void Server::uninitialize() void Server::uninitialize()
{ {
logger().information("shutting down"); logger().information("shutting down");
@ -166,6 +192,26 @@ int Server::main(const std::vector<std::string> & /*args*/)
std::string path = getCanonicalPath(config().getString("path", DBMS_DEFAULT_PATH)); std::string path = getCanonicalPath(config().getString("path", DBMS_DEFAULT_PATH));
std::string default_database = config().getString("default_database", "default"); std::string default_database = config().getString("default_database", "default");
/// Check that the process' user id matches the owner of the data.
const auto effective_user_id = geteuid();
struct stat statbuf;
if (stat(path.c_str(), &statbuf) == 0 && effective_user_id != statbuf.st_uid)
{
const auto effective_user = getUserName(effective_user_id);
const auto data_owner = getUserName(statbuf.st_uid);
std::string message = "Effective user of the process (" + effective_user +
") does not match the owner of the data (" + data_owner + ").";
if (effective_user_id == 0)
{
message += " Run under 'sudo -u " + data_owner + "'.";
throw Exception(message, ErrorCodes::MISMATCHING_USERS_FOR_PROCESS_AND_DATA);
}
else
{
LOG_WARNING(log, message);
}
}
global_context->setPath(path); global_context->setPath(path);
/// Create directories for 'path' and for default database, if not exist. /// Create directories for 'path' and for default database, if not exist.

View File

@ -370,19 +370,7 @@ void TCPHandler::processInsertQuery(const Settings & global_settings)
} }
/// Send block to the client - table structure. /// Send block to the client - table structure.
Block block = state.io.out->getHeader(); sendData(state.io.out->getHeader());
/// Support insert from old clients without low cardinality type.
if (client_revision && client_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE)
{
for (auto & col : block)
{
col.type = recursiveRemoveLowCardinality(col.type);
col.column = recursiveRemoveLowCardinality(col.column);
}
}
sendData(block);
readData(global_settings); readData(global_settings);
state.io.out->writeSuffix(); state.io.out->writeSuffix();
@ -399,16 +387,6 @@ void TCPHandler::processOrdinaryQuery()
{ {
Block header = state.io.in->getHeader(); Block header = state.io.in->getHeader();
/// Send data to old clients without low cardinality type.
if (client_revision && client_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE)
{
for (auto & column : header)
{
column.column = recursiveRemoveLowCardinality(column.column);
column.type = recursiveRemoveLowCardinality(column.type);
}
}
if (header) if (header)
sendData(header); sendData(header);
} }
@ -782,7 +760,8 @@ void TCPHandler::initBlockInput()
state.block_in = std::make_shared<NativeBlockInputStream>( state.block_in = std::make_shared<NativeBlockInputStream>(
*state.maybe_compressed_in, *state.maybe_compressed_in,
header, header,
client_revision); client_revision,
!connection_context.getSettingsRef().low_cardinality_allow_in_native_format);
} }
} }
@ -803,7 +782,8 @@ void TCPHandler::initBlockOutput(const Block & block)
state.block_out = std::make_shared<NativeBlockOutputStream>( state.block_out = std::make_shared<NativeBlockOutputStream>(
*state.maybe_compressed_out, *state.maybe_compressed_out,
client_revision, client_revision,
block.cloneEmpty()); block.cloneEmpty(),
!connection_context.getSettingsRef().low_cardinality_allow_in_native_format);
} }
} }
@ -815,7 +795,8 @@ void TCPHandler::initLogsBlockOutput(const Block & block)
state.logs_block_out = std::make_shared<NativeBlockOutputStream>( state.logs_block_out = std::make_shared<NativeBlockOutputStream>(
*out, *out,
client_revision, client_revision,
block.cloneEmpty()); block.cloneEmpty(),
!connection_context.getSettingsRef().low_cardinality_allow_in_native_format);
} }
} }

View File

@ -25,6 +25,7 @@ namespace Poco { class Logger; }
namespace DB namespace DB
{ {
struct ColumnsDescription;
/// State of query processing. /// State of query processing.
struct QueryState struct QueryState

View File

@ -187,6 +187,20 @@
</replica> </replica>
</shard> </shard>
</test_shard_localhost_secure> </test_shard_localhost_secure>
<test_unavailable_shard>
<shard>
<replica>
<host>localhost</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>localhost</host>
<port>1</port>
</replica>
</shard>
</test_unavailable_shard>
</remote_servers> </remote_servers>

View File

@ -0,0 +1,36 @@
#include <AggregateFunctions/AggregateFunctionFactory.h>
#include <AggregateFunctions/AggregateFunctionBoundingRatio.h>
#include <AggregateFunctions/FactoryHelpers.h>
namespace DB
{
namespace ErrorCodes
{
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
}
namespace
{
AggregateFunctionPtr createAggregateFunctionRate(const std::string & name, const DataTypes & argument_types, const Array & parameters)
{
assertNoParameters(name, parameters);
assertBinary(name, argument_types);
if (argument_types.size() < 2)
throw Exception("Aggregate function " + name + " requires at least two arguments",
ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
return std::make_shared<AggregateFunctionBoundingRatio>(argument_types);
}
}
void registerAggregateFunctionRate(AggregateFunctionFactory & factory)
{
factory.registerFunction("boundingRatio", createAggregateFunctionRate, AggregateFunctionFactory::CaseInsensitive);
}
}

View File

@ -0,0 +1,162 @@
#pragma once
#include <DataTypes/DataTypesNumber.h>
#include <Columns/ColumnsNumber.h>
#include <Common/FieldVisitors.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteHelpers.h>
#include <AggregateFunctions/Helpers.h>
#include <AggregateFunctions/IAggregateFunction.h>
namespace DB
{
namespace ErrorCodes
{
extern const int BAD_ARGUMENTS;
}
/** Tracks the leftmost and rightmost (x, y) data points.
*/
struct AggregateFunctionBoundingRatioData
{
struct Point
{
Float64 x;
Float64 y;
};
bool empty = true;
Point left;
Point right;
void add(Float64 x, Float64 y)
{
Point point{x, y};
if (empty)
{
left = point;
right = point;
empty = false;
}
else if (point.x < left.x)
{
left = point;
}
else if (point.x > right.x)
{
right = point;
}
}
void merge(const AggregateFunctionBoundingRatioData & other)
{
if (empty)
{
*this = other;
}
else
{
if (other.left.x < left.x)
left = other.left;
if (other.right.x > right.x)
right = other.right;
}
}
void serialize(WriteBuffer & buf) const
{
writeBinary(empty, buf);
if (!empty)
{
writePODBinary(left, buf);
writePODBinary(right, buf);
}
}
void deserialize(ReadBuffer & buf)
{
readBinary(empty, buf);
if (!empty)
{
readPODBinary(left, buf);
readPODBinary(right, buf);
}
}
};
class AggregateFunctionBoundingRatio final : public IAggregateFunctionDataHelper<AggregateFunctionBoundingRatioData, AggregateFunctionBoundingRatio>
{
private:
/** Calculates the slope of a line between leftmost and rightmost data points.
* (y2 - y1) / (x2 - x1)
*/
Float64 getBoundingRatio(const AggregateFunctionBoundingRatioData & data) const
{
if (data.empty)
return std::numeric_limits<Float64>::quiet_NaN();
return (data.right.y - data.left.y) / (data.right.x - data.left.x);
}
public:
String getName() const override
{
return "boundingRatio";
}
AggregateFunctionBoundingRatio(const DataTypes & arguments)
{
const auto x_arg = arguments.at(0).get();
const auto y_arg = arguments.at(0).get();
if (!x_arg->isValueRepresentedByNumber() || !y_arg->isValueRepresentedByNumber())
throw Exception("Illegal types of arguments of aggregate function " + getName() + ", must have number representation.",
ErrorCodes::BAD_ARGUMENTS);
}
DataTypePtr getReturnType() const override
{
return std::make_shared<DataTypeFloat64>();
}
void add(AggregateDataPtr place, const IColumn ** columns, const size_t row_num, Arena *) const override
{
/// TODO Inefficient.
const auto x = applyVisitor(FieldVisitorConvertToNumber<Float64>(), (*columns[0])[row_num]);
const auto y = applyVisitor(FieldVisitorConvertToNumber<Float64>(), (*columns[1])[row_num]);
data(place).add(x, y);
}
void merge(AggregateDataPtr place, ConstAggregateDataPtr rhs, Arena *) const override
{
data(place).merge(data(rhs));
}
void serialize(ConstAggregateDataPtr place, WriteBuffer & buf) const override
{
data(place).serialize(buf);
}
void deserialize(AggregateDataPtr place, ReadBuffer & buf, Arena *) const override
{
data(place).deserialize(buf);
}
void insertResultInto(ConstAggregateDataPtr place, IColumn & to) const override
{
static_cast<ColumnFloat64 &>(to).getData().push_back(getBoundingRatio(data(place)));
}
const char * getHeaderFilePath() const override
{
return __FILE__;
}
};
}

View File

@ -17,6 +17,7 @@ namespace ErrorCodes
extern const int PARAMETER_OUT_OF_BOUND; extern const int PARAMETER_OUT_OF_BOUND;
} }
namespace namespace
{ {
@ -44,6 +45,8 @@ AggregateFunctionPtr createAggregateFunctionHistogram(const std::string & name,
throw Exception("Illegal type " + arguments[0]->getName() + " of argument for aggregate function " + name, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT); throw Exception("Illegal type " + arguments[0]->getName() + " of argument for aggregate function " + name, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
return res; return res;
return nullptr;
} }
} }

View File

@ -15,6 +15,7 @@ void registerAggregateFunctionGroupArrayInsertAt(AggregateFunctionFactory &);
void registerAggregateFunctionsQuantile(AggregateFunctionFactory &); void registerAggregateFunctionsQuantile(AggregateFunctionFactory &);
void registerAggregateFunctionsSequenceMatch(AggregateFunctionFactory &); void registerAggregateFunctionsSequenceMatch(AggregateFunctionFactory &);
void registerAggregateFunctionWindowFunnel(AggregateFunctionFactory &); void registerAggregateFunctionWindowFunnel(AggregateFunctionFactory &);
void registerAggregateFunctionRate(AggregateFunctionFactory &);
void registerAggregateFunctionsMinMaxAny(AggregateFunctionFactory &); void registerAggregateFunctionsMinMaxAny(AggregateFunctionFactory &);
void registerAggregateFunctionsStatisticsStable(AggregateFunctionFactory &); void registerAggregateFunctionsStatisticsStable(AggregateFunctionFactory &);
void registerAggregateFunctionsStatisticsSimple(AggregateFunctionFactory &); void registerAggregateFunctionsStatisticsSimple(AggregateFunctionFactory &);
@ -50,6 +51,7 @@ void registerAggregateFunctions()
registerAggregateFunctionsQuantile(factory); registerAggregateFunctionsQuantile(factory);
registerAggregateFunctionsSequenceMatch(factory); registerAggregateFunctionsSequenceMatch(factory);
registerAggregateFunctionWindowFunnel(factory); registerAggregateFunctionWindowFunnel(factory);
registerAggregateFunctionRate(factory);
registerAggregateFunctionsMinMaxAny(factory); registerAggregateFunctionsMinMaxAny(factory);
registerAggregateFunctionsStatisticsStable(factory); registerAggregateFunctionsStatisticsStable(factory);
registerAggregateFunctionsStatisticsSimple(factory); registerAggregateFunctionsStatisticsSimple(factory);

View File

@ -322,17 +322,11 @@ bool ColumnArray::hasEqualOffsets(const ColumnArray & other) const
ColumnPtr ColumnArray::convertToFullColumnIfConst() const ColumnPtr ColumnArray::convertToFullColumnIfConst() const
{ {
ColumnPtr new_data; /// It is possible to have an array with constant data and non-constant offsets.
/// Example is the result of expression: replicate('hello', [1])
if (ColumnPtr full_column = getData().convertToFullColumnIfConst()) return ColumnArray::create(data->convertToFullColumnIfConst(), offsets);
new_data = full_column;
else
new_data = data;
return ColumnArray::create(new_data, offsets);
} }
void ColumnArray::getExtremes(Field & min, Field & max) const void ColumnArray::getExtremes(Field & min, Field & max) const
{ {
min = Array(); min = Array();

View File

@ -22,8 +22,7 @@ ColumnNullable::ColumnNullable(MutableColumnPtr && nested_column_, MutableColumn
: nested_column(std::move(nested_column_)), null_map(std::move(null_map_)) : nested_column(std::move(nested_column_)), null_map(std::move(null_map_))
{ {
/// ColumnNullable cannot have constant nested column. But constant argument could be passed. Materialize it. /// ColumnNullable cannot have constant nested column. But constant argument could be passed. Materialize it.
if (ColumnPtr nested_column_materialized = getNestedColumn().convertToFullColumnIfConst()) nested_column = getNestedColumn().convertToFullColumnIfConst();
nested_column = nested_column_materialized;
if (!getNestedColumn().canBeInsideNullable()) if (!getNestedColumn().canBeInsideNullable())
throw Exception{getNestedColumn().getName() + " cannot be inside Nullable column", ErrorCodes::ILLEGAL_COLUMN}; throw Exception{getNestedColumn().getName() + " cannot be inside Nullable column", ErrorCodes::ILLEGAL_COLUMN};

View File

@ -45,7 +45,7 @@ public:
/** If column isn't constant, returns nullptr (or itself). /** If column isn't constant, returns nullptr (or itself).
* If column is constant, transforms constant to full column (if column type allows such tranform) and return it. * If column is constant, transforms constant to full column (if column type allows such tranform) and return it.
*/ */
virtual Ptr convertToFullColumnIfConst() const { return {}; } virtual Ptr convertToFullColumnIfConst() const { return getPtr(); }
/// If column isn't ColumnLowCardinality, return itself. /// If column isn't ColumnLowCardinality, return itself.
/// If column is ColumnLowCardinality, transforms is to full column. /// If column is ColumnLowCardinality, transforms is to full column.

View File

@ -24,7 +24,7 @@ namespace DB
{ {
/// For cutting prerpocessed path to this base /// For cutting prerpocessed path to this base
std::string main_config_path; static std::string main_config_path;
/// Extracts from a string the first encountered number consisting of at least two digits. /// Extracts from a string the first encountered number consisting of at least two digits.
static std::string numberFromHost(const std::string & s) static std::string numberFromHost(const std::string & s)

View File

@ -402,6 +402,9 @@ namespace ErrorCodes
extern const int SYSTEM_ERROR = 425; extern const int SYSTEM_ERROR = 425;
extern const int NULL_POINTER_DEREFERENCE = 426; extern const int NULL_POINTER_DEREFERENCE = 426;
extern const int CANNOT_COMPILE_REGEXP = 427; extern const int CANNOT_COMPILE_REGEXP = 427;
extern const int UNKNOWN_LOG_LEVEL = 428;
extern const int FAILED_TO_GETPWUID = 429;
extern const int MISMATCHING_USERS_FOR_PROCESS_AND_DATA = 430;
extern const int KEEPER_EXCEPTION = 999; extern const int KEEPER_EXCEPTION = 999;
extern const int POCO_EXCEPTION = 1000; extern const int POCO_EXCEPTION = 1000;

View File

@ -9,11 +9,13 @@
#cmakedefine01 USE_RDKAFKA #cmakedefine01 USE_RDKAFKA
#cmakedefine01 USE_CAPNP #cmakedefine01 USE_CAPNP
#cmakedefine01 USE_EMBEDDED_COMPILER #cmakedefine01 USE_EMBEDDED_COMPILER
#cmakedefine01 LLVM_HAS_RTTI
#cmakedefine01 USE_POCO_SQLODBC #cmakedefine01 USE_POCO_SQLODBC
#cmakedefine01 USE_POCO_DATAODBC #cmakedefine01 USE_POCO_DATAODBC
#cmakedefine01 USE_POCO_MONGODB #cmakedefine01 USE_POCO_MONGODB
#cmakedefine01 USE_POCO_NETSSL #cmakedefine01 USE_POCO_NETSSL
#cmakedefine01 CLICKHOUSE_SPLIT_BINARY
#cmakedefine01 USE_BASE64 #cmakedefine01 USE_BASE64
#cmakedefine01 USE_HDFS #cmakedefine01 USE_HDFS
#cmakedefine01 USE_XXHASH
#cmakedefine01 CLICKHOUSE_SPLIT_BINARY
#cmakedefine01 LLVM_HAS_RTTI

View File

@ -97,8 +97,8 @@ public:
/// Approximate number of allocated bytes in memory - for profiling and limits. /// Approximate number of allocated bytes in memory - for profiling and limits.
size_t allocatedBytes() const; size_t allocatedBytes() const;
operator bool() const { return !data.empty(); } operator bool() const { return !!columns(); }
bool operator!() const { return data.empty(); } bool operator!() const { return !this->operator bool(); }
/** Get a list of column names separated by commas. */ /** Get a list of column names separated by commas. */
std::string dumpNames() const; std::string dumpNames() const;

View File

@ -27,7 +27,6 @@ bool callOnBasicType(TypeIndex number, F && f)
case TypeIndex::UInt16: return f(TypePair<T, UInt16>()); case TypeIndex::UInt16: return f(TypePair<T, UInt16>());
case TypeIndex::UInt32: return f(TypePair<T, UInt32>()); case TypeIndex::UInt32: return f(TypePair<T, UInt32>());
case TypeIndex::UInt64: return f(TypePair<T, UInt64>()); case TypeIndex::UInt64: return f(TypePair<T, UInt64>());
//case TypeIndex::UInt128>: return f(TypePair<T, UInt128>());
case TypeIndex::Int8: return f(TypePair<T, Int8>()); case TypeIndex::Int8: return f(TypePair<T, Int8>());
case TypeIndex::Int16: return f(TypePair<T, Int16>()); case TypeIndex::Int16: return f(TypePair<T, Int16>());
@ -35,6 +34,9 @@ bool callOnBasicType(TypeIndex number, F && f)
case TypeIndex::Int64: return f(TypePair<T, Int64>()); case TypeIndex::Int64: return f(TypePair<T, Int64>());
case TypeIndex::Int128: return f(TypePair<T, Int128>()); case TypeIndex::Int128: return f(TypePair<T, Int128>());
case TypeIndex::Enum8: return f(TypePair<T, Int8>());
case TypeIndex::Enum16: return f(TypePair<T, Int16>());
default: default:
break; break;
} }
@ -89,13 +91,16 @@ inline bool callOnBasicTypes(TypeIndex type_num1, TypeIndex type_num2, F && f)
case TypeIndex::UInt16: return callOnBasicType<UInt16, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f)); case TypeIndex::UInt16: return callOnBasicType<UInt16, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::UInt32: return callOnBasicType<UInt32, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f)); case TypeIndex::UInt32: return callOnBasicType<UInt32, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::UInt64: return callOnBasicType<UInt64, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f)); case TypeIndex::UInt64: return callOnBasicType<UInt64, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
//case TypeIndex::UInt128: return callOnBasicType<UInt128, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Int8: return callOnBasicType<Int8, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f)); case TypeIndex::Int8: return callOnBasicType<Int8, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Int16: return callOnBasicType<Int16, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f)); case TypeIndex::Int16: return callOnBasicType<Int16, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Int32: return callOnBasicType<Int32, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f)); case TypeIndex::Int32: return callOnBasicType<Int32, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Int64: return callOnBasicType<Int64, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f)); case TypeIndex::Int64: return callOnBasicType<Int64, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Int128: return callOnBasicType<Int128, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f)); case TypeIndex::Int128: return callOnBasicType<Int128, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Enum8: return callOnBasicType<Int8, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Enum16: return callOnBasicType<Int16, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
default: default:
break; break;
} }

View File

@ -29,8 +29,8 @@ NativeBlockInputStream::NativeBlockInputStream(ReadBuffer & istr_, UInt64 server
{ {
} }
NativeBlockInputStream::NativeBlockInputStream(ReadBuffer & istr_, const Block & header_, UInt64 server_revision_) NativeBlockInputStream::NativeBlockInputStream(ReadBuffer & istr_, const Block & header_, UInt64 server_revision_, bool convert_types_to_low_cardinality_)
: istr(istr_), header(header_), server_revision(server_revision_) : istr(istr_), header(header_), server_revision(server_revision_), convert_types_to_low_cardinality(convert_types_to_low_cardinality_)
{ {
} }
@ -154,7 +154,8 @@ Block NativeBlockInputStream::readImpl()
column.column = std::move(read_column); column.column = std::move(read_column);
/// Support insert from old clients without low cardinality type. /// Support insert from old clients without low cardinality type.
if (header && server_revision && server_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE) bool revision_without_low_cardinality = server_revision && server_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE;
if (header && (convert_types_to_low_cardinality || revision_without_low_cardinality))
{ {
column.column = recursiveLowCardinalityConversion(column.column, column.type, header.getByPosition(i).type); column.column = recursiveLowCardinalityConversion(column.column, column.type, header.getByPosition(i).type);
column.type = header.getByPosition(i).type; column.type = header.getByPosition(i).type;

View File

@ -65,7 +65,7 @@ public:
/// For cases when data structure (header) is known in advance. /// For cases when data structure (header) is known in advance.
/// NOTE We may use header for data validation and/or type conversions. It is not implemented. /// NOTE We may use header for data validation and/or type conversions. It is not implemented.
NativeBlockInputStream(ReadBuffer & istr_, const Block & header_, UInt64 server_revision_); NativeBlockInputStream(ReadBuffer & istr_, const Block & header_, UInt64 server_revision_, bool convert_types_to_low_cardinality_ = false);
/// For cases when we have an index. It allows to skip columns. Only columns specified in the index will be read. /// For cases when we have an index. It allows to skip columns. Only columns specified in the index will be read.
NativeBlockInputStream(ReadBuffer & istr_, UInt64 server_revision_, NativeBlockInputStream(ReadBuffer & istr_, UInt64 server_revision_,
@ -91,6 +91,8 @@ private:
IndexForNativeFormat::Blocks::const_iterator index_block_end; IndexForNativeFormat::Blocks::const_iterator index_block_end;
IndexOfBlockForNativeFormat::Columns::const_iterator index_column_it; IndexOfBlockForNativeFormat::Columns::const_iterator index_column_it;
bool convert_types_to_low_cardinality = false;
/// If an index is specified, then `istr` must be CompressedReadBufferFromFile. Unused otherwise. /// If an index is specified, then `istr` must be CompressedReadBufferFromFile. Unused otherwise.
CompressedReadBufferFromFile * istr_concrete = nullptr; CompressedReadBufferFromFile * istr_concrete = nullptr;

View File

@ -21,10 +21,10 @@ namespace ErrorCodes
NativeBlockOutputStream::NativeBlockOutputStream( NativeBlockOutputStream::NativeBlockOutputStream(
WriteBuffer & ostr_, UInt64 client_revision_, const Block & header_, WriteBuffer & ostr_, UInt64 client_revision_, const Block & header_, bool remove_low_cardinality_,
WriteBuffer * index_ostr_, size_t initial_size_of_file_) WriteBuffer * index_ostr_, size_t initial_size_of_file_)
: ostr(ostr_), client_revision(client_revision_), header(header_), : ostr(ostr_), client_revision(client_revision_), header(header_),
index_ostr(index_ostr_), initial_size_of_file(initial_size_of_file_) index_ostr(index_ostr_), initial_size_of_file(initial_size_of_file_), remove_low_cardinality(remove_low_cardinality_)
{ {
if (index_ostr) if (index_ostr)
{ {
@ -46,12 +46,7 @@ void NativeBlockOutputStream::writeData(const IDataType & type, const ColumnPtr
/** If there are columns-constants - then we materialize them. /** If there are columns-constants - then we materialize them.
* (Since the data type does not know how to serialize / deserialize constants.) * (Since the data type does not know how to serialize / deserialize constants.)
*/ */
ColumnPtr full_column; ColumnPtr full_column = column->convertToFullColumnIfConst();
if (ColumnPtr converted = column->convertToFullColumnIfConst())
full_column = converted;
else
full_column = column;
IDataType::SerializeBinaryBulkSettings settings; IDataType::SerializeBinaryBulkSettings settings;
settings.getter = [&ostr](IDataType::SubstreamPath) -> WriteBuffer * { return &ostr; }; settings.getter = [&ostr](IDataType::SubstreamPath) -> WriteBuffer * { return &ostr; };
@ -104,7 +99,7 @@ void NativeBlockOutputStream::write(const Block & block)
ColumnWithTypeAndName column = block.safeGetByPosition(i); ColumnWithTypeAndName column = block.safeGetByPosition(i);
/// Send data to old clients without low cardinality type. /// Send data to old clients without low cardinality type.
if (client_revision && client_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE) if (remove_low_cardinality || (client_revision && client_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE))
{ {
column.column = recursiveRemoveLowCardinality(column.column); column.column = recursiveRemoveLowCardinality(column.column);
column.type = recursiveRemoveLowCardinality(column.type); column.type = recursiveRemoveLowCardinality(column.type);

View File

@ -23,7 +23,7 @@ public:
/** If non-zero client_revision is specified, additional block information can be written. /** If non-zero client_revision is specified, additional block information can be written.
*/ */
NativeBlockOutputStream( NativeBlockOutputStream(
WriteBuffer & ostr_, UInt64 client_revision_, const Block & header_, WriteBuffer & ostr_, UInt64 client_revision_, const Block & header_, bool remove_low_cardinality_ = false,
WriteBuffer * index_ostr_ = nullptr, size_t initial_size_of_file_ = 0); WriteBuffer * index_ostr_ = nullptr, size_t initial_size_of_file_ = 0);
Block getHeader() const override { return header; } Block getHeader() const override { return header; }
@ -42,6 +42,8 @@ private:
size_t initial_size_of_file; /// The initial size of the data file, if `append` done. Used for the index. size_t initial_size_of_file; /// The initial size of the data file, if `append` done. Used for the index.
/// If you need to write index, then `ostr` must be a CompressedWriteBuffer. /// If you need to write index, then `ostr` must be a CompressedWriteBuffer.
CompressedWriteBuffer * ostr_concrete = nullptr; CompressedWriteBuffer * ostr_concrete = nullptr;
bool remove_low_cardinality;
}; };
} }

View File

@ -127,10 +127,7 @@ Block TotalsHavingBlockInputStream::readImpl()
expression->execute(finalized); expression->execute(finalized);
size_t filter_column_pos = finalized.getPositionByName(filter_column_name); size_t filter_column_pos = finalized.getPositionByName(filter_column_name);
ColumnPtr filter_column_ptr = finalized.safeGetByPosition(filter_column_pos).column; ColumnPtr filter_column_ptr = finalized.safeGetByPosition(filter_column_pos).column->convertToFullColumnIfConst();
if (ColumnPtr materialized = filter_column_ptr->convertToFullColumnIfConst())
filter_column_ptr = materialized;
FilterDescription filter_description(*filter_column_ptr); FilterDescription filter_description(*filter_column_ptr);

View File

@ -14,9 +14,7 @@ Block materializeBlock(const Block & block)
for (size_t i = 0; i < columns; ++i) for (size_t i = 0; i < columns; ++i)
{ {
auto & element = res.getByPosition(i); auto & element = res.getByPosition(i);
auto & src = element.column; element.column = element.column->convertToFullColumnIfConst();
if (ColumnPtr converted = src->convertToFullColumnIfConst())
src = converted;
} }
return res; return res;

View File

@ -19,6 +19,7 @@ void registerDataTypeInterval(DataTypeFactory & factory)
factory.registerSimpleDataType("IntervalDay", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Day)); }); factory.registerSimpleDataType("IntervalDay", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Day)); });
factory.registerSimpleDataType("IntervalWeek", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Week)); }); factory.registerSimpleDataType("IntervalWeek", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Week)); });
factory.registerSimpleDataType("IntervalMonth", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Month)); }); factory.registerSimpleDataType("IntervalMonth", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Month)); });
factory.registerSimpleDataType("IntervalQuarter", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Quarter)); });
factory.registerSimpleDataType("IntervalYear", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Year)); }); factory.registerSimpleDataType("IntervalYear", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Year)); });
} }

View File

@ -25,6 +25,7 @@ public:
Day, Day,
Week, Week,
Month, Month,
Quarter,
Year Year
}; };
@ -46,6 +47,7 @@ public:
case Day: return "Day"; case Day: return "Day";
case Week: return "Week"; case Week: return "Week";
case Month: return "Month"; case Month: return "Month";
case Quarter: return "Quarter";
case Year: return "Year"; case Year: return "Year";
default: __builtin_unreachable(); default: __builtin_unreachable();
} }

View File

@ -8,7 +8,6 @@
#include <DataTypes/DataTypeNullable.h> #include <DataTypes/DataTypeNullable.h>
#include <DataTypes/DataTypeNothing.h> #include <DataTypes/DataTypeNothing.h>
#include <DataTypes/getLeastSupertype.h> #include <DataTypes/getLeastSupertype.h>
#include <Interpreters/convertFieldToType.h>
#include <Common/Exception.h> #include <Common/Exception.h>
#include <ext/size.h> #include <ext/size.h>

View File

@ -26,7 +26,7 @@ using DataTypes = std::vector<DataTypePtr>;
/** Properties of data type. /** Properties of data type.
* Contains methods for serialization/deserialization. * Contains methods for serialization/deserialization.
* Implementations of this interface represent a data type (example: UInt8) * Implementations of this interface represent a data type (example: UInt8)
* or parapetric family of data types (example: Array(...)). * or parametric family of data types (example: Array(...)).
* *
* DataType is totally immutable object. You can always share them. * DataType is totally immutable object. You can always share them.
*/ */

View File

@ -1,8 +1,7 @@
include(${ClickHouse_SOURCE_DIR}/cmake/dbms_glob_sources.cmake) include(${ClickHouse_SOURCE_DIR}/cmake/dbms_glob_sources.cmake)
add_headers_and_sources(clickhouse_functions .)
add_headers_and_sources(clickhouse_functions ./GatherUtils) add_headers_and_sources(clickhouse_functions ./GatherUtils)
add_headers_and_sources(clickhouse_functions ./Conditional) add_headers_and_sources(clickhouse_functions .)
list(REMOVE_ITEM clickhouse_functions_sources IFunction.cpp FunctionFactory.cpp FunctionHelpers.cpp) list(REMOVE_ITEM clickhouse_functions_sources IFunction.cpp FunctionFactory.cpp FunctionHelpers.cpp)
@ -21,7 +20,8 @@ target_link_libraries(clickhouse_functions
${METROHASH_LIBRARIES} ${METROHASH_LIBRARIES}
murmurhash murmurhash
${BASE64_LIBRARY} ${BASE64_LIBRARY}
${OPENSSL_CRYPTO_LIBRARY}) ${OPENSSL_CRYPTO_LIBRARY}
${LZ4_LIBRARY})
target_include_directories (clickhouse_functions SYSTEM BEFORE PUBLIC ${DIVIDE_INCLUDE_DIR}) target_include_directories (clickhouse_functions SYSTEM BEFORE PUBLIC ${DIVIDE_INCLUDE_DIR})

View File

@ -11,13 +11,14 @@
#include <Columns/ColumnDecimal.h> #include <Columns/ColumnDecimal.h>
#include <Columns/ColumnConst.h> #include <Columns/ColumnConst.h>
#include <Columns/ColumnAggregateFunction.h> #include <Columns/ColumnAggregateFunction.h>
#include <Functions/IFunction.h> #include "IFunction.h"
#include <Functions/FunctionHelpers.h> #include "FunctionHelpers.h"
#include "intDiv.h"
#include "castTypeToEither.h"
#include "FunctionFactory.h"
#include <DataTypes/NumberTraits.h> #include <DataTypes/NumberTraits.h>
#include <Common/typeid_cast.h> #include <Common/typeid_cast.h>
#include <Common/Arena.h> #include <Common/Arena.h>
#include <Functions/intDiv.h>
#include <Functions/castTypeToEither.h>
#include <Common/config.h> #include <Common/config.h>
#if USE_EMBEDDED_COMPILER #if USE_EMBEDDED_COMPILER

View File

@ -113,6 +113,21 @@ struct AddMonthsImpl
} }
}; };
struct AddQuartersImpl
{
static constexpr auto name = "addQuarters";
static inline UInt32 execute(UInt32 t, Int64 delta, const DateLUTImpl & time_zone)
{
return time_zone.addQuarters(t, delta);
}
static inline UInt16 execute(UInt16 d, Int64 delta, const DateLUTImpl & time_zone)
{
return time_zone.addQuarters(DayNum(d), delta);
}
};
struct AddYearsImpl struct AddYearsImpl
{ {
static constexpr auto name = "addYears"; static constexpr auto name = "addYears";
@ -149,6 +164,7 @@ struct SubtractHoursImpl : SubtractIntervalImpl<AddHoursImpl> { static constexpr
struct SubtractDaysImpl : SubtractIntervalImpl<AddDaysImpl> { static constexpr auto name = "subtractDays"; }; struct SubtractDaysImpl : SubtractIntervalImpl<AddDaysImpl> { static constexpr auto name = "subtractDays"; };
struct SubtractWeeksImpl : SubtractIntervalImpl<AddWeeksImpl> { static constexpr auto name = "subtractWeeks"; }; struct SubtractWeeksImpl : SubtractIntervalImpl<AddWeeksImpl> { static constexpr auto name = "subtractWeeks"; };
struct SubtractMonthsImpl : SubtractIntervalImpl<AddMonthsImpl> { static constexpr auto name = "subtractMonths"; }; struct SubtractMonthsImpl : SubtractIntervalImpl<AddMonthsImpl> { static constexpr auto name = "subtractMonths"; };
struct SubtractQuartersImpl : SubtractIntervalImpl<AddQuartersImpl> { static constexpr auto name = "subtractQuarters"; };
struct SubtractYearsImpl : SubtractIntervalImpl<AddYearsImpl> { static constexpr auto name = "subtractYears"; }; struct SubtractYearsImpl : SubtractIntervalImpl<AddYearsImpl> { static constexpr auto name = "subtractYears"; };

View File

@ -89,6 +89,7 @@ void registerFunctionsConversion(FunctionFactory & factory)
factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalDay, PositiveMonotonicity>>(); factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalDay, PositiveMonotonicity>>();
factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalWeek, PositiveMonotonicity>>(); factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalWeek, PositiveMonotonicity>>();
factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalMonth, PositiveMonotonicity>>(); factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalMonth, PositiveMonotonicity>>();
factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalQuarter, PositiveMonotonicity>>();
factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalYear, PositiveMonotonicity>>(); factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalYear, PositiveMonotonicity>>();
} }

View File

@ -738,6 +738,7 @@ DEFINE_NAME_TO_INTERVAL(Hour)
DEFINE_NAME_TO_INTERVAL(Day) DEFINE_NAME_TO_INTERVAL(Day)
DEFINE_NAME_TO_INTERVAL(Week) DEFINE_NAME_TO_INTERVAL(Week)
DEFINE_NAME_TO_INTERVAL(Month) DEFINE_NAME_TO_INTERVAL(Month)
DEFINE_NAME_TO_INTERVAL(Quarter)
DEFINE_NAME_TO_INTERVAL(Year) DEFINE_NAME_TO_INTERVAL(Year)
#undef DEFINE_NAME_TO_INTERVAL #undef DEFINE_NAME_TO_INTERVAL
@ -1138,6 +1139,9 @@ struct ToIntMonotonicity
static IFunction::Monotonicity get(const IDataType & type, const Field & left, const Field & right) static IFunction::Monotonicity get(const IDataType & type, const Field & left, const Field & right)
{ {
if (!type.isValueRepresentedByNumber())
return {};
size_t size_of_type = type.getSizeOfValueInMemory(); size_t size_of_type = type.getSizeOfValueInMemory();
/// If type is expanding /// If type is expanding

View File

@ -346,11 +346,8 @@ private:
String attr_name = attr_name_col->getValue<String>(); String attr_name = attr_name_col->getValue<String>();
const ColumnWithTypeAndName & key_col_with_type = block.getByPosition(arguments[2]); const ColumnWithTypeAndName & key_col_with_type = block.getByPosition(arguments[2]);
ColumnPtr key_col = key_col_with_type.column;
/// Functions in external dictionaries only support full-value (not constant) columns with keys. /// Functions in external dictionaries only support full-value (not constant) columns with keys.
if (ColumnPtr key_col_materialized = key_col_with_type.column->convertToFullColumnIfConst()) ColumnPtr key_col = key_col_with_type.column->convertToFullColumnIfConst();
key_col = key_col_materialized;
if (checkColumn<ColumnTuple>(key_col.get())) if (checkColumn<ColumnTuple>(key_col.get()))
{ {
@ -578,11 +575,8 @@ private:
String attr_name = attr_name_col->getValue<String>(); String attr_name = attr_name_col->getValue<String>();
const ColumnWithTypeAndName & key_col_with_type = block.getByPosition(arguments[2]); const ColumnWithTypeAndName & key_col_with_type = block.getByPosition(arguments[2]);
ColumnPtr key_col = key_col_with_type.column;
/// Functions in external dictionaries only support full-value (not constant) columns with keys. /// Functions in external dictionaries only support full-value (not constant) columns with keys.
if (ColumnPtr key_col_materialized = key_col_with_type.column->convertToFullColumnIfConst()) ColumnPtr key_col = key_col_with_type.column->convertToFullColumnIfConst();
key_col = key_col_materialized;
const auto & key_columns = typeid_cast<const ColumnTuple &>(*key_col).getColumns(); const auto & key_columns = typeid_cast<const ColumnTuple &>(*key_col).getColumns();
const auto & key_types = static_cast<const DataTypeTuple &>(*key_col_with_type.type).getElements(); const auto & key_types = static_cast<const DataTypeTuple &>(*key_col_with_type.type).getElements();
@ -813,11 +807,9 @@ private:
String attr_name = attr_name_col->getValue<String>(); String attr_name = attr_name_col->getValue<String>();
const ColumnWithTypeAndName & key_col_with_type = block.getByPosition(arguments[2]); const ColumnWithTypeAndName & key_col_with_type = block.getByPosition(arguments[2]);
ColumnPtr key_col = key_col_with_type.column;
/// Functions in external dictionaries only support full-value (not constant) columns with keys. /// Functions in external dictionaries only support full-value (not constant) columns with keys.
if (ColumnPtr key_col_materialized = key_col_with_type.column->convertToFullColumnIfConst()) ColumnPtr key_col = key_col_with_type.column->convertToFullColumnIfConst();
key_col = key_col_materialized;
if (checkColumn<ColumnTuple>(key_col.get())) if (checkColumn<ColumnTuple>(key_col.get()))
{ {
@ -1079,11 +1071,9 @@ private:
String attr_name = attr_name_col->getValue<String>(); String attr_name = attr_name_col->getValue<String>();
const ColumnWithTypeAndName & key_col_with_type = block.getByPosition(arguments[2]); const ColumnWithTypeAndName & key_col_with_type = block.getByPosition(arguments[2]);
ColumnPtr key_col = key_col_with_type.column;
/// Functions in external dictionaries only support full-value (not constant) columns with keys. /// Functions in external dictionaries only support full-value (not constant) columns with keys.
if (ColumnPtr key_col_materialized = key_col_with_type.column->convertToFullColumnIfConst()) ColumnPtr key_col = key_col_with_type.column->convertToFullColumnIfConst();
key_col = key_col_materialized;
const auto & key_columns = typeid_cast<const ColumnTuple &>(*key_col).getColumns(); const auto & key_columns = typeid_cast<const ColumnTuple &>(*key_col).getColumns();
const auto & key_types = static_cast<const DataTypeTuple &>(*key_col_with_type.type).getElements(); const auto & key_types = static_cast<const DataTypeTuple &>(*key_col_with_type.type).getElements();
@ -1691,7 +1681,7 @@ static const PaddedPODArray<T> & getColumnDataAsPaddedPODArray(const IColumn & c
} }
} }
const auto full_column = column.isColumnConst() ? column.convertToFullColumnIfConst() : column.getPtr(); const auto full_column = column.convertToFullColumnIfConst();
// With type conversion and const columns we need to use backup storage here // With type conversion and const columns we need to use backup storage here
const auto size = full_column->size(); const auto size = full_column->size();

View File

@ -227,7 +227,6 @@ protected:
bool executeOperationTyped(const IColumn * in_untyped, PaddedPODArray<OutputType> & dst, const IColumn * centroids_array_untyped) bool executeOperationTyped(const IColumn * in_untyped, PaddedPODArray<OutputType> & dst, const IColumn * centroids_array_untyped)
{ {
const auto maybe_const = in_untyped->convertToFullColumnIfConst(); const auto maybe_const = in_untyped->convertToFullColumnIfConst();
if (maybe_const)
in_untyped = maybe_const.get(); in_untyped = maybe_const.get();
const auto in_vector = checkAndGetColumn<ColumnVector<InputType>>(in_untyped); const auto in_vector = checkAndGetColumn<ColumnVector<InputType>>(in_untyped);

View File

@ -1,5 +1,6 @@
#include <Functions/FunctionFactory.h> #include <Functions/FunctionFactory.h>
#include <Functions/FunctionsHashing.h> #include <Functions/FunctionsHashing.h>
#include <Common/config.h>
namespace DB namespace DB
@ -20,10 +21,17 @@ void registerFunctionsHashing(FunctionFactory & factory)
factory.registerFunction<FunctionIntHash32>(); factory.registerFunction<FunctionIntHash32>();
factory.registerFunction<FunctionIntHash64>(); factory.registerFunction<FunctionIntHash64>();
factory.registerFunction<FunctionURLHash>(); factory.registerFunction<FunctionURLHash>();
factory.registerFunction<FunctionJavaHash>();
factory.registerFunction<FunctionHiveHash>();
factory.registerFunction<FunctionMurmurHash2_32>(); factory.registerFunction<FunctionMurmurHash2_32>();
factory.registerFunction<FunctionMurmurHash2_64>(); factory.registerFunction<FunctionMurmurHash2_64>();
factory.registerFunction<FunctionMurmurHash3_32>(); factory.registerFunction<FunctionMurmurHash3_32>();
factory.registerFunction<FunctionMurmurHash3_64>(); factory.registerFunction<FunctionMurmurHash3_64>();
factory.registerFunction<FunctionMurmurHash3_128>(); factory.registerFunction<FunctionMurmurHash3_128>();
#if USE_XXHASH
factory.registerFunction<FunctionXxHash32>();
factory.registerFunction<FunctionXxHash64>();
#endif
} }
} }

View File

@ -8,6 +8,11 @@
#include <murmurhash2.h> #include <murmurhash2.h>
#include <murmurhash3.h> #include <murmurhash3.h>
#include <Common/config.h>
#if USE_XXHASH
#include <xxhash.h>
#endif
#include <Poco/ByteOrder.h> #include <Poco/ByteOrder.h>
#include <Common/SipHash.h> #include <Common/SipHash.h>
@ -41,6 +46,7 @@ namespace ErrorCodes
{ {
extern const int LOGICAL_ERROR; extern const int LOGICAL_ERROR;
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH; extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
extern const int NOT_IMPLEMENTED;
} }
@ -116,6 +122,7 @@ struct HalfMD5Impl
/// If true, it will use intHash32 or intHash64 to hash POD types. This behaviour is intended for better performance of some functions. /// If true, it will use intHash32 or intHash64 to hash POD types. This behaviour is intended for better performance of some functions.
/// Otherwise it will hash bytes in memory as a string using corresponding hash function. /// Otherwise it will hash bytes in memory as a string using corresponding hash function.
static constexpr bool use_int_hash_for_pods = false; static constexpr bool use_int_hash_for_pods = false;
}; };
@ -298,6 +305,51 @@ struct MurmurHash3Impl64
static constexpr bool use_int_hash_for_pods = false; static constexpr bool use_int_hash_for_pods = false;
}; };
/// http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/478a4add975b/src/share/classes/java/lang/String.java#l1452
/// Care should be taken to do all calculation in unsigned integers (to avoid undefined behaviour on overflow)
/// but obtain the same result as it is done in singed integers with two's complement arithmetic.
struct JavaHashImpl
{
static constexpr auto name = "javaHash";
using ReturnType = Int32;
static Int32 apply(const char * data, const size_t size)
{
UInt32 h = 0;
for (size_t i = 0; i < size; ++i)
h = 31 * h + static_cast<UInt32>(static_cast<Int8>(data[i]));
return static_cast<Int32>(h);
}
static Int32 combineHashes(Int32, Int32)
{
throw Exception("Java hash is not combineable for multiple arguments", ErrorCodes::NOT_IMPLEMENTED);
}
static constexpr bool use_int_hash_for_pods = false;
};
/// This is just JavaHash with zeroed out sign bit.
/// This function is used in Hive for versions before 3.0,
/// after 3.0, Hive uses murmur-hash3.
struct HiveHashImpl
{
static constexpr auto name = "hiveHash";
using ReturnType = Int32;
static Int32 apply(const char * data, const size_t size)
{
return static_cast<Int32>(0x7FFFFFFF & static_cast<UInt32>(JavaHashImpl::apply(data, size)));
}
static Int32 combineHashes(Int32, Int32)
{
throw Exception("Hive hash is not combineable for multiple arguments", ErrorCodes::NOT_IMPLEMENTED);
}
static constexpr bool use_int_hash_for_pods = false;
};
struct MurmurHash3Impl128 struct MurmurHash3Impl128
{ {
static constexpr auto name = "murmurHash3_128"; static constexpr auto name = "murmurHash3_128";
@ -356,6 +408,49 @@ struct ImplMetroHash64
}; };
#if USE_XXHASH
struct ImplXxHash32
{
static constexpr auto name = "xxHash32";
using ReturnType = UInt32;
static auto apply(const char * s, const size_t len) { return XXH32(s, len, 0); }
/**
* With current implementation with more than 1 arguments it will give the results
* non-reproducable from outside of CH.
*
* Proper way of combining several input is to use streaming mode of hash function
* https://github.com/Cyan4973/xxHash/issues/114#issuecomment-334908566
*
* In common case doable by init_state / update_state / finalize_state
*/
static auto combineHashes(UInt32 h1, UInt32 h2) { return IntHash32Impl::apply(h1) ^ h2; }
static constexpr bool use_int_hash_for_pods = false;
};
struct ImplXxHash64
{
static constexpr auto name = "xxHash64";
using ReturnType = UInt64;
using uint128_t = CityHash_v1_0_2::uint128;
static auto apply(const char * s, const size_t len) { return XXH64(s, len, 0); }
/*
With current implementation with more than 1 arguments it will give the results
non-reproducable from outside of CH. (see comment on ImplXxHash32).
*/
static auto combineHashes(UInt64 h1, UInt64 h2) { return CityHash_v1_0_2::Hash128to64(uint128_t(h1, h2)); }
static constexpr bool use_int_hash_for_pods = false;
};
#endif
template <typename Impl> template <typename Impl>
class FunctionStringHashFixedString : public IFunction class FunctionStringHashFixedString : public IFunction
{ {
@ -978,4 +1073,12 @@ using FunctionMurmurHash2_64 = FunctionAnyHash<MurmurHash2Impl64>;
using FunctionMurmurHash3_32 = FunctionAnyHash<MurmurHash3Impl32>; using FunctionMurmurHash3_32 = FunctionAnyHash<MurmurHash3Impl32>;
using FunctionMurmurHash3_64 = FunctionAnyHash<MurmurHash3Impl64>; using FunctionMurmurHash3_64 = FunctionAnyHash<MurmurHash3Impl64>;
using FunctionMurmurHash3_128 = FunctionStringHashFixedString<MurmurHash3Impl128>; using FunctionMurmurHash3_128 = FunctionStringHashFixedString<MurmurHash3Impl128>;
using FunctionJavaHash = FunctionAnyHash<JavaHashImpl>;
using FunctionHiveHash = FunctionAnyHash<HiveHashImpl>;
#if USE_XXHASH
using FunctionXxHash32 = FunctionAnyHash<ImplXxHash32>;
using FunctionXxHash64 = FunctionAnyHash<ImplXxHash64>;
#endif
} }

View File

@ -10,6 +10,7 @@ void registerFunctionsRound(FunctionFactory & factory)
factory.registerFunction<FunctionFloor>("floor", FunctionFactory::CaseInsensitive); factory.registerFunction<FunctionFloor>("floor", FunctionFactory::CaseInsensitive);
factory.registerFunction<FunctionCeil>("ceil", FunctionFactory::CaseInsensitive); factory.registerFunction<FunctionCeil>("ceil", FunctionFactory::CaseInsensitive);
factory.registerFunction<FunctionTrunc>("trunc", FunctionFactory::CaseInsensitive); factory.registerFunction<FunctionTrunc>("trunc", FunctionFactory::CaseInsensitive);
factory.registerFunction<FunctionRoundDown>();
/// Compatibility aliases. /// Compatibility aliases.
factory.registerAlias("ceiling", "ceil", FunctionFactory::CaseInsensitive); factory.registerAlias("ceiling", "ceil", FunctionFactory::CaseInsensitive);

View File

@ -1,14 +1,19 @@
#pragma once #pragma once
#include <Columns/ColumnArray.h>
#include <Functions/FunctionUnaryArithmetic.h> #include <Functions/FunctionUnaryArithmetic.h>
#include <Functions/FunctionHelpers.h> #include <Functions/FunctionHelpers.h>
#include <IO/WriteHelpers.h> #include <IO/WriteHelpers.h>
#include <DataTypes/getLeastSupertype.h>
#include <DataTypes/DataTypeArray.h>
#include <Interpreters/castColumn.h>
#include <common/intExp.h> #include <common/intExp.h>
#include <cmath> #include <cmath>
#include <type_traits> #include <type_traits>
#include <array> #include <array>
#include <ext/bit_cast.h> #include <ext/bit_cast.h>
#include <algorithm>
#if __SSE4_1__ #if __SSE4_1__
#include <smmintrin.h> #include <smmintrin.h>
@ -24,6 +29,7 @@ namespace ErrorCodes
extern const int ILLEGAL_TYPE_OF_ARGUMENT; extern const int ILLEGAL_TYPE_OF_ARGUMENT;
extern const int ILLEGAL_COLUMN; extern const int ILLEGAL_COLUMN;
extern const int LOGICAL_ERROR; extern const int LOGICAL_ERROR;
extern const int BAD_ARGUMENTS;
} }
@ -133,6 +139,9 @@ struct IntegerRoundingComputation
static ALWAYS_INLINE void compute(const T * __restrict in, size_t scale, T * __restrict out) static ALWAYS_INLINE void compute(const T * __restrict in, size_t scale, T * __restrict out)
{ {
if (scale > size_t(std::numeric_limits<T>::max()))
*out = 0;
else
*out = compute(*in, scale); *out = compute(*in, scale);
} }
@ -556,6 +565,154 @@ public:
}; };
/** Rounds down to a number within explicitly specified array.
* If the value is less than the minimal bound - returns the minimal bound.
*/
class FunctionRoundDown : public IFunction
{
public:
static constexpr auto name = "roundDown";
static FunctionPtr create(const Context & context) { return std::make_shared<FunctionRoundDown>(context); }
FunctionRoundDown(const Context & context) : context(context) {}
public:
String getName() const override { return name; }
bool isVariadic() const override { return false; }
size_t getNumberOfArguments() const override { return 2; }
bool useDefaultImplementationForConstants() const override { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {1}; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
const DataTypePtr & type_x = arguments[0];
if (!(isNumber(type_x) || isDecimal(type_x)))
throw Exception{"Unsupported type " + type_x->getName()
+ " of first argument of function " + getName()
+ ", must be numeric type.", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT};
const DataTypeArray * type_arr = checkAndGetDataType<DataTypeArray>(arguments[1].get());
if (!type_arr)
throw Exception{"Second argument of function " + getName()
+ ", must be array of boundaries to round to.", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT};
const auto type_arr_nested = type_arr->getNestedType();
if (!(isNumber(type_arr_nested) || isDecimal(type_arr_nested)))
{
throw Exception{"Elements of array of second argument of function " + getName()
+ " must be numeric type.", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT};
}
return getLeastSupertype({type_x, type_arr_nested});
}
void executeImpl(Block & block, const ColumnNumbers & arguments, size_t result, size_t) override
{
auto in_column = block.getByPosition(arguments[0]).column;
const auto & in_type = block.getByPosition(arguments[0]).type;
auto array_column = block.getByPosition(arguments[1]).column;
const auto & array_type = block.getByPosition(arguments[1]).type;
const auto & return_type = block.getByPosition(result).type;
auto column_result = return_type->createColumn();
auto out = column_result.get();
if (!in_type->equals(*return_type))
in_column = castColumn(block.getByPosition(arguments[0]), return_type, context);
if (!array_type->equals(*return_type))
array_column = castColumn(block.getByPosition(arguments[1]), std::make_shared<DataTypeArray>(return_type), context);
const auto in = in_column.get();
auto boundaries = typeid_cast<const ColumnConst &>(*array_column).getValue<Array>();
size_t num_boundaries = boundaries.size();
if (!num_boundaries)
throw Exception("Empty array is illegal for boundaries in " + getName() + " function", ErrorCodes::BAD_ARGUMENTS);
if (!executeNum<UInt8>(in, out, boundaries)
&& !executeNum<UInt16>(in, out, boundaries)
&& !executeNum<UInt32>(in, out, boundaries)
&& !executeNum<UInt64>(in, out, boundaries)
&& !executeNum<Int8>(in, out, boundaries)
&& !executeNum<Int16>(in, out, boundaries)
&& !executeNum<Int32>(in, out, boundaries)
&& !executeNum<Int64>(in, out, boundaries)
&& !executeNum<Float32>(in, out, boundaries)
&& !executeNum<Float64>(in, out, boundaries)
&& !executeDecimal<Decimal32>(in, out, boundaries)
&& !executeDecimal<Decimal64>(in, out, boundaries)
&& !executeDecimal<Decimal128>(in, out, boundaries))
{
throw Exception{"Illegal column " + in->getName() + " of first argument of function " + getName(), ErrorCodes::ILLEGAL_COLUMN};
}
block.getByPosition(result).column = std::move(column_result);
}
private:
template <typename T>
bool executeNum(const IColumn * in_untyped, IColumn * out_untyped, const Array & boundaries)
{
const auto in = checkAndGetColumn<ColumnVector<T>>(in_untyped);
auto out = typeid_cast<ColumnVector<T> *>(out_untyped);
if (!in || !out)
return false;
executeImplNumToNum(in->getData(), out->getData(), boundaries);
return true;
}
template <typename T>
bool executeDecimal(const IColumn * in_untyped, IColumn * out_untyped, const Array & boundaries)
{
const auto in = checkAndGetColumn<ColumnDecimal<T>>(in_untyped);
auto out = typeid_cast<ColumnDecimal<T> *>(out_untyped);
if (!in || !out)
return false;
executeImplNumToNum(in->getData(), out->getData(), boundaries);
return true;
}
template <typename Container>
void executeImplNumToNum(const Container & src, Container & dst, const Array & boundaries)
{
using ValueType = typename Container::value_type;
std::vector<ValueType> boundary_values(boundaries.size());
for (size_t i = 0; i < boundaries.size(); ++i)
boundary_values[i] = boundaries[i].get<ValueType>();
std::sort(boundary_values.begin(), boundary_values.end());
boundary_values.erase(std::unique(boundary_values.begin(), boundary_values.end()), boundary_values.end());
size_t size = src.size();
dst.resize(size);
for (size_t i = 0; i < size; ++i)
{
auto it = std::upper_bound(boundary_values.begin(), boundary_values.end(), src[i]);
if (it == boundary_values.end())
{
dst[i] = boundary_values.back();
}
else if (it == boundary_values.begin())
{
dst[i] = boundary_values.front();
}
else
{
dst[i] = *(it - 1);
}
}
}
private:
const Context & context;
};
struct NameRound { static constexpr auto name = "round"; }; struct NameRound { static constexpr auto name = "round"; };
struct NameCeil { static constexpr auto name = "ceil"; }; struct NameCeil { static constexpr auto name = "ceil"; };
struct NameFloor { static constexpr auto name = "floor"; }; struct NameFloor { static constexpr auto name = "floor"; };

View File

@ -1080,7 +1080,7 @@ void registerFunctionsStringSearch(FunctionFactory & factory)
factory.registerFunction<FunctionReplaceAll>(); factory.registerFunction<FunctionReplaceAll>();
factory.registerFunction<FunctionReplaceRegexpOne>(); factory.registerFunction<FunctionReplaceRegexpOne>();
factory.registerFunction<FunctionReplaceRegexpAll>(); factory.registerFunction<FunctionReplaceRegexpAll>();
factory.registerFunction<FunctionPosition>(); factory.registerFunction<FunctionPosition>(FunctionFactory::CaseInsensitive);
factory.registerFunction<FunctionPositionUTF8>(); factory.registerFunction<FunctionPositionUTF8>();
factory.registerFunction<FunctionPositionCaseInsensitive>(); factory.registerFunction<FunctionPositionCaseInsensitive>();
factory.registerFunction<FunctionPositionCaseInsensitiveUTF8>(); factory.registerFunction<FunctionPositionCaseInsensitiveUTF8>();

View File

@ -157,10 +157,7 @@ ColumnPtr wrapInNullable(const ColumnPtr & src, const Block & block, const Colum
if (!result_null_map_column) if (!result_null_map_column)
return makeNullable(src); return makeNullable(src);
if (src_not_nullable->isColumnConst())
return ColumnNullable::create(src_not_nullable->convertToFullColumnIfConst(), result_null_map_column); return ColumnNullable::create(src_not_nullable->convertToFullColumnIfConst(), result_null_map_column);
else
return ColumnNullable::create(src_not_nullable, result_null_map_column);
} }
@ -431,9 +428,7 @@ void PreparedFunctionImpl::execute(Block & block, const ColumnNumbers & args, si
executeWithoutLowCardinalityColumns(block_without_low_cardinality, args, result, block_without_low_cardinality.rows(), dry_run); executeWithoutLowCardinalityColumns(block_without_low_cardinality, args, result, block_without_low_cardinality.rows(), dry_run);
auto & keys = block_without_low_cardinality.safeGetByPosition(result).column; auto keys = block_without_low_cardinality.safeGetByPosition(result).column->convertToFullColumnIfConst();
if (auto full_column = keys->convertToFullColumnIfConst())
keys = full_column;
auto res_mut_dictionary = DataTypeLowCardinality::createColumnUnique(*res_low_cardinality_type->getDictionaryType()); auto res_mut_dictionary = DataTypeLowCardinality::createColumnUnique(*res_low_cardinality_type->getDictionaryType());
ColumnPtr res_indexes = res_mut_dictionary->uniqueInsertRangeFrom(*keys, 0, keys->size()); ColumnPtr res_indexes = res_mut_dictionary->uniqueInsertRangeFrom(*keys, 0, keys->size());

View File

@ -48,7 +48,7 @@ template <> struct FunctionUnaryArithmeticMonotonicity<NameAbs>
void registerFunctionAbs(FunctionFactory & factory) void registerFunctionAbs(FunctionFactory & factory)
{ {
factory.registerFunction<FunctionAbs>(); factory.registerFunction<FunctionAbs>(FunctionFactory::CaseInsensitive);
} }
} }

View File

@ -0,0 +1,18 @@
#include <Functions/IFunction.h>
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionDateOrDateTimeAddInterval.h>
namespace DB
{
using FunctionAddQuarters = FunctionDateOrDateTimeAddInterval<AddQuartersImpl>;
void registerFunctionAddQuarters(FunctionFactory & factory)
{
factory.registerFunction<FunctionAddQuarters>();
}
}

View File

@ -69,8 +69,7 @@ public:
if (!arg.type->equals(*elem_type)) if (!arg.type->equals(*elem_type))
preprocessed_column = castColumn(arg, elem_type, context); preprocessed_column = castColumn(arg, elem_type, context);
if (ColumnPtr materialized_column = preprocessed_column->convertToFullColumnIfConst()) preprocessed_column = preprocessed_column->convertToFullColumnIfConst();
preprocessed_column = materialized_column;
columns_holder[i] = std::move(preprocessed_column); columns_holder[i] = std::move(preprocessed_column);
columns[i] = columns_holder[i].get(); columns[i] = columns_holder[i].get();

View File

@ -61,21 +61,10 @@ private:
static constexpr size_t INITIAL_SIZE_DEGREE = 9; static constexpr size_t INITIAL_SIZE_DEGREE = 9;
template <typename T> template <typename T>
bool executeNumber(const ColumnArray * array, const IColumn * null_map, ColumnUInt32::Container & res_values); bool executeNumber(const ColumnArray::Offsets & offsets, const IColumn & data, const NullMap * null_map, ColumnUInt32::Container & res_values);
bool executeString(const ColumnArray::Offsets & offsets, const IColumn & data, const NullMap * null_map, ColumnUInt32::Container & res_values);
bool executeString(const ColumnArray * array, const IColumn * null_map, ColumnUInt32::Container & res_values); bool execute128bit(const ColumnArray::Offsets & offsets, const ColumnRawPtrs & columns, ColumnUInt32::Container & res_values);
bool executeHashed(const ColumnArray::Offsets & offsets, const ColumnRawPtrs & columns, ColumnUInt32::Container & res_values);
bool execute128bit(
const ColumnArray::Offsets & offsets,
const ColumnRawPtrs & columns,
const ColumnRawPtrs & null_maps,
ColumnUInt32::Container & res_values,
bool has_nullable_columns);
void executeHashed(
const ColumnArray::Offsets & offsets,
const ColumnRawPtrs & columns,
ColumnUInt32::Container & res_values);
}; };
@ -83,14 +72,14 @@ template <typename Derived>
void FunctionArrayEnumerateExtended<Derived>::executeImpl(Block & block, const ColumnNumbers & arguments, size_t result, size_t /*input_rows_count*/) void FunctionArrayEnumerateExtended<Derived>::executeImpl(Block & block, const ColumnNumbers & arguments, size_t result, size_t /*input_rows_count*/)
{ {
const ColumnArray::Offsets * offsets = nullptr; const ColumnArray::Offsets * offsets = nullptr;
ColumnRawPtrs data_columns; size_t num_arguments = arguments.size();
data_columns.reserve(arguments.size()); ColumnRawPtrs data_columns(num_arguments);
bool has_nullable_columns = false; Columns array_holders;
ColumnPtr offsets_column;
for (size_t i = 0; i < arguments.size(); ++i) for (size_t i = 0; i < num_arguments; ++i)
{ {
ColumnPtr array_ptr = block.getByPosition(arguments[i]).column; const ColumnPtr & array_ptr = block.getByPosition(arguments[i]).column;
const ColumnArray * array = checkAndGetColumn<ColumnArray>(array_ptr.get()); const ColumnArray * array = checkAndGetColumn<ColumnArray>(array_ptr.get());
if (!array) if (!array)
{ {
@ -100,101 +89,84 @@ void FunctionArrayEnumerateExtended<Derived>::executeImpl(Block & block, const C
throw Exception("Illegal column " + block.getByPosition(arguments[i]).column->getName() throw Exception("Illegal column " + block.getByPosition(arguments[i]).column->getName()
+ " of " + toString(i + 1) + "-th argument of function " + getName(), + " of " + toString(i + 1) + "-th argument of function " + getName(),
ErrorCodes::ILLEGAL_COLUMN); ErrorCodes::ILLEGAL_COLUMN);
array_ptr = const_array->convertToFullColumn(); array_holders.emplace_back(const_array->convertToFullColumn());
array = checkAndGetColumn<ColumnArray>(array_ptr.get()); array = checkAndGetColumn<ColumnArray>(array_holders.back().get());
} }
const ColumnArray::Offsets & offsets_i = array->getOffsets(); const ColumnArray::Offsets & offsets_i = array->getOffsets();
if (i == 0) if (i == 0)
{
offsets = &offsets_i; offsets = &offsets_i;
offsets_column = array->getOffsetsPtr();
}
else if (offsets_i != *offsets) else if (offsets_i != *offsets)
throw Exception("Lengths of all arrays passed to " + getName() + " must be equal.", throw Exception("Lengths of all arrays passed to " + getName() + " must be equal.",
ErrorCodes::SIZES_OF_ARRAYS_DOESNT_MATCH); ErrorCodes::SIZES_OF_ARRAYS_DOESNT_MATCH);
auto * array_data = &array->getData(); auto * array_data = &array->getData();
data_columns.push_back(array_data); data_columns[i] = array_data;
} }
size_t num_columns = data_columns.size(); const NullMap * null_map = nullptr;
ColumnRawPtrs original_data_columns(num_columns);
ColumnRawPtrs null_maps(num_columns);
for (size_t i = 0; i < num_columns; ++i) for (size_t i = 0; i < num_arguments; ++i)
{ {
original_data_columns[i] = data_columns[i];
if (data_columns[i]->isColumnNullable()) if (data_columns[i]->isColumnNullable())
{ {
has_nullable_columns = true;
const auto & nullable_col = static_cast<const ColumnNullable &>(*data_columns[i]); const auto & nullable_col = static_cast<const ColumnNullable &>(*data_columns[i]);
if (num_arguments == 1)
data_columns[i] = &nullable_col.getNestedColumn(); data_columns[i] = &nullable_col.getNestedColumn();
null_maps[i] = &nullable_col.getNullMapColumn();
null_map = &nullable_col.getNullMapData();
break;
} }
else
null_maps[i] = nullptr;
} }
const ColumnArray * first_array = checkAndGetColumn<ColumnArray>(block.getByPosition(arguments.at(0)).column.get());
const IColumn * first_null_map = null_maps[0];
auto res_nested = ColumnUInt32::create(); auto res_nested = ColumnUInt32::create();
ColumnUInt32::Container & res_values = res_nested->getData(); ColumnUInt32::Container & res_values = res_nested->getData();
if (!offsets->empty()) if (!offsets->empty())
res_values.resize(offsets->back()); res_values.resize(offsets->back());
if (num_columns == 1) if (num_arguments == 1)
{ {
if (!(executeNumber<UInt8>(first_array, first_null_map, res_values) executeNumber<UInt8>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<UInt16>(first_array, first_null_map, res_values) || executeNumber<UInt16>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<UInt32>(first_array, first_null_map, res_values) || executeNumber<UInt32>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<UInt64>(first_array, first_null_map, res_values) || executeNumber<UInt64>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<Int8>(first_array, first_null_map, res_values) || executeNumber<Int8>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<Int16>(first_array, first_null_map, res_values) || executeNumber<Int16>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<Int32>(first_array, first_null_map, res_values) || executeNumber<Int32>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<Int64>(first_array, first_null_map, res_values) || executeNumber<Int64>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<Float32>(first_array, first_null_map, res_values) || executeNumber<Float32>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<Float64>(first_array, first_null_map, res_values) || executeNumber<Float64>(*offsets, *data_columns[0], null_map, res_values)
|| executeString (first_array, first_null_map, res_values))) || executeString(*offsets, *data_columns[0], null_map, res_values)
executeHashed(*offsets, original_data_columns, res_values); || executeHashed(*offsets, data_columns, res_values);
} }
else else
{ {
if (!execute128bit(*offsets, data_columns, null_maps, res_values, has_nullable_columns)) execute128bit(*offsets, data_columns, res_values)
executeHashed(*offsets, original_data_columns, res_values); || executeHashed(*offsets, data_columns, res_values);
} }
block.getByPosition(result).column = ColumnArray::create(std::move(res_nested), first_array->getOffsetsPtr()); block.getByPosition(result).column = ColumnArray::create(std::move(res_nested), offsets_column);
} }
template <typename Derived> template <typename Derived>
template <typename T> template <typename T>
bool FunctionArrayEnumerateExtended<Derived>::executeNumber(const ColumnArray * array, const IColumn * null_map, ColumnUInt32::Container & res_values) bool FunctionArrayEnumerateExtended<Derived>::executeNumber(
const ColumnArray::Offsets & offsets, const IColumn & data, const NullMap * null_map, ColumnUInt32::Container & res_values)
{ {
const IColumn * inner_col; const ColumnVector<T> * data_concrete = checkAndGetColumn<ColumnVector<T>>(&data);
if (!data_concrete)
const auto & array_data = array->getData();
if (array_data.isColumnNullable())
{
const auto & nullable_col = static_cast<const ColumnNullable &>(array_data);
inner_col = &nullable_col.getNestedColumn();
}
else
inner_col = &array_data;
const ColumnVector<T> * nested = checkAndGetColumn<ColumnVector<T>>(inner_col);
if (!nested)
return false; return false;
const ColumnArray::Offsets & offsets = array->getOffsets(); const auto & values = data_concrete->getData();
const typename ColumnVector<T>::Container & values = nested->getData();
using ValuesToIndices = ClearableHashMap<T, UInt32, DefaultHash<T>, HashTableGrower<INITIAL_SIZE_DEGREE>, using ValuesToIndices = ClearableHashMap<T, UInt32, DefaultHash<T>, HashTableGrower<INITIAL_SIZE_DEGREE>,
HashTableAllocatorWithStackMemory<(1ULL << INITIAL_SIZE_DEGREE) * sizeof(T)>>; HashTableAllocatorWithStackMemory<(1ULL << INITIAL_SIZE_DEGREE) * sizeof(T)>>;
const PaddedPODArray<UInt8> * null_map_data = nullptr;
if (null_map)
null_map_data = &static_cast<const ColumnUInt8 *>(null_map)->getData();
ValuesToIndices indices; ValuesToIndices indices;
size_t prev_off = 0; size_t prev_off = 0;
if constexpr (std::is_same_v<Derived, FunctionArrayEnumerateUniq>) if constexpr (std::is_same_v<Derived, FunctionArrayEnumerateUniq>)
@ -207,7 +179,7 @@ bool FunctionArrayEnumerateExtended<Derived>::executeNumber(const ColumnArray *
size_t off = offsets[i]; size_t off = offsets[i];
for (size_t j = prev_off; j < off; ++j) for (size_t j = prev_off; j < off; ++j)
{ {
if (null_map_data && ((*null_map_data)[j] == 1)) if (null_map && (*null_map)[j])
res_values[j] = ++null_count; res_values[j] = ++null_count;
else else
res_values[j] = ++indices[values[j]]; res_values[j] = ++indices[values[j]];
@ -226,7 +198,7 @@ bool FunctionArrayEnumerateExtended<Derived>::executeNumber(const ColumnArray *
size_t off = offsets[i]; size_t off = offsets[i];
for (size_t j = prev_off; j < off; ++j) for (size_t j = prev_off; j < off; ++j)
{ {
if (null_map_data && ((*null_map_data)[j] == 1)) if (null_map && (*null_map)[j])
{ {
if (!null_index) if (!null_index)
null_index = ++rank; null_index = ++rank;
@ -247,32 +219,17 @@ bool FunctionArrayEnumerateExtended<Derived>::executeNumber(const ColumnArray *
} }
template <typename Derived> template <typename Derived>
bool FunctionArrayEnumerateExtended<Derived>::executeString(const ColumnArray * array, const IColumn * null_map, ColumnUInt32::Container & res_values) bool FunctionArrayEnumerateExtended<Derived>::executeString(
const ColumnArray::Offsets & offsets, const IColumn & data, const NullMap * null_map, ColumnUInt32::Container & res_values)
{ {
const IColumn * inner_col; const ColumnString * values = checkAndGetColumn<ColumnString>(&data);
if (!values)
const auto & array_data = array->getData();
if (array_data.isColumnNullable())
{
const auto & nullable_col = static_cast<const ColumnNullable &>(array_data);
inner_col = &nullable_col.getNestedColumn();
}
else
inner_col = &array_data;
const ColumnString * nested = checkAndGetColumn<ColumnString>(inner_col);
if (!nested)
return false; return false;
const ColumnArray::Offsets & offsets = array->getOffsets();
size_t prev_off = 0; size_t prev_off = 0;
using ValuesToIndices = ClearableHashMap<StringRef, UInt32, StringRefHash, HashTableGrower<INITIAL_SIZE_DEGREE>, using ValuesToIndices = ClearableHashMap<StringRef, UInt32, StringRefHash, HashTableGrower<INITIAL_SIZE_DEGREE>,
HashTableAllocatorWithStackMemory<(1ULL << INITIAL_SIZE_DEGREE) * sizeof(StringRef)>>; HashTableAllocatorWithStackMemory<(1ULL << INITIAL_SIZE_DEGREE) * sizeof(StringRef)>>;
const PaddedPODArray<UInt8> * null_map_data = nullptr;
if (null_map)
null_map_data = &static_cast<const ColumnUInt8 *>(null_map)->getData();
ValuesToIndices indices; ValuesToIndices indices;
if constexpr (std::is_same_v<Derived, FunctionArrayEnumerateUniq>) if constexpr (std::is_same_v<Derived, FunctionArrayEnumerateUniq>)
{ {
@ -284,10 +241,10 @@ bool FunctionArrayEnumerateExtended<Derived>::executeString(const ColumnArray *
size_t off = offsets[i]; size_t off = offsets[i];
for (size_t j = prev_off; j < off; ++j) for (size_t j = prev_off; j < off; ++j)
{ {
if (null_map_data && ((*null_map_data)[j] == 1)) if (null_map && (*null_map)[j])
res_values[j] = ++null_count; res_values[j] = ++null_count;
else else
res_values[j] = ++indices[nested->getDataAt(j)]; res_values[j] = ++indices[values->getDataAt(j)];
} }
prev_off = off; prev_off = off;
} }
@ -303,7 +260,7 @@ bool FunctionArrayEnumerateExtended<Derived>::executeString(const ColumnArray *
size_t off = offsets[i]; size_t off = offsets[i];
for (size_t j = prev_off; j < off; ++j) for (size_t j = prev_off; j < off; ++j)
{ {
if (null_map_data && ((*null_map_data)[j] == 1)) if (null_map && (*null_map)[j])
{ {
if (!null_index) if (!null_index)
null_index = ++rank; null_index = ++rank;
@ -311,7 +268,7 @@ bool FunctionArrayEnumerateExtended<Derived>::executeString(const ColumnArray *
} }
else else
{ {
auto & idx = indices[nested->getDataAt(j)]; auto & idx = indices[values->getDataAt(j)];
if (!idx) if (!idx)
idx = ++rank; idx = ++rank;
res_values[j] = idx; res_values[j] = idx;
@ -327,9 +284,7 @@ template <typename Derived>
bool FunctionArrayEnumerateExtended<Derived>::execute128bit( bool FunctionArrayEnumerateExtended<Derived>::execute128bit(
const ColumnArray::Offsets & offsets, const ColumnArray::Offsets & offsets,
const ColumnRawPtrs & columns, const ColumnRawPtrs & columns,
const ColumnRawPtrs & null_maps, ColumnUInt32::Container & res_values)
ColumnUInt32::Container & res_values,
bool has_nullable_columns)
{ {
size_t count = columns.size(); size_t count = columns.size();
size_t keys_bytes = 0; size_t keys_bytes = 0;
@ -342,8 +297,6 @@ bool FunctionArrayEnumerateExtended<Derived>::execute128bit(
key_sizes[j] = columns[j]->sizeOfValueIfFixed(); key_sizes[j] = columns[j]->sizeOfValueIfFixed();
keys_bytes += key_sizes[j]; keys_bytes += key_sizes[j];
} }
if (has_nullable_columns)
keys_bytes += std::tuple_size<KeysNullMap<UInt128>>::value;
if (keys_bytes > 16) if (keys_bytes > 16)
return false; return false;
@ -361,29 +314,7 @@ bool FunctionArrayEnumerateExtended<Derived>::execute128bit(
indices.clear(); indices.clear();
size_t off = offsets[i]; size_t off = offsets[i];
for (size_t j = prev_off; j < off; ++j) for (size_t j = prev_off; j < off; ++j)
{
if (has_nullable_columns)
{
KeysNullMap<UInt128> bitmap{};
for (size_t i = 0; i < columns.size(); ++i)
{
if (null_maps[i])
{
const auto & null_map = static_cast<const ColumnUInt8 &>(*null_maps[i]).getData();
if (null_map[j] == 1)
{
size_t bucket = i / 8;
size_t offset = i % 8;
bitmap[bucket] |= UInt8(1) << offset;
}
}
}
res_values[j] = ++indices[packFixed<UInt128>(j, count, columns, key_sizes, bitmap)];
}
else
res_values[j] = ++indices[packFixed<UInt128>(j, count, columns, key_sizes)]; res_values[j] = ++indices[packFixed<UInt128>(j, count, columns, key_sizes)];
}
prev_off = off; prev_off = off;
} }
} }
@ -396,37 +327,12 @@ bool FunctionArrayEnumerateExtended<Derived>::execute128bit(
size_t off = offsets[i]; size_t off = offsets[i];
size_t rank = 0; size_t rank = 0;
for (size_t j = prev_off; j < off; ++j) for (size_t j = prev_off; j < off; ++j)
{
if (has_nullable_columns)
{
KeysNullMap<UInt128> bitmap{};
for (size_t i = 0; i < columns.size(); ++i)
{
if (null_maps[i])
{
const auto & null_map = static_cast<const ColumnUInt8 &>(*null_maps[i]).getData();
if (null_map[j] == 1)
{
size_t bucket = i / 8;
size_t offset = i % 8;
bitmap[bucket] |= UInt8(1) << offset;
}
}
}
auto &idx = indices[packFixed<UInt128>(j, count, columns, key_sizes, bitmap)];
if (!idx)
idx = ++rank;
res_values[j] = idx;
}
else
{ {
auto &idx = indices[packFixed<UInt128>(j, count, columns, key_sizes)];; auto &idx = indices[packFixed<UInt128>(j, count, columns, key_sizes)];;
if (!idx) if (!idx)
idx = ++rank; idx = ++rank;
res_values[j] = idx; res_values[j] = idx;
} }
}
prev_off = off; prev_off = off;
} }
} }
@ -435,7 +341,7 @@ bool FunctionArrayEnumerateExtended<Derived>::execute128bit(
} }
template <typename Derived> template <typename Derived>
void FunctionArrayEnumerateExtended<Derived>::executeHashed( bool FunctionArrayEnumerateExtended<Derived>::executeHashed(
const ColumnArray::Offsets & offsets, const ColumnArray::Offsets & offsets,
const ColumnRawPtrs & columns, const ColumnRawPtrs & columns,
ColumnUInt32::Container & res_values) ColumnUInt32::Container & res_values)
@ -479,6 +385,8 @@ void FunctionArrayEnumerateExtended<Derived>::executeHashed(
prev_off = off; prev_off = off;
} }
} }
return true;
} }
} }

View File

@ -838,14 +838,8 @@ private:
null_map_data, nullptr); null_map_data, nullptr);
else else
{ {
/// If item_arg is tuple and have constants.
if (ColumnPtr materialized_tuple = item_arg.convertToFullColumnIfConst())
ArrayIndexGenericImpl<IndexConv, false>::vector( ArrayIndexGenericImpl<IndexConv, false>::vector(
col_nested, col_array->getOffsets(), *materialized_tuple, col_res->getData(), col_nested, col_array->getOffsets(), *item_arg.convertToFullColumnIfConst(), col_res->getData(),
null_map_data, null_map_item);
else
ArrayIndexGenericImpl<IndexConv, false>::vector(
col_nested, col_array->getOffsets(), item_arg, col_res->getData(),
null_map_data, null_map_item); null_map_data, null_map_item);
} }

View File

@ -23,9 +23,7 @@ struct ArrayMapImpl
static ColumnPtr execute(const ColumnArray & array, ColumnPtr mapped) static ColumnPtr execute(const ColumnArray & array, ColumnPtr mapped)
{ {
return mapped->isColumnConst() return ColumnArray::create(mapped->convertToFullColumnIfConst(), array.getOffsetsPtr());
? ColumnArray::create(mapped->convertToFullColumnIfConst(), array.getOffsetsPtr())
: ColumnArray::create(mapped, array.getOffsetsPtr());
} }
}; };

View File

@ -63,37 +63,23 @@ private:
static constexpr size_t INITIAL_SIZE_DEGREE = 9; static constexpr size_t INITIAL_SIZE_DEGREE = 9;
template <typename T> template <typename T>
bool executeNumber(const ColumnArray * array, const IColumn * null_map, ColumnUInt32::Container & res_values); bool executeNumber(const ColumnArray::Offsets & offsets, const IColumn & data, const NullMap * null_map, ColumnUInt32::Container & res_values);
bool executeString(const ColumnArray::Offsets & offsets, const IColumn & data, const NullMap * null_map, ColumnUInt32::Container & res_values);
bool executeString(const ColumnArray * array, const IColumn * null_map, ColumnUInt32::Container & res_values); bool execute128bit(const ColumnArray::Offsets & offsets, const ColumnRawPtrs & columns, ColumnUInt32::Container & res_values);
bool executeHashed(const ColumnArray::Offsets & offsets, const ColumnRawPtrs & columns, ColumnUInt32::Container & res_values);
bool execute128bit(
const ColumnArray::Offsets & offsets,
const ColumnRawPtrs & columns,
const ColumnRawPtrs & null_maps,
ColumnUInt32::Container & res_values,
bool has_nullable_columns);
void executeHashed(
const ColumnArray::Offsets & offsets,
const ColumnRawPtrs & columns,
ColumnUInt32::Container & res_values);
}; };
void FunctionArrayUniq::executeImpl(Block & block, const ColumnNumbers & arguments, size_t result, size_t /*input_rows_count*/) void FunctionArrayUniq::executeImpl(Block & block, const ColumnNumbers & arguments, size_t result, size_t /*input_rows_count*/)
{ {
Columns array_columns(arguments.size());
const ColumnArray::Offsets * offsets = nullptr; const ColumnArray::Offsets * offsets = nullptr;
ColumnRawPtrs data_columns(arguments.size()); size_t num_arguments = arguments.size();
ColumnRawPtrs original_data_columns(arguments.size()); ColumnRawPtrs data_columns(num_arguments);
ColumnRawPtrs null_maps(arguments.size());
bool has_nullable_columns = false; Columns array_holders;
for (size_t i = 0; i < num_arguments; ++i)
for (size_t i = 0; i < arguments.size(); ++i)
{ {
ColumnPtr array_ptr = block.getByPosition(arguments[i]).column; const ColumnPtr & array_ptr = block.getByPosition(arguments[i]).column;
const ColumnArray * array = checkAndGetColumn<ColumnArray>(array_ptr.get()); const ColumnArray * array = checkAndGetColumn<ColumnArray>(array_ptr.get());
if (!array) if (!array)
{ {
@ -101,14 +87,12 @@ void FunctionArrayUniq::executeImpl(Block & block, const ColumnNumbers & argumen
block.getByPosition(arguments[i]).column.get()); block.getByPosition(arguments[i]).column.get());
if (!const_array) if (!const_array)
throw Exception("Illegal column " + block.getByPosition(arguments[i]).column->getName() throw Exception("Illegal column " + block.getByPosition(arguments[i]).column->getName()
+ " of " + toString(i + 1) + getOrdinalSuffix(i + 1) + " argument of function " + getName(), + " of " + toString(i + 1) + "-th argument of function " + getName(),
ErrorCodes::ILLEGAL_COLUMN); ErrorCodes::ILLEGAL_COLUMN);
array_ptr = const_array->convertToFullColumn(); array_holders.emplace_back(const_array->convertToFullColumn());
array = static_cast<const ColumnArray *>(array_ptr.get()); array = checkAndGetColumn<ColumnArray>(array_holders.back().get());
} }
array_columns[i] = array_ptr;
const ColumnArray::Offsets & offsets_i = array->getOffsets(); const ColumnArray::Offsets & offsets_i = array->getOffsets();
if (i == 0) if (i == 0)
offsets = &offsets_i; offsets = &offsets_i;
@ -116,78 +100,65 @@ void FunctionArrayUniq::executeImpl(Block & block, const ColumnNumbers & argumen
throw Exception("Lengths of all arrays passed to " + getName() + " must be equal.", throw Exception("Lengths of all arrays passed to " + getName() + " must be equal.",
ErrorCodes::SIZES_OF_ARRAYS_DOESNT_MATCH); ErrorCodes::SIZES_OF_ARRAYS_DOESNT_MATCH);
data_columns[i] = &array->getData(); auto * array_data = &array->getData();
original_data_columns[i] = data_columns[i]; data_columns[i] = array_data;
}
const NullMap * null_map = nullptr;
for (size_t i = 0; i < num_arguments; ++i)
{
if (data_columns[i]->isColumnNullable()) if (data_columns[i]->isColumnNullable())
{ {
has_nullable_columns = true;
const auto & nullable_col = static_cast<const ColumnNullable &>(*data_columns[i]); const auto & nullable_col = static_cast<const ColumnNullable &>(*data_columns[i]);
if (num_arguments == 1)
data_columns[i] = &nullable_col.getNestedColumn(); data_columns[i] = &nullable_col.getNestedColumn();
null_maps[i] = &nullable_col.getNullMapColumn();
null_map = &nullable_col.getNullMapData();
break;
} }
else
null_maps[i] = nullptr;
} }
const ColumnArray * first_array = static_cast<const ColumnArray *>(array_columns[0].get());
const IColumn * first_null_map = null_maps[0];
auto res = ColumnUInt32::create(); auto res = ColumnUInt32::create();
ColumnUInt32::Container & res_values = res->getData(); ColumnUInt32::Container & res_values = res->getData();
res_values.resize(offsets->size()); res_values.resize(offsets->size());
if (arguments.size() == 1) if (num_arguments == 1)
{ {
if (!(executeNumber<UInt8>(first_array, first_null_map, res_values) executeNumber<UInt8>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<UInt16>(first_array, first_null_map, res_values) || executeNumber<UInt16>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<UInt32>(first_array, first_null_map, res_values) || executeNumber<UInt32>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<UInt64>(first_array, first_null_map, res_values) || executeNumber<UInt64>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<Int8>(first_array, first_null_map, res_values) || executeNumber<Int8>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<Int16>(first_array, first_null_map, res_values) || executeNumber<Int16>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<Int32>(first_array, first_null_map, res_values) || executeNumber<Int32>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<Int64>(first_array, first_null_map, res_values) || executeNumber<Int64>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<Float32>(first_array, first_null_map, res_values) || executeNumber<Float32>(*offsets, *data_columns[0], null_map, res_values)
|| executeNumber<Float64>(first_array, first_null_map, res_values) || executeNumber<Float64>(*offsets, *data_columns[0], null_map, res_values)
|| executeString(first_array, first_null_map, res_values))) || executeString(*offsets, *data_columns[0], null_map, res_values)
executeHashed(*offsets, original_data_columns, res_values); || executeHashed(*offsets, data_columns, res_values);
} }
else else
{ {
if (!execute128bit(*offsets, data_columns, null_maps, res_values, has_nullable_columns)) execute128bit(*offsets, data_columns, res_values)
executeHashed(*offsets, original_data_columns, res_values); || executeHashed(*offsets, data_columns, res_values);
} }
block.getByPosition(result).column = std::move(res); block.getByPosition(result).column = std::move(res);
} }
template <typename T> template <typename T>
bool FunctionArrayUniq::executeNumber(const ColumnArray * array, const IColumn * null_map, ColumnUInt32::Container & res_values) bool FunctionArrayUniq::executeNumber(const ColumnArray::Offsets & offsets, const IColumn & data, const NullMap * null_map, ColumnUInt32::Container & res_values)
{ {
const IColumn * inner_col; const ColumnVector<T> * nested = checkAndGetColumn<ColumnVector<T>>(&data);
const auto & array_data = array->getData();
if (array_data.isColumnNullable())
{
const auto & nullable_col = static_cast<const ColumnNullable &>(array_data);
inner_col = &nullable_col.getNestedColumn();
}
else
inner_col = &array_data;
const ColumnVector<T> * nested = checkAndGetColumn<ColumnVector<T>>(inner_col);
if (!nested) if (!nested)
return false; return false;
const ColumnArray::Offsets & offsets = array->getOffsets(); const auto & values = nested->getData();
const typename ColumnVector<T>::Container & values = nested->getData();
using Set = ClearableHashSet<T, DefaultHash<T>, HashTableGrower<INITIAL_SIZE_DEGREE>, using Set = ClearableHashSet<T, DefaultHash<T>, HashTableGrower<INITIAL_SIZE_DEGREE>,
HashTableAllocatorWithStackMemory<(1ULL << INITIAL_SIZE_DEGREE) * sizeof(T)>>; HashTableAllocatorWithStackMemory<(1ULL << INITIAL_SIZE_DEGREE) * sizeof(T)>>;
const PaddedPODArray<UInt8> * null_map_data = nullptr;
if (null_map)
null_map_data = &static_cast<const ColumnUInt8 *>(null_map)->getData();
Set set; Set set;
ColumnArray::Offset prev_off = 0; ColumnArray::Offset prev_off = 0;
for (size_t i = 0; i < offsets.size(); ++i) for (size_t i = 0; i < offsets.size(); ++i)
@ -197,7 +168,7 @@ bool FunctionArrayUniq::executeNumber(const ColumnArray * array, const IColumn *
ColumnArray::Offset off = offsets[i]; ColumnArray::Offset off = offsets[i];
for (ColumnArray::Offset j = prev_off; j < off; ++j) for (ColumnArray::Offset j = prev_off; j < off; ++j)
{ {
if (null_map_data && ((*null_map_data)[j] == 1)) if (null_map && (*null_map)[j])
found_null = true; found_null = true;
else else
set.insert(values[j]); set.insert(values[j]);
@ -209,31 +180,15 @@ bool FunctionArrayUniq::executeNumber(const ColumnArray * array, const IColumn *
return true; return true;
} }
bool FunctionArrayUniq::executeString(const ColumnArray * array, const IColumn * null_map, ColumnUInt32::Container & res_values) bool FunctionArrayUniq::executeString(const ColumnArray::Offsets & offsets, const IColumn & data, const NullMap * null_map, ColumnUInt32::Container & res_values)
{ {
const IColumn * inner_col; const ColumnString * nested = checkAndGetColumn<ColumnString>(&data);
const auto & array_data = array->getData();
if (array_data.isColumnNullable())
{
const auto & nullable_col = static_cast<const ColumnNullable &>(array_data);
inner_col = &nullable_col.getNestedColumn();
}
else
inner_col = &array_data;
const ColumnString * nested = checkAndGetColumn<ColumnString>(inner_col);
if (!nested) if (!nested)
return false; return false;
const ColumnArray::Offsets & offsets = array->getOffsets();
using Set = ClearableHashSet<StringRef, StringRefHash, HashTableGrower<INITIAL_SIZE_DEGREE>, using Set = ClearableHashSet<StringRef, StringRefHash, HashTableGrower<INITIAL_SIZE_DEGREE>,
HashTableAllocatorWithStackMemory<(1ULL << INITIAL_SIZE_DEGREE) * sizeof(StringRef)>>; HashTableAllocatorWithStackMemory<(1ULL << INITIAL_SIZE_DEGREE) * sizeof(StringRef)>>;
const PaddedPODArray<UInt8> * null_map_data = nullptr;
if (null_map)
null_map_data = &static_cast<const ColumnUInt8 *>(null_map)->getData();
Set set; Set set;
ColumnArray::Offset prev_off = 0; ColumnArray::Offset prev_off = 0;
for (size_t i = 0; i < offsets.size(); ++i) for (size_t i = 0; i < offsets.size(); ++i)
@ -243,7 +198,7 @@ bool FunctionArrayUniq::executeString(const ColumnArray * array, const IColumn *
ColumnArray::Offset off = offsets[i]; ColumnArray::Offset off = offsets[i];
for (ColumnArray::Offset j = prev_off; j < off; ++j) for (ColumnArray::Offset j = prev_off; j < off; ++j)
{ {
if (null_map_data && ((*null_map_data)[j] == 1)) if (null_map && (*null_map)[j])
found_null = true; found_null = true;
else else
set.insert(nested->getDataAt(j)); set.insert(nested->getDataAt(j));
@ -259,9 +214,7 @@ bool FunctionArrayUniq::executeString(const ColumnArray * array, const IColumn *
bool FunctionArrayUniq::execute128bit( bool FunctionArrayUniq::execute128bit(
const ColumnArray::Offsets & offsets, const ColumnArray::Offsets & offsets,
const ColumnRawPtrs & columns, const ColumnRawPtrs & columns,
const ColumnRawPtrs & null_maps, ColumnUInt32::Container & res_values)
ColumnUInt32::Container & res_values,
bool has_nullable_columns)
{ {
size_t count = columns.size(); size_t count = columns.size();
size_t keys_bytes = 0; size_t keys_bytes = 0;
@ -274,8 +227,6 @@ bool FunctionArrayUniq::execute128bit(
key_sizes[j] = columns[j]->sizeOfValueIfFixed(); key_sizes[j] = columns[j]->sizeOfValueIfFixed();
keys_bytes += key_sizes[j]; keys_bytes += key_sizes[j];
} }
if (has_nullable_columns)
keys_bytes += std::tuple_size<KeysNullMap<UInt128>>::value;
if (keys_bytes > 16) if (keys_bytes > 16)
return false; return false;
@ -283,19 +234,6 @@ bool FunctionArrayUniq::execute128bit(
using Set = ClearableHashSet<UInt128, UInt128HashCRC32, HashTableGrower<INITIAL_SIZE_DEGREE>, using Set = ClearableHashSet<UInt128, UInt128HashCRC32, HashTableGrower<INITIAL_SIZE_DEGREE>,
HashTableAllocatorWithStackMemory<(1ULL << INITIAL_SIZE_DEGREE) * sizeof(UInt128)>>; HashTableAllocatorWithStackMemory<(1ULL << INITIAL_SIZE_DEGREE) * sizeof(UInt128)>>;
/// Suppose that, for a given row, each of the N columns has an array whose length is M.
/// Denote arr_i each of these arrays (1 <= i <= N). Then the following is performed:
///
/// col1 ... colN
///
/// arr_1[1], ..., arr_N[1] -> pack into a binary blob b1
/// .
/// .
/// .
/// arr_1[M], ..., arr_N[M] -> pack into a binary blob bM
///
/// Each binary blob is inserted into a hash table.
///
Set set; Set set;
ColumnArray::Offset prev_off = 0; ColumnArray::Offset prev_off = 0;
for (ColumnArray::Offset i = 0; i < offsets.size(); ++i) for (ColumnArray::Offset i = 0; i < offsets.size(); ++i)
@ -303,29 +241,7 @@ bool FunctionArrayUniq::execute128bit(
set.clear(); set.clear();
ColumnArray::Offset off = offsets[i]; ColumnArray::Offset off = offsets[i];
for (ColumnArray::Offset j = prev_off; j < off; ++j) for (ColumnArray::Offset j = prev_off; j < off; ++j)
{
if (has_nullable_columns)
{
KeysNullMap<UInt128> bitmap{};
for (ColumnArray::Offset i = 0; i < columns.size(); ++i)
{
if (null_maps[i])
{
const auto & null_map = static_cast<const ColumnUInt8 &>(*null_maps[i]).getData();
if (null_map[j] == 1)
{
ColumnArray::Offset bucket = i / 8;
ColumnArray::Offset offset = i % 8;
bitmap[bucket] |= UInt8(1) << offset;
}
}
}
set.insert(packFixed<UInt128>(j, count, columns, key_sizes, bitmap));
}
else
set.insert(packFixed<UInt128>(j, count, columns, key_sizes)); set.insert(packFixed<UInt128>(j, count, columns, key_sizes));
}
res_values[i] = set.size(); res_values[i] = set.size();
prev_off = off; prev_off = off;
@ -334,7 +250,7 @@ bool FunctionArrayUniq::execute128bit(
return true; return true;
} }
void FunctionArrayUniq::executeHashed( bool FunctionArrayUniq::executeHashed(
const ColumnArray::Offsets & offsets, const ColumnArray::Offsets & offsets,
const ColumnRawPtrs & columns, const ColumnRawPtrs & columns,
ColumnUInt32::Container & res_values) ColumnUInt32::Container & res_values)
@ -356,6 +272,8 @@ void FunctionArrayUniq::executeHashed(
res_values[i] = set.size(); res_values[i] = set.size();
prev_off = off; prev_off = off;
} }
return true;
} }

View File

@ -638,9 +638,7 @@ private:
static ColumnPtr materializeColumnIfConst(const ColumnPtr & column) static ColumnPtr materializeColumnIfConst(const ColumnPtr & column)
{ {
if (ColumnPtr res = column->convertToFullColumnIfConst()) return column->convertToFullColumnIfConst();
return res;
return column;
} }
static ColumnPtr makeNullableColumnIfNot(const ColumnPtr & column) static ColumnPtr makeNullableColumnIfNot(const ColumnPtr & column)

View File

@ -34,11 +34,7 @@ public:
void executeImpl(Block & block, const ColumnNumbers & arguments, size_t result, size_t /*input_rows_count*/) override void executeImpl(Block & block, const ColumnNumbers & arguments, size_t result, size_t /*input_rows_count*/) override
{ {
const auto & src = block.getByPosition(arguments[0]).column; block.getByPosition(result).column = block.getByPosition(arguments[0]).column->convertToFullColumnIfConst();
if (ColumnPtr converted = src->convertToFullColumnIfConst())
block.getByPosition(result).column = converted;
else
block.getByPosition(result).column = src;
} }
}; };

View File

@ -9,7 +9,7 @@ using FunctionRand = FunctionRandom<UInt32, NameRand>;
void registerFunctionRand(FunctionFactory & factory) void registerFunctionRand(FunctionFactory & factory)
{ {
factory.registerFunction<FunctionRand>(); factory.registerFunction<FunctionRand>(FunctionFactory::CaseInsensitive);
} }
} }

View File

@ -0,0 +1,114 @@
#include <Columns/ColumnString.h>
#include <DataTypes/DataTypeString.h>
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionHelpers.h>
#include <common/find_symbols.h>
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_COLUMN;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
}
class FunctionRegexpQuoteMeta : public IFunction
{
public:
static constexpr auto name = "regexpQuoteMeta";
static FunctionPtr create(const Context &)
{
return std::make_shared<FunctionRegexpQuoteMeta>();
}
String getName() const override
{
return name;
}
size_t getNumberOfArguments() const override
{
return 1;
}
bool useDefaultImplementationForConstants() const override
{
return true;
}
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
if (!WhichDataType(arguments[0].type).isString())
throw Exception(
"Illegal type " + arguments[0].type->getName() + " of 1 argument of function " + getName() + ". Must be String.",
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
return std::make_shared<DataTypeString>();
}
void executeImpl(Block & block, const ColumnNumbers & arguments, size_t result, size_t input_rows_count) override
{
const ColumnPtr & column_string = block.getByPosition(arguments[0]).column;
const ColumnString * input = checkAndGetColumn<ColumnString>(column_string.get());
if (!input)
throw Exception(
"Illegal column " + block.getByPosition(arguments[0]).column->getName() + " of first argument of function " + getName(),
ErrorCodes::ILLEGAL_COLUMN);
auto dst_column = ColumnString::create();
auto & dst_data = dst_column->getChars();
auto & dst_offsets = dst_column->getOffsets();
dst_offsets.resize(input_rows_count);
const ColumnString::Offsets & src_offsets = input->getOffsets();
auto src_begin = reinterpret_cast<const char *>(input->getChars().data());
auto src_pos = src_begin;
for (size_t row_idx = 0; row_idx < input_rows_count; ++row_idx)
{
/// NOTE This implementation slightly differs from re2::RE2::QuoteMeta.
/// It escapes zero byte as \0 instead of \x00
/// and it escapes only required characters.
/// This is Ok. Look at comments in re2.cc
const char * src_end = src_begin + src_offsets[row_idx] - 1;
while (true)
{
const char * next_src_pos = find_first_symbols<'\0', '\\', '|', '(', ')', '^', '$', '.', '[', ']', '?', '*', '+', '{', ':', '-'>(src_pos, src_end);
size_t bytes_to_copy = next_src_pos - src_pos;
size_t old_dst_size = dst_data.size();
dst_data.resize(old_dst_size + bytes_to_copy);
memcpySmallAllowReadWriteOverflow15(dst_data.data() + old_dst_size, src_pos, bytes_to_copy);
src_pos = next_src_pos + 1;
if (next_src_pos == src_end)
{
dst_data.emplace_back('\0');
break;
}
dst_data.emplace_back('\\');
dst_data.emplace_back(*next_src_pos);
}
dst_offsets[row_idx] = dst_data.size();
}
block.getByPosition(result).column = std::move(dst_column);
}
};
void registerFunctionRegexpQuoteMeta(FunctionFactory & factory)
{
factory.registerFunction<FunctionRegexpQuoteMeta>();
}
}

View File

@ -47,6 +47,7 @@ void registerFunctionAddHours(FunctionFactory &);
void registerFunctionAddDays(FunctionFactory &); void registerFunctionAddDays(FunctionFactory &);
void registerFunctionAddWeeks(FunctionFactory &); void registerFunctionAddWeeks(FunctionFactory &);
void registerFunctionAddMonths(FunctionFactory &); void registerFunctionAddMonths(FunctionFactory &);
void registerFunctionAddQuarters(FunctionFactory &);
void registerFunctionAddYears(FunctionFactory &); void registerFunctionAddYears(FunctionFactory &);
void registerFunctionSubtractSeconds(FunctionFactory &); void registerFunctionSubtractSeconds(FunctionFactory &);
void registerFunctionSubtractMinutes(FunctionFactory &); void registerFunctionSubtractMinutes(FunctionFactory &);
@ -54,6 +55,7 @@ void registerFunctionSubtractHours(FunctionFactory &);
void registerFunctionSubtractDays(FunctionFactory &); void registerFunctionSubtractDays(FunctionFactory &);
void registerFunctionSubtractWeeks(FunctionFactory &); void registerFunctionSubtractWeeks(FunctionFactory &);
void registerFunctionSubtractMonths(FunctionFactory &); void registerFunctionSubtractMonths(FunctionFactory &);
void registerFunctionSubtractQuarters(FunctionFactory &);
void registerFunctionSubtractYears(FunctionFactory &); void registerFunctionSubtractYears(FunctionFactory &);
void registerFunctionDateDiff(FunctionFactory &); void registerFunctionDateDiff(FunctionFactory &);
void registerFunctionToTimeZone(FunctionFactory &); void registerFunctionToTimeZone(FunctionFactory &);
@ -106,6 +108,7 @@ void registerFunctionsDateTime(FunctionFactory & factory)
registerFunctionAddDays(factory); registerFunctionAddDays(factory);
registerFunctionAddWeeks(factory); registerFunctionAddWeeks(factory);
registerFunctionAddMonths(factory); registerFunctionAddMonths(factory);
registerFunctionAddQuarters(factory);
registerFunctionAddYears(factory); registerFunctionAddYears(factory);
registerFunctionSubtractSeconds(factory); registerFunctionSubtractSeconds(factory);
registerFunctionSubtractMinutes(factory); registerFunctionSubtractMinutes(factory);
@ -113,6 +116,7 @@ void registerFunctionsDateTime(FunctionFactory & factory)
registerFunctionSubtractDays(factory); registerFunctionSubtractDays(factory);
registerFunctionSubtractWeeks(factory); registerFunctionSubtractWeeks(factory);
registerFunctionSubtractMonths(factory); registerFunctionSubtractMonths(factory);
registerFunctionSubtractQuarters(factory);
registerFunctionSubtractYears(factory); registerFunctionSubtractYears(factory);
registerFunctionDateDiff(factory); registerFunctionDateDiff(factory);
registerFunctionToTimeZone(factory); registerFunctionToTimeZone(factory);

View File

@ -21,6 +21,8 @@ void registerFunctionSubstringUTF8(FunctionFactory &);
void registerFunctionAppendTrailingCharIfAbsent(FunctionFactory &); void registerFunctionAppendTrailingCharIfAbsent(FunctionFactory &);
void registerFunctionStartsWith(FunctionFactory &); void registerFunctionStartsWith(FunctionFactory &);
void registerFunctionEndsWith(FunctionFactory &); void registerFunctionEndsWith(FunctionFactory &);
void registerFunctionTrim(FunctionFactory &);
void registerFunctionRegexpQuoteMeta(FunctionFactory &);
#if USE_BASE64 #if USE_BASE64
void registerFunctionBase64Encode(FunctionFactory &); void registerFunctionBase64Encode(FunctionFactory &);
@ -46,6 +48,8 @@ void registerFunctionsString(FunctionFactory & factory)
registerFunctionAppendTrailingCharIfAbsent(factory); registerFunctionAppendTrailingCharIfAbsent(factory);
registerFunctionStartsWith(factory); registerFunctionStartsWith(factory);
registerFunctionEndsWith(factory); registerFunctionEndsWith(factory);
registerFunctionTrim(factory);
registerFunctionRegexpQuoteMeta(factory);
#if USE_BASE64 #if USE_BASE64
registerFunctionBase64Encode(factory); registerFunctionBase64Encode(factory);
registerFunctionBase64Decode(factory); registerFunctionBase64Decode(factory);

View File

@ -147,7 +147,7 @@ private:
void registerFunctionReverse(FunctionFactory & factory) void registerFunctionReverse(FunctionFactory & factory)
{ {
factory.registerFunction<FunctionBuilderReverse>(); factory.registerFunction<FunctionBuilderReverse>(FunctionFactory::CaseInsensitive);
} }
} }

View File

@ -0,0 +1,18 @@
#include <Functions/IFunction.h>
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionDateOrDateTimeAddInterval.h>
namespace DB
{
using FunctionSubtractQuarters = FunctionDateOrDateTimeAddInterval<SubtractQuartersImpl>;
void registerFunctionSubtractQuarters(FunctionFactory & factory)
{
factory.registerFunction<FunctionSubtractQuarters>();
}
}

146
dbms/src/Functions/trim.cpp Normal file
View File

@ -0,0 +1,146 @@
#include <Columns/ColumnString.h>
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionStringToString.h>
#if __SSE4_2__
#include <nmmintrin.h>
#endif
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
}
struct TrimModeLeft
{
static constexpr auto name = "trimLeft";
static constexpr bool trim_left = true;
static constexpr bool trim_right = false;
};
struct TrimModeRight
{
static constexpr auto name = "trimRight";
static constexpr bool trim_left = false;
static constexpr bool trim_right = true;
};
struct TrimModeBoth
{
static constexpr auto name = "trimBoth";
static constexpr bool trim_left = true;
static constexpr bool trim_right = true;
};
template <typename mode>
class FunctionTrimImpl
{
public:
static void vector(
const ColumnString::Chars & data,
const ColumnString::Offsets & offsets,
ColumnString::Chars & res_data,
ColumnString::Offsets & res_offsets)
{
size_t size = offsets.size();
res_offsets.resize(size);
res_data.reserve(data.size());
size_t prev_offset = 0;
size_t res_offset = 0;
const UInt8 * start;
size_t length;
for (size_t i = 0; i < size; ++i)
{
execute(reinterpret_cast<const UInt8 *>(&data[prev_offset]), offsets[i] - prev_offset - 1, start, length);
res_data.resize(res_data.size() + length + 1);
memcpy(&res_data[res_offset], start, length);
res_offset += length + 1;
res_data[res_offset - 1] = '\0';
res_offsets[i] = res_offset;
prev_offset = offsets[i];
}
}
static void vector_fixed(const ColumnString::Chars &, size_t, ColumnString::Chars &)
{
throw Exception("Functions trimLeft, trimRight and trimBoth cannot work with FixedString argument", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
}
private:
static void execute(const UInt8 * data, size_t size, const UInt8 *& res_data, size_t & res_size)
{
size_t chars_to_trim_left = 0;
size_t chars_to_trim_right = 0;
char whitespace = ' ';
#if __SSE4_2__
const auto bytes_sse = sizeof(__m128i);
const auto size_sse = size - (size % bytes_sse);
const auto whitespace_mask = _mm_set1_epi8(whitespace);
constexpr auto base_sse_mode = _SIDD_UBYTE_OPS | _SIDD_CMP_EQUAL_EACH | _SIDD_NEGATIVE_POLARITY;
auto mask = bytes_sse;
#endif
if constexpr (mode::trim_left)
{
#if __SSE4_2__
/// skip whitespace from left in blocks of up to 16 characters
/// Avoid gcc bug: _mm_cmpistri: error: the third argument must be an 8-bit immediate
enum { left_sse_mode = base_sse_mode | _SIDD_LEAST_SIGNIFICANT };
while (mask == bytes_sse && chars_to_trim_left < size_sse)
{
const auto chars = _mm_loadu_si128(reinterpret_cast<const __m128i *>(data + chars_to_trim_left));
mask = _mm_cmpistri(whitespace_mask, chars, left_sse_mode);
chars_to_trim_left += mask;
}
#endif
/// skip remaining whitespace from left, character by character
while (chars_to_trim_left < size && data[chars_to_trim_left] == whitespace)
++chars_to_trim_left;
}
if constexpr (mode::trim_right)
{
const auto trim_right_size = size - chars_to_trim_left;
#if __SSE4_2__
/// try to skip whitespace from right in blocks of up to 16 characters
/// Avoid gcc bug: _mm_cmpistri: error: the third argument must be an 8-bit immediate
enum { right_sse_mode = base_sse_mode | _SIDD_MOST_SIGNIFICANT };
const auto trim_right_size_sse = trim_right_size - (trim_right_size % bytes_sse);
while (mask == bytes_sse && chars_to_trim_right < trim_right_size_sse)
{
const auto chars = _mm_loadu_si128(reinterpret_cast<const __m128i *>(data + size - chars_to_trim_right - bytes_sse));
mask = _mm_cmpistri(whitespace_mask, chars, right_sse_mode);
chars_to_trim_right += mask;
}
#endif
/// skip remaining whitespace from right, character by character
while (chars_to_trim_right < trim_right_size && data[size - chars_to_trim_right - 1] == whitespace)
++chars_to_trim_right;
}
res_data = data + chars_to_trim_left;
res_size = size - chars_to_trim_left - chars_to_trim_right;
}
};
using FunctionTrimLeft = FunctionStringToString<FunctionTrimImpl<TrimModeLeft>, TrimModeLeft>;
using FunctionTrimRight = FunctionStringToString<FunctionTrimImpl<TrimModeRight>, TrimModeRight>;
using FunctionTrimBoth = FunctionStringToString<FunctionTrimImpl<TrimModeBoth>, TrimModeBoth>;
void registerFunctionTrim(FunctionFactory & factory)
{
factory.registerFunction<FunctionTrimLeft>();
factory.registerFunction<FunctionTrimRight>();
factory.registerFunction<FunctionTrimBoth>();
}
}

View File

@ -65,14 +65,11 @@ public:
Columns tuple_columns(tuple_size); Columns tuple_columns(tuple_size);
for (size_t i = 0; i < tuple_size; ++i) for (size_t i = 0; i < tuple_size; ++i)
{ {
tuple_columns[i] = block.getByPosition(arguments[i]).column;
/** If tuple is mixed of constant and not constant columns, /** If tuple is mixed of constant and not constant columns,
* convert all to non-constant columns, * convert all to non-constant columns,
* because many places in code expect all non-constant columns in non-constant tuple. * because many places in code expect all non-constant columns in non-constant tuple.
*/ */
if (ColumnPtr converted = tuple_columns[i]->convertToFullColumnIfConst()) tuple_columns[i] = block.getByPosition(arguments[i]).column->convertToFullColumnIfConst();
tuple_columns[i] = converted;
} }
block.getByPosition(result).column = ColumnTuple::create(tuple_columns); block.getByPosition(result).column = ColumnTuple::create(tuple_columns);
} }

View File

@ -76,7 +76,7 @@ public:
{ {
nextIfAtEnd(); nextIfAtEnd();
size_t bytes_to_copy = std::min(static_cast<size_t>(working_buffer.end() - pos), n - bytes_copied); size_t bytes_to_copy = std::min(static_cast<size_t>(working_buffer.end() - pos), n - bytes_copied);
std::memcpy(pos, from + bytes_copied, bytes_to_copy); memcpy(pos, from + bytes_copied, bytes_to_copy);
pos += bytes_to_copy; pos += bytes_to_copy;
bytes_copied += bytes_to_copy; bytes_copied += bytes_to_copy;
} }

View File

@ -86,6 +86,25 @@ ReturnType parseDateTimeBestEffortImpl(time_t & res, ReadBuffer & in, const Date
bool is_pm = false; bool is_pm = false;
auto read_alpha_month = [&month] (const auto & alpha)
{
if (0 == strncasecmp(alpha, "Jan", 3)) month = 1;
else if (0 == strncasecmp(alpha, "Feb", 3)) month = 2;
else if (0 == strncasecmp(alpha, "Mar", 3)) month = 3;
else if (0 == strncasecmp(alpha, "Apr", 3)) month = 4;
else if (0 == strncasecmp(alpha, "May", 3)) month = 5;
else if (0 == strncasecmp(alpha, "Jun", 3)) month = 6;
else if (0 == strncasecmp(alpha, "Jul", 3)) month = 7;
else if (0 == strncasecmp(alpha, "Aug", 3)) month = 8;
else if (0 == strncasecmp(alpha, "Sep", 3)) month = 9;
else if (0 == strncasecmp(alpha, "Oct", 3)) month = 10;
else if (0 == strncasecmp(alpha, "Nov", 3)) month = 11;
else if (0 == strncasecmp(alpha, "Dec", 3)) month = 12;
else
return false;
return true;
};
while (!in.eof()) while (!in.eof())
{ {
char digits[14]; char digits[14];
@ -205,6 +224,10 @@ ReturnType parseDateTimeBestEffortImpl(time_t & res, ReadBuffer & in, const Date
/// hh - only if already have day of month /// hh - only if already have day of month
/// DD/MM/YYYY /// DD/MM/YYYY
/// DD/MM/YY /// DD/MM/YY
/// DD.MM.YYYY
/// DD.MM.YY
/// DD-MM-YYYY
/// DD-MM-YY
/// DD /// DD
UInt8 hour_or_day_of_month = 0; UInt8 hour_or_day_of_month = 0;
@ -244,7 +267,7 @@ ReturnType parseDateTimeBestEffortImpl(time_t & res, ReadBuffer & in, const Date
return on_error("Cannot read DateTime: unexpected number of decimal digits after hour and minute: " + toString(num_digits), ErrorCodes::CANNOT_PARSE_DATETIME); return on_error("Cannot read DateTime: unexpected number of decimal digits after hour and minute: " + toString(num_digits), ErrorCodes::CANNOT_PARSE_DATETIME);
} }
} }
else if (checkChar('/', in)) else if (checkChar('/', in) || checkChar('.', in) || checkChar('-', in))
{ {
if (day_of_month) if (day_of_month)
return on_error("Cannot read DateTime: day of month is duplicated", ErrorCodes::CANNOT_PARSE_DATETIME); return on_error("Cannot read DateTime: day of month is duplicated", ErrorCodes::CANNOT_PARSE_DATETIME);
@ -260,10 +283,23 @@ ReturnType parseDateTimeBestEffortImpl(time_t & res, ReadBuffer & in, const Date
readDecimalNumber<2>(month, digits); readDecimalNumber<2>(month, digits);
else if (num_digits == 1) else if (num_digits == 1)
readDecimalNumber<1>(month, digits); readDecimalNumber<1>(month, digits);
else if (num_digits == 0)
{
/// Month in alphabetical form
char alpha[9]; /// The longest month name: September
size_t num_alpha = readAlpha(alpha, sizeof(alpha), in);
if (num_alpha < 3)
return on_error("Cannot read DateTime: unexpected number of alphabetical characters after day of month: " + toString(num_alpha), ErrorCodes::CANNOT_PARSE_DATETIME);
if (!read_alpha_month(alpha))
return on_error("Cannot read DateTime: alphabetical characters after day of month don't look like month: " + std::string(alpha, 3), ErrorCodes::CANNOT_PARSE_DATETIME);
}
else else
return on_error("Cannot read DateTime: unexpected number of decimal digits after day of month: " + toString(num_digits), ErrorCodes::CANNOT_PARSE_DATETIME); return on_error("Cannot read DateTime: unexpected number of decimal digits after day of month: " + toString(num_digits), ErrorCodes::CANNOT_PARSE_DATETIME);
if (checkChar('/', in)) if (checkChar('/', in) || checkChar('.', in) || checkChar('-', in))
{ {
if (year) if (year)
return on_error("Cannot read DateTime: year component is duplicated", ErrorCodes::CANNOT_PARSE_DATETIME); return on_error("Cannot read DateTime: year component is duplicated", ErrorCodes::CANNOT_PARSE_DATETIME);
@ -401,19 +437,9 @@ ReturnType parseDateTimeBestEffortImpl(time_t & res, ReadBuffer & in, const Date
{ {
bool has_day_of_week = false; bool has_day_of_week = false;
if (0 == strncasecmp(alpha, "Jan", 3)) month = 1; if (read_alpha_month(alpha))
else if (0 == strncasecmp(alpha, "Feb", 3)) month = 2; {
else if (0 == strncasecmp(alpha, "Mar", 3)) month = 3; }
else if (0 == strncasecmp(alpha, "Apr", 3)) month = 4;
else if (0 == strncasecmp(alpha, "May", 3)) month = 5;
else if (0 == strncasecmp(alpha, "Jun", 3)) month = 6;
else if (0 == strncasecmp(alpha, "Jul", 3)) month = 7;
else if (0 == strncasecmp(alpha, "Aug", 3)) month = 8;
else if (0 == strncasecmp(alpha, "Sep", 3)) month = 9;
else if (0 == strncasecmp(alpha, "Oct", 3)) month = 10;
else if (0 == strncasecmp(alpha, "Nov", 3)) month = 11;
else if (0 == strncasecmp(alpha, "Dec", 3)) month = 12;
else if (0 == strncasecmp(alpha, "UTC", 3)) has_time_zone_offset = true; else if (0 == strncasecmp(alpha, "UTC", 3)) has_time_zone_offset = true;
else if (0 == strncasecmp(alpha, "GMT", 3)) has_time_zone_offset = true; else if (0 == strncasecmp(alpha, "GMT", 3)) has_time_zone_offset = true;
else if (0 == strncasecmp(alpha, "MSK", 3)) { has_time_zone_offset = true; time_zone_offset_hour = 3; } else if (0 == strncasecmp(alpha, "MSK", 3)) { has_time_zone_offset = true; time_zone_offset_hour = 3; }

View File

@ -34,7 +34,7 @@ class ReadBuffer;
* YYYYMM - 6 digits is a year, month if year was not already read * YYYYMM - 6 digits is a year, month if year was not already read
* hhmmss - 6 digits is a time if year was already read * hhmmss - 6 digits is a time if year was already read
* *
* .nnnnnnn - any number of digits after point is fractional part of second, if it is not YYYY.MM.DD * .nnnnnnn - any number of digits after point is fractional part of second, if it is not YYYY.MM.DD or DD.MM.YYYY
* *
* T - means that time will follow * T - means that time will follow
* *

View File

@ -615,7 +615,6 @@ void NO_INLINE Aggregator::executeImplCase(
AggregateDataPtr overflow_row) const AggregateDataPtr overflow_row) const
{ {
/// NOTE When editing this code, also pay attention to SpecializedAggregator.h. /// NOTE When editing this code, also pay attention to SpecializedAggregator.h.
/// TODO for low cardinality optimization.
/// For all rows. /// For all rows.
typename Method::Key prev_key; typename Method::Key prev_key;
@ -773,13 +772,8 @@ bool Aggregator::executeOnBlock(const Block & block, AggregatedDataVariants & re
/// Remember the columns we will work with /// Remember the columns we will work with
for (size_t i = 0; i < params.keys_size; ++i) for (size_t i = 0; i < params.keys_size; ++i)
{ {
key_columns[i] = block.safeGetByPosition(params.keys[i]).column.get(); materialized_columns.push_back(block.safeGetByPosition(params.keys[i]).column->convertToFullColumnIfConst());
if (ColumnPtr converted = key_columns[i]->convertToFullColumnIfConst())
{
materialized_columns.push_back(converted);
key_columns[i] = materialized_columns.back().get(); key_columns[i] = materialized_columns.back().get();
}
if (const auto * low_cardinality_column = typeid_cast<const ColumnLowCardinality *>(key_columns[i])) if (const auto * low_cardinality_column = typeid_cast<const ColumnLowCardinality *>(key_columns[i]))
{ {
@ -798,13 +792,8 @@ bool Aggregator::executeOnBlock(const Block & block, AggregatedDataVariants & re
{ {
for (size_t j = 0; j < aggregate_columns[i].size(); ++j) for (size_t j = 0; j < aggregate_columns[i].size(); ++j)
{ {
aggregate_columns[i][j] = block.safeGetByPosition(params.aggregates[i].arguments[j]).column.get(); materialized_columns.push_back(block.safeGetByPosition(params.aggregates[i].arguments[j]).column->convertToFullColumnIfConst());
if (ColumnPtr converted = aggregate_columns[i][j]->convertToFullColumnIfConst())
{
materialized_columns.push_back(converted);
aggregate_columns[i][j] = materialized_columns.back().get(); aggregate_columns[i][j] = materialized_columns.back().get();
}
if (auto * col_low_cardinality = typeid_cast<const ColumnLowCardinality *>(aggregate_columns[i][j])) if (auto * col_low_cardinality = typeid_cast<const ColumnLowCardinality *>(aggregate_columns[i][j]))
{ {

View File

@ -331,6 +331,9 @@ private:
auto result = ColumnFloat64::create(column_size); auto result = ColumnFloat64::create(column_size);
auto result_buf = result->getData().data(); auto result_buf = result->getData().data();
if (!column_size)
return result;
/// Prepare float features. /// Prepare float features.
PODArray<const float *> float_features(column_size); PODArray<const float *> float_features(column_size);
auto float_features_buf = float_features.data(); auto float_features_buf = float_features.data();

View File

@ -397,14 +397,24 @@ void Cluster::initMisc()
std::unique_ptr<Cluster> Cluster::getClusterWithSingleShard(size_t index) const std::unique_ptr<Cluster> Cluster::getClusterWithSingleShard(size_t index) const
{ {
return std::unique_ptr<Cluster>{ new Cluster(*this, index) }; return std::unique_ptr<Cluster>{ new Cluster(*this, {index}) };
} }
Cluster::Cluster(const Cluster & from, size_t index) std::unique_ptr<Cluster> Cluster::getClusterWithMultipleShards(const std::vector<size_t> & indices) const
: shards_info{from.shards_info[index]}
{ {
return std::unique_ptr<Cluster>{ new Cluster(*this, indices) };
}
Cluster::Cluster(const Cluster & from, const std::vector<size_t> & indices)
: shards_info{}
{
for (size_t index : indices)
{
shards_info.emplace_back(from.shards_info.at(index));
if (!from.addresses_with_failover.empty()) if (!from.addresses_with_failover.empty())
addresses_with_failover.emplace_back(from.addresses_with_failover[index]); addresses_with_failover.emplace_back(from.addresses_with_failover.at(index));
}
initMisc(); initMisc();
} }

View File

@ -143,6 +143,9 @@ public:
/// Get a subcluster consisting of one shard - index by count (from 0) of the shard of this cluster. /// Get a subcluster consisting of one shard - index by count (from 0) of the shard of this cluster.
std::unique_ptr<Cluster> getClusterWithSingleShard(size_t index) const; std::unique_ptr<Cluster> getClusterWithSingleShard(size_t index) const;
/// Get a subcluster consisting of one or multiple shards - indexes by count (from 0) of the shard of this cluster.
std::unique_ptr<Cluster> getClusterWithMultipleShards(const std::vector<size_t> & indices) const;
private: private:
using SlotToShard = std::vector<UInt64>; using SlotToShard = std::vector<UInt64>;
SlotToShard slot_to_shard; SlotToShard slot_to_shard;
@ -153,8 +156,8 @@ public:
private: private:
void initMisc(); void initMisc();
/// For getClusterWithSingleShard implementation. /// For getClusterWithMultipleShards implementation.
Cluster(const Cluster & from, size_t index); Cluster(const Cluster & from, const std::vector<size_t> & indices);
String hash_of_addresses; String hash_of_addresses;
/// Description of the cluster shards. /// Description of the cluster shards.

View File

@ -204,7 +204,6 @@ static bool isSupportedAlterType(int type)
ASTAlterCommand::ADD_COLUMN, ASTAlterCommand::ADD_COLUMN,
ASTAlterCommand::DROP_COLUMN, ASTAlterCommand::DROP_COLUMN,
ASTAlterCommand::MODIFY_COLUMN, ASTAlterCommand::MODIFY_COLUMN,
ASTAlterCommand::MODIFY_PRIMARY_KEY,
ASTAlterCommand::DROP_PARTITION, ASTAlterCommand::DROP_PARTITION,
ASTAlterCommand::DELETE, ASTAlterCommand::DELETE,
ASTAlterCommand::UPDATE, ASTAlterCommand::UPDATE,
@ -692,17 +691,31 @@ void DDLWorker::processTaskAlter(
auto lock = createSimpleZooKeeperLock(zookeeper, shard_path, "lock", task.host_id_str); auto lock = createSimpleZooKeeperLock(zookeeper, shard_path, "lock", task.host_id_str);
pcg64 rng(randomSeed()); pcg64 rng(randomSeed());
auto is_already_executed = [&]() -> bool
{
String executed_by;
if (zookeeper->tryGet(is_executed_path, executed_by))
{
is_executed_by_any_replica = true;
LOG_DEBUG(log, "Task " << task.entry_name << " has already been executed by another replica ("
<< executed_by << ") of the same shard.");
return true;
}
return false;
};
static const size_t max_tries = 20; static const size_t max_tries = 20;
for (size_t num_tries = 0; num_tries < max_tries; ++num_tries) for (size_t num_tries = 0; num_tries < max_tries; ++num_tries)
{ {
if (zookeeper->exists(is_executed_path)) if (is_already_executed())
{
is_executed_by_any_replica = true;
break; break;
}
if (lock->tryLock()) if (lock->tryLock())
{ {
if (is_already_executed())
break;
tryExecuteQuery(rewritten_query, task, task.execution_status); tryExecuteQuery(rewritten_query, task, task.execution_status);
if (execute_on_leader_replica && task.execution_status.code == ErrorCodes::NOT_IMPLEMENTED) if (execute_on_leader_replica && task.execution_status.code == ErrorCodes::NOT_IMPLEMENTED)

View File

@ -9,6 +9,8 @@
#include <Interpreters/InterpreterSelectWithUnionQuery.h> #include <Interpreters/InterpreterSelectWithUnionQuery.h>
#include <Interpreters/ExecuteScalarSubqueriesVisitor.h> #include <Interpreters/ExecuteScalarSubqueriesVisitor.h>
#include <DataTypes/DataTypeAggregateFunction.h>
namespace DB namespace DB
{ {
@ -98,6 +100,11 @@ void ExecuteScalarSubqueriesMatcher::visit(const ASTSubquery & subquery, ASTPtr
size_t columns = block.columns(); size_t columns = block.columns();
if (columns == 1) if (columns == 1)
{ {
if (typeid_cast<const DataTypeAggregateFunction*>(block.safeGetByPosition(0).type.get()))
{
throw Exception("Scalar subquery can't return an aggregate function state", ErrorCodes::INCORRECT_RESULT_OF_SCALAR_SUBQUERY);
}
auto lit = std::make_unique<ASTLiteral>((*block.safeGetByPosition(0).column)[0]); auto lit = std::make_unique<ASTLiteral>((*block.safeGetByPosition(0).column)[0]);
lit->alias = subquery.alias; lit->alias = subquery.alias;
lit->prefer_alias_to_column_name = subquery.prefer_alias_to_column_name; lit->prefer_alias_to_column_name = subquery.prefer_alias_to_column_name;
@ -116,6 +123,11 @@ void ExecuteScalarSubqueriesMatcher::visit(const ASTSubquery & subquery, ASTPtr
exp_list->children.resize(columns); exp_list->children.resize(columns);
for (size_t i = 0; i < columns; ++i) for (size_t i = 0; i < columns; ++i)
{ {
if (typeid_cast<const DataTypeAggregateFunction*>(block.safeGetByPosition(i).type.get()))
{
throw Exception("Scalar subquery can't return an aggregate function state", ErrorCodes::INCORRECT_RESULT_OF_SCALAR_SUBQUERY);
}
exp_list->children[i] = addTypeConversion( exp_list->children[i] = addTypeConversion(
std::make_unique<ASTLiteral>((*block.safeGetByPosition(i).column)[0]), std::make_unique<ASTLiteral>((*block.safeGetByPosition(i).column)[0]),
block.safeGetByPosition(i).type->getName()); block.safeGetByPosition(i).type->getName());

View File

@ -375,10 +375,7 @@ void ExpressionAction::execute(Block & block, bool dry_run) const
if (array_joined_columns.empty()) if (array_joined_columns.empty())
throw Exception("No arrays to join", ErrorCodes::LOGICAL_ERROR); throw Exception("No arrays to join", ErrorCodes::LOGICAL_ERROR);
ColumnPtr any_array_ptr = block.getByName(*array_joined_columns.begin()).column; ColumnPtr any_array_ptr = block.getByName(*array_joined_columns.begin()).column->convertToFullColumnIfConst();
if (ColumnPtr converted = any_array_ptr->convertToFullColumnIfConst())
any_array_ptr = converted;
const ColumnArray * any_array = typeid_cast<const ColumnArray *>(&*any_array_ptr); const ColumnArray * any_array = typeid_cast<const ColumnArray *>(&*any_array_ptr);
if (!any_array) if (!any_array)
throw Exception("ARRAY JOIN of not array: " + *array_joined_columns.begin(), ErrorCodes::TYPE_MISMATCH); throw Exception("ARRAY JOIN of not array: " + *array_joined_columns.begin(), ErrorCodes::TYPE_MISMATCH);
@ -416,10 +413,10 @@ void ExpressionAction::execute(Block & block, bool dry_run) const
Block tmp_block{src_col, column_of_max_length, {{}, src_col.type, {}}}; Block tmp_block{src_col, column_of_max_length, {{}, src_col.type, {}}};
function_arrayResize->build({src_col, column_of_max_length})->execute(tmp_block, {0, 1}, 2, rows); function_arrayResize->build({src_col, column_of_max_length})->execute(tmp_block, {0, 1}, 2, rows);
any_array_ptr = src_col.column = tmp_block.safeGetByPosition(2).column; src_col.column = tmp_block.safeGetByPosition(2).column;
any_array_ptr = src_col.column->convertToFullColumnIfConst();
} }
if (ColumnPtr converted = any_array_ptr->convertToFullColumnIfConst())
any_array_ptr = converted;
any_array = typeid_cast<const ColumnArray *>(&*any_array_ptr); any_array = typeid_cast<const ColumnArray *>(&*any_array_ptr);
} }
else if (array_join_is_left && !unaligned_array_join) else if (array_join_is_left && !unaligned_array_join)
@ -434,10 +431,7 @@ void ExpressionAction::execute(Block & block, bool dry_run) const
non_empty_array_columns[name] = tmp_block.safeGetByPosition(1).column; non_empty_array_columns[name] = tmp_block.safeGetByPosition(1).column;
} }
any_array_ptr = non_empty_array_columns.begin()->second; any_array_ptr = non_empty_array_columns.begin()->second->convertToFullColumnIfConst();
if (ColumnPtr converted = any_array_ptr->convertToFullColumnIfConst())
any_array_ptr = converted;
any_array = &typeid_cast<const ColumnArray &>(*any_array_ptr); any_array = &typeid_cast<const ColumnArray &>(*any_array_ptr);
} }
@ -452,9 +446,7 @@ void ExpressionAction::execute(Block & block, bool dry_run) const
throw Exception("ARRAY JOIN of not array: " + current.name, ErrorCodes::TYPE_MISMATCH); throw Exception("ARRAY JOIN of not array: " + current.name, ErrorCodes::TYPE_MISMATCH);
ColumnPtr array_ptr = (array_join_is_left && !unaligned_array_join) ? non_empty_array_columns[current.name] : current.column; ColumnPtr array_ptr = (array_join_is_left && !unaligned_array_join) ? non_empty_array_columns[current.name] : current.column;
array_ptr = array_ptr->convertToFullColumnIfConst();
if (ColumnPtr converted = array_ptr->convertToFullColumnIfConst())
array_ptr = converted;
const ColumnArray & array = typeid_cast<const ColumnArray &>(*array_ptr); const ColumnArray & array = typeid_cast<const ColumnArray &>(*array_ptr);
if (!unaligned_array_join && !array.hasEqualOffsets(typeid_cast<const ColumnArray &>(*any_array_ptr))) if (!unaligned_array_join && !array.hasEqualOffsets(typeid_cast<const ColumnArray &>(*any_array_ptr)))

View File

@ -143,15 +143,17 @@ void ExpressionAnalyzer::analyzeAggregation()
ExpressionActionsPtr temp_actions = std::make_shared<ExpressionActions>(source_columns, context); ExpressionActionsPtr temp_actions = std::make_shared<ExpressionActions>(source_columns, context);
if (select_query && select_query->array_join_expression_list()) if (select_query)
{ {
getRootActions(select_query->array_join_expression_list(), true, temp_actions); bool is_array_join_left;
addMultipleArrayJoinAction(temp_actions); ASTPtr array_join_expression_list = select_query->array_join_expression_list(is_array_join_left);
if (array_join_expression_list)
{
getRootActions(array_join_expression_list, true, temp_actions);
addMultipleArrayJoinAction(temp_actions, is_array_join_left);
array_join_columns = temp_actions->getSampleBlock().getNamesAndTypesList(); array_join_columns = temp_actions->getSampleBlock().getNamesAndTypesList();
} }
if (select_query)
{
const ASTTablesInSelectQueryElement * join = select_query->join(); const ASTTablesInSelectQueryElement * join = select_query->join();
if (join) if (join)
{ {
@ -512,7 +514,7 @@ void ExpressionAnalyzer::initChain(ExpressionActionsChain & chain, const NamesAn
} }
/// "Big" ARRAY JOIN. /// "Big" ARRAY JOIN.
void ExpressionAnalyzer::addMultipleArrayJoinAction(ExpressionActionsPtr & actions) const void ExpressionAnalyzer::addMultipleArrayJoinAction(ExpressionActionsPtr & actions, bool array_join_is_left) const
{ {
NameSet result_columns; NameSet result_columns;
for (const auto & result_source : syntax->array_join_result_to_source) for (const auto & result_source : syntax->array_join_result_to_source)
@ -525,22 +527,24 @@ void ExpressionAnalyzer::addMultipleArrayJoinAction(ExpressionActionsPtr & actio
result_columns.insert(result_source.first); result_columns.insert(result_source.first);
} }
actions->add(ExpressionAction::arrayJoin(result_columns, select_query->array_join_is_left(), context)); actions->add(ExpressionAction::arrayJoin(result_columns, array_join_is_left, context));
} }
bool ExpressionAnalyzer::appendArrayJoin(ExpressionActionsChain & chain, bool only_types) bool ExpressionAnalyzer::appendArrayJoin(ExpressionActionsChain & chain, bool only_types)
{ {
assertSelect(); assertSelect();
if (!select_query->array_join_expression_list()) bool is_array_join_left;
ASTPtr array_join_expression_list = select_query->array_join_expression_list(is_array_join_left);
if (!array_join_expression_list)
return false; return false;
initChain(chain, source_columns); initChain(chain, source_columns);
ExpressionActionsChain::Step & step = chain.steps.back(); ExpressionActionsChain::Step & step = chain.steps.back();
getRootActions(select_query->array_join_expression_list(), only_types, step.actions); getRootActions(array_join_expression_list, only_types, step.actions);
addMultipleArrayJoinAction(step.actions); addMultipleArrayJoinAction(step.actions, is_array_join_left);
return true; return true;
} }

View File

@ -240,7 +240,7 @@ private:
/// Find global subqueries in the GLOBAL IN/JOIN sections. Fills in external_tables. /// Find global subqueries in the GLOBAL IN/JOIN sections. Fills in external_tables.
void initGlobalSubqueriesAndExternalTables(); void initGlobalSubqueriesAndExternalTables();
void addMultipleArrayJoinAction(ExpressionActionsPtr & actions) const; void addMultipleArrayJoinAction(ExpressionActionsPtr & actions, bool is_left) const;
void addJoinAction(ExpressionActionsPtr & actions, bool only_types) const; void addJoinAction(ExpressionActionsPtr & actions, bool only_types) const;
@ -269,7 +269,7 @@ private:
void assertAggregation() const; void assertAggregation() const;
/** /**
* Create Set from a subuqery or a table expression in the query. The created set is suitable for using the index. * Create Set from a subuquery or a table expression in the query. The created set is suitable for using the index.
* The set will not be created if its size hits the limit. * The set will not be created if its size hits the limit.
*/ */
void tryMakeSetForIndexFromSubquery(const ASTPtr & subquery_or_table_name); void tryMakeSetForIndexFromSubquery(const ASTPtr & subquery_or_table_name);

View File

@ -221,8 +221,11 @@ BlockIO InterpreterKillQueryQuery::execute()
Block InterpreterKillQueryQuery::getSelectFromSystemProcessesResult() Block InterpreterKillQueryQuery::getSelectFromSystemProcessesResult()
{ {
String system_processes_query = "SELECT query_id, user, query FROM system.processes WHERE " String system_processes_query = "SELECT query_id, user, query FROM system.processes";
+ queryToString(static_cast<ASTKillQueryQuery &>(*query_ptr).where_expression); auto & where_expression = static_cast<ASTKillQueryQuery &>(*query_ptr).where_expression;
if (where_expression)
system_processes_query += " WHERE " + queryToString(where_expression);
BlockIO system_processes_io = executeQuery(system_processes_query, context, true); BlockIO system_processes_io = executeQuery(system_processes_query, context, true);
Block res = system_processes_io.in->read(); Block res = system_processes_io.in->read();

View File

@ -437,14 +437,8 @@ bool Join::insertFromBlock(const Block & block)
/// Memoize key columns to work. /// Memoize key columns to work.
for (size_t i = 0; i < keys_size; ++i) for (size_t i = 0; i < keys_size; ++i)
{ {
materialized_columns.emplace_back(recursiveRemoveLowCardinality(block.getByName(key_names_right[i]).column)); materialized_columns.emplace_back(recursiveRemoveLowCardinality(block.getByName(key_names_right[i]).column->convertToFullColumnIfConst()));
key_columns[i] = materialized_columns.back().get(); key_columns[i] = materialized_columns.back().get();
if (ColumnPtr converted = key_columns[i]->convertToFullColumnIfConst())
{
materialized_columns.emplace_back(converted);
key_columns[i] = materialized_columns.back().get();
}
} }
/// We will insert to the map only keys, where all components are not NULL. /// We will insert to the map only keys, where all components are not NULL.
@ -483,11 +477,7 @@ bool Join::insertFromBlock(const Block & block)
/// Rare case, when joined columns are constant. To avoid code bloat, simply materialize them. /// Rare case, when joined columns are constant. To avoid code bloat, simply materialize them.
for (size_t i = 0; i < size; ++i) for (size_t i = 0; i < size; ++i)
{ stored_block->safeGetByPosition(i).column = stored_block->safeGetByPosition(i).column->convertToFullColumnIfConst();
ColumnPtr col = stored_block->safeGetByPosition(i).column;
if (ColumnPtr converted = col->convertToFullColumnIfConst())
stored_block->safeGetByPosition(i).column = converted;
}
/// In case of LEFT and FULL joins, if use_nulls, convert joined columns to Nullable. /// In case of LEFT and FULL joins, if use_nulls, convert joined columns to Nullable.
if (use_nulls && (kind == ASTTableJoin::Kind::Left || kind == ASTTableJoin::Kind::Full)) if (use_nulls && (kind == ASTTableJoin::Kind::Left || kind == ASTTableJoin::Kind::Full))
@ -685,14 +675,8 @@ void Join::joinBlockImpl(
/// Memoize key columns to work with. /// Memoize key columns to work with.
for (size_t i = 0; i < keys_size; ++i) for (size_t i = 0; i < keys_size; ++i)
{ {
materialized_columns.emplace_back(recursiveRemoveLowCardinality(block.getByName(key_names_left[i]).column)); materialized_columns.emplace_back(recursiveRemoveLowCardinality(block.getByName(key_names_left[i]).column->convertToFullColumnIfConst()));
key_columns[i] = materialized_columns.back().get(); key_columns[i] = materialized_columns.back().get();
if (ColumnPtr converted = key_columns[i]->convertToFullColumnIfConst())
{
materialized_columns.emplace_back(converted);
key_columns[i] = materialized_columns.back().get();
}
} }
/// Keys with NULL value in any column won't join to anything. /// Keys with NULL value in any column won't join to anything.
@ -710,10 +694,7 @@ void Join::joinBlockImpl(
{ {
for (size_t i = 0; i < existing_columns; ++i) for (size_t i = 0; i < existing_columns; ++i)
{ {
auto & col = block.getByPosition(i).column; block.getByPosition(i).column = block.getByPosition(i).column->convertToFullColumnIfConst();
if (ColumnPtr converted = col->convertToFullColumnIfConst())
col = converted;
/// If use_nulls, convert left columns (except keys) to Nullable. /// If use_nulls, convert left columns (except keys) to Nullable.
if (use_nulls) if (use_nulls)

View File

@ -8,6 +8,7 @@
#include <Parsers/ASTTablesInSelectQuery.h> #include <Parsers/ASTTablesInSelectQuery.h>
#include <DataTypes/NestedUtils.h> #include <DataTypes/NestedUtils.h>
#include <Common/typeid_cast.h> #include <Common/typeid_cast.h>
#include "InDepthNodeVisitor.h"
namespace DB namespace DB
{ {

View File

@ -121,15 +121,10 @@ void Set::setHeader(const Block & block)
/// Remember the columns we will work with /// Remember the columns we will work with
for (size_t i = 0; i < keys_size; ++i) for (size_t i = 0; i < keys_size; ++i)
{ {
key_columns.emplace_back(block.safeGetByPosition(i).column.get()); materialized_columns.emplace_back(block.safeGetByPosition(i).column->convertToFullColumnIfConst());
key_columns.emplace_back(materialized_columns.back().get());
data_types.emplace_back(block.safeGetByPosition(i).type); data_types.emplace_back(block.safeGetByPosition(i).type);
if (ColumnPtr converted = key_columns.back()->convertToFullColumnIfConst())
{
materialized_columns.emplace_back(converted);
key_columns.back() = materialized_columns.back().get();
}
/// Convert low cardinality column to full. /// Convert low cardinality column to full.
if (auto * low_cardinality_type = typeid_cast<const DataTypeLowCardinality *>(data_types.back().get())) if (auto * low_cardinality_type = typeid_cast<const DataTypeLowCardinality *>(data_types.back().get()))
{ {
@ -175,20 +170,8 @@ bool Set::insertFromBlock(const Block & block)
/// Remember the columns we will work with /// Remember the columns we will work with
for (size_t i = 0; i < keys_size; ++i) for (size_t i = 0; i < keys_size; ++i)
{ {
key_columns.emplace_back(block.safeGetByPosition(i).column.get()); materialized_columns.emplace_back(block.safeGetByPosition(i).column->convertToFullColumnIfConst()->convertToFullColumnIfLowCardinality());
key_columns.emplace_back(materialized_columns.back().get());
if (ColumnPtr converted = key_columns.back()->convertToFullColumnIfConst())
{
materialized_columns.emplace_back(converted);
key_columns.back() = materialized_columns.back().get();
}
/// Convert low cardinality column to full.
if (key_columns.back()->lowCardinality())
{
materialized_columns.emplace_back(key_columns.back()->convertToFullColumnIfLowCardinality());
key_columns.back() = materialized_columns.back().get();
}
} }
size_t rows = block.rows(); size_t rows = block.rows();
@ -365,18 +348,13 @@ ColumnPtr Set::execute(const Block & block, bool negative) const
for (size_t i = 0; i < num_key_columns; ++i) for (size_t i = 0; i < num_key_columns; ++i)
{ {
key_columns.push_back(block.safeGetByPosition(i).column.get());
if (!removeNullable(data_types[i])->equals(*removeNullable(block.safeGetByPosition(i).type))) if (!removeNullable(data_types[i])->equals(*removeNullable(block.safeGetByPosition(i).type)))
throw Exception("Types of column " + toString(i + 1) + " in section IN don't match: " throw Exception("Types of column " + toString(i + 1) + " in section IN don't match: "
+ data_types[i]->getName() + " on the right, " + block.safeGetByPosition(i).type->getName() + + data_types[i]->getName() + " on the right, " + block.safeGetByPosition(i).type->getName() +
" on the left.", ErrorCodes::TYPE_MISMATCH); " on the left.", ErrorCodes::TYPE_MISMATCH);
if (ColumnPtr converted = key_columns.back()->convertToFullColumnIfConst()) materialized_columns.emplace_back(block.safeGetByPosition(i).column->convertToFullColumnIfConst());
{ key_columns.emplace_back() = materialized_columns.back().get();
materialized_columns.emplace_back(converted);
key_columns.back() = materialized_columns.back().get();
}
} }
/// We will check existence in Set only for keys, where all components are not NULL. /// We will check existence in Set only for keys, where all components are not NULL.

View File

@ -89,6 +89,7 @@ struct Settings
M(SettingBool, skip_unavailable_shards, false, "Silently skip unavailable shards.") \ M(SettingBool, skip_unavailable_shards, false, "Silently skip unavailable shards.") \
\ \
M(SettingBool, distributed_group_by_no_merge, false, "Do not merge aggregation states from different servers for distributed query processing - in case it is for certain that there are different keys on different shards.") \ M(SettingBool, distributed_group_by_no_merge, false, "Do not merge aggregation states from different servers for distributed query processing - in case it is for certain that there are different keys on different shards.") \
M(SettingBool, optimize_skip_unused_shards, false, "Assumes that data is distributed by sharding_key. Optimization to skip unused shards if SELECT query filters by sharding_key.") \
\ \
M(SettingUInt64, merge_tree_min_rows_for_concurrent_read, (20 * 8192), "If at least as many lines are read from one file, the reading can be parallelized.") \ M(SettingUInt64, merge_tree_min_rows_for_concurrent_read, (20 * 8192), "If at least as many lines are read from one file, the reading can be parallelized.") \
M(SettingUInt64, merge_tree_min_rows_for_seek, 0, "You can skip reading more than that number of rows at the price of one seek per file.") \ M(SettingUInt64, merge_tree_min_rows_for_seek, 0, "You can skip reading more than that number of rows at the price of one seek per file.") \
@ -277,7 +278,7 @@ struct Settings
M(SettingBool, log_profile_events, true, "Log query performance statistics into the query_log and query_thread_log.") \ M(SettingBool, log_profile_events, true, "Log query performance statistics into the query_log and query_thread_log.") \
M(SettingBool, log_query_settings, true, "Log query settings into the query_log.") \ M(SettingBool, log_query_settings, true, "Log query settings into the query_log.") \
M(SettingBool, log_query_threads, true, "Log query threads into system.query_thread_log table. This setting have effect only when 'log_queries' is true.") \ M(SettingBool, log_query_threads, true, "Log query threads into system.query_thread_log table. This setting have effect only when 'log_queries' is true.") \
M(SettingString, send_logs_level, "none", "Send server text logs with specified minumum level to client. Valid values: 'trace', 'debug', 'information', 'warning', 'error', 'none'") \ M(SettingLogsLevel, send_logs_level, "none", "Send server text logs with specified minumum level to client. Valid values: 'trace', 'debug', 'information', 'warning', 'error', 'none'") \
M(SettingBool, enable_optimize_predicate_expression, 0, "If it is set to true, optimize predicates to subqueries.") \ M(SettingBool, enable_optimize_predicate_expression, 0, "If it is set to true, optimize predicates to subqueries.") \
\ \
M(SettingUInt64, low_cardinality_max_dictionary_size, 8192, "Maximum size (in rows) of shared global dictionary for LowCardinality type.") \ M(SettingUInt64, low_cardinality_max_dictionary_size, 8192, "Maximum size (in rows) of shared global dictionary for LowCardinality type.") \
@ -294,7 +295,7 @@ struct Settings
M(SettingBool, parallel_view_processing, false, "Enables pushing to attached views concurrently instead of sequentially.") \ M(SettingBool, parallel_view_processing, false, "Enables pushing to attached views concurrently instead of sequentially.") \
M(SettingBool, enable_debug_queries, false, "Enables debug queries such as AST.") \ M(SettingBool, enable_debug_queries, false, "Enables debug queries such as AST.") \
M(SettingBool, enable_unaligned_array_join, false, "Allow ARRAY JOIN with multiple arrays that have different sizes. When this settings is enabled, arrays will be resized to the longest one.") \ M(SettingBool, enable_unaligned_array_join, false, "Allow ARRAY JOIN with multiple arrays that have different sizes. When this settings is enabled, arrays will be resized to the longest one.") \
M(SettingBool, low_cardinality_allow_in_native_format, true, "Use LowCardinality type in Native format. Otherwise, convert LowCardinality columns to ordinary for select query, and convert ordinary columns to required LowCardinality for insert query.") \
#define DECLARE(TYPE, NAME, DEFAULT, DESCRIPTION) \ #define DECLARE(TYPE, NAME, DEFAULT, DESCRIPTION) \
TYPE NAME {DEFAULT}; TYPE NAME {DEFAULT};

View File

@ -23,6 +23,7 @@ namespace ErrorCodes
extern const int UNKNOWN_DISTRIBUTED_PRODUCT_MODE; extern const int UNKNOWN_DISTRIBUTED_PRODUCT_MODE;
extern const int UNKNOWN_GLOBAL_SUBQUERIES_METHOD; extern const int UNKNOWN_GLOBAL_SUBQUERIES_METHOD;
extern const int UNKNOWN_JOIN_STRICTNESS; extern const int UNKNOWN_JOIN_STRICTNESS;
extern const int UNKNOWN_LOG_LEVEL;
extern const int SIZE_OF_FIXED_STRING_DOESNT_MATCH; extern const int SIZE_OF_FIXED_STRING_DOESNT_MATCH;
extern const int BAD_ARGUMENTS; extern const int BAD_ARGUMENTS;
} }
@ -674,4 +675,58 @@ void SettingDateTimeInputFormat::write(WriteBuffer & buf) const
writeBinary(toString(), buf); writeBinary(toString(), buf);
} }
const std::vector<String> SettingLogsLevel::log_levels =
{
"none",
"trace",
"debug",
"information",
"warning",
"error"
};
SettingLogsLevel::SettingLogsLevel(const String & level)
{
set(level);
}
void SettingLogsLevel::set(const String & level)
{
auto it = std::find(log_levels.begin(), log_levels.end(), level);
if (it == log_levels.end())
throw Exception("Log level '" + level + "' not allowed.", ErrorCodes::UNKNOWN_LOG_LEVEL);
value = *it;
changed = true;
}
void SettingLogsLevel::set(const Field & level)
{
set(safeGet<String>(level));
}
void SettingLogsLevel::set(ReadBuffer & buf)
{
String x;
readBinary(x, buf);
set(x);
}
String SettingLogsLevel::toString() const
{
return value;
}
void SettingLogsLevel::write(WriteBuffer & buf) const
{
writeBinary(toString(), buf);
}
} }

View File

@ -404,4 +404,25 @@ struct SettingDateTimeInputFormat
void write(WriteBuffer & buf) const; void write(WriteBuffer & buf) const;
}; };
class SettingLogsLevel
{
public:
String value;
bool changed = false;
static const std::vector<String> log_levels;
SettingLogsLevel(const String & level);
operator String() const { return value; }
void set(const String & level);
void set(const Field & level);
void set(ReadBuffer & buf);
String toString() const;
void write(WriteBuffer & buf) const;
};
} }

View File

@ -108,6 +108,9 @@ void NO_INLINE Aggregator::executeSpecialized(
AggregateDataPtr overflow_row) const AggregateDataPtr overflow_row) const
{ {
typename Method::State state; typename Method::State state;
if constexpr (Method::low_cardinality_optimization)
state.init(key_columns, aggregation_state_cache);
else
state.init(key_columns); state.init(key_columns);
if (!no_more_keys) if (!no_more_keys)
@ -133,15 +136,19 @@ void NO_INLINE Aggregator::executeSpecializedCase(
AggregateDataPtr overflow_row) const AggregateDataPtr overflow_row) const
{ {
/// For all rows. /// For all rows.
typename Method::iterator it;
typename Method::Key prev_key; typename Method::Key prev_key;
AggregateDataPtr value = nullptr;
for (size_t i = 0; i < rows; ++i) for (size_t i = 0; i < rows; ++i)
{ {
bool inserted; /// Inserted a new key, or was this key already? bool inserted = false; /// Inserted a new key, or was this key already?
bool overflow = false; /// New key did not fit in the hash table because of no_more_keys.
/// Get the key to insert into the hash table. /// Get the key to insert into the hash table.
typename Method::Key key = state.getKey(key_columns, params.keys_size, i, key_sizes, keys, *aggregates_pool); typename Method::Key key;
if constexpr (!Method::low_cardinality_optimization)
key = state.getKey(key_columns, params.keys_size, i, key_sizes, keys, *aggregates_pool);
AggregateDataPtr * aggregate_data = nullptr;
typename Method::iterator it; /// Is not used if Method::low_cardinality_optimization
if (!no_more_keys) /// Insert. if (!no_more_keys) /// Insert.
{ {
@ -150,8 +157,6 @@ void NO_INLINE Aggregator::executeSpecializedCase(
{ {
if (i != 0 && key == prev_key) if (i != 0 && key == prev_key)
{ {
AggregateDataPtr value = Method::getAggregateData(it->second);
/// Add values into aggregate functions. /// Add values into aggregate functions.
AggregateFunctionsList::forEach(AggregateFunctionsUpdater( AggregateFunctionsList::forEach(AggregateFunctionsUpdater(
aggregate_functions, offsets_of_aggregate_states, aggregate_columns, value, i, aggregates_pool)); aggregate_functions, offsets_of_aggregate_states, aggregate_columns, value, i, aggregates_pool));
@ -163,19 +168,29 @@ void NO_INLINE Aggregator::executeSpecializedCase(
prev_key = key; prev_key = key;
} }
if constexpr (Method::low_cardinality_optimization)
aggregate_data = state.emplaceKeyFromRow(method.data, i, inserted, params.keys_size, keys, *aggregates_pool);
else
{
method.data.emplace(key, it, inserted); method.data.emplace(key, it, inserted);
aggregate_data = &Method::getAggregateData(it->second);
}
} }
else else
{ {
/// Add only if the key already exists. /// Add only if the key already exists.
inserted = false; if constexpr (Method::low_cardinality_optimization)
aggregate_data = state.findFromRow(method.data, i);
else
{
it = method.data.find(key); it = method.data.find(key);
if (method.data.end() == it) if (method.data.end() != it)
overflow = true; aggregate_data = &Method::getAggregateData(it->second);
}
} }
/// If the key does not fit, and the data does not need to be aggregated in a separate row, then there's nothing to do. /// If the key does not fit, and the data does not need to be aggregated in a separate row, then there's nothing to do.
if (no_more_keys && overflow && !overflow_row) if (!aggregate_data && !overflow_row)
{ {
method.onExistingKey(key, keys, *aggregates_pool); method.onExistingKey(key, keys, *aggregates_pool);
continue; continue;
@ -184,9 +199,9 @@ void NO_INLINE Aggregator::executeSpecializedCase(
/// If a new key is inserted, initialize the states of the aggregate functions, and possibly some stuff related to the key. /// If a new key is inserted, initialize the states of the aggregate functions, and possibly some stuff related to the key.
if (inserted) if (inserted)
{ {
AggregateDataPtr & aggregate_data = Method::getAggregateData(it->second); *aggregate_data = nullptr;
aggregate_data = nullptr;
if constexpr (!Method::low_cardinality_optimization)
method.onNewKey(*it, params.keys_size, keys, *aggregates_pool); method.onNewKey(*it, params.keys_size, keys, *aggregates_pool);
AggregateDataPtr place = aggregates_pool->alignedAlloc(total_size_of_aggregate_states, align_aggregate_states); AggregateDataPtr place = aggregates_pool->alignedAlloc(total_size_of_aggregate_states, align_aggregate_states);
@ -194,12 +209,15 @@ void NO_INLINE Aggregator::executeSpecializedCase(
AggregateFunctionsList::forEach(AggregateFunctionsCreator( AggregateFunctionsList::forEach(AggregateFunctionsCreator(
aggregate_functions, offsets_of_aggregate_states, place)); aggregate_functions, offsets_of_aggregate_states, place));
aggregate_data = place; *aggregate_data = place;
if constexpr (Method::low_cardinality_optimization)
state.cacheAggregateData(i, place);
} }
else else
method.onExistingKey(key, keys, *aggregates_pool); method.onExistingKey(key, keys, *aggregates_pool);
AggregateDataPtr value = (!no_more_keys || !overflow) ? Method::getAggregateData(it->second) : overflow_row; value = aggregate_data ? *aggregate_data : overflow_row;
/// Add values into the aggregate functions. /// Add values into the aggregate functions.
AggregateFunctionsList::forEach(AggregateFunctionsUpdater( AggregateFunctionsList::forEach(AggregateFunctionsUpdater(

View File

@ -676,9 +676,13 @@ void optimizeUsing(const ASTSelectQuery * select_query)
void getArrayJoinedColumns(ASTPtr & query, SyntaxAnalyzerResult & result, const ASTSelectQuery * select_query, void getArrayJoinedColumns(ASTPtr & query, SyntaxAnalyzerResult & result, const ASTSelectQuery * select_query,
const Names & source_columns, const NameSet & source_columns_set) const Names & source_columns, const NameSet & source_columns_set)
{ {
if (select_query && select_query->array_join_expression_list()) if (!select_query)
return;
ASTPtr array_join_expression_list = select_query->array_join_expression_list();
if (array_join_expression_list)
{ {
ASTs & array_join_asts = select_query->array_join_expression_list()->children; ASTs & array_join_asts = array_join_expression_list->children;
for (const auto & ast : array_join_asts) for (const auto & ast : array_join_asts)
{ {
const String nested_table_name = ast->getColumnName(); const String nested_table_name = ast->getColumnName();

View File

@ -1,18 +1,20 @@
#include <Core/Block.h> #include <Interpreters/evaluateConstantExpression.h>
#include <Columns/ColumnConst.h> #include <Columns/ColumnConst.h>
#include <Columns/ColumnsNumber.h> #include <Columns/ColumnsNumber.h>
#include <Parsers/ASTIdentifier.h> #include <Core/Block.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ExpressionElementParsers.h>
#include <DataTypes/DataTypesNumber.h> #include <DataTypes/DataTypesNumber.h>
#include <Interpreters/Context.h> #include <Interpreters/Context.h>
#include <Interpreters/SyntaxAnalyzer.h> #include <Interpreters/convertFieldToType.h>
#include <Interpreters/ExpressionAnalyzer.h>
#include <Interpreters/ExpressionActions.h> #include <Interpreters/ExpressionActions.h>
#include <Interpreters/evaluateConstantExpression.h> #include <Interpreters/ExpressionAnalyzer.h>
#include <Common/typeid_cast.h> #include <Interpreters/SyntaxAnalyzer.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ExpressionElementParsers.h>
#include <TableFunctions/TableFunctionFactory.h> #include <TableFunctions/TableFunctionFactory.h>
#include <Common/typeid_cast.h>
namespace DB namespace DB
@ -57,7 +59,7 @@ std::pair<Field, std::shared_ptr<const IDataType>> evaluateConstantExpression(co
ASTPtr evaluateConstantExpressionAsLiteral(const ASTPtr & node, const Context & context) ASTPtr evaluateConstantExpressionAsLiteral(const ASTPtr & node, const Context & context)
{ {
/// Branch with string in qery. /// Branch with string in query.
if (typeid_cast<const ASTLiteral *>(node.get())) if (typeid_cast<const ASTLiteral *>(node.get()))
return node; return node;
@ -77,4 +79,236 @@ ASTPtr evaluateConstantExpressionOrIdentifierAsLiteral(const ASTPtr & node, cons
return evaluateConstantExpressionAsLiteral(node, context); return evaluateConstantExpressionAsLiteral(node, context);
} }
namespace
{
using Conjunction = ColumnsWithTypeAndName;
using Disjunction = std::vector<Conjunction>;
Disjunction analyzeEquals(const ASTIdentifier * identifier, const ASTLiteral * literal, const ExpressionActionsPtr & expr)
{
if (!identifier || !literal)
{
return {};
}
for (const auto & name_and_type : expr->getRequiredColumnsWithTypes())
{
const auto & name = name_and_type.name;
const auto & type = name_and_type.type;
if (name == identifier->name)
{
ColumnWithTypeAndName column;
// FIXME: what to do if field is not convertable?
column.column = type->createColumnConst(1, convertFieldToType(literal->value, *type));
column.name = name;
column.type = type;
return {{std::move(column)}};
}
}
return {};
}
Disjunction andDNF(const Disjunction & left, const Disjunction & right)
{
if (left.empty())
{
return right;
}
Disjunction result;
for (const auto & conjunct1 : left)
{
for (const auto & conjunct2 : right)
{
Conjunction new_conjunct{conjunct1};
new_conjunct.insert(new_conjunct.end(), conjunct2.begin(), conjunct2.end());
result.emplace_back(new_conjunct);
}
}
return result;
}
Disjunction analyzeFunction(const ASTFunction * fn, const ExpressionActionsPtr & expr)
{
if (!fn)
{
return {};
}
// TODO: enumerate all possible function names!
if (fn->name == "equals")
{
const auto * left = fn->arguments->children.front().get();
const auto * right = fn->arguments->children.back().get();
const auto * identifier = typeid_cast<const ASTIdentifier *>(left) ? typeid_cast<const ASTIdentifier *>(left)
: typeid_cast<const ASTIdentifier *>(right);
const auto * literal = typeid_cast<const ASTLiteral *>(left) ? typeid_cast<const ASTLiteral *>(left)
: typeid_cast<const ASTLiteral *>(right);
return analyzeEquals(identifier, literal, expr);
}
else if (fn->name == "in")
{
const auto * left = fn->arguments->children.front().get();
const auto * right = fn->arguments->children.back().get();
const auto * identifier = typeid_cast<const ASTIdentifier *>(left);
const auto * inner_fn = typeid_cast<const ASTFunction *>(right);
if (!inner_fn)
{
return {};
}
const auto * tuple = typeid_cast<const ASTExpressionList *>(inner_fn->children.front().get());
if (!tuple)
{
return {};
}
Disjunction result;
for (const auto & child : tuple->children)
{
const auto * literal = typeid_cast<const ASTLiteral *>(child.get());
const auto dnf = analyzeEquals(identifier, literal, expr);
if (dnf.empty())
{
return {};
}
result.insert(result.end(), dnf.begin(), dnf.end());
}
return result;
}
else if (fn->name == "or")
{
const auto * args = typeid_cast<const ASTExpressionList *>(fn->children.front().get());
if (!args)
{
return {};
}
Disjunction result;
for (const auto & arg : args->children)
{
const auto dnf = analyzeFunction(typeid_cast<const ASTFunction *>(arg.get()), expr);
if (dnf.empty())
{
return {};
}
result.insert(result.end(), dnf.begin(), dnf.end());
}
return result;
}
else if (fn->name == "and")
{
const auto * args = typeid_cast<const ASTExpressionList *>(fn->children.front().get());
if (!args)
{
return {};
}
Disjunction result;
for (const auto & arg : args->children)
{
const auto dnf = analyzeFunction(typeid_cast<const ASTFunction *>(arg.get()), expr);
if (dnf.empty())
{
continue;
}
result = andDNF(result, dnf);
}
return result;
}
return {};
}
}
std::optional<Blocks> evaluateExpressionOverConstantCondition(const ASTPtr & node, const ExpressionActionsPtr & target_expr)
{
Blocks result;
// TODO: `node` may be always-false literal.
if (const auto fn = typeid_cast<const ASTFunction *>(node.get()))
{
const auto dnf = analyzeFunction(fn, target_expr);
if (dnf.empty())
{
return {};
}
auto hasRequiredColumns = [&target_expr](const Block & block) -> bool
{
for (const auto & name : target_expr->getRequiredColumns())
{
bool hasColumn = false;
for (const auto & column_name : block.getNames())
{
if (column_name == name)
{
hasColumn = true;
break;
}
}
if (!hasColumn)
return false;
}
return true;
};
for (const auto & conjunct : dnf)
{
Block block(conjunct);
// Block should contain all required columns from `target_expr`
if (!hasRequiredColumns(block))
{
return {};
}
target_expr->execute(block);
if (block.rows() == 1)
{
result.push_back(block);
}
else if (block.rows() == 0)
{
// filter out cases like "WHERE a = 1 AND a = 2"
continue;
}
else
{
// FIXME: shouldn't happen
return {};
}
}
}
return {result};
}
} }

View File

@ -1,17 +1,22 @@
#pragma once #pragma once
#include <memory> #include <Core/Block.h>
#include <Core/Field.h> #include <Core/Field.h>
#include <Parsers/IAST.h> #include <Parsers/IAST.h>
#include <Parsers/IParser.h> #include <Parsers/IParser.h>
#include <memory>
#include <optional>
namespace DB namespace DB
{ {
class Context; class Context;
class ExpressionActions;
class IDataType; class IDataType;
using ExpressionActionsPtr = std::shared_ptr<ExpressionActions>;
/** Evaluate constant expression and its type. /** Evaluate constant expression and its type.
* Used in rare cases - for elements of set for IN, for data to INSERT. * Used in rare cases - for elements of set for IN, for data to INSERT.
@ -20,17 +25,24 @@ class IDataType;
std::pair<Field, std::shared_ptr<const IDataType>> evaluateConstantExpression(const ASTPtr & node, const Context & context); std::pair<Field, std::shared_ptr<const IDataType>> evaluateConstantExpression(const ASTPtr & node, const Context & context);
/** Evaluate constant expression /** Evaluate constant expression and returns ASTLiteral with its value.
* and returns ASTLiteral with its value.
*/ */
ASTPtr evaluateConstantExpressionAsLiteral(const ASTPtr & node, const Context & context); ASTPtr evaluateConstantExpressionAsLiteral(const ASTPtr & node, const Context & context);
/** Evaluate constant expression /** Evaluate constant expression and returns ASTLiteral with its value.
* and returns ASTLiteral with its value.
* Also, if AST is identifier, then return string literal with its name. * Also, if AST is identifier, then return string literal with its name.
* Useful in places where some name may be specified as identifier, or as result of a constant expression. * Useful in places where some name may be specified as identifier, or as result of a constant expression.
*/ */
ASTPtr evaluateConstantExpressionOrIdentifierAsLiteral(const ASTPtr & node, const Context & context); ASTPtr evaluateConstantExpressionOrIdentifierAsLiteral(const ASTPtr & node, const Context & context);
/** Try to fold condition to countable set of constant values.
* @param condition a condition that we try to fold.
* @param target_expr expression evaluated over a set of constants.
* @return optional blocks each with a single row and a single column for target expression,
* or empty blocks if condition is always false,
* or nothing if condition can't be folded to a set of constants.
*/
std::optional<Blocks> evaluateExpressionOverConstantCondition(const ASTPtr & condition, const ExpressionActionsPtr & target_expr);
} }

View File

@ -67,8 +67,7 @@ void evaluateMissingDefaults(Block & block,
if (copy_block.has(col->name)) if (copy_block.has(col->name))
{ {
auto evaluated_col = copy_block.getByName(col->name); auto evaluated_col = copy_block.getByName(col->name);
if (ColumnPtr converted = evaluated_col.column->convertToFullColumnIfConst()) evaluated_col.column = evaluated_col.column->convertToFullColumnIfConst();
evaluated_col.column = converted;
block.insert(pos, std::move(evaluated_col)); block.insert(pos, std::move(evaluated_col));
} }

View File

@ -25,11 +25,6 @@ ASTPtr ASTAlterCommand::clone() const
res->column = column->clone(); res->column = column->clone();
res->children.push_back(res->column); res->children.push_back(res->column);
} }
if (primary_key)
{
res->primary_key = primary_key->clone();
res->children.push_back(res->primary_key);
}
if (order_by) if (order_by)
{ {
res->order_by = order_by->clone(); res->order_by = order_by->clone();
@ -56,7 +51,7 @@ void ASTAlterCommand::formatImpl(
if (type == ASTAlterCommand::ADD_COLUMN) if (type == ASTAlterCommand::ADD_COLUMN)
{ {
settings.ostr << (settings.hilite ? hilite_keyword : "") << indent_str << "ADD COLUMN " << (settings.hilite ? hilite_none : ""); settings.ostr << (settings.hilite ? hilite_keyword : "") << indent_str << "ADD COLUMN " << (if_not_exists ? "IF NOT EXISTS " : "") << (settings.hilite ? hilite_none : "");
col_decl->formatImpl(settings, state, frame); col_decl->formatImpl(settings, state, frame);
/// AFTER /// AFTER
@ -69,7 +64,7 @@ void ASTAlterCommand::formatImpl(
else if (type == ASTAlterCommand::DROP_COLUMN) else if (type == ASTAlterCommand::DROP_COLUMN)
{ {
settings.ostr << (settings.hilite ? hilite_keyword : "") << indent_str settings.ostr << (settings.hilite ? hilite_keyword : "") << indent_str
<< (clear_column ? "CLEAR " : "DROP ") << "COLUMN " << (settings.hilite ? hilite_none : ""); << (clear_column ? "CLEAR " : "DROP ") << "COLUMN " << (if_exists ? "IF EXISTS " : "") << (settings.hilite ? hilite_none : "");
column->formatImpl(settings, state, frame); column->formatImpl(settings, state, frame);
if (partition) if (partition)
{ {
@ -79,14 +74,9 @@ void ASTAlterCommand::formatImpl(
} }
else if (type == ASTAlterCommand::MODIFY_COLUMN) else if (type == ASTAlterCommand::MODIFY_COLUMN)
{ {
settings.ostr << (settings.hilite ? hilite_keyword : "") << indent_str << "MODIFY COLUMN " << (settings.hilite ? hilite_none : ""); settings.ostr << (settings.hilite ? hilite_keyword : "") << indent_str << "MODIFY COLUMN " << (if_exists ? "IF EXISTS " : "") << (settings.hilite ? hilite_none : "");
col_decl->formatImpl(settings, state, frame); col_decl->formatImpl(settings, state, frame);
} }
else if (type == ASTAlterCommand::MODIFY_PRIMARY_KEY)
{
settings.ostr << (settings.hilite ? hilite_keyword : "") << indent_str << "MODIFY PRIMARY KEY " << (settings.hilite ? hilite_none : "");
primary_key->formatImpl(settings, state, frame);
}
else if (type == ASTAlterCommand::MODIFY_ORDER_BY) else if (type == ASTAlterCommand::MODIFY_ORDER_BY)
{ {
settings.ostr << (settings.hilite ? hilite_keyword : "") << indent_str << "MODIFY ORDER BY " << (settings.hilite ? hilite_none : ""); settings.ostr << (settings.hilite ? hilite_keyword : "") << indent_str << "MODIFY ORDER BY " << (settings.hilite ? hilite_none : "");

View File

@ -26,7 +26,6 @@ public:
DROP_COLUMN, DROP_COLUMN,
MODIFY_COLUMN, MODIFY_COLUMN,
COMMENT_COLUMN, COMMENT_COLUMN,
MODIFY_PRIMARY_KEY,
MODIFY_ORDER_BY, MODIFY_ORDER_BY,
DROP_PARTITION, DROP_PARTITION,
@ -55,10 +54,6 @@ public:
*/ */
ASTPtr column; ASTPtr column;
/** For MODIFY PRIMARY KEY
*/
ASTPtr primary_key;
/** For MODIFY ORDER BY /** For MODIFY ORDER BY
*/ */
ASTPtr order_by; ASTPtr order_by;
@ -83,6 +78,10 @@ public:
bool clear_column = false; /// for CLEAR COLUMN (do not drop column from metadata) bool clear_column = false; /// for CLEAR COLUMN (do not drop column from metadata)
bool if_not_exists = false; /// option for ADD_COLUMN
bool if_exists = false; /// option for DROP_COLUMN, MODIFY_COLUMN, COMMENT_COLUMN
/** For FETCH PARTITION - the path in ZK to the shard, from which to download the partition. /** For FETCH PARTITION - the path in ZK to the shard, from which to download the partition.
*/ */
String from; String from;

View File

@ -1,6 +1,6 @@
#pragma once #pragma once
#include <Parsers/IAST.h> #include "IAST.h"
#include <Core/Field.h> #include <Core/Field.h>
#include <Common/FieldVisitors.h> #include <Common/FieldVisitors.h>
@ -18,7 +18,7 @@ public:
ASTEnumElement(const String & name, const Field & value) ASTEnumElement(const String & name, const Field & value)
: name{name}, value {value} {} : name{name}, value {value} {}
String getID() const override { return "EnumElement"; } String getID(char) const override { return "EnumElement"; }
ASTPtr clone() const override ASTPtr clone() const override
{ {

View File

@ -13,10 +13,12 @@ void ASTKillQueryQuery::formatQueryImpl(const FormatSettings & settings, FormatS
settings.ostr << (settings.hilite ? hilite_keyword : "") << "KILL QUERY"; settings.ostr << (settings.hilite ? hilite_keyword : "") << "KILL QUERY";
formatOnCluster(settings); formatOnCluster(settings);
settings.ostr << " WHERE " << (settings.hilite ? hilite_none : "");
if (where_expression) if (where_expression)
{
settings.ostr << " WHERE " << (settings.hilite ? hilite_none : "");
where_expression->formatImpl(settings, state, frame); where_expression->formatImpl(settings, state, frame);
}
settings.ostr << " " << (settings.hilite ? hilite_keyword : "") << (test ? "TEST" : (sync ? "SYNC" : "ASYNC")) << (settings.hilite ? hilite_none : ""); settings.ostr << " " << (settings.hilite ? hilite_keyword : "") << (test ? "TEST" : (sync ? "SYNC" : "ASYNC")) << (settings.hilite ? hilite_none : "");
} }

View File

@ -15,8 +15,12 @@ public:
ASTPtr clone() const override ASTPtr clone() const override
{ {
auto clone = std::make_shared<ASTKillQueryQuery>(*this); auto clone = std::make_shared<ASTKillQueryQuery>(*this);
if (where_expression)
{
clone->where_expression = where_expression->clone(); clone->where_expression = where_expression->clone();
clone->children = {clone->where_expression}; clone->children = {clone->where_expression};
}
return clone; return clone;
} }

View File

@ -283,23 +283,21 @@ bool ASTSelectQuery::final() const
} }
ASTPtr ASTSelectQuery::array_join_expression_list() const ASTPtr ASTSelectQuery::array_join_expression_list(bool & is_left) const
{ {
const ASTArrayJoin * array_join = getFirstArrayJoin(*this); const ASTArrayJoin * array_join = getFirstArrayJoin(*this);
if (!array_join) if (!array_join)
return {}; return {};
is_left = (array_join->kind == ASTArrayJoin::Kind::Left);
return array_join->expression_list; return array_join->expression_list;
} }
bool ASTSelectQuery::array_join_is_left() const ASTPtr ASTSelectQuery::array_join_expression_list() const
{ {
const ASTArrayJoin * array_join = getFirstArrayJoin(*this); bool is_left;
if (!array_join) return array_join_expression_list(is_left);
return {};
return array_join->kind == ASTArrayJoin::Kind::Left;
} }

Some files were not shown because too many files have changed in this diff Show More