Merge branch 'master' of github.com:yandex/ClickHouse

This commit is contained in:
Ivan Blinkov 2018-12-21 14:37:09 +03:00
commit b7bb966ca6
92 changed files with 2305 additions and 527 deletions

3
.gitignore vendored
View File

@ -248,3 +248,6 @@ website/package-lock.json
# Ignore files for locally disabled tests # Ignore files for locally disabled tests
/dbms/tests/queries/**/*.disabled /dbms/tests/queries/**/*.disabled
# cquery cache
/.cquery-cache

View File

@ -1,3 +1,98 @@
## ClickHouse release 18.16.0, 2018-12-14
### New features:
* `DEFAULT` expressions are evaluated for missing fields when loading data in semi-structured input formats (`JSONEachRow`, `TSKV`). [#3555](https://github.com/yandex/ClickHouse/pull/3555)
* The `ALTER TABLE` query now has the `MODIFY ORDER BY` action for changing the sorting key when adding or removing a table column. This is useful for tables in the `MergeTree` family that perform additional tasks when merging based on this sorting key, such as `SummingMergeTree`, `AggregatingMergeTree`, and so on. [#3581](https://github.com/yandex/ClickHouse/pull/3581) [#3755](https://github.com/yandex/ClickHouse/pull/3755)
* For tables in the `MergeTree` family, now you can specify a different sorting key (`ORDER BY`) and index (`PRIMARY KEY`). The sorting key can be longer than the index. [#3581](https://github.com/yandex/ClickHouse/pull/3581)
* Added the `hdfs` table function and the `HDFS` table engine for importing and exporting data to HDFS. [chenxing-xc](https://github.com/yandex/ClickHouse/pull/3617)
* Added functions for working with base64: `base64Encode`, `base64Decode`, `tryBase64Decode`. [Alexander Krasheninnikov](https://github.com/yandex/ClickHouse/pull/3350)
* Now you can use a parameter to configure the precision of the `uniqCombined` aggregate function (select the number of HyperLogLog cells). [#3406](https://github.com/yandex/ClickHouse/pull/3406)
* Added the `system.contributors` table that contains the names of everyone who made commits in ClickHouse. [#3452](https://github.com/yandex/ClickHouse/pull/3452)
* Added the ability to omit the partition for the `ALTER TABLE ... FREEZE` query in order to back up all partitions at once. [#3514](https://github.com/yandex/ClickHouse/pull/3514)
* Added `dictGet` and `dictGetOrDefault` functions that don't require specifying the type of return value. The type is determined automatically from the dictionary description. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3564)
* Now you can specify comments for a column in the table description and change it using `ALTER`. [#3377](https://github.com/yandex/ClickHouse/pull/3377)
* Reading is supported for `Join` type tables with simple keys. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
* Now you can specify the options `join_use_nulls`, `max_rows_in_join`, `max_bytes_in_join`, and `join_overflow_mode` when creating a `Join` type table. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
* Added the `joinGet` function that allows you to use a `Join` type table like a dictionary. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3728)
* Added the `partition_key`, `sorting_key`, `primary_key`, and `sampling_key` columns to the `system.tables` table in order to provide information about table keys. [#3609](https://github.com/yandex/ClickHouse/pull/3609)
* Added the `is_in_partition_key`, `is_in_sorting_key`, `is_in_primary_key`, and `is_in_sampling_key` columns to the `system.columns` table. [#3609](https://github.com/yandex/ClickHouse/pull/3609)
* Added the `min_time` and `max_time` columns to the `system.parts` table. These columns are populated when the partitioning key is an expression consisting of `DateTime` columns. [Emmanuel Donin de Rosière](https://github.com/yandex/ClickHouse/pull/3800)
### Bug fixes:
* Fixes and performance improvements for the `LowCardinality` data type. `GROUP BY` using `LowCardinality(Nullable(...))`. Getting the values of `extremes`. Processing high-order functions. `LEFT ARRAY JOIN`. Distributed `GROUP BY`. Functions that return `Array`. Execution of `ORDER BY`. Writing to `Distributed` tables (nicelulu). Backward compatibility for `INSERT` queries from old clients that implement the `Native` protocol. Support for `LowCardinality` for `JOIN`. Improved performance when working in a single stream. [#3823](https://github.com/yandex/ClickHouse/pull/3823) [#3803](https://github.com/yandex/ClickHouse/pull/3803) [#3799](https://github.com/yandex/ClickHouse/pull/3799) [#3769](https://github.com/yandex/ClickHouse/pull/3769) [#3744](https://github.com/yandex/ClickHouse/pull/3744) [#3681](https://github.com/yandex/ClickHouse/pull/3681) [#3651](https://github.com/yandex/ClickHouse/pull/3651) [#3649](https://github.com/yandex/ClickHouse/pull/3649) [#3641](https://github.com/yandex/ClickHouse/pull/3641) [#3632](https://github.com/yandex/ClickHouse/pull/3632) [#3568](https://github.com/yandex/ClickHouse/pull/3568) [#3523](https://github.com/yandex/ClickHouse/pull/3523) [#3518](https://github.com/yandex/ClickHouse/pull/3518)
* Fixed how the `select_sequential_consistency` option works. Previously, when this setting was enabled, an incomplete result was sometimes returned after beginning to write to a new partition. [#2863](https://github.com/yandex/ClickHouse/pull/2863)
* Databases are correctly specified when executing DDL `ON CLUSTER` queries and `ALTER UPDATE/DELETE`. [#3772](https://github.com/yandex/ClickHouse/pull/3772) [#3460](https://github.com/yandex/ClickHouse/pull/3460)
* Databases are correctly specified for subqueries inside a VIEW. [#3521](https://github.com/yandex/ClickHouse/pull/3521)
* Fixed a bug in `PREWHERE` with `FINAL` for `VersionedCollapsingMergeTree`. [7167bfd7](https://github.com/yandex/ClickHouse/commit/7167bfd7b365538f7a91c4307ad77e552ab4e8c1)
* Now you can use `KILL QUERY` to cancel queries that have not started yet because they are waiting for the table to be locked. [#3517](https://github.com/yandex/ClickHouse/pull/3517)
* Corrected date and time calculations if the clocks were moved back at midnight (this happens in Iran, and happened in Moscow from 1981 to 1983). Previously, this led to the time being reset a day earlier than necessary, and also caused incorrect formatting of the date and time in text format. [#3819](https://github.com/yandex/ClickHouse/pull/3819)
* Fixed bugs in some cases of `VIEW` and subqueries that omit the database. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3521)
* Fixed a race condition when simultaneously reading from a `MATERIALIZED VIEW` and deleting a `MATERIALIZED VIEW` due to not locking the internal `MATERIALIZED VIEW`. [#3404](https://github.com/yandex/ClickHouse/pull/3404) [#3694](https://github.com/yandex/ClickHouse/pull/3694)
* Fixed the error `Lock handler cannot be nullptr.` [#3689](https://github.com/yandex/ClickHouse/pull/3689)
* Fixed query processing when the `compile_expressions` option is enabled (it's enabled by default). Nondeterministic constant expressions like the `now` function are no longer unfolded. [#3457](https://github.com/yandex/ClickHouse/pull/3457)
* Fixed a crash when specifying a non-constant scale argument in `toDecimal32/64/128` functions.
* Fixed an error when trying to insert an array with `NULL` elements in the `Values` format into a column of type `Array` without `Nullable` (if `input_format_values_interpret_expressions` = 1). [#3487](https://github.com/yandex/ClickHouse/pull/3487) [#3503](https://github.com/yandex/ClickHouse/pull/3503)
* Fixed continuous error logging in `DDLWorker` if ZooKeeper is not available. [8f50c620](https://github.com/yandex/ClickHouse/commit/8f50c620334988b28018213ec0092fe6423847e2)
* Fixed the return type for `quantile*` functions from `Date` and `DateTime` types of arguments. [#3580](https://github.com/yandex/ClickHouse/pull/3580)
* Fixed the `WITH` clause if it specifies a simple alias without expressions. [#3570](https://github.com/yandex/ClickHouse/pull/3570)
* Fixed processing of queries with named sub-queries and qualified column names when `enable_optimize_predicate_expression` is enabled. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3588)
* Fixed the error `Attempt to attach to nullptr thread group` when working with materialized views. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3623)
* Fixed a crash when passing certain incorrect arguments to the `arrayReverse` function. [73e3a7b6](https://github.com/yandex/ClickHouse/commit/73e3a7b662161d6005e7727d8a711b930386b871)
* Fixed the buffer overflow in the `extractURLParameter` function. Improved performance. Added correct processing of strings containing zero bytes. [141e9799](https://github.com/yandex/ClickHouse/commit/141e9799e49201d84ea8e951d1bed4fb6d3dacb5)
* Fixed buffer overflow in the `lowerUTF8` and `upperUTF8` functions. Removed the ability to execute these functions over `FixedString` type arguments. [#3662](https://github.com/yandex/ClickHouse/pull/3662)
* Fixed a rare race condition when deleting `MergeTree` tables. [#3680](https://github.com/yandex/ClickHouse/pull/3680)
* Fixed a race condition when reading from `Buffer` tables and simultaneously performing `ALTER` or `DROP` on the target tables. [#3719](https://github.com/yandex/ClickHouse/pull/3719)
* Fixed a segfault if the `max_temporary_non_const_columns` limit was exceeded. [#3788](https://github.com/yandex/ClickHouse/pull/3788)
### Improvements:
* The server does not write the processed configuration files to the `/etc/clickhouse-server/` directory. Instead, it saves them in the `preprocessed_configs` directory inside `path`. This means that the `/etc/clickhouse-server/` directory doesn't have write access for the `clickhouse` user, which improves security. [#2443](https://github.com/yandex/ClickHouse/pull/2443)
* The `min_merge_bytes_to_use_direct_io` option is set to 10 GiB by default. A merge that forms large parts of tables from the MergeTree family will be performed in `O_DIRECT` mode, which prevents excessive page cache eviction. [#3504](https://github.com/yandex/ClickHouse/pull/3504)
* Accelerated server start when there is a very large number of tables. [#3398](https://github.com/yandex/ClickHouse/pull/3398)
* Added a connection pool and HTTP `Keep-Alive` for connections between replicas. [#3594](https://github.com/yandex/ClickHouse/pull/3594)
* If the query syntax is invalid, the `400 Bad Request` code is returned in the `HTTP` interface (500 was returned previously). [31bc680a](https://github.com/yandex/ClickHouse/commit/31bc680ac5f4bb1d0360a8ba4696fa84bb47d6ab)
* The `join_default_strictness` option is set to `ALL` by default for compatibility. [120e2cbe](https://github.com/yandex/ClickHouse/commit/120e2cbe2ff4fbad626c28042d9b28781c805afe)
* Removed logging to `stderr` from the `re2` library for invalid or complex regular expressions. [#3723](https://github.com/yandex/ClickHouse/pull/3723)
* Added for the `Kafka` table engine: checks for subscriptions before beginning to read from Kafka; the kafka_max_block_size setting for the table. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3396)
* The `cityHash64`, `farmHash64`, `metroHash64`, `sipHash64`, `halfMD5`, `murmurHash2_32`, `murmurHash2_64`, `murmurHash3_32`, and `murmurHash3_64` functions now work for any number of arguments and for arguments in the form of tuples. [#3451](https://github.com/yandex/ClickHouse/pull/3451) [#3519](https://github.com/yandex/ClickHouse/pull/3519)
* The `arrayReverse` function now works with any types of arrays. [73e3a7b6](https://github.com/yandex/ClickHouse/commit/73e3a7b662161d6005e7727d8a711b930386b871)
* Added an optional parameter: the slot size for the `timeSlots` function. [Kirill Shvakov](https://github.com/yandex/ClickHouse/pull/3724)
* For `FULL` and `RIGHT JOIN`, the `max_block_size` setting is used for a stream of non-joined data from the right table. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3699)
* Added the `--secure` command line parameter in `clickhouse-benchmark` and `clickhouse-performance-test` to enable TLS. [#3688](https://github.com/yandex/ClickHouse/pull/3688) [#3690](https://github.com/yandex/ClickHouse/pull/3690)
* Type conversion when the structure of a `Buffer` type table does not match the structure of the destination table. [Vitaly Baranov](https://github.com/yandex/ClickHouse/pull/3603)
* Added the `tcp_keep_alive_timeout` option to enable keep-alive packets after inactivity for the specified time interval. [#3441](https://github.com/yandex/ClickHouse/pull/3441)
* Removed unnecessary quoting of values for the partition key in the `system.parts` table if it consists of a single column. [#3652](https://github.com/yandex/ClickHouse/pull/3652)
* The modulo function works for `Date` and `DateTime` data types. [#3385](https://github.com/yandex/ClickHouse/pull/3385)
* Added synonyms for the `POWER`, `LN`, `LCASE`, `UCASE`, `REPLACE`, `LOCATE`, `SUBSTR`, and `MID` functions. [#3774](https://github.com/yandex/ClickHouse/pull/3774) [#3763](https://github.com/yandex/ClickHouse/pull/3763) Some function names are case-insensitive for compatibility with the SQL standard. Added syntactic sugar `SUBSTRING(expr FROM start FOR length)` for compatibility with SQL. [#3804](https://github.com/yandex/ClickHouse/pull/3804)
* Added the ability to `mlock` memory pages corresponding to `clickhouse-server` executable code to prevent it from being forced out of memory. This feature is disabled by default. [#3553](https://github.com/yandex/ClickHouse/pull/3553)
* Improved performance when reading from `O_DIRECT` (with the `min_bytes_to_use_direct_io` option enabled). [#3405](https://github.com/yandex/ClickHouse/pull/3405)
* Improved performance of the `dictGet...OrDefault` function for a constant key argument and a non-constant default argument. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3563)
* The `firstSignificantSubdomain` function now processes the domains `gov`, `mil`, and `edu`. [Igor Hatarist](https://github.com/yandex/ClickHouse/pull/3601) Improved performance. [#3628](https://github.com/yandex/ClickHouse/pull/3628)
* Ability to specify custom environment variables for starting `clickhouse-server` using the `SYS-V init.d` script by defining `CLICKHOUSE_PROGRAM_ENV` in `/etc/default/clickhouse`.
[Pavlo Bashynskyi](https://github.com/yandex/ClickHouse/pull/3612)
* Correct return code for the clickhouse-server init script. [#3516](https://github.com/yandex/ClickHouse/pull/3516)
* The `system.metrics` table now has the `VersionInteger` metric, and `system.build_options` has the added line `VERSION_INTEGER`, which contains the numeric form of the ClickHouse version, such as `18016000`. [#3644](https://github.com/yandex/ClickHouse/pull/3644)
* Removed the ability to compare the `Date` type with a number to avoid potential errors like `date = 2018-12-17`, where quotes around the date are omitted by mistake. [#3687](https://github.com/yandex/ClickHouse/pull/3687)
* Fixed the behavior of stateful functions like `rowNumberInAllBlocks`. They previously output a result that was one number larger due to starting during query analysis. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3729)
* If the `force_restore_data` file can't be deleted, an error message is displayed. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3794)
### Build improvements:
* Updated the `jemalloc` library, which fixes a potential memory leak. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3557)
* Profiling with `jemalloc` is enabled by default in order to debug builds. [2cc82f5c](https://github.com/yandex/ClickHouse/commit/2cc82f5cbe266421cd4c1165286c2c47e5ffcb15)
* Added the ability to run integration tests when only `Docker` is installed on the system. [#3650](https://github.com/yandex/ClickHouse/pull/3650)
* Added the fuzz expression test in SELECT queries. [#3442](https://github.com/yandex/ClickHouse/pull/3442)
* Added a stress test for commits, which performs functional tests in parallel and in random order to detect more race conditions. [#3438](https://github.com/yandex/ClickHouse/pull/3438)
* Improved the method for starting clickhouse-server in a Docker image. [Elghazal Ahmed](https://github.com/yandex/ClickHouse/pull/3663)
* For a Docker image, added support for initializing databases using files in the `/docker-entrypoint-initdb.d` directory. [Konstantin Lebedev](https://github.com/yandex/ClickHouse/pull/3695)
* Fixes for builds on ARM. [#3709](https://github.com/yandex/ClickHouse/pull/3709)
### Backward incompatible changes:
* Removed the ability to compare the `Date` type with a number. Instead of `toDate('2018-12-18') = 17883`, you must use explicit type conversion `= toDate(17883)` [#3687](https://github.com/yandex/ClickHouse/pull/3687)
## ClickHouse release 18.14.18, 2018-12-04 ## ClickHouse release 18.14.18, 2018-12-04
### Bug fixes: ### Bug fixes:
@ -90,7 +185,7 @@
### Improvements: ### Improvements:
* Significantly reduced memory consumption for requests with `ORDER BY` and `LIMIT`. See the `max_bytes_before_remerge_sort` setting. [#3205](https://github.com/yandex/ClickHouse/pull/3205) * Significantly reduced memory consumption for queries with `ORDER BY` and `LIMIT`. See the `max_bytes_before_remerge_sort` setting. [#3205](https://github.com/yandex/ClickHouse/pull/3205)
* In the absence of `JOIN` (`LEFT`, `INNER`, ...), `INNER JOIN` is assumed. [#3147](https://github.com/yandex/ClickHouse/pull/3147) * In the absence of `JOIN` (`LEFT`, `INNER`, ...), `INNER JOIN` is assumed. [#3147](https://github.com/yandex/ClickHouse/pull/3147)
* Qualified asterisks work correctly in queries with `JOIN`. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3202) * Qualified asterisks work correctly in queries with `JOIN`. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3202)
* The `ODBC` table engine correctly chooses the method for quoting identifiers in the SQL dialect of a remote database. [Alexandr Krasheninnikov](https://github.com/yandex/ClickHouse/pull/3210) * The `ODBC` table engine correctly chooses the method for quoting identifiers in the SQL dialect of a remote database. [Alexandr Krasheninnikov](https://github.com/yandex/ClickHouse/pull/3210)
@ -127,7 +222,7 @@
* If after merging data parts, the checksum for the resulting part differs from the result of the same merge in another replica, the result of the merge is deleted and the data part is downloaded from the other replica (this is the correct behavior). But after downloading the data part, it couldn't be added to the working set because of an error that the part already exists (because the data part was deleted with some delay after the merge). This led to cyclical attempts to download the same data. [#3194](https://github.com/yandex/ClickHouse/pull/3194) * If after merging data parts, the checksum for the resulting part differs from the result of the same merge in another replica, the result of the merge is deleted and the data part is downloaded from the other replica (this is the correct behavior). But after downloading the data part, it couldn't be added to the working set because of an error that the part already exists (because the data part was deleted with some delay after the merge). This led to cyclical attempts to download the same data. [#3194](https://github.com/yandex/ClickHouse/pull/3194)
* Fixed incorrect calculation of total memory consumption by queries (because of incorrect calculation, the `max_memory_usage_for_all_queries` setting worked incorrectly and the `MemoryTracking` metric had an incorrect value). This error occurred in version 18.12.13. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3344) * Fixed incorrect calculation of total memory consumption by queries (because of incorrect calculation, the `max_memory_usage_for_all_queries` setting worked incorrectly and the `MemoryTracking` metric had an incorrect value). This error occurred in version 18.12.13. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3344)
* Fixed the functionality of `CREATE TABLE ... ON CLUSTER ... AS SELECT ...` This error occurred in version 18.12.13. [#3247](https://github.com/yandex/ClickHouse/pull/3247) * Fixed the functionality of `CREATE TABLE ... ON CLUSTER ... AS SELECT ...` This error occurred in version 18.12.13. [#3247](https://github.com/yandex/ClickHouse/pull/3247)
* Fixed unnecessary preparation of data structures for `JOIN`s on the server that initiates the request if the `JOIN` is only performed on remote servers. [#3340](https://github.com/yandex/ClickHouse/pull/3340) * Fixed unnecessary preparation of data structures for `JOIN`s on the server that initiates the query if the `JOIN` is only performed on remote servers. [#3340](https://github.com/yandex/ClickHouse/pull/3340)
* Fixed bugs in the `Kafka` engine: deadlocks after exceptions when starting to read data, and locks upon completion [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3215). * Fixed bugs in the `Kafka` engine: deadlocks after exceptions when starting to read data, and locks upon completion [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3215).
* For `Kafka` tables, the optional `schema` parameter was not passed (the schema of the `Cap'n'Proto` format). [Vojtech Splichal](https://github.com/yandex/ClickHouse/pull/3150) * For `Kafka` tables, the optional `schema` parameter was not passed (the schema of the `Cap'n'Proto` format). [Vojtech Splichal](https://github.com/yandex/ClickHouse/pull/3150)
* If the ensemble of ZooKeeper servers has servers that accept the connection but then immediately close it instead of responding to the handshake, ClickHouse chooses to connect another server. Previously, this produced the error `Cannot read all data. Bytes read: 0. Bytes expected: 4.` and the server couldn't start. [8218cf3a](https://github.com/yandex/ClickHouse/commit/8218cf3a5f39a43401953769d6d12a0bb8d29da9) * If the ensemble of ZooKeeper servers has servers that accept the connection but then immediately close it instead of responding to the handshake, ClickHouse chooses to connect another server. Previously, this produced the error `Cannot read all data. Bytes read: 0. Bytes expected: 4.` and the server couldn't start. [8218cf3a](https://github.com/yandex/ClickHouse/commit/8218cf3a5f39a43401953769d6d12a0bb8d29da9)
@ -208,7 +303,7 @@
* Added the `DECIMAL(digits, scale)` data type (`Decimal32(scale)`, `Decimal64(scale)`, `Decimal128(scale)`). To enable it, use the setting `allow_experimental_decimal_type`. [#2846](https://github.com/yandex/ClickHouse/pull/2846) [#2970](https://github.com/yandex/ClickHouse/pull/2970) [#3008](https://github.com/yandex/ClickHouse/pull/3008) [#3047](https://github.com/yandex/ClickHouse/pull/3047) * Added the `DECIMAL(digits, scale)` data type (`Decimal32(scale)`, `Decimal64(scale)`, `Decimal128(scale)`). To enable it, use the setting `allow_experimental_decimal_type`. [#2846](https://github.com/yandex/ClickHouse/pull/2846) [#2970](https://github.com/yandex/ClickHouse/pull/2970) [#3008](https://github.com/yandex/ClickHouse/pull/3008) [#3047](https://github.com/yandex/ClickHouse/pull/3047)
* New `WITH ROLLUP` modifier for `GROUP BY` (alternative syntax: `GROUP BY ROLLUP(...)`). [#2948](https://github.com/yandex/ClickHouse/pull/2948) * New `WITH ROLLUP` modifier for `GROUP BY` (alternative syntax: `GROUP BY ROLLUP(...)`). [#2948](https://github.com/yandex/ClickHouse/pull/2948)
* In requests with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting `asterisk_left_columns_only` to 1 on the user configuration level. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2787) * In queries with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting `asterisk_left_columns_only` to 1 on the user configuration level. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2787)
* Added support for JOIN with table functions. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2907) * Added support for JOIN with table functions. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2907)
* Autocomplete by pressing Tab in clickhouse-client. [Sergey Shcherbin](https://github.com/yandex/ClickHouse/pull/2447) * Autocomplete by pressing Tab in clickhouse-client. [Sergey Shcherbin](https://github.com/yandex/ClickHouse/pull/2447)
* Ctrl+C in clickhouse-client clears a query that was entered. [#2877](https://github.com/yandex/ClickHouse/pull/2877) * Ctrl+C in clickhouse-client clears a query that was entered. [#2877](https://github.com/yandex/ClickHouse/pull/2877)
@ -294,7 +389,7 @@
### Backward incompatible changes: ### Backward incompatible changes:
* In requests with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting `asterisk_left_columns_only` to 1 on the user configuration level. * In queries with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting `asterisk_left_columns_only` to 1 on the user configuration level.
### Build changes: ### Build changes:
@ -338,7 +433,7 @@
* Fixed an error for concurrent `Set` or `Join`. [Amos Bird](https://github.com/yandex/ClickHouse/pull/2823) * Fixed an error for concurrent `Set` or `Join`. [Amos Bird](https://github.com/yandex/ClickHouse/pull/2823)
* Fixed the `Block structure mismatch in UNION stream: different number of columns` error that occurred for `UNION ALL` queries inside a sub-query if one of the `SELECT` queries contains duplicate column names. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2094) * Fixed the `Block structure mismatch in UNION stream: different number of columns` error that occurred for `UNION ALL` queries inside a sub-query if one of the `SELECT` queries contains duplicate column names. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2094)
* Fixed a memory leak if an exception occurred when connecting to a MySQL server. * Fixed a memory leak if an exception occurred when connecting to a MySQL server.
* Fixed incorrect clickhouse-client response code in case of a request error. * Fixed incorrect clickhouse-client response code in case of a query error.
* Fixed incorrect behavior of materialized views containing DISTINCT. [#2795](https://github.com/yandex/ClickHouse/issues/2795) * Fixed incorrect behavior of materialized views containing DISTINCT. [#2795](https://github.com/yandex/ClickHouse/issues/2795)
### Backward incompatible changes ### Backward incompatible changes
@ -452,7 +547,7 @@ The expression must be a chain of equalities joined by the AND operator. Each si
* Fixed a problem with a very small timeout for sockets (one second) for reading and writing when sending and downloading replicated data, which made it impossible to download larger parts if there is a load on the network or disk (it resulted in cyclical attempts to download parts). This error occurred in version 1.1.54388. * Fixed a problem with a very small timeout for sockets (one second) for reading and writing when sending and downloading replicated data, which made it impossible to download larger parts if there is a load on the network or disk (it resulted in cyclical attempts to download parts). This error occurred in version 1.1.54388.
* Fixed issues when using chroot in ZooKeeper if you inserted duplicate data blocks in the table. * Fixed issues when using chroot in ZooKeeper if you inserted duplicate data blocks in the table.
* The `has` function now works correctly for an array with Nullable elements ([#2115](https://github.com/yandex/ClickHouse/issues/2115)). * The `has` function now works correctly for an array with Nullable elements ([#2115](https://github.com/yandex/ClickHouse/issues/2115)).
* The `system.tables` table now works correctly when used in distributed queries. The `metadata_modification_time` and `engine_full` columns are now non-virtual. Fixed an error that occurred if only these columns were requested from the table. * The `system.tables` table now works correctly when used in distributed queries. The `metadata_modification_time` and `engine_full` columns are now non-virtual. Fixed an error that occurred if only these columns were queried from the table.
* Fixed how an empty `TinyLog` table works after inserting an empty data block ([#2563](https://github.com/yandex/ClickHouse/issues/2563)). * Fixed how an empty `TinyLog` table works after inserting an empty data block ([#2563](https://github.com/yandex/ClickHouse/issues/2563)).
* The `system.zookeeper` table works if the value of the node in ZooKeeper is NULL. * The `system.zookeeper` table works if the value of the node in ZooKeeper is NULL.
@ -701,7 +796,7 @@ The expression must be a chain of equalities joined by the AND operator. Each si
* Added the `parseDateTimeBestEffort`, `parseDateTimeBestEffortOrZero`, and `parseDateTimeBestEffortOrNull` functions to read the DateTime from a string containing text in a wide variety of possible formats. * Added the `parseDateTimeBestEffort`, `parseDateTimeBestEffortOrZero`, and `parseDateTimeBestEffortOrNull` functions to read the DateTime from a string containing text in a wide variety of possible formats.
* Data can be partially reloaded from external dictionaries during updating (load just the records in which the value of the specified field greater than in the previous download) (Arsen Hakobyan). * Data can be partially reloaded from external dictionaries during updating (load just the records in which the value of the specified field greater than in the previous download) (Arsen Hakobyan).
* Added the `cluster` table function. Example: `cluster(cluster_name, db, table)`. The `remote` table function can accept the cluster name as the first argument, if it is specified as an identifier. * Added the `cluster` table function. Example: `cluster(cluster_name, db, table)`. The `remote` table function can accept the cluster name as the first argument, if it is specified as an identifier.
* The `remote` and `cluster` table functions can be used in `INSERT` requests. * The `remote` and `cluster` table functions can be used in `INSERT` queries.
* Added the `create_table_query` and `engine_full` virtual columns to the `system.tables`table . The `metadata_modification_time` column is virtual. * Added the `create_table_query` and `engine_full` virtual columns to the `system.tables`table . The `metadata_modification_time` column is virtual.
* Added the `data_path` and `metadata_path` columns to `system.tables`and` system.databases` tables, and added the `path` column to the `system.parts` and `system.parts_columns` tables. * Added the `data_path` and `metadata_path` columns to `system.tables`and` system.databases` tables, and added the `path` column to the `system.parts` and `system.parts_columns` tables.
* Added additional information about merges in the `system.part_log` table. * Added additional information about merges in the `system.part_log` table.
@ -1040,7 +1135,7 @@ This release contains bug fixes for the previous release 1.1.54310:
### Please note when upgrading: ### Please note when upgrading:
* There is now a higher default value for the MergeTree setting `max_bytes_to_merge_at_max_space_in_pool` (the maximum total size of data parts to merge, in bytes): it has increased from 100 GiB to 150 GiB. This might result in large merges running after the server upgrade, which could cause an increased load on the disk subsystem. If the free space available on the server is less than twice the total amount of the merges that are running, this will cause all other merges to stop running, including merges of small data parts. As a result, INSERT requests will fail with the message "Merges are processing significantly slower than inserts." Use the ` SELECT * FROM system.merges` request to monitor the situation. You can also check the `DiskSpaceReservedForMerge` metric in the `system.metrics` table, or in Graphite. You don't need to do anything to fix this, since the issue will resolve itself once the large merges finish. If you find this unacceptable, you can restore the previous value for the `max_bytes_to_merge_at_max_space_in_pool` setting. To do this, go to the <merge_tree> section in config.xml, set `<merge_tree>``<max_bytes_to_merge_at_max_space_in_pool>107374182400</max_bytes_to_merge_at_max_space_in_pool>` and restart the server. * There is now a higher default value for the MergeTree setting `max_bytes_to_merge_at_max_space_in_pool` (the maximum total size of data parts to merge, in bytes): it has increased from 100 GiB to 150 GiB. This might result in large merges running after the server upgrade, which could cause an increased load on the disk subsystem. If the free space available on the server is less than twice the total amount of the merges that are running, this will cause all other merges to stop running, including merges of small data parts. As a result, INSERT queries will fail with the message "Merges are processing significantly slower than inserts." Use the ` SELECT * FROM system.merges` query to monitor the situation. You can also check the `DiskSpaceReservedForMerge` metric in the `system.metrics` table, or in Graphite. You don't need to do anything to fix this, since the issue will resolve itself once the large merges finish. If you find this unacceptable, you can restore the previous value for the `max_bytes_to_merge_at_max_space_in_pool` setting. To do this, go to the <merge_tree> section in config.xml, set `<merge_tree>``<max_bytes_to_merge_at_max_space_in_pool>107374182400</max_bytes_to_merge_at_max_space_in_pool>` and restart the server.
## ClickHouse release 1.1.54284, 2017-08-29 ## ClickHouse release 1.1.54284, 2017-08-29
@ -1133,7 +1228,7 @@ This release contains bug fixes for the previous release 1.1.54276:
### New features: ### New features:
* Distributed DDL (for example, `CREATE TABLE ON CLUSTER`) * Distributed DDL (for example, `CREATE TABLE ON CLUSTER`)
* The replicated request `ALTER TABLE CLEAR COLUMN IN PARTITION.` * The replicated query `ALTER TABLE CLEAR COLUMN IN PARTITION.`
* The engine for Dictionary tables (access to dictionary data in the form of a table). * The engine for Dictionary tables (access to dictionary data in the form of a table).
* Dictionary database engine (this type of database automatically has Dictionary tables available for all the connected external dictionaries). * Dictionary database engine (this type of database automatically has Dictionary tables available for all the connected external dictionaries).
* You can check for updates to the dictionary by sending a request to the source. * You can check for updates to the dictionary by sending a request to the source.

View File

@ -1,4 +1,4 @@
if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/base64/lib/lib.c") if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/base64/lib/lib.c")
set (MISSING_INTERNAL_BASE64_LIBRARY 1) set (MISSING_INTERNAL_BASE64_LIBRARY 1)
message (WARNING "submodule contrib/base64 is missing. to fix try run: \n git submodule update --init --recursive") message (WARNING "submodule contrib/base64 is missing. to fix try run: \n git submodule update --init --recursive")
endif () endif ()

View File

@ -2,7 +2,7 @@
if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU") if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-maybe-uninitialized -Wno-format -Wno-misleading-indentation -Wno-stringop-overflow") set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-maybe-uninitialized -Wno-format -Wno-misleading-indentation -Wno-stringop-overflow")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-old-style-cast -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-non-virtual-dtor -Wno-maybe-uninitialized -Wno-format -Wno-misleading-indentation -Wno-implicit-fallthrough -Wno-class-memaccess -std=c++1z") set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-old-style-cast -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-non-virtual-dtor -Wno-maybe-uninitialized -Wno-format -Wno-misleading-indentation -Wno-implicit-fallthrough -Wno-class-memaccess -Wno-sign-compare -std=c++1z")
elseif (CMAKE_CXX_COMPILER_ID STREQUAL "Clang") elseif (CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-format -Wno-parentheses-equality") set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-format -Wno-parentheses-equality")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-old-style-cast -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-non-virtual-dtor -Wno-format -std=c++1z") set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-old-style-cast -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-non-virtual-dtor -Wno-format -std=c++1z")

View File

@ -61,7 +61,7 @@ int mainEntryClickHouseCompressor(int argc, char ** argv)
("block-size,b", boost::program_options::value<unsigned>()->default_value(DBMS_DEFAULT_BUFFER_SIZE), "compress in blocks of specified size") ("block-size,b", boost::program_options::value<unsigned>()->default_value(DBMS_DEFAULT_BUFFER_SIZE), "compress in blocks of specified size")
("hc", "use LZ4HC instead of LZ4") ("hc", "use LZ4HC instead of LZ4")
("zstd", "use ZSTD instead of LZ4") ("zstd", "use ZSTD instead of LZ4")
("level", "compression level") ("level", boost::program_options::value<int>(), "compression level")
("none", "use no compression instead of LZ4") ("none", "use no compression instead of LZ4")
("stat", "print block statistics of compressed data") ("stat", "print block statistics of compressed data")
; ;
@ -94,7 +94,9 @@ int mainEntryClickHouseCompressor(int argc, char ** argv)
else if (use_none) else if (use_none)
method = DB::CompressionMethod::NONE; method = DB::CompressionMethod::NONE;
DB::CompressionSettings settings(method, options.count("level") > 0 ? options["level"].as<int>() : DB::CompressionSettings::getDefaultLevel(method)); DB::CompressionSettings settings(method, options.count("level")
? options["level"].as<int>()
: DB::CompressionSettings::getDefaultLevel(method));
DB::ReadBufferFromFileDescriptor rb(STDIN_FILENO); DB::ReadBufferFromFileDescriptor rb(STDIN_FILENO);
DB::WriteBufferFromFileDescriptor wb(STDOUT_FILENO); DB::WriteBufferFromFileDescriptor wb(STDOUT_FILENO);

View File

@ -370,19 +370,7 @@ void TCPHandler::processInsertQuery(const Settings & global_settings)
} }
/// Send block to the client - table structure. /// Send block to the client - table structure.
Block block = state.io.out->getHeader(); sendData(state.io.out->getHeader());
/// Support insert from old clients without low cardinality type.
if (client_revision && client_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE)
{
for (auto & col : block)
{
col.type = recursiveRemoveLowCardinality(col.type);
col.column = recursiveRemoveLowCardinality(col.column);
}
}
sendData(block);
readData(global_settings); readData(global_settings);
state.io.out->writeSuffix(); state.io.out->writeSuffix();
@ -399,16 +387,6 @@ void TCPHandler::processOrdinaryQuery()
{ {
Block header = state.io.in->getHeader(); Block header = state.io.in->getHeader();
/// Send data to old clients without low cardinality type.
if (client_revision && client_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE)
{
for (auto & column : header)
{
column.column = recursiveRemoveLowCardinality(column.column);
column.type = recursiveRemoveLowCardinality(column.type);
}
}
if (header) if (header)
sendData(header); sendData(header);
} }
@ -782,7 +760,8 @@ void TCPHandler::initBlockInput()
state.block_in = std::make_shared<NativeBlockInputStream>( state.block_in = std::make_shared<NativeBlockInputStream>(
*state.maybe_compressed_in, *state.maybe_compressed_in,
header, header,
client_revision); client_revision,
!connection_context.getSettingsRef().low_cardinality_allow_in_native_format);
} }
} }
@ -803,7 +782,8 @@ void TCPHandler::initBlockOutput(const Block & block)
state.block_out = std::make_shared<NativeBlockOutputStream>( state.block_out = std::make_shared<NativeBlockOutputStream>(
*state.maybe_compressed_out, *state.maybe_compressed_out,
client_revision, client_revision,
block.cloneEmpty()); block.cloneEmpty(),
!connection_context.getSettingsRef().low_cardinality_allow_in_native_format);
} }
} }
@ -815,7 +795,8 @@ void TCPHandler::initLogsBlockOutput(const Block & block)
state.logs_block_out = std::make_shared<NativeBlockOutputStream>( state.logs_block_out = std::make_shared<NativeBlockOutputStream>(
*out, *out,
client_revision, client_revision,
block.cloneEmpty()); block.cloneEmpty(),
!connection_context.getSettingsRef().low_cardinality_allow_in_native_format);
} }
} }

View File

@ -25,7 +25,7 @@ namespace Poco { class Logger; }
namespace DB namespace DB
{ {
class ColumnsDescription; struct ColumnsDescription;
/// State of query processing. /// State of query processing.
struct QueryState struct QueryState

View File

@ -187,6 +187,20 @@
</replica> </replica>
</shard> </shard>
</test_shard_localhost_secure> </test_shard_localhost_secure>
<test_unavailable_shard>
<shard>
<replica>
<host>localhost</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>localhost</host>
<port>1</port>
</replica>
</shard>
</test_unavailable_shard>
</remote_servers> </remote_servers>

View File

@ -0,0 +1,36 @@
#include <AggregateFunctions/AggregateFunctionFactory.h>
#include <AggregateFunctions/AggregateFunctionBoundingRatio.h>
#include <AggregateFunctions/FactoryHelpers.h>
namespace DB
{
namespace ErrorCodes
{
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
}
namespace
{
AggregateFunctionPtr createAggregateFunctionRate(const std::string & name, const DataTypes & argument_types, const Array & parameters)
{
assertNoParameters(name, parameters);
assertBinary(name, argument_types);
if (argument_types.size() < 2)
throw Exception("Aggregate function " + name + " requires at least two arguments",
ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
return std::make_shared<AggregateFunctionBoundingRatio>(argument_types);
}
}
void registerAggregateFunctionRate(AggregateFunctionFactory & factory)
{
factory.registerFunction("boundingRatio", createAggregateFunctionRate, AggregateFunctionFactory::CaseInsensitive);
}
}

View File

@ -0,0 +1,162 @@
#pragma once
#include <DataTypes/DataTypesNumber.h>
#include <Columns/ColumnsNumber.h>
#include <Common/FieldVisitors.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteHelpers.h>
#include <AggregateFunctions/Helpers.h>
#include <AggregateFunctions/IAggregateFunction.h>
namespace DB
{
namespace ErrorCodes
{
extern const int BAD_ARGUMENTS;
}
/** Tracks the leftmost and rightmost (x, y) data points.
*/
struct AggregateFunctionBoundingRatioData
{
struct Point
{
Float64 x;
Float64 y;
};
bool empty = true;
Point left;
Point right;
void add(Float64 x, Float64 y)
{
Point point{x, y};
if (empty)
{
left = point;
right = point;
empty = false;
}
else if (point.x < left.x)
{
left = point;
}
else if (point.x > right.x)
{
right = point;
}
}
void merge(const AggregateFunctionBoundingRatioData & other)
{
if (empty)
{
*this = other;
}
else
{
if (other.left.x < left.x)
left = other.left;
if (other.right.x > right.x)
right = other.right;
}
}
void serialize(WriteBuffer & buf) const
{
writeBinary(empty, buf);
if (!empty)
{
writePODBinary(left, buf);
writePODBinary(right, buf);
}
}
void deserialize(ReadBuffer & buf)
{
readBinary(empty, buf);
if (!empty)
{
readPODBinary(left, buf);
readPODBinary(right, buf);
}
}
};
class AggregateFunctionBoundingRatio final : public IAggregateFunctionDataHelper<AggregateFunctionBoundingRatioData, AggregateFunctionBoundingRatio>
{
private:
/** Calculates the slope of a line between leftmost and rightmost data points.
* (y2 - y1) / (x2 - x1)
*/
Float64 getBoundingRatio(const AggregateFunctionBoundingRatioData & data) const
{
if (data.empty)
return std::numeric_limits<Float64>::quiet_NaN();
return (data.right.y - data.left.y) / (data.right.x - data.left.x);
}
public:
String getName() const override
{
return "boundingRatio";
}
AggregateFunctionBoundingRatio(const DataTypes & arguments)
{
const auto x_arg = arguments.at(0).get();
const auto y_arg = arguments.at(0).get();
if (!x_arg->isValueRepresentedByNumber() || !y_arg->isValueRepresentedByNumber())
throw Exception("Illegal types of arguments of aggregate function " + getName() + ", must have number representation.",
ErrorCodes::BAD_ARGUMENTS);
}
DataTypePtr getReturnType() const override
{
return std::make_shared<DataTypeFloat64>();
}
void add(AggregateDataPtr place, const IColumn ** columns, const size_t row_num, Arena *) const override
{
/// TODO Inefficient.
const auto x = applyVisitor(FieldVisitorConvertToNumber<Float64>(), (*columns[0])[row_num]);
const auto y = applyVisitor(FieldVisitorConvertToNumber<Float64>(), (*columns[1])[row_num]);
data(place).add(x, y);
}
void merge(AggregateDataPtr place, ConstAggregateDataPtr rhs, Arena *) const override
{
data(place).merge(data(rhs));
}
void serialize(ConstAggregateDataPtr place, WriteBuffer & buf) const override
{
data(place).serialize(buf);
}
void deserialize(AggregateDataPtr place, ReadBuffer & buf, Arena *) const override
{
data(place).deserialize(buf);
}
void insertResultInto(ConstAggregateDataPtr place, IColumn & to) const override
{
static_cast<ColumnFloat64 &>(to).getData().push_back(getBoundingRatio(data(place)));
}
const char * getHeaderFilePath() const override
{
return __FILE__;
}
};
}

View File

@ -17,6 +17,7 @@ namespace ErrorCodes
extern const int PARAMETER_OUT_OF_BOUND; extern const int PARAMETER_OUT_OF_BOUND;
} }
namespace namespace
{ {
@ -44,6 +45,8 @@ AggregateFunctionPtr createAggregateFunctionHistogram(const std::string & name,
throw Exception("Illegal type " + arguments[0]->getName() + " of argument for aggregate function " + name, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT); throw Exception("Illegal type " + arguments[0]->getName() + " of argument for aggregate function " + name, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
return res; return res;
return nullptr;
} }
} }

View File

@ -15,6 +15,7 @@ void registerAggregateFunctionGroupArrayInsertAt(AggregateFunctionFactory &);
void registerAggregateFunctionsQuantile(AggregateFunctionFactory &); void registerAggregateFunctionsQuantile(AggregateFunctionFactory &);
void registerAggregateFunctionsSequenceMatch(AggregateFunctionFactory &); void registerAggregateFunctionsSequenceMatch(AggregateFunctionFactory &);
void registerAggregateFunctionWindowFunnel(AggregateFunctionFactory &); void registerAggregateFunctionWindowFunnel(AggregateFunctionFactory &);
void registerAggregateFunctionRate(AggregateFunctionFactory &);
void registerAggregateFunctionsMinMaxAny(AggregateFunctionFactory &); void registerAggregateFunctionsMinMaxAny(AggregateFunctionFactory &);
void registerAggregateFunctionsStatisticsStable(AggregateFunctionFactory &); void registerAggregateFunctionsStatisticsStable(AggregateFunctionFactory &);
void registerAggregateFunctionsStatisticsSimple(AggregateFunctionFactory &); void registerAggregateFunctionsStatisticsSimple(AggregateFunctionFactory &);
@ -50,6 +51,7 @@ void registerAggregateFunctions()
registerAggregateFunctionsQuantile(factory); registerAggregateFunctionsQuantile(factory);
registerAggregateFunctionsSequenceMatch(factory); registerAggregateFunctionsSequenceMatch(factory);
registerAggregateFunctionWindowFunnel(factory); registerAggregateFunctionWindowFunnel(factory);
registerAggregateFunctionRate(factory);
registerAggregateFunctionsMinMaxAny(factory); registerAggregateFunctionsMinMaxAny(factory);
registerAggregateFunctionsStatisticsStable(factory); registerAggregateFunctionsStatisticsStable(factory);
registerAggregateFunctionsStatisticsSimple(factory); registerAggregateFunctionsStatisticsSimple(factory);

View File

@ -97,8 +97,8 @@ public:
/// Approximate number of allocated bytes in memory - for profiling and limits. /// Approximate number of allocated bytes in memory - for profiling and limits.
size_t allocatedBytes() const; size_t allocatedBytes() const;
operator bool() const { return !data.empty(); } operator bool() const { return !!columns(); }
bool operator!() const { return data.empty(); } bool operator!() const { return !this->operator bool(); }
/** Get a list of column names separated by commas. */ /** Get a list of column names separated by commas. */
std::string dumpNames() const; std::string dumpNames() const;

View File

@ -29,8 +29,8 @@ NativeBlockInputStream::NativeBlockInputStream(ReadBuffer & istr_, UInt64 server
{ {
} }
NativeBlockInputStream::NativeBlockInputStream(ReadBuffer & istr_, const Block & header_, UInt64 server_revision_) NativeBlockInputStream::NativeBlockInputStream(ReadBuffer & istr_, const Block & header_, UInt64 server_revision_, bool convert_types_to_low_cardinality_)
: istr(istr_), header(header_), server_revision(server_revision_) : istr(istr_), header(header_), server_revision(server_revision_), convert_types_to_low_cardinality(convert_types_to_low_cardinality_)
{ {
} }
@ -154,7 +154,8 @@ Block NativeBlockInputStream::readImpl()
column.column = std::move(read_column); column.column = std::move(read_column);
/// Support insert from old clients without low cardinality type. /// Support insert from old clients without low cardinality type.
if (header && server_revision && server_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE) bool revision_without_low_cardinality = server_revision && server_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE;
if (header && (convert_types_to_low_cardinality || revision_without_low_cardinality))
{ {
column.column = recursiveLowCardinalityConversion(column.column, column.type, header.getByPosition(i).type); column.column = recursiveLowCardinalityConversion(column.column, column.type, header.getByPosition(i).type);
column.type = header.getByPosition(i).type; column.type = header.getByPosition(i).type;

View File

@ -65,7 +65,7 @@ public:
/// For cases when data structure (header) is known in advance. /// For cases when data structure (header) is known in advance.
/// NOTE We may use header for data validation and/or type conversions. It is not implemented. /// NOTE We may use header for data validation and/or type conversions. It is not implemented.
NativeBlockInputStream(ReadBuffer & istr_, const Block & header_, UInt64 server_revision_); NativeBlockInputStream(ReadBuffer & istr_, const Block & header_, UInt64 server_revision_, bool convert_types_to_low_cardinality_ = false);
/// For cases when we have an index. It allows to skip columns. Only columns specified in the index will be read. /// For cases when we have an index. It allows to skip columns. Only columns specified in the index will be read.
NativeBlockInputStream(ReadBuffer & istr_, UInt64 server_revision_, NativeBlockInputStream(ReadBuffer & istr_, UInt64 server_revision_,
@ -91,6 +91,8 @@ private:
IndexForNativeFormat::Blocks::const_iterator index_block_end; IndexForNativeFormat::Blocks::const_iterator index_block_end;
IndexOfBlockForNativeFormat::Columns::const_iterator index_column_it; IndexOfBlockForNativeFormat::Columns::const_iterator index_column_it;
bool convert_types_to_low_cardinality = false;
/// If an index is specified, then `istr` must be CompressedReadBufferFromFile. Unused otherwise. /// If an index is specified, then `istr` must be CompressedReadBufferFromFile. Unused otherwise.
CompressedReadBufferFromFile * istr_concrete = nullptr; CompressedReadBufferFromFile * istr_concrete = nullptr;

View File

@ -21,10 +21,10 @@ namespace ErrorCodes
NativeBlockOutputStream::NativeBlockOutputStream( NativeBlockOutputStream::NativeBlockOutputStream(
WriteBuffer & ostr_, UInt64 client_revision_, const Block & header_, WriteBuffer & ostr_, UInt64 client_revision_, const Block & header_, bool remove_low_cardinality_,
WriteBuffer * index_ostr_, size_t initial_size_of_file_) WriteBuffer * index_ostr_, size_t initial_size_of_file_)
: ostr(ostr_), client_revision(client_revision_), header(header_), : ostr(ostr_), client_revision(client_revision_), header(header_),
index_ostr(index_ostr_), initial_size_of_file(initial_size_of_file_) index_ostr(index_ostr_), initial_size_of_file(initial_size_of_file_), remove_low_cardinality(remove_low_cardinality_)
{ {
if (index_ostr) if (index_ostr)
{ {
@ -104,7 +104,7 @@ void NativeBlockOutputStream::write(const Block & block)
ColumnWithTypeAndName column = block.safeGetByPosition(i); ColumnWithTypeAndName column = block.safeGetByPosition(i);
/// Send data to old clients without low cardinality type. /// Send data to old clients without low cardinality type.
if (client_revision && client_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE) if (remove_low_cardinality || (client_revision && client_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE))
{ {
column.column = recursiveRemoveLowCardinality(column.column); column.column = recursiveRemoveLowCardinality(column.column);
column.type = recursiveRemoveLowCardinality(column.type); column.type = recursiveRemoveLowCardinality(column.type);

View File

@ -23,7 +23,7 @@ public:
/** If non-zero client_revision is specified, additional block information can be written. /** If non-zero client_revision is specified, additional block information can be written.
*/ */
NativeBlockOutputStream( NativeBlockOutputStream(
WriteBuffer & ostr_, UInt64 client_revision_, const Block & header_, WriteBuffer & ostr_, UInt64 client_revision_, const Block & header_, bool remove_low_cardinality_ = false,
WriteBuffer * index_ostr_ = nullptr, size_t initial_size_of_file_ = 0); WriteBuffer * index_ostr_ = nullptr, size_t initial_size_of_file_ = 0);
Block getHeader() const override { return header; } Block getHeader() const override { return header; }
@ -42,6 +42,8 @@ private:
size_t initial_size_of_file; /// The initial size of the data file, if `append` done. Used for the index. size_t initial_size_of_file; /// The initial size of the data file, if `append` done. Used for the index.
/// If you need to write index, then `ostr` must be a CompressedWriteBuffer. /// If you need to write index, then `ostr` must be a CompressedWriteBuffer.
CompressedWriteBuffer * ostr_concrete = nullptr; CompressedWriteBuffer * ostr_concrete = nullptr;
bool remove_low_cardinality;
}; };
} }

View File

@ -19,6 +19,7 @@ void registerDataTypeInterval(DataTypeFactory & factory)
factory.registerSimpleDataType("IntervalDay", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Day)); }); factory.registerSimpleDataType("IntervalDay", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Day)); });
factory.registerSimpleDataType("IntervalWeek", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Week)); }); factory.registerSimpleDataType("IntervalWeek", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Week)); });
factory.registerSimpleDataType("IntervalMonth", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Month)); }); factory.registerSimpleDataType("IntervalMonth", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Month)); });
factory.registerSimpleDataType("IntervalQuarter", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Quarter)); });
factory.registerSimpleDataType("IntervalYear", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Year)); }); factory.registerSimpleDataType("IntervalYear", [] { return DataTypePtr(std::make_shared<DataTypeInterval>(DataTypeInterval::Year)); });
} }

View File

@ -25,6 +25,7 @@ public:
Day, Day,
Week, Week,
Month, Month,
Quarter,
Year Year
}; };
@ -46,6 +47,7 @@ public:
case Day: return "Day"; case Day: return "Day";
case Week: return "Week"; case Week: return "Week";
case Month: return "Month"; case Month: return "Month";
case Quarter: return "Quarter";
case Year: return "Year"; case Year: return "Year";
default: __builtin_unreachable(); default: __builtin_unreachable();
} }

View File

@ -113,6 +113,21 @@ struct AddMonthsImpl
} }
}; };
struct AddQuartersImpl
{
static constexpr auto name = "addQuarters";
static inline UInt32 execute(UInt32 t, Int64 delta, const DateLUTImpl & time_zone)
{
return time_zone.addQuarters(t, delta);
}
static inline UInt16 execute(UInt16 d, Int64 delta, const DateLUTImpl & time_zone)
{
return time_zone.addQuarters(DayNum(d), delta);
}
};
struct AddYearsImpl struct AddYearsImpl
{ {
static constexpr auto name = "addYears"; static constexpr auto name = "addYears";
@ -149,6 +164,7 @@ struct SubtractHoursImpl : SubtractIntervalImpl<AddHoursImpl> { static constexpr
struct SubtractDaysImpl : SubtractIntervalImpl<AddDaysImpl> { static constexpr auto name = "subtractDays"; }; struct SubtractDaysImpl : SubtractIntervalImpl<AddDaysImpl> { static constexpr auto name = "subtractDays"; };
struct SubtractWeeksImpl : SubtractIntervalImpl<AddWeeksImpl> { static constexpr auto name = "subtractWeeks"; }; struct SubtractWeeksImpl : SubtractIntervalImpl<AddWeeksImpl> { static constexpr auto name = "subtractWeeks"; };
struct SubtractMonthsImpl : SubtractIntervalImpl<AddMonthsImpl> { static constexpr auto name = "subtractMonths"; }; struct SubtractMonthsImpl : SubtractIntervalImpl<AddMonthsImpl> { static constexpr auto name = "subtractMonths"; };
struct SubtractQuartersImpl : SubtractIntervalImpl<AddQuartersImpl> { static constexpr auto name = "subtractQuarters"; };
struct SubtractYearsImpl : SubtractIntervalImpl<AddYearsImpl> { static constexpr auto name = "subtractYears"; }; struct SubtractYearsImpl : SubtractIntervalImpl<AddYearsImpl> { static constexpr auto name = "subtractYears"; };

View File

@ -89,6 +89,7 @@ void registerFunctionsConversion(FunctionFactory & factory)
factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalDay, PositiveMonotonicity>>(); factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalDay, PositiveMonotonicity>>();
factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalWeek, PositiveMonotonicity>>(); factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalWeek, PositiveMonotonicity>>();
factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalMonth, PositiveMonotonicity>>(); factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalMonth, PositiveMonotonicity>>();
factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalQuarter, PositiveMonotonicity>>();
factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalYear, PositiveMonotonicity>>(); factory.registerFunction<FunctionConvert<DataTypeInterval, NameToIntervalYear, PositiveMonotonicity>>();
} }

View File

@ -738,6 +738,7 @@ DEFINE_NAME_TO_INTERVAL(Hour)
DEFINE_NAME_TO_INTERVAL(Day) DEFINE_NAME_TO_INTERVAL(Day)
DEFINE_NAME_TO_INTERVAL(Week) DEFINE_NAME_TO_INTERVAL(Week)
DEFINE_NAME_TO_INTERVAL(Month) DEFINE_NAME_TO_INTERVAL(Month)
DEFINE_NAME_TO_INTERVAL(Quarter)
DEFINE_NAME_TO_INTERVAL(Year) DEFINE_NAME_TO_INTERVAL(Year)
#undef DEFINE_NAME_TO_INTERVAL #undef DEFINE_NAME_TO_INTERVAL
@ -1138,6 +1139,9 @@ struct ToIntMonotonicity
static IFunction::Monotonicity get(const IDataType & type, const Field & left, const Field & right) static IFunction::Monotonicity get(const IDataType & type, const Field & left, const Field & right)
{ {
if (!type.isValueRepresentedByNumber())
return {};
size_t size_of_type = type.getSizeOfValueInMemory(); size_t size_of_type = type.getSizeOfValueInMemory();
/// If type is expanding /// If type is expanding
@ -1153,14 +1157,10 @@ struct ToIntMonotonicity
} }
/// If type is same, too. (Enum has separate case, because it is different data type) /// If type is same, too. (Enum has separate case, because it is different data type)
if (checkAndGetDataType<DataTypeNumber<T>>(&type) || if (checkAndGetDataType<DataTypeNumberBase<T>>(&type) ||
checkAndGetDataType<DataTypeEnum<T>>(&type)) checkAndGetDataType<DataTypeEnum<T>>(&type))
return { true, true, true }; return { true, true, true };
/// In other cases, if range is unbounded, we don't know, whether function is monotonic or not.
if (left.isNull() || right.isNull())
return {};
/// If converting from float, for monotonicity, arguments must fit in range of result type. /// If converting from float, for monotonicity, arguments must fit in range of result type.
if (WhichDataType(type).isFloat()) if (WhichDataType(type).isFloat())
{ {

View File

@ -1080,7 +1080,7 @@ void registerFunctionsStringSearch(FunctionFactory & factory)
factory.registerFunction<FunctionReplaceAll>(); factory.registerFunction<FunctionReplaceAll>();
factory.registerFunction<FunctionReplaceRegexpOne>(); factory.registerFunction<FunctionReplaceRegexpOne>();
factory.registerFunction<FunctionReplaceRegexpAll>(); factory.registerFunction<FunctionReplaceRegexpAll>();
factory.registerFunction<FunctionPosition>(); factory.registerFunction<FunctionPosition>(FunctionFactory::CaseInsensitive);
factory.registerFunction<FunctionPositionUTF8>(); factory.registerFunction<FunctionPositionUTF8>();
factory.registerFunction<FunctionPositionCaseInsensitive>(); factory.registerFunction<FunctionPositionCaseInsensitive>();
factory.registerFunction<FunctionPositionCaseInsensitiveUTF8>(); factory.registerFunction<FunctionPositionCaseInsensitiveUTF8>();

View File

@ -48,7 +48,7 @@ template <> struct FunctionUnaryArithmeticMonotonicity<NameAbs>
void registerFunctionAbs(FunctionFactory & factory) void registerFunctionAbs(FunctionFactory & factory)
{ {
factory.registerFunction<FunctionAbs>(); factory.registerFunction<FunctionAbs>(FunctionFactory::CaseInsensitive);
} }
} }

View File

@ -0,0 +1,18 @@
#include <Functions/IFunction.h>
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionDateOrDateTimeAddInterval.h>
namespace DB
{
using FunctionAddQuarters = FunctionDateOrDateTimeAddInterval<AddQuartersImpl>;
void registerFunctionAddQuarters(FunctionFactory & factory)
{
factory.registerFunction<FunctionAddQuarters>();
}
}

View File

@ -9,7 +9,7 @@ using FunctionRand = FunctionRandom<UInt32, NameRand>;
void registerFunctionRand(FunctionFactory & factory) void registerFunctionRand(FunctionFactory & factory)
{ {
factory.registerFunction<FunctionRand>(); factory.registerFunction<FunctionRand>(FunctionFactory::CaseInsensitive);
} }
} }

View File

@ -0,0 +1,114 @@
#include <Columns/ColumnString.h>
#include <DataTypes/DataTypeString.h>
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionHelpers.h>
#include <common/find_symbols.h>
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_COLUMN;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
}
class FunctionRegexpQuoteMeta : public IFunction
{
public:
static constexpr auto name = "regexpQuoteMeta";
static FunctionPtr create(const Context &)
{
return std::make_shared<FunctionRegexpQuoteMeta>();
}
String getName() const override
{
return name;
}
size_t getNumberOfArguments() const override
{
return 1;
}
bool useDefaultImplementationForConstants() const override
{
return true;
}
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
if (!WhichDataType(arguments[0].type).isString())
throw Exception(
"Illegal type " + arguments[0].type->getName() + " of 1 argument of function " + getName() + ". Must be String.",
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
return std::make_shared<DataTypeString>();
}
void executeImpl(Block & block, const ColumnNumbers & arguments, size_t result, size_t input_rows_count) override
{
const ColumnPtr & column_string = block.getByPosition(arguments[0]).column;
const ColumnString * input = checkAndGetColumn<ColumnString>(column_string.get());
if (!input)
throw Exception(
"Illegal column " + block.getByPosition(arguments[0]).column->getName() + " of first argument of function " + getName(),
ErrorCodes::ILLEGAL_COLUMN);
auto dst_column = ColumnString::create();
auto & dst_data = dst_column->getChars();
auto & dst_offsets = dst_column->getOffsets();
dst_offsets.resize(input_rows_count);
const ColumnString::Offsets & src_offsets = input->getOffsets();
auto src_begin = reinterpret_cast<const char *>(input->getChars().data());
auto src_pos = src_begin;
for (size_t row_idx = 0; row_idx < input_rows_count; ++row_idx)
{
/// NOTE This implementation slightly differs from re2::RE2::QuoteMeta.
/// It escapes zero byte as \0 instead of \x00
/// and it escapes only required characters.
/// This is Ok. Look at comments in re2.cc
const char * src_end = src_begin + src_offsets[row_idx] - 1;
while (true)
{
const char * next_src_pos = find_first_symbols<'\0', '\\', '|', '(', ')', '^', '$', '.', '[', '?', '*', '+', '{', ':', '-'>(src_pos, src_end);
size_t bytes_to_copy = next_src_pos - src_pos;
size_t old_dst_size = dst_data.size();
dst_data.resize(old_dst_size + bytes_to_copy);
memcpySmallAllowReadWriteOverflow15(dst_data.data() + old_dst_size, src_pos, bytes_to_copy);
src_pos = next_src_pos + 1;
if (next_src_pos == src_end)
{
dst_data.emplace_back('\0');
break;
}
dst_data.emplace_back('\\');
dst_data.emplace_back(*next_src_pos);
}
dst_offsets[row_idx] = dst_data.size();
}
block.getByPosition(result).column = std::move(dst_column);
}
};
void registerFunctionRegexpQuoteMeta(FunctionFactory & factory)
{
factory.registerFunction<FunctionRegexpQuoteMeta>();
}
}

View File

@ -47,6 +47,7 @@ void registerFunctionAddHours(FunctionFactory &);
void registerFunctionAddDays(FunctionFactory &); void registerFunctionAddDays(FunctionFactory &);
void registerFunctionAddWeeks(FunctionFactory &); void registerFunctionAddWeeks(FunctionFactory &);
void registerFunctionAddMonths(FunctionFactory &); void registerFunctionAddMonths(FunctionFactory &);
void registerFunctionAddQuarters(FunctionFactory &);
void registerFunctionAddYears(FunctionFactory &); void registerFunctionAddYears(FunctionFactory &);
void registerFunctionSubtractSeconds(FunctionFactory &); void registerFunctionSubtractSeconds(FunctionFactory &);
void registerFunctionSubtractMinutes(FunctionFactory &); void registerFunctionSubtractMinutes(FunctionFactory &);
@ -54,6 +55,7 @@ void registerFunctionSubtractHours(FunctionFactory &);
void registerFunctionSubtractDays(FunctionFactory &); void registerFunctionSubtractDays(FunctionFactory &);
void registerFunctionSubtractWeeks(FunctionFactory &); void registerFunctionSubtractWeeks(FunctionFactory &);
void registerFunctionSubtractMonths(FunctionFactory &); void registerFunctionSubtractMonths(FunctionFactory &);
void registerFunctionSubtractQuarters(FunctionFactory &);
void registerFunctionSubtractYears(FunctionFactory &); void registerFunctionSubtractYears(FunctionFactory &);
void registerFunctionDateDiff(FunctionFactory &); void registerFunctionDateDiff(FunctionFactory &);
void registerFunctionToTimeZone(FunctionFactory &); void registerFunctionToTimeZone(FunctionFactory &);
@ -106,6 +108,7 @@ void registerFunctionsDateTime(FunctionFactory & factory)
registerFunctionAddDays(factory); registerFunctionAddDays(factory);
registerFunctionAddWeeks(factory); registerFunctionAddWeeks(factory);
registerFunctionAddMonths(factory); registerFunctionAddMonths(factory);
registerFunctionAddQuarters(factory);
registerFunctionAddYears(factory); registerFunctionAddYears(factory);
registerFunctionSubtractSeconds(factory); registerFunctionSubtractSeconds(factory);
registerFunctionSubtractMinutes(factory); registerFunctionSubtractMinutes(factory);
@ -113,6 +116,7 @@ void registerFunctionsDateTime(FunctionFactory & factory)
registerFunctionSubtractDays(factory); registerFunctionSubtractDays(factory);
registerFunctionSubtractWeeks(factory); registerFunctionSubtractWeeks(factory);
registerFunctionSubtractMonths(factory); registerFunctionSubtractMonths(factory);
registerFunctionSubtractQuarters(factory);
registerFunctionSubtractYears(factory); registerFunctionSubtractYears(factory);
registerFunctionDateDiff(factory); registerFunctionDateDiff(factory);
registerFunctionToTimeZone(factory); registerFunctionToTimeZone(factory);

View File

@ -21,6 +21,8 @@ void registerFunctionSubstringUTF8(FunctionFactory &);
void registerFunctionAppendTrailingCharIfAbsent(FunctionFactory &); void registerFunctionAppendTrailingCharIfAbsent(FunctionFactory &);
void registerFunctionStartsWith(FunctionFactory &); void registerFunctionStartsWith(FunctionFactory &);
void registerFunctionEndsWith(FunctionFactory &); void registerFunctionEndsWith(FunctionFactory &);
void registerFunctionTrim(FunctionFactory &);
void registerFunctionRegexpQuoteMeta(FunctionFactory &);
#if USE_BASE64 #if USE_BASE64
void registerFunctionBase64Encode(FunctionFactory &); void registerFunctionBase64Encode(FunctionFactory &);
@ -46,6 +48,8 @@ void registerFunctionsString(FunctionFactory & factory)
registerFunctionAppendTrailingCharIfAbsent(factory); registerFunctionAppendTrailingCharIfAbsent(factory);
registerFunctionStartsWith(factory); registerFunctionStartsWith(factory);
registerFunctionEndsWith(factory); registerFunctionEndsWith(factory);
registerFunctionTrim(factory);
registerFunctionRegexpQuoteMeta(factory);
#if USE_BASE64 #if USE_BASE64
registerFunctionBase64Encode(factory); registerFunctionBase64Encode(factory);
registerFunctionBase64Decode(factory); registerFunctionBase64Decode(factory);

View File

@ -147,7 +147,7 @@ private:
void registerFunctionReverse(FunctionFactory & factory) void registerFunctionReverse(FunctionFactory & factory)
{ {
factory.registerFunction<FunctionBuilderReverse>(); factory.registerFunction<FunctionBuilderReverse>(FunctionFactory::CaseInsensitive);
} }
} }

View File

@ -0,0 +1,18 @@
#include <Functions/IFunction.h>
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionDateOrDateTimeAddInterval.h>
namespace DB
{
using FunctionSubtractQuarters = FunctionDateOrDateTimeAddInterval<SubtractQuartersImpl>;
void registerFunctionSubtractQuarters(FunctionFactory & factory)
{
factory.registerFunction<FunctionSubtractQuarters>();
}
}

142
dbms/src/Functions/trim.cpp Normal file
View File

@ -0,0 +1,142 @@
#include <Columns/ColumnString.h>
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionStringToString.h>
#if __SSE4_2__
#include <nmmintrin.h>
#endif
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
}
struct TrimModeLeft
{
static constexpr auto name = "trimLeft";
static constexpr bool trim_left = true;
static constexpr bool trim_right = false;
};
struct TrimModeRight
{
static constexpr auto name = "trimRight";
static constexpr bool trim_left = false;
static constexpr bool trim_right = true;
};
struct TrimModeBoth
{
static constexpr auto name = "trimBoth";
static constexpr bool trim_left = true;
static constexpr bool trim_right = true;
};
template <typename mode>
class FunctionTrimImpl
{
public:
static void vector(
const ColumnString::Chars & data,
const ColumnString::Offsets & offsets,
ColumnString::Chars & res_data,
ColumnString::Offsets & res_offsets)
{
size_t size = offsets.size();
res_offsets.resize(size);
res_data.reserve(data.size());
size_t prev_offset = 0;
size_t res_offset = 0;
const UInt8 * start;
size_t length;
for (size_t i = 0; i < size; ++i)
{
execute(reinterpret_cast<const UInt8 *>(&data[prev_offset]), offsets[i] - prev_offset - 1, start, length);
res_data.resize(res_data.size() + length + 1);
memcpy(&res_data[res_offset], start, length);
res_offset += length + 1;
res_data[res_offset - 1] = '\0';
res_offsets[i] = res_offset;
prev_offset = offsets[i];
}
}
static void vector_fixed(const ColumnString::Chars &, size_t, ColumnString::Chars &)
{
throw Exception("Functions trimLeft, trimRight and trimBoth cannot work with FixedString argument", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
}
private:
static void execute(const UInt8 * data, size_t size, const UInt8 *& res_data, size_t & res_size)
{
size_t chars_to_trim_left = 0;
size_t chars_to_trim_right = 0;
char whitespace = ' ';
#if __SSE4_2__
const auto bytes_sse = sizeof(__m128i);
const auto size_sse = size - (size % bytes_sse);
const auto whitespace_mask = _mm_set1_epi8(whitespace);
constexpr auto base_sse_mode = _SIDD_UBYTE_OPS | _SIDD_CMP_EQUAL_EACH | _SIDD_NEGATIVE_POLARITY;
auto mask = bytes_sse;
#endif
if constexpr (mode::trim_left)
{
#if __SSE4_2__
/// skip whitespace from left in blocks of up to 16 characters
constexpr auto left_sse_mode = base_sse_mode | _SIDD_LEAST_SIGNIFICANT;
while (mask == bytes_sse && chars_to_trim_left < size_sse)
{
const auto chars = _mm_loadu_si128(reinterpret_cast<const __m128i *>(data + chars_to_trim_left));
mask = _mm_cmpistri(whitespace_mask, chars, left_sse_mode);
chars_to_trim_left += mask;
}
#endif
/// skip remaining whitespace from left, character by character
while (chars_to_trim_left < size && data[chars_to_trim_left] == whitespace)
++chars_to_trim_left;
}
if constexpr (mode::trim_right)
{
constexpr auto right_sse_mode = base_sse_mode | _SIDD_MOST_SIGNIFICANT;
const auto trim_right_size = size - chars_to_trim_left;
#if __SSE4_2__
/// try to skip whitespace from right in blocks of up to 16 characters
const auto trim_right_size_sse = trim_right_size - (trim_right_size % bytes_sse);
while (mask == bytes_sse && chars_to_trim_right < trim_right_size_sse)
{
const auto chars = _mm_loadu_si128(reinterpret_cast<const __m128i *>(data + size - chars_to_trim_right - bytes_sse));
mask = _mm_cmpistri(whitespace_mask, chars, right_sse_mode);
chars_to_trim_right += mask;
}
#endif
/// skip remaining whitespace from right, character by character
while (chars_to_trim_right < trim_right_size && data[size - chars_to_trim_right - 1] == whitespace)
++chars_to_trim_right;
}
res_data = data + chars_to_trim_left;
res_size = size - chars_to_trim_left - chars_to_trim_right;
}
};
using FunctionTrimLeft = FunctionStringToString<FunctionTrimImpl<TrimModeLeft>, TrimModeLeft>;
using FunctionTrimRight = FunctionStringToString<FunctionTrimImpl<TrimModeRight>, TrimModeRight>;
using FunctionTrimBoth = FunctionStringToString<FunctionTrimImpl<TrimModeBoth>, TrimModeBoth>;
void registerFunctionTrim(FunctionFactory & factory)
{
factory.registerFunction<FunctionTrimLeft>();
factory.registerFunction<FunctionTrimRight>();
factory.registerFunction<FunctionTrimBoth>();
}
}

View File

@ -76,7 +76,7 @@ public:
{ {
nextIfAtEnd(); nextIfAtEnd();
size_t bytes_to_copy = std::min(static_cast<size_t>(working_buffer.end() - pos), n - bytes_copied); size_t bytes_to_copy = std::min(static_cast<size_t>(working_buffer.end() - pos), n - bytes_copied);
std::memcpy(pos, from + bytes_copied, bytes_to_copy); memcpy(pos, from + bytes_copied, bytes_to_copy);
pos += bytes_to_copy; pos += bytes_to_copy;
bytes_copied += bytes_to_copy; bytes_copied += bytes_to_copy;
} }

View File

@ -615,7 +615,6 @@ void NO_INLINE Aggregator::executeImplCase(
AggregateDataPtr overflow_row) const AggregateDataPtr overflow_row) const
{ {
/// NOTE When editing this code, also pay attention to SpecializedAggregator.h. /// NOTE When editing this code, also pay attention to SpecializedAggregator.h.
/// TODO for low cardinality optimization.
/// For all rows. /// For all rows.
typename Method::Key prev_key; typename Method::Key prev_key;

View File

@ -397,14 +397,24 @@ void Cluster::initMisc()
std::unique_ptr<Cluster> Cluster::getClusterWithSingleShard(size_t index) const std::unique_ptr<Cluster> Cluster::getClusterWithSingleShard(size_t index) const
{ {
return std::unique_ptr<Cluster>{ new Cluster(*this, index) }; return std::unique_ptr<Cluster>{ new Cluster(*this, {index}) };
} }
Cluster::Cluster(const Cluster & from, size_t index) std::unique_ptr<Cluster> Cluster::getClusterWithMultipleShards(const std::vector<size_t> & indices) const
: shards_info{from.shards_info[index]}
{ {
if (!from.addresses_with_failover.empty()) return std::unique_ptr<Cluster>{ new Cluster(*this, indices) };
addresses_with_failover.emplace_back(from.addresses_with_failover[index]); }
Cluster::Cluster(const Cluster & from, const std::vector<size_t> & indices)
: shards_info{}
{
for (size_t index : indices)
{
shards_info.emplace_back(from.shards_info.at(index));
if (!from.addresses_with_failover.empty())
addresses_with_failover.emplace_back(from.addresses_with_failover.at(index));
}
initMisc(); initMisc();
} }

View File

@ -143,6 +143,9 @@ public:
/// Get a subcluster consisting of one shard - index by count (from 0) of the shard of this cluster. /// Get a subcluster consisting of one shard - index by count (from 0) of the shard of this cluster.
std::unique_ptr<Cluster> getClusterWithSingleShard(size_t index) const; std::unique_ptr<Cluster> getClusterWithSingleShard(size_t index) const;
/// Get a subcluster consisting of one or multiple shards - indexes by count (from 0) of the shard of this cluster.
std::unique_ptr<Cluster> getClusterWithMultipleShards(const std::vector<size_t> & indices) const;
private: private:
using SlotToShard = std::vector<UInt64>; using SlotToShard = std::vector<UInt64>;
SlotToShard slot_to_shard; SlotToShard slot_to_shard;
@ -153,8 +156,8 @@ public:
private: private:
void initMisc(); void initMisc();
/// For getClusterWithSingleShard implementation. /// For getClusterWithMultipleShards implementation.
Cluster(const Cluster & from, size_t index); Cluster(const Cluster & from, const std::vector<size_t> & indices);
String hash_of_addresses; String hash_of_addresses;
/// Description of the cluster shards. /// Description of the cluster shards.

View File

@ -204,7 +204,6 @@ static bool isSupportedAlterType(int type)
ASTAlterCommand::ADD_COLUMN, ASTAlterCommand::ADD_COLUMN,
ASTAlterCommand::DROP_COLUMN, ASTAlterCommand::DROP_COLUMN,
ASTAlterCommand::MODIFY_COLUMN, ASTAlterCommand::MODIFY_COLUMN,
ASTAlterCommand::MODIFY_PRIMARY_KEY,
ASTAlterCommand::DROP_PARTITION, ASTAlterCommand::DROP_PARTITION,
ASTAlterCommand::DELETE, ASTAlterCommand::DELETE,
ASTAlterCommand::UPDATE, ASTAlterCommand::UPDATE,

View File

@ -89,6 +89,7 @@ struct Settings
M(SettingBool, skip_unavailable_shards, false, "Silently skip unavailable shards.") \ M(SettingBool, skip_unavailable_shards, false, "Silently skip unavailable shards.") \
\ \
M(SettingBool, distributed_group_by_no_merge, false, "Do not merge aggregation states from different servers for distributed query processing - in case it is for certain that there are different keys on different shards.") \ M(SettingBool, distributed_group_by_no_merge, false, "Do not merge aggregation states from different servers for distributed query processing - in case it is for certain that there are different keys on different shards.") \
M(SettingBool, optimize_skip_unused_shards, false, "Assumes that data is distributed by sharding_key. Optimization to skip unused shards if SELECT query filters by sharding_key.") \
\ \
M(SettingUInt64, merge_tree_min_rows_for_concurrent_read, (20 * 8192), "If at least as many lines are read from one file, the reading can be parallelized.") \ M(SettingUInt64, merge_tree_min_rows_for_concurrent_read, (20 * 8192), "If at least as many lines are read from one file, the reading can be parallelized.") \
M(SettingUInt64, merge_tree_min_rows_for_seek, 0, "You can skip reading more than that number of rows at the price of one seek per file.") \ M(SettingUInt64, merge_tree_min_rows_for_seek, 0, "You can skip reading more than that number of rows at the price of one seek per file.") \
@ -294,7 +295,7 @@ struct Settings
M(SettingBool, parallel_view_processing, false, "Enables pushing to attached views concurrently instead of sequentially.") \ M(SettingBool, parallel_view_processing, false, "Enables pushing to attached views concurrently instead of sequentially.") \
M(SettingBool, enable_debug_queries, false, "Enables debug queries such as AST.") \ M(SettingBool, enable_debug_queries, false, "Enables debug queries such as AST.") \
M(SettingBool, enable_unaligned_array_join, false, "Allow ARRAY JOIN with multiple arrays that have different sizes. When this settings is enabled, arrays will be resized to the longest one.") \ M(SettingBool, enable_unaligned_array_join, false, "Allow ARRAY JOIN with multiple arrays that have different sizes. When this settings is enabled, arrays will be resized to the longest one.") \
M(SettingBool, low_cardinality_allow_in_native_format, true, "Use LowCardinality type in Native format. Otherwise, convert LowCardinality columns to ordinary for select query, and convert ordinary columns to required LowCardinality for insert query.") \
#define DECLARE(TYPE, NAME, DEFAULT, DESCRIPTION) \ #define DECLARE(TYPE, NAME, DEFAULT, DESCRIPTION) \
TYPE NAME {DEFAULT}; TYPE NAME {DEFAULT};

View File

@ -108,7 +108,10 @@ void NO_INLINE Aggregator::executeSpecialized(
AggregateDataPtr overflow_row) const AggregateDataPtr overflow_row) const
{ {
typename Method::State state; typename Method::State state;
state.init(key_columns); if constexpr (Method::low_cardinality_optimization)
state.init(key_columns, aggregation_state_cache);
else
state.init(key_columns);
if (!no_more_keys) if (!no_more_keys)
executeSpecializedCase<false, Method, AggregateFunctionsList>( executeSpecializedCase<false, Method, AggregateFunctionsList>(
@ -133,15 +136,19 @@ void NO_INLINE Aggregator::executeSpecializedCase(
AggregateDataPtr overflow_row) const AggregateDataPtr overflow_row) const
{ {
/// For all rows. /// For all rows.
typename Method::iterator it;
typename Method::Key prev_key; typename Method::Key prev_key;
AggregateDataPtr value = nullptr;
for (size_t i = 0; i < rows; ++i) for (size_t i = 0; i < rows; ++i)
{ {
bool inserted; /// Inserted a new key, or was this key already? bool inserted = false; /// Inserted a new key, or was this key already?
bool overflow = false; /// New key did not fit in the hash table because of no_more_keys.
/// Get the key to insert into the hash table. /// Get the key to insert into the hash table.
typename Method::Key key = state.getKey(key_columns, params.keys_size, i, key_sizes, keys, *aggregates_pool); typename Method::Key key;
if constexpr (!Method::low_cardinality_optimization)
key = state.getKey(key_columns, params.keys_size, i, key_sizes, keys, *aggregates_pool);
AggregateDataPtr * aggregate_data = nullptr;
typename Method::iterator it; /// Is not used if Method::low_cardinality_optimization
if (!no_more_keys) /// Insert. if (!no_more_keys) /// Insert.
{ {
@ -150,8 +157,6 @@ void NO_INLINE Aggregator::executeSpecializedCase(
{ {
if (i != 0 && key == prev_key) if (i != 0 && key == prev_key)
{ {
AggregateDataPtr value = Method::getAggregateData(it->second);
/// Add values into aggregate functions. /// Add values into aggregate functions.
AggregateFunctionsList::forEach(AggregateFunctionsUpdater( AggregateFunctionsList::forEach(AggregateFunctionsUpdater(
aggregate_functions, offsets_of_aggregate_states, aggregate_columns, value, i, aggregates_pool)); aggregate_functions, offsets_of_aggregate_states, aggregate_columns, value, i, aggregates_pool));
@ -163,19 +168,29 @@ void NO_INLINE Aggregator::executeSpecializedCase(
prev_key = key; prev_key = key;
} }
method.data.emplace(key, it, inserted); if constexpr (Method::low_cardinality_optimization)
aggregate_data = state.emplaceKeyFromRow(method.data, i, inserted, params.keys_size, keys, *aggregates_pool);
else
{
method.data.emplace(key, it, inserted);
aggregate_data = &Method::getAggregateData(it->second);
}
} }
else else
{ {
/// Add only if the key already exists. /// Add only if the key already exists.
inserted = false; if constexpr (Method::low_cardinality_optimization)
it = method.data.find(key); aggregate_data = state.findFromRow(method.data, i);
if (method.data.end() == it) else
overflow = true; {
it = method.data.find(key);
if (method.data.end() != it)
aggregate_data = &Method::getAggregateData(it->second);
}
} }
/// If the key does not fit, and the data does not need to be aggregated in a separate row, then there's nothing to do. /// If the key does not fit, and the data does not need to be aggregated in a separate row, then there's nothing to do.
if (no_more_keys && overflow && !overflow_row) if (!aggregate_data && !overflow_row)
{ {
method.onExistingKey(key, keys, *aggregates_pool); method.onExistingKey(key, keys, *aggregates_pool);
continue; continue;
@ -184,22 +199,25 @@ void NO_INLINE Aggregator::executeSpecializedCase(
/// If a new key is inserted, initialize the states of the aggregate functions, and possibly some stuff related to the key. /// If a new key is inserted, initialize the states of the aggregate functions, and possibly some stuff related to the key.
if (inserted) if (inserted)
{ {
AggregateDataPtr & aggregate_data = Method::getAggregateData(it->second); *aggregate_data = nullptr;
aggregate_data = nullptr;
method.onNewKey(*it, params.keys_size, keys, *aggregates_pool); if constexpr (!Method::low_cardinality_optimization)
method.onNewKey(*it, params.keys_size, keys, *aggregates_pool);
AggregateDataPtr place = aggregates_pool->alignedAlloc(total_size_of_aggregate_states, align_aggregate_states); AggregateDataPtr place = aggregates_pool->alignedAlloc(total_size_of_aggregate_states, align_aggregate_states);
AggregateFunctionsList::forEach(AggregateFunctionsCreator( AggregateFunctionsList::forEach(AggregateFunctionsCreator(
aggregate_functions, offsets_of_aggregate_states, place)); aggregate_functions, offsets_of_aggregate_states, place));
aggregate_data = place; *aggregate_data = place;
if constexpr (Method::low_cardinality_optimization)
state.cacheAggregateData(i, place);
} }
else else
method.onExistingKey(key, keys, *aggregates_pool); method.onExistingKey(key, keys, *aggregates_pool);
AggregateDataPtr value = (!no_more_keys || !overflow) ? Method::getAggregateData(it->second) : overflow_row; value = aggregate_data ? *aggregate_data : overflow_row;
/// Add values into the aggregate functions. /// Add values into the aggregate functions.
AggregateFunctionsList::forEach(AggregateFunctionsUpdater( AggregateFunctionsList::forEach(AggregateFunctionsUpdater(

View File

@ -1,18 +1,20 @@
#include <Core/Block.h> #include <Interpreters/evaluateConstantExpression.h>
#include <Columns/ColumnConst.h> #include <Columns/ColumnConst.h>
#include <Columns/ColumnsNumber.h> #include <Columns/ColumnsNumber.h>
#include <Parsers/ASTIdentifier.h> #include <Core/Block.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ExpressionElementParsers.h>
#include <DataTypes/DataTypesNumber.h> #include <DataTypes/DataTypesNumber.h>
#include <Interpreters/Context.h> #include <Interpreters/Context.h>
#include <Interpreters/SyntaxAnalyzer.h> #include <Interpreters/convertFieldToType.h>
#include <Interpreters/ExpressionAnalyzer.h>
#include <Interpreters/ExpressionActions.h> #include <Interpreters/ExpressionActions.h>
#include <Interpreters/evaluateConstantExpression.h> #include <Interpreters/ExpressionAnalyzer.h>
#include <Common/typeid_cast.h> #include <Interpreters/SyntaxAnalyzer.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ExpressionElementParsers.h>
#include <TableFunctions/TableFunctionFactory.h> #include <TableFunctions/TableFunctionFactory.h>
#include <Common/typeid_cast.h>
namespace DB namespace DB
@ -77,4 +79,236 @@ ASTPtr evaluateConstantExpressionOrIdentifierAsLiteral(const ASTPtr & node, cons
return evaluateConstantExpressionAsLiteral(node, context); return evaluateConstantExpressionAsLiteral(node, context);
} }
namespace
{
using Conjunction = ColumnsWithTypeAndName;
using Disjunction = std::vector<Conjunction>;
Disjunction analyzeEquals(const ASTIdentifier * identifier, const ASTLiteral * literal, const ExpressionActionsPtr & expr)
{
if (!identifier || !literal)
{
return {};
}
for (const auto & name_and_type : expr->getRequiredColumnsWithTypes())
{
const auto & name = name_and_type.name;
const auto & type = name_and_type.type;
if (name == identifier->name)
{
ColumnWithTypeAndName column;
// FIXME: what to do if field is not convertable?
column.column = type->createColumnConst(1, convertFieldToType(literal->value, *type));
column.name = name;
column.type = type;
return {{std::move(column)}};
}
}
return {};
}
Disjunction andDNF(const Disjunction & left, const Disjunction & right)
{
if (left.empty())
{
return right;
}
Disjunction result;
for (const auto & conjunct1 : left)
{
for (const auto & conjunct2 : right)
{
Conjunction new_conjunct{conjunct1};
new_conjunct.insert(new_conjunct.end(), conjunct2.begin(), conjunct2.end());
result.emplace_back(new_conjunct);
}
}
return result;
}
Disjunction analyzeFunction(const ASTFunction * fn, const ExpressionActionsPtr & expr)
{
if (!fn)
{
return {};
}
// TODO: enumerate all possible function names!
if (fn->name == "equals")
{
const auto * left = fn->arguments->children.front().get();
const auto * right = fn->arguments->children.back().get();
const auto * identifier = typeid_cast<const ASTIdentifier *>(left) ? typeid_cast<const ASTIdentifier *>(left)
: typeid_cast<const ASTIdentifier *>(right);
const auto * literal = typeid_cast<const ASTLiteral *>(left) ? typeid_cast<const ASTLiteral *>(left)
: typeid_cast<const ASTLiteral *>(right);
return analyzeEquals(identifier, literal, expr);
}
else if (fn->name == "in")
{
const auto * left = fn->arguments->children.front().get();
const auto * right = fn->arguments->children.back().get();
const auto * identifier = typeid_cast<const ASTIdentifier *>(left);
const auto * inner_fn = typeid_cast<const ASTFunction *>(right);
if (!inner_fn)
{
return {};
}
const auto * tuple = typeid_cast<const ASTExpressionList *>(inner_fn->children.front().get());
if (!tuple)
{
return {};
}
Disjunction result;
for (const auto & child : tuple->children)
{
const auto * literal = typeid_cast<const ASTLiteral *>(child.get());
const auto dnf = analyzeEquals(identifier, literal, expr);
if (dnf.empty())
{
return {};
}
result.insert(result.end(), dnf.begin(), dnf.end());
}
return result;
}
else if (fn->name == "or")
{
const auto * args = typeid_cast<const ASTExpressionList *>(fn->children.front().get());
if (!args)
{
return {};
}
Disjunction result;
for (const auto & arg : args->children)
{
const auto dnf = analyzeFunction(typeid_cast<const ASTFunction *>(arg.get()), expr);
if (dnf.empty())
{
return {};
}
result.insert(result.end(), dnf.begin(), dnf.end());
}
return result;
}
else if (fn->name == "and")
{
const auto * args = typeid_cast<const ASTExpressionList *>(fn->children.front().get());
if (!args)
{
return {};
}
Disjunction result;
for (const auto & arg : args->children)
{
const auto dnf = analyzeFunction(typeid_cast<const ASTFunction *>(arg.get()), expr);
if (dnf.empty())
{
continue;
}
result = andDNF(result, dnf);
}
return result;
}
return {};
}
}
std::optional<Blocks> evaluateExpressionOverConstantCondition(const ASTPtr & node, const ExpressionActionsPtr & target_expr)
{
Blocks result;
// TODO: `node` may be always-false literal.
if (const auto fn = typeid_cast<const ASTFunction *>(node.get()))
{
const auto dnf = analyzeFunction(fn, target_expr);
if (dnf.empty())
{
return {};
}
auto hasRequiredColumns = [&target_expr](const Block & block) -> bool
{
for (const auto & name : target_expr->getRequiredColumns())
{
bool hasColumn = false;
for (const auto & column_name : block.getNames())
{
if (column_name == name)
{
hasColumn = true;
break;
}
}
if (!hasColumn)
return false;
}
return true;
};
for (const auto & conjunct : dnf)
{
Block block(conjunct);
// Block should contain all required columns from `target_expr`
if (!hasRequiredColumns(block))
{
return {};
}
target_expr->execute(block);
if (block.rows() == 1)
{
result.push_back(block);
}
else if (block.rows() == 0)
{
// filter out cases like "WHERE a = 1 AND a = 2"
continue;
}
else
{
// FIXME: shouldn't happen
return {};
}
}
}
return {result};
}
} }

View File

@ -1,17 +1,22 @@
#pragma once #pragma once
#include <memory> #include <Core/Block.h>
#include <Core/Field.h> #include <Core/Field.h>
#include <Parsers/IAST.h> #include <Parsers/IAST.h>
#include <Parsers/IParser.h> #include <Parsers/IParser.h>
#include <memory>
#include <optional>
namespace DB namespace DB
{ {
class Context; class Context;
class ExpressionActions;
class IDataType; class IDataType;
using ExpressionActionsPtr = std::shared_ptr<ExpressionActions>;
/** Evaluate constant expression and its type. /** Evaluate constant expression and its type.
* Used in rare cases - for elements of set for IN, for data to INSERT. * Used in rare cases - for elements of set for IN, for data to INSERT.
@ -20,17 +25,24 @@ class IDataType;
std::pair<Field, std::shared_ptr<const IDataType>> evaluateConstantExpression(const ASTPtr & node, const Context & context); std::pair<Field, std::shared_ptr<const IDataType>> evaluateConstantExpression(const ASTPtr & node, const Context & context);
/** Evaluate constant expression /** Evaluate constant expression and returns ASTLiteral with its value.
* and returns ASTLiteral with its value.
*/ */
ASTPtr evaluateConstantExpressionAsLiteral(const ASTPtr & node, const Context & context); ASTPtr evaluateConstantExpressionAsLiteral(const ASTPtr & node, const Context & context);
/** Evaluate constant expression /** Evaluate constant expression and returns ASTLiteral with its value.
* and returns ASTLiteral with its value.
* Also, if AST is identifier, then return string literal with its name. * Also, if AST is identifier, then return string literal with its name.
* Useful in places where some name may be specified as identifier, or as result of a constant expression. * Useful in places where some name may be specified as identifier, or as result of a constant expression.
*/ */
ASTPtr evaluateConstantExpressionOrIdentifierAsLiteral(const ASTPtr & node, const Context & context); ASTPtr evaluateConstantExpressionOrIdentifierAsLiteral(const ASTPtr & node, const Context & context);
/** Try to fold condition to countable set of constant values.
* @param condition a condition that we try to fold.
* @param target_expr expression evaluated over a set of constants.
* @return optional blocks each with a single row and a single column for target expression,
* or empty blocks if condition is always false,
* or nothing if condition can't be folded to a set of constants.
*/
std::optional<Blocks> evaluateExpressionOverConstantCondition(const ASTPtr & condition, const ExpressionActionsPtr & target_expr);
} }

View File

@ -25,11 +25,6 @@ ASTPtr ASTAlterCommand::clone() const
res->column = column->clone(); res->column = column->clone();
res->children.push_back(res->column); res->children.push_back(res->column);
} }
if (primary_key)
{
res->primary_key = primary_key->clone();
res->children.push_back(res->primary_key);
}
if (order_by) if (order_by)
{ {
res->order_by = order_by->clone(); res->order_by = order_by->clone();
@ -82,11 +77,6 @@ void ASTAlterCommand::formatImpl(
settings.ostr << (settings.hilite ? hilite_keyword : "") << indent_str << "MODIFY COLUMN " << (settings.hilite ? hilite_none : ""); settings.ostr << (settings.hilite ? hilite_keyword : "") << indent_str << "MODIFY COLUMN " << (settings.hilite ? hilite_none : "");
col_decl->formatImpl(settings, state, frame); col_decl->formatImpl(settings, state, frame);
} }
else if (type == ASTAlterCommand::MODIFY_PRIMARY_KEY)
{
settings.ostr << (settings.hilite ? hilite_keyword : "") << indent_str << "MODIFY PRIMARY KEY " << (settings.hilite ? hilite_none : "");
primary_key->formatImpl(settings, state, frame);
}
else if (type == ASTAlterCommand::MODIFY_ORDER_BY) else if (type == ASTAlterCommand::MODIFY_ORDER_BY)
{ {
settings.ostr << (settings.hilite ? hilite_keyword : "") << indent_str << "MODIFY ORDER BY " << (settings.hilite ? hilite_none : ""); settings.ostr << (settings.hilite ? hilite_keyword : "") << indent_str << "MODIFY ORDER BY " << (settings.hilite ? hilite_none : "");

View File

@ -26,7 +26,6 @@ public:
DROP_COLUMN, DROP_COLUMN,
MODIFY_COLUMN, MODIFY_COLUMN,
COMMENT_COLUMN, COMMENT_COLUMN,
MODIFY_PRIMARY_KEY,
MODIFY_ORDER_BY, MODIFY_ORDER_BY,
DROP_PARTITION, DROP_PARTITION,
@ -55,10 +54,6 @@ public:
*/ */
ASTPtr column; ASTPtr column;
/** For MODIFY PRIMARY KEY
*/
ASTPtr primary_key;
/** For MODIFY ORDER BY /** For MODIFY ORDER BY
*/ */
ASTPtr order_by; ASTPtr order_by;

View File

@ -46,4 +46,98 @@ protected:
} }
}; };
class ParserInterval: public IParserBase
{
public:
enum class IntervalKind
{
Incorrect,
Second,
Minute,
Hour,
Day,
Week,
Month,
Quarter,
Year
};
IntervalKind interval_kind;
ParserInterval() : interval_kind(IntervalKind::Incorrect) {}
const char * getToIntervalKindFunctionName()
{
switch (interval_kind)
{
case ParserInterval::IntervalKind::Second:
return "toIntervalSecond";
case ParserInterval::IntervalKind::Minute:
return "toIntervalMinute";
case ParserInterval::IntervalKind::Hour:
return "toIntervalHour";
case ParserInterval::IntervalKind::Day:
return "toIntervalDay";
case ParserInterval::IntervalKind::Week:
return "toIntervalWeek";
case ParserInterval::IntervalKind::Month:
return "toIntervalMonth";
case ParserInterval::IntervalKind::Quarter:
return "toIntervalQuarter";
case ParserInterval::IntervalKind::Year:
return "toIntervalYear";
default:
return nullptr;
}
}
protected:
const char * getName() const override { return "interval"; }
bool parseImpl(Pos & pos, ASTPtr & /*node*/, Expected & expected) override
{
if (ParserKeyword("SECOND").ignore(pos, expected) || ParserKeyword("SQL_TSI_SECOND").ignore(pos, expected)
|| ParserKeyword("SS").ignore(pos, expected) || ParserKeyword("S").ignore(pos, expected))
interval_kind = IntervalKind::Second;
else if (
ParserKeyword("MINUTE").ignore(pos, expected) || ParserKeyword("SQL_TSI_MINUTE").ignore(pos, expected)
|| ParserKeyword("MI").ignore(pos, expected) || ParserKeyword("N").ignore(pos, expected))
interval_kind = IntervalKind::Minute;
else if (
ParserKeyword("HOUR").ignore(pos, expected) || ParserKeyword("SQL_TSI_HOUR").ignore(pos, expected)
|| ParserKeyword("HH").ignore(pos, expected))
interval_kind = IntervalKind::Hour;
else if (
ParserKeyword("DAY").ignore(pos, expected) || ParserKeyword("SQL_TSI_DAY").ignore(pos, expected)
|| ParserKeyword("DD").ignore(pos, expected) || ParserKeyword("D").ignore(pos, expected))
interval_kind = IntervalKind::Day;
else if (
ParserKeyword("WEEK").ignore(pos, expected) || ParserKeyword("SQL_TSI_WEEK").ignore(pos, expected)
|| ParserKeyword("WK").ignore(pos, expected) || ParserKeyword("WW").ignore(pos, expected))
interval_kind = IntervalKind::Week;
else if (
ParserKeyword("MONTH").ignore(pos, expected) || ParserKeyword("SQL_TSI_MONTH").ignore(pos, expected)
|| ParserKeyword("MM").ignore(pos, expected) || ParserKeyword("M").ignore(pos, expected))
interval_kind = IntervalKind::Month;
else if (
ParserKeyword("QUARTER").ignore(pos, expected) || ParserKeyword("SQL_TSI_QUARTER").ignore(pos, expected)
|| ParserKeyword("QQ").ignore(pos, expected) || ParserKeyword("Q").ignore(pos, expected))
interval_kind = IntervalKind::Quarter;
else if (
ParserKeyword("YEAR").ignore(pos, expected) || ParserKeyword("SQL_TSI_YEAR").ignore(pos, expected)
|| ParserKeyword("YYYY").ignore(pos, expected) || ParserKeyword("YY").ignore(pos, expected))
interval_kind = IntervalKind::Year;
else
interval_kind = IntervalKind::Incorrect;
if (interval_kind == IntervalKind::Incorrect)
{
expected.add(pos, "YEAR, QUARTER, MONTH, WEEK, DAY, HOUR, MINUTE or SECOND");
return false;
}
/// one of ParserKeyword already made ++pos
return true;
}
};
} }

View File

@ -388,6 +388,255 @@ bool ParserSubstringExpression::parseImpl(Pos & pos, ASTPtr & node, Expected & e
return true; return true;
} }
bool ParserTrimExpression::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{
/// Handles all possible TRIM/LTRIM/RTRIM call variants
std::string func_name;
bool trim_left = false;
bool trim_right = false;
bool char_override = false;
ASTPtr expr_node;
ASTPtr pattern_node;
ASTPtr to_remove;
if (ParserKeyword("LTRIM").ignore(pos, expected))
{
if (pos->type != TokenType::OpeningRoundBracket)
return false;
++pos;
trim_left = true;
}
else if (ParserKeyword("RTRIM").ignore(pos, expected))
{
if (pos->type != TokenType::OpeningRoundBracket)
return false;
++pos;
trim_right = true;
}
else if (ParserKeyword("TRIM").ignore(pos, expected))
{
if (pos->type != TokenType::OpeningRoundBracket)
return false;
++pos;
if (ParserKeyword("BOTH").ignore(pos, expected))
{
trim_left = true;
trim_right = true;
char_override = true;
}
else if (ParserKeyword("LEADING").ignore(pos, expected))
{
trim_left = true;
char_override = true;
}
else if (ParserKeyword("TRAILING").ignore(pos, expected))
{
trim_right = true;
char_override = true;
}
else
{
trim_left = true;
trim_right = true;
}
if (char_override)
{
if (!ParserExpression().parse(pos, to_remove, expected))
return false;
if (!ParserKeyword("FROM").ignore(pos, expected))
return false;
auto quote_meta_func_node = std::make_shared<ASTFunction>();
auto quote_meta_list_args = std::make_shared<ASTExpressionList>();
quote_meta_list_args->children = {to_remove};
quote_meta_func_node->name = "regexpQuoteMeta";
quote_meta_func_node->arguments = std::move(quote_meta_list_args);
quote_meta_func_node->children.push_back(quote_meta_func_node->arguments);
to_remove = std::move(quote_meta_func_node);
}
}
if (!(trim_left || trim_right))
return false;
if (!ParserExpression().parse(pos, expr_node, expected))
return false;
if (pos->type != TokenType::ClosingRoundBracket)
return false;
++pos;
/// Convert to regexp replace function call
if (char_override)
{
auto pattern_func_node = std::make_shared<ASTFunction>();
auto pattern_list_args = std::make_shared<ASTExpressionList>();
if (trim_left && trim_right)
{
pattern_list_args->children = {
std::make_shared<ASTLiteral>("^["),
to_remove,
std::make_shared<ASTLiteral>("]*|["),
to_remove,
std::make_shared<ASTLiteral>("]*$")
};
func_name = "replaceRegexpAll";
}
else
{
if (trim_left)
{
pattern_list_args->children = {
std::make_shared<ASTLiteral>("^["),
to_remove,
std::make_shared<ASTLiteral>("]*")
};
}
else
{
/// trim_right == false not possible
pattern_list_args->children = {
std::make_shared<ASTLiteral>("["),
to_remove,
std::make_shared<ASTLiteral>("]*$")
};
}
func_name = "replaceRegexpOne";
}
pattern_func_node->name = "concat";
pattern_func_node->arguments = std::move(pattern_list_args);
pattern_func_node->children.push_back(pattern_func_node->arguments);
pattern_node = std::move(pattern_func_node);
}
else
{
if (trim_left && trim_right)
{
func_name = "trimBoth";
}
else
{
if (trim_left)
{
func_name = "trimLeft";
}
else
{
/// trim_right == false not possible
func_name = "trimRight";
}
}
}
auto expr_list_args = std::make_shared<ASTExpressionList>();
if (char_override)
expr_list_args->children = {expr_node, pattern_node, std::make_shared<ASTLiteral>("")};
else
expr_list_args->children = {expr_node};
auto func_node = std::make_shared<ASTFunction>();
func_node->name = func_name;
func_node->arguments = std::move(expr_list_args);
func_node->children.push_back(func_node->arguments);
node = std::move(func_node);
return true;
}
bool ParserLeftExpression::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{
/// Rewrites left(expr, length) to SUBSTRING(expr, 1, length)
ASTPtr expr_node;
ASTPtr start_node;
ASTPtr length_node;
if (!ParserKeyword("LEFT").ignore(pos, expected))
return false;
if (pos->type != TokenType::OpeningRoundBracket)
return false;
++pos;
if (!ParserExpression().parse(pos, expr_node, expected))
return false;
ParserToken(TokenType::Comma).ignore(pos, expected);
if (!ParserExpression().parse(pos, length_node, expected))
return false;
if (pos->type != TokenType::ClosingRoundBracket)
return false;
++pos;
auto expr_list_args = std::make_shared<ASTExpressionList>();
start_node = std::make_shared<ASTLiteral>(1);
expr_list_args->children = {expr_node, start_node, length_node};
auto func_node = std::make_shared<ASTFunction>();
func_node->name = "substring";
func_node->arguments = std::move(expr_list_args);
func_node->children.push_back(func_node->arguments);
node = std::move(func_node);
return true;
}
bool ParserRightExpression::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{
/// Rewrites RIGHT(expr, length) to substring(expr, -length)
ASTPtr expr_node;
ASTPtr length_node;
if (!ParserKeyword("RIGHT").ignore(pos, expected))
return false;
if (pos->type != TokenType::OpeningRoundBracket)
return false;
++pos;
if (!ParserExpression().parse(pos, expr_node, expected))
return false;
ParserToken(TokenType::Comma).ignore(pos, expected);
if (!ParserExpression().parse(pos, length_node, expected))
return false;
if (pos->type != TokenType::ClosingRoundBracket)
return false;
++pos;
auto start_expr_list_args = std::make_shared<ASTExpressionList>();
start_expr_list_args->children = {length_node};
auto start_node = std::make_shared<ASTFunction>();
start_node->name = "negate";
start_node->arguments = std::move(start_expr_list_args);
start_node->children.push_back(start_node->arguments);
auto expr_list_args = std::make_shared<ASTExpressionList>();
expr_list_args->children = {expr_node, start_node};
auto func_node = std::make_shared<ASTFunction>();
func_node->name = "substring";
func_node->arguments = std::move(expr_list_args);
func_node->children.push_back(func_node->arguments);
node = std::move(func_node);
return true;
}
bool ParserExtractExpression::parseImpl(Pos & pos, ASTPtr & node, Expected & expected) bool ParserExtractExpression::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{ {
auto begin = pos; auto begin = pos;
@ -402,26 +651,42 @@ bool ParserExtractExpression::parseImpl(Pos & pos, ASTPtr & node, Expected & exp
ASTPtr expr; ASTPtr expr;
const char * function_name = nullptr; const char * function_name = nullptr;
if (ParserKeyword("SECOND").ignore(pos, expected)) ParserInterval interval_parser;
function_name = "toSecond"; if (!interval_parser.ignore(pos, expected))
else if (ParserKeyword("MINUTE").ignore(pos, expected))
function_name = "toMinute";
else if (ParserKeyword("HOUR").ignore(pos, expected))
function_name = "toHour";
else if (ParserKeyword("DAY").ignore(pos, expected))
function_name = "toDayOfMonth";
// TODO: SELECT toRelativeWeekNum(toDate('2017-06-15')) - toRelativeWeekNum(toStartOfYear(toDate('2017-06-15')))
// else if (ParserKeyword("WEEK").ignore(pos, expected))
// function_name = "toRelativeWeekNum";
else if (ParserKeyword("MONTH").ignore(pos, expected))
function_name = "toMonth";
else if (ParserKeyword("YEAR").ignore(pos, expected))
function_name = "toYear";
else
return false; return false;
switch (interval_parser.interval_kind)
{
case ParserInterval::IntervalKind::Second:
function_name = "toSecond";
break;
case ParserInterval::IntervalKind::Minute:
function_name = "toMinute";
break;
case ParserInterval::IntervalKind::Hour:
function_name = "toHour";
break;
case ParserInterval::IntervalKind::Day:
function_name = "toDayOfMonth";
break;
case ParserInterval::IntervalKind::Week:
// TODO: SELECT toRelativeWeekNum(toDate('2017-06-15')) - toRelativeWeekNum(toStartOfYear(toDate('2017-06-15')))
// else if (ParserKeyword("WEEK").ignore(pos, expected))
// function_name = "toRelativeWeekNum";
return false;
case ParserInterval::IntervalKind::Month:
function_name = "toMonth";
break;
case ParserInterval::IntervalKind::Quarter:
function_name = "toQuarter";
break;
case ParserInterval::IntervalKind::Year:
function_name = "toYear";
break;
default:
return false;
}
ParserKeyword s_from("FROM"); ParserKeyword s_from("FROM");
if (!s_from.ignore(pos, expected)) if (!s_from.ignore(pos, expected))
return false; return false;
@ -449,6 +714,168 @@ bool ParserExtractExpression::parseImpl(Pos & pos, ASTPtr & node, Expected & exp
return true; return true;
} }
bool ParserDateAddExpression::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{
const char * function_name = nullptr;
ASTPtr timestamp_node;
ASTPtr offset_node;
if (ParserKeyword("DATEADD").ignore(pos, expected) || ParserKeyword("DATE_ADD").ignore(pos, expected)
|| ParserKeyword("TIMESTAMPADD").ignore(pos, expected) || ParserKeyword("TIMESTAMP_ADD").ignore(pos, expected))
function_name = "plus";
else if (ParserKeyword("DATESUB").ignore(pos, expected) || ParserKeyword("DATE_SUB").ignore(pos, expected)
|| ParserKeyword("TIMESTAMPSUB").ignore(pos, expected) || ParserKeyword("TIMESTAMP_SUB").ignore(pos, expected))
function_name = "minus";
else
return false;
if (pos->type != TokenType::OpeningRoundBracket)
return false;
++pos;
ParserInterval interval_parser;
if (interval_parser.ignore(pos, expected))
{
/// function(unit, offset, timestamp)
if (pos->type != TokenType::Comma)
return false;
++pos;
if (!ParserExpression().parse(pos, offset_node, expected))
return false;
if (pos->type != TokenType::Comma)
return false;
++pos;
if (!ParserExpression().parse(pos, timestamp_node, expected))
return false;
}
else
{
/// function(timestamp, INTERVAL offset unit)
if (!ParserExpression().parse(pos, timestamp_node, expected))
return false;
if (pos->type != TokenType::Comma)
return false;
++pos;
if (!ParserKeyword("INTERVAL").ignore(pos, expected))
return false;
if (!ParserExpression().parse(pos, offset_node, expected))
return false;
interval_parser.ignore(pos, expected);
}
if (pos->type != TokenType::ClosingRoundBracket)
return false;
++pos;
const char * interval_function_name = interval_parser.getToIntervalKindFunctionName();
auto interval_expr_list_args = std::make_shared<ASTExpressionList>();
interval_expr_list_args->children = {offset_node};
auto interval_func_node = std::make_shared<ASTFunction>();
interval_func_node->name = interval_function_name;
interval_func_node->arguments = std::move(interval_expr_list_args);
interval_func_node->children.push_back(interval_func_node->arguments);
auto expr_list_args = std::make_shared<ASTExpressionList>();
expr_list_args->children = {timestamp_node, interval_func_node};
auto func_node = std::make_shared<ASTFunction>();
func_node->name = function_name;
func_node->arguments = std::move(expr_list_args);
func_node->children.push_back(func_node->arguments);
node = std::move(func_node);
return true;
}
bool ParserDateDiffExpression::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{
const char * interval_name = nullptr;
ASTPtr left_node;
ASTPtr right_node;
if (!(ParserKeyword("DATEDIFF").ignore(pos, expected) || ParserKeyword("DATE_DIFF").ignore(pos, expected)
|| ParserKeyword("TIMESTAMPDIFF").ignore(pos, expected) || ParserKeyword("TIMESTAMP_DIFF").ignore(pos, expected)))
return false;
if (pos->type != TokenType::OpeningRoundBracket)
return false;
++pos;
ParserInterval interval_parser;
if (!interval_parser.ignore(pos, expected))
return false;
switch (interval_parser.interval_kind)
{
case ParserInterval::IntervalKind::Second:
interval_name = "second";
break;
case ParserInterval::IntervalKind::Minute:
interval_name = "minute";
break;
case ParserInterval::IntervalKind::Hour:
interval_name = "hour";
break;
case ParserInterval::IntervalKind::Day:
interval_name = "day";
break;
case ParserInterval::IntervalKind::Week:
interval_name = "week";
break;
case ParserInterval::IntervalKind::Month:
interval_name = "month";
break;
case ParserInterval::IntervalKind::Quarter:
interval_name = "quarter";
break;
case ParserInterval::IntervalKind::Year:
interval_name = "year";
break;
default:
return false;
}
if (pos->type != TokenType::Comma)
return false;
++pos;
if (!ParserExpression().parse(pos, left_node, expected))
return false;
if (pos->type != TokenType::Comma)
return false;
++pos;
if (!ParserExpression().parse(pos, right_node, expected))
return false;
if (pos->type != TokenType::ClosingRoundBracket)
return false;
++pos;
auto expr_list_args = std::make_shared<ASTExpressionList>();
expr_list_args->children = {std::make_shared<ASTLiteral>(interval_name), left_node, right_node};
auto func_node = std::make_shared<ASTFunction>();
func_node->name = "dateDiff";
func_node->arguments = std::move(expr_list_args);
func_node->children.push_back(func_node->arguments);
node = std::move(func_node);
return true;
}
bool ParserNull::parseImpl(Pos & pos, ASTPtr & node, Expected & expected) bool ParserNull::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{ {
@ -750,7 +1177,12 @@ bool ParserExpressionElement::parseImpl(Pos & pos, ASTPtr & node, Expected & exp
|| ParserLiteral().parse(pos, node, expected) || ParserLiteral().parse(pos, node, expected)
|| ParserCastExpression().parse(pos, node, expected) || ParserCastExpression().parse(pos, node, expected)
|| ParserExtractExpression().parse(pos, node, expected) || ParserExtractExpression().parse(pos, node, expected)
|| ParserDateAddExpression().parse(pos, node, expected)
|| ParserDateDiffExpression().parse(pos, node, expected)
|| ParserSubstringExpression().parse(pos, node, expected) || ParserSubstringExpression().parse(pos, node, expected)
|| ParserTrimExpression().parse(pos, node, expected)
|| ParserLeftExpression().parse(pos, node, expected)
|| ParserRightExpression().parse(pos, node, expected)
|| ParserCase().parse(pos, node, expected) || ParserCase().parse(pos, node, expected)
|| ParserFunction().parse(pos, node, expected) || ParserFunction().parse(pos, node, expected)
|| ParserQualifiedAsterisk().parse(pos, node, expected) || ParserQualifiedAsterisk().parse(pos, node, expected)

View File

@ -103,6 +103,27 @@ protected:
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override; bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override;
}; };
class ParserTrimExpression : public IParserBase
{
protected:
const char * getName() const override { return "TRIM expression"; }
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override;
};
class ParserLeftExpression : public IParserBase
{
protected:
const char * getName() const override { return "LEFT expression"; }
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override;
};
class ParserRightExpression : public IParserBase
{
protected:
const char * getName() const override { return "RIGHT expression"; }
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override;
};
class ParserExtractExpression : public IParserBase class ParserExtractExpression : public IParserBase
{ {
protected: protected:
@ -110,6 +131,19 @@ protected:
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override; bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override;
}; };
class ParserDateAddExpression : public IParserBase
{
protected:
const char * getName() const override { return "DATE_ADD expression"; }
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override;
};
class ParserDateDiffExpression : public IParserBase
{
protected:
const char * getName() const override { return "DATE_DIFF expression"; }
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override;
};
/** NULL literal. /** NULL literal.
*/ */

View File

@ -607,25 +607,13 @@ bool ParserIntervalOperatorExpression::parseImpl(Pos & pos, ASTPtr & node, Expec
if (!ParserExpressionWithOptionalAlias(false).parse(pos, expr, expected)) if (!ParserExpressionWithOptionalAlias(false).parse(pos, expr, expected))
return false; return false;
const char * function_name = nullptr;
if (ParserKeyword("SECOND").ignore(pos, expected)) ParserInterval interval_parser;
function_name = "toIntervalSecond"; if (!interval_parser.ignore(pos, expected))
else if (ParserKeyword("MINUTE").ignore(pos, expected))
function_name = "toIntervalMinute";
else if (ParserKeyword("HOUR").ignore(pos, expected))
function_name = "toIntervalHour";
else if (ParserKeyword("DAY").ignore(pos, expected))
function_name = "toIntervalDay";
else if (ParserKeyword("WEEK").ignore(pos, expected))
function_name = "toIntervalWeek";
else if (ParserKeyword("MONTH").ignore(pos, expected))
function_name = "toIntervalMonth";
else if (ParserKeyword("YEAR").ignore(pos, expected))
function_name = "toIntervalYear";
else
return false; return false;
const char * function_name = interval_parser.getToIntervalKindFunctionName();
/// the function corresponding to the operator /// the function corresponding to the operator
auto function = std::make_shared<ASTFunction>(); auto function = std::make_shared<ASTFunction>();

View File

@ -24,7 +24,6 @@ bool ParserAlterCommand::parseImpl(Pos & pos, ASTPtr & node, Expected & expected
ParserKeyword s_clear_column("CLEAR COLUMN"); ParserKeyword s_clear_column("CLEAR COLUMN");
ParserKeyword s_modify_column("MODIFY COLUMN"); ParserKeyword s_modify_column("MODIFY COLUMN");
ParserKeyword s_comment_column("COMMENT COLUMN"); ParserKeyword s_comment_column("COMMENT COLUMN");
ParserKeyword s_modify_primary_key("MODIFY PRIMARY KEY");
ParserKeyword s_modify_order_by("MODIFY ORDER BY"); ParserKeyword s_modify_order_by("MODIFY ORDER BY");
ParserKeyword s_attach_partition("ATTACH PARTITION"); ParserKeyword s_attach_partition("ATTACH PARTITION");
@ -196,13 +195,6 @@ bool ParserAlterCommand::parseImpl(Pos & pos, ASTPtr & node, Expected & expected
command->type = ASTAlterCommand::MODIFY_COLUMN; command->type = ASTAlterCommand::MODIFY_COLUMN;
} }
else if (s_modify_primary_key.ignore(pos, expected))
{
if (!parser_exp_elem.parse(pos, command->primary_key, expected))
return false;
command->type = ASTAlterCommand::MODIFY_PRIMARY_KEY;
}
else if (s_modify_order_by.ignore(pos, expected)) else if (s_modify_order_by.ignore(pos, expected))
{ {
if (!parser_exp_elem.parse(pos, command->order_by, expected)) if (!parser_exp_elem.parse(pos, command->order_by, expected))
@ -247,14 +239,16 @@ bool ParserAlterCommand::parseImpl(Pos & pos, ASTPtr & node, Expected & expected
command->children.push_back(command->col_decl); command->children.push_back(command->col_decl);
if (command->column) if (command->column)
command->children.push_back(command->column); command->children.push_back(command->column);
if (command->primary_key)
command->children.push_back(command->primary_key);
if (command->partition) if (command->partition)
command->children.push_back(command->partition); command->children.push_back(command->partition);
if (command->order_by)
command->children.push_back(command->order_by);
if (command->predicate) if (command->predicate)
command->children.push_back(command->predicate); command->children.push_back(command->predicate);
if (command->update_assignments) if (command->update_assignments)
command->children.push_back(command->update_assignments); command->children.push_back(command->update_assignments);
if (command->comment)
command->children.push_back(command->comment);
return true; return true;
} }

View File

@ -101,13 +101,6 @@ std::optional<AlterCommand> AlterCommand::parse(const ASTAlterCommand * command_
command.comment = ast_comment.value.get<String>(); command.comment = ast_comment.value.get<String>();
return command; return command;
} }
else if (command_ast->type == ASTAlterCommand::MODIFY_PRIMARY_KEY)
{
AlterCommand command;
command.type = AlterCommand::MODIFY_PRIMARY_KEY;
command.primary_key = command_ast->primary_key;
return command;
}
else if (command_ast->type == ASTAlterCommand::MODIFY_ORDER_BY) else if (command_ast->type == ASTAlterCommand::MODIFY_ORDER_BY)
{ {
AlterCommand command; AlterCommand command;
@ -271,13 +264,6 @@ void AlterCommand::apply(ColumnsDescription & columns_description, ASTPtr & orde
/// both old and new columns have default expression, update it /// both old and new columns have default expression, update it
columns_description.defaults[column_name].expression = default_expression; columns_description.defaults[column_name].expression = default_expression;
} }
else if (type == MODIFY_PRIMARY_KEY)
{
if (!primary_key_ast)
order_by_ast = primary_key;
else
primary_key_ast = primary_key;
}
else if (type == MODIFY_ORDER_BY) else if (type == MODIFY_ORDER_BY)
{ {
if (!primary_key_ast) if (!primary_key_ast)

View File

@ -22,7 +22,6 @@ struct AlterCommand
DROP_COLUMN, DROP_COLUMN,
MODIFY_COLUMN, MODIFY_COLUMN,
COMMENT_COLUMN, COMMENT_COLUMN,
MODIFY_PRIMARY_KEY,
MODIFY_ORDER_BY, MODIFY_ORDER_BY,
UKNOWN_TYPE, UKNOWN_TYPE,
}; };
@ -44,9 +43,6 @@ struct AlterCommand
/// For ADD - after which column to add a new one. If an empty string, add to the end. To add to the beginning now it is impossible. /// For ADD - after which column to add a new one. If an empty string, add to the end. To add to the beginning now it is impossible.
String after_column; String after_column;
/// For MODIFY_PRIMARY_KEY
ASTPtr primary_key;
/// For MODIFY_ORDER_BY /// For MODIFY_ORDER_BY
ASTPtr order_by; ASTPtr order_by;
@ -73,7 +69,7 @@ class AlterCommands : public std::vector<AlterCommand>
public: public:
void apply(ColumnsDescription & columns_description, ASTPtr & order_by_ast, ASTPtr & primary_key_ast) const; void apply(ColumnsDescription & columns_description, ASTPtr & order_by_ast, ASTPtr & primary_key_ast) const;
/// For storages that don't support MODIFY_PRIMARY_KEY or MODIFY_ORDER_BY. /// For storages that don't support MODIFY_ORDER_BY.
void apply(ColumnsDescription & columns_description) const; void apply(ColumnsDescription & columns_description) const;
void validate(const IStorage & table, const Context & context); void validate(const IStorage & table, const Context & context);

View File

@ -313,7 +313,7 @@ bool KeyCondition::addCondition(const String & column, const Range & range)
return true; return true;
} }
/** Computes value of constant expression and it data type. /** Computes value of constant expression and its data type.
* Returns false, if expression isn't constant. * Returns false, if expression isn't constant.
*/ */
static bool getConstant(const ASTPtr & expr, Block & block_with_constants, Field & out_value, DataTypePtr & out_type) static bool getConstant(const ASTPtr & expr, Block & block_with_constants, Field & out_value, DataTypePtr & out_type)

View File

@ -253,7 +253,7 @@ public:
/// Get the maximum number of the key element used in the condition. /// Get the maximum number of the key element used in the condition.
size_t getMaxKeyColumn() const; size_t getMaxKeyColumn() const;
/// Impose an additional condition: the value in the column column must be in the `range` range. /// Impose an additional condition: the value in the column `column` must be in the range `range`.
/// Returns whether there is such a column in the key. /// Returns whether there is such a column in the key.
bool addCondition(const String & column, const Range & range); bool addCondition(const String & column, const Range & range);

View File

@ -1229,7 +1229,6 @@ void MergeTreeData::createConvertExpression(const DataPartPtr & part, const Name
MergeTreeData::AlterDataPartTransactionPtr MergeTreeData::alterDataPart( MergeTreeData::AlterDataPartTransactionPtr MergeTreeData::alterDataPart(
const DataPartPtr & part, const DataPartPtr & part,
const NamesAndTypesList & new_columns, const NamesAndTypesList & new_columns,
const ASTPtr & new_primary_key_expr_list,
bool skip_sanity_checks) bool skip_sanity_checks)
{ {
ExpressionActionsPtr expression; ExpressionActionsPtr expression;
@ -1290,63 +1289,6 @@ MergeTreeData::AlterDataPartTransactionPtr MergeTreeData::alterDataPart(
DataPart::Checksums add_checksums; DataPart::Checksums add_checksums;
/// Update primary key if needed.
size_t new_primary_key_file_size{};
MergeTreeDataPartChecksum::uint128 new_primary_key_hash{};
if (new_primary_key_expr_list)
{
ASTPtr query = new_primary_key_expr_list;
auto syntax_result = SyntaxAnalyzer(context, {}).analyze(query, new_columns);
ExpressionActionsPtr new_primary_expr = ExpressionAnalyzer(query, syntax_result, context).getActions(true);
Block new_primary_key_sample = new_primary_expr->getSampleBlock();
size_t new_key_size = new_primary_key_sample.columns();
Columns new_index(new_key_size);
/// Copy the existing primary key columns. Fill new columns with default values.
/// NOTE default expressions are not supported.
ssize_t prev_position_of_existing_column = -1;
for (size_t i = 0; i < new_key_size; ++i)
{
const String & column_name = new_primary_key_sample.safeGetByPosition(i).name;
if (primary_key_sample.has(column_name))
{
ssize_t position_of_existing_column = primary_key_sample.getPositionByName(column_name);
if (position_of_existing_column < prev_position_of_existing_column)
throw Exception("Permuting of columns of primary key is not supported", ErrorCodes::BAD_ARGUMENTS);
new_index[i] = part->index.at(position_of_existing_column);
prev_position_of_existing_column = position_of_existing_column;
}
else
{
const IDataType & type = *new_primary_key_sample.safeGetByPosition(i).type;
new_index[i] = type.createColumnConstWithDefaultValue(part->marks_count)->convertToFullColumnIfConst();
}
}
if (prev_position_of_existing_column == -1)
throw Exception("No common columns while modifying primary key", ErrorCodes::BAD_ARGUMENTS);
String index_tmp_path = full_path + part->name + "/primary.idx.tmp";
WriteBufferFromFile index_file(index_tmp_path);
HashingWriteBuffer index_stream(index_file);
for (size_t i = 0, marks_count = part->marks_count; i < marks_count; ++i)
for (size_t j = 0; j < new_key_size; ++j)
new_primary_key_sample.getByPosition(j).type->serializeBinary(*new_index[j].get(), i, index_stream);
transaction->rename_map["primary.idx.tmp"] = "primary.idx";
index_stream.next();
new_primary_key_file_size = index_stream.count();
new_primary_key_hash = index_stream.getHash();
}
if (transaction->rename_map.empty() && !force_update_metadata) if (transaction->rename_map.empty() && !force_update_metadata)
{ {
transaction->clear(); transaction->clear();
@ -1395,12 +1337,6 @@ MergeTreeData::AlterDataPartTransactionPtr MergeTreeData::alterDataPart(
new_checksums.files[it.second] = add_checksums.files[it.first]; new_checksums.files[it.second] = add_checksums.files[it.first];
} }
if (new_primary_key_file_size)
{
new_checksums.files["primary.idx"].file_size = new_primary_key_file_size;
new_checksums.files["primary.idx"].file_hash = new_primary_key_hash;
}
/// Write the checksums to the temporary file. /// Write the checksums to the temporary file.
if (!part->checksums.empty()) if (!part->checksums.empty())
{ {

View File

@ -479,13 +479,11 @@ public:
/// Performs ALTER of the data part, writes the result to temporary files. /// Performs ALTER of the data part, writes the result to temporary files.
/// Returns an object allowing to rename temporary files to permanent files. /// Returns an object allowing to rename temporary files to permanent files.
/// If new_primary_key_expr_list is not nullptr, will prepare the new primary.idx file.
/// If the number of affected columns is suspiciously high and skip_sanity_checks is false, throws an exception. /// If the number of affected columns is suspiciously high and skip_sanity_checks is false, throws an exception.
/// If no data transformations are necessary, returns nullptr. /// If no data transformations are necessary, returns nullptr.
AlterDataPartTransactionPtr alterDataPart( AlterDataPartTransactionPtr alterDataPart(
const DataPartPtr & part, const DataPartPtr & part,
const NamesAndTypesList & new_columns, const NamesAndTypesList & new_columns,
const ASTPtr & new_primary_key_expr_list,
bool skip_sanity_checks); bool skip_sanity_checks);
/// Freezes all parts. /// Freezes all parts.

View File

@ -150,7 +150,7 @@ void ReplicatedMergeTreeAlterThread::run()
/// Update the part and write result to temporary files. /// Update the part and write result to temporary files.
/// TODO: You can skip checking for too large changes if ZooKeeper has, for example, /// TODO: You can skip checking for too large changes if ZooKeeper has, for example,
/// node /flags/force_alter. /// node /flags/force_alter.
auto transaction = storage.data.alterDataPart(part, columns_for_parts, nullptr, false); auto transaction = storage.data.alterDataPart(part, columns_for_parts, false);
if (!transaction) if (!transaction)
continue; continue;

View File

@ -1,38 +1,41 @@
#include <Storages/StorageDistributed.h>
#include <DataStreams/OneBlockInputStream.h> #include <DataStreams/OneBlockInputStream.h>
#include <DataStreams/materializeBlock.h> #include <DataStreams/materializeBlock.h>
#include <Databases/IDatabase.h> #include <Databases/IDatabase.h>
#include <DataTypes/DataTypeFactory.h> #include <DataTypes/DataTypeFactory.h>
#include <DataTypes/DataTypesNumber.h>
#include <Storages/StorageDistributed.h>
#include <Storages/Distributed/DistributedBlockOutputStream.h>
#include <Storages/Distributed/DirectoryMonitor.h> #include <Storages/Distributed/DirectoryMonitor.h>
#include <Storages/Distributed/DistributedBlockOutputStream.h>
#include <Storages/StorageFactory.h> #include <Storages/StorageFactory.h>
#include <Common/Macros.h> #include <Common/Macros.h>
#include <Common/escapeForFileName.h> #include <Common/escapeForFileName.h>
#include <Common/typeid_cast.h> #include <Common/typeid_cast.h>
#include <Parsers/ASTInsertQuery.h>
#include <Parsers/ASTSelectQuery.h>
#include <Parsers/TablePropertiesQueriesASTs.h>
#include <Parsers/ParserAlterQuery.h>
#include <Parsers/parseQuery.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTExpressionList.h>
#include <Parsers/ASTTablesInSelectQuery.h>
#include <Parsers/ASTDropQuery.h> #include <Parsers/ASTDropQuery.h>
#include <Parsers/ASTExpressionList.h>
#include <Parsers/ASTIdentifier.h> #include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTInsertQuery.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTSelectQuery.h>
#include <Parsers/ASTTablesInSelectQuery.h>
#include <Parsers/ParserAlterQuery.h>
#include <Parsers/TablePropertiesQueriesASTs.h>
#include <Parsers/parseQuery.h>
#include <Interpreters/InterpreterSelectQuery.h> #include <Interpreters/ClusterProxy/SelectStreamFactory.h>
#include <Interpreters/ClusterProxy/executeQuery.h>
#include <Interpreters/ExpressionAnalyzer.h>
#include <Interpreters/InterpreterAlterQuery.h> #include <Interpreters/InterpreterAlterQuery.h>
#include <Interpreters/InterpreterDescribeQuery.h> #include <Interpreters/InterpreterDescribeQuery.h>
#include <Interpreters/InterpreterSelectQuery.h>
#include <Interpreters/SyntaxAnalyzer.h> #include <Interpreters/SyntaxAnalyzer.h>
#include <Interpreters/ExpressionAnalyzer.h> #include <Interpreters/createBlockSelector.h>
#include <Interpreters/evaluateConstantExpression.h> #include <Interpreters/evaluateConstantExpression.h>
#include <Interpreters/ClusterProxy/executeQuery.h>
#include <Interpreters/ClusterProxy/SelectStreamFactory.h>
#include <Interpreters/getClusterName.h> #include <Interpreters/getClusterName.h>
#include <Core/Field.h> #include <Core/Field.h>
@ -58,6 +61,7 @@ namespace ErrorCodes
extern const int INFINITE_LOOP; extern const int INFINITE_LOOP;
extern const int TYPE_MISMATCH; extern const int TYPE_MISMATCH;
extern const int NO_SUCH_COLUMN_IN_TABLE; extern const int NO_SUCH_COLUMN_IN_TABLE;
extern const int TOO_MANY_ROWS;
} }
@ -133,6 +137,29 @@ void initializeFileNamesIncrement(const std::string & path, SimpleIncrement & in
increment.set(getMaximumFileNumber(path)); increment.set(getMaximumFileNumber(path));
} }
/// the same as DistributedBlockOutputStream::createSelector, should it be static?
IColumn::Selector createSelector(const ClusterPtr cluster, const ColumnWithTypeAndName & result)
{
const auto & slot_to_shard = cluster->getSlotToShard();
#define CREATE_FOR_TYPE(TYPE) \
if (typeid_cast<const DataType##TYPE *>(result.type.get())) \
return createBlockSelector<TYPE>(*result.column, slot_to_shard);
CREATE_FOR_TYPE(UInt8)
CREATE_FOR_TYPE(UInt16)
CREATE_FOR_TYPE(UInt32)
CREATE_FOR_TYPE(UInt64)
CREATE_FOR_TYPE(Int8)
CREATE_FOR_TYPE(Int16)
CREATE_FOR_TYPE(Int32)
CREATE_FOR_TYPE(Int64)
#undef CREATE_FOR_TYPE
throw Exception{"Sharding key expression does not evaluate to an integer type", ErrorCodes::TYPE_MISMATCH};
}
} }
@ -267,6 +294,14 @@ BlockInputStreams StorageDistributed::read(
: ClusterProxy::SelectStreamFactory( : ClusterProxy::SelectStreamFactory(
header, processed_stage, QualifiedTableName{remote_database, remote_table}, context.getExternalTables()); header, processed_stage, QualifiedTableName{remote_database, remote_table}, context.getExternalTables());
if (settings.optimize_skip_unused_shards)
{
auto smaller_cluster = skipUnusedShards(cluster, query_info);
if (smaller_cluster)
cluster = smaller_cluster;
}
return ClusterProxy::executeQuery( return ClusterProxy::executeQuery(
select_stream_factory, cluster, modified_query_ast, context, settings); select_stream_factory, cluster, modified_query_ast, context, settings);
} }
@ -425,6 +460,41 @@ void StorageDistributed::ClusterNodeData::shutdownAndDropAllData()
directory_monitor->shutdownAndDropAllData(); directory_monitor->shutdownAndDropAllData();
} }
/// Returns a new cluster with fewer shards if constant folding for `sharding_key_expr` is possible
/// using constraints from "WHERE" condition, otherwise returns `nullptr`
ClusterPtr StorageDistributed::skipUnusedShards(ClusterPtr cluster, const SelectQueryInfo & query_info)
{
const auto & select = typeid_cast<ASTSelectQuery &>(*query_info.query);
if (!select.where_expression)
{
return nullptr;
}
const auto & blocks = evaluateExpressionOverConstantCondition(select.where_expression, sharding_key_expr);
// Can't get definite answer if we can skip any shards
if (!blocks)
{
return nullptr;
}
std::set<int> shards;
for (const auto & block : *blocks)
{
if (!block.has(sharding_key_column_name))
throw Exception("sharding_key_expr should evaluate as a single row", ErrorCodes::TOO_MANY_ROWS);
const auto result = block.getByName(sharding_key_column_name);
const auto selector = createSelector(cluster, result);
shards.insert(selector.begin(), selector.end());
}
return cluster->getClusterWithMultipleShards({shards.begin(), shards.end()});
}
void registerStorageDistributed(StorageFactory & factory) void registerStorageDistributed(StorageFactory & factory)
{ {

View File

@ -166,6 +166,8 @@ protected:
const ASTPtr & sharding_key_, const ASTPtr & sharding_key_,
const String & data_path_, const String & data_path_,
bool attach); bool attach);
ClusterPtr skipUnusedShards(ClusterPtr cluster, const SelectQueryInfo & query_info);
}; };
} }

View File

@ -210,24 +210,12 @@ void StorageMergeTree::alter(
ASTPtr new_primary_key_ast = data.primary_key_ast; ASTPtr new_primary_key_ast = data.primary_key_ast;
params.apply(new_columns, new_order_by_ast, new_primary_key_ast); params.apply(new_columns, new_order_by_ast, new_primary_key_ast);
ASTPtr primary_expr_list_for_altering_parts;
for (const AlterCommand & param : params)
{
if (param.type == AlterCommand::MODIFY_PRIMARY_KEY)
{
if (supportsSampling())
throw Exception("MODIFY PRIMARY KEY only supported for tables without sampling key", ErrorCodes::BAD_ARGUMENTS);
primary_expr_list_for_altering_parts = MergeTreeData::extractKeyExpressionList(param.primary_key);
}
}
auto parts = data.getDataParts({MergeTreeDataPartState::PreCommitted, MergeTreeDataPartState::Committed, MergeTreeDataPartState::Outdated}); auto parts = data.getDataParts({MergeTreeDataPartState::PreCommitted, MergeTreeDataPartState::Committed, MergeTreeDataPartState::Outdated});
auto columns_for_parts = new_columns.getAllPhysical(); auto columns_for_parts = new_columns.getAllPhysical();
std::vector<MergeTreeData::AlterDataPartTransactionPtr> transactions; std::vector<MergeTreeData::AlterDataPartTransactionPtr> transactions;
for (const MergeTreeData::DataPartPtr & part : parts) for (const MergeTreeData::DataPartPtr & part : parts)
{ {
if (auto transaction = data.alterDataPart(part, columns_for_parts, primary_expr_list_for_altering_parts, false)) if (auto transaction = data.alterDataPart(part, columns_for_parts, false))
transactions.push_back(std::move(transaction)); transactions.push_back(std::move(transaction));
} }
@ -238,19 +226,7 @@ void StorageMergeTree::alter(
auto & storage_ast = typeid_cast<ASTStorage &>(ast); auto & storage_ast = typeid_cast<ASTStorage &>(ast);
if (new_order_by_ast.get() != data.order_by_ast.get()) if (new_order_by_ast.get() != data.order_by_ast.get())
{ storage_ast.set(storage_ast.order_by, new_order_by_ast);
if (storage_ast.order_by)
{
/// The table was created using the "new" syntax (with key expressions in separate clauses).
storage_ast.set(storage_ast.order_by, new_order_by_ast);
}
else
{
/// Primary key is in the second place in table engine description and can be represented as a tuple.
/// TODO: Not always in second place. If there is a sampling key, then the third one. Fix it.
storage_ast.engine->arguments->children.at(1) = new_order_by_ast;
}
}
if (new_primary_key_ast.get() != data.primary_key_ast.get()) if (new_primary_key_ast.get() != data.primary_key_ast.get())
storage_ast.set(storage_ast.primary_key, new_primary_key_ast); storage_ast.set(storage_ast.primary_key, new_primary_key_ast);
@ -266,9 +242,6 @@ void StorageMergeTree::alter(
/// Columns sizes could be changed /// Columns sizes could be changed
data.recalculateColumnSizes(); data.recalculateColumnSizes();
if (primary_expr_list_for_altering_parts)
data.loadDataParts(false);
} }
@ -725,7 +698,7 @@ void StorageMergeTree::clearColumnInPartition(const ASTPtr & partition, const Fi
if (part->info.partition_id != partition_id) if (part->info.partition_id != partition_id)
throw Exception("Unexpected partition ID " + part->info.partition_id + ". This is a bug.", ErrorCodes::LOGICAL_ERROR); throw Exception("Unexpected partition ID " + part->info.partition_id + ". This is a bug.", ErrorCodes::LOGICAL_ERROR);
if (auto transaction = data.alterDataPart(part, columns_for_parts, nullptr, false)) if (auto transaction = data.alterDataPart(part, columns_for_parts, false))
transactions.push_back(std::move(transaction)); transactions.push_back(std::move(transaction));
LOG_DEBUG(log, "Removing column " << get<String>(column_name) << " from part " << part->name); LOG_DEBUG(log, "Removing column " << get<String>(column_name) << " from part " << part->name);

View File

@ -1504,7 +1504,7 @@ void StorageReplicatedMergeTree::executeClearColumnInPartition(const LogEntry &
LOG_DEBUG(log, "Clearing column " << entry.column_name << " in part " << part->name); LOG_DEBUG(log, "Clearing column " << entry.column_name << " in part " << part->name);
auto transaction = data.alterDataPart(part, columns_for_parts, nullptr, false); auto transaction = data.alterDataPart(part, columns_for_parts, false);
if (!transaction) if (!transaction)
continue; continue;
@ -3059,12 +3059,6 @@ void StorageReplicatedMergeTree::alter(const AlterCommands & params,
data.checkAlter(params); data.checkAlter(params);
for (const AlterCommand & param : params)
{
if (param.type == AlterCommand::MODIFY_PRIMARY_KEY)
throw Exception("Modification of primary key is not supported for replicated tables", ErrorCodes::NOT_IMPLEMENTED);
}
ColumnsDescription new_columns = data.getColumns(); ColumnsDescription new_columns = data.getColumns();
ASTPtr new_order_by_ast = data.order_by_ast; ASTPtr new_order_by_ast = data.order_by_ast;
ASTPtr new_primary_key_ast = data.primary_key_ast; ASTPtr new_primary_key_ast = data.primary_key_ast;

View File

@ -139,7 +139,7 @@ public:
data_out(data_out_compressed, CompressionSettings(CompressionMethod::LZ4), storage.max_compress_block_size), data_out(data_out_compressed, CompressionSettings(CompressionMethod::LZ4), storage.max_compress_block_size),
index_out_compressed(storage.full_path() + "index.mrk", INDEX_BUFFER_SIZE, O_WRONLY | O_APPEND | O_CREAT), index_out_compressed(storage.full_path() + "index.mrk", INDEX_BUFFER_SIZE, O_WRONLY | O_APPEND | O_CREAT),
index_out(index_out_compressed), index_out(index_out_compressed),
block_out(data_out, 0, storage.getSampleBlock(), &index_out, Poco::File(storage.full_path() + "data.bin").getSize()) block_out(data_out, 0, storage.getSampleBlock(), false, &index_out, Poco::File(storage.full_path() + "data.bin").getSize())
{ {
} }

View File

@ -0,0 +1,19 @@
<test>
<name>bounding_ratio</name>
<type>once</type>
<stop_conditions>
<any_of>
<!-- This is only for infinite running query. -->
<average_speed_not_changing_for_ms>1000</average_speed_not_changing_for_ms>
<total_time_ms>10000</total_time_ms>
</any_of>
</stop_conditions>
<metrics>
<max_rows_per_second />
</metrics>
<query>SELECT boundingRatio(number, number) FROM system.numbers</query>
<query>SELECT (argMax(number, number) - argMin(number, number)) / (max(number) - min(number)) FROM system.numbers</query>
</test>

View File

@ -0,0 +1,34 @@
<test>
<name>right</name>
<type>loop</type>
<preconditions>
<table_exists>hits_100m_single</table_exists>
</preconditions>
<stop_conditions>
<all_of>
<total_time_ms>10000</total_time_ms>
</all_of>
<any_of>
<average_speed_not_changing_for_ms>5000</average_speed_not_changing_for_ms>
<total_time_ms>20000</total_time_ms>
</any_of>
</stop_conditions>
<main_metric>
<min_time/>
</main_metric>
<substitutions>
<substitution>
<name>func</name>
<values>
<value>right(URL, 16)</value>
<value>substring(URL, greatest(minus(plus(length(URL), 1), 16), 1))</value>
</values>
</substitution>
</substitutions>
<query>SELECT count() FROM hits_100m_single WHERE NOT ignore({func})</query>
</test>

View File

@ -0,0 +1,34 @@
<test>
<name>trim_numbers</name>
<type>loop</type>
<stop_conditions>
<all_of>
<total_time_ms>10000</total_time_ms>
</all_of>
<any_of>
<average_speed_not_changing_for_ms>5000</average_speed_not_changing_for_ms>
<total_time_ms>20000</total_time_ms>
</any_of>
</stop_conditions>
<main_metric>
<min_time/>
</main_metric>
<substitutions>
<substitution>
<name>func</name>
<values>
<value>trim(</value>
<value>ltrim(</value>
<value>rtrim(</value>
<value>trim(LEADING '012345' FROM </value>
<value>trim(TRAILING '012345' FROM </value>
<value>trim(BOTH '012345' FROM </value>
</values>
</substitution>
</substitutions>
<query>SELECT count() FROM numbers(10000000) WHERE NOT ignore({func}toString(number)))</query>
</test>

View File

@ -0,0 +1,38 @@
<test>
<name>trim_urls</name>
<type>loop</type>
<preconditions>
<table_exists>hits_100m_single</table_exists>
</preconditions>
<stop_conditions>
<all_of>
<total_time_ms>10000</total_time_ms>
</all_of>
<any_of>
<average_speed_not_changing_for_ms>5000</average_speed_not_changing_for_ms>
<total_time_ms>20000</total_time_ms>
</any_of>
</stop_conditions>
<main_metric>
<min_time/>
</main_metric>
<substitutions>
<substitution>
<name>func</name>
<values>
<value>trim(</value>
<value>ltrim(</value>
<value>rtrim(</value>
<value>trim(LEADING 'htpsw:/' FROM </value>
<value>trim(TRAILING '/' FROM </value>
<value>trim(BOTH 'htpsw:/' FROM </value>
</values>
</substitution>
</substitutions>
<query>SELECT count() FROM hits_100m_single WHERE NOT ignore({func}URL))</query>
</test>

View File

@ -0,0 +1,35 @@
<test>
<name>trim_whitespaces</name>
<type>loop</type>
<preconditions>
<table_exists>whitespaces</table_exists>
</preconditions>
<stop_conditions>
<all_of>
<total_time_ms>30000</total_time_ms>
</all_of>
</stop_conditions>
<main_metric>
<min_time/>
</main_metric>
<substitutions>
<substitution>
<name>func</name>
<values>
<value>value</value>
<value>trimLeft(value)</value>
<value>trimRight(value)</value>
<value>trimBoth(value)</value>
<value>replaceRegexpOne(value, '^ *', '')</value>
<value>replaceRegexpOne(value, ' *$', '')</value>
<value>replaceRegexpAll(value, '^ *| *$', '')</value>
</values>
</substitution>
</substitutions>
<query>SELECT count() FROM whitespaces WHERE NOT ignore({func})</query>
</test>

View File

@ -0,0 +1,17 @@
CREATE TABLE whitespaces
(
value String
)
ENGINE = MergeTree()
PARTITION BY tuple()
ORDER BY tuple()
INSERT INTO whitespaces SELECT value
FROM
(
SELECT
arrayStringConcat(groupArray(' ')) AS spaces,
concat(spaces, toString(any(number)), spaces) AS value
FROM numbers(100000000)
GROUP BY pow(number, intHash32(number) % 4) % 12345678
) -- repeat something like this multiple times and/or just copy whitespaces table into itself

View File

@ -1,130 +0,0 @@
1
2
3
2
3
1
2
3
2
3
1
2
3
2
3
2
3
1
1 Hello
2
2 World
3
3 abc
4 def
2
2 World
3
3 abc
4 def
2 World
3 abc
4 def
2
2 World
3
3 abc
4 def
1
2
3
1
1 Hello
2
2 World
3
3 abc
4 def
2
2 World
3
3 abc
4 def
2 World
3 abc
4 def
2
2 World
3
3 abc
4 def
1
2
3
1
1 Hello
2
2 World
3
3 abc
4 def
1
1 Hello
2
2 World
3
3 abc
4 def
1
1 Hello
2
2 World
3
3 abc
4 def
2
2 World
3
3 abc
4 def
2 World
3 abc
4 def
2
2 World
3
3 abc
4 def
1
2
3
1
1 Hello
2
2 World
3
3 abc
4 def
2
2 World
3
3 abc
4 def
2 World
3 abc
4 def
2
2 World
3
3 abc
4 def
1
2
3
*** Check table creation statement ***
CREATE TABLE test.pk2 ( x UInt32, y UInt32, z UInt32) ENGINE = MergeTree PRIMARY KEY (x, y) ORDER BY (x, y, z) SETTINGS index_granularity = 8192
*** Check that the inserted values were correctly sorted ***
100 20 1
100 20 2
100 30 1
100 30 2

View File

@ -1,83 +0,0 @@
SET send_logs_level = 'none';
DROP TABLE IF EXISTS test.pk;
CREATE TABLE test.pk (d Date DEFAULT '2000-01-01', x UInt64) ENGINE = MergeTree(d, x, 1);
INSERT INTO test.pk (x) VALUES (1), (2), (3);
SELECT x FROM test.pk ORDER BY x;
SELECT x FROM test.pk WHERE x >= 2 ORDER BY x;
ALTER TABLE test.pk MODIFY PRIMARY KEY (x);
SELECT x FROM test.pk ORDER BY x;
SELECT x FROM test.pk WHERE x >= 2 ORDER BY x;
ALTER TABLE test.pk ADD COLUMN y String, MODIFY PRIMARY KEY (x, y);
SELECT x, y FROM test.pk ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x >= 2 ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x >= 2 AND y = '' ORDER BY x, y;
INSERT INTO test.pk (x, y) VALUES (1, 'Hello'), (2, 'World'), (3, 'abc'), (4, 'def');
SELECT x, y FROM test.pk ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x >= 2 ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x >= 2 AND y > '' ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x >= 2 AND y >= '' ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x > 2 AND y > 'z' ORDER BY x, y;
SELECT x, y FROM test.pk WHERE y < 'A' ORDER BY x, y;
DETACH TABLE test.pk;
ATTACH TABLE test.pk (d Date DEFAULT '2000-01-01', x UInt64, y String) ENGINE = MergeTree(d, (x, y), 1);
SELECT x, y FROM test.pk ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x >= 2 ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x >= 2 AND y > '' ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x >= 2 AND y >= '' ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x > 2 AND y > 'z' ORDER BY x, y;
SELECT x, y FROM test.pk WHERE y < 'A' ORDER BY x, y;
SET max_rows_to_read = 3;
SELECT x, y FROM test.pk WHERE x > 2 AND y > 'z' ORDER BY x, y;
SET max_rows_to_read = 0;
OPTIMIZE TABLE test.pk;
SELECT x, y FROM test.pk;
SELECT x, y FROM test.pk ORDER BY x, y;
ALTER TABLE test.pk MODIFY PRIMARY KEY (x);
SELECT x, y FROM test.pk ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x >= 2 ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x >= 2 AND y > '' ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x >= 2 AND y >= '' ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x > 2 AND y > 'z' ORDER BY x, y;
SELECT x, y FROM test.pk WHERE y < 'A' ORDER BY x, y;
DETACH TABLE test.pk;
ATTACH TABLE test.pk (d Date DEFAULT '2000-01-01', x UInt64, y String) ENGINE = MergeTree(d, (x), 1);
SELECT x, y FROM test.pk ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x >= 2 ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x >= 2 AND y > '' ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x >= 2 AND y >= '' ORDER BY x, y;
SELECT x, y FROM test.pk WHERE x > 2 AND y > 'z' ORDER BY x, y;
SELECT x, y FROM test.pk WHERE y < 'A' ORDER BY x, y;
DROP TABLE test.pk;
DROP TABLE IF EXISTS test.pk2;
CREATE TABLE test.pk2 (x UInt32) ENGINE MergeTree ORDER BY x;
ALTER TABLE test.pk2 ADD COLUMN y UInt32, ADD COLUMN z UInt32, MODIFY ORDER BY (x, y, z);
ALTER TABLE test.pk2 MODIFY PRIMARY KEY (y); -- { serverError 36 }
ALTER TABLE test.pk2 MODIFY PRIMARY KEY (x, y);
SELECT '*** Check table creation statement ***';
SHOW CREATE TABLE test.pk2;
INSERT INTO test.pk2 VALUES (100, 30, 2), (100, 30, 1), (100, 20, 2), (100, 20, 1);
SELECT '*** Check that the inserted values were correctly sorted ***';
SELECT * FROM test.pk2;
DROP TABLE test.pk2;

View File

@ -36,3 +36,4 @@
2029-02-28 01:02:03 2017-03-29 01:02:03 2029-02-28 01:02:03 2017-03-29 01:02:03
2030-02-28 01:02:03 2017-04-29 01:02:03 2030-02-28 01:02:03 2017-04-29 01:02:03
2031-02-28 01:02:03 2017-05-29 01:02:03 2031-02-28 01:02:03 2017-05-29 01:02:03
2015-11-29 01:02:03

View File

@ -2,3 +2,4 @@ SELECT toDateTime('2017-10-30 08:18:19') + INTERVAL 1 DAY + INTERVAL 1 MONTH - I
SELECT toDateTime('2017-10-30 08:18:19') + INTERVAL 1 HOUR + INTERVAL 1000 MINUTE + INTERVAL 10 SECOND; SELECT toDateTime('2017-10-30 08:18:19') + INTERVAL 1 HOUR + INTERVAL 1000 MINUTE + INTERVAL 10 SECOND;
SELECT toDateTime('2017-10-30 08:18:19') + INTERVAL 1 DAY + INTERVAL number MONTH FROM system.numbers LIMIT 20; SELECT toDateTime('2017-10-30 08:18:19') + INTERVAL 1 DAY + INTERVAL number MONTH FROM system.numbers LIMIT 20;
SELECT toDateTime('2016-02-29 01:02:03') + INTERVAL number YEAR, toDateTime('2016-02-29 01:02:03') + INTERVAL number MONTH FROM system.numbers LIMIT 16; SELECT toDateTime('2016-02-29 01:02:03') + INTERVAL number YEAR, toDateTime('2016-02-29 01:02:03') + INTERVAL number MONTH FROM system.numbers LIMIT 16;
SELECT toDateTime('2016-02-29 01:02:03') - INTERVAL 1 QUARTER;

View File

@ -13,7 +13,7 @@ SELECT EXTRACT(year FROM toDateTime('2017-12-31 18:59:58'));
DROP TABLE IF EXISTS test.Orders; DROP TABLE IF EXISTS test.Orders;
CREATE TABLE test.Orders (OrderId UInt64, OrderName String, OrderDate DateTime) engine = Log; CREATE TABLE test.Orders (OrderId UInt64, OrderName String, OrderDate DateTime) engine = Log;
insert into test.Orders values (1, 'Jarlsberg Cheese', toDateTime('2008-10-11 13:23:44')); insert into test.Orders values (1, 'Jarlsberg Cheese', toDateTime('2008-10-11 13:23:44'));
SELECT EXTRACT(YEAR FROM OrderDate) AS OrderYear, EXTRACT(MONTH FROM OrderDate) AS OrderMonth, EXTRACT(DAY FROM OrderDate) AS OrderDay, SELECT EXTRACT(YYYY FROM OrderDate) AS OrderYear, EXTRACT(MONTH FROM OrderDate) AS OrderMonth, EXTRACT(DAY FROM OrderDate) AS OrderDay,
EXTRACT(HOUR FROM OrderDate), EXTRACT(MINUTE FROM OrderDate), EXTRACT(SECOND FROM OrderDate) FROM test.Orders WHERE OrderId=1; EXTRACT(HOUR FROM OrderDate), EXTRACT(MINUTE FROM OrderDate), EXTRACT(SECOND FROM OrderDate) FROM test.Orders WHERE OrderId=1;
DROP TABLE test.Orders; DROP TABLE test.Orders;

View File

@ -2,4 +2,3 @@ drop table if exists test.table;
create table test.table (val Int32) engine = MergeTree order by val; create table test.table (val Int32) engine = MergeTree order by val;
insert into test.table values (-2), (0), (2); insert into test.table values (-2), (0), (2);
select count() from test.table where toUInt64(val) == 0; select count() from test.table where toUInt64(val) == 0;

View File

@ -0,0 +1,26 @@
no monotonic int case: String -> UInt64
no monotonic int case: FixedString -> UInt64
monotonic int case: Int32 -> Int64
monotonic int case: Int32 -> UInt64
monotonic int case: Int32 -> Int32
monotonic int case: Int32 -> UInt32
monotonic int case: Int32 -> Int16
monotonic int case: Int32 -> UInt16
monotonic int case: UInt32 -> Int64
monotonic int case: UInt32 -> UInt64
monotonic int case: UInt32 -> Int32
monotonic int case: UInt32 -> UInt32
monotonic int case: UInt32 -> Int16
monotonic int case: UInt32 -> UInt16
monotonic int case: Enum16 -> Int32
monotonic int case: Enum16 -> UInt32
monotonic int case: Enum16 -> Int16
monotonic int case: Enum16 -> UInt16
monotonic int case: Enum16 -> Int8
monotonic int case: Enum16 -> UInt8
monotonic int case: Date -> Int32
monotonic int case: Date -> UInt32
monotonic int case: Date -> Int16
monotonic int case: Date -> UInt16
monotonic int case: Date -> Int8
monotonic int case: Date -> UInt8

View File

@ -0,0 +1,85 @@
#!/usr/bin/env bash
#--------------------------------------------
# Description of test result:
# Test the correctness of the optimization
# by asserting read marks in the log.
# Relation of read marks and optimization:
# read marks =
# the number of monotonic marks filtered through predicates
# + no monotonic marks count
#--------------------------------------------
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
. $CURDIR/../shell_config.sh
${CLICKHOUSE_CLIENT} --query="SYSTEM STOP MERGES;"
${CLICKHOUSE_CLIENT} --query="DROP TABLE IF EXISTS test.string_test_table;"
${CLICKHOUSE_CLIENT} --query="DROP TABLE IF EXISTS test.fixed_string_test_table;"
${CLICKHOUSE_CLIENT} --query="DROP TABLE IF EXISTS test.signed_integer_test_table;"
${CLICKHOUSE_CLIENT} --query="DROP TABLE IF EXISTS test.unsigned_integer_test_table;"
${CLICKHOUSE_CLIENT} --query="DROP TABLE IF EXISTS test.enum_test_table;"
${CLICKHOUSE_CLIENT} --query="DROP TABLE IF EXISTS test.date_test_table;"
${CLICKHOUSE_CLIENT} --query="CREATE TABLE test.string_test_table (val String) ENGINE = MergeTree ORDER BY val SETTINGS index_granularity = 1;"
${CLICKHOUSE_CLIENT} --query="CREATE TABLE test.fixed_string_test_table (val FixedString(1)) ENGINE = MergeTree ORDER BY val SETTINGS index_granularity = 1;"
${CLICKHOUSE_CLIENT} --query="CREATE TABLE test.signed_integer_test_table (val Int32) ENGINE = MergeTree ORDER BY val SETTINGS index_granularity = 1;"
${CLICKHOUSE_CLIENT} --query="CREATE TABLE test.unsigned_integer_test_table (val UInt32) ENGINE = MergeTree ORDER BY val SETTINGS index_granularity = 1;"
${CLICKHOUSE_CLIENT} --query="CREATE TABLE test.enum_test_table (val Enum16('hello' = 1, 'world' = 2, 'yandex' = 256, 'clickhouse' = 257)) ENGINE = MergeTree ORDER BY val SETTINGS index_granularity = 1;"
${CLICKHOUSE_CLIENT} --query="CREATE TABLE test.date_test_table (val Date) ENGINE = MergeTree ORDER BY val SETTINGS index_granularity = 1;"
${CLICKHOUSE_CLIENT} --query="INSERT INTO test.string_test_table VALUES ('0'), ('2'), ('2');"
${CLICKHOUSE_CLIENT} --query="INSERT INTO test.fixed_string_test_table VALUES ('0'), ('2'), ('2');"
# 131072 -> 17 bit is 1
${CLICKHOUSE_CLIENT} --query="INSERT INTO test.signed_integer_test_table VALUES (-2), (0), (2), (2), (131072), (131073), (131073);"
${CLICKHOUSE_CLIENT} --query="INSERT INTO test.unsigned_integer_test_table VALUES (0), (2), (2), (131072), (131073), (131073);"
${CLICKHOUSE_CLIENT} --query="INSERT INTO test.enum_test_table VALUES ('hello'), ('world'), ('world'), ('yandex'), ('clickhouse'), ('clickhouse');"
${CLICKHOUSE_CLIENT} --query="INSERT INTO test.date_test_table VALUES (1), (2), (2), (256), (257), (257);"
export CLICKHOUSE_CLIENT=`echo ${CLICKHOUSE_CLIENT} |sed 's/'"${CLICKHOUSE_CLIENT_SERVER_LOGS_LEVEL}"'/debug/g'`
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.string_test_table WHERE toUInt64(val) == 0;" 2>&1 |grep -q "3 marks to read from 1 ranges" && echo "no monotonic int case: String -> UInt64"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.fixed_string_test_table WHERE toUInt64(val) == 0;" 2>&1 |grep -q "3 marks to read from 1 ranges" && echo "no monotonic int case: FixedString -> UInt64"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.signed_integer_test_table WHERE toInt64(val) == 0;" 2>&1 |grep -q "2 marks to read from" && echo "monotonic int case: Int32 -> Int64"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.signed_integer_test_table WHERE toUInt64(val) == 0;" 2>&1 |grep -q "2 marks to read from" && echo "monotonic int case: Int32 -> UInt64"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.signed_integer_test_table WHERE toInt32(val) == 0;" 2>&1 |grep -q "2 marks to read from" && echo "monotonic int case: Int32 -> Int32"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.signed_integer_test_table WHERE toUInt32(val) == 0;" 2>&1 |grep -q "2 marks to read from" && echo "monotonic int case: Int32 -> UInt32"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.signed_integer_test_table WHERE toInt16(val) == 0;" 2>&1 |grep -q "5 marks to read from" && echo "monotonic int case: Int32 -> Int16"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.signed_integer_test_table WHERE toUInt16(val) == 0;" 2>&1 |grep -q "5 marks to read from" && echo "monotonic int case: Int32 -> UInt16"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.unsigned_integer_test_table WHERE toInt64(val) == 0;" 2>&1 |grep -q "1 marks to read from" && echo "monotonic int case: UInt32 -> Int64"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.unsigned_integer_test_table WHERE toUInt64(val) == 0;" 2>&1 |grep -q "1 marks to read from" && echo "monotonic int case: UInt32 -> UInt64"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.unsigned_integer_test_table WHERE toInt32(val) == 0;" 2>&1 |grep -q "1 marks to read from" && echo "monotonic int case: UInt32 -> Int32"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.unsigned_integer_test_table WHERE toUInt32(val) == 0;" 2>&1 |grep -q "1 marks to read from" && echo "monotonic int case: UInt32 -> UInt32"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.unsigned_integer_test_table WHERE toInt16(val) == 0;" 2>&1 |grep -q "4 marks to read from" && echo "monotonic int case: UInt32 -> Int16"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.unsigned_integer_test_table WHERE toUInt16(val) == 0;" 2>&1 |grep -q "4 marks to read from" && echo "monotonic int case: UInt32 -> UInt16"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.enum_test_table WHERE toInt32(val) == 1;" 2>&1 |grep -q "1 marks to read from" && echo "monotonic int case: Enum16 -> Int32"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.enum_test_table WHERE toUInt32(val) == 1;" 2>&1 |grep -q "1 marks to read from" && echo "monotonic int case: Enum16 -> UInt32"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.enum_test_table WHERE toInt16(val) == 1;" 2>&1 |grep -q "1 marks to read from" && echo "monotonic int case: Enum16 -> Int16"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.enum_test_table WHERE toUInt16(val) == 1;" 2>&1 |grep -q "1 marks to read from" && echo "monotonic int case: Enum16 -> UInt16"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.enum_test_table WHERE toInt8(val) == 1;" 2>&1 |grep -q "5 marks to read from" && echo "monotonic int case: Enum16 -> Int8"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.enum_test_table WHERE toUInt8(val) == 1;" 2>&1 |grep -q "5 marks to read from" && echo "monotonic int case: Enum16 -> UInt8"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.date_test_table WHERE toInt32(val) == 1;" 2>&1 |grep -q "1 marks to read from" && echo "monotonic int case: Date -> Int32"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.date_test_table WHERE toUInt32(val) == 1;" 2>&1 |grep -q "1 marks to read from" && echo "monotonic int case: Date -> UInt32"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.date_test_table WHERE toInt16(val) == 1;" 2>&1 |grep -q "1 marks to read from" && echo "monotonic int case: Date -> Int16"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.date_test_table WHERE toUInt16(val) == 1;" 2>&1 |grep -q "1 marks to read from" && echo "monotonic int case: Date -> UInt16"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.date_test_table WHERE toInt8(val) == 1;" 2>&1 |grep -q "5 marks to read from" && echo "monotonic int case: Date -> Int8"
${CLICKHOUSE_CLIENT} --query="SELECT count() FROM test.date_test_table WHERE toUInt8(val) == 1;" 2>&1 |grep -q "5 marks to read from" && echo "monotonic int case: Date -> UInt8"
export CLICKHOUSE_CLIENT=`echo ${CLICKHOUSE_CLIENT} |sed 's/debug/'"${CLICKHOUSE_CLIENT_SERVER_LOGS_LEVEL}"'/g'`
${CLICKHOUSE_CLIENT} --query="DROP TABLE IF EXISTS test.string_test_table;"
${CLICKHOUSE_CLIENT} --query="DROP TABLE IF EXISTS test.fixed_string_test_table;"
${CLICKHOUSE_CLIENT} --query="DROP TABLE IF EXISTS test.signed_integer_test_table;"
${CLICKHOUSE_CLIENT} --query="DROP TABLE IF EXISTS test.unsigned_integer_test_table;"
${CLICKHOUSE_CLIENT} --query="DROP TABLE IF EXISTS test.enum_test_table;"
${CLICKHOUSE_CLIENT} --query="DROP TABLE IF EXISTS test.date_test_table;"
${CLICKHOUSE_CLIENT} --query="SYSTEM START MERGES;"

View File

@ -0,0 +1,21 @@
1
1
1.5
1.5
1.5
0 1.5
1 1.5
2 1.5
3 1.5
4 1.5
5 1.5
6 1.5
7 1.5
8 1.5
9 1.5
0 1.5
1.5
nan
nan
1

View File

@ -0,0 +1,26 @@
drop table if exists rate_test;
create table rate_test (timestamp UInt32, event UInt32) engine=Memory;
insert into rate_test values (0,1000),(1,1001),(2,1002),(3,1003),(4,1004),(5,1005),(6,1006),(7,1007),(8,1008);
select 1.0 = boundingRatio(timestamp, event) from rate_test;
drop table if exists rate_test2;
create table rate_test2 (uid UInt32 default 1,timestamp DateTime, event UInt32) engine=Memory;
insert into rate_test2(timestamp, event) values ('2018-01-01 01:01:01',1001),('2018-01-01 01:01:02',1002),('2018-01-01 01:01:03',1003),('2018-01-01 01:01:04',1004),('2018-01-01 01:01:05',1005),('2018-01-01 01:01:06',1006),('2018-01-01 01:01:07',1007),('2018-01-01 01:01:08',1008);
select 1.0 = boundingRatio(timestamp, event) from rate_test2;
drop table rate_test;
drop table rate_test2;
SELECT boundingRatio(number, number * 1.5) FROM numbers(10);
SELECT boundingRatio(1000 + number, number * 1.5) FROM numbers(10);
SELECT boundingRatio(1000 + number, number * 1.5 - 111) FROM numbers(10);
SELECT number % 10 AS k, boundingRatio(1000 + number, number * 1.5 - 111) FROM numbers(100) GROUP BY k WITH TOTALS ORDER BY k;
SELECT boundingRatio(1000 + number, number * 1.5 - 111) FROM numbers(2);
SELECT boundingRatio(1000 + number, number * 1.5 - 111) FROM numbers(1);
SELECT boundingRatio(1000 + number, number * 1.5 - 111) FROM numbers(1) WHERE 0;
SELECT boundingRatio(number, exp(number)) = e() - 1 FROM numbers(2);

View File

@ -0,0 +1,16 @@
OK
OK
1
OK
0
4
2
1
1
1
4
OK
OK
OK
OK
OK

View File

@ -0,0 +1,105 @@
#!/usr/bin/env bash
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
. $CURDIR/../shell_config.sh
${CLICKHOUSE_CLIENT} --query "DROP TABLE IF EXISTS test.mergetree;"
${CLICKHOUSE_CLIENT} --query "DROP TABLE IF EXISTS test.distributed;"
${CLICKHOUSE_CLIENT} --query "CREATE TABLE test.mergetree (a Int64, b Int64, c Int64) ENGINE = MergeTree ORDER BY (a, b);"
${CLICKHOUSE_CLIENT} --query "CREATE TABLE test.distributed AS test.mergetree ENGINE = Distributed(test_unavailable_shard, test, mergetree, jumpConsistentHash(a+b, 2));"
${CLICKHOUSE_CLIENT} --query "INSERT INTO test.mergetree VALUES (0, 0, 0);"
${CLICKHOUSE_CLIENT} --query "INSERT INTO test.mergetree VALUES (1, 0, 0);"
${CLICKHOUSE_CLIENT} --query "INSERT INTO test.mergetree VALUES (0, 1, 1);"
${CLICKHOUSE_CLIENT} --query "INSERT INTO test.mergetree VALUES (1, 1, 1);"
# Should fail because second shard is unavailable
${CLICKHOUSE_CLIENT} --query "SELECT count(*) FROM test.distributed;" 2>&1 \
| fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
# Should fail without setting `optimize_skip_unused_shards`
${CLICKHOUSE_CLIENT} --query "SELECT count(*) FROM test.distributed WHERE a = 0 AND b = 0;" 2>&1 \
| fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
# Should pass now
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM test.distributed WHERE a = 0 AND b = 0;
"
# Should still fail because of matching unavailable shard
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM test.distributed WHERE a = 2 AND b = 2;
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
# Try more complext expressions for constant folding - all should pass.
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM test.distributed WHERE a = 1 AND a = 0 AND b = 0;
"
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM test.distributed WHERE a IN (0, 1) AND b IN (0, 1);
"
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM test.distributed WHERE a = 0 AND b = 0 OR a = 1 AND b = 1;
"
# TODO: should pass one day.
#${CLICKHOUSE_CLIENT} -n --query="
# SET optimize_skip_unused_shards = 1;
# SELECT count(*) FROM test.distributed WHERE a = 0 AND b >= 0 AND b <= 1;
#"
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM test.distributed WHERE a = 0 AND b = 0 AND c = 0;
"
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM test.distributed WHERE a = 0 AND b = 0 AND c != 10;
"
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM test.distributed WHERE a = 0 AND b = 0 AND (a+b)*b != 12;
"
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM test.distributed WHERE (a = 0 OR a = 1) AND (b = 0 OR b = 1);
"
# These ones should fail.
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM test.distributed WHERE a = 0 AND b <= 1;
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM test.distributed WHERE a = 0 AND c = 0;
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM test.distributed WHERE a = 0 OR a = 1 AND b = 0;
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM test.distributed WHERE a = 0 AND b = 0 OR a = 2 AND b = 2;
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM test.distributed WHERE a = 0 AND b = 0 OR c = 0;
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'

View File

@ -10,3 +10,19 @@ o
1 1
oo oo
o o
fo
foo
r
bar
foo
foo
xxfoo
fooabba
fooabbafoo
foo*
-11
-3
2021-01-01
2018-07-18 01:02:03
2018-04-01

View File

@ -12,3 +12,19 @@ select mid('foo', 3);
select IF(3>2, 1, 0); select IF(3>2, 1, 0);
select substring('foo' from 1 + 1); select substring('foo' from 1 + 1);
select SUBSTRING('foo' FROM 2 FOR 1); select SUBSTRING('foo' FROM 2 FOR 1);
select left('foo', 2);
select LEFT('foo', 123);
select RIGHT('bar', 1);
select right('bar', 123);
select ltrim('') || rtrim('') || trim('');
select ltrim(' foo');
select RTRIM(' foo ');
select trim(TRAILING 'x' FROM 'xxfooxx');
select Trim(LEADING 'ab' FROM 'abbafooabba');
select TRIM(both 'ab' FROM 'abbafooabbafooabba');
select trim(LEADING '*[]{}|\\' FROM '\\|[[[}}}*foo*');
select DATE_DIFF(MONTH, toDate('2018-12-18'), toDate('2018-01-01'));
select DATE_DIFF(QQ, toDate('2018-12-18'), toDate('2018-01-01'));
select DATE_ADD(YEAR, 3, toDate('2018-01-01'));
select timestamp_sub(SQL_TSI_MONTH, 5, toDateTime('2018-12-18 01:02:03'));
select timestamp_ADD(toDate('2018-01-01'), INTERVAL 3 MONTH);

View File

@ -0,0 +1,12 @@
hello
hel\\\\lo
h\\{ell}o
\\(h\\{ell}o\\)
\\(
Hello\\(
\\(Hello
\\(\\(\\(\\(\\(\\(\\(\\(\\(
\\\\
\\\0\\\\\\|\\(\\)\\^\\$\\.\\[\\?\\*\\+\\{
1

View File

@ -0,0 +1,12 @@
SELECT regexpQuoteMeta('hello');
SELECT regexpQuoteMeta('hel\\lo');
SELECT regexpQuoteMeta('h{ell}o');
SELECT regexpQuoteMeta('(h{ell}o)');
SELECT regexpQuoteMeta('');
SELECT regexpQuoteMeta('(');
SELECT regexpQuoteMeta('Hello(');
SELECT regexpQuoteMeta('(Hello');
SELECT regexpQuoteMeta('(((((((((');
SELECT regexpQuoteMeta('\\');
SELECT regexpQuoteMeta('\0\\|()^$.[?*+{');
SELECT DISTINCT regexpQuoteMeta(toString(number)) = toString(number) FROM numbers(100000);

View File

@ -53,6 +53,20 @@
<timezone>Europe/Moscow</timezone> <timezone>Europe/Moscow</timezone>
<remote_servers incl="clickhouse_remote_servers" > <remote_servers incl="clickhouse_remote_servers" >
<!-- Test only shard config for testing distributed storage --> <!-- Test only shard config for testing distributed storage -->
<test_unavailable_shard>
<shard>
<replica>
<host>localhost</host>
<port>59000</port>
</replica>
</shard>
<shard>
<replica>
<host>localhost</host>
<port>1</port>
</replica>
</shard>
</test_unavailable_shard>
<test_shard_localhost> <test_shard_localhost>
<shard> <shard>
<replica> <replica>

View File

@ -169,7 +169,7 @@ When formatting, rows are enclosed in double quotes. A double quote inside a str
clickhouse-client --format_csv_delimiter="|" --query="INSERT INTO test.csv FORMAT CSV" < data.csv clickhouse-client --format_csv_delimiter="|" --query="INSERT INTO test.csv FORMAT CSV" < data.csv
``` ```
&ast;By default, the delimiter is `,`. See the [format_csv_delimiter](/operations/settings/settings/#format_csv_delimiter) setting for more information. &ast;By default, the delimiter is `,`. See the [format_csv_delimiter](../operations/settings/settings.md#format_csv_delimiter) setting for more information.
When parsing, all values can be parsed either with or without quotes. Both double and single quotes are supported. Rows can also be arranged without quotes. In this case, they are parsed up to the delimiter character or line feed (CR or LF). In violation of the RFC, when parsing rows without quotes, the leading and trailing spaces and tabs are ignored. For the line feed, Unix (LF), Windows (CR LF) and Mac OS Classic (CR LF) types are all supported. When parsing, all values can be parsed either with or without quotes. Both double and single quotes are supported. Rows can also be arranged without quotes. In this case, they are parsed up to the delimiter character or line feed (CR or LF). In violation of the RFC, when parsing rows without quotes, the leading and trailing spaces and tabs are ignored. For the line feed, Unix (LF), Windows (CR LF) and Mac OS Classic (CR LF) types are all supported.

View File

@ -681,8 +681,20 @@ For more information, see the section "[Replication](../../operations/table_engi
**Example** **Example**
```xml ```xml
<zookeeper incl="zookeeper-servers" optional="true" /> <zookeeper>
<node index="1">
<host>example1</host>
<port>2181</port>
</node>
<node index="2">
<host>example2</host>
<port>2181</port>
</node>
<node index="3">
<host>example3</host>
<port>2181</port>
</node>
</zookeeper>
``` ```
[Original article](https://clickhouse.yandex/docs/en/operations/server_settings/settings/) <!--hide--> [Original article](https://clickhouse.yandex/docs/en/operations/server_settings/settings/) <!--hide-->

View File

@ -149,7 +149,7 @@ Default value: 0 (off).
Used when performing `SELECT` from a distributed table that points to replicated tables. Used when performing `SELECT` from a distributed table that points to replicated tables.
## max_threads ## max_threads {#max_threads}
The maximum number of query processing threads The maximum number of query processing threads

View File

@ -4,8 +4,8 @@
This query is exactly the same as `CREATE`, but This query is exactly the same as `CREATE`, but
- instead of the word `CREATE` it uses the word `ATTACH`. - Instead of the word `CREATE` it uses the word `ATTACH`.
- The query doesn't create data on the disk, but assumes that data is already in the appropriate places, and just adds information about the table to the server. - The query does not create data on the disk, but assumes that data is already in the appropriate places, and just adds information about the table to the server.
After executing an ATTACH query, the server will know about the existence of the table. After executing an ATTACH query, the server will know about the existence of the table.
If the table was previously detached (``DETACH``), meaning that its structure is known, you can use shorthand without defining the structure. If the table was previously detached (``DETACH``), meaning that its structure is known, you can use shorthand without defining the structure.
@ -16,6 +16,41 @@ ATTACH TABLE [IF NOT EXISTS] [db.]name [ON CLUSTER cluster]
This query is used when starting the server. The server stores table metadata as files with `ATTACH` queries, which it simply runs at launch (with the exception of system tables, which are explicitly created on the server). This query is used when starting the server. The server stores table metadata as files with `ATTACH` queries, which it simply runs at launch (with the exception of system tables, which are explicitly created on the server).
## CHECK TABLE
Checks if the data in the table is corrupted.
``` sql
CHECK TABLE [db.]name
```
The `CHECK TABLE` query compares actual file sizes with the expected values which are stored on the server. If the file sizes do not match the stored values, it means the data is corrupted. This can be caused, for example, by a system crash during query execution.
The query response contains the `result` column with a single row. The row has a value of
[Boolean](../data_types/boolean.md) type:
- 0 - The data in the table is corrupted.
- 1 - The data maintains integrity.
The `CHECK TABLE` query is only supported for the following table engines:
- [Log](../operations/table_engines/log.md)
- [TinyLog](../operations/table_engines/tinylog.md)
- StripeLog
These engines do not provide automatic data recovery on failure. Use the `CHECK TABLE` query to track data loss in a timely manner.
To avoid data loss use the [MergeTree](../operations/table_engines/mergetree.md) family tables.
**If the data is corrupted**
If the table is corrupted, you can copy the non-corrupted data to another table. To do this:
1. Create a new table with the same structure as damaged table. To do this execute the query `CREATE TABLE <new_table_name> AS <damaged_table_name>`.
2. Set the [max_threads](../operations/settings/settings.md#max_threads) value to 1 to process the next query in a single thread. To do this run the query `SET max_threads = 1`.
3. Execute the query `INSERT INTO <new_table_name> SELECT * FROM <damaged_table_name>`. This request copies the non-corrupted data from the damaged table to another table. Only the data before the corrupted part will be copied.
4. Restart the `clickhouse-client` to reset the `max_threads` value.
## DESCRIBE TABLE ## DESCRIBE TABLE
``` sql ``` sql
@ -198,8 +233,8 @@ SHOW [TEMPORARY] TABLES [FROM db] [LIKE 'pattern'] [INTO OUTFILE filename] [FORM
Displays a list of tables Displays a list of tables
- tables from the current database, or from the 'db' database if "FROM db" is specified. - Tables from the current database, or from the 'db' database if "FROM db" is specified.
- all tables, or tables whose name matches the pattern, if "LIKE 'pattern'" is specified. - All tables, or tables whose name matches the pattern, if "LIKE 'pattern'" is specified.
This query is identical to: `SELECT name FROM system.tables WHERE database = 'db' [AND name LIKE 'pattern'] [INTO OUTFILE filename] [FORMAT format]`. This query is identical to: `SELECT name FROM system.tables WHERE database = 'db' [AND name LIKE 'pattern'] [INTO OUTFILE filename] [FORMAT format]`.
@ -207,7 +242,7 @@ See also the section "LIKE operator".
## TRUNCATE ## TRUNCATE
```sql ``` sql
TRUNCATE TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster] TRUNCATE TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]
``` ```

View File

@ -683,7 +683,20 @@ ClickHouse использует ZooKeeper для хранения метадан
**Пример** **Пример**
```xml ```xml
<zookeeper incl="zookeeper-servers" optional="true" /> <zookeeper>
<node index="1">
<host>example1</host>
<port>2181</port>
</node>
<node index="2">
<host>example2</host>
<port>2181</port>
</node>
<node index="3">
<host>example3</host>
<port>2181</port>
</node>
</zookeeper>
``` ```
[Оригинальная статья](https://clickhouse.yandex/docs/ru/operations/server_settings/settings/) <!--hide--> [Оригинальная статья](https://clickhouse.yandex/docs/ru/operations/server_settings/settings/) <!--hide-->

View File

@ -680,7 +680,20 @@ For more information, see the section "[Replication](../../operations/table_engi
**Example** **Example**
```xml ```xml
<zookeeper incl="zookeeper-servers" optional="true" /> <zookeeper>
<node index="1">
<host>example1</host>
<port>2181</port>
</node>
<node index="2">
<host>example2</host>
<port>2181</port>
</node>
<node index="3">
<host>example3</host>
<port>2181</port>
</node>
</zookeeper>
``` ```

View File

@ -584,6 +584,16 @@ public:
} }
} }
inline time_t addQuarters(time_t t, Int64 delta) const
{
return addMonths(t, delta * 3);
}
inline DayNum addQuarters(DayNum d, Int64 delta) const
{
return addMonths(d, delta * 3);
}
/// Saturation can occur if 29 Feb is mapped to non-leap year. /// Saturation can occur if 29 Feb is mapped to non-leap year.
inline time_t addYears(time_t t, Int64 delta) const inline time_t addYears(time_t t, Int64 delta) const
{ {