Merge branch 'master' into CLICKHOUSE-3837

This commit is contained in:
Vadim 2018-07-24 16:21:51 +03:00 committed by GitHub
commit 437f3f20a9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
406 changed files with 6407 additions and 4674 deletions

5
.gitignore vendored
View File

@ -11,9 +11,12 @@
/build
/docs/build
/docs/edit
/docs/tools/venv/
/docs/en/development/build/
/docs/ru/development/build/
/docs/en/single.md
/docs/ru/single.md
# callgrind files
callgrind.out.*
@ -238,3 +241,5 @@ node_modules
public
website/docs
website/presentations
.DS_Store
*/.DS_Store

View File

@ -1,15 +1 @@
# en:
## Improvements:
* Added Nullable support for runningDifference function. [#2590](https://github.com/yandex/ClickHouse/issues/2590)
## Bug fiexs:
* Fixed switching to default databses in case of client reconection. [#2580](https://github.com/yandex/ClickHouse/issues/2580)
# ru:
## Улучшения:
* Добавлена поддержка Nullable для функции runningDifference. [#2590](https://github.com/yandex/ClickHouse/issues/2590)
## Исправление ошибок:
* Исправлено переключение на дефолтную базу данных при переподключении клиента. [#2580](https://github.com/yandex/ClickHouse/issues/2580)

View File

@ -1,6 +1,54 @@
# ClickHouse release 1.1.54388, 2018-06-28
## ClickHouse release 1.1.54394, 2018-07-12
## New features:
### New features:
* Added the `histogram` aggregate function ([Mikhail Surin](https://github.com/yandex/ClickHouse/pull/2521)).
* Now `OPTIMIZE TABLE ... FINAL` can be used without specifying partitions for `ReplicatedMergeTree` ([Amos Bird](https://github.com/yandex/ClickHouse/pull/2600)).
### Bug fixes:
* Fixed a problem with a very small timeout for sockets (one second) for reading and writing when sending and downloading replicated data, which made it impossible to download larger parts if there is a load on the network or disk (it resulted in cyclical attempts to download parts). This error occurred in version 1.1.54388.
* Fixed issues when using chroot in ZooKeeper if you inserted duplicate data blocks in the table.
* The `has` function now works correctly for an array with Nullable elements ([#2115](https://github.com/yandex/ClickHouse/issues/2115)).
* The `system.tables` table now works correctly when used in distributed queries. The `metadata_modification_time` and `engine_full` columns are now non-virtual. Fixed an error that occurred if only these columns were requested from the table.
* Fixed how an empty `TinyLog` table works after inserting an empty data block ([#2563](https://github.com/yandex/ClickHouse/issues/2563)).
* The `system.zookeeper` table works if the value of the node in ZooKeeper is NULL.
## ClickHouse release 1.1.54390, 2018-07-06
### New features:
* Queries can be sent in `multipart/form-data` format (in the `query` field), which is useful if external data is also sent for query processing ([Olga Hvostikova](https://github.com/yandex/ClickHouse/pull/2490)).
* Added the ability to enable or disable processing single or double quotes when reading data in CSV format. You can configure this in the `format_csv_allow_single_quotes` and `format_csv_allow_double_quotes` settings ([Amos Bird](https://github.com/yandex/ClickHouse/pull/2574)).
* Now `OPTIMIZE TABLE ... FINAL` can be used without specifying the partition for non-replicated variants of `MergeTree` ([Amos Bird](https://github.com/yandex/ClickHouse/pull/2599)).
### Improvements:
* Improved performance, reduced memory consumption, and correct tracking of memory consumption with use of the IN operator when a table index could be used ([#2584](https://github.com/yandex/ClickHouse/pull/2584)).
* Removed redundant checking of checksums when adding a data part. This is important when there are a large number of replicas, because in these cases the total number of checks was equal to N^2.
* Added support for `Array(Tuple(...))` arguments for the `arrayEnumerateUniq` function ([#2573](https://github.com/yandex/ClickHouse/pull/2573)).
* Added `Nullable` support for the `runningDifference` function ([#2594](https://github.com/yandex/ClickHouse/pull/2594)).
* Improved query analysis performance when there is a very large number of expressions ([#2572](https://github.com/yandex/ClickHouse/pull/2572)).
* Faster selection of data parts for merging in `ReplicatedMergeTree` tables. Faster recovery of the ZooKeeper session ([#2597](https://github.com/yandex/ClickHouse/pull/2597)).
* The `format_version.txt` file for `MergeTree` tables is re-created if it is missing, which makes sense if ClickHouse is launched after copying the directory structure without files ([Ciprian Hacman](https://github.com/yandex/ClickHouse/pull/2593)).
### Bug fixes:
* Fixed a bug when working with ZooKeeper that could make it impossible to recover the session and readonly states of tables before restarting the server.
* Fixed a bug when working with ZooKeeper that could result in old nodes not being deleted if the session is interrupted.
* Fixed an error in the `quantileTDigest` function for Float arguments (this bug was introduced in version 1.1.54388) ([Mikhail Surin](https://github.com/yandex/ClickHouse/pull/2553)).
* Fixed a bug in the index for MergeTree tables if the primary key column is located inside the function for converting types between signed and unsigned integers of the same size ([#2603](https://github.com/yandex/ClickHouse/pull/2603)).
* Fixed segfault if `macros` are used but they aren't in the config file ([#2570](https://github.com/yandex/ClickHouse/pull/2570)).
* Fixed switching to the default database when reconnecting the client ([#2583](https://github.com/yandex/ClickHouse/pull/2583)).
* Fixed a bug that occurred when the `use_index_for_in_with_subqueries` setting was disabled.
### Security fix:
* Sending files is no longer possible when connected to MySQL (`LOAD DATA LOCAL INFILE`).
## ClickHouse release 1.1.54388, 2018-06-28
### New features:
* Support for the `ALTER TABLE t DELETE WHERE` query for replicated tables. Added the `system.mutations` table to track progress of this type of queries.
* Support for the `ALTER TABLE t [REPLACE|ATTACH] PARTITION` query for MergeTree tables.
@ -18,12 +66,12 @@
* Added the `date_time_input_format` setting. If you switch this setting to `'best_effort'`, DateTime values will be read in a wide range of formats.
* Added the `clickhouse-obfuscator` utility for data obfuscation. Usage example: publishing data used in performance tests.
## Experimental features:
### Experimental features:
* Added the ability to calculate `and` arguments only where they are needed ([Anastasia Tsarkova](https://github.com/yandex/ClickHouse/pull/2272)).
* JIT compilation to native code is now available for some expressions ([pyos](https://github.com/yandex/ClickHouse/pull/2277)).
## Bug fixes:
### Bug fixes:
* Duplicates no longer appear for a query with `DISTINCT` and `ORDER BY`.
* Queries with `ARRAY JOIN` and `arrayFilter` no longer return an incorrect result.
@ -45,7 +93,7 @@
* Fixed SSRF in the `remote()` table function.
* Fixed exit behavior of `clickhouse-client` in multiline mode ([#2510](https://github.com/yandex/ClickHouse/issues/2510)).
## Improvements:
### Improvements:
* Background tasks in replicated tables are now performed in a thread pool instead of in separate threads ([Silviu Caragea](https://github.com/yandex/ClickHouse/pull/1722)).
* Improved LZ4 compression performance.
@ -58,7 +106,7 @@
* When calculating the number of available CPU cores, limits on cgroups are now taken into account ([Atri Sharma](https://github.com/yandex/ClickHouse/pull/2325)).
* Added chown for config directories in the systemd config file ([Mikhail Shiryaev](https://github.com/yandex/ClickHouse/pull/2421)).
## Build changes:
### Build changes:
* The gcc8 compiler can be used for builds.
* Added the ability to build llvm from a submodule.
@ -69,36 +117,36 @@
* Added the ability to use the libtinfo library instead of libtermcap ([Georgy Kondratiev](https://github.com/yandex/ClickHouse/pull/2519)).
* Fixed a header file conflict in Fedora Rawhide ([#2520](https://github.com/yandex/ClickHouse/issues/2520)).
## Backward incompatible changes:
### Backward incompatible changes:
* Removed escaping in `Vertical` and `Pretty*` formats and deleted the `VerticalRaw` format.
* If servers with version 1.1.54388 (or newer) and servers with older version are used simultaneously in distributed query and the query has `cast(x, 'Type')` expression in the form without `AS` keyword and with `cast` not in uppercase, then the exception with message like `Not found column cast(0, 'UInt8') in block` will be thrown. Solution: update server on all cluster nodes.
# ClickHouse release 1.1.54385, 2018-06-01
## ClickHouse release 1.1.54385, 2018-06-01
## Bug fixes:
### Bug fixes:
* Fixed an error that in some cases caused ZooKeeper operations to block.
# ClickHouse release 1.1.54383, 2018-05-22
## ClickHouse release 1.1.54383, 2018-05-22
## Bug fixes:
### Bug fixes:
* Fixed a slowdown of replication queue if a table has many replicas.
# ClickHouse release 1.1.54381, 2018-05-14
## ClickHouse release 1.1.54381, 2018-05-14
## Bug fixes:
### Bug fixes:
* Fixed a nodes leak in ZooKeeper when ClickHouse loses connection to ZooKeeper server.
# ClickHouse release 1.1.54380, 2018-04-21
## ClickHouse release 1.1.54380, 2018-04-21
## New features:
### New features:
* Added table function `file(path, format, structure)`. An example reading bytes from `/dev/urandom`: `ln -s /dev/urandom /var/lib/clickhouse/user_files/random` `clickhouse-client -q "SELECT * FROM file('random', 'RowBinary', 'd UInt8') LIMIT 10"`.
## Improvements:
### Improvements:
* Subqueries could be wrapped by `()` braces (to enhance queries readability). For example, `(SELECT 1) UNION ALL (SELECT 1)`.
* Simple `SELECT` queries from table `system.processes` are not counted in `max_concurrent_queries` limit.
## Bug fixes:
### Bug fixes:
* Fixed incorrect behaviour of `IN` operator when select from `MATERIALIZED VIEW`.
* Fixed incorrect filtering by partition index in expressions like `WHERE partition_key_column IN (...)`
* Fixed inability to execute `OPTIMIZE` query on non-leader replica if the table was `REANAME`d.
@ -106,11 +154,11 @@
* Fixed freezing of `KILL QUERY` queries.
* Fixed an error in ZooKeeper client library which led to watches loses, freezing of distributed DDL queue and slowing replication queue if non-empty `chroot` prefix is used in ZooKeeper configuration.
## Backward incompatible changes:
### Backward incompatible changes:
* Removed support of expressions like `(a, b) IN (SELECT (a, b))` (instead of them you can use their equivalent `(a, b) IN (SELECT a, b)`). In previous releases, these expressions led to undetermined data filtering or caused errors.
# ClickHouse release 1.1.54378, 2018-04-16
## New features:
## ClickHouse release 1.1.54378, 2018-04-16
### New features:
* Logging level can be changed without restarting the server.
* Added the `SHOW CREATE DATABASE` query.
@ -124,7 +172,7 @@
* Multiple comma-separated `topics` can be specified for the `Kafka` engine (Tobias Adamson).
* When a query is stopped by `KILL QUERY` or `replace_running_query`, the client receives the `Query was cancelled` exception instead of an incomplete response.
## Improvements:
### Improvements:
* `ALTER TABLE ... DROP/DETACH PARTITION` queries are run at the front of the replication queue.
* `SELECT ... FINAL` and `OPTIMIZE ... FINAL` can be used even when the table has a single data part.
@ -135,7 +183,7 @@
* More robust crash recovery for asynchronous insertion into `Distributed` tables.
* The return type of the `countEqual` function changed from `UInt32` to `UInt64` (谢磊).
## Bug fixes:
### Bug fixes:
* Fixed an error with `IN` when the left side of the expression is `Nullable`.
* Correct results are now returned when using tuples with `IN` when some of the tuple components are in the table index.
@ -151,31 +199,31 @@
* `SummingMergeTree` now works correctly for summation of nested data structures with a composite key.
* Fixed the possibility of a race condition when choosing the leader for `ReplicatedMergeTree` tables.
## Build changes:
### Build changes:
* The build supports `ninja` instead of `make` and uses it by default for building releases.
* Renamed packages: `clickhouse-server-base` is now `clickhouse-common-static`; `clickhouse-server-common` is now `clickhouse-server`; `clickhouse-common-dbg` is now `clickhouse-common-static-dbg`. To install, use `clickhouse-server clickhouse-client`. Packages with the old names will still load in the repositories for backward compatibility.
## Backward-incompatible changes:
### Backward-incompatible changes:
* Removed the special interpretation of an IN expression if an array is specified on the left side. Previously, the expression `arr IN (set)` was interpreted as "at least one `arr` element belongs to the `set`". To get the same behavior in the new version, write `arrayExists(x -> x IN (set), arr)`.
* Disabled the incorrect use of the socket option `SO_REUSEPORT`, which was incorrectly enabled by default in the Poco library. Note that on Linux there is no longer any reason to simultaneously specify the addresses `::` and `0.0.0.0` for listen use just `::`, which allows listening to the connection both over IPv4 and IPv6 (with the default kernel config settings). You can also revert to the behavior from previous versions by specifying `<listen_reuse_port>1</listen_reuse_port>` in the config.
# ClickHouse release 1.1.54370, 2018-03-16
## ClickHouse release 1.1.54370, 2018-03-16
## New features:
### New features:
* Added the `system.macros` table and auto updating of macros when the config file is changed.
* Added the `SYSTEM RELOAD CONFIG` query.
* Added the `maxIntersections(left_col, right_col)` aggregate function, which returns the maximum number of simultaneously intersecting intervals `[left; right]`. The `maxIntersectionsPosition(left, right)` function returns the beginning of the "maximum" interval. ([Michael Furmur](https://github.com/yandex/ClickHouse/pull/2012)).
## Improvements:
### Improvements:
* When inserting data in a `Replicated` table, fewer requests are made to `ZooKeeper` (and most of the user-level errors have disappeared from the `ZooKeeper` log).
* Added the ability to create aliases for sets. Example: `WITH (1, 2, 3) AS set SELECT number IN set FROM system.numbers LIMIT 10`.
## Bug fixes:
### Bug fixes:
* Fixed the `Illegal PREWHERE` error when reading from `Merge` tables over `Distributed` tables.
* Added fixes that allow you to run `clickhouse-server` in IPv4-only Docker containers.
@ -189,9 +237,9 @@
* Restored the behavior for queries like `SELECT * FROM remote('server2', default.table) WHERE col IN (SELECT col2 FROM default.table)` when the right side argument of the `IN` should use a remote `default.table` instead of a local one. This behavior was broken in version 1.1.54358.
* Removed extraneous error-level logging of `Not found column ... in block`.
# ClickHouse release 1.1.54356, 2018-03-06
## ClickHouse release 1.1.54356, 2018-03-06
## New features:
### New features:
* Aggregation without `GROUP BY` for an empty set (such as `SELECT count(*) FROM table WHERE 0`) now returns a result with one row with null values for aggregate functions, in compliance with the SQL standard. To restore the old behavior (return an empty result), set `empty_result_for_aggregation_by_empty_set` to 1.
* Added type conversion for `UNION ALL`. Different alias names are allowed in `SELECT` positions in `UNION ALL`, in compliance with the SQL standard.
@ -226,7 +274,7 @@
* `RENAME TABLE` can be performed for `VIEW`.
* Added the `odbc_default_field_size` option, which allows you to extend the maximum size of the value loaded from an ODBC source (by default, it is 1024).
## Improvements:
### Improvements:
* Limits and quotas on the result are no longer applied to intermediate data for `INSERT SELECT` queries or for `SELECT` subqueries.
* Fewer false triggers of `force_restore_data` when checking the status of `Replicated` tables when the server starts.
@ -242,7 +290,7 @@
* `Enum` values can be used in `min`, `max`, `sum` and some other functions. In these cases, it uses the corresponding numeric values. This feature was previously available but was lost in the release 1.1.54337.
* Added `max_expanded_ast_elements` to restrict the size of the AST after recursively expanding aliases.
## Bug fixes:
### Bug fixes:
* Fixed cases when unnecessary columns were removed from subqueries in error, or not removed from subqueries containing `UNION ALL`.
* Fixed a bug in merges for `ReplacingMergeTree` tables.
@ -268,18 +316,18 @@
* Fixed a crash when passing arrays of different sizes to an `arrayReduce` function when using aggregate functions from multiple arguments.
* Prohibited the use of queries with `UNION ALL` in a `MATERIALIZED VIEW`.
## Backward incompatible changes:
### Backward incompatible changes:
* Removed the `distributed_ddl_allow_replicated_alter` option. This behavior is enabled by default.
* Removed the `UnsortedMergeTree` engine.
# ClickHouse release 1.1.54343, 2018-02-05
## ClickHouse release 1.1.54343, 2018-02-05
* Added macros support for defining cluster names in distributed DDL queries and constructors of Distributed tables: `CREATE TABLE distr ON CLUSTER '{cluster}' (...) ENGINE = Distributed('{cluster}', 'db', 'table')`.
* Now the table index is used for conditions like `expr IN (subquery)`.
* Improved processing of duplicates when inserting to Replicated tables, so they no longer slow down execution of the replication queue.
# ClickHouse release 1.1.54342, 2018-01-22
## ClickHouse release 1.1.54342, 2018-01-22
This release contains bug fixes for the previous release 1.1.54337:
* Fixed a regression in 1.1.54337: if the default user has readonly access, then the server refuses to start up with the message `Cannot create database in readonly mode`.
@ -290,9 +338,9 @@ This release contains bug fixes for the previous release 1.1.54337:
* Buffer tables now work correctly when MATERIALIZED columns are present in the destination table (by zhang2014).
* Fixed a bug in implementation of NULL.
# ClickHouse release 1.1.54337, 2018-01-18
## ClickHouse release 1.1.54337, 2018-01-18
## New features:
### New features:
* Added support for storage of multidimensional arrays and tuples (`Tuple` data type) in tables.
* Added support for table functions in `DESCRIBE` and `INSERT` queries. Added support for subqueries in `DESCRIBE`. Examples: `DESC TABLE remote('host', default.hits)`; `DESC TABLE (SELECT 1)`; `INSERT INTO TABLE FUNCTION remote('host', default.hits)`. Support for `INSERT INTO TABLE` syntax in addition to `INSERT INTO`.
@ -323,7 +371,7 @@ This release contains bug fixes for the previous release 1.1.54337:
* Added the `--silent` option for the `clickhouse-local` tool. It suppresses printing query execution info in stderr.
* Added support for reading values of type `Date` from text in a format where the month and/or day of the month is specified using a single digit instead of two digits (Amos Bird).
## Performance optimizations:
### Performance optimizations:
* Improved performance of `min`, `max`, `any`, `anyLast`, `anyHeavy`, `argMin`, `argMax` aggregate functions for String arguments.
* Improved performance of `isInfinite`, `isFinite`, `isNaN`, `roundToExp2` functions.
@ -332,7 +380,7 @@ This release contains bug fixes for the previous release 1.1.54337:
* Lowered memory usage for `JOIN` in the case when the left and right parts have columns with identical names that are not contained in `USING`.
* Improved performance of `varSamp`, `varPop`, `stddevSamp`, `stddevPop`, `covarSamp`, `covarPop`, and `corr` aggregate functions by reducing computational stability. The old functions are available under the names: `varSampStable`, `varPopStable`, `stddevSampStable`, `stddevPopStable`, `covarSampStable`, `covarPopStable`, `corrStable`.
## Bug fixes:
### Bug fixes:
* Fixed data deduplication after running a `DROP PARTITION` query. In the previous version, dropping a partition and INSERTing the same data again was not working because INSERTed blocks were considered duplicates.
* Fixed a bug that could lead to incorrect interpretation of the `WHERE` clause for `CREATE MATERIALIZED VIEW` queries with `POPULATE`.
@ -371,7 +419,7 @@ This release contains bug fixes for the previous release 1.1.54337:
* Fixed the `SYSTEM DROP DNS CACHE` query: the cache was flushed but addresses of cluster nodes were not updated.
* Fixed the behavior of `MATERIALIZED VIEW` after executing `DETACH TABLE` for the table under the view (Marek Vavruša).
## Build improvements:
### Build improvements:
* Builds use `pbuilder`. The build process is almost completely independent of the build host environment.
* A single build is used for different OS versions. Packages and binaries have been made compatible with a wide range of Linux systems.
@ -385,7 +433,7 @@ This release contains bug fixes for the previous release 1.1.54337:
* Removed usage of GNU extensions from the code. Enabled the `-Wextra` option. When building with `clang`, `libc++` is used instead of `libstdc++`.
* Extracted `clickhouse_parsers` and `clickhouse_common_io` libraries to speed up builds of various tools.
## Backward incompatible changes:
### Backward incompatible changes:
* The format for marks in `Log` type tables that contain `Nullable` columns was changed in a backward incompatible way. If you have these tables, you should convert them to the `TinyLog` type before starting up the new server version. To do this, replace `ENGINE = Log` with `ENGINE = TinyLog` in the corresponding `.sql` file in the `metadata` directory. If your table doesn't have `Nullable` columns or if the type of your table is not `Log`, then you don't need to do anything.
* Removed the `experimental_allow_extended_storage_definition_syntax` setting. Now this feature is enabled by default.
@ -396,16 +444,16 @@ This release contains bug fixes for the previous release 1.1.54337:
* In previous server versions there was an undocumented feature: if an aggregate function depends on parameters, you can still specify it without parameters in the AggregateFunction data type. Example: `AggregateFunction(quantiles, UInt64)` instead of `AggregateFunction(quantiles(0.5, 0.9), UInt64)`. This feature was lost. Although it was undocumented, we plan to support it again in future releases.
* Enum data types cannot be used in min/max aggregate functions. The possibility will be returned back in future release.
## Please note when upgrading:
### Please note when upgrading:
* When doing a rolling update on a cluster, at the point when some of the replicas are running the old version of ClickHouse and some are running the new version, replication is temporarily stopped and the message `unknown parameter 'shard'` appears in the log. Replication will continue after all replicas of the cluster are updated.
* If you have different ClickHouse versions on the cluster, you can get incorrect results for distributed queries with the aggregate functions `varSamp`, `varPop`, `stddevSamp`, `stddevPop`, `covarSamp`, `covarPop`, and `corr`. You should update all cluster nodes.
# ClickHouse release 1.1.54327, 2017-12-21
## ClickHouse release 1.1.54327, 2017-12-21
This release contains bug fixes for the previous release 1.1.54318:
* Fixed bug with possible race condition in replication that could lead to data loss. This issue affects versions 1.1.54310 and 1.1.54318. If you use one of these versions with Replicated tables, the update is strongly recommended. This issue shows in logs in Warning messages like `Part ... from own log doesn't exist.` The issue is relevant even if you don't see these messages in logs.
# ClickHouse release 1.1.54318, 2017-11-30
## ClickHouse release 1.1.54318, 2017-11-30
This release contains bug fixes for the previous release 1.1.54310:
* Fixed incorrect row deletions during merges in the SummingMergeTree engine
@ -414,9 +462,9 @@ This release contains bug fixes for the previous release 1.1.54310:
* Fixed an issue that was causing the replication queue to stop running
* Fixed rotation and archiving of server logs
# ClickHouse release 1.1.54310, 2017-11-01
## ClickHouse release 1.1.54310, 2017-11-01
## New features:
### New features:
* Custom partitioning key for the MergeTree family of table engines.
* [Kafka](https://clickhouse.yandex/docs/en/single/index.html#document-table_engines/kafka) table engine.
* Added support for loading [CatBoost](https://catboost.yandex/) models and applying them to data stored in ClickHouse.
@ -432,12 +480,12 @@ This release contains bug fixes for the previous release 1.1.54310:
* Added support for the Cap'n Proto input format.
* You can now customize compression level when using the zstd algorithm.
## Backward incompatible changes:
### Backward incompatible changes:
* Creation of temporary tables with an engine other than Memory is forbidden.
* Explicit creation of tables with the View or MaterializedView engine is forbidden.
* During table creation, a new check verifies that the sampling key expression is included in the primary key.
## Bug fixes:
### Bug fixes:
* Fixed hangups when synchronously inserting into a Distributed table.
* Fixed nonatomic adding and removing of parts in Replicated tables.
* Data inserted into a materialized view is not subjected to unnecessary deduplication.
@ -447,15 +495,15 @@ This release contains bug fixes for the previous release 1.1.54310:
* Fixed hangups when the disk volume containing server logs is full.
* Fixed an overflow in the `toRelativeWeekNum` function for the first week of the Unix epoch.
## Build improvements:
### Build improvements:
* Several third-party libraries (notably Poco) were updated and converted to git submodules.
# ClickHouse release 1.1.54304, 2017-10-19
## ClickHouse release 1.1.54304, 2017-10-19
## New features:
### New features:
* TLS support in the native protocol (to enable, set `tcp_ssl_port` in `config.xml`)
## Bug fixes:
### Bug fixes:
* `ALTER` for replicated tables now tries to start running as soon as possible
* Fixed crashing when reading data with the setting `preferred_block_size_bytes=0`
* Fixed crashes of `clickhouse-client` when `Page Down` is pressed
@ -468,16 +516,16 @@ This release contains bug fixes for the previous release 1.1.54310:
* Users are updated correctly when `users.xml` is invalid
* Correct handling when an executable dictionary returns a non-zero response code
# ClickHouse release 1.1.54292, 2017-09-20
## ClickHouse release 1.1.54292, 2017-09-20
## New features:
### New features:
* Added the `pointInPolygon` function for working with coordinates on a coordinate plane.
* Added the `sumMap` aggregate function for calculating the sum of arrays, similar to `SummingMergeTree`.
* Added the `trunc` function. Improved performance of the rounding functions (`round`, `floor`, `ceil`, `roundToExp2`) and corrected the logic of how they work. Changed the logic of the `roundToExp2` function for fractions and negative numbers.
* The ClickHouse executable file is now less dependent on the libc version. The same ClickHouse executable file can run on a wide variety of Linux systems. Note: There is still a dependency when using compiled queries (with the setting `compile = 1`, which is not used by default).
* Reduced the time needed for dynamic compilation of queries.
## Bug fixes:
### Bug fixes:
* Fixed an error that sometimes produced `part ... intersects previous part` messages and weakened replica consistency.
* Fixed an error that caused the server to lock up if ZooKeeper was unavailable during shutdown.
* Removed excessive logging when restoring replicas.
@ -485,9 +533,9 @@ This release contains bug fixes for the previous release 1.1.54310:
* Fixed an error in the concat function that occurred if the first column in a block has the Array type.
* Progress is now displayed correctly in the system.merges table.
# ClickHouse release 1.1.54289, 2017-09-13
## ClickHouse release 1.1.54289, 2017-09-13
## New features:
### New features:
* `SYSTEM` queries for server administration: `SYSTEM RELOAD DICTIONARY`, `SYSTEM RELOAD DICTIONARIES`, `SYSTEM DROP DNS CACHE`, `SYSTEM SHUTDOWN`, `SYSTEM KILL`.
* Added functions for working with arrays: `concat`, `arraySlice`, `arrayPushBack`, `arrayPushFront`, `arrayPopBack`, `arrayPopFront`.
* Added the `root` and `identity` parameters for the ZooKeeper configuration. This allows you to isolate individual users on the same ZooKeeper cluster.
@ -502,7 +550,7 @@ This release contains bug fixes for the previous release 1.1.54310:
* Option to set `umask` in the config file.
* Improved performance for queries with `DISTINCT`.
## Bug fixes:
### Bug fixes:
* Improved the process for deleting old nodes in ZooKeeper. Previously, old nodes sometimes didn't get deleted if there were very frequent inserts, which caused the server to be slow to shut down, among other things.
* Fixed randomization when choosing hosts for the connection to ZooKeeper.
* Fixed the exclusion of lagging replicas in distributed queries if the replica is localhost.
@ -515,28 +563,28 @@ This release contains bug fixes for the previous release 1.1.54310:
* Resolved the appearance of zombie processes when using a dictionary with an `executable` source.
* Fixed segfault for the HEAD query.
## Improvements to development workflow and ClickHouse build:
### Improvements to development workflow and ClickHouse build:
* You can use `pbuilder` to build ClickHouse.
* You can use `libc++` instead of `libstdc++` for builds on Linux.
* Added instructions for using static code analysis tools: `Coverity`, `clang-tidy`, and `cppcheck`.
## Please note when upgrading:
### Please note when upgrading:
* There is now a higher default value for the MergeTree setting `max_bytes_to_merge_at_max_space_in_pool` (the maximum total size of data parts to merge, in bytes): it has increased from 100 GiB to 150 GiB. This might result in large merges running after the server upgrade, which could cause an increased load on the disk subsystem. If the free space available on the server is less than twice the total amount of the merges that are running, this will cause all other merges to stop running, including merges of small data parts. As a result, INSERT requests will fail with the message "Merges are processing significantly slower than inserts." Use the `SELECT * FROM system.merges` request to monitor the situation. You can also check the `DiskSpaceReservedForMerge` metric in the `system.metrics` table, or in Graphite. You don't need to do anything to fix this, since the issue will resolve itself once the large merges finish. If you find this unacceptable, you can restore the previous value for the `max_bytes_to_merge_at_max_space_in_pool` setting (to do this, go to the `<merge_tree>` section in config.xml, set `<max_bytes_to_merge_at_max_space_in_pool>107374182400</max_bytes_to_merge_at_max_space_in_pool>` and restart the server).
# ClickHouse release 1.1.54284, 2017-08-29
## ClickHouse release 1.1.54284, 2017-08-29
* This is bugfix release for previous 1.1.54282 release. It fixes ZooKeeper nodes leak in `parts/` directory.
# ClickHouse release 1.1.54282, 2017-08-23
## ClickHouse release 1.1.54282, 2017-08-23
This is a bugfix release. The following bugs were fixed:
* `DB::Exception: Assertion violation: !_path.empty()` error when inserting into a Distributed table.
* Error when parsing inserted data in RowBinary format if the data begins with ';' character.
* Errors during runtime compilation of certain aggregate functions (e.g. `groupArray()`).
# ClickHouse release 1.1.54276, 2017-08-16
## ClickHouse release 1.1.54276, 2017-08-16
## New features:
### New features:
* You can use an optional WITH clause in a SELECT query. Example query: `WITH 1+1 AS a SELECT a, a*a`
* INSERT can be performed synchronously in a Distributed table: OK is returned only after all the data is saved on all the shards. This is activated by the setting insert_distributed_sync=1.
@ -547,7 +595,7 @@ This is a bugfix release. The following bugs were fixed:
* Added support for non-constant arguments and negative offsets in the function `substring(str, pos, len).`
* Added the max_size parameter for the `groupArray(max_size)(column)` aggregate function, and optimized its performance.
## Major changes:
### Major changes:
* Improved security: all server files are created with 0640 permissions (can be changed via <umask> config parameter).
* Improved error messages for queries with invalid syntax.
@ -555,11 +603,11 @@ This is a bugfix release. The following bugs were fixed:
* Significantly increased the performance of data merges for the ReplacingMergeTree engine.
* Improved performance for asynchronous inserts from a Distributed table by batching multiple source inserts. To enable this functionality, use the setting distributed_directory_monitor_batch_inserts=1.
## Backward incompatible changes:
### Backward incompatible changes:
* Changed the binary format of aggregate states of `groupArray(array_column)` functions for arrays.
## Complete list of changes:
### Complete list of changes:
* Added the `output_format_json_quote_denormals` setting, which enables outputting nan and inf values in JSON format.
* Optimized thread allocation when reading from a Distributed table.
@ -578,7 +626,7 @@ This is a bugfix release. The following bugs were fixed:
* It is possible to connect to MySQL through a socket in the file system.
* The `system.parts` table has a new column with information about the size of marks, in bytes.
## Bug fixes:
### Bug fixes:
* Distributed tables using a Merge table now work correctly for a SELECT query with a condition on the _table field.
* Fixed a rare race condition in ReplicatedMergeTree when checking data parts.
@ -602,15 +650,15 @@ This is a bugfix release. The following bugs were fixed:
* Fixed the "Cannot mremap" error when using arrays in IN and JOIN clauses with more than 2 billion elements.
* Fixed the failover for dictionaries with MySQL as the source.
## Improved workflow for developing and assembling ClickHouse:
### Improved workflow for developing and assembling ClickHouse:
* Builds can be assembled in Arcadia.
* You can use gcc 7 to compile ClickHouse.
* Parallel builds using ccache+distcc are faster now.
# ClickHouse release 1.1.54245, 2017-07-04
## ClickHouse release 1.1.54245, 2017-07-04
## New features:
### New features:
* Distributed DDL (for example, `CREATE TABLE ON CLUSTER`).
* The replicated request `ALTER TABLE CLEAR COLUMN IN PARTITION.`
@ -622,16 +670,16 @@ This is a bugfix release. The following bugs were fixed:
* Sessions in the HTTP interface.
* The OPTIMIZE query for a Replicated table can can run not only on the leader.
## Backward incompatible changes:
### Backward incompatible changes:
* Removed SET GLOBAL.
## Minor changes:
### Minor changes:
* If an alert is triggered, the full stack trace is printed into the log.
* Relaxed the verification of the number of damaged or extra data parts at startup (there were too many false positives).
## Bug fixes:
### Bug fixes:
* Fixed a bad connection "sticking" when inserting into a Distributed table.
* GLOBAL IN now works for a query from a Merge table that looks at a Distributed table.

View File

@ -1,6 +1,50 @@
# ClickHouse release 1.1.54388, 2018-06-28
## ClickHouse release 1.1.54394, 2018-07-12
## Новые возможности:
### Новые возможности:
* Добавлена агрегатная функция `histogram` ([Михаил Сурин](https://github.com/yandex/ClickHouse/pull/2521)).
* Возможность использования `OPTIMIZE TABLE ... FINAL` без указания партиции для `ReplicatedMergeTree` ([Amos Bird](https://github.com/yandex/ClickHouse/pull/2600)).
### Исправление ошибок:
* Исправлена ошибка - выставление слишком маленького таймаута у сокетов (одна секунда) для чтения и записи при отправке и скачивании реплицируемых данных, что приводило к невозможности скачать куски достаточно большого размера при наличии некоторой нагрузки на сеть или диск (попытки скачивания кусков циклически повторяются). Ошибка возникла в версии 1.1.54388.
* Исправлена работа при использовании chroot в ZooKeeper, в случае вставки дублирующихся блоков данных в таблицу.
* Исправлена работа функции `has` для случая массива с Nullable элементами ([#2115](https://github.com/yandex/ClickHouse/issues/2521)).
* Исправлена работа таблицы `system.tables` при её использовании в распределённых запросах; столбцы `metadata_modification_time` и `engine_full` сделаны невиртуальными; исправлена ошибка в случае, если из таблицы были запрошены только эти столбцы.
* Исправлена работа пустой таблицы типа `TinyLog` после вставки в неё пустого блока данных ([#2563](https://github.com/yandex/ClickHouse/issues/2563)).
* Таблица `system.zookeeper` работает в случае, если значение узла в ZooKeeper равно NULL.
## ClickHouse release 1.1.54390, 2018-07-06
### Новые возможности:
* Возможность отправки запроса в формате `multipart/form-data` (в поле `query`), что полезно, если при этом также отправляются внешние данные для обработки запроса ([Ольга Хвостикова](https://github.com/yandex/ClickHouse/pull/2490)).
* Добавлена возможность включить или отключить обработку одинарных или двойных кавычек при чтении данных в формате CSV. Это задаётся настройками `format_csv_allow_single_quotes` и `format_csv_allow_double_quotes` ([Amos Bird](https://github.com/yandex/ClickHouse/pull/2574))
* Возможность использования `OPTIMIZE TABLE ... FINAL` без указания партиции для не реплицированных вариантов`MergeTree` ([Amos Bird](https://github.com/yandex/ClickHouse/pull/2599)).
### Улучшения:
* Увеличена производительность, уменьшено потребление памяти, добавлен корректный учёт потребления памяти, при использовании оператора IN в случае, когда для его работы может использоваться индекс таблицы ([#2584](https://github.com/yandex/ClickHouse/pull/2584)).
* Убраны избыточные проверки чексумм при добавлении куска. Это важно в случае большого количества реплик, так как в этом случае суммарное количество проверок было равно N^2.
* Добавлена поддержка аргументов типа `Array(Tuple(...))` для функции `arrayEnumerateUniq` ([#2573](https://github.com/yandex/ClickHouse/pull/2573)).
* Добавлена поддержка `Nullable` для функции `runningDifference`. ([#2594](https://github.com/yandex/ClickHouse/pull/2594))
* Увеличена производительность анализа запроса в случае очень большого количества выражений ([#2572](https://github.com/yandex/ClickHouse/pull/2572)).
* Более быстрый выбор кусков для слияния в таблицах типа `ReplicatedMergeTree`. Более быстрое восстановление сессии с ZooKeeper. ([#2597](https://github.com/yandex/ClickHouse/pull/2597)).
* Файл `format_version.txt` для таблиц семейства `MergeTree` создаётся заново при его отсутствии, что имеет смысл в случае запуска ClickHouse после копирования структуры директорий без файлов ([Ciprian Hacman](https://github.com/yandex/ClickHouse/pull/2593)).
### Исправление ошибок:
* Исправлена ошибка при работе с ZooKeeper, которая могла приводить к невозможности восстановления сессии и readonly состояниям таблиц до перезапуска сервера.
* Исправлена ошибка при работе с ZooKeeper, которая могла приводить к неудалению старых узлов при разрыве сессии.
* Исправлена ошибка в функции `quantileTDigest` для Float аргументов (ошибка появилась в версии 1.1.54388) ([Михаил Сурин](https://github.com/yandex/ClickHouse/pull/2553)).
* Исправлена ошибка работы индекса таблиц типа MergeTree, если в условии, столбец первичного ключа расположен внутри функции преобразования типов между знаковым и беззнаковым целым одного размера ([#2603](https://github.com/yandex/ClickHouse/pull/2603)).
* Исправлен segfault, если в конфигурационном файле нет `macros`, но они используются ([#2570](https://github.com/yandex/ClickHouse/pull/2570)).
* Исправлено переключение на базу данных по-умолчанию при переподключении клиента ([#2583](https://github.com/yandex/ClickHouse/pull/2583)).
* Исправлена ошибка в случае отключенной настройки `use_index_for_in_with_subqueries`.
### Исправления безопасности:
* При соединениях с MySQL удалена возможность отправки файлов (`LOAD DATA LOCAL INFILE`).
## ClickHouse release 1.1.54388, 2018-06-28
### Новые возможности:
* Добавлена поддержка запроса `ALTER TABLE t DELETE WHERE` для реплицированных таблиц и таблица `system.mutations`.
* Добавлена поддержка запроса `ALTER TABLE t [REPLACE|ATTACH] PARTITION` для *MergeTree-таблиц.
* Добавлена поддержка запроса `TRUNCATE TABLE` ([Winter Zhang](https://github.com/yandex/ClickHouse/pull/2260))
@ -17,11 +61,11 @@
* Добавлена настройка `date_time_input_format`. Если переключить эту настройку в значение `'best_effort'`, значения DateTime будут читаться в широком диапазоне форматов.
* Добавлена утилита `clickhouse-obfuscator` для обфускации данных. Пример использования: публикация данных, используемых в тестах производительности.
## Экспериментальные возможности:
### Экспериментальные возможности:
* Добавлена возможность вычислять аргументы функции `and` только там, где они нужны ([Анастасия Царькова](https://github.com/yandex/ClickHouse/pull/2272))
* Добавлена возможность JIT-компиляции в нативный код некоторых выражений ([pyos](https://github.com/yandex/ClickHouse/pull/2277)).
## Исправление ошибок:
### Исправление ошибок:
* Исправлено появление дублей в запросе с `DISTINCT` и `ORDER BY`.
* Запросы с `ARRAY JOIN` и `arrayFilter` раньше возвращали некорректный результат.
* Исправлена ошибка при чтении столбца-массива из Nested-структуры ([#2066](https://github.com/yandex/ClickHouse/issues/2066)).
@ -42,7 +86,7 @@
* Исправлена SSRF в табличной функции remote().
* Исправлен выход из `clickhouse-client` в multiline-режиме ([#2510](https://github.com/yandex/ClickHouse/issues/2510)).
## Улучшения:
### Улучшения:
* Фоновые задачи в реплицированных таблицах теперь выполняются не в отдельных потоках, а в пуле потоков ([Silviu Caragea](https://github.com/yandex/ClickHouse/pull/1722))
* Улучшена производительность разжатия LZ4.
* Ускорен анализ запроса с большим числом JOIN-ов и подзапросов.
@ -54,7 +98,7 @@
* При расчёте количества доступных ядер CPU теперь учитываются ограничения cgroups ([Atri Sharma](https://github.com/yandex/ClickHouse/pull/2325)).
* Добавлен chown директорий конфигов в конфигурационном файле systemd ([Михаил Ширяев](https://github.com/yandex/ClickHouse/pull/2421)).
## Изменения сборки:
### Изменения сборки:
* Добавлена возможность сборки компилятором gcc8.
* Добавлена возможность сборки llvm из submodule.
* Используемая версия библиотеки librdkafka обновлена до v0.11.4.
@ -64,34 +108,34 @@
* Добавлена возможность использования библиотеки libtinfo вместо libtermcap ([Георгий Кондратьев](https://github.com/yandex/ClickHouse/pull/2519)).
* Исправлен конфликт заголовочных файлов в Fedora Rawhide ([#2520](https://github.com/yandex/ClickHouse/issues/2520)).
## Обратно несовместимые изменения:
### Обратно несовместимые изменения:
* Убран escaping в форматах `Vertical` и `Pretty*`, удалён формат `VerticalRaw`.
* Если в распределённых запросах одновременно участвуют серверы версии 1.1.54388 или новее и более старые, то при использовании выражения `cast(x, 'Type')`, записанного без указания `AS`, если слово `cast` указано не в верхнем регистре, возникает ошибка вида `Not found column cast(0, 'UInt8') in block`. Решение: обновить сервер на всём кластере.
# ClickHouse release 1.1.54385, 2018-06-01
## Исправление ошибок:
## ClickHouse release 1.1.54385, 2018-06-01
### Исправление ошибок:
* Исправлена ошибка, которая в некоторых случаях приводила к блокировке операций с ZooKeeper.
# ClickHouse release 1.1.54383, 2018-05-22
## Исправление ошибок:
## ClickHouse release 1.1.54383, 2018-05-22
### Исправление ошибок:
* Исправлена деградация скорости выполнения очереди репликации при большом количестве реплик
# ClickHouse release 1.1.54381, 2018-05-14
## ClickHouse release 1.1.54381, 2018-05-14
## Исправление ошибок:
### Исправление ошибок:
* Исправлена ошибка, приводящая к "утеканию" метаданных в ZooKeeper при потере соединения с сервером ZooKeeper.
# ClickHouse release 1.1.54380, 2018-04-21
## ClickHouse release 1.1.54380, 2018-04-21
## Новые возможности:
### Новые возможности:
* Добавлена табличная функция `file(path, format, structure)`. Пример, читающий байты из `/dev/urandom`: `ln -s /dev/urandom /var/lib/clickhouse/user_files/random` `clickhouse-client -q "SELECT * FROM file('random', 'RowBinary', 'd UInt8') LIMIT 10"`.
## Улучшения:
### Улучшения:
* Добавлена возможность оборачивать подзапросы скобками `()` для повышения читаемости запросов. Например: `(SELECT 1) UNION ALL (SELECT 1)`.
* Простые запросы `SELECT` из таблицы `system.processes` не учитываются в ограничении `max_concurrent_queries`.
## Исправление ошибок:
### Исправление ошибок:
* Исправлена неправильная работа оператора `IN` в `MATERIALIZED VIEW`.
* Исправлена неправильная работа индекса по ключу партиционирования в выражениях типа `partition_key_column IN (...)`.
* Исправлена невозможность выполнить `OPTIMIZE` запрос на лидирующей реплике после выполнения `RENAME` таблицы.
@ -99,13 +143,13 @@
* Исправлены зависания запросов `KILL QUERY`.
* Исправлена ошибка в клиентской библиотеке ZooKeeper, которая при использовании непустого префикса `chroot` в конфигурации приводила к потере watch'ей, остановке очереди distributed DDL запросов и замедлению репликации.
## Обратно несовместимые изменения:
### Обратно несовместимые изменения:
* Убрана поддержка выражений типа `(a, b) IN (SELECT (a, b))` (можно использовать эквивалентные выражение `(a, b) IN (SELECT a, b)`). Раньше такие запросы могли приводить к недетерминированной фильтрации в `WHERE`.
# ClickHouse release 1.1.54378, 2018-04-16
## ClickHouse release 1.1.54378, 2018-04-16
## Новые возможности:
### Новые возможности:
* Возможность изменения уровня логгирования без перезагрузки сервера.
* Добавлен запрос `SHOW CREATE DATABASE`.
@ -119,7 +163,7 @@
* Возможность указания нескольких `topics` через запятую для движка `Kafka` (Tobias Adamson)
* При остановке запроса по причине `KILL QUERY` или `replace_running_query`, клиент получает исключение `Query was cancelled` вместо неполного результата.
## Улучшения:
### Улучшения:
* Запросы вида `ALTER TABLE ... DROP/DETACH PARTITION` выполняются впереди очереди репликации.
* Возможность использовать `SELECT ... FINAL` и `OPTIMIZE ... FINAL` даже в случае, если данные в таблице представлены одним куском.
@ -130,7 +174,7 @@
* Более надёжное восстановление после сбоев при асинхронной вставке в `Distributed` таблицы.
* Возвращаемый тип функции `countEqual` изменён с `UInt32` на `UInt64` (谢磊)
## Исправление ошибок:
### Исправление ошибок:
* Исправлена ошибка c `IN` где левая часть выражения `Nullable`.
* Исправлен неправильный результат при использовании кортежей с `IN` в случае, если часть компоненнтов кортежа есть в индексе таблицы.
@ -146,31 +190,31 @@
* Исправлена работа `SummingMergeTree` в случае суммирования вложенных структур данных с составным ключом.
* Исправлена возможность возникновения race condition при выборе лидера таблиц `ReplicatedMergeTree`.
## Изменения сборки:
### Изменения сборки:
* Поддержка `ninja` вместо `make` при сборке. `ninja` используется по-умолчанию при сборке релизов.
* Переименованы пакеты `clickhouse-server-base` в `clickhouse-common-static`; `clickhouse-server-common` в `clickhouse-server`; `clickhouse-common-dbg` в `clickhouse-common-static-dbg`. Для установки используйте `clickhouse-server clickhouse-client`. Для совместимости, пакеты со старыми именами продолжают загружаться в репозиторий.
## Обратно несовместимые изменения:
### Обратно несовместимые изменения:
* Удалена специальная интерпретация выражения IN, если слева указан массив. Ранее выражение вида `arr IN (set)` воспринималось как "хотя бы один элемент `arr` принадлежит множеству `set`". Для получения такого же поведения в новой версии, напишите `arrayExists(x -> x IN (set), arr)`.
* Отключено ошибочное использование опции сокета `SO_REUSEPORT` (которая по ошибке включена по-умолчанию в библиотеке Poco). Стоит обратить внимание, что на Linux системах теперь не имеет смысла указывать одновременно адреса `::` и `0.0.0.0` для listen - следует использовать лишь адрес `::`, который (с настройками ядра по-умолчанию) позволяет слушать соединения как по IPv4 так и по IPv6. Также вы можете вернуть поведение старых версий, указав в конфиге `<listen_reuse_port>1</listen_reuse_port>`.
# ClickHouse release 1.1.54370, 2018-03-16
## ClickHouse release 1.1.54370, 2018-03-16
## Новые возможности:
### Новые возможности:
* Добавлена системная таблица `system.macros` и автоматическое обновление макросов при изменении конфигурационного файла.
* Добавлен запрос `SYSTEM RELOAD CONFIG`.
* Добавлена агрегатная функция `maxIntersections(left_col, right_col)`, возвращающая максимальное количество одновременно пересекающихся интервалов `[left; right]`. Функция `maxIntersectionsPosition(left, right)` возвращает начало такого "максимального" интервала. ([Michael Furmur](https://github.com/yandex/ClickHouse/pull/2012)).
## Улучшения:
### Улучшения:
* При вставке данных в `Replicated`-таблицу делается меньше обращений к `ZooKeeper` (также из лога `ZooKeeper` исчезло большинство user-level ошибок).
* Добавлена возможность создавать алиасы для множеств. Пример: `WITH (1, 2, 3) AS set SELECT number IN set FROM system.numbers LIMIT 10`.
## Исправление ошибок:
### Исправление ошибок:
* Исправлена ошибка `Illegal PREWHERE` при чтении из Merge-таблицы над `Distributed`-таблицами.
* Добавлены исправления, позволяющие запускать clickhouse-server в IPv4-only docker-контейнерах.
@ -185,9 +229,9 @@
* Устранено ненужное Error-level логирование `Not found column ... in block`.
# Релиз ClickHouse 1.1.54362, 2018-03-11
## Релиз ClickHouse 1.1.54362, 2018-03-11
## Новые возможности:
### Новые возможности:
* Агрегация без `GROUP BY` по пустому множеству (как например, `SELECT count(*) FROM table WHERE 0`) теперь возвращает результат из одной строки с нулевыми значениями агрегатных функций, в соответствии со стандартом SQL. Вы можете вернуть старое поведение (возвращать пустой результат), выставив настройку `empty_result_for_aggregation_by_empty_set` в значение 1.
* Добавлено приведение типов при `UNION ALL`. Допустимо использование столбцов с разными алиасами в соответствующих позициях `SELECT` в `UNION ALL`, что соответствует стандарту SQL.
@ -225,7 +269,7 @@
* Добавлена настройка `odbc_default_field_size`, позволяющая расширить максимальный размер значения, загружаемого из ODBC источника (по-умолчанию - 1024).
* В таблицу `system.processes` и в `SHOW PROCESSLIST` добавлены столбцы `is_cancelled` и `peak_memory_usage`.
## Улучшения:
### Улучшения:
* Ограничения на результат и квоты на результат теперь не применяются к промежуточным данным для запросов `INSERT SELECT` и для подзапросов в `SELECT`.
* Уменьшено количество ложных срабатываний при проверке состояния `Replicated` таблиц при запуске сервера, приводивших к необходимости выставления флага `force_restore_data`.
@ -241,7 +285,7 @@
* Значения типа `Enum` можно использовать в функциях `min`, `max`, `sum` и некоторых других - в этих случаях используются соответствующие числовые значения. Эта возможность присутствовала ранее, но была потеряна в релизе 1.1.54337.
* Добавлено ограничение `max_expanded_ast_elements` действующее на размер AST после рекурсивного раскрытия алиасов.
## Исправление ошибок:
### Исправление ошибок:
* Исправлены случаи ошибочного удаления ненужных столбцов из подзапросов, а также отсутствие удаления ненужных столбцов из подзапросов, содержащих `UNION ALL`.
* Исправлена ошибка в слияниях для таблиц типа `ReplacingMergeTree`.
@ -269,19 +313,19 @@
* Запрещено использование запросов с `UNION ALL` в `MATERIALIZED VIEW`.
* Исправлена ошибка, которая может возникать при инициализации системной таблицы `part_log` при старте сервера (по-умолчанию `part_log` выключен).
## Обратно несовместимые изменения:
### Обратно несовместимые изменения:
* Удалена настройка `distributed_ddl_allow_replicated_alter`. Соответствующее поведение включено по-умолчанию.
* Удалена настройка `strict_insert_defaults`. Если вы использовали эту функциональность, напишите на `clickhouse-feedback@yandex-team.com`.
* Удалён движок таблиц `UnsortedMergeTree`.
# Релиз ClickHouse 1.1.54343, 2018-02-05
## Релиз ClickHouse 1.1.54343, 2018-02-05
* Добавлена возможность использовать макросы при задании имени кластера в распределенных DLL запросах и создании Distributed-таблиц: `CREATE TABLE distr ON CLUSTER '{cluster}' (...) ENGINE = Distributed('{cluster}', 'db', 'table')`.
* Теперь при вычислении запросов вида `SELECT ... FROM table WHERE expr IN (subquery)` используется индекс таблицы `table`.
* Улучшена обработка дубликатов при вставке в Replicated-таблицы, теперь они не приводят к излишнему замедлению выполнения очереди репликации.
# Релиз ClickHouse 1.1.54342, 2018-01-22
## Релиз ClickHouse 1.1.54342, 2018-01-22
Релиз содержит исправление к предыдущему релизу 1.1.54337:
* Исправлена регрессия в версии 1.1.54337: если пользователь по-умолчанию имеет readonly доступ, то сервер отказывался стартовать с сообщением `Cannot create database in readonly mode`.
@ -292,9 +336,9 @@
* Таблицы типа Buffer теперь работают при наличии MATERIALIZED столбцов в таблице назначения (by zhang2014).
* Исправлена одна из ошибок в реализации NULL.
# Релиз ClickHouse 1.1.54337, 2018-01-18
## Релиз ClickHouse 1.1.54337, 2018-01-18
## Новые возможности:
### Новые возможности:
* Добавлена поддержка хранения многомерных массивов и кортежей (тип данных `Tuple`) в таблицах.
* Поддержка табличных функций для запросов `DESCRIBE` и `INSERT`. Поддержка подзапроса в запросе `DESCRIBE`. Примеры: `DESC TABLE remote('host', default.hits)`; `DESC TABLE (SELECT 1)`; `INSERT INTO TABLE FUNCTION remote('host', default.hits)`. Возможность писать `INSERT INTO TABLE` вместо `INSERT INTO`.
@ -323,9 +367,9 @@
* Добавлена поддержка `ALTER` для таблиц типа `Null` (Anastasiya Tsarkova).
* Функция `reinterpretAsString` расширена на все типы данных, значения которых хранятся в памяти непрерывно.
* Для программы `clickhouse-local` добавлена опция `--silent` для подавления вывода информации о выполнении запроса в stderr.
* Добавлена поддержка чтения `Date` в текстовом виде в формате, где месяц и день месяца могут быть указаны одной цифрой вместо двух (Amos Bird).
* Добавлена поддержка чтения `Date` в текстовом виде в формате, где месяц и день месяца могут быть указаны одной цифрой вместо двух (Amos Bird).
## Увеличение производительности:
### Увеличение производительности:
* Увеличена производительность агрегатных функций `min`, `max`, `any`, `anyLast`, `anyHeavy`, `argMin`, `argMax` от строковых аргументов.
* Увеличена производительность функций `isInfinite`, `isFinite`, `isNaN`, `roundToExp2`.
@ -334,7 +378,7 @@
* Уменьшено потребление памяти при `JOIN`, если левая и правая часть содержали столбцы с одинаковым именем, не входящие в `USING`.
* Увеличена производительность агрегатных функций `varSamp`, `varPop`, `stddevSamp`, `stddevPop`, `covarSamp`, `covarPop`, `corr` за счёт уменьшения стойкости к вычислительной погрешности. Старые версии функций добавлены под именами `varSampStable`, `varPopStable`, `stddevSampStable`, `stddevPopStable`, `covarSampStable`, `covarPopStable`, `corrStable`.
## Исправления ошибок:
### Исправления ошибок:
* Исправлена работа дедупликации блоков после `DROP` или `DETATH PARTITION`. Раньше удаление партиции и вставка тех же самых данных заново не работала, так как вставленные заново блоки считались дубликатами.
* Исправлена ошибка, в связи с которой может неправильно обрабатываться `WHERE` для запросов на создание `MATERIALIZED VIEW` с указанием `POPULATE`.
@ -344,7 +388,7 @@
* Добавлена недостающая поддержка типа данных `UUID` для `DISTINCT`, `JOIN`, в агрегатных функциях `uniq` и во внешних словарях (Иванов Евгений). Поддержка `UUID` всё ещё остаётся не полной.
* Исправлено поведение `SummingMergeTree` для строк, в которых все значения после суммирования равны нулю.
* Многочисленные доработки для движка таблиц `Kafka` (Marek Vavruša).
* Исправлена некорректная работа движка таблиц `Join` (Amos Bird).
* Исправлена некорректная работа движка таблиц `Join` (Amos Bird).
* Исправлена работа аллокатора под FreeBSD и OS X.
* Функция `extractAll` теперь может доставать пустые вхождения.
* Исправлена ошибка, не позволяющая подключить при сборке `libressl` вместо `openssl`.
@ -368,12 +412,12 @@
* Исправлена работа `DISTINCT` при условии, что все столбцы константные.
* Исправлено форматирование запроса в случае наличия функции `tupleElement` со сложным константным выражением в качестве номера элемента.
* Исправлена работа `Dictionary` таблиц для словарей типа `range_hashed`.
* Исправлена ошибка, приводящая к появлению лишних строк при `FULL` и `RIGHT JOIN` (Amos Bird).
* Исправлена ошибка, приводящая к появлению лишних строк при `FULL` и `RIGHT JOIN` (Amos Bird).
* Исправлено падение сервера в случае создания и удаления временных файлов в `config.d` директориях в момент перечитывания конфигурации.
* Исправлена работа запроса `SYSTEM DROP DNS CACHE`: ранее сброс DNS кэша не приводил к повторному резолвингу имён хостов кластера.
* Исправлено поведение `MATERIALIZED VIEW` после `DETACH TABLE` таблицы, на которую он смотрит (Marek Vavruša).
## Улучшения сборки:
### Улучшения сборки:
* Для сборки используется `pbuilder`. Сборка максимально независима от окружения на сборочной машине.
* Для разных версий систем выкладывается один и тот же пакет, который совместим с широким диапазоном Linux систем.
@ -387,27 +431,27 @@
* Удалено использование расширений GNU из кода и включена опция `-Wextra`. При сборке с помощью `clang` по-умолчанию используется `libc++` вместо `libstdc++`.
* Выделены библиотеки `clickhouse_parsers` и `clickhouse_common_io` для более быстрой сборки утилит.
## Обратно несовместимые изменения:
### Обратно несовместимые изменения:
* Формат засечек (marks) для таблиц типа `Log`, содержащих `Nullable` столбцы, изменён обратно-несовместимым образом. В случае наличия таких таблиц, вы можете преобразовать их в `TinyLog` до запуска новой версии сервера. Для этого в соответствующем таблице файле `.sql` в директории `metadata`, замените `ENGINE = Log` на `ENGINE = TinyLog`. Если в таблице нет `Nullable` столбцов или тип таблицы не `Log`, то ничего делать не нужно.
* Удалена настройка `experimental_allow_extended_storage_definition_syntax`. Соответствующая функциональность включена по-умолчанию.
* Функция `runningIncome` переименована в `runningDifferenceStartingWithFirstValue` во избежание путаницы.
* Удалена возможность написания `FROM ARRAY JOIN arr` без указания таблицы после FROM (Amos Bird).
* Удалена возможность написания `FROM ARRAY JOIN arr` без указания таблицы после FROM (Amos Bird).
* Удалён формат `BlockTabSeparated`, использовавшийся лишь для демонстрационных целей.
* Изменён формат состояния агрегатных функций `varSamp`, `varPop`, `stddevSamp`, `stddevPop`, `covarSamp`, `covarPop`, `corr`. Если вы использовали эти состояния для хранения в таблицах (тип данных `AggregateFunction` от этих функций или материализованные представления, хранящие эти состояния), напишите на clickhouse-feedback@yandex-team.com.
* В предыдущих версиях существовала недокументированная возможность: в типе данных AggregateFunction можно было не указывать параметры для агрегатной функции, которая зависит от параметров. Пример: `AggregateFunction(quantiles, UInt64)` вместо `AggregateFunction(quantiles(0.5, 0.9), UInt64)`. Эта возможность потеряна. Не смотря на то, что возможность не документирована, мы собираемся вернуть её в ближайших релизах.
* Значения типа данных Enum не могут быть переданы в агрегатные функции min/max. Возможность будет возвращена обратно в следующем релизе.
## На что обратить внимание при обновлении:
### На что обратить внимание при обновлении:
* При обновлении кластера, на время, когда на одних репликах работает новая версия сервера, а на других - старая, репликация будет приостановлена и в логе появятся сообщения вида `unknown parameter 'shard'`. Репликация продолжится после обновления всех реплик кластера.
* Если на серверах кластера работают разные версии ClickHouse, то возможен неправильный результат распределённых запросов, использующих функции `varSamp`, `varPop`, `stddevSamp`, `stddevPop`, `covarSamp`, `covarPop`, `corr`. Необходимо обновить все серверы кластера.
# Релиз ClickHouse 1.1.54327, 2017-12-21
## Релиз ClickHouse 1.1.54327, 2017-12-21
Релиз содержит исправление к предыдущему релизу 1.1.54318:
* Исправлена проблема с возможным race condition при репликации, которая может приводить к потере данных. Проблеме подвержены версии 1.1.54310 и 1.1.54318. Если вы их используете и у вас есть Replicated таблицы, то обновление обязательно. Понять, что эта проблема существует, можно по сообщениям в логе Warning вида `Part ... from own log doesn't exist.` Даже если таких сообщений нет, проблема всё-равно актуальна.
# Релиз ClickHouse 1.1.54318, 2017-11-30
## Релиз ClickHouse 1.1.54318, 2017-11-30
Релиз содержит изменения к предыдущему релизу 1.1.54310 с исправлением следующих багов:
* Исправлено некорректное удаление строк при слияниях в движке SummingMergeTree
@ -416,9 +460,9 @@
* Исправлена проблема, приводящая к остановке выполнения очереди репликации
* Исправлено ротирование и архивация логов сервера
# Релиз ClickHouse 1.1.54310, 2017-11-01
## Релиз ClickHouse 1.1.54310, 2017-11-01
## Новые возможности:
### Новые возможности:
* Произвольный ключ партиционирования для таблиц семейства MergeTree.
* Движок таблиц [Kafka](https://clickhouse.yandex/docs/en/single/index.html#document-table_engines/kafka).
* Возможность загружать модели [CatBoost](https://catboost.yandex/) и применять их к данным, хранящимся в ClickHouse.
@ -434,12 +478,12 @@
* Поддержка входного формата Capn Proto.
* Возможность задавать уровень сжатия при использовании алгоритма zstd.
## Обратно несовместимые изменения:
### Обратно несовместимые изменения:
* Запрещено создание временных таблиц с движком, отличным от Memory.
* Запрещено явное создание таблиц с движком View и MaterializedView.
* При создании таблицы теперь проверяется, что ключ сэмплирования входит в первичный ключ.
## Исправления ошибок:
### Исправления ошибок:
* Исправлено зависание при синхронной вставке в Distributed таблицу.
* Исправлена неатомарность при добавлении/удалении кусков в реплицированных таблицах.
* Данные, вставляемые в материализованное представление, теперь не подвергаются излишней дедупликации.
@ -449,14 +493,14 @@
* Исправлено зависание при недостатке места на диске в разделе с логами.
* Исправлено переполнение в функции toRelativeWeekNum для первой недели Unix-эпохи.
## Улучшения сборки:
### Улучшения сборки:
* Несколько сторонних библиотек (в частности, Poco) обновлены и переведены на git submodules.
# Релиз ClickHouse 1.1.54304, 2017-10-19
## Новые возможности:
## Релиз ClickHouse 1.1.54304, 2017-10-19
### Новые возможности:
* Добавлена поддержка TLS в нативном протоколе (включается заданием `tcp_ssl_port` в `config.xml`)
## Исправления ошибок:
### Исправления ошибок:
* `ALTER` для реплицированных таблиц теперь пытается начать выполнение как можно быстрее
* Исправлены падения при чтении данных с настройкой `preferred_block_size_bytes=0`
* Исправлено падение `clickhouse-client` при нажатии `Page Down`
@ -469,16 +513,16 @@
* Корректное обновление пользователей при невалидном `users.xml`
* Корректная обработка случаев, когда executable-словарь возвращает ненулевой код ответа
# Релиз ClickHouse 1.1.54292, 2017-09-20
## Релиз ClickHouse 1.1.54292, 2017-09-20
## Новые возможности:
### Новые возможности:
* Добавлена функция `pointInPolygon` для работы с координатами на плоскости.
* Добавлена агрегатная функция `sumMap`, обеспечивающая суммирование массивов аналогично `SummingMergeTree`.
* Добавлена функция `trunc`. Увеличена производительность функций округления `round`, `floor`, `ceil`, `roundToExp2`. Исправлена логика работы функций округления. Изменена логика работы функции `roundToExp2` для дробных и отрицательных чисел.
* Ослаблена зависимость исполняемого файла ClickHouse от версии libc. Один и тот же исполняемый файл ClickHouse может запускаться и работать на широком множестве Linux систем. Замечание: зависимость всё ещё присутствует при использовании скомпилированных запросов (настройка `compile = 1`, по-умолчанию не используется).
* Уменьшено время динамической компиляции запросов.
## Исправления ошибок:
### Исправления ошибок:
* Исправлена ошибка, которая могла приводить к сообщениям `part ... intersects previous part` и нарушению консистентности реплик.
* Исправлена ошибка, приводящая к блокировке при завершении работы сервера, если в это время ZooKeeper недоступен.
* Удалено избыточное логгирование при восстановлении реплик.
@ -486,9 +530,9 @@
* Исправлена ошибка в функции concat, возникающая в случае, если первый столбец блока имеет тип Array.
* Исправлено отображение прогресса в таблице system.merges.
# Релиз ClickHouse 1.1.54289, 2017-09-13
## Релиз ClickHouse 1.1.54289, 2017-09-13
## Новые возможности:
### Новые возможности:
* Запросы `SYSTEM` для административных действий с сервером: `SYSTEM RELOAD DICTIONARY`, `SYSTEM RELOAD DICTIONARIES`, `SYSTEM DROP DNS CACHE`, `SYSTEM SHUTDOWN`, `SYSTEM KILL`.
* Добавлены функции для работы с массивами: `concat`, `arraySlice`, `arrayPushBack`, `arrayPushFront`, `arrayPopBack`, `arrayPopFront`.
* Добавлены параметры `root` и `identity` для конфигурации ZooKeeper. Это позволяет изолировать разных пользователей одного ZooKeeper кластера.
@ -503,7 +547,7 @@
* Возможность задать `umask` в конфигурационном файле.
* Увеличена производительность запросов с `DISTINCT`.
## Исправления ошибок:
### Исправления ошибок:
* Более оптимальная процедура удаления старых нод в ZooKeeper. Ранее в случае очень частых вставок, старые ноды могли не успевать удаляться, что приводило, в том числе, к очень долгому завершению сервера.
* Исправлена рандомизация при выборе хостов для соединения с ZooKeeper.
* Исправлено отключение отстающей реплики при распределённых запросах, если реплика является localhost.
@ -516,28 +560,28 @@
* Исправлено появление zombie процессов при работе со словарём с источником `executable`.
* Исправлен segfault при запросе HEAD.
## Улучшения процесса разработки и сборки ClickHouse:
### Улучшения процесса разработки и сборки ClickHouse:
* Возможность сборки с помощью `pbuilder`.
* Возможность сборки с использованием `libc++` вместо `libstdc++` под Linux.
* Добавлены инструкции для использования статических анализаторов кода `Coverity`, `clang-tidy`, `cppcheck`.
## На что обратить внимание при обновлении:
### На что обратить внимание при обновлении:
* Увеличено значение по-умолчанию для настройки MergeTree `max_bytes_to_merge_at_max_space_in_pool` (максимальный суммарный размер кусков в байтах для мержа) со 100 GiB до 150 GiB. Это может привести к запуску больших мержей после обновления сервера, что может вызвать повышенную нагрузку на дисковую подсистему. Если же на серверах, где это происходит, количество свободного места менее чем в два раза больше суммарного объёма выполняющихся мержей, то в связи с этим перестанут выполняться какие-либо другие мержи, включая мержи мелких кусков. Это приведёт к тому, что INSERT-ы будут отклоняться с сообщением "Merges are processing significantly slower than inserts". Для наблюдения, используйте запрос `SELECT * FROM system.merges`. Вы также можете смотреть на метрику `DiskSpaceReservedForMerge` в таблице `system.metrics` или в Graphite. Для исправления этой ситуации можно ничего не делать, так как она нормализуется сама после завершения больших мержей. Если же вас это не устраивает, вы можете вернуть настройку `max_bytes_to_merge_at_max_space_in_pool` в старое значение, прописав в config.xml в секции `<merge_tree>` `<max_bytes_to_merge_at_max_space_in_pool>107374182400</max_bytes_to_merge_at_max_space_in_pool>` и перезапустить сервер.
# Релиз ClickHouse 1.1.54284, 2017-08-29
## Релиз ClickHouse 1.1.54284, 2017-08-29
* Релиз содержит изменения к предыдущему релизу 1.1.54282, которые исправляют утечку записей о кусках в ZooKeeper
# Релиз ClickHouse 1.1.54282, 2017-08-23
## Релиз ClickHouse 1.1.54282, 2017-08-23
Релиз содержит исправления к предыдущему релизу 1.1.54276:
* Исправлена ошибка `DB::Exception: Assertion violation: !_path.empty()` при вставке в Distributed таблицу.
* Исправлен парсинг при вставке в формате RowBinary, если входные данные начинаются с ';'.
* Исправлена ошибка при рантайм-компиляции некоторых агрегатных функций (например, `groupArray()`).
# Релиз ClickHouse 1.1.54276, 2017-08-16
## Релиз ClickHouse 1.1.54276, 2017-08-16
## Новые возможности:
### Новые возможности:
* Добавлена опциональная секция WITH запроса SELECT. Пример запроса: `WITH 1+1 AS a SELECT a, a*a`
* Добавлена возможность синхронной вставки в Distributed таблицу: выдается Ok только после того как все данные записались на все шарды. Активируется настройкой insert_distributed_sync=1
* Добавлен тип данных UUID для работы с 16-байтовыми идентификаторами
@ -547,17 +591,17 @@
* Добавлена поддержка неконстантных аргументов и отрицательных смещений в функции `substring(str, pos, len)`
* Добавлен параметр max_size для агрегатной функции `groupArray(max_size)(column)`, и оптимизирована её производительность
## Основные изменения:
### Основные изменения:
* Улучшение безопасности: все файлы сервера создаются с правами 0640 (можно поменять, через параметр <umask> в конфиге).
* Улучшены сообщения об ошибках в случае синтаксически неверных запросов
* Значительно уменьшен расход оперативной памяти и улучшена производительность слияний больших MergeTree-кусков данных
* Значительно увеличена производительность слияний данных для движка ReplacingMergeTree
* Улучшена производительность асинхронных вставок из Distributed таблицы за счет объединения нескольких исходных вставок. Функциональность включается настройкой distributed_directory_monitor_batch_inserts=1.
## Обратно несовместимые изменения:
### Обратно несовместимые изменения:
* Изменился бинарный формат агрегатных состояний функции `groupArray(array_column)` для массивов
## Полный список изменений:
### Полный список изменений:
* Добавлена настройка `output_format_json_quote_denormals`, включающая вывод nan и inf значений в формате JSON
* Более оптимальное выделение потоков при чтении из Distributed таблиц
* Разрешено задавать настройки в режиме readonly, если их значение не изменяется
@ -575,7 +619,7 @@
* Возможность подключения к MySQL через сокет на файловой системе
* В таблицу system.parts добавлен столбец с информацией о размере marks в байтах
## Исправления багов:
### Исправления багов:
* Исправлена некорректная работа Distributed таблиц, использующих Merge таблицы, при SELECT с условием на поле _table
* Исправлен редкий race condition в ReplicatedMergeTree при проверке кусков данных
* Исправлено возможное зависание процедуры leader election при старте сервера
@ -598,15 +642,15 @@
* Исправлена ошибка "Cannot mremap" при использовании множеств в секциях IN, JOIN, содержащих более 2 млрд. элементов
* Исправлен failover для словарей с источником MySQL
## Улучшения процесса разработки и сборки ClickHouse:
### Улучшения процесса разработки и сборки ClickHouse:
* Добавлена возмозможность сборки в Arcadia
* Добавлена возможность сборки с помощью gcc 7
* Ускорена параллельная сборка с помощью ccache+distcc
# Релиз ClickHouse 1.1.54245, 2017-07-04
## Релиз ClickHouse 1.1.54245, 2017-07-04
## Новые возможности:
### Новые возможности:
* Распределённые DDL (например, `CREATE TABLE ON CLUSTER`)
* Реплицируемый запрос `ALTER TABLE CLEAR COLUMN IN PARTITION`
* Движок таблиц Dictionary (доступ к данным словаря в виде таблицы)
@ -617,14 +661,14 @@
* Сессии в HTTP интерфейсе
* Запрос OPTIMIZE для Replicated таблицы теперь можно выполнять не только на лидере
## Обратно несовместимые изменения:
### Обратно несовместимые изменения:
* Убрана команда SET GLOBAL
## Мелкие изменения:
### Мелкие изменения:
* Теперь после получения сигнала в лог печатается полный стектрейс
* Ослаблена проверка на количество повреждённых/лишних кусков при старте (было слишком много ложных срабатываний)
## Исправления багов:
### Исправления багов:
* Исправлено залипание плохого соединения при вставке в Distributed таблицу
* GLOBAL IN теперь работает при запросе из таблицы Merge, смотрящей в Distributed
* Теперь правильно определяется количество ядер на виртуалках Google Compute Engine

View File

@ -275,6 +275,9 @@ include (cmake/find_rdkafka.cmake)
include (cmake/find_capnp.cmake)
include (cmake/find_llvm.cmake)
include (cmake/find_cpuid.cmake)
if (ENABLE_TESTS)
include (cmake/find_gtest.cmake)
endif ()
include (cmake/find_contrib_lib.cmake)
find_contrib_lib(cityhash)

View File

@ -1,39 +0,0 @@
## How to increase maxfiles on macOS
To increase maxfiles on macOS, create the following file:
(Note: you'll need to use sudo)
/Library/LaunchDaemons/limit.maxfiles.plist:
```
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>limit.maxfiles</string>
<key>ProgramArguments</key>
<array>
<string>launchctl</string>
<string>limit</string>
<string>maxfiles</string>
<string>524288</string>
<string>524288</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>ServiceIPC</key>
<false/>
</dict>
</plist>
```
Execute the following command:
```
sudo chown root:wheel /Library/LaunchDaemons/limit.maxfiles.plist
```
Reboot.
To check if it's working, you can use `ulimit -n` command.

View File

@ -24,6 +24,15 @@ if (ENABLE_EMBEDDED_COMPILER)
endif ()
endif ()
if (LLVM_FOUND)
find_library (LLD_LIBRARY_TEST lldCore PATHS ${LLVM_LIBRARY_DIRS})
find_path (LLD_INCLUDE_DIR_TEST NAMES lld/Core/AbsoluteAtom.h PATHS ${LLVM_INCLUDE_DIRS})
if (NOT LLD_LIBRARY_TEST OR NOT LLD_INCLUDE_DIR_TEST)
set (LLVM_FOUND 0)
message(WARNING "liblld (${LLD_LIBRARY_TEST}, ${LLD_INCLUDE_DIR_TEST}) not found in ${LLVM_INCLUDE_DIRS} ${LLVM_LIBRARY_DIRS}. Disabling internal compiler.")
endif ()
endif ()
if (LLVM_FOUND)
# Remove dynamically-linked zlib and libedit from LLVM's dependencies:
set_target_properties(LLVMSupport PROPERTIES INTERFACE_LINK_LIBRARIES "-lpthread;LLVMDemangle;${ZLIB_LIBRARIES}")

View File

@ -144,6 +144,7 @@ target_link_libraries (clickhouse_common_io
${EXECINFO_LIBRARY}
${ELF_LIBRARY}
${Boost_SYSTEM_LIBRARY}
apple_rt
${CMAKE_DL_LIBS}
)
@ -244,8 +245,6 @@ add_subdirectory (programs)
add_subdirectory (tests)
if (ENABLE_TESTS)
include (${ClickHouse_SOURCE_DIR}/cmake/find_gtest.cmake)
if (USE_INTERNAL_GTEST_LIBRARY)
# Google Test from sources
add_subdirectory(${ClickHouse_SOURCE_DIR}/contrib/googletest/googletest ${CMAKE_CURRENT_BINARY_DIR}/googletest)

View File

@ -1,11 +1,11 @@
# This strings autochanged from release_lib.sh:
set(VERSION_REVISION 54395 CACHE STRING "")
set(VERSION_MAJOR 1 CACHE STRING "")
set(VERSION_MINOR 1 CACHE STRING "")
set(VERSION_PATCH 54398 CACHE STRING "")
set(VERSION_GITHASH 4b31f389b743c69af688788c0d0cdb8973aefa77 CACHE STRING "")
set(VERSION_DESCRIBE v1.1.54398-testing CACHE STRING "")
set(VERSION_STRING 1.1.54398 CACHE STRING "")
set(VERSION_REVISION 54397 CACHE STRING "")
set(VERSION_MAJOR 18 CACHE STRING "")
set(VERSION_MINOR 2 CACHE STRING "")
set(VERSION_PATCH 0 CACHE STRING "")
set(VERSION_GITHASH 6ad677d7d6961a0c9088ccd9eff55779cfdaa654 CACHE STRING "")
set(VERSION_DESCRIBE v18.2.0-testing CACHE STRING "")
set(VERSION_STRING 18.2.0 CACHE STRING "")
# end of autochange
set(VERSION_EXTRA "" CACHE STRING "")
@ -18,7 +18,8 @@ if (VERSION_EXTRA)
string(CONCAT VERSION_STRING ${VERSION_STRING} "." ${VERSION_EXTRA})
endif ()
set (VERSION_FULL "${PROJECT_NAME} ${VERSION_STRING}")
set (VERSION_NAME "${PROJECT_NAME}")
set (VERSION_FULL "${VERSION_NAME} ${VERSION_STRING}")
if (APPLE)
# dirty hack: ld: malformed 64-bit a.b.c.d.e version number: 1.1.54160

View File

@ -1,6 +1,8 @@
add_library (clickhouse-client-lib Client.cpp)
target_link_libraries (clickhouse-client-lib clickhouse_functions clickhouse_aggregate_functions ${LINE_EDITING_LIBS} ${Boost_PROGRAM_OPTIONS_LIBRARY})
target_include_directories (clickhouse-client-lib SYSTEM PRIVATE ${READLINE_INCLUDE_DIR})
if (READLINE_INCLUDE_DIR)
target_include_directories (clickhouse-client-lib SYSTEM PRIVATE ${READLINE_INCLUDE_DIR})
endif ()
if (CLICKHOUSE_SPLIT_BINARY)
add_executable (clickhouse-client clickhouse-client.cpp)

View File

@ -28,6 +28,7 @@
#include <Common/StringUtils/StringUtils.h>
#include <Common/typeid_cast.h>
#include <Common/Config/ConfigProcessor.h>
#include <Common/config_version.h>
#include <Core/Types.h>
#include <Core/QueryProcessingStage.h>
#include <IO/ReadBufferFromFileDescriptor.h>
@ -1316,10 +1317,7 @@ private:
void showClientVersion()
{
std::cout << "ClickHouse client version " << DBMS_VERSION_MAJOR
<< "." << DBMS_VERSION_MINOR
<< "." << ClickHouseRevision::get()
<< "." << std::endl;
std::cout << DBMS_NAME << " client version " << VERSION_STRING << "." << std::endl;
}
public:

View File

@ -17,6 +17,7 @@
#include <Common/Config/ConfigProcessor.h>
#include <Common/escapeForFileName.h>
#include <Common/ClickHouseRevision.h>
#include <Common/config_version.h>
#include <IO/ReadBufferFromString.h>
#include <IO/WriteBufferFromString.h>
#include <IO/WriteBufferFromFileDescriptor.h>
@ -355,10 +356,7 @@ void LocalServer::setupUsers()
static void showClientVersion()
{
std::cout << "ClickHouse client version " << DBMS_VERSION_MAJOR
<< "." << DBMS_VERSION_MINOR
<< "." << ClickHouseRevision::get()
<< "." << std::endl;
std::cout << DBMS_NAME << " client version " << VERSION_STRING << "." << std::endl;
}
std::string LocalServer::getHelpHeader() const

View File

@ -1,36 +1,27 @@
#include "TCPHandler.h"
#include <iomanip>
#include <Poco/Net/NetException.h>
#include <Common/ClickHouseRevision.h>
#include <Common/Stopwatch.h>
#include <IO/Progress.h>
#include <IO/CompressedReadBuffer.h>
#include <IO/CompressedWriteBuffer.h>
#include <IO/ReadBufferFromPocoSocket.h>
#include <IO/WriteBufferFromPocoSocket.h>
#include <IO/CompressionSettings.h>
#include <IO/copyData.h>
#include <DataStreams/AsynchronousBlockInputStream.h>
#include <DataStreams/NativeBlockInputStream.h>
#include <DataStreams/NativeBlockOutputStream.h>
#include <Interpreters/executeQuery.h>
#include <Interpreters/Quota.h>
#include <Interpreters/TablesStatus.h>
#include <Storages/StorageMemory.h>
#include <Storages/StorageReplicatedMergeTree.h>
#include <Common/ClickHouseRevision.h>
#include <Common/Stopwatch.h>
#include <Common/ExternalTable.h>
#include "TCPHandler.h"
#include <Common/NetException.h>
#include <Common/config_version.h>
#include <ext/scope_guard.h>

View File

@ -18,6 +18,9 @@ public:
DataTypes transformArguments(const DataTypes & arguments) const override
{
if (0 == arguments.size())
throw Exception("-Array aggregate functions require at least one argument", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
DataTypes nested_arguments;
for (const auto & type : arguments)
{

View File

@ -15,6 +15,10 @@ namespace DB
*/
class AggregateFunctionCombinatorFactory final: public ext::singleton<AggregateFunctionCombinatorFactory>
{
private:
using Dict = std::unordered_map<std::string, AggregateFunctionCombinatorPtr>;
Dict dict;
public:
/// Not thread safe. You must register before using tryGet.
void registerCombinator(const AggregateFunctionCombinatorPtr & value);
@ -22,8 +26,10 @@ public:
/// Example: if the name is 'avgIf', it will return combinator -If.
AggregateFunctionCombinatorPtr tryFindSuffix(const std::string & name) const;
private:
std::unordered_map<std::string, AggregateFunctionCombinatorPtr> dict;
const Dict & getAllAggregateFunctionCombinators() const
{
return dict;
}
};
}

View File

@ -19,6 +19,7 @@
#include <Common/CurrentMetrics.h>
#include <Common/DNSResolver.h>
#include <Common/StringUtils/StringUtils.h>
#include <Common/config_version.h>
#include <Interpreters/ClientInfo.h>
#include <Common/config.h>

View File

@ -87,3 +87,14 @@ const std::string & Collator::getLocale() const
{
return locale;
}
std::vector<std::string> Collator::getAvailableCollations()
{
std::vector<std::string> result;
#if USE_ICU
size_t available_locales_count = ucol_countAvailable();
for (size_t i = 0; i < available_locales_count; ++i)
result.push_back(ucol_getAvailable(i));
#endif
return result;
}

View File

@ -1,6 +1,7 @@
#pragma once
#include <string>
#include <vector>
#include <boost/noncopyable.hpp>
struct UCollator;
@ -15,6 +16,8 @@ public:
const std::string & getLocale() const;
static std::vector<std::string> getAvailableCollations();
private:
std::string locale;
UCollator * collator;

View File

@ -377,6 +377,7 @@ namespace ErrorCodes
extern const int CANNOT_STAT = 400;
extern const int FEATURE_IS_NOT_ENABLED_AT_BUILD_TIME = 401;
extern const int CANNOT_IOSETUP = 402;
extern const int INVALID_JOIN_ON_EXPRESSION = 403;
extern const int KEEPER_EXCEPTION = 999;

View File

@ -13,7 +13,31 @@
#cmakedefine VERSION_REVISION @VERSION_REVISION@
#endif
#cmakedefine VERSION_NAME "@VERSION_NAME@"
#define DBMS_NAME VERSION_NAME
#cmakedefine VERSION_MAJOR @VERSION_MAJOR@
#cmakedefine VERSION_MINOR @VERSION_MINOR@
#cmakedefine VERSION_PATCH @VERSION_PATCH@
#cmakedefine VERSION_STRING "@VERSION_STRING@"
#cmakedefine VERSION_FULL "@VERSION_FULL@"
#cmakedefine VERSION_DESCRIBE "@VERSION_DESCRIBE@"
#cmakedefine VERSION_GITHASH "@VERSION_GITHASH@"
#if defined(VERSION_MAJOR)
#define DBMS_VERSION_MAJOR VERSION_MAJOR
#else
#define DBMS_VERSION_MAJOR 0
#endif
#if defined(VERSION_MINOR)
#define DBMS_VERSION_MINOR VERSION_MINOR
#else
#define DBMS_VERSION_MINOR 0
#endif
#if defined(VERSION_PATCH)
#define DBMS_VERSION_PATCH VERSION_PATCH
#else
#define DBMS_VERSION_PATCH 0
#endif

View File

@ -1,9 +1,5 @@
#pragma once
#define DBMS_NAME "ClickHouse"
#define DBMS_VERSION_MAJOR 1
#define DBMS_VERSION_MINOR 1
#define DBMS_DEFAULT_HOST "localhost"
#define DBMS_DEFAULT_PORT 9000
#define DBMS_DEFAULT_SECURE_PORT 9440

View File

@ -71,7 +71,7 @@ void PushingToViewsBlockOutputStream::write(const Block & block)
try
{
BlockInputStreamPtr from = std::make_shared<OneBlockInputStream>(block);
InterpreterSelectQuery select(view.query, *views_context, {}, QueryProcessingStage::Complete, 0, from);
InterpreterSelectQuery select(view.query, *views_context, from);
BlockInputStreamPtr in = std::make_shared<MaterializingBlockInputStream>(select.execute().in);
/// Squashing is needed here because the materialized view query can generate a lot of blocks
/// even when only one block is inserted into the parent table (e.g. if the query is a GROUP BY

View File

@ -323,22 +323,24 @@ void SummingSortedBlockInputStream::merge(MutableColumns & merged_columns, std::
if (current_key.empty()) /// The first key encountered.
{
setPrimaryKeyRef(current_key, current);
key_differs = true;
current_row_is_zero = true;
}
else
key_differs = next_key != current_key;
/// if there are enough rows and the last one is calculated completely
if (key_differs && merged_rows >= max_block_size)
return;
queue.pop();
if (key_differs)
{
/// Write the data for the previous group.
insertCurrentRowIfNeeded(merged_columns, false);
if (!current_key.empty())
/// Write the data for the previous group.
insertCurrentRowIfNeeded(merged_columns, false);
if (merged_rows >= max_block_size)
{
/// The block is now full and the last row is calculated completely.
current_key.reset();
return;
}
current_key.swap(next_key);
@ -375,6 +377,8 @@ void SummingSortedBlockInputStream::merge(MutableColumns & merged_columns, std::
current_row_is_zero = false;
}
queue.pop();
if (!current->isLast())
{
current->next();
@ -481,6 +485,9 @@ void SummingSortedBlockInputStream::addRow(SortCursor & cursor)
{
for (auto & desc : columns_to_aggregate)
{
if (!desc.created)
throw Exception("Logical error in SummingSortedBlockInputStream, there are no description", ErrorCodes::LOGICAL_ERROR);
if (desc.is_agg_func_type)
{
// desc.state is not used for AggregateFunction types
@ -489,9 +496,6 @@ void SummingSortedBlockInputStream::addRow(SortCursor & cursor)
}
else
{
if (!desc.created)
throw Exception("Logical error in SummingSortedBlockInputStream, there are no description", ErrorCodes::LOGICAL_ERROR);
// Specialized case for unary functions
if (desc.column_numbers.size() == 1)
{

View File

@ -44,6 +44,11 @@ public:
/// Register a simple data type, that have no parameters.
void registerSimpleDataType(const String & name, SimpleCreator creator, CaseSensitiveness case_sensitiveness = CaseSensitive);
const DataTypesDictionary & getAllDataTypes() const
{
return data_types;
}
private:
DataTypesDictionary data_types;

View File

@ -50,7 +50,7 @@ ClickHouseDictionarySource::ClickHouseDictionarySource(
table{config.getString(config_prefix + ".table")},
where{config.getString(config_prefix + ".where", "")},
update_field{config.getString(config_prefix + ".update_field", "")},
query_builder{dict_struct, db, table, where, ExternalQueryBuilder::Backticks},
query_builder{dict_struct, db, table, where, IdentifierQuotingStyle::Backticks},
sample_block{sample_block}, context(context),
is_local{isLocalAddress({ host, port }, config.getInt("tcp_port", 0))},
pool{is_local ? nullptr : createPool(host, port, secure, db, user, password, context)},
@ -67,7 +67,7 @@ ClickHouseDictionarySource::ClickHouseDictionarySource(const ClickHouseDictionar
db{other.db}, table{other.table},
where{other.where},
update_field{other.update_field},
query_builder{dict_struct, db, table, where, ExternalQueryBuilder::Backticks},
query_builder{dict_struct, db, table, where, IdentifierQuotingStyle::Backticks},
sample_block{other.sample_block}, context(other.context),
is_local{other.is_local},
pool{is_local ? nullptr : createPool(host, port, secure, db, user, password, context)},

View File

@ -22,7 +22,7 @@ ExternalQueryBuilder::ExternalQueryBuilder(
const std::string & db,
const std::string & table,
const std::string & where,
QuotingStyle quoting_style)
IdentifierQuotingStyle quoting_style)
: dict_struct(dict_struct), db(db), table(table), where(where), quoting_style(quoting_style)
{
}
@ -32,15 +32,15 @@ void ExternalQueryBuilder::writeQuoted(const std::string & s, WriteBuffer & out)
{
switch (quoting_style)
{
case None:
case IdentifierQuotingStyle::None:
writeString(s, out);
break;
case Backticks:
case IdentifierQuotingStyle::Backticks:
writeBackQuotedString(s, out);
break;
case DoubleQuotes:
case IdentifierQuotingStyle::DoubleQuotes:
writeDoubleQuotedString(s, out);
break;
}
@ -138,7 +138,7 @@ std::string ExternalQueryBuilder::composeLoadAllQuery() const
}
std::string ExternalQueryBuilder::composeUpdateQuery(const std::string &update_field, const std::string &time_point) const
std::string ExternalQueryBuilder::composeUpdateQuery(const std::string & update_field, const std::string & time_point) const
{
std::string out = composeLoadAllQuery();
std::string update_query;

View File

@ -3,6 +3,7 @@
#include <string>
#include <Formats/FormatSettings.h>
#include <Columns/IColumn.h>
#include <Parsers/IdentifierQuotingStyle.h>
namespace DB
@ -21,16 +22,7 @@ struct ExternalQueryBuilder
const std::string & table;
const std::string & where;
/// Method to quote identifiers.
/// NOTE There could be differences in escaping rules inside quotes. Escaping rules may not match that required by specific external DBMS.
enum QuotingStyle
{
None, /// Write as-is, without quotes.
Backticks, /// `mysql` style
DoubleQuotes /// "postgres" style
};
QuotingStyle quoting_style;
IdentifierQuotingStyle quoting_style;
ExternalQueryBuilder(
@ -38,7 +30,7 @@ struct ExternalQueryBuilder
const std::string & db,
const std::string & table,
const std::string & where,
QuotingStyle quoting_style);
IdentifierQuotingStyle quoting_style);
/** Generate a query to load all data. */
std::string composeLoadAllQuery() const;

View File

@ -208,25 +208,30 @@ BlockInputStreamPtr LibraryDictionarySource::loadKeys(const Columns & key_column
{
LOG_TRACE(log, "loadKeys " << toString() << " size = " << requested_rows.size());
auto columns_holder = std::make_unique<ClickHouseLibrary::CString[]>(key_columns.size());
ClickHouseLibrary::CStrings columns_pass{
static_cast<decltype(ClickHouseLibrary::CStrings::data)>(columns_holder.get()), key_columns.size()};
size_t key_columns_n = 0;
for (auto & column : key_columns)
auto holder = std::make_unique<ClickHouseLibrary::Row[]>(key_columns.size());
std::vector<std::unique_ptr<ClickHouseLibrary::Field[]>> column_data_holders;
for (size_t i = 0; i < key_columns.size(); ++i)
{
columns_pass.data[key_columns_n] = column->getName().c_str();
++key_columns_n;
}
const ClickHouseLibrary::VectorUInt64 requested_rows_c{
ext::bit_cast<decltype(ClickHouseLibrary::VectorUInt64::data)>(requested_rows.data()), requested_rows.size()};
void * data_ptr = nullptr;
auto cell_holder = std::make_unique<ClickHouseLibrary::Field[]>(requested_rows.size());
for (size_t j = 0; j < requested_rows.size(); ++j)
{
auto data_ref = key_columns[i]->getDataAt(requested_rows[j]);
cell_holder[j] = ClickHouseLibrary::Field{.data = static_cast<const void *>(data_ref.data), .size = data_ref.size};
}
holder[i]
= ClickHouseLibrary::Row{.data = static_cast<ClickHouseLibrary::Field *>(cell_holder.get()), .size = requested_rows.size()};
column_data_holders.push_back(std::move(cell_holder));
}
ClickHouseLibrary::Table request_cols{.data = static_cast<ClickHouseLibrary::Row *>(holder.get()), .size = key_columns.size()};
void * data_ptr = nullptr;
/// Get function pointer before dataNew call because library->get may throw.
auto func_loadKeys
= library->get<void * (*)(decltype(data_ptr), decltype(&settings->strings), decltype(&columns_pass), decltype(&requested_rows_c))>(
"ClickHouseDictionary_v3_loadKeys");
auto func_loadKeys = library->get<void * (*)(decltype(data_ptr), decltype(&settings->strings), decltype(&request_cols))>(
"ClickHouseDictionary_v3_loadKeys");
data_ptr = library->get<decltype(data_ptr) (*)(decltype(lib_data))>("ClickHouseDictionary_v3_dataNew")(lib_data);
auto data = func_loadKeys(data_ptr, &settings->strings, &columns_pass, &requested_rows_c);
auto data = func_loadKeys(data_ptr, &settings->strings, &request_cols);
auto block = dataToBlock(description.sample_block, data);
SCOPE_EXIT(library->get<void (*)(decltype(lib_data), decltype(data_ptr))>("ClickHouseDictionary_v3_dataDelete")(lib_data, data_ptr));
return std::make_shared<OneBlockInputStream>(block);

View File

@ -35,7 +35,7 @@ MySQLDictionarySource::MySQLDictionarySource(const DictionaryStructure & dict_st
dont_check_update_time{config.getBool(config_prefix + ".dont_check_update_time", false)},
sample_block{sample_block},
pool{config, config_prefix},
query_builder{dict_struct, db, table, where, ExternalQueryBuilder::Backticks},
query_builder{dict_struct, db, table, where, IdentifierQuotingStyle::Backticks},
load_all_query{query_builder.composeLoadAllQuery()},
invalidate_query{config.getString(config_prefix + ".invalidate_query", "")}
{
@ -53,7 +53,7 @@ MySQLDictionarySource::MySQLDictionarySource(const MySQLDictionarySource & other
dont_check_update_time{other.dont_check_update_time},
sample_block{other.sample_block},
pool{other.pool},
query_builder{dict_struct, db, table, where, ExternalQueryBuilder::Backticks},
query_builder{dict_struct, db, table, where, IdentifierQuotingStyle::Backticks},
load_all_query{other.load_all_query}, last_modification{other.last_modification},
invalidate_query{other.invalidate_query}, invalidate_query_response{other.invalidate_query_response}
{

View File

@ -29,7 +29,7 @@ ODBCDictionarySource::ODBCDictionarySource(const DictionaryStructure & dict_stru
where{config.getString(config_prefix + ".where", "")},
update_field{config.getString(config_prefix + ".update_field", "")},
sample_block{sample_block},
query_builder{dict_struct, db, table, where, ExternalQueryBuilder::None}, /// NOTE Better to obtain quoting style via ODBC interface.
query_builder{dict_struct, db, table, where, IdentifierQuotingStyle::None}, /// NOTE Better to obtain quoting style via ODBC interface.
load_all_query{query_builder.composeLoadAllQuery()},
invalidate_query{config.getString(config_prefix + ".invalidate_query", "")}
{
@ -58,7 +58,7 @@ ODBCDictionarySource::ODBCDictionarySource(const ODBCDictionarySource & other)
update_field{other.update_field},
sample_block{other.sample_block},
pool{other.pool},
query_builder{dict_struct, db, table, where, ExternalQueryBuilder::None},
query_builder{dict_struct, db, table, where, IdentifierQuotingStyle::None},
load_all_query{other.load_all_query},
invalidate_query{other.invalidate_query}, invalidate_query_response{other.invalidate_query_response}
{

View File

@ -1,6 +1,7 @@
#include <Common/Exception.h>
#include <IO/WriteHelpers.h>
#include <Formats/BlockInputStreamFromRowInputStream.h>
#include <common/logger_useful.h>
namespace DB
@ -128,4 +129,16 @@ Block BlockInputStreamFromRowInputStream::readImpl()
return sample.cloneWithColumns(std::move(columns));
}
void BlockInputStreamFromRowInputStream::readSuffix()
{
if (allow_errors_num > 0 || allow_errors_ratio > 0)
{
Logger * log = &Logger::get("BlockInputStreamFromRowInputStream");
LOG_TRACE(log, "Skipped " << num_errors << " rows with errors while reading the input stream");
}
row_input->readSuffix();
}
}

View File

@ -25,7 +25,7 @@ public:
const FormatSettings & settings);
void readPrefix() override { row_input->readPrefix(); }
void readSuffix() override { row_input->readSuffix(); }
void readSuffix() override;
String getName() const override { return "BlockInputStreamFromRowInputStream"; }

View File

@ -43,10 +43,10 @@ private:
/* Action for state machine for traversing nested structures. */
struct Action
{
enum Type { POP, PUSH, READ };
Type type;
capnp::StructSchema::Field field = {};
size_t column = 0;
enum Type { POP, PUSH, READ };
Type type;
capnp::StructSchema::Field field = {};
size_t column = 0;
};
// Wrapper for classes that could throw in destructor
@ -54,10 +54,10 @@ private:
template <typename T>
struct DestructorCatcher
{
T impl;
template <typename ... Arg>
DestructorCatcher(Arg && ... args) : impl(kj::fwd<Arg>(args)...) {}
~DestructorCatcher() noexcept try { } catch (...) { }
T impl;
template <typename ... Arg>
DestructorCatcher(Arg && ... args) : impl(kj::fwd<Arg>(args)...) {}
~DestructorCatcher() noexcept try { } catch (...) { return; }
};
using SchemaParser = DestructorCatcher<capnp::SchemaParser>;

View File

@ -58,6 +58,11 @@ public:
void registerInputFormat(const String & name, InputCreator input_creator);
void registerOutputFormat(const String & name, OutputCreator output_creator);
const FormatsDictionary & getAllFormats() const
{
return dict;
}
private:
FormatsDictionary dict;

View File

@ -41,6 +41,7 @@ generate_function_register(Array
FunctionArrayEnumerate
FunctionArrayEnumerateUniq
FunctionArrayUniq
FunctionArrayDistinct
FunctionEmptyArrayUInt8
FunctionEmptyArrayUInt16
FunctionEmptyArrayUInt32

View File

@ -1062,9 +1062,7 @@ void FunctionArrayUniq::executeImpl(Block & block, const ColumnNumbers & argumen
|| executeNumber<Float32>(first_array, first_null_map, res_values)
|| executeNumber<Float64>(first_array, first_null_map, res_values)
|| executeString(first_array, first_null_map, res_values)))
throw Exception("Illegal column " + block.getByPosition(arguments[0]).column->getName()
+ " of first argument of function " + getName(),
ErrorCodes::ILLEGAL_COLUMN);
executeHashed(*offsets, original_data_columns, res_values);
}
else
{
@ -1272,6 +1270,213 @@ void FunctionArrayUniq::executeHashed(
}
}
/// Implementation of FunctionArrayDistinct.
FunctionPtr FunctionArrayDistinct::create(const Context &)
{
return std::make_shared<FunctionArrayDistinct>();
}
String FunctionArrayDistinct::getName() const
{
return name;
}
DataTypePtr FunctionArrayDistinct::getReturnTypeImpl(const DataTypes & arguments) const
{
const DataTypeArray * array_type = checkAndGetDataType<DataTypeArray>(arguments[0].get());
if (!array_type)
throw Exception("Argument for function " + getName() + " must be array but it "
" has type " + arguments[0]->getName() + ".",
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
auto nested_type = removeNullable(array_type->getNestedType());
return std::make_shared<DataTypeArray>(nested_type);
}
void FunctionArrayDistinct::executeImpl(Block & block, const ColumnNumbers & arguments, size_t result, size_t /*input_rows_count*/)
{
ColumnPtr array_ptr = block.getByPosition(arguments[0]).column;
const ColumnArray * array = checkAndGetColumn<ColumnArray>(array_ptr.get());
const auto & return_type = block.getByPosition(result).type;
auto res_ptr = return_type->createColumn();
ColumnArray & res = static_cast<ColumnArray &>(*res_ptr);
const IColumn & src_data = array->getData();
const ColumnArray::Offsets & offsets = array->getOffsets();
ColumnRawPtrs original_data_columns;
original_data_columns.push_back(&src_data);
IColumn & res_data = res.getData();
ColumnArray::Offsets & res_offsets = res.getOffsets();
const ColumnNullable * nullable_col = nullptr;
const IColumn * inner_col;
if (src_data.isColumnNullable())
{
nullable_col = static_cast<const ColumnNullable *>(&src_data);
inner_col = &nullable_col->getNestedColumn();
}
else
{
inner_col = &src_data;
}
if (!(executeNumber<UInt8>(*inner_col, offsets, res_data, res_offsets, nullable_col)
|| executeNumber<UInt16>(*inner_col, offsets, res_data, res_offsets, nullable_col)
|| executeNumber<UInt32>(*inner_col, offsets, res_data, res_offsets, nullable_col)
|| executeNumber<UInt64>(*inner_col, offsets, res_data, res_offsets, nullable_col)
|| executeNumber<Int8>(*inner_col, offsets, res_data, res_offsets, nullable_col)
|| executeNumber<Int16>(*inner_col, offsets, res_data, res_offsets, nullable_col)
|| executeNumber<Int32>(*inner_col, offsets, res_data, res_offsets, nullable_col)
|| executeNumber<Int64>(*inner_col, offsets, res_data, res_offsets, nullable_col)
|| executeNumber<Float32>(*inner_col, offsets, res_data, res_offsets, nullable_col)
|| executeNumber<Float64>(*inner_col, offsets, res_data, res_offsets, nullable_col)
|| executeString(*inner_col, offsets, res_data, res_offsets, nullable_col)))
executeHashed(offsets, original_data_columns, res_data, res_offsets);
block.getByPosition(result).column = std::move(res_ptr);
}
template <typename T>
bool FunctionArrayDistinct::executeNumber(const IColumn & src_data,
const ColumnArray::Offsets & src_offsets,
IColumn & res_data_col,
ColumnArray::Offsets & res_offsets,
const ColumnNullable * nullable_col)
{
const ColumnVector<T> * src_data_concrete = checkAndGetColumn<ColumnVector<T>>(&src_data);
if (!src_data_concrete)
{
return false;
}
const PaddedPODArray<T> & values = src_data_concrete->getData();
PaddedPODArray<T> & res_data = typeid_cast<ColumnVector<T> &>(res_data_col).getData();
const PaddedPODArray<UInt8> * src_null_map = nullptr;
if (nullable_col)
{
src_null_map = &static_cast<const ColumnUInt8 *>(&nullable_col->getNullMapColumn())->getData();
}
using Set = ClearableHashSet<T,
DefaultHash<T>,
HashTableGrower<INITIAL_SIZE_DEGREE>,
HashTableAllocatorWithStackMemory<(1ULL << INITIAL_SIZE_DEGREE) * sizeof(T)>>;
Set set;
size_t prev_off = 0;
for (size_t i = 0; i < src_offsets.size(); ++i)
{
set.clear();
size_t off = src_offsets[i];
for (size_t j = prev_off; j < off; ++j)
{
if ((set.find(values[j]) == set.end()) && (!nullable_col || (*src_null_map)[j] == 0))
{
res_data.emplace_back(values[j]);
set.insert(values[j]);
}
}
res_offsets.emplace_back(set.size() + prev_off);
prev_off = off;
}
return true;
}
bool FunctionArrayDistinct::executeString(
const IColumn & src_data,
const ColumnArray::Offsets & src_offsets,
IColumn & res_data_col,
ColumnArray::Offsets & res_offsets,
const ColumnNullable * nullable_col)
{
const ColumnString * src_data_concrete = checkAndGetColumn<ColumnString>(&src_data);
if (!src_data_concrete)
{
return false;
}
ColumnString & res_data_column_string = typeid_cast<ColumnString &>(res_data_col);
using Set = ClearableHashSet<StringRef,
StringRefHash,
HashTableGrower<INITIAL_SIZE_DEGREE>,
HashTableAllocatorWithStackMemory<(1ULL << INITIAL_SIZE_DEGREE) * sizeof(StringRef)>>;
const PaddedPODArray<UInt8> * src_null_map = nullptr;
if (nullable_col)
{
src_null_map = &static_cast<const ColumnUInt8 *>(&nullable_col->getNullMapColumn())->getData();
}
Set set;
size_t prev_off = 0;
for (size_t i = 0; i < src_offsets.size(); ++i)
{
set.clear();
size_t off = src_offsets[i];
for (size_t j = prev_off; j < off; ++j)
{
StringRef str_ref = src_data_concrete->getDataAt(j);
if (set.find(str_ref) == set.end() && (!nullable_col || (*src_null_map)[j] == 0))
{
set.insert(str_ref);
res_data_column_string.insertData(str_ref.data, str_ref.size);
}
}
res_offsets.emplace_back(set.size() + prev_off);
prev_off = off;
}
return true;
}
void FunctionArrayDistinct::executeHashed(
const ColumnArray::Offsets & offsets,
const ColumnRawPtrs & columns,
IColumn & res_data_col,
ColumnArray::Offsets & res_offsets)
{
size_t count = columns.size();
using Set = ClearableHashSet<UInt128, UInt128TrivialHash, HashTableGrower<INITIAL_SIZE_DEGREE>,
HashTableAllocatorWithStackMemory<(1ULL << INITIAL_SIZE_DEGREE) * sizeof(UInt128)>>;
Set set;
size_t prev_off = 0;
for (size_t i = 0; i < offsets.size(); ++i)
{
set.clear();
size_t off = offsets[i];
for (size_t j = prev_off; j < off; ++j)
{
auto hash = hash128(j, count, columns);
if (set.find(hash) == set.end())
{
set.insert(hash);
res_data_col.insertFrom(*columns[0], j);
}
}
res_offsets.emplace_back(set.size() + prev_off);
prev_off = off;
}
}
/// Implementation of FunctionArrayEnumerateUniq.
FunctionPtr FunctionArrayEnumerateUniq::create(const Context &)
@ -1334,13 +1539,7 @@ void FunctionArrayEnumerateUniq::executeImpl(Block & block, const ColumnNumbers
ErrorCodes::SIZES_OF_ARRAYS_DOESNT_MATCH);
auto * array_data = &array->getData();
if (auto * tuple_column = checkAndGetColumn<ColumnTuple>(array_data))
{
for (const auto & element : tuple_column->getColumns())
data_columns.push_back(element.get());
}
else
data_columns.push_back(array_data);
data_columns.push_back(array_data);
}
size_t num_columns = data_columns.size();
@ -1383,9 +1582,7 @@ void FunctionArrayEnumerateUniq::executeImpl(Block & block, const ColumnNumbers
|| executeNumber<Float32>(first_array, first_null_map, res_values)
|| executeNumber<Float64>(first_array, first_null_map, res_values)
|| executeString (first_array, first_null_map, res_values)))
throw Exception("Illegal column " + block.getByPosition(arguments[0]).column->getName()
+ " of first argument of function " + getName(),
ErrorCodes::ILLEGAL_COLUMN);
executeHashed(*offsets, original_data_columns, res_values);
}
else
{
@ -2427,8 +2624,6 @@ void FunctionArrayReduce::executeImpl(Block & block, const ColumnNumbers & argum
std::vector<const IColumn *> aggregate_arguments_vec(num_arguments_columns);
const ColumnArray::Offsets * offsets = nullptr;
bool is_const = true;
for (size_t i = 0; i < num_arguments_columns; ++i)
{
const IColumn * col = block.getByPosition(arguments[i + 1]).column.get();
@ -2437,7 +2632,6 @@ void FunctionArrayReduce::executeImpl(Block & block, const ColumnNumbers & argum
{
aggregate_arguments_vec[i] = &arr->getData();
offsets_i = &arr->getOffsets();
is_const = false;
}
else if (const ColumnConst * const_arr = checkAndGetColumnConst<ColumnArray>(col))
{
@ -2493,14 +2687,7 @@ void FunctionArrayReduce::executeImpl(Block & block, const ColumnNumbers & argum
current_offset = next_offset;
}
if (!is_const)
{
block.getByPosition(result).column = std::move(result_holder);
}
else
{
block.getByPosition(result).column = block.getByPosition(result).type->createColumnConst(rows, res_col[0]);
}
block.getByPosition(result).column = std::move(result_holder);
}
/// Implementation of FunctionArrayConcat.

View File

@ -46,6 +46,8 @@ namespace ErrorCodes
* arrayUniq(arr) - counts the number of different elements in the array,
* arrayUniq(arr1, arr2, ...) - counts the number of different tuples from the elements in the corresponding positions in several arrays.
*
* arrayDistinct(arr) - retrun different elements in an array
*
* arrayEnumerateUniq(arr)
* - outputs an array parallel (having same size) to this, where for each element specified
* how many times this element was encountered before (including this element) among elements with the same value.
@ -1009,10 +1011,11 @@ public:
DataTypePtr observed_type0 = removeNullable(array_type->getNestedType());
DataTypePtr observed_type1 = removeNullable(arguments[1]);
if (!(observed_type0->isNumber() && observed_type1->isNumber())
/// We also support arrays of Enum type (that are represented by number) to search numeric values.
if (!(observed_type0->isValueRepresentedByNumber() && observed_type1->isNumber())
&& !observed_type0->equals(*observed_type1))
throw Exception("Types of array and 2nd argument of function "
+ getName() + " must be identical up to nullability. Passed: "
+ getName() + " must be identical up to nullability or numeric types or Enum and numeric type. Passed: "
+ arguments[0]->getName() + " and " + arguments[1]->getName() + ".",
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
}
@ -1210,6 +1213,52 @@ private:
};
/// Find different elements in an array.
class FunctionArrayDistinct : public IFunction
{
public:
static constexpr auto name = "arrayDistinct";
static FunctionPtr create(const Context & context);
String getName() const override;
bool isVariadic() const override { return false; }
size_t getNumberOfArguments() const override { return 1; }
bool useDefaultImplementationForConstants() const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override;
void executeImpl(Block & block, const ColumnNumbers & arguments, size_t result, size_t input_rows_count) override;
private:
/// Initially allocate a piece of memory for 512 elements. NOTE: This is just a guess.
static constexpr size_t INITIAL_SIZE_DEGREE = 9;
template <typename T>
bool executeNumber(
const IColumn & src_data,
const ColumnArray::Offsets & src_offsets,
IColumn & res_data_col,
ColumnArray::Offsets & res_offsets,
const ColumnNullable * nullable_col);
bool executeString(
const IColumn & src_data,
const ColumnArray::Offsets & src_offsets,
IColumn & res_data_col,
ColumnArray::Offsets & res_offsets,
const ColumnNullable * nullable_col);
void executeHashed(
const ColumnArray::Offsets & offsets,
const ColumnRawPtrs & columns,
IColumn & res_data_col,
ColumnArray::Offsets & res_offsets);
};
class FunctionArrayEnumerateUniq : public IFunction
{
public:
@ -1384,6 +1433,9 @@ public:
bool isVariadic() const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool useDefaultImplementationForConstants() const override { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {0}; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override;
void executeImpl(Block & block, const ColumnNumbers & arguments, size_t result, size_t input_rows_count) override;

View File

@ -15,6 +15,7 @@
#include <Common/UnicodeBar.h>
#include <Common/UTF8Helpers.h>
#include <Common/FieldVisitors.h>
#include <Common/config_version.h>
#include <DataTypes/DataTypeAggregateFunction.h>
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypeDate.h>
@ -1867,9 +1868,7 @@ public:
std::string FunctionVersion::getVersion() const
{
std::ostringstream os;
os << DBMS_VERSION_MAJOR << "." << DBMS_VERSION_MINOR << "." << ClickHouseRevision::get();
return os.str();
return VERSION_STRING;
}

View File

@ -640,7 +640,6 @@ inline void readBinary(String & x, ReadBuffer & buf) { readStringBinary(x, buf);
inline void readBinary(UInt128 & x, ReadBuffer & buf) { readPODBinary(x, buf); }
inline void readBinary(UInt256 & x, ReadBuffer & buf) { readPODBinary(x, buf); }
inline void readBinary(LocalDate & x, ReadBuffer & buf) { readPODBinary(x, buf); }
inline void readBinary(LocalDateTime & x, ReadBuffer & buf) { readPODBinary(x, buf); }
/// Generic methods to read value in text tab-separated format.

View File

@ -394,11 +394,13 @@ inline void writeBackQuotedString(const String & s, WriteBuffer & buf)
writeAnyQuotedString<'`'>(s, buf);
}
/// The same, but backquotes apply only if there are characters that do not match the identifier without backquotes.
inline void writeProbablyBackQuotedString(const String & s, WriteBuffer & buf)
/// The same, but quotes apply only if there are characters that do not match the identifier without quotes.
template <typename F>
inline void writeProbablyQuotedStringImpl(const String & s, WriteBuffer & buf, F && write_quoted_string)
{
if (s.empty() || !isValidIdentifierBegin(s[0]))
writeBackQuotedString(s, buf);
write_quoted_string(s, buf);
else
{
const char * pos = s.data() + 1;
@ -407,12 +409,22 @@ inline void writeProbablyBackQuotedString(const String & s, WriteBuffer & buf)
if (!isWordCharASCII(*pos))
break;
if (pos != end)
writeBackQuotedString(s, buf);
write_quoted_string(s, buf);
else
writeString(s, buf);
}
}
inline void writeProbablyBackQuotedString(const String & s, WriteBuffer & buf)
{
writeProbablyQuotedStringImpl(s, buf, [](const String & s, WriteBuffer & buf) { return writeBackQuotedString(s, buf); });
}
inline void writeProbablyDoubleQuotedString(const String & s, WriteBuffer & buf)
{
writeProbablyQuotedStringImpl(s, buf, [](const String & s, WriteBuffer & buf) { return writeDoubleQuotedString(s, buf); });
}
/** Outputs the string in for the CSV format.
* Rules:

View File

@ -6,6 +6,7 @@
#include <Core/Defines.h>
#include <Common/getFQDNOrHostName.h>
#include <Common/ClickHouseRevision.h>
#include <Common/config_version.h>
#include <port/unistd.h>

View File

@ -58,7 +58,7 @@ namespace
BlockInputStreamPtr createLocalStream(const ASTPtr & query_ast, const Context & context, QueryProcessingStage::Enum processed_stage)
{
InterpreterSelectQuery interpreter{query_ast, context, {}, processed_stage};
InterpreterSelectQuery interpreter{query_ast, context, Names{}, processed_stage};
BlockInputStreamPtr stream = interpreter.execute().in;
/** Materialization is needed, since from remote servers the constants come materialized.

View File

@ -158,7 +158,7 @@ ExpressionAction ExpressionAction::ordinaryJoin(std::shared_ptr<const Join> join
void ExpressionAction::prepare(Block & sample_block)
{
// std::cerr << "preparing: " << toString() << std::endl;
// std::cerr << "preparing: " << toString() << std::endl;
/** Constant expressions should be evaluated, and put the result in sample_block.
*/
@ -322,8 +322,6 @@ size_t ExpressionAction::getInputRowsCount(Block & block, std::unordered_map<std
void ExpressionAction::execute(Block & block, std::unordered_map<std::string, size_t> & input_rows_counts) const
{
// std::cerr << "executing: " << toString() << std::endl;
size_t input_rows_count = getInputRowsCount(block, input_rows_counts);
if (type == REMOVE_COLUMN || type == COPY_COLUMN)

View File

@ -61,6 +61,7 @@
#include <DataTypes/DataTypeFunction.h>
#include <Functions/FunctionsMiscellaneous.h>
#include <DataTypes/DataTypeTuple.h>
#include <Parsers/queryToString.h>
namespace DB
@ -89,6 +90,7 @@ namespace ErrorCodes
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
extern const int CONDITIONAL_TREE_PARENT_NOT_FOUND;
extern const int TYPE_MISMATCH;
extern const int INVALID_JOIN_ON_EXPRESSION;
}
@ -215,7 +217,7 @@ ExpressionAnalyzer::ExpressionAnalyzer(
/// Common subexpression elimination. Rewrite rules.
normalizeTree();
/// Remove unneeded columns according to 'required_source_columns'.
/// Remove unneeded columns according to 'required_result_columns'.
/// Leave all selected columns in case of DISTINCT; columns that contain arrayJoin function inside.
/// Must be after 'normalizeTree' (after expanding aliases, for aliases not get lost)
/// and before 'executeScalarSubqueries', 'analyzeAggregation', etc. to avoid excessive calculations.
@ -1477,18 +1479,7 @@ void ExpressionAnalyzer::tryMakeSetForIndexFromSubquery(const ASTPtr & subquery_
{
BlockIO res = interpretSubquery(subquery_or_table_name, context, subquery_depth + 1, {})->execute();
SizeLimits set_for_index_size_limits;
if (settings.use_index_for_in_with_subqueries_max_values && settings.use_index_for_in_with_subqueries_max_values < settings.max_rows_in_set)
{
/// Silently cancel creating the set for index if the specific limit has been reached.
set_for_index_size_limits = SizeLimits(settings.use_index_for_in_with_subqueries_max_values, settings.max_bytes_in_set, OverflowMode::BREAK);
}
else
{
/// If the limit specific for set for index is lower than general limits for set - use general limit.
set_for_index_size_limits = SizeLimits(settings.max_rows_in_set, settings.max_bytes_in_set, settings.set_overflow_mode);
}
SizeLimits set_for_index_size_limits = SizeLimits(settings.max_rows_in_set, settings.max_bytes_in_set, settings.set_overflow_mode);
SetPtr set = std::make_shared<Set>(set_for_index_size_limits, true);
set->setHeader(res.in->getHeader());
@ -2071,6 +2062,7 @@ void ExpressionAnalyzer::getActionsImpl(const ASTPtr & ast, bool no_subqueries,
ColumnWithTypeAndName fake_column;
fake_column.name = projection_manipulator->getColumnName(getColumnName());
fake_column.type = std::make_shared<DataTypeUInt8>();
fake_column.column = fake_column.type->createColumn();
actions_stack.addAction(ExpressionAction::addColumn(fake_column, projection_manipulator->getProjectionSourceColumn(), false));
getActionsImpl(node->arguments->children.at(0), no_subqueries, only_consts, actions_stack,
projection_manipulator);
@ -2866,22 +2858,53 @@ void ExpressionAnalyzer::collectJoinedColumns(NameSet & joined_columns, NamesAnd
nested_result_sample = InterpreterSelectWithUnionQuery::getSampleBlock(subquery, context);
}
auto add_name_to_join_keys = [](Names & join_keys, const String & name, const char * where)
{
if (join_keys.end() == std::find(join_keys.begin(), join_keys.end(), name))
join_keys.push_back(name);
else
throw Exception("Duplicate column " + name + " " + where, ErrorCodes::DUPLICATE_COLUMN);
};
if (table_join.using_expression_list)
{
auto & keys = typeid_cast<ASTExpressionList &>(*table_join.using_expression_list);
for (const auto & key : keys.children)
{
if (join_key_names_left.end() == std::find(join_key_names_left.begin(), join_key_names_left.end(), key->getColumnName()))
join_key_names_left.push_back(key->getColumnName());
else
throw Exception("Duplicate column " + key->getColumnName() + " in USING list", ErrorCodes::DUPLICATE_COLUMN);
if (join_key_names_right.end() == std::find(join_key_names_right.begin(), join_key_names_right.end(), key->getAliasOrColumnName()))
join_key_names_right.push_back(key->getAliasOrColumnName());
else
throw Exception("Duplicate column " + key->getAliasOrColumnName() + " in USING list", ErrorCodes::DUPLICATE_COLUMN);
add_name_to_join_keys(join_key_names_left, key->getColumnName(), "in USING list");
add_name_to_join_keys(join_key_names_right, key->getAliasOrColumnName(), "in USING list");
}
}
else if (table_join.on_expression)
{
const auto supported_syntax =
"\nSupported syntax: JOIN ON [table.]column = [table.]column [AND [table.]column = [table.]column ...]";
auto throwSyntaxException = [&](const String & msg)
{
throw Exception("Invalid expression for JOIN ON. " + msg + supported_syntax, ErrorCodes::INVALID_JOIN_ON_EXPRESSION);
};
auto add_columns_from_equals_expr = [&](const ASTPtr & expr)
{
auto * func_equals = typeid_cast<const ASTFunction *>(expr.get());
if (!func_equals || func_equals->name != "equals")
throwSyntaxException("Expected equals expression, got " + queryToString(expr));
String left_name = func_equals->arguments->children.at(0)->getAliasOrColumnName();
String right_name = func_equals->arguments->children.at(1)->getAliasOrColumnName();
add_name_to_join_keys(join_key_names_left, left_name, "in JOIN ON expression for left table");
add_name_to_join_keys(join_key_names_right, right_name, "in JOIN ON expression for right table");
};
auto * func = typeid_cast<const ASTFunction *>(table_join.on_expression.get());
if (func && func->name == "and")
{
for (auto expr : func->children)
add_columns_from_equals_expr(expr);
}
else
add_columns_from_equals_expr(table_join.on_expression);
}
for (const auto i : ext::range(0, nested_result_sample.columns()))
{

View File

@ -63,40 +63,70 @@ namespace ErrorCodes
InterpreterSelectQuery::InterpreterSelectQuery(
const ASTPtr & query_ptr_,
const Context & context_,
const Names & required_result_column_names_,
const Names & required_result_column_names,
QueryProcessingStage::Enum to_stage_,
size_t subquery_depth_,
const BlockInputStreamPtr & input,
bool only_analyze)
: query_ptr(query_ptr_->clone()) /// Note: the query is cloned because it will be modified during analysis.
, query(typeid_cast<ASTSelectQuery &>(*query_ptr))
, context(context_)
, to_stage(to_stage_)
, subquery_depth(subquery_depth_)
, only_analyze(only_analyze)
, input(input)
, log(&Logger::get("InterpreterSelectQuery"))
bool only_analyze_)
: InterpreterSelectQuery(query_ptr_, context_, nullptr, nullptr, required_result_column_names, to_stage_, subquery_depth_, only_analyze_)
{
init(required_result_column_names_);
}
InterpreterSelectQuery::InterpreterSelectQuery(OnlyAnalyzeTag, const ASTPtr & query_ptr_, const Context & context_)
: query_ptr(query_ptr_->clone())
, query(typeid_cast<ASTSelectQuery &>(*query_ptr))
, context(context_)
, to_stage(QueryProcessingStage::Complete)
, subquery_depth(0)
, only_analyze(true)
, log(&Logger::get("InterpreterSelectQuery"))
InterpreterSelectQuery::InterpreterSelectQuery(
const ASTPtr & query_ptr_,
const Context & context_,
const BlockInputStreamPtr & input_,
QueryProcessingStage::Enum to_stage_,
bool only_analyze_)
: InterpreterSelectQuery(query_ptr_, context_, input_, nullptr, Names{}, to_stage_, 0, only_analyze_)
{
}
InterpreterSelectQuery::InterpreterSelectQuery(
const ASTPtr & query_ptr_,
const Context & context_,
const StoragePtr & storage_,
QueryProcessingStage::Enum to_stage_,
bool only_analyze_)
: InterpreterSelectQuery(query_ptr_, context_, nullptr, storage_, Names{}, to_stage_, 0, only_analyze_)
{
init({});
}
InterpreterSelectQuery::~InterpreterSelectQuery() = default;
void InterpreterSelectQuery::init(const Names & required_result_column_names)
/** There are no limits on the maximum size of the result for the subquery.
* Since the result of the query is not the result of the entire query.
*/
static Context getSubqueryContext(const Context & context)
{
Context subquery_context = context;
Settings subquery_settings = context.getSettings();
subquery_settings.max_result_rows = 0;
subquery_settings.max_result_bytes = 0;
/// The calculation of extremes does not make sense and is not necessary (if you do it, then the extremes of the subquery can be taken for whole query).
subquery_settings.extremes = 0;
subquery_context.setSettings(subquery_settings);
return subquery_context;
}
InterpreterSelectQuery::InterpreterSelectQuery(
const ASTPtr & query_ptr_,
const Context & context_,
const BlockInputStreamPtr & input_,
const StoragePtr & storage_,
const Names & required_result_column_names,
QueryProcessingStage::Enum to_stage_,
size_t subquery_depth_,
bool only_analyze_)
: query_ptr(query_ptr_->clone()) /// Note: the query is cloned because it will be modified during analysis.
, query(typeid_cast<ASTSelectQuery &>(*query_ptr))
, context(context_)
, to_stage(to_stage_)
, subquery_depth(subquery_depth_)
, only_analyze(only_analyze_)
, storage(storage_)
, input(input_)
, log(&Logger::get("InterpreterSelectQuery"))
{
if (!context.hasQueryContext())
context.setQueryContext(context);
@ -111,39 +141,44 @@ void InterpreterSelectQuery::init(const Names & required_result_column_names)
max_streams = settings.max_threads;
const auto & table_expression = query.table();
NamesAndTypesList source_columns;
if (input)
{
/// Read from prepared input.
source_columns = input->getHeader().getNamesAndTypesList();
source_header = input->getHeader();
}
else if (table_expression && typeid_cast<const ASTSelectWithUnionQuery *>(table_expression.get()))
{
/// Read from subquery.
source_columns = InterpreterSelectWithUnionQuery::getSampleBlock(table_expression, context).getNamesAndTypesList();
}
else if (table_expression && typeid_cast<const ASTFunction *>(table_expression.get()))
{
/// Read from table function.
storage = context.getQueryContext().executeTableFunction(table_expression);
}
else
{
/// Read from table. Even without table expression (implicit SELECT ... FROM system.one).
String database_name;
String table_name;
interpreter_subquery = std::make_unique<InterpreterSelectWithUnionQuery>(
table_expression, getSubqueryContext(context), required_columns, QueryProcessingStage::Complete, subquery_depth + 1, only_analyze);
getDatabaseAndTableNames(database_name, table_name);
source_header = interpreter_subquery->getSampleBlock();
}
else if (!storage)
{
if (table_expression && typeid_cast<const ASTFunction *>(table_expression.get()))
{
/// Read from table function.
storage = context.getQueryContext().executeTableFunction(table_expression);
}
else
{
/// Read from table. Even without table expression (implicit SELECT ... FROM system.one).
String database_name;
String table_name;
storage = context.getTable(database_name, table_name);
getDatabaseAndTableNames(database_name, table_name);
storage = context.getTable(database_name, table_name);
}
}
if (storage)
table_lock = storage->lockStructure(false, __PRETTY_FUNCTION__);
query_analyzer = std::make_unique<ExpressionAnalyzer>(
query_ptr, context, storage, source_columns, required_result_column_names, subquery_depth, !only_analyze);
query_ptr, context, storage, source_header.getNamesAndTypesList(), required_result_column_names, subquery_depth, !only_analyze);
if (!only_analyze)
{
@ -161,6 +196,25 @@ void InterpreterSelectQuery::init(const Names & required_result_column_names)
if (!context.tryGetExternalTable(it.first))
context.addExternalTable(it.first, it.second);
}
if (interpreter_subquery)
{
/// If there is an aggregation in the outer query, WITH TOTALS is ignored in the subquery.
if (query_analyzer->hasAggregation())
interpreter_subquery->ignoreWithTotals();
}
required_columns = query_analyzer->getRequiredSourceColumns();
if (storage)
source_header = storage->getSampleBlockForColumns(required_columns);
/// Calculate structure of the result.
{
Pipeline pipeline;
executeImpl(pipeline, input, true);
result_header = pipeline.firstStream()->getHeader();
}
}
@ -194,23 +248,14 @@ void InterpreterSelectQuery::getDatabaseAndTableNames(String & database_name, St
Block InterpreterSelectQuery::getSampleBlock()
{
Pipeline pipeline;
executeImpl(pipeline, input, true);
auto res = pipeline.firstStream()->getHeader();
return res;
}
Block InterpreterSelectQuery::getSampleBlock(const ASTPtr & query_ptr_, const Context & context_)
{
return InterpreterSelectQuery(OnlyAnalyzeTag(), query_ptr_, context_).getSampleBlock();
return result_header;
}
BlockIO InterpreterSelectQuery::execute()
{
Pipeline pipeline;
executeImpl(pipeline, input, false);
executeImpl(pipeline, input, only_analyze);
executeUnion(pipeline);
BlockIO res;
@ -221,12 +266,12 @@ BlockIO InterpreterSelectQuery::execute()
BlockInputStreams InterpreterSelectQuery::executeWithMultipleStreams()
{
Pipeline pipeline;
executeImpl(pipeline, input, false);
executeImpl(pipeline, input, only_analyze);
return pipeline.streams;
}
InterpreterSelectQuery::AnalysisResult InterpreterSelectQuery::analyzeExpressions(QueryProcessingStage::Enum from_stage)
InterpreterSelectQuery::AnalysisResult InterpreterSelectQuery::analyzeExpressions(QueryProcessingStage::Enum from_stage, bool dry_run)
{
AnalysisResult res;
@ -247,16 +292,16 @@ InterpreterSelectQuery::AnalysisResult InterpreterSelectQuery::analyzeExpression
res.need_aggregate = query_analyzer->hasAggregation();
query_analyzer->appendArrayJoin(chain, !res.first_stage);
query_analyzer->appendArrayJoin(chain, dry_run || !res.first_stage);
if (query_analyzer->appendJoin(chain, !res.first_stage))
if (query_analyzer->appendJoin(chain, dry_run || !res.first_stage))
{
res.has_join = true;
res.before_join = chain.getLastActions();
chain.addStep();
}
if (query_analyzer->appendWhere(chain, !res.first_stage))
if (query_analyzer->appendWhere(chain, dry_run || !res.first_stage))
{
res.has_where = true;
res.before_where = chain.getLastActions();
@ -265,14 +310,14 @@ InterpreterSelectQuery::AnalysisResult InterpreterSelectQuery::analyzeExpression
if (res.need_aggregate)
{
query_analyzer->appendGroupBy(chain, !res.first_stage);
query_analyzer->appendAggregateFunctionsArguments(chain, !res.first_stage);
query_analyzer->appendGroupBy(chain, dry_run || !res.first_stage);
query_analyzer->appendAggregateFunctionsArguments(chain, dry_run || !res.first_stage);
res.before_aggregation = chain.getLastActions();
chain.finalize();
chain.clear();
if (query_analyzer->appendHaving(chain, !res.second_stage))
if (query_analyzer->appendHaving(chain, dry_run || !res.second_stage))
{
res.has_having = true;
res.before_having = chain.getLastActions();
@ -281,13 +326,13 @@ InterpreterSelectQuery::AnalysisResult InterpreterSelectQuery::analyzeExpression
}
/// If there is aggregation, we execute expressions in SELECT and ORDER BY on the initiating server, otherwise on the source servers.
query_analyzer->appendSelect(chain, res.need_aggregate ? !res.second_stage : !res.first_stage);
query_analyzer->appendSelect(chain, dry_run || (res.need_aggregate ? !res.second_stage : !res.first_stage));
res.selected_columns = chain.getLastStep().required_output;
res.has_order_by = query_analyzer->appendOrderBy(chain, res.need_aggregate ? !res.second_stage : !res.first_stage);
res.has_order_by = query_analyzer->appendOrderBy(chain, dry_run || (res.need_aggregate ? !res.second_stage : !res.first_stage));
res.before_order_and_select = chain.getLastActions();
chain.addStep();
if (query_analyzer->appendLimitBy(chain, !res.second_stage))
if (query_analyzer->appendLimitBy(chain, dry_run || !res.second_stage))
{
res.has_limit_by = true;
res.before_limit_by = chain.getLastActions();
@ -328,16 +373,25 @@ void InterpreterSelectQuery::executeImpl(Pipeline & pipeline, const BlockInputSt
* then perform the remaining operations with one resulting stream.
*/
/** Read the data from Storage. from_stage - to what stage the request was completed in Storage. */
QueryProcessingStage::Enum from_stage = executeFetchColumns(pipeline, dry_run);
AnalysisResult expressions;
if (from_stage == QueryProcessingStage::WithMergeableState && to_stage == QueryProcessingStage::WithMergeableState)
throw Exception("Distributed on Distributed is not supported", ErrorCodes::NOT_IMPLEMENTED);
if (dry_run)
{
pipeline.streams.emplace_back(std::make_shared<NullBlockInputStream>(source_header));
expressions = analyzeExpressions(QueryProcessingStage::FetchColumns, true);
}
else
{
/** Read the data from Storage. from_stage - to what stage the request was completed in Storage. */
QueryProcessingStage::Enum from_stage = executeFetchColumns(pipeline);
if (from_stage == QueryProcessingStage::WithMergeableState && to_stage == QueryProcessingStage::WithMergeableState)
throw Exception("Distributed on Distributed is not supported", ErrorCodes::NOT_IMPLEMENTED);
if (!dry_run)
LOG_TRACE(log, QueryProcessingStage::toString(from_stage) << " -> " << QueryProcessingStage::toString(to_stage));
AnalysisResult expressions = analyzeExpressions(from_stage);
expressions = analyzeExpressions(from_stage, false);
}
const Settings & settings = context.getSettingsRef();
@ -502,11 +556,8 @@ static void getLimitLengthAndOffset(ASTSelectQuery & query, size_t & length, siz
}
}
QueryProcessingStage::Enum InterpreterSelectQuery::executeFetchColumns(Pipeline & pipeline, bool dry_run)
QueryProcessingStage::Enum InterpreterSelectQuery::executeFetchColumns(Pipeline & pipeline)
{
/// List of columns to read to execute the query.
Names required_columns = query_analyzer->getRequiredSourceColumns();
/// Actions to calculate ALIAS if required.
ExpressionActionsPtr alias_actions;
/// Are ALIAS columns required for query execution?
@ -546,36 +597,11 @@ QueryProcessingStage::Enum InterpreterSelectQuery::executeFetchColumns(Pipeline
}
}
/// The subquery interpreter, if the subquery
std::unique_ptr<InterpreterSelectWithUnionQuery> interpreter_subquery;
auto query_table = query.table();
if (query_table && typeid_cast<ASTSelectWithUnionQuery *>(query_table.get()))
{
/** There are no limits on the maximum size of the result for the subquery.
* Since the result of the query is not the result of the entire query.
*/
Context subquery_context = context;
Settings subquery_settings = context.getSettings();
subquery_settings.max_result_rows = 0;
subquery_settings.max_result_bytes = 0;
/// The calculation of extremes does not make sense and is not necessary (if you do it, then the extremes of the subquery can be taken for whole query).
subquery_settings.extremes = 0;
subquery_context.setSettings(subquery_settings);
interpreter_subquery = std::make_unique<InterpreterSelectWithUnionQuery>(
query_table, subquery_context, required_columns, QueryProcessingStage::Complete, subquery_depth + 1);
/// If there is an aggregation in the outer query, WITH TOTALS is ignored in the subquery.
if (query_analyzer->hasAggregation())
interpreter_subquery->ignoreWithTotals();
}
const Settings & settings = context.getSettingsRef();
/// Limitation on the number of columns to read.
/// It's not applied in 'dry_run' mode, because the query could be analyzed without removal of unnecessary columns.
if (!dry_run && settings.max_columns_to_read && required_columns.size() > settings.max_columns_to_read)
/// It's not applied in 'only_analyze' mode, because the query could be analyzed without removal of unnecessary columns.
if (!only_analyze && settings.max_columns_to_read && required_columns.size() > settings.max_columns_to_read)
throw Exception("Limit for number of columns to read exceeded. "
"Requested: " + toString(required_columns.size())
+ ", maximum: " + settings.max_columns_to_read.toString(),
@ -631,10 +657,17 @@ QueryProcessingStage::Enum InterpreterSelectQuery::executeFetchColumns(Pipeline
{
/// Subquery.
if (!dry_run)
pipeline.streams = interpreter_subquery->executeWithMultipleStreams();
else
pipeline.streams.emplace_back(std::make_shared<NullBlockInputStream>(interpreter_subquery->getSampleBlock()));
/// If we need less number of columns that subquery have - update the interpreter.
if (required_columns.size() < source_header.columns())
{
interpreter_subquery = std::make_unique<InterpreterSelectWithUnionQuery>(
query.table(), getSubqueryContext(context), required_columns, QueryProcessingStage::Complete, subquery_depth + 1, only_analyze);
if (query_analyzer->hasAggregation())
interpreter_subquery->ignoreWithTotals();
}
pipeline.streams = interpreter_subquery->executeWithMultipleStreams();
}
else if (storage)
{
@ -668,8 +701,7 @@ QueryProcessingStage::Enum InterpreterSelectQuery::executeFetchColumns(Pipeline
optimize_prewhere(*merge_tree);
}
if (!dry_run)
pipeline.streams = storage->read(required_columns, query_info, context, from_stage, max_block_size, max_streams);
pipeline.streams = storage->read(required_columns, query_info, context, from_stage, max_block_size, max_streams);
if (pipeline.streams.empty())
pipeline.streams.emplace_back(std::make_shared<NullBlockInputStream>(storage->getSampleBlockForColumns(required_columns)));

View File

@ -1,5 +1,7 @@
#pragma once
#include <memory>
#include <Core/QueryProcessingStage.h>
#include <Interpreters/Context.h>
#include <Interpreters/IInterpreter.h>
@ -16,6 +18,7 @@ namespace DB
class ExpressionAnalyzer;
class ASTSelectQuery;
struct SubqueryForSet;
class InterpreterSelectWithUnionQuery;
/** Interprets the SELECT query. Returns the stream of blocks with the results of the query before `to_stage` stage.
@ -35,9 +38,6 @@ public:
* - to control the limit on the depth of nesting of subqueries. For subqueries, a value that is incremented by one is passed;
* for INSERT SELECT, a value 1 is passed instead of 0.
*
* input
* - if given - read not from the table specified in the query, but from prepared source.
*
* required_result_column_names
* - don't calculate all columns except the specified ones from the query
* - it is used to remove calculation (and reading) of unnecessary columns from subqueries.
@ -50,8 +50,23 @@ public:
const Names & required_result_column_names = Names{},
QueryProcessingStage::Enum to_stage_ = QueryProcessingStage::Complete,
size_t subquery_depth_ = 0,
const BlockInputStreamPtr & input = nullptr,
bool only_analyze = false);
bool only_analyze_ = false);
/// Read data not from the table specified in the query, but from the prepared source `input`.
InterpreterSelectQuery(
const ASTPtr & query_ptr_,
const Context & context_,
const BlockInputStreamPtr & input_,
QueryProcessingStage::Enum to_stage_ = QueryProcessingStage::Complete,
bool only_analyze_ = false);
/// Read data not from the table specified in the query, but from the specified `storage_`.
InterpreterSelectQuery(
const ASTPtr & query_ptr_,
const Context & context_,
const StoragePtr & storage_,
QueryProcessingStage::Enum to_stage_ = QueryProcessingStage::Complete,
bool only_analyze_ = false);
~InterpreterSelectQuery() override;
@ -63,13 +78,20 @@ public:
Block getSampleBlock();
static Block getSampleBlock(
const ASTPtr & query_ptr_,
const Context & context_);
void ignoreWithTotals();
private:
InterpreterSelectQuery(
const ASTPtr & query_ptr_,
const Context & context_,
const BlockInputStreamPtr & input_,
const StoragePtr & storage_,
const Names & required_result_column_names,
QueryProcessingStage::Enum to_stage_,
size_t subquery_depth_,
bool only_analyze_);
struct Pipeline
{
/** Streams of data.
@ -103,14 +125,6 @@ private:
}
};
struct OnlyAnalyzeTag {};
InterpreterSelectQuery(
OnlyAnalyzeTag,
const ASTPtr & query_ptr_,
const Context & context_);
void init(const Names & required_result_column_names);
void executeImpl(Pipeline & pipeline, const BlockInputStreamPtr & input, bool dry_run);
@ -142,7 +156,7 @@ private:
SubqueriesForSets subqueries_for_sets;
};
AnalysisResult analyzeExpressions(QueryProcessingStage::Enum from_stage);
AnalysisResult analyzeExpressions(QueryProcessingStage::Enum from_stage, bool dry_run);
/** From which table to read. With JOIN, the "left" table is returned.
@ -155,7 +169,7 @@ private:
void executeWithMultipleStreamsImpl(Pipeline & pipeline, const BlockInputStreamPtr & input, bool dry_run);
/// Fetch data from the table. Returns the stage to which the query was processed in Storage.
QueryProcessingStage::Enum executeFetchColumns(Pipeline & pipeline, bool dry_run);
QueryProcessingStage::Enum executeFetchColumns(Pipeline & pipeline);
void executeWhere(Pipeline & pipeline, const ExpressionActionsPtr & expression);
void executeAggregation(Pipeline & pipeline, const ExpressionActionsPtr & expression, bool overflow_row, bool final);
@ -186,7 +200,7 @@ private:
ASTSelectQuery & query;
Context context;
QueryProcessingStage::Enum to_stage;
size_t subquery_depth;
size_t subquery_depth = 0;
std::unique_ptr<ExpressionAnalyzer> query_analyzer;
/// How many streams we ask for storage to produce, and in how many threads we will do further processing.
@ -195,6 +209,16 @@ private:
/// The object was created only for query analysis.
bool only_analyze = false;
/// List of columns to read to execute the query.
Names required_columns;
/// Structure of query source (table, subquery, etc).
Block source_header;
/// Structure of query result.
Block result_header;
/// The subquery interpreter, if the subquery
std::unique_ptr<InterpreterSelectWithUnionQuery> interpreter_subquery;
/// Table from where to read data, if not subquery.
StoragePtr storage;
TableStructureReadLockPtr table_lock;

View File

@ -26,7 +26,8 @@ InterpreterSelectWithUnionQuery::InterpreterSelectWithUnionQuery(
const Context & context_,
const Names & required_result_column_names,
QueryProcessingStage::Enum to_stage_,
size_t subquery_depth_)
size_t subquery_depth_,
bool only_analyze)
: query_ptr(query_ptr_),
context(context_),
to_stage(to_stage_),
@ -56,7 +57,7 @@ InterpreterSelectWithUnionQuery::InterpreterSelectWithUnionQuery(
/// We use it to determine positions of 'required_result_column_names' in SELECT clause.
Block full_result_header = InterpreterSelectQuery(
ast.list_of_selects->children.at(0), context, Names(), to_stage, subquery_depth).getSampleBlock();
ast.list_of_selects->children.at(0), context, Names(), to_stage, subquery_depth, true).getSampleBlock();
std::vector<size_t> positions_of_required_result_columns(required_result_column_names.size());
for (size_t required_result_num = 0, size = required_result_column_names.size(); required_result_num < size; ++required_result_num)
@ -65,7 +66,7 @@ InterpreterSelectWithUnionQuery::InterpreterSelectWithUnionQuery(
for (size_t query_num = 1; query_num < num_selects; ++query_num)
{
Block full_result_header_for_current_select = InterpreterSelectQuery(
ast.list_of_selects->children.at(query_num), context, Names(), to_stage, subquery_depth).getSampleBlock();
ast.list_of_selects->children.at(query_num), context, Names(), to_stage, subquery_depth, true).getSampleBlock();
if (full_result_header_for_current_select.columns() != full_result_header.columns())
throw Exception("Different number of columns in UNION ALL elements", ErrorCodes::UNION_ALL_RESULT_STRUCTURES_MISMATCH);
@ -83,7 +84,7 @@ InterpreterSelectWithUnionQuery::InterpreterSelectWithUnionQuery(
: required_result_column_names_for_other_selects[query_num];
nested_interpreters.emplace_back(std::make_unique<InterpreterSelectQuery>(
ast.list_of_selects->children.at(query_num), context, current_required_result_column_names, to_stage, subquery_depth));
ast.list_of_selects->children.at(query_num), context, current_required_result_column_names, to_stage, subquery_depth, only_analyze));
}
/// Determine structure of result.
@ -165,7 +166,7 @@ Block InterpreterSelectWithUnionQuery::getSampleBlock(
return cache[key];
}
return cache[key] = InterpreterSelectWithUnionQuery(query_ptr, context).getSampleBlock();
return cache[key] = InterpreterSelectWithUnionQuery(query_ptr, context, {}, QueryProcessingStage::Complete, 0, true).getSampleBlock();
}

View File

@ -21,7 +21,8 @@ public:
const Context & context_,
const Names & required_result_column_names = Names{},
QueryProcessingStage::Enum to_stage_ = QueryProcessingStage::Complete,
size_t subquery_depth_ = 0);
size_t subquery_depth_ = 0,
bool only_analyze = false);
~InterpreterSelectWithUnionQuery();

View File

@ -357,11 +357,10 @@ void LogicalExpressionsOptimizer::fixBrokenOrExpressions()
for (auto & parent : parents)
{
parent->children.push_back(operands[0]);
auto first_erased = std::remove_if(parent->children.begin(), parent->children.end(),
[or_function](const ASTPtr & ptr) { return ptr.get() == or_function; });
parent->children.erase(first_erased, parent->children.end());
// The order of children matters if or is children of some function, e.g. minus
std::replace_if(parent->children.begin(), parent->children.end(),
[or_function](const ASTPtr & ptr) { return ptr.get() == or_function; },
operands[0] );
}
/// If the OR node was the root of the WHERE, PREWHERE, or HAVING expression, then update this root.

View File

@ -20,10 +20,10 @@ Block QueryLogElement::createBlock()
{
return
{
{ColumnUInt8::create(), std::make_shared<DataTypeUInt8>(), "type"},
{ColumnUInt16::create(), std::make_shared<DataTypeDate>(), "event_date"},
{ColumnUInt32::create(), std::make_shared<DataTypeDateTime>(), "event_time"},
{ColumnUInt32::create(), std::make_shared<DataTypeDateTime>(), "query_start_time"},
{ColumnUInt8::create(), std::make_shared<DataTypeUInt8>(), "type"},
{ColumnUInt16::create(), std::make_shared<DataTypeDate>(), "event_date"},
{ColumnUInt32::create(), std::make_shared<DataTypeDateTime>(), "event_time"},
{ColumnUInt32::create(), std::make_shared<DataTypeDateTime>(), "query_start_time"},
{ColumnUInt64::create(), std::make_shared<DataTypeUInt64>(), "query_duration_ms"},
{ColumnUInt64::create(), std::make_shared<DataTypeUInt64>(), "read_rows"},
@ -41,7 +41,7 @@ Block QueryLogElement::createBlock()
{ColumnString::create(), std::make_shared<DataTypeString>(), "exception"},
{ColumnString::create(), std::make_shared<DataTypeString>(), "stack_trace"},
{ColumnUInt8::create(), std::make_shared<DataTypeUInt8>(), "is_initial_query"},
{ColumnUInt8::create(), std::make_shared<DataTypeUInt8>(), "is_initial_query"},
{ColumnString::create(), std::make_shared<DataTypeString>(), "user"},
{ColumnString::create(), std::make_shared<DataTypeString>(), "query_id"},
@ -53,14 +53,14 @@ Block QueryLogElement::createBlock()
{ColumnFixedString::create(16), std::make_shared<DataTypeFixedString>(16), "initial_address"},
{ColumnUInt16::create(), std::make_shared<DataTypeUInt16>(), "initial_port"},
{ColumnUInt8::create(), std::make_shared<DataTypeUInt8>(), "interface"},
{ColumnUInt8::create(), std::make_shared<DataTypeUInt8>(), "interface"},
{ColumnString::create(), std::make_shared<DataTypeString>(), "os_user"},
{ColumnString::create(), std::make_shared<DataTypeString>(), "client_hostname"},
{ColumnString::create(), std::make_shared<DataTypeString>(), "client_name"},
{ColumnUInt32::create(), std::make_shared<DataTypeUInt32>(), "client_revision"},
{ColumnUInt8::create(), std::make_shared<DataTypeUInt8>(), "http_method"},
{ColumnUInt8::create(), std::make_shared<DataTypeUInt8>(), "http_method"},
{ColumnString::create(), std::make_shared<DataTypeString>(), "http_user_agent"},
{ColumnString::create(), std::make_shared<DataTypeString>(), "quota_key"},

View File

@ -187,8 +187,6 @@ struct Settings
M(SettingSeconds, http_receive_timeout, DEFAULT_HTTP_READ_BUFFER_TIMEOUT, "HTTP receive timeout") \
M(SettingBool, optimize_throw_if_noop, false, "If setting is enabled and OPTIMIZE query didn't actually assign a merge then an explanatory exception is thrown") \
M(SettingBool, use_index_for_in_with_subqueries, true, "Try using an index if there is a subquery or a table expression on the right side of the IN operator.") \
M(SettingUInt64, use_index_for_in_with_subqueries_max_values, 100000, "Don't use index of a table for filtering by right hand size of the IN operator if the size of set is larger than specified threshold. This allows to avoid performance degradation and higher memory usage due to preparation of additional data structures.") \
\
M(SettingBool, empty_result_for_aggregation_by_empty_set, false, "Return empty result when aggregating without keys on empty set.") \
M(SettingBool, allow_distributed_ddl, true, "If it is set to true, then a user is allowed to executed distributed DDL queries.") \
M(SettingUInt64, odbc_max_field_size, 1024, "Max size of filed can be read from ODBC dictionary. Long strings are truncated.") \

View File

@ -13,7 +13,7 @@ void ASTIdentifier::formatImplWithoutAlias(const FormatSettings & settings, Form
settings.ostr << (settings.hilite ? hilite_identifier : "");
WriteBufferFromOStream wb(settings.ostr, 32);
writeProbablyBackQuotedString(name, wb);
settings.writeIdentifier(name, wb);
wb.next();
settings.ostr << (settings.hilite ? hilite_none : "");

View File

@ -6,6 +6,18 @@
namespace DB
{
void ASTWithAlias::writeAlias(const String & name, const FormatSettings & settings) const
{
settings.ostr << (settings.hilite ? hilite_keyword : "") << " AS " << (settings.hilite ? hilite_alias : "");
WriteBufferFromOStream wb(settings.ostr, 32);
settings.writeIdentifier(name, wb);
wb.next();
settings.ostr << (settings.hilite ? hilite_none : "");
}
void ASTWithAlias::formatImpl(const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const
{
if (!alias.empty())
@ -14,7 +26,7 @@ void ASTWithAlias::formatImpl(const FormatSettings & settings, FormatState & sta
if (!state.printed_asts_with_alias.emplace(frame.current_select, alias).second)
{
WriteBufferFromOStream wb(settings.ostr, 32);
writeProbablyBackQuotedString(alias, wb);
settings.writeIdentifier(alias, wb);
return;
}
}
@ -27,7 +39,7 @@ void ASTWithAlias::formatImpl(const FormatSettings & settings, FormatState & sta
if (!alias.empty())
{
writeAlias(alias, settings.ostr, settings.hilite);
writeAlias(alias, settings);
if (frame.need_parens)
settings.ostr <<')';
}

View File

@ -32,6 +32,8 @@ public:
protected:
virtual void appendColumnNameImpl(WriteBuffer & ostr) const = 0;
void writeAlias(const String & name, const FormatSettings & settings) const;
};
/// helper for setting aliases and chaining result to other functions

View File

@ -1,6 +1,8 @@
#include <errno.h>
#include <cstdlib>
#include <Poco/String.h>
#include <IO/ReadHelpers.h>
#include <IO/ReadBufferFromMemory.h>
@ -236,6 +238,17 @@ bool ParserFunction::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
, ErrorCodes::SYNTAX_ERROR);
}
/// Temporary compatibility fix for Yandex.Metrika.
/// When we have a query with
/// cast(x, 'Type')
/// when cast is not in uppercase and when expression is written as a function, not as operator like cast(x AS Type)
/// and newer ClickHouse server (1.1.54388) interacts with older ClickHouse server (1.1.54381) in distributed query,
/// then exception was thrown.
auto & identifier_concrete = typeid_cast<ASTIdentifier &>(*identifier);
if (Poco::toLower(identifier_concrete.name) == "cast")
identifier_concrete.name = "CAST";
/// The parametric aggregate function has two lists (parameters and arguments) in parentheses. Example: quantile(0.9)(x).
if (pos->type == TokenType::OpeningRoundBracket)
{

View File

@ -13,6 +13,7 @@ namespace ErrorCodes
{
extern const int TOO_BIG_AST;
extern const int TOO_DEEP_AST;
extern const int BAD_ARGUMENTS;
}
@ -36,18 +37,6 @@ String backQuoteIfNeed(const String & x)
}
void IAST::writeAlias(const String & name, std::ostream & s, bool hilite) const
{
s << (hilite ? hilite_keyword : "") << " AS " << (hilite ? hilite_alias : "");
WriteBufferFromOStream wb(s, 32);
writeProbablyBackQuotedString(name, wb);
wb.next();
s << (hilite ? hilite_none : "");
}
size_t IAST::checkSize(size_t max_size) const
{
size_t res = 1;
@ -109,4 +98,36 @@ String IAST::getColumnName() const
return write_buffer.str();
}
void IAST::FormatSettings::writeIdentifier(const String & name, WriteBuffer & out) const
{
switch (identifier_quoting_style)
{
case IdentifierQuotingStyle::None:
{
if (always_quote_identifiers)
throw Exception("Incompatible arguments: always_quote_identifiers = true && identifier_quoting_style == IdentifierQuotingStyle::None",
ErrorCodes::BAD_ARGUMENTS);
writeString(name, out);
break;
}
case IdentifierQuotingStyle::Backticks:
{
if (always_quote_identifiers)
writeBackQuotedString(name, out);
else
writeProbablyBackQuotedString(name, out);
break;
}
case IdentifierQuotingStyle::DoubleQuotes:
{
if (always_quote_identifiers)
writeDoubleQuotedString(name, out);
else
writeProbablyDoubleQuotedString(name, out);
break;
}
}
}
}

View File

@ -7,6 +7,7 @@
#include <Core/Types.h>
#include <Common/Exception.h>
#include <Parsers/StringRange.h>
#include <Parsers/IdentifierQuotingStyle.h>
class SipHash;
@ -150,16 +151,20 @@ public:
struct FormatSettings
{
std::ostream & ostr;
bool hilite;
bool hilite = false;
bool one_line;
bool always_quote_identifiers = false;
IdentifierQuotingStyle identifier_quoting_style = IdentifierQuotingStyle::Backticks;
char nl_or_ws;
FormatSettings(std::ostream & ostr_, bool hilite_, bool one_line_)
: ostr(ostr_), hilite(hilite_), one_line(one_line_)
FormatSettings(std::ostream & ostr_, bool one_line_)
: ostr(ostr_), one_line(one_line_)
{
nl_or_ws = one_line ? ' ' : '\n';
}
void writeIdentifier(const String & name, WriteBuffer & out) const;
};
/// State. For example, a set of nodes can be remembered, which we already walk through.
@ -194,8 +199,6 @@ public:
ErrorCodes::UNKNOWN_ELEMENT_IN_AST);
}
void writeAlias(const String & name, std::ostream & s, bool hilite) const;
void cloneChildren();
public:

View File

@ -0,0 +1,16 @@
#pragma once
namespace DB
{
/// Method to quote identifiers.
/// NOTE There could be differences in escaping rules inside quotes. Escaping rules may not match that required by specific external DBMS.
enum class IdentifierQuotingStyle
{
None, /// Write as-is, without quotes.
Backticks, /// `mysql` style
DoubleQuotes /// "postgres" style
};
}

View File

@ -7,7 +7,9 @@ namespace DB
void formatAST(const IAST & ast, std::ostream & s, bool hilite, bool one_line)
{
IAST::FormatSettings settings(s, hilite, one_line);
IAST::FormatSettings settings(s, one_line);
settings.hilite = hilite;
ast.format(settings);
}

View File

@ -12,6 +12,7 @@ namespace DB
*/
void formatAST(const IAST & ast, std::ostream & s, bool hilite = true, bool one_line = false);
inline std::ostream & operator<<(std::ostream & os, const IAST & ast)
{
formatAST(ast, os, false, true);

View File

@ -69,10 +69,24 @@ Block ITableDeclaration::getSampleBlockForColumns(const Names & column_names) co
{
Block res;
NamesAndTypesList all_columns = getColumns().getAll();
std::unordered_map<String, DataTypePtr> columns_map;
for (const auto & elem : all_columns)
columns_map.emplace(elem.name, elem.type);
for (const auto & name : column_names)
{
auto col = getColumn(name);
res.insert({ col.type->createColumn(), col.type, name });
auto it = columns_map.find(name);
if (it != columns_map.end())
{
res.insert({ it->second->createColumn(), it->second, it->first });
}
else
{
/// Virtual columns.
NameAndTypePair elem = getColumn(name);
res.insert({ elem.type->createColumn(), elem.type, elem.name });
}
}
return res;

View File

@ -22,6 +22,8 @@ public:
Block getSampleBlock() const;
Block getSampleBlockNonMaterialized() const;
/// Including virtual and alias columns.
Block getSampleBlockForColumns(const Names & column_names) const;
/** Verify that all the requested names are in the table and are set correctly.

View File

@ -2354,12 +2354,7 @@ MergeTreeData::MutableDataPartPtr MergeTreeData::cloneAndLoadDataPart(const Merg
const String & tmp_part_prefix,
const MergeTreePartInfo & dst_part_info)
{
String dst_part_name;
if (format_version < MERGE_TREE_DATA_MIN_FORMAT_VERSION_WITH_CUSTOM_PARTITIONING)
dst_part_name = dst_part_info.getPartNameV0(src_part->getMinDate(), src_part->getMaxDate());
else
dst_part_name = dst_part_info.getPartName();
String dst_part_name = src_part->getNewName(dst_part_info);
String tmp_dst_part_name = tmp_part_prefix + dst_part_name;
Poco::Path dst_part_absolute_path = Poco::Path(full_path + tmp_dst_part_name).absolute();

View File

@ -5,6 +5,8 @@
#include <Storages/MergeTree/SimpleMergeSelector.h>
#include <Storages/MergeTree/AllMergeSelector.h>
#include <Storages/MergeTree/MergeList.h>
#include <Storages/MergeTree/StorageFromMergeTreeDataPart.h>
#include <Storages/MergeTree/BackgroundProcessingPool.h>
#include <DataStreams/DistinctSortedBlockInputStream.h>
#include <DataStreams/ExpressionBlockInputStream.h>
#include <DataStreams/MergingSortedBlockInputStream.h>
@ -18,14 +20,16 @@
#include <DataStreams/ConcatBlockInputStream.h>
#include <DataStreams/ColumnGathererStream.h>
#include <DataStreams/ApplyingMutationsBlockInputStream.h>
#include <Parsers/ASTAsterisk.h>
#include <Interpreters/InterpreterSelectQuery.h>
#include <IO/CompressedWriteBuffer.h>
#include <IO/CompressedReadBufferFromFile.h>
#include <DataTypes/NestedUtils.h>
#include <DataTypes/DataTypeArray.h>
#include <Storages/MergeTree/BackgroundProcessingPool.h>
#include <Common/SimpleIncrement.h>
#include <Common/interpolate.h>
#include <Common/typeid_cast.h>
#include <Common/localBackup.h>
#include <Poco/File.h>
@ -811,6 +815,124 @@ MergeTreeData::MutableDataPartPtr MergeTreeDataMergerMutator::mergePartsToTempor
}
static bool isStorageTouchedByMutation(
const StoragePtr & storage, const std::vector<MutationCommand> & commands, const Context & context)
{
if (commands.empty())
return false;
for (const MutationCommand & command : commands)
{
if (!command.predicate) /// The command touches all rows.
return true;
}
/// Execute `SELECT count() FROM storage WHERE predicate1 OR predicate2 OR ...` query.
/// The result is tne number of affected rows.
auto select = std::make_shared<ASTSelectQuery>();
select->select_expression_list = std::make_shared<ASTExpressionList>();
select->children.push_back(select->select_expression_list);
auto count_func = std::make_shared<ASTFunction>();
count_func->name = "count";
count_func->arguments = std::make_shared<ASTExpressionList>();
select->select_expression_list->children.push_back(count_func);
if (commands.size() == 1)
select->where_expression = commands[0].predicate;
else
{
auto coalesced_predicates = std::make_shared<ASTFunction>();
coalesced_predicates->name = "or";
coalesced_predicates->arguments = std::make_shared<ASTExpressionList>();
coalesced_predicates->children.push_back(coalesced_predicates->arguments);
for (const MutationCommand & command : commands)
coalesced_predicates->arguments->children.push_back(command.predicate);
select->where_expression = std::move(coalesced_predicates);
}
select->children.push_back(select->where_expression);
auto context_copy = context;
context_copy.getSettingsRef().merge_tree_uniform_read_distribution = 0;
context_copy.getSettingsRef().max_threads = 1;
InterpreterSelectQuery interpreter_select(select, context_copy, storage, QueryProcessingStage::Complete);
BlockInputStreamPtr in = interpreter_select.execute().in;
Block block = in->read();
if (!block.rows())
return false;
else if (block.rows() != 1)
throw Exception("count() expression returned " + toString(block.rows()) + " rows, not 1",
ErrorCodes::LOGICAL_ERROR);
auto count = (*block.getByName("count()").column)[0].get<UInt64>();
return count != 0;
}
static BlockInputStreamPtr createInputStreamWithMutatedData(
const StoragePtr & storage, std::vector<MutationCommand> commands, const Context & context)
{
auto select = std::make_shared<ASTSelectQuery>();
select->select_expression_list = std::make_shared<ASTExpressionList>();
select->children.push_back(select->select_expression_list);
select->select_expression_list->children.push_back(std::make_shared<ASTAsterisk>());
/// For all commands that are in front of the list and are DELETE commands, we can push them down
/// to the SELECT statement and remove them from commands.
auto deletes_end = commands.begin();
for (; deletes_end != commands.end(); ++deletes_end)
{
if (deletes_end->type != MutationCommand::DELETE)
break;
}
std::vector<ASTPtr> predicates;
for (auto it = commands.begin(); it != deletes_end; ++it)
{
auto predicate = std::make_shared<ASTFunction>();
predicate->name = "not";
predicate->arguments = std::make_shared<ASTExpressionList>();
predicate->arguments->children.push_back(it->predicate);
predicate->children.push_back(predicate->arguments);
predicates.push_back(predicate);
}
commands.erase(commands.begin(), deletes_end);
if (!predicates.empty())
{
ASTPtr where_expression;
if (predicates.size() == 1)
where_expression = predicates[0];
else
{
auto coalesced_predicates = std::make_shared<ASTFunction>();
coalesced_predicates->name = "and";
coalesced_predicates->arguments = std::make_shared<ASTExpressionList>();
coalesced_predicates->children.push_back(coalesced_predicates->arguments);
coalesced_predicates->arguments->children = predicates;
where_expression = std::move(coalesced_predicates);
}
select->where_expression = where_expression;
select->children.push_back(where_expression);
}
InterpreterSelectQuery interpreter_select(select, context, storage);
BlockInputStreamPtr in = interpreter_select.execute().in;
if (!commands.empty())
in = std::make_shared<ApplyingMutationsBlockInputStream>(in, commands, context);
return in;
}
MergeTreeData::MutableDataPartPtr MergeTreeDataMergerMutator::mutatePartToTemporaryPart(
const FuturePart & future_part,
const std::vector<MutationCommand> & commands,
@ -826,7 +948,19 @@ MergeTreeData::MutableDataPartPtr MergeTreeDataMergerMutator::mutatePartToTempor
CurrentMetrics::Increment num_mutations{CurrentMetrics::PartMutation};
const auto & source_part = future_part.parts[0];
LOG_TRACE(log, "Mutating part " << source_part->name << " to mutation version " << future_part.part_info.mutation);
auto storage_from_source_part = StorageFromMergeTreeDataPart::create(source_part);
auto context_for_reading = context;
context_for_reading.getSettingsRef().merge_tree_uniform_read_distribution = 0;
context_for_reading.getSettingsRef().max_threads = 1;
if (!isStorageTouchedByMutation(storage_from_source_part, commands, context_for_reading))
{
LOG_TRACE(log, "Part " << source_part->name << " doesn't change up to mutation version " << future_part.part_info.mutation);
return data.cloneAndLoadDataPart(source_part, "tmp_clone_", future_part.part_info);
}
else
LOG_TRACE(log, "Mutating part " << source_part->name << " to mutation version " << future_part.part_info.mutation);
MergeTreeData::MutableDataPartPtr new_data_part = std::make_shared<MergeTreeData::DataPart>(
data, future_part.name, future_part.part_info);
@ -835,21 +969,16 @@ MergeTreeData::MutableDataPartPtr MergeTreeDataMergerMutator::mutatePartToTempor
String new_part_tmp_path = new_data_part->getFullPath();
Poco::File(new_part_tmp_path).createDirectories();
NamesAndTypesList all_columns = data.getColumns().getAllPhysical();
BlockInputStreamPtr in = std::make_shared<MergeTreeBlockInputStream>(
data, source_part, DEFAULT_MERGE_BLOCK_SIZE, 0, 0, all_columns.getNames(),
MarkRanges(1, MarkRange(0, source_part->marks_count)),
false, nullptr, String(), true, 0, DBMS_DEFAULT_BUFFER_SIZE, false);
in = std::make_shared<ApplyingMutationsBlockInputStream>(in, commands, context);
auto in = createInputStreamWithMutatedData(storage_from_source_part, commands, context_for_reading);
if (data.hasPrimaryKey())
in = std::make_shared<MaterializingBlockInputStream>(
std::make_shared<ExpressionBlockInputStream>(in, data.getPrimaryExpression()));
Poco::File(new_part_tmp_path).createDirectories();
NamesAndTypesList all_columns = data.getColumns().getAllPhysical();
auto compression_settings = context.chooseCompressionSettings(
source_part->bytes_on_disk,
static_cast<double>(source_part->bytes_on_disk) / data.getTotalActiveSizeInBytes());

View File

@ -92,6 +92,7 @@ public:
MergeListEntry & merge_entry,
size_t aio_threshold, time_t time_of_merge, DiskSpaceMonitor::Reservation * disk_reservation, bool deduplication);
/// Mutate a single data part with the specified commands. Will create and return a temporary part.
MergeTreeData::MutableDataPartPtr mutatePartToTemporaryPart(
const FuturePart & future_part,
const std::vector<MutationCommand> & commands,

View File

@ -140,9 +140,22 @@ BlockInputStreams MergeTreeDataSelectExecutor::read(
const unsigned num_streams,
Int64 max_block_number_to_read) const
{
size_t part_index = 0;
return readFromParts(
data.getDataPartsVector(), column_names_to_return, query_info, context, processed_stage,
max_block_size, num_streams, max_block_number_to_read);
}
MergeTreeData::DataPartsVector parts = data.getDataPartsVector();
BlockInputStreams MergeTreeDataSelectExecutor::readFromParts(
MergeTreeData::DataPartsVector parts,
const Names & column_names_to_return,
const SelectQueryInfo & query_info,
const Context & context,
QueryProcessingStage::Enum & processed_stage,
const size_t max_block_size,
const unsigned num_streams,
Int64 max_block_number_to_read) const
{
size_t part_index = 0;
/// If query contains restrictions on the virtual column `_part` or `_part_index`, select only parts suitable for it.
/// The virtual column `_sample_factor` (which is equal to 1 / used sample rate) can be requested in the query.

View File

@ -31,6 +31,16 @@ public:
unsigned num_streams,
Int64 max_block_number_to_read) const;
BlockInputStreams readFromParts(
MergeTreeData::DataPartsVector parts,
const Names & column_names,
const SelectQueryInfo & query_info,
const Context & context,
QueryProcessingStage::Enum & processed_stage,
size_t max_block_size,
unsigned num_streams,
Int64 max_block_number_to_read) const;
private:
MergeTreeData & data;

View File

@ -0,0 +1,42 @@
#pragma once
#include <Storages/IStorage.h>
#include <Storages/MergeTree/MergeTreeDataPart.h>
#include <Storages/MergeTree/MergeTreeDataSelectExecutor.h>
#include <Core/Defines.h>
#include <ext/shared_ptr_helper.h>
namespace DB
{
/// A Storage that allows reading from a single MergeTree data part.
class StorageFromMergeTreeDataPart : public ext::shared_ptr_helper<StorageFromMergeTreeDataPart>, public IStorage
{
public:
String getName() const override { return "FromMergeTreeDataPart"; }
String getTableName() const override { return part->storage.getTableName() + " (part " + part->name + ")"; }
BlockInputStreams read(
const Names & column_names,
const SelectQueryInfo & query_info,
const Context & context,
QueryProcessingStage::Enum & processed_stage,
size_t max_block_size,
unsigned num_streams) override
{
return MergeTreeDataSelectExecutor(part->storage).readFromParts(
{part}, column_names, query_info, context, processed_stage, max_block_size, num_streams, 0);
}
protected:
StorageFromMergeTreeDataPart(const MergeTreeData::DataPartPtr & part_)
: IStorage(part_->storage.getColumns()), part(part_)
{}
private:
MergeTreeData::DataPartPtr part;
};
}

View File

@ -135,7 +135,7 @@ BlockInputStreams StorageBuffer::read(
*/
if (processed_stage > QueryProcessingStage::FetchColumns)
for (auto & stream : streams_from_buffers)
stream = InterpreterSelectQuery(query_info.query, context, {}, processed_stage, 0, stream).execute().in;
stream = InterpreterSelectQuery(query_info.query, context, stream, processed_stage).execute().in;
streams_from_dst.insert(streams_from_dst.end(), streams_from_buffers.begin(), streams_from_buffers.end());
return streams_from_dst;

View File

@ -257,7 +257,7 @@ BlockInputStreams StorageDistributed::read(
ClusterProxy::SelectStreamFactory select_stream_factory(
header, processed_stage, remote_table_function_ptr, context.getExternalTables());
return ClusterProxy::executeQuery(
select_stream_factory, cluster, modified_query_ast, context, settings);
select_stream_factory, cluster, modified_query_ast, context, settings);
}
else
{

View File

@ -241,12 +241,10 @@ BlockInputStreams StorageMerge::read(
header = getSampleBlockForColumns(column_names);
break;
case QueryProcessingStage::WithMergeableState:
header = materializeBlock(InterpreterSelectQuery(query_info.query, context, {}, QueryProcessingStage::WithMergeableState, 0,
std::make_shared<OneBlockInputStream>(getSampleBlockForColumns(column_names)), true).getSampleBlock());
break;
case QueryProcessingStage::Complete:
header = materializeBlock(InterpreterSelectQuery(query_info.query, context, {}, QueryProcessingStage::Complete, 0,
std::make_shared<OneBlockInputStream>(getSampleBlockForColumns(column_names)), true).getSampleBlock());
header = materializeBlock(InterpreterSelectQuery(
query_info.query, context, std::make_shared<OneBlockInputStream>(getSampleBlockForColumns(column_names)),
processed_stage_in_source_table, true).getSampleBlock());
break;
}
}

View File

@ -311,6 +311,7 @@ std::vector<MergeTreeMutationStatus> StorageMergeTree::getMutationsStatus() cons
part_data_versions.reserve(data_parts.size());
for (const auto & part : data_parts)
part_data_versions.push_back(part->info.getDataVersion());
std::sort(part_data_versions.begin(), part_data_versions.end());
std::vector<MergeTreeMutationStatus> result;
for (const auto & kv : current_mutations_by_version)

View File

@ -57,7 +57,8 @@ BlockInputStreams StorageMySQL::read(
{
check(column_names);
processed_stage = QueryProcessingStage::FetchColumns;
String query = transformQueryForExternalDatabase(*query_info.query, getColumns().ordinary, remote_database_name, remote_table_name, context);
String query = transformQueryForExternalDatabase(
*query_info.query, getColumns().ordinary, IdentifierQuotingStyle::Backticks, remote_database_name, remote_table_name, context);
Block sample_block;
for (const String & name : column_names)

View File

@ -44,7 +44,7 @@ BlockInputStreams StorageODBC::read(
check(column_names);
processed_stage = QueryProcessingStage::FetchColumns;
String query = transformQueryForExternalDatabase(
*query_info.query, getColumns().ordinary, remote_database_name, remote_table_name, context);
*query_info.query, getColumns().ordinary, IdentifierQuotingStyle::DoubleQuotes, remote_database_name, remote_table_name, context);
Block sample_block;
for (const String & name : column_names)

View File

@ -0,0 +1,60 @@
#pragma once
#include <DataStreams/OneBlockInputStream.h>
#include <DataTypes/DataTypeString.h>
#include <Storages/ColumnsDescription.h>
#include <Storages/IStorage.h>
#include <ext/shared_ptr_helper.h>
namespace DB
{
class Context;
/** Base class for system tables whose all columns have String type.
*/
template <typename Self>
class IStorageSystemWithStringColumns : public IStorage
{
protected:
virtual void fillData(MutableColumns & res_columns) const = 0;
public:
IStorageSystemWithStringColumns(const String & name_) : name(name_)
{
auto names = Self::getColumnNames();
NamesAndTypesList name_list;
for (const auto & name : names)
{
name_list.push_back(NameAndTypePair{name, std::make_shared<DataTypeString>()});
}
setColumns(ColumnsDescription(name_list));
}
std::string getTableName() const override
{
return name;
}
BlockInputStreams read(const Names & column_names,
const SelectQueryInfo & /*query_info*/,
const Context & /*context*/,
QueryProcessingStage::Enum & processed_stage,
size_t /*max_block_size*/,
unsigned /*num_streams*/) override
{
check(column_names);
processed_stage = QueryProcessingStage::FetchColumns;
Block sample_block = getSampleBlock();
MutableColumns res_columns = sample_block.cloneEmptyColumns();
fillData(res_columns);
return BlockInputStreams(1, std::make_shared<OneBlockInputStream>(sample_block.cloneWithColumns(std::move(res_columns))));
}
private:
const String name;
};
}

View File

@ -0,0 +1,14 @@
#include <AggregateFunctions/AggregateFunctionCombinatorFactory.h>
#include <Storages/System/StorageSystemAggregateFunctionCombinators.h>
namespace DB
{
void StorageSystemAggregateFunctionCombinators::fillData(MutableColumns & res_columns) const
{
const auto & combinators = AggregateFunctionCombinatorFactory::instance().getAllAggregateFunctionCombinators();
for (const auto & pair : combinators)
{
res_columns[0]->insert(pair.first);
}
}
}

View File

@ -0,0 +1,26 @@
#pragma once
#include <Storages/System/IStorageSystemWithStringColumns.h>
#include <ext/shared_ptr_helper.h>
namespace DB
{
class StorageSystemAggregateFunctionCombinators : public ext::shared_ptr_helper<StorageSystemAggregateFunctionCombinators>,
public IStorageSystemWithStringColumns<StorageSystemAggregateFunctionCombinators>
{
protected:
void fillData(MutableColumns & res_columns) const override;
public:
using IStorageSystemWithStringColumns::IStorageSystemWithStringColumns;
std::string getName() const override
{
return "SystemAggregateFunctionCombinators";
}
static std::vector<String> getColumnNames()
{
return {"name"};
}
};
}

View File

@ -0,0 +1,13 @@
#include <Columns/Collator.h>
#include <Storages/System/StorageSystemCollations.h>
namespace DB
{
void StorageSystemCollations::fillData(MutableColumns & res_columns) const
{
for (const auto & collation : Collator::getAvailableCollations())
{
res_columns[0]->insert(collation);
}
}
}

View File

@ -0,0 +1,26 @@
#pragma once
#include <Storages/System/IStorageSystemWithStringColumns.h>
#include <ext/shared_ptr_helper.h>
namespace DB
{
class StorageSystemCollations : public ext::shared_ptr_helper<StorageSystemCollations>,
public IStorageSystemWithStringColumns<StorageSystemCollations>
{
protected:
void fillData(MutableColumns & res_columns) const override;
public:
using IStorageSystemWithStringColumns::IStorageSystemWithStringColumns;
std::string getName() const override
{
return "SystemTableCollations";
}
static std::vector<String> getColumnNames()
{
return {"name"};
}
};
}

View File

@ -0,0 +1,91 @@
#include <Core/Field.h>
#include <DataTypes/DataTypeFactory.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ASTLiteral.h>
#include <Storages/System/StorageSystemDataTypeFamilies.h>
#include <boost/algorithm/string/join.hpp>
#include <boost/algorithm/string/predicate.hpp>
#include <sstream>
namespace DB
{
namespace
{
String getPropertiesAsString(const DataTypePtr data_type)
{
std::vector<std::string> properties;
if (data_type->isParametric())
properties.push_back("parametric");
if (data_type->haveSubtypes())
properties.push_back("have_subtypes");
if (data_type->cannotBeStoredInTables())
properties.push_back("cannot_be_stored_in_tables");
if (data_type->isComparable())
properties.push_back("comparable");
if (data_type->canBeComparedWithCollation())
properties.push_back("can_be_compared_with_collation");
if (data_type->canBeUsedAsVersion())
properties.push_back("can_be_used_as_version");
if (data_type->isSummable())
properties.push_back("summable");
if (data_type->canBeUsedInBitOperations())
properties.push_back("can_be_used_in_bit_operations");
if (data_type->canBeUsedInBooleanContext())
properties.push_back("can_be_used_in_boolean_context");
if (data_type->isValueRepresentedByNumber())
properties.push_back("value_represented_by_number");
if (data_type->isCategorial())
properties.push_back("categorial");
if (data_type->isNullable())
properties.push_back("nullable");
if (data_type->onlyNull())
properties.push_back("only_null");
if (data_type->canBeInsideNullable())
properties.push_back("can_be_inside_nullable");
return boost::algorithm::join(properties, ",");
}
ASTPtr createFakeEnumCreationAst()
{
String fakename{"e"};
ASTPtr name = std::make_shared<ASTLiteral>(Field(fakename.c_str(), fakename.size()));
ASTPtr value = std::make_shared<ASTLiteral>(Field(UInt64(1)));
ASTPtr ast_func = makeASTFunction("equals", name, value);
ASTPtr clone = ast_func->clone();
clone->children.clear();
clone->children.push_back(ast_func);
return clone;
}
}
void StorageSystemDataTypeFamilies::fillData(MutableColumns & res_columns) const
{
const auto & factory = DataTypeFactory::instance();
const auto & data_types = factory.getAllDataTypes();
for (const auto & pair : data_types)
{
res_columns[0]->insert(pair.first);
try
{
DataTypePtr type_ptr;
//special case with enum, because it has arguments but it's properties doesn't
//depend on arguments
if (boost::starts_with(pair.first, "Enum"))
{
type_ptr = factory.get(pair.first, createFakeEnumCreationAst());
}
else
{
type_ptr = factory.get(pair.first);
}
res_columns[1]->insert(getPropertiesAsString(type_ptr));
}
catch (Exception & ex)
{
res_columns[1]->insert(String{"depends_on_arguments"});
}
}
}
}

View File

@ -0,0 +1,25 @@
#pragma once
#include <Storages/System/IStorageSystemWithStringColumns.h>
#include <ext/shared_ptr_helper.h>
namespace DB
{
class StorageSystemDataTypeFamilies : public ext::shared_ptr_helper<StorageSystemDataTypeFamilies>,
public IStorageSystemWithStringColumns<StorageSystemDataTypeFamilies>
{
protected:
void fillData(MutableColumns & res_columns) const override;
public:
using IStorageSystemWithStringColumns::IStorageSystemWithStringColumns;
std::string getName() const override
{
return "SystemTableDataTypeFamilies";
}
static std::vector<String> getColumnNames()
{
return {"name", "properties"};
}
};
}

View File

@ -0,0 +1,30 @@
#include <Formats/FormatFactory.h>
#include <Storages/System/StorageSystemFormats.h>
namespace DB
{
void StorageSystemFormats::fillData(MutableColumns & res_columns) const
{
const auto & formats = FormatFactory::instance().getAllFormats();
for (const auto & pair : formats)
{
const auto & [name, creator_pair] = pair;
bool has_input_format = (creator_pair.first != nullptr);
bool has_output_format = (creator_pair.second != nullptr);
res_columns[0]->insert(name);
std::string format_type;
if (has_input_format)
format_type = "input";
if (has_output_format)
{
if (!format_type.empty())
format_type += "/output";
else
format_type = "output";
}
res_columns[1]->insert(format_type);
}
}
}

View File

@ -0,0 +1,26 @@
#pragma once
#include <Storages/System/IStorageSystemWithStringColumns.h>
#include <ext/shared_ptr_helper.h>
namespace DB
{
class StorageSystemFormats : public ext::shared_ptr_helper<StorageSystemFormats>, public IStorageSystemWithStringColumns<StorageSystemFormats>
{
protected:
void fillData(MutableColumns & res_columns) const override;
public:
using IStorageSystemWithStringColumns::IStorageSystemWithStringColumns;
std::string getName() const override
{
return "SystemFormats";
}
static std::vector<String> getColumnNames()
{
return {"name", "type"};
}
};
}

View File

@ -0,0 +1,14 @@
#include <Storages/System/StorageSystemTableFunctions.h>
#include <TableFunctions/TableFunctionFactory.h>
namespace DB
{
void StorageSystemTableFunctions::fillData(MutableColumns & res_columns) const
{
const auto & functions = TableFunctionFactory::instance().getAllTableFunctions();
for (const auto & pair : functions)
{
res_columns[0]->insert(pair.first);
}
}
}

View File

@ -0,0 +1,26 @@
#pragma once
#include <Storages/System/IStorageSystemWithStringColumns.h>
#include <ext/shared_ptr_helper.h>
namespace DB
{
class StorageSystemTableFunctions : public ext::shared_ptr_helper<StorageSystemTableFunctions>,
public IStorageSystemWithStringColumns<StorageSystemTableFunctions>
{
protected:
void fillData(MutableColumns & res_columns) const override;
public:
using IStorageSystemWithStringColumns::IStorageSystemWithStringColumns;
std::string getName() const override
{
return "SystemTableFunctions";
}
static std::vector<String> getColumnNames()
{
return {"name"};
}
};
}

View File

@ -1,13 +1,17 @@
#include <Databases/IDatabase.h>
#include <Storages/System/attachSystemTables.h>
#include <Storages/System/StorageSystemAggregateFunctionCombinators.h>
#include <Storages/System/StorageSystemAsynchronousMetrics.h>
#include <Storages/System/StorageSystemBuildOptions.h>
#include <Storages/System/StorageSystemCollations.h>
#include <Storages/System/StorageSystemClusters.h>
#include <Storages/System/StorageSystemColumns.h>
#include <Storages/System/StorageSystemDatabases.h>
#include <Storages/System/StorageSystemDataTypeFamilies.h>
#include <Storages/System/StorageSystemDictionaries.h>
#include <Storages/System/StorageSystemEvents.h>
#include <Storages/System/StorageSystemFormats.h>
#include <Storages/System/StorageSystemFunctions.h>
#include <Storages/System/StorageSystemGraphite.h>
#include <Storages/System/StorageSystemMacros.h>
@ -23,6 +27,7 @@
#include <Storages/System/StorageSystemReplicas.h>
#include <Storages/System/StorageSystemReplicationQueue.h>
#include <Storages/System/StorageSystemSettings.h>
#include <Storages/System/StorageSystemTableFunctions.h>
#include <Storages/System/StorageSystemTables.h>
#include <Storages/System/StorageSystemZooKeeper.h>
@ -42,6 +47,11 @@ void attachSystemTablesLocal(IDatabase & system_database)
system_database.attachTable("events", StorageSystemEvents::create("events"));
system_database.attachTable("settings", StorageSystemSettings::create("settings"));
system_database.attachTable("build_options", StorageSystemBuildOptions::create("build_options"));
system_database.attachTable("formats", StorageSystemFormats::create("formats"));
system_database.attachTable("table_functions", StorageSystemTableFunctions::create("table_functions"));
system_database.attachTable("aggregate_function_combinators", StorageSystemAggregateFunctionCombinators::create("aggregate_function_combinators"));
system_database.attachTable("data_type_families", StorageSystemDataTypeFamilies::create("data_type_families"));
system_database.attachTable("collations", StorageSystemCollations::create("collations"));
}
void attachSystemTablesServer(IDatabase & system_database, bool has_zookeeper)

View File

@ -1,3 +1,4 @@
#include <sstream>
#include <Common/typeid_cast.h>
#include <Parsers/IAST.h>
#include <Parsers/ASTFunction.h>
@ -5,7 +6,6 @@
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTSelectQuery.h>
#include <Parsers/ASTExpressionList.h>
#include <Parsers/queryToString.h>
#include <Interpreters/ExpressionAnalyzer.h>
#include <Storages/transformQueryForExternalDatabase.h>
@ -57,6 +57,7 @@ static bool isCompatible(const IAST & node)
String transformQueryForExternalDatabase(
const IAST & query,
const NamesAndTypesList & available_columns,
IdentifierQuotingStyle identifier_quoting_style,
const String & database,
const String & table,
const Context & context)
@ -105,7 +106,13 @@ String transformQueryForExternalDatabase(
}
}
return queryToString(select);
std::stringstream out;
IAST::FormatSettings settings(out, true);
settings.always_quote_identifiers = true;
settings.identifier_quoting_style = identifier_quoting_style;
select->format(settings);
return out.str();
}
}

View File

@ -2,6 +2,7 @@
#include <Core/Types.h>
#include <Core/NamesAndTypes.h>
#include <Parsers/IdentifierQuotingStyle.h>
namespace DB
@ -27,6 +28,7 @@ class Context;
String transformQueryForExternalDatabase(
const IAST & query,
const NamesAndTypesList & available_columns,
IdentifierQuotingStyle identifier_quoting_style,
const String & database,
const String & table,
const Context & context);

View File

@ -23,6 +23,7 @@ class TableFunctionFactory final: public ext::singleton<TableFunctionFactory>
public:
using Creator = std::function<TableFunctionPtr()>;
using TableFunctions = std::unordered_map<std::string, Creator>;
/// Register a function by its name.
/// No locking, you must register all functions before usage of get.
void registerFunction(const std::string & name, Creator creator);
@ -47,6 +48,11 @@ public:
private:
using TableFunctions = std::unordered_map<std::string, Creator>;
const TableFunctions & getAllTableFunctions() const {
return functions;
}
private:
TableFunctions functions;
};

View File

@ -9,8 +9,9 @@ ROOT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && cd ../.. && pwd)
#TODO: DATA_DIR=${DATA_DIR:=`mktemp -d /tmp/clickhouse.test..XXXXX`}
DATA_DIR=${DATA_DIR:=/tmp/clickhouse}
LOG_DIR=${LOG_DIR:=$DATA_DIR/log}
BUILD_DIR=${BUILD_DIR:=$ROOT_DIR/build${BUILD_TYPE}}
export CLICKHOUSE_BINARY=${CLICKHOUSE_BINARY:="clickhouse"}
[ -x "$ROOT_DIR/dbms/programs/${CLICKHOUSE_BINARY}-server" ] && BUILD_DIR=${BUILD_DIR:=$ROOT_DIR} # Build without separate build dir
BUILD_DIR=${BUILD_DIR:=$ROOT_DIR/build${BUILD_TYPE}}
[ -x "$CUR_DIR/clickhouse-server" ] && [ -x "${CUR_DIR}/${CLICKHOUSE_BINARY}-client" ] && BIN_DIR= # Allow run in /usr/bin
[ -x "$BUILD_DIR/dbms/programs/${CLICKHOUSE_BINARY}-server" ] && BIN_DIR=${BIN_DIR:=$BUILD_DIR/dbms/programs/}
[ -f "$CUR_DIR/server-test.xml" ] && CONFIG_DIR=${CONFIG_DIR=$CUR_DIR}/
@ -52,9 +53,9 @@ CERT=`${BIN_DIR}clickhouse-extract-from-config --config=$CLICKHOUSE_CONFIG --key
[ -n "$DHPARAM" ] && openssl dhparam -out $DHPARAM 256
[ -n "$PRIVATEKEY" ] && [ -n "$CERT" ] && openssl req -subj "/CN=localhost" -new -newkey rsa:2048 -days 365 -nodes -x509 -keyout $PRIVATEKEY -out $CERT
if [ "$TEST_GDB" ]; then
if [ "$TEST_GDB" ] || [ "$GDB" ]; then
echo -e "run \nset pagination off \nset logging file $DATA_DIR/gdb.log \nset logging on \nthread apply all backtrace \ndetach \nquit " > $DATA_DIR/gdb.cmd
GDB="gdb -x $DATA_DIR/gdb.cmd --args "
GDB=${GDB:="gdb -x $DATA_DIR/gdb.cmd --args "}
fi
# Start a local clickhouse server which will be used to run tests

View File

@ -136,10 +136,7 @@ void * ClickHouseDictionary_v3_loadAll(void * data_ptr, ClickHouseLibrary::CStri
return static_cast<void *>(&ptr->ctable);
}
void * ClickHouseDictionary_v3_loadKeys(void * data_ptr,
ClickHouseLibrary::CStrings * settings,
ClickHouseLibrary::CStrings * columns,
const ClickHouseLibrary::VectorUInt64 * requested_rows)
void * ClickHouseDictionary_v3_loadKeys(void * data_ptr, ClickHouseLibrary::CStrings * settings, ClickHouseLibrary::Table * requested_keys)
{
auto ptr = static_cast<DataHolder *>(data_ptr);
LOG(ptr->lib->log, "loadKeys lib call ptr=" << data_ptr << " => " << ptr);
@ -151,20 +148,11 @@ void * ClickHouseDictionary_v3_loadKeys(void * data_ptr,
LOG(ptr->lib->log, "setting " << i << " :" << settings->data[i]);
}
}
if (columns)
if (requested_keys)
{
LOG(ptr->lib->log, "columns passed:" << columns->size);
for (size_t i = 0; i < columns->size; ++i)
{
LOG(ptr->lib->log, "col " << i << " :" << columns->data[i]);
}
}
if (requested_rows)
{
LOG(ptr->lib->log, "requested_rows passed: " << requested_rows->size);
for (size_t i = 0; i < requested_rows->size; ++i)
{
LOG(ptr->lib->log, "id " << i << " :" << requested_rows->data[i]);
LOG(ptr->lib->log, "requested_keys columns passed: " << requested_keys->size);
for (size_t i = 0; i < requested_keys->size; ++i) {
LOG(ptr->lib->log, "requested_keys at column " << i << " passed: " << requested_keys->data[i].size);
}
}

View File

@ -30,6 +30,27 @@ typedef struct
int someField;
} DataHolder;
typedef struct
{
const void * data;
uint64_t size;
} ClickHouseLibField;
typedef struct
{
const ClickHouseLibField * data;
uint64_t size;
} ClickHouseLibRow;
typedef struct
{
const ClickHouseLibRow * data;
uint64_t size;
uint64_t error_code;
const char * error_string;
} ClickHouseLibTable;
#define LOG(logger, format, ...) \
do \
{ \
@ -54,11 +75,10 @@ void * ClickHouseDictionary_v3_loadAll(void * data_ptr, ClickHouseLibCStrings *
return 0;
}
void * ClickHouseDictionary_v3_loadKeys(
void * data_ptr, ClickHouseLibCStrings * settings, ClickHouseLibCStrings * columns, const ClickHouseLibVectorUInt64 * requested_rows)
void * ClickHouseDictionary_v3_loadKeys(void * data_ptr, ClickHouseLibCStrings * settings, ClickHouseLibTable* requested_keys)
{
LibHolder * lib = ((DataHolder *)(data_ptr))->lib;
LOG(lib->log, "loadKeys c lib call ptr=%p size=%" PRIu64, data_ptr, requested_rows->size);
LOG(lib->log, "loadKeys c lib call ptr=%p size=%" PRIu64, data_ptr, requested_keys->size);
return 0;
}

View File

@ -1 +1,3 @@
6
[1,1,1,2]
[1,1,2]

View File

@ -1,2 +1,5 @@
SELECT max(arrayJoin(arr)) FROM (SELECT arrayEnumerateUniq(groupArray(intDiv(number, 54321)) AS nums, groupArray(toString(intDiv(number, 98765)))) AS arr FROM (SELECT number FROM system.numbers LIMIT 1000000) GROUP BY intHash32(number) % 100000)
SELECT max(arrayJoin(arr)) FROM (SELECT arrayEnumerateUniq(groupArray(intDiv(number, 54321)) AS nums, groupArray(toString(intDiv(number, 98765)))) AS arr FROM (SELECT number FROM system.numbers LIMIT 1000000) GROUP BY intHash32(number) % 100000);
SELECT arrayEnumerateUniq([[1], [2], [34], [1]]);
SELECT arrayEnumerateUniq([(1, 2), (3, 4), (1, 2)]);

View File

@ -1,3 +1,5 @@
10000
10000
1 6 3 3
1 6 3 3
1 6 [3,2]

View File

@ -1,6 +1,33 @@
drop table if exists test.summing_merge_tree_aggregate_function;
drop table if exists test.summing_merge_tree_null;
---- partition merge
create table test.summing_merge_tree_aggregate_function (
d Date,
k UInt64,
u AggregateFunction(uniq, UInt64)
) engine=SummingMergeTree(d, k, 1);
insert into test.summing_merge_tree_aggregate_function
select today() as d,
number as k,
uniqState(toUInt64(number % 500))
from numbers(5000)
group by d, k;
insert into test.summing_merge_tree_aggregate_function
select today() as d,
number + 5000 as k,
uniqState(toUInt64(number % 500))
from numbers(5000)
group by d, k;
select count() from test.summing_merge_tree_aggregate_function;
optimize table test.summing_merge_tree_aggregate_function;
select count() from test.summing_merge_tree_aggregate_function;
drop table test.summing_merge_tree_aggregate_function;
---- sum + uniq + uniqExact
create table test.summing_merge_tree_aggregate_function (
d materialized today(),

View File

@ -428,5 +428,7 @@ SELECT COVAR_SAMPArray([CAST( 0 AS Int8)],arrayPopBack([CAST( 0 AS Int8)]));
SELECT medianTimingWeightedArray([CAST( 0 AS Int8)],arrayPopBack([CAST( 0 AS Int8)]));
SELECT quantilesDeterministicArray([CAST( 0 AS Int8)],arrayPopBack([CAST( 0 AS Int32)]));
SELECT maxIntersections([], [])
SELECT sumMap([], [])
SELECT maxIntersections([], []);
SELECT sumMap([], []);
SELECT countArray();

View File

@ -0,0 +1,2 @@
1
0

View File

@ -0,0 +1,8 @@
drop table if exists test.orin_test;
create table test.orin_test (c1 Int32) engine=Memory;
insert into test.orin_test values(1), (100);
select minus(c1 = 1 or c1=2 or c1 =3, c1=5) from test.orin_test;
drop table test.orin_test;

View File

@ -0,0 +1,12 @@
[1,2,3]
[1,2,3]
[1,2,5]
['1212','sef','343r4']
['1212','sef','343r4']
['1212','sef','343r4','232']
[1,2,3]
[21]
['a','b','c']
['123']
[['1212'],['sef'],['343r4']]
[(1,2),(1,3),(1,5)]

Some files were not shown because too many files have changed in this diff Show More