Merge remote-tracking branch 'upstream/master'

This commit is contained in:
BayoNet 2018-09-05 17:06:44 +03:00
commit f1edc50599
161 changed files with 3251 additions and 1381 deletions

View File

@ -1,3 +1,16 @@
## ClickHouse release 18.6.0, 2018-08-02
### New features:
* Added support for ON expressions for the JOIN ON syntax:
`JOIN ON Expr([table.]column ...) = Expr([table.]column, ...) [AND Expr([table.]column, ...) = Expr([table.]column, ...) ...]`
Выражение должно представлять из себя цепочку равенств, объединенных оператором AND. Каждая часть равенства может являться произвольным выражением над столбцами одной из таблиц. Поддержана возможность использования fully qualified имен столбцов (`table.name`, `database.table.name`, `table_alias.name`, `subquery_alias.name`) for the right table. [#2742](https://github.com/yandex/ClickHouse/pull/2742)
* HTTPS can be enabled for replication. [#2760](https://github.com/yandex/ClickHouse/pull/2760)
### Improvements:
* The server passes the patch component of its version to the client. Data about the patch version component is in `system.processes` and `query_log`. [#2646](https://github.com/yandex/ClickHouse/pull/2646)
## ClickHouse release 18.5.1, 2018-07-31 ## ClickHouse release 18.5.1, 2018-07-31
### New features: ### New features:
@ -6,7 +19,7 @@
### Improvements: ### Improvements:
* Now you can use the `from_env` attribute to set values in config files from environment variables [#2741](https://github.com/yandex/ClickHouse/pull/2741). * Now you can use the `from_env` [#2741](https://github.com/yandex/ClickHouse/pull/2741) attribute to set values in config files from environment variables.
* Added case-insensitive versions of the `coalesce`, `ifNull`, and `nullIf functions` [#2752](https://github.com/yandex/ClickHouse/pull/2752). * Added case-insensitive versions of the `coalesce`, `ifNull`, and `nullIf functions` [#2752](https://github.com/yandex/ClickHouse/pull/2752).
### Bug fixes: ### Bug fixes:
@ -18,21 +31,21 @@
### New features: ### New features:
* Added system tables: `formats`, `data_type_families`, `aggregate_function_combinators`, `table_functions`, `table_engines`, `collations` [#2721](https://github.com/yandex/ClickHouse/pull/2721). * Added system tables: `formats`, `data_type_families`, `aggregate_function_combinators`, `table_functions`, `table_engines`, `collations` [#2721](https://github.com/yandex/ClickHouse/pull/2721).
* Added the ability to use a table function instead of a table as an argument of a `remote` or `cluster` table function [#2708](https://github.com/yandex/ClickHouse/pull/2708). * Added the ability to use a table function instead of a table as an argument of a `remote` or `cluster table function` [#2708](https://github.com/yandex/ClickHouse/pull/2708).
* Support for `HTTP Basic` authentication in the replication protocol [#2727](https://github.com/yandex/ClickHouse/pull/2727). * Support for `HTTP Basic` authentication in the replication protocol [#2727](https://github.com/yandex/ClickHouse/pull/2727).
* The `has` function now allows searching for a numeric value in an array of `Enum` values [Maxim Khrisanfov](https://github.com/yandex/ClickHouse/pull/2699). * The `has` function now allows searching for a numeric value in an array of `Enum` values [Maxim Khrisanfov](https://github.com/yandex/ClickHouse/pull/2699).
* Support for adding arbitrary message separators when reading from `Kafka` [Amos Bird](https://github.com/yandex/ClickHouse/pull/2701). * Support for adding arbitrary message separators when reading from `Kafka` [Amos Bird](https://github.com/yandex/ClickHouse/pull/2701).
### Improvements: ### Improvements:
* The `ALTER TABLE t DELETE WHERE` query does not rewrite data chunks that were not affected by the WHERE condition [#2694](https://github.com/yandex/ClickHouse/pull/2694). * The `ALTER TABLE t DELETE WHERE` query does not rewrite data parts that were not affected by the WHERE condition [#2694](https://github.com/yandex/ClickHouse/pull/2694).
* The `use_minimalistic_checksums_in_zookeeper` option for `ReplicatedMergeTree` tables is enabled by default. This setting was added in version 1.1.54378, 2018-04-16. Versions that are older than 1.1.54378 can no longer be installed. * The `use_minimalistic_checksums_in_zookeeper` option for `ReplicatedMergeTree` tables is enabled by default. This setting was added in version 1.1.54378, 2018-04-16. Versions that are older than 1.1.54378 can no longer be installed.
* Support for running `KILL` and `OPTIMIZE` queries that specify `ON CLUSTER` [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2689). * Support for running `KILL` and `OPTIMIZE` queries that specify `ON CLUSTER` [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2689).
### Bug fixes: ### Bug fixes:
* Fixed the error `Column ... is not under an aggregate function and not in GROUP BY` for aggregation with an IN expression. This bug appeared in version 18.1.0. ([bbdd780b](https://github.com/yandex/ClickHouse/commit/bbdd780be0be06a0f336775941cdd536878dd2c2)) * Fixed the error `Column ... is not under an aggregate function and not in GROUP BY` for aggregation with an IN expression. This bug appeared in version 18.1.0. ([bbdd780b](https://github.com/yandex/ClickHouse/commit/bbdd780be0be06a0f336775941cdd536878dd2c2))
* Fixed a bug in the `windowFunnel` aggregate function [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2735). * Fixed a bug in the `windowFunnel aggregate function` [Winter Zhang](https://github.com/yandex/ClickHouse/pull/2735).
* Fixed a bug in the `anyHeavy` aggregate function ([a2101df2](https://github.com/yandex/ClickHouse/commit/a2101df25a6a0fba99aa71f8793d762af2b801ee)) * Fixed a bug in the `anyHeavy` aggregate function ([a2101df2](https://github.com/yandex/ClickHouse/commit/a2101df25a6a0fba99aa71f8793d762af2b801ee))
* Fixed server crash when using the `countArray()` aggregate function. * Fixed server crash when using the `countArray()` aggregate function.
@ -72,7 +85,6 @@
* Converting a string containing the number zero to DateTime does not work. Example: `SELECT toDateTime('0')`. This is also the reason that `DateTime DEFAULT '0'` does not work in tables, as well as `<null_value>0</null_value>` in dictionaries. Solution: replace `0` with `0000-00-00 00:00:00`. * Converting a string containing the number zero to DateTime does not work. Example: `SELECT toDateTime('0')`. This is also the reason that `DateTime DEFAULT '0'` does not work in tables, as well as `<null_value>0</null_value>` in dictionaries. Solution: replace `0` with `0000-00-00 00:00:00`.
## ClickHouse release 1.1.54394, 2018-07-12 ## ClickHouse release 1.1.54394, 2018-07-12
### New features: ### New features:
@ -99,7 +111,7 @@
### Improvements: ### Improvements:
* Improved performance, reduced memory consumption, and correct tracking of memory consumption with use of the IN operator when a table index could be used ([#2584](https://github.com/yandex/ClickHouse/pull/2584)). * Improved performance, reduced memory consumption, and correct memory consumption tracking with use of the IN operator when a table index could be used ([#2584](https://github.com/yandex/ClickHouse/pull/2584)).
* Removed redundant checking of checksums when adding a data part. This is important when there are a large number of replicas, because in these cases the total number of checks was equal to N^2. * Removed redundant checking of checksums when adding a data part. This is important when there are a large number of replicas, because in these cases the total number of checks was equal to N^2.
* Added support for `Array(Tuple(...))` arguments for the `arrayEnumerateUniq` function ([#2573](https://github.com/yandex/ClickHouse/pull/2573)). * Added support for `Array(Tuple(...))` arguments for the `arrayEnumerateUniq` function ([#2573](https://github.com/yandex/ClickHouse/pull/2573)).
* Added `Nullable` support for the `runningDifference` function ([#2594](https://github.com/yandex/ClickHouse/pull/2594)). * Added `Nullable` support for the `runningDifference` function ([#2594](https://github.com/yandex/ClickHouse/pull/2594)).
@ -126,8 +138,8 @@
### New features: ### New features:
* Support for the `ALTER TABLE t DELETE WHERE` query for replicated tables. Added the `system.mutations` table to track progress of this type of queries. * Support for the `ALTER TABLE t DELETE WHERE` query for replicated tables. Added the `system.mutations` table to track progress of this type of queries.
* Support for the `ALTER TABLE t [REPLACE|ATTACH] PARTITION` query for MergeTree tables. * Support for the `ALTER TABLE t [REPLACE|ATTACH] PARTITION` query for \*MergeTree tables.
* Support for the `TRUNCATE TABLE` query ([Winter Zhang](https://github.com/yandex/ClickHouse/pull/2260)). * Support for the `TRUNCATE TABLE` query ([Winter Zhang](https://github.com/yandex/ClickHouse/pull/2260))
* Several new `SYSTEM` queries for replicated tables (`RESTART REPLICAS`, `SYNC REPLICA`, `[STOP|START] [MERGES|FETCHES|SENDS REPLICATED|REPLICATION QUEUES]`). * Several new `SYSTEM` queries for replicated tables (`RESTART REPLICAS`, `SYNC REPLICA`, `[STOP|START] [MERGES|FETCHES|SENDS REPLICATED|REPLICATION QUEUES]`).
* Added the ability to write to a table with the MySQL engine and the corresponding table function ([sundy-li](https://github.com/yandex/ClickHouse/pull/2294)). * Added the ability to write to a table with the MySQL engine and the corresponding table function ([sundy-li](https://github.com/yandex/ClickHouse/pull/2294)).
* Added the `url()` table function and the `URL` table engine ([Alexander Sapin](https://github.com/yandex/ClickHouse/pull/2501)). * Added the `url()` table function and the `URL` table engine ([Alexander Sapin](https://github.com/yandex/ClickHouse/pull/2501)).
@ -137,13 +149,13 @@
* The password to `clickhouse-client` can be entered interactively. * The password to `clickhouse-client` can be entered interactively.
* Server logs can now be sent to syslog ([Alexander Krasheninnikov](https://github.com/yandex/ClickHouse/pull/2459)). * Server logs can now be sent to syslog ([Alexander Krasheninnikov](https://github.com/yandex/ClickHouse/pull/2459)).
* Support for logging in dictionaries with a shared library source ([Alexander Sapin](https://github.com/yandex/ClickHouse/pull/2472)). * Support for logging in dictionaries with a shared library source ([Alexander Sapin](https://github.com/yandex/ClickHouse/pull/2472)).
* Support for custom CSV delimiters ([Ivan Zhukov](https://github.com/yandex/ClickHouse/pull/2263)). * Support for custom CSV delimiters ([Ivan Zhukov](https://github.com/yandex/ClickHouse/pull/2263))
* Added the `date_time_input_format` setting. If you switch this setting to `'best_effort'`, DateTime values will be read in a wide range of formats. * Added the `date_time_input_format` setting. If you switch this setting to `'best_effort'`, DateTime values will be read in a wide range of formats.
* Added the `clickhouse-obfuscator` utility for data obfuscation. Usage example: publishing data used in performance tests. * Added the `clickhouse-obfuscator` utility for data obfuscation. Usage example: publishing data used in performance tests.
### Experimental features: ### Experimental features:
* Added the ability to calculate `and` arguments only where they are needed ([Anastasia Tsarkova](https://github.com/yandex/ClickHouse/pull/2272)). * Added the ability to calculate `and` arguments only where they are needed ([Anastasia Tsarkova](https://github.com/yandex/ClickHouse/pull/2272))
* JIT compilation to native code is now available for some expressions ([pyos](https://github.com/yandex/ClickHouse/pull/2277)). * JIT compilation to native code is now available for some expressions ([pyos](https://github.com/yandex/ClickHouse/pull/2277)).
### Bug fixes: ### Bug fixes:
@ -151,11 +163,11 @@
* Duplicates no longer appear for a query with `DISTINCT` and `ORDER BY`. * Duplicates no longer appear for a query with `DISTINCT` and `ORDER BY`.
* Queries with `ARRAY JOIN` and `arrayFilter` no longer return an incorrect result. * Queries with `ARRAY JOIN` and `arrayFilter` no longer return an incorrect result.
* Fixed an error when reading an array column from a Nested structure ([#2066](https://github.com/yandex/ClickHouse/issues/2066)). * Fixed an error when reading an array column from a Nested structure ([#2066](https://github.com/yandex/ClickHouse/issues/2066)).
* Fixed an error when analyzing queries with a HAVING section like `HAVING tuple IN (...)`. * Fixed an error when analyzing queries with a HAVING clause like `HAVING tuple IN (...)`.
* Fixed an error when analyzing queries with recursive aliases. * Fixed an error when analyzing queries with recursive aliases.
* Fixed an error when reading from ReplacingMergeTree with a condition in PREWHERE that filters all rows ([#2525](https://github.com/yandex/ClickHouse/issues/2525)). * Fixed an error when reading from ReplacingMergeTree with a condition in PREWHERE that filters all rows ([#2525](https://github.com/yandex/ClickHouse/issues/2525)).
* User profile settings were not applied when using sessions in the HTTP interface. * User profile settings were not applied when using sessions in the HTTP interface.
* Fixed how settings are applied from the command line parameters in `clickhouse-local`. * Fixed how settings are applied from the command line parameters in clickhouse-local.
* The ZooKeeper client library now uses the session timeout received from the server. * The ZooKeeper client library now uses the session timeout received from the server.
* Fixed a bug in the ZooKeeper client library when the client waited for the server response longer than the timeout. * Fixed a bug in the ZooKeeper client library when the client waited for the server response longer than the timeout.
* Fixed pruning of parts for queries with conditions on partition key columns ([#2342](https://github.com/yandex/ClickHouse/issues/2342)). * Fixed pruning of parts for queries with conditions on partition key columns ([#2342](https://github.com/yandex/ClickHouse/issues/2342)).
@ -165,7 +177,7 @@
* Fixed syntactic parsing and formatting of the `CAST` operator. * Fixed syntactic parsing and formatting of the `CAST` operator.
* Fixed insertion into a materialized view for the Distributed table engine ([Babacar Diassé](https://github.com/yandex/ClickHouse/pull/2411)). * Fixed insertion into a materialized view for the Distributed table engine ([Babacar Diassé](https://github.com/yandex/ClickHouse/pull/2411)).
* Fixed a race condition when writing data from the `Kafka` engine to materialized views ([Yangkuan Liu](https://github.com/yandex/ClickHouse/pull/2448)). * Fixed a race condition when writing data from the `Kafka` engine to materialized views ([Yangkuan Liu](https://github.com/yandex/ClickHouse/pull/2448)).
* Fixed SSRF in the `remote()` table function. * Fixed SSRF in the remote() table function.
* Fixed exit behavior of `clickhouse-client` in multiline mode ([#2510](https://github.com/yandex/ClickHouse/issues/2510)). * Fixed exit behavior of `clickhouse-client` in multiline mode ([#2510](https://github.com/yandex/ClickHouse/issues/2510)).
### Improvements: ### Improvements:
@ -184,7 +196,7 @@
### Build changes: ### Build changes:
* The gcc8 compiler can be used for builds. * The gcc8 compiler can be used for builds.
* Added the ability to build llvm from a submodule. * Added the ability to build llvm from submodule.
* The version of the librdkafka library has been updated to v0.11.4. * The version of the librdkafka library has been updated to v0.11.4.
* Added the ability to use the system libcpuid library. The library version has been updated to 0.4.0. * Added the ability to use the system libcpuid library. The library version has been updated to 0.4.0.
* Fixed the build using the vectorclass library ([Babacar Diassé](https://github.com/yandex/ClickHouse/pull/2274)). * Fixed the build using the vectorclass library ([Babacar Diassé](https://github.com/yandex/ClickHouse/pull/2274)).
@ -195,44 +207,52 @@
### Backward incompatible changes: ### Backward incompatible changes:
* Removed escaping in `Vertical` and `Pretty*` formats and deleted the `VerticalRaw` format. * Removed escaping in `Vertical` and `Pretty*` formats and deleted the `VerticalRaw` format.
* If servers with version 1.1.54388 (or newer) and servers with older version are used simultaneously in distributed query and the query has `cast(x, 'Type')` expression in the form without `AS` keyword and with `cast` not in uppercase, then the exception with message like `Not found column cast(0, 'UInt8') in block` will be thrown. Solution: update server on all cluster nodes. * If servers with version 1.1.54388 (or newer) and servers with an older version are used simultaneously in a distributed query and the query has the `cast(x, 'Type')` expression without the `AS` keyword and doesn't have the word `cast` in uppercase, an exception will be thrown with a message like `Not found column cast(0, 'UInt8') in block`. Solution: Update the server on the entire cluster.
## ClickHouse release 1.1.54385, 2018-06-01 ## ClickHouse release 1.1.54385, 2018-06-01
### Bug fixes: ### Bug fixes:
* Fixed an error that in some cases caused ZooKeeper operations to block. * Fixed an error that in some cases caused ZooKeeper operations to block.
## ClickHouse release 1.1.54383, 2018-05-22 ## ClickHouse release 1.1.54383, 2018-05-22
### Bug fixes: ### Bug fixes:
* Fixed a slowdown of replication queue if a table has many replicas. * Fixed a slowdown of replication queue if a table has many replicas.
## ClickHouse release 1.1.54381, 2018-05-14 ## ClickHouse release 1.1.54381, 2018-05-14
### Bug fixes: ### Bug fixes:
* Fixed a nodes leak in ZooKeeper when ClickHouse loses connection to ZooKeeper server. * Fixed a nodes leak in ZooKeeper when ClickHouse loses connection to ZooKeeper server.
## ClickHouse release 1.1.54380, 2018-04-21 ## ClickHouse release 1.1.54380, 2018-04-21
### New features: ### New features:
* Added table function `file(path, format, structure)`. An example reading bytes from `/dev/urandom`: `ln -s /dev/urandom /var/lib/clickhouse/user_files/random` `clickhouse-client -q "SELECT * FROM file('random', 'RowBinary', 'd UInt8') LIMIT 10"`.
* Added the table function `file(path, format, structure)`. An example reading bytes from `/dev/urandom`: `ln -s /dev/urandom /var/lib/clickhouse/user_files/random``clickhouse-client -q "SELECT * FROM file('random', 'RowBinary', 'd UInt8') LIMIT 10"`.
### Improvements: ### Improvements:
* Subqueries could be wrapped by `()` braces (to enhance queries readability). For example, `(SELECT 1) UNION ALL (SELECT 1)`.
* Simple `SELECT` queries from table `system.processes` are not counted in `max_concurrent_queries` limit. * Subqueries can be wrapped in `()` brackets to enhance query readability. For example: `(SELECT 1) UNION ALL (SELECT 1)`.
* Simple `SELECT` queries from the `system.processes` table are not included in the `max_concurrent_queries` limit.
### Bug fixes: ### Bug fixes:
* Fixed incorrect behaviour of `IN` operator when select from `MATERIALIZED VIEW`.
* Fixed incorrect filtering by partition index in expressions like `WHERE partition_key_column IN (...)` * Fixed incorrect behavior of the `IN` operator when select from `MATERIALIZED VIEW`.
* Fixed inability to execute `OPTIMIZE` query on non-leader replica if the table was `REANAME`d. * Fixed incorrect filtering by partition index in expressions like `partition_key_column IN (...)`.
* Fixed authorization error when execute `OPTIMIZE` or `ALTER` queries on a non-leader replica. * Fixed inability to execute `OPTIMIZE` query on non-leader replica if `REANAME` was performed on the table.
* Fixed freezing of `KILL QUERY` queries. * Fixed the authorization error when executing `OPTIMIZE` or `ALTER` queries on a non-leader replica.
* Fixed an error in ZooKeeper client library which led to watches loses, freezing of distributed DDL queue and slowing replication queue if non-empty `chroot` prefix is used in ZooKeeper configuration. * Fixed freezing of `KILL QUERY`.
* Fixed an error in ZooKeeper client library which led to loss of watches, freezing of distributed DDL queue, and slowdowns in the replication queue if a non-empty `chroot` prefix is used in the ZooKeeper configuration.
### Backward incompatible changes: ### Backward incompatible changes:
* Removed support of expressions like `(a, b) IN (SELECT (a, b))` (instead of them you can use their equivalent `(a, b) IN (SELECT a, b)`). In previous releases, these expressions led to undetermined data filtering or caused errors.
* Removed support for expressions like `(a, b) IN (SELECT (a, b))` (you can use the equivalent expression `(a, b) IN (SELECT a, b)`). In previous releases, these expressions led to undetermined `WHERE` filtering or caused errors.
## ClickHouse release 1.1.54378, 2018-04-16 ## ClickHouse release 1.1.54378, 2018-04-16
### New features: ### New features:
* Logging level can be changed without restarting the server. * Logging level can be changed without restarting the server.
@ -242,10 +262,10 @@
* Added support for `ALTER TABLE ... PARTITION ... ` for `MATERIALIZED VIEW`. * Added support for `ALTER TABLE ... PARTITION ... ` for `MATERIALIZED VIEW`.
* Added information about the size of data parts in uncompressed form in the system table. * Added information about the size of data parts in uncompressed form in the system table.
* Server-to-server encryption support for distributed tables (`<secure>1</secure>` in the replica config in `<remote_servers>`). * Server-to-server encryption support for distributed tables (`<secure>1</secure>` in the replica config in `<remote_servers>`).
* Configuration of the table level for the `ReplicatedMergeTree` family in order to minimize the amount of data stored in zookeeper: `use_minimalistic_checksums_in_zookeeper = 1` * Configuration of the table level for the `ReplicatedMergeTree` family in order to minimize the amount of data stored in Zookeeper: : `use_minimalistic_checksums_in_zookeeper = 1`
* Configuration of the `clickhouse-client` prompt. By default, server names are now output to the prompt. The server's display name can be changed; it's also sent in the `X-ClickHouse-Display-Name` HTTP header (Kirill Shvakov). * Configuration of the `clickhouse-client` prompt. By default, server names are now output to the prompt. The server's display name can be changed. It's also sent in the `X-ClickHouse-Display-Name` HTTP header (Kirill Shvakov).
* Multiple comma-separated `topics` can be specified for the `Kafka` engine (Tobias Adamson). * Multiple comma-separated `topics` can be specified for the `Kafka` engine (Tobias Adamson)
* When a query is stopped by `KILL QUERY` or `replace_running_query`, the client receives the `Query was cancelled` exception instead of an incomplete response. * When a query is stopped by `KILL QUERY` or `replace_running_query`, the client receives the `Query was cancelled` exception instead of an incomplete result.
### Improvements: ### Improvements:
@ -264,11 +284,11 @@
* Correct results are now returned when using tuples with `IN` when some of the tuple components are in the table index. * Correct results are now returned when using tuples with `IN` when some of the tuple components are in the table index.
* The `max_execution_time` limit now works correctly with distributed queries. * The `max_execution_time` limit now works correctly with distributed queries.
* Fixed errors when calculating the size of composite columns in the `system.columns` table. * Fixed errors when calculating the size of composite columns in the `system.columns` table.
* Fixed an error when creating a temporary table `CREATE TEMPORARY TABLE IF NOT EXISTS`. * Fixed an error when creating a temporary table `CREATE TEMPORARY TABLE IF NOT EXISTS.`
* Fixed errors in `StorageKafka` (#2075) * Fixed errors in `StorageKafka` (##2075)
* Fixed server crashes from invalid arguments of certain aggregate functions. * Fixed server crashes from invalid arguments of certain aggregate functions.
* Fixed the error that prevented the `DETACH DATABASE` query from stopping background tasks for `ReplicatedMergeTree` tables. * Fixed the error that prevented the `DETACH DATABASE` query from stopping background tasks for `ReplicatedMergeTree` tables.
* `Too many parts` state is less likely to happen when inserting into aggregated materialized views (#2084). * `Too many parts` state is less likely to happen when inserting into aggregated materialized views (##2084).
* Corrected recursive handling of substitutions in the config if a substitution must be followed by another substitution on the same level. * Corrected recursive handling of substitutions in the config if a substitution must be followed by another substitution on the same level.
* Corrected the syntax in the metadata file when creating a `VIEW` that uses a query with `UNION ALL`. * Corrected the syntax in the metadata file when creating a `VIEW` that uses a query with `UNION ALL`.
* `SummingMergeTree` now works correctly for summation of nested data structures with a composite key. * `SummingMergeTree` now works correctly for summation of nested data structures with a composite key.
@ -276,15 +296,14 @@
### Build changes: ### Build changes:
* The build supports `ninja` instead of `make` and uses it by default for building releases. * The build supports `ninja` instead of `make` and uses `ninja` by default for building releases.
* Renamed packages: `clickhouse-server-base` is now `clickhouse-common-static`; `clickhouse-server-common` is now `clickhouse-server`; `clickhouse-common-dbg` is now `clickhouse-common-static-dbg`. To install, use `clickhouse-server clickhouse-client`. Packages with the old names will still load in the repositories for backward compatibility. * Renamed packages: `clickhouse-server-base` in `clickhouse-common-static`; `clickhouse-server-common` in `clickhouse-server`; `clickhouse-common-dbg` in `clickhouse-common-static-dbg`. To install, use `clickhouse-server clickhouse-client`. Packages with the old names will still load in the repositories for backward compatibility.
### Backward-incompatible changes: ### Backward incompatible changes:
* Removed the special interpretation of an IN expression if an array is specified on the left side. Previously, the expression `arr IN (set)` was interpreted as "at least one `arr` element belongs to the `set`". To get the same behavior in the new version, write `arrayExists(x -> x IN (set), arr)`. * Removed the special interpretation of an IN expression if an array is specified on the left side. Previously, the expression `arr IN (set)` was interpreted as "at least one `arr` element belongs to the `set`". To get the same behavior in the new version, write `arrayExists(x -> x IN (set), arr)`.
* Disabled the incorrect use of the socket option `SO_REUSEPORT`, which was incorrectly enabled by default in the Poco library. Note that on Linux there is no longer any reason to simultaneously specify the addresses `::` and `0.0.0.0` for listen use just `::`, which allows listening to the connection both over IPv4 and IPv6 (with the default kernel config settings). You can also revert to the behavior from previous versions by specifying `<listen_reuse_port>1</listen_reuse_port>` in the config. * Disabled the incorrect use of the socket option `SO_REUSEPORT`, which was incorrectly enabled by default in the Poco library. Note that on Linux there is no longer any reason to simultaneously specify the addresses `::` and `0.0.0.0` for listen use just `::`, which allows listening to the connection both over IPv4 and IPv6 (with the default kernel config settings). You can also revert to the behavior from previous versions by specifying `<listen_reuse_port>1</listen_reuse_port>` in the config.
## ClickHouse release 1.1.54370, 2018-03-16 ## ClickHouse release 1.1.54370, 2018-03-16
### New features: ### New features:
@ -296,43 +315,44 @@
### Improvements: ### Improvements:
* When inserting data in a `Replicated` table, fewer requests are made to `ZooKeeper` (and most of the user-level errors have disappeared from the `ZooKeeper` log). * When inserting data in a `Replicated` table, fewer requests are made to `ZooKeeper` (and most of the user-level errors have disappeared from the `ZooKeeper` log).
* Added the ability to create aliases for sets. Example: `WITH (1, 2, 3) AS set SELECT number IN set FROM system.numbers LIMIT 10`. * Added the ability to create aliases for data sets. Example: `WITH (1, 2, 3) AS set SELECT number IN set FROM system.numbers LIMIT 10`.
### Bug fixes: ### Bug fixes:
* Fixed the `Illegal PREWHERE` error when reading from `Merge` tables over `Distributed` tables. * Fixed the `Illegal PREWHERE` error when reading from Merge tables for `Distributed`tables.
* Added fixes that allow you to run `clickhouse-server` in IPv4-only Docker containers. * Added fixes that allow you to start clickhouse-server in IPv4-only Docker containers.
* Fixed a race condition when reading from system `system.parts_columns` tables. * Fixed a race condition when reading from system `system.parts_columns tables.`
* Removed double buffering during a synchronous insert to a `Distributed` table, which could have caused the connection to timeout. * Removed double buffering during a synchronous insert to a `Distributed` table, which could have caused the connection to timeout.
* Fixed a bug that caused excessively long waits for an unavailable replica before beginning a `SELECT` query. * Fixed a bug that caused excessively long waits for an unavailable replica before beginning a `SELECT` query.
* Fixed incorrect dates in the `system.parts` table. * Fixed incorrect dates in the `system.parts` table.
* Fixed a bug that made it impossible to insert data in a `Replicated` table if `chroot` was non-empty in the configuration of the `ZooKeeper` cluster. * Fixed a bug that made it impossible to insert data in a `Replicated` table if `chroot` was non-empty in the configuration of the `ZooKeeper` cluster.
* Fixed the vertical merging algorithm for an empty `ORDER BY` table. * Fixed the vertical merging algorithm for an empty `ORDER BY` table.
* Restored the ability to use dictionaries in queries to remote tables, even if these dictionaries are not present on the requestor server. This functionality was lost in release 1.1.54362. * Restored the ability to use dictionaries in queries to remote tables, even if these dictionaries are not present on the requestor server. This functionality was lost in release 1.1.54362.
* Restored the behavior for queries like `SELECT * FROM remote('server2', default.table) WHERE col IN (SELECT col2 FROM default.table)` when the right side argument of the `IN` should use a remote `default.table` instead of a local one. This behavior was broken in version 1.1.54358. * Restored the behavior for queries like `SELECT * FROM remote('server2', default.table) WHERE col IN (SELECT col2 FROM default.table)` when the right side of the `IN` should use a remote `default.table` instead of a local one. This behavior was broken in version 1.1.54358.
* Removed extraneous error-level logging of `Not found column ... in block`. * Removed extraneous error-level logging of `Not found column ... in block`.
## ClickHouse release 1.1.54356, 2018-03-06 ## Clickhouse Release 1.1.54362, 2018-03-11
### New features: ### New features:
* Aggregation without `GROUP BY` for an empty set (such as `SELECT count(*) FROM table WHERE 0`) now returns a result with one row with null values for aggregate functions, in compliance with the SQL standard. To restore the old behavior (return an empty result), set `empty_result_for_aggregation_by_empty_set` to 1. * Aggregation without `GROUP BY` for an empty set (such as `SELECT count(*) FROM table WHERE 0`) now returns a result with one row with null values for aggregate functions, in compliance with the SQL standard. To restore the old behavior (return an empty result), set `empty_result_for_aggregation_by_empty_set` to 1.
* Added type conversion for `UNION ALL`. Different alias names are allowed in `SELECT` positions in `UNION ALL`, in compliance with the SQL standard. * Added type conversion for `UNION ALL`. Different alias names are allowed in `SELECT` positions in `UNION ALL`, in compliance with the SQL standard.
* Arbitrary expressions are supported in `LIMIT BY` sections. Previously, it was only possible to use columns resulting from `SELECT`. * Arbitrary expressions are supported in `LIMIT BY` clauses. Previously, it was only possible to use columns resulting from `SELECT`.
* An index of `MergeTree` tables is used when `IN` is applied to a tuple of expressions from the columns of the primary key. Example: `WHERE (UserID, EventDate) IN ((123, '2000-01-01'), ...)` (Anastasiya Tsarkova). * An index of `MergeTree` tables is used when `IN` is applied to a tuple of expressions from the columns of the primary key. Example: `WHERE (UserID, EventDate) IN ((123, '2000-01-01'), ...)` (Anastasiya Tsarkova).
* Added the `clickhouse-copier` tool for copying between clusters and resharding data (beta). * Added the `clickhouse-copier` tool for copying between clusters and resharding data (beta).
* Added consistent hashing functions: `yandexConsistentHash`, `jumpConsistentHash`, `sumburConsistentHash`. They can be used as a sharding key in order to reduce the amount of network traffic during subsequent reshardings. * Added consistent hashing functions: `yandexConsistentHash`, `jumpConsistentHash`, `sumburConsistentHash`. They can be used as a sharding key in order to reduce the amount of network traffic during subsequent reshardings.
* Added functions: `arrayAny`, `arrayAll`, `hasAny`, `hasAll`, `arrayIntersect`, `arrayResize`. * Added functions: `arrayAny`, `arrayAll`, `hasAny`, `hasAll`, `arrayIntersect`, `arrayResize`.
* Added the `arrayCumSum` function (Javi Santana). * Added the `arrayCumSum` function (Javi Santana).
* Added the `parseDateTimeBestEffort`, `parseDateTimeBestEffortOrZero`, and `parseDateTimeBestEffortOrNull` functions to read the DateTime from a string containing text in a wide variety of possible formats. * Added the `parseDateTimeBestEffort`, `parseDateTimeBestEffortOrZero`, and `parseDateTimeBestEffortOrNull` functions to read the DateTime from a string containing text in a wide variety of possible formats.
* It is now possible to change the logging settings without restarting the server. * Data can be partially reloaded from external dictionaries during updating (load just the records in which the value of the specified field greater than in the previous download) (Arsen Hakobyan).
* Added the `cluster` table function. Example: `cluster(cluster_name, db, table)`. The `remote` table function can accept the cluster name as the first argument, if it is specified as an identifier. * Added the `cluster` table function. Example: `cluster(cluster_name, db, table)`. The `remote` table function can accept the cluster name as the first argument, if it is specified as an identifier.
* The `remote` and `cluster` table functions can be used in `INSERT` requests.
* Added the `create_table_query` and `engine_full` virtual columns to the `system.tables`table . The `metadata_modification_time` column is virtual. * Added the `create_table_query` and `engine_full` virtual columns to the `system.tables`table . The `metadata_modification_time` column is virtual.
* Added the `data_path` and `metadata_path` columns to `system.tables`and` system.databases` tables, and added the `path` column to the `system.parts` and `system.parts_columns` tables. * Added the `data_path` and `metadata_path` columns to `system.tables`and` system.databases` tables, and added the `path` column to the `system.parts` and `system.parts_columns` tables.
* Added additional information about merges in the `system.part_log` table. * Added additional information about merges in the `system.part_log` table.
* An arbitrary partitioning key can be used for the `system.query_log` table (Kirill Shvakov). * An arbitrary partitioning key can be used for the `system.query_log` table (Kirill Shvakov).
* The `SHOW TABLES` query now also shows temporary tables. Added temporary tables and the `is_temporary` column to `system.tables` (zhang2014). * The `SHOW TABLES` query now also shows temporary tables. Added temporary tables and the `is_temporary` column to `system.tables` (zhang2014).
* Added the `DROP TEMPORARY TABLE` query (zhang2014). * Added `DROP TEMPORARY TABLE` and `EXISTS TEMPORARY TABLE` queries (zhang2014).
* Support for `SHOW CREATE TABLE` for temporary tables (zhang2014). * Support for `SHOW CREATE TABLE` for temporary tables (zhang2014).
* Added the `system_profile` configuration parameter for the settings used by internal processes. * Added the `system_profile` configuration parameter for the settings used by internal processes.
* Support for loading `object_id` as an attribute in `MongoDB` dictionaries (Pavel Litvinenko). * Support for loading `object_id` as an attribute in `MongoDB` dictionaries (Pavel Litvinenko).
@ -347,7 +367,9 @@
* `MergeTree` tables can be used without a primary key (you need to specify `ORDER BY tuple()`). * `MergeTree` tables can be used without a primary key (you need to specify `ORDER BY tuple()`).
* A `Nullable` type can be `CAST` to a non-`Nullable` type if the argument is not `NULL`. * A `Nullable` type can be `CAST` to a non-`Nullable` type if the argument is not `NULL`.
* `RENAME TABLE` can be performed for `VIEW`. * `RENAME TABLE` can be performed for `VIEW`.
* Added the `throwIf` function.
* Added the `odbc_default_field_size` option, which allows you to extend the maximum size of the value loaded from an ODBC source (by default, it is 1024). * Added the `odbc_default_field_size` option, which allows you to extend the maximum size of the value loaded from an ODBC source (by default, it is 1024).
* The `system.processes` table and `SHOW PROCESSLIST` now have the `is_cancelled` and `peak_memory_usage` columns.
### Improvements: ### Improvements:
@ -371,6 +393,7 @@
* Fixed a bug in merges for `ReplacingMergeTree` tables. * Fixed a bug in merges for `ReplacingMergeTree` tables.
* Fixed synchronous insertions in `Distributed` tables (`insert_distributed_sync = 1`). * Fixed synchronous insertions in `Distributed` tables (`insert_distributed_sync = 1`).
* Fixed segfault for certain uses of `FULL` and `RIGHT JOIN` with duplicate columns in subqueries. * Fixed segfault for certain uses of `FULL` and `RIGHT JOIN` with duplicate columns in subqueries.
* Fixed segfault for certain uses of `replace_running_query` and `KILL QUERY`.
* Fixed the order of the `source` and `last_exception` columns in the `system.dictionaries` table. * Fixed the order of the `source` and `last_exception` columns in the `system.dictionaries` table.
* Fixed a bug when the `DROP DATABASE` query did not delete the file with metadata. * Fixed a bug when the `DROP DATABASE` query did not delete the file with metadata.
* Fixed the `DROP DATABASE` query for `Dictionary` databases. * Fixed the `DROP DATABASE` query for `Dictionary` databases.
@ -390,74 +413,77 @@
* Fixed a race condition in the query execution pipeline that occurred in very rare cases when using `Merge` tables with a large number of tables, and when using `GLOBAL` subqueries. * Fixed a race condition in the query execution pipeline that occurred in very rare cases when using `Merge` tables with a large number of tables, and when using `GLOBAL` subqueries.
* Fixed a crash when passing arrays of different sizes to an `arrayReduce` function when using aggregate functions from multiple arguments. * Fixed a crash when passing arrays of different sizes to an `arrayReduce` function when using aggregate functions from multiple arguments.
* Prohibited the use of queries with `UNION ALL` in a `MATERIALIZED VIEW`. * Prohibited the use of queries with `UNION ALL` in a `MATERIALIZED VIEW`.
* Fixed an error during initialization of the `part_log` system table when the server starts (by default, `part_log` is disabled).
### Backward incompatible changes: ### Backward incompatible changes:
* Removed the `distributed_ddl_allow_replicated_alter` option. This behavior is enabled by default. * Removed the `distributed_ddl_allow_replicated_alter` option. This behavior is enabled by default.
* Removed the `strict_insert_defaults` setting. If you were using this functionality, write to `clickhouse-feedback@yandex-team.com`.
* Removed the `UnsortedMergeTree` engine. * Removed the `UnsortedMergeTree` engine.
## ClickHouse release 1.1.54343, 2018-02-05 ## Clickhouse Release 1.1.54343, 2018-02-05
* Added macros support for defining cluster names in distributed DDL queries and constructors of Distributed tables: `CREATE TABLE distr ON CLUSTER '{cluster}' (...) ENGINE = Distributed('{cluster}', 'db', 'table')`. * Added macros support for defining cluster names in distributed DDL queries and constructors of Distributed tables: `CREATE TABLE distr ON CLUSTER '{cluster}' (...) ENGINE = Distributed('{cluster}', 'db', 'table')`.
* Now the table index is used for conditions like `expr IN (subquery)`. * Now queries like `SELECT ... FROM table WHERE expr IN (subquery)` are processed using the `table` index.
* Improved processing of duplicates when inserting to Replicated tables, so they no longer slow down execution of the replication queue. * Improved processing of duplicates when inserting to Replicated tables, so they no longer slow down execution of the replication queue.
## ClickHouse release 1.1.54342, 2018-01-22 ## Clickhouse Release 1.1.54342, 2018-01-22
This release contains bug fixes for the previous release 1.1.54337: This release contains bug fixes for the previous release 1.1.54337:
* Fixed a regression in 1.1.54337: if the default user has readonly access, then the server refuses to start up with the message `Cannot create database in readonly mode`. * Fixed a regression in 1.1.54337: if the default user has readonly access, then the server refuses to start up with the message `Cannot create database in readonly mode`.
* Fixed a regression in 1.1.54337: on systems with `systemd`, logs are always written to syslog regardless of the configuration; the watchdog script still uses `init.d`. * Fixed a regression in 1.1.54337: on systems with systemd, logs are always written to syslog regardless of the configuration; the watchdog script still uses init.d.
* Fixed a regression in 1.1.54337: wrong default configuration in the Docker image. * Fixed a regression in 1.1.54337: wrong default configuration in the Docker image.
* Fixed nondeterministic behaviour of GraphiteMergeTree (you can notice it in log messages `Data after merge is not byte-identical to data on another replicas`). * Fixed nondeterministic behavior of GraphiteMergeTree (you can see it in log messages `Data after merge is not byte-identical to the data on another replicas`).
* Fixed a bug that may lead to inconsistent merges after OPTIMIZE query to Replicated tables (you may notice it in log messages `Part ... intersects previous part`). * Fixed a bug that may lead to inconsistent merges after OPTIMIZE query to Replicated tables (you may see it in log messages `Part ... intersects the previous part`).
* Buffer tables now work correctly when MATERIALIZED columns are present in the destination table (by zhang2014). * Buffer tables now work correctly when MATERIALIZED columns are present in the destination table (by zhang2014).
* Fixed a bug in implementation of NULL. * Fixed a bug in implementation of NULL.
## ClickHouse release 1.1.54337, 2018-01-18 ## Clickhouse Release 1.1.54337, 2018-01-18
### New features: ### New features:
* Added support for storage of multidimensional arrays and tuples (`Tuple` data type) in tables. * Added support for storage of multi-dimensional arrays and tuples (`Tuple` data type) in tables.
* Added support for table functions in `DESCRIBE` and `INSERT` queries. Added support for subqueries in `DESCRIBE`. Examples: `DESC TABLE remote('host', default.hits)`; `DESC TABLE (SELECT 1)`; `INSERT INTO TABLE FUNCTION remote('host', default.hits)`. Support for `INSERT INTO TABLE` syntax in addition to `INSERT INTO`. * Support for table functions for `DESCRIBE` and `INSERT` queries. Added support for subqueries in `DESCRIBE`. Examples: `DESC TABLE remote('host', default.hits)`; `DESC TABLE (SELECT 1)`; `INSERT INTO TABLE FUNCTION remote('host', default.hits)`. Support for `INSERT INTO TABLE` in addition to `INSERT INTO`.
* Improved support for timezones. The `DateTime` data type can be annotated with the timezone that is used for parsing and formatting in text formats. Example: `DateTime('Europe/Moscow')`. When timezones are specified in functions for DateTime arguments, the return type will track the timezone, and the value will be displayed as expected. * Improved support for time zones. The `DateTime` data type can be annotated with the timezone that is used for parsing and formatting in text formats. Example: `DateTime('Europe/Moscow')`. When timezones are specified in functions for `DateTime` arguments, the return type will track the timezone, and the value will be displayed as expected.
* Added the functions `toTimeZone`, `timeDiff`, `toQuarter`, `toRelativeQuarterNum`. The `toRelativeHour`/`Minute`/`Second` functions can take a value of type `Date` as an argument. The name of the `now` function has been made case-insensitive. * Added the functions `toTimeZone`, `timeDiff`, `toQuarter`, `toRelativeQuarterNum`. The `toRelativeHour`/`Minute`/`Second` functions can take a value of type `Date` as an argument. The `now` function name is case-sensitive.
* Added the `toStartOfFifteenMinutes` function (Kirill Shvakov). * Added the `toStartOfFifteenMinutes` function (Kirill Shvakov).
* Added the `clickhouse format` tool for formatting queries. * Added the `clickhouse format` tool for formatting queries.
* Added the `format_schema_path` configuration parameter (Marek Vavruša). It is used for specifying a schema in `Cap'n'Proto` format. Schema files can be located only in the specified directory. * Added the `format_schema_path` configuration parameter (Marek Vavruşa). It is used for specifying a schema in `Cap'n Proto` format. Schema files can be located only in the specified directory.
* Added support for config substitutions (`incl` and `conf.d`) for configuration of external dictionaries and models (Pavel Yakunin). * Added support for config substitutions (`incl` and `conf.d`) for configuration of external dictionaries and models (Pavel Yakunin).
* Added a column with documentation for the `system.settings` table (Kirill Shvakov). * Added a column with documentation for the `system.settings` table (Kirill Shvakov).
* Added the `system.parts_columns` table with information about column sizes in each data part of `MergeTree` tables. * Added the `system.parts_columns` table with information about column sizes in each data part of `MergeTree` tables.
* Added the `system.models` table with information about loaded `CatBoost` machine learning models. * Added the `system.models` table with information about loaded `CatBoost` machine learning models.
* Added the `mysql` and `odbc` table functions along with the corresponding `MySQL` and `ODBC` table engines for working with foreign databases. This feature is in the beta stage. * Added the `mysql` and `odbc` table function and corresponding `MySQL` and `ODBC` table engines for accessing remote databases. This functionality is in the beta stage.
* Added the possibility to pass an argument of type `AggregateFunction` for the `groupArray` aggregate function (so you can create an array of states of some aggregate function). * Added the possibility to pass an argument of type `AggregateFunction` for the `groupArray` aggregate function (so you can create an array of states of some aggregate function).
* Removed restrictions on various combinations of aggregate function combinators. For example, you can use `avgForEachIf` as well as `avgIfForEach` aggregate functions, which have different behaviors. * Removed restrictions on various combinations of aggregate function combinators. For example, you can use `avgForEachIf` as well as `avgIfForEach` aggregate functions, which have different behaviors.
* The `-ForEach` aggregate function combinator is extended for the case of aggregate functions of multiple arguments. * The `-ForEach` aggregate function combinator is extended for the case of aggregate functions of multiple arguments.
* Added support for aggregate functions of `Nullable` arguments even for cases when the function returns a non-`Nullable` result (added with the contribution of Silviu Caragea). Examples: `groupArray`, `groupUniqArray`, `topK`. * Added support for aggregate functions of `Nullable` arguments even for cases when the function returns a non-`Nullable` result (added with the contribution of Silviu Caragea). Example: `groupArray`, `groupUniqArray`, `topK`.
* Added the `max_client_network_bandwidth` command line parameter for `clickhouse-client` (Kirill Shvakov). * Added the `max_client_network_bandwidth` for `clickhouse-client` (Kirill Shvakov).
* Users with the ` readonly = 2` setting are allowed to work with TEMPORARY tables (CREATE, DROP, INSERT...) (Kirill Shvakov). * Users with the ` readonly = 2` setting are allowed to work with TEMPORARY tables (CREATE, DROP, INSERT...) (Kirill Shvakov).
* Added support for using multiple consumers with the `Kafka` engine. Extended configuration options for `Kafka` (Marek Vavruša). * Added support for using multiple consumers with the `Kafka` engine. Extended configuration options for `Kafka` (Marek Vavruša).
* Added the `intExp2` and `intExp10` functions. * Added the `intExp3` and `intExp4` functions.
* Added the `sumKahan` aggregate function (computationally stable summation of floating point numbers). * Added the `sumKahan` aggregate function.
* Added to*Number*OrNull functions, where *Number* is a numeric type. * Added the to * Number* OrNull functions, where * Number* is a numeric type.
* Added support for the `WITH` clause for an `INSERT SELECT` query (by zhang2014). * Added support for `WITH` clauses for an `INSERT SELECT` query (author: zhang2014).
* Added the settings `http_connection_timeout`, `http_send_timeout`, and `http_receive_timeout`. In particular, these settings are used for downloading data parts for replication. Changing these settings allows for faster failover if the network is overloaded. * Added settings: `http_connection_timeout`, `http_send_timeout`, `http_receive_timeout`. In particular, these settings are used for downloading data parts for replication. Changing these settings allows for faster failover if the network is overloaded.
* Added support for the `ALTER` query for tables of type `Null` (Anastasiya Tsarkova). Tables of type `Null` are often used with materialized views. * Added support for `ALTER` for tables of type `Null` (Anastasiya Tsarkova).
* The `reinterpretAsString` function is extended for all data types that are stored contiguously in memory. * The `reinterpretAsString` function is extended for all data types that are stored contiguously in memory.
* Added the `--silent` option for the `clickhouse-local` tool. It suppresses printing query execution info in stderr. * Added the `--silent` option for the `clickhouse-local` tool. It suppresses printing query execution info in stderr.
* Added support for reading values of type `Date` from text in a format where the month and/or day of the month is specified using a single digit instead of two digits (Amos Bird). * Added support for reading values of type `Date` from text in a format where the month and/or day of the month is specified using a single digit instead of two digits (Amos Bird).
### Performance optimizations: ### Performance optimizations:
* Improved performance of `min`, `max`, `any`, `anyLast`, `anyHeavy`, `argMin`, `argMax` aggregate functions for String arguments. * Improved performance of aggregate functions `min`, `max`, `any`, `anyLast`, `anyHeavy`, `argMin`, `argMax` from string arguments.
* Improved performance of `isInfinite`, `isFinite`, `isNaN`, `roundToExp2` functions. * Improved performance of the functions `isInfinite`, `isFinite`, `isNaN`, `roundToExp2`.
* Improved performance of parsing and formatting values of type `Date` and `DateTime` in text formats. * Improved performance of parsing and formatting `Date` and `DateTime` type values in text format.
* Improved performance and precision of parsing floating point numbers. * Improved performance and precision of parsing floating point numbers.
* Lowered memory usage for `JOIN` in the case when the left and right parts have columns with identical names that are not contained in `USING` . * Lowered memory usage for `JOIN` in the case when the left and right parts have columns with identical names that are not contained in `USING` .
* Improved performance of `varSamp`, `varPop`, `stddevSamp`, `stddevPop`, `covarSamp`, `covarPop`, and `corr` aggregate functions by reducing computational stability. The old functions are available under the names: `varSampStable`, `varPopStable`, `stddevSampStable`, `stddevPopStable`, `covarSampStable`, `covarPopStable`, `corrStable`. * Improved performance of aggregate functions `varSamp`, `varPop`, `stddevSamp`, `stddevPop`, `covarSamp`, `covarPop`, `corr` by reducing computational stability. The old functions are available under the names `varSampStable`, `varPopStable`, `stddevSampStable`, `stddevPopStable`, `covarSampStable`, `covarPopStable`, `corrStable`.
### Bug fixes: ### Bug fixes:
* Fixed data deduplication after running a `DROP PARTITION` query. In the previous version, dropping a partition and INSERTing the same data again was not working because INSERTed blocks were considered duplicates. * Fixed data deduplication after running a `DROP` or `DETACH PARTITION` query. In the previous version, dropping a partition and inserting the same data again was not working because inserted blocks were considered duplicates.
* Fixed a bug that could lead to incorrect interpretation of the `WHERE` clause for ` CREATE MATERIALIZED VIEW` queries with `POPULATE` . * Fixed a bug that could lead to incorrect interpretation of the `WHERE` clause for ` CREATE MATERIALIZED VIEW` queries with `POPULATE` .
* Fixed a bug in using the `root_path` parameter in the `zookeeper_servers` configuration. * Fixed a bug in using the `root_path` parameter in the `zookeeper_servers` configuration.
* Fixed unexpected results of passing the `Date` argument to `toStartOfDay` . * Fixed unexpected results of passing the `Date` argument to `toStartOfDay` .
@ -476,19 +502,19 @@ This release contains bug fixes for the previous release 1.1.54337:
* Fixed a bug in the background check of parts (`MergeTreePartChecker` ) when using a custom partition key. * Fixed a bug in the background check of parts (`MergeTreePartChecker` ) when using a custom partition key.
* Fixed parsing of tuples (values of the `Tuple` data type) in text formats. * Fixed parsing of tuples (values of the `Tuple` data type) in text formats.
* Improved error messages about incompatible types passed to `multiIf` , `array` and some other functions. * Improved error messages about incompatible types passed to `multiIf` , `array` and some other functions.
* Support for `Nullable` types is completely reworked. Fixed bugs that may lead to a server crash. Fixed almost all other bugs related to NULL support: incorrect type conversions in INSERT SELECT, insufficient support for Nullable in HAVING and PREWHERE, `join_use_nulls` mode, Nullable types as arguments of OR operator, etc. * Redesigned support for `Nullable` types. Fixed bugs that may lead to a server crash. Fixed almost all other bugs related to ` NULL` support: incorrect type conversions in INSERT SELECT, insufficient support for Nullable in HAVING and PREWHERE, `join_use_nulls` mode, Nullable types as arguments of `OR` operator, etc.
* Fixed various bugs related to internal semantics of data types. Examples: unnecessary summing of `Enum` type fields in `SummingMergeTree`; alignment of `Enum` types in Pretty formats, etc. * Fixed various bugs related to internal semantics of data types. Examples: unnecessary summing of `Enum` type fields in `SummingMergeTree` ; alignment of `Enum` types in `Pretty` formats, etc.
* Stricter checks for allowed combinations of composite columns. Fixed several bugs that could lead to a server crash. * Stricter checks for allowed combinations of composite columns.
* Fixed the overflow when specifying a very large parameter for the `FixedString` data type. * Fixed the overflow when specifying a very large parameter for the `FixedString` data type.
* Fixed a bug in the `topK` aggregate function in a generic case. * Fixed a bug in the `topK` aggregate function in a generic case.
* Added the missing check for equality of array sizes in arguments of n-ary variants of aggregate functions with an `-Array` combinator. * Added the missing check for equality of array sizes in arguments of n-ary variants of aggregate functions with an `-Array` combinator.
* Fixed the `--pager` option for `clickhouse-client` (by ks1322). * Fixed a bug in `--pager` for `clickhouse-client` (author: ks1322).
* Fixed the precision of the `exp10` function. * Fixed the precision of the `exp10` function.
* Fixed the behavior of the `visitParamExtract` function for better compliance with documentation. * Fixed the behavior of the `visitParamExtract` function for better compliance with documentation.
* Fixed the crash when incorrect data types are specified. * Fixed the crash when incorrect data types are specified.
* Fixed the behavior of `DISTINCT` in the case when all columns are constants. * Fixed the behavior of `DISTINCT` in the case when all columns are constants.
* Fixed query formatting in the case of using the `tupleElement` function with a complex constant expression as the tuple element index. * Fixed query formatting in the case of using the `tupleElement` function with a complex constant expression as the tuple element index.
* Fixed the `Dictionary` table engine for dictionaries of type `range_hashed`. * Fixed a bug in `Dictionary` tables for `range_hashed` dictionaries.
* Fixed a bug that leads to excessive rows in the result of `FULL` and ` RIGHT JOIN` (Amos Bird). * Fixed a bug that leads to excessive rows in the result of `FULL` and ` RIGHT JOIN` (Amos Bird).
* Fixed a server crash when creating and removing temporary files in `config.d` directories during config reload. * Fixed a server crash when creating and removing temporary files in `config.d` directories during config reload.
* Fixed the ` SYSTEM DROP DNS CACHE` query: the cache was flushed but addresses of cluster nodes were not updated. * Fixed the ` SYSTEM DROP DNS CACHE` query: the cache was flushed but addresses of cluster nodes were not updated.
@ -496,41 +522,44 @@ This release contains bug fixes for the previous release 1.1.54337:
### Build improvements: ### Build improvements:
* Builds use `pbuilder`. The build process is almost completely independent of the build host environment. * The `pbuilder` tool is used for builds. The build process is almost completely independent of the build host environment.
* A single build is used for different OS versions. Packages and binaries have been made compatible with a wide range of Linux systems. * A single build is used for different OS versions. Packages and binaries have been made compatible with a wide range of Linux systems.
* Added the `clickhouse-test` package. It can be used to run functional tests. * Added the `clickhouse-test` package. It can be used to run functional tests.
* The source tarball can now be published to the repository. It can be used to reproduce the build without using GitHub. * The source tarball can now be published to the repository. It can be used to reproduce the build without using GitHub.
* Added limited integration with Travis CI. Due to limits on build time in Travis, only the debug build is tested and a limited subset of tests are run. * Added limited integration with Travis CI. Due to limits on build time in Travis, only the debug build is tested and a limited subset of tests are run.
* Added support for `Cap'n'Proto` in the default build. * Added support for `Cap'n'Proto` in the default build.
* Changed the format of documentation sources from `Restructured Text` to `Markdown`. * Changed the format of documentation sources from `Restricted Text` to `Markdown`.
* Added support for `systemd` (Vladimir Smirnov). It is disabled by default due to incompatibility with some OS images and can be enabled manually. * Added support for `systemd` (Vladimir Smirnov). It is disabled by default due to incompatibility with some OS images and can be enabled manually.
* For dynamic code generation, `clang` and `lld` are embedded into the `clickhouse` binary. They can also be invoked as ` clickhouse clang` and ` clickhouse lld` . * For dynamic code generation, `clang` and `lld` are embedded into the `clickhouse` binary. They can also be invoked as ` clickhouse clang` and ` clickhouse lld` .
* Removed usage of GNU extensions from the code. Enabled the `-Wextra` option. When building with `clang`, `libc++` is used instead of `libstdc++`. * Removed usage of GNU extensions from the code. Enabled the `-Wextra` option. When building with `clang` the default is `libc++` instead of `libstdc++`.
* Extracted `clickhouse_parsers` and `clickhouse_common_io` libraries to speed up builds of various tools. * Extracted `clickhouse_parsers` and `clickhouse_common_io` libraries to speed up builds of various tools.
### Backward incompatible changes: ### Backward incompatible changes:
* The format for marks in `Log` type tables that contain `Nullable` columns was changed in a backward incompatible way. If you have these tables, you should convert them to the `TinyLog` type before starting up the new server version. To do this, replace `ENGINE = Log` with `ENGINE = TinyLog` in the corresponding `.sql` file in the `metadata` directory. If your table doesn't have `Nullable` columns or if the type of your table is not `Log`, then you don't need to do anything. * The format for marks in `Log` type tables that contain `Nullable` columns was changed in a backward incompatible way. If you have these tables, you should convert them to the `TinyLog` type before starting up the new server version. To do this, replace `ENGINE = Log` with `ENGINE = TinyLog` in the corresponding `.sql` file in the `metadata` directory. If your table doesn't have `Nullable` columns or if the type of your table is not `Log`, then you don't need to do anything.
* Removed the `experimental_allow_extended_storage_definition_syntax` setting. Now this feature is enabled by default. * Removed the `experimental_allow_extended_storage_definition_syntax` setting. Now this feature is enabled by default.
* To avoid confusion, the `runningIncome` function has been renamed to `runningDifferenceStartingWithFirstValue`. * The `runningIncome` function was renamed to `runningDifferenceStartingWithFirstvalue` to avoid confusion.
* Removed the ` FROM ARRAY JOIN arr` syntax when ARRAY JOIN is specified directly after FROM with no table (Amos Bird). * Removed the ` FROM ARRAY JOIN arr` syntax when ARRAY JOIN is specified directly after FROM with no table (Amos Bird).
* Removed the `BlockTabSeparated` format that was used solely for demonstration purposes. * Removed the `BlockTabSeparated` format that was used solely for demonstration purposes.
* Changed the serialization format of intermediate states of the aggregate functions `varSamp`, `varPop`, `stddevSamp`, `stddevPop`, `covarSamp`, `covarPop`, and `corr`. If you have stored states of these aggregate functions in tables (using the AggregateFunction data type or materialized views with corresponding states), please write to clickhouse-feedback@yandex-team.com. * Changed the state format for aggregate functions `varSamp`, `varPop`, `stddevSamp`, `stddevPop`, `covarSamp`, `covarPop`, `corr`. If you have stored states of these aggregate functions in tables (using the `AggregateFunction` data type or materialized views with corresponding states), please write to clickhouse-feedback@yandex-team.com.
* In previous server versions there was an undocumented feature: if an aggregate function depends on parameters, you can still specify it without parameters in the AggregateFunction data type. Example: `AggregateFunction(quantiles, UInt64)` instead of `AggregateFunction(quantiles(0.5, 0.9), UInt64)`. This feature was lost. Although it was undocumented, we plan to support it again in future releases. * In previous server versions there was an undocumented feature: if an aggregate function depends on parameters, you can still specify it without parameters in the AggregateFunction data type. Example: `AggregateFunction(quantiles, UInt64)` instead of `AggregateFunction(quantiles(0.5, 0.9), UInt64)`. This feature was lost. Although it was undocumented, we plan to support it again in future releases.
* Enum data types cannot be used in min/max aggregate functions. The possibility will be returned back in future release. * Enum data types cannot be used in min/max aggregate functions. This ability will be returned in the next release.
### Please note when upgrading: ### Please note when upgrading:
* When doing a rolling update on a cluster, at the point when some of the replicas are running the old version of ClickHouse and some are running the new version, replication is temporarily stopped and the message ` unknown parameter 'shard'` appears in the log. Replication will continue after all replicas of the cluster are updated. * When doing a rolling update on a cluster, at the point when some of the replicas are running the old version of ClickHouse and some are running the new version, replication is temporarily stopped and the message ` unknown parameter 'shard'` appears in the log. Replication will continue after all replicas of the cluster are updated.
* If you have different ClickHouse versions on the cluster, you can get incorrect results for distributed queries with the aggregate functions `varSamp`, `varPop`, `stddevSamp`, `stddevPop`, `covarSamp`, `covarPop`, and `corr`. You should update all cluster nodes. * If different versions of ClickHouse are running on the cluster servers, it is possible that distributed queries using the following functions will have incorrect results: `varSamp`, `varPop`, `stddevSamp`, `stddevPop`, `covarSamp`, `covarPop`, `corr`. You should update all cluster nodes.
## ClickHouse release 1.1.54327, 2017-12-21 ## ClickHouse release 1.1.54327, 2017-12-21
This release contains bug fixes for the previous release 1.1.54318: This release contains bug fixes for the previous release 1.1.54318:
* Fixed bug with possible race condition in replication that could lead to data loss. This issue affects versions 1.1.54310 and 1.1.54318. If you use one of these versions with Replicated tables, the update is strongly recommended. This issue shows in logs in Warning messages like ` Part ... from own log doesn't exist.` The issue is relevant even if you don't see these messages in logs. * Fixed bug with possible race condition in replication that could lead to data loss. This issue affects versions 1.1.54310 and 1.1.54318. If you use one of these versions with Replicated tables, the update is strongly recommended. This issue shows in logs in Warning messages like ` Part ... from own log doesn't exist.` The issue is relevant even if you don't see these messages in logs.
## ClickHouse release 1.1.54318, 2017-11-30 ## ClickHouse release 1.1.54318, 2017-11-30
This release contains bug fixes for the previous release 1.1.54310: This release contains bug fixes for the previous release 1.1.54310:
* Fixed incorrect row deletions during merges in the SummingMergeTree engine * Fixed incorrect row deletions during merges in the SummingMergeTree engine
* Fixed a memory leak in unreplicated MergeTree engines * Fixed a memory leak in unreplicated MergeTree engines
* Fixed performance degradation with frequent inserts in MergeTree engines * Fixed performance degradation with frequent inserts in MergeTree engines
@ -540,6 +569,7 @@ This release contains bug fixes for the previous release 1.1.54310:
## ClickHouse release 1.1.54310, 2017-11-01 ## ClickHouse release 1.1.54310, 2017-11-01
### New features: ### New features:
* Custom partitioning key for the MergeTree family of table engines. * Custom partitioning key for the MergeTree family of table engines.
* [ Kafka](https://clickhouse.yandex/docs/en/single/index.html#document-table_engines/kafka) table engine. * [ Kafka](https://clickhouse.yandex/docs/en/single/index.html#document-table_engines/kafka) table engine.
* Added support for loading [CatBoost](https://catboost.yandex/) models and applying them to data stored in ClickHouse. * Added support for loading [CatBoost](https://catboost.yandex/) models and applying them to data stored in ClickHouse.
@ -550,17 +580,19 @@ This release contains bug fixes for the previous release 1.1.54310:
* Added the `ATTACH TABLE` query without arguments. * Added the `ATTACH TABLE` query without arguments.
* The processing logic for Nested columns with names ending in -Map in a SummingMergeTree table was extracted to the sumMap aggregate function. You can now specify such columns explicitly. * The processing logic for Nested columns with names ending in -Map in a SummingMergeTree table was extracted to the sumMap aggregate function. You can now specify such columns explicitly.
* Max size of the IP trie dictionary is increased to 128M entries. * Max size of the IP trie dictionary is increased to 128M entries.
* Added the `getSizeOfEnumType` function. * Added the getSizeOfEnumType function.
* Added the `sumWithOverflow` aggregate function. * Added the sumWithOverflow aggregate function.
* Added support for the Cap'n Proto input format. * Added support for the Cap'n Proto input format.
* You can now customize compression level when using the zstd algorithm. * You can now customize compression level when using the zstd algorithm.
### Backward incompatible changes: ### Backward incompatible changes:
* Creation of temporary tables with an engine other than Memory is forbidden.
* Explicit creation of tables with the View or MaterializedView engine is forbidden. * Creation of temporary tables with an engine other than Memory is not allowed.
* Explicit creation of tables with the View or MaterializedView engine is not allowed.
* During table creation, a new check verifies that the sampling key expression is included in the primary key. * During table creation, a new check verifies that the sampling key expression is included in the primary key.
### Bug fixes: ### Bug fixes:
* Fixed hangups when synchronously inserting into a Distributed table. * Fixed hangups when synchronously inserting into a Distributed table.
* Fixed nonatomic adding and removing of parts in Replicated tables. * Fixed nonatomic adding and removing of parts in Replicated tables.
* Data inserted into a materialized view is not subjected to unnecessary deduplication. * Data inserted into a materialized view is not subjected to unnecessary deduplication.
@ -568,39 +600,44 @@ This release contains bug fixes for the previous release 1.1.54310:
* Users don't need access permissions to the `default` database to create temporary tables anymore. * Users don't need access permissions to the `default` database to create temporary tables anymore.
* Fixed crashing when specifying the Array type without arguments. * Fixed crashing when specifying the Array type without arguments.
* Fixed hangups when the disk volume containing server logs is full. * Fixed hangups when the disk volume containing server logs is full.
* Fixed an overflow in the `toRelativeWeekNum` function for the first week of the Unix epoch. * Fixed an overflow in the toRelativeWeekNum function for the first week of the Unix epoch.
### Build improvements: ### Build improvements:
* Several third-party libraries (notably Poco) were updated and converted to git submodules. * Several third-party libraries (notably Poco) were updated and converted to git submodules.
## ClickHouse release 1.1.54304, 2017-10-19 ## ClickHouse release 1.1.54304, 2017-10-19
### New features: ### New features:
* TLS support in the native protocol (to enable, set `tcp_ssl_port` in `config.xml`)
* TLS support in the native protocol (to enable, set `tcp_ssl_port` in `config.xml` ).
### Bug fixes: ### Bug fixes:
* `ALTER` for replicated tables now tries to start running as soon as possible
* Fixed crashing when reading data with the setting `preferred_block_size_bytes=0` * `ALTER` for replicated tables now tries to start running as soon as possible.
* Fixed crashes of `clickhouse-client` when `Page Down` is pressed * Fixed crashing when reading data with the setting `preferred_block_size_bytes=0.`
* Fixed crashes of `clickhouse-client` when pressing ` Page Down`
* Correct interpretation of certain complex queries with `GLOBAL IN` and `UNION ALL` * Correct interpretation of certain complex queries with `GLOBAL IN` and `UNION ALL`
* `FREEZE PARTITION` always works atomically now * `FREEZE PARTITION` always works atomically now.
* Empty POST requests now return a response with code 411 * Empty POST requests now return a response with code 411.
* Fixed interpretation errors for expressions like `CAST(1 AS Nullable(UInt8))` * Fixed interpretation errors for expressions like `CAST(1 AS Nullable(UInt8)).`
* Fixed an error when reading columns like `Array(Nullable(String))` from `MergeTree` tables * Fixed an error when reading `Array(Nullable(String))` columns from `MergeTree` tables.
* Fixed crashing when parsing queries like `SELECT dummy AS dummy, dummy AS b` * Fixed crashing when parsing queries like `SELECT dummy AS dummy, dummy AS b`
* Users are updated correctly when `users.xml` is invalid * Users are updated correctly with invalid `users.xml`
* Correct handling when an executable dictionary returns a non-zero response code * Correct handling when an executable dictionary returns a non-zero response code.
## ClickHouse release 1.1.54292, 2017-09-20 ## ClickHouse release 1.1.54292, 2017-09-20
### New features: ### New features:
* Added the `pointInPolygon` function for working with coordinates on a coordinate plane. * Added the `pointInPolygon` function for working with coordinates on a coordinate plane.
* Added the `sumMap` aggregate function for calculating the sum of arrays, similar to `SummingMergeTree`. * Added the `sumMap` aggregate function for calculating the sum of arrays, similar to `SummingMergeTree`.
* Added the `trunc` function. Improved performance of the rounding functions (`round`, `floor`, `ceil`, `roundToExp2`) and corrected the logic of how they work. Changed the logic of the `roundToExp2` function for fractions and negative numbers. * Added the `trunc` function. Improved performance of the rounding functions (`round`, `floor`, `ceil`, `roundToExp2`) and corrected the logic of how they work. Changed the logic of the `roundToExp2` function for fractions and negative numbers.
* The ClickHouse executable file is now less dependent on the libc version. The same ClickHouse executable file can run on a wide variety of Linux systems. Note: There is still a dependency when using compiled queries (with the setting `compile = 1`, which is not used by default). * The ClickHouse executable file is now less dependent on the libc version. The same ClickHouse executable file can run on a wide variety of Linux systems. There is still a dependency when using compiled queries (with the setting ` compile = 1` , which is not used by default).
* Reduced the time needed for dynamic compilation of queries. * Reduced the time needed for dynamic compilation of queries.
### Bug fixes: ### Bug fixes:
* Fixed an error that sometimes produced ` part ... intersects previous part` messages and weakened replica consistency. * Fixed an error that sometimes produced ` part ... intersects previous part` messages and weakened replica consistency.
* Fixed an error that caused the server to lock up if ZooKeeper was unavailable during shutdown. * Fixed an error that caused the server to lock up if ZooKeeper was unavailable during shutdown.
* Removed excessive logging when restoring replicas. * Removed excessive logging when restoring replicas.
@ -611,12 +648,13 @@ This release contains bug fixes for the previous release 1.1.54310:
## ClickHouse release 1.1.54289, 2017-09-13 ## ClickHouse release 1.1.54289, 2017-09-13
### New features: ### New features:
* `SYSTEM` queries for server administration: `SYSTEM RELOAD DICTIONARY`, `SYSTEM RELOAD DICTIONARIES`, `SYSTEM DROP DNS CACHE`, `SYSTEM SHUTDOWN`, `SYSTEM KILL`. * `SYSTEM` queries for server administration: `SYSTEM RELOAD DICTIONARY`, `SYSTEM RELOAD DICTIONARIES`, `SYSTEM DROP DNS CACHE`, `SYSTEM SHUTDOWN`, `SYSTEM KILL`.
* Added functions for working with arrays: `concat`, `arraySlice`, `arrayPushBack`, `arrayPushFront`, `arrayPopBack`, `arrayPopFront`. * Added functions for working with arrays: `concat`, `arraySlice`, `arrayPushBack`, `arrayPushFront`, `arrayPopBack`, `arrayPopFront`.
* Added the `root` and `identity` parameters for the ZooKeeper configuration. This allows you to isolate individual users on the same ZooKeeper cluster. * Added `root` and `identity` parameters for the ZooKeeper configuration. This allows you to isolate individual users on the same ZooKeeper cluster.
* Added the aggregate functions `groupBitAnd`, `groupBitOr`, and `groupBitXor` (for compatibility, they can also be accessed with the names `BIT_AND`, `BIT_OR`, and `BIT_XOR`). * Added aggregate functions `groupBitAnd`, `groupBitOr`, and `groupBitXor` (for compatibility, they are also available under the names `BIT_AND`, `BIT_OR`, and `BIT_XOR`).
* External dictionaries can be loaded from MySQL by specifying a socket in the filesystem. * External dictionaries can be loaded from MySQL by specifying a socket in the filesystem.
* External dictionaries can be loaded from MySQL over SSL (the `ssl_cert`, `ssl_key`, and `ssl_ca` parameters). * External dictionaries can be loaded from MySQL over SSL (`ssl_cert`, `ssl_key`, `ssl_ca` parameters).
* Added the `max_network_bandwidth_for_user` setting to restrict the overall bandwidth use for queries per user. * Added the `max_network_bandwidth_for_user` setting to restrict the overall bandwidth use for queries per user.
* Support for `DROP TABLE` for temporary tables. * Support for `DROP TABLE` for temporary tables.
* Support for reading `DateTime` values in Unix timestamp format from the `CSV` and `JSONEachRow` formats. * Support for reading `DateTime` values in Unix timestamp format from the `CSV` and `JSONEachRow` formats.
@ -626,6 +664,7 @@ This release contains bug fixes for the previous release 1.1.54310:
* Improved performance for queries with `DISTINCT` . * Improved performance for queries with `DISTINCT` .
### Bug fixes: ### Bug fixes:
* Improved the process for deleting old nodes in ZooKeeper. Previously, old nodes sometimes didn't get deleted if there were very frequent inserts, which caused the server to be slow to shut down, among other things. * Improved the process for deleting old nodes in ZooKeeper. Previously, old nodes sometimes didn't get deleted if there were very frequent inserts, which caused the server to be slow to shut down, among other things.
* Fixed randomization when choosing hosts for the connection to ZooKeeper. * Fixed randomization when choosing hosts for the connection to ZooKeeper.
* Fixed the exclusion of lagging replicas in distributed queries if the replica is localhost. * Fixed the exclusion of lagging replicas in distributed queries if the replica is localhost.
@ -638,30 +677,33 @@ This release contains bug fixes for the previous release 1.1.54310:
* Resolved the appearance of zombie processes when using a dictionary with an `executable` source. * Resolved the appearance of zombie processes when using a dictionary with an `executable` source.
* Fixed segfault for the HEAD query. * Fixed segfault for the HEAD query.
### Improvements to development workflow and ClickHouse build: ### Improved workflow for developing and assembling ClickHouse:
* You can use `pbuilder` to build ClickHouse. * You can use `pbuilder` to build ClickHouse.
* You can use `libc++` instead of `libstdc++` for builds on Linux. * You can use `libc++` instead of `libstdc++` for builds on Linux.
* Added instructions for using static code analysis tools: `Coverity`, `clang-tidy`, and `cppcheck`. * Added instructions for using static code analysis tools: `Coverage`, `clang-tidy`, `cppcheck`.
### Please note when upgrading: ### Please note when upgrading:
* There is now a higher default value for the MergeTree setting `max_bytes_to_merge_at_max_space_in_pool` (the maximum total size of data parts to merge, in bytes): it has increased from 100 GiB to 150 GiB. This might result in large merges running after the server upgrade, which could cause an increased load on the disk subsystem. If the free space available on the server is less than twice the total amount of the merges that are running, this will cause all other merges to stop running, including merges of small data parts. As a result, INSERT requests will fail with the message "Merges are processing significantly slower than inserts." Use the `SELECT * FROM system.merges` request to monitor the situation. You can also check the `DiskSpaceReservedForMerge` metric in the `system.metrics` table, or in Graphite. You don't need to do anything to fix this, since the issue will resolve itself once the large merges finish. If you find this unacceptable, you can restore the previous value for the `max_bytes_to_merge_at_max_space_in_pool` setting (to do this, go to the `<merge_tree>` section in config.xml, set `<max_bytes_to_merge_at_max_space_in_pool>107374182400</max_bytes_to_merge_at_max_space_in_pool>` and restart the server).
* There is now a higher default value for the MergeTree setting `max_bytes_to_merge_at_max_space_in_pool` (the maximum total size of data parts to merge, in bytes): it has increased from 100 GiB to 150 GiB. This might result in large merges running after the server upgrade, which could cause an increased load on the disk subsystem. If the free space available on the server is less than twice the total amount of the merges that are running, this will cause all other merges to stop running, including merges of small data parts. As a result, INSERT requests will fail with the message "Merges are processing significantly slower than inserts." Use the ` SELECT * FROM system.merges` request to monitor the situation. You can also check the `DiskSpaceReservedForMerge` metric in the `system.metrics` table, or in Graphite. You don't need to do anything to fix this, since the issue will resolve itself once the large merges finish. If you find this unacceptable, you can restore the previous value for the `max_bytes_to_merge_at_max_space_in_pool` setting. To do this, go to the <merge_tree> section in config.xml, set `<merge_tree>``<max_bytes_to_merge_at_max_space_in_pool>107374182400</max_bytes_to_merge_at_max_space_in_pool>` and restart the server.
## ClickHouse release 1.1.54284, 2017-08-29 ## ClickHouse release 1.1.54284, 2017-08-29
* This is bugfix release for previous 1.1.54282 release. It fixes ZooKeeper nodes leak in `parts/` directory. * This is a bugfix release for the previous 1.1.54282 release. It fixes leaks in the parts directory in ZooKeeper.
## ClickHouse release 1.1.54282, 2017-08-23 ## ClickHouse release 1.1.54282, 2017-08-23
This is a bugfix release. The following bugs were fixed: This release contains bug fixes for the previous release 1.1.54276:
* `DB::Exception: Assertion violation: !_path.empty()` error when inserting into a Distributed table.
* Error when parsing inserted data in RowBinary format if the data begins with ';' character. * Fixed `DB::Exception: Assertion violation: !_path.empty()` when inserting into a Distributed table.
* Fixed parsing when inserting in RowBinary format if input data starts with';'.
* Errors during runtime compilation of certain aggregate functions (e.g. `groupArray()`). * Errors during runtime compilation of certain aggregate functions (e.g. `groupArray()`).
## ClickHouse release 1.1.54276, 2017-08-16 ## Clickhouse Release 1.1.54276, 2017-08-16
### New features: ### New features:
* You can use an optional WITH clause in a SELECT query. Example query: `WITH 1+1 AS a SELECT a, a*a` * Added an optional WITH section for a SELECT query. Example query: `WITH 1+1 AS a SELECT a, a*a`
* INSERT can be performed synchronously in a Distributed table: OK is returned only after all the data is saved on all the shards. This is activated by the setting insert_distributed_sync=1. * INSERT can be performed synchronously in a Distributed table: OK is returned only after all the data is saved on all the shards. This is activated by the setting insert_distributed_sync=1.
* Added the UUID data type for working with 16-byte identifiers. * Added the UUID data type for working with 16-byte identifiers.
* Added aliases of CHAR, FLOAT and other types for compatibility with the Tableau. * Added aliases of CHAR, FLOAT and other types for compatibility with the Tableau.
@ -670,13 +712,13 @@ This is a bugfix release. The following bugs were fixed:
* Added support for non-constant arguments and negative offsets in the function `substring(str, pos, len).` * Added support for non-constant arguments and negative offsets in the function `substring(str, pos, len).`
* Added the max_size parameter for the `groupArray(max_size)(column)` aggregate function, and optimized its performance. * Added the max_size parameter for the `groupArray(max_size)(column)` aggregate function, and optimized its performance.
### Major changes: ### Main changes:
* Improved security: all server files are created with 0640 permissions (can be changed via <umask> config parameter). * Security improvements: all server files are created with 0640 permissions (can be changed via <umask> config parameter).
* Improved error messages for queries with invalid syntax. * Improved error messages for queries with invalid syntax.
* Significantly reduced memory consumption and improved performance when merging large sections of MergeTree data. * Significantly reduced memory consumption and improved performance when merging large sections of MergeTree data.
* Significantly increased the performance of data merges for the ReplacingMergeTree engine. * Significantly increased the performance of data merges for the ReplacingMergeTree engine.
* Improved performance for asynchronous inserts from a Distributed table by batching multiple source inserts. To enable this functionality, use the setting distributed_directory_monitor_batch_inserts=1. * Improved performance for asynchronous inserts from a Distributed table by combining multiple source inserts. To enable this functionality, use the setting distributed_directory_monitor_batch_inserts=1.
### Backward incompatible changes: ### Backward incompatible changes:
@ -685,12 +727,12 @@ This is a bugfix release. The following bugs were fixed:
### Complete list of changes: ### Complete list of changes:
* Added the `output_format_json_quote_denormals` setting, which enables outputting nan and inf values in JSON format. * Added the `output_format_json_quote_denormals` setting, which enables outputting nan and inf values in JSON format.
* Optimized thread allocation when reading from a Distributed table. * Optimized stream allocation when reading from a Distributed table.
* Settings can be modified in readonly mode if the value doesn't change. * Settings can be configured in readonly mode if the value doesn't change.
* Added the ability to read fractional granules of the MergeTree engine in order to meet restrictions on the block size specified in the preferred_block_size_bytes setting. The purpose is to reduce the consumption of RAM and increase cache locality when processing queries from tables with large columns. * Added the ability to retrieve non-integer granules of the MergeTree engine in order to meet restrictions on the block size specified in the preferred_block_size_bytes setting. The purpose is to reduce the consumption of RAM and increase cache locality when processing queries from tables with large columns.
* Efficient use of indexes that contain expressions like `toStartOfHour(x)` for conditions like `toStartOfHour(x) op сonstexpr.` * Efficient use of indexes that contain expressions like `toStartOfHour(x)` for conditions like `toStartOfHour(x) op сonstexpr.`
* Added new settings for MergeTree engines (the merge_tree section in config.xml): * Added new settings for MergeTree engines (the merge_tree section in config.xml):
- replicated_deduplication_window_seconds sets the size of deduplication window in seconds for Replicated tables. - replicated_deduplication_window_seconds sets the number of seconds allowed for deduplicating inserts in Replicated tables.
- cleanup_delay_period sets how often to start cleanup to remove outdated data. - cleanup_delay_period sets how often to start cleanup to remove outdated data.
- replicated_can_become_leader can prevent a replica from becoming the leader (and assigning merges). - replicated_can_become_leader can prevent a replica from becoming the leader (and assigning merges).
* Accelerated cleanup to remove outdated data from ZooKeeper. * Accelerated cleanup to remove outdated data from ZooKeeper.
@ -699,11 +741,11 @@ This is a bugfix release. The following bugs were fixed:
* Added the "none" value for the compression method. * Added the "none" value for the compression method.
* You can use multiple dictionaries_config sections in config.xml. * You can use multiple dictionaries_config sections in config.xml.
* It is possible to connect to MySQL through a socket in the file system. * It is possible to connect to MySQL through a socket in the file system.
* The `system.parts` table has a new column with information about the size of marks, in bytes. * The system.parts table has a new column with information about the size of marks, in bytes.
### Bug fixes: ### Bug fixes:
* Distributed tables using a Merge table now work correctly for a SELECT query with a condition on the _table field. * Distributed tables using a Merge table now work correctly for a SELECT query with a condition on the `_table` field.
* Fixed a rare race condition in ReplicatedMergeTree when checking data parts. * Fixed a rare race condition in ReplicatedMergeTree when checking data parts.
* Fixed possible freezing on "leader election" when starting a server. * Fixed possible freezing on "leader election" when starting a server.
* The max_replica_delay_for_distributed_queries setting was ignored when using a local replica of the data source. This has been fixed. * The max_replica_delay_for_distributed_queries setting was ignored when using a local replica of the data source. This has been fixed.
@ -717,11 +759,11 @@ This is a bugfix release. The following bugs were fixed:
* Too many threads were used for parallel aggregation. This has been fixed. * Too many threads were used for parallel aggregation. This has been fixed.
* Fixed how the "if" function works with FixedString arguments. * Fixed how the "if" function works with FixedString arguments.
* SELECT worked incorrectly from a Distributed table for shards with a weight of 0. This has been fixed. * SELECT worked incorrectly from a Distributed table for shards with a weight of 0. This has been fixed.
* Crashes no longer occur when running `CREATE VIEW IF EXISTS.` * Running `CREATE VIEW IF EXISTS no longer causes crashes.`
* Fixed incorrect behavior when input_format_skip_unknown_fields=1 is set and there are negative numbers. * Fixed incorrect behavior when input_format_skip_unknown_fields=1 is set and there are negative numbers.
* Fixed an infinite loop in the `dictGetHierarchy()` function if there is some invalid data in the dictionary. * Fixed an infinite loop in the `dictGetHierarchy()` function if there is some invalid data in the dictionary.
* Fixed `Syntax error: unexpected (...)` errors when running distributed queries with subqueries in an IN or JOIN clause and Merge tables. * Fixed `Syntax error: unexpected (...)` errors when running distributed queries with subqueries in an IN or JOIN clause and Merge tables.
* Fixed the incorrect interpretation of a SELECT query from Dictionary tables. * Fixed an incorrect interpretation of a SELECT query from Dictionary tables.
* Fixed the "Cannot mremap" error when using arrays in IN and JOIN clauses with more than 2 billion elements. * Fixed the "Cannot mremap" error when using arrays in IN and JOIN clauses with more than 2 billion elements.
* Fixed the failover for dictionaries with MySQL as the source. * Fixed the failover for dictionaries with MySQL as the source.
@ -735,7 +777,7 @@ This is a bugfix release. The following bugs were fixed:
### New features: ### New features:
* Distributed DDL (for example, `CREATE TABLE ON CLUSTER`). * Distributed DDL (for example, `CREATE TABLE ON CLUSTER`)
* The replicated request `ALTER TABLE CLEAR COLUMN IN PARTITION.` * The replicated request `ALTER TABLE CLEAR COLUMN IN PARTITION.`
* The engine for Dictionary tables (access to dictionary data in the form of a table). * The engine for Dictionary tables (access to dictionary data in the form of a table).
* Dictionary database engine (this type of database automatically has Dictionary tables available for all the connected external dictionaries). * Dictionary database engine (this type of database automatically has Dictionary tables available for all the connected external dictionaries).
@ -751,8 +793,8 @@ This is a bugfix release. The following bugs were fixed:
### Minor changes: ### Minor changes:
* If an alert is triggered, the full stack trace is printed into the log. * Now after an alert is triggered, the log prints the full stack trace.
* Relaxed the verification of the number of damaged or extra data parts at startup (there were too many false positives). * Relaxed the verification of the number of damaged/extra data parts at startup (there were too many false positives).
### Bug fixes: ### Bug fixes:
@ -762,7 +804,7 @@ This is a bugfix release. The following bugs were fixed:
* Changes in how an executable source of cached external dictionaries works. * Changes in how an executable source of cached external dictionaries works.
* Fixed the comparison of strings containing null characters. * Fixed the comparison of strings containing null characters.
* Fixed the comparison of Float32 primary key fields with constants. * Fixed the comparison of Float32 primary key fields with constants.
* Previously, an incorrect estimate of the size of a field could lead to overly large allocations. This has been fixed. * Previously, an incorrect estimate of the size of a field could lead to overly large allocations.
* Fixed a crash when querying a Nullable column added to a table using ALTER. * Fixed a crash when querying a Nullable column added to a table using ALTER.
* Fixed a crash when sorting by a Nullable column, if the number of rows is less than LIMIT. * Fixed a crash when sorting by a Nullable column, if the number of rows is less than LIMIT.
* Fixed an ORDER BY subquery consisting of only constant values. * Fixed an ORDER BY subquery consisting of only constant values.

View File

@ -180,21 +180,6 @@ if (OS_LINUX AND CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
endif () endif ()
endif () endif ()
if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
# If we leave this optimization enabled, gcc-7 replaces a pair of SSE intrinsics (16 byte load, store) with a call to memcpy. It leads to slow code. This is compiler bug.
# It looks like this:
#
# (gdb) bt
#0 memcpy (destination=0x7faa6e9f1638, source=0x7faa81d9e9a8, size=16) at ../libs/libmemcpy/memcpy.h:11
#1 0x0000000005341c5f in _mm_storeu_si128 (__B=..., __P=<optimized out>) at /usr/lib/gcc/x86_64-linux-gnu/7/include/emmintrin.h:720
#2 memcpySmallAllowReadWriteOverflow15Impl (n=<optimized out>, src=<optimized out>, dst=<optimized out>) at ../dbms/src/Common/memcpySmall.h:37
#3 memcpySmallAllowReadWriteOverflow15 (n=<optimized out>, src=<optimized out>, dst=<optimized out>) at ../dbms/src/Common/memcpySmall.h:52
#4 extractKeysAndPlaceInPoolContiguous (pool=..., keys=..., key_columns=..., keys_size=<optimized out>, i=<optimized out>) at ../dbms/src/Interpreters/AggregationCommon.h:262
set (CMAKE_CXX_FLAGS_RELWITHDEBINFO "${CMAKE_CXX_FLAGS_RELWITHDEBINFO} -fno-tree-loop-distribute-patterns")
set (CMAKE_C_FLAGS_RELWITHDEBINFO "${CMAKE_C_FLAGS_RELWITHDEBINFO} -fno-tree-loop-distribute-patterns")
endif ()
if (USE_STATIC_LIBRARIES AND HAVE_NO_PIE) if (USE_STATIC_LIBRARIES AND HAVE_NO_PIE)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${FLAG_NO_PIE}") set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${FLAG_NO_PIE}")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${FLAG_NO_PIE}") set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${FLAG_NO_PIE}")

View File

@ -41,6 +41,18 @@ if (USE_DEBUG_HELPERS)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -include ${ClickHouse_SOURCE_DIR}/libs/libcommon/include/common/iostream_debug_helpers.h") set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -include ${ClickHouse_SOURCE_DIR}/libs/libcommon/include/common/iostream_debug_helpers.h")
endif () endif ()
if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
# If we leave this optimization enabled, gcc-7 replaces a pair of SSE intrinsics (16 byte load, store) with a call to memcpy.
# It leads to slow code. This is compiler bug. It looks like this:
#
# (gdb) bt
#0 memcpy (destination=0x7faa6e9f1638, source=0x7faa81d9e9a8, size=16) at ../libs/libmemcpy/memcpy.h:11
#1 0x0000000005341c5f in _mm_storeu_si128 (__B=..., __P=<optimized out>) at /usr/lib/gcc/x86_64-linux-gnu/7/include/emmintrin.h:720
#2 memcpySmallAllowReadWriteOverflow15Impl (n=<optimized out>, src=<optimized out>, dst=<optimized out>) at ../dbms/src/Common/memcpySmall.h:37
add_definitions ("-fno-tree-loop-distribute-patterns")
endif ()
find_package (Threads) find_package (Threads)
add_subdirectory (src) add_subdirectory (src)

View File

@ -2,10 +2,10 @@
set(VERSION_REVISION 54407 CACHE STRING "") set(VERSION_REVISION 54407 CACHE STRING "")
set(VERSION_MAJOR 18 CACHE STRING "") set(VERSION_MAJOR 18 CACHE STRING "")
set(VERSION_MINOR 12 CACHE STRING "") set(VERSION_MINOR 12 CACHE STRING "")
set(VERSION_PATCH 1 CACHE STRING "") set(VERSION_PATCH 2 CACHE STRING "")
set(VERSION_GITHASH 76eaacf1be15102a732a90949739b6605d8596a1 CACHE STRING "") set(VERSION_GITHASH d12c1b02bc50119d67db2690c6bc7aeeae9d55ef CACHE STRING "")
set(VERSION_DESCRIBE v18.12.1-testing CACHE STRING "") set(VERSION_DESCRIBE v18.12.2-testing CACHE STRING "")
set(VERSION_STRING 18.12.1 CACHE STRING "") set(VERSION_STRING 18.12.2 CACHE STRING "")
# end of autochange # end of autochange
set(VERSION_EXTRA "" CACHE STRING "") set(VERSION_EXTRA "" CACHE STRING "")

View File

@ -25,7 +25,6 @@
#include <Parsers/IAST.h> #include <Parsers/IAST.h>
#include <common/ErrorHandlers.h> #include <common/ErrorHandlers.h>
#include <Common/StatusFile.h> #include <Common/StatusFile.h>
#include <Common/ThreadStatus.h>
#include <Functions/registerFunctions.h> #include <Functions/registerFunctions.h>
#include <AggregateFunctions/registerAggregateFunctions.h> #include <AggregateFunctions/registerAggregateFunctions.h>
#include <TableFunctions/registerTableFunctions.h> #include <TableFunctions/registerTableFunctions.h>

View File

@ -66,7 +66,7 @@ struct AggregateFunctionWindowFunnelData
/// either sort whole container or do so partially merging ranges afterwards /// either sort whole container or do so partially merging ranges afterwards
if (!sorted && !other.sorted) if (!sorted && !other.sorted)
std::sort(std::begin(events_list), std::end(events_list), Comparator{}); std::stable_sort(std::begin(events_list), std::end(events_list), Comparator{});
else else
{ {
const auto begin = std::begin(events_list); const auto begin = std::begin(events_list);
@ -74,10 +74,10 @@ struct AggregateFunctionWindowFunnelData
const auto end = std::end(events_list); const auto end = std::end(events_list);
if (!sorted) if (!sorted)
std::sort(begin, middle, Comparator{}); std::stable_sort(begin, middle, Comparator{});
if (!other.sorted) if (!other.sorted)
std::sort(middle, end, Comparator{}); std::stable_sort(middle, end, Comparator{});
std::inplace_merge(begin, middle, end, Comparator{}); std::inplace_merge(begin, middle, end, Comparator{});
} }
@ -89,7 +89,7 @@ struct AggregateFunctionWindowFunnelData
{ {
if (!sorted) if (!sorted)
{ {
std::sort(std::begin(events_list), std::end(events_list), Comparator{}); std::stable_sort(std::begin(events_list), std::end(events_list), Comparator{});
sorted = true; sorted = true;
} }
} }
@ -215,19 +215,13 @@ public:
void add(AggregateDataPtr place, const IColumn ** columns, const size_t row_num, Arena *) const override void add(AggregateDataPtr place, const IColumn ** columns, const size_t row_num, Arena *) const override
{ {
UInt8 event_level = 0; const auto timestamp = static_cast<const ColumnVector<UInt32> *>(columns[0])->getData()[row_num];
for (const auto i : ext::range(1, events_size + 1)) // reverse iteration and stable sorting are needed for events that are qualified by more than one condition.
for (auto i = events_size; i > 0; --i)
{ {
auto event = static_cast<const ColumnVector<UInt8> *>(columns[i])->getData()[row_num]; auto event = static_cast<const ColumnVector<UInt8> *>(columns[i])->getData()[row_num];
if (event) if (event)
{ this->data(place).add(timestamp, i);
event_level = i;
break;
}
}
if (event_level)
{
this->data(place).add(static_cast<const ColumnVector<UInt32> *>(columns[0])->getData()[row_num], event_level);
} }
} }

View File

@ -77,7 +77,7 @@ public:
const char * getFamilyName() const override { return TypeName<T>::get(); } const char * getFamilyName() const override { return TypeName<T>::get(); }
bool isNumeric() const override { return false; } bool isNumeric() const override { return false; }
bool canBeInsideNullable() const override { return false; } bool canBeInsideNullable() const override { return true; }
bool isFixedAndContiguous() const override { return true; } bool isFixedAndContiguous() const override { return true; }
size_t sizeOfValueIfFixed() const override { return sizeof(T); } size_t sizeOfValueIfFixed() const override { return sizeof(T); }

View File

@ -54,7 +54,7 @@ public:
"You must create it manulally with appropriate value or 0 for first start."); "You must create it manulally with appropriate value or 0 for first start.");
} }
int fd = open(path.c_str(), O_RDWR | O_CREAT, 0666); int fd = ::open(path.c_str(), O_RDWR | O_CREAT, 0666);
if (-1 == fd) if (-1 == fd)
DB::throwFromErrno("Cannot open file " + path); DB::throwFromErrno("Cannot open file " + path);
@ -128,7 +128,7 @@ public:
{ {
bool file_exists = Poco::File(path).exists(); bool file_exists = Poco::File(path).exists();
int fd = open(path.c_str(), O_RDWR | O_CREAT, 0666); int fd = ::open(path.c_str(), O_RDWR | O_CREAT, 0666);
if (-1 == fd) if (-1 == fd)
DB::throwFromErrno("Cannot open file " + path); DB::throwFromErrno("Cannot open file " + path);

View File

@ -1,3 +1,5 @@
#include <memory>
#include "CurrentThread.h" #include "CurrentThread.h"
#include <common/logger_useful.h> #include <common/logger_useful.h>
#include <Common/ThreadStatus.h> #include <Common/ThreadStatus.h>
@ -15,6 +17,10 @@ namespace ErrorCodes
extern const int LOGICAL_ERROR; extern const int LOGICAL_ERROR;
} }
/// Order of current_thread and current_thread_scope matters
thread_local ThreadStatusPtr current_thread = ThreadStatus::create();
thread_local CurrentThread::ThreadScopePtr current_thread_scope = std::make_shared<CurrentThread::ThreadScope>();
void CurrentThread::updatePerformanceCounters() void CurrentThread::updatePerformanceCounters()
{ {
get()->updatePerformanceCounters(); get()->updatePerformanceCounters();
@ -33,6 +39,11 @@ ThreadStatusPtr CurrentThread::get()
return current_thread; return current_thread;
} }
CurrentThread::ThreadScopePtr CurrentThread::getScope()
{
return current_thread_scope;
}
ProfileEvents::Counters & CurrentThread::getProfileEvents() ProfileEvents::Counters & CurrentThread::getProfileEvents()
{ {
return current_thread->performance_counters; return current_thread->performance_counters;

View File

@ -1,7 +1,9 @@
#pragma once #pragma once
#include <memory> #include <memory>
#include <string> #include <string>
#include <Common/ThreadStatus.h>
namespace ProfileEvents namespace ProfileEvents
{ {
@ -16,13 +18,8 @@ namespace DB
class Context; class Context;
class QueryStatus; class QueryStatus;
class ThreadStatus;
struct Progress; struct Progress;
using ThreadStatusPtr = std::shared_ptr<ThreadStatus>;
class InternalTextLogsQueue; class InternalTextLogsQueue;
class ThreadGroupStatus;
using ThreadGroupStatusPtr = std::shared_ptr<ThreadGroupStatus>;
class CurrentThread class CurrentThread
{ {
@ -30,6 +27,7 @@ public:
/// Handler to current thread /// Handler to current thread
static ThreadStatusPtr get(); static ThreadStatusPtr get();
/// Group to which belongs current thread /// Group to which belongs current thread
static ThreadGroupStatusPtr getGroup(); static ThreadGroupStatusPtr getGroup();
@ -77,6 +75,29 @@ public:
explicit QueryScope(Context & query_context); explicit QueryScope(Context & query_context);
~QueryScope(); ~QueryScope();
}; };
public:
/// Implicitly finalizes current thread in the destructor
class ThreadScope
{
public:
void (*deleter)() = nullptr;
ThreadScope() = default;
~ThreadScope()
{
if (deleter)
deleter();
/// std::terminate on exception: this is Ok.
}
};
using ThreadScopePtr = std::shared_ptr<ThreadScope>;
static ThreadScopePtr getScope();
private:
static void defaultThreadDeleter();
}; };
} }

View File

@ -8,6 +8,7 @@
#include <Poco/Path.h> #include <Poco/Path.h>
#include <Poco/Util/AbstractConfiguration.h> #include <Poco/Util/AbstractConfiguration.h>
#include <Common/ShellCommand.h> #include <Common/ShellCommand.h>
#include <Common/config.h>
#include <common/logger_useful.h> #include <common/logger_useful.h>
#include <ext/range.h> #include <ext/range.h>
@ -36,13 +37,28 @@ ODBCBridgeHelper::ODBCBridgeHelper(
void ODBCBridgeHelper::startODBCBridge() const void ODBCBridgeHelper::startODBCBridge() const
{ {
Poco::Path path{config.getString("application.dir", "")}; Poco::Path path{config.getString("application.dir", "")};
path.setFileName("clickhouse");
path.setFileName(
#if CLICKHOUSE_SPLIT_BINARY
"clickhouse-odbc-bridge"
#else
"clickhouse"
#endif
);
if (!Poco::File(path).exists()) if (!Poco::File(path).exists())
throw Exception("clickhouse binary is not found", ErrorCodes::EXTERNAL_EXECUTABLE_NOT_FOUND); throw Exception("clickhouse binary (" + path.toString() + ") is not found", ErrorCodes::EXTERNAL_EXECUTABLE_NOT_FOUND);
std::stringstream command; std::stringstream command;
command << path.toString() << " odbc-bridge ";
command << path.toString() <<
#if CLICKHOUSE_SPLIT_BINARY
" "
#else
" odbc-bridge "
#endif
;
command << "--http-port " << config.getUInt("odbc_bridge.port", DEFAULT_PORT) << ' '; command << "--http-port " << config.getUInt("odbc_bridge.port", DEFAULT_PORT) << ' ';
command << "--listen-host " << config.getString("odbc_bridge.listen_host", DEFAULT_HOST) << ' '; command << "--listen-host " << config.getString("odbc_bridge.listen_host", DEFAULT_HOST) << ' ';
command << "--http-timeout " << http_timeout.totalMicroseconds() << ' '; command << "--http-timeout " << http_timeout.totalMicroseconds() << ' ';

View File

@ -40,7 +40,7 @@ StatusFile::StatusFile(const std::string & path_)
LOG_INFO(&Logger::get("StatusFile"), "Status file " << path << " already exists and is empty - probably unclean hardware restart."); LOG_INFO(&Logger::get("StatusFile"), "Status file " << path << " already exists and is empty - probably unclean hardware restart.");
} }
fd = open(path.c_str(), O_WRONLY | O_CREAT, 0666); fd = ::open(path.c_str(), O_WRONLY | O_CREAT, 0666);
if (-1 == fd) if (-1 == fd)
throwFromErrno("Cannot open file " + path); throwFromErrno("Cannot open file " + path);

View File

@ -1,6 +1,5 @@
#include <sstream> #include <sstream>
#include <common/Types.h>
#include <Common/CurrentThread.h> #include <Common/CurrentThread.h>
#include <Common/Exception.h> #include <Common/Exception.h>
#include <Common/ThreadProfileEvents.h> #include <Common/ThreadProfileEvents.h>
@ -21,20 +20,13 @@ namespace ErrorCodes
extern const int PTHREAD_ERROR; extern const int PTHREAD_ERROR;
} }
/// Order of current_thread and current_thread_scope matters
thread_local ThreadStatusPtr current_thread = ThreadStatus::create();
thread_local ThreadStatus::CurrentThreadScope current_thread_scope;
TasksStatsCounters TasksStatsCounters::current() TasksStatsCounters TasksStatsCounters::current()
{ {
TasksStatsCounters res; TasksStatsCounters res;
current_thread->taskstats_getter->getStat(res.stat, current_thread->os_thread_id); CurrentThread::get()->taskstats_getter->getStat(res.stat, CurrentThread::get()->os_thread_id);
return res; return res;
} }
ThreadStatus::ThreadStatus() ThreadStatus::ThreadStatus()
{ {
thread_number = Poco::ThreadNumber::get(); thread_number = Poco::ThreadNumber::get();
@ -82,6 +74,7 @@ void ThreadStatus::initPerformanceCounters()
static SimpleObjectPool<TaskStatsInfoGetter> pool; static SimpleObjectPool<TaskStatsInfoGetter> pool;
taskstats_getter = pool.getDefault(); taskstats_getter = pool.getDefault();
} }
*last_taskstats = TasksStatsCounters::current(); *last_taskstats = TasksStatsCounters::current();
} }
} }

View File

@ -166,29 +166,6 @@ protected:
/// Set to non-nullptr only if we have enough capabilities. /// Set to non-nullptr only if we have enough capabilities.
/// We use pool because creation and destruction of TaskStatsInfoGetter objects are expensive. /// We use pool because creation and destruction of TaskStatsInfoGetter objects are expensive.
SimpleObjectPool<TaskStatsInfoGetter>::Pointer taskstats_getter; SimpleObjectPool<TaskStatsInfoGetter>::Pointer taskstats_getter;
public:
/// Implicitly finalizes current thread in the destructor
class CurrentThreadScope
{
public:
void (*deleter)() = nullptr;
CurrentThreadScope() = default;
~CurrentThreadScope()
{
if (deleter)
deleter();
/// std::terminate on exception: this is Ok.
}
}; };
private:
static void defaultThreadDeleter();
};
extern thread_local ThreadStatusPtr current_thread;
extern thread_local ThreadStatus::CurrentThreadScope current_thread_scope;
} }

View File

@ -14,3 +14,4 @@
#cmakedefine01 USE_POCO_DATAODBC #cmakedefine01 USE_POCO_DATAODBC
#cmakedefine01 USE_POCO_MONGODB #cmakedefine01 USE_POCO_MONGODB
#cmakedefine01 USE_POCO_NETSSL #cmakedefine01 USE_POCO_NETSSL
#cmakedefine01 CLICKHOUSE_SPLIT_BINARY

View File

@ -1,6 +1,9 @@
#pragma once #pragma once
#include <cmath>
#include <limits> #include <limits>
#include <Common/NaNUtils.h>
#include <Core/Types.h> #include <Core/Types.h>
#include <Common/UInt128.h> #include <Common/UInt128.h>
@ -396,6 +399,8 @@ inline bool_if_safe_conversion<A, B> lessOp(A a, B b)
template <typename A, typename B> template <typename A, typename B>
inline bool_if_not_safe_conversion<A, B> lessOrEqualsOp(A a, B b) inline bool_if_not_safe_conversion<A, B> lessOrEqualsOp(A a, B b)
{ {
if (isNaN(a) || isNaN(b))
return false;
return !greaterOp(a, b); return !greaterOp(a, b);
} }
@ -409,6 +414,8 @@ inline bool_if_safe_conversion<A, B> lessOrEqualsOp(A a, B b)
template <typename A, typename B> template <typename A, typename B>
inline bool_if_not_safe_conversion<A, B> greaterOrEqualsOp(A a, B b) inline bool_if_not_safe_conversion<A, B> greaterOrEqualsOp(A a, B b)
{ {
if (isNaN(a) || isNaN(b))
return false;
return !greaterOp(b, a); return !greaterOp(b, a);
} }

View File

@ -20,8 +20,8 @@ namespace ErrorCodes
} }
bool decimalCheckComparisonOverflow(const Context & context) { return context.getSettingsRef().decimal_check_comparison_overflow; } bool decimalCheckComparisonOverflow(const Context & context) { return context.getSettingsRef().decimal_check_overflow; }
bool decimalCheckArithmeticOverflow(const Context & context) { return context.getSettingsRef().decimal_check_arithmetic_overflow; } bool decimalCheckArithmeticOverflow(const Context & context) { return context.getSettingsRef().decimal_check_overflow; }
// //

View File

@ -148,7 +148,7 @@ public:
bool canBeUsedInBooleanContext() const override { return true; } bool canBeUsedInBooleanContext() const override { return true; }
bool isNumber() const override { return true; } bool isNumber() const override { return true; }
bool isInteger() const override { return false; } bool isInteger() const override { return false; }
bool canBeInsideNullable() const override { return false; } bool canBeInsideNullable() const override { return true; }
/// Decimal specific /// Decimal specific

View File

@ -1,3 +1,4 @@
#include <IO/createReadBufferFromFileBase.h>
#include <IO/CachedCompressedReadBuffer.h> #include <IO/CachedCompressedReadBuffer.h>
#include <IO/WriteHelpers.h> #include <IO/WriteHelpers.h>
#include <IO/CompressedStream.h> #include <IO/CompressedStream.h>

View File

@ -2,7 +2,7 @@
#include <memory> #include <memory>
#include <time.h> #include <time.h>
#include <IO/createReadBufferFromFileBase.h> #include <IO/ReadBufferFromFileBase.h>
#include <IO/CompressedReadBufferBase.h> #include <IO/CompressedReadBufferBase.h>
#include <IO/UncompressedCache.h> #include <IO/UncompressedCache.h>
#include <port/clock.h> #include <port/clock.h>

View File

@ -9,7 +9,6 @@
namespace ProfileEvents namespace ProfileEvents
{ {
extern const Event FileOpen; extern const Event FileOpen;
extern const Event FileOpenFailed;
} }
namespace DB namespace DB

View File

@ -38,7 +38,7 @@ ReadBufferFromFile::ReadBufferFromFile(
if (o_direct) if (o_direct)
flags = flags & ~O_DIRECT; flags = flags & ~O_DIRECT;
#endif #endif
fd = open(file_name.c_str(), flags == -1 ? O_RDONLY : flags); fd = ::open(file_name.c_str(), flags == -1 ? O_RDONLY : flags);
if (-1 == fd) if (-1 == fd)
throwFromErrno("Cannot open file " + file_name, errno == ENOENT ? ErrorCodes::FILE_DOESNT_EXIST : ErrorCodes::CANNOT_OPEN_FILE); throwFromErrno("Cannot open file " + file_name, errno == ENOENT ? ErrorCodes::FILE_DOESNT_EXIST : ErrorCodes::CANNOT_OPEN_FILE);

View File

@ -47,6 +47,8 @@ WriteBufferAIO::WriteBufferAIO(const std::string & filename_, size_t buffer_size
flush_buffer(BufferWithOwnMemory<WriteBuffer>(this->memory.size(), nullptr, DEFAULT_AIO_FILE_BLOCK_SIZE)), flush_buffer(BufferWithOwnMemory<WriteBuffer>(this->memory.size(), nullptr, DEFAULT_AIO_FILE_BLOCK_SIZE)),
filename(filename_) filename(filename_)
{ {
ProfileEvents::increment(ProfileEvents::FileOpen);
/// Correct the buffer size information so that additional pages do not touch the base class `BufferBase`. /// Correct the buffer size information so that additional pages do not touch the base class `BufferBase`.
this->buffer().resize(this->buffer().size() - DEFAULT_AIO_FILE_BLOCK_SIZE); this->buffer().resize(this->buffer().size() - DEFAULT_AIO_FILE_BLOCK_SIZE);
this->internalBuffer().resize(this->internalBuffer().size() - DEFAULT_AIO_FILE_BLOCK_SIZE); this->internalBuffer().resize(this->internalBuffer().size() - DEFAULT_AIO_FILE_BLOCK_SIZE);

View File

@ -41,7 +41,7 @@ WriteBufferFromFile::WriteBufferFromFile(
flags = flags & ~O_DIRECT; flags = flags & ~O_DIRECT;
#endif #endif
fd = open(file_name.c_str(), flags == -1 ? O_WRONLY | O_TRUNC | O_CREAT : flags, mode); fd = ::open(file_name.c_str(), flags == -1 ? O_WRONLY | O_TRUNC | O_CREAT : flags, mode);
if (-1 == fd) if (-1 == fd)
throwFromErrno("Cannot open file " + file_name, errno == ENOENT ? ErrorCodes::FILE_DOESNT_EXIST : ErrorCodes::CANNOT_OPEN_FILE); throwFromErrno("Cannot open file " + file_name, errno == ENOENT ? ErrorCodes::FILE_DOESNT_EXIST : ErrorCodes::CANNOT_OPEN_FILE);

View File

@ -35,7 +35,7 @@ std::unique_ptr<ReadBufferFromFileBase> createReadBufferFromFileBase(const std::
ProfileEvents::increment(ProfileEvents::CreatedReadBufferAIO); ProfileEvents::increment(ProfileEvents::CreatedReadBufferAIO);
return std::make_unique<ReadBufferAIO>(filename_, buffer_size_, flags_, existing_memory_); return std::make_unique<ReadBufferAIO>(filename_, buffer_size_, flags_, existing_memory_);
#else #else
throw Exception("AIO is not implemented yet on MacOS X", ErrorCodes::NOT_IMPLEMENTED); throw Exception("AIO is not implemented yet on non-Linux OS", ErrorCodes::NOT_IMPLEMENTED);
#endif #endif
} }
} }

View File

@ -15,7 +15,8 @@ namespace DB
* If aio_threshold = 0 or estimated_size < aio_threshold, read operations are executed synchronously. * If aio_threshold = 0 or estimated_size < aio_threshold, read operations are executed synchronously.
* Otherwise, the read operations are performed asynchronously. * Otherwise, the read operations are performed asynchronously.
*/ */
std::unique_ptr<ReadBufferFromFileBase> createReadBufferFromFileBase(const std::string & filename_, std::unique_ptr<ReadBufferFromFileBase> createReadBufferFromFileBase(
const std::string & filename_,
size_t estimated_size, size_t estimated_size,
size_t aio_threshold, size_t aio_threshold,
size_t buffer_size_ = DBMS_DEFAULT_BUFFER_SIZE, size_t buffer_size_ = DBMS_DEFAULT_BUFFER_SIZE,

View File

@ -22,22 +22,22 @@ namespace ErrorCodes
} }
#endif #endif
WriteBufferFromFileBase * createWriteBufferFromFileBase(const std::string & filename_, size_t estimated_size, std::unique_ptr<WriteBufferFromFileBase> createWriteBufferFromFileBase(const std::string & filename_, size_t estimated_size,
size_t aio_threshold, size_t buffer_size_, int flags_, mode_t mode, char * existing_memory_, size_t aio_threshold, size_t buffer_size_, int flags_, mode_t mode, char * existing_memory_,
size_t alignment) size_t alignment)
{ {
if ((aio_threshold == 0) || (estimated_size < aio_threshold)) if ((aio_threshold == 0) || (estimated_size < aio_threshold))
{ {
ProfileEvents::increment(ProfileEvents::CreatedWriteBufferOrdinary); ProfileEvents::increment(ProfileEvents::CreatedWriteBufferOrdinary);
return new WriteBufferFromFile(filename_, buffer_size_, flags_, mode, existing_memory_, alignment); return std::make_unique<WriteBufferFromFile>(filename_, buffer_size_, flags_, mode, existing_memory_, alignment);
} }
else else
{ {
#if defined(__linux__) #if defined(__linux__)
ProfileEvents::increment(ProfileEvents::CreatedWriteBufferAIO); ProfileEvents::increment(ProfileEvents::CreatedWriteBufferAIO);
return new WriteBufferAIO(filename_, buffer_size_, flags_, mode, existing_memory_); return std::make_unique<WriteBufferAIO>(filename_, buffer_size_, flags_, mode, existing_memory_);
#else #else
throw Exception("AIO is not implemented yet on MacOS X", ErrorCodes::NOT_IMPLEMENTED); throw Exception("AIO is not implemented yet on non-Linux OS", ErrorCodes::NOT_IMPLEMENTED);
#endif #endif
} }
} }

View File

@ -2,6 +2,8 @@
#include <IO/WriteBufferFromFileBase.h> #include <IO/WriteBufferFromFileBase.h>
#include <string> #include <string>
#include <memory>
namespace DB namespace DB
{ {
@ -13,7 +15,8 @@ namespace DB
* If aio_threshold = 0 or estimated_size < aio_threshold, the write operations are executed synchronously. * If aio_threshold = 0 or estimated_size < aio_threshold, the write operations are executed synchronously.
* Otherwise, write operations are performed asynchronously. * Otherwise, write operations are performed asynchronously.
*/ */
WriteBufferFromFileBase * createWriteBufferFromFileBase(const std::string & filename_, std::unique_ptr<WriteBufferFromFileBase> createWriteBufferFromFileBase(
const std::string & filename_,
size_t estimated_size, size_t estimated_size,
size_t aio_threshold, size_t aio_threshold,
size_t buffer_size_ = DBMS_DEFAULT_BUFFER_SIZE, size_t buffer_size_ = DBMS_DEFAULT_BUFFER_SIZE,

View File

@ -69,7 +69,7 @@ BlockIO InterpreterDropQuery::executeToTable(String & database_name_, String & t
{ {
database_and_table.second->shutdown(); database_and_table.second->shutdown();
/// If table was already dropped by anyone, an exception will be thrown /// If table was already dropped by anyone, an exception will be thrown
auto table_lock = database_and_table.second->lockDataForAlter(__PRETTY_FUNCTION__); auto table_lock = database_and_table.second->lockForAlter(__PRETTY_FUNCTION__);
/// Drop table from memory, don't touch data and metadata /// Drop table from memory, don't touch data and metadata
database_and_table.first->detachTable(database_and_table.second->getTableName()); database_and_table.first->detachTable(database_and_table.second->getTableName());
} }
@ -78,7 +78,7 @@ BlockIO InterpreterDropQuery::executeToTable(String & database_name_, String & t
database_and_table.second->checkTableCanBeDropped(); database_and_table.second->checkTableCanBeDropped();
/// If table was already dropped by anyone, an exception will be thrown /// If table was already dropped by anyone, an exception will be thrown
auto table_lock = database_and_table.second->lockDataForAlter(__PRETTY_FUNCTION__); auto table_lock = database_and_table.second->lockForAlter(__PRETTY_FUNCTION__);
/// Drop table data, don't touch metadata /// Drop table data, don't touch metadata
database_and_table.second->truncate(query_ptr); database_and_table.second->truncate(query_ptr);
} }
@ -88,7 +88,7 @@ BlockIO InterpreterDropQuery::executeToTable(String & database_name_, String & t
database_and_table.second->shutdown(); database_and_table.second->shutdown();
/// If table was already dropped by anyone, an exception will be thrown /// If table was already dropped by anyone, an exception will be thrown
auto table_lock = database_and_table.second->lockDataForAlter(__PRETTY_FUNCTION__); auto table_lock = database_and_table.second->lockForAlter(__PRETTY_FUNCTION__);
/// Delete table metdata and table itself from memory /// Delete table metdata and table itself from memory
database_and_table.first->removeTable(context, database_and_table.second->getTableName()); database_and_table.first->removeTable(context, database_and_table.second->getTableName());
/// Delete table data /// Delete table data
@ -124,7 +124,7 @@ BlockIO InterpreterDropQuery::executeToTemporaryTable(String & table_name, ASTDr
if (kind == ASTDropQuery::Kind::Truncate) if (kind == ASTDropQuery::Kind::Truncate)
{ {
/// If table was already dropped by anyone, an exception will be thrown /// If table was already dropped by anyone, an exception will be thrown
auto table_lock = table->lockDataForAlter(__PRETTY_FUNCTION__); auto table_lock = table->lockForAlter(__PRETTY_FUNCTION__);
/// Drop table data, don't touch metadata /// Drop table data, don't touch metadata
table->truncate(query_ptr); table->truncate(query_ptr);
} }

View File

@ -13,7 +13,9 @@ class IAST;
using ASTPtr = std::shared_ptr<IAST>; using ASTPtr = std::shared_ptr<IAST>;
using DatabaseAndTable = std::pair<DatabasePtr, StoragePtr>; using DatabaseAndTable = std::pair<DatabasePtr, StoragePtr>;
/** Allow to either drop table with all its data (DROP), or remove information about table (just forget) from server (DETACH). /** Allow to either drop table with all its data (DROP),
* or remove information about table (just forget) from server (DETACH),
* or just clear all data in table (TRUNCATE).
*/ */
class InterpreterDropQuery : public IInterpreter class InterpreterDropQuery : public IInterpreter
{ {

View File

@ -281,8 +281,7 @@ struct Settings
M(SettingBool, low_cardinality_use_single_dictionary_for_part, false, "LowCardinality type serialization setting. If is true, than will use additional keys when global dictionary overflows. Otherwise, will create several shared dictionaries.") \ M(SettingBool, low_cardinality_use_single_dictionary_for_part, false, "LowCardinality type serialization setting. If is true, than will use additional keys when global dictionary overflows. Otherwise, will create several shared dictionaries.") \
M(SettingBool, allow_experimental_low_cardinality_type, false, "Allows to create table with LowCardinality types.") \ M(SettingBool, allow_experimental_low_cardinality_type, false, "Allows to create table with LowCardinality types.") \
M(SettingBool, allow_experimental_decimal_type, false, "Enables Decimal data type.") \ M(SettingBool, allow_experimental_decimal_type, false, "Enables Decimal data type.") \
M(SettingBool, decimal_check_comparison_overflow, true, "Check overflow of decimal comparison operations") \ M(SettingBool, decimal_check_overflow, true, "Check overflow of decimal arithmetic/comparison operations") \
M(SettingBool, decimal_check_arithmetic_overflow, true, "Check overflow of decimal arithmetic operations") \
\ \
M(SettingBool, prefer_localhost_replica, 1, "1 - always send query to local replica, if it exists. 0 - choose replica to send query between local and remote ones according to load_balancing") \ M(SettingBool, prefer_localhost_replica, 1, "1 - always send query to local replica, if it exists. 0 - choose replica to send query between local and remote ones according to load_balancing") \
M(SettingUInt64, max_fetch_partition_retries_count, 5, "Amount of retries while fetching partition from another host.") \ M(SettingUInt64, max_fetch_partition_retries_count, 5, "Amount of retries while fetching partition from another host.") \

View File

@ -34,7 +34,7 @@ String ThreadStatus::getQueryID()
return {}; return {};
} }
void ThreadStatus::defaultThreadDeleter() void CurrentThread::defaultThreadDeleter()
{ {
ThreadStatus & thread = *CurrentThread::get(); ThreadStatus & thread = *CurrentThread::get();
LOG_TRACE(thread.log, "Thread " << thread.thread_number << " exited"); LOG_TRACE(thread.log, "Thread " << thread.thread_number << " exited");
@ -56,7 +56,6 @@ void ThreadStatus::initializeQuery()
initPerformanceCounters(); initPerformanceCounters();
thread_state = ThreadState::AttachedToQuery; thread_state = ThreadState::AttachedToQuery;
current_thread_scope.deleter = ThreadStatus::defaultThreadDeleter;
} }
void ThreadStatus::attachQuery(const ThreadGroupStatusPtr & thread_group_, bool check_detached) void ThreadStatus::attachQuery(const ThreadGroupStatusPtr & thread_group_, bool check_detached)
@ -94,7 +93,6 @@ void ThreadStatus::attachQuery(const ThreadGroupStatusPtr & thread_group_, bool
initPerformanceCounters(); initPerformanceCounters();
thread_state = ThreadState::AttachedToQuery; thread_state = ThreadState::AttachedToQuery;
current_thread_scope.deleter = ThreadStatus::defaultThreadDeleter;
} }
void ThreadStatus::finalizePerformanceCounters() void ThreadStatus::finalizePerformanceCounters()
@ -190,15 +188,16 @@ void ThreadStatus::logToQueryThreadLog(QueryThreadLog & thread_log)
thread_log.add(elem); thread_log.add(elem);
} }
void CurrentThread::initializeQuery() void CurrentThread::initializeQuery()
{ {
get()->initializeQuery(); get()->initializeQuery();
getScope()->deleter = CurrentThread::defaultThreadDeleter;
} }
void CurrentThread::attachTo(const ThreadGroupStatusPtr & thread_group) void CurrentThread::attachTo(const ThreadGroupStatusPtr & thread_group)
{ {
get()->attachQuery(thread_group, true); get()->attachQuery(thread_group, true);
getScope()->deleter = CurrentThread::defaultThreadDeleter;
} }
void CurrentThread::attachToIfDetached(const ThreadGroupStatusPtr & thread_group) void CurrentThread::attachToIfDetached(const ThreadGroupStatusPtr & thread_group)
@ -208,10 +207,10 @@ void CurrentThread::attachToIfDetached(const ThreadGroupStatusPtr & thread_group
std::string CurrentThread::getCurrentQueryID() std::string CurrentThread::getCurrentQueryID()
{ {
if (!current_thread || current_thread.use_count() <= 0) if (!get() || get().use_count() <= 0)
return {}; return {};
return current_thread->getQueryID(); return get()->getQueryID();
} }
void CurrentThread::attachQueryContext(Context & query_context) void CurrentThread::attachQueryContext(Context & query_context)

View File

@ -244,6 +244,7 @@ bool ParserCreateQuery::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
query->attach = attach; query->attach = attach;
query->if_not_exists = if_not_exists; query->if_not_exists = if_not_exists;
query->cluster = cluster_str;
if (database) if (database)
query->database = typeid_cast<ASTIdentifier &>(*database).name; query->database = typeid_cast<ASTIdentifier &>(*database).name;

View File

@ -126,7 +126,7 @@ public:
return res; return res;
} }
/** Does not allow reading the table structure. It is taken for ALTER, RENAME and DROP. /** Does not allow reading the table structure. It is taken for ALTER, RENAME and DROP, TRUNCATE.
*/ */
TableFullWriteLock lockForAlter(const std::string & who = "Alter") TableFullWriteLock lockForAlter(const std::string & who = "Alter")
{ {

View File

@ -222,18 +222,6 @@ BlockInputStreams StorageSystemColumns::read(
} }
} }
/// Whe should exit quickly in case of LIMIT. This helps when we have extraordinarily huge number of tables.
std::optional<UInt64> limit;
{
const ASTSelectQuery * select = typeid_cast<const ASTSelectQuery *>(query_info.query.get());
if (!select)
throw Exception("Logical error: not a SELECT query in StorageSystemColumns::read method", ErrorCodes::LOGICAL_ERROR);
if (select->limit_length)
limit = typeid_cast<const ASTLiteral &>(*select->limit_length).value.get<UInt64>();
if (select->limit_offset)
*limit += typeid_cast<const ASTLiteral &>(*select->limit_offset).value.get<UInt64>();
}
Block block_to_filter; Block block_to_filter;
Storages storages; Storages storages;
@ -276,16 +264,6 @@ BlockInputStreams StorageSystemColumns::read(
std::forward_as_tuple(iterator->table())); std::forward_as_tuple(iterator->table()));
table_column_mut->insert(table_name); table_column_mut->insert(table_name);
++offsets[i]; ++offsets[i];
if (limit && offsets[i] >= *limit)
break;
}
if (limit && offsets[i] >= *limit)
{
offsets.resize(i);
database_column = database_column->cut(0, i);
break;
} }
} }

View File

@ -147,7 +147,7 @@ class ClickHouseCluster:
print "Mysql Started" print "Mysql Started"
return return
except Exception as ex: except Exception as ex:
print "Can't connecto to MySQL " + str(ex) print "Can't connect to MySQL " + str(ex)
time.sleep(0.5) time.sleep(0.5)
raise Exception("Cannot wait MySQL container") raise Exception("Cannot wait MySQL container")
@ -162,7 +162,7 @@ class ClickHouseCluster:
print "All instances of ZooKeeper started" print "All instances of ZooKeeper started"
return return
except Exception as ex: except Exception as ex:
print "Can't connec to to ZooKeeper " + str(ex) print "Can't connect to ZooKeeper " + str(ex)
time.sleep(0.5) time.sleep(0.5)
raise Exception("Cannot wait ZooKeeper container") raise Exception("Cannot wait ZooKeeper container")
@ -322,8 +322,24 @@ class ClickHouseInstance:
self.image = image self.image = image
# Connects to the instance via clickhouse-client, sends a query (1st argument) and returns the answer # Connects to the instance via clickhouse-client, sends a query (1st argument) and returns the answer
def query(self, *args, **kwargs): def query(self, sql, stdin=None, timeout=None, settings=None, user=None, ignore_error=False):
return self.client.query(*args, **kwargs) return self.client.query(sql, stdin, timeout, settings, user, ignore_error)
def query_with_retry(self, sql, stdin=None, timeout=None, settings=None, user=None, ignore_error=False, retry_count=20, sleep_time=0.5, check_callback=lambda x: True):
result = None
for i in range(retry_count):
try:
result = self.query(sql, stdin, timeout, settings, user, ignore_error)
if check_callback(result):
return result
time.sleep(sleep_time)
except Exception as ex:
print "Retry {} got exception {}".format(i + 1, ex)
time.sleep(sleep_time)
if result is not None:
return result
raise Exception("Can't execute query {}".format(sql))
# As query() but doesn't wait response and returns response handler # As query() but doesn't wait response and returns response handler
def get_query_request(self, *args, **kwargs): def get_query_request(self, *args, **kwargs):

View File

@ -1,17 +1,40 @@
import difflib import difflib
import time
class TSV: class TSV:
"""Helper to get pretty diffs between expected and actual tab-separated value files""" """Helper to get pretty diffs between expected and actual tab-separated value files"""
def __init__(self, contents): def __init__(self, contents):
self.lines = contents.readlines() if isinstance(contents, file) else contents.splitlines(True) raw_lines = contents.readlines() if isinstance(contents, file) else contents.splitlines(True)
self.lines = [l.strip() for l in raw_lines if l.strip()]
def __eq__(self, other): def __eq__(self, other):
return self.lines == other.lines return self.lines == other.lines
def diff(self, other): def __ne__(self, other):
return list(line.rstrip() for line in difflib.context_diff(self.lines, other.lines))[2:] return self.lines != other.lines
def diff(self, other, n1=None, n2=None):
return list(line.rstrip() for line in difflib.unified_diff(self.lines, other.lines, fromfile=n1, tofile=n2))[2:]
def __str__(self):
return '\n'.join(self.lines)
@staticmethod @staticmethod
def toMat(contents): def toMat(contents):
return [line.split("\t") for line in contents.split("\n") if line.strip()] return [line.split("\t") for line in contents.split("\n") if line.strip()]
def assert_eq_with_retry(instance, query, expectation, retry_count=20, sleep_time=0.5, stdin=None, timeout=None, settings=None, user=None, ignore_error=False):
expectation_tsv = TSV(expectation)
for i in xrange(retry_count):
try:
if TSV(instance.query(query)) == expectation_tsv:
break
time.sleep(sleep_time)
except Exception as ex:
print "assert_eq_with_retry retry {} exception {}".format(i + 1, ex)
time.sleep(sleep_time)
else:
val = TSV(instance.query(query))
if expectation_tsv != val:
raise AssertionError("'{}' != '{}'\n{}".format(expectation_tsv, val, '\n'.join(expectation_tsv.diff(val, n1="expectation", n2="query"))))

View File

@ -5,7 +5,7 @@ import pytest
from helpers.cluster import ClickHouseCluster from helpers.cluster import ClickHouseCluster
from helpers.network import PartitionManager from helpers.network import PartitionManager
from helpers.test_tools import TSV from helpers.test_tools import assert_eq_with_retry
cluster = ClickHouseCluster(__file__) cluster = ClickHouseCluster(__file__)
@ -56,14 +56,14 @@ CREATE TABLE distributed(date Date, id UInt32, shard_id UInt32)
def test(started_cluster): def test(started_cluster):
# Check that the data has been inserted into correct tables. # Check that the data has been inserted into correct tables.
assert node1.query("SELECT id FROM shard_0.replicated") == '111\n' assert_eq_with_retry(node1, "SELECT id FROM shard_0.replicated", '111')
assert node1.query("SELECT id FROM shard_2.replicated") == '333\n' assert_eq_with_retry(node1, "SELECT id FROM shard_2.replicated", '333')
assert node2.query("SELECT id FROM shard_0.replicated") == '111\n' assert_eq_with_retry(node2, "SELECT id FROM shard_0.replicated", '111')
assert node2.query("SELECT id FROM shard_1.replicated") == '222\n' assert_eq_with_retry(node2, "SELECT id FROM shard_1.replicated", '222')
assert node3.query("SELECT id FROM shard_1.replicated") == '222\n' assert_eq_with_retry(node3, "SELECT id FROM shard_1.replicated", '222')
assert node3.query("SELECT id FROM shard_2.replicated") == '333\n' assert_eq_with_retry(node3, "SELECT id FROM shard_2.replicated", '333')
# Check that SELECT from the Distributed table works. # Check that SELECT from the Distributed table works.
expected_from_distributed = '''\ expected_from_distributed = '''\
@ -71,20 +71,20 @@ def test(started_cluster):
2017-06-16 222 1 2017-06-16 222 1
2017-06-16 333 2 2017-06-16 333 2
''' '''
assert TSV(node1.query("SELECT * FROM distributed ORDER BY id")) == TSV(expected_from_distributed) assert_eq_with_retry(node1, "SELECT * FROM distributed ORDER BY id", expected_from_distributed)
assert TSV(node2.query("SELECT * FROM distributed ORDER BY id")) == TSV(expected_from_distributed) assert_eq_with_retry(node2, "SELECT * FROM distributed ORDER BY id", expected_from_distributed)
assert TSV(node3.query("SELECT * FROM distributed ORDER BY id")) == TSV(expected_from_distributed) assert_eq_with_retry(node3, "SELECT * FROM distributed ORDER BY id", expected_from_distributed)
# Now isolate node3 from other nodes and check that SELECTs on other nodes still work. # Now isolate node3 from other nodes and check that SELECTs on other nodes still work.
with PartitionManager() as pm: with PartitionManager() as pm:
pm.partition_instances(node3, node1, action='REJECT --reject-with tcp-reset') pm.partition_instances(node3, node1, action='REJECT --reject-with tcp-reset')
pm.partition_instances(node3, node2, action='REJECT --reject-with tcp-reset') pm.partition_instances(node3, node2, action='REJECT --reject-with tcp-reset')
assert TSV(node1.query("SELECT * FROM distributed ORDER BY id")) == TSV(expected_from_distributed) assert_eq_with_retry(node1, "SELECT * FROM distributed ORDER BY id", expected_from_distributed)
assert TSV(node2.query("SELECT * FROM distributed ORDER BY id")) == TSV(expected_from_distributed) assert_eq_with_retry(node2, "SELECT * FROM distributed ORDER BY id", expected_from_distributed)
with pytest.raises(Exception): with pytest.raises(Exception):
print node3.query("SELECT * FROM distributed ORDER BY id") print node3.query_with_retry("SELECT * FROM distributed ORDER BY id", retry_count=5)
if __name__ == '__main__': if __name__ == '__main__':

View File

@ -279,7 +279,9 @@ ENGINE = Distributed(cluster_without_replication, default, merge, i)
assert TSV(instance.query("SELECT i FROM all_merge_32 ORDER BY i")) == TSV(''.join(['{}\n'.format(x) for x in xrange(4)])) assert TSV(instance.query("SELECT i FROM all_merge_32 ORDER BY i")) == TSV(''.join(['{}\n'.format(x) for x in xrange(4)]))
time.sleep(5)
ddl_check_query(instance, "ALTER TABLE merge ON CLUSTER cluster_without_replication MODIFY COLUMN i Int64") ddl_check_query(instance, "ALTER TABLE merge ON CLUSTER cluster_without_replication MODIFY COLUMN i Int64")
time.sleep(5)
ddl_check_query(instance, "ALTER TABLE merge ON CLUSTER cluster_without_replication ADD COLUMN s DEFAULT toString(i) FORMAT TSV") ddl_check_query(instance, "ALTER TABLE merge ON CLUSTER cluster_without_replication ADD COLUMN s DEFAULT toString(i) FORMAT TSV")
assert TSV(instance.query("SELECT i, s FROM all_merge_64 ORDER BY i")) == TSV(''.join(['{}\t{}\n'.format(x,x) for x in xrange(4)])) assert TSV(instance.query("SELECT i, s FROM all_merge_64 ORDER BY i")) == TSV(''.join(['{}\t{}\n'.format(x,x) for x in xrange(4)]))

View File

@ -3,6 +3,8 @@ import pytest
from helpers.cluster import ClickHouseCluster from helpers.cluster import ClickHouseCluster
from helpers.test_tools import assert_eq_with_retry
""" """
Both ssl_conf.xml and no_ssl_conf.xml have the same port Both ssl_conf.xml and no_ssl_conf.xml have the same port
""" """
@ -35,16 +37,14 @@ def both_https_cluster():
def test_both_https(both_https_cluster): def test_both_https(both_https_cluster):
node1.query("insert into test_table values ('2017-06-16', 111, 0)") node1.query("insert into test_table values ('2017-06-16', 111, 0)")
time.sleep(1)
assert node1.query("SELECT id FROM test_table order by id") == '111\n' assert_eq_with_retry(node1, "SELECT id FROM test_table order by id", '111')
assert node2.query("SELECT id FROM test_table order by id") == '111\n' assert_eq_with_retry(node2, "SELECT id FROM test_table order by id", '111')
node2.query("insert into test_table values ('2017-06-17', 222, 1)") node2.query("insert into test_table values ('2017-06-17', 222, 1)")
time.sleep(1)
assert node1.query("SELECT id FROM test_table order by id") == '111\n222\n' assert_eq_with_retry(node1, "SELECT id FROM test_table order by id", '111\n222')
assert node2.query("SELECT id FROM test_table order by id") == '111\n222\n' assert_eq_with_retry(node2, "SELECT id FROM test_table order by id", '111\n222')
node3 = cluster.add_instance('node3', config_dir="configs", main_configs=['configs/remote_servers.xml', 'configs/no_ssl_conf.xml'], with_zookeeper=True) node3 = cluster.add_instance('node3', config_dir="configs", main_configs=['configs/remote_servers.xml', 'configs/no_ssl_conf.xml'], with_zookeeper=True)
node4 = cluster.add_instance('node4', config_dir="configs", main_configs=['configs/remote_servers.xml', 'configs/no_ssl_conf.xml'], with_zookeeper=True) node4 = cluster.add_instance('node4', config_dir="configs", main_configs=['configs/remote_servers.xml', 'configs/no_ssl_conf.xml'], with_zookeeper=True)
@ -63,16 +63,14 @@ def both_http_cluster():
def test_both_http(both_http_cluster): def test_both_http(both_http_cluster):
node3.query("insert into test_table values ('2017-06-16', 111, 0)") node3.query("insert into test_table values ('2017-06-16', 111, 0)")
time.sleep(1)
assert node3.query("SELECT id FROM test_table order by id") == '111\n' assert_eq_with_retry(node3, "SELECT id FROM test_table order by id", '111')
assert node4.query("SELECT id FROM test_table order by id") == '111\n' assert_eq_with_retry(node4, "SELECT id FROM test_table order by id", '111')
node4.query("insert into test_table values ('2017-06-17', 222, 1)") node4.query("insert into test_table values ('2017-06-17', 222, 1)")
time.sleep(1)
assert node3.query("SELECT id FROM test_table order by id") == '111\n222\n' assert_eq_with_retry(node3, "SELECT id FROM test_table order by id", '111\n222')
assert node4.query("SELECT id FROM test_table order by id") == '111\n222\n' assert_eq_with_retry(node4, "SELECT id FROM test_table order by id", '111\n222')
node5 = cluster.add_instance('node5', config_dir="configs", main_configs=['configs/remote_servers.xml', 'configs/ssl_conf.xml'], with_zookeeper=True) node5 = cluster.add_instance('node5', config_dir="configs", main_configs=['configs/remote_servers.xml', 'configs/ssl_conf.xml'], with_zookeeper=True)
node6 = cluster.add_instance('node6', config_dir="configs", main_configs=['configs/remote_servers.xml', 'configs/no_ssl_conf.xml'], with_zookeeper=True) node6 = cluster.add_instance('node6', config_dir="configs", main_configs=['configs/remote_servers.xml', 'configs/no_ssl_conf.xml'], with_zookeeper=True)
@ -91,13 +89,11 @@ def mixed_protocol_cluster():
def test_mixed_protocol(mixed_protocol_cluster): def test_mixed_protocol(mixed_protocol_cluster):
node5.query("insert into test_table values ('2017-06-16', 111, 0)") node5.query("insert into test_table values ('2017-06-16', 111, 0)")
time.sleep(1)
assert node5.query("SELECT id FROM test_table order by id") == '111\n' assert_eq_with_retry(node5, "SELECT id FROM test_table order by id", '111')
assert node6.query("SELECT id FROM test_table order by id") == '' assert_eq_with_retry(node6, "SELECT id FROM test_table order by id", '')
node6.query("insert into test_table values ('2017-06-17', 222, 1)") node6.query("insert into test_table values ('2017-06-17', 222, 1)")
time.sleep(1)
assert node5.query("SELECT id FROM test_table order by id") == '111\n' assert_eq_with_retry(node5, "SELECT id FROM test_table order by id", '111')
assert node6.query("SELECT id FROM test_table order by id") == '222\n' assert_eq_with_retry(node6, "SELECT id FROM test_table order by id", '222')

View File

@ -2,6 +2,7 @@ import time
import pytest import pytest
from helpers.cluster import ClickHouseCluster from helpers.cluster import ClickHouseCluster
from helpers.test_tools import assert_eq_with_retry
def fill_nodes(nodes, shard): def fill_nodes(nodes, shard):
for node in nodes: for node in nodes:
@ -40,10 +41,6 @@ def test_recovery(start_cluster):
for i in range(100): for i in range(100):
node1.query("INSERT INTO test_table VALUES (1, {})".format(i)) node1.query("INSERT INTO test_table VALUES (1, {})".format(i))
time.sleep(2) node2.query_with_retry("ATTACH TABLE test_table", check_callback=lambda x: len(node2.query("select * from test_table")) > 0)
node2.query("ATTACH TABLE test_table") assert_eq_with_retry(node2, "SELECT count(*) FROM test_table", node1.query("SELECT count(*) FROM test_table"))
time.sleep(2)
assert node1.query("SELECT count(*) FROM test_table") == node2.query("SELECT count(*) FROM test_table")

View File

@ -5,6 +5,8 @@ import sys
from helpers.cluster import ClickHouseCluster from helpers.cluster import ClickHouseCluster
from helpers.network import PartitionManager from helpers.network import PartitionManager
from helpers.test_tools import assert_eq_with_retry
cluster = ClickHouseCluster(__file__) cluster = ClickHouseCluster(__file__)
def _fill_nodes(nodes, shard): def _fill_nodes(nodes, shard):
@ -42,17 +44,15 @@ def normal_work():
def test_normal_work(normal_work): def test_normal_work(normal_work):
node1.query("insert into test_table values ('2017-06-16', 111, 0)") node1.query("insert into test_table values ('2017-06-16', 111, 0)")
node1.query("insert into real_table values ('2017-06-16', 222, 0)") node1.query("insert into real_table values ('2017-06-16', 222, 0)")
time.sleep(1)
assert node1.query("SELECT id FROM test_table order by id") == '111\n' assert_eq_with_retry(node1, "SELECT id FROM test_table order by id", '111')
assert node1.query("SELECT id FROM real_table order by id") == '222\n' assert_eq_with_retry(node1, "SELECT id FROM real_table order by id", '222')
assert node2.query("SELECT id FROM test_table order by id") == '111\n' assert_eq_with_retry(node2, "SELECT id FROM test_table order by id", '111')
node1.query("ALTER TABLE test_table REPLACE PARTITION 201706 FROM real_table") node1.query("ALTER TABLE test_table REPLACE PARTITION 201706 FROM real_table")
time.sleep(1)
assert node1.query("SELECT id FROM test_table order by id") == '222\n' assert_eq_with_retry(node1, "SELECT id FROM test_table order by id", '222')
assert node2.query("SELECT id FROM test_table order by id") == '222\n' assert_eq_with_retry(node2, "SELECT id FROM test_table order by id", '222')
node3 = cluster.add_instance('node3', main_configs=['configs/remote_servers.xml'], with_zookeeper=True) node3 = cluster.add_instance('node3', main_configs=['configs/remote_servers.xml'], with_zookeeper=True)
node4 = cluster.add_instance('node4', main_configs=['configs/remote_servers.xml'], with_zookeeper=True) node4 = cluster.add_instance('node4', main_configs=['configs/remote_servers.xml'], with_zookeeper=True)
@ -72,11 +72,10 @@ def drop_failover():
def test_drop_failover(drop_failover): def test_drop_failover(drop_failover):
node3.query("insert into test_table values ('2017-06-16', 111, 0)") node3.query("insert into test_table values ('2017-06-16', 111, 0)")
node3.query("insert into real_table values ('2017-06-16', 222, 0)") node3.query("insert into real_table values ('2017-06-16', 222, 0)")
time.sleep(1)
assert node3.query("SELECT id FROM test_table order by id") == '111\n' assert_eq_with_retry(node3, "SELECT id FROM test_table order by id", '111')
assert node3.query("SELECT id FROM real_table order by id") == '222\n' assert_eq_with_retry(node3, "SELECT id FROM real_table order by id", '222')
assert node4.query("SELECT id FROM test_table order by id") == '111\n' assert_eq_with_retry(node4, "SELECT id FROM test_table order by id", '111')
with PartitionManager() as pm: with PartitionManager() as pm:
@ -88,23 +87,18 @@ def test_drop_failover(drop_failover):
node3.query("ALTER TABLE test_table REPLACE PARTITION 201706 FROM real_table") node3.query("ALTER TABLE test_table REPLACE PARTITION 201706 FROM real_table")
# Node3 replace is ok # Node3 replace is ok
assert node3.query("SELECT id FROM test_table order by id") == '222\n' assert_eq_with_retry(node3, "SELECT id FROM test_table order by id", '222')
# Network interrupted -- replace is not ok, but it's ok # Network interrupted -- replace is not ok, but it's ok
assert node4.query("SELECT id FROM test_table order by id") == '111\n' assert_eq_with_retry(node4, "SELECT id FROM test_table order by id", '111')
#Drop partition on source node #Drop partition on source node
node3.query("ALTER TABLE test_table DROP PARTITION 201706") node3.query("ALTER TABLE test_table DROP PARTITION 201706")
time.sleep(1)
# connection restored # connection restored
counter = 0
while counter < 10: # will lasts forever node4.query_with_retry("select last_exception from system.replication_queue where type = 'REPLACE_RANGE'", check_callback=lambda x: 'Not found part' not in x, sleep_time=1)
if 'Not found part' not in node4.query("select last_exception from system.replication_queue where type = 'REPLACE_RANGE'"):
break
time.sleep(1)
counter += 1
assert 'Not found part' not in node4.query("select last_exception from system.replication_queue where type = 'REPLACE_RANGE'") assert 'Not found part' not in node4.query("select last_exception from system.replication_queue where type = 'REPLACE_RANGE'")
assert node4.query("SELECT id FROM test_table order by id") == '' assert_eq_with_retry(node4, "SELECT id FROM test_table order by id", '')
node5 = cluster.add_instance('node5', main_configs=['configs/remote_servers.xml'], with_zookeeper=True) node5 = cluster.add_instance('node5', main_configs=['configs/remote_servers.xml'], with_zookeeper=True)
node6 = cluster.add_instance('node6', main_configs=['configs/remote_servers.xml'], with_zookeeper=True) node6 = cluster.add_instance('node6', main_configs=['configs/remote_servers.xml'], with_zookeeper=True)
@ -125,12 +119,11 @@ def test_replace_after_replace_failover(replace_after_replace_failover):
node5.query("insert into test_table values ('2017-06-16', 111, 0)") node5.query("insert into test_table values ('2017-06-16', 111, 0)")
node5.query("insert into real_table values ('2017-06-16', 222, 0)") node5.query("insert into real_table values ('2017-06-16', 222, 0)")
node5.query("insert into other_table values ('2017-06-16', 333, 0)") node5.query("insert into other_table values ('2017-06-16', 333, 0)")
time.sleep(1)
assert node5.query("SELECT id FROM test_table order by id") == '111\n' assert_eq_with_retry(node5, "SELECT id FROM test_table order by id", '111')
assert node5.query("SELECT id FROM real_table order by id") == '222\n' assert_eq_with_retry(node5, "SELECT id FROM real_table order by id", '222')
assert node5.query("SELECT id FROM other_table order by id") == '333\n' assert_eq_with_retry(node5, "SELECT id FROM other_table order by id", '333')
assert node6.query("SELECT id FROM test_table order by id") == '111\n' assert_eq_with_retry(node6, "SELECT id FROM test_table order by id", '111')
with PartitionManager() as pm: with PartitionManager() as pm:
@ -142,22 +135,15 @@ def test_replace_after_replace_failover(replace_after_replace_failover):
node5.query("ALTER TABLE test_table REPLACE PARTITION 201706 FROM real_table") node5.query("ALTER TABLE test_table REPLACE PARTITION 201706 FROM real_table")
# Node5 replace is ok # Node5 replace is ok
assert node5.query("SELECT id FROM test_table order by id") == '222\n' assert_eq_with_retry(node5, "SELECT id FROM test_table order by id", '222')
# Network interrupted -- replace is not ok, but it's ok # Network interrupted -- replace is not ok, but it's ok
assert node6.query("SELECT id FROM test_table order by id") == '111\n' assert_eq_with_retry(node6, "SELECT id FROM test_table order by id", '111')
#Replace partition on source node #Replace partition on source node
node5.query("ALTER TABLE test_table REPLACE PARTITION 201706 FROM other_table") node5.query("ALTER TABLE test_table REPLACE PARTITION 201706 FROM other_table")
assert node5.query("SELECT id FROM test_table order by id") == '333\n' assert_eq_with_retry(node5, "SELECT id FROM test_table order by id", '333')
time.sleep(1) node6.query_with_retry("select last_exception from system.replication_queue where type = 'REPLACE_RANGE'", check_callback=lambda x: 'Not found part' not in x, sleep_time=1)
# connection restored
counter = 0
while counter < 10: # will lasts forever
if 'Not found part' not in node6.query("select last_exception from system.replication_queue where type = 'REPLACE_RANGE'"):
break
time.sleep(1)
counter += 1
assert 'Not found part' not in node6.query("select last_exception from system.replication_queue where type = 'REPLACE_RANGE'") assert 'Not found part' not in node6.query("select last_exception from system.replication_queue where type = 'REPLACE_RANGE'")
assert node6.query("SELECT id FROM test_table order by id") == '333\n' assert_eq_with_retry(node6, "SELECT id FROM test_table order by id", '333')

View File

@ -57,7 +57,10 @@ def test_SYSTEM_RELOAD_DICTIONARY(started_cluster):
def test_DROP_DNS_CACHE(started_cluster): def test_DROP_DNS_CACHE(started_cluster):
instance = cluster.instances['ch1'] instance = cluster.instances['ch1']
instance.exec_in_container(['bash', '-c', 'echo 127.255.255.255 lost_host > /etc/hosts'], privileged=True, user='root') instance.exec_in_container(['bash', '-c', 'echo 127.0.0.1 localhost > /etc/hosts'], privileged=True, user='root')
instance.exec_in_container(['bash', '-c', 'echo ::1 localhost >> /etc/hosts'], privileged=True, user='root')
instance.exec_in_container(['bash', '-c', 'echo 127.255.255.255 lost_host >> /etc/hosts'], privileged=True, user='root')
instance.query("SYSTEM DROP DNS CACHE") instance.query("SYSTEM DROP DNS CACHE")
with pytest.raises(QueryRuntimeException): with pytest.raises(QueryRuntimeException):
@ -67,7 +70,10 @@ def test_DROP_DNS_CACHE(started_cluster):
with pytest.raises(QueryRuntimeException): with pytest.raises(QueryRuntimeException):
instance.query("SELECT * FROM distributed_lost_host") instance.query("SELECT * FROM distributed_lost_host")
instance.exec_in_container(['bash', '-c', 'echo 127.0.0.1 lost_host > /etc/hosts'], privileged=True, user='root') instance.exec_in_container(['bash', '-c', 'echo 127.0.0.1 localhost > /etc/hosts'], privileged=True, user='root')
instance.exec_in_container(['bash', '-c', 'echo ::1 localhost >> /etc/hosts'], privileged=True, user='root')
instance.exec_in_container(['bash', '-c', 'echo 127.0.0.1 lost_host >> /etc/hosts'], privileged=True, user='root')
instance.query("SYSTEM DROP DNS CACHE") instance.query("SYSTEM DROP DNS CACHE")
instance.query("SELECT * FROM remote('lost_host', 'system', 'one')") instance.query("SELECT * FROM remote('lost_host', 'system', 'one')")

View File

@ -1,2 +1,2 @@
SET compile = 1, min_count_to_compile = 0, max_threads = 1; SET compile = 1, min_count_to_compile = 0, max_threads = 1, send_logs_level = 'none';
SELECT arrayJoin([1, 2, 1]) AS UserID, argMax('Hello', today()) AS res GROUP BY UserID; SELECT arrayJoin([1, 2, 1]) AS UserID, argMax('Hello', today()) AS res GROUP BY UserID;

View File

@ -25,6 +25,7 @@ select 5 = windowFunnel(4)(timestamp, event = 1003, event = 1004, event = 1005,
select 2 = windowFunnel(10000)(timestamp, event = 1001, event = 1008) from funnel_test2; select 2 = windowFunnel(10000)(timestamp, event = 1001, event = 1008) from funnel_test2;
select 1 = windowFunnel(10000)(timestamp, event = 1008, event = 1001) from funnel_test2; select 1 = windowFunnel(10000)(timestamp, event = 1008, event = 1001) from funnel_test2;
select 5 = windowFunnel(4)(timestamp, event = 1003, event = 1004, event = 1005, event = 1006, event = 1007) from funnel_test2; select 5 = windowFunnel(4)(timestamp, event = 1003, event = 1004, event = 1005, event = 1006, event = 1007) from funnel_test2;
select 4 = windowFunnel(4)(timestamp, event <= 1007, event >= 1002, event <= 1006, event >= 1004) from funnel_test2;
drop table funnel_test; drop table funnel_test;
drop table funnel_test2; drop table funnel_test2;

View File

@ -1,3 +1,6 @@
1 10000 1 6667
1 10000 2 3333
1 10000 1 6667
2 3333
1 6667
2 3333

View File

@ -0,0 +1,11 @@
0
1
1
1
1
1
1
1
1
1
1

View File

@ -0,0 +1,11 @@
SELECT count() > 0 FROM (SELECT * FROM system.columns LIMIT 0);
SELECT count() > 0 FROM (SELECT * FROM system.columns LIMIT 1);
SELECT count() > 0 FROM (SELECT * FROM system.columns LIMIT 2);
SELECT count() > 0 FROM (SELECT * FROM system.columns LIMIT 3);
SELECT count() > 0 FROM (SELECT * FROM system.columns LIMIT 4);
SELECT count() > 0 FROM (SELECT * FROM system.columns LIMIT 5);
SELECT count() > 0 FROM (SELECT * FROM system.columns LIMIT 6);
SELECT count() > 0 FROM (SELECT * FROM system.columns LIMIT 7);
SELECT count() > 0 FROM (SELECT * FROM system.columns LIMIT 8);
SELECT count() > 0 FROM (SELECT * FROM system.columns LIMIT 9);
SELECT count() > 0 FROM (SELECT * FROM system.columns LIMIT 10);

View File

@ -58,7 +58,7 @@ SELECT 21 + j, 21 - j, 84 - j, 21 * j, -21 * j, 21 / j, 84 / j FROM test.decimal
SELECT a, -a, -b, -c, -d, -e, -f, -g, -h, -j from test.decimal ORDER BY a; SELECT a, -a, -b, -c, -d, -e, -f, -g, -h, -j from test.decimal ORDER BY a;
SELECT abs(a), abs(b), abs(c), abs(d), abs(e), abs(f), abs(g), abs(h), abs(j) from test.decimal ORDER BY a; SELECT abs(a), abs(b), abs(c), abs(d), abs(e), abs(f), abs(g), abs(h), abs(j) from test.decimal ORDER BY a;
SET decimal_check_arithmetic_overflow = 0; SET decimal_check_overflow = 0;
SELECT (h * h) != 0, (h / h) != 1 FROM test.decimal WHERE h > 0; SELECT (h * h) != 0, (h / h) != 1 FROM test.decimal WHERE h > 0;
SELECT (i * i) != 0, (i / i) = 1 FROM test.decimal WHERE i > 0; SELECT (i * i) != 0, (i / i) = 1 FROM test.decimal WHERE i > 0;

View File

@ -1,14 +1,6 @@
SET allow_experimental_decimal_type = 1; SET allow_experimental_decimal_type = 1;
SET send_logs_level = 'none'; SET send_logs_level = 'none';
CREATE TABLE IF NOT EXISTS test.x (a Nullable(Decimal(9, 2))) ENGINE = Memory; -- { serverError 43 }
CREATE TABLE IF NOT EXISTS test.x (a Nullable(Decimal(18, 2))) ENGINE = Memory; -- { serverError 43 }
CREATE TABLE IF NOT EXISTS test.x (a Nullable(Decimal(38, 2))) ENGINE = Memory; -- { serverError 43 }
SELECT toNullable(toDecimal32(0, 0)); -- { serverError 43 }
SELECT toNullable(toDecimal64(0, 0)); -- { serverError 43 }
SELECT toNullable(toDecimal128(0, 0)); -- { serverError 43 }
SELECT toDecimal32('1.1', 1), toDecimal32('1.1', 2), toDecimal32('1.1', 8); SELECT toDecimal32('1.1', 1), toDecimal32('1.1', 2), toDecimal32('1.1', 8);
SELECT toDecimal32('1.1', 0); -- { serverError 69 } SELECT toDecimal32('1.1', 0); -- { serverError 69 }
SELECT toDecimal32(1.1, 0), toDecimal32(1.1, 1), toDecimal32(1.1, 2), toDecimal32(1.1, 8); SELECT toDecimal32(1.1, 0), toDecimal32(1.1, 1), toDecimal32(1.1, 2), toDecimal32(1.1, 8);

View File

@ -0,0 +1,39 @@
32 32
64 64
128 128
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 8 8
\N \N
\N \N
1 1
1 1
\N
\N
1
1
1.10 1.10000 1.10000 1.1000 1.10000000 1.10000000
2.20 2.20000 2.20000 2.2000 \N \N
3.30 3.30000 3.30000 \N 3.30000000 \N
4.40 4.40000 4.40000 \N \N 4.40000000
5.50 5.50000 5.50000 \N \N \N
0 1
0 1
0 1
1 0
1 0
1 0
5
5
5
3
3
3
2
2
2

View File

@ -0,0 +1,64 @@
SET send_logs_level = 'none';
SET allow_experimental_decimal_type = 1;
CREATE DATABASE IF NOT EXISTS test;
DROP TABLE IF EXISTS test.decimal;
CREATE TABLE IF NOT EXISTS test.decimal
(
a DEC(9, 2),
b DEC(18, 5),
c DEC(38, 5),
d Nullable(DEC(9, 4)),
e Nullable(DEC(18, 8)),
f Nullable(DEC(38, 8))
) ENGINE = Memory;
SELECT toNullable(toDecimal32(32, 0)) AS x, assumeNotNull(x);
SELECT toNullable(toDecimal64(64, 0)) AS x, assumeNotNull(x);
SELECT toNullable(toDecimal128(128, 0)) AS x, assumeNotNull(x);
SELECT ifNull(toDecimal32(1, 0), NULL), ifNull(toDecimal64(1, 0), NULL), ifNull(toDecimal128(1, 0), NULL);
SELECT ifNull(toNullable(toDecimal32(2, 0)), NULL), ifNull(toNullable(toDecimal64(2, 0)), NULL), ifNull(toNullable(toDecimal128(2, 0)), NULL);
SELECT ifNull(NULL, toDecimal32(3, 0)), ifNull(NULL, toDecimal64(3, 0)), ifNull(NULL, toDecimal128(3, 0));
SELECT ifNull(NULL, toNullable(toDecimal32(4, 0))), ifNull(NULL, toNullable(toDecimal64(4, 0))), ifNull(NULL, toNullable(toDecimal128(4, 0)));
SELECT coalesce(toDecimal32(5, 0), NULL), coalesce(toDecimal64(5, 0), NULL), coalesce(toDecimal128(5, 0), NULL);
SELECT coalesce(NULL, toDecimal32(6, 0)), coalesce(NULL, toDecimal64(6, 0)), coalesce(NULL, toDecimal128(6, 0));
SELECT coalesce(toNullable(toDecimal32(7, 0)), NULL), coalesce(toNullable(toDecimal64(7, 0)), NULL), coalesce(toNullable(toDecimal128(7, 0)), NULL);
SELECT coalesce(NULL, toNullable(toDecimal32(8, 0))), coalesce(NULL, toNullable(toDecimal64(8, 0))), coalesce(NULL, toNullable(toDecimal128(8, 0)));
SELECT nullIf(toNullable(toDecimal32(1, 0)), toDecimal32(1, 0)), nullIf(toNullable(toDecimal64(1, 0)), toDecimal64(1, 0));
SELECT nullIf(toDecimal32(1, 0), toNullable(toDecimal32(1, 0))), nullIf(toDecimal64(1, 0), toNullable(toDecimal64(1, 0)));
SELECT nullIf(toNullable(toDecimal32(1, 0)), toDecimal32(2, 0)), nullIf(toNullable(toDecimal64(1, 0)), toDecimal64(2, 0));
SELECT nullIf(toDecimal32(1, 0), toNullable(toDecimal32(2, 0))), nullIf(toDecimal64(1, 0), toNullable(toDecimal64(2, 0)));
SELECT nullIf(toNullable(toDecimal128(1, 0)), toDecimal128(1, 0));
SELECT nullIf(toDecimal128(1, 0), toNullable(toDecimal128(1, 0)));
SELECT nullIf(toNullable(toDecimal128(1, 0)), toDecimal128(2, 0));
SELECT nullIf(toDecimal128(1, 0), toNullable(toDecimal128(2, 0)));
INSERT INTO test.decimal (a, b, c, d, e, f) VALUES (1.1, 1.1, 1.1, 1.1, 1.1, 1.1);
INSERT INTO test.decimal (a, b, c, d) VALUES (2.2, 2.2, 2.2, 2.2);
INSERT INTO test.decimal (a, b, c, e) VALUES (3.3, 3.3, 3.3, 3.3);
INSERT INTO test.decimal (a, b, c, f) VALUES (4.4, 4.4, 4.4, 4.4);
INSERT INTO test.decimal (a, b, c) VALUES (5.5, 5.5, 5.5);
SELECT * FROM test.decimal ORDER BY d, e, f;
SELECT isNull(a), isNotNull(a) FROM test.decimal WHERE a = toDecimal32(5.5, 1);
SELECT isNull(b), isNotNull(b) FROM test.decimal WHERE a = toDecimal32(5.5, 1);
SELECT isNull(c), isNotNull(c) FROM test.decimal WHERE a = toDecimal32(5.5, 1);
SELECT isNull(d), isNotNull(d) FROM test.decimal WHERE a = toDecimal32(5.5, 1);
SELECT isNull(e), isNotNull(e) FROM test.decimal WHERE a = toDecimal32(5.5, 1);
SELECT isNull(f), isNotNull(f) FROM test.decimal WHERE a = toDecimal32(5.5, 1);
SELECT count() FROM test.decimal WHERE a IS NOT NULL;
SELECT count() FROM test.decimal WHERE b IS NOT NULL;
SELECT count() FROM test.decimal WHERE c IS NOT NULL;
SELECT count() FROM test.decimal WHERE d IS NULL;
SELECT count() FROM test.decimal WHERE e IS NULL;
SELECT count() FROM test.decimal WHERE f IS NULL;
SELECT count() FROM test.decimal WHERE d IS NULL AND e IS NULL;
SELECT count() FROM test.decimal WHERE d IS NULL AND f IS NULL;
SELECT count() FROM test.decimal WHERE e IS NULL AND f IS NULL;
DROP TABLE IF EXISTS test.decimal;

View File

@ -0,0 +1,27 @@
#!/usr/bin/env bash
set -e
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
. $CURDIR/../shell_config.sh
${CLICKHOUSE_CLIENT} --multiquery --query="
DROP TABLE IF EXISTS test.memory;
CREATE TABLE test.memory (x UInt64) ENGINE = Memory;
SET max_block_size = 1, min_insert_block_size_rows = 0, min_insert_block_size_bytes = 0;
INSERT INTO test.memory SELECT * FROM numbers(1000);"
${CLICKHOUSE_CLIENT} --multiquery --query="
SET max_threads = 1;
SELECT count() FROM test.memory WHERE NOT ignore(sleep(0.0001));" &
sleep 0.05;
${CLICKHOUSE_CLIENT} --multiquery --query="
TRUNCATE TABLE test.memory;
DROP TABLE test.memory;
"
wait

View File

@ -0,0 +1,31 @@
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 1
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
nan nan nan nan nan nan
nan nan nan nan nan nan nan nan nan
nan nan nan nan nan
nan nan nan nan nan nan nan
-1 1
-1 1

View File

@ -0,0 +1,36 @@
SELECT nan = toUInt8(0), nan != toUInt8(0), nan < toUInt8(0), nan > toUInt8(0), nan <= toUInt8(0), nan >= toUInt8(0);
SELECT nan = toInt8(0), nan != toInt8(0), nan < toInt8(0), nan > toInt8(0), nan <= toInt8(0), nan >= toInt8(0);
SELECT nan = toUInt16(0), nan != toUInt16(0), nan < toUInt16(0), nan > toUInt16(0), nan <= toUInt16(0), nan >= toUInt16(0);
SELECT nan = toInt16(0), nan != toInt16(0), nan < toInt16(0), nan > toInt16(0), nan <= toInt16(0), nan >= toInt16(0);
SELECT nan = toUInt32(0), nan != toUInt32(0), nan < toUInt32(0), nan > toUInt32(0), nan <= toUInt32(0), nan >= toUInt32(0);
SELECT nan = toInt32(0), nan != toInt32(0), nan < toInt32(0), nan > toInt32(0), nan <= toInt32(0), nan >= toInt32(0);
SELECT nan = toUInt64(0), nan != toUInt64(0), nan < toUInt64(0), nan > toUInt64(0), nan <= toUInt64(0), nan >= toUInt64(0);
SELECT nan = toInt64(0), nan != toInt64(0), nan < toInt64(0), nan > toInt64(0), nan <= toInt64(0), nan >= toInt64(0);
SELECT nan = toFloat32(0.0), nan != toFloat32(0.0), nan < toFloat32(0.0), nan > toFloat32(0.0), nan <= toFloat32(0.0), nan >= toFloat32(0.0);
SELECT nan = toFloat64(0.0), nan != toFloat64(0.0), nan < toFloat64(0.0), nan > toFloat64(0.0), nan <= toFloat64(0.0), nan >= toFloat64(0.0);
SELECT -nan = toUInt8(0), -nan != toUInt8(0), -nan < toUInt8(0), -nan > toUInt8(0), -nan <= toUInt8(0), -nan >= toUInt8(0);
SELECT -nan = toInt8(0), -nan != toInt8(0), -nan < toInt8(0), -nan > toInt8(0), -nan <= toInt8(0), -nan >= toInt8(0);
SELECT -nan = toUInt16(0), -nan != toUInt16(0), -nan < toUInt16(0), -nan > toUInt16(0), -nan <= toUInt16(0), -nan >= toUInt16(0);
SELECT -nan = toInt16(0), -nan != toInt16(0), -nan < toInt16(0), -nan > toInt16(0), -nan <= toInt16(0), -nan >= toInt16(0);
SELECT -nan = toUInt32(0), -nan != toUInt32(0), -nan < toUInt32(0), -nan > toUInt32(0), -nan <= toUInt32(0), -nan >= toUInt32(0);
SELECT -nan = toInt32(0), -nan != toInt32(0), -nan < toInt32(0), -nan > toInt32(0), -nan <= toInt32(0), -nan >= toInt32(0);
SELECT -nan = toUInt64(0), -nan != toUInt64(0), -nan < toUInt64(0), -nan > toUInt64(0), -nan <= toUInt64(0), -nan >= toUInt64(0);
SELECT -nan = toInt64(0), -nan != toInt64(0), -nan < toInt64(0), -nan > toInt64(0), -nan <= toInt64(0), -nan >= toInt64(0);
SELECT -nan = toFloat32(0.0), -nan != toFloat32(0.0), -nan < toFloat32(0.0), -nan > toFloat32(0.0), -nan <= toFloat32(0.0), -nan >= toFloat32(0.0);
SELECT -nan = toFloat64(0.0), -nan != toFloat64(0.0), -nan < toFloat64(0.0), -nan > toFloat64(0.0), -nan <= toFloat64(0.0), -nan >= toFloat64(0.0);
SELECT nan = nan, nan != nan, nan = -nan, nan != -nan;
SELECT nan < nan, nan <= nan, nan < -nan, nan <= -nan;
SELECT nan > nan, nan >= nan, nan > -nan, nan >= -nan;
SELECT -nan < -nan, -nan <= -nan, -nan < nan, -nan <= nan;
SELECT -nan > -nan, -nan >= -nan, -nan > nan, -nan >= nan;
--SELECT 1 % nan, nan % 1, pow(x, 1), pow(1, x); -- TODO
SELECT 1 + nan, 1 - nan, nan - 1, 1 * nan, 1 / nan, nan / 1;
SELECT nan AS x, exp(x), exp2(x), exp10(x), log(x), log2(x), log10(x), sqrt(x), cbrt(x);
SELECT nan AS x, erf(x), erfc(x), lgamma(x), tgamma(x);
SELECT nan AS x, sin(x), cos(x), tan(x), asin(x), acos(x), atan(x);
SELECT min(x), max(x) FROM (SELECT arrayJoin([toFloat32(0.0), nan, toFloat32(1.0), toFloat32(-1.0)]) AS x);
SELECT min(x), max(x) FROM (SELECT arrayJoin([toFloat64(0.0), -nan, toFloat64(1.0), toFloat64(-1.0)]) AS x);

5
debian/.pbuilderrc vendored
View File

@ -189,12 +189,13 @@ export DEB_BUILD_OPTIONS=parallel=`nproc`
# chown -R $BUILDUSERID:$BUILDUSERID $CCACHEDIR # chown -R $BUILDUSERID:$BUILDUSERID $CCACHEDIR
# Do not create source package inside pbuilder (-b)
# Use current dir to make package (by default should have src archive) # Use current dir to make package (by default should have src archive)
# echo "3.0 (native)" > debian/source/format # echo "3.0 (native)" > debian/source/format
# OR # OR
# pdebuild --debbuildopts "--source-option=--format=\"3.0 (native)\"" # pdebuild -b --debbuildopts "--source-option=--format=\"3.0 (native)\""
# OR # OR
DEBBUILDOPTS="--source-option=--format=\"3.0 (native)\"" DEBBUILDOPTS="-b --source-option=--format=\"3.0 (native)\""
HOOKDIR="debian/pbuilder-hooks" HOOKDIR="debian/pbuilder-hooks"

4
debian/changelog vendored
View File

@ -1,5 +1,5 @@
clickhouse (18.12.1) unstable; urgency=low clickhouse (18.12.2) unstable; urgency=low
* Modified source code * Modified source code
-- <root@yandex-team.ru> Thu, 30 Aug 2018 22:28:33 +0300 -- <root@yandex-team.ru> Wed, 05 Sep 2018 00:28:49 +0300

View File

@ -5,3 +5,5 @@ tar-ignore="contrib/poco/openssl/*"
tar-ignore="contrib/poco/gradle/*" tar-ignore="contrib/poco/gradle/*"
tar-ignore="contrib/poco/Data/SQLite/*" tar-ignore="contrib/poco/Data/SQLite/*"
tar-ignore="contrib/poco/PDF/*" tar-ignore="contrib/poco/PDF/*"
compression-level=3
compression=gzip

View File

@ -1,7 +1,7 @@
FROM ubuntu:18.04 FROM ubuntu:18.04
ARG repository="deb http://repo.yandex.ru/clickhouse/deb/stable/ main/" ARG repository="deb http://repo.yandex.ru/clickhouse/deb/stable/ main/"
ARG version=18.12.1 ARG version=18.12.2
RUN apt-get update && \ RUN apt-get update && \
apt-get install -y apt-transport-https dirmngr && \ apt-get install -y apt-transport-https dirmngr && \

View File

@ -1,7 +1,7 @@
FROM ubuntu:18.04 FROM ubuntu:18.04
ARG repository="deb http://repo.yandex.ru/clickhouse/deb/stable/ main/" ARG repository="deb http://repo.yandex.ru/clickhouse/deb/stable/ main/"
ARG version=18.12.1 ARG version=18.12.2
RUN apt-get update && \ RUN apt-get update && \
apt-get install -y apt-transport-https dirmngr && \ apt-get install -y apt-transport-https dirmngr && \

View File

@ -1,7 +1,7 @@
FROM ubuntu:18.04 FROM ubuntu:18.04
ARG repository="deb http://repo.yandex.ru/clickhouse/deb/stable/ main/" ARG repository="deb http://repo.yandex.ru/clickhouse/deb/stable/ main/"
ARG version=18.12.1 ARG version=18.12.2
RUN apt-get update && \ RUN apt-get update && \
apt-get install -y apt-transport-https dirmngr && \ apt-get install -y apt-transport-https dirmngr && \

View File

@ -1,5 +1,85 @@
<a name="data_type-array"></a>
# Array(T) # Array(T)
An array of elements of type T. The T type can be any type, including an array. Array of `T`-type items.
We don't recommend using multidimensional arrays, because they are not well supported (for example, you can't store multidimensional arrays in tables with a MergeTree engine).
`T` can be anything, including an array. Use multi-dimensional arrays with caution. ClickHouse has limited support for multi-dimensional arrays. For example, they can't be stored in `MergeTree` tables.
## Creating an array
You can use a function to create an array:
```
array(T)
```
You can also use square brackets.
```
[]
```
Example of creating an array:
```
:) SELECT array(1, 2) AS x, toTypeName(x)
SELECT
[1, 2] AS x,
toTypeName(x)
┌─x─────┬─toTypeName(array(1, 2))─┐
│ [1,2] │ Array(UInt8) │
└───────┴─────────────────────────┘
1 rows in set. Elapsed: 0.002 sec.
:) SELECT [1, 2] AS x, toTypeName(x)
SELECT
[1, 2] AS x,
toTypeName(x)
┌─x─────┬─toTypeName([1, 2])─┐
│ [1,2] │ Array(UInt8) │
└───────┴────────────────────┘
1 rows in set. Elapsed: 0.002 sec.
```
## Working with data types
When creating an array on the fly, ClickHouse automatically defines the argument type as the narrowest data type that can store all the listed arguments. If there are any [NULL](../query_language/syntax.md#null-literal) or [Nullable](nullable.md#data_type-nullable) type arguments, the type of array elements is [Nullable](nullable.md#data_type-nullable).
If ClickHouse couldn't determine the data type, it will generate an exception. For instance, this will happen when trying to create an array with strings and numbers simultaneously (`SELECT array(1, 'a')`).
Examples of automatic data type detection:
```
:) SELECT array(1, 2, NULL) AS x, toTypeName(x)
SELECT
[1, 2, NULL] AS x,
toTypeName(x)
┌─x──────────┬─toTypeName(array(1, 2, NULL))─┐
│ [1,2,NULL] │ Array(Nullable(UInt8)) │
└────────────┴───────────────────────────────┘
1 rows in set. Elapsed: 0.002 sec.
```
If you try to create an array of incompatible data types, ClickHouse throws an exception:
```
:) SELECT array(1, 'a')
SELECT [1, 'a']
Received exception from server (version 1.1.54388):
Code: 386. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception: There is no supertype for types UInt8, String because some of them are String/FixedString and some of them are not.
0 rows in set. Elapsed: 0.246 sec.
```

View File

@ -1,3 +1,5 @@
<a name="data_type-datetime"></a>
# DateTime # DateTime
Date with time. Stored in four bytes as a Unix timestamp (unsigned). Allows storing values in the same range as for the Date type. The minimal value is output as 0000-00-00 00:00:00. Date with time. Stored in four bytes as a Unix timestamp (unsigned). Allows storing values in the same range as for the Date type. The minimal value is output as 0000-00-00 00:00:00.

View File

@ -1,18 +1,101 @@
# Enum <a name="data_type-enum"></a>
Enum8 or Enum16. A finite set of string values that can be stored more efficiently than the `String` data type. # Enum8, Enum16
Example: Includes the `Enum8` and `Enum16` types. `Enum` saves the final set of pairs of `'string' = integer`. In ClickHouse , all operations with the `Enum` data type are performed as if with numbers, although the user is working with string constants. This is more effective in terms of performance than working with the `String` data type.
```text - `Enum8` is described by pairs of `'String' = Int8`.
Enum8('hello' = 1, 'world' = 2) - `Enum16` is described by pairs of `'String' = Int16`.
## Usage examples
Here we create a table with an `Enum8('hello' = 1, 'world' = 2)` type column.
```
CREATE TABLE t_enum
(
x Enum8('hello' = 1, 'world' = 2)
)
ENGINE = TinyLog
``` ```
- A data type with two possible values: 'hello' and 'world'. This column `x` can only store the values that are listed in the type definition: `'hello'` or `'world'`. If you try to save a different value, ClickHouse generates an exception.
```
:) INSERT INTO t_enum Values('hello'),('world'),('hello')
INSERT INTO t_enum VALUES
Ok.
3 rows in set. Elapsed: 0.002 sec.
:) insert into t_enum values('a')
INSERT INTO t_enum VALUES
Exception on client:
Code: 49. DB::Exception: Unknown element 'a' for type Enum8('hello' = 1, 'world' = 2)
```
When you query data from the table, ClickHouse outputs the string values from `Enum`.
```
SELECT * FROM t_enum
┌─x─────┐
│ hello │
│ world │
│ hello │
└───────┘
```
If you need to see the numeric equivalents of the rows, you must cast the type.
```
SELECT CAST(x, 'Int8') FROM t_enum
┌─CAST(x, 'Int8')─┐
│ 1 │
│ 2 │
│ 1 │
└─────────────────┘
```
To create an Enum value in a query, you also need the `CAST` function.
```
SELECT toTypeName(CAST('a', 'Enum8(\'a\' = 1, \'b\' = 2)'))
┌─toTypeName(CAST('a', 'Enum8(\'a\' = 1, \'b\' = 2)'))─┐
│ Enum8('a' = 1, 'b' = 2) │
└──────────────────────────────────────────────────────┘
```
## General rules and usage
Each of the values is assigned a number in the range `-128 ... 127` for `Enum8` or in the range `-32768 ... 32767` for `Enum16`. All the strings and numbers must be different. An empty string is allowed. If this type is specified (in a table definition), numbers can be in an arbitrary order. However, the order does not matter. Each of the values is assigned a number in the range `-128 ... 127` for `Enum8` or in the range `-32768 ... 32767` for `Enum16`. All the strings and numbers must be different. An empty string is allowed. If this type is specified (in a table definition), numbers can be in an arbitrary order. However, the order does not matter.
In RAM, this type of column is stored in the same way as `Int8` or `Int16` of the corresponding numerical values. Neither the string nor the numeric value in an `Enum` can be [NULL](../query_language/syntax.md#null-literal).
`An Enum` can be passed to a [Nullable](nullable.md#data_type-nullable) type. So if you create a table using the query
```
CREATE TABLE t_enum_nullable
(
x Nullable( Enum8('hello' = 1, 'world' = 2) )
)
ENGINE = TinyLog
```
it can store not only `'hello'` and `'world'`, but `NULL`, as well.
```
INSERT INTO t_enum_null Values('hello'),('world'),(NULL)
```
In RAM, an `Enum` column is stored in the same way as `Int8` or `Int16` of the corresponding numerical values.
When reading in text form, ClickHouse parses the value as a string and searches for the corresponding string from the set of Enum values. If it is not found, an exception is thrown. When reading in text format, the string is read and the corresponding numeric value is looked up. An exception will be thrown if it is not found. When reading in text form, ClickHouse parses the value as a string and searches for the corresponding string from the set of Enum values. If it is not found, an exception is thrown. When reading in text format, the string is read and the corresponding numeric value is looked up. An exception will be thrown if it is not found.
When writing in text form, it writes the value as the corresponding string. If column data contains garbage (numbers that are not from the valid set), an exception is thrown. When reading and writing in binary form, it works the same way as for Int8 and Int16 data types. When writing in text form, it writes the value as the corresponding string. If column data contains garbage (numbers that are not from the valid set), an exception is thrown. When reading and writing in binary form, it works the same way as for Int8 and Int16 data types.
The implicit default value is the value with the lowest number. The implicit default value is the value with the lowest number.

View File

@ -16,6 +16,7 @@ We recommend that you store data in integer form whenever possible. For example,
```sql ```sql
SELECT 1 - 0.9 SELECT 1 - 0.9
``` ```
``` ```
┌───────minus(1, 0.9)─┐ ┌───────minus(1, 0.9)─┐
│ 0.09999999999999998 │ │ 0.09999999999999998 │

View File

@ -1,3 +1,5 @@
<a name="data_type-int"></a>
# UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64 # UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64
Fixed-length integers, with or without a sign. Fixed-length integers, with or without a sign.

View File

@ -0,0 +1,63 @@
<a name="data_type-nullable"></a>
# Nullable(TypeName)
Allows you to work with the `TypeName` value or without it ([NULL](../query_language/syntax.md#null-literal)) in the same variable, including storage of `NULL` tables with the `TypeName` values. For example, a `Nullable(Int8)` type column can store `Int8` type values, and the rows that don't have a value will store `NULL`.
For a `TypeName`, you can't use composite data types [Array](array.md#data_type is array) and [Tuple](tuple.md#data_type-tuple). Composite data types can contain `Nullable` type values, such as `Array(Nullable(Int8))`.
A `Nullable` type field can't be included in indexes.
`NULL` is the default value for the `Nullable` type, unless specified otherwise in the ClickHouse server configuration.
##Storage features
For storing `Nullable` type values, ClickHouse uses:
- A separate file with `NULL` masks (referred to as the mask).
- The file with the values.
The mask determines what is in a data cell: `NULL` or a value.
When the mask indicates that `NULL` is stored in a cell, the file with values stores the default value for the data type. So if the field has the type `Nullable(Int8)`, the cell will store the default value for `Int8`. This feature increases storage capacity.
!!! Note:
Using `Nullable` almost always reduces performance, so keep this in mind when designing your databases.
## Usage example
```
:) CREATE TABLE t_null(x Int8, y Nullable(Int8)) ENGINE TinyLog
CREATE TABLE t_null
(
x Int8,
y Nullable(Int8)
)
ENGINE = TinyLog
Ok.
0 rows in set. Elapsed: 0.012 sec.
:) INSERT INTO t_null VALUES (1, NULL)
INSERT INTO t_null VALUES
Ok.
1 rows in set. Elapsed: 0.007 sec.
:) SELECT x + y from t_null
SELECT x + y
FROM t_null
┌─plus(x, y)─┐
│ ᴺᵁᴸᴸ │
│ 5 │
└────────────┘
2 rows in set. Elapsed: 0.144 sec.
```

View File

@ -0,0 +1,20 @@
<a name="special_data_type-nothing"></a>
# Nothing
The only purpose of this data type is to represent [NULL](../../query_language/syntax.md#null-literal), i.e., no value.
You can't create a `Nothing` type value, because it is used where a value is not expected. For example, `NULL` is written as `Nullable(Nothing)` ([Nullable](../../data_types/nullable.md#data_type-nullable) — this is the data type that allows storing `NULL` in tables.) The `Nothing` type is also used to denote empty arrays:
```bash
:) SELECT toTypeName(Array())
SELECT toTypeName([])
┌─toTypeName(array())─┐
│ Array(Nothing) │
└─────────────────────┘
1 rows in set. Elapsed: 0.062 sec.
```

View File

@ -1,3 +1,5 @@
<a name="data_types-string"></a>
# String # String
Strings of an arbitrary length. The length is not limited. The value can contain an arbitrary set of bytes, including null bytes. Strings of an arbitrary length. The length is not limited. The value can contain an arbitrary set of bytes, including null bytes.

View File

@ -1,6 +1,54 @@
<a name="data_type-tuple"></a>
# Tuple(T1, T2, ...) # Tuple(T1, T2, ...)
Tuples can't be written to tables (other than Memory tables). They are used for temporary column grouping. Columns can be grouped when an IN expression is used in a query, and for specifying certain formal parameters of lambda functions. For more information, see "IN operators" and "Higher order functions". A tuple of elements of any [type](index.md#data_types). There can be one or more types of elements in a tuple.
Tuples can be output as the result of running a query. In this case, for text formats other than JSON\*, values are comma-separated in brackets. In JSON\* formats, tuples are output as arrays (in square brackets). You can't store tuples in tables (other than Memory tables). They are used for temporary column grouping. Columns can be grouped when an IN expression is used in a query, and for specifying certain formal parameters of lambda functions. For more information, see the sections [IN operators](../query_language/select.md#in_operators) and [Higher order functions](../query_language/functions/higher_order_functions.md#higher_order_functions).
Tuples can be the result of a query. In this case, for text formats other than JSON, values are comma-separated in brackets. In JSON formats, tuples are output as arrays (in square brackets).
## Creating a tuple
You can use a function to create a tuple
```
tuple(T1, T2, ...)
```
Example of creating a tuple:
```
:) SELECT tuple(1,'a') AS x, toTypeName(x)
SELECT
(1, 'a') AS x,
toTypeName(x)
┌─x───────┬─toTypeName(tuple(1, 'a'))─┐
│ (1,'a') │ Tuple(UInt8, String) │
└─────────┴───────────────────────────┘
1 rows in set. Elapsed: 0.021 sec.
```
## Working with data types
When creating a tuple on the fly, ClickHouse automatically detects the type of each argument as the minimum of the types which can store the argument value. If the argument is [NULL](../query_language/syntax.md#null-literal), the type of the tuple element is [Nullable](nullable.md#data_type-nullable).
Example of automatic data type detection:
```
SELECT tuple(1,NULL) AS x, toTypeName(x)
SELECT
(1, NULL) AS x,
toTypeName(x)
┌─x────────┬─toTypeName(tuple(1, NULL))──────┐
│ (1,NULL) │ Tuple(UInt8, Nullable(Nothing)) │
└──────────┴─────────────────────────────────┘
1 rows in set. Elapsed: 0.002 sec.
```

View File

@ -16,7 +16,7 @@
**2.** Indents are 4 spaces. Configure your development environment so that a tab adds four spaces. **2.** Indents are 4 spaces. Configure your development environment so that a tab adds four spaces.
**3.** A left curly bracket must be separated on a new line. (And the right one, as well.) **3.** Opening and closing curly brackets must be on a separate line.
```cpp ```cpp
inline void readBoolText(bool & x, ReadBuffer & buf) inline void readBoolText(bool & x, ReadBuffer & buf)
@ -27,28 +27,30 @@ inline void readBoolText(bool & x, ReadBuffer & buf)
} }
``` ```
**4.** **4.** If the entire function body is a single `statement`, it can be placed on a single line. Place spaces around curly braces (besides the space at the end of the line).
But if the entire function body is quite short (a single statement), you can place it entirely on one line if you wish. Place spaces around curly braces (besides the space at the end of the line).
```cpp ```cpp
inline size_t mask() const { return buf_size() - 1; } inline size_t mask() const { return buf_size() - 1; }
inline size_t place(HashValue x) const { return x & mask(); } inline size_t place(HashValue x) const { return x & mask(); }
``` ```
**5.** For functions, don't put spaces around brackets. **5.** For functions. Don't put spaces around brackets.
```cpp ```cpp
void reinsert(const Value & x) void reinsert(const Value & x)
```
```cpp
memcpy(&buf[place_value], &x, sizeof(x)); memcpy(&buf[place_value], &x, sizeof(x));
``` ```
**6.** When using statements such as `if`, `for`, and `while` (unlike function calls), put a space before the opening bracket. **6.** In `if`, `for`, `while` and other expressions, a space is inserted in front of the opening bracket (as opposed to function calls).
```cpp ```cpp
for (size_t i = 0; i < rows; i += storage.index_granularity) for (size_t i = 0; i < rows; i += storage.index_granularity)
``` ```
**7.** Put spaces around binary operators (`+`, `-`, `*`, `/`, `%`, ...), as well as the ternary operator `?:`. **7.** Add spaces around binary operators (`+`, `-`, `*`, `/`, `%`, ...) and the ternary operator `?:`.
```cpp ```cpp
UInt16 year = (s[0] - '0') * 1000 + (s[1] - '0') * 100 + (s[2] - '0') * 10 + (s[3] - '0'); UInt16 year = (s[0] - '0') * 1000 + (s[1] - '0') * 100 + (s[2] - '0') * 10 + (s[3] - '0');
@ -77,13 +79,13 @@ dst.ClickGoodEvent = click.GoodEvent;
If necessary, the operator can be wrapped to the next line. In this case, the offset in front of it is increased. If necessary, the operator can be wrapped to the next line. In this case, the offset in front of it is increased.
**11.** Do not use a space to separate unary operators (`-`, `+`, `*`, `&`, ...) from the argument. **11.** Do not use a space to separate unary operators (`--`, `++`, `*`, `&`, ...) from the argument.
**12.** Put a space after a comma, but not before it. The same rule goes for a semicolon inside a for expression. **12.** Put a space after a comma, but not before it. The same rule goes for a semicolon inside a `for` expression.
**13.** Do not use spaces to separate the `[]` operator. **13.** Do not use spaces to separate the `[]` operator.
**14.** In a `template <...>` expression, use a space between `template` and `<`. No spaces after `<` or before `>`. **14.** In a `template <...>` expression, use a space between `template` and `<`; no spaces after `<` or before `>`.
```cpp ```cpp
template <typename TKey, typename TValue> template <typename TKey, typename TValue>
@ -91,7 +93,7 @@ struct AggregatedStatElement
{} {}
``` ```
**15.** In classes and structures, public, private, and protected are written on the same level as the `class/struct`, but all other internal elements should be deeper. **15.** In classes and structures, write `public`, `private`, and `protected` on the same level as `class/struct`, and indent the rest of the code.
```cpp ```cpp
template <typename T> template <typename T>
@ -104,9 +106,11 @@ public:
} }
``` ```
**16.** If the same namespace is used for the entire file, and there isn't anything else significant, an offset is not necessary inside namespace. **16.** If the same `namespace` is used for the entire file, and there isn't anything else significant, an offset is not necessary inside `namespace`.
**17.** If the block for `if`, `for`, `while`... expressions consists of a single statement, you don't need to use curly brackets. Place the statement on a separate line, instead. The same is true for a nested if, for, while... statement. But if the inner statement contains curly brackets or else, the external block should be written in curly brackets. **17.** If the block for an `if`, `for`, `while`, or other expression consists of a single `statement`, the curly brackets are optional. Place the `statement` on a separate line, instead. This rule is also valid for nested `if`, `for`, `while`, ...
But if the inner `statement` contains curly brackets or `else`, the external block should be written in curly brackets.
```cpp ```cpp
/// Finish write. /// Finish write.
@ -114,9 +118,9 @@ for (auto & stream : streams)
stream.second->finalize(); stream.second->finalize();
``` ```
**18.** There should be any spaces at the ends of lines. **18.** There shouldn't be any spaces at the ends of lines.
**19.** Sources are UTF-8 encoded. **19.** Source files are UTF-8 encoded.
**20.** Non-ASCII characters can be used in string literals. **20.** Non-ASCII characters can be used in string literals.
@ -124,13 +128,13 @@ for (auto & stream : streams)
<< ", " << (timer.elapsed() / chunks_stats.hits) << " μsec/hit."; << ", " << (timer.elapsed() / chunks_stats.hits) << " μsec/hit.";
``` ```
**21.** Do not write multiple expressions in a single line. **21** Do not write multiple expressions in a single line.
**22.** Group sections of code inside functions and separate them with no more than one empty line. **22.** Group sections of code inside functions and separate them with no more than one empty line.
**23.** Separate functions, classes, and so on with one or two empty lines. **23.** Separate functions, classes, and so on with one or two empty lines.
**24.** A `const` (related to a value) must be written before the type name. **24.** `A const` (related to a value) must be written before the type name.
```cpp ```cpp
//correct //correct
@ -206,7 +210,7 @@ This is very important. Writing the comment might help you realize that the code
/** Part of piece of memory, that can be used. /** Part of piece of memory, that can be used.
* For example, if internal_buffer is 1MB, and there was only 10 bytes loaded to buffer from file for reading, * For example, if internal_buffer is 1MB, and there was only 10 bytes loaded to buffer from file for reading,
* then working_buffer will have size of only 10 bytes * then working_buffer will have size of only 10 bytes
* (working_buffer.end() will point to the position right after those 10 bytes available for read). * (working_buffer.end() will point to position right after those 10 bytes available for read).
*/ */
``` ```
@ -221,8 +225,8 @@ void executeQuery(
ReadBuffer & istr, /// Where to read the query from (and data for INSERT, if applicable) ReadBuffer & istr, /// Where to read the query from (and data for INSERT, if applicable)
WriteBuffer & ostr, /// Where to write the result WriteBuffer & ostr, /// Where to write the result
Context & context, /// DB, tables, data types, engines, functions, aggregate functions... Context & context, /// DB, tables, data types, engines, functions, aggregate functions...
BlockInputStreamPtr & query_plan, /// A description of query processing can be included here BlockInputStreamPtr & query_plan, /// Here could be written the description on how query was executed
QueryProcessingStage::Enum stage = QueryProcessingStage::Complete /// The last stage to process the SELECT query to QueryProcessingStage::Enum stage = QueryProcessingStage::Complete /// Up to which stage process the SELECT query
) )
``` ```
@ -253,7 +257,7 @@ void executeQuery(
*/ */
``` ```
The example is borrowed from [http://home.tamk.fi/~jaalto/course/coding-style/doc/unmaintainable-code/](http://home.tamk.fi/~jaalto/course/coding-style/doc/unmaintainable-code/). The example is borrowed from the resource [http://home.tamk.fi/~jaalto/course/coding-style/doc/unmaintainable-code/](http://home.tamk.fi/~jaalto/course/coding-style/doc/unmaintainable-code/).
**7.** Do not write garbage comments (author, creation date ..) at the beginning of each file. **7.** Do not write garbage comments (author, creation date ..) at the beginning of each file.
@ -263,9 +267,9 @@ Note: You can use Doxygen to generate documentation from these comments. But Dox
**9.** Multi-line comments must not have empty lines at the beginning and end (except the line that closes a multi-line comment). **9.** Multi-line comments must not have empty lines at the beginning and end (except the line that closes a multi-line comment).
**10.** For commenting out code, use basic comments, not "documenting" comments. **10.** For commenting out code, use basic comments, not “documenting” comments.
**11.** Delete the commented out parts of the code before commiting. **11.** Delete the commented out parts of the code before committing.
**12.** Do not use profanity in comments or code. **12.** Do not use profanity in comments or code.
@ -275,63 +279,63 @@ Note: You can use Doxygen to generate documentation from these comments. But Dox
/// WHAT THE FAIL??? /// WHAT THE FAIL???
``` ```
**14.** Do not make delimeters from comments. **14.** Do not use comments to make delimeters.
``` ```cpp
///****************************************************** ///******************************************************
``` ```
**15.** Do not start discussions in comments. **15.** Do not start discussions in comments.
``` ```cpp
/// Why did you do this stuff? /// Why did you do this stuff?
``` ```
**16.** There's no need to write a comment at the end of a block describing what it was about. **16.** There's no need to write a comment at the end of a block describing what it was about.
``` ```cpp
/// for /// for
``` ```
## Names ## Names
**1.** The names of variables and class members use lowercase letters with underscores. **1.** Use lowercase letters with underscores in the names of variables and class members.
```cpp ```cpp
size_t max_block_size; size_t max_block_size;
``` ```
**2.** The names of functions (methods) use camelCase beginning with a lowercase letter. **2.** For the names of functions (methods), use camelCase beginning with a lowercase letter.
```cpp ```cpp
std::string getName() const override { return "Memory"; } std::string getName() const override { return "Memory"; }
``` ```
**3.** The names of classes (structures) use CamelCase beginning with an uppercase letter. Prefixes other than I are not used for interfaces. **3.** For the names of classes (structs), use CamelCase beginning with an uppercase letter. Prefixes other than I are not used for interfaces.
```cpp ```cpp
class StorageMemory : public IStorage class StorageMemory : public IStorage
``` ```
**4.** The names of usings follow the same rules as classes, or you can add _t at the end. **4.** `using` are named the same way as classes, or with `_t` on the end.
**5.** Names of template type arguments for simple cases: T; T, U; T1, T2. **5.** Names of template type arguments: in simple cases, use `T`; `T`, `U`; `T1`, `T2`.
For more complex cases, either follow the rules for class names, or add the prefix T. For more complex cases, either follow the rules for class names, or add the prefix `T`.
```cpp ```cpp
template <typename TKey, typename TValue> template <typename TKey, typename TValue>
struct AggregatedStatElement struct AggregatedStatElement
``` ```
**6.** Names of template constant arguments: either follow the rules for variable names, or use N in simple cases. **6.** Names of template constant arguments: either follow the rules for variable names, or use `N` in simple cases.
```cpp ```cpp
template <bool without_www> template <bool without_www>
struct ExtractDomain struct ExtractDomain
``` ```
**7.** For abstract classes (interfaces) you can add the I prefix. **7.** For abstract classes (interfaces) you can add the `I` prefix.
```cpp ```cpp
class IBlockInputStream class IBlockInputStream
@ -339,13 +343,13 @@ class IBlockInputStream
**8.** If you use a variable locally, you can use the short name. **8.** If you use a variable locally, you can use the short name.
In other cases, use a descriptive name that conveys the meaning. In all other cases, use a name that describes the meaning.
```cpp ```cpp
bool info_successfully_loaded = false; bool info_successfully_loaded = false;
``` ```
**9.** `define`s should be in ALL_CAPS with underscores. The same is true for global constants. **9.** Names of `define`s and global constants use ALL_CAPS with underscores.
```cpp ```cpp
#define MAX_SRC_TABLE_NAMES_TO_STORE 1000 #define MAX_SRC_TABLE_NAMES_TO_STORE 1000
@ -353,9 +357,9 @@ bool info_successfully_loaded = false;
**10.** File names should use the same style as their contents. **10.** File names should use the same style as their contents.
If a file contains a single class, name the file the same way as the class, in CamelCase. If a file contains a single class, name the file the same way as the class (CamelCase).
If the file contains a single function, name the file the same way as the function, in camelCase. If the file contains a single function, name the file the same way as the function (camelCase).
**11.** If the name contains an abbreviation, then: **11.** If the name contains an abbreviation, then:
@ -385,7 +389,7 @@ The underscore suffix can be omitted if the argument is not used in the construc
timer (not m_timer) timer (not m_timer)
``` ```
**14.** Constants in enums use CamelCase beginning with an uppercase letter. ALL_CAPS is also allowed. If the enum is not local, use enum class. **14.** For the constants in an `enum`, use CamelCase with a capital letter. ALL_CAPS is also acceptable. If the `enum` is non-local, use an `enum class`.
```cpp ```cpp
enum class CompressionMethod enum class CompressionMethod
@ -397,7 +401,7 @@ enum class CompressionMethod
**15.** All names must be in English. Transliteration of Russian words is not allowed. **15.** All names must be in English. Transliteration of Russian words is not allowed.
```cpp ```
not Stroka not Stroka
``` ```
@ -419,9 +423,9 @@ You can also use an abbreviation if the full name is included next to it in the
**1.** Memory management. **1.** Memory management.
Manual memory deallocation (delete) can only be used in library code. Manual memory deallocation (`delete`) can only be used in library code.
In library code, the delete operator can only be used in destructors. In library code, the `delete` operator can only be used in destructors.
In application code, memory must be freed by the object that owns it. In application code, memory must be freed by the object that owns it.
@ -429,36 +433,42 @@ Examples:
- The easiest way is to place an object on the stack, or make it a member of another class. - The easiest way is to place an object on the stack, or make it a member of another class.
- For a large number of small objects, use containers. - For a large number of small objects, use containers.
- For automatic deallocation of a small number of objects that reside in the heap, use shared_ptr/unique_ptr. - For automatic deallocation of a small number of objects that reside in the heap, use `shared_ptr/unique_ptr`.
**2.** Resource management. **2.** Resource management.
Use RAII and see the previous point. Use `RAII` and see above.
**3.** Error handling. **3.** Error handling.
Use exceptions. In most cases, you only need to throw an exception, and don't need to catch it (because of RAII). Use exceptions. In most cases, you only need to throw an exception, and don't need to catch it (because of `RAII`).
In offline data processing applications, it's often acceptable to not catch exceptions. In offline data processing applications, it's often acceptable to not catch exceptions.
In servers that handle user requests, it's usually enough to catch exceptions at the top level of the connection handler. In servers that handle user requests, it's usually enough to catch exceptions at the top level of the connection handler.
In thread functions, you should catch and keep all exceptions to rethrow them in the main thread after `join`.
```cpp ```cpp
/// If there were no other calculations yet, do it synchronously /// If there weren't any calculations yet, calculate the first block synchronously
if (!started) if (!started)
{ {
calculate(); calculate();
started = true; started = true;
} }
else /// If the calculations are already in progress, wait for results else /// If calculations are already in progress, wait for the result
pool.wait(); pool.wait();
if (exception) if (exception)
exception->rethrow(); exception->rethrow();
``` ```
Never hide exceptions without handling. Never just blindly put all exceptions to log. Never hide exceptions without handling. Never just blindly put all exceptions to log.
Not `catch (...) {}`. ```cpp
//Not correct
catch (...) {}
```
If you need to ignore some exceptions, do so only for specific ones and rethrow the rest. If you need to ignore some exceptions, do so only for specific ones and rethrow the rest.
@ -472,14 +482,14 @@ catch (const DB::Exception & e)
} }
``` ```
When using functions with response codes or errno, always check the result and throw an exception in case of error. When using functions with response codes or `errno`, always check the result and throw an exception in case of error.
```cpp ```cpp
if (0 != close(fd)) if (0 != close(fd))
throwFromErrno("Cannot close file " + file_name, ErrorCodes::CANNOT_CLOSE_FILE); throwFromErrno("Cannot close file " + file_name, ErrorCodes::CANNOT_CLOSE_FILE);
``` ```
Asserts are not used. `Do not use assert`.
**4.** Exception types. **4.** Exception types.
@ -491,10 +501,10 @@ This is not recommended, but it is allowed.
Use the following options: Use the following options:
- Create a (done() or finalize()) function that will do all the work in advance that might lead to an exception. If that function was called, there should be no exceptions in the destructor later. - Create a function (`done()` or `finalize()`) that will do all the work in advance that might lead to an exception. If that function was called, there should be no exceptions in the destructor later.
- Tasks that are too complex (such as sending messages over the network) can be put in separate method that the class user will have to call before destruction. - Tasks that are too complex (such as sending messages over the network) can be put in separate method that the class user will have to call before destruction.
- If there is an exception in the destructor, its better to log it than to hide it (if the logger is available). - If there is an exception in the destructor, its better to log it than to hide it (if the logger is available).
- In simple applications, it is acceptable to rely on std::terminate (for cases of noexcept by default in C++11) to handle exceptions. - In simple applications, it is acceptable to rely on `std::terminate` (for cases of `noexcept` by default in C++11) to handle exceptions.
**6.** Anonymous code blocks. **6.** Anonymous code blocks.
@ -514,7 +524,7 @@ ready_any.set();
**7.** Multithreading. **7.** Multithreading.
For offline data processing applications: In offline data processing programs:
- Try to get the best possible performance on a single CPU core. You can then parallelize your code if necessary. - Try to get the best possible performance on a single CPU core. You can then parallelize your code if necessary.
@ -524,11 +534,11 @@ In server applications:
Fork is not used for parallelization. Fork is not used for parallelization.
**8.** Synchronizing threads. **8.** Syncing threads.
Often it is possible to make different threads use different memory cells (even better: different cache lines,) and to not use any thread synchronization (except joinAll). Often it is possible to make different threads use different memory cells (even better: different cache lines,) and to not use any thread synchronization (except `joinAll`).
If synchronization is required, in most cases, it is sufficient to use mutex under lock_guard. If synchronization is required, in most cases, it is sufficient to use mutex under `lock_guard`.
In other cases use system synchronization primitives. Do not use busy wait. In other cases use system synchronization primitives. Do not use busy wait.
@ -542,40 +552,40 @@ In most cases, prefer references.
**10.** const. **10.** const.
Use constant references, pointers to constants, `const_iterator`, `const` methods. Use constant references, pointers to constants, `const_iterator`, and const methods.
Consider `const` to be default and use non-const only when necessary. Consider `const` to be default and use non-`const` only when necessary.
When passing variable by value, using `const` usually does not make sense. When passing variables by value, using `const` usually does not make sense.
**11.** unsigned. **11.** unsigned.
Use `unsigned`, if needed. Use `unsigned` if necessary.
**12.** Numeric types **12.** Numeric types.
Use `UInt8`, `UInt16`, `UInt32`, `UInt64`, `Int8`, `Int16`, `Int32`, `Int64`, and `size_t`, `ssize_t`, `ptrdiff_t`. Use the types `UInt8`, `UInt16`, `UInt32`, `UInt64`, `Int8`, `Int16`, `Int32`, and `Int64`, as well as `size_t`, `ssize_t`, and `ptrdiff_t`.
Don't use `signed/unsigned long`, `long long`, `short`, `signed char`, `unsigned char`, or `char` types for numbers. Don't use these types for numbers: `signed/unsigned long`, `long long`, `short`, `signed/unsigned char`, `char`.
**13.** Passing arguments. **13.** Passing arguments.
Pass complex values by reference (including `std::string`). Pass complex values by reference (including `std::string`).
If a function captures ownership of an objected created in the heap, make the argument type `shared_ptr` or `unique_ptr`. If a function captures ownership of an object created in the heap, make the argument type `shared_ptr` or `unique_ptr`.
**14.** Returning values. **14.** Return values.
In most cases, just use return. Do not write `[return std::move(res)]{.strike}`. In most cases, just use `return`. Do not write `[return std::move(res)]{.strike}`.
If the function allocates an object on heap and returns it, use `shared_ptr` or `unique_ptr`. If the function allocates an object on heap and returns it, use `shared_ptr` or `unique_ptr`.
In rare cases you might need to return the value via an argument. In this case, the argument should be a reference. In rare cases you might need to return the value via an argument. In this case, the argument should be a reference.
``` ```cpp
using AggregateFunctionPtr = std::shared_ptr<IAggregateFunction>; using AggregateFunctionPtr = std::shared_ptr<IAggregateFunction>;
/** Creates an aggregate function by name. /** Allows creating an aggregate function by its name.
*/ */
class AggregateFunctionFactory class AggregateFunctionFactory
{ {
@ -586,28 +596,28 @@ public:
**15.** namespace. **15.** namespace.
There is no need to use a separate namespace for application code or small libraries. There is no need to use a separate `namespace` for application code.
or small libraries. Small libraries don't need this, either.
For medium to large libraries, put everything in the namespace. For medium to large libraries, put everything in a `namespace`.
You can use the additional detail namespace in a library's `.h` file to hide implementation details. In the library's `.h` file, you can use `namespace detail` to hide implementation details not needed for the application code.
In a `.cpp` file, you can use the static or anonymous namespace to hide symbols. In a `.cpp` file, you can use a `static` or anonymous namespace to hide symbols.
You can also use namespace for enums to prevent its names from polluting the outer namespace, but its better to use the enum class. Also, a `namespace` can be used for an `enum` to prevent the corresponding names from falling into an external `namespace` (but it's better to use an `enum class`).
**16.** Delayed initialization. **16.** Deferred initialization.
If arguments are required for initialization then do not write a default constructor. If arguments are required for initialization, then you normally shouldn't write a default constructor.
If later youll need to delay initialization, you can add a default constructor that will create an invalid object. Or, for a small number of objects, you can use `shared_ptr/unique_ptr`. If later youll need to delay initialization, you can add a default constructor that will create an invalid object. Or, for a small number of objects, you can use `shared_ptr/unique_ptr`.
```cpp ```cpp
Loader(DB::Connection * connection_, const std::string & query, size_t max_block_size_); Loader(DB::Connection * connection_, const std::string & query, size_t max_block_size_);
/// For delayed initialization /// For deferred initialization
Loader() {} Loader() {}
``` ```
@ -639,11 +649,11 @@ Do not use profanity in the log.
Use UTF-8 encoding in the log. In rare cases you can use non-ASCII characters in the log. Use UTF-8 encoding in the log. In rare cases you can use non-ASCII characters in the log.
**20.** I/O. **20.** Input-output.
Don't use iostreams in internal cycles that are critical for application performance (and never use stringstream). Don't use `iostreams` in internal cycles that are critical for application performance (and never use `stringstream`).
Use the DB/IO library instead. Use the `DB/IO` library instead.
**21.** Date and time. **21.** Date and time.
@ -655,30 +665,26 @@ Always use `#pragma once` instead of include guards.
**23.** using. **23.** using.
The `using namespace` is not used. `using namespace` is not used. You can use `using` with something specific. But make it local inside a class or function.
It's fine if you are 'using' something specific, but make it local inside a class or function. **24.** Do not use `trailing return type` for functions unless necessary.
**24.** Do not use trailing return type for functions unless necessary. ```cpp
```
[auto f() -&gt; void;]{.strike} [auto f() -&gt; void;]{.strike}
``` ```
**25.** Do not declare and init variables like this: **25.** Declaration and initialization of variables.
```cpp ```cpp
//right way
std::string s = "Hello";
std::string s{"Hello"};
//wrong way
auto s = std::string{"Hello"}; auto s = std::string{"Hello"};
``` ```
Do it like this: **26.** For virtual functions, write `virtual` in the base class, but write `override` instead of `virtual` in descendent classes.
```cpp
std::string s = "Hello";
std::string s{"Hello"};
```
**26.** For virtual functions, write `virtual` in the base class, but write `override` in descendent classes.
## Unused Features of C++ ## Unused Features of C++
@ -692,11 +698,11 @@ std::string s{"Hello"};
But other things being equal, cross-platform or portable code is preferred. But other things being equal, cross-platform or portable code is preferred.
**2.** The language is C++17. **2.** Language: C++17.
**3.** The compiler is `gcc`. At this time (December 2017), the code is compiled using version 7.2. (It can also be compiled using clang 5.) **3.** Compiler: `gcc`. At this time (December 2017), the code is compiled using version 7.2. (It can also be compiled using `clang 4`.)
The standard library is used (implementation of `libstdc++` or `libc++`). The standard library is used (`libstdc++` or `libc++`).
**4.**OS: Linux Ubuntu, not older than Precise. **4.**OS: Linux Ubuntu, not older than Precise.
@ -712,17 +718,17 @@ The CPU instruction set is the minimum supported set among our servers. Currentl
## Tools ## Tools
**1.** `KDevelop` is a good IDE. **1.** KDevelop is a good IDE.
**2.** For debugging, use `gdb`, `valgrind` (`memcheck`), `strace`, `-fsanitize=`, ..., `tcmalloc_minimal_debug`. **2.** For debugging, use `gdb`, `valgrind` (`memcheck`), `strace`, `-fsanitize=...`, or `tcmalloc_minimal_debug`.
**3.** For profiling, use Linux Perf `valgrind` (`callgrind`), `strace-cf`. **3.** For profiling, use `Linux Perf`, `valgrind` (`callgrind`), or `strace -cf`.
**4.** Sources are in Git. **4.** Sources are in Git.
**5.** Compilation is managed by `CMake`. **5.** Assembly uses `CMake`.
**6.** Releases are in `deb` packages. **6.** Programs are released using `deb` packages.
**7.** Commits to master must not break the build. **7.** Commits to master must not break the build.
@ -732,15 +738,15 @@ Though only selected revisions are considered workable.
Use branches for this purpose. Use branches for this purpose.
If your code is not buildable yet, exclude it from the build before pushing to master. You'll need to finish it or remove it from master within a few days. If your code in the `master` branch is not buildable yet, exclude it from the build before the `push`. You'll need to finish it or remove it within a few days.
**9.** For non-trivial changes, used branches and publish them on the server. **9.** For non-trivial changes, use branches and publish them on the server.
**10.** Unused code is removed from the repository. **10.** Unused code is removed from the repository.
## Libraries ## Libraries
**1.** The C++14 standard library is used (experimental extensions are fine), as well as boost and Poco frameworks. **1.** The C++14 standard library is used (experimental extensions are allowed), as well as `boost` and `Poco` frameworks.
**2.** If necessary, you can use any well-known libraries available in the OS package. **2.** If necessary, you can use any well-known libraries available in the OS package.
@ -750,9 +756,9 @@ If there is a good solution already available, then use it, even if it means you
**3.** You can install a library that isn't in the packages, if the packages don't have what you need or have an outdated version or the wrong type of compilation. **3.** You can install a library that isn't in the packages, if the packages don't have what you need or have an outdated version or the wrong type of compilation.
**4.** If the library is small and doesn't have its own complex build system, put the source files in the contrib folder. **4.** If the library is small and doesn't have its own complex build system, put the source files in the `contrib` folder.
**5.** Preference is always given to libraries that are already used. **5.** Preference is always given to libraries that are already in use.
## General Recommendations ## General Recommendations
@ -762,31 +768,31 @@ If there is a good solution already available, then use it, even if it means you
**3.** Don't write code until you know how it's going to work and how the inner loop will function. **3.** Don't write code until you know how it's going to work and how the inner loop will function.
**4.** In the simplest cases, use 'using' instead of classes or structs. **4.** In the simplest cases, use `using` instead of classes or structs.
**5.** If possible, do not write copy constructors, assignment operators, destructors (other than a virtual one, if the class contains at least one virtual function), mpve-constructors and move assignment operators. In other words, the compiler-generated functions must work correctly. You can use 'default'. **5.** If possible, do not write copy constructors, assignment operators, destructors (other than a virtual one, if the class contains at least one virtual function), move constructors or move assignment operators. In other words, the compiler-generated functions must work correctly. You can use `default`.
**6.** Code simplification is encouraged. Reduce the size of your code where possible. **6.** Code simplification is encouraged. Reduce the size of your code where possible.
## Additional Recommendations ## Additional Recommendations
**1.** Explicit `std::` for types from `stddef.h` is not recommended. **1.** Explicitly specifying `std::` for types from `stddef.h`
We recommend writing `size_t` instead `std::size_t` because it's shorter. is not recommended. In other words, we recommend writing `size_t` instead `std::size_t`, because it's shorter.
But if you prefer, `std::` is acceptable. It is acceptable to add `std::`.
**2.** Explicit `std::` for functions from the standard C library is not recommended. **2.** Explicitly specifying `std::` for functions from the standard C library
Write `memcpy` instead of `std::memcpy`. is not recommended. In other words, write `memcpy` instead of `std::memcpy`.
The reason is that there are similar non-standard functions, such as `memmem`. We do use these functions on occasion. These functions do not exist in namespace `std`. The reason is that there are similar non-standard functions, such as `memmem`. We do use these functions on occasion. These functions do not exist in `namespace std`.
If you write `std::memcpy` instead of `memcpy` everywhere, then `memmem` without `std::` will look awkward. If you write `std::memcpy` instead of `memcpy` everywhere, then `memmem` without `std::` will look strange.
Nevertheless, `std::` is allowed if you prefer it. Nevertheless, you can still use `std::` if you prefer it.
**3.** Using functions from C when the ones are available in the standard C++ library. **3.** Using functions from C when the same ones are available in the standard C++ library.
This is acceptable if it is more efficient. This is acceptable if it is more efficient.

View File

@ -2,11 +2,12 @@
## Why Not Use Something Like MapReduce? ## Why Not Use Something Like MapReduce?
We can refer to systems like MapReduce as distributed computing systems in which the reduce operation is based on a distributed sort. The most common opensource solution of this kind is [Apache Hadoop](http://hadoop.apache.org), while Yandex internally uses it's own MapReduce implementation — YT. We can refer to systems like MapReduce as distributed computing systems in which the reduce operation is based on distributed sorting. The most common open source solution in this class is [Apache Hadoop](http://hadoop.apache.org). Yandex uses their in-house solution, YT.
The systems of this kind are not suitable for online queries due to their high latency. In other words, they can't be used as the back-end for a web interface. These systems aren't appropriate for online queries due to their high latency. In other words, they can't be used as the back-end for a web interface.
These types of systems aren't useful for real-time data updates.
Distributed sorting isn't the best way to perform reduce operations if the result of the operation and all the intermediate results (if there are any) are located in the RAM of a single server, which is usually the case for online queries. In such a case, a hash table is the optimal way to perform reduce operations. A common approach to optimizing map-reduce tasks is pre-aggregation (partial reduce) using a hash table in RAM. The user performs this optimization manually.
Distributed sorting is one of the main causes of reduced performance when running simple map-reduce tasks.
Distributed sorting isn't the best way to perform reduce operations if the result of the operation and all the intermediate results (if there are any) are located in the RAM of a single server, which is usually the case for online queries. In such a case, a hash table is the optimal way to perform reduce operations. A common approach to optimizing MapReduce tasks is pre-aggregation (partial reduce) using a hash table in RAM. The user performs this optimization manually. Most MapReduce implementations allow you to execute arbitrary code on a cluster. But a declarative query language is better suited to OLAP in order to run experiments quickly. For example, Hadoop has Hive and Pig. Also consider Cloudera Impala or Shark (outdated) for Spark, as well as Spark SQL, Presto, and Apache Drill. Performance when running such tasks is highly sub-optimal compared to specialized systems, but relatively high latency makes it unrealistic to use these systems as the backend for a web interface.
Distributed sorting is one of the main causes of reduced performance when running simple MapReduce tasks.
Most MapReduce implementations allow executing any code on the cluster. But a declarative query language is better suited to OLAP in order to run experiments quickly. For example, Hadoop has Hive and Pig. Also consider Cloudera Impala, Shark (outdated) for Spark, and Spark SQL, Presto, and Apache Drill. Performance when running such tasks is highly sub-optimal compared to specialized systems, but relatively high latency makes it unrealistic to use these systems as the backend for a web interface.

View File

@ -118,3 +118,4 @@ GROUP BY sourceIP
ORDER BY totalRevenue DESC ORDER BY totalRevenue DESC
LIMIT 1 LIMIT 1
``` ```

View File

@ -2,7 +2,7 @@
## How to Import The Raw Data ## How to Import The Raw Data
See <https://github.com/toddwschneider/nyc-taxi-data> and <http://tech.marksblogg.com/billion-nyc-taxi-rides-redshift.html> for the description of the dataset and instructions for downloading. See <https://github.com/toddwschneider/nyc-taxi-data> and <http://tech.marksblogg.com/billion-nyc-taxi-rides-redshift.html> for the description of a dataset and instructions for downloading.
Downloading will result in about 227 GB of uncompressed data in CSV files. The download takes about an hour over a 1 Gbit connection (parallel downloading from s3.amazonaws.com recovers at least half of a 1 Gbit channel). Downloading will result in about 227 GB of uncompressed data in CSV files. The download takes about an hour over a 1 Gbit connection (parallel downloading from s3.amazonaws.com recovers at least half of a 1 Gbit channel).
Some of the files might not download fully. Check the file sizes and re-download any that seem doubtful. Some of the files might not download fully. Check the file sizes and re-download any that seem doubtful.
@ -311,9 +311,7 @@ ORDER BY year, count(*) DESC
The following server was used: The following server was used:
Two Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz, 16 physical kernels total, Two Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz, 16 physical kernels total,128 GiB RAM,8x6 TB HD on hardware RAID-5
128 GiB RAM,
8x6 TB HD on hardware RAID-5
Execution time is the best of three runsBut starting from the second run, queries read data from the file system cache. No further caching occurs: the data is read out and processed in each run. Execution time is the best of three runsBut starting from the second run, queries read data from the file system cache. No further caching occurs: the data is read out and processed in each run.
@ -360,8 +358,9 @@ We ran queries using a client located in a Yandex datacenter in Finland on a clu
## Summary ## Summary
| nodes | Q1 | Q2 | Q3 | Q4 | | servers | Q1 | Q2 | Q3 | Q4 |
| ----- | ----- | ----- | ----- | ----- | | ------- | ----- | ----- | ----- | ----- |
| 1 | 0.490 | 1.224 | 2.104 | 3.593 | | 1 | 0.490 | 1.224 | 2.104 | 3.593 |
| 3 | 0.212 | 0.438 | 0.733 | 1.241 | | 3 | 0.212 | 0.438 | 0.733 | 1.241 |
| 140 | 0.028 | 0.043 | 0.051 | 0.072 | | 140 | 0.028 | 0.043 | 0.051 | 0.072 |

View File

@ -308,7 +308,7 @@ SELECT OriginCityName, DestCityName, count() AS c FROM ontime GROUP BY OriginCit
SELECT OriginCityName, count() AS c FROM ontime GROUP BY OriginCityName ORDER BY c DESC LIMIT 10; SELECT OriginCityName, count() AS c FROM ontime GROUP BY OriginCityName ORDER BY c DESC LIMIT 10;
``` ```
This performance test was created by Vadim Tkachenko. For mode details see: This performance test was created by Vadim Tkachenko. See:
- <https://www.percona.com/blog/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/> - <https://www.percona.com/blog/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/>
- <https://www.percona.com/blog/2009/10/26/air-traffic-queries-in-luciddb/> - <https://www.percona.com/blog/2009/10/26/air-traffic-queries-in-luciddb/>
@ -316,3 +316,4 @@ This performance test was created by Vadim Tkachenko. For mode details see:
- <https://www.percona.com/blog/2014/04/21/using-apache-hadoop-and-impala-together-with-mysql-for-data-analysis/> - <https://www.percona.com/blog/2014/04/21/using-apache-hadoop-and-impala-together-with-mysql-for-data-analysis/>
- <https://www.percona.com/blog/2016/01/07/apache-spark-with-air-ontime-performance-data/> - <https://www.percona.com/blog/2016/01/07/apache-spark-with-air-ontime-performance-data/>
- <http://nickmakos.blogspot.ru/2012/08/analyzing-air-traffic-performance-with.html> - <http://nickmakos.blogspot.ru/2012/08/analyzing-air-traffic-performance-with.html>

View File

@ -82,3 +82,4 @@ Downloading data (change 'customer' to 'customerd' in the distributed version):
cat customer.tbl | sed 's/$/2000-01-01/' | clickhouse-client --query "INSERT INTO customer FORMAT CSV" cat customer.tbl | sed 's/$/2000-01-01/' | clickhouse-client --query "INSERT INTO customer FORMAT CSV"
cat lineorder.tbl | clickhouse-client --query "INSERT INTO lineorder FORMAT CSV" cat lineorder.tbl | clickhouse-client --query "INSERT INTO lineorder FORMAT CSV"
``` ```

View File

@ -24,3 +24,4 @@ for i in {2007..2016}; do for j in {01..12}; do echo $i-$j >&2; curl -sSL "http:
cat links.txt | while read link; do wget http://dumps.wikimedia.org/other/pagecounts-raw/$(echo $link | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})[0-9]{2}-[0-9]+\.gz/\1/')/$(echo $link | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})[0-9]{2}-[0-9]+\.gz/\1-\2/')/$link; done cat links.txt | while read link; do wget http://dumps.wikimedia.org/other/pagecounts-raw/$(echo $link | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})[0-9]{2}-[0-9]+\.gz/\1/')/$(echo $link | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})[0-9]{2}-[0-9]+\.gz/\1-\2/')/$link; done
ls -1 /opt/wikistat/ | grep gz | while read i; do echo $i; gzip -cd /opt/wikistat/$i | ./wikistat-loader --time="$(echo -n $i | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})([0-9]{2})-([0-9]{2})([0-9]{2})([0-9]{2})\.gz/\1-\2-\3 \4-00-00/')" | clickhouse-client --query="INSERT INTO wikistat FORMAT TabSeparated"; done ls -1 /opt/wikistat/ | grep gz | while read i; do echo $i; gzip -cd /opt/wikistat/$i | ./wikistat-loader --time="$(echo -n $i | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})([0-9]{2})-([0-9]{2})([0-9]{2})([0-9]{2})\.gz/\1-\2-\3 \4-00-00/')" | clickhouse-client --query="INSERT INTO wikistat FORMAT TabSeparated"; done
``` ```

View File

@ -2,15 +2,19 @@
## System Requirements ## System Requirements
This is not a cross-platform system. It requires Linux Ubuntu Precise (12.04) or newer, with x86_64 architecture and support for the SSE 4.2 instruction set. Installation from the official repository requires Linux with x86_64 architecture and support for the SSE 4.2 instruction set.
To check for SSE 4.2: To check for SSE 4.2:
```bash ```bash
grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported" grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
``` ```
We recommend using Ubuntu Trusty, Ubuntu Xenial, or Ubuntu Precise. We recommend using Ubuntu or Debian. The terminal must use UTF-8 encoding.
The terminal must use UTF-8 encoding (the default in Ubuntu).
For rpm-based systems, you can use 3rd-party packages: https://packagecloud.io/altinity/clickhouse or install debian packages.
ClickHouse also works on FreeBSD and Mac OS X. It can be compiled for x86_64 processors without SSE 4.2 support, and for AArch64 CPUs.
## Installation ## Installation
@ -62,7 +66,7 @@ For the server, create a catalog with data, such as:
(Configurable in the server config.) (Configurable in the server config.)
Run 'chown' for the desired user. Run 'chown' for the desired user.
Note the path to logs in the server config (src/dbms/programs/config.xml). Note the path to logs in the server config (src/dbms/programs/server/config.xml).
### Other Installation Methods ### Other Installation Methods

View File

@ -1,76 +1,75 @@
# What is ClickHouse? # What is ClickHouse?
ClickHouse is a columnar database management system (DBMS) for online analytical processing (OLAP). ClickHouse is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).
In a "normal" row-oriented DBMS, data is stored in this order: In a "normal" row-oriented DBMS, data is stored in this order:
| Row | WatchID | JavaEnable | Title | GoodEvent | EventTime | | Row | WatchID | JavaEnable | Title | GoodEvent | EventTime |
| --- | ----------- | ---------- | ------------------ | --------- | ------------------- | | ------ | ------------------- | ---------- | ------------------ | --------- | ------------------- |
| #0 | 89354350662 | 1 | Investor Relations | 1 | 2016-05-18 05:19:20 | | #0 | 5385521489354350662 | 1 | Investor Relations | 1 | 2016-05-18 05:19:20 |
| #1 | 90329509958 | 0 | Contact us | 1 | 2016-05-18 08:10:20 | | #1 | 5385521490329509958 | 0 | Contact us | 1 | 2016-05-18 08:10:20 |
| #2 | 89953706054 | 1 | Mission | 1 | 2016-05-18 07:38:00 | | #2 | 5385521489953706054 | 1 | Mission | 1 | 2016-05-18 07:38:00 |
| #N | ... | ... | ... | ... | ... | | #N | ... | ... | ... | ... | ... |
In order words, all the values related to a row are physically stored next to each other. In order words, all the values related to a row are physically stored next to each other.
Examples of a row-oriented DBMSs are MySQL, Postgres and MS SQL Server. Examples of a row-oriented DBMS are MySQL, Postgres, and MS SQL Server.
{: .grey } {: .grey }
In a column-oriented DBMS, data is stored like this: In a column-oriented DBMS, data is stored like this:
| Row: | #0 | #1 | #2 | #N | | Row: | #0 | #1 | #2 | #N |
| ----------- | ------------------- | ------------------- | ------------------- | ------------------- | | ----------- | ------------------- | ------------------- | ------------------- | ------------------- |
| WatchID: | 89354350662 | 90329509958 | 89953706054 | ... | | WatchID: | 5385521489354350662 | 5385521490329509958 | 5385521489953706054 | ... |
| JavaEnable: | 1 | 0 | 1 | ... | | JavaEnable: | 1 | 0 | 1 | ... |
| Title: | Investor Relations | Contact us | Mission | ... | | Title: | Investor Relations | Contact us | Mission | ... |
| GoodEvent: | 1 | 1 | 1 | ... | | GoodEvent: | 1 | 1 | 1 | ... |
| EventTime: | 2016-05-18 05:19:20 | 2016-05-18 08:10:20 | 2016-05-18 07:38:00 | ... | | EventTime: | 2016-05-18 05:19:20 | 2016-05-18 08:10:20 | 2016-05-18 07:38:00 | ... |
These examples only show the order that data is arranged in. These examples only show the order that data is arranged in.
The values from different columns are stored separately, and data from the same column is stored together. The values from different columns are stored separately, and data from the same column is stored together.
Examples of column-oriented DBMSs: Vertica, Paraccel (Actian Matrix, Amazon Redshift), Sybase IQ, Exasol, Infobright, InfiniDB, MonetDB (VectorWise, Actian Vector), LucidDB, SAP HANA, Google Dremel, Google PowerDrill, Druid, kdb+. Examples of a column-oriented DBMS: Vertica, Paraccel (Actian Matrix and Amazon Redshift), Sybase IQ, Exasol, Infobright, InfiniDB, MonetDB (VectorWise and Actian Vector), LucidDB, SAP HANA, Google Dremel, Google PowerDrill, Druid, and kdb+.
{: .grey } {: .grey }
Different orders for storing data are better suited to different scenarios. The data access scenario refers to which queries are made, how often, and in what proportion; how much data is read for each type of query rows, columns, and bytes; the relationship between reading and writing data; the size of the actively used dataset and how locally it is used; whether transactions are used, and how isolated they are; requirements for data replication and logical integrity; requirements for latency and throughput for each type of query, and so on. Different orders for storing data are better suited to different scenarios.
The data access scenario refers to what queries are made, how often, and in what proportion; how much data is read for each type of query rows, columns, and bytes; the relationship between reading and updating data; the working size of the data and how locally it is used; whether transactions are used, and how isolated they are; requirements for data replication and logical integrity; requirements for latency and throughput for each type of query, and so on.
The higher the load on the system, the more important it is to customize the system set up to match the requirements of the usage scenario, and the more fine grained this customization becomes. There is no system that is equally well-suited to significantly different scenarios. If a system is adaptable to a wide set of scenarios, under a high load, the system will handle all the scenarios equally poorly, or will work well for just one or few of possible scenarios. The higher the load on the system, the more important it is to customize the system to the scenario, and the more specific this customization becomes. There is no system that is equally well-suited to significantly different scenarios. If a system is adaptable to a wide set of scenarios, under a high load, the system will handle all the scenarios equally poorly, or will work well for just one of the scenarios.
## Key Properties of OLAP Scenario ## Key features of the OLAP scenario
- The vast majority of requests are for read access. - The vast majority of requests are for read access.
- Data is ingested in fairly large batches (> 1000 rows), not by single rows; or it is not updated at all. - Data is updated in fairly large batches (> 1000 rows), not by single rows; or it is not updated at all.
- Data is added to the DB but is not modified. - Data is added to the DB but is not modified.
- For reads, quite a large number of rows are extracted from the DB, but only a small subset of columns. - For reads, quite a large number of rows are extracted from the DB, but only a small subset of columns.
- Tables are "wide", meaning they contain a large number of columns. - Tables are "wide," meaning they contain a large number of columns.
- Queries are relatively rare (usually hundreds of queries per second per server or less). - Queries are relatively rare (usually hundreds of queries per server or less per second).
- For simple queries, latencies around 50 ms are allowed. - For simple queries, latencies around 50 ms are allowed.
- Column values are fairly small: numbers and short strings (for example, 60 bytes per URL). - Column values are fairly small: numbers and short strings (for example, 60 bytes per URL).
- Requires high throughput when processing a single query (up to billions of rows per second per server). - Requires high throughput when processing a single query (up to billions of rows per second per server).
- Transactions are not necessary. - There are no transactions.
- Low requirements for data consistency. - Low requirements for data consistency.
- There is one large table per query. All tables are small, except for one. - There is one large table per query. All tables are small, except for one.
- A query result is significantly smaller than the source data. In other words, data is filtered or aggregated, so the result fits in a single server's RAM. - A query result is significantly smaller than the source data. In other words, data is filtered or aggregated. The result fits in a single server's RAM.
It is easy to see that the OLAP scenario is very different from other popular scenarios (such as OLTP or Key-Value access). So it doesn't make sense to try to use OLTP or a Key-Value DB for processing analytical queries if you want to get decent performance. For example, if you try to use MongoDB or Redis for analytics, you will get very poor performance compared to OLAP databases. It is easy to see that the OLAP scenario is very different from other popular scenarios (such as OLTP or Key-Value access). So it doesn't make sense to try to use OLTP or a Key-Value DB for processing analytical queries if you want to get decent performance. For example, if you try to use MongoDB or Redis for analytics, you will get very poor performance compared to OLAP databases.
## Reasons Why Columnar Databases Are Better Suited for OLAP Scenario ## Why column-oriented databases are better for the OLAP scenario
Column-oriented databases are better suited to OLAP scenarios (at least 100 times better in processing speed for most queries). The reasons for that are explained below in detail, but it's easier to be demonstrated visually: Column-oriented databases are better suited to OLAP scenarios: they are at least 100 times faster in processing most queries. The reasons are explained in detail below, but the fact is easier to demonstrate visually:
**Row oriented** **Row-oriented DBMS**
![Row oriented](images/row_oriented.gif#) ![Row-oriented ]( images/row_oriented.gif#)
**Column oriented** **Column-oriented DBMS**
![Column oriented](images/column_oriented.gif#) ![Column-oriented](images / column_oriented.gif#)
See the difference? Read further to learn why this happens. See the difference?
### Input/output ### About input/output
1. For an analytical query, only a small number of table columns need to be read. In a column-oriented database, you can read just the data you need. For example, if you need 5 columns out of 100, you can expect a 20-fold reduction in I/O. 1. For an analytical query, only a small number of table columns need to be read. In a column-oriented database, you can read just the data you need. For example, if you need 5 columns out of 100, you can expect a 20-fold reduction in I/O.
2. Since data is read in packets, it is easier to compress. Data in columns is also easier to compress. This further reduces the I/O volume. 2. Since data is read in packets, it is easier to compress. Data in columns is also easier to compress. This further reduces the I/O volume.
@ -121,12 +120,11 @@ LIMIT 20
20 rows in set. Elapsed: 0.153 sec. Processed 1.00 billion rows, 4.00 GB (6.53 billion rows/s., 26.10 GB/s.) 20 rows in set. Elapsed: 0.153 sec. Processed 1.00 billion rows, 4.00 GB (6.53 billion rows/s., 26.10 GB/s.)
:) :)</pre>
</pre>
</p>
</details>
### CPU </p></details>
### By calculation
Since executing a query requires processing a large number of rows, it helps to dispatch all operations for entire vectors instead of for separate rows, or to implement the query engine so that there is almost no dispatching cost. If you don't do this, with any half-decent disk subsystem, the query interpreter inevitably stalls the CPU. Since executing a query requires processing a large number of rows, it helps to dispatch all operations for entire vectors instead of for separate rows, or to implement the query engine so that there is almost no dispatching cost. If you don't do this, with any half-decent disk subsystem, the query interpreter inevitably stalls the CPU.
It makes sense to both store data in columns and process it, when possible, by columns. It makes sense to both store data in columns and process it, when possible, by columns.
@ -140,4 +138,3 @@ There are two ways to do this:
This is not done in "normal" databases, because it doesn't make sense when running simple queries. However, there are exceptions. For example, MemSQL uses code generation to reduce latency when processing SQL queries. (For comparison, analytical DBMSs require optimization of throughput, not latency.) This is not done in "normal" databases, because it doesn't make sense when running simple queries. However, there are exceptions. For example, MemSQL uses code generation to reduce latency when processing SQL queries. (For comparison, analytical DBMSs require optimization of throughput, not latency.)
Note that for CPU efficiency, the query language must be declarative (SQL or MDX), or at least a vector (J, K). The query should only contain implicit loops, allowing for optimization. Note that for CPU efficiency, the query language must be declarative (SQL or MDX), or at least a vector (J, K). The query should only contain implicit loops, allowing for optimization.

View File

@ -31,6 +31,7 @@ _EOF
cat file.csv | clickhouse-client --database=test --query="INSERT INTO test FORMAT CSV"; cat file.csv | clickhouse-client --database=test --query="INSERT INTO test FORMAT CSV";
``` ```
In batch mode, the default data format is TabSeparated. You can set the format in the FORMAT clause of the query. In batch mode, the default data format is TabSeparated. You can set the format in the FORMAT clause of the query.
By default, you can only process a single query in batch mode. To make multiple queries from a "script," use the --multiquery parameter. This works for all queries except INSERT. Query results are output consecutively without additional separators. By default, you can only process a single query in batch mode. To make multiple queries from a "script," use the --multiquery parameter. This works for all queries except INSERT. Query results are output consecutively without additional separators.
@ -51,7 +52,8 @@ The history is written to `~/.clickhouse-client-history`.
By default, the format used is PrettyCompact. You can change the format in the FORMAT clause of the query, or by specifying `\G` at the end of the query, using the `--format` or `--vertical` argument in the command line, or using the client configuration file. By default, the format used is PrettyCompact. You can change the format in the FORMAT clause of the query, or by specifying `\G` at the end of the query, using the `--format` or `--vertical` argument in the command line, or using the client configuration file.
To exit the client, press Ctrl+D (or Ctrl+C), or enter one of the following instead of a query: "exit", "quit", "logout", "exit;", "quit;", "logout;", "q", "Q", ":q" To exit the client, press Ctrl+D (or Ctrl+C), or enter one of the following instead of a query:
"exit", "quit", "logout", "exit;", "quit;", "logout;", "q", "Q", ":q"
When processing a query, the client shows: When processing a query, the client shows:

View File

@ -1,36 +1,36 @@
<a name="formats"></a> <a name="formats"></a>
# Input and Output Formats # Formats for input and output data
The format determines how data is returned to you after SELECTs (how it is written and formatted by the server), and how it is accepted for INSERTs (how it is read and parsed by the server). ClickHouse can accept (`INSERT`) and return (`SELECT`) data in various formats.
See the table below for the list of supported formats for either kinds of queries. The table below lists supported formats and how they can be used in `INSERT` and `SELECT` queries.
Format | INSERT | SELECT | Format | INSERT | SELECT |
-------|--------|-------- | ------- | -------- | -------- |
[TabSeparated](formats.md#tabseparated) | ✔ | ✔ | | [TabSeparated](#tabseparated) | ✔ | ✔ |
[TabSeparatedRaw](formats.md#tabseparatedraw) | ✗ | ✔ | | [TabSeparatedRaw](#tabseparatedraw) | ✗ | ✔ |
[TabSeparatedWithNames](formats.md#tabseparatedwithnames) | ✔ | ✔ | | [TabSeparatedWithNames](#tabseparatedwithnames) | ✔ | ✔ |
[TabSeparatedWithNamesAndTypes](formats.md#tabseparatedwithnamesandtypes) | ✔ | ✔ | | [TabSeparatedWithNamesAndTypes](#tabseparatedwithnamesandtypes) | ✔ | ✔ |
[CSV](formats.md#csv) | ✔ | ✔ | | [CSV](#csv) | ✔ | ✔ |
[CSVWithNames](formats.md#csvwithnames) | ✔ | ✔ | | [CSVWithNames](#csvwithnames) | ✔ | ✔ |
[Values](formats.md#values) | ✔ | ✔ | | [Values](#values) | ✔ | ✔ |
[Vertical](formats.md#vertical) | ✗ | ✔ | | [Vertical](#vertical) | ✗ | ✔ |
[VerticalRaw](formats.md#verticalraw) | ✗ | ✔ | | [VerticalRaw](#verticalraw) | ✗ | ✔ |
[JSON](formats.md#json) | ✗ | ✔ | | [JSON](#json) | ✗ | ✔ |
[JSONCompact](formats.md#jsoncompact) | ✗ | ✔ | | [JSONCompact](#jsoncompact) | ✗ | ✔ |
[JSONEachRow](formats.md#jsoneachrow) | ✔ | ✔ | | [JSONEachRow](#jsoneachrow) | ✔ | ✔ |
[TSKV](formats.md#tskv) | ✔ | ✔ | | [TSKV](#tskv) | ✔ | ✔ |
[Pretty](formats.md#pretty) | ✗ | ✔ | | [Pretty](#pretty) | ✗ | ✔ |
[PrettyCompact](formats.md#prettycompact) | ✗ | ✔ | | [PrettyCompact](#prettycompact) | ✗ | ✔ |
[PrettyCompactMonoBlock](formats.md#prettycompactmonoblock) | ✗ | ✔ | | [PrettyCompactMonoBlock](#prettycompactmonoblock) | ✗ | ✔ |
[PrettyNoEscapes](formats.md#prettynoescapes) | ✗ | ✔ | | [PrettyNoEscapes](#prettynoescapes) | ✗ | ✔ |
[PrettySpace](formats.md#prettyspace) | ✗ | ✔ | | [PrettySpace](#prettyspace) | ✗ | ✔ |
[RowBinary](formats.md#rowbinary) | ✔ | ✔ | | [RowBinary](#rowbinary) | ✔ | ✔ |
[Native](formats.md#native) | ✔ | ✔ | | [Native](#native) | ✔ | ✔ |
[Null](formats.md#null) | ✗ | ✔ | | [Null](#null) | ✗ | ✔ |
[XML](formats.md#xml) | ✗ | ✔ | | [XML](#xml) | ✗ | ✔ |
[CapnProto](formats.md#capnproto) | ✔ | ✔ | | [CapnProto](#capnproto) | ✔ | ✔ |
<a name="format_capnproto"></a> <a name="format_capnproto"></a>
@ -57,26 +57,30 @@ struct Message {
Schema files are in the file that is located in the directory specified in [ format_schema_path](../operations/server_settings/settings.md#server_settings-format_schema_path) in the server configuration. Schema files are in the file that is located in the directory specified in [ format_schema_path](../operations/server_settings/settings.md#server_settings-format_schema_path) in the server configuration.
Deserialization is effective and usually doesn't increase the system load. Deserialization is effective and usually doesn't increase the system load.
<a name="csv"></a>
## CSV ## CSV
Comma Separated Values format ([RFC](https://tools.ietf.org/html/rfc4180)). Comma Separated Values format ([RFC](https://tools.ietf.org/html/rfc4180)).
When formatting, rows are enclosed in double quotes. A double quote inside a string is output as two double quotes in a row. There are no other rules for escaping characters. Date and date-time are enclosed in double quotes. Numbers are output without quotes. Values are separated by a delimiter&ast;. Rows are separated using the Unix line feed (LF). Arrays are serialized in CSV as follows: first the array is serialized to a string as in TabSeparated format, and then the resulting string is output to CSV in double quotes. Tuples in CSV format are serialized as separate columns (that is, their nesting in the tuple is lost). When formatting, rows are enclosed in double quotes. A double quote inside a string is output as two double quotes in a row. There are no other rules for escaping characters. Date and date-time are enclosed in double quotes. Numbers are output without quotes. Values are separated by a delimiter character, which is `,` by default. The delimiter character is defined in the setting [format_csv_delimiter](../operations/settings/settings.md#format_csv_delimiter). Rows are separated using the Unix line feed (LF). Arrays are serialized in CSV as follows: first the array is serialized to a string as in TabSeparated format, and then the resulting string is output to CSV in double quotes. Tuples in CSV format are serialized as separate columns (that is, their nesting in the tuple is lost).
``` ```
clickhouse-client --format_csv_delimiter="|" --query="INSERT INTO test.csv FORMAT CSV" < data.csv clickhouse-client --format_csv_delimiter="|" --query="INSERT INTO test.csv FORMAT CSV" < data.csv
``` ```
&ast;By default`,`. See a [format_csv_delimiter](/operations/settings/settings/#format_csv_delimiter) setting for additional info. &ast;By default, the delimiter is `,`. See the [format_csv_delimiter](/operations/settings/settings/#format_csv_delimiter) setting for more information.
When parsing, all values can be parsed either with or without quotes. Both double and single quotes are supported. Rows can also be arranged without quotes. In this case, they are parsed up to a delimiter or line feed (CR or LF). In violation of the RFC, when parsing rows without quotes, the leading and trailing spaces and tabs are ignored. For the line feed, Unix (LF), Windows (CR LF) and Mac OS Classic (CR LF) are all supported. When parsing, all values can be parsed either with or without quotes. Both double and single quotes are supported. Rows can also be arranged without quotes. In this case, they are parsed up to the delimiter character or line feed (CR or LF). In violation of the RFC, when parsing rows without quotes, the leading and trailing spaces and tabs are ignored. For the line feed, Unix (LF), Windows (CR LF) and Mac OS Classic (CR LF) types are all supported.
`NULL` is formatted as `\N`.
The CSV format supports the output of totals and extremes the same way as `TabSeparated`. The CSV format supports the output of totals and extremes the same way as `TabSeparated`.
## CSVWithNames ## CSVWithNames
Also prints the header row, similar to `TabSeparatedWithNames`. Also prints the header row, similar to `TabSeparatedWithNames`.
<a name="json"></a>
## JSON ## JSON
@ -162,7 +166,13 @@ If the query contains GROUP BY, rows_before_limit_at_least is the exact number o
`extremes` Extreme values (when extremes is set to 1). `extremes` Extreme values (when extremes is set to 1).
This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table). This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
ClickHouse supports [NULL](../query_language/syntax.md#null-literal), which is displayed as `null` in the JSON output.
See also the JSONEachRow format. See also the JSONEachRow format.
<a name="jsoncompact"></a>
## JSONCompact ## JSONCompact
Differs from JSON only in that data rows are output in arrays, not in objects. Differs from JSON only in that data rows are output in arrays, not in objects.
@ -188,8 +198,8 @@ Example:
["", "8267016"], ["", "8267016"],
["bathroom interior design", "2166"], ["bathroom interior design", "2166"],
["yandex", "1655"], ["yandex", "1655"],
["spring 2014 fashion", "1549"], ["fashion trends spring 2014", "1549"],
["freeform photos", "1480"] ["freeform photo", "1480"]
], ],
"totals": ["","8873898"], "totals": ["","8873898"],
@ -208,6 +218,7 @@ Example:
This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table). This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
See also the `JSONEachRow` format. See also the `JSONEachRow` format.
<a name="jsoneachrow"></a>
## JSONEachRow ## JSONEachRow
@ -217,35 +228,51 @@ Outputs data as separate JSON objects for each row (newline delimited JSON).
{"SearchPhrase":"","count()":"8267016"} {"SearchPhrase":"","count()":"8267016"}
{"SearchPhrase": "bathroom interior design","count()": "2166"} {"SearchPhrase": "bathroom interior design","count()": "2166"}
{"SearchPhrase":"yandex","count()":"1655"} {"SearchPhrase":"yandex","count()":"1655"}
{"SearchPhrase":"spring 2014 fashion","count()":"1549"} {"SearchPhrase":"2014 spring fashion","count()":"1549"}
{"SearchPhrase":"freeform photo","count()":"1480"} {"SearchPhrase":"freeform photo","count()":"1480"}
{"SearchPhrase":"angelina jolie","count()":"1245"} {"SearchPhrase":"angelina jolie","count()":"1245"}
{"SearchPhrase":"omsk","count()":"1112"} {"SearchPhrase":"omsk","count()":"1112"}
{"SearchPhrase":"photos of dog breeds","count()":"1091"} {"SearchPhrase":"photos of dog breeds","count()":"1091"}
{"SearchPhrase":"curtain design","count()":"1064"} {"SearchPhrase":"curtain designs","count()":"1064"}
{"SearchPhrase":"baku","count()":"1000"} {"SearchPhrase":"baku","count()":"1000"}
``` ```
Unlike the JSON format, there is no substitution of invalid UTF-8 sequences. Any set of bytes can be output in the rows. This is necessary so that data can be formatted without losing any information. Values are escaped in the same way as for JSON. Unlike the JSON format, there is no substitution of invalid UTF-8 sequences. Any set of bytes can be output in the rows. This is necessary so that data can be formatted without losing any information. Values are escaped in the same way as for JSON.
For parsing, any order is supported for the values of different columns. It is acceptable for some values to be omitted they are treated as equal to their default values. In this case, zeros and blank rows are used as default values. Complex values that could be specified in the table are not supported as defaults. Whitespace between elements is ignored. If a comma is placed after the objects, it is ignored. Objects don't necessarily have to be separated by new lines. For parsing, any order is supported for the values of different columns. It is acceptable for some values to be omitted they are treated as equal to their default values. In this case, zeros and blank rows are used as default values. Complex values that could be specified in the table are not supported as defaults. Whitespace between elements is ignored. If a comma is placed after the objects, it is ignored. Objects don't necessarily have to be separated by new lines.
<a name="native"></a>
## Native ## Native
The most efficient format. Data is written and read by blocks in binary format. For each block, the number of rows, number of columns, column names and types, and parts of columns in this block are recorded one after another. In other words, this format is "columnar" it doesn't convert columns to rows. This is the format used in the native interface for interaction between servers, for using the command-line client, and for C++ clients. The most efficient format. Data is written and read by blocks in binary format. For each block, the number of rows, number of columns, column names and types, and parts of columns in this block are recorded one after another. In other words, this format is "columnar" it doesn't convert columns to rows. This is the format used in the native interface for interaction between servers, for using the command-line client, and for C++ clients.
You can use this format to quickly generate dumps that can only be read by the ClickHouse DBMS. It doesn't make sense to work with this format yourself. You can use this format to quickly generate dumps that can only be read by the ClickHouse DBMS. It doesn't make sense to work with this format yourself.
<a name="null"></a>
## Null ## Null
Nothing is output. However, the query is processed, and when using the command-line client, data is transmitted to the client. This is used for tests, including productivity testing. Nothing is output. However, the query is processed, and when using the command-line client, data is transmitted to the client. This is used for tests, including productivity testing.
Obviously, this format is only appropriate for output, not for parsing. Obviously, this format is only appropriate for output, not for parsing.
<a name="pretty"></a>
## Pretty ## Pretty
Outputs data as Unicode-art tables, also using ANSI-escape sequences for setting colors in the terminal. Outputs data as Unicode-art tables, also using ANSI-escape sequences for setting colors in the terminal.
A full grid of the table is drawn, and each row occupies two lines in the terminal. A full grid of the table is drawn, and each row occupies two lines in the terminal.
Each result block is output as a separate table. This is necessary so that blocks can be output without buffering results (buffering would be necessary in order to pre-calculate the visible width of all the values). Each result block is output as a separate table. This is necessary so that blocks can be output without buffering results (buffering would be necessary in order to pre-calculate the visible width of all the values).
[NULL](../query_language/syntax.md#null-literal) is output as `ᴺᵁᴸᴸ`.
```sql
SELECT * FROM t_null
```
```
┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
└───┴──────┘
```
To avoid dumping too much data to the terminal, only the first 10,000 rows are printed. If the number of rows is greater than or equal to 10,000, the message "Showed first 10 000" is printed. To avoid dumping too much data to the terminal, only the first 10,000 rows are printed. If the number of rows is greater than or equal to 10,000, the message "Showed first 10 000" is printed.
This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table). This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
@ -278,14 +305,18 @@ Extremes:
└────────────┴─────────┘ └────────────┴─────────┘
``` ```
<a name="prettycompact"></a>
## PrettyCompact ## PrettyCompact
Differs from `Pretty` in that the grid is drawn between rows and the result is more compact. Differs from `Pretty` in that the grid is drawn between rows and the result is more compact.
This format is used by default in the command-line client in interactive mode. This format is used by default in the command-line client in interactive mode.
<a name="prettycompactmonoblock"></a>
## PrettyCompactMonoBlock ## PrettyCompactMonoBlock
Differs from `PrettyCompact` in that up to 10,000 rows are buffered, then output as a single table, not by blocks. Differs from [PrettyCompact](#prettycompact) in that up to 10,000 rows are buffered, then output as a single table, not by blocks.
<a name="prettynoescapes"></a>
## PrettyNoEscapes ## PrettyNoEscapes
@ -306,10 +337,12 @@ The same as the previous setting.
### PrettySpaceNoEscapes ### PrettySpaceNoEscapes
The same as the previous setting. The same as the previous setting.
<a name="prettyspace"></a>
## PrettySpace ## PrettySpace
Differs from `PrettyCompact` in that whitespace (space characters) is used instead of the grid. Differs from [PrettyCompact](#prettycompact) in that whitespace (space characters) is used instead of the grid.
<a name="rowbinary"></a>
## RowBinary ## RowBinary
@ -324,10 +357,41 @@ FixedString is represented simply as a sequence of bytes.
Array is represented as a varint length (unsigned [LEB128](https://en.wikipedia.org/wiki/LEB128)), followed by successive elements of the array. Array is represented as a varint length (unsigned [LEB128](https://en.wikipedia.org/wiki/LEB128)), followed by successive elements of the array.
For [NULL](../query_language/syntax.md#null-literal) support, an additional byte containing 1 or 0 is added before each [Nullable](../data_types/nullable.md#data_type-nullable) value. If 1, then the value is `NULL` and this byte is interpreted as a separate value. If 0, the value after the byte is not `NULL`.
<a name="tabseparated"></a>
## TabSeparated ## TabSeparated
In TabSeparated format, data is written by row. Each row contains values separated by tabs. Each value is follow by a tab, except the last value in the row, which is followed by a line feed. Strictly Unix line feeds are assumed everywhere. The last row also must contain a line feed at the end. Values are written in text format, without enclosing quotation marks, and with special characters escaped. In TabSeparated format, data is written by row. Each row contains values separated by tabs. Each value is follow by a tab, except the last value in the row, which is followed by a line feed. Strictly Unix line feeds are assumed everywhere. The last row also must contain a line feed at the end. Values are written in text format, without enclosing quotation marks, and with special characters escaped.
This format is also available under the name `TSV`.
The `TabSeparated` format is convenient for processing data using custom programs and scripts. It is used by default in the HTTP interface, and in the command-line client's batch mode. This format also allows transferring data between different DBMSs. For example, you can get a dump from MySQL and upload it to ClickHouse, or vice versa.
The `TabSeparated` format supports outputting total values (when using WITH TOTALS) and extreme values (when 'extremes' is set to 1). In these cases, the total values and extremes are output after the main data. The main result, total values, and extremes are separated from each other by an empty line. Example:
```sql
SELECT EventDate, count() AS c FROM test.hits GROUP BY EventDate WITH TOTALS ORDER BY EventDate FORMAT TabSeparated``
```
```text
2014-03-17 1406958
2014-03-18 1383658
2014-03-19 1405797
2014-03-20 1353623
2014-03-21 1245779
2014-03-22 1031592
2014-03-23 1046491
0000-00-00 8873898
2014-03-17 1031592
2014-03-23 1406958
```
## Data formatting
Integer numbers are written in decimal form. Numbers can contain an extra "+" character at the beginning (ignored when parsing, and not recorded when formatting). Non-negative numbers can't contain the negative sign. When reading, it is allowed to parse an empty string as a zero, or (for signed types) a string consisting of just a minus sign as a zero. Numbers that do not fit into the corresponding data type may be parsed as a different number, without an error message. Integer numbers are written in decimal form. Numbers can contain an extra "+" character at the beginning (ignored when parsing, and not recorded when formatting). Non-negative numbers can't contain the negative sign. When reading, it is allowed to parse an empty string as a zero, or (for signed types) a string consisting of just a minus sign as a zero. Numbers that do not fit into the corresponding data type may be parsed as a different number, without an error message.
Floating-point numbers are written in decimal form. The dot is used as the decimal separator. Exponential entries are supported, as are 'inf', '+inf', '-inf', and 'nan'. An entry of floating-point numbers may begin or end with a decimal point. Floating-point numbers are written in decimal form. The dot is used as the decimal separator. Exponential entries are supported, as are 'inf', '+inf', '-inf', and 'nan'. An entry of floating-point numbers may begin or end with a decimal point.
@ -358,30 +422,9 @@ Only a small set of symbols are escaped. You can easily stumble onto a string va
Arrays are written as a list of comma-separated values in square brackets. Number items in the array are fomratted as normally, but dates, dates with times, and strings are written in single quotes with the same escaping rules as above. Arrays are written as a list of comma-separated values in square brackets. Number items in the array are fomratted as normally, but dates, dates with times, and strings are written in single quotes with the same escaping rules as above.
The TabSeparated format is convenient for processing data using custom programs and scripts. It is used by default in the HTTP interface, and in the command-line client's batch mode. This format also allows transferring data between different DBMSs. For example, you can get a dump from MySQL and upload it to ClickHouse, or vice versa. [NULL](../query_language/syntax.md#null-literal) is formatted as `\N`.
The TabSeparated format supports outputting total values (when using WITH TOTALS) and extreme values (when 'extremes' is set to 1). In these cases, the total values and extremes are output after the main data. The main result, total values, and extremes are separated from each other by an empty line. Example: <a name="tabseparatedraw"></a>
```sql
SELECT EventDate, count() AS c FROM test.hits GROUP BY EventDate WITH TOTALS ORDER BY EventDate FORMAT TabSeparated``
```
```text
2014-03-17 1406958
2014-03-18 1383658
2014-03-19 1405797
2014-03-20 1353623
2014-03-21 1245779
2014-03-22 1031592
2014-03-23 1046491
0000-00-00 8873898
2014-03-17 1031592
2014-03-23 1406958
```
This format is also available under the name `TSV`.
## TabSeparatedRaw ## TabSeparatedRaw
@ -389,6 +432,7 @@ Differs from `TabSeparated` format in that the rows are written without escaping
This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table). This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
This format is also available under the name `TSVRaw`. This format is also available under the name `TSVRaw`.
<a name="tabseparatedwithnames"></a>
## TabSeparatedWithNames ## TabSeparatedWithNames
@ -397,6 +441,7 @@ During parsing, the first row is completely ignored. You can't use column names
(Support for parsing the header row may be added in the future.) (Support for parsing the header row may be added in the future.)
This format is also available under the name `TSVWithNames`. This format is also available under the name `TSVWithNames`.
<a name="tabseparatedwithnamesandtypes"></a>
## TabSeparatedWithNamesAndTypes ## TabSeparatedWithNamesAndTypes
@ -404,6 +449,7 @@ Differs from the `TabSeparated` format in that the column names are written to t
During parsing, the first and second rows are completely ignored. During parsing, the first and second rows are completely ignored.
This format is also available under the name `TSVWithNamesAndTypes`. This format is also available under the name `TSVWithNamesAndTypes`.
<a name="tskv"></a>
## TSKV ## TSKV
@ -413,15 +459,25 @@ Similar to TabSeparated, but outputs a value in name=value format. Names are esc
SearchPhrase= count()=8267016 SearchPhrase= count()=8267016
SearchPhrase=bathroom interior design count()=2166 SearchPhrase=bathroom interior design count()=2166
SearchPhrase=yandex count()=1655 SearchPhrase=yandex count()=1655
SearchPhrase=spring 2014 fashion count()=1549 SearchPhrase=2014 spring fashion count()=1549
SearchPhrase=freeform photos count()=1480 SearchPhrase=freeform photos count()=1480
SearchPhrase=angelina jolia count()=1245 SearchPhrase=angelina jolie count()=1245
SearchPhrase=omsk count()=1112 SearchPhrase=omsk count()=1112
SearchPhrase=photos of dog breeds count()=1091 SearchPhrase=photos of dog breeds count()=1091
SearchPhrase=curtain design count()=1064 SearchPhrase=curtain designs count()=1064
SearchPhrase=baku count()=1000 SearchPhrase=baku count()=1000
``` ```
[NULL](../query_language/syntax.md#null-literal) is formatted as `\N`.
```sql
SELECT * FROM t_null FORMAT TSKV
```
```
x=1 y=\N
```
When there is a large number of small columns, this format is ineffective, and there is generally no reason to use it. It is used in some departments of Yandex. When there is a large number of small columns, this format is ineffective, and there is generally no reason to use it. It is used in some departments of Yandex.
Both data output and parsing are supported in this format. For parsing, any order is supported for the values of different columns. It is acceptable for some values to be omitted they are treated as equal to their default values. In this case, zeros and blank rows are used as default values. Complex values that could be specified in the table are not supported as defaults. Both data output and parsing are supported in this format. For parsing, any order is supported for the values of different columns. It is acceptable for some values to be omitted they are treated as equal to their default values. In this case, zeros and blank rows are used as default values. Complex values that could be specified in the table are not supported as defaults.
@ -430,17 +486,37 @@ Parsing allows the presence of the additional field `tskv` without the equal sig
## Values ## Values
Prints every row in brackets. Rows are separated by commas. There is no comma after the last row. The values inside the brackets are also comma-separated. Numbers are output in decimal format without quotes. Arrays are output in square brackets. Strings, dates, and dates with times are output in quotes. Escaping rules and parsing are similar to the TabSeparated format. During formatting, extra spaces aren't inserted, but during parsing, they are allowed and skipped (except for spaces inside array values, which are not allowed). Prints every row in brackets. Rows are separated by commas. There is no comma after the last row. The values inside the brackets are also comma-separated. Numbers are output in decimal format without quotes. Arrays are output in square brackets. Strings, dates, and dates with times are output in quotes. Escaping rules and parsing are similar to the [TabSeparated](#tabseparated) format. During formatting, extra spaces aren't inserted, but during parsing, they are allowed and skipped (except for spaces inside array values, which are not allowed). [NULL](../query_language/syntax.md#null-literal) is represented as `NULL`.
The minimum set of characters that you need to escape when passing data in Values format: single quotes and backslashes. The minimum set of characters that you need to escape when passing data in Values format: single quotes and backslashes.
This is the format that is used in `INSERT INTO t VALUES ...`, but you can also use it for formatting query results. This is the format that is used in `INSERT INTO t VALUES ...`, but you can also use it for formatting query results.
<a name="vertical"></a>
## Vertical ## Vertical
Prints each value on a separate line with the column name specified. This format is convenient for printing just one or a few rows, if each row consists of a large number of columns. Prints each value on a separate line with the column name specified. This format is convenient for printing just one or a few rows, if each row consists of a large number of columns.
[NULL](../query_language/syntax.md#null-literal) is output as `ᴺᵁᴸᴸ`.
Example:
```sql
SELECT * FROM t_null FORMAT Vertical
```
```
Row 1:
──────
x: 1
y: ᴺᵁᴸᴸ
```
This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table). This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
<a name="verticalraw"></a>
## VerticalRaw ## VerticalRaw
Differs from `Vertical` format in that the rows are not escaped. Differs from `Vertical` format in that the rows are not escaped.
@ -469,6 +545,9 @@ Row 1:
────── ──────
test: string with \'quotes\' and \t with some special \n characters test: string with \'quotes\' and \t with some special \n characters
``` ```
<a name="xml"></a>
## XML ## XML
XML format is suitable only for output, not for parsing. Example: XML format is suitable only for output, not for parsing. Example:
@ -502,7 +581,7 @@ XML format is suitable only for output, not for parsing. Example:
<field>1655</field> <field>1655</field>
</row> </row>
<row> <row>
<SearchPhrase>spring 2014 fashion</SearchPhrase> <SearchPhrase>2014 spring fashion</SearchPhrase>
<field>1549</field> <field>1549</field>
</row> </row>
<row> <row>
@ -522,7 +601,7 @@ XML format is suitable only for output, not for parsing. Example:
<field>1091</field> <field>1091</field>
</row> </row>
<row> <row>
<SearchPhrase>curtain design</SearchPhrase> <SearchPhrase>curtain designs</SearchPhrase>
<field>1064</field> <field>1064</field>
</row> </row>
<row> <row>
@ -540,6 +619,4 @@ Just as for JSON, invalid UTF-8 sequences are changed to the replacement charact
In string values, the characters `<` and `&` are escaped as `<` and `&`. In string values, the characters `<` and `&` are escaped as `<` and `&`.
Arrays are output as `<array><elem>Hello</elem><elem>World</elem>...</array>`, Arrays are output as `<array><elem>Hello</elem><elem>World</elem>...</array>`,and tuples as `<tuple><elem>Hello</elem><elem>World</elem>...</tuple>`.
and tuples as `<tuple><elem>Hello</elem><elem>World</elem>...</tuple>`.

View File

@ -34,7 +34,8 @@ Date: Fri, 16 Nov 2012 19:21:50 GMT
1 1
``` ```
As you can see, curl is somewhat inconvenient in that spaces must be URL escaped.Although wget escapes everything itself, we don't recommend using it because it doesn't work well over HTTP 1.1 when using keep-alive and Transfer-Encoding: chunked. As you can see, curl is somewhat inconvenient in that spaces must be URL escaped.
Although wget escapes everything itself, we don't recommend using it because it doesn't work well over HTTP 1.1 when using keep-alive and Transfer-Encoding: chunked.
```bash ```bash
$ echo 'SELECT 1' | curl 'http://localhost:8123/' --data-binary @- $ echo 'SELECT 1' | curl 'http://localhost:8123/' --data-binary @-
@ -170,8 +171,7 @@ echo 'SELECT 1' | curl 'http://localhost:8123/?user=user&password=password' -d @
``` ```
If the user name is not indicated, the username 'default' is used. If the password is not indicated, an empty password is used. If the user name is not indicated, the username 'default' is used. If the password is not indicated, an empty password is used.
You can also use the URL parameters to specify any settings for processing a single query, or entire profiles of settings. Example: You can also use the URL parameters to specify any settings for processing a single query, or entire profiles of settings. Example:http://localhost:8123/?profile=web&max_rows_to_read=1000000000&query=SELECT+1
http://localhost:8123/?profile=web&max_rows_to_read=1000000000&query=SELECT+1
For more information, see the section "Settings". For more information, see the section "Settings".

View File

@ -1,7 +1,5 @@
# JDBC Driver # JDBC Driver
There is an official JDBC driver for ClickHouse. See [here](https://github.com/yandex/clickhouse-jdbc) . - [Official driver](https://github.com/yandex/clickhouse-jdbc).
- Third-party driver from [ClickHouse-Native-JDBC](https://github.com/housepower/ClickHouse-Native-JDBC).
JDBC drivers implemented by other organizations:
- [ClickHouse-Native-JDBC](https://github.com/housepower/ClickHouse-Native-JDBC)

View File

@ -1,6 +1,6 @@
# Libraries from Third-party Developers # Libraries from Third-party Developers
There are libraries for working with ClickHouse for: We have not tested the libraries listed below.
- Python - Python
- [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm) - [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm)
@ -45,4 +45,3 @@ There are libraries for working with ClickHouse for:
- Nim - Nim
- [nim-clickhouse](https://github.com/leonardoce/nim-clickhouse) - [nim-clickhouse](https://github.com/leonardoce/nim-clickhouse)
We have not tested these libraries. They are listed in random order.

View File

@ -4,7 +4,8 @@
Web interface for ClickHouse in the [Tabix](https://github.com/tabixio/tabix) project. Web interface for ClickHouse in the [Tabix](https://github.com/tabixio/tabix) project.
### Features: Main features:
- Works with ClickHouse directly from the browser, without the need to install additional software. - Works with ClickHouse directly from the browser, without the need to install additional software.
- Query editor with syntax highlighting. - Query editor with syntax highlighting.
- Auto-completion of commands. - Auto-completion of commands.
@ -13,22 +14,25 @@ Web interface for ClickHouse in the [Tabix](https://github.com/tabixio/tabix) pr
[Tabix documentation](https://tabix.io/doc/). [Tabix documentation](https://tabix.io/doc/).
## HouseOps ## HouseOps
[HouseOps](https://github.com/HouseOps/HouseOps) is a unique Desktop ClickHouse Ops UI / IDE for OSX, Linux and Windows. [HouseOps](https://github.com/HouseOps/HouseOps) is a UI/IDE for OSX, Linux and Windows.
Main features:
- Query builder with syntax highlighting. View the response in a table or JSON view.
- Export query results as CSV or JSON.
- List of processes with descriptions. Write mode. Ability to stop (`KILL`) a process.
- Database graph. Shows all tables and their columns with additional information.
- Quick view of the column size.
- Server configuration.
The following features are planned for development:
- Database management.
- User management.
- Real-time data analysis.
- Cluster monitoring.
- Cluster management.
- Monitoring replicated and Kafka tables.
### Features:
- Query builder with syntax highlighting, response viewed in Table and JSON Object.
- Export results in csv and JSON object.
- Processes List with description, Record mode and Kill processes feature.
- Database Graph with all tables and columns with extra informations.
- Easy view your columns size.
- Server settings.
- Database manangement (soon);
- Users manangement (soon);
- Real-Time Data Analytics (soon);
- Cluster/Infra monitoring (soon);
- Cluster manangement (soon);
- Kafka and Replicated tables monitoring (soon);
- And a lot of others features for you take a beautiful implementation of ClickHouse.

View File

@ -2,68 +2,61 @@
## True Column-oriented DBMS ## True Column-oriented DBMS
In a true column-oriented DBMS, there is no excessive data stored with the values. For example, this means that constant-length values must be supported, to avoid storing their length as additional integer next to the values. In this case, a billion UInt8 values should actually consume around 1 GB uncompressed, or this will strongly affect the CPU use. It is very important to store data compactly even when uncompressed, since the speed of decompression (CPU usage) depends mainly on the volume of uncompressed data. In a true column-oriented DBMS, no extra data is stored with the values. Among other things, this means that constant-length values must be supported, to avoid storing their length "number" next to the values. As an example, a billion UInt8-type values should actually consume around 1 GB uncompressed, or this will strongly affect the CPU use. It is very important to store data compactly (without any "garbage") even when uncompressed, since the speed of decompression (CPU usage) depends mainly on the volume of uncompressed data.
This is worth noting because there are systems that can store values of different columns separately, but that can't effectively process analytical queries due to their optimization for other scenarios. Examples are HBase, BigTable, Cassandra, and HyperTable. In these systems, you will get throughput around a hundred thousand rows per second, but not hundreds of millions of rows per second. This is worth noting because there are systems that can store values of separate columns separately, but that can't effectively process analytical queries due to their optimization for other scenarios. Examples are HBase, BigTable, Cassandra, and HyperTable. In these systems, you will get throughput around a hundred thousand rows per second, but not hundreds of millions of rows per second.
Also note that ClickHouse is a database management system, not a single database. ClickHouse allows creating tables and databases in runtime, loading data, and running queries without reconfiguring and restarting the server. It's also worth noting that ClickHouse is a database management system, not a single database. ClickHouse allows creating tables and databases in runtime, loading data, and running queries without reconfiguring and restarting the server.
## Data Compression ## Data Compression
Some column-oriented DBMSs (InfiniDB CE and MonetDB) do not use data compression. However, data compression is crucial to achieve excellent performance. Some column-oriented DBMSs (InfiniDB CE and MonetDB) do not use data compression. However, data compression does play a key role in achieving excellent performance.
## Disk Storage of Data ## Disk Storage of Data
Many column-oriented DBMSs (such as SAP HANA and Google PowerDrill) can only work in RAM. This approach stimulates the allocation of a larger hardware budget than is actually necessary for real-time analysis. ClickHouse is designed to work on regular hard drives, which ensures low cost of ownership per gigabyte of data, but SSD and additional RAM are also utilized fully if available. Many column-oriented DBMSs (such as SAP HANA and Google PowerDrill) can only work in RAM. This approach encourages more budgeting for hardware than is actually needed for real-time analysis. ClickHouse is designed to work on normal hard drives, which means the cost per GB of data storage is low, but SSD b additional RAM is also fully used if available.
## Parallel Processing on Multiple Cores ## Parallel Processing on Multiple Cores
Large queries are parallelized in a natural way, utilizing all necessary resources that are available on the current server. Large queries are parallelized in a natural way, taking all the necessary resources from what is available on the server.
## Distributed Processing on Multiple Servers ## Distributed Processing on Multiple Servers
Almost none of the columnar DBMSs mentioned above have support for distributed query processing. Almost none of the columnar DBMSs listed above have support for distributed processing.
In ClickHouse, data can reside on different shards. Each shard can be a group of replicas that are used for fault tolerance. The query is processed on all the shards in parallel. This is transparent for the user. In ClickHouse, data can reside on different shards. Each shard can be a group of replicas that are used for fault tolerance. The query is processed on all the shards in parallel. This is transparent for the user.
## SQL Support ## SQL Support
If you are familiar with standard SQL, we can't really talk about SQL support. ClickHouse supports a declarative query language based on SQL that is identical to the SQL standard in many cases.
All the functions have different names. Supported queries include GROUP BY, ORDER BY, subqueries in FROM, IN, and JOIN clauses, and scalar subqueries.
However, this is a declarative query language based on SQL that can't be differentiated from SQL in many instances. Dependent subqueries and window functions are not supported.
JOINs are supported. Subqueries are supported in FROM, IN, and JOIN clauses, as well as scalar subqueries.
Dependent subqueries are not supported.
ClickHouse supports declarative query language that is based on SQL and complies to SQL standard in many cases.
GROUP BY, ORDER BY, scalar subqueries and subqueries in FROM, IN and JOIN clauses are supported.
Correlated subqueries and window functions are not supported.
## Vector Engine ## Vector Engine
Data is not only stored by columns, but is also processed by vectors (parts of columns). This allows to achieve high CPU efficiency. Data is not only stored by columns, but is processed by vectors (parts of columns). This allows us to achieve high CPU performance.
## Real-time Data Updates ## Real-time Data Updates
ClickHouse supports tables with a primary key. In order to quickly perform queries on the range of the primary key, the data is sorted incrementally using the merge tree. Due to this, data can continually be added to the table. No locks are taken when new data is ingested. ClickHouse supports primary key tables. In order to quickly perform queries on the range of the primary key, the data is sorted incrementally using the merge tree. Due to this, data can continually be added to the table. There is no locking when adding data.
## Index ## Index
Having a data physically sorted by primary key makes it possible to extract data for it's specific values or value ranges with low latency, less than few dozen milliseconds. Physical sorting of data by primary key allows you to get data for specific key values or ranges of values with low latency of less than several dozen milliseconds.
## Suitable for Online Queries ## Suitable for Online Queries
Low latency means that queries can be processed without delay and without trying to prepare answer in advance, right at the same moment while user interface page is loading. In other words, online. Low latency means queries can be processed without delay and without preparing the response ahead of time, so a query can be processed while the user interface page is loading. In other words, in online mode.
## Support for Approximated Calculations ## Support for Approximated Calculations
ClickHouse provides various ways to trade accuracy for performance: ClickHouse provides various ways to change the precision of calculations for improved performance:
1. Aggregate functions for approximated calculation of the number of distinct values, medians, and quantiles. 1. The system contains aggregate functions for approximated calculation of the number of various values, medians, and quantiles.
2. Running a query based on a part (sample) of data and getting an approximated result. In this case, proportionally less data is retrieved from the disk. 2. Supports running a query based on a part (sample) of data and getting an approximated result. In this case, proportionally less data is retrieved from the disk.
3. Running an aggregation for a limited number of random keys, instead of for all keys. Under certain conditions for key distribution in the data, this provides a reasonably accurate result while using fewer resources. 3. Supports running an aggregation for a limited number of random keys, instead of for all keys. Under certain conditions for key distribution in the data, this provides a reasonably accurate result while using fewer resources.
## Data Replication and Integrity ## Data replication and data integrity support
ClickHouse uses asynchronous multimaster replication. After being written to any available replica, data is distributed to all the other replicas in background. The system maintains identical data on different replicas. Data is restored automatically after most failures, or semiautomatically in complicated cases. Uses asynchronous multimaster replication. After being written to any available replica, data is distributed to all the remaining replicas in the background. The system maintains identical data on different replicas. Recovery after most failures is performed automatically, and in complex cases — semi-automatically.
For more information, see the [Data replication](../operations/table_engines/replication.md#table_engines-replication) section.
For more information, see the section [Data replication](../operations/table_engines/replication.md#table_engines-replication).

View File

@ -1,6 +1,7 @@
# ClickHouse Features that Can be Considered Disadvantages # ClickHouse Features that Can be Considered Disadvantages
1. No full-fledged transactions. 1. Lack of full transactions.
2. Lack of ability to modify or delete already inserted data with high rate and low latency. There are batch deletes available to clean up data that is not needed anymore or to comply with [GDPR](https://gdpr-info.eu). Batch updates are currently in development as of July 2018. 2. Previously recorded data can't be changed or deleted with low latency and high query frequency. Mass deletions to clear data that is no longer relevant or falls under [GDPR](https://gdpr-info.eu) regulations. Batch changes to data are in development (as of July, 2018).
3. Sparse index makes ClickHouse not really suitable for point queries retrieving single rows by their keys. 3. The sparse index makes ClickHouse ill-suited for point-reading single rows on its own
keys.

View File

@ -1,24 +1,23 @@
# Performance # Performance
According to internal testing results by Yandex, ClickHouse shows the best performance for comparable operating scenarios among systems of its class that were available for testing. This includes the highest throughput for long queries, and the lowest latency on short queries. Testing results are shown on a [separate page](https://clickhouse.yandex/benchmark.html). According to internal testing results at Yandex, ClickHouse shows the best performance (both the highest throughput for long queries and the lowest latency on short queries) for comparable operating scenarios among systems of its class that were available for testing. You can view the test results on a [separate page](https://clickhouse.yandex/benchmark.html).
There are a lot of independent benchmarks that confirm this as well. You can look it up on your own or here is the small [collection of independent benchmark links](https://clickhouse.yandex/#independent-benchmarks). This has also been confirmed by numerous independent benchmarks. They are not difficult to find using an internet search, or you can see [our small collection of related links](https://clickhouse.yandex/#independent-bookmarks).
## Throughput for a Single Large Query ## Throughput for a Single Large Query
Throughput can be measured in rows per second or in megabytes per second. If the data is placed in the page cache, a query that is not too complex is processed on modern hardware at a speed of approximately 2-10 GB/s of uncompressed data on a single server (for the simplest cases, the speed may reach 30 GB/s). If data is not placed in the page cache, the speed is bound by the disk subsystem and how well the data has been compressed. For example, if the disk subsystem allows reading data at 400 MB/s, and the data compression rate is 3, the speed will be around 1.2 GB/s. To get the speed in rows per second, divide the speed in bytes per second by the total size of the columns used in the query. For example, if 10 bytes of columns are extracted, the speed will be around 100-200 million rows per second. Throughput can be measured in rows per second or in megabytes per second. If the data is placed in the page cache, a query that is not too complex is processed on modern hardware at a speed of approximately 2-10 GB/s of uncompressed data on a single server (for the simplest cases, the speed may reach 30 GB/s). If data is not placed in the page cache, the speed depends on the disk subsystem and the data compression rate. For example, if the disk subsystem allows reading data at 400 MB/s, and the data compression rate is 3, the speed will be around 1.2 GB/s. To get the speed in rows per second, divide the speed in bytes per second by the total size of the columns used in the query. For example, if 10 bytes of columns are extracted, the speed will be around 100-200 million rows per second.
The processing speed increases almost linearly for distributed processing, but only if the number of rows resulting from aggregation or sorting is not too large. The processing speed increases almost linearly for distributed processing, but only if the number of rows resulting from aggregation or sorting is not too large.
## Latency When Processing Short Queries ## Latency When Processing Short Queries
If a query uses a primary key and does not select too many rows to process (hundreds of thousands), and does not use too many columns, we can expect less than 50 milliseconds of latency (single digits of milliseconds in the best case) if data is placed in the page cache. Otherwise, latency is calculated from the number of seeks. If you use rotating drives, for a system that is not overloaded, the approximate latency can be calculated by this formula: seek time (10 ms) \* number of columns queried \* number of data parts. If a query uses a primary key and does not select too many rows to process (hundreds of thousands), and does not use too many columns, we can expect less than 50 milliseconds of latency (single digits of milliseconds in the best case) if data is placed in the page cache. Otherwise, latency is calculated from the number of seeks. If you use rotating drives, for a system that is not overloaded, the latency is calculated by this formula: seek time (10 ms) \* number of columns queried \* number of data parts.
## Throughput When Processing a Large Quantity of Short Queries ## Throughput When Processing a Large Quantity of Short Queries
Under the same circumstances, ClickHouse can handle several hundred queries per second on a single server (up to several thousands in the best case). Since this scenario is not typical for analytical DBMSs, it is better to expect a maximum of hundreds of queries per second. Under the same conditions, ClickHouse can handle several hundred queries per second on a single server (up to several thousand in the best case). Since this scenario is not typical for analytical DBMSs, we recommend expecting a maximum of 100 queries per second.
## Performance When Inserting Data ## Performance When Inserting Data
It is recommended to insert data in batches of at least 1000 rows, or no more than a single request per second. When inserting to a MergeTree table from a tab-separated dump, the insertion speed will be from 50 to 200 MB/s. If the inserted rows are around 1 Kb in size, the speed will be from 50,000 to 200,000 rows per second. If the rows are small, the performance will be higher in rows per second (on Banner System data -`>` 500,000 rows per second; on Graphite data -`>` 1,000,000 rows per second). To improve performance, you can make multiple INSERT queries in parallel, and performance will increase linearly. We recommend inserting data in packets of at least 1000 rows, or no more than a single request per second. When inserting to a MergeTree table from a tab-separated dump, the insertion speed will be from 50 to 200 MB/s. If the inserted rows are around 1 Kb in size, the speed will be from 50,000 to 200,000 rows per second. If the rows are small, the performance will be higher in rows per second (on Banner System data -`>` 500,000 rows per second; on Graphite data -`>` 1,000,000 rows per second). To improve performance, you can make multiple INSERT queries in parallel, and performance will increase linearly.

View File

@ -1,10 +1,10 @@
# Yandex.Metrica Use Case # Yandex.Metrica Use Case
ClickHouse has been initially developed to power [Yandex.Metrica](https://metrica.yandex.com/), [the second largest web analytics platform in the world](http://w3techs.com/technologies/overview/traffic_analysis/all), and continues to be it's core component. With more than 13 trillion records in the database and more than 20 billion events daily, ClickHouse allows generating custom reports on the fly directly from non-aggregated data. This article gives a historical background on what was the main goal of ClickHouse before it became an opensource product. ClickHouse was originally developed to power [Yandex.Metrica](https://metrica.yandex.com/), [the second largest web analytics platform in the world](http://w3techs.com/technologies/overview/traffic_analysis/all), and continues to be the core component of this system. With more than 13 trillion records in the database and more than 20 billion events daily, ClickHouse allows generating custom reports on the fly directly from non-aggregated data. This article briefly covers the goals of ClickHouse in the early stages of its development.
Yandex.Metrica generates custom reports based on hits and sessions on the fly, with arbitrary segments and time periods chosen by the end user. Complex aggregates are often required, such as the number of unique visitors. New data for the reports arrives in real-time. Yandex.Metrica builds customized reports on the fly based on hits and sessions, with arbitrary segments defined by the user. This often requires building complex aggregates, such as the number of unique users. New data for building a report is received in real time.
As of April 2014, Yandex.Metrica received approximately 12 billion events (page views and clicks) daily. All these events must be stored in order to build those custom reports. A single query may require scanning millions of rows in no more than a few hundred milliseconds, or hundreds of millions of rows over a few seconds. As of April 2014, Yandex.Metrica was tracking about 12 billion events (page views and clicks) daily. All these events must be stored in order to build custom reports. A single query may require scanning millions of rows within a few hundred milliseconds, or hundreds of millions of rows in just a few seconds.
## Usage in Yandex.Metrica and Other Yandex Services ## Usage in Yandex.Metrica and Other Yandex Services

View File

@ -11,11 +11,11 @@ Users are recorded in the `users` section. Here is a fragment of the `users.xml`
<default> <default>
<!-- Password could be specified in plaintext or in SHA256 (in hex format). <!-- Password could be specified in plaintext or in SHA256 (in hex format).
If you want to specify the password in plain text (not recommended), place it in the 'password' element. If you want to specify password in plaintext (not recommended), place it in 'password' element.
Example: <password>qwerty</password>. Example: <password>qwerty</password>.
Password can be empty. Password could be empty.
If you want to specify SHA256, place it in the 'password_sha256_hex' element. If you want to specify SHA256, place it in 'password_sha256_hex' element.
Example: <password_sha256_hex>65e84be33532fb784c48129675f9eff3a682b27168c0ea744b2cf58ee02337c5</password_sha256_hex> Example: <password_sha256_hex>65e84be33532fb784c48129675f9eff3a682b27168c0ea744b2cf58ee02337c5</password_sha256_hex>
How to generate decent password: How to generate decent password:
@ -23,16 +23,17 @@ Users are recorded in the `users` section. Here is a fragment of the `users.xml`
In first line will be password and in second - corresponding SHA256. In first line will be password and in second - corresponding SHA256.
--> -->
<password></password> <password></password>
<!-- A list of networks that access is allowed from. <!-- A list of networks that access is allowed from.
Each list item has one of the following forms: Each list item has one of the following forms:
<ip>IP address or subnet mask. For example: 198.51.100.0/24 or 2001:DB8::/32. <ip> The IP address or subnet mask. For example: 198.51.100.0/24 or 2001:DB8::/32.
<host> Host name. For example: example01. A DNS query is made for verification, and all addresses obtained are compared with the address of the customer. <host> Host name. For example: example01. A DNS query is made for verification, and all addresses obtained are compared with the address of the customer.
<host_regexp> Regular expression for host names. For example: ^example\d\d-\d\d-\d\.yandex\.ru$ <host_regexp> Regular expression for host names. For example, ^example\d\d-\d\d-\d\.yandex\.ru$
For verification, a DNS PTR query is made for the customer's address and a regular expression is applied to the result. To check it, a DNS PTR request is made for the client's address and a regular expression is applied to the result.
Then another DNS query is made for the result of the PTR query, and all received address are compared to the client address. Then another DNS query is made for the result of the PTR query, and all received address are compared to the client address.
We strongly recommend that the regex ends with \.yandex\.ru$. We strongly recommend that the regex ends with \.yandex\.ru$.
If you are installing ClickHouse yourself, enter: If you are installing ClickHouse yourself, specify here:
<networks> <networks>
<ip>::/0</ip> <ip>::/0</ip>
</networks> </networks>
@ -56,7 +57,6 @@ Users are recorded in the `users` section. Here is a fragment of the `users.xml`
<database>test</database> <database>test</database>
</allow_databases> </allow_databases>
</web> </web>
</users>
``` ```
You can see a declaration from two users: `default`and`web`. We added the `web` user separately. You can see a declaration from two users: `default`and`web`. We added the `web` user separately.
@ -67,7 +67,7 @@ The user that is used for exchanging information between servers combined in a c
The password is specified in clear text (not recommended) or in SHA-256. The hash isn't salted. In this regard, you should not consider these passwords as providing security against potential malicious attacks. Rather, they are necessary for protection from employees. The password is specified in clear text (not recommended) or in SHA-256. The hash isn't salted. In this regard, you should not consider these passwords as providing security against potential malicious attacks. Rather, they are necessary for protection from employees.
A list of networks is specified that access is allowed from. In this example, the list of networks for both users is loaded from a separate file (/etc/metrika.xml) containing the 'networks' substitution. Here is a fragment of it: A list of networks is specified that access is allowed from. In this example, the list of networks for both users is loaded from a separate file (`/etc/metrika.xml`) containing the `networks` substitution. Here is a fragment of it:
```xml ```xml
<yandex> <yandex>
@ -81,15 +81,15 @@ A list of networks is specified that access is allowed from. In this example, th
</yandex> </yandex>
``` ```
We could have defined this list of networks directly in 'users.xml', or in a file in the 'users.d' directory (for more information, see the section "Configuration files"). You could define this list of networks directly in `users.xml`, or in a file in the `users.d` directory (for more information, see the section "[Configuration files](configuration_files.md#configuration_files)").
The config includes comments explaining how to open access from everywhere. The config includes comments explaining how to open access from everywhere.
For use in production, only specify IP elements (IP addresses and their masks), since using 'host' and 'hoost_regexp' might cause extra latency. For use in production, only specify `ip` elements (IP addresses and their masks), since using `host` and `hoost_regexp` might cause extra latency.
Next the user settings profile is specified (see the section "Settings profiles"). You can specify the default profile, `default`. The profile can have any name. You can specify the same profile for different users. The most important thing you can write in the settings profile is 'readonly' set to 1, which provides read-only access. Next the user settings profile is specified (see the section "[Settings profiles](settings/settings_profiles.md#settings_profiles)"). You can specify the default profile, `default'`. The profile can have any name. You can specify the same profile for different users. The most important thing you can write in the settings profile is `readonly=1`, which ensures read-only access.
After this, the quota is defined (see the section "Quotas"). You can specify the default quota, `default`. It is set in the config by default so that it only counts resource usage, but does not restrict it. The quota can have any name. You can specify the same quota for different users in this case, resource usage is calculated for each user individually. Then specify the quota to be used (see the section "[Quotas](quotas.md#quotas)"). You can specify the default quota: `default`. It is set in the config by default to only count resource usage, without restricting it. The quota can have any name. You can specify the same quota for different users in this case, resource usage is calculated for each user individually.
In the optional `<allow_databases>` section, you can also specify a list of databases that the user can access. By default, all databases are available to the user. You can specify the `default` database. In this case, the user will receive access to the database by default. In the optional `<allow_databases>` section, you can also specify a list of databases that the user can access. By default, all databases are available to the user. You can specify the `default` database. In this case, the user will receive access to the database by default.
@ -98,3 +98,4 @@ Access to the `system` database is always allowed (since this database is used f
The user can get a list of all databases and tables in them by using `SHOW` queries or system tables, even if access to individual databases isn't allowed. The user can get a list of all databases and tables in them by using `SHOW` queries or system tables, even if access to individual databases isn't allowed.
Database access is not related to the [readonly](settings/query_complexity.md#query_complexity_readonly) setting. You can't grant full access to one database and `readonly` access to another one. Database access is not related to the [readonly](settings/query_complexity.md#query_complexity_readonly) setting. You can't grant full access to one database and `readonly` access to another one.

View File

@ -14,7 +14,7 @@ If `replace` is specified, it replaces the entire element with the specified one
If `remove` is specified, it deletes the element. If `remove` is specified, it deletes the element.
The config can also define "substitutions". If an element has the `incl` attribute, the corresponding substitution from the file will be used as the value. By default, the path to the file with substitutions is `/etc/metrika.xml`. This can be changed in the [include_from](server_settings/settings.md#server_settings-include_from) element in the server config. The substitution values are specified in `/yandex/substitution_name` elements in this file. If a substitution specified in ` incl` does not exist, it is recorded in the log. To prevent ClickHouse from logging missing substitutions, specify the `optional="true"` attribute (for example, settings for [macros](server_settings/settings.md#server_settings-macros)). The config can also define "substitutions". If an element has the `incl` attribute, the corresponding substitution from the file will be used as the value. By default, the path to the file with substitutions is `/etc/metrika.xml`. This can be changed in the [include_from](server_settings/settings.md#server_settings-include_from) element in the server config. The substitution values are specified in `/yandex/substitution_name` elements in this file. If a substitution specified in `incl` does not exist, it is recorded in the log. To prevent ClickHouse from logging missing substitutions, specify the `optional="true"` attribute (for example, settings for [macros]()server_settings/settings.md#server_settings-macros)).
Substitutions can also be performed from ZooKeeper. To do this, specify the attribute `from_zk = "/path/to/node"`. The element value is replaced with the contents of the node at `/path/to/node` in ZooKeeper. You can also put an entire XML subtree on the ZooKeeper node and it will be fully inserted into the source element. Substitutions can also be performed from ZooKeeper. To do this, specify the attribute `from_zk = "/path/to/node"`. The element value is replaced with the contents of the node at `/path/to/node` in ZooKeeper. You can also put an entire XML subtree on the ZooKeeper node and it will be fully inserted into the source element.
@ -25,4 +25,3 @@ In addition, `users_config` may have overrides in files from the `users_config.d
For each config file, the server also generates `file-preprocessed.xml` files when starting. These files contain all the completed substitutions and overrides, and they are intended for informational use. If ZooKeeper substitutions were used in the config files but ZooKeeper is not available on the server start, the server loads the configuration from the preprocessed file. For each config file, the server also generates `file-preprocessed.xml` files when starting. These files contain all the completed substitutions and overrides, and they are intended for informational use. If ZooKeeper substitutions were used in the config files but ZooKeeper is not available on the server start, the server loads the configuration from the preprocessed file.
The server tracks changes in config files, as well as files and ZooKeeper nodes that were used when performing substitutions and overrides, and reloads the settings for users and clusters on the fly. This means that you can modify the cluster, users, and their settings without restarting the server. The server tracks changes in config files, as well as files and ZooKeeper nodes that were used when performing substitutions and overrides, and reloads the settings for users and clusters on the fly. This means that you can modify the cluster, users, and their settings without restarting the server.

View File

@ -1,2 +1,2 @@
# Operations # Usage

View File

@ -1,3 +1,5 @@
<a name="quotas"></a>
# Quotas # Quotas
Quotas allow you to limit resource usage over a period of time, or simply track the use of resources. Quotas allow you to limit resource usage over a period of time, or simply track the use of resources.
@ -13,7 +15,7 @@ In contrast to query complexity restrictions, quotas:
Let's look at the section of the 'users.xml' file that defines quotas. Let's look at the section of the 'users.xml' file that defines quotas.
```xml ```xml
<!-- Quotas. --> <!-- Quotas -->
<quotas> <quotas>
<!-- Quota name. --> <!-- Quota name. -->
<default> <default>
@ -84,7 +86,7 @@ Quotas can use the "quota key" feature in order to report on resources for multi
```xml ```xml
<!-- For the global reports designer. --> <!-- For the global reports designer. -->
<web_global> <web_global>
<!-- keyed - The quota_key "key" is passed in the query parameter, <!-- keyed The quota_key "key" is passed in the query parameter,
and the quota is tracked separately for each key value. and the quota is tracked separately for each key value.
For example, you can pass a Yandex.Metrica username as the key, For example, you can pass a Yandex.Metrica username as the key,
so the quota will be counted separately for each username. so the quota will be counted separately for each username.

View File

@ -22,7 +22,7 @@ Default value: 3600.
Data compression settings. Data compression settings.
!!! warning "Warning" !!! Important
Don't use it if you have just started using ClickHouse. Don't use it if you have just started using ClickHouse.
The configuration looks like this: The configuration looks like this:
@ -64,7 +64,7 @@ ClickHouse checks ` min_part_size` and ` min_part_size_ratio` and processes th
The default database. The default database.
To get a list of databases, use the [SHOW DATABASES](../../query_language/misc.md#query_language_queries_show_databases). To get a list of databases, use the [ SHOW DATABASES](../../query_language/misc.md#query_language_queries_show_databases) query.
**Example** **Example**
@ -176,7 +176,7 @@ You can configure multiple `<graphite>` clauses. For instance, you can use this
Settings for thinning data for Graphite. Settings for thinning data for Graphite.
For more information, see [GraphiteMergeTree](../../operations/table_engines/graphitemergetree.md#table_engines-graphitemergetree). For more details, see [GraphiteMergeTree](../../operations/table_engines/graphitemergetree.md#table_engines-graphitemergetree).
**Example** **Example**
@ -308,7 +308,7 @@ Logging settings.
Keys: Keys:
- level Logging level. Acceptable values: ``trace``, ``debug``, ``information``, ``warning``, ``error``. - level Logging level. Acceptable values: ``trace``, ``debug``, ``information``, ``warning``, ``error``.
- log The log file. Contains all the entries according to `` level``. - log The log file. Contains all the entries according to `level`.
- errorlog Error log file. - errorlog Error log file.
- size Size of the file. Applies to ``log``and``errorlog``. Once the file reaches ``size``, ClickHouse archives and renames it, and creates a new log file in its place. - size Size of the file. Applies to ``log``and``errorlog``. Once the file reaches ``size``, ClickHouse archives and renames it, and creates a new log file in its place.
- count The number of archived log files that ClickHouse stores. - count The number of archived log files that ClickHouse stores.
@ -325,7 +325,8 @@ Keys:
</logger> </logger>
``` ```
Also, logging to syslog is possible. Configuration example: Writing to the syslog is also supported. Config example:
```xml ```xml
<logger> <logger>
<use_syslog>1</use_syslog> <use_syslog>1</use_syslog>
@ -339,13 +340,14 @@ Also, logging to syslog is possible. Configuration example:
``` ```
Keys: Keys:
- user_syslog - activation key, turning on syslog logging.
- address - host[:port] of syslogd. If not specified, local one would be used. - user_syslog — Required setting if you want to write to the syslog.
- hostname - optional, source host of logs - address — The host[:порт] of syslogd. If omitted, the local daemon is used.
- facility - [syslog facility](https://en.wikipedia.org/wiki/Syslog#Facility), - hostname — Optional. The name of the host that logs are sent from.
in uppercase, prefixed with "LOG_": (``LOG_USER``, ``LOG_DAEMON``, ``LOG_LOCAL3`` etc.). - facility — [The syslog facility keyword](https://en.wikipedia.org/wiki/Syslog#Facility)
Default values: when ``address`` is specified, then ``LOG_USER``, otherwise - ``LOG_DAEMON`` in uppercase letters with the "LOG_" prefix: (``LOG_USER``, ``LOG_DAEMON``, ``LOG_LOCAL3``, and so on).
- format - message format. Possible values are - ``bsd`` and ``syslog`` Default value: ``LOG_USER`` if ``address`` is specified, ``LOG_DAEMON otherwise.``
- format Message format. Possible values: ``bsd`` and ``syslog.``
<a name="server_settings-macros"></a> <a name="server_settings-macros"></a>
@ -367,7 +369,7 @@ For more information, see the section "[Creating replicated tables](../../operat
## mark_cache_size ## mark_cache_size
Approximate size (in bytes) of the cache of "marks" used by [MergeTree](../../operations/table_engines/mergetree.md#table_engines-mergetree) engines. Approximate size (in bytes) of the cache of "marks" used by [MergeTree](../../operations/table_engines/mergetree.md#table_engines-mergetree).
The cache is shared for the server and memory is allocated as needed. The cache size must be at least 5368709120. The cache is shared for the server and memory is allocated as needed. The cache size must be at least 5368709120.
@ -423,7 +425,7 @@ We recommend using this option in Mac OS X, since the ` getrlimit()` function re
Restriction on deleting tables. Restriction on deleting tables.
If the size of a [MergeTree](../../operations/table_engines/mergetree.md#table_engines-mergetree) type table exceeds `max_table_size_to_drop` (in bytes), you can't delete it using a DROP query. If the size of a [MergeTree](../../operations/table_engines/mergetree.md#table_engines-mergetree) table exceeds `max_table_size_to_drop` (in bytes), you can't delete it using a DROP query.
If you still need to delete the table without restarting the ClickHouse server, create the `<clickhouse-path>/flags/force_drop_table` file and run the DROP query. If you still need to delete the table without restarting the ClickHouse server, create the `<clickhouse-path>/flags/force_drop_table` file and run the DROP query.
@ -441,7 +443,7 @@ The value 0 means that you can delete all tables without any restrictions.
## merge_tree ## merge_tree
Fine tuning for tables in the [ MergeTree](../../operations/table_engines/mergetree.md#table_engines-mergetree) family. Fine tuning for tables in the [ MergeTree](../../operations/table_engines/mergetree.md#table_engines-mergetree).
For more information, see the MergeTreeSettings.h header file. For more information, see the MergeTreeSettings.h header file.
@ -459,25 +461,25 @@ For more information, see the MergeTreeSettings.h header file.
SSL client/server configuration. SSL client/server configuration.
Support for SSL is provided by the `` libpoco`` library. The interface is described in the file [SSLManager.h](https://github.com/ClickHouse-Extras/poco/blob/master/NetSSL_OpenSSL/include/Poco/Net/SSLManager.h) Support for SSL is provided by the `libpoco` library. The interface is described in the file [SSLManager.h](https://github.com/ClickHouse-Extras/poco/blob/master/NetSSL_OpenSSL/include/Poco/Net/SSLManager.h)
Keys for server/client settings: Keys for server/client settings:
- privateKeyFile The path to the file with the secret key of the PEM certificate. The file may contain a key and certificate at the same time. - privateKeyFile The path to the file with the secret key of the PEM certificate. The file may contain a key and certificate at the same time.
- certificateFile The path to the client/server certificate file in PEM format. You can omit it if `` privateKeyFile`` contains the certificate. - certificateFile The path to the client/server certificate file in PEM format. You can omit it if `privateKeyFile` contains the certificate.
- caConfig The path to the file or directory that contains trusted root certificates. - caConfig The path to the file or directory that contains trusted root certificates.
- verificationMode The method for checking the node's certificates. Details are in the description of the [Context](https://github.com/ClickHouse-Extras/poco/blob/master/NetSSL_OpenSSL/include/Poco/Net/Context.h) class. Possible values: ``none``, ``relaxed``, ``strict``, ``once``. - verificationMode The method for checking the node's certificates. Details are in the description of the [Context](https://github.com/ClickHouse-Extras/poco/blob/master/NetSSL_OpenSSL/include/Poco/Net/Context.h) class. Possible values: ``none``, ``relaxed``, ``strict``, ``once``.
- verificationDepth The maximum length of the verification chain. Verification will fail if the certificate chain length exceeds the set value. - verificationDepth The maximum length of the verification chain. Verification will fail if the certificate chain length exceeds the set value.
- loadDefaultCAFile Indicates that built-in CA certificates for OpenSSL will be used. Acceptable values: `` true``, `` false``. | - loadDefaultCAFile Indicates that built-in CA certificates for OpenSSL will be used. Acceptable values: `true`, `false`. |
- cipherList Supported OpenSSL encryptions. For example: `` ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH``. - cipherList Supported OpenSSL encryptions. For example: `ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH`.
- cacheSessions Enables or disables caching sessions. Must be used in combination with ``sessionIdContext``. Acceptable values: `` true``, `` false``. - cacheSessions Enables or disables caching sessions. Must be used in combination with ``sessionIdContext``. Acceptable values: `true`, `false`.
- sessionIdContext A unique set of random characters that the server appends to each generated identifier. The length of the string must not exceed ``SSL_MAX_SSL_SESSION_ID_LENGTH``. This parameter is always recommended, since it helps avoid problems both if the server caches the session and if the client requested caching. Default value: ``${application.name}``. - sessionIdContext A unique set of random characters that the server appends to each generated identifier. The length of the string must not exceed ``SSL_MAX_SSL_SESSION_ID_LENGTH``. This parameter is always recommended, since it helps avoid problems both if the server caches the session and if the client requested caching. Default value: ``${application.name}``.
- sessionCacheSize The maximum number of sessions that the server caches. Default value: 1024\*20. 0 Unlimited sessions. - sessionCacheSize The maximum number of sessions that the server caches. Default value: 1024\*20. 0 Unlimited sessions.
- sessionTimeout Time for caching the session on the server. - sessionTimeout Time for caching the session on the server.
- extendedVerification Automatically extended verification of certificates after the session ends. Acceptable values: `` true``, `` false``. - extendedVerification Automatically extended verification of certificates after the session ends. Acceptable values: `true`, `false`.
- requireTLSv1 Require a TLSv1 connection. Acceptable values: `` true``, `` false``. - requireTLSv1 Require a TLSv1 connection. Acceptable values: `true`, `false`.
- requireTLSv1_1 Require a TLSv1.1 connection. Acceptable values: `` true``, `` false``. - requireTLSv1_1 Require a TLSv1.1 connection. Acceptable values: `true`, `false`.
- requireTLSv1 Require a TLSv1.2 connection. Acceptable values: `` true``, `` false``. - requireTLSv1 Require a TLSv1.2 connection. Acceptable values: `true`, `false`.
- fips Activates OpenSSL FIPS mode. Supported if the library's OpenSSL version supports FIPS. - fips Activates OpenSSL FIPS mode. Supported if the library's OpenSSL version supports FIPS.
- privateKeyPassphraseHandler Class (PrivateKeyPassphraseHandler subclass) that requests the passphrase for accessing the private key. For example: ``<privateKeyPassphraseHandler>``, ``<name>KeyFileHandler</name>``, ``<options><password>test</password></options>``, ``</privateKeyPassphraseHandler>``. - privateKeyPassphraseHandler Class (PrivateKeyPassphraseHandler subclass) that requests the passphrase for accessing the private key. For example: ``<privateKeyPassphraseHandler>``, ``<name>KeyFileHandler</name>``, ``<options><password>test</password></options>``, ``</privateKeyPassphraseHandler>``.
- invalidCertificateHandler Class (subclass of CertificateHandler) for verifying invalid certificates. For example: `` <invalidCertificateHandler> <name>ConsoleCertificateHandler</name> </invalidCertificateHandler>`` . - invalidCertificateHandler Class (subclass of CertificateHandler) for verifying invalid certificates. For example: `` <invalidCertificateHandler> <name>ConsoleCertificateHandler</name> </invalidCertificateHandler>`` .
@ -518,7 +520,7 @@ Keys for server/client settings:
## part_log ## part_log
Logging events that are associated with [MergeTree](../../operations/table_engines/mergetree.md#table_engines-mergetree) data. For instance, adding or merging data. You can use the log to simulate merge algorithms and compare their characteristics. You can visualize the merge process. Logging events that are associated with [MergeTree](../../operations/table_engines/mergetree.md#table_engines-mergetree). For instance, adding or merging data. You can use the log to simulate merge algorithms and compare their characteristics. You can visualize the merge process.
Queries are logged in the ClickHouse table, not in a separate file. Queries are logged in the ClickHouse table, not in a separate file.
@ -558,9 +560,8 @@ Use the following parameters to configure logging:
The path to the directory containing data. The path to the directory containing data.
!!! warning "Attention" !!! Note:
The trailing slash is mandatory. The end slash is mandatory.
**Example** **Example**
@ -646,8 +647,8 @@ Port for communicating with clients over the TCP protocol.
Path to temporary data for processing large queries. Path to temporary data for processing large queries.
!!! warning "Attention" !!! Note:
The trailing slash is mandatory. The end slash is mandatory.
**Example** **Example**
@ -659,7 +660,7 @@ Path to temporary data for processing large queries.
## uncompressed_cache_size ## uncompressed_cache_size
Cache size (in bytes) for uncompressed data used by table engines from the [MergeTree](../../operations/table_engines/mergetree.md#table_engines-mergetree) family. Cache size (in bytes) for uncompressed data used by table engines from the [MergeTree](../../operations/table_engines/mergetree.md#table_engines-mergetree).
There is one shared cache for the server. Memory is allocated on demand. The cache is used if the option [use_uncompressed_cache](../settings/settings.md#settings-use_uncompressed_cache) is enabled. There is one shared cache for the server. Memory is allocated on demand. The cache is used if the option [use_uncompressed_cache](../settings/settings.md#settings-use_uncompressed_cache) is enabled.
@ -673,7 +674,7 @@ The uncompressed cache is advantageous for very short queries in individual case
## user_files_path ## user_files_path
A catalog with user files. Used in a [file()](../../query_language/table_functions/file.md#table_functions-file) table function. The directory with user files. Used in the table function [file()](../../query_language/table_functions/file.md#table_functions-file).
**Example** **Example**
@ -715,3 +716,4 @@ For more information, see the section "[Replication](../../operations/table_engi
```xml ```xml
<zookeeper incl="zookeeper-servers" optional="true" /> <zookeeper incl="zookeeper-servers" optional="true" />
``` ```

View File

@ -7,17 +7,18 @@ Settings are configured in layers, so each subsequent layer redefines the previo
Ways to configure settings, in order of priority: Ways to configure settings, in order of priority:
- Settings in the server config file `users.xml`. - Settings in the `users.xml` server configuration file.
Set it in user profile in `<profiles>` element. Set in the element `<profiles>`.
- Session settings. - Session settings.
Send ` SET setting=value` from the ClickHouse console client in interactive mode. Send ` SET setting=value` from the ClickHouse console client in interactive mode.
Similarly, you can use ClickHouse sessions in the HTTP protocol. To do this, you need to specify the `session_id` HTTP parameter. Similarly, you can use ClickHouse sessions in the HTTP protocol. To do this, you need to specify the `session_id` HTTP parameter.
- For a query. - Query settings.
- When starting the ClickHouse console client in non-interactive mode, set the startup parameter `--setting=value`. - When starting the ClickHouse console client in non-interactive mode, set the startup parameter `--setting=value`.
- When using the HTTP API, pass CGI parameters (`URL?setting_1=value&setting_2=value...`). - When using the HTTP API, pass CGI parameters (`URL?setting_1=value&setting_2=value...`).
Settings that can only be made in the server config file are not covered in this section. Settings that can only be made in the server config file are not covered in this section.

View File

@ -4,18 +4,24 @@
## distributed_product_mode ## distributed_product_mode
Changes the behavior of [distributed subqueries](../../query_language/select.md#queries-distributed-subrequests), i.e. in cases when the query contains the product of distributed tables. Changes the behavior of [distributed subqueries](../../query_language/select.md#queries-distributed-subrequests).
ClickHouse applies the configuration if the subqueries on any level have a distributed table that exists on the local server and has more than one shard. ClickHouse applies this setting when the query contains the product of distributed tables, i.e. when the query for a distributed table contains a non-GLOBAL subquery for the distributed table.
Restrictions: Restrictions:
- Only applied for IN and JOIN subqueries. - Only applied for IN and JOIN subqueries.
- Used only if a distributed table is used in the FROM clause. - Only if the FROM section uses a distributed table containing more than one shard.
- Not used for a table-valued [ remote](../../query_language/table_functions/remote.md#table_functions-remote) function. - If the subquery concerns a distributed table containing more than one shard,
- Not used for a table-valued [ remote](../../query_language/table_functions/remote.md#table_functions-remote).
The possible values are: The possible values are:
- `deny` — Default value. Prohibits using these types of subqueries (returns the "Double-distributed in/JOIN subqueries is denied" exception).
- `local` — Replaces the database and table in the subquery with local ones for the destination server (shard), leaving the normal `IN` / `JOIN.`
- `global` — Replaces the `IN` / `JOIN` query with `GLOBAL IN` / `GLOBAL JOIN.`
- `allow` — Allows the use of these types of subqueries.
<a name="settings-settings-fallback_to_stale_replicas_for_distributed_queries"></a> <a name="settings-settings-fallback_to_stale_replicas_for_distributed_queries"></a>
## fallback_to_stale_replicas_for_distributed_queries ## fallback_to_stale_replicas_for_distributed_queries
@ -158,7 +164,7 @@ Don't confuse blocks for compression (a chunk of memory consisting of bytes) and
## min_compress_block_size ## min_compress_block_size
For [MergeTree](../../operations/table_engines/mergetree.md#table_engines-mergetree)" tables. In order to reduce latency when processing queries, a block is compressed when writing the next mark if its size is at least 'min_compress_block_size'. By default, 65,536. For [MergeTree](../../operations/table_engines/mergetree.md#table_engines-mergetree)". In order to reduce latency when processing queries, a block is compressed when writing the next mark if its size is at least 'min_compress_block_size'. By default, 65,536.
The actual size of the block, if the uncompressed data is less than 'max_compress_block_size', is no less than this value and no less than the volume of data for one mark. The actual size of the block, if the uncompressed data is less than 'max_compress_block_size', is no less than this value and no less than the volume of data for one mark.
@ -343,4 +349,12 @@ If the value is true, integers appear in quotes when using JSON\* Int64 and UInt
## format_csv_delimiter ## format_csv_delimiter
The character to be considered as a delimiter in CSV data. By default, `,`. The character interpreted as a delimiter in the CSV data. By default, the delimiter is `,`.
<!--a name="settings-join_use_nulls"></a-->
## join_use_nulls {: #settings-join_use_nulls}
Affects the behavior of [JOIN](../../query_language/select.md#query_language-join).
With `join_use_nulls=1`, `JOIN` behaves like in standard SQL, i.e. if empty cells appear when merging, the type of the corresponding field is converted to [Nullable](../../data_types/nullable.md#data_type-nullable), and empty cells are filled with [NULL](../../query_language/syntax.md#null-literal).

View File

@ -1,11 +1,13 @@
<a name="settings_profiles"></a>
# Settings profiles # Settings profiles
A settings profile is a collection of settings grouped under the same name. Each ClickHouse user has a profile. A settings profile is a collection of settings grouped under the same name. Each ClickHouse user has a profile.
To apply all the settings in a profile, set `profile`. To apply all the settings in a profile, set the `profile` setting.
Example: Example:
Setting `web` profile. Install the `web` profile.
```sql ```sql
SET profile = 'web' SET profile = 'web'
@ -60,3 +62,4 @@ Example:
The example specifies two profiles: `default` and `web`. The `default` profile has a special purpose: it must always be present and is applied when starting the server. In other words, the `default` profile contains default settings. The `web` profile is a regular profile that can be set using the `SET` query or using a URL parameter in an HTTP query. The example specifies two profiles: `default` and `web`. The `default` profile has a special purpose: it must always be present and is applied when starting the server. In other words, the `default` profile contains default settings. The `web` profile is a regular profile that can be set using the `SET` query or using a URL parameter in an HTTP query.
Settings profiles can inherit from each other. To use inheritance, indicate the `profile` setting before the other settings that are listed in the profile. Settings profiles can inherit from each other. To use inheritance, indicate the `profile` setting before the other settings that are listed in the profile.

View File

@ -5,7 +5,6 @@ You can't delete a system table (but you can perform DETACH).
System tables don't have files with data on the disk or files with metadata. The server creates all the system tables when it starts. System tables don't have files with data on the disk or files with metadata. The server creates all the system tables when it starts.
System tables are read-only. System tables are read-only.
They are located in the 'system' database. They are located in the 'system' database.
<a name="system_tables-system.asynchronous_metrics"></a> <a name="system_tables-system.asynchronous_metrics"></a>
## system.asynchronous_metrics ## system.asynchronous_metrics
@ -20,27 +19,28 @@ Contains information about clusters available in the config file and the servers
Columns: Columns:
```text ```text
cluster String Cluster name. cluster String — The cluster name.
shard_num UInt32 Number of a shard in the cluster, starting from 1. shard_num UInt32 — The shard number in the cluster, starting from 1.
shard_weight UInt32 Relative weight of a shard when writing data. shard_weight UInt32 — The relative weight of the shard when writing data.
replica_num UInt32 Number of a replica in the shard, starting from 1. replica_num UInt32 — The replica number in the shard, starting from 1.
host_name String Host name as specified in the config. host_name String — The host name, as specified in the config.
host_address String Host's IP address obtained from DNS. String host_address — The host IP address obtained from DNS.
port UInt16 The port used to access the server. port UInt16 — The port to use for connecting to the server.
user String The username to use for connecting to the server. user String — The name of the user for connecting to the server.
``` ```
## system.columns ## system.columns
Contains information about the columns in all tables. Contains information about the columns in all tables.
You can use this table to get information similar to `DESCRIBE TABLE`, but for multiple tables at once. You can use this table to get information similar to `DESCRIBE TABLE`, but for multiple tables at once.
```text ```text
database String - Name of the database the table is located in. database String — The name of the database the table is in.
table String - Table name. table String Table name.
name String - Column name. name String Column name.
type String - Column type. type String Column type.
default_type String - Expression type (DEFAULT, MATERIALIZED, ALIAS) for the default value, or an empty string if it is not defined. default_type String Expression type (DEFAULT, MATERIALIZED, ALIAS) for the default value, or an empty string if it is not defined.
default_expression String - Expression for the default value, or an empty string if it is not defined. default_expression String Expression for the default value, or an empty string if it is not defined.
``` ```
## system.databases ## system.databases
@ -55,19 +55,19 @@ Contains information about external dictionaries.
Columns: Columns:
- `name String` Dictionary name. - `name String` Dictionary name.
- `type String` Dictionary type: Flat, Hashed, Cache. - `type String` Dictionary type: Flat, Hashed, Cache.
- `origin String` Path to the config file where the dictionary is described. - `origin String` — Path to the configuration file that describes the dictionary.
- `attribute.names Array(String)` Array of attribute names provided by the dictionary. - `attribute.names Array(String)` Array of attribute names provided by the dictionary.
- `attribute.types Array(String)` Corresponding array of attribute types provided by the dictionary. - `attribute.types Array(String)` — Corresponding array of attribute types that are provided by the dictionary.
- `has_hierarchy UInt8` Whether the dictionary is hierarchical. - `has_hierarchy UInt8` Whether the dictionary is hierarchical.
- `bytes_allocated UInt64` The amount of RAM used by the dictionary. - `bytes_allocated UInt64` — The amount of RAM the dictionary uses.
- `hit_rate Float64` For cache dictionaries, the percent of usage for which the value was in the cache. - `hit_rate Float64` — For cache dictionaries, the percentage of uses for which the value was in the cache.
- `element_count UInt64` The number of items stored in the dictionary. - `element_count UInt64` The number of items stored in the dictionary.
- `load_factor Float64` The filled percentage of the dictionary (for a hashed dictionary, it is the filled percentage of the hash table). - `load_factor Float64` — The percentage full of the dictionary (for a hashed dictionary, the percentage filled in the hash table).
- `creation_time DateTime` Time spent for the creation or last successful reload of the dictionary. - `creation_time DateTime` — The time when the dictionary was created or last successfully reloaded.
- `last_exception String` Text of an error that occurred when creating or reloading the dictionary, if the dictionary couldn't be created. - `last_exception String` — Text of the error that occurs when creating or reloading the dictionary if the dictionary couldn't be created.
- `source String` Text describing the data source for the dictionary. - `source String` Text describing the data source for the dictionary.
Note that the amount of memory used by the dictionary is not proportional to the number of items stored in it. So for flat and cached dictionaries, all the memory cells are pre-assigned, regardless of how full the dictionary actually is. Note that the amount of memory used by the dictionary is not proportional to the number of items stored in it. So for flat and cached dictionaries, all the memory cells are pre-assigned, regardless of how full the dictionary actually is.
<a name="system_tables-system.events"></a> <a name="system_tables-system.events"></a>
@ -84,26 +84,27 @@ Contains information about normal and aggregate functions.
Columns: Columns:
- `name` (`String`) Function name. - `name`(`String`) The name of the function.
- `is_aggregate` (`UInt8`) Whether it is an aggregate function. - `is_aggregate`(`UInt8`) — Whether the function is aggregate.
## system.merges ## system.merges
Contains information about merges currently in process for tables in the MergeTree family. Contains information about merges currently in process for tables in the MergeTree family.
Columns: Columns:
- `database String` — Name of the database the table is located in. - `database String` — The name of the database the table is in.
- `table String`Name of the table. - `table String`Table name.
- `elapsed Float64` — Time in seconds since the merge started. - `elapsed Float64` — The time elapsed (in seconds) since the merge started.
- `progress Float64`Percent of progress made, from 0 to 1. - `progress Float64`The percentage of completed work from 0 to 1.
- `num_parts UInt64`Number of parts to merge. - `num_parts UInt64`The number of pieces to be merged.
- `result_part_name String`Name of the part that will be formed as the result of the merge. - `result_part_name String`The name of the part that will be formed as the result of merging.
- `total_size_bytes_compressed UInt64` — Total size of compressed data in the parts being merged. - `total_size_bytes_compressed UInt64` — The total size of the compressed data in the merged chunks.
- `total_size_marks UInt64` — Total number of marks in the parts being merged. - `total_size_marks UInt64` — The total number of marks in the merged partss.
- `bytes_read_uncompressed UInt64`Amount of bytes read, decompressed. - `bytes_read_uncompressed UInt64`Number of bytes read, uncompressed.
- `rows_read UInt64` — Number of rows read. - `rows_read UInt64` — Number of rows read.
- `bytes_written_uncompressed UInt64`Amount of bytes written, uncompressed. - `bytes_written_uncompressed UInt64`Number of bytes written, uncompressed.
- `rows_written UInt64` — Number of rows written. - `rows_written UInt64` — Number of lines rows written.
<a name="system_tables-system.metrics"></a> <a name="system_tables-system.metrics"></a>
## system.metrics ## system.metrics
@ -127,31 +128,54 @@ This is similar to the DUAL table found in other DBMSs.
## system.parts ## system.parts
Contains information about parts of a table in the [MergeTree](../operations/table_engines/mergetree.md#table_engines-mergetree) family. Contains information about parts of [MergeTree](table_engines/mergetree.md#table_engines-mergetree) tables.
Each row describes one part of the data. Each row describes one part of the data.
Columns: Columns:
- partition (String) The partition name. It's in YYYYMM format in case of old-style partitioning and is arbitary serialized value in case of custom partitioning. To learn what a partition is, see the description of the [ALTER](../query_language/alter.md#query_language_queries_alter) query. - partition (String) The partition name. To learn what a partition is, see the description of the [ALTER](../query_language/alter.md#query_language_queries_alter) query.
Formats:
- `YYYYMM` for automatic partitioning by month.
- `any_string` when partitioning manually.
- name (String) Name of the data part. - name (String) Name of the data part.
- active (UInt8) Indicates whether the part is active. If a part is active, it is used in a table; otherwise, it will be deleted. Inactive data parts remain after merging. - active (UInt8) Indicates whether the part is active. If a part is active, it is used in a table; otherwise, it will be deleted. Inactive data parts remain after merging.
- marks (UInt64) The number of marks. To get the approximate number of rows in a data part, multiply ``marks`` by the index granularity (usually 8192). - marks (UInt64) The number of marks. To get the approximate number of rows in a data part, multiply ``marks`` by the index granularity (usually 8192).
- marks_size (UInt64) The size of the file with marks. - marks_size (UInt64) The size of the file with marks.
- rows (UInt64) The number of rows. - rows (UInt64) The number of rows.
- bytes (UInt64) The number of bytes when compressed. - bytes (UInt64) The number of bytes when compressed.
- modification_time (DateTime) The modification time of the directory with the data part. This usually corresponds to the time of data part creation.| - modification_time (DateTime) The modification time of the directory with the data part. This usually corresponds to the time of data part creation.|
- remove_time (DateTime) The time when the data part became inactive. - remove_time (DateTime) The time when the data part became inactive.
- refcount (UInt32) The number of places where the data part is used. A value greater than 2 indicates that the data part is used in queries or merges. - refcount (UInt32) The number of places where the data part is used. A value greater than 2 indicates that the data part is used in queries or merges.
- min_date (Date) The minimum value of the date key in the data part. - min_date (Date) The minimum value of the date key in the data part.
- max_date (Date) The maximum value of the date key in the data part. - max_date (Date) The maximum value of the date key in the data part.
- min_block_number (UInt64) The minimum number of data parts that make up the current part after merging. - min_block_number (UInt64) The minimum number of data parts that make up the current part after merging.
- max_block_number (UInt64) The maximum number of data parts that make up the current part after merging. - max_block_number (UInt64) The maximum number of data parts that make up the current part after merging.
- level (UInt32) Depth of the merge tree. If a merge was not performed, ``level=0``. - level (UInt32) Depth of the merge tree. If a merge was not performed, ``level=0``.
- primary_key_bytes_in_memory (UInt64) The amount of memory (in bytes) used by primary key values. - primary_key_bytes_in_memory (UInt64) The amount of memory (in bytes) used by primary key values.
- primary_key_bytes_in_memory_allocated (UInt64) The amount of memory (in bytes) reserved for primary key values. - primary_key_bytes_in_memory_allocated (UInt64) The amount of memory (in bytes) reserved for primary key values.
- database (String) Name of the database. - database (String) Name of the database.
- table (String) Name of the table. - table (String) Name of the table.
- engine (String) Name of the table engine without parameters. - engine (String) Name of the table engine without parameters.
## system.processes ## system.processes
@ -162,21 +186,21 @@ Columns:
```text ```text
user String Name of the user who made the request. For distributed query processing, this is the user who helped the requestor server send the query to this server, not the user who made the distributed request on the requestor server. user String Name of the user who made the request. For distributed query processing, this is the user who helped the requestor server send the query to this server, not the user who made the distributed request on the requestor server.
address String The IP address that the query was made from. The same is true for distributed query processing. address String - The IP address the request was made from. The same for distributed processing.
elapsed Float64 The time in seconds since request execution started. elapsed Float64 - The time in seconds since request execution started.
rows_read UInt64 The number of rows read from the table. For distributed processing, on the requestor server, this is the total for all remote servers. rows_read UInt64 - The number of rows read from the table. For distributed processing, on the requestor server, this is the total for all remote servers.
bytes_read UInt64 The number of uncompressed bytes read from the table. For distributed processing, on the requestor server, this is the total for all remote servers. bytes_read UInt64 - The number of uncompressed bytes read from the table. For distributed processing, on the requestor server, this is the total for all remote servers.
UInt64 total_rows_approx The approximate total number of rows that must be read. For distributed processing, on the requestor server, this is the total for all remote servers. It can be updated during request processing, when new sources to process become known. total_rows_approx UInt64 - The approximation of the total number of rows that should be read. For distributed processing, on the requestor server, this is the total for all remote servers. It can be updated during request processing, when new sources to process become known.
memory_usage UInt64 Memory consumption by the query. It might not include some types of dedicated memory. memory_usage UInt64 - How much memory the request uses. It might not include some types of dedicated memory.
query String The query text. For INSERT, it doesn't include the data to insert. query String - The query text. For INSERT, it doesn't include the data to insert.
query_id Query ID, if defined. query_id String - Query ID, if defined.
``` ```
## system.replicas ## system.replicas
@ -220,54 +244,54 @@ active_replicas: 2
Columns: Columns:
```text ```text
database: database name database: Database name
table: table name table: Table name
engine: table engine name engine: Table engine name
is_leader: whether the replica is the leader is_leader: Whether the replica is the leader.
Only one replica at a time can be the leader. The leader is responsible for selecting background merges to perform. Only one replica at a time can be the leader. The leader is responsible for selecting background merges to perform.
Note that writes can be performed to any replica that is available and has a session in ZK, regardless of whether it is a leader. Note that writes can be performed to any replica that is available and has a session in ZK, regardless of whether it is a leader.
is_readonly: Whether the replica is in read-only mode. is_readonly: Whether the replica is in read-only mode.
This mode is turned on if the config doesn't have sections with ZK, if an unknown error occurred when reinitializing sessions in ZK, and during session reinitialization in ZK. This mode is turned on if the config doesn't have sections with ZooKeeper, if an unknown error occurred when reinitializing sessions in ZooKeeper, and during session reinitialization in ZooKeeper.
is_session_expired: Whether the ZK session expired. is_session_expired: Whether the session with ZooKeeper has expired.
Basically, the same thing as is_readonly. Basically the same as 'is_readonly'.
future_parts: The number of data parts that will appear as the result of INSERTs or merges that haven't been done yet. future_parts: The number of data parts that will appear as the result of INSERTs or merges that haven't been done yet.
parts_to_check: The number of data parts in the queue for verification. parts_to_check: The number of data parts in the queue for verification.
A part is put in the verification queue if there is suspicion that it might be damaged. A part is put in the verification queue if there is suspicion that it might be damaged.
zookeeper_path: The path to the table data in ZK. zookeeper_path: Path to table data in ZooKeeper.
replica_name: Name of the replica in ZK. Different replicas of the same table have different names. replica_name: Replica name in ZooKeeper. Different replicas of the same table have different names.
replica_path: The path to the replica data in ZK. The same as concatenating zookeeper_path/replicas/replica_path. replica_path: Path to replica data in ZooKeeper. The same as concatenating 'zookeeper_path/replicas/replica_path'.
columns_version: Version number of the table structure. columns_version: Version number of the table structure.
Indicates how many times ALTER was performed. If replicas have different versions, it means some replicas haven't made all of the ALTERs yet. Indicates how many times ALTER was performed. If replicas have different versions, it means some replicas haven't made all of the ALTERs yet.
queue_size: Size of the queue for operations waiting to be performed. queue_size: Size of the queue for operations waiting to be performed.
Operations include inserting blocks of data, merges, and certain other actions. Operations include inserting blocks of data, merges, and certain other actions.
Normally coincides with future_parts. It usually coincides with 'future_parts'.
inserts_in_queue: Number of inserts of blocks of data that need to be made. inserts_in_queue: Number of inserts of blocks of data that need to be made.
Insertions are usually replicated fairly quickly. If the number is high, something is wrong. Insertions are usually replicated fairly quickly. If this number is large, it means something is wrong.
merges_in_queue: The number of merges waiting to be made. merges_in_queue: The number of merges waiting to be made.
Sometimes merges are lengthy, so this value may be greater than zero for a long time. Sometimes merges are lengthy, so this value may be greater than zero for a long time.
The next 4 columns have a non-null value only if the ZK session is active. The next 4 columns have a non-zero value only where there is an active session with ZK.
log_max_index: Maximum entry number in the log of general activity. log_max_index: Maximum entry number in the log of general activity.
log_pointer: Maximum entry number in the log of general activity that the replica copied to its execution queue, plus one. log_pointer: Maximum entry number in the log of general activity that the replica copied to its execution queue, plus one.
If log_pointer is much smaller than log_max_index, something is wrong. If log_pointer is much smaller than log_max_index, something is wrong.
total_replicas: Total number of known replicas of this table. total_replicas: The total number of known replicas of this table.
active_replicas: Number of replicas of this table that have a ZK session (the number of active replicas). active_replicas: The number of replicas of this table that have a session in ZooKeeper (i.e., the number of functioning replicas).
``` ```
If you request all the columns, the table may work a bit slowly, since several reads from ZK are made for each row. If you request all the columns, the table may work a bit slowly, since several reads from ZooKeeper are made for each row.
If you don't request the last 4 columns (log_max_index, log_pointer, total_replicas, active_replicas), the table works quickly. If you don't request the last 4 columns (log_max_index, log_pointer, total_replicas, active_replicas), the table works quickly.
For example, you can check that everything is working correctly like this: For example, you can check that everything is working correctly like this:
@ -307,14 +331,14 @@ If this query doesn't return anything, it means that everything is fine.
## system.settings ## system.settings
Contains information about settings that are currently in use. Contains information about settings that are currently in use.
I.e. used for executing the query you are using to read from the system.settings table). I.e. used for executing the query you are using to read from the system.settings table.
Columns: Columns:
```text ```text
name String Setting name. name String Setting name.
value String Setting value. value String Setting value.
changed UInt8 - Whether the setting was explicitly defined in the config or explicitly changed. changed UInt8 Whether the setting was explicitly defined in the config or explicitly changed.
``` ```
Example: Example:
@ -343,7 +367,7 @@ This system table is used for implementing SHOW TABLES queries.
## system.zookeeper ## system.zookeeper
This table presents when ZooKeeper is configured. It allows reading data from the ZooKeeper cluster defined in the config. The table does not exist if ZooKeeper is not configured. Allows reading data from the ZooKeeper cluster defined in the config.
The query must have a 'path' equality condition in the WHERE clause. This is the path in ZooKeeper for the children that you want to get data for. The query must have a 'path' equality condition in the WHERE clause. This is the path in ZooKeeper for the children that you want to get data for.
The query `SELECT * FROM system.zookeeper WHERE path = '/clickhouse'` outputs data for all children on the `/clickhouse` node. The query `SELECT * FROM system.zookeeper WHERE path = '/clickhouse'` outputs data for all children on the `/clickhouse` node.
@ -352,21 +376,20 @@ If the path specified in 'path' doesn't exist, an exception will be thrown.
Columns: Columns:
- `name String`Name of the node. - `name String`The name of the node.
- `path String`Path to the node. - `path String`The path to the node.
- `value String`Value of the node. - `value String`Node value.
- `dataLength Int32` — Size of the value. - `dataLength Int32` — Size of the value.
- `numChildren Int32` — Number of children. - `numChildren Int32` — Number of descendants.
- `czxid Int64` — ID of the transaction that created the node. - `czxid Int64` — ID of the transaction that created the node.
- `mzxid Int64` — ID of the transaction that last changed the node. - `mzxid Int64` — ID of the transaction that last changed the node.
- `pzxid Int64` — ID of the transaction that last added or removed children. - `pzxid Int64` — ID of the transaction that last deleted or added descendants.
- `ctime DateTime` — Time of node creation. - `ctime DateTime` — Time of node creation.
- `mtime DateTime` — Time of the last node modification. - `mtime DateTime` — Time of the last modification of the node.
- `version Int32` — Node version - the number of times the node was changed. - `version Int32` — Node version: the number of times the node was changed.
- `cversion Int32` — Number of added or removed children. - `cversion Int32` — Number of added or removed descendants.
- `aversion Int32` — Number of changes to ACL. - `aversion Int32` — Number of changes to the ACL.
- `ephemeralOwner Int64` — For ephemeral nodes, the ID of the session that owns this node. - `ephemeralOwner Int64` — For ephemeral nodes, the ID of hte session that owns this node.
Example: Example:

Some files were not shown because too many files have changed in this diff Show More