Merge branch 'master' into shellcheck3

This commit is contained in:
alexey-milovidov 2020-08-15 19:49:22 +03:00 committed by GitHub
commit 966dd287b7
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
500 changed files with 20021 additions and 46806 deletions

View File

@ -1,5 +1,184 @@
## ClickHouse release 20.6
### ClickHouse release v20.6.3.28-stable
#### New Feature
* Added an initial implementation of `EXPLAIN` query. Syntax: `EXPLAIN SELECT ...`. This fixes [#1118](https://github.com/ClickHouse/ClickHouse/issues/1118). [#11873](https://github.com/ClickHouse/ClickHouse/pull/11873) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Added storage `RabbitMQ`. [#11069](https://github.com/ClickHouse/ClickHouse/pull/11069) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Implemented PostgreSQL-like `ILIKE` operator for [#11710](https://github.com/ClickHouse/ClickHouse/issues/11710). [#12125](https://github.com/ClickHouse/ClickHouse/pull/12125) ([Mike](https://github.com/myrrc)).
* Supported RIGHT and FULL JOIN with `SET join_algorithm = 'partial_merge'`. Only ALL strictness is allowed (ANY, SEMI, ANTI, ASOF are not). [#12118](https://github.com/ClickHouse/ClickHouse/pull/12118) ([Artem Zuikov](https://github.com/4ertus2)).
* Added a function `initializeAggregation` to initialize an aggregation based on a single value. [#12109](https://github.com/ClickHouse/ClickHouse/pull/12109) ([Guillaume Tassery](https://github.com/YiuRULE)).
* Supported `ALTER TABLE ... [ADD|MODIFY] COLUMN ... FIRST` [#4006](https://github.com/ClickHouse/ClickHouse/issues/4006). [#12073](https://github.com/ClickHouse/ClickHouse/pull/12073) ([Winter Zhang](https://github.com/zhang2014)).
* Added function `parseDateTimeBestEffortUS`. [#12028](https://github.com/ClickHouse/ClickHouse/pull/12028) ([flynn](https://github.com/ucasFL)).
* Support format `ORC` for output (was supported only for input). [#11662](https://github.com/ClickHouse/ClickHouse/pull/11662) ([Kruglov Pavel](https://github.com/Avogar)).
#### Bug Fix
* Fixed `aggregate function any(x) is found inside another aggregate function in query` error with `SET optimize_move_functions_out_of_any = 1` and aliases inside `any()`. [#13419](https://github.com/ClickHouse/ClickHouse/pull/13419) ([Artem Zuikov](https://github.com/4ertus2)).
* Fixed `PrettyCompactMonoBlock` for clickhouse-local. Fixed extremes/totals with `PrettyCompactMonoBlock`. This fixes [#7746](https://github.com/ClickHouse/ClickHouse/issues/7746). [#13394](https://github.com/ClickHouse/ClickHouse/pull/13394) ([Azat Khuzhin](https://github.com/azat)).
* Fixed possible error `Totals having transform was already added to pipeline` in case of a query from delayed replica. [#13290](https://github.com/ClickHouse/ClickHouse/pull/13290) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* The server may crash if user passed specifically crafted arguments to the function `h3ToChildren`. This fixes [#13275](https://github.com/ClickHouse/ClickHouse/issues/13275). [#13277](https://github.com/ClickHouse/ClickHouse/pull/13277) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed potentially low performance and slightly incorrect result for `uniqExact`, `topK`, `sumDistinct` and similar aggregate functions called on Float types with NaN values. It also triggered assert in debug build. This fixes [#12491](https://github.com/ClickHouse/ClickHouse/issues/12491). [#13254](https://github.com/ClickHouse/ClickHouse/pull/13254) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed function if with nullable constexpr as cond that is not literal NULL. Fixes [#12463](https://github.com/ClickHouse/ClickHouse/issues/12463). [#13226](https://github.com/ClickHouse/ClickHouse/pull/13226) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed assert in `arrayElement` function in case of array elements are Nullable and array subscript is also Nullable. This fixes [#12172](https://github.com/ClickHouse/ClickHouse/issues/12172). [#13224](https://github.com/ClickHouse/ClickHouse/pull/13224) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed `DateTime64` conversion functions with constant argument. [#13205](https://github.com/ClickHouse/ClickHouse/pull/13205) ([Azat Khuzhin](https://github.com/azat)).
* Fixed wrong index analysis with functions. It could lead to pruning wrong parts, while reading from `MergeTree` tables. Fixes [#13060](https://github.com/ClickHouse/ClickHouse/issues/13060). Fixes [#12406](https://github.com/ClickHouse/ClickHouse/issues/12406). [#13081](https://github.com/ClickHouse/ClickHouse/pull/13081) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed error `Cannot convert column because it is constant but values of constants are different in source and result` for remote queries which use deterministic functions in scope of query, but not deterministic between queries, like `now()`, `now64()`, `randConstant()`. Fixes [#11327](https://github.com/ClickHouse/ClickHouse/issues/11327). [#13075](https://github.com/ClickHouse/ClickHouse/pull/13075) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed unnecessary limiting for the number of threads for selects from local replica. [#12840](https://github.com/ClickHouse/ClickHouse/pull/12840) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed rare bug when `ALTER DELETE` and `ALTER MODIFY COLUMN` queries executed simultaneously as a single mutation. Bug leads to an incorrect amount of rows in `count.txt` and as a consequence incorrect data in part. Also, fix a small bug with simultaneous `ALTER RENAME COLUMN` and `ALTER ADD COLUMN`. [#12760](https://github.com/ClickHouse/ClickHouse/pull/12760) ([alesapin](https://github.com/alesapin)).
* Fixed `CAST(Nullable(String), Enum())`. [#12745](https://github.com/ClickHouse/ClickHouse/pull/12745) ([Azat Khuzhin](https://github.com/azat)).
* Fixed a performance with large tuples, which are interpreted as functions in `IN` section. The case when user write `WHERE x IN tuple(1, 2, ...)` instead of `WHERE x IN (1, 2, ...)` for some obscure reason. [#12700](https://github.com/ClickHouse/ClickHouse/pull/12700) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed memory tracking for `input_format_parallel_parsing` (by attaching thread to group). [#12672](https://github.com/ClickHouse/ClickHouse/pull/12672) ([Azat Khuzhin](https://github.com/azat)).
* Fixed bloom filter index with const expression. This fixes [#10572](https://github.com/ClickHouse/ClickHouse/issues/10572). [#12659](https://github.com/ClickHouse/ClickHouse/pull/12659) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed `SIGSEGV` in `StorageKafka` when broker is unavailable (and not only). [#12658](https://github.com/ClickHouse/ClickHouse/pull/12658) ([Azat Khuzhin](https://github.com/azat)).
* Added support for function `if` with `Array(UUID)` arguments. This fixes [#11066](https://github.com/ClickHouse/ClickHouse/issues/11066). [#12648](https://github.com/ClickHouse/ClickHouse/pull/12648) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* `CREATE USER IF NOT EXISTS` now doesn't throw exception if the user exists. This fixes [#12507](https://github.com/ClickHouse/ClickHouse/issues/12507). [#12646](https://github.com/ClickHouse/ClickHouse/pull/12646) ([Vitaly Baranov](https://github.com/vitlibar)).
* Better exception message in disk access storage. [#12625](https://github.com/ClickHouse/ClickHouse/pull/12625) ([alesapin](https://github.com/alesapin)).
* The function `groupArrayMoving*` was not working for distributed queries. It's result was calculated within incorrect data type (without promotion to the largest type). The function `groupArrayMovingAvg` was returning integer number that was inconsistent with the `avg` function. This fixes [#12568](https://github.com/ClickHouse/ClickHouse/issues/12568). [#12622](https://github.com/ClickHouse/ClickHouse/pull/12622) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed lack of aliases with function `any`. [#12593](https://github.com/ClickHouse/ClickHouse/pull/12593) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed race condition in external dictionaries with cache layout which can lead server crash. [#12566](https://github.com/ClickHouse/ClickHouse/pull/12566) ([alesapin](https://github.com/alesapin)).
* Remove data for Distributed tables (blocks from async INSERTs) on DROP TABLE. [#12556](https://github.com/ClickHouse/ClickHouse/pull/12556) ([Azat Khuzhin](https://github.com/azat)).
* Fixed bug which lead to broken old parts after `ALTER DELETE` query when `enable_mixed_granularity_parts=1`. Fixes [#12536](https://github.com/ClickHouse/ClickHouse/issues/12536). [#12543](https://github.com/ClickHouse/ClickHouse/pull/12543) ([alesapin](https://github.com/alesapin)).
* Better exception for function `in` with invalid number of arguments. [#12529](https://github.com/ClickHouse/ClickHouse/pull/12529) ([Anton Popov](https://github.com/CurtizJ)).
* Fixing race condition in live view tables which could cause data duplication. [#12519](https://github.com/ClickHouse/ClickHouse/pull/12519) ([vzakaznikov](https://github.com/vzakaznikov)).
* Fixed performance issue, while reading from compact parts. [#12492](https://github.com/ClickHouse/ClickHouse/pull/12492) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed backwards compatibility in binary format of `AggregateFunction(avg, ...)` values. This fixes [#12342](https://github.com/ClickHouse/ClickHouse/issues/12342). [#12486](https://github.com/ClickHouse/ClickHouse/pull/12486) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed SETTINGS parse after FORMAT. [#12480](https://github.com/ClickHouse/ClickHouse/pull/12480) ([Azat Khuzhin](https://github.com/azat)).
* Fixed the deadlock if `text_log` is enabled. [#12452](https://github.com/ClickHouse/ClickHouse/pull/12452) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed overflow when very large `LIMIT` or `OFFSET` is specified. This fixes [#10470](https://github.com/ClickHouse/ClickHouse/issues/10470). This fixes [#11372](https://github.com/ClickHouse/ClickHouse/issues/11372). [#12427](https://github.com/ClickHouse/ClickHouse/pull/12427) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed possible segfault if `StorageMerge`. This fixes [#12054](https://github.com/ClickHouse/ClickHouse/issues/12054). [#12401](https://github.com/ClickHouse/ClickHouse/pull/12401) ([tavplubix](https://github.com/tavplubix)).
* Reverted change introduced in [#11079](https://github.com/ClickHouse/ClickHouse/issues/11079) to resolve [#12098](https://github.com/ClickHouse/ClickHouse/issues/12098). [#12397](https://github.com/ClickHouse/ClickHouse/pull/12397) ([Mike](https://github.com/myrrc)).
* Additional check for arguments of bloom filter index. This fixes [#11408](https://github.com/ClickHouse/ClickHouse/issues/11408). [#12388](https://github.com/ClickHouse/ClickHouse/pull/12388) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Avoid exception when negative or floating point constant is used in WHERE condition for indexed tables. This fixes [#11905](https://github.com/ClickHouse/ClickHouse/issues/11905). [#12384](https://github.com/ClickHouse/ClickHouse/pull/12384) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Allowed to `CLEAR` column even if there are depending `DEFAULT` expressions. This fixes [#12333](https://github.com/ClickHouse/ClickHouse/issues/12333). [#12378](https://github.com/ClickHouse/ClickHouse/pull/12378) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix `TOTALS/ROLLUP/CUBE` for aggregate functions with `-State` and `Nullable` arguments. This fixes [#12163](https://github.com/ClickHouse/ClickHouse/issues/12163). [#12376](https://github.com/ClickHouse/ClickHouse/pull/12376) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed error message and exit codes for `ALTER RENAME COLUMN` queries, when `RENAME` is not allowed. Fixes [#12301](https://github.com/ClickHouse/ClickHouse/issues/12301) and [#12303](https://github.com/ClickHouse/ClickHouse/issues/12303). [#12335](https://github.com/ClickHouse/ClickHouse/pull/12335) ([alesapin](https://github.com/alesapin)).
* Fixed very rare race condition in `ReplicatedMergeTreeQueue`. [#12315](https://github.com/ClickHouse/ClickHouse/pull/12315) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* When using codec `Delta` or `DoubleDelta` with non fixed width types, exception with code `LOGICAL_ERROR` was returned instead of exception with code `BAD_ARGUMENTS` (we ensure that exceptions with code logical error never happen). This fixes [#12110](https://github.com/ClickHouse/ClickHouse/issues/12110). [#12308](https://github.com/ClickHouse/ClickHouse/pull/12308) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed order of columns in `WITH FILL` modifier. Previously order of columns of `ORDER BY` statement wasn't respected. [#12306](https://github.com/ClickHouse/ClickHouse/pull/12306) ([Anton Popov](https://github.com/CurtizJ)).
* Avoid "bad cast" exception when there is an expression that filters data by virtual columns (like `_table` in `Merge` tables) or by "index" columns in system tables such as filtering by database name when querying from `system.tables`, and this expression returns `Nullable` type. This fixes [#12166](https://github.com/ClickHouse/ClickHouse/issues/12166). [#12305](https://github.com/ClickHouse/ClickHouse/pull/12305) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed `TTL` after renaming column, on which depends TTL expression. [#12304](https://github.com/ClickHouse/ClickHouse/pull/12304) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed SIGSEGV if there is an message with error in the middle of the batch in `Kafka` Engine. [#12302](https://github.com/ClickHouse/ClickHouse/pull/12302) ([Azat Khuzhin](https://github.com/azat)).
* Fixed the situation when some threads might randomly hang for a few seconds during `DNS` cache updating. [#12296](https://github.com/ClickHouse/ClickHouse/pull/12296) ([tavplubix](https://github.com/tavplubix)).
* Fixed typo in setting name. [#12292](https://github.com/ClickHouse/ClickHouse/pull/12292) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Show error after `TrieDictionary` failed to load. [#12290](https://github.com/ClickHouse/ClickHouse/pull/12290) ([Vitaly Baranov](https://github.com/vitlibar)).
* The function `arrayFill` worked incorrectly for empty arrays that may lead to crash. This fixes [#12263](https://github.com/ClickHouse/ClickHouse/issues/12263). [#12279](https://github.com/ClickHouse/ClickHouse/pull/12279) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Implement conversions to the common type for `LowCardinality` types. This allows to execute UNION ALL of tables with columns of LowCardinality and other columns. This fixes [#8212](https://github.com/ClickHouse/ClickHouse/issues/8212). This fixes [#4342](https://github.com/ClickHouse/ClickHouse/issues/4342). [#12275](https://github.com/ClickHouse/ClickHouse/pull/12275) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed the behaviour on reaching redirect limit in request to `S3` storage. [#12256](https://github.com/ClickHouse/ClickHouse/pull/12256) ([ianton-ru](https://github.com/ianton-ru)).
* Fixed the behaviour when during multiple sequential inserts in `StorageFile` header for some special types was written more than once. This fixed [#6155](https://github.com/ClickHouse/ClickHouse/issues/6155). [#12197](https://github.com/ClickHouse/ClickHouse/pull/12197) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fixed logical functions for UInt8 values when they are not equal to 0 or 1. [#12196](https://github.com/ClickHouse/ClickHouse/pull/12196) ([Alexander Kazakov](https://github.com/Akazz)).
* Cap max_memory_usage* limits to the process resident memory. [#12182](https://github.com/ClickHouse/ClickHouse/pull/12182) ([Azat Khuzhin](https://github.com/azat)).
* Fix dictGet arguments check during `GROUP BY` injective functions elimination. [#12179](https://github.com/ClickHouse/ClickHouse/pull/12179) ([Azat Khuzhin](https://github.com/azat)).
* Fixed the behaviour when `SummingMergeTree` engine sums up columns from partition key. Added an exception in case of explicit definition of columns to sum which intersects with partition key columns. This fixes [#7867](https://github.com/ClickHouse/ClickHouse/issues/7867). [#12173](https://github.com/ClickHouse/ClickHouse/pull/12173) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Don't split the dictionary source's table name into schema and table name itself if ODBC connection doesn't support schema. [#12165](https://github.com/ClickHouse/ClickHouse/pull/12165) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fixed wrong logic in `ALTER DELETE` that leads to deleting of records when condition evaluates to NULL. This fixes [#9088](https://github.com/ClickHouse/ClickHouse/issues/9088). This closes [#12106](https://github.com/ClickHouse/ClickHouse/issues/12106). [#12153](https://github.com/ClickHouse/ClickHouse/pull/12153) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed transform of query to send to external DBMS (e.g. MySQL, ODBC) in presense of aliases. This fixes [#12032](https://github.com/ClickHouse/ClickHouse/issues/12032). [#12151](https://github.com/ClickHouse/ClickHouse/pull/12151) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed bad code in redundant ORDER BY optimization. The bug was introduced in [#10067](https://github.com/ClickHouse/ClickHouse/issues/10067). [#12148](https://github.com/ClickHouse/ClickHouse/pull/12148) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed potential overflow in integer division. This fixes [#12119](https://github.com/ClickHouse/ClickHouse/issues/12119). [#12140](https://github.com/ClickHouse/ClickHouse/pull/12140) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed potential infinite loop in `greatCircleDistance`, `geoDistance`. This fixes [#12117](https://github.com/ClickHouse/ClickHouse/issues/12117). [#12137](https://github.com/ClickHouse/ClickHouse/pull/12137) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Normalize "pid" file handling. In previous versions the server may refuse to start if it was killed without proper shutdown and if there is another process that has the same pid as previously runned server. Also pid file may be removed in unsuccessful server startup even if there is another server running. This fixes [#3501](https://github.com/ClickHouse/ClickHouse/issues/3501). [#12133](https://github.com/ClickHouse/ClickHouse/pull/12133) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed bug which leads to incorrect table metadata in ZooKeepeer for ReplicatedVersionedCollapsingMergeTree tables. Fixes [#12093](https://github.com/ClickHouse/ClickHouse/issues/12093). [#12121](https://github.com/ClickHouse/ClickHouse/pull/12121) ([alesapin](https://github.com/alesapin)).
* Avoid "There is no query" exception for materialized views with joins or with subqueries attached to system logs (system.query_log, metric_log, etc) or to engine=Buffer underlying table. [#12120](https://github.com/ClickHouse/ClickHouse/pull/12120) ([filimonov](https://github.com/filimonov)).
* Fixed handling dependency of table with ENGINE=Dictionary on dictionary. This fixes [#10994](https://github.com/ClickHouse/ClickHouse/issues/10994). This fixes [#10397](https://github.com/ClickHouse/ClickHouse/issues/10397). [#12116](https://github.com/ClickHouse/ClickHouse/pull/12116) ([Vitaly Baranov](https://github.com/vitlibar)).
* Format `Parquet` now properly works with `LowCardinality` and `LowCardinality(Nullable)` types. Fixes [#12086](https://github.com/ClickHouse/ClickHouse/issues/12086), [#8406](https://github.com/ClickHouse/ClickHouse/issues/8406). [#12108](https://github.com/ClickHouse/ClickHouse/pull/12108) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed performance for selects with `UNION` caused by wrong limit for the total number of threads. Fixes [#12030](https://github.com/ClickHouse/ClickHouse/issues/12030). [#12103](https://github.com/ClickHouse/ClickHouse/pull/12103) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed segfault with `-StateResample` combinators. [#12092](https://github.com/ClickHouse/ClickHouse/pull/12092) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed empty `result_rows` and `result_bytes` metrics in `system.quey_log` for selects. Fixes [#11595](https://github.com/ClickHouse/ClickHouse/issues/11595). [#12089](https://github.com/ClickHouse/ClickHouse/pull/12089) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed unnecessary limiting the number of threads for selects from `VIEW`. Fixes [#11937](https://github.com/ClickHouse/ClickHouse/issues/11937). [#12085](https://github.com/ClickHouse/ClickHouse/pull/12085) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed SIGSEGV in StorageKafka on DROP TABLE. [#12075](https://github.com/ClickHouse/ClickHouse/pull/12075) ([Azat Khuzhin](https://github.com/azat)).
* Fixed possible crash while using wrong type for `PREWHERE`. Fixes [#12053](https://github.com/ClickHouse/ClickHouse/issues/12053), [#12060](https://github.com/ClickHouse/ClickHouse/issues/12060). [#12060](https://github.com/ClickHouse/ClickHouse/pull/12060) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed error `Cannot capture column` for higher-order functions with `Tuple(LowCardinality)` argument. Fixes [#9766](https://github.com/ClickHouse/ClickHouse/issues/9766). [#12055](https://github.com/ClickHouse/ClickHouse/pull/12055) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed constraints check if constraint is a constant expression. This fixes [#11360](https://github.com/ClickHouse/ClickHouse/issues/11360). [#12042](https://github.com/ClickHouse/ClickHouse/pull/12042) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed wrong result and potential crash when invoking function `if` with arguments of type `FixedString` with different sizes. This fixes [#11362](https://github.com/ClickHouse/ClickHouse/issues/11362). [#12021](https://github.com/ClickHouse/ClickHouse/pull/12021) ([alexey-milovidov](https://github.com/alexey-milovidov)).
#### Improvement
* Allowed to set `JOIN` kind and type in more standard way: `LEFT SEMI JOIN` instead of `SEMI LEFT JOIN`. For now both are correct. [#12520](https://github.com/ClickHouse/ClickHouse/pull/12520) ([Artem Zuikov](https://github.com/4ertus2)).
* lifetime_rows/lifetime_bytes for Buffer engine. [#12421](https://github.com/ClickHouse/ClickHouse/pull/12421) ([Azat Khuzhin](https://github.com/azat)).
* Write the detail exception message to the client instead of 'MySQL server has gone away'. [#12383](https://github.com/ClickHouse/ClickHouse/pull/12383) ([BohuTANG](https://github.com/BohuTANG)).
* Allows to change a charset which is used for printing grids borders. Available charsets are following: UTF-8, ASCII. Setting `output_format_pretty_grid_charset` enables this feature. [#12372](https://github.com/ClickHouse/ClickHouse/pull/12372) ([Sabyanin Maxim](https://github.com/s-mx)).
* Supported MySQL 'SELECT DATABASE()' [#9336](https://github.com/ClickHouse/ClickHouse/issues/9336) 2. Add MySQL replacement query integration test. [#12314](https://github.com/ClickHouse/ClickHouse/pull/12314) ([BohuTANG](https://github.com/BohuTANG)).
* Added `KILL QUERY [connection_id]` for the MySQL client/driver to cancel the long query, issue [#12038](https://github.com/ClickHouse/ClickHouse/issues/12038). [#12152](https://github.com/ClickHouse/ClickHouse/pull/12152) ([BohuTANG](https://github.com/BohuTANG)).
* Added support for `%g` (two digit ISO year) and `%G` (four digit ISO year) substitutions in `formatDateTime` function. [#12136](https://github.com/ClickHouse/ClickHouse/pull/12136) ([vivarum](https://github.com/vivarum)).
* Added 'type' column in system.disks. [#12115](https://github.com/ClickHouse/ClickHouse/pull/12115) ([ianton-ru](https://github.com/ianton-ru)).
* Improved `REVOKE` command: now it requires grant/admin option for only access which will be revoked. For example, to execute `REVOKE ALL ON *.* FROM user1` now it doesn't require to have full access rights granted with grant option. Added command `REVOKE ALL FROM user1` - it revokes all granted roles from `user1`. [#12083](https://github.com/ClickHouse/ClickHouse/pull/12083) ([Vitaly Baranov](https://github.com/vitlibar)).
* Added replica priority for load_balancing (for manual prioritization of the load balancing). [#11995](https://github.com/ClickHouse/ClickHouse/pull/11995) ([Azat Khuzhin](https://github.com/azat)).
* Switched paths in S3 metadata to relative which allows to handle S3 blobs more easily. [#11892](https://github.com/ClickHouse/ClickHouse/pull/11892) ([Vladimir Chebotarev](https://github.com/excitoon)).
#### Performance Improvement
* Improved performace of 'ORDER BY' and 'GROUP BY' by prefix of sorting key (enabled with `optimize_aggregation_in_order` setting, disabled by default). [#11696](https://github.com/ClickHouse/ClickHouse/pull/11696) ([Anton Popov](https://github.com/CurtizJ)).
* Removed injective functions inside `uniq*()` if `set optimize_injective_functions_inside_uniq=1`. [#12337](https://github.com/ClickHouse/ClickHouse/pull/12337) ([Ruslan Kamalov](https://github.com/kamalov-ruslan)).
* Index not used for IN operator with literals", performance regression introduced around v19.3. This fixes "[#10574](https://github.com/ClickHouse/ClickHouse/issues/10574). [#12062](https://github.com/ClickHouse/ClickHouse/pull/12062) ([nvartolomei](https://github.com/nvartolomei)).
* Implemented single part uploads for DiskS3 (experimental feature). [#12026](https://github.com/ClickHouse/ClickHouse/pull/12026) ([Vladimir Chebotarev](https://github.com/excitoon)).
#### Experimental Feature
* Added new in-memory format of parts in `MergeTree`-family tables, which stores data in memory. Parts are written on disk at first merge. Part will be created in in-memory format if its size in rows or bytes is below thresholds `min_rows_for_compact_part` and `min_bytes_for_compact_part`. Also optional support of Write-Ahead-Log is available, which is enabled by default and is controlled by setting `in_memory_parts_enable_wal`. [#10697](https://github.com/ClickHouse/ClickHouse/pull/10697) ([Anton Popov](https://github.com/CurtizJ)).
#### Build/Testing/Packaging Improvement
* Implement AST-based query fuzzing mode for clickhouse-client. See [this label](https://github.com/ClickHouse/ClickHouse/issues?q=label%3Afuzz+is%3Aissue) for the list of issues we recently found by fuzzing. Most of them were found by this tool, and a couple by SQLancer and `00746_sql_fuzzy.pl`. [#12111](https://github.com/ClickHouse/ClickHouse/pull/12111) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Add new type of tests based on Testflows framework. [#12090](https://github.com/ClickHouse/ClickHouse/pull/12090) ([vzakaznikov](https://github.com/vzakaznikov)).
* Added S3 HTTPS integration test. [#12412](https://github.com/ClickHouse/ClickHouse/pull/12412) ([Pavel Kovalenko](https://github.com/Jokser)).
* Log sanitizer trap messages from separate thread. This will prevent possible deadlock under thread sanitizer. [#12313](https://github.com/ClickHouse/ClickHouse/pull/12313) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Now functional and stress tests will be able to run with old version of `clickhouse-test` script. [#12287](https://github.com/ClickHouse/ClickHouse/pull/12287) ([alesapin](https://github.com/alesapin)).
* Remove strange file creation during build in `orc`. [#12258](https://github.com/ClickHouse/ClickHouse/pull/12258) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Place common docker compose files to integration docker container. [#12168](https://github.com/ClickHouse/ClickHouse/pull/12168) ([Ilya Yatsishin](https://github.com/qoega)).
* Fix warnings from CodeQL. `CodeQL` is another static analyzer that we will use along with `clang-tidy` and `PVS-Studio` that we use already. [#12138](https://github.com/ClickHouse/ClickHouse/pull/12138) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Minor CMake fixes for UNBUNDLED build. [#12131](https://github.com/ClickHouse/ClickHouse/pull/12131) ([Matwey V. Kornilov](https://github.com/matwey)).
* Added a showcase of the minimal Docker image without using any Linux distribution. [#12126](https://github.com/ClickHouse/ClickHouse/pull/12126) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Perform an upgrade of system packages in the `clickhouse-server` docker image. [#12124](https://github.com/ClickHouse/ClickHouse/pull/12124) ([Ivan Blinkov](https://github.com/blinkov)).
* Add `UNBUNDLED` flag to `system.build_options` table. Move skip lists for `clickhouse-test` to clickhouse repo. [#12107](https://github.com/ClickHouse/ClickHouse/pull/12107) ([alesapin](https://github.com/alesapin)).
* Regular check by [Anchore Container Analysis](https://docs.anchore.com) security analysis tool that looks for [CVE](https://cve.mitre.org/) in `clickhouse-server` Docker image. Also confirms that `Dockerfile` is buildable. Runs daily on `master` and on pull-requests to `Dockerfile`. [#12102](https://github.com/ClickHouse/ClickHouse/pull/12102) ([Ivan Blinkov](https://github.com/blinkov)).
* Daily check by [GitHub CodeQL](https://securitylab.github.com/tools/codeql) security analysis tool that looks for [CWE](https://cwe.mitre.org/). [#12101](https://github.com/ClickHouse/ClickHouse/pull/12101) ([Ivan Blinkov](https://github.com/blinkov)).
* Install `ca-certificates` before the first `apt-get update` in Dockerfile. [#12095](https://github.com/ClickHouse/ClickHouse/pull/12095) ([Ivan Blinkov](https://github.com/blinkov)).
## ClickHouse release 20.5
### ClickHouse release v20.5.4.40-stable 2020-08-10
#### Bug Fix
* Fixed wrong index analysis with functions. It could lead to pruning wrong parts, while reading from `MergeTree` tables. Fixes [#13060](https://github.com/ClickHouse/ClickHouse/issues/13060). Fixes [#12406](https://github.com/ClickHouse/ClickHouse/issues/12406). [#13081](https://github.com/ClickHouse/ClickHouse/pull/13081) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed unnecessary limiting for the number of threads for selects from local replica. [#12840](https://github.com/ClickHouse/ClickHouse/pull/12840) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed performance with large tuples, which are interpreted as functions in `IN` section. The case when user write `WHERE x IN tuple(1, 2, ...)` instead of `WHERE x IN (1, 2, ...)` for some obscure reason. [#12700](https://github.com/ClickHouse/ClickHouse/pull/12700) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed memory tracking for input_format_parallel_parsing (by attaching thread to group). [#12672](https://github.com/ClickHouse/ClickHouse/pull/12672) ([Azat Khuzhin](https://github.com/azat)).
* Fixed bloom filter index with const expression. This fixes [#10572](https://github.com/ClickHouse/ClickHouse/issues/10572). [#12659](https://github.com/ClickHouse/ClickHouse/pull/12659) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed `SIGSEGV` in `StorageKafka` when broker is unavailable (and not only). [#12658](https://github.com/ClickHouse/ClickHouse/pull/12658) ([Azat Khuzhin](https://github.com/azat)).
* Added support for function `if` with `Array(UUID)` arguments. This fixes [#11066](https://github.com/ClickHouse/ClickHouse/issues/11066). [#12648](https://github.com/ClickHouse/ClickHouse/pull/12648) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed lack of aliases with function `any`. [#12593](https://github.com/ClickHouse/ClickHouse/pull/12593) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed race condition in external dictionaries with cache layout which can lead server crash. [#12566](https://github.com/ClickHouse/ClickHouse/pull/12566) ([alesapin](https://github.com/alesapin)).
* Remove data for Distributed tables (blocks from async INSERTs) on DROP TABLE. [#12556](https://github.com/ClickHouse/ClickHouse/pull/12556) ([Azat Khuzhin](https://github.com/azat)).
* Fixed bug which lead to broken old parts after `ALTER DELETE` query when `enable_mixed_granularity_parts=1`. Fixes [#12536](https://github.com/ClickHouse/ClickHouse/issues/12536). [#12543](https://github.com/ClickHouse/ClickHouse/pull/12543) ([alesapin](https://github.com/alesapin)).
* Better exception for function `in` with invalid number of arguments. [#12529](https://github.com/ClickHouse/ClickHouse/pull/12529) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed race condition in live view tables which could cause data duplication. [#12519](https://github.com/ClickHouse/ClickHouse/pull/12519) ([vzakaznikov](https://github.com/vzakaznikov)).
* Fixed performance issue, while reading from compact parts. [#12492](https://github.com/ClickHouse/ClickHouse/pull/12492) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed backwards compatibility in binary format of `AggregateFunction(avg, ...)` values. This fixes [#12342](https://github.com/ClickHouse/ClickHouse/issues/12342). [#12486](https://github.com/ClickHouse/ClickHouse/pull/12486) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed the deadlock if `text_log` is enabled. [#12452](https://github.com/ClickHouse/ClickHouse/pull/12452) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed overflow when very large LIMIT or OFFSET is specified. This fixes [#10470](https://github.com/ClickHouse/ClickHouse/issues/10470). This fixes [#11372](https://github.com/ClickHouse/ClickHouse/issues/11372). [#12427](https://github.com/ClickHouse/ClickHouse/pull/12427) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed possible segfault if StorageMerge. Closes [#12054](https://github.com/ClickHouse/ClickHouse/issues/12054). [#12401](https://github.com/ClickHouse/ClickHouse/pull/12401) ([tavplubix](https://github.com/tavplubix)).
* Reverts change introduced in [#11079](https://github.com/ClickHouse/ClickHouse/issues/11079) to resolve [#12098](https://github.com/ClickHouse/ClickHouse/issues/12098). [#12397](https://github.com/ClickHouse/ClickHouse/pull/12397) ([Mike](https://github.com/myrrc)).
* Avoid exception when negative or floating point constant is used in WHERE condition for indexed tables. This fixes [#11905](https://github.com/ClickHouse/ClickHouse/issues/11905). [#12384](https://github.com/ClickHouse/ClickHouse/pull/12384) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Allow to CLEAR column even if there are depending DEFAULT expressions. This fixes [#12333](https://github.com/ClickHouse/ClickHouse/issues/12333). [#12378](https://github.com/ClickHouse/ClickHouse/pull/12378) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed TOTALS/ROLLUP/CUBE for aggregate functions with `-State` and `Nullable` arguments. This fixes [#12163](https://github.com/ClickHouse/ClickHouse/issues/12163). [#12376](https://github.com/ClickHouse/ClickHouse/pull/12376) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed SIGSEGV if there is an message with error in the middle of the batch in `Kafka` Engine. [#12302](https://github.com/ClickHouse/ClickHouse/pull/12302) ([Azat Khuzhin](https://github.com/azat)).
* Fixed the behaviour when `SummingMergeTree` engine sums up columns from partition key. Added an exception in case of explicit definition of columns to sum which intersects with partition key columns. This fixes [#7867](https://github.com/ClickHouse/ClickHouse/issues/7867). [#12173](https://github.com/ClickHouse/ClickHouse/pull/12173) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fixed transform of query to send to external DBMS (e.g. MySQL, ODBC) in presense of aliases. This fixes [#12032](https://github.com/ClickHouse/ClickHouse/issues/12032). [#12151](https://github.com/ClickHouse/ClickHouse/pull/12151) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed bug which leads to incorrect table metadata in ZooKeepeer for ReplicatedVersionedCollapsingMergeTree tables. Fixes [#12093](https://github.com/ClickHouse/ClickHouse/issues/12093). [#12121](https://github.com/ClickHouse/ClickHouse/pull/12121) ([alesapin](https://github.com/alesapin)).
* Fixed unnecessary limiting the number of threads for selects from `VIEW`. Fixes [#11937](https://github.com/ClickHouse/ClickHouse/issues/11937). [#12085](https://github.com/ClickHouse/ClickHouse/pull/12085) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed crash in JOIN with LowCardinality type with `join_algorithm=partial_merge`. [#12035](https://github.com/ClickHouse/ClickHouse/pull/12035) ([Artem Zuikov](https://github.com/4ertus2)).
* Fixed wrong result for `if()` with NULLs in condition. [#11807](https://github.com/ClickHouse/ClickHouse/pull/11807) ([Artem Zuikov](https://github.com/4ertus2)).
#### Performance Improvement
* Index not used for IN operator with literals", performance regression introduced around v19.3. This fixes "[#10574](https://github.com/ClickHouse/ClickHouse/issues/10574). [#12062](https://github.com/ClickHouse/ClickHouse/pull/12062) ([nvartolomei](https://github.com/nvartolomei)).
#### Build/Testing/Packaging Improvement
* Install `ca-certificates` before the first `apt-get update` in Dockerfile. [#12095](https://github.com/ClickHouse/ClickHouse/pull/12095) ([Ivan Blinkov](https://github.com/blinkov)).
### ClickHouse release v20.5.2.7-stable 2020-07-02
#### Backward Incompatible Change
@ -355,6 +534,80 @@
## ClickHouse release v20.4
### ClickHouse release v20.4.8.99-stable 2020-08-10
#### Bug Fix
* Fixed error in `parseDateTimeBestEffort` function when unix timestamp was passed as an argument. This fixes [#13362](https://github.com/ClickHouse/ClickHouse/issues/13362). [#13441](https://github.com/ClickHouse/ClickHouse/pull/13441) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed potentially low performance and slightly incorrect result for `uniqExact`, `topK`, `sumDistinct` and similar aggregate functions called on Float types with NaN values. It also triggered assert in debug build. This fixes [#12491](https://github.com/ClickHouse/ClickHouse/issues/12491). [#13254](https://github.com/ClickHouse/ClickHouse/pull/13254) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed function if with nullable constexpr as cond that is not literal NULL. Fixes [#12463](https://github.com/ClickHouse/ClickHouse/issues/12463). [#13226](https://github.com/ClickHouse/ClickHouse/pull/13226) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed assert in `arrayElement` function in case of array elements are Nullable and array subscript is also Nullable. This fixes [#12172](https://github.com/ClickHouse/ClickHouse/issues/12172). [#13224](https://github.com/ClickHouse/ClickHouse/pull/13224) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed wrong index analysis with functions. It could lead to pruning wrong parts, while reading from `MergeTree` tables. Fixes [#13060](https://github.com/ClickHouse/ClickHouse/issues/13060). Fixes [#12406](https://github.com/ClickHouse/ClickHouse/issues/12406). [#13081](https://github.com/ClickHouse/ClickHouse/pull/13081) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed unnecessary limiting for the number of threads for selects from local replica. [#12840](https://github.com/ClickHouse/ClickHouse/pull/12840) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed possible extra overflow row in data which could appear for queries `WITH TOTALS`. [#12747](https://github.com/ClickHouse/ClickHouse/pull/12747) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed performance with large tuples, which are interpreted as functions in `IN` section. The case when user write `WHERE x IN tuple(1, 2, ...)` instead of `WHERE x IN (1, 2, ...)` for some obscure reason. [#12700](https://github.com/ClickHouse/ClickHouse/pull/12700) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed memory tracking for `input_format_parallel_parsing` (by attaching thread to group). [#12672](https://github.com/ClickHouse/ClickHouse/pull/12672) ([Azat Khuzhin](https://github.com/azat)).
* Fixed [#12293](https://github.com/ClickHouse/ClickHouse/issues/12293) allow push predicate when subquery contains with clause. [#12663](https://github.com/ClickHouse/ClickHouse/pull/12663) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed [#10572](https://github.com/ClickHouse/ClickHouse/issues/10572) fix bloom filter index with const expression. [#12659](https://github.com/ClickHouse/ClickHouse/pull/12659) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed `SIGSEGV` in `StorageKafka` when broker is unavailable (and not only). [#12658](https://github.com/ClickHouse/ClickHouse/pull/12658) ([Azat Khuzhin](https://github.com/azat)).
* Added support for function `if` with `Array(UUID)` arguments. This fixes [#11066](https://github.com/ClickHouse/ClickHouse/issues/11066). [#12648](https://github.com/ClickHouse/ClickHouse/pull/12648) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed race condition in external dictionaries with cache layout which can lead server crash. [#12566](https://github.com/ClickHouse/ClickHouse/pull/12566) ([alesapin](https://github.com/alesapin)).
* Removed data for Distributed tables (blocks from async INSERTs) on DROP TABLE. [#12556](https://github.com/ClickHouse/ClickHouse/pull/12556) ([Azat Khuzhin](https://github.com/azat)).
* Fixed bug which lead to broken old parts after `ALTER DELETE` query when `enable_mixed_granularity_parts=1`. Fixes [#12536](https://github.com/ClickHouse/ClickHouse/issues/12536). [#12543](https://github.com/ClickHouse/ClickHouse/pull/12543) ([alesapin](https://github.com/alesapin)).
* Better exception for function `in` with invalid number of arguments. [#12529](https://github.com/ClickHouse/ClickHouse/pull/12529) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed performance issue, while reading from compact parts. [#12492](https://github.com/ClickHouse/ClickHouse/pull/12492) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed crash in JOIN with dictionary when we are joining over expression of dictionary key: `t JOIN dict ON expr(dict.id) = t.id`. Disable dictionary join optimisation for this case. [#12458](https://github.com/ClickHouse/ClickHouse/pull/12458) ([Artem Zuikov](https://github.com/4ertus2)).
* Fixed possible segfault if StorageMerge. Closes [#12054](https://github.com/ClickHouse/ClickHouse/issues/12054). [#12401](https://github.com/ClickHouse/ClickHouse/pull/12401) ([tavplubix](https://github.com/tavplubix)).
* Fixed order of columns in `WITH FILL` modifier. Previously order of columns of `ORDER BY` statement wasn't respected. [#12306](https://github.com/ClickHouse/ClickHouse/pull/12306) ([Anton Popov](https://github.com/CurtizJ)).
* Avoid "bad cast" exception when there is an expression that filters data by virtual columns (like `_table` in `Merge` tables) or by "index" columns in system tables such as filtering by database name when querying from `system.tables`, and this expression returns `Nullable` type. This fixes [#12166](https://github.com/ClickHouse/ClickHouse/issues/12166). [#12305](https://github.com/ClickHouse/ClickHouse/pull/12305) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Show error after TrieDictionary failed to load. [#12290](https://github.com/ClickHouse/ClickHouse/pull/12290) ([Vitaly Baranov](https://github.com/vitlibar)).
* The function `arrayFill` worked incorrectly for empty arrays that may lead to crash. This fixes [#12263](https://github.com/ClickHouse/ClickHouse/issues/12263). [#12279](https://github.com/ClickHouse/ClickHouse/pull/12279) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Implemented conversions to the common type for `LowCardinality` types. This allows to execute UNION ALL of tables with columns of LowCardinality and other columns. This fixes [#8212](https://github.com/ClickHouse/ClickHouse/issues/8212). This fixes [#4342](https://github.com/ClickHouse/ClickHouse/issues/4342). [#12275](https://github.com/ClickHouse/ClickHouse/pull/12275) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed the behaviour when during multiple sequential inserts in `StorageFile` header for some special types was written more than once. This fixed [#6155](https://github.com/ClickHouse/ClickHouse/issues/6155). [#12197](https://github.com/ClickHouse/ClickHouse/pull/12197) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fixed logical functions for UInt8 values when they are not equal to 0 or 1. [#12196](https://github.com/ClickHouse/ClickHouse/pull/12196) ([Alexander Kazakov](https://github.com/Akazz)).
* Cap max_memory_usage* limits to the process resident memory. [#12182](https://github.com/ClickHouse/ClickHouse/pull/12182) ([Azat Khuzhin](https://github.com/azat)).
* Fixed `dictGet` arguments check during GROUP BY injective functions elimination. [#12179](https://github.com/ClickHouse/ClickHouse/pull/12179) ([Azat Khuzhin](https://github.com/azat)).
* Don't split the dictionary source's table name into schema and table name itself if ODBC connection doesn't support schema. [#12165](https://github.com/ClickHouse/ClickHouse/pull/12165) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fixed wrong logic in `ALTER DELETE` that leads to deleting of records when condition evaluates to NULL. This fixes [#9088](https://github.com/ClickHouse/ClickHouse/issues/9088). This closes [#12106](https://github.com/ClickHouse/ClickHouse/issues/12106). [#12153](https://github.com/ClickHouse/ClickHouse/pull/12153) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed transform of query to send to external DBMS (e.g. MySQL, ODBC) in presense of aliases. This fixes [#12032](https://github.com/ClickHouse/ClickHouse/issues/12032). [#12151](https://github.com/ClickHouse/ClickHouse/pull/12151) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed potential overflow in integer division. This fixes [#12119](https://github.com/ClickHouse/ClickHouse/issues/12119). [#12140](https://github.com/ClickHouse/ClickHouse/pull/12140) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed potential infinite loop in `greatCircleDistance`, `geoDistance`. This fixes [#12117](https://github.com/ClickHouse/ClickHouse/issues/12117). [#12137](https://github.com/ClickHouse/ClickHouse/pull/12137) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Normalize "pid" file handling. In previous versions the server may refuse to start if it was killed without proper shutdown and if there is another process that has the same pid as previously runned server. Also pid file may be removed in unsuccessful server startup even if there is another server running. This fixes [#3501](https://github.com/ClickHouse/ClickHouse/issues/3501). [#12133](https://github.com/ClickHouse/ClickHouse/pull/12133) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed handling dependency of table with ENGINE=Dictionary on dictionary. This fixes [#10994](https://github.com/ClickHouse/ClickHouse/issues/10994). This fixes [#10397](https://github.com/ClickHouse/ClickHouse/issues/10397). [#12116](https://github.com/ClickHouse/ClickHouse/pull/12116) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fixed performance for selects with `UNION` caused by wrong limit for the total number of threads. Fixes [#12030](https://github.com/ClickHouse/ClickHouse/issues/12030). [#12103](https://github.com/ClickHouse/ClickHouse/pull/12103) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed segfault with `-StateResample` combinators. [#12092](https://github.com/ClickHouse/ClickHouse/pull/12092) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed empty `result_rows` and `result_bytes` metrics in `system.quey_log` for selects. Fixes [#11595](https://github.com/ClickHouse/ClickHouse/issues/11595). [#12089](https://github.com/ClickHouse/ClickHouse/pull/12089) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed unnecessary limiting the number of threads for selects from `VIEW`. Fixes [#11937](https://github.com/ClickHouse/ClickHouse/issues/11937). [#12085](https://github.com/ClickHouse/ClickHouse/pull/12085) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed possible crash while using wrong type for `PREWHERE`. Fixes [#12053](https://github.com/ClickHouse/ClickHouse/issues/12053), [#12060](https://github.com/ClickHouse/ClickHouse/issues/12060). [#12060](https://github.com/ClickHouse/ClickHouse/pull/12060) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed error `Expected single dictionary argument for function` for function `defaultValueOfArgumentType` with `LowCardinality` type. Fixes [#11808](https://github.com/ClickHouse/ClickHouse/issues/11808). [#12056](https://github.com/ClickHouse/ClickHouse/pull/12056) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed error `Cannot capture column` for higher-order functions with `Tuple(LowCardinality)` argument. Fixes [#9766](https://github.com/ClickHouse/ClickHouse/issues/9766). [#12055](https://github.com/ClickHouse/ClickHouse/pull/12055) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Parse tables metadata in parallel when loading database. This fixes slow server startup when there are large number of tables. [#12045](https://github.com/ClickHouse/ClickHouse/pull/12045) ([tavplubix](https://github.com/tavplubix)).
* Make `topK` aggregate function return Enum for Enum types. This fixes [#3740](https://github.com/ClickHouse/ClickHouse/issues/3740). [#12043](https://github.com/ClickHouse/ClickHouse/pull/12043) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed constraints check if constraint is a constant expression. This fixes [#11360](https://github.com/ClickHouse/ClickHouse/issues/11360). [#12042](https://github.com/ClickHouse/ClickHouse/pull/12042) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed incorrect comparison of tuples with `Nullable` columns. Fixes [#11985](https://github.com/ClickHouse/ClickHouse/issues/11985). [#12039](https://github.com/ClickHouse/ClickHouse/pull/12039) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed calculation of access rights when allow_introspection_functions=0. [#12031](https://github.com/ClickHouse/ClickHouse/pull/12031) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fixed wrong result and potential crash when invoking function `if` with arguments of type `FixedString` with different sizes. This fixes [#11362](https://github.com/ClickHouse/ClickHouse/issues/11362). [#12021](https://github.com/ClickHouse/ClickHouse/pull/12021) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* A query with function `neighbor` as the only returned expression may return empty result if the function is called with offset `-9223372036854775808`. This fixes [#11367](https://github.com/ClickHouse/ClickHouse/issues/11367). [#12019](https://github.com/ClickHouse/ClickHouse/pull/12019) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed calculation of access rights when allow_ddl=0. [#12015](https://github.com/ClickHouse/ClickHouse/pull/12015) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fixed potential array size overflow in generateRandom that may lead to crash. This fixes [#11371](https://github.com/ClickHouse/ClickHouse/issues/11371). [#12013](https://github.com/ClickHouse/ClickHouse/pull/12013) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed potential floating point exception. This closes [#11378](https://github.com/ClickHouse/ClickHouse/issues/11378). [#12005](https://github.com/ClickHouse/ClickHouse/pull/12005) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed wrong setting name in log message at server startup. [#11997](https://github.com/ClickHouse/ClickHouse/pull/11997) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed `Query parameter was not set` in `Values` format. Fixes [#11918](https://github.com/ClickHouse/ClickHouse/issues/11918). [#11936](https://github.com/ClickHouse/ClickHouse/pull/11936) ([tavplubix](https://github.com/tavplubix)).
* Keep aliases for substitutions in query (parametrized queries). This fixes [#11914](https://github.com/ClickHouse/ClickHouse/issues/11914). [#11916](https://github.com/ClickHouse/ClickHouse/pull/11916) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed bug with no moves when changing storage policy from default one. [#11893](https://github.com/ClickHouse/ClickHouse/pull/11893) ([Vladimir Chebotarev](https://github.com/excitoon)).
* Fixed potential floating point exception when parsing `DateTime64`. This fixes [#11374](https://github.com/ClickHouse/ClickHouse/issues/11374). [#11875](https://github.com/ClickHouse/ClickHouse/pull/11875) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed memory accounting via HTTP interface (can be significant with `wait_end_of_query=1`). [#11840](https://github.com/ClickHouse/ClickHouse/pull/11840) ([Azat Khuzhin](https://github.com/azat)).
* Parse metadata stored in zookeeper before checking for equality. [#11739](https://github.com/ClickHouse/ClickHouse/pull/11739) ([Azat Khuzhin](https://github.com/azat)).
#### Performance Improvement
* Index not used for IN operator with literals", performance regression introduced around v19.3. This fixes "[#10574](https://github.com/ClickHouse/ClickHouse/issues/10574). [#12062](https://github.com/ClickHouse/ClickHouse/pull/12062) ([nvartolomei](https://github.com/nvartolomei)).
#### Build/Testing/Packaging Improvement
* Install `ca-certificates` before the first `apt-get update` in Dockerfile. [#12095](https://github.com/ClickHouse/ClickHouse/pull/12095) ([Ivan Blinkov](https://github.com/blinkov)).
### ClickHouse release v20.4.6.53-stable 2020-06-25
#### Bug Fix
@ -762,6 +1015,70 @@ No changes compared to v20.4.3.16-stable.
## ClickHouse release v20.3
### ClickHouse release v20.3.16.165-lts 2020-08-10
#### Bug Fix
* Fixed error in `parseDateTimeBestEffort` function when unix timestamp was passed as an argument. This fixes [#13362](https://github.com/ClickHouse/ClickHouse/issues/13362). [#13441](https://github.com/ClickHouse/ClickHouse/pull/13441) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed potentially low performance and slightly incorrect result for `uniqExact`, `topK`, `sumDistinct` and similar aggregate functions called on Float types with `NaN` values. It also triggered assert in debug build. This fixes [#12491](https://github.com/ClickHouse/ClickHouse/issues/12491). [#13254](https://github.com/ClickHouse/ClickHouse/pull/13254) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed function if with nullable constexpr as cond that is not literal NULL. Fixes [#12463](https://github.com/ClickHouse/ClickHouse/issues/12463). [#13226](https://github.com/ClickHouse/ClickHouse/pull/13226) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed assert in `arrayElement` function in case of array elements are Nullable and array subscript is also Nullable. This fixes [#12172](https://github.com/ClickHouse/ClickHouse/issues/12172). [#13224](https://github.com/ClickHouse/ClickHouse/pull/13224) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed unnecessary limiting for the number of threads for selects from local replica. [#12840](https://github.com/ClickHouse/ClickHouse/pull/12840) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed possible extra overflow row in data which could appear for queries `WITH TOTALS`. [#12747](https://github.com/ClickHouse/ClickHouse/pull/12747) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed performance with large tuples, which are interpreted as functions in `IN` section. The case when user write `WHERE x IN tuple(1, 2, ...)` instead of `WHERE x IN (1, 2, ...)` for some obscure reason. [#12700](https://github.com/ClickHouse/ClickHouse/pull/12700) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed memory tracking for input_format_parallel_parsing (by attaching thread to group). [#12672](https://github.com/ClickHouse/ClickHouse/pull/12672) ([Azat Khuzhin](https://github.com/azat)).
* Fixed [#12293](https://github.com/ClickHouse/ClickHouse/issues/12293) allow push predicate when subquery contains with clause. [#12663](https://github.com/ClickHouse/ClickHouse/pull/12663) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed [#10572](https://github.com/ClickHouse/ClickHouse/issues/10572) fix bloom filter index with const expression. [#12659](https://github.com/ClickHouse/ClickHouse/pull/12659) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed SIGSEGV in StorageKafka when broker is unavailable (and not only). [#12658](https://github.com/ClickHouse/ClickHouse/pull/12658) ([Azat Khuzhin](https://github.com/azat)).
* Fixed race condition in external dictionaries with cache layout which can lead server crash. [#12566](https://github.com/ClickHouse/ClickHouse/pull/12566) ([alesapin](https://github.com/alesapin)).
* Fixed bug which lead to broken old parts after `ALTER DELETE` query when `enable_mixed_granularity_parts=1`. Fixes [#12536](https://github.com/ClickHouse/ClickHouse/issues/12536). [#12543](https://github.com/ClickHouse/ClickHouse/pull/12543) ([alesapin](https://github.com/alesapin)).
* Better exception for function `in` with invalid number of arguments. [#12529](https://github.com/ClickHouse/ClickHouse/pull/12529) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed performance issue, while reading from compact parts. [#12492](https://github.com/ClickHouse/ClickHouse/pull/12492) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed the deadlock if `text_log` is enabled. [#12452](https://github.com/ClickHouse/ClickHouse/pull/12452) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed possible segfault if StorageMerge. Closes [#12054](https://github.com/ClickHouse/ClickHouse/issues/12054). [#12401](https://github.com/ClickHouse/ClickHouse/pull/12401) ([tavplubix](https://github.com/tavplubix)).
* Fixed `TOTALS/ROLLUP/CUBE` for aggregate functions with `-State` and `Nullable` arguments. This fixes [#12163](https://github.com/ClickHouse/ClickHouse/issues/12163). [#12376](https://github.com/ClickHouse/ClickHouse/pull/12376) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed order of columns in `WITH FILL` modifier. Previously order of columns of `ORDER BY` statement wasn't respected. [#12306](https://github.com/ClickHouse/ClickHouse/pull/12306) ([Anton Popov](https://github.com/CurtizJ)).
* Avoid "bad cast" exception when there is an expression that filters data by virtual columns (like `_table` in `Merge` tables) or by "index" columns in system tables such as filtering by database name when querying from `system.tables`, and this expression returns `Nullable` type. This fixes [#12166](https://github.com/ClickHouse/ClickHouse/issues/12166). [#12305](https://github.com/ClickHouse/ClickHouse/pull/12305) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Show error after `TrieDictionary` failed to load. [#12290](https://github.com/ClickHouse/ClickHouse/pull/12290) ([Vitaly Baranov](https://github.com/vitlibar)).
* The function `arrayFill` worked incorrectly for empty arrays that may lead to crash. This fixes [#12263](https://github.com/ClickHouse/ClickHouse/issues/12263). [#12279](https://github.com/ClickHouse/ClickHouse/pull/12279) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Implement conversions to the common type for `LowCardinality` types. This allows to execute UNION ALL of tables with columns of LowCardinality and other columns. This fixes [#8212](https://github.com/ClickHouse/ClickHouse/issues/8212). This fixes [#4342](https://github.com/ClickHouse/ClickHouse/issues/4342). [#12275](https://github.com/ClickHouse/ClickHouse/pull/12275) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed the behaviour when during multiple sequential inserts in `StorageFile` header for some special types was written more than once. This fixed [#6155](https://github.com/ClickHouse/ClickHouse/issues/6155). [#12197](https://github.com/ClickHouse/ClickHouse/pull/12197) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fixed logical functions for UInt8 values when they are not equal to 0 or 1. [#12196](https://github.com/ClickHouse/ClickHouse/pull/12196) ([Alexander Kazakov](https://github.com/Akazz)).
* Fixed `dictGet` arguments check during GROUP BY injective functions elimination. [#12179](https://github.com/ClickHouse/ClickHouse/pull/12179) ([Azat Khuzhin](https://github.com/azat)).
* Fixed wrong logic in `ALTER DELETE` that leads to deleting of records when condition evaluates to NULL. This fixes [#9088](https://github.com/ClickHouse/ClickHouse/issues/9088). This closes [#12106](https://github.com/ClickHouse/ClickHouse/issues/12106). [#12153](https://github.com/ClickHouse/ClickHouse/pull/12153) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed transform of query to send to external DBMS (e.g. MySQL, ODBC) in presense of aliases. This fixes [#12032](https://github.com/ClickHouse/ClickHouse/issues/12032). [#12151](https://github.com/ClickHouse/ClickHouse/pull/12151) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed potential overflow in integer division. This fixes [#12119](https://github.com/ClickHouse/ClickHouse/issues/12119). [#12140](https://github.com/ClickHouse/ClickHouse/pull/12140) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed potential infinite loop in `greatCircleDistance`, `geoDistance`. This fixes [#12117](https://github.com/ClickHouse/ClickHouse/issues/12117). [#12137](https://github.com/ClickHouse/ClickHouse/pull/12137) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Avoid `There is no query` exception for materialized views with joins or with subqueries attached to system logs (system.query_log, metric_log, etc) or to engine=Buffer underlying table. [#12120](https://github.com/ClickHouse/ClickHouse/pull/12120) ([filimonov](https://github.com/filimonov)).
* Fixed performance for selects with `UNION` caused by wrong limit for the total number of threads. Fixes [#12030](https://github.com/ClickHouse/ClickHouse/issues/12030). [#12103](https://github.com/ClickHouse/ClickHouse/pull/12103) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed segfault with `-StateResample` combinators. [#12092](https://github.com/ClickHouse/ClickHouse/pull/12092) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed unnecessary limiting the number of threads for selects from `VIEW`. Fixes [#11937](https://github.com/ClickHouse/ClickHouse/issues/11937). [#12085](https://github.com/ClickHouse/ClickHouse/pull/12085) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed possible crash while using wrong type for `PREWHERE`. Fixes [#12053](https://github.com/ClickHouse/ClickHouse/issues/12053), [#12060](https://github.com/ClickHouse/ClickHouse/issues/12060). [#12060](https://github.com/ClickHouse/ClickHouse/pull/12060) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed error `Expected single dictionary argument for function` for function `defaultValueOfArgumentType` with `LowCardinality` type. Fixes [#11808](https://github.com/ClickHouse/ClickHouse/issues/11808). [#12056](https://github.com/ClickHouse/ClickHouse/pull/12056) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed error `Cannot capture column` for higher-order functions with `Tuple(LowCardinality)` argument. Fixes [#9766](https://github.com/ClickHouse/ClickHouse/issues/9766). [#12055](https://github.com/ClickHouse/ClickHouse/pull/12055) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Parse tables metadata in parallel when loading database. This fixes slow server startup when there are large number of tables. [#12045](https://github.com/ClickHouse/ClickHouse/pull/12045) ([tavplubix](https://github.com/tavplubix)).
* Make `topK` aggregate function return Enum for Enum types. This fixes [#3740](https://github.com/ClickHouse/ClickHouse/issues/3740). [#12043](https://github.com/ClickHouse/ClickHouse/pull/12043) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed constraints check if constraint is a constant expression. This fixes [#11360](https://github.com/ClickHouse/ClickHouse/issues/11360). [#12042](https://github.com/ClickHouse/ClickHouse/pull/12042) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed incorrect comparison of tuples with `Nullable` columns. Fixes [#11985](https://github.com/ClickHouse/ClickHouse/issues/11985). [#12039](https://github.com/ClickHouse/ClickHouse/pull/12039) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed wrong result and potential crash when invoking function `if` with arguments of type `FixedString` with different sizes. This fixes [#11362](https://github.com/ClickHouse/ClickHouse/issues/11362). [#12021](https://github.com/ClickHouse/ClickHouse/pull/12021) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* A query with function `neighbor` as the only returned expression may return empty result if the function is called with offset `-9223372036854775808`. This fixes [#11367](https://github.com/ClickHouse/ClickHouse/issues/11367). [#12019](https://github.com/ClickHouse/ClickHouse/pull/12019) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed potential array size overflow in generateRandom that may lead to crash. This fixes [#11371](https://github.com/ClickHouse/ClickHouse/issues/11371). [#12013](https://github.com/ClickHouse/ClickHouse/pull/12013) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed potential floating point exception. This closes [#11378](https://github.com/ClickHouse/ClickHouse/issues/11378). [#12005](https://github.com/ClickHouse/ClickHouse/pull/12005) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed wrong setting name in log message at server startup. [#11997](https://github.com/ClickHouse/ClickHouse/pull/11997) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed `Query parameter was not set` in `Values` format. Fixes [#11918](https://github.com/ClickHouse/ClickHouse/issues/11918). [#11936](https://github.com/ClickHouse/ClickHouse/pull/11936) ([tavplubix](https://github.com/tavplubix)).
* Keep aliases for substitutions in query (parametrized queries). This fixes [#11914](https://github.com/ClickHouse/ClickHouse/issues/11914). [#11916](https://github.com/ClickHouse/ClickHouse/pull/11916) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed potential floating point exception when parsing DateTime64. This fixes [#11374](https://github.com/ClickHouse/ClickHouse/issues/11374). [#11875](https://github.com/ClickHouse/ClickHouse/pull/11875) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed memory accounting via `HTTP` interface (can be significant with `wait_end_of_query=1`). [#11840](https://github.com/ClickHouse/ClickHouse/pull/11840) ([Azat Khuzhin](https://github.com/azat)).
* Fixed wrong result for `if()` with NULLs in condition. [#11807](https://github.com/ClickHouse/ClickHouse/pull/11807) ([Artem Zuikov](https://github.com/4ertus2)).
* Parse metadata stored in zookeeper before checking for equality. [#11739](https://github.com/ClickHouse/ClickHouse/pull/11739) ([Azat Khuzhin](https://github.com/azat)).
* Fixed `LIMIT n WITH TIES` usage together with `ORDER BY` statement, which contains aliases. [#11689](https://github.com/ClickHouse/ClickHouse/pull/11689) ([Anton Popov](https://github.com/CurtizJ)).
* Fix potential read of uninitialized memory in cache dictionary. [#10834](https://github.com/ClickHouse/ClickHouse/pull/10834) ([alexey-milovidov](https://github.com/alexey-milovidov)).
#### Performance Improvement
* Index not used for IN operator with literals", performance regression introduced around v19.3. This fixes "[#10574](https://github.com/ClickHouse/ClickHouse/issues/10574). [#12062](https://github.com/ClickHouse/ClickHouse/pull/12062) ([nvartolomei](https://github.com/nvartolomei)).
### ClickHouse release v20.3.12.112-lts 2020-06-25
#### Bug Fix

View File

@ -142,6 +142,27 @@ endif ()
# Make sure the final executable has symbols exported
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -rdynamic")
find_program (OBJCOPY_PATH NAMES "llvm-objcopy" "llvm-objcopy-10" "llvm-objcopy-9" "llvm-objcopy-8" "objcopy")
if (OBJCOPY_PATH)
message(STATUS "Using objcopy: ${OBJCOPY_PATH}.")
if (ARCH_AMD64)
set(OBJCOPY_ARCH_OPTIONS -O elf64-x86-64 -B i386)
elseif (ARCH_AARCH64)
set(OBJCOPY_ARCH_OPTIONS -O elf64-aarch64 -B aarch64)
endif ()
else ()
message(FATAL_ERROR "Cannot find objcopy.")
endif ()
if (OS_DARWIN)
set(WHOLE_ARCHIVE -all_load)
set(NO_WHOLE_ARCHIVE -noall_load)
else ()
set(WHOLE_ARCHIVE --whole-archive)
set(NO_WHOLE_ARCHIVE --no-whole-archive)
endif ()
option (ADD_GDB_INDEX_FOR_GOLD "Set to add .gdb-index to resulting binaries for gold linker. NOOP if lld is used." 0)
if (NOT CMAKE_BUILD_TYPE_UC STREQUAL "RELEASE")
if (LINKER_NAME STREQUAL "lld")
@ -347,7 +368,6 @@ include (cmake/find/icu.cmake)
include (cmake/find/zlib.cmake)
include (cmake/find/zstd.cmake)
include (cmake/find/ltdl.cmake) # for odbc
include (cmake/find/termcap.cmake)
# openssl, zlib before poco
include (cmake/find/sparsehash.cmake)
include (cmake/find/re2.cmake)

View File

@ -7,6 +7,7 @@ add_subdirectory (daemon)
add_subdirectory (loggers)
add_subdirectory (pcg-random)
add_subdirectory (widechar_width)
add_subdirectory (readpassphrase)
if (USE_MYSQL)
add_subdirectory (mysqlxx)

View File

@ -17,6 +17,7 @@ set (SRCS
sleep.cpp
terminalColors.cpp
errnoToString.cpp
getResource.cpp
)
if (ENABLE_REPLXX)

View File

@ -3,11 +3,9 @@
#include <cctz/civil_time.h>
#include <cctz/time_zone.h>
#include <cctz/zone_info_source.h>
#include <common/unaligned.h>
#include <common/getResource.h>
#include <Poco/Exception.h>
#include <dlfcn.h>
#include <algorithm>
#include <cassert>
#include <chrono>
@ -213,19 +211,9 @@ namespace cctz_extension
const std::string & name,
const std::function<std::unique_ptr<cctz::ZoneInfoSource>(const std::string & name)> & fallback)
{
std::string name_replaced = name;
std::replace(name_replaced.begin(), name_replaced.end(), '/', '_');
std::replace(name_replaced.begin(), name_replaced.end(), '-', '_');
/// These are the names that are generated by "ld -r -b binary"
std::string symbol_name_data = "_binary_" + name_replaced + "_start";
std::string symbol_name_size = "_binary_" + name_replaced + "_size";
const void * sym_data = dlsym(RTLD_DEFAULT, symbol_name_data.c_str());
const void * sym_size = dlsym(RTLD_DEFAULT, symbol_name_size.c_str());
if (sym_data && sym_size)
return std::make_unique<Source>(static_cast<const char *>(sym_data), unalignedLoad<size_t>(&sym_size));
std::string_view resource = getResource(name);
if (!resource.empty())
return std::make_unique<Source>(resource.data(), resource.size());
return fallback(name);
}

View File

@ -308,7 +308,7 @@ inline void splitInto(To & to, const std::string & what, bool token_compress = f
const char * delimiter_or_end = find_first_symbols<symbols...>(pos, end);
if (!token_compress || pos < delimiter_or_end)
to.emplace_back(pos, delimiter_or_end);
to.emplace_back(pos, delimiter_or_end - pos);
if (delimiter_or_end < end)
pos = delimiter_or_end + 1;

View File

@ -0,0 +1,24 @@
#include "getResource.h"
#include "unaligned.h"
#include <dlfcn.h>
#include <string>
std::string_view getResource(std::string_view name)
{
std::string name_replaced(name);
std::replace(name_replaced.begin(), name_replaced.end(), '/', '_');
std::replace(name_replaced.begin(), name_replaced.end(), '-', '_');
std::replace(name_replaced.begin(), name_replaced.end(), '.', '_');
/// These are the names that are generated by "ld -r -b binary"
std::string symbol_name_data = "_binary_" + name_replaced + "_start";
std::string symbol_name_size = "_binary_" + name_replaced + "_size";
const void * sym_data = dlsym(RTLD_DEFAULT, symbol_name_data.c_str());
const void * sym_size = dlsym(RTLD_DEFAULT, symbol_name_size.c_str());
if (sym_data && sym_size)
return { static_cast<const char *>(sym_data), unalignedLoad<size_t>(&sym_size) };
return {};
}

View File

@ -0,0 +1,7 @@
#pragma once
#include <string_view>
/// Get resource from binary if exists. Otherwise return empty string view.
/// Resources are data that is embedded into executable at link time.
std::string_view getResource(std::string_view name);

View File

@ -1,3 +1,4 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
LIBRARY()
ADDINCL(
@ -10,6 +11,7 @@ CFLAGS (GLOBAL -DARCADIA_BUILD)
CFLAGS (GLOBAL -DUSE_CPUID=1)
CFLAGS (GLOBAL -DUSE_JEMALLOC=0)
CFLAGS (GLOBAL -DUSE_RAPIDJSON=1)
CFLAGS (GLOBAL -DUSE_SSL=1)
IF (OS_DARWIN)
CFLAGS (GLOBAL -DOS_DARWIN)
@ -23,6 +25,7 @@ PEERDIR(
contrib/libs/cctz/src
contrib/libs/cxxsupp/libcxx-filesystem
contrib/libs/poco/Net
contrib/libs/poco/NetSSL_OpenSSL
contrib/libs/poco/Util
contrib/libs/fmt
contrib/restricted/boost
@ -35,8 +38,10 @@ SRCS(
DateLUT.cpp
DateLUTImpl.cpp
demangle.cpp
errnoToString.cpp
getFQDNOrHostName.cpp
getMemoryAmount.cpp
getResource.cpp
getThreadId.cpp
JSON.cpp
LineReader.cpp
@ -47,7 +52,6 @@ SRCS(
shift10.cpp
sleep.cpp
terminalColors.cpp
errnoToString.cpp
)
END()

36
base/common/ya.make.in Normal file
View File

@ -0,0 +1,36 @@
LIBRARY()
ADDINCL(
GLOBAL clickhouse/base
GLOBAL contrib/libs/cctz/include
)
CFLAGS (GLOBAL -DARCADIA_BUILD)
CFLAGS (GLOBAL -DUSE_CPUID=1)
CFLAGS (GLOBAL -DUSE_JEMALLOC=0)
CFLAGS (GLOBAL -DUSE_RAPIDJSON=1)
IF (OS_DARWIN)
CFLAGS (GLOBAL -DOS_DARWIN)
ELSEIF (OS_FREEBSD)
CFLAGS (GLOBAL -DOS_FREEBSD)
ELSEIF (OS_LINUX)
CFLAGS (GLOBAL -DOS_LINUX)
ENDIF ()
PEERDIR(
contrib/libs/cctz/src
contrib/libs/cxxsupp/libcxx-filesystem
contrib/libs/poco/Net
contrib/libs/poco/Util
contrib/libs/fmt
contrib/restricted/boost
contrib/restricted/cityhash-1.0.2
)
SRCS(
<? find . -name '*.cpp' | grep -v -F tests/ | grep -v -F Replxx | grep -v -F Readline | sed 's/^\.\// /' | sort ?>
)
END()

View File

@ -459,6 +459,11 @@ BaseDaemon::~BaseDaemon()
{
writeSignalIDtoSignalPipe(SignalListener::StopThread);
signal_listener_thread.join();
/// Reset signals to SIG_DFL to avoid trying to write to the signal_pipe that will be closed after.
for (int sig : handled_signals)
{
signal(sig, SIG_DFL);
}
signal_pipe.close();
}
@ -713,7 +718,7 @@ void BaseDaemon::initializeTerminationAndSignalProcessing()
/// Setup signal handlers.
auto add_signal_handler =
[](const std::vector<int> & signals, signal_function handler)
[this](const std::vector<int> & signals, signal_function handler)
{
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
@ -737,6 +742,8 @@ void BaseDaemon::initializeTerminationAndSignalProcessing()
for (auto signal : signals)
if (sigaction(signal, &sa, nullptr))
throw Poco::Exception("Cannot set signal handler.");
std::copy(signals.begin(), signals.end(), std::back_inserter(handled_signals));
}
};

View File

@ -192,6 +192,8 @@ protected:
Poco::Util::AbstractConfiguration * last_configuration = nullptr;
String build_id_info;
std::vector<int> handled_signals;
};

View File

@ -5,3 +5,4 @@ add_library(readpassphrase readpassphrase.c)
set_target_properties(readpassphrase PROPERTIES LINKER_LANGUAGE C)
target_compile_options(readpassphrase PRIVATE -Wno-unused-result -Wno-reserved-id-macro)
target_include_directories(readpassphrase PUBLIC .)

View File

@ -23,7 +23,9 @@
/* OPENBSD ORIGINAL: lib/libc/gen/readpassphrase.c */
#include "includes.h"
#ifndef _PATH_TTY
#define _PATH_TTY "/dev/tty"
#endif
#include <termios.h>
#include <signal.h>

View File

@ -4,4 +4,5 @@ RECURSE(
loggers
pcg-random
widechar_width
readpassphrase
)

1774
benchmark/monetdb/aws.log Normal file

File diff suppressed because it is too large Load Diff

10
benchmark/monetdb/benchmark.sh Executable file
View File

@ -0,0 +1,10 @@
#!/bin/bash
grep -v -P '^#' queries.sql | sed -e 's/{table}/hits/' | while read query; do
echo 3 | sudo tee /proc/sys/vm/drop_caches
echo "$query";
for i in {1..3}; do
./send-query "$query" 2>&1 | grep -P '\d+ tuple|clk: |unknown|overflow|error';
done;
done;

View File

@ -1,4 +0,0 @@
CONF_DIR=/home/kartavyy/benchmark/monetdb
expect_file=$CONF_DIR/expect.tcl
test_file=$CONF_DIR/queries.sql
etc_init_d_service=/etc/init.d/monetdb5-sql

View File

@ -1,3 +0,0 @@
create table hits_10m ( WatchID BIGINT, JavaEnable SMALLINT, Title VARCHAR(1400), GoodEvent SMALLINT, EventTime TIMESTAMP, EventDate DATE, CounterID BIGINT, ClientIP BIGINT, RegionID BIGINT, UserID BIGINT, CounterClass TINYINT, OS SMALLINT, UserAgent SMALLINT, URL VARCHAR(7800), Referer VARCHAR(3125), Refresh TINYINT, RefererCategoryID INT, RefererRegionID BIGINT, URLCategoryID INT, URLRegionID BIGINT, ResolutionWidth INT, ResolutionHeight INT, ResolutionDepth SMALLINT, FlashMajor SMALLINT, FlashMinor SMALLINT, FlashMinor2 VARCHAR(256), NetMajor SMALLINT, NetMinor SMALLINT, UserAgentMajor INT, UserAgentMinor CHAR(2), CookieEnable SMALLINT, JavascriptEnable SMALLINT, IsMobile SMALLINT, MobilePhone SMALLINT, MobilePhoneModel VARCHAR(80), Params VARCHAR(2925), IPNetworkID BIGINT, TraficSourceID SMALLINT, SearchEngineID INT, SearchPhrase VARCHAR(2008), AdvEngineID SMALLINT, IsArtifical SMALLINT, WindowClientWidth INT, WindowClientHeight INT, ClientTimeZone INTEGER, ClientEventTime TIMESTAMP, SilverlightVersion1 SMALLINT, SilverlightVersion2 SMALLINT, SilverlightVersion3 BIGINT, SilverlightVersion4 INT, PageCharset VARCHAR(80), CodeVersion BIGINT, IsLink SMALLINT, IsDownload SMALLINT, IsNotBounce SMALLINT, FUniqID BIGINT, OriginalURL VARCHAR(8181), HID BIGINT, IsOldCounter SMALLINT, IsEvent SMALLINT, IsParameter SMALLINT, DontCountHits SMALLINT, WithHash SMALLINT, HitColor CHAR(1), LocalEventTime TIMESTAMP, Age SMALLINT, Sex SMALLINT, Income SMALLINT, Interests INT, Robotness SMALLINT, RemoteIP BIGINT, WindowName INT, OpenerName INT, HistoryLength SMALLINT, BrowserLanguage CHAR(2), BrowserCountry CHAR(2), SocialNetwork VARCHAR(128), SocialAction VARCHAR(128), HTTPError INT, SendTiming BIGINT, DNSTiming BIGINT, ConnectTiming BIGINT, ResponseStartTiming BIGINT, ResponseEndTiming BIGINT, FetchTiming BIGINT, SocialSourceNetworkID SMALLINT, SocialSourcePage VARCHAR(256), ParamPrice BIGINT, ParamOrderID VARCHAR(80), ParamCurrency CHAR(3), ParamCurrencyID INT, OpenstatServiceName VARCHAR(80), OpenstatCampaignID VARCHAR(512), OpenstatAdID VARCHAR(80), OpenstatSourceID VARCHAR(256), UTMSource VARCHAR(256), UTMMedium VARCHAR(256), UTMCampaign VARCHAR(407), UTMContent VARCHAR(256), UTMTerm VARCHAR(437), FromTag VARCHAR(428), HasGCLID SMALLINT, RefererHash BIGINT, URLHash BIGINT, CLID BIGINT, UserIDHash BIGINT ); CREATE INDEX hits_10m_ind ON hits_10m(CounterID, EventDate, UserIDHash, EventTime);
COPY INTO hits_10m FROM ('/opt/dump/dump_0.3/dump_hits_10m_meshed_utf8.tsv') DELIMITERS '\t';

View File

@ -0,0 +1,356 @@
Go to https://www.monetdb.org/
Dowload now.
Latest binary releases.
Ubuntu & Debian.
https://www.monetdb.org/downloads/deb/
Go to the server where you want to install MonetDB.
```
$ sudo mcedit /etc/apt/sources.list.d/monetdb.list
```
Write:
```
deb https://dev.monetdb.org/downloads/deb/ bionic monetdb
```
```
$ wget --output-document=- https://www.monetdb.org/downloads/MonetDB-GPG-KEY | sudo apt-key add -
$ sudo apt update
$ sudo apt install monetdb5-sql monetdb-client
$ sudo systemctl enable monetdbd
$ sudo systemctl start monetdbd
$ sudo usermod -a -G monetdb $USER
```
Logout and login back to your server.
Tutorial:
https://www.monetdb.org/Documentation/UserGuide/Tutorial
Creating the database:
```
$ sudo mkdir /opt/monetdb
$ sudo chmod 777 /opt/monetdb
$ monetdbd create /opt/monetdb
$ monetdbd start /opt/monetdb
cannot remove socket files
```
Now you have to stop MonetDB, copy the contents of `/var/monetdb5` to `/opt/monetdb` and replace the `/var/monetdb5` with symlink to `/opt/monetdb`. This is necessary, because I don't have free space in `/var` and creation of database in `/opt` did not succeed.
Start MonetDB again.
```
$ sudo systemctl start monetdbd
```
```
$ monetdb create test
created database in maintenance mode: test
$ monetdb release test
taken database out of maintenance mode: test
```
Run client:
```
$ mclient -u monetdb -d test
```
Type password: monetdb
```
CREATE TABLE hits
(
"WatchID" BIGINT,
"JavaEnable" TINYINT,
"Title" TEXT,
"GoodEvent" SMALLINT,
"EventTime" TIMESTAMP,
"EventDate" Date,
"CounterID" INTEGER,
"ClientIP" INTEGER,
"RegionID" INTEGER,
"UserID" BIGINT,
"CounterClass" TINYINT,
"OS" TINYINT,
"UserAgent" TINYINT,
"URL" TEXT,
"Referer" TEXT,
"Refresh" TINYINT,
"RefererCategoryID" SMALLINT,
"RefererRegionID" INTEGER,
"URLCategoryID" SMALLINT,
"URLRegionID" INTEGER,
"ResolutionWidth" SMALLINT,
"ResolutionHeight" SMALLINT,
"ResolutionDepth" TINYINT,
"FlashMajor" TINYINT,
"FlashMinor" TINYINT,
"FlashMinor2" TEXT,
"NetMajor" TINYINT,
"NetMinor" TINYINT,
"UserAgentMajor" SMALLINT,
"UserAgentMinor" TEXT,
"CookieEnable" TINYINT,
"JavascriptEnable" TINYINT,
"IsMobile" TINYINT,
"MobilePhone" TINYINT,
"MobilePhoneModel" TEXT,
"Params" TEXT,
"IPNetworkID" INTEGER,
"TraficSourceID" TINYINT,
"SearchEngineID" SMALLINT,
"SearchPhrase" TEXT,
"AdvEngineID" TINYINT,
"IsArtifical" TINYINT,
"WindowClientWidth" SMALLINT,
"WindowClientHeight" SMALLINT,
"ClientTimeZone" SMALLINT,
"ClientEventTime" TIMESTAMP,
"SilverlightVersion1" TINYINT,
"SilverlightVersion2" TINYINT,
"SilverlightVersion3" INTEGER,
"SilverlightVersion4" SMALLINT,
"PageCharset" TEXT,
"CodeVersion" INTEGER,
"IsLink" TINYINT,
"IsDownload" TINYINT,
"IsNotBounce" TINYINT,
"FUniqID" BIGINT,
"OriginalURL" TEXT,
"HID" INTEGER,
"IsOldCounter" TINYINT,
"IsEvent" TINYINT,
"IsParameter" TINYINT,
"DontCountHits" TINYINT,
"WithHash" TINYINT,
"HitColor" TEXT,
"LocalEventTime" TIMESTAMP,
"Age" TINYINT,
"Sex" TINYINT,
"Income" TINYINT,
"Interests" SMALLINT,
"Robotness" TINYINT,
"RemoteIP" INTEGER,
"WindowName" INTEGER,
"OpenerName" INTEGER,
"HistoryLength" SMALLINT,
"BrowserLanguage" TEXT,
"BrowserCountry" TEXT,
"SocialNetwork" TEXT,
"SocialAction" TEXT,
"HTTPError" SMALLINT,
"SendTiming" INTEGER,
"DNSTiming" INTEGER,
"ConnectTiming" INTEGER,
"ResponseStartTiming" INTEGER,
"ResponseEndTiming" INTEGER,
"FetchTiming" INTEGER,
"SocialSourceNetworkID" TINYINT,
"SocialSourcePage" TEXT,
"ParamPrice" BIGINT,
"ParamOrderID" TEXT,
"ParamCurrency" TEXT,
"ParamCurrencyID" SMALLINT,
"OpenstatServiceName" TEXT,
"OpenstatCampaignID" TEXT,
"OpenstatAdID" TEXT,
"OpenstatSourceID" TEXT,
"UTMSource" TEXT,
"UTMMedium" TEXT,
"UTMCampaign" TEXT,
"UTMContent" TEXT,
"UTMTerm" TEXT,
"FromTag" TEXT,
"HasGCLID" TINYINT,
"RefererHash" BIGINT,
"URLHash" BIGINT,
"CLID" INTEGER
);
```
# How to prepare data
Download the 100 million rows dataset from here and insert into ClickHouse:
https://clickhouse.tech/docs/en/getting-started/example-datasets/metrica/
Create the dataset from ClickHouse:
```
SELECT
toInt64(WatchID) = -9223372036854775808 ? -9223372036854775807 : toInt64(WatchID),
toInt8(JavaEnable) = -128 ? -127 : toInt8(JavaEnable),
toValidUTF8(toString(Title)),
toInt16(GoodEvent) = -32768 ? -32767 : toInt16(GoodEvent),
EventTime,
EventDate,
toInt32(CounterID) = -2147483648 ? -2147483647 : toInt32(CounterID),
toInt32(ClientIP) = -2147483648 ? -2147483647 : toInt32(ClientIP),
toInt32(RegionID) = -2147483648 ? -2147483647 : toInt32(RegionID),
toInt64(UserID) = -9223372036854775808 ? -9223372036854775807 : toInt64(UserID),
toInt8(CounterClass) = -128 ? -127 : toInt8(CounterClass),
toInt8(OS) = -128 ? -127 : toInt8(OS),
toInt8(UserAgent) = -128 ? -127 : toInt8(UserAgent),
toValidUTF8(toString(URL)),
toValidUTF8(toString(Referer)),
toInt8(Refresh) = -128 ? -127 : toInt8(Refresh),
toInt16(RefererCategoryID) = -32768 ? -32767 : toInt16(RefererCategoryID),
toInt32(RefererRegionID) = -2147483648 ? -2147483647 : toInt32(RefererRegionID),
toInt16(URLCategoryID) = -32768 ? -32767 : toInt16(URLCategoryID),
toInt32(URLRegionID) = -2147483648 ? -2147483647 : toInt32(URLRegionID),
toInt16(ResolutionWidth) = -32768 ? -32767 : toInt16(ResolutionWidth),
toInt16(ResolutionHeight) = -32768 ? -32767 : toInt16(ResolutionHeight),
toInt8(ResolutionDepth) = -128 ? -127 : toInt8(ResolutionDepth),
toInt8(FlashMajor) = -128 ? -127 : toInt8(FlashMajor),
toInt8(FlashMinor) = -128 ? -127 : toInt8(FlashMinor),
toValidUTF8(toString(FlashMinor2)),
toInt8(NetMajor) = -128 ? -127 : toInt8(NetMajor),
toInt8(NetMinor) = -128 ? -127 : toInt8(NetMinor),
toInt16(UserAgentMajor) = -32768 ? -32767 : toInt16(UserAgentMajor),
toValidUTF8(toString(UserAgentMinor)),
toInt8(CookieEnable) = -128 ? -127 : toInt8(CookieEnable),
toInt8(JavascriptEnable) = -128 ? -127 : toInt8(JavascriptEnable),
toInt8(IsMobile) = -128 ? -127 : toInt8(IsMobile),
toInt8(MobilePhone) = -128 ? -127 : toInt8(MobilePhone),
toValidUTF8(toString(MobilePhoneModel)),
toValidUTF8(toString(Params)),
toInt32(IPNetworkID) = -2147483648 ? -2147483647 : toInt32(IPNetworkID),
toInt8(TraficSourceID) = -128 ? -127 : toInt8(TraficSourceID),
toInt16(SearchEngineID) = -32768 ? -32767 : toInt16(SearchEngineID),
toValidUTF8(toString(SearchPhrase)),
toInt8(AdvEngineID) = -128 ? -127 : toInt8(AdvEngineID),
toInt8(IsArtifical) = -128 ? -127 : toInt8(IsArtifical),
toInt16(WindowClientWidth) = -32768 ? -32767 : toInt16(WindowClientWidth),
toInt16(WindowClientHeight) = -32768 ? -32767 : toInt16(WindowClientHeight),
toInt16(ClientTimeZone) = -32768 ? -32767 : toInt16(ClientTimeZone),
ClientEventTime,
toInt8(SilverlightVersion1) = -128 ? -127 : toInt8(SilverlightVersion1),
toInt8(SilverlightVersion2) = -128 ? -127 : toInt8(SilverlightVersion2),
toInt32(SilverlightVersion3) = -2147483648 ? -2147483647 : toInt32(SilverlightVersion3),
toInt16(SilverlightVersion4) = -32768 ? -32767 : toInt16(SilverlightVersion4),
toValidUTF8(toString(PageCharset)),
toInt32(CodeVersion) = -2147483648 ? -2147483647 : toInt32(CodeVersion),
toInt8(IsLink) = -128 ? -127 : toInt8(IsLink),
toInt8(IsDownload) = -128 ? -127 : toInt8(IsDownload),
toInt8(IsNotBounce) = -128 ? -127 : toInt8(IsNotBounce),
toInt64(FUniqID) = -9223372036854775808 ? -9223372036854775807 : toInt64(FUniqID),
toValidUTF8(toString(OriginalURL)),
toInt32(HID) = -2147483648 ? -2147483647 : toInt32(HID),
toInt8(IsOldCounter) = -128 ? -127 : toInt8(IsOldCounter),
toInt8(IsEvent) = -128 ? -127 : toInt8(IsEvent),
toInt8(IsParameter) = -128 ? -127 : toInt8(IsParameter),
toInt8(DontCountHits) = -128 ? -127 : toInt8(DontCountHits),
toInt8(WithHash) = -128 ? -127 : toInt8(WithHash),
toValidUTF8(toString(HitColor)),
LocalEventTime,
toInt8(Age) = -128 ? -127 : toInt8(Age),
toInt8(Sex) = -128 ? -127 : toInt8(Sex),
toInt8(Income) = -128 ? -127 : toInt8(Income),
toInt16(Interests) = -32768 ? -32767 : toInt16(Interests),
toInt8(Robotness) = -128 ? -127 : toInt8(Robotness),
toInt32(RemoteIP) = -2147483648 ? -2147483647 : toInt32(RemoteIP),
toInt32(WindowName) = -2147483648 ? -2147483647 : toInt32(WindowName),
toInt32(OpenerName) = -2147483648 ? -2147483647 : toInt32(OpenerName),
toInt16(HistoryLength) = -32768 ? -32767 : toInt16(HistoryLength),
toValidUTF8(toString(BrowserLanguage)),
toValidUTF8(toString(BrowserCountry)),
toValidUTF8(toString(SocialNetwork)),
toValidUTF8(toString(SocialAction)),
toInt16(HTTPError) = -32768 ? -32767 : toInt16(HTTPError),
toInt32(SendTiming) = -2147483648 ? -2147483647 : toInt32(SendTiming),
toInt32(DNSTiming) = -2147483648 ? -2147483647 : toInt32(DNSTiming),
toInt32(ConnectTiming) = -2147483648 ? -2147483647 : toInt32(ConnectTiming),
toInt32(ResponseStartTiming) = -2147483648 ? -2147483647 : toInt32(ResponseStartTiming),
toInt32(ResponseEndTiming) = -2147483648 ? -2147483647 : toInt32(ResponseEndTiming),
toInt32(FetchTiming) = -2147483648 ? -2147483647 : toInt32(FetchTiming),
toInt8(SocialSourceNetworkID) = -128 ? -127 : toInt8(SocialSourceNetworkID),
toValidUTF8(toString(SocialSourcePage)),
toInt64(ParamPrice) = -9223372036854775808 ? -9223372036854775807 : toInt64(ParamPrice),
toValidUTF8(toString(ParamOrderID)),
toValidUTF8(toString(ParamCurrency)),
toInt16(ParamCurrencyID) = -32768 ? -32767 : toInt16(ParamCurrencyID),
toValidUTF8(toString(OpenstatServiceName)),
toValidUTF8(toString(OpenstatCampaignID)),
toValidUTF8(toString(OpenstatAdID)),
toValidUTF8(toString(OpenstatSourceID)),
toValidUTF8(toString(UTMSource)),
toValidUTF8(toString(UTMMedium)),
toValidUTF8(toString(UTMCampaign)),
toValidUTF8(toString(UTMContent)),
toValidUTF8(toString(UTMTerm)),
toValidUTF8(toString(FromTag)),
toInt8(HasGCLID) = -128 ? -127 : toInt8(HasGCLID),
toInt64(RefererHash) = -9223372036854775808 ? -9223372036854775807 : toInt64(RefererHash),
toInt64(URLHash) = -9223372036854775808 ? -9223372036854775807 : toInt64(URLHash),
toInt32(CLID) = -2147483648 ? -2147483647 : toInt32(CLID)
FROM hits_100m_obfuscated
INTO OUTFILE '/home/milovidov/example_datasets/hits_100m_obfuscated_monetdb.tsv'
FORMAT TSV;
```
Note that MonetDB does not support the most negative numbers like -128. And we have to convert them by adding one.
It makes impossible to store the values of 64bit identifiers in BIGINT.
Maybe it's a trick to optimize NULLs?
Upload the data:
```
$ mclient -u monetdb -d test
```
Type password: monetdb
```
COPY INTO hits FROM '/home/milovidov/example_datasets/hits_100m_obfuscated_monetdb.tsv' USING DELIMITERS '\t';
```
It takes 28 minutes 02 seconds on a server (Linux Ubuntu, Xeon E5-2560v2, 32 logical CPU, 128 GiB RAM, 8xHDD RAID-5, 40 TB).
It is roughly 60 000 rows per second.
Validate the data:
```
SELECT count(*) FROM hits;
```
Create an index:
```
CREATE INDEX hits_idx ON hits ("CounterID", "EventDate");
```
(it takes 5 seconds)
Run the benchmark:
```
./benchmark.sh | tee log.txt
```
You can find the log in `log.txt` file.
Postprocess data:
```
grep clk log.txt | tr -d '\r' | awk '{ if ($3 == "ms") { print $2 / 1000; } else if ($3 == "sec") { print $2 } else { print } }'
```
Then replace values with "min" (minutes) timing manually and save to `tmp.txt`.
Then process to JSON format:
```
awk '{
if (i % 3 == 0) { a = $1 }
else if (i % 3 == 1) { b = $1 }
else if (i % 3 == 2) { c = $1; print "[" a ", " b ", " c "]," };
++i; }' < tmp.txt
```
And paste to `/website/benchmark/dbms/results/005_monetdb.json` in the repository.

341
benchmark/monetdb/log.txt Normal file
View File

@ -0,0 +1,341 @@
3
SELECT count(*) FROM hits;
1 tuple
clk: 1.262 ms
1 tuple
clk: 1.420 ms
1 tuple
clk: 1.190 ms
3
SELECT count(*) FROM hits WHERE "AdvEngineID" <> 0;
1 tuple
clk: 1.530 sec
1 tuple
clk: 1.489 sec
1 tuple
clk: 1.490 sec
3
SELECT sum("AdvEngineID"), count(*), avg("ResolutionWidth") FROM hits;
1 tuple
clk: 597.512 ms
1 tuple
clk: 579.383 ms
1 tuple
clk: 598.220 ms
3
SELECT sum("UserID") FROM hits;
overflow in calculation.
clk: 568.003 ms
overflow in calculation.
clk: 554.572 ms
overflow in calculation.
clk: 552.076 ms
3
SELECT COUNT(DISTINCT "UserID") FROM hits;
1 tuple
clk: 6.688 sec
1 tuple
clk: 6.689 sec
1 tuple
clk: 6.652 sec
3
SELECT COUNT(DISTINCT "SearchPhrase") FROM hits;
1 tuple
clk: 15.702 sec
1 tuple
clk: 17.189 sec
1 tuple
clk: 15.514 sec
3
SELECT min("EventDate"), max("EventDate") FROM hits;
1 tuple
clk: 697.770 ms
1 tuple
clk: 711.870 ms
1 tuple
clk: 697.177 ms
3
SELECT "AdvEngineID", count(*) FROM hits WHERE "AdvEngineID" <> 0 GROUP BY "AdvEngineID" ORDER BY count(*) DESC;
18 tuples
clk: 1.536 sec
18 tuples
clk: 1.505 sec
18 tuples
clk: 1.492 sec
3
SELECT "RegionID", COUNT(DISTINCT "UserID") AS u FROM hits GROUP BY "RegionID" ORDER BY u DESC LIMIT 10;
10 tuples
clk: 9.965 sec
10 tuples
clk: 10.106 sec
10 tuples
clk: 10.136 sec
3
SELECT "RegionID", sum("AdvEngineID"), count(*) AS c, avg("ResolutionWidth"), COUNT(DISTINCT "UserID") FROM hits GROUP BY "RegionID" ORDER BY c DESC LIMIT 10;
10 tuples
clk: 8.329 sec
10 tuples
clk: 8.601 sec
10 tuples
clk: 8.039 sec
3
SELECT "MobilePhoneModel", COUNT(DISTINCT "UserID") AS u FROM hits WHERE "MobilePhoneModel" <> '' GROUP BY "MobilePhoneModel" ORDER BY u DESC LIMIT 10;
10 tuples
clk: 3.385 sec
10 tuples
clk: 3.321 sec
10 tuples
clk: 3.326 sec
3
SELECT "MobilePhone", "MobilePhoneModel", COUNT(DISTINCT "UserID") AS u FROM hits WHERE "MobilePhoneModel" <> '' GROUP BY "MobilePhone", "MobilePhoneModel" ORDER BY u DESC LIMIT 10;
10 tuples
clk: 3.510 sec
10 tuples
clk: 3.431 sec
10 tuples
clk: 3.382 sec
3
SELECT "SearchPhrase", count(*) AS c FROM hits WHERE "SearchPhrase" <> '' GROUP BY "SearchPhrase" ORDER BY c DESC LIMIT 10;
10 tuples
clk: 10.891 sec
10 tuples
clk: 11.483 sec
10 tuples
clk: 10.352 sec
3
SELECT "SearchPhrase", COUNT(DISTINCT "UserID") AS u FROM hits WHERE "SearchPhrase" <> '' GROUP BY "SearchPhrase" ORDER BY u DESC LIMIT 10;
10 tuples
clk: 15.711 sec
10 tuples
clk: 15.444 sec
10 tuples
clk: 15.503 sec
3
SELECT "SearchEngineID", "SearchPhrase", count(*) AS c FROM hits WHERE "SearchPhrase" <> '' GROUP BY "SearchEngineID", "SearchPhrase" ORDER BY c DESC LIMIT 10;
10 tuples
clk: 11.433 sec
10 tuples
clk: 11.399 sec
10 tuples
clk: 11.285 sec
3
SELECT "UserID", count(*) FROM hits GROUP BY "UserID" ORDER BY count(*) DESC LIMIT 10;
10 tuples
clk: 7.184 sec
10 tuples
clk: 7.015 sec
10 tuples
clk: 6.849 sec
3
SELECT "UserID", "SearchPhrase", count(*) FROM hits GROUP BY "UserID", "SearchPhrase" ORDER BY count(*) DESC LIMIT 10;
10 tuples
clk: 29.096 sec
10 tuples
clk: 28.328 sec
10 tuples
clk: 29.247 sec
3
SELECT "UserID", "SearchPhrase", count(*) FROM hits GROUP BY "UserID", "SearchPhrase" LIMIT 10;
10 tuples
clk: 29.457 sec
10 tuples
clk: 29.364 sec
10 tuples
clk: 29.269 sec
3
SELECT "UserID", extract(minute FROM "EventTime") AS m, "SearchPhrase", count(*) FROM hits GROUP BY "UserID", m, "SearchPhrase" ORDER BY count(*) DESC LIMIT 10;
10 tuples
clk: 47.141 sec
10 tuples
clk: 46.495 sec
10 tuples
clk: 46.472 sec
3
SELECT "UserID" FROM hits WHERE "UserID" = -6101065172474983726;
0 tuples
clk: 783.332 ms
0 tuples
clk: 771.157 ms
0 tuples
clk: 783.082 ms
3
SELECT count(*) FROM hits WHERE "URL" LIKE '%metrika%';
1 tuple
clk: 3.963 sec
1 tuple
clk: 3.930 sec
1 tuple
clk: 3.964 sec
3
SELECT "SearchPhrase", min("URL"), count(*) AS c FROM hits WHERE "URL" LIKE '%metrika%' AND "SearchPhrase" <> '' GROUP BY "SearchPhrase" ORDER BY c DESC LIMIT 10;
10 tuples
clk: 3.925 sec
10 tuples
clk: 3.817 sec
10 tuples
clk: 3.802 sec
3
SELECT "SearchPhrase", min("URL"), min("Title"), count(*) AS c, COUNT(DISTINCT "UserID") FROM hits WHERE "Title" LIKE '%Яндекс%' AND "URL" NOT LIKE '%.yandex.%' AND "SearchPhrase" <> '' GROUP BY "SearchPhrase" ORDER BY c DESC LIMIT 10;
10 tuples
clk: 6.067 sec
10 tuples
clk: 6.120 sec
10 tuples
clk: 6.012 sec
3
SELECT * FROM hits WHERE "URL" LIKE '%metrika%' ORDER BY "EventTime" LIMIT 10;
10 tuples !87 columns dropped, 29 fields truncated!
clk: 4.251 sec
10 tuples !87 columns dropped, 29 fields truncated!
clk: 4.190 sec
10 tuples !87 columns dropped, 29 fields truncated!
clk: 4.379 sec
3
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "EventTime" LIMIT 10;
10 tuples
clk: 6.699 sec
10 tuples
clk: 6.718 sec
10 tuples
clk: 6.802 sec
3
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "SearchPhrase" LIMIT 10;
10 tuples
clk: 6.887 sec
10 tuples
clk: 6.838 sec
10 tuples
clk: 6.844 sec
3
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "EventTime", "SearchPhrase" LIMIT 10;
10 tuples
clk: 6.806 sec
10 tuples
clk: 6.878 sec
10 tuples
clk: 6.807 sec
3
SELECT "CounterID", avg(length("URL")) AS l, count(*) AS c FROM hits WHERE "URL" <> '' GROUP BY "CounterID" HAVING count(*) > 100000 ORDER BY l DESC LIMIT 25;
25 tuples
clk: 1:01 min
25 tuples
clk: 55.553 sec
25 tuples
clk: 56.188 sec
3
SELECT sys.getdomain("Referer") AS key, avg(length("Referer")) AS l, count(*) AS c, min("Referer") FROM hits WHERE "Referer" <> '' GROUP BY key HAVING count(*) > 100000 ORDER BY l DESC LIMIT 25;
clk: 1:00 min
clk: 1:00 min
clk: 1:00 min
3
SELECT sum("ResolutionWidth"), sum("ResolutionWidth" + 1), sum("ResolutionWidth" + 2), sum("ResolutionWidth" + 3), sum("ResolutionWidth" + 4), sum("ResolutionWidth" + 5), sum("ResolutionWidth" + 6), sum("ResolutionWidth" + 7), sum("ResolutionWidth" + 8), sum("ResolutionWidth" + 9), sum("ResolutionWidth" + 10), sum("ResolutionWidth" + 11), sum("ResolutionWidth" + 12), sum("ResolutionWidth" + 13), sum("ResolutionWidth" + 14), sum("ResolutionWidth" + 15), sum("ResolutionWidth" + 16), sum("ResolutionWidth" + 17), sum("ResolutionWidth" + 18), sum("ResolutionWidth" + 19), sum("ResolutionWidth" + 20), sum("ResolutionWidth" + 21), sum("ResolutionWidth" + 22), sum("ResolutionWidth" + 23), sum("ResolutionWidth" + 24), sum("ResolutionWidth" + 25), sum("ResolutionWidth" + 26), sum("ResolutionWidth" + 27), sum("ResolutionWidth" + 28), sum("ResolutionWidth" + 29), sum("ResolutionWidth" + 30), sum("ResolutionWidth" + 31), sum("ResolutionWidth" + 32), sum("ResolutionWidth" + 33), sum("ResolutionWidth" + 34), sum("ResolutionWidth" + 35), sum("ResolutionWidth" + 36), sum("ResolutionWidth" + 37), sum("ResolutionWidth" + 38), sum("ResolutionWidth" + 39), sum("ResolutionWidth" + 40), sum("ResolutionWidth" + 41), sum("ResolutionWidth" + 42), sum("ResolutionWidth" + 43), sum("ResolutionWidth" + 44), sum("ResolutionWidth" + 45), sum("ResolutionWidth" + 46), sum("ResolutionWidth" + 47), sum("ResolutionWidth" + 48), sum("ResolutionWidth" + 49), sum("ResolutionWidth" + 50), sum("ResolutionWidth" + 51), sum("ResolutionWidth" + 52), sum("ResolutionWidth" + 53), sum("ResolutionWidth" + 54), sum("ResolutionWidth" + 55), sum("ResolutionWidth" + 56), sum("ResolutionWidth" + 57), sum("ResolutionWidth" + 58), sum("ResolutionWidth" + 59), sum("ResolutionWidth" + 60), sum("ResolutionWidth" + 61), sum("ResolutionWidth" + 62), sum("ResolutionWidth" + 63), sum("ResolutionWidth" + 64), sum("ResolutionWidth" + 65), sum("ResolutionWidth" + 66), sum("ResolutionWidth" + 67), sum("ResolutionWidth" + 68), sum("ResolutionWidth" + 69), sum("ResolutionWidth" + 70), sum("ResolutionWidth" + 71), sum("ResolutionWidth" + 72), sum("ResolutionWidth" + 73), sum("ResolutionWidth" + 74), sum("ResolutionWidth" + 75), sum("ResolutionWidth" + 76), sum("ResolutionWidth" + 77), sum("ResolutionWidth" + 78), sum("ResolutionWidth" + 79), sum("ResolutionWidth" + 80), sum("ResolutionWidth" + 81), sum("ResolutionWidth" + 82), sum("ResolutionWidth" + 83), sum("ResolutionWidth" + 84), sum("ResolutionWidth" + 85), sum("ResolutionWidth" + 86), sum("ResolutionWidth" + 87), sum("ResolutionWidth" + 88), sum("ResolutionWidth" + 89) FROM hits;
1 tuple !77 columns dropped!
clk: 6.221 sec
1 tuple !77 columns dropped!
clk: 6.170 sec
1 tuple !77 columns dropped!
clk: 6.382 sec
3
SELECT "SearchEngineID", "ClientIP", count(*) AS c, sum("Refresh"), avg("ResolutionWidth") FROM hits WHERE "SearchPhrase" <> '' GROUP BY "SearchEngineID", "ClientIP" ORDER BY c DESC LIMIT 10;
10 tuples
clk: 5.684 sec
10 tuples
clk: 5.585 sec
10 tuples
clk: 5.463 sec
3
SELECT "WatchID", "ClientIP", count(*) AS c, sum("Refresh"), avg("ResolutionWidth") FROM hits WHERE "SearchPhrase" <> '' GROUP BY "WatchID", "ClientIP" ORDER BY c DESC LIMIT 10;
10 tuples
clk: 6.281 sec
10 tuples
clk: 6.574 sec
10 tuples
clk: 6.243 sec
3
SELECT "WatchID", "ClientIP", count(*) AS c, sum("Refresh"), avg("ResolutionWidth") FROM hits GROUP BY "WatchID", "ClientIP" ORDER BY c DESC LIMIT 10;
10 tuples
clk: 44.641 sec
10 tuples
clk: 41.904 sec
10 tuples
clk: 43.218 sec
3
SELECT "URL", count(*) AS c FROM hits GROUP BY "URL" ORDER BY c DESC LIMIT 10;
10 tuples
clk: 1:24 min
10 tuples
clk: 1:31 min
10 tuples
clk: 1:24 min
3
SELECT 1, "URL", count(*) AS c FROM hits GROUP BY 1, "URL" ORDER BY c DESC LIMIT 10;
10 tuples
clk: 1:24 min
10 tuples
clk: 1:25 min
10 tuples
clk: 1:24 min
3
SELECT "ClientIP", "ClientIP" - 1, "ClientIP" - 2, "ClientIP" - 3, count(*) AS c FROM hits GROUP BY "ClientIP", "ClientIP" - 1, "ClientIP" - 2, "ClientIP" - 3 ORDER BY c DESC LIMIT 10;
10 tuples
clk: 26.438 sec
10 tuples
clk: 26.033 sec
10 tuples
clk: 26.147 sec
3
SELECT "URL", count(*) AS "PageViews" FROM hits WHERE "CounterID" = 62 AND "EventDate" >= '2013-07-01' AND "EventDate" <= '2013-07-31' AND "DontCountHits" = 0 AND "Refresh" = 0 AND "URL" <> '' GROUP BY "URL" ORDER BY "PageViews" DESC LIMIT 10;
10 tuples
clk: 4.825 sec
10 tuples
clk: 4.618 sec
10 tuples
clk: 4.623 sec
3
SELECT "Title", count(*) AS "PageViews" FROM hits WHERE "CounterID" = 62 AND "EventDate" >= '2013-07-01' AND "EventDate" <= '2013-07-31' AND "DontCountHits" = 0 AND "Refresh" = 0 AND "Title" <> '' GROUP BY "Title" ORDER BY "PageViews" DESC LIMIT 10;
10 tuples
clk: 4.380 sec
10 tuples
clk: 4.418 sec
10 tuples
clk: 4.413 sec
3
SELECT "URL", count(*) AS "PageViews" FROM hits WHERE "CounterID" = 62 AND "EventDate" >= '2013-07-01' AND "EventDate" <= '2013-07-31' AND "Refresh" = 0 AND "IsLink" <> 0 AND "IsDownload" = 0 GROUP BY "URL" ORDER BY "PageViews" DESC LIMIT 1000;
1000 tuples
clk: 4.259 sec
1000 tuples
clk: 4.195 sec
1000 tuples
clk: 4.195 sec
3
SELECT "TraficSourceID", "SearchEngineID", "AdvEngineID", CASE WHEN ("SearchEngineID" = 0 AND "AdvEngineID" = 0) THEN "Referer" ELSE '' END AS Src, "URL" AS Dst, count(*) AS "PageViews" FROM hits WHERE "CounterID" = 62 AND "EventDate" >= '2013-07-01' AND "EventDate" <= '2013-07-31' AND "Refresh" = 0 GROUP BY "TraficSourceID", "SearchEngineID", "AdvEngineID", CASE WHEN ("SearchEngineID" = 0 AND "AdvEngineID" = 0) THEN "Referer" ELSE '' END, "URL" ORDER BY "PageViews" DESC LIMIT 1000;
1000 tuples
clk: 3.233 sec
1000 tuples
clk: 3.180 sec
1000 tuples
clk: 3.181 sec
3
SELECT "URLHash", "EventDate", count(*) AS "PageViews" FROM hits WHERE "CounterID" = 62 AND "EventDate" >= '2013-07-01' AND "EventDate" <= '2013-07-31' AND "Refresh" = 0 AND "TraficSourceID" IN (-1, 6) AND "RefererHash" = 686716256552154761 GROUP BY "URLHash", "EventDate" ORDER BY "PageViews" DESC LIMIT 100;
0 tuples
clk: 2.656 sec
0 tuples
clk: 2.557 sec
0 tuples
clk: 2.561 sec
3
SELECT "WindowClientWidth", "WindowClientHeight", count(*) AS "PageViews" FROM hits WHERE "CounterID" = 62 AND "EventDate" >= '2013-07-01' AND "EventDate" <= '2013-07-31' AND "Refresh" = 0 AND "DontCountHits" = 0 AND "URLHash" = 686716256552154761 GROUP BY "WindowClientWidth", "WindowClientHeight" ORDER BY "PageViews" DESC LIMIT 10000;
0 tuples
clk: 4.161 sec
0 tuples
clk: 4.243 sec
0 tuples
clk: 4.166 sec
3
SELECT DATE_TRUNC('minute', "EventTime") AS "Minute", count(*) AS "PageViews" FROM hits WHERE "CounterID" = 62 AND "EventDate" >= '2013-07-01' AND "EventDate" <= '2013-07-02' AND "Refresh" = 0 AND "DontCountHits" = 0 GROUP BY DATE_TRUNC('minute', "EventTime") ORDER BY DATE_TRUNC('minute', "EventTime");
0 tuples
clk: 4.199 sec
0 tuples
clk: 4.211 sec
0 tuples
clk: 4.190 sec

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -1,111 +1,43 @@
SELECT count(*) FROM hits_10m;
SELECT count(*) FROM hits_10m WHERE AdvEngineID <> 0;
SELECT sum(AdvEngineID), count(*), avg(ResolutionWidth) FROM hits_10m;
SELECT sum(UserID) FROM hits_10m;
SELECT count(DISTINCT UserID) FROM hits_10m;
SELECT count(DISTINCT SearchPhrase) FROM hits_10m;
SELECT min(EventDate), max(EventDate) FROM hits_10m;
SELECT AdvEngineID, count(*) FROM hits_10m WHERE AdvEngineID <> 0 GROUP BY AdvEngineID ORDER BY count(*) DESC;
-- мощная фильтрация. После фильтрации почти ничего не остаётся, но делаем ещё агрегацию.;
SELECT RegionID, count(DISTINCT UserID) AS u FROM hits_10m GROUP BY RegionID ORDER BY u DESC LIMIT 10;
-- агрегация, среднее количество ключей.;
SELECT RegionID, sum(AdvEngineID), count(*) AS c, avg(ResolutionWidth), count(DISTINCT UserID) FROM hits_10m GROUP BY RegionID ORDER BY count(*) DESC LIMIT 10;
-- агрегация, среднее количество ключей, несколько агрегатных функций.;
SELECT MobilePhoneModel, count(DISTINCT UserID) AS u FROM hits_10m WHERE MobilePhoneModel <> '' GROUP BY MobilePhoneModel ORDER BY u DESC LIMIT 10;
-- мощная фильтрация по строкам, затем агрегация по строкам.;
SELECT MobilePhone, MobilePhoneModel, count(DISTINCT UserID) AS u FROM hits_10m WHERE MobilePhoneModel <> '' GROUP BY MobilePhone, MobilePhoneModel ORDER BY u DESC LIMIT 10;
-- мощная фильтрация по строкам, затем агрегация по паре из числа и строки.;
SELECT SearchPhrase, count(*) FROM hits_10m WHERE SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY count(*) DESC LIMIT 10;
-- средняя фильтрация по строкам, затем агрегация по строкам, большое количество ключей.;
SELECT SearchPhrase, count(DISTINCT UserID) AS u FROM hits_10m WHERE SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY u DESC LIMIT 10;
-- агрегация чуть сложнее.;
SELECT SearchEngineID, SearchPhrase, count(*) FROM hits_10m WHERE SearchPhrase <> '' GROUP BY SearchEngineID, SearchPhrase ORDER BY count(*) DESC LIMIT 10;
-- агрегация по числу и строке, большое количество ключей.;
SELECT UserID, count(*) FROM hits_10m GROUP BY UserID ORDER BY count(*) DESC LIMIT 10;
-- агрегация по очень большому количеству ключей, может не хватить оперативки.;
SELECT UserID, SearchPhrase, count(*) FROM hits_10m GROUP BY UserID, SearchPhrase ORDER BY count(*) DESC LIMIT 10;
-- ещё более сложная агрегация.;
SELECT UserID, SearchPhrase, count(*) FROM hits_10m GROUP BY UserID, SearchPhrase LIMIT 10;
-- то же самое, но без сортировки.;
SELECT UserID, extract (minute from EventTime) AS m, SearchPhrase, count(*) FROM hits_10m GROUP BY UserID, m, SearchPhrase ORDER BY count(*) DESC LIMIT 10;
-- ещё более сложная агрегация, не стоит выполнять на больших таблицах.;
SELECT UserID FROM hits_10m WHERE UserID = 1234567890;
-- мощная фильтрация по столбцу типа UInt64.;
SELECT count(*) FROM hits_10m WHERE URL LIKE '%metrika%';
-- фильтрация по поиску подстроки в строке.;
SELECT SearchPhrase, MAX(URL), count(*) FROM hits_10m WHERE URL LIKE '%metrika%' AND SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY count(*) DESC LIMIT 10;
-- вынимаем большие столбцы, фильтрация по строке.;
SELECT SearchPhrase, MAX(URL), MAX(Title), count(*) AS c, count(DISTINCT UserID) FROM hits_10m WHERE Title LIKE '%Яндекс%' AND URL <> '%.yandex.%' AND SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY count(*) DESC LIMIT 10;
-- чуть больше столбцы.;
SELECT * FROM hits_10m WHERE URL LIKE '%metrika%' ORDER BY EventTime LIMIT 10;
-- плохой запрос - вынимаем все столбцы.;
SELECT SearchPhrase FROM hits_10m WHERE SearchPhrase <> '' ORDER BY EventTime LIMIT 10;
-- большая сортировка.;
SELECT SearchPhrase FROM hits_10m WHERE SearchPhrase <> '' ORDER BY SearchPhrase LIMIT 10;
-- большая сортировка по строкам.;
SELECT SearchPhrase FROM hits_10m WHERE SearchPhrase <> '' ORDER BY EventTime, SearchPhrase LIMIT 10;
-- большая сортировка по кортежу.;
SELECT CounterID, avg(length(URL)) AS l, count(*) FROM hits_10m WHERE URL <> '' GROUP BY CounterID HAVING count(*) > 100000 ORDER BY l DESC LIMIT 25;
-- считаем средние длины URL для крупных счётчиков.;
SELECT SUBSTRING( SUBSTRING(Referer, POSITION('//' IN Referer) + 2), 1, ifthenelse( (0 > POSITION('/' IN SUBSTRING(Referer, POSITION('//' IN Referer) + 2)) - 1), 0, POSITION('/' IN SUBSTRING(Referer, POSITION('//' IN Referer) + 2)) - 1 ) ) AS k, avg(length(Referer)) AS l, count(*) AS c, MAX(Referer) FROM hits_10m WHERE Referer <> '' GROUP BY k HAVING count(*) > 100000 ORDER BY l DESC LIMIT 25;
-- то же самое, но с разбивкой по доменам.;
SELECT sum(ResolutionWidth), sum(ResolutionWidth + 1), sum(ResolutionWidth + 2), sum(ResolutionWidth + 3), sum(ResolutionWidth + 4), sum(ResolutionWidth + 5), sum(ResolutionWidth + 6), sum(ResolutionWidth + 7), sum(ResolutionWidth + 8), sum(ResolutionWidth + 9), sum(ResolutionWidth + 10), sum(ResolutionWidth + 11), sum(ResolutionWidth + 12), sum(ResolutionWidth + 13), sum(ResolutionWidth + 14), sum(ResolutionWidth + 15), sum(ResolutionWidth + 16), sum(ResolutionWidth + 17), sum(ResolutionWidth + 18), sum(ResolutionWidth + 19), sum(ResolutionWidth + 20), sum(ResolutionWidth + 21), sum(ResolutionWidth + 22), sum(ResolutionWidth + 23), sum(ResolutionWidth + 24), sum(ResolutionWidth + 25), sum(ResolutionWidth + 26), sum(ResolutionWidth + 27), sum(ResolutionWidth + 28), sum(ResolutionWidth + 29), sum(ResolutionWidth + 30), sum(ResolutionWidth + 31), sum(ResolutionWidth + 32), sum(ResolutionWidth + 33), sum(ResolutionWidth + 34), sum(ResolutionWidth + 35), sum(ResolutionWidth + 36), sum(ResolutionWidth + 37), sum(ResolutionWidth + 38), sum(ResolutionWidth + 39), sum(ResolutionWidth + 40), sum(ResolutionWidth + 41), sum(ResolutionWidth + 42), sum(ResolutionWidth + 43), sum(ResolutionWidth + 44), sum(ResolutionWidth + 45), sum(ResolutionWidth + 46), sum(ResolutionWidth + 47), sum(ResolutionWidth + 48), sum(ResolutionWidth + 49), sum(ResolutionWidth + 50), sum(ResolutionWidth + 51), sum(ResolutionWidth + 52), sum(ResolutionWidth + 53), sum(ResolutionWidth + 54), sum(ResolutionWidth + 55), sum(ResolutionWidth + 56), sum(ResolutionWidth + 57), sum(ResolutionWidth + 58), sum(ResolutionWidth + 59), sum(ResolutionWidth + 60), sum(ResolutionWidth + 61), sum(ResolutionWidth + 62), sum(ResolutionWidth + 63), sum(ResolutionWidth + 64), sum(ResolutionWidth + 65), sum(ResolutionWidth + 66), sum(ResolutionWidth + 67), sum(ResolutionWidth + 68), sum(ResolutionWidth + 69), sum(ResolutionWidth + 70), sum(ResolutionWidth + 71), sum(ResolutionWidth + 72), sum(ResolutionWidth + 73), sum(ResolutionWidth + 74), sum(ResolutionWidth + 75), sum(ResolutionWidth + 76), sum(ResolutionWidth + 77), sum(ResolutionWidth + 78), sum(ResolutionWidth + 79), sum(ResolutionWidth + 80), sum(ResolutionWidth + 81), sum(ResolutionWidth + 82), sum(ResolutionWidth + 83), sum(ResolutionWidth + 84), sum(ResolutionWidth + 85), sum(ResolutionWidth + 86), sum(ResolutionWidth + 87), sum(ResolutionWidth + 88), sum(ResolutionWidth + 89) FROM hits_10m;
-- много тупых агрегатных функций.;
SELECT SearchEngineID, ClientIP, count(*) AS c, sum(Refresh), avg(ResolutionWidth) FROM hits_10m WHERE SearchPhrase <> '' GROUP BY SearchEngineID, ClientIP ORDER BY count(*) DESC LIMIT 10;
-- сложная агрегация, для больших таблиц может не хватить оперативки.;
SELECT WatchID, ClientIP, count(*) AS c, sum(Refresh), avg(ResolutionWidth) FROM hits_10m WHERE SearchPhrase <> '' GROUP BY WatchID, ClientIP ORDER BY count(*) DESC LIMIT 10;
-- агрегация по двум полям, которая ничего не агрегирует. Для больших таблиц выполнить не получится.;
SELECT WatchID, ClientIP, count(*) AS c, sum(Refresh), avg(ResolutionWidth) FROM hits_10m GROUP BY WatchID, ClientIP ORDER BY count(*) DESC LIMIT 10;
-- то же самое, но ещё и без фильтрации.;
SELECT URL, count(*) FROM hits_10m GROUP BY URL ORDER BY count(*) DESC LIMIT 10;
-- агрегация по URL.;
SELECT 1, URL, count(*) FROM hits_10m GROUP BY 1, URL ORDER BY count(*) DESC LIMIT 10;
-- агрегация по URL и числу.;
SELECT ClientIP, ClientIP - 1, ClientIP - 2, ClientIP - 3, count(*) FROM hits_10m GROUP BY ClientIP, ClientIP - 1, ClientIP - 2, ClientIP - 3 ORDER BY count(*) DESC LIMIT 10;
SELECT URL, count(*) AS PageViews FROM hits_10m WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT DontCountHits AND NOT Refresh AND URL <> '' GROUP BY URL ORDER BY PageViews DESC LIMIT 10;
SELECT Title, count(*) AS PageViews FROM hits_10m WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT DontCountHits AND NOT Refresh AND Title <> '' GROUP BY Title ORDER BY PageViews DESC LIMIT 10;
SELECT URL, count(*) AS PageViews FROM hits_10m WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT Refresh AND IsLink AND NOT IsDownload GROUP BY URL ORDER BY PageViews DESC LIMIT 1000;
SELECT TraficSourceID, SearchEngineID, AdvEngineID, CASE WHEN SearchEngineID = 0 AND AdvEngineID = 0 THEN Referer ELSE '' END AS Src, URL AS Dst, count(*) AS PageViews FROM hits_10m WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT Refresh GROUP BY TraficSourceID, SearchEngineID, AdvEngineID, Src, Dst ORDER BY PageViews DESC LIMIT 1000;
SELECT URLHash, EventDate, count(*) AS PageViews FROM hits_10m WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT Refresh AND TraficSourceID IN (-1, 6) AND RefererHash = 6202628419148573758 GROUP BY URLHash, EventDate ORDER BY PageViews DESC LIMIT 100000;
SELECT WindowClientWidth, WindowClientHeight, count(*) AS PageViews FROM hits_10m WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT Refresh AND NOT DontCountHits AND URLHash = 6202628419148573758 GROUP BY WindowClientWidth, WindowClientHeight ORDER BY PageViews DESC LIMIT 10000;
SELECT EventTime - extract (SECOND from EventTime) AS M, count(*) AS PageViews FROM hits_10m WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-02' AND NOT Refresh AND NOT DontCountHits GROUP BY M ORDER BY M;
SELECT count(*) FROM {table};
SELECT count(*) FROM {table} WHERE "AdvEngineID" <> 0;
SELECT sum("AdvEngineID"), count(*), avg("ResolutionWidth") FROM {table};
SELECT sum("UserID") FROM {table};
SELECT COUNT(DISTINCT "UserID") FROM {table};
SELECT COUNT(DISTINCT "SearchPhrase") FROM {table};
SELECT min("EventDate"), max("EventDate") FROM {table};
SELECT "AdvEngineID", count(*) FROM {table} WHERE "AdvEngineID" <> 0 GROUP BY "AdvEngineID" ORDER BY count(*) DESC;
SELECT "RegionID", COUNT(DISTINCT "UserID") AS u FROM {table} GROUP BY "RegionID" ORDER BY u DESC LIMIT 10;
SELECT "RegionID", sum("AdvEngineID"), count(*) AS c, avg("ResolutionWidth"), COUNT(DISTINCT "UserID") FROM {table} GROUP BY "RegionID" ORDER BY c DESC LIMIT 10;
SELECT "MobilePhoneModel", COUNT(DISTINCT "UserID") AS u FROM {table} WHERE "MobilePhoneModel" <> '' GROUP BY "MobilePhoneModel" ORDER BY u DESC LIMIT 10;
SELECT "MobilePhone", "MobilePhoneModel", COUNT(DISTINCT "UserID") AS u FROM {table} WHERE "MobilePhoneModel" <> '' GROUP BY "MobilePhone", "MobilePhoneModel" ORDER BY u DESC LIMIT 10;
SELECT "SearchPhrase", count(*) AS c FROM {table} WHERE "SearchPhrase" <> '' GROUP BY "SearchPhrase" ORDER BY c DESC LIMIT 10;
SELECT "SearchPhrase", COUNT(DISTINCT "UserID") AS u FROM {table} WHERE "SearchPhrase" <> '' GROUP BY "SearchPhrase" ORDER BY u DESC LIMIT 10;
SELECT "SearchEngineID", "SearchPhrase", count(*) AS c FROM {table} WHERE "SearchPhrase" <> '' GROUP BY "SearchEngineID", "SearchPhrase" ORDER BY c DESC LIMIT 10;
SELECT "UserID", count(*) FROM {table} GROUP BY "UserID" ORDER BY count(*) DESC LIMIT 10;
SELECT "UserID", "SearchPhrase", count(*) FROM {table} GROUP BY "UserID", "SearchPhrase" ORDER BY count(*) DESC LIMIT 10;
SELECT "UserID", "SearchPhrase", count(*) FROM {table} GROUP BY "UserID", "SearchPhrase" LIMIT 10;
SELECT "UserID", extract(minute FROM "EventTime") AS m, "SearchPhrase", count(*) FROM {table} GROUP BY "UserID", m, "SearchPhrase" ORDER BY count(*) DESC LIMIT 10;
SELECT "UserID" FROM {table} WHERE "UserID" = -6101065172474983726;
SELECT count(*) FROM {table} WHERE "URL" LIKE '%metrika%';
SELECT "SearchPhrase", min("URL"), count(*) AS c FROM {table} WHERE "URL" LIKE '%metrika%' AND "SearchPhrase" <> '' GROUP BY "SearchPhrase" ORDER BY c DESC LIMIT 10;
SELECT "SearchPhrase", min("URL"), min("Title"), count(*) AS c, COUNT(DISTINCT "UserID") FROM {table} WHERE "Title" LIKE '%Яндекс%' AND "URL" NOT LIKE '%.yandex.%' AND "SearchPhrase" <> '' GROUP BY "SearchPhrase" ORDER BY c DESC LIMIT 10;
SELECT * FROM {table} WHERE "URL" LIKE '%metrika%' ORDER BY "EventTime" LIMIT 10;
SELECT "SearchPhrase" FROM {table} WHERE "SearchPhrase" <> '' ORDER BY "EventTime" LIMIT 10;
SELECT "SearchPhrase" FROM {table} WHERE "SearchPhrase" <> '' ORDER BY "SearchPhrase" LIMIT 10;
SELECT "SearchPhrase" FROM {table} WHERE "SearchPhrase" <> '' ORDER BY "EventTime", "SearchPhrase" LIMIT 10;
SELECT "CounterID", avg(length("URL")) AS l, count(*) AS c FROM {table} WHERE "URL" <> '' GROUP BY "CounterID" HAVING count(*) > 100000 ORDER BY l DESC LIMIT 25;
SELECT sys.getdomain("Referer") AS key, avg(length("Referer")) AS l, count(*) AS c, min("Referer") FROM {table} WHERE "Referer" <> '' GROUP BY key HAVING count(*) > 100000 ORDER BY l DESC LIMIT 25;
SELECT sum("ResolutionWidth"), sum("ResolutionWidth" + 1), sum("ResolutionWidth" + 2), sum("ResolutionWidth" + 3), sum("ResolutionWidth" + 4), sum("ResolutionWidth" + 5), sum("ResolutionWidth" + 6), sum("ResolutionWidth" + 7), sum("ResolutionWidth" + 8), sum("ResolutionWidth" + 9), sum("ResolutionWidth" + 10), sum("ResolutionWidth" + 11), sum("ResolutionWidth" + 12), sum("ResolutionWidth" + 13), sum("ResolutionWidth" + 14), sum("ResolutionWidth" + 15), sum("ResolutionWidth" + 16), sum("ResolutionWidth" + 17), sum("ResolutionWidth" + 18), sum("ResolutionWidth" + 19), sum("ResolutionWidth" + 20), sum("ResolutionWidth" + 21), sum("ResolutionWidth" + 22), sum("ResolutionWidth" + 23), sum("ResolutionWidth" + 24), sum("ResolutionWidth" + 25), sum("ResolutionWidth" + 26), sum("ResolutionWidth" + 27), sum("ResolutionWidth" + 28), sum("ResolutionWidth" + 29), sum("ResolutionWidth" + 30), sum("ResolutionWidth" + 31), sum("ResolutionWidth" + 32), sum("ResolutionWidth" + 33), sum("ResolutionWidth" + 34), sum("ResolutionWidth" + 35), sum("ResolutionWidth" + 36), sum("ResolutionWidth" + 37), sum("ResolutionWidth" + 38), sum("ResolutionWidth" + 39), sum("ResolutionWidth" + 40), sum("ResolutionWidth" + 41), sum("ResolutionWidth" + 42), sum("ResolutionWidth" + 43), sum("ResolutionWidth" + 44), sum("ResolutionWidth" + 45), sum("ResolutionWidth" + 46), sum("ResolutionWidth" + 47), sum("ResolutionWidth" + 48), sum("ResolutionWidth" + 49), sum("ResolutionWidth" + 50), sum("ResolutionWidth" + 51), sum("ResolutionWidth" + 52), sum("ResolutionWidth" + 53), sum("ResolutionWidth" + 54), sum("ResolutionWidth" + 55), sum("ResolutionWidth" + 56), sum("ResolutionWidth" + 57), sum("ResolutionWidth" + 58), sum("ResolutionWidth" + 59), sum("ResolutionWidth" + 60), sum("ResolutionWidth" + 61), sum("ResolutionWidth" + 62), sum("ResolutionWidth" + 63), sum("ResolutionWidth" + 64), sum("ResolutionWidth" + 65), sum("ResolutionWidth" + 66), sum("ResolutionWidth" + 67), sum("ResolutionWidth" + 68), sum("ResolutionWidth" + 69), sum("ResolutionWidth" + 70), sum("ResolutionWidth" + 71), sum("ResolutionWidth" + 72), sum("ResolutionWidth" + 73), sum("ResolutionWidth" + 74), sum("ResolutionWidth" + 75), sum("ResolutionWidth" + 76), sum("ResolutionWidth" + 77), sum("ResolutionWidth" + 78), sum("ResolutionWidth" + 79), sum("ResolutionWidth" + 80), sum("ResolutionWidth" + 81), sum("ResolutionWidth" + 82), sum("ResolutionWidth" + 83), sum("ResolutionWidth" + 84), sum("ResolutionWidth" + 85), sum("ResolutionWidth" + 86), sum("ResolutionWidth" + 87), sum("ResolutionWidth" + 88), sum("ResolutionWidth" + 89) FROM {table};
SELECT "SearchEngineID", "ClientIP", count(*) AS c, sum("Refresh"), avg("ResolutionWidth") FROM {table} WHERE "SearchPhrase" <> '' GROUP BY "SearchEngineID", "ClientIP" ORDER BY c DESC LIMIT 10;
SELECT "WatchID", "ClientIP", count(*) AS c, sum("Refresh"), avg("ResolutionWidth") FROM {table} WHERE "SearchPhrase" <> '' GROUP BY "WatchID", "ClientIP" ORDER BY c DESC LIMIT 10;
SELECT "WatchID", "ClientIP", count(*) AS c, sum("Refresh"), avg("ResolutionWidth") FROM {table} GROUP BY "WatchID", "ClientIP" ORDER BY c DESC LIMIT 10;
SELECT "URL", count(*) AS c FROM {table} GROUP BY "URL" ORDER BY c DESC LIMIT 10;
SELECT 1, "URL", count(*) AS c FROM {table} GROUP BY 1, "URL" ORDER BY c DESC LIMIT 10;
SELECT "ClientIP", "ClientIP" - 1, "ClientIP" - 2, "ClientIP" - 3, count(*) AS c FROM {table} GROUP BY "ClientIP", "ClientIP" - 1, "ClientIP" - 2, "ClientIP" - 3 ORDER BY c DESC LIMIT 10;
SELECT "URL", count(*) AS "PageViews" FROM {table} WHERE "CounterID" = 62 AND "EventDate" >= '2013-07-01' AND "EventDate" <= '2013-07-31' AND "DontCountHits" = 0 AND "Refresh" = 0 AND "URL" <> '' GROUP BY "URL" ORDER BY "PageViews" DESC LIMIT 10;
SELECT "Title", count(*) AS "PageViews" FROM {table} WHERE "CounterID" = 62 AND "EventDate" >= '2013-07-01' AND "EventDate" <= '2013-07-31' AND "DontCountHits" = 0 AND "Refresh" = 0 AND "Title" <> '' GROUP BY "Title" ORDER BY "PageViews" DESC LIMIT 10;
SELECT "URL", count(*) AS "PageViews" FROM {table} WHERE "CounterID" = 62 AND "EventDate" >= '2013-07-01' AND "EventDate" <= '2013-07-31' AND "Refresh" = 0 AND "IsLink" <> 0 AND "IsDownload" = 0 GROUP BY "URL" ORDER BY "PageViews" DESC LIMIT 1000;
SELECT "TraficSourceID", "SearchEngineID", "AdvEngineID", CASE WHEN ("SearchEngineID" = 0 AND "AdvEngineID" = 0) THEN "Referer" ELSE '' END AS Src, "URL" AS Dst, count(*) AS "PageViews" FROM {table} WHERE "CounterID" = 62 AND "EventDate" >= '2013-07-01' AND "EventDate" <= '2013-07-31' AND "Refresh" = 0 GROUP BY "TraficSourceID", "SearchEngineID", "AdvEngineID", CASE WHEN ("SearchEngineID" = 0 AND "AdvEngineID" = 0) THEN "Referer" ELSE '' END, "URL" ORDER BY "PageViews" DESC LIMIT 1000;
SELECT "URLHash", "EventDate", count(*) AS "PageViews" FROM {table} WHERE "CounterID" = 62 AND "EventDate" >= '2013-07-01' AND "EventDate" <= '2013-07-31' AND "Refresh" = 0 AND "TraficSourceID" IN (-1, 6) AND "RefererHash" = 686716256552154761 GROUP BY "URLHash", "EventDate" ORDER BY "PageViews" DESC LIMIT 100;
SELECT "WindowClientWidth", "WindowClientHeight", count(*) AS "PageViews" FROM {table} WHERE "CounterID" = 62 AND "EventDate" >= '2013-07-01' AND "EventDate" <= '2013-07-31' AND "Refresh" = 0 AND "DontCountHits" = 0 AND "URLHash" = 686716256552154761 GROUP BY "WindowClientWidth", "WindowClientHeight" ORDER BY "PageViews" DESC LIMIT 10000;
SELECT DATE_TRUNC('minute', "EventTime") AS "Minute", count(*) AS "PageViews" FROM {table} WHERE "CounterID" = 62 AND "EventDate" >= '2013-07-01' AND "EventDate" <= '2013-07-02' AND "Refresh" = 0 AND "DontCountHits" = 0 GROUP BY DATE_TRUNC('minute', "EventTime") ORDER BY DATE_TRUNC('minute', "EventTime");

View File

@ -1,5 +1,4 @@
#!/usr/bin/env bash
#!/bin/expect
#!/usr/bin/expect
# Set timeout
set timeout 600
@ -7,14 +6,14 @@ set timeout 600
# Get arguments
set query [lindex $argv 0]
spawn mclient -u monetdb -d hits
spawn mclient -u monetdb -d test --timer=clock
expect "password:"
send "monetdb\r"
expect "sql>"
send "$query\r"
send "$query;\r"
expect "sql>"
send "\\q\r"
expect eof
expect eof

File diff suppressed because it is too large Load Diff

View File

@ -1,6 +1,10 @@
option(ENABLE_CASSANDRA "Enable Cassandra" ${ENABLE_LIBRARIES})
if (ENABLE_CASSANDRA)
if (APPLE)
SET(CMAKE_MACOSX_RPATH ON)
endif()
if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libuv")
message (ERROR "submodule contrib/libuv is missing. to fix try run: \n git submodule update --init --recursive")
elseif (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/cassandra")

View File

@ -1,8 +0,0 @@
find_library (TERMCAP_LIBRARY tinfo)
if (NOT TERMCAP_LIBRARY)
find_library (TERMCAP_LIBRARY ncurses)
endif()
if (NOT TERMCAP_LIBRARY)
find_library (TERMCAP_LIBRARY termcap)
endif()
message (STATUS "Using termcap: ${TERMCAP_LIBRARY}")

View File

@ -18,7 +18,6 @@ file(GLOB AWS_CORE_SOURCES
"${AWS_CORE_LIBRARY_DIR}/source/client/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/http/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/http/standard/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/http/curl/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/config/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/external/cjson/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/external/tinyxml2/*.cpp"
@ -91,7 +90,6 @@ set_property(TARGET aws_s3_checksums PROPERTY C_STANDARD 99)
add_library(aws_s3 ${S3_UNIFIED_SRC})
target_compile_definitions(aws_s3 PUBLIC -DENABLE_CURL_CLIENT)
target_compile_definitions(aws_s3 PUBLIC "AWS_SDK_VERSION_MAJOR=1")
target_compile_definitions(aws_s3 PUBLIC "AWS_SDK_VERSION_MINOR=7")
target_compile_definitions(aws_s3 PUBLIC "AWS_SDK_VERSION_PATCH=231")
@ -102,4 +100,4 @@ if (OPENSSL_FOUND)
target_link_libraries(aws_s3 PRIVATE ${OPENSSL_LIBRARIES})
endif()
target_link_libraries(aws_s3 PRIVATE aws_s3_checksums curl)
target_link_libraries(aws_s3 PRIVATE aws_s3_checksums)

View File

@ -26,14 +26,7 @@ if (USE_INTERNAL_CCTZ)
# Build a libray with embedded tzdata
if (OS_LINUX AND ARCH_AMD64)
find_program (OBJCOPY_PATH NAMES "llvm-objcopy" "llvm-objcopy-10" "llvm-objcopy-9" "llvm-objcopy-8" "objcopy")
if (OBJCOPY_PATH)
message(STATUS "Using objcopy: ${OBJCOPY_PATH}.")
else ()
message(FATAL_ERROR "Cannot find objcopy.")
endif ()
if (OS_LINUX)
set (TIMEZONES
Africa/Abidjan
@ -609,7 +602,7 @@ if (USE_INTERNAL_CCTZ)
# https://stackoverflow.com/questions/14776463/compile-and-add-an-object-file-from-a-binary-with-cmake
add_custom_command(OUTPUT ${TZ_OBJ}
COMMAND cd ${TZDIR} && ${OBJCOPY_PATH} -I binary -O elf64-x86-64 -B i386 ${TIMEZONE} ${CMAKE_CURRENT_BINARY_DIR}/${TZ_OBJ}
COMMAND cd ${TZDIR} && ${OBJCOPY_PATH} -I binary ${OBJCOPY_ARCH_OPTIONS} ${TIMEZONE} ${CMAKE_CURRENT_BINARY_DIR}/${TZ_OBJ}
COMMAND ${OBJCOPY_PATH} --rename-section .data=.rodata,alloc,load,readonly,data,contents
${CMAKE_CURRENT_BINARY_DIR}/${TZ_OBJ} ${CMAKE_CURRENT_BINARY_DIR}/${TZ_OBJ})
@ -623,7 +616,7 @@ if (USE_INTERNAL_CCTZ)
# libraries in linker command. To avoid this we hardcode whole-archive
# library into single string.
add_dependencies(cctz tzdata)
target_link_libraries(cctz INTERFACE "-Wl,--whole-archive $<TARGET_FILE:tzdata> -Wl,--no-whole-archive")
target_link_libraries(cctz INTERFACE "-Wl,${WHOLE_ARCHIVE} $<TARGET_FILE:tzdata> -Wl,${NO_WHOLE_ARCHIVE}")
endif ()
else ()

View File

@ -122,11 +122,11 @@ initdb()
CLICKHOUSE_DATADIR_FROM_CONFIG=$CLICKHOUSE_DATADIR
fi
if ! getent group ${CLICKHOUSE_USER} >/dev/null; then
if ! getent passwd ${CLICKHOUSE_USER} >/dev/null; then
echo "Can't chown to non-existing user ${CLICKHOUSE_USER}"
return
fi
if ! getent passwd ${CLICKHOUSE_GROUP} >/dev/null; then
if ! getent group ${CLICKHOUSE_GROUP} >/dev/null; then
echo "Can't chown to non-existing group ${CLICKHOUSE_GROUP}"
return
fi

View File

@ -92,6 +92,10 @@
"name": "yandex/clickhouse-fasttest",
"dependent": []
},
"docker/test/style": {
"name": "yandex/clickhouse-style-test",
"dependent": []
},
"docker/test/integration/s3_proxy": {
"name": "yandex/clickhouse-s3-proxy",
"dependent": []
@ -103,5 +107,25 @@
"docker/test/integration/helper_container": {
"name": "yandex/clickhouse-integration-helper",
"dependent": []
},
"docker/test/integration/mysql_golang_client": {
"name": "yandex/clickhouse-mysql-golang-client",
"dependent": []
},
"docker/test/integration/mysql_java_client": {
"name": "yandex/clickhouse-mysql-java-client",
"dependent": []
},
"docker/test/integration/mysql_js_client": {
"name": "yandex/clickhouse-mysql-js-client",
"dependent": []
},
"docker/test/integration/mysql_php_client": {
"name": "yandex/clickhouse-mysql-php-client",
"dependent": []
},
"docker/test/integration/postgresql_java_client": {
"name": "yandex/clickhouse-postgresql-java-client",
"dependent": []
}
}

View File

@ -2,6 +2,9 @@
set -x -e
# Update tzdata to the latest version. It is embedded into clickhouse binary.
sudo apt-get update && sudo apt-get install tzdata
mkdir -p build/cmake/toolchain/darwin-x86_64
tar xJf MacOSX10.14.sdk.tar.xz -C build/cmake/toolchain/darwin-x86_64 --strip-components=1

View File

@ -2,6 +2,9 @@
set -x -e
# Update tzdata to the latest version. It is embedded into clickhouse binary.
sudo apt-get update && sudo apt-get install tzdata
ccache --show-stats ||:
ccache --zero-stats ||:
build/release --no-pbuilder $ALIEN_PKGS | ts '%Y-%m-%d %H:%M:%S'

View File

@ -158,6 +158,8 @@ TESTS_TO_SKIP=(
01280_ssd_complex_key_dictionary
00652_replicated_mutations_zookeeper
01411_bayesian_ab_testing
01238_http_memory_tracking # max_memory_usage_for_user can interfere another queries running concurrently
01281_group_by_limit_memory_tracking # max_memory_usage_for_user can interfere another queries running concurrently
)
clickhouse-test -j 4 --no-long --testname --shard --zookeeper --skip ${TESTS_TO_SKIP[*]} 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee /test_output/test_log.txt

View File

@ -1,3 +1,6 @@
# docker build -t yandex/clickhouse-mysql-golang-client .
# MySQL golang client docker container
FROM golang:1.12.2
RUN go get "github.com/go-sql-driver/mysql"

View File

@ -1,3 +1,6 @@
# docker build -t yandex/clickhouse-mysql-java-client .
# MySQL Java client docker container
FROM ubuntu:18.04
RUN apt-get update && \

View File

@ -0,0 +1,8 @@
# docker build -t yandex/clickhouse-mysql-js-client .
# MySQL JavaScript client docker container
FROM node:8
RUN npm install mysql
COPY ./test.js test.js

View File

@ -1,3 +1,6 @@
# docker build -t yandex/clickhouse-mysql-php-client .
# MySQL PHP client docker container
FROM php:7.3-cli
COPY ./client.crt client.crt

View File

@ -1,3 +1,6 @@
# docker build -t yandex/clickhouse-postgresql-java-client .
# PostgreSQL Java client docker container
FROM ubuntu:18.04
RUN apt-get update && \

View File

@ -1,8 +1,6 @@
version: '2.3'
services:
golang1:
build:
context: ./
network: host
image: yandex/clickhouse-mysql-golang-client:${DOCKER_MYSQL_GOLANG_CLIENT_TAG}
# to keep container running
command: sleep infinity

View File

@ -1,8 +1,6 @@
version: '2.3'
services:
java1:
build:
context: ./
network: host
image: yandex/clickhouse-mysql-java-client:${DOCKER_MYSQL_JAVA_CLIENT_TAG}
# to keep container running
command: sleep infinity

View File

@ -1,8 +1,6 @@
version: '2.3'
services:
mysqljs1:
build:
context: ./
network: host
image: yandex/clickhouse-mysql-js-client:${DOCKER_MYSQL_JS_CLIENT_TAG}
# to keep container running
command: sleep infinity

View File

@ -1,7 +1,6 @@
version: '2.3'
services:
php1:
build:
context: ./
image: yandex/clickhouse-mysql-php-client:${DOCKER_MYSQL_PHP_CLIENT_TAG}
# to keep container running
command: sleep infinity

View File

@ -1,8 +1,6 @@
version: '2.2'
services:
java:
build:
context: ./
network: host
image: yandex/clickhouse-postgresql-java-client:${DOCKER_POSTGRESQL_JAVA_CLIENT_TAG}
# to keep container running
command: sleep infinity

View File

@ -22,5 +22,11 @@ export CLICKHOUSE_TESTS_CLIENT_BIN_PATH=/clickhouse
export CLICKHOUSE_TESTS_BASE_CONFIG_DIR=/clickhouse-config
export CLICKHOUSE_ODBC_BRIDGE_BINARY_PATH=/clickhouse-odbc-bridge
export DOCKER_MYSQL_GOLANG_CLIENT_TAG=${DOCKER_MYSQL_GOLANG_CLIENT_TAG:=latest}
export DOCKER_MYSQL_JAVA_CLIENT_TAG=${DOCKER_MYSQL_JAVA_CLIENT_TAG:=latest}
export DOCKER_MYSQL_JS_CLIENT_TAG=${DOCKER_MYSQL_JS_CLIENT_TAG:=latest}
export DOCKER_MYSQL_PHP_CLIENT_TAG=${DOCKER_MYSQL_PHP_CLIENT_TAG:=latest}
export DOCKER_POSTGRESQL_JAVA_CLIENT_TAG=${DOCKER_POSTGRESQL_JAVA_CLIENT_TAG:=latest}
cd /ClickHouse/tests/integration
exec "$@"

View File

@ -318,10 +318,10 @@ create view right_query_log as select *
create view query_logs as
select 0 version, query_id, ProfileEvents.Names, ProfileEvents.Values,
query_duration_ms from left_query_log
query_duration_ms, memory_usage from left_query_log
union all
select 1 version, query_id, ProfileEvents.Names, ProfileEvents.Values,
query_duration_ms from right_query_log
query_duration_ms, memory_usage from right_query_log
;
-- This is a single source of truth on all metrics we have for query runs. The
@ -345,10 +345,11 @@ create table query_run_metric_arrays engine File(TSV, 'analyze/query-run-metric-
arrayMap(x->toFloat64(x), ProfileEvents.Values))]
),
arrayReduce('sumMapState', [(
['client_time', 'server_time'],
['client_time', 'server_time', 'memory_usage'],
arrayMap(x->if(x != 0., x, -0.), [
toFloat64(query_runs.time),
toFloat64(query_duration_ms / 1000.)]))])
toFloat64(query_duration_ms / 1000.),
toFloat64(memory_usage)]))])
]
)) as metrics_tuple).1 metric_names,
metrics_tuple.2 metric_values

View File

@ -33,13 +33,16 @@ function download
fi
done
# Might have the same version on left and right (for testing).
# Might have the same version on left and right (for testing) -- in this case we just copy
# already downloaded 'right' to the 'left. There is the third case when we don't have to
# download anything, for example in some manual runs. In this case, SHAs are not set.
if ! [ "$left_sha" = "$right_sha" ]
then
wget -nv -nd -c "$left_path" -O- | tar -C left --strip-components=1 -zxv &
else
elif [ "$right_sha" != "" ]
then
mkdir left ||:
cp -a right/* left &
cp -an right/* left &
fi
for dataset_name in $datasets

View File

@ -15,24 +15,24 @@ function download
mkdir left right db0 ||:
"$script_dir/download.sh" ||: &
cp -vP "$repo_dir"/../build-gcc9-rel/programs/clickhouse* right &
cp -vP "$repo_dir"/../build-clang10-rel/programs/clickhouse* left &
cp -nvP "$repo_dir"/../build-gcc9-rel/programs/clickhouse* left &
cp -nvP "$repo_dir"/../build-clang10-rel/programs/clickhouse* right &
wait
}
function configure
{
# Test files
cp -av "$repo_dir/tests/performance" right
cp -av "$repo_dir/tests/performance" left
cp -nav "$repo_dir/tests/performance" right
cp -nav "$repo_dir/tests/performance" left
# Configs
cp -av "$script_dir/config" right
cp -av "$script_dir/config" left
cp -av "$repo_dir"/programs/server/config* right/config
cp -av "$repo_dir"/programs/server/user* right/config
cp -av "$repo_dir"/programs/server/config* left/config
cp -av "$repo_dir"/programs/server/user* left/config
cp -nav "$script_dir/config" right
cp -nav "$script_dir/config" left
cp -nav "$repo_dir"/programs/server/config* right/config
cp -nav "$repo_dir"/programs/server/user* right/config
cp -nav "$repo_dir"/programs/server/config* left/config
cp -nav "$repo_dir"/programs/server/user* left/config
tree left
}

View File

@ -222,17 +222,22 @@ for query_index, q in enumerate(test_queries):
query_error_on_connection[conn_index] = traceback.format_exc();
continue
# If prewarm fails for the query on both servers -- report the error, skip
# the query and continue testing the next query.
if query_error_on_connection.count(None) == 0:
print(query_error_on_connection[0], file = sys.stderr)
continue
# Report all errors that ocurred during prewarm and decide what to do next.
# If prewarm fails for the query on all servers -- skip the query and
# continue testing the next query.
# If prewarm fails on one of the servers, run the query on the rest of them.
# Useful for queries that use new functions added in the new server version.
if query_error_on_connection.count(None) < len(query_error_on_connection):
no_error = [i for i, e in enumerate(query_error_on_connection) if not e]
print(f'partial\t{query_index}\t{no_error}')
no_errors = []
for i, e in enumerate(query_error_on_connection):
if e:
print(e, file = sys.stderr)
else:
no_errors.append(i)
if len(no_errors) == 0:
continue
elif len(no_errors) < len(connections):
print(f'partial\t{query_index}\t{no_errors}')
# Now, perform measured runs.
# Track the time spent by the client to process this query, so that we can
@ -245,7 +250,15 @@ for query_index, q in enumerate(test_queries):
for conn_index, c in enumerate(connections):
if query_error_on_connection[conn_index]:
continue
res = c.execute(q, query_id = run_id)
try:
res = c.execute(q, query_id = run_id)
except Exception as e:
# Add query id to the exception to make debugging easier.
e.args = (run_id, *e.args)
e.message = run_id + ': ' + e.message
raise
print(f'query\t{query_index}\t{run_id}\t{conn_index}\t{c.last_query.elapsed}')
server_seconds += c.last_query.elapsed

View File

@ -353,7 +353,7 @@ if args.report == 'main':
attrs[1] = ''
if float(row[0]) > allowed_single_run_time:
attrs[0] = f'style="background: {color_bad}"'
errors_explained.append([f'<a href="#{anchor}">The query no. {row[3]} of test \'{row[2]}\' is taking too long to run. Keep the run time below {allowed_single_run} seconds"</a>'])
errors_explained.append([f'<a href="#{anchor}">The query no. {row[3]} of test \'{row[2]}\' is taking too long to run. Keep the run time below {allowed_single_run_time} seconds"</a>'])
slow_average_tests += 1
else:
attrs[0] = ''

View File

@ -24,20 +24,26 @@ def run_perf_test(cmd, xmls_path, output_folder):
return p
def get_options(i):
options = ""
if 0 < i:
options += " --order=random"
if i == 1:
options += " --atomic-db-engine"
return options
def run_func_test(cmd, output_prefix, num_processes, skip_tests_option):
skip_list_opt = get_skip_list_cmd(cmd)
output_paths = [os.path.join(output_prefix, "stress_test_run_{}.txt".format(i)) for i in range(num_processes)]
f = open(output_paths[0], 'w')
main_command = "{} {} {}".format(cmd, skip_list_opt, skip_tests_option)
logging.info("Run func tests main cmd '%s'", main_command)
pipes = [Popen(main_command, shell=True, stdout=f, stderr=f)]
for output_path in output_paths[1:]:
time.sleep(0.5)
f = open(output_path, 'w')
full_command = "{} {} --order=random {}".format(cmd, skip_list_opt, skip_tests_option)
pipes = []
for i in range(0, len(output_paths)):
f = open(output_paths[i], 'w')
full_command = "{} {} {} {}".format(cmd, skip_list_opt, get_options(i), skip_tests_option)
logging.info("Run func tests '%s'", full_command)
p = Popen(full_command, shell=True, stdout=f, stderr=f)
pipes.append(p)
time.sleep(0.5)
return pipes

View File

@ -0,0 +1,8 @@
# docker build -t yandex/clickhouse-style-test .
FROM ubuntu:20.04
RUN apt-get update && env DEBIAN_FRONTEND=noninteractive apt-get install --yes shellcheck libxml2-utils git
CMD cd /ClickHouse/utils/check-style && ./check-style -n | tee /test_output/style_output.txt && \
./check-duplicate-includes.sh | tee /test_output/duplicate_output.txt

View File

@ -464,7 +464,7 @@ If the query contains GROUP BY, rows\_before\_limit\_at\_least is the exact numb
This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
ClickHouse supports [NULL](../sql-reference/syntax.md), which is displayed as `null` in the JSON output.
ClickHouse supports [NULL](../sql-reference/syntax.md), which is displayed as `null` in the JSON output. To enable `+nan`, `-nan`, `+inf`, `-inf` values in output, set the [output\_format\_json\_quote\_denormals](../operations/settings/settings.md#settings-output_format_json_quote_denormals) to 1.
See also the [JSONEachRow](#jsoneachrow) format.

View File

@ -65,6 +65,7 @@ toc_title: Adopters
| <a href="https://rambler.ru" class="favicon">Rambler</a> | Internet services | Analytics | — | — | [Talk in Russian, April 2018](https://medium.com/@ramblertop/разработка-api-clickhouse-для-рамблер-топ-100-f4c7e56f3141) |
| <a href="https://www.s7.ru" class="favicon">S7 Airlines</a> | Airlines | Metrics, Logging | — | — | [Talk in Russian, March 2019](https://www.youtube.com/watch?v=nwG68klRpPg&t=15s) |
| <a href="https://www.scireum.de/" class="favicon">scireum GmbH</a> | e-Commerce | Main product | — | — | [Talk in German, February 2020](https://www.youtube.com/watch?v=7QWAn5RbyR4) |
| <a href="https://segment.com/" class="favicon">Segment</a> | Data processing | Main product | 9 * i3en.3xlarge nodes 7.5TB NVME SSDs, 96GB Memory, 12 vCPUs | — | [Slides, 2019](https://slides.com/abraithwaite/segment-clickhouse) |
| <a href="https://www.semrush.com/" class="favicon">SEMrush</a> | Marketing | Main product | — | — | [Slides in Russian, August 2018](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup17/5_semrush.pdf) |
| <a href="https://sentry.io/" class="favicon">Sentry</a> | Software Development | Main product | — | — | [Blog Post in English, May 2019](https://blog.sentry.io/2019/05/16/introducing-snuba-sentrys-new-search-infrastructure) |
| <a href="https://seo.do/" class="favicon">seo.do</a> | Analytics | Main product | — | — | [Slides in English, November 2019](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup35/CH%20Presentation-%20Metehan%20Çetinkaya.pdf) |

View File

@ -399,7 +399,7 @@ The cache is shared for the server and memory is allocated as needed. The cache
```
## max\_server\_memory\_usage {#max_server_memory_usage}
Limits total RAM usage by the ClickHouse server. You can specify it only for the default profile.
Limits total RAM usage by the ClickHouse server.
Possible values:

View File

@ -654,7 +654,7 @@ log_query_threads=1
## max\_insert\_block\_size {#settings-max_insert_block_size}
The size of blocks to form for insertion into a table.
The size of blocks (in a count of rows) to form for insertion into a table.
This setting only applies in cases when the server forms the blocks.
For example, for an INSERT via the HTTP interface, the server parses the data format and forms blocks of the specified size.
But when using clickhouse-client, the client parses the data itself, and the max\_insert\_block\_size setting on the server doesnt affect the size of the inserted blocks.
@ -998,6 +998,105 @@ The results of the compilation are saved in the build directory in the form of .
If the value is true, integers appear in quotes when using JSON\* Int64 and UInt64 formats (for compatibility with most JavaScript implementations); otherwise, integers are output without the quotes.
## output\_format\_json\_quote\_denormals {#settings-output_format_json_quote_denormals}
Enables `+nan`, `-nan`, `+inf`, `-inf` outputs in [JSON](../../interfaces/formats.md#json) output format.
Possible values:
- 0 — Disabled.
- 1 — Enabled.
Default value: 0.
**Example**
Consider the following table `account_orders`:
```text
┌─id─┬─name───┬─duration─┬─period─┬─area─┐
│ 1 │ Andrew │ 20 │ 0 │ 400 │
│ 2 │ John │ 40 │ 0 │ 0 │
│ 3 │ Bob │ 15 │ 0 │ -100 │
└────┴────────┴──────────┴────────┴──────┘
```
When `output_format_json_quote_denormals = 0`, the query returns `null` values in output:
```sql
SELECT area/period FROM account_orders FORMAT JSON;
```
```json
{
"meta":
[
{
"name": "divide(area, period)",
"type": "Float64"
}
],
"data":
[
{
"divide(area, period)": null
},
{
"divide(area, period)": null
},
{
"divide(area, period)": null
}
],
"rows": 3,
"statistics":
{
"elapsed": 0.003648093,
"rows_read": 3,
"bytes_read": 24
}
}
```
When `output_format_json_quote_denormals = 1`, the query returns:
```json
{
"meta":
[
{
"name": "divide(area, period)",
"type": "Float64"
}
],
"data":
[
{
"divide(area, period)": "inf"
},
{
"divide(area, period)": "-nan"
},
{
"divide(area, period)": "-inf"
}
],
"rows": 3,
"statistics":
{
"elapsed": 0.000070241,
"rows_read": 3,
"bytes_read": 24
}
}
```
## format\_csv\_delimiter {#settings-format_csv_delimiter}
The character interpreted as a delimiter in the CSV data. By default, the delimiter is `,`.
@ -1722,6 +1821,20 @@ SELECT * FROM a;
| 1 |
+---+
```
## optimize_read_in_order {#optimize_read_in_order}
Enables [ORDER BY](../../sql-reference/statements/select/order-by.md#optimize_read_in_order) optimization in [SELECT](../../sql-reference/statements/select/index.md) queries for reading data from [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md) tables.
Possible values:
- 0 — `ORDER BY` optimization is disabled.
- 1 — `ORDER BY` optimization is enabled.
Default value: `1`.
**See Also**
- [ORDER BY Clause](../../sql-reference/statements/select/order-by.md#optimize_read_in_order)
## mutations_sync {#mutations_sync}

View File

@ -88,8 +88,8 @@ SELECT replaceRegexpAll('Hello, World!', '^', 'here: ') AS res
## regexpQuoteMeta(s) {#regexpquotemetas}
The function adds a backslash before some predefined characters in the string.
Predefined characters: 0, \\, \|, (, ), ^, $, ., \[, \], ?, \*,+,{,:,-.
This implementation slightly differs from re2::RE2::QuoteMeta. It escapes zero byte as \\0 instead of 00 and it escapes only required characters.
Predefined characters: `\0`, `\\`, `|`, `(`, `)`, `^`, `$`, `.`, `[`, `]`, `?`, `*`, `+`, `{`, `:`, `-`.
This implementation slightly differs from re2::RE2::QuoteMeta. It escapes zero byte as `\0` instead of `\x00` and it escapes only required characters.
For more information, see the link: [RE2](https://github.com/google/re2/blob/master/re2/re2.cc#L473)
[Original article](https://clickhouse.tech/docs/en/query_language/functions/string_replace_functions/) <!--hide-->

View File

@ -70,6 +70,25 @@ Running a query may use more memory than `max_bytes_before_external_sort`. For t
External sorting works much less effectively than sorting in RAM.
## Optimization of Data Reading {#optimize_read_in_order}
If `ORDER BY` expression has a prefix that coincides with the table sorting key, you can optimize the query by using the [optimize_read_in_order](../../../operations/settings/settings.md#optimize_read_in_order) setting.
When the `optimize_read_in_order` setting is enabled, the Clickhouse server uses the table index and reads the data in order of the `ORDER BY` key. This allows to avoid reading all data in case of specified [LIMIT](../../../sql-reference/statements/select/limit.md). So queries on big data with small limit are processed faster.
Optimization works with both `ASC` and `DESC` and doesn't work together with [GROUP BY](../../../sql-reference/statements/select/group-by.md) clause and [FINAL](../../../sql-reference/statements/select/from.md#select-from-final) modifier.
When the `optimize_read_in_order` setting is disabled, the Clickhouse server does not use the table index while processing `SELECT` queries.
Consider disabling `optimize_read_in_order` manually, when running queries that have `ORDER BY` clause, large `LIMIT` and [WHERE](../../../sql-reference/statements/select/where.md) condition that requires to read huge amount of records before queried data is found.
Optimization is supported in the following table engines:
- [MergeTree](../../../engines/table-engines/mergetree-family/mergetree.md)
- [Merge](../../../engines/table-engines/special/merge.md), [Buffer](../../../engines/table-engines/special/buffer.md), and [MaterializedView](../../../engines/table-engines/special/materializedview.md) table engines over `MergeTree`-engine tables
In `MaterializedView`-engine tables the optimization works with views like `SELECT ... FROM merge_tree_table ORDER BY pk`. But it is not supported in the queries like `SELECT ... FROM view ORDER BY pk` if the view query doesn't have the `ORDER BY` clause.
## ORDER BY Expr WITH FILL Modifier {#orderby-with-fill}
This modifier also can be combined with [LIMIT … WITH TIES modifier](../../../sql-reference/statements/select/limit.md#limit-with-ties).

View File

@ -1,3 +1,5 @@
# Инструкция для разработчиков
Сборка ClickHouse поддерживается на Linux, FreeBSD, Mac OS X.
# Если вы используете Windows {#esli-vy-ispolzuete-windows}

View File

@ -7,7 +7,7 @@ ClickHouse может работать на любой операционной
Предварительно собранные пакеты компилируются для x86\_64 и используют набор инструкций SSE 4.2, поэтому, если не указано иное, его поддержка в используемом процессоре, становится дополнительным требованием к системе. Вот команда, чтобы проверить, поддерживает ли текущий процессор SSE 4.2:
``` bash
$ grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
```
Чтобы запустить ClickHouse на процессорах, которые не поддерживают SSE 4.2, либо имеют архитектуру AArch64 или PowerPC64LE, необходимо самостоятельно [собрать ClickHouse из исходного кода](#from-sources) с соответствующими настройками конфигурации.
@ -84,7 +84,7 @@ sudo clickhouse-client-$LATEST_VERSION/install/doinst.sh
### Из исполняемых файлов для нестандартных окружений {#from-binaries-non-linux}
Для других операционных систем и арихитектуры AArch64, сборки ClickHouse предоставляются в виде кросс-компилированного бинарника с последнего коммита ветки master (с задержкой в несколько часов).
Для других операционных систем и архитектуры AArch64, сборки ClickHouse предоставляются в виде кросс-компилированного бинарника с последнего коммита ветки master (с задержкой в несколько часов).
- [macOS](https://builds.clickhouse.tech/master/macos/clickhouse) — `curl -O 'https://builds.clickhouse.tech/master/macos/clickhouse' && chmod a+x ./clickhouse`
- [AArch64](https://builds.clickhouse.tech/master/aarch64/clickhouse) — `curl -O 'https://builds.clickhouse.tech/master/aarch64/clickhouse' && chmod a+x ./clickhouse`
@ -115,7 +115,7 @@ sudo clickhouse-client-$LATEST_VERSION/install/doinst.sh
Для запуска сервера в качестве демона, выполните:
``` bash
$ sudo service clickhouse-server start
sudo service clickhouse-server start
```
Смотрите логи в директории `/var/log/clickhouse-server/`.
@ -125,7 +125,7 @@ $ sudo service clickhouse-server start
Также можно запустить сервер вручную из консоли:
``` bash
$ clickhouse-server --config-file=/etc/clickhouse-server/config.xml
clickhouse-server --config-file=/etc/clickhouse-server/config.xml
```
При этом, лог будет выводиться в консоль, что удобно для разработки.
@ -134,7 +134,7 @@ $ clickhouse-server --config-file=/etc/clickhouse-server/config.xml
После запуска сервера, соединиться с ним можно с помощью клиента командной строки:
``` bash
$ clickhouse-client
clickhouse-client
```
По умолчанию он соединяется с localhost:9000, от имени пользователя `default` без пароля. Также клиент может быть использован для соединения с удалённым сервером с помощью аргумента `--host`.

View File

@ -432,7 +432,7 @@ JSON совместим с JavaScript. Для этого, дополнитель
Этот формат подходит только для вывода результата выполнения запроса, но не для парсинга (приёма данных для вставки в таблицу).
ClickHouse поддерживает [NULL](../sql-reference/syntax.md), который при выводе JSON будет отображен как `null`.
ClickHouse поддерживает [NULL](../sql-reference/syntax.md), который при выводе JSON будет отображен как `null`. Чтобы включить отображение в результате значений `+nan`, `-nan`, `+inf`, `-inf`, установите параметр [output\_format\_json\_quote\_denormals](../operations/settings/settings.md#settings-output_format_json_quote_denormals) равным 1.
Смотрите также формат [JSONEachRow](#jsoneachrow) .

View File

@ -374,7 +374,7 @@ ClickHouse проверит условия `min_part_size` и `min_part_size_rat
## max_server_memory_usage {#max_server_memory_usage}
Ограничивает объём оперативной памяти, используемой сервером ClickHouse. Настройка может быть задана только для профиля `default`.
Ограничивает объём оперативной памяти, используемой сервером ClickHouse.
Возможные значения:

View File

@ -571,7 +571,7 @@ log_query_threads=1
## max\_insert\_block\_size {#settings-max_insert_block_size}
Формировать блоки указанного размера, при вставке в таблицу.
Формировать блоки указанного размера (в количестве строк), при вставке в таблицу.
Эта настройка действует только в тех случаях, когда сервер сам формирует такие блоки.
Например, при INSERT-е через HTTP интерфейс, сервер парсит формат данных, и формирует блоки указанного размера.
А при использовании clickhouse-client, клиент сам парсит данные, и настройка max\_insert\_block\_size на сервере не влияет на размер вставляемых блоков.
@ -908,6 +908,105 @@ load_balancing = first_or_random
Если значение истинно, то при использовании JSON\* форматов UInt64 и Int64 числа выводятся в кавычках (из соображений совместимости с большинством реализаций JavaScript), иначе - без кавычек.
## output\_format\_json\_quote\_denormals {#settings-output_format_json_quote_denormals}
При выводе данных в формате [JSON](../../interfaces/formats.md#json) включает отображение значений `+nan`, `-nan`, `+inf`, `-inf`.
Возможные значения:
- 0 — выключена.
- 1 — включена.
Значение по умолчанию: 0.
**Пример**
Рассмотрим следующую таблицу `account_orders`:
```text
┌─id─┬─name───┬─duration─┬─period─┬─area─┐
│ 1 │ Andrew │ 20 │ 0 │ 400 │
│ 2 │ John │ 40 │ 0 │ 0 │
│ 3 │ Bob │ 15 │ 0 │ -100 │
└────┴────────┴──────────┴────────┴──────┘
```
Когда `output_format_json_quote_denormals = 0`, следующий запрос возвращает значения `null`.
```sql
SELECT area/period FROM account_orders FORMAT JSON;
```
```json
{
"meta":
[
{
"name": "divide(area, period)",
"type": "Float64"
}
],
"data":
[
{
"divide(area, period)": null
},
{
"divide(area, period)": null
},
{
"divide(area, period)": null
}
],
"rows": 3,
"statistics":
{
"elapsed": 0.003648093,
"rows_read": 3,
"bytes_read": 24
}
}
```
Если `output_format_json_quote_denormals = 1`, то запрос вернет:
```json
{
"meta":
[
{
"name": "divide(area, period)",
"type": "Float64"
}
],
"data":
[
{
"divide(area, period)": "inf"
},
{
"divide(area, period)": "-nan"
},
{
"divide(area, period)": "-inf"
}
],
"rows": 3,
"statistics":
{
"elapsed": 0.000070241,
"rows_read": 3,
"bytes_read": 24
}
}
```
## format\_csv\_delimiter {#settings-format_csv_delimiter}
Символ, интерпретируемый как разделитель в данных формата CSV. По умолчанию — `,`.
@ -1480,6 +1579,21 @@ SELECT idx, i FROM null_in WHERE i IN (1, NULL) SETTINGS transform_null_in = 1;
- [min_insert_block_size_bytes](#min-insert-block-size-bytes)
## optimize_read_in_order {#optimize_read_in_order}
Включает или отключает оптимизацию в запросах [SELECT](../../sql-reference/statements/select/index.md) с секцией [ORDER BY](../../sql-reference/statements/select/order-by.md#optimize_read_in_order) при работе с таблицами семейства [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md).
Возможные значения:
- 0 — оптимизация отключена.
- 1 — оптимизация включена.
Значение по умолчанию: `1`.
**См. также**
- [Оптимизация чтения данных](../../sql-reference/statements/select/order-by.md#optimize_read_in_order) в секции `ORDER BY`
## mutations_sync {#mutations_sync}
Позволяет выполнять запросы `ALTER TABLE ... UPDATE|DELETE` ([мутации](../../sql-reference/statements/alter.md#mutations)) синхронно.
@ -1497,4 +1611,5 @@ SELECT idx, i FROM null_in WHERE i IN (1, NULL) SETTINGS transform_null_in = 1;
- [Синхронность запросов ALTER](../../sql-reference/statements/alter.md#synchronicity-of-alter-queries)
- [Мутации](../../sql-reference/statements/alter.md#mutations)
[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/settings/settings/) <!--hide-->

View File

@ -2,7 +2,7 @@
Промежуточное состояние агрегатной функции. Чтобы его получить, используются агрегатные функции с суффиксом `-State`. Чтобы в дальнейшем получить агрегированные данные необходимо использовать те же агрегатные функции с суффиксом `-Merge`.
`AggregateFunction(name, types\_of\_arguments…)` — параметрический тип данных.
`AggregateFunction(name, types_of_arguments…)` — параметрический тип данных.
**Параметры**
@ -58,6 +58,6 @@ SELECT uniqMerge(state) FROM (SELECT uniqState(UserID) AS state FROM table GROUP
## Пример использования {#primer-ispolzovaniia}
Смотрите в описании движка [AggregatingMergeTree](../../sql-reference/data-types/aggregatefunction.md).
Смотрите в описании движка [AggregatingMergeTree](../../engines/table-engines/mergetree-family/aggregatingmergetree.md).
[Оригинальная статья](https://clickhouse.tech/docs/ru/data_types/nested_data_structures/aggregatefunction/) <!--hide-->

View File

@ -493,7 +493,7 @@ ALTER TABLE [db.]table MATERIALIZE INDEX name IN PARTITION partition_name
Мутации линейно упорядочены между собой и накладываются на каждый кусок в порядке добавления. Мутации также упорядочены со вставками - гарантируется, что данные, вставленные в таблицу до начала выполнения запроса мутации, будут изменены, а данные, вставленные после окончания запроса мутации, изменены не будут. При этом мутации никак не блокируют вставки.
Запрос завершается немедленно после добавления информации о мутации (для реплицированных таблиц - в ZooKeeper, для нереплицированных - на файловую систему). Сама мутация выполняется асинхронно, используя настройки системного профиля. Следить за ходом её выполнения можно по таблице [`system.mutations`](../../operations/system-tables/mutations.md#system_tables-mutations). Добавленные мутации будут выполняться до конца даже в случае перезапуска серверов ClickHouse. Откатить мутацию после её добавления нельзя, но если мутация по какой-то причине не может выполниться до конца, её можно остановить с помощью запроса [`KILL MUTATION`](misc.md#kill-mutation-statement).
Запрос завершается немедленно после добавления информации о мутации (для реплицированных таблиц - в ZooKeeper, для нереплицированных - на файловую систему). Сама мутация выполняется асинхронно, используя настройки системного профиля. Следить за ходом её выполнения можно по таблице [`system.mutations`](../../operations/system-tables/mutations.md#system_tables-mutations). Добавленные мутации будут выполняться до конца даже в случае перезапуска серверов ClickHouse. Откатить мутацию после её добавления нельзя, но если мутация по какой-то причине не может выполниться до конца, её можно остановить с помощью запроса [`KILL MUTATION`](../../sql-reference/statements/kill.md#kill-mutation).
Записи о последних выполненных мутациях удаляются не сразу (количество сохраняемых мутаций определяется параметром движка таблиц `finished_mutations_to_keep`). Более старые записи удаляются.

View File

@ -0,0 +1,23 @@
---
toc_priority: 42
toc_title: ATTACH
---
# ATTACH Statement {#attach}
Запрос полностью аналогичен запросу `CREATE`, но:
- вместо слова `CREATE` используется слово `ATTACH`;
- запрос не создаёт данные на диске, а предполагает, что данные уже лежат в соответствующих местах, и всего лишь добавляет информацию о таблице на сервер. После выполнения запроса `ATTACH` сервер будет знать о существовании таблицы.
Если таблица перед этим была отсоединена (`DETACH`), т.е. её структура известна, можно использовать сокращенную форму записи без определения структуры.
``` sql
ATTACH TABLE [IF NOT EXISTS] [db.]name [ON CLUSTER cluster]
```
Этот запрос используется при старте сервера. Сервер хранит метаданные таблиц в виде файлов с запросами `ATTACH`, которые он просто исполняет при запуске (за исключением системных таблиц, которые явно создаются на сервере).
[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/statements/attach/) <!--hide-->

View File

@ -0,0 +1,44 @@
---
toc_priority: 43
toc_title: CHECK
---
# CHECK TABLE Statement {#check-table}
Проверяет таблицу на повреждение данных.
``` sql
CHECK TABLE [db.]name
```
Запрос `CHECK TABLE` сравнивает текущие размеры файлов (в которых хранятся данные из колонок) с ожидаемыми значениями. Если значения не совпадают, данные в таблице считаются поврежденными. Искажение возможно, например, из-за сбоя при записи данных.
Ответ содержит колонку `result`, содержащую одну строку с типом [Boolean](../../sql-reference/data-types/boolean.md). Допустимые значения:
- 0 - данные в таблице повреждены;
- 1 - данные не повреждены.
Запрос `CHECK TABLE` поддерживает следующие движки таблиц:
- [Log](../../engines/table-engines/log-family/log.md)
- [TinyLog](../../engines/table-engines/log-family/tinylog.md)
- [StripeLog](../../engines/table-engines/log-family/stripelog.md)
- [Семейство MergeTree](../../engines/table-engines/mergetree-family/index.md)
При попытке выполнить запрос с таблицами с другими табличными движками, ClickHouse генерирует исключение.
В движках `*Log` не предусмотрено автоматическое восстановление данных после сбоя. Используйте запрос `CHECK TABLE`, чтобы своевременно выявлять повреждение данных.
Для движков из семейства `MergeTree` запрос `CHECK TABLE` показывает статус проверки для каждого отдельного куска данных таблицы на локальном сервере.
**Что делать, если данные повреждены**
В этом случае можно скопировать оставшиеся неповрежденные данные в другую таблицу. Для этого:
1. Создайте новую таблицу с такой же структурой, как у поврежденной таблицы. Для этого выполните запрос `CREATE TABLE <new_table_name> AS <damaged_table_name>`.
2. Установите значение параметра [max\_threads](../../operations/settings/settings.md#settings-max_threads) в 1. Это нужно для того, чтобы выполнить следующий запрос в одном потоке. Установить значение параметра можно через запрос: `SET max_threads = 1`.
3. Выполните запрос `INSERT INTO <new_table_name> SELECT * FROM <damaged_table_name>`. В результате неповрежденные данные будут скопированы в другую таблицу. Обратите внимание, будут скопированы только те данные, которые следуют до поврежденного участка.
4. Перезапустите `clickhouse-client`, чтобы вернуть предыдущее значение параметра `max_threads`.
[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/statements/check-table/) <!--hide-->

View File

@ -0,0 +1,24 @@
---
toc_priority: 44
toc_title: DESCRIBE
---
# DESCRIBE TABLE Statement {#misc-describe-table}
``` sql
DESC|DESCRIBE TABLE [db.]table [INTO OUTFILE filename] [FORMAT format]
```
Возвращает описание столбцов таблицы.
Результат запроса содержит столбцы (все столбцы имеют тип String):
- `name` — имя столбца таблицы;
- `type`— тип столбца;
- `default_type` — в каком виде задано [выражение для значения по умолчанию](../../sql-reference/statements/create/table.md#create-default-values): `DEFAULT`, `MATERIALIZED` или `ALIAS`. Столбец содержит пустую строку, если значение по умолчанию не задано.
- `default_expression` — значение, заданное в секции `DEFAULT`;
- `comment_expression` — комментарий к столбцу.
Вложенные структуры данных выводятся в «развёрнутом» виде. То есть, каждый столбец - по отдельности, с именем через точку.
[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/statements/describe-table/) <!--hide-->

View File

@ -0,0 +1,19 @@
---
toc_priority: 45
toc_title: DETACH
---
# DETACH {#detach-statement}
Удаляет из сервера информацию о таблице name. Сервер перестаёт знать о существовании таблицы.
``` sql
DETACH TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]
```
Но ни данные, ни метаданные таблицы не удаляются. При следующем запуске сервера, сервер прочитает метаданные и снова узнает о таблице.
Также, «отцепленную» таблицу можно прицепить заново запросом `ATTACH` (за исключением системных таблиц, для которых метаданные не хранятся).
Запроса `DETACH DATABASE` нет.
[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/statements/detach/) <!--hide-->

View File

@ -0,0 +1,84 @@
---
toc_priority: 46
toc_title: DROP
---
# DROP {#drop}
Запрос имеет два вида: `DROP DATABASE` и `DROP TABLE`.
``` sql
DROP DATABASE [IF EXISTS] db [ON CLUSTER cluster]
```
``` sql
DROP [TEMPORARY] TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]
```
Удаляет таблицу.
Если указано `IF EXISTS` - не выдавать ошибку, если таблица не существует или база данных не существует.
## DROP USER {#drop-user-statement}
Удаляет пользователя.
### Синтаксис {#drop-user-syntax}
```sql
DROP USER [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
```
## DROP ROLE {#drop-role-statement}
Удаляет роль.
При удалении роль отзывается у всех объектов системы доступа, которым она присвоена.
### Синтаксис {#drop-role-syntax}
```sql
DROP ROLE [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
```
## DROP ROW POLICY {#drop-row-policy-statement}
Удаляет политику доступа к строкам.
При удалении политика отзывается у всех объектов системы доступа, которым она присвоена.
### Синтаксис {#drop-row-policy-syntax}
``` sql
DROP [ROW] POLICY [IF EXISTS] name [,...] ON [database.]table [,...] [ON CLUSTER cluster_name]
```
## DROP QUOTA {#drop-quota-statement}
Удаляет квоту.
При удалении квота отзывается у всех объектов системы доступа, которым она присвоена.
### Синтаксис {#drop-quota-syntax}
``` sql
DROP QUOTA [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
```
## DROP SETTINGS PROFILE {#drop-settings-profile-statement}
Удаляет профиль настроек.
При удалении профиль отзывается у всех объектов системы доступа, которым он присвоен.
### Синтаксис {#drop-settings-profile-syntax}
``` sql
DROP [SETTINGS] PROFILE [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
```
[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/statements/drop/) <!--hide-->

View File

@ -0,0 +1,15 @@
---
toc_priority: 47
toc_title: EXISTS
---
# EXISTS {#exists}
``` sql
EXISTS [TEMPORARY] TABLE [db.]name [INTO OUTFILE filename] [FORMAT format]
```
Возвращает один столбец типа `UInt8`, содержащий одно значение - `0`, если таблицы или БД не существует и `1`, если таблица в указанной БД существует.
[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/statements/exists/) <!--hide-->

View File

@ -291,7 +291,7 @@ GRANT INSERT(x,y) ON db.table TO john
- Привилегия `MODIFY SETTING` позволяет изменять настройки движков таблиц. Не влияет на настройки или конфигурационные параметры сервера.
- Операция `ATTACH` требует наличие привилегии [CREATE](#grant-create).
- Операция `DETACH` требует наличие привилегии [DROP](#grant-drop).
- Для остановки мутации с помощью [KILL MUTATION](misc.md#kill-mutation-statement), необходима привилегия на выполнение данной мутации. Например, чтобы остановить запрос `ALTER UPDATE`, необходима одна из привилегий: `ALTER UPDATE`, `ALTER TABLE` или `ALTER`.
- Для остановки мутации с помощью [KILL MUTATION](../../sql-reference/statements/kill.md#kill-mutation), необходима привилегия на выполнение данной мутации. Например, чтобы остановить запрос `ALTER UPDATE`, необходима одна из привилегий: `ALTER UPDATE`, `ALTER TABLE` или `ALTER`.
### CREATE {#grant-create}
@ -321,7 +321,7 @@ GRANT INSERT(x,y) ON db.table TO john
### TRUNCATE {#grant-truncate}
Разрешает выполнять запросы [TRUNCATE](misc.md#truncate-statement).
Разрешает выполнять запросы [TRUNCATE](../../sql-reference/statements/truncate.md).
Уровень: `TABLE`.
@ -348,7 +348,7 @@ GRANT INSERT(x,y) ON db.table TO john
### KILL QUERY {#grant-kill-query}
Разрешает выполнять запросы [KILL](misc.md#kill-query-statement) в соответствии со следующей иерархией привилегий:
Разрешает выполнять запросы [KILL](../../sql-reference/statements/kill.md#kill-query) в соответствии со следующей иерархией привилегий:
Уровень: `GLOBAL`.

View File

@ -0,0 +1,73 @@
---
toc_priority: 48
toc_title: KILL
---
# KILL {#kill-statements}
Существует два вида операторов KILL: KILL QUERY и KILL MUTATION
## KILL QUERY {#kill-query}
``` sql
KILL QUERY [ON CLUSTER cluster]
WHERE <where expression to SELECT FROM system.processes query>
[SYNC|ASYNC|TEST]
[FORMAT format]
```
Пытается принудительно остановить исполняющиеся в данный момент запросы.
Запросы для принудительной остановки выбираются из таблицы system.processes с помощью условия, указанного в секции `WHERE` запроса `KILL`.
Примеры
``` sql
-- Принудительно останавливает все запросы с указанным query_id:
KILL QUERY WHERE query_id='2-857d-4a57-9ee0-327da5d60a90'
-- Синхронно останавливает все запросы пользователя 'username':
KILL QUERY WHERE user='username' SYNC
```
Readonly-пользователи могут останавливать только свои запросы.
По умолчанию используется асинхронный вариант запроса (`ASYNC`), который не дожидается подтверждения остановки запросов.
Синхронный вариант (`SYNC`) ожидает остановки всех запросов и построчно выводит информацию о процессах по ходу их остановки.
Ответ содержит колонку `kill_status`, которая может принимать следующие значения:
1. finished - запрос был успешно остановлен;
2. waiting - запросу отправлен сигнал завершения, ожидается его остановка;
3. остальные значения описывают причину невозможности остановки запроса.
Тестовый вариант запроса (`TEST`) только проверяет права пользователя и выводит список запросов для остановки.
## KILL MUTATION {#kill-mutation}
``` sql
KILL MUTATION [ON CLUSTER cluster]
WHERE <where expression to SELECT FROM system.mutations query>
[TEST]
[FORMAT format]
```
Пытается остановить выполняющиеся в данные момент [мутации](alter.md#mutations). Мутации для остановки выбираются из таблицы [`system.mutations`](../../operations/system-tables/mutations.md#system_tables-mutations) с помощью условия, указанного в секции `WHERE` запроса `KILL`.
Тестовый вариант запроса (`TEST`) только проверяет права пользователя и выводит список запросов для остановки.
Примеры:
``` sql
-- Останавливает все мутации одной таблицы:
KILL MUTATION WHERE database = 'default' AND table = 'table'
-- Останавливает конкретную мутацию:
KILL MUTATION WHERE database = 'default' AND table = 'table' AND mutation_id = 'mutation_3.txt'
```
Запрос полезен в случаях, когда мутация не может выполниться до конца (например, если функция в запросе мутации бросает исключение на данных таблицы).
Данные, уже изменённые мутацией, остаются в таблице (отката на старую версию данных не происходит).
[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/statements/kill/) <!--hide-->

View File

@ -1,347 +1,22 @@
---
toc_hidden: true
toc_priority: 41
---
# Прочие виды запросов {#prochie-vidy-zaprosov}
## ATTACH {#attach}
Запрос полностью аналогичен запросу `CREATE`, но:
- вместо слова `CREATE` используется слово `ATTACH`;
- запрос не создаёт данные на диске, а предполагает, что данные уже лежат в соответствующих местах, и всего лишь добавляет информацию о таблице на сервер. После выполнения запроса `ATTACH` сервер будет знать о существовании таблицы.
Если таблица перед этим была отсоединена (`DETACH`), т.е. её структура известна, можно использовать сокращенную форму записи без определения структуры.
``` sql
ATTACH TABLE [IF NOT EXISTS] [db.]name [ON CLUSTER cluster]
```
Этот запрос используется при старте сервера. Сервер хранит метаданные таблиц в виде файлов с запросами `ATTACH`, которые он просто исполняет при запуске (за исключением системных таблиц, которые явно создаются на сервере).
## CHECK TABLE {#check-table}
Проверяет таблицу на повреждение данных.
``` sql
CHECK TABLE [db.]name
```
Запрос `CHECK TABLE` сравнивает текущие размеры файлов (в которых хранятся данные из колонок) с ожидаемыми значениями. Если значения не совпадают, данные в таблице считаются поврежденными. Искажение возможно, например, из-за сбоя при записи данных.
Ответ содержит колонку `result`, содержащую одну строку с типом [Boolean](../../sql-reference/data-types/boolean.md). Допустимые значения:
- 0 - данные в таблице повреждены;
- 1 - данные не повреждены.
Запрос `CHECK TABLE` поддерживает следующие движки таблиц:
- [Log](../../engines/table-engines/log-family/log.md)
- [TinyLog](../../engines/table-engines/log-family/tinylog.md)
- [StripeLog](../../engines/table-engines/log-family/stripelog.md)
- [Семейство MergeTree](../../engines/table-engines/mergetree-family/index.md)
При попытке выполнить запрос с таблицами с другими табличными движками, ClickHouse генерирует исключение.
В движках `*Log` не предусмотрено автоматическое восстановление данных после сбоя. Используйте запрос `CHECK TABLE`, чтобы своевременно выявлять повреждение данных.
Для движков из семейства `MergeTree` запрос `CHECK TABLE` показывает статус проверки для каждого отдельного куска данных таблицы на локальном сервере.
**Что делать, если данные повреждены**
В этом случае можно скопировать оставшиеся неповрежденные данные в другую таблицу. Для этого:
1. Создайте новую таблицу с такой же структурой, как у поврежденной таблицы. Для этого выполните запрос `CREATE TABLE <new_table_name> AS <damaged_table_name>`.
2. Установите значение параметра [max\_threads](../../operations/settings/settings.md#settings-max_threads) в 1. Это нужно для того, чтобы выполнить следующий запрос в одном потоке. Установить значение параметра можно через запрос: `SET max_threads = 1`.
3. Выполните запрос `INSERT INTO <new_table_name> SELECT * FROM <damaged_table_name>`. В результате неповрежденные данные будут скопированы в другую таблицу. Обратите внимание, будут скопированы только те данные, которые следуют до поврежденного участка.
4. Перезапустите `clickhouse-client`, чтобы вернуть предыдущее значение параметра `max_threads`.
## DESCRIBE TABLE {#misc-describe-table}
``` sql
DESC|DESCRIBE TABLE [db.]table [INTO OUTFILE filename] [FORMAT format]
```
Возвращает описание столбцов таблицы.
Результат запроса содержит столбцы (все столбцы имеют тип String):
- `name` — имя столбца таблицы;
- `type`— тип столбца;
- `default_type` — в каком виде задано [выражение для значения по умолчанию](create/table.md#create-default-values): `DEFAULT`, `MATERIALIZED` или `ALIAS`. Столбец содержит пустую строку, если значение по умолчанию не задано.
- `default_expression` — значение, заданное в секции `DEFAULT`;
- `comment_expression` — комментарий к столбцу.
Вложенные структуры данных выводятся в «развёрнутом» виде. То есть, каждый столбец - по отдельности, с именем через точку.
## DETACH {#detach-statement}
Удаляет из сервера информацию о таблице name. Сервер перестаёт знать о существовании таблицы.
``` sql
DETACH TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]
```
Но ни данные, ни метаданные таблицы не удаляются. При следующем запуске сервера, сервер прочитает метаданные и снова узнает о таблице.
Также, «отцепленную» таблицу можно прицепить заново запросом `ATTACH` (за исключением системных таблиц, для которых метаданные не хранятся).
Запроса `DETACH DATABASE` нет.
## DROP {#drop}
Запрос имеет два вида: `DROP DATABASE` и `DROP TABLE`.
``` sql
DROP DATABASE [IF EXISTS] db [ON CLUSTER cluster]
```
``` sql
DROP [TEMPORARY] TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]
```
Удаляет таблицу.
Если указано `IF EXISTS` - не выдавать ошибку, если таблица не существует или база данных не существует.
## DROP USER {#drop-user-statement}
Удаляет пользователя.
### Синтаксис {#drop-user-syntax}
```sql
DROP USER [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
```
## DROP ROLE {#drop-role-statement}
Удаляет роль.
При удалении роль отзывается у всех объектов системы доступа, которым она присвоена.
### Синтаксис {#drop-role-syntax}
```sql
DROP ROLE [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
```
## DROP ROW POLICY {#drop-row-policy-statement}
Удаляет политику доступа к строкам.
При удалении политика отзывается у всех объектов системы доступа, которым она присвоена.
### Синтаксис {#drop-row-policy-syntax}
``` sql
DROP [ROW] POLICY [IF EXISTS] name [,...] ON [database.]table [,...] [ON CLUSTER cluster_name]
```
## DROP QUOTA {#drop-quota-statement}
Удаляет квоту.
При удалении квота отзывается у всех объектов системы доступа, которым она присвоена.
### Синтаксис {#drop-quota-syntax}
``` sql
DROP QUOTA [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
```
## DROP SETTINGS PROFILE {#drop-settings-profile-statement}
Удаляет профиль настроек.
При удалении профиль отзывается у всех объектов системы доступа, которым он присвоен.
### Синтаксис {#drop-settings-profile-syntax}
``` sql
DROP [SETTINGS] PROFILE [IF EXISTS] name [,...] [ON CLUSTER cluster_name]
```
## EXISTS {#exists}
``` sql
EXISTS [TEMPORARY] TABLE [db.]name [INTO OUTFILE filename] [FORMAT format]
```
Возвращает один столбец типа `UInt8`, содержащий одно значение - `0`, если таблицы или БД не существует и `1`, если таблица в указанной БД существует.
## KILL QUERY {#kill-query-statement}
``` sql
KILL QUERY [ON CLUSTER cluster]
WHERE <where expression to SELECT FROM system.processes query>
[SYNC|ASYNC|TEST]
[FORMAT format]
```
Пытается принудительно остановить исполняющиеся в данный момент запросы.
Запросы для принудительной остановки выбираются из таблицы system.processes с помощью условия, указанного в секции `WHERE` запроса `KILL`.
Примеры
``` sql
-- Принудительно останавливает все запросы с указанным query_id:
KILL QUERY WHERE query_id='2-857d-4a57-9ee0-327da5d60a90'
-- Синхронно останавливает все запросы пользователя 'username':
KILL QUERY WHERE user='username' SYNC
```
Readonly-пользователи могут останавливать только свои запросы.
По умолчанию используется асинхронный вариант запроса (`ASYNC`), который не дожидается подтверждения остановки запросов.
Синхронный вариант (`SYNC`) ожидает остановки всех запросов и построчно выводит информацию о процессах по ходу их остановки.
Ответ содержит колонку `kill_status`, которая может принимать следующие значения:
1. finished - запрос был успешно остановлен;
2. waiting - запросу отправлен сигнал завершения, ожидается его остановка;
3. остальные значения описывают причину невозможности остановки запроса.
Тестовый вариант запроса (`TEST`) только проверяет права пользователя и выводит список запросов для остановки.
## KILL MUTATION {#kill-mutation-statement}
``` sql
KILL MUTATION [ON CLUSTER cluster]
WHERE <where expression to SELECT FROM system.mutations query>
[TEST]
[FORMAT format]
```
Пытается остановить выполняющиеся в данные момент [мутации](alter.md#mutations). Мутации для остановки выбираются из таблицы [`system.mutations`](../../operations/system-tables/mutations.md#system_tables-mutations) с помощью условия, указанного в секции `WHERE` запроса `KILL`.
Тестовый вариант запроса (`TEST`) только проверяет права пользователя и выводит список запросов для остановки.
Примеры:
``` sql
-- Останавливает все мутации одной таблицы:
KILL MUTATION WHERE database = 'default' AND table = 'table'
-- Останавливает конкретную мутацию:
KILL MUTATION WHERE database = 'default' AND table = 'table' AND mutation_id = 'mutation_3.txt'
```
Запрос полезен в случаях, когда мутация не может выполниться до конца (например, если функция в запросе мутации бросает исключение на данных таблицы).
Данные, уже изменённые мутацией, остаются в таблице (отката на старую версию данных не происходит).
## OPTIMIZE {#misc_operations-optimize}
``` sql
OPTIMIZE TABLE [db.]name [ON CLUSTER cluster] [PARTITION partition | PARTITION ID 'partition_id'] [FINAL] [DEDUPLICATE]
```
Запрос пытается запустить внеплановый мёрж кусков данных для таблиц семейства [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md). Другие движки таблиц не поддерживаются.
Если `OPTIMIZE` применяется к таблицам семейства [ReplicatedMergeTree](../../engines/table-engines/mergetree-family/replication.md), ClickHouse создаёт задачу на мёрж и ожидает её исполнения на всех узлах (если активирована настройка `replication_alter_partitions_sync`).
- Если `OPTIMIZE` не выполняет мёрж по любой причине, ClickHouse не оповещает об этом клиента. Чтобы включить оповещения, используйте настройку [optimize\_throw\_if\_noop](../../operations/settings/settings.md#setting-optimize_throw_if_noop).
- Если указать `PARTITION`, то оптимизация выполняется только для указанной партиции. [Как задавать имя партиции в запросах](alter.md#alter-how-to-specify-part-expr).
- Если указать `FINAL`, то оптимизация выполняется даже в том случае, если все данные уже лежат в одном куске.
- Если указать `DEDUPLICATE`, то произойдет схлопывание полностью одинаковых строк (сравниваются значения во всех колонках), имеет смысл только для движка MergeTree.
!!! warning "Внимание"
Запрос `OPTIMIZE` не может устранить причину появления ошибки «Too many parts».
## RENAME {#misc_operations-rename}
Переименовывает одну или несколько таблиц.
``` sql
RENAME TABLE [db11.]name11 TO [db12.]name12, [db21.]name21 TO [db22.]name22, ... [ON CLUSTER cluster]
```
Переименовывание таблицы является лёгкой операцией. Если вы указали после `TO` другую базу данных, то таблица будет перенесена в эту базу данных. При этом, директории с базами данных должны быть расположены в одной файловой системе (иначе возвращается ошибка). В случае переименования нескольких таблиц в одном запросе — это неатомарная операция, может выполнится частично, запросы в других сессиях могут получить ошибку `Table ... doesn't exist...`.
## SET {#query-set}
``` sql
SET param = value
```
Устанавливает значение `value` для [настройки](../../operations/settings/index.md) `param` в текущей сессии. [Конфигурационные параметры сервера](../../operations/server-configuration-parameters/settings.md) нельзя изменить подобным образом.
Можно одним запросом установить все настройки из заданного профиля настроек.
``` sql
SET profile = 'profile-name-from-the-settings-file'
```
Подробности смотрите в разделе [Настройки](../../operations/settings/settings.md).
## SET ROLE {#set-role-statement}
Активирует роли для текущего пользователя.
### Синтаксис {#set-role-syntax}
``` sql
SET ROLE {DEFAULT | NONE | role [,...] | ALL | ALL EXCEPT role [,...]}
```
## SET DEFAULT ROLE {#set-default-role-statement}
Устанавливает роли по умолчанию для пользователя.
Роли по умолчанию активируются автоматически при входе пользователя. Ролями по умолчанию могут быть установлены только ранее назначенные роли. Если роль не назначена пользователю, ClickHouse выбрасывает исключение.
### Синтаксис {#set-default-role-syntax}
``` sql
SET DEFAULT ROLE {NONE | role [,...] | ALL | ALL EXCEPT role [,...]} TO {user|CURRENT_USER} [,...]
```
### Примеры {#set-default-role-examples}
Установить несколько ролей по умолчанию для пользователя:
``` sql
SET DEFAULT ROLE role1, role2, ... TO user
```
Установить ролями по умолчанию все назначенные пользователю роли:
``` sql
SET DEFAULT ROLE ALL TO user
```
Удалить роли по умолчанию для пользователя:
``` sql
SET DEFAULT ROLE NONE TO user
```
Установить ролями по умолчанию все назначенные пользователю роли за исключением указанных:
```sql
SET DEFAULT ROLE ALL EXCEPT role1, role2 TO user
```
## TRUNCATE {#truncate-statement}
``` sql
TRUNCATE TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]
```
Удаляет все данные из таблицы. Если условие `IF EXISTS` не указано, запрос вернет ошибку, если таблицы не существует.
Запрос `TRUNCATE` не поддерживается для следующих движков: [View](../../engines/table-engines/special/view.md), [File](../../engines/table-engines/special/file.md), [URL](../../engines/table-engines/special/url.md) и [Null](../../engines/table-engines/special/null.md).
## USE {#use}
``` sql
USE db
```
Позволяет установить текущую базу данных для сессии.
Текущая база данных используется для поиска таблиц, если база данных не указана в запросе явно через точку перед именем таблицы.
При использовании HTTP протокола запрос не может быть выполнен, так как понятия сессии не существует.
[Оригинальная статья](https://clickhouse.tech/docs/ru/query_language/misc/) <!--hide-->
- [ATTACH](../../sql-reference/statements/attach.md)
- [CHECK TABLE](../../sql-reference/statements/check-table.md)
- [DESCRIBE TABLE](../../sql-reference/statements/describe-table.md)
- [DETACH](../../sql-reference/statements/detach.md)
- [DROP](../../sql-reference/statements/drop.md)
- [EXISTS](../../sql-reference/statements/exists.md)
- [KILL](../../sql-reference/statements/kill.md)
- [OPTIMIZE](../../sql-reference/statements/optimize.md)
- [RENAME](../../sql-reference/statements/rename.md)
- [SET](../../sql-reference/statements/set.md)
- [SET ROLE](../../sql-reference/statements/set-role.md)
- [TRUNCATE](../../sql-reference/statements/truncate.md)
- [USE](../../sql-reference/statements/use.md)
[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/statements/misc/) <!--hide-->

View File

@ -0,0 +1,25 @@
---
toc_priority: 49
toc_title: OPTIMIZE
---
# OPTIMIZE {#misc_operations-optimize}
``` sql
OPTIMIZE TABLE [db.]name [ON CLUSTER cluster] [PARTITION partition | PARTITION ID 'partition_id'] [FINAL] [DEDUPLICATE]
```
Запрос пытается запустить внеплановый мёрж кусков данных для таблиц семейства [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md). Другие движки таблиц не поддерживаются.
Если `OPTIMIZE` применяется к таблицам семейства [ReplicatedMergeTree](../../engines/table-engines/mergetree-family/replication.md), ClickHouse создаёт задачу на мёрж и ожидает её исполнения на всех узлах (если активирована настройка `replication_alter_partitions_sync`).
- Если `OPTIMIZE` не выполняет мёрж по любой причине, ClickHouse не оповещает об этом клиента. Чтобы включить оповещения, используйте настройку [optimize\_throw\_if\_noop](../../operations/settings/settings.md#setting-optimize_throw_if_noop).
- Если указать `PARTITION`, то оптимизация выполняется только для указанной партиции. [Как задавать имя партиции в запросах](alter.md#alter-how-to-specify-part-expr).
- Если указать `FINAL`, то оптимизация выполняется даже в том случае, если все данные уже лежат в одном куске.
- Если указать `DEDUPLICATE`, то произойдет схлопывание полностью одинаковых строк (сравниваются значения во всех колонках), имеет смысл только для движка MergeTree.
!!! warning "Внимание"
Запрос `OPTIMIZE` не может устранить причину появления ошибки «Too many parts».
[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/statements/optimize/) <!--hide-->

View File

@ -0,0 +1,17 @@
---
toc_priority: 50
toc_title: RENAME
---
# RENAME {#misc_operations-rename}
Переименовывает одну или несколько таблиц.
``` sql
RENAME TABLE [db11.]name11 TO [db12.]name12, [db21.]name21 TO [db22.]name22, ... [ON CLUSTER cluster]
```
Переименовывание таблицы является лёгкой операцией. Если вы указали после `TO` другую базу данных, то таблица будет перенесена в эту базу данных. При этом, директории с базами данных должны быть расположены в одной файловой системе (иначе возвращается ошибка). В случае переименования нескольких таблиц в одном запросе — это неатомарная операция, может выполнится частично, запросы в других сессиях могут получить ошибку `Table ... doesn't exist...`.
[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/statements/rename/) <!--hide-->

View File

@ -70,6 +70,25 @@ toc_title: ORDER BY
Внешняя сортировка работает существенно менее эффективно, чем сортировка в оперативке.
## Оптимизация чтения данных {#optimize_read_in_order}
Если в списке выражений в секции `ORDER BY` первыми указаны те поля, по которым проиндексирована таблица, по которой строится выборка, такой запрос можно оптимизировать — для этого используйте настройку [optimize_read_in_order](../../../operations/settings/settings.md#optimize_read_in_order).
Когда настройка `optimize_read_in_order` включена, при выполнении запроса сервер использует табличные индексы и считывает данные в том порядке, который задан списком выражений `ORDER BY`. Поэтому если в запросе установлен [LIMIT](../../../sql-reference/statements/select/limit.md), сервер не станет считывать лишние данные. Таким образом, запросы к большим таблицам, но имеющие ограничения по числу записей, выполняются быстрее.
Оптимизация работает при любом порядке сортировки `ASC` или `DESC`, но не работает при использовании группировки [GROUP BY](../../../sql-reference/statements/select/group-by.md) и модификатора [FINAL](../../../sql-reference/statements/select/from.md#select-from-final).
Когда настройка `optimize_read_in_order` отключена, при выполнении запросов `SELECT` табличные индексы не используются.
Для запросов с сортировкой `ORDER BY`, большим значением `LIMIT` и условиями отбора [WHERE](../../../sql-reference/statements/select/where.md), требующими чтения больших объемов данных, рекомендуется отключать `optimize_read_in_order` вручную.
Оптимизация чтения данных поддерживается в следующих движках:
- [MergeTree](../../../engines/table-engines/mergetree-family/mergetree.md)
- [Merge](../../../engines/table-engines/special/merge.md), [Buffer](../../../engines/table-engines/special/buffer.md) и [MaterializedView](../../../engines/table-engines/special/materializedview.md), работающими с таблицами `MergeTree`
В движке `MaterializedView` оптимизация поддерживается при работе с сохраненными запросами (представлениями) вида `SELECT ... FROM merge_tree_table ORDER BY pk`. Но оптимизация не поддерживается для запросов вида `SELECT ... FROM view ORDER BY pk`, если в сохраненном запросе нет секции `ORDER BY`.
## Модификатор ORDER BY expr WITH FILL {#orderby-with-fill}
Этот модификатор также может быть скобинирован с модификатором [LIMIT ... WITH TIES](../../../sql-reference/statements/select/limit.md#limit-with-ties)

View File

@ -0,0 +1,57 @@
---
toc_priority: 52
toc_title: SET ROLE
---
# SET ROLE {#set-role-statement}
Активирует роли для текущего пользователя.
### Синтаксис {#set-role-syntax}
``` sql
SET ROLE {DEFAULT | NONE | role [,...] | ALL | ALL EXCEPT role [,...]}
```
## SET DEFAULT ROLE {#set-default-role-statement}
Устанавливает роли по умолчанию для пользователя.
Роли по умолчанию активируются автоматически при входе пользователя. Ролями по умолчанию могут быть установлены только ранее назначенные роли. Если роль не назначена пользователю, ClickHouse выбрасывает исключение.
### Синтаксис {#set-default-role-syntax}
``` sql
SET DEFAULT ROLE {NONE | role [,...] | ALL | ALL EXCEPT role [,...]} TO {user|CURRENT_USER} [,...]
```
### Примеры {#set-default-role-examples}
Установить несколько ролей по умолчанию для пользователя:
``` sql
SET DEFAULT ROLE role1, role2, ... TO user
```
Установить ролями по умолчанию все назначенные пользователю роли:
``` sql
SET DEFAULT ROLE ALL TO user
```
Удалить роли по умолчанию для пользователя:
``` sql
SET DEFAULT ROLE NONE TO user
```
Установить ролями по умолчанию все назначенные пользователю роли за исключением указанных:
```sql
SET DEFAULT ROLE ALL EXCEPT role1, role2 TO user
```
[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/statements/set-role/) <!--hide-->

View File

@ -0,0 +1,22 @@
---
toc_priority: 51
toc_title: SET
---
# SET {#query-set}
``` sql
SET param = value
```
Устанавливает значение `value` для [настройки](../../operations/settings/index.md) `param` в текущей сессии. [Конфигурационные параметры сервера](../../operations/server-configuration-parameters/settings.md) нельзя изменить подобным образом.
Можно одним запросом установить все настройки из заданного профиля настроек.
``` sql
SET profile = 'profile-name-from-the-settings-file'
```
Подробности смотрите в разделе [Настройки](../../operations/settings/settings.md).
[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/statements/set/) <!--hide-->

Some files were not shown because too many files have changed in this diff Show More