Merge branch 'master' into database_atomic

This commit is contained in:
Alexander Tokmakov 2020-01-29 15:05:48 +03:00
commit 89a31e4253
90 changed files with 1100 additions and 386 deletions

View File

@ -1,3 +1,302 @@
## ClickHouse release v20.1
### ClickHouse release v20.1.2.4, 2020-01-22
### Backward Incompatible Change
* Make the setting `merge_tree_uniform_read_distribution` obsolete. The server still recognizes this setting but it has no effect. [#8308](https://github.com/ClickHouse/ClickHouse/pull/8308) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Changed return type of the function `greatCircleDistance` to `Float32` because now the result of calculation is `Float32`. [#7993](https://github.com/ClickHouse/ClickHouse/pull/7993) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Now it's expected that query parameters are represented in "escaped" format. For example, to pass string `a<tab>b` you have to write `a\tb` or `a\<tab>b` and respectively, `a%5Ctb` or `a%5C%09b` in URL. This is needed to add the possibility to pass NULL as `\N`. This fixes [#7488](https://github.com/ClickHouse/ClickHouse/issues/7488). [#8517](https://github.com/ClickHouse/ClickHouse/pull/8517) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Enable `use_minimalistic_part_header_in_zookeeper` setting for `ReplicatedMergeTree` by default. This will significantly reduce amount of data stored in ZooKeeper. This setting is supported since version 19.1 and we already use it in production in multiple services without any issues for more than half a year. Disable this setting if you have a chance to downgrade to versions older than 19.1. [#6850](https://github.com/ClickHouse/ClickHouse/pull/6850) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Data skipping indices are production ready and enabled by default. The settings `allow_experimental_data_skipping_indices`, `allow_experimental_cross_to_join_conversion` and `allow_experimental_multiple_joins_emulation` are now obsolete and do nothing. [#7974](https://github.com/ClickHouse/ClickHouse/pull/7974) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add new `ANY JOIN` logic for `StorageJoin` consistent with `JOIN` operation. To upgrade without changes in behaviour you need add `SETTINGS any_join_distinct_right_table_keys = 1` to Engine Join tables metadata or recreate these tables after upgrade. [#8400](https://github.com/ClickHouse/ClickHouse/pull/8400) ([Artem Zuikov](https://github.com/4ertus2))
* Require server to be restarted to apply the changes in logging configuration. This is a temporary workaround to avoid the bug where the server logs to a deleted log file (see [#8696](https://github.com/ClickHouse/ClickHouse/issues/8696)). [#8707](https://github.com/ClickHouse/ClickHouse/pull/8707) ([Alexander Kuzmenkov](https://github.com/akuzm))
### New Feature
* Added information about part paths to `system.merges`. [#8043](https://github.com/ClickHouse/ClickHouse/pull/8043) ([Vladimir Chebotarev](https://github.com/excitoon))
* Add ability to execute `SYSTEM RELOAD DICTIONARY` query in `ON CLUSTER` mode. [#8288](https://github.com/ClickHouse/ClickHouse/pull/8288) ([Guillaume Tassery](https://github.com/YiuRULE))
* Add ability to execute `CREATE DICTIONARY` queries in `ON CLUSTER` mode. [#8163](https://github.com/ClickHouse/ClickHouse/pull/8163) ([alesapin](https://github.com/alesapin))
* Now user's profile in `users.xml` can inherit multiple profiles. [#8343](https://github.com/ClickHouse/ClickHouse/pull/8343) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
* Added `system.stack_trace` table that allows to look at stack traces of all server threads. This is useful for developers to introspect server state. This fixes [#7576](https://github.com/ClickHouse/ClickHouse/issues/7576). [#8344](https://github.com/ClickHouse/ClickHouse/pull/8344) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add `DateTime64` datatype with configurable sub-second precision. [#7170](https://github.com/ClickHouse/ClickHouse/pull/7170) ([Vasily Nemkov](https://github.com/Enmk))
* Add table function `clusterAllReplicas` which allows to query all the nodes in the cluster. [#8493](https://github.com/ClickHouse/ClickHouse/pull/8493) ([kiran sunkari](https://github.com/kiransunkari))
* Add aggregate function `categoricalInformationValue` which calculates the information value of a discrete feature. [#8117](https://github.com/ClickHouse/ClickHouse/pull/8117) ([hcz](https://github.com/hczhcz))
* Speed up parsing of data files in `CSV`, `TSV` and `JSONEachRow` format by doing it in parallel. [#7780](https://github.com/ClickHouse/ClickHouse/pull/7780) ([Alexander Kuzmenkov](https://github.com/akuzm))
* Add function `bankerRound` which performs banker's rounding. [#8112](https://github.com/ClickHouse/ClickHouse/pull/8112) ([hcz](https://github.com/hczhcz))
* Support more languages in embedded dictionary for region names: 'ru', 'en', 'ua', 'uk', 'by', 'kz', 'tr', 'de', 'uz', 'lv', 'lt', 'et', 'pt', 'he', 'vi'. [#8189](https://github.com/ClickHouse/ClickHouse/pull/8189) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Improvements in consistency of `ANY JOIN` logic. Now `t1 ANY LEFT JOIN t2` equals `t2 ANY RIGHT JOIN t1`. [#7665](https://github.com/ClickHouse/ClickHouse/pull/7665) ([Artem Zuikov](https://github.com/4ertus2))
* Add setting `any_join_distinct_right_table_keys` which enables old behaviour for `ANY INNER JOIN`. [#7665](https://github.com/ClickHouse/ClickHouse/pull/7665) ([Artem Zuikov](https://github.com/4ertus2))
* Add new `SEMI` and `ANTI JOIN`. Old `ANY INNER JOIN` behaviour now available as `SEMI LEFT JOIN`. [#7665](https://github.com/ClickHouse/ClickHouse/pull/7665) ([Artem Zuikov](https://github.com/4ertus2))
* Added `Distributed` format for `File` engine and `file` table function which allows to read from `.bin` files generated by asynchronous inserts into `Distributed` table. [#8535](https://github.com/ClickHouse/ClickHouse/pull/8535) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Add optional reset column argument for `runningAccumulate` which allows to reset aggregation results for each new key value. [#8326](https://github.com/ClickHouse/ClickHouse/pull/8326) ([Sergey Kononenko](https://github.com/kononencheg))
* Add ability to use ClickHouse as Prometheus endpoint. [#7900](https://github.com/ClickHouse/ClickHouse/pull/7900) ([vdimir](https://github.com/Vdimir))
* Add section `<remote_url_allow_hosts>` in `config.xml` which restricts allowed hosts for remote table engines and table functions `URL`, `S3`, `HDFS`. [#7154](https://github.com/ClickHouse/ClickHouse/pull/7154) ([Mikhail Korotov](https://github.com/millb))
* Added function `greatCircleAngle` which calculates the distance on a sphere in degrees. [#8105](https://github.com/ClickHouse/ClickHouse/pull/8105) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Changed Earth radius to be consistent with H3 library. [#8105](https://github.com/ClickHouse/ClickHouse/pull/8105) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Added `JSONCompactEachRow` and `JSONCompactEachRowWithNamesAndTypes` formats for input and output. [#7841](https://github.com/ClickHouse/ClickHouse/pull/7841) ([Mikhail Korotov](https://github.com/millb))
* Added feature for file-related table engines and table functions (`File`, `S3`, `URL`, `HDFS`) which allows to read and write `gzip` files based on additional engine parameter or file extension. [#7840](https://github.com/ClickHouse/ClickHouse/pull/7840) ([Andrey Bodrov](https://github.com/apbodrov))
* Added the `randomASCII(length)` function, generating a string with a random set of [ASCII](https://en.wikipedia.org/wiki/ASCII#Printable_characters) printable characters. [#8401](https://github.com/ClickHouse/ClickHouse/pull/8401) ([BayoNet](https://github.com/BayoNet))
* Added function `JSONExtractArrayRaw` which returns an array on unparsed json array elements from `JSON` string. [#8081](https://github.com/ClickHouse/ClickHouse/pull/8081) ([Oleg Matrokhin](https://github.com/errx))
* Add `arrayZip` function which allows to combine multiple arrays of equal lengths into one array of tuples. [#8149](https://github.com/ClickHouse/ClickHouse/pull/8149) ([Winter Zhang](https://github.com/zhang2014))
* Add ability to move data between disks according to configured `TTL`-expressions for `*MergeTree` table engines family. [#8140](https://github.com/ClickHouse/ClickHouse/pull/8140) ([Vladimir Chebotarev](https://github.com/excitoon))
* Added new aggregate function `avgWeighted` which allows to calculate weighted average. [#7898](https://github.com/ClickHouse/ClickHouse/pull/7898) ([Andrey Bodrov](https://github.com/apbodrov))
* Now parallel parsing is enabled by default for `TSV`, `TSKV`, `CSV` and `JSONEachRow` formats. [#7894](https://github.com/ClickHouse/ClickHouse/pull/7894) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
* Add several geo functions from `H3` library: `h3GetResolution`, `h3EdgeAngle`, `h3EdgeLength`, `h3IsValid` and `h3kRing`. [#8034](https://github.com/ClickHouse/ClickHouse/pull/8034) ([Konstantin Malanchev](https://github.com/hombit))
* Added support for brotli (`br`) compression in file-related storages and table functions. This fixes [#8156](https://github.com/ClickHouse/ClickHouse/issues/8156). [#8526](https://github.com/ClickHouse/ClickHouse/pull/8526) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add `groupBit*` functions for the `SimpleAggregationFunction` type. [#8485](https://github.com/ClickHouse/ClickHouse/pull/8485) ([Guillaume Tassery](https://github.com/YiuRULE))
### Bug Fix
* Fix rename of tables with `Distributed` engine. Fixes issue [#7868](https://github.com/ClickHouse/ClickHouse/issues/7868). [#8306](https://github.com/ClickHouse/ClickHouse/pull/8306) ([tavplubix](https://github.com/tavplubix))
* Now dictionaries support `EXPRESSION` for attributes in arbitrary string in non-ClickHouse SQL dialect. [#8098](https://github.com/ClickHouse/ClickHouse/pull/8098) ([alesapin](https://github.com/alesapin))
* Fix broken `INSERT SELECT FROM mysql(...)` query. This fixes [#8070](https://github.com/ClickHouse/ClickHouse/issues/8070) and [#7960](https://github.com/ClickHouse/ClickHouse/issues/7960). [#8234](https://github.com/ClickHouse/ClickHouse/pull/8234) ([tavplubix](https://github.com/tavplubix))
* Fix error "Mismatch column sizes" when inserting default `Tuple` from `JSONEachRow`. This fixes [#5653](https://github.com/ClickHouse/ClickHouse/issues/5653). [#8606](https://github.com/ClickHouse/ClickHouse/pull/8606) ([tavplubix](https://github.com/tavplubix))
* Now an exception will be thrown in case of using `WITH TIES` alongside `LIMIT BY`. Also add ability to use `TOP` with `LIMIT BY`. This fixes [#7472](https://github.com/ClickHouse/ClickHouse/issues/7472). [#7637](https://github.com/ClickHouse/ClickHouse/pull/7637) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
* Fix unintendent dependency from fresh glibc version in `clickhouse-odbc-bridge` binary. [#8046](https://github.com/ClickHouse/ClickHouse/pull/8046) ([Amos Bird](https://github.com/amosbird))
* Fix bug in check function of `*MergeTree` engines family. Now it doesn't fail in case when we have equal amount of rows in last granule and last mark (non-final). [#8047](https://github.com/ClickHouse/ClickHouse/pull/8047) ([alesapin](https://github.com/alesapin))
* Fix insert into `Enum*` columns after `ALTER` query, when underlying numeric type is equal to table specified type. This fixes [#7836](https://github.com/ClickHouse/ClickHouse/issues/7836). [#7908](https://github.com/ClickHouse/ClickHouse/pull/7908) ([Anton Popov](https://github.com/CurtizJ))
* Allowed non-constant negative "size" argument for function `substring`. It was not allowed by mistake. This fixes [#4832](https://github.com/ClickHouse/ClickHouse/issues/4832). [#7703](https://github.com/ClickHouse/ClickHouse/pull/7703) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix parsing bug when wrong number of arguments passed to `(O|J)DBC` table engine. [#7709](https://github.com/ClickHouse/ClickHouse/pull/7709) ([alesapin](https://github.com/alesapin))
* Using command name of the running clickhouse process when sending logs to syslog. In previous versions, empty string was used instead of command name. [#8460](https://github.com/ClickHouse/ClickHouse/pull/8460) ([Michael Nacharov](https://github.com/mnach))
* Fix check of allowed hosts for `localhost`. This PR fixes the solution provided in [#8241](https://github.com/ClickHouse/ClickHouse/pull/8241). [#8342](https://github.com/ClickHouse/ClickHouse/pull/8342) ([Vitaly Baranov](https://github.com/vitlibar))
* Fix rare crash in `argMin` and `argMax` functions for long string arguments, when result is used in `runningAccumulate` function. This fixes [#8325](https://github.com/ClickHouse/ClickHouse/issues/8325) [#8341](https://github.com/ClickHouse/ClickHouse/pull/8341) ([dinosaur](https://github.com/769344359))
* Fix memory overcommit for tables with `Buffer` engine. [#8345](https://github.com/ClickHouse/ClickHouse/pull/8345) ([Azat Khuzhin](https://github.com/azat))
* Fixed potential bug in functions that can take `NULL` as one of the arguments and return non-NULL. [#8196](https://github.com/ClickHouse/ClickHouse/pull/8196) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Better metrics calculations in thread pool for background processes for `MergeTree` table engines. [#8194](https://github.com/ClickHouse/ClickHouse/pull/8194) ([Vladimir Chebotarev](https://github.com/excitoon))
* Fix function `IN` inside `WHERE` statement when row-level table filter is present. Fixes [#6687](https://github.com/ClickHouse/ClickHouse/issues/6687) [#8357](https://github.com/ClickHouse/ClickHouse/pull/8357) ([Ivan](https://github.com/abyss7))
* Now an exception is thrown if the integral value is not parsed completely for settings values. [#7678](https://github.com/ClickHouse/ClickHouse/pull/7678) ([Mikhail Korotov](https://github.com/millb))
* Fix exception when aggregate function is used in query to distributed table with more than two local shards. [#8164](https://github.com/ClickHouse/ClickHouse/pull/8164) ([小路](https://github.com/nicelulu))
* Now bloom filter can handle zero length arrays and doesn't perform redundant calculations. [#8242](https://github.com/ClickHouse/ClickHouse/pull/8242) ([achimbab](https://github.com/achimbab))
* Fixed checking if a client host is allowed by matching the client host to `host_regexp` specified in `users.xml`. [#8241](https://github.com/ClickHouse/ClickHouse/pull/8241) ([Vitaly Baranov](https://github.com/vitlibar))
* Relax ambiguous column check that leads to false positives in multiple `JOIN ON` section. [#8385](https://github.com/ClickHouse/ClickHouse/pull/8385) ([Artem Zuikov](https://github.com/4ertus2))
* Fixed possible server crash (`std::terminate`) when the server cannot send or write data in `JSON` or `XML` format with values of `String` data type (that require `UTF-8` validation) or when compressing result data with Brotli algorithm or in some other rare cases. This fixes [#7603](https://github.com/ClickHouse/ClickHouse/issues/7603) [#8384](https://github.com/ClickHouse/ClickHouse/pull/8384) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix race condition in `StorageDistributedDirectoryMonitor` found by CI. This fixes [#8364](https://github.com/ClickHouse/ClickHouse/issues/8364). [#8383](https://github.com/ClickHouse/ClickHouse/pull/8383) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Now background merges in `*MergeTree` table engines family preserve storage policy volume order more accurately. [#8549](https://github.com/ClickHouse/ClickHouse/pull/8549) ([Vladimir Chebotarev](https://github.com/excitoon))
* Now table engine `Kafka` works properly with `Native` format. This fixes [#6731](https://github.com/ClickHouse/ClickHouse/issues/6731) [#7337](https://github.com/ClickHouse/ClickHouse/issues/7337) [#8003](https://github.com/ClickHouse/ClickHouse/issues/8003). [#8016](https://github.com/ClickHouse/ClickHouse/pull/8016) ([filimonov](https://github.com/filimonov))
* Fixed formats with headers (like `CSVWithNames`) which were throwing exception about EOF for table engine `Kafka`. [#8016](https://github.com/ClickHouse/ClickHouse/pull/8016) ([filimonov](https://github.com/filimonov))
* Fixed a bug with making set from subquery in right part of `IN` section. This fixes [#5767](https://github.com/ClickHouse/ClickHouse/issues/5767) and [#2542](https://github.com/ClickHouse/ClickHouse/issues/2542). [#7755](https://github.com/ClickHouse/ClickHouse/pull/7755) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
* Fix possible crash while reading from storage `File`. [#7756](https://github.com/ClickHouse/ClickHouse/pull/7756) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fixed reading of the files in `Parquet` format containing columns of type `list`. [#8334](https://github.com/ClickHouse/ClickHouse/pull/8334) ([maxulan](https://github.com/maxulan))
* Fix error `Not found column` for distributed queries with `PREWHERE` condition dependent on sampling key if `max_parallel_replicas > 1`. [#7913](https://github.com/ClickHouse/ClickHouse/pull/7913) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix error `Not found column` if query used `PREWHERE` dependent on table's alias and the result set was empty because of primary key condition. [#7911](https://github.com/ClickHouse/ClickHouse/pull/7911) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fixed return type for functions `rand` and `randConstant` in case of `Nullable` argument. Now functions always return `UInt32` and never `Nullable(UInt32)`. [#8204](https://github.com/ClickHouse/ClickHouse/pull/8204) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Disabled predicate push-down for `WITH FILL` expression. This fixes [#7784](https://github.com/ClickHouse/ClickHouse/issues/7784). [#7789](https://github.com/ClickHouse/ClickHouse/pull/7789) ([Winter Zhang](https://github.com/zhang2014))
* Fixed incorrect `count()` result for `SummingMergeTree` when `FINAL` section is used. [#3280](https://github.com/ClickHouse/ClickHouse/issues/3280) [#7786](https://github.com/ClickHouse/ClickHouse/pull/7786) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
* Fix possible incorrect result for constant functions from remote servers. It happened for queries with functions like `version()`, `uptime()`, etc. which returns different constant values for different servers. This fixes [#7666](https://github.com/ClickHouse/ClickHouse/issues/7666). [#7689](https://github.com/ClickHouse/ClickHouse/pull/7689) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix complicated bug in push-down predicate optimization which leads to wrong results. This fixes a lot of issues on push-down predicate optimization. [#8503](https://github.com/ClickHouse/ClickHouse/pull/8503) ([Winter Zhang](https://github.com/zhang2014))
* Fix crash in `CREATE TABLE .. AS dictionary` query. [#8508](https://github.com/ClickHouse/ClickHouse/pull/8508) ([Azat Khuzhin](https://github.com/azat))
* Several improvements ClickHouse grammar in `.g4` file. [#8294](https://github.com/ClickHouse/ClickHouse/pull/8294) ([taiyang-li](https://github.com/taiyang-li))
* Fix bug that leads to crashes in `JOIN`s with tables with engine `Join`. This fixes [#7556](https://github.com/ClickHouse/ClickHouse/issues/7556) [#8254](https://github.com/ClickHouse/ClickHouse/issues/8254) [#7915](https://github.com/ClickHouse/ClickHouse/issues/7915) [#8100](https://github.com/ClickHouse/ClickHouse/issues/8100). [#8298](https://github.com/ClickHouse/ClickHouse/pull/8298) ([Artem Zuikov](https://github.com/4ertus2))
* Fix redundant dictionaries reload on `CREATE DATABASE`. [#7916](https://github.com/ClickHouse/ClickHouse/pull/7916) ([Azat Khuzhin](https://github.com/azat))
* Limit maximum number of streams for read from `StorageFile` and `StorageHDFS`. Fixes https://github.com/ClickHouse/ClickHouse/issues/7650. [#7981](https://github.com/ClickHouse/ClickHouse/pull/7981) ([alesapin](https://github.com/alesapin))
* Fix bug in `ALTER ... MODIFY ... CODEC` query, when user specify both default expression and codec. Fixes [8593](https://github.com/ClickHouse/ClickHouse/issues/8593). [#8614](https://github.com/ClickHouse/ClickHouse/pull/8614) ([alesapin](https://github.com/alesapin))
* Fix error in background merge of columns with `SimpleAggregateFunction(LowCardinality)` type. [#8613](https://github.com/ClickHouse/ClickHouse/pull/8613) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fixed type check in function `toDateTime64`. [#8375](https://github.com/ClickHouse/ClickHouse/pull/8375) ([Vasily Nemkov](https://github.com/Enmk))
* Now server do not crash on `LEFT` or `FULL JOIN` with and Join engine and unsupported `join_use_nulls` settings. [#8479](https://github.com/ClickHouse/ClickHouse/pull/8479) ([Artem Zuikov](https://github.com/4ertus2))
* Now `DROP DICTIONARY IF EXISTS db.dict` query doesn't throw exception if `db` doesn't exist. [#8185](https://github.com/ClickHouse/ClickHouse/pull/8185) ([Vitaly Baranov](https://github.com/vitlibar))
* Fix possible crashes in table functions (`file`, `mysql`, `remote`) caused by usage of reference to removed `IStorage` object. Fix incorrect parsing of columns specified at insertion into table function. [#7762](https://github.com/ClickHouse/ClickHouse/pull/7762) ([tavplubix](https://github.com/tavplubix))
* Ensure network be up before starting `clickhouse-server`. This fixes [#7507](https://github.com/ClickHouse/ClickHouse/issues/7507). [#8570](https://github.com/ClickHouse/ClickHouse/pull/8570) ([Zhichang Yu](https://github.com/yuzhichang))
* Fix timeouts handling for secure connections, so queries doesn't hang indefenitely. This fixes [#8126](https://github.com/ClickHouse/ClickHouse/issues/8126). [#8128](https://github.com/ClickHouse/ClickHouse/pull/8128) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix `clickhouse-copier`'s redundant contention between concurrent workers. [#7816](https://github.com/ClickHouse/ClickHouse/pull/7816) ([Ding Xiang Fei](https://github.com/dingxiangfei2009))
* Now mutations doesn't skip attached parts, even if their mutation version were larger than current mutation version. [#7812](https://github.com/ClickHouse/ClickHouse/pull/7812) ([Zhichang Yu](https://github.com/yuzhichang)) [#8250](https://github.com/ClickHouse/ClickHouse/pull/8250) ([alesapin](https://github.com/alesapin))
* Ignore redundant copies of `*MergeTree` data parts after move to another disk and server restart. [#7810](https://github.com/ClickHouse/ClickHouse/pull/7810) ([Vladimir Chebotarev](https://github.com/excitoon))
* Fix crash in `FULL JOIN` with `LowCardinality` in `JOIN` key. [#8252](https://github.com/ClickHouse/ClickHouse/pull/8252) ([Artem Zuikov](https://github.com/4ertus2))
* Forbidden to use column name more than once in insert query like `INSERT INTO tbl (x, y, x)`. This fixes [#5465](https://github.com/ClickHouse/ClickHouse/issues/5465), [#7681](https://github.com/ClickHouse/ClickHouse/issues/7681). [#7685](https://github.com/ClickHouse/ClickHouse/pull/7685) ([alesapin](https://github.com/alesapin))
* Added fallback for detection the number of physical CPU cores for unknown CPUs (using the number of logical CPU cores). This fixes [#5239](https://github.com/ClickHouse/ClickHouse/issues/5239). [#7726](https://github.com/ClickHouse/ClickHouse/pull/7726) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix `There's no column` error for materialized and alias columns. [#8210](https://github.com/ClickHouse/ClickHouse/pull/8210) ([Artem Zuikov](https://github.com/4ertus2))
* Fixed sever crash when `EXISTS` query was used without `TABLE` or `DICTIONARY` qualifier. Just like `EXISTS t`. This fixes [#8172](https://github.com/ClickHouse/ClickHouse/issues/8172). This bug was introduced in version 19.17. [#8213](https://github.com/ClickHouse/ClickHouse/pull/8213) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix rare bug with error `"Sizes of columns doesn't match"` that might appear when using `SimpleAggregateFunction` column. [#7790](https://github.com/ClickHouse/ClickHouse/pull/7790) ([Boris Granveaud](https://github.com/bgranvea))
* Fix bug where user with empty `allow_databases` got access to all databases (and same for `allow_dictionaries`). [#7793](https://github.com/ClickHouse/ClickHouse/pull/7793) ([DeifyTheGod](https://github.com/DeifyTheGod))
* Fix client crash when server already disconnected from client. [#8071](https://github.com/ClickHouse/ClickHouse/pull/8071) ([Azat Khuzhin](https://github.com/azat))
* Fix `ORDER BY` behaviour in case of sorting by primary key prefix and non primary key suffix. [#7759](https://github.com/ClickHouse/ClickHouse/pull/7759) ([Anton Popov](https://github.com/CurtizJ))
* Check if qualified column present in the table. This fixes [#6836](https://github.com/ClickHouse/ClickHouse/issues/6836). [#7758](https://github.com/ClickHouse/ClickHouse/pull/7758) ([Artem Zuikov](https://github.com/4ertus2))
* Fixed behavior with `ALTER MOVE` ran immediately after merge finish moves superpart of specified. Fixes [#8103](https://github.com/ClickHouse/ClickHouse/issues/8103). [#8104](https://github.com/ClickHouse/ClickHouse/pull/8104) ([Vladimir Chebotarev](https://github.com/excitoon))
* Fix possible server crash while using `UNION` with different number of columns. Fixes [#7279](https://github.com/ClickHouse/ClickHouse/issues/7279). [#7929](https://github.com/ClickHouse/ClickHouse/pull/7929) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix size of result substring for function `substr` with negative size. [#8589](https://github.com/ClickHouse/ClickHouse/pull/8589) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Now server does not execute part mutation in `MergeTree` if there are not enough free threads in background pool. [#8588](https://github.com/ClickHouse/ClickHouse/pull/8588) ([tavplubix](https://github.com/tavplubix))
* Fix a minor typo on formatting `UNION ALL` AST. [#7999](https://github.com/ClickHouse/ClickHouse/pull/7999) ([litao91](https://github.com/litao91))
* Fixed incorrect bloom filter results for negative numbers. This fixes [#8317](https://github.com/ClickHouse/ClickHouse/issues/8317). [#8566](https://github.com/ClickHouse/ClickHouse/pull/8566) ([Winter Zhang](https://github.com/zhang2014))
* Fixed potential buffer overflow in decompress. Malicious user can pass fabricated compressed data that will cause read after buffer. This issue was found by Eldar Zaitov from Yandex information security team. [#8404](https://github.com/ClickHouse/ClickHouse/pull/8404) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix incorrect result because of integers overflow in `arrayIntersect`. [#7777](https://github.com/ClickHouse/ClickHouse/pull/7777) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Now `OPTIMIZE TABLE` query will not wait for offline replicas to perform the operation. [#8314](https://github.com/ClickHouse/ClickHouse/pull/8314) ([javi santana](https://github.com/javisantana))
* Fixed `ALTER TTL` parser for `Replicated*MergeTree` tables. [#8318](https://github.com/ClickHouse/ClickHouse/pull/8318) ([Vladimir Chebotarev](https://github.com/excitoon))
* Fix communication between server and client, so server read temporary tables info after query failure. [#8084](https://github.com/ClickHouse/ClickHouse/pull/8084) ([Azat Khuzhin](https://github.com/azat))
* Fix `bitmapAnd` function error when intersecting an aggregated bitmap and a scalar bitmap. [#8082](https://github.com/ClickHouse/ClickHouse/pull/8082) ([Yue Huang](https://github.com/moon03432))
* Refine the definition of `ZXid` according to the ZooKeeper Programmer's Guide which fixes bug in `clickhouse-cluster-copier`. [#8088](https://github.com/ClickHouse/ClickHouse/pull/8088) ([Ding Xiang Fei](https://github.com/dingxiangfei2009))
* `odbc` table function now respects `external_table_functions_use_nulls` setting. [#7506](https://github.com/ClickHouse/ClickHouse/pull/7506) ([Vasily Nemkov](https://github.com/Enmk))
* Fixed bug that lead to a rare data race. [#8143](https://github.com/ClickHouse/ClickHouse/pull/8143) ([Alexander Kazakov](https://github.com/Akazz))
* Now `SYSTEM RELOAD DICTIONARY` reloads a dictionary completely, ignoring `update_field`. This fixes [#7440](https://github.com/ClickHouse/ClickHouse/issues/7440). [#8037](https://github.com/ClickHouse/ClickHouse/pull/8037) ([Vitaly Baranov](https://github.com/vitlibar))
* Add ability to check if dictionary exists in create query. [#8032](https://github.com/ClickHouse/ClickHouse/pull/8032) ([alesapin](https://github.com/alesapin))
* Fix `Float*` parsing in `Values` format. This fixes [#7817](https://github.com/ClickHouse/ClickHouse/issues/7817). [#7870](https://github.com/ClickHouse/ClickHouse/pull/7870) ([tavplubix](https://github.com/tavplubix))
* Fix crash when we cannot reserve space in some background operations of `*MergeTree` table engines family. [#7873](https://github.com/ClickHouse/ClickHouse/pull/7873) ([Vladimir Chebotarev](https://github.com/excitoon))
* Fix crash of merge operation when table contains `SimpleAggregateFunction(LowCardinality)` column. This fixes [#8515](https://github.com/ClickHouse/ClickHouse/issues/8515). [#8522](https://github.com/ClickHouse/ClickHouse/pull/8522) ([Azat Khuzhin](https://github.com/azat))
* Restore support of all ICU locales and add the ability to apply collations for constant expressions. Also add language name to `system.collations` table. [#8051](https://github.com/ClickHouse/ClickHouse/pull/8051) ([alesapin](https://github.com/alesapin))
* Fix bug when external dictionaries with zero minimal lifetime (`LIFETIME(MIN 0 MAX N)`, `LIFETIME(N)`) don't update in background. [#7983](https://github.com/ClickHouse/ClickHouse/pull/7983) ([alesapin](https://github.com/alesapin))
* Fix crash when external dictionary with ClickHouse source has subquery in query. [#8351](https://github.com/ClickHouse/ClickHouse/pull/8351) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix incorrect parsing of file extension in table with engine `URL`. This fixes [#8157](https://github.com/ClickHouse/ClickHouse/issues/8157). [#8419](https://github.com/ClickHouse/ClickHouse/pull/8419) ([Andrey Bodrov](https://github.com/apbodrov))
* Fix `CHECK TABLE` query for `*MergeTree` tables without key. Fixes [#7543](https://github.com/ClickHouse/ClickHouse/issues/7543). [#7979](https://github.com/ClickHouse/ClickHouse/pull/7979) ([alesapin](https://github.com/alesapin))
* Fixed conversion of `Float64` to MySQL type. [#8079](https://github.com/ClickHouse/ClickHouse/pull/8079) ([Yuriy Baranov](https://github.com/yurriy))
* Now if table was not completely dropped because of server crash, server will try to restore and load it. [#8176](https://github.com/ClickHouse/ClickHouse/pull/8176) ([tavplubix](https://github.com/tavplubix))
* Fixed crash in table function `file` while inserting into file that doesn't exist. Now in this case file would be created and then insert would be processed. [#8177](https://github.com/ClickHouse/ClickHouse/pull/8177) ([Olga Khvostikova](https://github.com/stavrolia))
* Fix rare deadlock which can happen when `trace_log` is in enabled. [#7838](https://github.com/ClickHouse/ClickHouse/pull/7838) ([filimonov](https://github.com/filimonov))
* Add ability to work with different types besides `Date` in `RangeHashed` external dictionary created from DDL query. Fixes [7899](https://github.com/ClickHouse/ClickHouse/issues/7899). [#8275](https://github.com/ClickHouse/ClickHouse/pull/8275) ([alesapin](https://github.com/alesapin))
* Fixes crash when `now64()` is called with result of another function. [#8270](https://github.com/ClickHouse/ClickHouse/pull/8270) ([Vasily Nemkov](https://github.com/Enmk))
* Fixed bug with detecting client IP for connections through mysql wire protocol. [#7743](https://github.com/ClickHouse/ClickHouse/pull/7743) ([Dmitry Muzyka](https://github.com/dmitriy-myz))
* Fix empty array handling in `arraySplit` function. This fixes [#7708](https://github.com/ClickHouse/ClickHouse/issues/7708). [#7747](https://github.com/ClickHouse/ClickHouse/pull/7747) ([hcz](https://github.com/hczhcz))
* Fixed the issue when `pid-file` of another running `clickhouse-server` may be deleted. [#8487](https://github.com/ClickHouse/ClickHouse/pull/8487) ([Weiqing Xu](https://github.com/weiqxu))
* Fix dictionary reload if it has `invalidate_query`, which stopped updates and some exception on previous update tries. [#8029](https://github.com/ClickHouse/ClickHouse/pull/8029) ([alesapin](https://github.com/alesapin))
* Fixed error in function `arrayReduce` that may lead to "double free" and error in aggregate function combinator `Resample` that may lead to memory leak. Added aggregate function `aggThrow`. This function can be used for testing purposes. [#8446](https://github.com/ClickHouse/ClickHouse/pull/8446) ([alexey-milovidov](https://github.com/alexey-milovidov))
### Improvement
* Improved logging when working with `S3` table engine. [#8251](https://github.com/ClickHouse/ClickHouse/pull/8251) ([Grigory Pervakov](https://github.com/GrigoryPervakov))
* Printed help message when no arguments are passed when calling `clickhouse-local`. This fixes [#5335](https://github.com/ClickHouse/ClickHouse/issues/5335). [#8230](https://github.com/ClickHouse/ClickHouse/pull/8230) ([Andrey Nagorny](https://github.com/Melancholic))
* Add setting `mutations_sync` which allows to wait `ALTER UPDATE/DELETE` queries synchronously. [#8237](https://github.com/ClickHouse/ClickHouse/pull/8237) ([alesapin](https://github.com/alesapin))
* Allow to set up relative `user_files_path` in `config.xml` (in the way similar to `format_schema_path`). [#7632](https://github.com/ClickHouse/ClickHouse/pull/7632) ([hcz](https://github.com/hczhcz))
* Add exception for illegal types for conversion functions with `-OrZero` postfix. [#7880](https://github.com/ClickHouse/ClickHouse/pull/7880) ([Andrey Konyaev](https://github.com/akonyaev90))
* Simplify format of the header of data sending to a shard in a distributed query. [#8044](https://github.com/ClickHouse/ClickHouse/pull/8044) ([Vitaly Baranov](https://github.com/vitlibar))
* `Live View` table engine refactoring. [#8519](https://github.com/ClickHouse/ClickHouse/pull/8519) ([vzakaznikov](https://github.com/vzakaznikov))
* Add additional checks for external dictionaries created from DDL-queries. [#8127](https://github.com/ClickHouse/ClickHouse/pull/8127) ([alesapin](https://github.com/alesapin))
* Fix error `Column ... already exists` while using `FINAL` and `SAMPLE` together, e.g. `select count() from table final sample 1/2`. Fixes [#5186](https://github.com/ClickHouse/ClickHouse/issues/5186). [#7907](https://github.com/ClickHouse/ClickHouse/pull/7907) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Now table the first argument of `joinGet` function can be table indentifier. [#7707](https://github.com/ClickHouse/ClickHouse/pull/7707) ([Amos Bird](https://github.com/amosbird))
* Allow using `MaterializedView` with subqueries above `Kafka` tables. [#8197](https://github.com/ClickHouse/ClickHouse/pull/8197) ([filimonov](https://github.com/filimonov))
* Now background moves between disks run it the seprate thread pool. [#7670](https://github.com/ClickHouse/ClickHouse/pull/7670) ([Vladimir Chebotarev](https://github.com/excitoon))
* `SYSTEM RELOAD DICTIONARY` now executes synchronously. [#8240](https://github.com/ClickHouse/ClickHouse/pull/8240) ([Vitaly Baranov](https://github.com/vitlibar))
* Stack traces now display physical addresses (offsets in object file) instead of virtual memory addresses (where the object file was loaded). That allows the use of `addr2line` when binary is position independent and ASLR is active. This fixes [#8360](https://github.com/ClickHouse/ClickHouse/issues/8360). [#8387](https://github.com/ClickHouse/ClickHouse/pull/8387) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Support new syntax for row-level security filters: `<table name='table_name'>…</table>`. Fixes [#5779](https://github.com/ClickHouse/ClickHouse/issues/5779). [#8381](https://github.com/ClickHouse/ClickHouse/pull/8381) ([Ivan](https://github.com/abyss7))
* Now `cityHash` function can work with `Decimal` and `UUID` types. Fixes [#5184](https://github.com/ClickHouse/ClickHouse/issues/5184). [#7693](https://github.com/ClickHouse/ClickHouse/pull/7693) ([Mikhail Korotov](https://github.com/millb))
* Removed fixed index granularity (it was 1024) from system logs because it's obsolete after implementation of adaptive granularity. [#7698](https://github.com/ClickHouse/ClickHouse/pull/7698) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Enabled MySQL compatibility server when ClickHouse is compiled without SSL. [#7852](https://github.com/ClickHouse/ClickHouse/pull/7852) ([Yuriy Baranov](https://github.com/yurriy))
* Now server checksums distributed batches, which gives more verbose errors in case of corrupted data in batch. [#7914](https://github.com/ClickHouse/ClickHouse/pull/7914) ([Azat Khuzhin](https://github.com/azat))
* Support `DROP DATABASE`, `DETACH TABLE`, `DROP TABLE` and `ATTACH TABLE` for `MySQL` database engine. [#8202](https://github.com/ClickHouse/ClickHouse/pull/8202) ([Winter Zhang](https://github.com/zhang2014))
* Add authentication in S3 table function and table engine. [#7623](https://github.com/ClickHouse/ClickHouse/pull/7623) ([Vladimir Chebotarev](https://github.com/excitoon))
* Added check for extra parts of `MergeTree` at different disks, in order to not allow to miss data parts at undefined disks. [#8118](https://github.com/ClickHouse/ClickHouse/pull/8118) ([Vladimir Chebotarev](https://github.com/excitoon))
* Enable SSL support for Mac client and server. [#8297](https://github.com/ClickHouse/ClickHouse/pull/8297) ([Ivan](https://github.com/abyss7))
* Now ClickHouse can work as MySQL federated server (see https://dev.mysql.com/doc/refman/5.7/en/federated-create-server.html). [#7717](https://github.com/ClickHouse/ClickHouse/pull/7717) ([Maxim Fedotov](https://github.com/MaxFedotov))
* `clickhouse-client` now only enable `bracketed-paste` when multiquery is on and multiline is off. This fixes (#7757)[https://github.com/ClickHouse/ClickHouse/issues/7757]. [#7761](https://github.com/ClickHouse/ClickHouse/pull/7761) ([Amos Bird](https://github.com/amosbird))
* Support `Array(Decimal)` in `if` function. [#7721](https://github.com/ClickHouse/ClickHouse/pull/7721) ([Artem Zuikov](https://github.com/4ertus2))
* Support Decimals in `arrayDifference`, `arrayCumSum` and `arrayCumSumNegative` functions. [#7724](https://github.com/ClickHouse/ClickHouse/pull/7724) ([Artem Zuikov](https://github.com/4ertus2))
* Added `lifetime` column to `system.dictionaries` table. [#6820](https://github.com/ClickHouse/ClickHouse/issues/6820) [#7727](https://github.com/ClickHouse/ClickHouse/pull/7727) ([kekekekule](https://github.com/kekekekule))
* Improved check for existing parts on different disks for `*MergeTree` table engines. Addresses [#7660](https://github.com/ClickHouse/ClickHouse/issues/7660). [#8440](https://github.com/ClickHouse/ClickHouse/pull/8440) ([Vladimir Chebotarev](https://github.com/excitoon))
* Integration with `AWS SDK` for `S3` interactions which allows to use all S3 features out of the box. [#8011](https://github.com/ClickHouse/ClickHouse/pull/8011) ([Pavel Kovalenko](https://github.com/Jokser))
* Added support for subqueries in `Live View` tables. [#7792](https://github.com/ClickHouse/ClickHouse/pull/7792) ([vzakaznikov](https://github.com/vzakaznikov))
* Check for using `Date` or `DateTime` column from `TTL` expressions was removed. [#7920](https://github.com/ClickHouse/ClickHouse/pull/7920) ([Vladimir Chebotarev](https://github.com/excitoon))
* Information about disk was added to `system.detached_parts` table. [#7833](https://github.com/ClickHouse/ClickHouse/pull/7833) ([Vladimir Chebotarev](https://github.com/excitoon))
* Now settings `max_(table|partition)_size_to_drop` can be changed without a restart. [#7779](https://github.com/ClickHouse/ClickHouse/pull/7779) ([Grigory Pervakov](https://github.com/GrigoryPervakov))
* Slightly better usability of error messages. Ask user not to remove the lines below `Stack trace:`. [#7897](https://github.com/ClickHouse/ClickHouse/pull/7897) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Better reading messages from `Kafka` engine in various formats after [#7935](https://github.com/ClickHouse/ClickHouse/issues/7935). [#8035](https://github.com/ClickHouse/ClickHouse/pull/8035) ([Ivan](https://github.com/abyss7))
* Better compatibility with MySQL clients which don't support `sha2_password` auth plugin. [#8036](https://github.com/ClickHouse/ClickHouse/pull/8036) ([Yuriy Baranov](https://github.com/yurriy))
* Support more column types in MySQL compatibility server. [#7975](https://github.com/ClickHouse/ClickHouse/pull/7975) ([Yuriy Baranov](https://github.com/yurriy))
* Implement `ORDER BY` optimization for `Merge`, `Buffer` and `Materilized View` storages with underlying `MergeTree` tables. [#8130](https://github.com/ClickHouse/ClickHouse/pull/8130) ([Anton Popov](https://github.com/CurtizJ))
* Now we always use POSIX implementation of `getrandom` to have better compatibility with old kernels (< 3.17). [#7940](https://github.com/ClickHouse/ClickHouse/pull/7940) ([Amos Bird](https://github.com/amosbird))
* Better check for valid destination in a move TTL rule. [#8410](https://github.com/ClickHouse/ClickHouse/pull/8410) ([Vladimir Chebotarev](https://github.com/excitoon))
* Better checks for broken insert batches for `Distributed` table engine. [#7933](https://github.com/ClickHouse/ClickHouse/pull/7933) ([Azat Khuzhin](https://github.com/azat))
* Add column with array of parts name which mutations must process in future to `system.mutations` table. [#8179](https://github.com/ClickHouse/ClickHouse/pull/8179) ([alesapin](https://github.com/alesapin))
* Parallel merge sort optimization for processors. [#8552](https://github.com/ClickHouse/ClickHouse/pull/8552) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* The settings `mark_cache_min_lifetime` is now obsolete and does nothing. In previous versions, mark cache can grow in memory larger than `mark_cache_size` to accomodate data within `mark_cache_min_lifetime` seconds. That was leading to confusion and higher memory usage than expected, that is especially bad on memory constrained systems. If you will see performance degradation after installing this release, you should increase the `mark_cache_size`. [#8484](https://github.com/ClickHouse/ClickHouse/pull/8484) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Preparation to use `tid` everywhere. This is needed for [#7477](https://github.com/ClickHouse/ClickHouse/issues/7477). [#8276](https://github.com/ClickHouse/ClickHouse/pull/8276) ([alexey-milovidov](https://github.com/alexey-milovidov))
### Performance Improvement
* Performance optimizations in processors pipeline. [#7988](https://github.com/ClickHouse/ClickHouse/pull/7988) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Non-blocking updates of expired keys in cache dictionaries (with permission to read old ones). [#8303](https://github.com/ClickHouse/ClickHouse/pull/8303) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
* Compile ClickHouse without `-fno-omit-frame-pointer` globally to spare one more register. [#8097](https://github.com/ClickHouse/ClickHouse/pull/8097) ([Amos Bird](https://github.com/amosbird))
* Speedup `greatCircleDistance` function and add performance tests for it. [#7307](https://github.com/ClickHouse/ClickHouse/pull/7307) ([Olga Khvostikova](https://github.com/stavrolia))
* Improved performance of function `roundDown`. [#8465](https://github.com/ClickHouse/ClickHouse/pull/8465) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Improved performance of `max`, `min`, `argMin`, `argMax` for `DateTime64` data type. [#8199](https://github.com/ClickHouse/ClickHouse/pull/8199) ([Vasily Nemkov](https://github.com/Enmk))
* Improved performance of sorting without a limit or with big limit and external sorting. [#8545](https://github.com/ClickHouse/ClickHouse/pull/8545) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Improved performance of formatting floating point numbers up to 6 times. [#8542](https://github.com/ClickHouse/ClickHouse/pull/8542) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Improved performance of `modulo` function. [#7750](https://github.com/ClickHouse/ClickHouse/pull/7750) ([Amos Bird](https://github.com/amosbird))
* Optimized `ORDER BY` and merging with single column key. [#8335](https://github.com/ClickHouse/ClickHouse/pull/8335) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Better implementation for `arrayReduce`, `-Array` and `-State` combinators. [#7710](https://github.com/ClickHouse/ClickHouse/pull/7710) ([Amos Bird](https://github.com/amosbird))
* Now `PREWHERE` should be optimized to be at least as efficient as `WHERE`. [#7769](https://github.com/ClickHouse/ClickHouse/pull/7769) ([Amos Bird](https://github.com/amosbird))
* Improve the way `round` and `roundBankers` handling negative numbers. [#8229](https://github.com/ClickHouse/ClickHouse/pull/8229) ([hcz](https://github.com/hczhcz))
* Improved decoding performance of `DoubleDelta` and `Gorilla` codecs by roughly 30-40%. This fixes [#7082](https://github.com/ClickHouse/ClickHouse/issues/7082). [#8019](https://github.com/ClickHouse/ClickHouse/pull/8019) ([Vasily Nemkov](https://github.com/Enmk))
* Improved performance of `base64` related functions. [#8444](https://github.com/ClickHouse/ClickHouse/pull/8444) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Added a function `geoDistance`. It is similar to `greatCircleDistance` but uses approximation to WGS-84 ellipsoid model. The performance of both functions are near the same. [#8086](https://github.com/ClickHouse/ClickHouse/pull/8086) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Faster `min` and `max` aggregation functions for `Decimal` data type. [#8144](https://github.com/ClickHouse/ClickHouse/pull/8144) ([Artem Zuikov](https://github.com/4ertus2))
* Vectorize processing `arrayReduce`. [#7608](https://github.com/ClickHouse/ClickHouse/pull/7608) ([Amos Bird](https://github.com/amosbird))
* `if` chains are now optimized as `multiIf`. [#8355](https://github.com/ClickHouse/ClickHouse/pull/8355) ([kamalov-ruslan](https://github.com/kamalov-ruslan))
* Fix performance regression of `Kafka` table engine introduced in 19.15. This fixes [#7261](https://github.com/ClickHouse/ClickHouse/issues/7261). [#7935](https://github.com/ClickHouse/ClickHouse/pull/7935) ([filimonov](https://github.com/filimonov))
* Removed "pie" code generation that `gcc` from Debian packages occasionally brings by default. [#8483](https://github.com/ClickHouse/ClickHouse/pull/8483) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Parallel parsing data formats [#6553](https://github.com/ClickHouse/ClickHouse/pull/6553) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
* Enable optimized parser of `Values` with expressions by default (`input_format_values_deduce_templates_of_expressions=1`). [#8231](https://github.com/ClickHouse/ClickHouse/pull/8231) ([tavplubix](https://github.com/tavplubix))
### Build/Testing/Packaging Improvement
* Build fixes for `ARM` and in minimal mode. [#8304](https://github.com/ClickHouse/ClickHouse/pull/8304) ([proller](https://github.com/proller))
* Add coverage file flush for `clickhouse-server` when std::atexit is not called. Also slightly improved logging in stateless tests with coverage. [#8267](https://github.com/ClickHouse/ClickHouse/pull/8267) ([alesapin](https://github.com/alesapin))
* Update LLVM library in contrib. Avoid using LLVM from OS packages. [#8258](https://github.com/ClickHouse/ClickHouse/pull/8258) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Make bundled `curl` build fully quiet. [#8232](https://github.com/ClickHouse/ClickHouse/pull/8232) [#8203](https://github.com/ClickHouse/ClickHouse/pull/8203) ([Pavel Kovalenko](https://github.com/Jokser))
* Fix some `MemorySanitizer` warnings. [#8235](https://github.com/ClickHouse/ClickHouse/pull/8235) ([Alexander Kuzmenkov](https://github.com/akuzm))
* Use `add_warning` and `no_warning` macros in `CMakeLists.txt`. [#8604](https://github.com/ClickHouse/ClickHouse/pull/8604) ([Ivan](https://github.com/abyss7))
* Add support of Minio S3 Compatible object (https://min.io/) for better integration tests. [#7863](https://github.com/ClickHouse/ClickHouse/pull/7863) [#7875](https://github.com/ClickHouse/ClickHouse/pull/7875) ([Pavel Kovalenko](https://github.com/Jokser))
* Imported `libc` headers to contrib. It allows to make builds more consistent across various systems (only for `x86_64-linux-gnu`). [#5773](https://github.com/ClickHouse/ClickHouse/pull/5773) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Remove `-fPIC` from some libraries. [#8464](https://github.com/ClickHouse/ClickHouse/pull/8464) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Clean `CMakeLists.txt` for curl. See https://github.com/ClickHouse/ClickHouse/pull/8011#issuecomment-569478910 [#8459](https://github.com/ClickHouse/ClickHouse/pull/8459) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Silent warnings in `CapNProto` library. [#8220](https://github.com/ClickHouse/ClickHouse/pull/8220) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add performance tests for short string optimized hash tables. [#7679](https://github.com/ClickHouse/ClickHouse/pull/7679) ([Amos Bird](https://github.com/amosbird))
* Now ClickHouse will build on `AArch64` even if `MADV_FREE` is not available. This fixes [#8027](https://github.com/ClickHouse/ClickHouse/issues/8027). [#8243](https://github.com/ClickHouse/ClickHouse/pull/8243) ([Amos Bird](https://github.com/amosbird))
* Update `zlib-ng` to fix memory sanitizer problems. [#7182](https://github.com/ClickHouse/ClickHouse/pull/7182) [#8206](https://github.com/ClickHouse/ClickHouse/pull/8206) ([Alexander Kuzmenkov](https://github.com/akuzm))
* Enable internal MySQL library on non-Linux system, because usage of OS packages is very fragile and usually doesn't work at all. This fixes [#5765](https://github.com/ClickHouse/ClickHouse/issues/5765). [#8426](https://github.com/ClickHouse/ClickHouse/pull/8426) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed build on some systems after enabling `libc++`. This supersedes [#8374](https://github.com/ClickHouse/ClickHouse/issues/8374). [#8380](https://github.com/ClickHouse/ClickHouse/pull/8380) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Make `Field` methods more type-safe to find more errors. [#7386](https://github.com/ClickHouse/ClickHouse/pull/7386) [#8209](https://github.com/ClickHouse/ClickHouse/pull/8209) ([Alexander Kuzmenkov](https://github.com/akuzm))
* Added missing files to the `libc-headers` submodule. [#8507](https://github.com/ClickHouse/ClickHouse/pull/8507) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix wrong `JSON` quoting in performance test output. [#8497](https://github.com/ClickHouse/ClickHouse/pull/8497) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Now stack trace is displayed for `std::exception` and `Poco::Exception`. In previous versions it was available only for `DB::Exception`. This improves diagnostics. [#8501](https://github.com/ClickHouse/ClickHouse/pull/8501) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Porting `clock_gettime` and `clock_nanosleep` for fresh glibc versions. [#8054](https://github.com/ClickHouse/ClickHouse/pull/8054) ([Amos Bird](https://github.com/amosbird))
* Enable `part_log` in example config for developers. [#8609](https://github.com/ClickHouse/ClickHouse/pull/8609) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix async nature of reload in `01036_no_superfluous_dict_reload_on_create_database*`. [#8111](https://github.com/ClickHouse/ClickHouse/pull/8111) ([Azat Khuzhin](https://github.com/azat))
* Fixed codec performance tests. [#8615](https://github.com/ClickHouse/ClickHouse/pull/8615) ([Vasily Nemkov](https://github.com/Enmk))
* Add install scripts for `.tgz` build and documentation for them. [#8612](https://github.com/ClickHouse/ClickHouse/pull/8612) [#8591](https://github.com/ClickHouse/ClickHouse/pull/8591) ([alesapin](https://github.com/alesapin))
* Removed old `ZSTD` test (it was created in year 2016 to reproduce the bug that pre 1.0 version of ZSTD has had). This fixes [#8618](https://github.com/ClickHouse/ClickHouse/issues/8618). [#8619](https://github.com/ClickHouse/ClickHouse/pull/8619) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed build on Mac OS Catalina. [#8600](https://github.com/ClickHouse/ClickHouse/pull/8600) ([meo](https://github.com/meob))
* Increased number of rows in codec performance tests to make results noticeable. [#8574](https://github.com/ClickHouse/ClickHouse/pull/8574) ([Vasily Nemkov](https://github.com/Enmk))
* In debug builds, treat `LOGICAL_ERROR` exceptions as assertion failures, so that they are easier to notice. [#8475](https://github.com/ClickHouse/ClickHouse/pull/8475) ([Alexander Kuzmenkov](https://github.com/akuzm))
* Make formats-related performance test more deterministic. [#8477](https://github.com/ClickHouse/ClickHouse/pull/8477) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Update `lz4` to fix a MemorySanitizer failure. [#8181](https://github.com/ClickHouse/ClickHouse/pull/8181) ([Alexander Kuzmenkov](https://github.com/akuzm))
* Suppress a known MemorySanitizer false positive in exception handling. [#8182](https://github.com/ClickHouse/ClickHouse/pull/8182) ([Alexander Kuzmenkov](https://github.com/akuzm))
* Update `gcc` and `g++` to version 9 in `build/docker/build.sh` [#7766](https://github.com/ClickHouse/ClickHouse/pull/7766) ([TLightSky](https://github.com/tlightsky))
* Add performance test case to test that `PREWHERE` is worse than `WHERE`. [#7768](https://github.com/ClickHouse/ClickHouse/pull/7768) ([Amos Bird](https://github.com/amosbird))
* Progress towards fixing one flacky test. [#8621](https://github.com/ClickHouse/ClickHouse/pull/8621) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Avoid MemorySanitizer report for data from `libunwind`. [#8539](https://github.com/ClickHouse/ClickHouse/pull/8539) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Updated `libc++` to the latest version. [#8324](https://github.com/ClickHouse/ClickHouse/pull/8324) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Build ICU library from sources. This fixes [#6460](https://github.com/ClickHouse/ClickHouse/issues/6460). [#8219](https://github.com/ClickHouse/ClickHouse/pull/8219) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Switched from `libressl` to `openssl`. ClickHouse should support TLS 1.3 and SNI after this change. This fixes [#8171](https://github.com/ClickHouse/ClickHouse/issues/8171). [#8218](https://github.com/ClickHouse/ClickHouse/pull/8218) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed UBSan report when using `chacha20_poly1305` from SSL (happens on connect to https://yandex.ru/). [#8214](https://github.com/ClickHouse/ClickHouse/pull/8214) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix mode of default password file for `.deb` linux distros. [#8075](https://github.com/ClickHouse/ClickHouse/pull/8075) ([proller](https://github.com/proller))
* Improved expression for getting `clickhouse-server` PID in `clickhouse-test`. [#8063](https://github.com/ClickHouse/ClickHouse/pull/8063) ([Alexander Kazakov](https://github.com/Akazz))
* Updated contrib/googletest to v1.10.0. [#8587](https://github.com/ClickHouse/ClickHouse/pull/8587) ([Alexander Burmak](https://github.com/Alex-Burmak))
* Fixed ThreadSaninitizer report in `base64` library. Also updated this library to the latest version, but it doesn't matter. This fixes [#8397](https://github.com/ClickHouse/ClickHouse/issues/8397). [#8403](https://github.com/ClickHouse/ClickHouse/pull/8403) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix `00600_replace_running_query` for processors. [#8272](https://github.com/ClickHouse/ClickHouse/pull/8272) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Remove support for `tcmalloc` to make `CMakeLists.txt` simpler. [#8310](https://github.com/ClickHouse/ClickHouse/pull/8310) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Release gcc builds now use `libc++` instead of `libstdc++`. Recently `libc++` was used only with clang. This will improve consistency of build configurations and portability. [#8311](https://github.com/ClickHouse/ClickHouse/pull/8311) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Enable ICU library for build with MemorySanitizer. [#8222](https://github.com/ClickHouse/ClickHouse/pull/8222) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Suppress warnings from `CapNProto` library. [#8224](https://github.com/ClickHouse/ClickHouse/pull/8224) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Removed special cases of code for `tcmalloc`, because it's no longer supported. [#8225](https://github.com/ClickHouse/ClickHouse/pull/8225) ([alexey-milovidov](https://github.com/alexey-milovidov))
* In CI coverage task, kill the server gracefully to allow it to save the coverage report. This fixes incomplete coverage reports we've been seeing lately. [#8142](https://github.com/ClickHouse/ClickHouse/pull/8142) ([alesapin](https://github.com/alesapin))
* Performance tests for all codecs against `Float64` and `UInt64` values. [#8349](https://github.com/ClickHouse/ClickHouse/pull/8349) ([Vasily Nemkov](https://github.com/Enmk))
* `termcap` is very much deprecated and lead to various problems (f.g. missing "up" cap and echoing `^J` instead of multi line) . Favor `terminfo` or bundled `ncurses`. [#7737](https://github.com/ClickHouse/ClickHouse/pull/7737) ([Amos Bird](https://github.com/amosbird))
* Fix `test_storage_s3` integration test. [#7734](https://github.com/ClickHouse/ClickHouse/pull/7734) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Support `StorageFile(<format>, null) ` to insert block into given format file without actually write to disk. This is required for performance tests. [#8455](https://github.com/ClickHouse/ClickHouse/pull/8455) ([Amos Bird](https://github.com/amosbird))
* Added argument `--print-time` to functional tests which prints execution time per test. [#8001](https://github.com/ClickHouse/ClickHouse/pull/8001) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Added asserts to `KeyCondition` while evaluating RPN. This will fix warning from gcc-9. [#8279](https://github.com/ClickHouse/ClickHouse/pull/8279) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Dump cmake options in CI builds. [#8273](https://github.com/ClickHouse/ClickHouse/pull/8273) ([Alexander Kuzmenkov](https://github.com/akuzm))
* Don't generate debug info for some fat libraries. [#8271](https://github.com/ClickHouse/ClickHouse/pull/8271) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Make `log_to_console.xml` always log to stderr, regardless of is it interactive or not. [#8395](https://github.com/ClickHouse/ClickHouse/pull/8395) ([Alexander Kuzmenkov](https://github.com/akuzm))
* Removed some unused features from `clickhouse-performance-test` tool. [#8555](https://github.com/ClickHouse/ClickHouse/pull/8555) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Now we will also search for `lld-X` with corresponding `clang-X` version. [#8092](https://github.com/ClickHouse/ClickHouse/pull/8092) ([alesapin](https://github.com/alesapin))
* Parquet build improvement. [#8421](https://github.com/ClickHouse/ClickHouse/pull/8421) ([maxulan](https://github.com/maxulan))
* More GCC warnings [#8221](https://github.com/ClickHouse/ClickHouse/pull/8221) ([kreuzerkrieg](https://github.com/kreuzerkrieg))
* Package for Arch Linux now allows to run ClickHouse server, and not only client. [#8534](https://github.com/ClickHouse/ClickHouse/pull/8534) ([Vladimir Chebotarev](https://github.com/excitoon))
* Fix test with processors. Tiny performance fixes. [#7672](https://github.com/ClickHouse/ClickHouse/pull/7672) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Update contrib/protobuf. [#8256](https://github.com/ClickHouse/ClickHouse/pull/8256) ([Matwey V. Kornilov](https://github.com/matwey))
* In preparation of switching to c++20 as a new year celebration. "May the C++ force be with ClickHouse." [#8447](https://github.com/ClickHouse/ClickHouse/pull/8447) ([Amos Bird](https://github.com/amosbird))
### Experimental Feature
* Added experimental setting `min_bytes_to_use_mmap_io`. It allows to read big files without copying data from kernel to userspace. The setting is disabled by default. Recommended threshold is about 64 MB, because mmap/munmap is slow. [#8520](https://github.com/ClickHouse/ClickHouse/pull/8520) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Reworked quotas as a part of access control system. Added new table `system.quotas`, new functions `currentQuota`, `currentQuotaKey`, new SQL syntax `CREATE QUOTA`, `ALTER QUOTA`, `DROP QUOTA`, `SHOW QUOTA`. [#7257](https://github.com/ClickHouse/ClickHouse/pull/7257) ([Vitaly Baranov](https://github.com/vitlibar))
* Allow skipping unknown settings with warnings instead of throwing exceptions. [#7653](https://github.com/ClickHouse/ClickHouse/pull/7653) ([Vitaly Baranov](https://github.com/vitlibar))
* Reworked row policies as a part of access control system. Added new table `system.row_policies`, new function `currentRowPolicies()`, new SQL syntax `CREATE POLICY`, `ALTER POLICY`, `DROP POLICY`, `SHOW CREATE POLICY`, `SHOW POLICIES`. [#7808](https://github.com/ClickHouse/ClickHouse/pull/7808) ([Vitaly Baranov](https://github.com/vitlibar))
### Security Fix
* Fixed the possibility of reading directories structure in tables with `File` table engine. This fixes [#8536](https://github.com/ClickHouse/ClickHouse/issues/8536). [#8537](https://github.com/ClickHouse/ClickHouse/pull/8537) ([alexey-milovidov](https://github.com/alexey-milovidov))
## ClickHouse release v19.17
### ClickHouse release v19.17.6.36, 2019-12-27

View File

@ -12,7 +12,7 @@ If you want to contribute to documentation, read the [Contributing to ClickHouse
## Legal Info
In order for us (YANDEX LLC) to accept patches and other contributions from you, you will have to adopt our Yandex Contributor License Agreement (the "**CLA**"). The current version of the CLA you may find here:
In order for us (YANDEX LLC) to accept patches and other contributions from you, you may adopt our Yandex Contributor License Agreement (the "**CLA**"). The current version of the CLA you may find here:
1) https://yandex.ru/legal/cla/?lang=en (in English) and
2) https://yandex.ru/legal/cla/?lang=ru (in Russian).
@ -37,4 +37,7 @@ Replace the bracketed text as follows:
It is enough to provide us such notification once.
If you don't agree with the CLA, you still can open a pull request to provide your contributions.
As an alternative, you can provide DCO instead of CLA. You can find the text of DCO here: https://developercertificate.org/
It is enough to read and copy it verbatim to your pull request.
If you don't agree with the CLA and don't want to provide DCO, you still can open a pull request to provide your contributions.

View File

@ -9,7 +9,10 @@ cat "$QUERIES_FILE" | sed "s|{table}|\"${TABLE}\"|g" | while read query; do
echo -n "["
for i in $(seq 1 $TRIES); do
while true; do
RES=$(command time -f %e -o /dev/stdout curl -sS --location-trusted -H "Authorization: OAuth $YT_TOKEN" "$YT_PROXY.yt.yandex.net/query?default_format=Null&database=*$YT_CLIQUE_ID" --data-binary @- <<< "$query" 2>/dev/null) && break;
RES=$(command time -f %e -o /dev/stdout curl -sS -G --data-urlencode "query=$query" --data "default_format=Null&max_memory_usage=100000000000&max_memory_usage_for_all_queries=100000000000&max_concurrent_queries_for_user=100&database=*$YT_CLIQUE_ID" --location-trusted -H "Authorization: OAuth $YT_TOKEN" "$YT_PROXY.yt.yandex.net/query" 2>/dev/null);
if [[ $? == 0 ]]; then
[[ $RES =~ 'fail|Exception' ]] || break;
fi
done
[[ "$?" == "0" ]] && echo -n "${RES}" || echo -n "null"

View File

@ -34,10 +34,10 @@ SELECT WatchID, ClientIP, count() AS c, sum(Refresh), avg(ResolutionWidth) FROM
SELECT URL, count() AS c FROM {table} GROUP BY URL ORDER BY c DESC LIMIT 10;
SELECT 1, URL, count() AS c FROM {table} GROUP BY 1, URL ORDER BY c DESC LIMIT 10;
SELECT ClientIP AS x, x - 1, x - 2, x - 3, count() AS c FROM {table} GROUP BY x, x - 1, x - 2, x - 3 ORDER BY c DESC LIMIT 10;
SELECT URL, count() AS PageViews FROM {table} WHERE CounterID = 62 AND EventDate >= toDate('2013-07-01') AND EventDate <= toDate('2013-07-31') AND NOT DontCountHits AND NOT Refresh AND notEmpty(URL) GROUP BY URL ORDER BY PageViews DESC LIMIT 10;
SELECT Title, count() AS PageViews FROM {table} WHERE CounterID = 62 AND EventDate >= toDate('2013-07-01') AND EventDate <= toDate('2013-07-31') AND NOT DontCountHits AND NOT Refresh AND notEmpty(Title) GROUP BY Title ORDER BY PageViews DESC LIMIT 10;
SELECT URL, count() AS PageViews FROM {table} WHERE CounterID = 62 AND EventDate >= toDate('2013-07-01') AND EventDate <= toDate('2013-07-31') AND NOT Refresh AND IsLink AND NOT IsDownload GROUP BY URL ORDER BY PageViews DESC LIMIT 1000;
SELECT TraficSourceID, SearchEngineID, AdvEngineID, ((SearchEngineID = 0 AND AdvEngineID = 0) ? Referer : '') AS Src, URL AS Dst, count() AS PageViews FROM {table} WHERE CounterID = 62 AND EventDate >= toDate('2013-07-01') AND EventDate <= toDate('2013-07-31') AND NOT Refresh GROUP BY TraficSourceID, SearchEngineID, AdvEngineID, Src, Dst ORDER BY PageViews DESC LIMIT 1000;
SELECT URLHash, EventDate, count() AS PageViews FROM {table} WHERE CounterID = 62 AND EventDate >= toDate('2013-07-01') AND EventDate <= toDate('2013-07-31') AND NOT Refresh AND TraficSourceID IN (-1, 6) AND RefererHash = halfMD5('http://example.ru/') GROUP BY URLHash, EventDate ORDER BY PageViews DESC LIMIT 100;
SELECT WindowClientWidth, WindowClientHeight, count() AS PageViews FROM {table} WHERE CounterID = 62 AND EventDate >= toDate('2013-07-01') AND EventDate <= toDate('2013-07-31') AND NOT Refresh AND NOT DontCountHits AND URLHash = halfMD5('http://example.ru/') GROUP BY WindowClientWidth, WindowClientHeight ORDER BY PageViews DESC LIMIT 10000;
SELECT toStartOfMinute(EventTime) AS Minute, count() AS PageViews FROM {table} WHERE CounterID = 62 AND EventDate >= toDate('2013-07-01') AND EventDate <= toDate('2013-07-02') AND NOT Refresh AND NOT DontCountHits GROUP BY Minute ORDER BY Minute;
SELECT URL, count() AS PageViews FROM {table} WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT DontCountHits AND NOT Refresh AND notEmpty(URL) GROUP BY URL ORDER BY PageViews DESC LIMIT 10;
SELECT Title, count() AS PageViews FROM {table} WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT DontCountHits AND NOT Refresh AND notEmpty(Title) GROUP BY Title ORDER BY PageViews DESC LIMIT 10;
SELECT URL, count() AS PageViews FROM {table} WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT Refresh AND IsLink AND NOT IsDownload GROUP BY URL ORDER BY PageViews DESC LIMIT 1000;
SELECT TraficSourceID, SearchEngineID, AdvEngineID, ((SearchEngineID = 0 AND AdvEngineID = 0) ? Referer : '') AS Src, URL AS Dst, count() AS PageViews FROM {table} WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT Refresh GROUP BY TraficSourceID, SearchEngineID, AdvEngineID, Src, Dst ORDER BY PageViews DESC LIMIT 1000;
SELECT URLHash, EventDate, count() AS PageViews FROM {table} WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT Refresh AND TraficSourceID IN (-1, 6) AND RefererHash = halfMD5('http://example.ru/') GROUP BY URLHash, EventDate ORDER BY PageViews DESC LIMIT 100;
SELECT WindowClientWidth, WindowClientHeight, count() AS PageViews FROM {table} WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT Refresh AND NOT DontCountHits AND URLHash = halfMD5('http://example.ru/') GROUP BY WindowClientWidth, WindowClientHeight ORDER BY PageViews DESC LIMIT 10000;
SELECT toStartOfMinute(EventTime) AS Minute, count() AS PageViews FROM {table} WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-02' AND NOT Refresh AND NOT DontCountHits GROUP BY Minute ORDER BY Minute;

View File

@ -513,6 +513,7 @@ private:
if (input.empty())
break;
has_vertical_output_suffix = false;
if (input.ends_with("\\G"))
{
input.resize(input.size() - 2);

View File

@ -65,6 +65,8 @@ void Suggest::loadImpl(Connection & connection, const ConnectionTimeouts & timeo
" UNION ALL "
"SELECT name FROM system.data_type_families"
" UNION ALL "
"SELECT name FROM system.merge_tree_settings"
" UNION ALL "
"SELECT name FROM system.settings"
" UNION ALL "
"SELECT cluster FROM system.clusters"

View File

@ -184,7 +184,7 @@ template <> constexpr bool isDecimalField<DecimalField<Decimal128>>() { return t
class FieldVisitorAccurateEquals : public StaticVisitor<bool>
{
public:
bool operator() (const UInt64 & l, const Null & r) const { return cantCompare(l, r); }
bool operator() (const UInt64 &, const Null &) const { return false; }
bool operator() (const UInt64 & l, const UInt64 & r) const { return l == r; }
bool operator() (const UInt64 & l, const UInt128 & r) const { return cantCompare(l, r); }
bool operator() (const UInt64 & l, const Int64 & r) const { return accurate::equalsOp(l, r); }
@ -194,7 +194,7 @@ public:
bool operator() (const UInt64 & l, const Tuple & r) const { return cantCompare(l, r); }
bool operator() (const UInt64 & l, const AggregateFunctionStateData & r) const { return cantCompare(l, r); }
bool operator() (const Int64 & l, const Null & r) const { return cantCompare(l, r); }
bool operator() (const Int64 &, const Null &) const { return false; }
bool operator() (const Int64 & l, const UInt64 & r) const { return accurate::equalsOp(l, r); }
bool operator() (const Int64 & l, const UInt128 & r) const { return cantCompare(l, r); }
bool operator() (const Int64 & l, const Int64 & r) const { return l == r; }
@ -204,7 +204,7 @@ public:
bool operator() (const Int64 & l, const Tuple & r) const { return cantCompare(l, r); }
bool operator() (const Int64 & l, const AggregateFunctionStateData & r) const { return cantCompare(l, r); }
bool operator() (const Float64 & l, const Null & r) const { return cantCompare(l, r); }
bool operator() (const Float64 &, const Null &) const { return false; }
bool operator() (const Float64 & l, const UInt64 & r) const { return accurate::equalsOp(l, r); }
bool operator() (const Float64 & l, const UInt128 & r) const { return cantCompare(l, r); }
bool operator() (const Float64 & l, const Int64 & r) const { return accurate::equalsOp(l, r); }
@ -227,6 +227,8 @@ public:
return l == r;
if constexpr (std::is_same_v<T, UInt128>)
return stringToUUID(l) == r;
if constexpr (std::is_same_v<T, Null>)
return false;
return cantCompare(l, r);
}
@ -237,6 +239,8 @@ public:
return l == r;
if constexpr (std::is_same_v<T, String>)
return l == stringToUUID(r);
if constexpr (std::is_same_v<T, Null>)
return false;
return cantCompare(l, r);
}
@ -245,6 +249,8 @@ public:
{
if constexpr (std::is_same_v<T, Array>)
return l == r;
if constexpr (std::is_same_v<T, Null>)
return false;
return cantCompare(l, r);
}
@ -253,6 +259,8 @@ public:
{
if constexpr (std::is_same_v<T, Tuple>)
return l == r;
if constexpr (std::is_same_v<T, Null>)
return false;
return cantCompare(l, r);
}
@ -263,6 +271,8 @@ public:
return l == r;
if constexpr (std::is_same_v<U, Int64> || std::is_same_v<U, UInt64>)
return l == DecimalField<Decimal128>(r, 0);
if constexpr (std::is_same_v<U, Null>)
return false;
return cantCompare(l, r);
}
@ -289,11 +299,10 @@ private:
}
};
class FieldVisitorAccurateLess : public StaticVisitor<bool>
{
public:
bool operator() (const UInt64 & l, const Null & r) const { return cantCompare(l, r); }
bool operator() (const UInt64 &, const Null &) const { return false; }
bool operator() (const UInt64 & l, const UInt64 & r) const { return l < r; }
bool operator() (const UInt64 & l, const UInt128 & r) const { return cantCompare(l, r); }
bool operator() (const UInt64 & l, const Int64 & r) const { return accurate::lessOp(l, r); }
@ -303,7 +312,7 @@ public:
bool operator() (const UInt64 & l, const Tuple & r) const { return cantCompare(l, r); }
bool operator() (const UInt64 & l, const AggregateFunctionStateData & r) const { return cantCompare(l, r); }
bool operator() (const Int64 & l, const Null & r) const { return cantCompare(l, r); }
bool operator() (const Int64 &, const Null &) const { return false; }
bool operator() (const Int64 & l, const UInt64 & r) const { return accurate::lessOp(l, r); }
bool operator() (const Int64 & l, const UInt128 & r) const { return cantCompare(l, r); }
bool operator() (const Int64 & l, const Int64 & r) const { return l < r; }
@ -313,7 +322,7 @@ public:
bool operator() (const Int64 & l, const Tuple & r) const { return cantCompare(l, r); }
bool operator() (const Int64 & l, const AggregateFunctionStateData & r) const { return cantCompare(l, r); }
bool operator() (const Float64 & l, const Null & r) const { return cantCompare(l, r); }
bool operator() (const Float64 &, const Null &) const { return false; }
bool operator() (const Float64 & l, const UInt64 & r) const { return accurate::lessOp(l, r); }
bool operator() (const Float64 & l, const UInt128 & r) const { return cantCompare(l, r); }
bool operator() (const Float64 & l, const Int64 & r) const { return accurate::lessOp(l, r); }
@ -336,6 +345,8 @@ public:
return l < r;
if constexpr (std::is_same_v<T, UInt128>)
return stringToUUID(l) < r;
if constexpr (std::is_same_v<T, Null>)
return false;
return cantCompare(l, r);
}
@ -346,6 +357,8 @@ public:
return l < r;
if constexpr (std::is_same_v<T, String>)
return l < stringToUUID(r);
if constexpr (std::is_same_v<T, Null>)
return false;
return cantCompare(l, r);
}
@ -354,6 +367,8 @@ public:
{
if constexpr (std::is_same_v<T, Array>)
return l < r;
if constexpr (std::is_same_v<T, Null>)
return false;
return cantCompare(l, r);
}
@ -362,6 +377,8 @@ public:
{
if constexpr (std::is_same_v<T, Tuple>)
return l < r;
if constexpr (std::is_same_v<T, Null>)
return false;
return cantCompare(l, r);
}
@ -370,8 +387,10 @@ public:
{
if constexpr (isDecimalField<U>())
return l < r;
else if constexpr (std::is_same_v<U, Int64> || std::is_same_v<U, UInt64>)
if constexpr (std::is_same_v<U, Int64> || std::is_same_v<U, UInt64>)
return l < DecimalField<Decimal128>(r, 0);
if constexpr (std::is_same_v<U, Null>)
return false;
return cantCompare(l, r);
}

View File

@ -265,7 +265,19 @@ static void toStringEveryLineImpl(const StackTrace::Frames & frames, size_t offs
uintptr_t virtual_offset = object ? uintptr_t(object->address_begin) : 0;
const void * physical_addr = reinterpret_cast<const void *>(uintptr_t(virtual_addr) - virtual_offset);
out << i << ". " << physical_addr << " ";
out << i << ". ";
if (object)
{
if (std::filesystem::exists(object->name))
{
auto dwarf_it = dwarfs.try_emplace(object->name, *object->elf).first;
DB::Dwarf::LocationInfo location;
if (dwarf_it->second.findAddress(uintptr_t(physical_addr), location, DB::Dwarf::LocationInfoMode::FAST))
out << location.file.toString() << ":" << location.line << ": ";
}
}
auto symbol = symbol_index.findSymbol(virtual_addr);
if (symbol)
@ -276,22 +288,8 @@ static void toStringEveryLineImpl(const StackTrace::Frames & frames, size_t offs
else
out << "?";
out << " ";
if (object)
{
if (std::filesystem::exists(object->name))
{
auto dwarf_it = dwarfs.try_emplace(object->name, *object->elf).first;
DB::Dwarf::LocationInfo location;
if (dwarf_it->second.findAddress(uintptr_t(physical_addr), location, DB::Dwarf::LocationInfoMode::FAST))
out << location.file.toString() << ":" << location.line;
}
out << " in " << object->name;
}
else
out << "?";
out << " @ " << physical_addr;
out << " in " << (object ? object->name : "?");
callback(out.str());
out.str({});

View File

@ -151,10 +151,18 @@ public:
func = std::forward<Function>(func),
args = std::make_tuple(std::forward<Args>(args)...)]
{
try
{
/// Thread status holds raw pointer on query context, thus it always must be destroyed
/// before sending signal that permits to join this thread.
DB::ThreadStatus thread_status;
std::apply(func, args);
}
catch (...)
{
state->set();
throw;
}
state->set();
});
}

View File

@ -17,6 +17,8 @@ namespace DB
namespace ErrorCodes
{
extern const int TOO_SLOW;
extern const int LOGICAL_ERROR;
extern const int TIMEOUT_EXCEEDED;
}
static void limitProgressingSpeed(size_t total_progress_size, size_t max_speed_in_seconds, UInt64 total_elapsed_microseconds)
@ -88,4 +90,29 @@ void ExecutionSpeedLimits::throttle(
}
}
static bool handleOverflowMode(OverflowMode mode, const String & message, int code)
{
switch (mode)
{
case OverflowMode::THROW:
throw Exception(message, code);
case OverflowMode::BREAK:
return false;
default:
throw Exception("Logical error: unknown overflow mode", ErrorCodes::LOGICAL_ERROR);
}
}
bool ExecutionSpeedLimits::checkTimeLimit(UInt64 elapsed_ns, OverflowMode overflow_mode)
{
if (max_execution_time != 0
&& elapsed_ns > static_cast<UInt64>(max_execution_time.totalMicroseconds()) * 1000)
return handleOverflowMode(overflow_mode,
"Timeout exceeded: elapsed " + toString(static_cast<double>(elapsed_ns) / 1000000000ULL)
+ " seconds, maximum: " + toString(max_execution_time.totalMicroseconds() / 1000000.0),
ErrorCodes::TIMEOUT_EXCEEDED);
return true;
}
}

View File

@ -2,6 +2,7 @@
#include <Poco/Timespan.h>
#include <Core/Types.h>
#include <DataStreams/SizeLimits.h>
namespace DB
{
@ -23,6 +24,8 @@ public:
/// Pause execution in case if speed limits were exceeded.
void throttle(size_t read_rows, size_t read_bytes, size_t total_rows_to_read, UInt64 total_elapsed_microseconds);
bool checkTimeLimit(UInt64 elapsed_ns, OverflowMode overflow_mode);
};
}

View File

@ -203,30 +203,9 @@ void IBlockInputStream::updateExtremes(Block & block)
}
static bool handleOverflowMode(OverflowMode mode, const String & message, int code)
{
switch (mode)
{
case OverflowMode::THROW:
throw Exception(message, code);
case OverflowMode::BREAK:
return false;
default:
throw Exception("Logical error: unknown overflow mode", ErrorCodes::LOGICAL_ERROR);
}
}
bool IBlockInputStream::checkTimeLimit()
{
if (limits.speed_limits.max_execution_time != 0
&& info.total_stopwatch.elapsed() > static_cast<UInt64>(limits.speed_limits.max_execution_time.totalMicroseconds()) * 1000)
return handleOverflowMode(limits.timeout_overflow_mode,
"Timeout exceeded: elapsed " + toString(info.total_stopwatch.elapsedSeconds())
+ " seconds, maximum: " + toString(limits.speed_limits.max_execution_time.totalMicroseconds() / 1000000.0),
ErrorCodes::TIMEOUT_EXCEEDED);
return true;
return limits.speed_limits.checkTimeLimit(info.total_stopwatch.elapsed(), limits.timeout_overflow_mode);
}

View File

@ -23,6 +23,8 @@ namespace DB
class Context;
/** Create dictionary according to its layout.
*/
class DictionaryFactory : private boost::noncopyable
{
public:

View File

@ -10,10 +10,8 @@
#include <Common/Stopwatch.h>
#include <Processors/ISource.h>
#include <Common/setThreadName.h>
#include <Interpreters/ProcessList.h>
#if !defined(__APPLE__) && !defined(__FreeBSD__)
#include <sched.h>
#endif
namespace DB
{
@ -33,12 +31,13 @@ static bool checkCanAddAdditionalInfoToException(const DB::Exception & exception
&& exception.code() != ErrorCodes::QUERY_WAS_CANCELLED;
}
PipelineExecutor::PipelineExecutor(Processors & processors_)
PipelineExecutor::PipelineExecutor(Processors & processors_, QueryStatus * elem)
: processors(processors_)
, cancelled(false)
, finished(false)
, num_processing_executors(0)
, expand_pipeline_task(nullptr)
, process_list_element(elem)
{
buildGraph();
}
@ -143,8 +142,7 @@ static void executeJob(IProcessor * processor)
catch (Exception & exception)
{
if (checkCanAddAdditionalInfoToException(exception))
exception.addMessage("While executing " + processor->getName() + " ("
+ toString(reinterpret_cast<std::uintptr_t>(processor)) + ") ");
exception.addMessage("While executing " + processor->getName());
throw;
}
}
@ -265,7 +263,9 @@ bool PipelineExecutor::prepareProcessor(UInt64 pid, size_t thread_number, Queue
std::vector<Edge *> updated_direct_edges;
{
/// Stopwatch watch;
#ifndef N_DEBUG
Stopwatch watch;
#endif
std::unique_lock<std::mutex> lock(std::move(node_lock));
@ -279,7 +279,9 @@ bool PipelineExecutor::prepareProcessor(UInt64 pid, size_t thread_number, Queue
return false;
}
/// node.execution_state->preparation_time_ns += watch.elapsed();
#ifndef N_DEBUG
node.execution_state->preparation_time_ns += watch.elapsed();
#endif
node.updated_input_ports.clear();
node.updated_output_ports.clear();
@ -464,14 +466,17 @@ void PipelineExecutor::execute(size_t num_threads)
if (node.execution_state->exception)
std::rethrow_exception(node.execution_state->exception);
}
catch (Exception & exception)
catch (...)
{
if (checkCanAddAdditionalInfoToException(exception))
exception.addMessage("\nCurrent state:\n" + dumpPipeline());
#ifndef N_DEBUG
LOG_TRACE(log, "Exception while executing query. Current state:\n" << dumpPipeline());
#endif
throw;
}
if (process_list_element && process_list_element->isKilled())
throw Exception("Query was cancelled", ErrorCodes::QUERY_WAS_CANCELLED);
if (cancelled)
return;
@ -486,28 +491,15 @@ void PipelineExecutor::execute(size_t num_threads)
void PipelineExecutor::executeSingleThread(size_t thread_num, size_t num_threads)
{
#if !defined(__APPLE__) && !defined(__FreeBSD__)
/// Specify CPU core for thread if can.
/// It may reduce the number of context swithches.
/*
if (num_threads > 1)
{
cpu_set_t cpu_set;
CPU_ZERO(&cpu_set);
CPU_SET(thread_num, &cpu_set);
#ifndef N_DEBUG
UInt64 total_time_ns = 0;
UInt64 execution_time_ns = 0;
UInt64 processing_time_ns = 0;
UInt64 wait_time_ns = 0;
if (sched_setaffinity(0, sizeof(cpu_set_t), &cpu_set) == -1)
LOG_TRACE(log, "Cannot set affinity for thread " << num_threads);
}
*/
Stopwatch total_time_watch;
#endif
// UInt64 total_time_ns = 0;
// UInt64 execution_time_ns = 0;
// UInt64 processing_time_ns = 0;
// UInt64 wait_time_ns = 0;
// Stopwatch total_time_watch;
ExecutionState * state = nullptr;
auto prepare_processor = [&](UInt64 pid, Queue & queue)
@ -585,9 +577,15 @@ void PipelineExecutor::executeSingleThread(size_t thread_num, size_t num_threads
addJob(state);
{
// Stopwatch execution_time_watch;
#ifndef N_DEBUG
Stopwatch execution_time_watch;
#endif
state->job();
// execution_time_ns += execution_time_watch.elapsed();
#ifndef N_DEBUG
execution_time_ns += execution_time_watch.elapsed();
#endif
}
if (state->exception)
@ -596,7 +594,9 @@ void PipelineExecutor::executeSingleThread(size_t thread_num, size_t num_threads
if (finished)
break;
// Stopwatch processing_time_watch;
#ifndef N_DEBUG
Stopwatch processing_time_watch;
#endif
/// Try to execute neighbour processor.
{
@ -648,19 +648,22 @@ void PipelineExecutor::executeSingleThread(size_t thread_num, size_t num_threads
doExpandPipeline(task, false);
}
// processing_time_ns += processing_time_watch.elapsed();
#ifndef N_DEBUG
processing_time_ns += processing_time_watch.elapsed();
#endif
}
}
// total_time_ns = total_time_watch.elapsed();
// wait_time_ns = total_time_ns - execution_time_ns - processing_time_ns;
/*
#ifndef N_DEBUG
total_time_ns = total_time_watch.elapsed();
wait_time_ns = total_time_ns - execution_time_ns - processing_time_ns;
LOG_TRACE(log, "Thread finished."
<< " Total time: " << (total_time_ns / 1e9) << " sec."
<< " Execution time: " << (execution_time_ns / 1e9) << " sec."
<< " Processing time: " << (processing_time_ns / 1e9) << " sec."
<< " Wait time: " << (wait_time_ns / 1e9) << "sec.");
*/
#endif
}
void PipelineExecutor::executeImpl(size_t num_threads)
@ -762,10 +765,18 @@ String PipelineExecutor::dumpPipeline() const
for (auto & node : graph)
{
if (node.execution_state)
node.processor->setDescription(
"(" + std::to_string(node.execution_state->num_executed_jobs) + " jobs, execution time: "
+ std::to_string(node.execution_state->execution_time_ns / 1e9) + " sec., preparation time: "
+ std::to_string(node.execution_state->preparation_time_ns / 1e9) + " sec.)");
{
WriteBufferFromOwnString buffer;
buffer << "(" << node.execution_state->num_executed_jobs << " jobs";
#ifndef N_DEBUG
buffer << ", execution time: " << node.execution_state->execution_time_ns / 1e9 << " sec.";
buffer << ", preparation time: " << node.execution_state->preparation_time_ns / 1e9 << " sec.";
#endif
buffer << ")";
node.processor->setDescription(buffer.str());
}
}
std::vector<IProcessor::Status> statuses;

View File

@ -13,6 +13,7 @@
namespace DB
{
class QueryStatus;
/// Executes query pipeline.
class PipelineExecutor
@ -24,7 +25,7 @@ public:
/// During pipeline execution new processors can appear. They will be added to existing set.
///
/// Explicit graph representation is built in constructor. Throws if graph is not correct.
explicit PipelineExecutor(Processors & processors_);
explicit PipelineExecutor(Processors & processors_, QueryStatus * elem = nullptr);
/// Execute pipeline in multiple threads. Must be called once.
/// In case of exception during execution throws any occurred.
@ -242,6 +243,9 @@ private:
using ProcessorsMap = std::unordered_map<const IProcessor *, UInt64>;
ProcessorsMap processors_map;
/// Now it's used to check if query was killed.
QueryStatus * process_list_element = nullptr;
/// Graph related methods.
bool addEdges(UInt64 node);
void buildGraph();

View File

@ -1,5 +1,6 @@
#include <Processors/Executors/TreeExecutorBlockInputStream.h>
#include <Processors/Sources/SourceWithProgress.h>
#include <Interpreters/ProcessList.h>
#include <stack>
namespace DB
@ -152,6 +153,12 @@ void TreeExecutorBlockInputStream::execute()
case IProcessor::Status::Ready:
{
node->work();
/// This is handled inside PipelineExecutor now,
/// and doesn't checked by processors as in IInputStream before.
if (process_list_element && process_list_element->isKilled())
throw Exception("Query was cancelled", ErrorCodes::QUERY_WAS_CANCELLED);
break;
}
case IProcessor::Status::Async:
@ -188,6 +195,8 @@ void TreeExecutorBlockInputStream::setProgressCallback(const ProgressCallback &
void TreeExecutorBlockInputStream::setProcessListElement(QueryStatus * elem)
{
process_list_element = elem;
for (auto & source : sources_with_progress)
source->setProcessListElement(elem);
}

View File

@ -46,6 +46,8 @@ private:
/// Remember sources that support progress.
std::vector<ISourceWithProgress *> sources_with_progress;
QueryStatus * process_list_element = nullptr;
void init();
/// Execute tree step-by-step until root returns next chunk or execution is finished.
void execute();

View File

@ -222,7 +222,11 @@ public:
/// In case if query was cancelled executor will wait till all processors finish their jobs.
/// Generally, there is no reason to check this flag. However, it may be reasonable for long operations (e.g. i/o).
bool isCancelled() const { return is_cancelled; }
void cancel() { is_cancelled = true; }
void cancel()
{
is_cancelled = true;
onCancel();
}
virtual ~IProcessor() = default;
@ -275,6 +279,9 @@ public:
void enableQuota() { has_quota = true; }
bool hasQuota() const { return has_quota; }
protected:
virtual void onCancel() {}
private:
std::atomic<bool> is_cancelled{false};

View File

@ -11,7 +11,7 @@ ISource::ISource(Block header)
ISource::Status ISource::prepare()
{
if (finished)
if (finished || isCancelled())
{
output.finish();
return Status::Finished;
@ -46,7 +46,7 @@ void ISource::work()
try
{
current_chunk.chunk = generate();
if (!current_chunk.chunk)
if (!current_chunk.chunk || isCancelled())
finished = true;
else
has_input = true;

View File

@ -76,7 +76,7 @@ LimitTransform::Status LimitTransform::prepare()
if (!input.hasData())
return Status::NeedData;
current_chunk = input.pull();
current_chunk = input.pull(true);
has_block = true;
auto rows = current_chunk.getNumRows();
@ -95,6 +95,7 @@ LimitTransform::Status LimitTransform::prepare()
}
/// Now, we pulled from input, and it must be empty.
input.setNeeded();
return Status::NeedData;
}
@ -114,6 +115,7 @@ LimitTransform::Status LimitTransform::prepare()
}
/// Now, we pulled from input, and it must be empty.
input.setNeeded();
return Status::NeedData;
}

View File

@ -523,6 +523,8 @@ void QueryPipeline::setProgressCallback(const ProgressCallback & callback)
void QueryPipeline::setProcessListElement(QueryStatus * elem)
{
process_list_element = elem;
for (auto & processor : processors)
{
if (auto * source = dynamic_cast<ISourceWithProgress *>(processor.get()))
@ -630,7 +632,7 @@ PipelineExecutorPtr QueryPipeline::execute()
if (!output_format)
throw Exception("Cannot execute pipeline because it doesn't have output.", ErrorCodes::LOGICAL_ERROR);
return std::make_shared<PipelineExecutor>(processors);
return std::make_shared<PipelineExecutor>(processors, process_list_element);
}
}

View File

@ -123,6 +123,8 @@ private:
size_t max_threads = 0;
QueryStatus * process_list_element = nullptr;
void checkInitialized();
void checkSource(const ProcessorPtr & source, bool can_have_totals);

View File

@ -35,7 +35,7 @@ IProcessor::Status SourceFromInputStream::prepare()
is_generating_finished = true;
/// Read postfix and get totals if needed.
if (!is_stream_finished)
if (!is_stream_finished && !isCancelled())
return Status::Ready;
if (has_totals_port)
@ -109,7 +109,7 @@ Chunk SourceFromInputStream::generate()
}
auto block = stream->read();
if (!block)
if (!block && !isCancelled())
{
stream->readSuffix();

View File

@ -30,6 +30,9 @@ public:
void setProgressCallback(const ProgressCallback & callback) final { stream->setProgressCallback(callback); }
void addTotalRowsApprox(size_t value) final { stream->addTotalRowsApprox(value); }
protected:
void onCancel() override { stream->cancel(false); }
private:
bool has_aggregate_functions = false;
bool force_add_aggregating_info;

View File

@ -10,6 +10,15 @@ namespace ErrorCodes
{
extern const int TOO_MANY_ROWS;
extern const int TOO_MANY_BYTES;
extern const int TIMEOUT_EXCEEDED;
}
void SourceWithProgress::work()
{
if (!limits.speed_limits.checkTimeLimit(total_stopwatch.elapsed(), limits.timeout_overflow_mode))
cancel();
else
ISourceWithProgress::work();
}
/// Aggregated copy-paste from IBlockInputStream::progressImpl.

View File

@ -58,6 +58,8 @@ protected:
/// Call this method to provide information about progress.
void progress(const Progress & value);
void work() override;
private:
LocalLimits limits;
std::shared_ptr<QuotaContext> quota;

View File

@ -6,6 +6,7 @@
#include <Processors/ISource.h>
#include <Processors/Transforms/MergingAggregatedMemoryEfficientTransform.h>
namespace ProfileEvents
{
extern const Event ExternalAggregationMerge;
@ -14,9 +15,11 @@ namespace ProfileEvents
namespace DB
{
namespace
{
/// Convert block to chunk.
/// Adds additional info about aggregation.
static Chunk convertToChunk(const Block & block)
Chunk convertToChunk(const Block & block)
{
auto info = std::make_shared<AggregatedChunkInfo>();
info->bucket_num = block.info.bucket_num;
@ -29,8 +32,19 @@ static Chunk convertToChunk(const Block & block)
return chunk;
}
namespace
const AggregatedChunkInfo * getInfoFromChunk(const Chunk & chunk)
{
auto & info = chunk.getChunkInfo();
if (!info)
throw Exception("Chunk info was not set for chunk.", ErrorCodes::LOGICAL_ERROR);
auto * agg_info = typeid_cast<const AggregatedChunkInfo *>(info.get());
if (!agg_info)
throw Exception("Chunk should have AggregatedChunkInfo.", ErrorCodes::LOGICAL_ERROR);
return agg_info;
}
/// Reads chunks from file in native format. Provide chunks with aggregation info.
class SourceFromNativeStream : public ISource
{
@ -77,13 +91,13 @@ public:
struct SharedData
{
std::atomic<UInt32> next_bucket_to_merge = 0;
std::array<std::atomic<Int32>, NUM_BUCKETS> source_for_bucket;
std::array<std::atomic<bool>, NUM_BUCKETS> is_bucket_processed;
std::atomic<bool> is_cancelled = false;
SharedData()
{
for (auto & source : source_for_bucket)
source = -1;
for (auto & flag : is_bucket_processed)
flag = false;
}
};
@ -93,13 +107,11 @@ public:
AggregatingTransformParamsPtr params_,
ManyAggregatedDataVariantsPtr data_,
SharedDataPtr shared_data_,
Int32 source_number_,
Arena * arena_)
: ISource(params_->getHeader())
, params(std::move(params_))
, data(std::move(data_))
, shared_data(std::move(shared_data_))
, source_number(source_number_)
, arena(arena_)
{}
@ -116,7 +128,7 @@ protected:
Block block = params->aggregator.mergeAndConvertOneBucketToBlock(*data, arena, params->final, bucket_num, &shared_data->is_cancelled);
Chunk chunk = convertToChunk(block);
shared_data->source_for_bucket[bucket_num] = source_number;
shared_data->is_bucket_processed[bucket_num] = true;
return chunk;
}
@ -125,7 +137,6 @@ private:
AggregatingTransformParamsPtr params;
ManyAggregatedDataVariantsPtr data;
SharedDataPtr shared_data;
Int32 source_number;
Arena * arena;
};
@ -249,16 +260,23 @@ private:
{
auto & output = outputs.front();
Int32 next_input_num = shared_data->source_for_bucket[current_bucket_num];
if (next_input_num < 0)
for (auto & input : inputs)
{
if (!input.isFinished() && input.hasData())
{
auto chunk = input.pull();
auto bucket = getInfoFromChunk(chunk)->bucket_num;
chunks[bucket] = std::move(chunk);
}
}
if (!shared_data->is_bucket_processed[current_bucket_num])
return Status::NeedData;
auto next_input = std::next(inputs.begin(), next_input_num);
/// next_input can't be finished till data was not pulled.
if (!next_input->hasData())
if (!chunks[current_bucket_num])
return Status::NeedData;
output.push(next_input->pull());
output.push(std::move(chunks[current_bucket_num]));
++current_bucket_num;
if (current_bucket_num == NUM_BUCKETS)
@ -286,6 +304,7 @@ private:
UInt32 current_bucket_num = 0;
static constexpr Int32 NUM_BUCKETS = 256;
std::array<Chunk, NUM_BUCKETS> chunks;
Processors processors;
@ -359,7 +378,7 @@ private:
{
Arena * arena = first->aggregates_pools.at(thread).get();
auto source = std::make_shared<ConvertingAggregatedToChunksSource>(
params, data, shared_data, thread, arena);
params, data, shared_data, arena);
processors.emplace_back(std::move(source));
}

View File

@ -17,20 +17,6 @@ namespace ErrorCodes
}
static bool handleOverflowMode(OverflowMode mode, const String & message, int code)
{
switch (mode)
{
case OverflowMode::THROW:
throw Exception(message, code);
case OverflowMode::BREAK:
return false;
default:
throw Exception("Logical error: unknown overflow mode", ErrorCodes::LOGICAL_ERROR);
}
}
void ProcessorProfileInfo::update(const Chunk & block)
{
++blocks;
@ -44,13 +30,6 @@ LimitsCheckingTransform::LimitsCheckingTransform(const Block & header_, LocalLim
{
}
//LimitsCheckingTransform::LimitsCheckingTransform(const Block & header, LocalLimits limits, QueryStatus * process_list_elem)
// : ISimpleTransform(header, header, false)
// , limits(std::move(limits))
// , process_list_elem(process_list_elem)
//{
//}
void LimitsCheckingTransform::transform(Chunk & chunk)
{
if (!info.started)
@ -59,7 +38,7 @@ void LimitsCheckingTransform::transform(Chunk & chunk)
info.started = true;
}
if (!checkTimeLimit())
if (!limits.speed_limits.checkTimeLimit(info.total_stopwatch.elapsed(), limits.timeout_overflow_mode))
{
stopReading();
return;
@ -78,18 +57,6 @@ void LimitsCheckingTransform::transform(Chunk & chunk)
}
}
bool LimitsCheckingTransform::checkTimeLimit()
{
if (limits.speed_limits.max_execution_time != 0
&& info.total_stopwatch.elapsed() > static_cast<UInt64>(limits.speed_limits.max_execution_time.totalMicroseconds()) * 1000)
return handleOverflowMode(limits.timeout_overflow_mode,
"Timeout exceeded: elapsed " + toString(info.total_stopwatch.elapsedSeconds())
+ " seconds, maximum: " + toString(limits.speed_limits.max_execution_time.totalMicroseconds() / 1000000.0),
ErrorCodes::TIMEOUT_EXCEEDED);
return true;
}
void LimitsCheckingTransform::checkQuota(Chunk & chunk)
{
switch (limits.mode)

View File

@ -29,10 +29,7 @@ public:
using LocalLimits = IBlockInputStream::LocalLimits;
using LimitsMode = IBlockInputStream::LimitsMode;
/// LIMITS_CURRENT
LimitsCheckingTransform(const Block & header_, LocalLimits limits_);
/// LIMITS_TOTAL
/// LimitsCheckingTransform(const Block & header, LocalLimits limits, QueryStatus * process_list_elem);
String getName() const override { return "LimitsCheckingTransform"; }

View File

@ -208,6 +208,12 @@ void ColumnsDescription::flattenNested()
continue;
}
if (!type_tuple->haveExplicitNames())
{
++it;
continue;
}
ColumnDescription column = std::move(*it);
it = columns.get<0>().erase(it);

View File

@ -117,6 +117,9 @@ public:
/// Returns true if the blocks shouldn't be pushed to associated views on insert.
virtual bool noPushingToViews() const { return false; }
/// Read query returns streams which automatically distribute data between themselves.
/// So, it's impossible for one stream run out of data when there is data in other streams.
/// Example is StorageSystemNumbers.
virtual bool hasEvenlyDistributedRead() const { return false; }
/// Optional size information of each physical column.

View File

@ -412,7 +412,7 @@ bool StorageKafka::streamToViews()
void registerStorageKafka(StorageFactory & factory)
{
factory.registerStorage("Kafka", [](const StorageFactory::Arguments & args)
auto creator_fn = [](const StorageFactory::Arguments & args)
{
ASTs & engine_args = args.engine_args;
size_t args_count = engine_args.size();
@ -637,7 +637,9 @@ void registerStorageKafka(StorageFactory & factory)
return StorageKafka::create(
args.table_id, args.context, args.columns,
brokers, group, topics, format, row_delimiter, schema, num_consumers, max_block_size, skip_broken, intermediate_commit);
});
};
factory.registerStorage("Kafka", creator_fn, StorageFactory::StorageFeatures{ .supports_settings = true, });
}

View File

@ -669,21 +669,31 @@ static StoragePtr create(const StorageFactory::Arguments & args)
void registerStorageMergeTree(StorageFactory & factory)
{
factory.registerStorage("MergeTree", create);
factory.registerStorage("CollapsingMergeTree", create);
factory.registerStorage("ReplacingMergeTree", create);
factory.registerStorage("AggregatingMergeTree", create);
factory.registerStorage("SummingMergeTree", create);
factory.registerStorage("GraphiteMergeTree", create);
factory.registerStorage("VersionedCollapsingMergeTree", create);
StorageFactory::StorageFeatures features{
.supports_settings = true,
.supports_skipping_indices = true,
.supports_sort_order = true,
.supports_ttl = true,
};
factory.registerStorage("ReplicatedMergeTree", create);
factory.registerStorage("ReplicatedCollapsingMergeTree", create);
factory.registerStorage("ReplicatedReplacingMergeTree", create);
factory.registerStorage("ReplicatedAggregatingMergeTree", create);
factory.registerStorage("ReplicatedSummingMergeTree", create);
factory.registerStorage("ReplicatedGraphiteMergeTree", create);
factory.registerStorage("ReplicatedVersionedCollapsingMergeTree", create);
factory.registerStorage("MergeTree", create, features);
factory.registerStorage("CollapsingMergeTree", create, features);
factory.registerStorage("ReplacingMergeTree", create, features);
factory.registerStorage("AggregatingMergeTree", create, features);
factory.registerStorage("SummingMergeTree", create, features);
factory.registerStorage("GraphiteMergeTree", create, features);
factory.registerStorage("VersionedCollapsingMergeTree", create, features);
features.supports_replication = true;
features.supports_deduplication = true;
factory.registerStorage("ReplicatedMergeTree", create, features);
factory.registerStorage("ReplicatedCollapsingMergeTree", create, features);
factory.registerStorage("ReplicatedReplacingMergeTree", create, features);
factory.registerStorage("ReplicatedAggregatingMergeTree", create, features);
factory.registerStorage("ReplicatedSummingMergeTree", create, features);
factory.registerStorage("ReplicatedGraphiteMergeTree", create, features);
factory.registerStorage("ReplicatedVersionedCollapsingMergeTree", create, features);
}
}

View File

@ -31,9 +31,9 @@ static void checkAllTypesAreAllowedInTable(const NamesAndTypesList & names_and_t
}
void StorageFactory::registerStorage(const std::string & name, Creator creator)
void StorageFactory::registerStorage(const std::string & name, CreatorFn creator_fn, StorageFeatures features)
{
if (!storages.emplace(name, std::move(creator)).second)
if (!storages.emplace(name, Creator{std::move(creator_fn), features}).second)
throw Exception("TableFunctionFactory: the table function name '" + name + "' is not unique",
ErrorCodes::LOGICAL_ERROR);
}
@ -93,24 +93,6 @@ StoragePtr StorageFactory::get(
name = engine_def.name;
if (storage_def->settings && !endsWith(name, "MergeTree") && name != "Kafka" && name != "Join")
{
throw Exception(
"Engine " + name + " doesn't support SETTINGS clause. "
"Currently only the MergeTree family of engines, Kafka engine and Join engine support it",
ErrorCodes::BAD_ARGUMENTS);
}
if ((storage_def->partition_by || storage_def->primary_key || storage_def->order_by || storage_def->sample_by ||
storage_def->ttl_table || !columns.getColumnTTLs().empty() ||
(query.columns_list && query.columns_list->indices && !query.columns_list->indices->children.empty()))
&& !endsWith(name, "MergeTree"))
{
throw Exception(
"Engine " + name + " doesn't support PARTITION BY, PRIMARY KEY, ORDER BY, TTL or SAMPLE BY clauses and skipping indices. "
"Currently only the MergeTree family of engines supports them", ErrorCodes::BAD_ARGUMENTS);
}
if (name == "View")
{
throw Exception(
@ -129,8 +111,6 @@ StoragePtr StorageFactory::get(
"Direct creation of tables with ENGINE LiveView is not supported, use CREATE LIVE VIEW statement",
ErrorCodes::INCORRECT_QUERY);
}
}
}
auto it = storages.find(name);
if (it == storages.end())
@ -142,6 +122,46 @@ StoragePtr StorageFactory::get(
throw Exception("Unknown table engine " + name, ErrorCodes::UNKNOWN_STORAGE);
}
auto checkFeature = [&](String feature_description, FeatureMatcherFn feature_matcher_fn)
{
if (!feature_matcher_fn(it->second.features))
{
String msg = "Engine " + name + " doesn't support " + feature_description + ". "
"Currently only the following engines have support for the feature: [";
auto supporting_engines = getAllRegisteredNamesByFeatureMatcherFn(feature_matcher_fn);
for (size_t index = 0; index < supporting_engines.size(); ++index)
{
if (index)
msg += ", ";
msg += supporting_engines[index];
}
msg += "]";
throw Exception(msg, ErrorCodes::BAD_ARGUMENTS);
}
};
if (storage_def->settings)
checkFeature(
"SETTINGS clause",
[](StorageFeatures features) { return features.supports_settings; });
if (storage_def->partition_by || storage_def->primary_key || storage_def->order_by || storage_def->sample_by)
checkFeature(
"PARTITION_BY, PRIMARY_KEY, ORDER_BY or SAMPLE_BY clauses",
[](StorageFeatures features) { return features.supports_sort_order; });
if (storage_def->ttl_table || !columns.getColumnTTLs().empty())
checkFeature(
"TTL clause",
[](StorageFeatures features) { return features.supports_ttl; });
if (query.columns_list && query.columns_list->indices && !query.columns_list->indices->children.empty())
checkFeature(
"skipping indices",
[](StorageFeatures features) { return features.supports_skipping_indices; });
}
}
Arguments arguments
{
.engine_name = name,
@ -158,7 +178,7 @@ StoragePtr StorageFactory::get(
.has_force_restore_data_flag = has_force_restore_data_flag
};
return it->second(arguments);
return storages.at(name).creator_fn(arguments);
}
StorageFactory & StorageFactory::instance()

View File

@ -46,7 +46,24 @@ public:
bool has_force_restore_data_flag;
};
using Creator = std::function<StoragePtr(const Arguments & arguments)>;
struct StorageFeatures
{
bool supports_settings = false;
bool supports_skipping_indices = false;
bool supports_sort_order = false;
bool supports_ttl = false;
bool supports_replication = false;
bool supports_deduplication = false;
};
using CreatorFn = std::function<StoragePtr(const Arguments & arguments)>;
struct Creator
{
CreatorFn creator_fn;
StorageFeatures features;
};
using Storages = std::unordered_map<std::string, Creator>;
StoragePtr get(
const ASTCreateQuery & query,
@ -59,9 +76,16 @@ public:
/// Register a table engine by its name.
/// No locking, you must register all engines before usage of get.
void registerStorage(const std::string & name, Creator creator);
void registerStorage(const std::string & name, CreatorFn creator_fn, StorageFeatures features = StorageFeatures{
.supports_settings = false,
.supports_skipping_indices = false,
.supports_sort_order = false,
.supports_ttl = false,
.supports_replication = false,
.supports_deduplication = false,
});
const auto & getAllStorages() const
const Storages & getAllStorages() const
{
return storages;
}
@ -74,8 +98,18 @@ public:
return result;
}
using FeatureMatcherFn = std::function<bool(StorageFeatures)>;
std::vector<String> getAllRegisteredNamesByFeatureMatcherFn(FeatureMatcherFn feature_matcher_fn) const
{
std::vector<String> result;
for (const auto& pair : storages)
if (feature_matcher_fn(pair.second.features))
result.push_back(pair.first);
return result;
}
private:
using Storages = std::unordered_map<std::string, Creator>;
Storages storages;
};

View File

@ -203,11 +203,12 @@ class StorageFileBlockInputStream : public IBlockInputStream
{
public:
StorageFileBlockInputStream(std::shared_ptr<StorageFile> storage_,
const Context & context, UInt64 max_block_size,
const Context & context_, UInt64 max_block_size_,
std::string file_path_, bool need_path, bool need_file,
const CompressionMethod compression_method,
const CompressionMethod compression_method_,
BlockInputStreamPtr prepared_reader = nullptr)
: storage(std::move(storage_)), reader(std::move(prepared_reader))
: storage(std::move(storage_)), reader(std::move(prepared_reader)),
context(context_), max_block_size(max_block_size_), compression_method(compression_method_)
{
if (storage->use_table_fd)
{
@ -227,7 +228,6 @@ public:
}
storage->table_fd_was_used = true;
read_buf = wrapReadBufferWithCompressionMethod(std::make_unique<ReadBufferFromFileDescriptor>(storage->table_fd), compression_method);
}
else
{
@ -235,11 +235,7 @@ public:
file_path = std::make_optional(file_path_);
with_file_column = need_file;
with_path_column = need_path;
read_buf = wrapReadBufferWithCompressionMethod(std::make_unique<ReadBufferFromFile>(file_path.value()), compression_method);
}
if (!reader)
reader = FormatFactory::instance().getInput(storage->format_name, *read_buf, storage->getSampleBlock(), context, max_block_size);
}
String getName() const override
@ -249,7 +245,21 @@ public:
Block readImpl() override
{
/// Open file lazily on first read. This is needed to avoid too many open files from different streams.
if (!reader)
{
read_buf = wrapReadBufferWithCompressionMethod(storage->use_table_fd
? std::make_unique<ReadBufferFromFileDescriptor>(storage->table_fd)
: std::make_unique<ReadBufferFromFile>(file_path.value()),
compression_method);
reader = FormatFactory::instance().getInput(storage->format_name, *read_buf, storage->getSampleBlock(), context, max_block_size);
reader->readPrefix();
}
auto res = reader->read();
/// Enrich with virtual columns.
if (res && file_path)
{
if (with_path_column)
@ -263,12 +273,22 @@ public:
std::make_shared<DataTypeString>(), "_file"});
}
}
/// Close file prematurally if stream was ended.
if (!res)
{
reader->readSuffix();
reader.reset();
read_buf.reset();
}
return res;
}
Block getHeader() const override
{
auto res = reader->getHeader();
auto res = storage->getSampleBlock();
if (res && file_path)
{
if (with_path_column)
@ -276,19 +296,10 @@ public:
if (with_file_column)
res.insert({DataTypeString().createColumn(), std::make_shared<DataTypeString>(), "_file"});
}
return res;
}
void readPrefixImpl() override
{
reader->readPrefix();
}
void readSuffixImpl() override
{
reader->readSuffix();
}
private:
std::shared_ptr<StorageFile> storage;
std::optional<std::string> file_path;
@ -298,6 +309,10 @@ private:
std::unique_ptr<ReadBuffer> read_buf;
BlockInputStreamPtr reader;
const Context & context; /// TODO Untangle potential issues with context lifetime.
UInt64 max_block_size;
const CompressionMethod compression_method;
std::shared_lock<std::shared_mutex> shared_lock;
std::unique_lock<std::shared_mutex> unique_lock;
};
@ -314,13 +329,13 @@ BlockInputStreams StorageFile::read(
const ColumnsDescription & columns_ = getColumns();
auto column_defaults = columns_.getDefaults();
BlockInputStreams blocks_input;
if (use_table_fd) /// need to call ctr BlockInputStream
paths = {""}; /// when use fd, paths are empty
else
{
if (paths.size() == 1 && !Poco::File(paths[0]).exists())
throw Exception("File " + paths[0] + " doesn't exist", ErrorCodes::FILE_DOESNT_EXIST);
}
blocks_input.reserve(paths.size());
bool need_path_column = false;
bool need_file_column = false;

View File

@ -16,7 +16,7 @@ namespace ErrorCodes
}
StorageInput::StorageInput(const String & table_name_, const ColumnsDescription & columns_)
: IStorage({"", table_name_}, columns_)
: IStorage({"", table_name_})
{
setColumns(columns_);
}

View File

@ -95,7 +95,7 @@ size_t StorageJoin::getSize() const { return join->getTotalRowCount(); }
void registerStorageJoin(StorageFactory & factory)
{
factory.registerStorage("Join", [](const StorageFactory::Arguments & args)
auto creator_fn = [](const StorageFactory::Arguments & args)
{
/// Join(ANY, LEFT, k1, k2, ...)
@ -209,7 +209,9 @@ void registerStorageJoin(StorageFactory & factory)
args.constraints,
join_any_take_last_row,
args.context);
});
};
factory.registerStorage("Join", creator_fn, StorageFactory::StorageFeatures{ .supports_settings = true, });
}
template <typename T>

View File

@ -5,6 +5,9 @@
#include <DataStreams/LimitBlockInputStream.h>
#include <Storages/System/StorageSystemNumbers.h>
#include <Processors/Sources/SourceWithProgress.h>
#include <Processors/Pipe.h>
#include <Processors/LimitTransform.h>
namespace DB
{
@ -12,21 +15,16 @@ namespace DB
namespace
{
class NumbersBlockInputStream : public IBlockInputStream
class NumbersSource : public SourceWithProgress
{
public:
NumbersBlockInputStream(UInt64 block_size_, UInt64 offset_, UInt64 step_)
: block_size(block_size_), next(offset_), step(step_) {}
NumbersSource(UInt64 block_size_, UInt64 offset_, UInt64 step_)
: SourceWithProgress(createHeader()), block_size(block_size_), next(offset_), step(step_) {}
String getName() const override { return "Numbers"; }
Block getHeader() const override
{
return { ColumnWithTypeAndName(ColumnUInt64::create(), std::make_shared<DataTypeUInt64>(), "number") };
}
protected:
Block readImpl() override
Chunk generate() override
{
auto column = ColumnUInt64::create(block_size);
ColumnUInt64::Container & vec = column->getData();
@ -38,12 +36,21 @@ protected:
*pos++ = curr++;
next += step;
return { ColumnWithTypeAndName(std::move(column), std::make_shared<DataTypeUInt64>(), "number") };
progress({column->size(), column->byteSize()});
return { Columns {std::move(column)}, block_size };
}
private:
UInt64 block_size;
UInt64 next;
UInt64 step;
static Block createHeader()
{
return { ColumnWithTypeAndName(ColumnUInt64::create(), std::make_shared<DataTypeUInt64>(), "number") };
}
};
@ -55,21 +62,19 @@ struct NumbersMultiThreadedState
using NumbersMultiThreadedStatePtr = std::shared_ptr<NumbersMultiThreadedState>;
class NumbersMultiThreadedBlockInputStream : public IBlockInputStream
class NumbersMultiThreadedSource : public SourceWithProgress
{
public:
NumbersMultiThreadedBlockInputStream(NumbersMultiThreadedStatePtr state_, UInt64 block_size_, UInt64 max_counter_)
: state(std::move(state_)), block_size(block_size_), max_counter(max_counter_) {}
NumbersMultiThreadedSource(NumbersMultiThreadedStatePtr state_, UInt64 block_size_, UInt64 max_counter_)
: SourceWithProgress(createHeader())
, state(std::move(state_))
, block_size(block_size_)
, max_counter(max_counter_) {}
String getName() const override { return "NumbersMt"; }
Block getHeader() const override
{
return { ColumnWithTypeAndName(ColumnUInt64::create(), std::make_shared<DataTypeUInt64>(), "number") };
}
protected:
Block readImpl() override
Chunk generate() override
{
if (block_size == 0)
return {};
@ -90,7 +95,9 @@ protected:
while (pos < end)
*pos++ = curr++;
return { ColumnWithTypeAndName(std::move(column), std::make_shared<DataTypeUInt64>(), "number") };
progress({column->size(), column->byteSize()});
return { Columns {std::move(column)}, block_size };
}
private:
@ -98,6 +105,11 @@ private:
UInt64 block_size;
UInt64 max_counter;
Block createHeader() const
{
return { ColumnWithTypeAndName(ColumnUInt64::create(), std::make_shared<DataTypeUInt64>(), "number") };
}
};
}
@ -109,7 +121,7 @@ StorageSystemNumbers::StorageSystemNumbers(const std::string & name_, bool multi
setColumns(ColumnsDescription({{"number", std::make_shared<DataTypeUInt64>()}}));
}
BlockInputStreams StorageSystemNumbers::read(
Pipes StorageSystemNumbers::readWithProcessors(
const Names & column_names,
const SelectQueryInfo &,
const Context & /*context*/,
@ -128,7 +140,8 @@ BlockInputStreams StorageSystemNumbers::read(
if (!multithreaded)
num_streams = 1;
BlockInputStreams res(num_streams);
Pipes res;
res.reserve(num_streams);
if (num_streams > 1 && !even_distribution && *limit)
{
@ -136,17 +149,26 @@ BlockInputStreams StorageSystemNumbers::read(
UInt64 max_counter = offset + *limit;
for (size_t i = 0; i < num_streams; ++i)
res[i] = std::make_shared<NumbersMultiThreadedBlockInputStream>(state, max_block_size, max_counter);
res.emplace_back(std::make_shared<NumbersMultiThreadedSource>(state, max_block_size, max_counter));
return res;
}
for (size_t i = 0; i < num_streams; ++i)
{
res[i] = std::make_shared<NumbersBlockInputStream>(max_block_size, offset + i * max_block_size, num_streams * max_block_size);
auto source = std::make_shared<NumbersSource>(max_block_size, offset + i * max_block_size, num_streams * max_block_size);
if (limit) /// This formula is how to split 'limit' elements to 'num_streams' chunks almost uniformly.
res[i] = std::make_shared<LimitBlockInputStream>(res[i], *limit * (i + 1) / num_streams - *limit * i / num_streams, 0, false, true);
if (limit && i == 0)
source->addTotalRowsApprox(*limit);
res.emplace_back(std::move(source));
if (limit)
{
/// This formula is how to split 'limit' elements to 'num_streams' chunks almost uniformly.
res.back().addSimpleTransform(std::make_shared<LimitTransform>(
res.back().getHeader(), *limit * (i + 1) / num_streams - *limit * i / num_streams, 0, false));
}
}
return res;

View File

@ -29,7 +29,7 @@ class StorageSystemNumbers : public ext::shared_ptr_helper<StorageSystemNumbers>
public:
std::string getName() const override { return "SystemNumbers"; }
BlockInputStreams read(
Pipes readWithProcessors(
const Names & column_names,
const SelectQueryInfo & query_info,
const Context & context,
@ -37,6 +37,8 @@ public:
size_t max_block_size,
unsigned num_streams) override;
bool supportProcessorsPipeline() const override { return true; }
bool hasEvenlyDistributedRead() const override { return true; }
private:

View File

@ -1,4 +1,5 @@
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypesNumber.h>
#include <Storages/StorageFactory.h>
#include <Storages/System/StorageSystemTableEngines.h>
@ -7,15 +8,26 @@ namespace DB
NamesAndTypesList StorageSystemTableEngines::getNamesAndTypes()
{
return {{"name", std::make_shared<DataTypeString>()}};
return {{"name", std::make_shared<DataTypeString>()},
{"supports_settings", std::make_shared<DataTypeUInt8>()},
{"supports_skipping_indices", std::make_shared<DataTypeUInt8>()},
{"supports_sort_order", std::make_shared<DataTypeUInt8>()},
{"supports_ttl", std::make_shared<DataTypeUInt8>()},
{"supports_replication", std::make_shared<DataTypeUInt8>()},
{"supports_deduplication", std::make_shared<DataTypeUInt8>()}};
}
void StorageSystemTableEngines::fillData(MutableColumns & res_columns, const Context &, const SelectQueryInfo &) const
{
const auto & storages = StorageFactory::instance().getAllStorages();
for (const auto & pair : storages)
for (const auto & pair : StorageFactory::instance().getAllStorages())
{
res_columns[0]->insert(pair.first);
res_columns[1]->insert(pair.second.features.supports_settings);
res_columns[2]->insert(pair.second.features.supports_skipping_indices);
res_columns[3]->insert(pair.second.features.supports_sort_order);
res_columns[4]->insert(pair.second.features.supports_ttl);
res_columns[5]->insert(pair.second.features.supports_replication);
res_columns[6]->insert(pair.second.features.supports_deduplication);
}
}

View File

@ -99,7 +99,7 @@ def get_processlist(client_cmd):
def get_stacktraces(server_pid):
cmd = "gdb -q -ex 'set pagination off' -ex 'backtrace' -ex 'thread apply all backtrace' -ex 'detach' -ex 'quit' --pid {} 2>/dev/null".format(server_pid)
cmd = "gdb -batch -ex 'thread apply all backtrace' -p {}".format(server_pid)
try:
return subprocess.check_output(cmd, shell=True)
except Exception as ex:
@ -107,12 +107,11 @@ def get_stacktraces(server_pid):
def get_server_pid(server_tcp_port):
cmd = "lsof -i tcp:{port} | fgrep 'TCP *:{port} (LISTEN)'".format(port=server_tcp_port)
cmd = "lsof -i tcp:{port} -s tcp:LISTEN -Fp | awk '/^p[0-9]+$/{{print substr($0, 2)}}'".format(port=server_tcp_port)
try:
output = subprocess.check_output(cmd, shell=True)
if output:
columns = output.strip().split(' ')
return int(columns[1])
return int(output[1:])
else:
return None # server dead
except Exception as ex:
@ -453,12 +452,22 @@ def main(args):
if args.hung_check:
processlist = get_processlist(args.client_with_database)
if processlist:
server_pid = get_server_pid(os.getenv("CLICKHOUSE_PORT_TCP", '9000'))
print(colored("\nFound hung queries in processlist:", args, "red", attrs=["bold"]))
print(processlist)
clickhouse_tcp_port = os.getenv("CLICKHOUSE_PORT_TCP", '9000')
server_pid = get_server_pid(clickhouse_tcp_port)
if server_pid:
print("\nStacktraces of all threads:")
print("\nLocated ClickHouse server process {} listening at TCP port {}".format(server_pid, clickhouse_tcp_port))
print("\nCollecting stacktraces from all running threads:")
print(get_stacktraces(server_pid))
else:
print(
colored(
"\nUnable to locate ClickHouse server process listening at TCP port {}. "
"It must have crashed or exited prematurely!".format(clickhouse_tcp_port),
args, "red", attrs=["bold"]))
exit_code = 1
else:
print(colored("\nNo queries hung.", args, "green", attrs=["bold"]))

View File

@ -67,7 +67,11 @@ class CommandRequest:
#print " ".join(command)
self.process = sp.Popen(command, stdin=stdin_file, stdout=self.stdout_file, stderr=self.stderr_file)
# we suppress stderror on client becase sometimes thread sanitizer
# can print some debug information there
env = {}
env["TSAN_OPTIONS"] = "verbosity=0"
self.process = sp.Popen(command, stdin=stdin_file, stdout=self.stdout_file, stderr=self.stderr_file, env=env)
self.timer = None
self.process_finished_before_timeout = True

View File

@ -5,6 +5,7 @@ from helpers.cluster import ClickHouseCluster
from dictionary import Field, Row, Dictionary, DictionaryStructure, Layout
from external_sources import SourceMySQL, SourceClickHouse, SourceFile, SourceExecutableCache, SourceExecutableHashed
from external_sources import SourceMongo, SourceHTTP, SourceHTTPS, SourceRedis
import math
SCRIPT_DIR = os.path.dirname(os.path.realpath(__file__))
dict_configs_path = os.path.join(SCRIPT_DIR, 'configs/dictionaries')
@ -198,12 +199,23 @@ def started_cluster():
cluster.shutdown()
def test_simple_dictionaries(started_cluster):
def get_dictionaries(fold, total_folds, all_dicts):
chunk_len = int(math.ceil(len(all_dicts) / float(total_folds)))
if chunk_len * fold >= len(all_dicts):
return []
return all_dicts[fold * chunk_len : (fold + 1) * chunk_len]
@pytest.mark.parametrize("fold", list(range(10)))
def test_simple_dictionaries(started_cluster, fold):
fields = FIELDS["simple"]
values = VALUES["simple"]
data = [Row(fields, vals) for vals in values]
simple_dicts = [d for d in DICTIONARIES if d.structure.layout.layout_type == "simple"]
all_simple_dicts = [d for d in DICTIONARIES if d.structure.layout.layout_type == "simple"]
simple_dicts = get_dictionaries(fold, 10, all_simple_dicts)
print "Length of dicts:", len(simple_dicts)
for dct in simple_dicts:
dct.load_data(data)
@ -295,12 +307,14 @@ def test_ranged_dictionaries(started_cluster):
assert node.query(query) == str(answer) + '\n'
def test_key_value_simple_dictionaries(started_cluster):
@pytest.mark.parametrize("fold", list(range(10)))
def test_key_value_simple_dictionaries(started_cluster, fold):
fields = FIELDS["simple"]
values = VALUES["simple"]
data = [Row(fields, vals) for vals in values]
simple_dicts = [d for d in DICTIONARIES_KV if d.structure.layout.layout_type == "simple"]
all_simple_dicts = [d for d in DICTIONARIES_KV if d.structure.layout.layout_type == "simple"]
simple_dicts = get_dictionaries(fold, 10, all_simple_dicts)
for dct in simple_dicts:
queries_with_answers = []

View File

@ -162,35 +162,48 @@ def test_inserts_local(started_cluster):
assert instance.query("SELECT count(*) FROM local").strip() == '1'
def test_prefer_localhost_replica(started_cluster):
test_query = "SELECT * FROM distributed ORDER BY id;"
test_query = "SELECT * FROM distributed ORDER BY id"
node1.query("INSERT INTO distributed VALUES (toDate('2017-06-17'), 11)")
node2.query("INSERT INTO distributed VALUES (toDate('2017-06-17'), 22)")
time.sleep(1.0)
expected_distributed = '''\
2017-06-17\t11
2017-06-17\t22
'''
assert TSV(node1.query(test_query)) == TSV(expected_distributed)
assert TSV(node2.query(test_query)) == TSV(expected_distributed)
with PartitionManager() as pm:
pm.partition_instances(node1, node2, action='REJECT --reject-with tcp-reset')
node1.query("INSERT INTO replicated VALUES (toDate('2017-06-17'), 33)")
node2.query("INSERT INTO replicated VALUES (toDate('2017-06-17'), 44)")
time.sleep(1.0)
expected_from_node2 = '''\
2017-06-17\t11
2017-06-17\t22
2017-06-17\t44
'''
# Query is sent to node2, as it local and prefer_localhost_replica=1
assert TSV(node2.query(test_query)) == TSV(expected_from_node2)
expected_from_node1 = '''\
2017-06-17\t11
2017-06-17\t22
2017-06-17\t33
'''
assert TSV(node1.query(test_query)) == TSV(expected_distributed)
assert TSV(node2.query(test_query)) == TSV(expected_distributed)
# Make replicas inconsistent by disabling merges and fetches
# for possibility of determining to which replica the query was send
node1.query("SYSTEM STOP MERGES")
node1.query("SYSTEM STOP FETCHES")
node2.query("SYSTEM STOP MERGES")
node2.query("SYSTEM STOP FETCHES")
node1.query("INSERT INTO replicated VALUES (toDate('2017-06-17'), 33)")
node2.query("INSERT INTO replicated VALUES (toDate('2017-06-17'), 44)")
time.sleep(1.0)
# Query is sent to node2, as it local and prefer_localhost_replica=1
assert TSV(node2.query(test_query)) == TSV(expected_from_node2)
# Now query is sent to node1, as it higher in order
assert TSV(node2.query("SET load_balancing='in_order'; SET prefer_localhost_replica=0;" + test_query)) == TSV(expected_from_node1)
assert TSV(node2.query(test_query + " SETTINGS load_balancing='in_order', prefer_localhost_replica=0")) == TSV(expected_from_node1)
def test_inserts_low_cardinality(started_cluster):
instance = shard1

View File

@ -0,0 +1,4 @@
[(1,2),(2,3),(3,4)]
[(1,2)]
[(1,2,3),(2,3,4)]
[(4,3,1)]

View File

@ -0,0 +1,16 @@
SET send_logs_level = 'none';
DROP TABLE IF EXISTS array_of_tuples;
CREATE TABLE array_of_tuples
(
f Array(Tuple(Float64, Float64)),
s Array(Tuple(UInt8, UInt16, UInt32))
) ENGINE = Memory;
INSERT INTO array_of_tuples values ([(1, 2), (2, 3), (3, 4)], array(tuple(1, 2, 3), tuple(2, 3, 4))), (array((1.0, 2.0)), [tuple(4, 3, 1)]);
SELECT f from array_of_tuples;
SELECT s from array_of_tuples;
DROP TABLE array_of_tuples;

View File

@ -1,6 +0,0 @@
a 1
b 0
a 1 2
b 0 3
a 1 a 2
0 b 3

View File

@ -1,15 +0,0 @@
DROP TABLE IF EXISTS Alpha;
DROP TABLE IF EXISTS Beta;
CREATE TABLE Alpha (foo String, bar UInt64) ENGINE = Memory;
CREATE TABLE Beta (foo LowCardinality(String), baz UInt64) ENGINE = Memory;
INSERT INTO Alpha VALUES ('a', 1);
INSERT INTO Beta VALUES ('a', 2), ('b', 3);
SELECT * FROM Alpha FULL JOIN (SELECT 'b' as foo) USING (foo);
SELECT * FROM Alpha FULL JOIN Beta USING (foo);
SELECT * FROM Alpha FULL JOIN Beta ON Alpha.foo = Beta.foo;
DROP TABLE Alpha;
DROP TABLE Beta;

View File

@ -0,0 +1,12 @@
a 1
b 0
a 1 2
b 0 3
0 b 3
a 1 a 2
a 1
b \N
a 1 2
b \N 3
a 1 a 2
\N \N b 3

View File

@ -0,0 +1,21 @@
DROP TABLE IF EXISTS Alpha;
DROP TABLE IF EXISTS Beta;
CREATE TABLE Alpha (foo String, bar UInt64) ENGINE = Memory;
CREATE TABLE Beta (foo LowCardinality(String), baz UInt64) ENGINE = Memory;
INSERT INTO Alpha VALUES ('a', 1);
INSERT INTO Beta VALUES ('a', 2), ('b', 3);
SELECT * FROM Alpha FULL JOIN (SELECT 'b' as foo) USING (foo) ORDER BY foo;
SELECT * FROM Alpha FULL JOIN Beta USING (foo) ORDER BY foo;
SELECT * FROM Alpha FULL JOIN Beta ON Alpha.foo = Beta.foo ORDER BY foo;
SET join_use_nulls = 1;
SELECT * FROM Alpha FULL JOIN (SELECT 'b' as foo) USING (foo) ORDER BY foo;
SELECT * FROM Alpha FULL JOIN Beta USING (foo) ORDER BY foo;
SELECT * FROM Alpha FULL JOIN Beta ON Alpha.foo = Beta.foo ORDER BY foo;
DROP TABLE Alpha;
DROP TABLE Beta;

View File

@ -27,15 +27,15 @@ function download
# might have the same version on left and right
if ! [ "$la" = "$ra" ]
then
wget -q -nd -c "https://clickhouse-builds.s3.yandex.net/$left_pr/$left_sha/performance/performance.tgz" -O "$la" && tar -C left --strip-components=1 -zxvf "$la" &
wget -q -nd -c "https://clickhouse-builds.s3.yandex.net/$right_pr/$right_sha/performance/performance.tgz" -O "$ra" && tar -C right --strip-components=1 -zxvf "$ra" &
wget -nv -nd -c "https://clickhouse-builds.s3.yandex.net/$left_pr/$left_sha/performance/performance.tgz" -O "$la" && tar -C left --strip-components=1 -zxvf "$la" &
wget -nv -nd -c "https://clickhouse-builds.s3.yandex.net/$right_pr/$right_sha/performance/performance.tgz" -O "$ra" && tar -C right --strip-components=1 -zxvf "$ra" &
else
wget -q -nd -c "https://clickhouse-builds.s3.yandex.net/$left_pr/$left_sha/performance/performance.tgz" -O "$la" && { tar -C left --strip-components=1 -zxvf "$la" & tar -C right --strip-components=1 -zxvf "$ra" & } &
wget -nv -nd -c "https://clickhouse-builds.s3.yandex.net/$left_pr/$left_sha/performance/performance.tgz" -O "$la" && { tar -C left --strip-components=1 -zxvf "$la" & tar -C right --strip-components=1 -zxvf "$ra" & } &
fi
cd db0 && wget -q -nd -c "https://s3.mds.yandex.net/clickhouse-private-datasets/hits_10m_single/partitions/hits_10m_single.tar" && tar -xvf hits_10m_single.tar &
cd db0 && wget -q -nd -c "https://s3.mds.yandex.net/clickhouse-private-datasets/hits_100m_single/partitions/hits_100m_single.tar" && tar -xvf hits_100m_single.tar &
cd db0 && wget -q -nd -c "https://clickhouse-datasets.s3.yandex.net/hits/partitions/hits_v1.tar" && tar -xvf hits_v1.tar &
cd db0 && wget -nv -nd -c "https://s3.mds.yandex.net/clickhouse-private-datasets/hits_10m_single/partitions/hits_10m_single.tar" && tar -xvf hits_10m_single.tar &
cd db0 && wget -nv -nd -c "https://s3.mds.yandex.net/clickhouse-private-datasets/hits_100m_single/partitions/hits_100m_single.tar" && tar -xvf hits_100m_single.tar &
cd db0 && wget -nv -nd -c "https://clickhouse-datasets.s3.yandex.net/hits/partitions/hits_v1.tar" && tar -xvf hits_v1.tar &
wait
}
@ -151,4 +151,4 @@ right/clickhouse local --file '*-report.tsv' -S "$result_structure" --query "sel
right/clickhouse local --file '*-client-time.tsv' -S "query text, client float, server float" -q "select client, server, floor(client/server, 3) p, query from table where p > 1.01 order by p desc" > slow-on-client.tsv
grep Exception:[^:] *-err.log > run-errors.log
./report.py > report.html
$script_dir/report.py > report.html

View File

@ -15,7 +15,14 @@ echo Reference tag is $ref_tag
# We use annotated tags which have their own shas, so we have to further
# dereference the tag to get the commit it points to, hence the '~0' thing.
ref_sha=$(cd ch && git rev-parse $ref_tag~0)
# Show what we're testing
echo Reference SHA is $ref_sha
(cd ch && git log -1 --decorate $ref_sha) ||:
echo
echo SHA to test is $SHA_TO_TEST
(cd ch && git log -1 --decorate $SHA_TO_TEST) ||:
echo
# Set python output encoding so that we can print queries with Russian letters.
export PYTHONIOENCODING=utf-8

View File

@ -1,10 +1,10 @@
-- input is table(query text, run UInt32, version int, time float)
select
-- abs(diff_percent) > rd_quantiles_percent[3] fail,
floor(original_medians_array.time_by_version[1], 4) left,
floor(original_medians_array.time_by_version[2], 4) right,
floor((right - left) / left, 3) diff_percent,
arrayMap(x -> floor(x / left, 3), rd.rd_quantiles) rd_quantiles_percent,
floor(original_medians_array.time_by_version[1], 4) l,
floor(original_medians_array.time_by_version[2], 4) r,
floor((r - l) / l, 3) diff_percent,
arrayMap(x -> floor(x / l, 3), rd.rd_quantiles) rd_quantiles_percent,
query
from
(

View File

@ -87,15 +87,15 @@ params['header'] = "ClickHouse Performance Comparison"
params['test_part'] = (table_template.format_map(
collections.defaultdict(str,
caption = 'Changes in performance',
header = table_header(['Left', 'Right', 'Diff', 'RD', 'Query']),
header = table_header(['Old, s', 'New, s', 'Relative difference (new&nbsp;-&nbsp;old)/old', 'Randomization distribution quantiles [5%,&nbsp;50%,&nbsp;95%]', 'Query']),
rows = tsv_rows('changed-perf.tsv'))) +
table_template.format(
caption = 'Slow on client',
header = table_header(['Client', 'Server', 'Ratio', 'Query']),
header = table_header(['Client time, s', 'Server time, s', 'Ratio', 'Query']),
rows = tsv_rows('slow-on-client.tsv')) +
table_template.format(
caption = 'Unstable',
header = table_header(['Left', 'Right', 'Diff', 'RD', 'Query']),
header = table_header(['Old, s', 'New, s', 'Relative difference (new&nbsp;-&nbsp;old)/old', 'Randomization distribution quantiles [5%,&nbsp;50%,&nbsp;95%]', 'Query']),
rows = tsv_rows('unstable.tsv')) +
table_template.format(
caption = 'Run errors',

View File

@ -0,0 +1,7 @@
# Browse ClickHouse Source Code
You can use **Woboq** online code browser available [here](https://clickhouse-test-reports.s3.yandex.net/codebrowser/html_report///ClickHouse/dbms/src/index.html). It provides code navigation and semantic highlighting, search and indexing. The code snapshot is updated daily.
Also you can browse sources on [GitHub](https://github.com/ClickHouse/ClickHouse) as usual.
If you're interested what IDE to use, we recommend CLion, QT Creator, VS Code and KDevelop (with caveats). You can use any favourite IDE. Vim and Emacs also count.

View File

@ -368,7 +368,7 @@ For more information, see the section "[Creating replicated tables](../../operat
## mark_cache_size {#server-mark-cache-size}
Approximate size (in bytes) of the cache of marks used by table engines of the [MergeTree](../../operations/table_engines/mergetree.md) family.
Approximate size (in bytes) of the cache of marks used by table engines of the [MergeTree](../table_engines/mergetree.md) family.
The cache is shared for the server and memory is allocated as needed. The cache size must be at least 5368709120.
@ -420,7 +420,7 @@ We recommend using this option in Mac OS X, since the `getrlimit()` function ret
Restriction on deleting tables.
If the size of a [MergeTree](../../operations/table_engines/mergetree.md) table exceeds `max_table_size_to_drop` (in bytes), you can't delete it using a DROP query.
If the size of a [MergeTree](../table_engines/mergetree.md) table exceeds `max_table_size_to_drop` (in bytes), you can't delete it using a DROP query.
If you still need to delete the table without restarting the ClickHouse server, create the `<clickhouse-path>/flags/force_drop_table` file and run the DROP query.
@ -437,7 +437,7 @@ The value 0 means that you can delete all tables without any restrictions.
## merge_tree {#server_settings-merge_tree}
Fine tuning for tables in the [MergeTree](../../operations/table_engines/mergetree.md).
Fine tuning for tables in the [MergeTree](../table_engines/mergetree.md).
For more information, see the MergeTreeSettings.h header file.
@ -512,7 +512,7 @@ Keys for server/client settings:
## part_log {#server_settings-part-log}
Logging events that are associated with [MergeTree](../../operations/table_engines/mergetree.md). For instance, adding or merging data. You can use the log to simulate merge algorithms and compare their characteristics. You can visualize the merge process.
Logging events that are associated with [MergeTree](../table_engines/mergetree.md). For instance, adding or merging data. You can use the log to simulate merge algorithms and compare their characteristics. You can visualize the merge process.
Queries are logged in the [system.part_log](../system_tables.md#system_tables-part-log) table, not in a separate file. You can configure the name of this table in the `table` parameter (see below).
@ -739,7 +739,7 @@ Path to temporary data for processing large queries.
## tmp_policy {#server-settings-tmp_policy}
Policy from [`storage_configuration`](mergetree.md#table_engine-mergetree-multiple-volumes) to store temporary files.
Policy from [`storage_configuration`](../table_engines/mergetree.md#table_engine-mergetree-multiple-volumes) to store temporary files.
If not set [`tmp_path`](#server-settings-tmp_path) is used, otherwise it is ignored.
!!! note
@ -750,7 +750,7 @@ If not set [`tmp_path`](#server-settings-tmp_path) is used, otherwise it is igno
## uncompressed_cache_size {#server-settings-uncompressed_cache_size}
Cache size (in bytes) for uncompressed data used by table engines from the [MergeTree](../../operations/table_engines/mergetree.md).
Cache size (in bytes) for uncompressed data used by table engines from the [MergeTree](../table_engines/mergetree.md).
There is one shared cache for the server. Memory is allocated on demand. The cache is used if the option [use_uncompressed_cache](../settings/settings.md#setting-use_uncompressed_cache) is enabled.

View File

@ -773,6 +773,43 @@ WHERE changed
└────────────────────────┴─────────────┴─────────┘
```
## system.table_engines
Contains description of table engines supported by server and their feature support information.
This table contains the following columns (the column type is shown in brackets):
- `name` (String) — The name of table engine.
- `supports_settings` (UInt8) — Flag that indicates if table engine supports `SETTINGS` clause.
- `supports_skipping_indices` (UInt8) — Flag that indicates if table engine supports [skipping indices](table_engines/mergetree/#table_engine-mergetree-data_skipping-indexes).
- `supports_ttl` (UInt8) — Flag that indicates if table engine supports [TTL](table_engines/mergetree/#table_engine-mergetree-ttl).
- `supports_sort_order` (UInt8) — Flag that indicates if table engine supports clauses `PARTITION_BY`, `PRIMARY_KEY`, `ORDER_BY` and `SAMPLE_BY`.
- `supports_replication` (UInt8) — Flag that indicates if table engine supports [data replication](table_engines/replication/).
- `supports_duduplication` (UInt8) — Flag that indicates if table engine supports data deduplication.
Example:
```sql
SELECT *
FROM system.table_engines
WHERE name in ('Kafka', 'MergeTree', 'ReplicatedCollapsingMergeTree')
```
```text
┌─name──────────────────────────┬─supports_settings─┬─supports_skipping_indices─┬─supports_sort_order─┬─supports_ttl─┬─supports_replication─┬─supports_deduplication─┐
│ Kafka │ 1 │ 0 │ 0 │ 0 │ 0 │ 0 │
│ MergeTree │ 1 │ 1 │ 1 │ 1 │ 0 │ 0 │
│ ReplicatedCollapsingMergeTree │ 1 │ 1 │ 1 │ 1 │ 1 │ 1 │
└───────────────────────────────┴───────────────────┴───────────────────────────┴─────────────────────┴──────────────┴──────────────────────┴────────────────────────┘
```
**See also**
- MergeTree family [query clauses](table_engines/mergetree.md#mergetree-query-clauses)
- Kafka [settings](table_engines/kafka.md#table_engine-kafka-creating-a-table)
- Join [settings](table_engines/join.md#join-limitations-and-settings)
## system.tables
Contains metadata of each table that the server knows about. Detached tables are not shown in `system.tables`.

View File

@ -14,7 +14,7 @@ The Distributed engine accepts parameters:
See also:
- `insert_distributed_sync` setting
- [MergeTree](../mergetree.md#table_engine-mergetree-multiple-volumes) for the examples
- [MergeTree](mergetree.md#table_engine-mergetree-multiple-volumes) for the examples
Example:

View File

@ -77,7 +77,7 @@ You cannot perform a `SELECT` query directly from the table. Instead, use one of
- Place the table to the right side in a `JOIN` clause.
- Call the [joinGet](../../query_language/functions/other_functions.md#other_functions-joinget) function, which lets you extract data from the table the same way as from a dictionary.
### Limitations and Settings
### Limitations and Settings {#join-limitations-and-settings}
When creating a table, the following settings are applied:

View File

@ -50,7 +50,7 @@ For a description of parameters, see the [CREATE query description](../../query_
!!!note "Note"
`INDEX` is an experimental feature, see [Data Skipping Indexes](#table_engine-mergetree-data_skipping-indexes).
### Query Clauses
### Query Clauses {#mergetree-query-clauses}
- `ENGINE` — Name and parameters of the engine. `ENGINE = MergeTree()`. The `MergeTree` engine does not have parameters.

View File

@ -2,8 +2,8 @@
## Q1 2020
- Resource pools for more precise distribution of cluster capacity between users
- Fine-grained authorization
- Role-based access control
- Integration with external authentication services
- Resource pools for more precise distribution of cluster capacity between users
[Original article](https://clickhouse.yandex/docs/en/roadmap/) <!--hide-->

View File

@ -0,0 +1 @@
../../en/development/browse_code.md

View File

@ -0,0 +1 @@
../../en/development/browse_code.md

View File

@ -0,0 +1,7 @@
# Навигация по коду ClickHouse
Для навигации по коду онлайн доступен **Woboq**, он расположен [здесь](https://clickhouse-test-reports.s3.yandex.net/codebrowser/html_report///ClickHouse/dbms/src/index.html). В нём реализовано удобное перемещение между исходными файлами, семантическая подсветка, подсказки, индексация и поиск. Слепок кода обновляется ежедневно.
Также вы можете просматривать исходники на [GitHub](https://github.com/ClickHouse/ClickHouse).
Если вы интересуетесь, какую среду разработки выбрать для работы с ClickHouse, мы рекомендуем CLion, QT Creator, VSCode или KDevelop (с некоторыми предостережениями). Вы можете использовать свою любимую среду разработки, Vim и Emacs тоже считаются.

View File

@ -125,7 +125,7 @@ SELECT CounterID, count() FROM hits GROUP BY CounterID ORDER BY count() DESC LIM
2. Кодогенерация. Для запроса генерируется код, в котором подставлены все косвенные вызовы.
В "обычных" БД этого не делается, так как не имеет смысла при выполнении простых запросов. Хотя есть исключения. Например, в MemSQL кодогенерация используется для уменьшения latency при выполнении SQL запросов. (Для сравнения - в аналитических СУБД, требуется оптимизация throughput, а не latency).
В "обычных" БД этого не делается, так как не имеет смысла при выполнении простых запросов. Хотя есть исключения. Например, в MemSQL кодогенерация используется для уменьшения latency при выполнении SQL запросов. Для сравнения, в аналитических СУБД требуется оптимизация throughput, а не latency.
Стоит заметить, что для эффективности по CPU требуется, чтобы язык запросов был декларативным (SQL, MDX) или хотя бы векторным (J, K). То есть, чтобы запрос содержал циклы только в неявном виде, открывая возможности для оптимизации.

View File

@ -695,6 +695,43 @@ WHERE changed
└────────────────────────┴─────────────┴─────────┘
```
## system.table_engines
Содержит информацию про движки таблиц, поддерживаемые сервером, а также об их возможностях.
Эта таблица содержит следующие столбцы (тип столбца показан в скобках):
- `name` (String) — имя движка.
- `supports_settings` (UInt8) — флаг, показывающий поддержку секции `SETTINGS`.
- `supports_skipping_indices` (UInt8) — флаг, показывающий поддержку [индексов пропуска данных](table_engines/mergetree/#table_engine-mergetree-data_skipping-indexes).
- `supports_ttl` (UInt8) — флаг, показывающий поддержку [TTL](table_engines/mergetree/#table_engine-mergetree-ttl).
- `supports_sort_order` (UInt8) — флаг, показывающий поддержку секций `PARTITION_BY`, `PRIMARY_KEY`, `ORDER_BY` и `SAMPLE_BY`.
- `supports_replication` (UInt8) — флаг, показвыающий поддержку [репликации](table_engines/replication/).
- `supports_duduplication` (UInt8) — флаг, показывающий наличие в движке дедупликации данных.
Пример:
```sql
SELECT *
FROM system.table_engines
WHERE name in ('Kafka', 'MergeTree', 'ReplicatedCollapsingMergeTree')
```
```text
┌─name──────────────────────────┬─supports_settings─┬─supports_skipping_indices─┬─supports_sort_order─┬─supports_ttl─┬─supports_replication─┬─supports_deduplication─┐
│ Kafka │ 1 │ 0 │ 0 │ 0 │ 0 │ 0 │
│ MergeTree │ 1 │ 1 │ 1 │ 1 │ 0 │ 0 │
│ ReplicatedCollapsingMergeTree │ 1 │ 1 │ 1 │ 1 │ 1 │ 1 │
└───────────────────────────────┴───────────────────┴───────────────────────────┴─────────────────────┴──────────────┴──────────────────────┴────────────────────────┘
```
**Смотрите также**
- [Секции движка](table_engines/mergetree/#mergetree-query-clauses) семейства MergeTree
- [Настройки](table_engines/kafka.md#table_engine-kafka-creating-a-table) Kafka
- [Настройки](table_engines/join/#join-limitations-and-settings) Join
## system.tables
Содержит метаданные каждой таблицы, о которой знает сервер. Отсоединённые таблицы не отображаются в `system.tables`.

View File

@ -79,7 +79,7 @@ SELECT joinGet('id_val_join', 'val', toUInt32(1))
- Используйте таблицу как правую в секции `JOIN`.
- Используйте функцию [joinGet](../../query_language/functions/other_functions.md#other_functions-joinget), которая позволяет извлекать данные из таблицы таким же образом как из словаря.
### Ограничения и настройки
### Ограничения и настройки {#join-limitations-and-settings}
При создании таблицы, применяются следующие параметры :

View File

@ -49,7 +49,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
!!!note "Note"
`INDEX` — экспериментальная возможность, смотрите [Индексы пропуска данных](#table_engine-mergetree-data_skipping-indexes).
### Секции запроса
### Секции запроса {#mergetree-query-clauses}
- `ENGINE` — имя и параметры движка. `ENGINE = MergeTree()`. `MergeTree` не имеет параметров.

View File

@ -218,14 +218,15 @@ nav:
- 'Development':
- 'hidden': 'development/index.md'
- 'The Beginner ClickHouse Developer Instruction': 'development/developer_instruction.md'
- 'Overview of ClickHouse Architecture': 'development/architecture.md'
- 'Browse ClickHouse Source Code': 'development/browse_code.md'
- 'How to Build ClickHouse on Linux': 'development/build.md'
- 'How to Build ClickHouse on Mac OS X': 'development/build_osx.md'
- 'How to Build ClickHouse on Linux for Mac OS X': 'development/build_cross_osx.md'
- 'How to Build ClickHouse on Linux for AARCH64 (ARM64)': 'development/build_cross_arm.md'
- 'How to Write C++ Code': 'development/style.md'
- 'How to Run ClickHouse Tests': 'development/tests.md'
- 'The Beginner ClickHouse Developer Instruction': 'development/developer_instruction.md'
- 'Third-Party Libraries Used': 'development/contrib.md'
- 'What''s New':

View File

@ -216,12 +216,13 @@ nav:
- 'Development':
- 'hidden': 'development/index.md'
- 'The Beginner ClickHouse Developer Instruction': 'development/developer_instruction.md'
- 'Overview of ClickHouse Architecture': 'development/architecture.md'
- 'Browse ClickHouse Source Code': 'development/browse_code.md'
- 'How to Build ClickHouse on Linux': 'development/build.md'
- 'How to Build ClickHouse on Mac OS X': 'development/build_osx.md'
- 'How to Write C++ code': 'development/style.md'
- 'How to Run ClickHouse Tests': 'development/tests.md'
- 'The Beginner ClickHouse Developer Instruction': 'development/developer_instruction.md'
- 'Third-Party Libraries Used': 'development/contrib.md'
- 'What''s New':

View File

@ -217,14 +217,15 @@ nav:
- 'Development':
- 'hidden': 'development/index.md'
- 'The Beginner ClickHouse Developer Instruction': 'development/developer_instruction.md'
- 'Overview of ClickHouse Architecture': 'development/architecture.md'
- 'Browse ClickHouse Source Code': 'development/browse_code.md'
- 'How to Build ClickHouse on Linux': 'development/build.md'
- 'How to Build ClickHouse on Mac OS X': 'development/build_osx.md'
- 'How to Build ClickHouse on Linux for Mac OS X': 'development/build_cross_osx.md'
- 'How to Build ClickHouse on Linux for AARCH64 (ARM64)': 'development/build_cross_arm.md'
- 'How to Write C++ Code': 'development/style.md'
- 'How to Run ClickHouse Tests': 'development/tests.md'
- 'The Beginner ClickHouse Developer Instruction': 'development/developer_instruction.md'
- 'Third-Party Libraries Used': 'development/contrib.md'
- 'What''s New':

View File

@ -216,13 +216,14 @@ nav:
- 'Разработка':
- 'hidden': 'development/index.md'
- 'Инструкция для начинающего разработчика ClickHouse': 'development/developer_instruction.md'
- 'Обзор архитектуры ClickHouse': 'development/architecture.md'
- 'Навигация по коду ClickHouse': 'development/browse_code.md'
- 'Как собрать ClickHouse на Linux': 'development/build.md'
- 'Как собрать ClickHouse на Mac OS X': 'development/build_osx.md'
- 'Как собрать ClickHouse на Linux для Mac OS X': 'development/build_cross_osx.md'
- 'Как писать код на C++': 'development/style.md'
- 'Как запустить тесты': 'development/tests.md'
- 'Инструкция для начинающего разработчика ClickHouse': 'development/developer_instruction.md'
- 'Сторонние библиотеки': 'development/contrib.md'
- 'Что нового':

View File

@ -215,13 +215,14 @@ nav:
- '开发者指南':
- 'hidden': 'development/index.md'
- '开发者指南': 'development/developer_instruction.md'
- 'ClickHouse架构概述': 'development/architecture.md'
- 'ClickHouse Code Browser': 'development/browse_code.md'
- '如何在Linux中编译ClickHouse': 'development/build.md'
- '如何在Mac OS X中编译ClickHouse': 'development/build_osx.md'
- '如何在Linux中编译Mac OS X ClickHouse': 'development/build_cross_osx.md'
- '如何编写C++代码': 'development/style.md'
- '如何运行ClickHouse测试': 'development/tests.md'
- '开发者指南': 'development/developer_instruction.md'
- '使用的第三方库': 'development/contrib.md'
- '新功能特性':

View File

@ -12,14 +12,22 @@ import util
def choose_latest_releases():
seen = collections.OrderedDict()
candidates = requests.get('https://api.github.com/repos/ClickHouse/ClickHouse/tags?per_page=100').json()
candidates = []
for page in range(1, 10):
url = 'https://api.github.com/repos/ClickHouse/ClickHouse/tags?per_page=100&page=%d' % page
candidates += requests.get(url).json()
for tag in candidates:
name = tag.get('name', '')
if ('v18' in name) or ('stable' not in name) or ('prestable' in name):
is_unstable = ('stable' not in name) and ('lts' not in name)
is_in_blacklist = ('v18' in name) or ('prestable' in name) or ('v1.1' in name)
if is_unstable or is_in_blacklist:
continue
major_version = '.'.join((name.split('.', 2))[:2])
if major_version not in seen:
seen[major_version] = (name, tag.get('tarball_url'),)
if len(seen) > 10:
break
return seen.items()

View File

@ -3,7 +3,7 @@
{% set palette = config.theme.palette %}
{% set font = config.theme.font %}
<!DOCTYPE html>
<html lang="{{ lang.t('language') }}" class="no-js" data-version="{{ config.extra.version_prefix }}">
<html lang="{{ lang.t('language') }}" class="no-js" data-version="{{ config.extra.version_prefix or 'master' }}" data-single-page="{% if config.extra.single_page %}true{% else %}false{% endif %}">
<head>
{% block site_meta %}
<meta charset="utf-8">
@ -276,6 +276,8 @@ apiKey: 'e239649803024433599de47a53b2d416',
indexName: 'yandex_clickhouse',
inputSelector: '#md-search__input',
algoliaOptions: {
advancedSyntax: true,
clickAnalytics: true,
hitsPerPage: 25,
'facetFilters': ["lang:{{ lang.t('language') }}"]
},

View File

@ -3,6 +3,7 @@ set -ex
BASE_DIR=$(dirname $(readlink -f $0))
BUILD_DIR="${BASE_DIR}/../build"
PUBLISH_DIR="${BASE_DIR}/../publish"
IMAGE="clickhouse/website"
if [[ -z "$1" ]]
then
@ -17,6 +18,20 @@ if [[ -z "$1" ]]
then
source "${BASE_DIR}/venv/bin/activate"
python "${BASE_DIR}/build.py" "--enable-stable-releases"
set +e
rm -rf "${PUBLISH_DIR}" || true
git clone git@github.com:ClickHouse/clickhouse.github.io.git "${PUBLISH_DIR}"
cd "${PUBLISH_DIR}"
git rm -rf *
git commit -a -m "wipe old release"
cp -R "${BUILD_DIR}"/* .
echo -n "clickhouse.tech" > CNAME
echo -n "" > README.md
cp "${BASE_DIR}/../../LICENSE" .
git add *
git commit -a -m "add new release at $(date)"
git push origin master
set -e
cd "${BUILD_DIR}"
docker build -t "${FULL_NAME}" "${BUILD_DIR}"
docker tag "${FULL_NAME}" "${REMOTE_NAME}"

View File

@ -0,0 +1 @@
../../en/development/browse_code.md

View File

@ -15,5 +15,16 @@ Join(ANY|ALL, LEFT|INNER, k1[, k2, ...])
跟 Set 引擎类似Join 引擎把数据存储在磁盘中。
### Limitations and Settings {#join-limitations-and-settings}
When creating a table, the following settings are applied:
- join_use_nulls
- max_rows_in_join
- max_bytes_in_join
- join_overflow_mode
- join_any_take_last_row
The `Join`-engine tables can't be used in `GLOBAL JOIN` operations.
[来源文章](https://clickhouse.yandex/docs/en/operations/table_engines/join/) <!--hide-->

View File

@ -8,6 +8,8 @@ Kafka 特性:
- 容错存储机制。
- 处理流数据。
<a name="table_engine-kafka-creating-a-table"></a>
老版格式:
```

View File

@ -46,6 +46,8 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
请求参数的描述,参考 [请求描述](../../query_language/create.md) 。
<a name="mergetree-query-clauses"></a>
**子句**
- `ENGINE` - 引擎名和参数。 `ENGINE = MergeTree()`. `MergeTree` 引擎没有参数。
@ -270,7 +272,7 @@ SELECT count() FROM table WHERE s < 'z'
SELECT count() FROM table WHERE u64 * i32 == 10 AND u64 * length(s) >= 1234
```
#### 索引的可用类型
#### 索引的可用类型 {#table_engine-mergetree-data_skipping-indexes}
* `minmax`
存储指定表达式的极值(如果表达式是 `tuple` ,则存储 `tuple` 中每个元素的极值),这些信息用于跳过数据块,类似主键。

View File

@ -196,7 +196,7 @@ def get_users_info(pull_requests, commits_info, token, max_retries, retry_timeou
update_user(pull_request['user'])
for commit_info in commits_info.values():
if 'author' in commit_info and commit_info['author'] is not None:
if 'committer' in commit_info and commit_info['committer'] is not None and 'login' in commit_info['committer']:
update_user(commit_info['committer']['login'])
else:
logging.warning('Not found author for commit %s.', commit_info['html_url'])

View File

@ -160,25 +160,16 @@ def clear_old_incoming_packages(ssh_connection, user):
for pkg in ('deb', 'rpm', 'tgz'):
for release_type in ('stable', 'testing', 'prestable', 'lts'):
try:
if pkg != 'tgz':
ssh_connection.execute("rm /home/{user}/incoming/clickhouse/{pkg}/{release_type}/*".format(
user=user, pkg=pkg, release_type=release_type))
else:
ssh_connection.execute("rm /home/{user}/incoming/clickhouse/{pkg}/*".format(
user=user, pkg=pkg))
except Exception:
logging.info("rm is not required")
def _get_incoming_path(repo_url, user=None, pkg_type=None, release_type=None):
if repo_url == 'repo.mirror.yandex.net':
if pkg_type != 'tgz':
return "/home/{user}/incoming/clickhouse/{pkg}/{release_type}".format(
user=user, pkg=pkg_type, release_type=release_type)
else:
return "/home/{user}/incoming/clickhouse/{pkg}".format(
user=user, pkg=pkg_type)
else:
return "/repo/{0}/mini-dinstall/incoming/".format(repo_url.split('.')[0])

View File

@ -32,7 +32,7 @@
<a class="menu_item" href="#quick-start">Quick Start</a>
<a class="menu_item" href="#performance">Performance</a>
<a class="menu_item" href="docs/en/">Documentation</a>
<a class="menu_item" href="blog/en/">Blog</a>
<a class="menu_item" href="https://clickhouse.yandex/blog/en/">Blog</a>
<a class="menu_item" href="#contacts">Contacts</a>
</div>
@ -567,7 +567,7 @@ sudo clickhouse-client-$LATEST_VERSION/install/doinst.sh
});
var hostParts = window.location.host.split('.');
if (hostParts.length > 2 && hostParts[0] != 'test') {
if (hostParts.length > 2 && hostParts[0] != 'test' && hostParts[1] != 'github') {
window.location.host = hostParts[0] + '.' + hostParts[1];
}