Merge remote-tracking branch 'upstream/master' into min_mv_insert_block_size

This commit is contained in:
Nikolai Kochetov 2020-05-19 18:24:21 +03:00
commit c147badf4f
266 changed files with 10459 additions and 1317 deletions

View File

@ -1,8 +1,288 @@
## ClickHouse release v20.4
### ClickHouse release v20.4.2.9, 2020-05-12
#### Backward Incompatible Change
* System tables (e.g. system.query_log, system.trace_log, system.metric_log) are using compact data part format for parts smaller than 10 MiB in size. Compact data part format is supported since version 20.3. If you are going to downgrade to version less than 20.3, you should manually delete table data for system logs in `/var/lib/clickhouse/data/system/`.
* When string comparison involves FixedString and compared arguments are of different sizes, do comparison as if smaller string is padded to the length of the larger. This is intented for SQL compatibility if we imagine that FixedString data type corresponds to SQL CHAR. This closes [#9272](https://github.com/ClickHouse/ClickHouse/issues/9272). [#10363](https://github.com/ClickHouse/ClickHouse/pull/10363) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Make SHOW CREATE TABLE multiline. Now it is more readable and more like MySQL. [#10049](https://github.com/ClickHouse/ClickHouse/pull/10049) ([Azat Khuzhin](https://github.com/azat))
* Added a setting `validate_polygons` that is used in `pointInPolygon` function and enabled by default. [#9857](https://github.com/ClickHouse/ClickHouse/pull/9857) ([alexey-milovidov](https://github.com/alexey-milovidov))
#### New Feature
* Add support for secured connection from ClickHouse to Zookeeper [#10184](https://github.com/ClickHouse/ClickHouse/pull/10184) ([Konstantin Lebedev](https://github.com/xzkostyan))
* Support custom HTTP handlers. See ISSUES-5436 for description. [#7572](https://github.com/ClickHouse/ClickHouse/pull/7572) ([Winter Zhang](https://github.com/zhang2014))
* Add MessagePack Input/Output format. [#9889](https://github.com/ClickHouse/ClickHouse/pull/9889) ([Kruglov Pavel](https://github.com/Avogar))
* Add Regexp input format. [#9196](https://github.com/ClickHouse/ClickHouse/pull/9196) ([Kruglov Pavel](https://github.com/Avogar))
* Added output format `Markdown` for embedding tables in markdown documents. [#10317](https://github.com/ClickHouse/ClickHouse/pull/10317) ([Kruglov Pavel](https://github.com/Avogar))
* Added support for custom settings section in dictionaries. Also fixes issue [#2829](https://github.com/ClickHouse/ClickHouse/issues/2829). [#10137](https://github.com/ClickHouse/ClickHouse/pull/10137) ([Artem Streltsov](https://github.com/kekekekule))
* Added custom settings support in DDL-queries for CREATE DICTIONARY [#10465](https://github.com/ClickHouse/ClickHouse/pull/10465) ([Artem Streltsov](https://github.com/kekekekule))
* Add simple server-wide memory profiler that will collect allocation contexts when server memory usage becomes higher than the next allocation threshold. [#10444](https://github.com/ClickHouse/ClickHouse/pull/10444) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add setting `always_fetch_merged_part` which restrict replica to merge parts by itself and always prefer dowloading from other replicas. [#10379](https://github.com/ClickHouse/ClickHouse/pull/10379) ([alesapin](https://github.com/alesapin))
* Add function JSONExtractKeysAndValuesRaw which extracts raw data from JSON objects [#10378](https://github.com/ClickHouse/ClickHouse/pull/10378) ([hcz](https://github.com/hczhcz))
* Add memory usage from OS to `system.asynchronous_metrics`. [#10361](https://github.com/ClickHouse/ClickHouse/pull/10361) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Added generic variants for functions `least` and `greatest`. Now they work with arbitrary number of arguments of arbitrary types. This fixes [#4767](https://github.com/ClickHouse/ClickHouse/issues/4767) [#10318](https://github.com/ClickHouse/ClickHouse/pull/10318) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Now ClickHouse controls timeouts of dictionary sources on its side. Two new settings added to cache dictionary configuration: `strict_max_lifetime_seconds`, which is `max_lifetime` by default, and `query_wait_timeout_milliseconds`, which is one minute by default. The first settings is also useful with `allow_read_expired_keys` settings (to forbid reading very expired keys). [#10337](https://github.com/ClickHouse/ClickHouse/pull/10337) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
* Add log_queries_min_type to filter which entries will be written to query_log [#10053](https://github.com/ClickHouse/ClickHouse/pull/10053) ([Azat Khuzhin](https://github.com/azat))
* Added function `isConstant`. This function checks whether its argument is constant expression and returns 1 or 0. It is intended for development, debugging and demonstration purposes. [#10198](https://github.com/ClickHouse/ClickHouse/pull/10198) ([alexey-milovidov](https://github.com/alexey-milovidov))
* add joinGetOrNull to return NULL when key is missing instead of returning the default value. [#10094](https://github.com/ClickHouse/ClickHouse/pull/10094) ([Amos Bird](https://github.com/amosbird))
* Consider `NULL` to be equal to `NULL` in `IN` operator, if the option `transform_null_in` is set. [#10085](https://github.com/ClickHouse/ClickHouse/pull/10085) ([achimbab](https://github.com/achimbab))
* Add `ALTER TABLE ... RENAME COLUMN` for MergeTree table engines family. [#9948](https://github.com/ClickHouse/ClickHouse/pull/9948) ([alesapin](https://github.com/alesapin))
* Support parallel distributed INSERT SELECT. [#9759](https://github.com/ClickHouse/ClickHouse/pull/9759) ([vxider](https://github.com/Vxider))
* Add ability to query Distributed over Distributed (w/o `distributed_group_by_no_merge`) ... [#9923](https://github.com/ClickHouse/ClickHouse/pull/9923) ([Azat Khuzhin](https://github.com/azat))
* Add function `arrayReduceInRanges` which aggregates array elements in given ranges. [#9598](https://github.com/ClickHouse/ClickHouse/pull/9598) ([hcz](https://github.com/hczhcz))
* Add Dictionary Status on prometheus exporter. [#9622](https://github.com/ClickHouse/ClickHouse/pull/9622) ([Guillaume Tassery](https://github.com/YiuRULE))
* Add function arrayAUC [#8698](https://github.com/ClickHouse/ClickHouse/pull/8698) ([taiyang-li](https://github.com/taiyang-li))
* Support `DROP VIEW` statement for better TPC-H compatibility. [#9831](https://github.com/ClickHouse/ClickHouse/pull/9831) ([Amos Bird](https://github.com/amosbird))
* Add 'strict_order' option to windowFunnel() [#9773](https://github.com/ClickHouse/ClickHouse/pull/9773) ([achimbab](https://github.com/achimbab))
* Support `DATE` and `TIMESTAMP` SQL operators, e.g. `SELECT date '2001-01-01'` [#9691](https://github.com/ClickHouse/ClickHouse/pull/9691) ([Artem Zuikov](https://github.com/4ertus2))
#### Experimental Feature
* Added experimental database engine Atomic. It supports non-blocking `DROP` and `RENAME TABLE` queries and atomic `EXCHANGE TABLES t1 AND t2` query [#7512](https://github.com/ClickHouse/ClickHouse/pull/7512) ([tavplubix](https://github.com/tavplubix))
* Initial support for ReplicatedMergeTree over S3 (it works in suboptimal way) [#10126](https://github.com/ClickHouse/ClickHouse/pull/10126) ([Pavel Kovalenko](https://github.com/Jokser))
#### Bug Fix
* Fixed incorrect scalar results inside inner query of `MATERIALIZED VIEW` in case if this query contained dependent table [#10603](https://github.com/ClickHouse/ClickHouse/pull/10603) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fixed bug, which caused HTTP requests to get stuck on client closing connection when `readonly=2` and `cancel_http_readonly_queries_on_client_close=1`. [#10684](https://github.com/ClickHouse/ClickHouse/pull/10684) ([tavplubix](https://github.com/tavplubix))
* Fix segfault in StorageBuffer when exception is thrown on server startup. Fixes [#10550](https://github.com/ClickHouse/ClickHouse/issues/10550) [#10609](https://github.com/ClickHouse/ClickHouse/pull/10609) ([tavplubix](https://github.com/tavplubix))
* The query`SYSTEM DROP DNS CACHE` now also drops caches used to check if user is allowed to connect from some IP addresses [#10608](https://github.com/ClickHouse/ClickHouse/pull/10608) ([tavplubix](https://github.com/tavplubix))
* Fix usage of multiple `IN` operators with an identical set in one query. Fixes [#10539](https://github.com/ClickHouse/ClickHouse/issues/10539) [#10686](https://github.com/ClickHouse/ClickHouse/pull/10686) ([Anton Popov](https://github.com/CurtizJ))
* Fix crash in `generateRandom` with nested types. Fixes [#10583](https://github.com/ClickHouse/ClickHouse/issues/10583). [#10734](https://github.com/ClickHouse/ClickHouse/pull/10734) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix data corruption for `LowCardinality(FixedString)` key column in `SummingMergeTree` which could have happened after merge. Fixes [#10489](https://github.com/ClickHouse/ClickHouse/issues/10489). [#10721](https://github.com/ClickHouse/ClickHouse/pull/10721) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix logic for aggregation_memory_efficient_merge_threads setting. [#10667](https://github.com/ClickHouse/ClickHouse/pull/10667) ([palasonic1](https://github.com/palasonic1))
* Fix disappearing totals. Totals could have being filtered if query had `JOIN` or subquery with external `WHERE` condition. Fixes [#10674](https://github.com/ClickHouse/ClickHouse/issues/10674) [#10698](https://github.com/ClickHouse/ClickHouse/pull/10698) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix the lack of parallel execution of remote queries with `distributed_aggregation_memory_efficient` enabled. Fixes [#10655](https://github.com/ClickHouse/ClickHouse/issues/10655) [#10664](https://github.com/ClickHouse/ClickHouse/pull/10664) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix possible incorrect number of rows for queries with `LIMIT`. Fixes [#10566](https://github.com/ClickHouse/ClickHouse/issues/10566), [#10709](https://github.com/ClickHouse/ClickHouse/issues/10709) [#10660](https://github.com/ClickHouse/ClickHouse/pull/10660) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix index corruption, which may occur in some cases after merging compact parts into another compact part. [#10531](https://github.com/ClickHouse/ClickHouse/pull/10531) ([Anton Popov](https://github.com/CurtizJ))
* Fix the situation, when mutation finished all parts, but hung up in `is_done=0`. [#10526](https://github.com/ClickHouse/ClickHouse/pull/10526) ([alesapin](https://github.com/alesapin))
* Fix overflow at beginning of unix epoch for timezones with fractional offset from UTC. Fixes [#9335](https://github.com/ClickHouse/ClickHouse/issues/9335). [#10513](https://github.com/ClickHouse/ClickHouse/pull/10513) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Better diagnostics for input formats. Fixes [#10204](https://github.com/ClickHouse/ClickHouse/issues/10204) [#10418](https://github.com/ClickHouse/ClickHouse/pull/10418) ([tavplubix](https://github.com/tavplubix))
* Fix numeric overflow in `simpleLinearRegression()` over large integers [#10474](https://github.com/ClickHouse/ClickHouse/pull/10474) ([hcz](https://github.com/hczhcz))
* Fix use-after-free in Distributed shutdown, avoid waiting for sending all batches [#10491](https://github.com/ClickHouse/ClickHouse/pull/10491) ([Azat Khuzhin](https://github.com/azat))
* Add CA certificates to clickhouse-server docker image [#10476](https://github.com/ClickHouse/ClickHouse/pull/10476) ([filimonov](https://github.com/filimonov))
* Fix a rare endless loop that might have occurred when using the `addressToLine` function or AggregateFunctionState columns. [#10466](https://github.com/ClickHouse/ClickHouse/pull/10466) ([Alexander Kuzmenkov](https://github.com/akuzm))
* Handle zookeeper "no node error" during distributed query [#10050](https://github.com/ClickHouse/ClickHouse/pull/10050) ([Daniel Chen](https://github.com/Phantomape))
* Fix bug when server cannot attach table after column's default was altered. [#10441](https://github.com/ClickHouse/ClickHouse/pull/10441) ([alesapin](https://github.com/alesapin))
* Implicitly cast the default expression type to the column type for the ALIAS columns [#10563](https://github.com/ClickHouse/ClickHouse/pull/10563) ([Azat Khuzhin](https://github.com/azat))
* Don't remove metadata directory if `ATTACH DATABASE` fails [#10442](https://github.com/ClickHouse/ClickHouse/pull/10442) ([Winter Zhang](https://github.com/zhang2014))
* Avoid dependency on system tzdata. Fixes loading of `Africa/Casablanca` timezone on CentOS 8. Fixes [#10211](https://github.com/ClickHouse/ClickHouse/issues/10211) [#10425](https://github.com/ClickHouse/ClickHouse/pull/10425) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix some issues if data is inserted with quorum and then gets deleted (DROP PARTITION, TTL, etc.). It led to stuck of INSERTs or false-positive exceptions in SELECTs. Fixes [#9946](https://github.com/ClickHouse/ClickHouse/issues/9946) [#10188](https://github.com/ClickHouse/ClickHouse/pull/10188) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
* Check the number and type of arguments when creating BloomFilter index [#9623](https://github.com/ClickHouse/ClickHouse/issues/9623) [#10431](https://github.com/ClickHouse/ClickHouse/pull/10431) ([Winter Zhang](https://github.com/zhang2014))
* Prefer `fallback_to_stale_replicas` over `skip_unavailable_shards`, otherwise when both settings specified and there are no up-to-date replicas the query will fail (patch from @alex-zaitsev ) [#10422](https://github.com/ClickHouse/ClickHouse/pull/10422) ([Azat Khuzhin](https://github.com/azat))
* Fix the issue when a query with ARRAY JOIN, ORDER BY and LIMIT may return incomplete result. Fixes [#10226](https://github.com/ClickHouse/ClickHouse/issues/10226). [#10427](https://github.com/ClickHouse/ClickHouse/pull/10427) ([Vadim Plakhtinskiy](https://github.com/VadimPlh))
* Add database name to dictionary name after DETACH/ATTACH. Fixes system.dictionaries table and `SYSTEM RELOAD` query [#10415](https://github.com/ClickHouse/ClickHouse/pull/10415) ([Azat Khuzhin](https://github.com/azat))
* Fix possible incorrect result for extremes in processors pipeline. [#10131](https://github.com/ClickHouse/ClickHouse/pull/10131) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix possible segfault when the setting `distributed_group_by_no_merge` is enabled (introduced in 20.3.7.46 by [#10131](https://github.com/ClickHouse/ClickHouse/issues/10131)). [#10399](https://github.com/ClickHouse/ClickHouse/pull/10399) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix wrong flattening of `Array(Tuple(...))` data types. Fixes [#10259](https://github.com/ClickHouse/ClickHouse/issues/10259) [#10390](https://github.com/ClickHouse/ClickHouse/pull/10390) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix column names of constants inside JOIN that may clash with names of constants outside of JOIN [#9950](https://github.com/ClickHouse/ClickHouse/pull/9950) ([Alexander Kuzmenkov](https://github.com/akuzm))
* Fix order of columns after Block::sortColumns() [#10826](https://github.com/ClickHouse/ClickHouse/pull/10826) ([Azat Khuzhin](https://github.com/azat))
* Fix possible `Pipeline stuck` error in `ConcatProcessor` which may happen in remote query. [#10381](https://github.com/ClickHouse/ClickHouse/pull/10381) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Don't make disk reservations for aggregations. Fixes [#9241](https://github.com/ClickHouse/ClickHouse/issues/9241) [#10375](https://github.com/ClickHouse/ClickHouse/pull/10375) ([Azat Khuzhin](https://github.com/azat))
* Fix wrong behaviour of datetime functions for timezones that has altered between positive and negative offsets from UTC (e.g. Pacific/Kiritimati). Fixes [#7202](https://github.com/ClickHouse/ClickHouse/issues/7202) [#10369](https://github.com/ClickHouse/ClickHouse/pull/10369) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Avoid infinite loop in `dictIsIn` function. Fixes #515 [#10365](https://github.com/ClickHouse/ClickHouse/pull/10365) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Disable GROUP BY sharding_key optimization by default and fix it for WITH ROLLUP/CUBE/TOTALS [#10516](https://github.com/ClickHouse/ClickHouse/pull/10516) ([Azat Khuzhin](https://github.com/azat))
* Check for error code when checking parts and don't mark part as broken if the error is like "not enough memory". Fixes [#6269](https://github.com/ClickHouse/ClickHouse/issues/6269) [#10364](https://github.com/ClickHouse/ClickHouse/pull/10364) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Show information about not loaded dictionaries in system tables. [#10234](https://github.com/ClickHouse/ClickHouse/pull/10234) ([Vitaly Baranov](https://github.com/vitlibar))
* Fix nullptr dereference in StorageBuffer if server was shutdown before table startup. [#10641](https://github.com/ClickHouse/ClickHouse/pull/10641) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed `DROP` vs `OPTIMIZE` race in `ReplicatedMergeTree`. `DROP` could left some garbage in replica path in ZooKeeper if there was concurrent `OPTIMIZE` query. [#10312](https://github.com/ClickHouse/ClickHouse/pull/10312) ([tavplubix](https://github.com/tavplubix))
* Fix 'Logical error: CROSS JOIN has expressions' error for queries with comma and names joins mix. Fixes [#9910](https://github.com/ClickHouse/ClickHouse/issues/9910) [#10311](https://github.com/ClickHouse/ClickHouse/pull/10311) ([Artem Zuikov](https://github.com/4ertus2))
* Fix queries with `max_bytes_before_external_group_by`. [#10302](https://github.com/ClickHouse/ClickHouse/pull/10302) ([Artem Zuikov](https://github.com/4ertus2))
* Fix the issue with limiting maximum recursion depth in parser in certain cases. This fixes [#10283](https://github.com/ClickHouse/ClickHouse/issues/10283) This fix may introduce minor incompatibility: long and deep queries via clickhouse-client may refuse to work, and you should adjust settings `max_query_size` and `max_parser_depth` accordingly. [#10295](https://github.com/ClickHouse/ClickHouse/pull/10295) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Allow to use `count(*)` with multiple JOINs. Fixes [#9853](https://github.com/ClickHouse/ClickHouse/issues/9853) [#10291](https://github.com/ClickHouse/ClickHouse/pull/10291) ([Artem Zuikov](https://github.com/4ertus2))
* Fix error `Pipeline stuck` with `max_rows_to_group_by` and `group_by_overflow_mode = 'break'`. [#10279](https://github.com/ClickHouse/ClickHouse/pull/10279) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix 'Cannot add column' error while creating `range_hashed` dictionary using DDL query. Fixes [#10093](https://github.com/ClickHouse/ClickHouse/issues/10093). [#10235](https://github.com/ClickHouse/ClickHouse/pull/10235) ([alesapin](https://github.com/alesapin))
* Fix rare possible exception `Cannot drain connections: cancel first`. [#10239](https://github.com/ClickHouse/ClickHouse/pull/10239) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fixed bug where ClickHouse would throw "Unknown function lambda." error message when user tries to run ALTER UPDATE/DELETE on tables with ENGINE = Replicated*. Check for nondeterministic functions now handles lambda expressions correctly. [#10237](https://github.com/ClickHouse/ClickHouse/pull/10237) ([Alexander Kazakov](https://github.com/Akazz))
* Fixed reasonably rare segfault in StorageSystemTables that happens when SELECT ... FROM system.tables is run on a database with Lazy engine. [#10209](https://github.com/ClickHouse/ClickHouse/pull/10209) ([Alexander Kazakov](https://github.com/Akazz))
* Fix possible infinite query execution when the query actually should stop on LIMIT, while reading from infinite source like `system.numbers` or `system.zeros`. [#10206](https://github.com/ClickHouse/ClickHouse/pull/10206) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fixed "generateRandom" function for Date type. This fixes [#9973](https://github.com/ClickHouse/ClickHouse/issues/9973). Fix an edge case when dates with year 2106 are inserted to MergeTree tables with old-style partitioning but partitions are named with year 1970. [#10218](https://github.com/ClickHouse/ClickHouse/pull/10218) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Convert types if the table definition of a View does not correspond to the SELECT query. This fixes [#10180](https://github.com/ClickHouse/ClickHouse/issues/10180) and [#10022](https://github.com/ClickHouse/ClickHouse/issues/10022) [#10217](https://github.com/ClickHouse/ClickHouse/pull/10217) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix `parseDateTimeBestEffort` for strings in RFC-2822 when day of week is Tuesday or Thursday. This fixes [#10082](https://github.com/ClickHouse/ClickHouse/issues/10082) [#10214](https://github.com/ClickHouse/ClickHouse/pull/10214) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix column names of constants inside JOIN that may clash with names of constants outside of JOIN. [#10207](https://github.com/ClickHouse/ClickHouse/pull/10207) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix move-to-prewhere optimization in presense of arrayJoin functions (in certain cases). This fixes [#10092](https://github.com/ClickHouse/ClickHouse/issues/10092) [#10195](https://github.com/ClickHouse/ClickHouse/pull/10195) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix issue with separator appearing in SCRAMBLE for native mysql-connector-java (JDBC) [#10140](https://github.com/ClickHouse/ClickHouse/pull/10140) ([BohuTANG](https://github.com/BohuTANG))
* Fix using the current database for an access checking when the database isn't specified. [#10192](https://github.com/ClickHouse/ClickHouse/pull/10192) ([Vitaly Baranov](https://github.com/vitlibar))
* Fix ALTER of tables with compact parts. [#10130](https://github.com/ClickHouse/ClickHouse/pull/10130) ([Anton Popov](https://github.com/CurtizJ))
* Add the ability to relax the restriction on non-deterministic functions usage in mutations with `allow_nondeterministic_mutations` setting. [#10186](https://github.com/ClickHouse/ClickHouse/pull/10186) ([filimonov](https://github.com/filimonov))
* Fix `DROP TABLE` invoked for dictionary [#10165](https://github.com/ClickHouse/ClickHouse/pull/10165) ([Azat Khuzhin](https://github.com/azat))
* Convert blocks if structure does not match when doing `INSERT` into Distributed table [#10135](https://github.com/ClickHouse/ClickHouse/pull/10135) ([Azat Khuzhin](https://github.com/azat))
* The number of rows was logged incorrectly (as sum across all parts) when inserted block is split by parts with partition key. [#10138](https://github.com/ClickHouse/ClickHouse/pull/10138) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add some arguments check and support identifier arguments for MySQL Database Engine [#10077](https://github.com/ClickHouse/ClickHouse/pull/10077) ([Winter Zhang](https://github.com/zhang2014))
* Fix incorrect `index_granularity_bytes` check while creating new replica. Fixes [#10098](https://github.com/ClickHouse/ClickHouse/issues/10098). [#10121](https://github.com/ClickHouse/ClickHouse/pull/10121) ([alesapin](https://github.com/alesapin))
* Fix bug in `CHECK TABLE` query when table contain skip indices. [#10068](https://github.com/ClickHouse/ClickHouse/pull/10068) ([alesapin](https://github.com/alesapin))
* Fix Distributed-over-Distributed with the only one shard in a nested table [#9997](https://github.com/ClickHouse/ClickHouse/pull/9997) ([Azat Khuzhin](https://github.com/azat))
* Fix possible rows loss for queries with `JOIN` and `UNION ALL`. Fixes [#9826](https://github.com/ClickHouse/ClickHouse/issues/9826), [#10113](https://github.com/ClickHouse/ClickHouse/issues/10113). ... [#10099](https://github.com/ClickHouse/ClickHouse/pull/10099) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix bug in dictionary when local clickhouse server is used as source. It may caused memory corruption if types in dictionary and source are not compatible. [#10071](https://github.com/ClickHouse/ClickHouse/pull/10071) ([alesapin](https://github.com/alesapin))
* Fixed replicated tables startup when updating from an old ClickHouse version where `/table/replicas/replica_name/metadata` node doesn't exist. Fixes [#10037](https://github.com/ClickHouse/ClickHouse/issues/10037). [#10095](https://github.com/ClickHouse/ClickHouse/pull/10095) ([alesapin](https://github.com/alesapin))
* Fix error `Cannot clone block with columns because block has 0 columns ... While executing GroupingAggregatedTransform`. It happened when setting `distributed_aggregation_memory_efficient` was enabled, and distributed query read aggregating data with mixed single and two-level aggregation from different shards. [#10063](https://github.com/ClickHouse/ClickHouse/pull/10063) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix deadlock when database with materialized view failed attach at start [#10054](https://github.com/ClickHouse/ClickHouse/pull/10054) ([Azat Khuzhin](https://github.com/azat))
* Fix a segmentation fault that could occur in GROUP BY over string keys containing trailing zero bytes ([#8636](https://github.com/ClickHouse/ClickHouse/issues/8636), [#8925](https://github.com/ClickHouse/ClickHouse/issues/8925)). ... [#10025](https://github.com/ClickHouse/ClickHouse/pull/10025) ([Alexander Kuzmenkov](https://github.com/akuzm))
* Fix wrong results of distributed queries when alias could override qualified column name. Fixes [#9672](https://github.com/ClickHouse/ClickHouse/issues/9672) [#9714](https://github.com/ClickHouse/ClickHouse/issues/9714) [#9972](https://github.com/ClickHouse/ClickHouse/pull/9972) ([Artem Zuikov](https://github.com/4ertus2))
* Fix possible deadlock in `SYSTEM RESTART REPLICAS` [#9955](https://github.com/ClickHouse/ClickHouse/pull/9955) ([tavplubix](https://github.com/tavplubix))
* Fix the number of threads used for remote query execution (performance regression, since 20.3). This happened when query from `Distributed` table was executed simultaneously on local and remote shards. Fixes [#9965](https://github.com/ClickHouse/ClickHouse/issues/9965) [#9971](https://github.com/ClickHouse/ClickHouse/pull/9971) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fixed `DeleteOnDestroy` logic in `ATTACH PART` which could lead to automatic removal of attached part and added few tests [#9410](https://github.com/ClickHouse/ClickHouse/pull/9410) ([Vladimir Chebotarev](https://github.com/excitoon))
* Fix a bug with `ON CLUSTER` DDL queries freezing on server startup. [#9927](https://github.com/ClickHouse/ClickHouse/pull/9927) ([Gagan Arneja](https://github.com/garneja))
* Fix bug in which the necessary tables weren't retrieved at one of the processing stages of queries to some databases. Fixes [#9699](https://github.com/ClickHouse/ClickHouse/issues/9699). [#9949](https://github.com/ClickHouse/ClickHouse/pull/9949) ([achulkov2](https://github.com/achulkov2))
* Fix 'Not found column in block' error when `JOIN` appears with `TOTALS`. Fixes [#9839](https://github.com/ClickHouse/ClickHouse/issues/9839) [#9939](https://github.com/ClickHouse/ClickHouse/pull/9939) ([Artem Zuikov](https://github.com/4ertus2))
* Fix parsing multiple hosts set in the CREATE USER command [#9924](https://github.com/ClickHouse/ClickHouse/pull/9924) ([Vitaly Baranov](https://github.com/vitlibar))
* Fix `TRUNCATE` for Join table engine ([#9917](https://github.com/ClickHouse/ClickHouse/issues/9917)). [#9920](https://github.com/ClickHouse/ClickHouse/pull/9920) ([Amos Bird](https://github.com/amosbird))
* Fix race condition between drop and optimize in `ReplicatedMergeTree`. [#9901](https://github.com/ClickHouse/ClickHouse/pull/9901) ([alesapin](https://github.com/alesapin))
* Fix `DISTINCT` for Distributed when `optimize_skip_unused_shards` is set. [#9808](https://github.com/ClickHouse/ClickHouse/pull/9808) ([Azat Khuzhin](https://github.com/azat))
* Fix "scalar doesn't exist" error in ALTERs ([#9878](https://github.com/ClickHouse/ClickHouse/issues/9878)). ... [#9904](https://github.com/ClickHouse/ClickHouse/pull/9904) ([Amos Bird](https://github.com/amosbird))
* Fix error with qualified names in `distributed_product_mode=\'local\'`. Fixes [#4756](https://github.com/ClickHouse/ClickHouse/issues/4756) [#9891](https://github.com/ClickHouse/ClickHouse/pull/9891) ([Artem Zuikov](https://github.com/4ertus2))
* For INSERT queries shards now do clamp the settings from the initiator to their constraints instead of throwing an exception. This fix allows to send INSERT queries to a shard with another constraints. This change improves fix [#9447](https://github.com/ClickHouse/ClickHouse/issues/9447). [#9852](https://github.com/ClickHouse/ClickHouse/pull/9852) ([Vitaly Baranov](https://github.com/vitlibar))
* Add some retries when commiting offsets to Kafka broker, since it can reject commit if during `offsets.commit.timeout.ms` there were no enough replicas available for the `__consumer_offsets` topic [#9884](https://github.com/ClickHouse/ClickHouse/pull/9884) ([filimonov](https://github.com/filimonov))
* Fix Distributed engine behavior when virtual columns of the underlying table used in `WHERE` [#9847](https://github.com/ClickHouse/ClickHouse/pull/9847) ([Azat Khuzhin](https://github.com/azat))
* Fixed some cases when timezone of the function argument wasn't used properly. [#9574](https://github.com/ClickHouse/ClickHouse/pull/9574) ([Vasily Nemkov](https://github.com/Enmk))
* Fix 'Different expressions with the same alias' error when query has PREWHERE and WHERE on distributed table and `SET distributed_product_mode = 'local'`. [#9871](https://github.com/ClickHouse/ClickHouse/pull/9871) ([Artem Zuikov](https://github.com/4ertus2))
* Fix mutations excessive memory consumption for tables with a composite primary key. This fixes [#9850](https://github.com/ClickHouse/ClickHouse/issues/9850). [#9860](https://github.com/ClickHouse/ClickHouse/pull/9860) ([alesapin](https://github.com/alesapin))
* Fix calculating grants for introspection functions from the setting `allow_introspection_functions`. [#9840](https://github.com/ClickHouse/ClickHouse/pull/9840) ([Vitaly Baranov](https://github.com/vitlibar))
* Fix max_distributed_connections (w/ and w/o Processors) [#9673](https://github.com/ClickHouse/ClickHouse/pull/9673) ([Azat Khuzhin](https://github.com/azat))
* Fix possible exception `Got 0 in totals chunk, expected 1` on client. It happened for queries with `JOIN` in case if right joined table had zero rows. Example: `select * from system.one t1 join system.one t2 on t1.dummy = t2.dummy limit 0 FORMAT TabSeparated;`. Fixes [#9777](https://github.com/ClickHouse/ClickHouse/issues/9777). ... [#9823](https://github.com/ClickHouse/ClickHouse/pull/9823) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix 'COMMA to CROSS JOIN rewriter is not enabled or cannot rewrite query' error in case of subqueries with COMMA JOIN out of tables lists (i.e. in WHERE). Fixes [#9782](https://github.com/ClickHouse/ClickHouse/issues/9782) [#9830](https://github.com/ClickHouse/ClickHouse/pull/9830) ([Artem Zuikov](https://github.com/4ertus2))
* Fix server crashing when `optimize_skip_unused_shards` is set and expression for key can't be converted to its field type [#9804](https://github.com/ClickHouse/ClickHouse/pull/9804) ([Azat Khuzhin](https://github.com/azat))
* Fix empty string handling in `splitByString`. [#9767](https://github.com/ClickHouse/ClickHouse/pull/9767) ([hcz](https://github.com/hczhcz))
* Fix broken `ALTER TABLE DELETE COLUMN` query for compact parts. [#9779](https://github.com/ClickHouse/ClickHouse/pull/9779) ([alesapin](https://github.com/alesapin))
* Fixed missing `rows_before_limit_at_least` for queries over http (with processors pipeline). Fixes [#9730](https://github.com/ClickHouse/ClickHouse/issues/9730) [#9757](https://github.com/ClickHouse/ClickHouse/pull/9757) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix excessive memory consumption in `ALTER` queries (mutations). This fixes [#9533](https://github.com/ClickHouse/ClickHouse/issues/9533) and [#9670](https://github.com/ClickHouse/ClickHouse/issues/9670). [#9754](https://github.com/ClickHouse/ClickHouse/pull/9754) ([alesapin](https://github.com/alesapin))
* Fix possible permanent "Cannot schedule a task" error. [#9154](https://github.com/ClickHouse/ClickHouse/pull/9154) ([Azat Khuzhin](https://github.com/azat))
* Fix bug in backquoting in external dictionaries DDL. Fixes [#9619](https://github.com/ClickHouse/ClickHouse/issues/9619). [#9734](https://github.com/ClickHouse/ClickHouse/pull/9734) ([alesapin](https://github.com/alesapin))
* Fixed data race in `text_log`. It does not correspond to any real bug. [#9726](https://github.com/ClickHouse/ClickHouse/pull/9726) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix bug in a replication that doesn't allow replication to work if the user has executed mutations on the previous version. This fixes [#9645](https://github.com/ClickHouse/ClickHouse/issues/9645). [#9652](https://github.com/ClickHouse/ClickHouse/pull/9652) ([alesapin](https://github.com/alesapin))
* Fixed incorrect internal function names for `sumKahan` and `sumWithOverflow`. It led to exception while using this functions in remote queries. [#9636](https://github.com/ClickHouse/ClickHouse/pull/9636) ([Azat Khuzhin](https://github.com/azat))
* Add setting `use_compact_format_in_distributed_parts_names` which allows to write files for `INSERT` queries into `Distributed` table with more compact format. This fixes [#9647](https://github.com/ClickHouse/ClickHouse/issues/9647). [#9653](https://github.com/ClickHouse/ClickHouse/pull/9653) ([alesapin](https://github.com/alesapin))
* Fix RIGHT and FULL JOIN with LowCardinality in JOIN keys. [#9610](https://github.com/ClickHouse/ClickHouse/pull/9610) ([Artem Zuikov](https://github.com/4ertus2))
* Fix possible exceptions `Size of filter doesn't match size of column` and `Invalid number of rows in Chunk` in `MergeTreeRangeReader`. They could appear while executing `PREWHERE` in some cases. [#9612](https://github.com/ClickHouse/ClickHouse/pull/9612) ([Anton Popov](https://github.com/CurtizJ))
* Allow `ALTER ON CLUSTER` of Distributed tables with internal replication. This fixes [#3268](https://github.com/ClickHouse/ClickHouse/issues/3268) [#9617](https://github.com/ClickHouse/ClickHouse/pull/9617) ([shinoi2](https://github.com/shinoi2))
* Fix issue when timezone was not preserved if you write a simple arithmetic expression like `time + 1` (in contrast to an expression like `time + INTERVAL 1 SECOND`). This fixes [#5743](https://github.com/ClickHouse/ClickHouse/issues/5743) [#9323](https://github.com/ClickHouse/ClickHouse/pull/9323) ([alexey-milovidov](https://github.com/alexey-milovidov))
#### Improvement
* Use time zone when comparing DateTime with string literal. This fixes [#5206](https://github.com/ClickHouse/ClickHouse/issues/5206). [#10515](https://github.com/ClickHouse/ClickHouse/pull/10515) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Print verbose diagnostic info if Decimal value cannot be parsed from text input format. [#10205](https://github.com/ClickHouse/ClickHouse/pull/10205) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add tasks/memory metrics for distributed/buffer schedule pools [#10449](https://github.com/ClickHouse/ClickHouse/pull/10449) ([Azat Khuzhin](https://github.com/azat))
* Display result as soon as it's ready for SELECT DISTINCT queries in clickhouse-local and HTTP interface. This fixes [#8951](https://github.com/ClickHouse/ClickHouse/issues/8951) [#9559](https://github.com/ClickHouse/ClickHouse/pull/9559) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Allow to use `SAMPLE OFFSET` query instead of `cityHash64(PRIMARY KEY) % N == n` for splitting in `clickhouse-copier`. To use this feature, pass `--experimental-use-sample-offset 1` as a command line argument. [#10414](https://github.com/ClickHouse/ClickHouse/pull/10414) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
* Allow to parse BOM in TSV if the first column cannot contain BOM in its value. This fixes [#10301](https://github.com/ClickHouse/ClickHouse/issues/10301) [#10424](https://github.com/ClickHouse/ClickHouse/pull/10424) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add Avro nested fields insert support [#10354](https://github.com/ClickHouse/ClickHouse/pull/10354) ([Andrew Onyshchuk](https://github.com/oandrew))
* Allowed to alter column in non-modifying data mode when the same type is specified. [#10382](https://github.com/ClickHouse/ClickHouse/pull/10382) ([Vladimir Chebotarev](https://github.com/excitoon))
* Auto `distributed_group_by_no_merge` on GROUP BY sharding key (if `optimize_skip_unused_shards` is set) [#10341](https://github.com/ClickHouse/ClickHouse/pull/10341) ([Azat Khuzhin](https://github.com/azat))
* Optimize queries with LIMIT/LIMIT BY/ORDER BY for distributed with GROUP BY sharding_key [#10373](https://github.com/ClickHouse/ClickHouse/pull/10373) ([Azat Khuzhin](https://github.com/azat))
* Added a setting `max_server_memory_usage` to limit total memory usage of the server. The metric `MemoryTracking` is now calculated without a drift. The setting `max_memory_usage_for_all_queries` is now obsolete and does nothing. This closes [#10293](https://github.com/ClickHouse/ClickHouse/issues/10293). [#10362](https://github.com/ClickHouse/ClickHouse/pull/10362) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add config option `system_tables_lazy_load`. If it's set to false, then system tables with logs are loaded at the server startup. [Alexander Burmak](https://github.com/Alex-Burmak), [Svyatoslav Tkhon Il Pak](https://github.com/DeifyTheGod), [#9642](https://github.com/ClickHouse/ClickHouse/pull/9642) [#10359](https://github.com/ClickHouse/ClickHouse/pull/10359) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Use background thread pool (background_schedule_pool_size) for distributed sends [#10263](https://github.com/ClickHouse/ClickHouse/pull/10263) ([Azat Khuzhin](https://github.com/azat))
* Use background thread pool for background buffer flushes. [#10315](https://github.com/ClickHouse/ClickHouse/pull/10315) ([Azat Khuzhin](https://github.com/azat))
* Support for one special case of removing incompletely written parts. This fixes [#9940](https://github.com/ClickHouse/ClickHouse/issues/9940). [#10221](https://github.com/ClickHouse/ClickHouse/pull/10221) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Use isInjective() over manual list of such functions for GROUP BY optimization. [#10342](https://github.com/ClickHouse/ClickHouse/pull/10342) ([Azat Khuzhin](https://github.com/azat))
* Avoid printing error message in log if client sends RST packet immediately on connect. It is typical behaviour of IPVS balancer with keepalived and VRRP. This fixes [#1851](https://github.com/ClickHouse/ClickHouse/issues/1851) [#10274](https://github.com/ClickHouse/ClickHouse/pull/10274) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Allow to parse `+inf` for floating point types. This closes [#1839](https://github.com/ClickHouse/ClickHouse/issues/1839) [#10272](https://github.com/ClickHouse/ClickHouse/pull/10272) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Implemented `generateRandom` table function for Nested types. This closes [#9903](https://github.com/ClickHouse/ClickHouse/issues/9903) [#10219](https://github.com/ClickHouse/ClickHouse/pull/10219) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Provide `max_allowed_packed` in MySQL compatibility interface that will help some clients to communicate with ClickHouse via MySQL protocol. [#10199](https://github.com/ClickHouse/ClickHouse/pull/10199) ([BohuTANG](https://github.com/BohuTANG))
* Allow literals for GLOBAL IN (i.e. `SELECT * FROM remote('localhost', system.one) WHERE dummy global in (0)`) [#10196](https://github.com/ClickHouse/ClickHouse/pull/10196) ([Azat Khuzhin](https://github.com/azat))
* Fix various small issues in interactive mode of clickhouse-client [#10194](https://github.com/ClickHouse/ClickHouse/pull/10194) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Avoid superfluous dictionaries load (system.tables, DROP/SHOW CREATE TABLE) [#10164](https://github.com/ClickHouse/ClickHouse/pull/10164) ([Azat Khuzhin](https://github.com/azat))
* Update to RWLock: timeout parameter for getLock() + implementation reworked to be phase fair [#10073](https://github.com/ClickHouse/ClickHouse/pull/10073) ([Alexander Kazakov](https://github.com/Akazz))
* Enhanced compatibility with native mysql-connector-java(JDBC) [#10021](https://github.com/ClickHouse/ClickHouse/pull/10021) ([BohuTANG](https://github.com/BohuTANG))
* The function `toString` is considered monotonic and can be used for index analysis even when applied in tautological cases with String or LowCardinality(String) argument. [#10110](https://github.com/ClickHouse/ClickHouse/pull/10110) ([Amos Bird](https://github.com/amosbird))
* Add `ON CLUSTER` clause support to commands `{CREATE|DROP} USER/ROLE/ROW POLICY/SETTINGS PROFILE/QUOTA`, `GRANT`. [#9811](https://github.com/ClickHouse/ClickHouse/pull/9811) ([Vitaly Baranov](https://github.com/vitlibar))
* Virtual hosted-style support for S3 URI [#9998](https://github.com/ClickHouse/ClickHouse/pull/9998) ([Pavel Kovalenko](https://github.com/Jokser))
* Now layout type for dictionaries with no arguments can be specified without round brackets in dictionaries DDL-queries. Fixes [#10057](https://github.com/ClickHouse/ClickHouse/issues/10057). [#10064](https://github.com/ClickHouse/ClickHouse/pull/10064) ([alesapin](https://github.com/alesapin))
* Add ability to use number ranges with leading zeros in filepath [#9989](https://github.com/ClickHouse/ClickHouse/pull/9989) ([Olga Khvostikova](https://github.com/stavrolia))
* Better memory usage in CROSS JOIN. [#10029](https://github.com/ClickHouse/ClickHouse/pull/10029) ([Artem Zuikov](https://github.com/4ertus2))
* Try to connect to all shards in cluster when getting structure of remote table and skip_unavailable_shards is set. [#7278](https://github.com/ClickHouse/ClickHouse/pull/7278) ([nvartolomei](https://github.com/nvartolomei))
* Add `total_rows`/`total_bytes` into the `system.tables` table. [#9919](https://github.com/ClickHouse/ClickHouse/pull/9919) ([Azat Khuzhin](https://github.com/azat))
* System log tables now use polymorpic parts by default. [#9905](https://github.com/ClickHouse/ClickHouse/pull/9905) ([Anton Popov](https://github.com/CurtizJ))
* Add type column into system.settings/merge_tree_settings [#9909](https://github.com/ClickHouse/ClickHouse/pull/9909) ([Azat Khuzhin](https://github.com/azat))
* Check for available CPU instructions at server startup as early as possible. [#9888](https://github.com/ClickHouse/ClickHouse/pull/9888) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Remove `ORDER BY` stage from mutations because we read from a single ordered part in a single thread. Also add check that the rows in mutation are ordered by sorting key and this order is not violated. [#9886](https://github.com/ClickHouse/ClickHouse/pull/9886) ([alesapin](https://github.com/alesapin))
* Implement operator LIKE for FixedString at left hand side. This is needed to better support TPC-DS queries. [#9890](https://github.com/ClickHouse/ClickHouse/pull/9890) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add `force_optimize_skip_unused_shards_no_nested` that will disable `force_optimize_skip_unused_shards` for nested Distributed table [#9812](https://github.com/ClickHouse/ClickHouse/pull/9812) ([Azat Khuzhin](https://github.com/azat))
* Now columns size is calculated only once for MergeTree data parts. [#9827](https://github.com/ClickHouse/ClickHouse/pull/9827) ([alesapin](https://github.com/alesapin))
* Evaluate constant expressions for `optimize_skip_unused_shards` (i.e. `SELECT * FROM foo_dist WHERE key=xxHash32(0)`) [#8846](https://github.com/ClickHouse/ClickHouse/pull/8846) ([Azat Khuzhin](https://github.com/azat))
* Check for using `Date` or `DateTime` column from TTL expressions was removed. [#9967](https://github.com/ClickHouse/ClickHouse/pull/9967) ([Vladimir Chebotarev](https://github.com/excitoon))
* DiskS3 hard links optimal implementation. [#9760](https://github.com/ClickHouse/ClickHouse/pull/9760) ([Pavel Kovalenko](https://github.com/Jokser))
* If `set multiple_joins_rewriter_version = 2` enables second version of multiple JOIN rewrites that keeps not clashed column names as is. It supports multiple JOINs with `USING` and allow `select *` for JOINs with subqueries. [#9739](https://github.com/ClickHouse/ClickHouse/pull/9739) ([Artem Zuikov](https://github.com/4ertus2))
* Implementation of "non-blocking" alter for StorageMergeTree [#9606](https://github.com/ClickHouse/ClickHouse/pull/9606) ([alesapin](https://github.com/alesapin))
* Add MergeTree full support for DiskS3 [#9646](https://github.com/ClickHouse/ClickHouse/pull/9646) ([Pavel Kovalenko](https://github.com/Jokser))
* Extend `splitByString` to support empty strings as separators. [#9742](https://github.com/ClickHouse/ClickHouse/pull/9742) ([hcz](https://github.com/hczhcz))
* Add a `timestamp_ns` column to `system.trace_log`. It contains a high-definition timestamp of the trace event, and allows to build timelines of thread profiles ("flame charts"). [#9696](https://github.com/ClickHouse/ClickHouse/pull/9696) ([Alexander Kuzmenkov](https://github.com/akuzm))
* When the setting `send_logs_level` is enabled, avoid intermixing of log messages and query progress. [#9634](https://github.com/ClickHouse/ClickHouse/pull/9634) ([Azat Khuzhin](https://github.com/azat))
* Added support of `MATERIALIZE TTL IN PARTITION`. [#9581](https://github.com/ClickHouse/ClickHouse/pull/9581) ([Vladimir Chebotarev](https://github.com/excitoon))
* Support complex types inside Avro nested fields [#10502](https://github.com/ClickHouse/ClickHouse/pull/10502) ([Andrew Onyshchuk](https://github.com/oandrew))
#### Performance Improvement
* Better insert logic for right table for Partial MergeJoin. [#10467](https://github.com/ClickHouse/ClickHouse/pull/10467) ([Artem Zuikov](https://github.com/4ertus2))
* Improved performance of row-oriented formats (more than 10% for CSV and more than 35% for Avro in case of narrow tables). [#10503](https://github.com/ClickHouse/ClickHouse/pull/10503) ([Andrew Onyshchuk](https://github.com/oandrew))
* Improved performance of queries with explicitly defined sets at right side of IN operator and tuples on the left side. [#10385](https://github.com/ClickHouse/ClickHouse/pull/10385) ([Anton Popov](https://github.com/CurtizJ))
* Use less memory for hash table in HashJoin. [#10416](https://github.com/ClickHouse/ClickHouse/pull/10416) ([Artem Zuikov](https://github.com/4ertus2))
* Special HashJoin over StorageDictionary. Allow rewrite `dictGet()` functions with JOINs. It's not backward incompatible itself but could uncover [#8400](https://github.com/ClickHouse/ClickHouse/issues/8400) on some installations. [#10133](https://github.com/ClickHouse/ClickHouse/pull/10133) ([Artem Zuikov](https://github.com/4ertus2))
* Enable parallel insert of materialized view when its target table supports. [#10052](https://github.com/ClickHouse/ClickHouse/pull/10052) ([vxider](https://github.com/Vxider))
* Improved performance of index analysis with monotonic functions. [#9607](https://github.com/ClickHouse/ClickHouse/pull/9607)[#10026](https://github.com/ClickHouse/ClickHouse/pull/10026) ([Anton Popov](https://github.com/CurtizJ))
* Using SSE2 or SSE4.2 SIMD intrinsics to speed up tokenization in bloom filters. [#9968](https://github.com/ClickHouse/ClickHouse/pull/9968) ([Vasily Nemkov](https://github.com/Enmk))
* Improved performance of queries with explicitly defined sets at right side of `IN` operator. This fixes performance regression in version 20.3. [#9740](https://github.com/ClickHouse/ClickHouse/pull/9740) ([Anton Popov](https://github.com/CurtizJ))
* Now clickhouse-copier splits each partition in number of pieces and copies them independently. [#9075](https://github.com/ClickHouse/ClickHouse/pull/9075) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
* Adding more aggregation methods. For example TPC-H query 1 will now pick `FixedHashMap<UInt16, AggregateDataPtr>` and gets 25% performance gain [#9829](https://github.com/ClickHouse/ClickHouse/pull/9829) ([Amos Bird](https://github.com/amosbird))
* Use single row counter for multiple streams in pre-limit transform. This helps to avoid uniting pipeline streams in queries with `limit` but without `order by` (like `select f(x) from (select x from t limit 1000000000)`) and use multiple threads for further processing. [#9602](https://github.com/ClickHouse/ClickHouse/pull/9602) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
#### Build/Testing/Packaging Improvement
* Use a fork of AWS SDK libraries from ClickHouse-Extras [#10527](https://github.com/ClickHouse/ClickHouse/pull/10527) ([Pavel Kovalenko](https://github.com/Jokser))
* Add integration tests for new ALTER RENAME COLUMN query. [#10654](https://github.com/ClickHouse/ClickHouse/pull/10654) ([vzakaznikov](https://github.com/vzakaznikov))
* Fix possible signed integer overflow in invocation of function `now64` with wrong arguments. This fixes [#8973](https://github.com/ClickHouse/ClickHouse/issues/8973) [#10511](https://github.com/ClickHouse/ClickHouse/pull/10511) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Split fuzzer and sanitizer configurations to make build config compatible with Oss-fuzz. [#10494](https://github.com/ClickHouse/ClickHouse/pull/10494) ([kyprizel](https://github.com/kyprizel))
* Fixes for clang-tidy on clang-10. [#10420](https://github.com/ClickHouse/ClickHouse/pull/10420) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Display absolute paths in error messages. Otherwise KDevelop fails to navigate to correct file and opens a new file instead. [#10434](https://github.com/ClickHouse/ClickHouse/pull/10434) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Added `ASAN_OPTIONS` environment variable to investigate errors in CI stress tests with Address sanitizer. [#10440](https://github.com/ClickHouse/ClickHouse/pull/10440) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
* Enable ThinLTO for clang builds (experimental). [#10435](https://github.com/ClickHouse/ClickHouse/pull/10435) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Remove accidential dependency on Z3 that may be introduced if the system has Z3 solver installed. [#10426](https://github.com/ClickHouse/ClickHouse/pull/10426) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Move integration tests docker files to docker/ directory. [#10335](https://github.com/ClickHouse/ClickHouse/pull/10335) ([Ilya Yatsishin](https://github.com/qoega))
* Allow to use `clang-10` in CI. It ensures that [#10238](https://github.com/ClickHouse/ClickHouse/issues/10238) is fixed. [#10384](https://github.com/ClickHouse/ClickHouse/pull/10384) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Update OpenSSL to upstream master. Fixed the issue when TLS connections may fail with the message `OpenSSL SSL_read: error:14094438:SSL routines:ssl3_read_bytes:tlsv1 alert internal error` and `SSL Exception: error:2400006E:random number generator::error retrieving entropy`. The issue was present in version 20.1. [#8956](https://github.com/ClickHouse/ClickHouse/pull/8956) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix clang-10 build. https://github.com/ClickHouse/ClickHouse/issues/10238 [#10370](https://github.com/ClickHouse/ClickHouse/pull/10370) ([Amos Bird](https://github.com/amosbird))
* Add performance test for [Parallel INSERT for materialized view](https://github.com/ClickHouse/ClickHouse/pull/10052). [#10345](https://github.com/ClickHouse/ClickHouse/pull/10345) ([vxider](https://github.com/Vxider))
* Fix flaky test `test_settings_constraints_distributed.test_insert_clamps_settings`. [#10346](https://github.com/ClickHouse/ClickHouse/pull/10346) ([Vitaly Baranov](https://github.com/vitlibar))
* Add util to test results upload in CI ClickHouse [#10330](https://github.com/ClickHouse/ClickHouse/pull/10330) ([Ilya Yatsishin](https://github.com/qoega))
* Convert test results to JSONEachRow format in junit_to_html tool [#10323](https://github.com/ClickHouse/ClickHouse/pull/10323) ([Ilya Yatsishin](https://github.com/qoega))
* Update cctz. [#10215](https://github.com/ClickHouse/ClickHouse/pull/10215) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Allow to create HTML report from the purest JUnit XML report. [#10247](https://github.com/ClickHouse/ClickHouse/pull/10247) ([Ilya Yatsishin](https://github.com/qoega))
* Update the check for minimal compiler version. Fix the root cause of the issue [#10250](https://github.com/ClickHouse/ClickHouse/issues/10250) [#10256](https://github.com/ClickHouse/ClickHouse/pull/10256) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Initial support for live view tables over distributed [#10179](https://github.com/ClickHouse/ClickHouse/pull/10179) ([vzakaznikov](https://github.com/vzakaznikov))
* Fix (false) MSan report in MergeTreeIndexFullText. The issue first appeared in [#9968](https://github.com/ClickHouse/ClickHouse/issues/9968). [#10801](https://github.com/ClickHouse/ClickHouse/pull/10801) ([alexey-milovidov](https://github.com/alexey-milovidov))
* clickhouse-docker-util [#10151](https://github.com/ClickHouse/ClickHouse/pull/10151) ([filimonov](https://github.com/filimonov))
* Update pdqsort to recent version [#10171](https://github.com/ClickHouse/ClickHouse/pull/10171) ([Ivan](https://github.com/abyss7))
* Update libdivide to v3.0 [#10169](https://github.com/ClickHouse/ClickHouse/pull/10169) ([Ivan](https://github.com/abyss7))
* Add check with enabled polymorphic parts. [#10086](https://github.com/ClickHouse/ClickHouse/pull/10086) ([Anton Popov](https://github.com/CurtizJ))
* Add cross-compile build for FreeBSD. This fixes [#9465](https://github.com/ClickHouse/ClickHouse/issues/9465) [#9643](https://github.com/ClickHouse/ClickHouse/pull/9643) ([Ivan](https://github.com/abyss7))
* Add performance test for [#6924](https://github.com/ClickHouse/ClickHouse/issues/6924) [#6980](https://github.com/ClickHouse/ClickHouse/pull/6980) ([filimonov](https://github.com/filimonov))
* Add support of `/dev/null` in the `File` engine for better performance testing [#8455](https://github.com/ClickHouse/ClickHouse/pull/8455) ([Amos Bird](https://github.com/amosbird))
* Move all folders inside /dbms one level up [#9974](https://github.com/ClickHouse/ClickHouse/pull/9974) ([Ivan](https://github.com/abyss7))
* Add a test that checks that read from MergeTree with single thread is performed in order. Addition to [#9670](https://github.com/ClickHouse/ClickHouse/issues/9670) [#9762](https://github.com/ClickHouse/ClickHouse/pull/9762) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix the `00964_live_view_watch_events_heartbeat.py` test to avoid race condition. [#9944](https://github.com/ClickHouse/ClickHouse/pull/9944) ([vzakaznikov](https://github.com/vzakaznikov))
* Fix integration test `test_settings_constraints` [#9962](https://github.com/ClickHouse/ClickHouse/pull/9962) ([Vitaly Baranov](https://github.com/vitlibar))
* Every function in its own file, part 12. [#9922](https://github.com/ClickHouse/ClickHouse/pull/9922) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Added performance test for the case of extremely slow analysis of array of tuples. [#9872](https://github.com/ClickHouse/ClickHouse/pull/9872) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Update zstd to 1.4.4. It has some minor improvements in performance and compression ratio. If you run replicas with different versions of ClickHouse you may see reasonable error messages `Data after merge is not byte-identical to data on another replicas.` with explanation. These messages are Ok and you should not worry. [#10663](https://github.com/ClickHouse/ClickHouse/pull/10663) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix TSan report in `system.stack_trace`. [#9832](https://github.com/ClickHouse/ClickHouse/pull/9832) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Removed dependency on `clock_getres`. [#9833](https://github.com/ClickHouse/ClickHouse/pull/9833) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Added identifier names check with clang-tidy. [#9799](https://github.com/ClickHouse/ClickHouse/pull/9799) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Update "builder" docker image. This image is not used in CI but is useful for developers. [#9809](https://github.com/ClickHouse/ClickHouse/pull/9809) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Remove old `performance-test` tool that is no longer used in CI. `clickhouse-performance-test` is great but now we are using way superior tool that is doing comparison testing with sophisticated statistical formulas to achieve confident results regardless to various changes in environment. [#9796](https://github.com/ClickHouse/ClickHouse/pull/9796) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Added most of clang-static-analyzer checks. [#9765](https://github.com/ClickHouse/ClickHouse/pull/9765) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Update Poco to 1.9.3 in preparation for MongoDB URI support. [#6892](https://github.com/ClickHouse/ClickHouse/pull/6892) ([Alexander Kuzmenkov](https://github.com/akuzm))
* Fix build with `-DUSE_STATIC_LIBRARIES=0 -DENABLE_JEMALLOC=0` [#9651](https://github.com/ClickHouse/ClickHouse/pull/9651) ([Artem Zuikov](https://github.com/4ertus2))
* For change log script, if merge commit was cherry-picked to release branch, take PR name from commit description. [#9708](https://github.com/ClickHouse/ClickHouse/pull/9708) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Support `vX.X-conflicts` tag in backport script. [#9705](https://github.com/ClickHouse/ClickHouse/pull/9705) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix `auto-label` for backporting script. [#9685](https://github.com/ClickHouse/ClickHouse/pull/9685) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Use libc++ in Darwin cross-build to make it consistent with native build. [#9665](https://github.com/ClickHouse/ClickHouse/pull/9665) ([Hui Wang](https://github.com/huiwang))
* Fix flacky test `01017_uniqCombined_memory_usage`. Continuation of [#7236](https://github.com/ClickHouse/ClickHouse/issues/7236). [#9667](https://github.com/ClickHouse/ClickHouse/pull/9667) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix build for native MacOS Clang compiler [#9649](https://github.com/ClickHouse/ClickHouse/pull/9649) ([Ivan](https://github.com/abyss7))
* Allow to add various glitches around `pthread_mutex_lock`, `pthread_mutex_unlock` functions. [#9635](https://github.com/ClickHouse/ClickHouse/pull/9635) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add support for `clang-tidy` in `packager` script. [#9625](https://github.com/ClickHouse/ClickHouse/pull/9625) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add ability to use unbundled msgpack. [#10168](https://github.com/ClickHouse/ClickHouse/pull/10168) ([Azat Khuzhin](https://github.com/azat))
## ClickHouse release v20.3
### ClickHouse release v20.3.8.53, 2020-04-23
### Bug Fix
#### Bug Fix
* Fixed wrong behaviour of datetime functions for timezones that has altered between positive and negative offsets from UTC (e.g. Pacific/Kiritimati). This fixes [#7202](https://github.com/ClickHouse/ClickHouse/issues/7202) [#10369](https://github.com/ClickHouse/ClickHouse/pull/10369) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix possible segfault with `distributed_group_by_no_merge` enabled (introduced in 20.3.7.46 by [#10131](https://github.com/ClickHouse/ClickHouse/issues/10131)). [#10399](https://github.com/ClickHouse/ClickHouse/pull/10399) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix wrong flattening of `Array(Tuple(...))` data types. This fixes [#10259](https://github.com/ClickHouse/ClickHouse/issues/10259) [#10390](https://github.com/ClickHouse/ClickHouse/pull/10390) ([alexey-milovidov](https://github.com/alexey-milovidov))
@ -18,7 +298,7 @@
* Fix the issue when a query with ARRAY JOIN, ORDER BY and LIMIT may return incomplete result. This fixes [#10226](https://github.com/ClickHouse/ClickHouse/issues/10226). Author: [Vadim Plakhtinskiy](https://github.com/VadimPlh). [#10427](https://github.com/ClickHouse/ClickHouse/pull/10427) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Check the number and type of arguments when creating BloomFilter index [#9623](https://github.com/ClickHouse/ClickHouse/issues/9623) [#10431](https://github.com/ClickHouse/ClickHouse/pull/10431) ([Winter Zhang](https://github.com/zhang2014))
### Performance Improvement
#### Performance Improvement
* Improved performance of queries with explicitly defined sets at right side of `IN` operator and tuples in the left side. This fixes performance regression in version 20.3. [#9740](https://github.com/ClickHouse/ClickHouse/pull/9740), [#10385](https://github.com/ClickHouse/ClickHouse/pull/10385) ([Anton Popov](https://github.com/CurtizJ))
### ClickHouse release v20.3.7.46, 2020-04-17
@ -59,7 +339,6 @@
* Fix bug in `CHECK TABLE` query when table contain skip indices. [#10068](https://github.com/ClickHouse/ClickHouse/pull/10068) ([alesapin](https://github.com/alesapin)).
* Fix error `Cannot clone block with columns because block has 0 columns ... While executing GroupingAggregatedTransform`. It happened when setting `distributed_aggregation_memory_efficient` was enabled, and distributed query read aggregating data with different level from different shards (mixed single and two level aggregation). [#10063](https://github.com/ClickHouse/ClickHouse/pull/10063) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix a segmentation fault that could occur in GROUP BY over string keys containing trailing zero bytes ([#8636](https://github.com/ClickHouse/ClickHouse/issues/8636), [#8925](https://github.com/ClickHouse/ClickHouse/issues/8925)). [#10025](https://github.com/ClickHouse/ClickHouse/pull/10025) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Fix parallel distributed INSERT SELECT for remote table. This PR fixes the solution provided in [#9759](https://github.com/ClickHouse/ClickHouse/pull/9759). [#9999](https://github.com/ClickHouse/ClickHouse/pull/9999) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix the number of threads used for remote query execution (performance regression, since 20.3). This happened when query from `Distributed` table was executed simultaneously on local and remote shards. Fixes [#9965](https://github.com/ClickHouse/ClickHouse/issues/9965). [#9971](https://github.com/ClickHouse/ClickHouse/pull/9971) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix bug in which the necessary tables weren't retrieved at one of the processing stages of queries to some databases. Fixes [#9699](https://github.com/ClickHouse/ClickHouse/issues/9699). [#9949](https://github.com/ClickHouse/ClickHouse/pull/9949) ([achulkov2](https://github.com/achulkov2)).
* Fix 'Not found column in block' error when `JOIN` appears with `TOTALS`. Fixes [#9839](https://github.com/ClickHouse/ClickHouse/issues/9839). [#9939](https://github.com/ClickHouse/ClickHouse/pull/9939) ([Artem Zuikov](https://github.com/4ertus2)).

View File

@ -385,9 +385,6 @@ if (OS_LINUX AND NOT ENABLE_JEMALLOC)
endif ()
if (USE_OPENCL)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -DUSE_OPENCL=1")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DUSE_OPENCL=1")
if (OS_DARWIN)
set(OPENCL_LINKER_FLAGS "-framework OpenCL")
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${OPENCL_LINKER_FLAGS}")

View File

@ -1,13 +1,19 @@
# TODO: enable by default
if(0)
option(ENABLE_OPENCL "Enable OpenCL support" ${ENABLE_LIBRARIES})
endif()
if(ENABLE_OPENCL)
# Intel OpenCl driver: sudo apt install intel-opencl-icd
# TODO It's possible to add it as submodules: https://github.com/intel/compute-runtime/releases
# @sa https://github.com/intel/compute-runtime/releases
# OpenCL applications should link wiht ICD loader
# sudo apt install opencl-headers ocl-icd-libopencl1
# sudo ln -s /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0 /usr/lib/libOpenCL.so
# TODO: add https://github.com/OCL-dev/ocl-icd as submodule instead
find_package(OpenCL REQUIRED)
find_package(OpenCL)
if(OpenCL_FOUND)
set(USE_OPENCL 1)
endif()

View File

@ -27,7 +27,7 @@ function configure
kill -0 $left_pid
disown $left_pid
set +m
while ! clickhouse-client --port 9001 --query "select 1" ; do kill -0 $left_pid ; echo . ; sleep 1 ; done
while ! clickhouse-client --port 9001 --query "select 1" && kill -0 $left_pid ; do echo . ; sleep 1 ; done
echo server for setup started
clickhouse-client --port 9001 --query "create database test" ||:
@ -71,9 +71,9 @@ function restart
set +m
while ! clickhouse-client --port 9001 --query "select 1" ; do kill -0 $left_pid ; echo . ; sleep 1 ; done
while ! clickhouse-client --port 9001 --query "select 1" && kill -0 $left_pid ; do echo . ; sleep 1 ; done
echo left ok
while ! clickhouse-client --port 9002 --query "select 1" ; do kill -0 $right_pid ; echo . ; sleep 1 ; done
while ! clickhouse-client --port 9002 --query "select 1" && kill -0 $right_pid ; do echo . ; sleep 1 ; done
echo right ok
clickhouse-client --port 9001 --query "select * from system.tables where database != 'system'"
@ -263,7 +263,7 @@ done
wait
unset IFS
parallel --verbose --null < analyze-commands.txt
parallel --null < analyze-commands.txt
}
# Analyze results
@ -314,6 +314,25 @@ create table queries_old_format engine File(TSVWithNamesAndTypes, 'queries.rep')
from queries
;
-- save all test runs as JSON for the new comparison page
create table all_query_funs_json engine File(JSON, 'report/all-query-runs.json') as
select test, query, versions_runs[1] runs_left, versions_runs[2] runs_right
from (
select
test, query,
groupArrayInsertAt(runs, version) versions_runs
from (
select
replaceAll(_file, '-queries.tsv', '') test,
query, version,
groupArray(time) runs
from file('*-queries.tsv', TSV, 'query text, run int, version UInt32, time float')
group by test, query, version
)
group by test, query
)
;
create table changed_perf_tsv engine File(TSV, 'report/changed-perf.tsv') as
select left, right, diff, stat_threshold, changed_fail, test, query from queries where changed_show
order by abs(diff) desc;
@ -542,7 +561,7 @@ case "$stage" in
# to collect the logs. Prefer not to restart, because addresses might change
# and we won't be able to process trace_log data. Start in a subshell, so that
# it doesn't interfere with the watchdog through `wait`.
( time get_profiles || restart || get_profiles ||: )
( get_profiles || restart || get_profiles ||: )
# Kill the whole process group, because somehow when the subshell is killed,
# the sleep inside remains alive and orphaned.

View File

@ -1,10 +1,10 @@
## system.table\_name {#system-tables_table-name}
## system.table_name {#system-tables_table-name}
Description.
Columns:
- `column_name` ([data\_type\_name](path/to/data_type.md)) — Description.
- `column_name` ([data_type_name](path/to/data_type.md)) — Description.
**Example**

View File

@ -5,7 +5,7 @@ toc_title: How to Build ClickHouse on Mac OS X
# How to Build ClickHouse on Mac OS X {#how-to-build-clickhouse-on-mac-os-x}
Build should work on Mac OS X 10.15 (Catalina)
Build should work on Mac OS X 10.15 (Catalina).
## Install Homebrew {#install-homebrew}

View File

@ -3,4 +3,4 @@ toc_folder_title: Engines
toc_priority: 25
---
{## [Original article](https://clickhouse.tech/docs/en/engines/) ##}

View File

@ -50,7 +50,7 @@ sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
```
If you want to use the most recent version, replace `stable` with `testing` (this is recommended for your testing environments). The `prestable` tag is sometimes available too.
If you want to use the most recent version, replace `stable` with `testing` (this is recommended for your testing environments). `prestable` is sometimes also available.
Then run these commands to install packages:

View File

@ -8,27 +8,27 @@ toc_title: Playground
[ClickHouse Playground](https://play.clickhouse.tech) allows people to experiment with ClickHouse by running queries instantly, without setting up their server or cluster.
Several example datasets are available in the Playground as well as sample queries that show ClickHouse features. There's also a selection of ClickHouse LTS releases to experiment with.
ClickHouse Playground gives the experience of m2.small [Managed Service for ClickHouse](https://cloud.yandex.com/services/managed-clickhouse) instance hosted in [Yandex.Cloud](https://cloud.yandex.com/). More information about [cloud providers](../commercial/cloud.md).
ClickHouse Playground gives the experience of m2.small [Managed Service for ClickHouse](https://cloud.yandex.com/services/managed-clickhouse) instance (4 vCPU, 32 GB RAM) hosted in [Yandex.Cloud](https://cloud.yandex.com/). More information about [cloud providers](../commercial/cloud.md).
You can make queries to playground using any HTTP client, for example [curl](https://curl.haxx.se) or [wget](https://www.gnu.org/software/wget/), or set up a connection using [JDBC](../interfaces/jdbc.md) or [ODBC](../interfaces/odbc.md) drivers. More information about software products that support ClickHouse is available [here](../interfaces/index.md).
## Credentials
| Parameter | Value |
|:------------------|:----------------------------------------|
| HTTPS endpoint | `https://play-api.clickhouse.tech:8443` |
| Native endpoint | `play-api.clickhouse.tech:9440` |
| User | `playground` |
| Password | `clickhouse` |
!!! note "Note"
Note that all endpoints require a secure TLS connection.
| Parameter | Value |
|:--------------------|:----------------------------------------|
| HTTPS endpoint | `https://play-api.clickhouse.tech:8443` |
| Native TCP endpoint | `play-api.clickhouse.tech:9440` |
| User | `playground` |
| Password | `clickhouse` |
There are additional endpoints with specific ClickHouse releases to experiment with their differences (ports and user/password are the same as above):
* 20.3 LTS: `play-api-v20-3.clickhouse.tech`
* 19.14 LTS: `play-api-v19-14.clickhouse.tech`
!!! note "Note"
All these endpoints require a secure TLS connection.
## Limitations
The queries are executed as a read-only user. It implies some limitations:
@ -50,7 +50,7 @@ HTTPS endpoint example with `curl`:
curl "https://play-api.clickhouse.tech:8443/?query=SELECT+'Play+ClickHouse!';&user=playground&password=clickhouse&database=datasets"
```
TCP endpoint example with [../interfaces/cli.md]:
TCP endpoint example with [CLI](../interfaces/cli.md):
``` bash
clickhouse client --secure -h play-api.clickhouse.tech --port 9440 -u playground --password clickhouse -q "SELECT 'Play ClickHouse!'"
```

View File

@ -11,7 +11,7 @@ toc_title: Adopters
| Company | Industry | Usecase | Cluster Size | (Un)Compressed Data Size<abbr title="of single replica"><sup>\*</sup></abbr> | Reference |
|---------------------------------------------------------------------|---------------------------------|-----------------------|------------------------------------------------------------|------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [2gis](https://2gis.ru){.favicon} | Maps | Monitoring | — | — | [Talk in Russian, July 2019](https://youtu.be/58sPkXfq6nw) |
| [Aloha&nbsp;Browser](https://alohabrowser.com/){.favicon} | Mobile App | Browser backend | — | — | [Slides in Russian, May 2019](https://github.com/yandex/clickhouse-presentations/blob/master/meetup22/aloha.pdf) |
| [Aloha&nbsp;Browser](https://alohabrowser.com/){.favicon} | Mobile App | Browser backend | — | — | [Slides in Russian, May 2019](https://presentations.clickhouse.tech/meetup22/aloha.pdf) |
| [Amadeus](https://amadeus.com/){.favicon} | Travel | Analytics | — | — | [Press Release, April 2018](https://www.altinity.com/blog/2018/4/5/amadeus-technologies-launches-investment-and-insights-tool-based-on-machine-learning-and-strategy-algorithms) |
| [Appsflyer](https://www.appsflyer.com){.favicon} | Mobile analytics | Main product | — | — | [Talk in Russian, July 2019](https://www.youtube.com/watch?v=M3wbRlcpBbY) |
| [ArenaData](https://arenadata.tech/){.favicon} | Data Platform | Main product | — | — | [Slides in Russian, December 2019](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup38/indexes.pdf) |

View File

@ -27,4 +27,4 @@ Under the same conditions, ClickHouse can handle several hundred queries per sec
We recommend inserting data in packets of at least 1000 rows, or no more than a single request per second. When inserting to a MergeTree table from a tab-separated dump, the insertion speed can be from 50 to 200 MB/s. If the inserted rows are around 1 Kb in size, the speed will be from 50,000 to 200,000 rows per second. If the rows are small, the performance can be higher in rows per second (on Banner System data -`>` 500,000 rows per second; on Graphite data -`>` 1,000,000 rows per second). To improve performance, you can make multiple INSERT queries in parallel, which scales linearly.
[Original article](https://clickhouse.tech/docs/en/introduction/performance/) <!--hide-->
{## [Original article](https://clickhouse.tech/docs/en/introduction/performance/) ##}

View File

@ -207,7 +207,7 @@ If `http_port` is specified, the OpenSSL configuration is ignored even if it is
**Example**
``` xml
<https>0000</https>
<https_port>9999</https_port>
```
## http\_server\_default\_response {#server_configuration_parameters-http_server_default_response}
@ -733,7 +733,7 @@ Example
<mysql_port>9004</mysql_port>
```
## tmp\_path {#server-settings-tmp_path}
## tmp_path {#tmp-path}
Path to temporary data for processing large queries.
@ -746,16 +746,17 @@ Path to temporary data for processing large queries.
<tmp_path>/var/lib/clickhouse/tmp/</tmp_path>
```
## tmp\_policy {#server-settings-tmp-policy}
## tmp_policy {#tmp-policy}
Policy from [`storage_configuration`](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-multiple-volumes) to store temporary files.
If not set [`tmp_path`](#server-settings-tmp_path) is used, otherwise it is ignored.
Policy from [storage_configuration](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-multiple-volumes) to store temporary files.
If not set, [tmp_path](#tmp-path) is used, otherwise it is ignored.
!!! note "Note"
- `move_factor` is ignored
- `keep_free_space_bytes` is ignored
- `max_data_part_size_bytes` is ignored
- you must have exactly one volume in that policy
- `move_factor` is ignored.
- `keep_free_space_bytes` is ignored.
- `max_data_part_size_bytes` is ignored.
- Уou must have exactly one volume in that policy.
## uncompressed\_cache\_size {#server-settings-uncompressed_cache_size}

View File

@ -1026,27 +1026,32 @@ Possible values:
Default value: 0.
## optimize\_skip\_unused\_shards {#settings-optimize_skip_unused_shards}
## optimize_skip_unused_shards {#optimize-skip-unused-shards}
Enables or disables skipping of unused shards for SELECT queries that have sharding key condition in PREWHERE/WHERE (assumes that the data is distributed by sharding key, otherwise do nothing).
Default value: 0
## force\_optimize\_skip\_unused\_shards {#settings-force_optimize_skip_unused_shards}
Enables or disables query execution if [`optimize_skip_unused_shards`](#settings-optimize_skip_unused_shards) enabled and skipping of unused shards is not possible. If the skipping is not possible and the setting is enabled exception will be thrown.
Enables or disables skipping of unused shards for [SELECT](../../sql-reference/statements/select/index.md) queries that have sharding key condition in `WHERE/PREWHERE` (assuming that the data is distributed by sharding key, otherwise does nothing).
Possible values:
- 0 - Disabled (do not throws)
- 1 - Disable query execution only if the table has sharding key
- 2 - Disable query execution regardless sharding key is defined for the table
- 0 — Disabled.
- 1 — Enabled.
Default value: 0
## force_optimize_skip_unused_shards {#force-optimize-skip-unused-shards}
Enables or disables query execution if [optimize_skip_unused_shards](#optimize-skip-unused-shards) is enabled and skipping of unused shards is not possible. If the skipping is not possible and the setting is enabled, an exception will be thrown.
Possible values:
- 0 — Disabled. ClickHouse doesn't throw an exception.
- 1 — Enabled. Query execution is disabled only if the table has a sharding key.
- 2 — Enabled. Query execution is disabled regardless of whether a sharding key is defined for the table.
Default value: 0
## force\_optimize\_skip\_unused\_shards\_no\_nested {#settings-force_optimize_skip_unused_shards_no_nested}
Reset [`optimize_skip_unused_shards`](#settings-force_optimize_skip_unused_shards) for nested `Distributed` table
Reset [`optimize_skip_unused_shards`](#optimize-skip-unused-shards) for nested `Distributed` table
Possible values:
@ -1250,7 +1255,9 @@ Default value: Empty
## background\_pool\_size {#background_pool_size}
Sets the number of threads performing background operations in table engines (for example, merges in [MergeTree engine](../../engines/table-engines/mergetree-family/index.md) tables). This setting is applied at ClickHouse server start and cant be changed in a user session. By adjusting this setting, you manage CPU and disk load. Smaller pool size utilizes less CPU and disk resources, but background processes advance slower which might eventually impact query performance.
Sets the number of threads performing background operations in table engines (for example, merges in [MergeTree engine](../../engines/table-engines/mergetree-family/index.md) tables). This setting is applied from `default` profile at ClickHouse server start and cant be changed in a user session. By adjusting this setting, you manage CPU and disk load. Smaller pool size utilizes less CPU and disk resources, but background processes advance slower which might eventually impact query performance.
Before changing it, please also take a look at related [MergeTree settings](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-merge_tree), such as `number_of_free_entries_in_pool_to_lower_max_size_of_merge` and `number_of_free_entries_in_pool_to_execute_mutation`.
Possible values:

View File

@ -536,26 +536,26 @@ Contains logging entries. Logging level which goes to this table can be limited
Columns:
- `event_date` (`Date`) - Date of the entry.
- `event_time` (`DateTime`) - Time of the entry.
- `microseconds` (`UInt32`) - Microseconds of the entry.
- `event_date` (Date) — Date of the entry.
- `event_time` (DateTime) — Time of the entry.
- `microseconds` (UInt32) — Microseconds of the entry.
- `thread_name` (String) — Name of the thread from which the logging was done.
- `thread_id` (UInt64) — OS thread ID.
- `level` (`Enum8`) - Entry level.
- `'Fatal' = 1`
- `'Critical' = 2`
- `'Error' = 3`
- `'Warning' = 4`
- `'Notice' = 5`
- `'Information' = 6`
- `'Debug' = 7`
- `'Trace' = 8`
- `query_id` (`String`) - ID of the query.
- `logger_name` (`LowCardinality(String)`) - Name of the logger (i.e. `DDLWorker`)
- `message` (`String`) - The message itself.
- `revision` (`UInt32`) - ClickHouse revision.
- `source_file` (`LowCardinality(String)`) - Source file from which the logging was done.
- `source_line` (`UInt64`) - Source line from which the logging was done.
- `level` (`Enum8`) — Entry level. Possible values:
- `1` or `'Fatal'`.
- `2` or `'Critical'`.
- `3` or `'Error'`.
- `4` or `'Warning'`.
- `5` or `'Notice'`.
- `6` or `'Information'`.
- `7` or `'Debug'`.
- `8` or `'Trace'`.
- `query_id` (String) — ID of the query.
- `logger_name` (LowCardinality(String)) — Name of the logger (i.e. `DDLWorker`).
- `message` (String) — The message itself.
- `revision` (UInt32) — ClickHouse revision.
- `source_file` (LowCardinality(String)) — Source file from which the logging was done.
- `source_line` (UInt64) — Source line from which the logging was done.
## system.query\_log {#system_tables-query_log}

View File

@ -58,6 +58,7 @@ LAYOUT(LAYOUT_TYPE(param value)) -- layout settings
- [range\_hashed](#range-hashed)
- [complex\_key\_hashed](#complex-key-hashed)
- [complex\_key\_cache](#complex-key-cache)
- [complex\_key\_direct](#complex-key-direct)
- [ip\_trie](#ip-trie)
### flat {#flat}
@ -317,6 +318,10 @@ or
LAYOUT(DIRECT())
```
### complex\_key\_direct {#complex-key-direct}
This type of storage is for use with composite [keys](external-dicts-dict-structure.md). Similar to `direct`.
### ip\_trie {#ip-trie}
This type of storage is for mapping network prefixes (IP addresses) to metadata such as ASN.

View File

@ -53,16 +53,16 @@ An exception is thrown when dividing by zero or when dividing a minimal negative
Differs from intDiv in that it returns zero when dividing by zero or when dividing a minimal negative number by minus one.
## modulo(a, b), a % b operator {#moduloa-b-a-b-operator}
## modulo(a, b), a % b operator {#modulo}
Calculates the remainder after division.
If arguments are floating-point numbers, they are pre-converted to integers by dropping the decimal portion.
The remainder is taken in the same sense as in C++. Truncated division is used for negative numbers.
An exception is thrown when dividing by zero or when dividing a minimal negative number by minus one.
## moduloOrZero(a, b) {#moduloorzeroa-b}
## moduloOrZero(a, b) {#modulo-or-zero}
Differs from modulo in that it returns zero when the divisor is zero.
Differs from [modulo](#modulo) in that it returns zero when the divisor is zero.
## negate(a), -a operator {#negatea-a-operator}

View File

@ -201,17 +201,17 @@ All changes on replicated tables are broadcasting to ZooKeeper so will be applie
The following operations with [partitions](../../engines/table-engines/mergetree-family/custom-partitioning-key.md) are available:
- [DETACH PARTITION](#alter_detach-partition) Moves a partition to the `detached` directory and forget it.
- [DROP PARTITION](#alter_drop-partition) Deletes a partition.
- [ATTACH PART\|PARTITION](#alter_attach-partition) Adds a part or partition from the `detached` directory to the table.
- [ATTACH PARTITION FROM](#alter_attach-partition-from) Copies the data partition from one table to another and adds.
- [REPLACE PARTITION](#alter_replace-partition) - Copies the data partition from one table to another and replaces.
- [MOVE PARTITION TO TABLE](#alter_move_to_table-partition)(#alter_move_to_table-partition) - Move the data partition from one table to another.
- [CLEAR COLUMN IN PARTITION](#alter_clear-column-partition) - Resets the value of a specified column in a partition.
- [CLEAR INDEX IN PARTITION](#alter_clear-index-partition) - Resets the specified secondary index in a partition.
- [FREEZE PARTITION](#alter_freeze-partition) Creates a backup of a partition.
- [FETCH PARTITION](#alter_fetch-partition) Downloads a partition from another server.
- [MOVE PARTITION\|PART](#alter_move-partition) Move partition/data part to another disk or volume.
- [DETACH PARTITION](#alter_detach-partition) Moves a partition to the `detached` directory and forget it.
- [DROP PARTITION](#alter_drop-partition) Deletes a partition.
- [ATTACH PART\|PARTITION](#alter_attach-partition) Adds a part or partition from the `detached` directory to the table.
- [ATTACH PARTITION FROM](#alter_attach-partition-from) Copies the data partition from one table to another and adds.
- [REPLACE PARTITION](#alter_replace-partition) Copies the data partition from one table to another and replaces.
- [MOVE PARTITION TO TABLE](#alter_move_to_table-partition) — Moves the data partition from one table to another.
- [CLEAR COLUMN IN PARTITION](#alter_clear-column-partition) Resets the value of a specified column in a partition.
- [CLEAR INDEX IN PARTITION](#alter_clear-index-partition) Resets the specified secondary index in a partition.
- [FREEZE PARTITION](#alter_freeze-partition) Creates a backup of a partition.
- [FETCH PARTITION](#alter_fetch-partition) Downloads a partition from another server.
- [MOVE PARTITION\|PART](#alter_move-partition) Move partition/data part to another disk or volume.
<!-- -->
@ -307,13 +307,13 @@ For the query to run successfully, the following conditions must be met:
ALTER TABLE table_source MOVE PARTITION partition_expr TO TABLE table_dest
```
This query move the data partition from the `table_source` to `table_dest` with deleting the data from `table_source`.
This query moves the data partition from the `table_source` to `table_dest` with deleting the data from `table_source`.
For the query to run successfully, the following conditions must be met:
- Both tables must have the same structure.
- Both tables must have the same partition key.
- Both tables must be the same engine family. (replicated or non-replicated)
- Both tables must be the same engine family (replicated or non-replicated).
- Both tables must have the same storage policy.
#### CLEAR COLUMN IN PARTITION {#alter_clear-column-partition}

View File

@ -5,13 +5,12 @@ toc_title: Roadmap
# Roadmap {#roadmap}
## Q1 2020 {#q1-2020}
- Role-based access control
## Q2 2020 {#q2-2020}
- Integration with external authentication services
## Q3 2020 {#q3-2020}
- Resource pools for more precise distribution of cluster capacity between users
{## [Original article](https://clickhouse.tech/docs/en/roadmap/) ##}

View File

@ -25,6 +25,7 @@ toc_title: Integrations
- Message queues
- [Kafka](https://kafka.apache.org)
- [clickhouse\_sinker](https://github.com/housepower/clickhouse_sinker) (uses [Go client](https://github.com/ClickHouse/clickhouse-go/))
- [stream-loader-clickhouse](https://github.com/adform/stream-loader)
- Stream processing
- [Flink](https://flink.apache.org)
- [flink-clickhouse-sink](https://github.com/ivi-ru/flink-clickhouse-sink)

View File

@ -209,7 +209,7 @@ Si `http_port` se especifica, la configuración de OpenSSL se ignora incluso si
**Ejemplo**
``` xml
<https>0000</https>
<https_port>9999</https_port>
```
## http\_server\_default\_response {#server_configuration_parameters-http_server_default_response}

View File

@ -27,6 +27,7 @@ toc_title: "\u06CC\u06A9\u067E\u0627\u0631\u0686\u06AF\u06CC"
- صف پیام
- [کافکا](https://kafka.apache.org)
- [در حال بارگذاری](https://github.com/housepower/clickhouse_sinker) (استفاده [برو کارگیر](https://github.com/ClickHouse/clickhouse-go/))
- [stream-loader-clickhouse](https://github.com/adform/stream-loader)
- پردازش جریان
- [لرزش](https://flink.apache.org)
- [سینک فلینک-کلیک](https://github.com/ivi-ru/flink-clickhouse-sink)

View File

@ -210,7 +210,7 @@ toc_title: "\u062A\u0646\u0638\u06CC\u0645\u0627\u062A \u06A9\u0627\u0631\u06AF\
**مثال**
``` xml
<https>0000</https>
<https_port>9999</https_port>
```
## نقلقولهای جدید از این نویسنده {#server_configuration_parameters-http_server_default_response}

View File

@ -27,6 +27,7 @@ toc_title: "Int\xE9gration"
- Files d'attente de messages
- [Kafka](https://kafka.apache.org)
- [clickhouse\_sinker](https://github.com/housepower/clickhouse_sinker) (utiliser [Allez client](https://github.com/ClickHouse/clickhouse-go/))
- [stream-loader-clickhouse](https://github.com/adform/stream-loader)
- Traitement de flux
- [Flink](https://flink.apache.org)
- [flink-clickhouse-évier](https://github.com/ivi-ru/flink-clickhouse-sink)

View File

@ -209,7 +209,7 @@ Si `http_port` est spécifié, la configuration OpenSSL est ignorée même si el
**Exemple**
``` xml
<https>0000</https>
<https_port>9999</https_port>
```
## http\_server\_default\_response {#server_configuration_parameters-http_server_default_response}

View File

@ -60,6 +60,7 @@ LAYOUT(LAYOUT_TYPE(param value)) -- layout settings
- [range\_hashed](#range-hashed)
- [complex\_key\_hashed](#complex-key-hashed)
- [complex\_key\_cache](#complex-key-cache)
- [complex\_key\_direct](#complex-key-direct)
- [ip\_trie](#ip-trie)
### plat {#flat}
@ -319,6 +320,11 @@ ou
LAYOUT(DIRECT())
```
### complex\_key\_cache {#complex-key-cache}
Ce type de stockage est pour une utilisation avec composite [touches](external-dicts-dict-structure.md). Semblable à `direct`.
### ip\_trie {#ip-trie}
Ce type de stockage permet de mapper des préfixes de réseau (adresses IP) à des métadonnées telles que ASN.

View File

@ -27,6 +27,7 @@ toc_title: "\u7D71\u5408"
- メッセージキュ
- [カフカ](https://kafka.apache.org)
- [clickhouse\_sinker](https://github.com/housepower/clickhouse_sinker) (用途 [Goクライアント](https://github.com/ClickHouse/clickhouse-go/))
- [stream-loader-clickhouse](https://github.com/adform/stream-loader)
- ストリーム処理
- [フリンク](https://flink.apache.org)
- [フリンク-クリックハウス-シンク](https://github.com/ivi-ru/flink-clickhouse-sink)

View File

@ -209,7 +209,7 @@ HTTP経由でサーバーに接続するためのポート。
**例**
``` xml
<https>0000</https>
<https_port>9999</https_port>
```
## http\_server\_default\_response {#server_configuration_parameters-http_server_default_response}

View File

@ -38,7 +38,7 @@ sudo rpm --import https://repo.yandex.ru/clickhouse/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.yandex.ru/clickhouse/rpm/stable/x86_64
```
Для использования наиболее свежих версий нужно заменить `stable` на `testing` (рекомендуется для тестовых окружений).
Для использования наиболее свежих версий нужно заменить `stable` на `testing` (рекомендуется для тестовых окружений). Также иногда доступен `prestable`.
Для, собственно, установки пакетов необходимо выполнить следующие команды:

View File

@ -20,6 +20,7 @@
- Очереди сообщений
- [Kafka](https://kafka.apache.org)
- [clickhouse\_sinker](https://github.com/housepower/clickhouse_sinker) (использует [Go client](https://github.com/ClickHouse/clickhouse-go/))
- [stream-loader-clickhouse](https://github.com/adform/stream-loader)
- Потоковая обработка
- [Flink](https://flink.apache.org)
- [flink-clickhouse-sink](https://github.com/ivi-ru/flink-clickhouse-sink)

View File

@ -195,7 +195,7 @@ ClickHouse проверит условия `min_part_size` и `min_part_size_rat
**Пример**
``` xml
<https>0000</https>
<https_port>9999</https_port>
```
## http\_server\_default\_response {#server_configuration_parameters-http_server_default_response}
@ -686,7 +686,7 @@ TCP порт для защищённого обмена данными с кли
<mysql_port>9004</mysql_port>
```
## tmp\_path {#tmp-path}
## tmp_path {#tmp-path}
Путь ко временным данным для обработки больших запросов.
@ -698,6 +698,17 @@ TCP порт для защищённого обмена данными с кли
``` xml
<tmp_path>/var/lib/clickhouse/tmp/</tmp_path>
```
## tmp_policy {#tmp-policy}
Политика из [storage_configuration](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-multiple-volumes) для хранения временных файлов.
Если политика не задана, используется [tmp_path](#tmp-path). В противном случае `tmp_path` игнорируется.
!!! note "Примечание"
- `move_factor` игнорируется.
- `keep_free_space_bytes` игнорируется.
- `max_data_part_size_bytes` игнорируется.
- В данной политике у вас должен быть ровно один том.
## uncompressed\_cache\_size {#server-settings-uncompressed_cache_size}

View File

@ -1025,6 +1025,29 @@ ClickHouse генерирует исключение
Значение по умолчанию: 0.
## optimize_skip_unused_shards {#optimize-skip-unused-shards}
Включает или отключает пропуск неиспользуемых шардов для запросов [SELECT](../../sql-reference/statements/select/index.md) , в которых условие ключа шардирования задано в секции `WHERE/PREWHERE`. Предполагается, что данные распределены с помощью ключа шардирования, в противном случае настройка ничего не делает.
Возможные значения:
- 0 — Выключена.
- 1 — Включена.
Значение по умолчанию: 0
## force_optimize_skip_unused_shards {#force-optimize-skip-unused-shards}
Разрешает или запрещает выполнение запроса, если настройка [optimize_skip_unused_shards](#optimize-skip-unused-shards) включена, а пропуск неиспользуемых шардов невозможен. Если данная настройка включена и пропуск невозможен, ClickHouse генерирует исключение.
Возможные значения:
- 0 — Выключена. ClickHouse не генерирует исключение.
- 1 — Включена. Выполнение запроса запрещается, только если у таблицы есть ключ шардирования.
- 2 — Включена. Выполнение запроса запрещается, даже если для таблицы не определен ключ шардирования.
Значение по умолчанию: 0
## optimize\_throw\_if\_noop {#setting-optimize_throw_if_noop}
Включает или отключает генерирование исключения в в случаях, когда запрос [OPTIMIZE](../../sql-reference/statements/misc.md#misc_operations-optimize) не выполняет мёрж.

View File

@ -517,6 +517,33 @@ CurrentMetric_ReplicatedChecks: 0
- `query` (String) текст запроса. Для запросов `INSERT` не содержит встаявляемые данные.
- `query_id` (String) идентификатор запроса, если был задан.
## system.text\_log {#system-tables-text-log}
Содержит записи логов. Уровень логирования для таблицы может быть ограничен параметром сервера `text_log.level`.
Столбцы:
- `event_date` (Date) — Дата создания записи.
- `event_time` (DateTime) — Время создания записи.
- `microseconds` (UInt32) — Время создания записи в микросекундах.
- `thread_name` (String) — Название потока, из которого была сделана запись.
- `thread_id` (UInt64) — Идентификатор потока ОС.
- `level` (Enum8) — Уровень логирования записи. Возможные значения:
- `1` или `'Fatal'`.
- `2` или `'Critical'`.
- `3` или `'Error'`.
- `4` или `'Warning'`.
- `5` или `'Notice'`.
- `6` или `'Information'`.
- `7` или `'Debug'`.
- `8` или `'Trace'`.
- `query_id` (String) — Идентификатор запроса.
- `logger_name` (LowCardinality(String)) — Название логгера (`DDLWorker`).
- `message` (String) — Само тело записи.
- `revision` (UInt32) — Ревизия ClickHouse.
- `source_file` (LowCardinality(String)) — Исходный файл, из которого была сделана запись.
- `source_line` (UInt64) — Исходная строка, из которой была сделана запись.
## system.query\_log {#system_tables-query_log}
Содержит информацию о выполнении запросов. Для каждого запроса вы можете увидеть время начала обработки, продолжительность обработки, сообщения об ошибках и другую информацию.

View File

@ -53,6 +53,7 @@ LAYOUT(LAYOUT_TYPE(param value)) -- layout settings
- [range\_hashed](#range-hashed)
- [complex\_key\_hashed](#complex-key-hashed)
- [complex\_key\_cache](#complex-key-cache)
- [complex\_key\_direct](#complex-key-direct)
- [ip\_trie](#ip-trie)
### flat {#flat}
@ -315,6 +316,10 @@ LAYOUT(CACHE(SIZE_IN_CELLS 1000000000))
LAYOUT(DIRECT())
```
### complex\_key\_direct {#complex-key-direct}
Тип размещения предназначен для использования с составными [ключами](external-dicts-dict-structure.md). Аналогичен `direct`.
### ip\_trie {#ip-trie}
Тип размещения предназначен для сопоставления префиксов сети (IP адресов) с метаданными, такими как ASN.

View File

@ -48,13 +48,17 @@ SELECT toTypeName(0), toTypeName(0 + 0), toTypeName(0 + 0 + 0), toTypeName(0 + 0
Отличается от intDiv тем, что при делении на ноль или при делении минимального отрицательного числа на минус единицу, возвращается ноль.
## modulo(a, b), оператор a % b {#moduloa-b-operator-a-b}
## modulo(a, b), оператор a % b {#modulo}
Вычисляет остаток от деления.
Если аргументы - числа с плавающей запятой, то они предварительно преобразуются в целые числа, путём отбрасывания дробной части.
Берётся остаток в том же смысле, как это делается в C++. По факту, для отрицательных чисел, используется truncated division.
При делении на ноль или при делении минимального отрицательного числа на минус единицу, кидается исключение.
## moduloOrZero(a, b) {#modulo-or-zero}
В отличие от [modulo](#modulo), возвращает ноль при делении на ноль.
## negate(a), оператор -a {#negatea-operator-a}
Вычисляет число, обратное по знаку. Результат всегда имеет знаковый тип.

View File

@ -1,6 +1,6 @@
# Функции для работы с внешними словарями {#ext_dict_functions}
Информацию о подключении и настройке внешних словарей смотрите в разделе [Внешние словари](../../sql-reference/functions/ext-dict-functions.md).
Информацию о подключении и настройке внешних словарей смотрите в разделе [Внешние словари](../../sql-reference/dictionaries/external-dictionaries/external-dicts.md).
## dictGet {#dictget}

View File

@ -204,17 +204,17 @@ ALTER TABLE [db].name DROP CONSTRAINT constraint_name;
Для работы с [партициями](../../sql-reference/statements/alter.md) доступны следующие операции:
- [DETACH PARTITION](#alter_detach-partition) перенести партицию в директорию `detached`;
- [DROP PARTITION](#alter_drop-partition) удалить партицию;
- [ATTACH PARTITION\|PART](#alter_attach-partition) добавить партицию/кусок в таблицу из директории `detached`;
- [ATTACH PARTITION FROM](#alter_attach-partition-from) скопировать партицию из другой таблицы;
- [REPLACE PARTITION](#alter_replace-partition) скопировать партицию из другой таблицы с заменой;
- [MOVE PARTITION TO TABLE](#alter_move_to_table-partition) (\#alter\_move\_to\_table-partition) - переместить партицию в другую таблицу;
- [CLEAR COLUMN IN PARTITION](#alter_clear-column-partition) удалить все значения в столбце для заданной партиции;
- [CLEAR INDEX IN PARTITION](#alter_clear-index-partition) - очистить построенные вторичные индексы для заданной партиции;
- [FREEZE PARTITION](#alter_freeze-partition) создать резервную копию партиции;
- [FETCH PARTITION](#alter_fetch-partition) скачать партицию с другого сервера;
- [MOVE PARTITION\|PART](#alter_move-partition) переместить партицию/кускок на другой диск или том.
- [DETACH PARTITION](#alter_detach-partition) перенести партицию в директорию `detached`;
- [DROP PARTITION](#alter_drop-partition) удалить партицию;
- [ATTACH PARTITION\|PART](#alter_attach-partition) добавить партицию/кусок в таблицу из директории `detached`;
- [ATTACH PARTITION FROM](#alter_attach-partition-from) скопировать партицию из другой таблицы;
- [REPLACE PARTITION](#alter_replace-partition) скопировать партицию из другой таблицы с заменой;
- [MOVE PARTITION TO TABLE](#alter_move_to_table-partition) переместить партицию в другую таблицу;
- [CLEAR COLUMN IN PARTITION](#alter_clear-column-partition) удалить все значения в столбце для заданной партиции;
- [CLEAR INDEX IN PARTITION](#alter_clear-index-partition) очистить построенные вторичные индексы для заданной партиции;
- [FREEZE PARTITION](#alter_freeze-partition) создать резервную копию партиции;
- [FETCH PARTITION](#alter_fetch-partition) скачать партицию с другого сервера;
- [MOVE PARTITION\|PART](#alter_move-partition) переместить партицию/кускок на другой диск или том.
#### DETACH PARTITION {#alter_detach-partition}
@ -312,12 +312,14 @@ ALTER TABLE table2 REPLACE PARTITION partition_expr FROM table1
ALTER TABLE table_source MOVE PARTITION partition_expr TO TABLE table_dest
```
Перемещает партицию из таблицы `table_source` в таблицу `table_dest` (добавляет к существующим данным в `table_dest`), с удалением данных из таблицы `table_source`.
Перемещает партицию из таблицы `table_source` в таблицу `table_dest` (добавляет к существующим данным в `table_dest`) с удалением данных из таблицы `table_source`.
Следует иметь в виду:
- Таблицы должны иметь одинаковую структуру.
- Для таблиц должен быть задан одинаковый ключ партиционирования.
- Движки таблиц должны быть одинакового семейства (реплицированные или нереплицированные).
- Для таблиц должна быть задана одинаковая политика хранения.
#### CLEAR COLUMN IN PARTITION {#alter_clear-column-partition}

View File

@ -119,6 +119,11 @@ class PatchedMacrosPlugin(macros.plugin.MacrosPlugin):
def on_page_markdown(self, markdown, page, config, files):
markdown = super(PatchedMacrosPlugin, self).on_page_markdown(markdown, page, config, files)
if os.path.islink(page.file.abs_src_path):
lang = config.data['theme']['language']
page.canonical_url = page.canonical_url.replace(f'/{lang}/', '/en/', 1)
if config.data['extra'].get('version_prefix') or config.data['extra'].get('single_page'):
return markdown
if self.skip_git_log:

View File

@ -44,7 +44,7 @@ then
if [[ ! -z "${CLOUDFLARE_TOKEN}" ]]
then
sleep 1m
git diff --stat="9999,9999" --diff-filter=M HEAD~1 | grep '|' | awk '$1 ~ /\.html$/ { if ($3>4) { url="https://clickhouse.tech/"$1; sub(/\/index.html/, "/", url); print "\""url"\""; }}' | split -l 25 /dev/stdin PURGE
git diff --stat="9999,9999" --diff-filter=M HEAD~1 | grep '|' | awk '$1 ~ /\.html$/ { if ($3>8) { url="https://content.clickhouse.tech/"$1; sub(/\/index.html/, "/", url); print "\""url"\""; }}' | split -l 25 /dev/stdin PURGE
for FILENAME in $(ls PURGE*)
do
POST_DATA=$(cat "${FILENAME}" | sed -n -e 'H;${x;s/\n/,/g;s/^,//;p;}' | awk '{print "{\"files\":["$0"]}";}')

View File

@ -1,7 +1,7 @@
Babel==2.8.0
backports-abc==0.5
backports.functools-lru-cache==1.6.1
beautifulsoup4==4.9.0
beautifulsoup4==4.9.1
certifi==2020.4.5.1
chardet==3.0.4
click==7.1.2
@ -21,7 +21,7 @@ mkdocs-htmlproofer-plugin==0.0.3
mkdocs-macros-plugin==0.4.7
nltk==3.5
nose==1.3.7
protobuf==3.11.3
protobuf==3.12.0
numpy==1.18.4
Pygments==2.5.2
pymdown-extensions==7.1
@ -31,7 +31,7 @@ repackage==0.7.3
requests==2.23.0
singledispatch==3.4.0.3
six==1.14.0
soupsieve==2.0
soupsieve==2.0.1
termcolor==1.1.0
tornado==5.1.1
Unidecode==1.1.1

View File

@ -1,9 +1,11 @@
import concurrent.futures
import hashlib
import json
import logging
import os
import shutil
import subprocess
import sys
import bs4
import closure
@ -20,25 +22,31 @@ def adjust_markdown_html(content):
content,
features='html.parser'
)
for a in soup.find_all('a'):
a_class = a.attrs.get('class')
if a_class and 'headerlink' in a_class:
a.string = '\xa0'
for details in soup.find_all('details'):
for summary in details.find_all('summary'):
if summary.parent != details:
summary.extract()
details.insert(0, summary)
for div in soup.find_all('div'):
div.attrs['role'] = 'alert'
div_class = div.attrs.get('class')
for a in div.find_all('a'):
a_class = a.attrs.get('class')
if a_class:
a.attrs['class'] = a_class + ['alert-link']
else:
a.attrs['class'] = 'alert-link'
is_admonition = div_class and 'admonition' in div.attrs.get('class')
if is_admonition:
for a in div.find_all('a'):
a_class = a.attrs.get('class')
if a_class:
a.attrs['class'] = a_class + ['alert-link']
else:
a.attrs['class'] = 'alert-link'
for p in div.find_all('p'):
p_class = p.attrs.get('class')
if p_class and ('admonition-title' in p_class):
p.attrs['class'] = p_class + ['alert-heading', 'display-5', 'mb-2']
if div_class and 'admonition' in div.attrs.get('class'):
if is_admonition and p_class and ('admonition-title' in p_class):
p.attrs['class'] = p_class + ['alert-heading', 'display-6', 'mb-2']
if is_admonition:
div.attrs['role'] = 'alert'
if ('info' in div_class) or ('note' in div_class):
mode = 'alert-primary'
elif ('attention' in div_class) or ('warning' in div_class):
@ -49,7 +57,7 @@ def adjust_markdown_html(content):
mode = 'alert-info'
else:
mode = 'alert-secondary'
div.attrs['class'] = div_class + ['alert', 'lead', 'pb-0', 'mb-4', mode]
div.attrs['class'] = div_class + ['alert', 'pb-0', 'mb-4', mode]
return str(soup)
@ -138,6 +146,7 @@ def get_js_in(args):
f"'{args.website_dir}/js/jquery.js'",
f"'{args.website_dir}/js/popper.js'",
f"'{args.website_dir}/js/bootstrap.js'",
f"'{args.website_dir}/js/sentry.js'",
f"'{args.website_dir}/js/base.js'",
f"'{args.website_dir}/js/index.js'",
f"'{args.website_dir}/js/docsearch.js'",
@ -145,6 +154,28 @@ def get_js_in(args):
]
def minify_file(path, css_digest, js_digest):
if not (
path.endswith('.html') or
path.endswith('.css')
):
return
logging.info('Minifying %s', path)
with open(path, 'rb') as f:
content = f.read().decode('utf-8')
if path.endswith('.html'):
content = minify_html(content)
content = content.replace('base.css?css_digest', f'base.css?{css_digest}')
content = content.replace('base.js?js_digest', f'base.js?{js_digest}')
elif path.endswith('.css'):
content = cssmin.cssmin(content)
elif path.endswith('.js'):
content = jsmin.jsmin(content)
with open(path, 'wb') as f:
f.write(content.encode('utf-8'))
def minify_website(args):
css_in = ' '.join(get_css_in(args))
css_out = f'{args.output_dir}/css/base.css'
@ -190,28 +221,17 @@ def minify_website(args):
if args.minify:
logging.info('Minifying website')
for root, _, filenames in os.walk(args.output_dir):
for filename in filenames:
path = os.path.join(root, filename)
if not (
filename.endswith('.html') or
filename.endswith('.css')
):
continue
logging.info('Minifying %s', path)
with open(path, 'rb') as f:
content = f.read().decode('utf-8')
if filename.endswith('.html'):
content = minify_html(content)
content = content.replace('base.css?css_digest', f'base.css?{css_digest}')
content = content.replace('base.js?js_digest', f'base.js?{js_digest}')
elif filename.endswith('.css'):
content = cssmin.cssmin(content)
elif filename.endswith('.js'):
content = jsmin.jsmin(content)
with open(path, 'wb') as f:
f.write(content.encode('utf-8'))
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = []
for root, _, filenames in os.walk(args.output_dir):
for filename in filenames:
path = os.path.join(root, filename)
futures.append(executor.submit(minify_file, path, css_digest, js_digest))
for future in futures:
exc = future.exception()
if exc:
logging.error(exc)
sys.exit(1)
def process_benchmark_results(args):

View File

@ -27,6 +27,7 @@ toc_title: Entegrasyonlar
- Mesaj kuyrukları
- [Kafka](https://kafka.apache.org)
- [clickhouse\_sinker](https://github.com/housepower/clickhouse_sinker) (kullanma [Go client](https://github.com/ClickHouse/clickhouse-go/))
- [stream-loader-clickhouse](https://github.com/adform/stream-loader)
- Akış işleme
- [Flink](https://flink.apache.org)
- [flink-clickhouse-lavabo](https://github.com/ivi-ru/flink-clickhouse-sink)

View File

@ -209,7 +209,7 @@ Eğer `http_port` belirtilmişse, OpenSSL yapılandırması ayarlanmış olsa bi
**Örnek**
``` xml
<https>0000</https>
<https_port>9999</https_port>
```
## http\_server\_default\_response {#server_configuration_parameters-http_server_default_response}

View File

@ -1,7 +1,5 @@
---
machine_translated: true
machine_translated_rev: 72537a2d527c63c07aa5d2361a8829f3895cf2bd
toc_folder_title: "\u53D1\u52A8\u673A"
toc_folder_title: "\u5f15\u64ce"
toc_priority: 25
---

View File

@ -19,6 +19,7 @@
- 消息队列
- [卡夫卡](https://kafka.apache.org)
- [clickhouse\_sinker](https://github.com/housepower/clickhouse_sinker) (使用 [去客户](https://github.com/ClickHouse/clickhouse-go/))
- [stream-loader-clickhouse](https://github.com/adform/stream-loader)
- 流处理
- [Flink](https://flink.apache.org)
- [flink-clickhouse-sink](https://github.com/ivi-ru/flink-clickhouse-sink)

View File

@ -207,7 +207,7 @@ ClickHouse每x秒重新加载内置字典。 这使得编辑字典 “on the fly
**示例**
``` xml
<https>0000</https>
<https_port>9999</https_port>
```
## http\_server\_default\_response {#server_configuration_parameters-http_server_default_response}

View File

@ -398,6 +398,10 @@ private:
ignore_error = config().getBool("ignore-error", false);
}
ClientInfo & client_info = context.getClientInfo();
client_info.setInitialQuery();
client_info.quota_key = config().getString("quota_key", "");
connect();
/// Initialize DateLUT here to avoid counting time spent here as query execution time.
@ -606,9 +610,7 @@ private:
server_version = toString(server_version_major) + "." + toString(server_version_minor) + "." + toString(server_version_patch);
if (
server_display_name = connection->getServerDisplayName(connection_parameters.timeouts);
server_display_name.length() == 0)
if (server_display_name = connection->getServerDisplayName(connection_parameters.timeouts); server_display_name.empty())
{
server_display_name = config().getString("host", "localhost");
}
@ -914,7 +916,7 @@ private:
query_id,
QueryProcessingStage::Complete,
&context.getSettingsRef(),
nullptr,
&context.getClientInfo(),
true);
sendExternalTables();
@ -946,7 +948,15 @@ private:
if (!parsed_insert_query.data && (is_interactive || (!stdin_is_a_tty && std_in.eof())))
throw Exception("No data to insert", ErrorCodes::NO_DATA_TO_INSERT);
connection->sendQuery(connection_parameters.timeouts, query_without_data, query_id, QueryProcessingStage::Complete, &context.getSettingsRef(), nullptr, true);
connection->sendQuery(
connection_parameters.timeouts,
query_without_data,
query_id,
QueryProcessingStage::Complete,
&context.getSettingsRef(),
&context.getClientInfo(),
true);
sendExternalTables();
/// Receive description of table structure.
@ -1719,6 +1729,7 @@ public:
*/
("password", po::value<std::string>()->implicit_value("\n", ""), "password")
("ask-password", "ask-password")
("quota_key", po::value<std::string>(), "A string to differentiate quotas when the user have keyed quotas configured on server")
("query_id", po::value<std::string>(), "query_id")
("query,q", po::value<std::string>(), "query")
("database,d", po::value<std::string>(), "database")
@ -1854,6 +1865,8 @@ public:
config().setString("password", options["password"].as<std::string>());
if (options.count("ask-password"))
config().setBool("ask-password", true);
if (options.count("quota_key"))
config().setString("quota_key", options["quota_key"].as<std::string>());
if (options.count("multiline"))
config().setBool("multiline", true);
if (options.count("multiquery"))

View File

@ -29,8 +29,10 @@ ConnectionParameters::ConnectionParameters(const Poco::Util::AbstractConfigurati
"port", config.getInt(is_secure ? "tcp_port_secure" : "tcp_port", is_secure ? DBMS_DEFAULT_SECURE_PORT : DBMS_DEFAULT_PORT));
default_database = config.getString("database", "");
/// changed the default value to "default" to fix the issue when the user in the prompt is blank
user = config.getString("user", "default");
bool password_prompt = false;
if (config.getBool("ask-password", false))
{
@ -52,6 +54,7 @@ ConnectionParameters::ConnectionParameters(const Poco::Util::AbstractConfigurati
if (auto result = readpassphrase(prompt.c_str(), buf, sizeof(buf), 0))
password = result;
}
compression = config.getBool("compression", true) ? Protocol::Compression::Enable : Protocol::Compression::Disable;
timeouts = ConnectionTimeouts(

View File

@ -23,7 +23,6 @@ struct ConnectionParameters
ConnectionTimeouts timeouts;
ConnectionParameters() {}
ConnectionParameters(const Poco::Util::AbstractConfiguration & config);
};

View File

@ -279,7 +279,7 @@ void LocalServer::processQueries()
context->makeSessionContext();
context->makeQueryContext();
context->setUser("default", "", Poco::Net::SocketAddress{}, "");
context->setUser("default", "", Poco::Net::SocketAddress{});
context->setCurrentQueryId("");
applyCmdSettings();

View File

@ -283,8 +283,10 @@ void HTTPHandler::processQuery(
}
std::string query_id = params.get("query_id", "");
context.setUser(user, password, request.clientAddress(), quota_key);
context.setUser(user, password, request.clientAddress());
context.setCurrentQueryId(query_id);
if (!quota_key.empty())
context.setQuotaKey(quota_key);
/// The user could specify session identifier and session timeout.
/// It allows to modify settings, create temporary tables and reuse them in subsequent requests.

View File

@ -61,13 +61,12 @@
#include <Common/ThreadFuzzer.h>
#include "MySQLHandlerFactory.h"
#if USE_OPENCL
#include "Common/BitonicSort.h"
#endif
#if !defined(ARCADIA_BUILD)
# include "config_core.h"
# include "Common/config_version.h"
# include "config_core.h"
# include "Common/config_version.h"
# if USE_OPENCL
# include "Common/BitonicSort.h" // Y_IGNORE
# endif
#endif
#if defined(OS_LINUX)
@ -225,8 +224,10 @@ int Server::main(const std::vector<std::string> & /*args*/)
registerDictionaries();
registerDisks();
#if !defined(ARCADIA_BUILD)
#if USE_OPENCL
BitonicSort::getInstance().configure();
BitonicSort::getInstance().configure();
#endif
#endif
CurrentMetrics::set(CurrentMetrics::Revision, ClickHouseRevision::get());
@ -379,7 +380,7 @@ int Server::main(const std::vector<std::string> & /*args*/)
std::string tmp_path = config().getString("tmp_path", path + "tmp/");
std::string tmp_policy = config().getString("tmp_policy", "");
const VolumePtr & volume = global_context->setTemporaryStorage(tmp_path, tmp_policy);
for (const DiskPtr & disk : volume->disks)
for (const DiskPtr & disk : volume->getDisks())
setupTmpPath(log, disk->getPath());
}

View File

@ -735,7 +735,7 @@ void TCPHandler::receiveHello()
<< (!user.empty() ? ", user: " + user : "")
<< ".");
connection_context.setUser(user, password, socket().peerAddress(), "");
connection_context.setUser(user, password, socket().peerAddress());
}

View File

@ -61,8 +61,11 @@ void Connection::connect(const ConnectionTimeouts & timeouts)
if (connected)
disconnect();
LOG_TRACE(log_wrapper.get(), "Connecting. Database: " << (default_database.empty() ? "(not specified)" : default_database) << ". User: " << user
<< (static_cast<bool>(secure) ? ". Secure" : "") << (static_cast<bool>(compression) ? "" : ". Uncompressed"));
LOG_TRACE(log_wrapper.get(), "Connecting. Database: "
<< (default_database.empty() ? "(not specified)" : default_database)
<< ". User: " << user
<< (static_cast<bool>(secure) ? ". Secure" : "")
<< (static_cast<bool>(compression) ? "" : ". Uncompressed"));
if (static_cast<bool>(secure))
{
@ -165,12 +168,14 @@ void Connection::sendHello()
|| has_control_character(password))
throw Exception("Parameters 'default_database', 'user' and 'password' must not contain ASCII control characters", ErrorCodes::BAD_ARGUMENTS);
auto client_revision = ClickHouseRevision::get();
writeVarUInt(Protocol::Client::Hello, *out);
writeStringBinary((DBMS_NAME " ") + client_name, *out);
writeVarUInt(DBMS_VERSION_MAJOR, *out);
writeVarUInt(DBMS_VERSION_MINOR, *out);
// NOTE For backward compatibility of the protocol, client cannot send its version_patch.
writeVarUInt(ClickHouseRevision::get(), *out);
writeVarUInt(client_revision, *out);
writeStringBinary(default_database, *out);
writeStringBinary(user, *out);
writeStringBinary(password, *out);
@ -394,23 +399,10 @@ void Connection::sendQuery(
/// Client info.
if (server_revision >= DBMS_MIN_REVISION_WITH_CLIENT_INFO)
{
ClientInfo client_info_to_send;
if (!client_info || client_info->empty())
{
/// No client info passed - means this query initiated by me.
client_info_to_send.query_kind = ClientInfo::QueryKind::INITIAL_QUERY;
client_info_to_send.fillOSUserHostNameAndVersionInfo();
client_info_to_send.client_name = (DBMS_NAME " ") + client_name;
}
if (client_info && !client_info->empty())
client_info->write(*out, server_revision);
else
{
/// This query is initiated by another query.
client_info_to_send = *client_info;
client_info_to_send.query_kind = ClientInfo::QueryKind::SECONDARY_QUERY;
}
client_info_to_send.write(*out, server_revision);
ClientInfo().write(*out, server_revision);
}
/// Per query settings.

View File

@ -15,8 +15,8 @@ namespace DB
*
* void thread()
* {
* auto connection = pool.get();
* connection->sendQuery("SELECT 'Hello, world!' AS world");
* auto connection = pool.get();
* connection->sendQuery(...);
* }
*/

View File

@ -94,7 +94,7 @@ void MultiplexedConnections::sendQuery(
const String & query,
const String & query_id,
UInt64 stage,
const ClientInfo * client_info,
const ClientInfo & client_info,
bool with_pending_data)
{
std::lock_guard lock(cancel_mutex);
@ -126,14 +126,14 @@ void MultiplexedConnections::sendQuery(
{
modified_settings.parallel_replica_offset = i;
replica_states[i].connection->sendQuery(timeouts, query, query_id,
stage, &modified_settings, client_info, with_pending_data);
stage, &modified_settings, &client_info, with_pending_data);
}
}
else
{
/// Use single replica.
replica_states[0].connection->sendQuery(timeouts, query, query_id, stage,
&modified_settings, client_info, with_pending_data);
replica_states[0].connection->sendQuery(timeouts, query, query_id,
stage, &modified_settings, &client_info, with_pending_data);
}
sent_query = true;

View File

@ -36,10 +36,10 @@ public:
void sendQuery(
const ConnectionTimeouts & timeouts,
const String & query,
const String & query_id = "",
UInt64 stage = QueryProcessingStage::Complete,
const ClientInfo * client_info = nullptr,
bool with_pending_data = false);
const String & query_id,
UInt64 stage,
const ClientInfo & client_info,
bool with_pending_data);
/// Get packet from any replica.
Packet receivePacket();

View File

@ -21,7 +21,7 @@
#if !defined(ARCADIA_BUILD)
# include <Common/config.h>
# if USE_OPENCL
# include "Common/BitonicSort.h"
# include "Common/BitonicSort.h" // Y_IGNORE
# endif
#else
#undef USE_OPENCL
@ -38,6 +38,7 @@ namespace ErrorCodes
{
extern const int PARAMETER_OUT_OF_BOUND;
extern const int SIZES_OF_COLUMNS_DOESNT_MATCH;
extern const int OPENCL_ERROR;
extern const int LOGICAL_ERROR;
}
@ -120,6 +121,30 @@ namespace
};
}
template <typename T>
void ColumnVector<T>::getSpecialPermutation(bool reverse, size_t limit, int nan_direction_hint, IColumn::Permutation & res,
IColumn::SpecialSort special_sort) const
{
if (special_sort == IColumn::SpecialSort::OPENCL_BITONIC)
{
#if !defined(ARCADIA_BUILD)
#if USE_OPENCL
if (!limit || limit >= data.size())
{
res.resize(data.size());
if (data.empty() || BitonicSort::getInstance().sort(data, res, !reverse))
return;
}
#else
throw DB::Exception("'special_sort = bitonic' specified but OpenCL not available", DB::ErrorCodes::OPENCL_ERROR);
#endif
#endif
}
getPermutation(reverse, limit, nan_direction_hint, res);
}
template <typename T>
void ColumnVector<T>::getPermutation(bool reverse, size_t limit, int nan_direction_hint, IColumn::Permutation & res) const
{
@ -144,12 +169,6 @@ void ColumnVector<T>::getPermutation(bool reverse, size_t limit, int nan_directi
}
else
{
#if USE_OPENCL
/// If bitonic sort if specified as preferred than `nan_direction_hint` equals specific value 42.
if (nan_direction_hint == 42 && BitonicSort::getInstance().sort(data, res, !reverse))
return;
#endif
/// A case for radix sort
if constexpr (is_arithmetic_v<T> && !std::is_same_v<T, UInt128>)
{

View File

@ -189,6 +189,8 @@ public:
}
void getPermutation(bool reverse, size_t limit, int nan_direction_hint, IColumn::Permutation & res) const override;
void getSpecialPermutation(bool reverse, size_t limit, int nan_direction_hint, IColumn::Permutation & res,
IColumn::SpecialSort) const override;
void reserve(size_t n) override
{

View File

@ -245,6 +245,17 @@ public:
*/
virtual void getPermutation(bool reverse, size_t limit, int nan_direction_hint, Permutation & res) const = 0;
enum class SpecialSort
{
NONE = 0,
OPENCL_BITONIC,
};
virtual void getSpecialPermutation(bool reverse, size_t limit, int nan_direction_hint, Permutation & res, SpecialSort) const
{
getPermutation(reverse, limit, nan_direction_hint, res);
}
/** Copies each element according offsets parameter.
* (i-th element should be copied offsets[i] - offsets[i - 1] times.)
* It is necessary in ARRAY JOIN operation.
@ -306,8 +317,9 @@ public:
static MutablePtr mutate(Ptr ptr)
{
MutablePtr res = ptr->shallowMutate();
res->forEachSubcolumn([](WrappedPtr & subcolumn) { subcolumn = IColumn::mutate(std::move(subcolumn)); });
MutablePtr res = ptr->shallowMutate(); /// Now use_count is 2.
ptr.reset(); /// Reset use_count to 1.
res->forEachSubcolumn([](WrappedPtr & subcolumn) { subcolumn = IColumn::mutate(std::move(subcolumn).detach()); });
return res;
}

View File

@ -11,25 +11,32 @@
#include <CL/cl.h>
#endif
#include <algorithm>
#include <cmath>
#include <cstdlib>
#include <cstdint>
#include <map>
#include <type_traits>
#include <ext/bit_cast.h>
#include <Core/Types.h>
#include <Core/Defines.h>
#include <Common/PODArray.h>
#include <Columns/ColumnsCommon.h>
#include "oclBasics.cpp"
#include "oclBasics.h"
#include "bitonicSortKernels.cl"
class BitonicSort
{
public:
using KernelType = OCL::KernelType;
enum Types
{
KernelInt8 = 0,
KernelUInt8,
KernelInt16,
KernelUInt16,
KernelInt32,
KernelUInt32,
KernelInt64,
KernelUInt64,
KernelMax
};
static BitonicSort & getInstance()
{
@ -39,40 +46,50 @@ public:
/// Sorts given array in specified order. Returns `true` if given sequence was sorted, `false` otherwise.
template <typename T>
bool sort(const DB::PaddedPODArray<T> & data, DB::IColumn::Permutation & res, cl_uint sort_ascending)
bool sort(const DB::PaddedPODArray<T> & data, DB::IColumn::Permutation & res, cl_uint sort_ascending [[maybe_unused]]) const
{
size_t s = data.size();
/// Getting the nearest power of 2.
size_t power = 1;
if (s <= 8) power = 8;
else while (power < s) power <<= 1;
/// Allocates more space for additional stubs to be added if needed.
std::vector<T> pairs_content(power);
std::vector<UInt32> pairs_indices(power);
for (UInt32 i = 0; i < s; ++i)
if constexpr (
std::is_same_v<T, Int8> ||
std::is_same_v<T, UInt8> ||
std::is_same_v<T, Int16> ||
std::is_same_v<T, UInt16> ||
std::is_same_v<T, Int32> ||
std::is_same_v<T, UInt32> ||
std::is_same_v<T, Int64> ||
std::is_same_v<T, UInt64>)
{
pairs_content[i] = data[i];
pairs_indices[i] = i;
}
size_t data_size = data.size();
bool result = sort(pairs_content.data(), pairs_indices.data(), s, power - s, sort_ascending);
/// Getting the nearest power of 2.
size_t power = 8;
while (power < data_size)
power <<= 1;
if (!result) return false;
/// Allocates more space for additional stubs to be added if needed.
std::vector<T> pairs_content(power);
std::vector<UInt32> pairs_indices(power);
for (size_t i = 0, shift = 0; i < power; ++i)
{
if (pairs_indices[i] >= s)
memcpy(&pairs_content[0], &data[0], sizeof(T) * data_size);
for (UInt32 i = 0; i < data_size; ++i)
pairs_indices[i] = i;
fillWithStubs(pairs_content.data(), pairs_indices.data(), data_size, power - data_size, sort_ascending);
sort(pairs_content.data(), pairs_indices.data(), power, sort_ascending);
for (size_t i = 0, shift = 0; i < power; ++i)
{
++shift;
continue;
if (pairs_indices[i] >= data_size)
{
++shift;
continue;
}
res[i - shift] = pairs_indices[i];
}
res[i - shift] = pairs_indices[i];
return true;
}
return true;
return false;
}
/// Creating a configuration instance with making all OpenCl required variables
@ -84,29 +101,36 @@ public:
cl_platform_id platform = OCL::getPlatformID(settings);
cl_device_id device = OCL::getDeviceID(platform, settings);
cl_context gpu_context = OCL::makeContext(device, settings);
cl_command_queue command_queue = OCL::makeCommandQueue(device, gpu_context, settings);
cl_command_queue command_queue = OCL::makeCommandQueue<2>(device, gpu_context, settings);
cl_program program = OCL::makeProgram(bitonic_sort_kernels, gpu_context, device, settings);
/// Creating kernels for each specified data type.
cl_int error = 0;
kernels.resize(KernelMax);
kernels["char"] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_char", &error),
clReleaseKernel);
kernels["uchar"] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_uchar", &error),
clReleaseKernel);
kernels["short"] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_short", &error),
clReleaseKernel);
kernels["ushort"] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_ushort", &error),
clReleaseKernel);
kernels["int"] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_int", &error),
clReleaseKernel);
kernels["uint"] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_uint", &error),
clReleaseKernel);
kernels["long"] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_long", &error),
clReleaseKernel);
kernels["ulong"] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_ulong", &error),
clReleaseKernel);
kernels[KernelInt8] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_char", &error), clReleaseKernel);
OCL::checkError(error);
kernels[KernelUInt8] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_uchar", &error), clReleaseKernel);
OCL::checkError(error);
kernels[KernelInt16] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_short", &error), clReleaseKernel);
OCL::checkError(error);
kernels[KernelUInt16] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_ushort", &error), clReleaseKernel);
OCL::checkError(error);
kernels[KernelInt32] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_int", &error), clReleaseKernel);
OCL::checkError(error);
kernels[KernelUInt32] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_uint", &error), clReleaseKernel);
OCL::checkError(error);
kernels[KernelInt64] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_long", &error), clReleaseKernel);
OCL::checkError(error);
kernels[KernelUInt64] = std::shared_ptr<KernelType>(clCreateKernel(program, "bitonicSort_ulong", &error), clReleaseKernel);
OCL::checkError(error);
configuration = std::shared_ptr<OCL::Configuration>(new OCL::Configuration(device, gpu_context, command_queue, program));
@ -114,97 +138,24 @@ public:
private:
/// Dictionary with kernels for each type from list: uchar, char, ushort, short, uint, int, ulong and long.
std::map<std::string, std::shared_ptr<KernelType>> kernels;
std::vector<std::shared_ptr<KernelType>> kernels;
/// Current configuration with core OpenCL instances.
std::shared_ptr<OCL::Configuration> configuration = nullptr;
/// Returns `true` if given sequence was sorted, `false` otherwise.
template <typename T>
bool sort(T * p_input, cl_uint * indices, cl_int array_size, cl_int number_of_stubs, cl_uint sort_ascending)
{
if (typeid(T).name() == typeid(cl_char).name())
sort_char(reinterpret_cast<cl_char *>(p_input), indices, array_size, number_of_stubs, sort_ascending);
else if (typeid(T) == typeid(cl_uchar))
sort_uchar(reinterpret_cast<cl_uchar *>(p_input), indices, array_size, number_of_stubs, sort_ascending);
else if (typeid(T) == typeid(cl_short))
sort_short(reinterpret_cast<cl_short *>(p_input), indices, array_size, number_of_stubs, sort_ascending);
else if (typeid(T) == typeid(cl_ushort))
sort_ushort(reinterpret_cast<cl_ushort *>(p_input), indices, array_size, number_of_stubs, sort_ascending);
else if (typeid(T) == typeid(cl_int))
sort_int(reinterpret_cast<cl_int *>(p_input), indices, array_size, number_of_stubs, sort_ascending);
else if (typeid(T) == typeid(cl_uint))
sort_uint(reinterpret_cast<cl_uint *>(p_input), indices, array_size, number_of_stubs, sort_ascending);
else if (typeid(T) == typeid(cl_long))
sort_long(reinterpret_cast<cl_long *>(p_input), indices, array_size, number_of_stubs, sort_ascending);
else if (typeid(T) == typeid(cl_ulong))
sort_ulong(reinterpret_cast<cl_ulong *>(p_input), indices, array_size, number_of_stubs, sort_ascending);
else
return false;
return true;
}
/// Specific functions for each integer type.
void sort_char(cl_char * p_input, cl_uint * indices, cl_int array_size, cl_int number_of_stubs, cl_uint sort_ascending)
{
cl_char stubs_value = sort_ascending ? CHAR_MAX : CHAR_MIN;
fillWithStubs(number_of_stubs, stubs_value, p_input, indices, array_size);
sort(kernels["char"].get(), p_input, indices, array_size + number_of_stubs, sort_ascending);
}
void sort_uchar(cl_uchar * p_input, cl_uint * indices, cl_int array_size, cl_int number_of_stubs, cl_uint sort_ascending)
{
cl_uchar stubs_value = sort_ascending ? UCHAR_MAX : 0;
fillWithStubs(number_of_stubs, stubs_value, p_input, indices, array_size);
sort(kernels["uchar"].get(), p_input, indices, array_size + number_of_stubs, sort_ascending);
}
void sort_short(cl_short * p_input, cl_uint * indices, cl_int array_size, cl_int number_of_stubs, cl_uint sort_ascending)
{
cl_short stubs_value = sort_ascending ? SHRT_MAX : SHRT_MIN;
fillWithStubs(number_of_stubs, stubs_value, p_input, indices, array_size);
sort(kernels["short"].get(), p_input, indices, array_size + number_of_stubs, sort_ascending);
}
void sort_ushort(cl_ushort * p_input, cl_uint * indices, cl_int array_size, cl_int number_of_stubs, cl_uint sort_ascending)
{
cl_ushort stubs_value = sort_ascending ? USHRT_MAX : 0;
fillWithStubs(number_of_stubs, stubs_value, p_input, indices, array_size);
sort(kernels["ushort"].get(), p_input, indices, array_size + number_of_stubs, sort_ascending);
}
void sort_int(cl_int * p_input, cl_uint * indices, cl_int array_size, cl_int number_of_stubs, cl_uint sort_ascending)
{
cl_int stubs_value = sort_ascending ? INT_MAX : INT_MIN;
fillWithStubs(number_of_stubs, stubs_value, p_input, indices, array_size);
sort(kernels["int"].get(), p_input, indices, array_size + number_of_stubs, sort_ascending);
}
void sort_uint(cl_uint * p_input, cl_uint * indices, cl_int array_size, cl_int number_of_stubs, cl_uint sort_ascending)
{
cl_uint stubs_value = sort_ascending ? UINT_MAX : 0;
fillWithStubs(number_of_stubs, stubs_value, p_input, indices, array_size);
sort(kernels["uint"].get(), p_input, indices, array_size + number_of_stubs, sort_ascending);
}
void sort_long(cl_long * p_input, cl_uint * indices, cl_int array_size, cl_int number_of_stubs, cl_uint sort_ascending)
{
cl_long stubs_value = sort_ascending ? LONG_MAX : LONG_MIN;
fillWithStubs(number_of_stubs, stubs_value, p_input, indices, array_size);
sort(kernels["long"].get(), p_input, indices, array_size + number_of_stubs, sort_ascending);
}
void sort_ulong(cl_ulong * p_input, cl_uint * indices, cl_int array_size, cl_int number_of_stubs, cl_uint sort_ascending)
{
cl_ulong stubs_value = sort_ascending ? ULONG_MAX : 0;
fillWithStubs(number_of_stubs, stubs_value, p_input, indices, array_size);
sort(kernels["ulong"].get(), p_input, indices, array_size + number_of_stubs, sort_ascending);
}
cl_kernel getKernel(Int8) const { return kernels[KernelInt8].get(); }
cl_kernel getKernel(UInt8) const { return kernels[KernelUInt8].get(); }
cl_kernel getKernel(Int16) const { return kernels[KernelInt16].get(); }
cl_kernel getKernel(UInt16) const { return kernels[KernelUInt16].get(); }
cl_kernel getKernel(Int32) const { return kernels[KernelInt32].get(); }
cl_kernel getKernel(UInt32) const { return kernels[KernelUInt32].get(); }
cl_kernel getKernel(Int64) const { return kernels[KernelInt64].get(); }
cl_kernel getKernel(UInt64) const { return kernels[KernelUInt64].get(); }
/// Sorts p_input inplace with indices. Works only with arrays which size equals to power of two.
template <class T>
void sort(cl_kernel kernel, T * p_input, cl_uint * indices, cl_int array_size, cl_uint sort_ascending)
void sort(T * p_input, cl_uint * indices, cl_int array_size, cl_uint sort_ascending) const
{
cl_kernel kernel = getKernel(T(0));
cl_int error = CL_SUCCESS;
cl_int num_stages = 0;
@ -246,7 +197,7 @@ private:
}
template <class T>
void configureKernel(cl_kernel kernel, int number_of_argument, void * source)
void configureKernel(cl_kernel kernel, int number_of_argument, void * source) const
{
cl_int error = clSetKernelArg(kernel, number_of_argument, sizeof(T), source);
OCL::checkError(error);
@ -254,9 +205,9 @@ private:
/// Fills given sequences from `arraySize` index with `numberOfStubs` values.
template <class T>
void fillWithStubs(cl_int number_of_stubs, T value, T * p_input,
cl_uint * indices, cl_int array_size)
void fillWithStubs(T * p_input, cl_uint * indices, cl_int array_size, cl_int number_of_stubs, cl_uint sort_ascending) const
{
T value = sort_ascending ? std::numeric_limits<T>::max() : std::numeric_limits<T>::min();
for (cl_int index = 0; index < number_of_stubs; ++index)
{
p_input[array_size + index] = value;
@ -264,7 +215,7 @@ private:
}
}
BitonicSort() {}
BitonicSort(BitonicSort const &);
void operator=(BitonicSort const &);
BitonicSort() = default;
BitonicSort(BitonicSort const &) = delete;
void operator = (BitonicSort const &) = delete;
};

View File

@ -217,6 +217,9 @@ protected:
operator const immutable_ptr<T> & () const { return value; }
operator immutable_ptr<T> & () { return value; }
/// Get internal immutable ptr. Does not change internal use counter.
immutable_ptr<T> detach() && { return std::move(value); }
operator bool() const { return value != nullptr; }
bool operator! () const { return value == nullptr; }

View File

@ -495,6 +495,9 @@ namespace ErrorCodes
extern const int ATOMIC_RENAME_FAIL = 521;
extern const int OPENCL_ERROR = 522;
extern const int UNKNOWN_ROW_POLICY = 523;
extern const int ALTER_OF_COLUMN_IS_FORBIDDEN = 524;
extern const int INCORRECT_DISK_INDEX = 525;
extern const int UNKNOWN_VOLUME_TYPE = 526;
extern const int KEEPER_EXCEPTION = 999;
extern const int POCO_EXCEPTION = 1000;

View File

@ -17,6 +17,8 @@ public:
explicit WeakHash32(size_t size) : data(size, ~UInt32(0)) {}
WeakHash32(const WeakHash32 & other) { data.assign(other.data); }
void reset(size_t size) { data.assign(size, ~UInt32(0)); }
const Container & getData() const { return data; }
Container & getData() { return data; }

View File

@ -151,16 +151,22 @@ public:
LOG_TRACE(log, BridgeHelperMixin::serviceAlias() + " is not running, will try to start it");
startBridge();
bool started = false;
for (size_t counter : ext::range(1, 20))
uint64_t milliseconds_to_wait = 10; /// Exponential backoff
uint64_t counter = 0;
while (milliseconds_to_wait < 10000)
{
++counter;
LOG_TRACE(log, "Checking " + BridgeHelperMixin::serviceAlias() + " is running, try " << counter);
if (checkBridgeIsRunning())
{
started = true;
break;
}
std::this_thread::sleep_for(std::chrono::milliseconds(10));
std::this_thread::sleep_for(std::chrono::milliseconds(milliseconds_to_wait));
milliseconds_to_wait *= 2;
}
if (!started)
throw Exception(BridgeHelperMixin::getName() + "BridgeHelper: " + BridgeHelperMixin::serviceAlias() + " is not responding",
ErrorCodes::EXTERNAL_SERVER_IS_NOT_RESPONDING);

View File

@ -1,3 +1,5 @@
#pragma once
#include <Common/config.h>
#if USE_OPENCL
@ -15,24 +17,18 @@
#include <Core/Types.h>
#include <Common/Exception.h>
#ifndef CL_VERSION_2_0
#define CL_USE_DEPRECATED_OPENCL_1_2_APIS
#endif
using KernelType = std::remove_reference<decltype(*cl_kernel())>::type;
namespace DB
{
namespace ErrorCodes
{
extern const int OPENCL_ERROR;
}
namespace ErrorCodes
{
extern const int OPENCL_ERROR;
}
}
struct OCL
{
using KernelType = std::remove_reference<decltype(*cl_kernel())>::type;
/**
* Structure which represents the most essential settings of common OpenCl entities.
@ -209,7 +205,7 @@ struct OCL
static void checkError(cl_int error)
{
if (error != CL_SUCCESS)
throw DB::Exception("OpenCL error " + opencl_error_to_str(error), DB::ErrorCodes::OPENCL_ERROR);
throw DB::Exception("OpenCL error: " + opencl_error_to_str(error), DB::ErrorCodes::OPENCL_ERROR);
}
@ -221,22 +217,18 @@ struct OCL
cl_int error = clGetPlatformIDs(settings.number_of_platform_entries, &platform,
settings.number_of_available_platforms);
checkError(error);
return platform;
}
static cl_device_id getDeviceID(cl_platform_id & platform, const Settings & settings)
{
cl_device_id device;
cl_int error = clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, settings.number_of_devices_entries,
&device, settings.number_of_available_devices);
OCL::checkError(error);
return device;
}
static cl_context makeContext(cl_device_id & device, const Settings & settings)
{
cl_int error;
@ -244,32 +236,43 @@ struct OCL
&device, settings.context_callback, settings.context_callback_data,
&error);
OCL::checkError(error);
return gpu_context;
}
template <int version>
static cl_command_queue makeCommandQueue(cl_device_id & device, cl_context & context, const Settings & settings [[maybe_unused]])
{
cl_int error;
#ifdef CL_USE_DEPRECATED_OPENCL_1_2_APIS
cl_command_queue command_queue = clCreateCommandQueue(context, device, settings.command_queue_properties, &error);
#else
cl_command_queue command_queue = clCreateCommandQueueWithProperties(context, device, nullptr, &error);
#endif
OCL::checkError(error);
cl_command_queue command_queue;
if constexpr (version == 1)
{
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
command_queue = clCreateCommandQueue(context, device, settings.command_queue_properties, &error);
#pragma GCC diagnostic pop
}
else
{
#ifdef CL_VERSION_2_0
command_queue = clCreateCommandQueueWithProperties(context, device, nullptr, &error);
#else
throw DB::Exception("Binary is built with OpenCL version < 2.0", DB::ErrorCodes::OPENCL_ERROR);
#endif
}
OCL::checkError(error);
return command_queue;
}
static cl_program makeProgram(const char * source_code, cl_context context,
cl_device_id device_id, const Settings & settings)
{
cl_int error = 0;
size_t source_size = strlen(source_code);
cl_program program = clCreateProgramWithSource(context, settings.number_of_program_source_pointers, &source_code, &source_size, &error);
cl_program program = clCreateProgramWithSource(context, settings.number_of_program_source_pointers,
&source_code, &source_size, &error);
checkError(error);
error = clBuildProgram(program, settings.number_of_devices_entries, &device_id, settings.build_options,
@ -291,39 +294,30 @@ struct OCL
}
checkError(error);
return program;
}
/// Configuring buffer for given input data
template<typename K>
static cl_mem createBuffer(K * p_input, cl_int array_size, cl_context context,
cl_int elements_size = sizeof(K))
static cl_mem createBuffer(K * p_input, cl_int array_size, cl_context context, cl_int elements_size = sizeof(K))
{
cl_int error = CL_SUCCESS;
cl_mem cl_input_buffer =
clCreateBuffer
(
cl_mem cl_input_buffer = clCreateBuffer(
context,
CL_MEM_USE_HOST_PTR,
zeroCopySizeAlignment(elements_size * array_size),
p_input,
&error
);
&error);
checkError(error);
return cl_input_buffer;
}
static size_t zeroCopySizeAlignment(size_t required_size)
{
return required_size + (~required_size + 1) % 64;
}
/// Manipulating with common OpenCL variables.
static void finishCommandQueue(cl_command_queue command_queue)
@ -333,10 +327,8 @@ struct OCL
OCL::checkError(error);
}
template<class T>
static void releaseData(T * origin, cl_int array_size, cl_mem cl_buffer,
cl_command_queue command_queue, size_t offset = 0)
static void releaseData(T * origin, cl_int array_size, cl_mem cl_buffer, cl_command_queue command_queue, size_t offset = 0)
{
cl_int error = CL_SUCCESS;
@ -357,7 +349,6 @@ struct OCL
error = clReleaseMemObject(cl_buffer);
checkError(error);
}
};
#endif

View File

@ -4,6 +4,7 @@
#if defined(linux) || defined(__linux) || defined(__linux__)
#include <unistd.h>
#include <fcntl.h>
#include <sys/syscall.h>
#include <linux/fs.h>
#include <sys/utsname.h>
@ -51,13 +52,11 @@ static void renameat2(const std::string & old_path, const std::string & new_path
{
if (old_path.empty() || new_path.empty())
throw Exception("Cannot rename " + old_path + " to " + new_path + ": path is empty", ErrorCodes::LOGICAL_ERROR);
if (old_path[0] != '/' || new_path[0] != '/')
throw Exception("Cannot rename " + old_path + " to " + new_path + ": path is relative", ErrorCodes::LOGICAL_ERROR);
/// int olddirfd (ignored for absolute oldpath), const char *oldpath,
/// int newdirfd (ignored for absolute newpath), const char *newpath,
/// unsigned int flags
if (0 == syscall(__NR_renameat2, 0, old_path.c_str(), 0, new_path.c_str(), flags))
if (0 == syscall(__NR_renameat2, AT_FDCWD, old_path.c_str(), AT_FDCWD, new_path.c_str(), flags))
return;
if (errno == EEXIST)

View File

@ -37,7 +37,7 @@ target_link_libraries (radix_sort PRIVATE clickhouse_common_io)
if (USE_OPENCL)
add_executable (bitonic_sort bitonic_sort.cpp)
target_link_libraries (bitonic_sort PRIVATE clickhouse_common_io ${OPENCL_LINKER_FLAGS})
target_link_libraries (bitonic_sort PRIVATE clickhouse_common_io ${OPENCL_LINKER_FLAGS} ${OpenCL_LIBRARIES})
endif ()
add_executable (arena_with_free_lists arena_with_free_lists.cpp)

View File

@ -1,8 +1,6 @@
#include <Common/config.h>
#include <iostream>
#if USE_OPENCL
#if !defined(__APPLE__) && !defined(__FreeBSD__)
#include <malloc.h>
#endif
@ -16,13 +14,10 @@
#include "Common/BitonicSort.h"
using Key = cl_ulong;
/// Generates vector of size 8 for testing.
/// Vector contains max possible value, min possible value and duplicate values.
template <class Type>
static void generateTest(std::vector<Type>& data, Type min_value, Type max_value)
static void generateTest(std::vector<Type> & data, Type min_value, Type max_value)
{
int size = 10;
@ -62,8 +57,7 @@ static void check(const std::vector<size_t> & indices, bool reverse = true)
template <class Type>
static void sortBitonicSortWithPodArrays(const std::vector<Type>& data,
std::vector<size_t> & indices, bool ascending = true)
static void sortBitonicSortWithPodArrays(const std::vector<Type> & data, std::vector<size_t> & indices, bool ascending = true)
{
DB::PaddedPODArray<Type> pod_array_data = DB::PaddedPODArray<Type>(data.size());
DB::IColumn::Permutation pod_array_indices = DB::IColumn::Permutation(data.size());
@ -74,7 +68,6 @@ static void sortBitonicSortWithPodArrays(const std::vector<Type>& data,
*(pod_array_indices.data() + index) = index;
}
BitonicSort::getInstance().configure();
BitonicSort::getInstance().sort(pod_array_data, pod_array_indices, ascending);
for (size_t index = 0; index < data.size(); ++index)
@ -83,7 +76,7 @@ static void sortBitonicSortWithPodArrays(const std::vector<Type>& data,
template <class Type>
static void testBitonicSort(std::string test_name, Type min_value, Type max_value)
static void testBitonicSort(const std::string & test_name, Type min_value, Type max_value)
{
std::cerr << test_name << std::endl;
@ -102,147 +95,80 @@ static void testBitonicSort(std::string test_name, Type min_value, Type max_valu
static void straightforwardTests()
{
testBitonicSort<cl_char>("Test 01: cl_char.", CHAR_MIN, CHAR_MAX);
testBitonicSort<cl_uchar>("Test 02: cl_uchar.", 0, UCHAR_MAX);
testBitonicSort<cl_short>("Test 03: cl_short.", SHRT_MIN, SHRT_MAX);
testBitonicSort<cl_ushort>("Test 04: cl_ushort.", 0, USHRT_MAX);
testBitonicSort<cl_int>("Test 05: cl_int.", INT_MIN, INT_MAX);
testBitonicSort<cl_uint >("Test 06: cl_uint.", 0, UINT_MAX);
testBitonicSort<cl_long >("Test 07: cl_long.", LONG_MIN, LONG_MAX);
testBitonicSort<cl_ulong >("Test 08: cl_ulong.", 0, ULONG_MAX);
testBitonicSort<DB::Int8>("Test 01: Int8.", CHAR_MIN, CHAR_MAX);
testBitonicSort<DB::UInt8>("Test 02: UInt8.", 0, UCHAR_MAX);
testBitonicSort<DB::Int16>("Test 03: Int16.", SHRT_MIN, SHRT_MAX);
testBitonicSort<DB::UInt16>("Test 04: UInt16.", 0, USHRT_MAX);
testBitonicSort<DB::Int32>("Test 05: Int32.", INT_MIN, INT_MAX);
testBitonicSort<DB::UInt32>("Test 06: UInt32.", 0, UINT_MAX);
testBitonicSort<DB::Int64>("Test 07: Int64.", LONG_MIN, LONG_MAX);
testBitonicSort<DB::UInt64>("Test 08: UInt64.", 0, ULONG_MAX);
}
static void NO_INLINE sort1(Key * data, size_t size)
template <typename T>
static void bitonicSort(std::vector<T> & data)
{
std::sort(data, data + size);
}
static void NO_INLINE sort2(std::vector<Key> & data, std::vector<size_t> & indices)
{
BitonicSort::getInstance().configure();
size_t size = data.size();
std::vector<size_t> indices(size);
for (size_t i = 0; i < size; ++i)
indices[i] = i;
sortBitonicSortWithPodArrays(data, indices);
std::vector<Key> result(data.size());
for (size_t index = 0; index < data.size(); ++index)
result[index] = data[indices[index]];
std::vector<T> result(size);
for (size_t i = 0; i < size; ++i)
result[i] = data[indices[i]];
data = std::move(result);
}
int main(int argc, char ** argv)
template <typename T>
static bool checkSort(const std::vector<T> & data, size_t size)
{
straightforwardTests();
std::vector<T> copy1(data.begin(), data.begin() + size);
std::vector<T> copy2(data.begin(), data.begin() + size);
if (argc < 3)
{
std::cerr << "Not enough arguments were passed\n";
return 1;
}
std::sort(copy1.data(), copy1.data() + size);
bitonicSort<T>(copy2);
size_t n = DB::parse<size_t>(argv[1]);
size_t method = DB::parse<size_t>(argv[2]);
for (size_t i = 0; i < size; ++i)
if (copy1[i] != copy2[i])
return false;
std::vector<Key> data(n);
std::vector<size_t> indices(n);
{
Stopwatch watch;
for (auto & elem : data)
elem = static_cast<Key>(rand());
for (size_t i = 0; i < n; ++i)
indices[i] = i;
watch.stop();
double elapsed = watch.elapsedSeconds();
std::cerr
<< "Filled in " << elapsed
<< " (" << n / elapsed << " elem/sec., "
<< n * sizeof(Key) / elapsed / 1048576 << " MB/sec.)"
<< std::endl;
}
if (n <= 100)
{
std::cerr << std::endl;
for (const auto & elem : data)
std::cerr << elem << ' ';
std::cerr << std::endl;
for (const auto & index : indices)
std::cerr << index << ' ';
std::cerr << std::endl;
}
{
Stopwatch watch;
if (method == 1) sort1(data.data(), n);
if (method == 2) sort2(data, indices);
watch.stop();
double elapsed = watch.elapsedSeconds();
std::cerr
<< "Sorted in " << elapsed
<< " (" << n / elapsed << " elem/sec., "
<< n * sizeof(Key) / elapsed / 1048576 << " MB/sec.)"
<< std::endl;
}
{
Stopwatch watch;
size_t i = 1;
while (i < n)
{
if (!(data[i - 1] <= data[i]))
break;
++i;
}
watch.stop();
double elapsed = watch.elapsedSeconds();
std::cerr
<< "Checked in " << elapsed
<< " (" << n / elapsed << " elem/sec., "
<< n * sizeof(Key) / elapsed / 1048576 << " MB/sec.)"
<< std::endl
<< "Result: " << (i == n ? "Ok." : "Fail!") << std::endl;
}
if (n <= 1000)
{
std::cerr << std::endl;
std::cerr << data[0] << ' ';
for (size_t i = 1; i < n; ++i)
{
if (!(data[i - 1] <= data[i]))
std::cerr << "*** ";
std::cerr << data[i] << ' ';
}
std::cerr << std::endl;
for (const auto & index : indices)
std::cerr << index << ' ';
std::cerr << std::endl;
}
return 0;
return true;
}
#else
int main()
{
std::cerr << "Openc CL disabled.";
BitonicSort::getInstance().configure();
straightforwardTests();
size_t size = 1100;
std::vector<int> data(size);
for (size_t i = 0; i < size; ++i)
data[i] = rand();
for (size_t i = 0; i < 128; ++i)
{
if (!checkSort<int>(data, i))
{
std::cerr << "fail at length " << i << std::endl;
return 1;
}
}
for (size_t i = 128; i < size; i += 7)
{
if (!checkSort<int>(data, i))
{
std::cerr << "fail at length " << i << std::endl;
return 1;
}
}
return 0;
}
#endif

View File

@ -959,7 +959,7 @@ public:
if (auth_response->empty())
{
context.setUser(user_name, "", address, "");
context.setUser(user_name, "", address);
return;
}
@ -982,7 +982,7 @@ public:
{
password_sha1[i] = digest[i] ^ static_cast<unsigned char>((*auth_response)[i]);
}
context.setUser(user_name, password_sha1, address, "");
context.setUser(user_name, password_sha1, address);
}
private:
String scramble;
@ -1124,7 +1124,7 @@ public:
password.pop_back();
}
context.setUser(user_name, password, address, "");
context.setUser(user_name, password, address);
}
private:

View File

@ -56,6 +56,7 @@ struct Settings : public SettingsCollection<Settings>
M(SettingUInt64, min_insert_block_size_bytes_for_materialized_views, 0, "Like min_insert_block_size_bytes, but applied only during pushing to MATERIALIZED VIEW (default: min_insert_block_size_bytes)", 0) \
M(SettingUInt64, max_joined_block_size_rows, DEFAULT_BLOCK_SIZE, "Maximum block size for JOIN result (if join algorithm supports it). 0 means unlimited.", 0) \
M(SettingUInt64, max_insert_threads, 0, "The maximum number of threads to execute the INSERT SELECT query. Values 0 or 1 means that INSERT SELECT is not run in parallel. Higher values will lead to higher memory usage. Parallel INSERT SELECT has effect only if the SELECT part is run on parallel, see 'max_threads' setting.", 0) \
M(SettingUInt64, max_final_threads, 16, "The maximum number of threads to read from table with FINAL.", 0) \
M(SettingMaxThreads, max_threads, 0, "The maximum number of threads to execute the request. By default, it is determined automatically.", 0) \
M(SettingMaxThreads, max_alter_threads, 0, "The maximum number of threads to execute the ALTER requests. By default, it is determined automatically.", 0) \
M(SettingUInt64, max_read_buffer_size, DBMS_DEFAULT_BUFFER_SIZE, "The maximum size of the buffer to read from the filesystem.", 0) \
@ -436,7 +437,6 @@ struct Settings : public SettingsCollection<Settings>
M(SettingBool, partial_merge_join, false, "Obsolete. Use join_algorithm='prefer_partial_merge' instead.", 0) \
M(SettingUInt64, max_memory_usage_for_all_queries, 0, "Obsolete. Will be removed after 2020-10-20", 0) \
DECLARE_SETTINGS_COLLECTION(LIST_OF_SETTINGS)
/** Set multiple settings from "profile" (in server configuration file (users.xml), profiles contain groups of multiple settings).

View File

@ -8,3 +8,4 @@
#cmakedefine01 USE_EMBEDDED_COMPILER
#cmakedefine01 USE_INTERNAL_LLVM_LIBRARY
#cmakedefine01 USE_SSL
#cmakedefine01 USE_OPENCL

View File

@ -347,7 +347,10 @@ void RemoteBlockInputStream::sendQuery()
established = true;
auto timeouts = ConnectionTimeouts::getTCPTimeoutsWithFailover(settings);
multiplexed_connections->sendQuery(timeouts, query, query_id, stage, &context.getClientInfo(), true);
ClientInfo modified_client_info = context.getClientInfo();
modified_client_info.query_kind = ClientInfo::QueryKind::SECONDARY_QUERY;
multiplexed_connections->sendQuery(timeouts, query, query_id, stage, modified_client_info, true);
established = false;
sent_query = true;

View File

@ -21,14 +21,17 @@ namespace ErrorCodes
RemoteBlockOutputStream::RemoteBlockOutputStream(Connection & connection_,
const ConnectionTimeouts & timeouts,
const String & query_,
const Settings * settings_,
const ClientInfo * client_info_)
: connection(connection_), query(query_), settings(settings_), client_info(client_info_)
const Settings & settings_,
const ClientInfo & client_info_)
: connection(connection_), query(query_)
{
/** Send query and receive "header", that describe table structure.
ClientInfo modified_client_info = client_info_;
modified_client_info.query_kind = ClientInfo::QueryKind::SECONDARY_QUERY;
/** Send query and receive "header", that describes table structure.
* Header is needed to know, what structure is required for blocks to be passed to 'write' method.
*/
connection.sendQuery(timeouts, query, "", QueryProcessingStage::Complete, settings, client_info);
connection.sendQuery(timeouts, query, "", QueryProcessingStage::Complete, &settings_, &modified_client_info);
while (true)
{

View File

@ -22,8 +22,8 @@ public:
RemoteBlockOutputStream(Connection & connection_,
const ConnectionTimeouts & timeouts,
const String & query_,
const Settings * settings_ = nullptr,
const ClientInfo * client_info_ = nullptr);
const Settings & settings_,
const ClientInfo & client_info_);
Block getHeader() const override { return header; }
@ -38,8 +38,6 @@ public:
private:
Connection & connection;
String query;
const Settings * settings;
const ClientInfo * client_info;
Block header;
bool finished = false;
};

View File

@ -376,15 +376,19 @@ void registerDataTypeString(DataTypeFactory & factory)
/// These synonyms are added for compatibility.
factory.registerAlias("CHAR", "String", DataTypeFactory::CaseInsensitive);
factory.registerAlias("CHARACTER", "String", DataTypeFactory::CaseInsensitive);
factory.registerAlias("VARCHAR", "String", DataTypeFactory::CaseInsensitive);
factory.registerAlias("VARCHAR2", "String", DataTypeFactory::CaseInsensitive); /// Oracle
factory.registerAlias("TEXT", "String", DataTypeFactory::CaseInsensitive);
factory.registerAlias("TINYTEXT", "String", DataTypeFactory::CaseInsensitive);
factory.registerAlias("MEDIUMTEXT", "String", DataTypeFactory::CaseInsensitive);
factory.registerAlias("LONGTEXT", "String", DataTypeFactory::CaseInsensitive);
factory.registerAlias("BLOB", "String", DataTypeFactory::CaseInsensitive);
factory.registerAlias("CLOB", "String", DataTypeFactory::CaseInsensitive);
factory.registerAlias("TINYBLOB", "String", DataTypeFactory::CaseInsensitive);
factory.registerAlias("MEDIUMBLOB", "String", DataTypeFactory::CaseInsensitive);
factory.registerAlias("LONGBLOB", "String", DataTypeFactory::CaseInsensitive);
factory.registerAlias("BYTEA", "String", DataTypeFactory::CaseInsensitive); /// PostgreSQL
}
}

View File

@ -186,6 +186,8 @@ void registerDataTypeDecimal(DataTypeFactory & factory)
factory.registerDataType("Decimal", create, DataTypeFactory::CaseInsensitive);
factory.registerAlias("DEC", "Decimal", DataTypeFactory::CaseInsensitive);
factory.registerAlias("NUMERIC", "Decimal", DataTypeFactory::CaseInsensitive);
factory.registerAlias("FIXED", "Decimal", DataTypeFactory::CaseInsensitive);
}
/// Explicit template instantiations.

View File

@ -23,11 +23,17 @@ void registerDataTypeNumbers(DataTypeFactory & factory)
/// These synonyms are added for compatibility.
factory.registerAlias("TINYINT", "Int8", DataTypeFactory::CaseInsensitive);
factory.registerAlias("BOOL", "Int8", DataTypeFactory::CaseInsensitive);
factory.registerAlias("BOOLEAN", "Int8", DataTypeFactory::CaseInsensitive);
factory.registerAlias("INT1", "Int8", DataTypeFactory::CaseInsensitive); /// MySQL
factory.registerAlias("BYTE", "Int8", DataTypeFactory::CaseInsensitive); /// MS Access
factory.registerAlias("SMALLINT", "Int16", DataTypeFactory::CaseInsensitive);
factory.registerAlias("INT", "Int32", DataTypeFactory::CaseInsensitive);
factory.registerAlias("INTEGER", "Int32", DataTypeFactory::CaseInsensitive);
factory.registerAlias("BIGINT", "Int64", DataTypeFactory::CaseInsensitive);
factory.registerAlias("FLOAT", "Float32", DataTypeFactory::CaseInsensitive);
factory.registerAlias("REAL", "Float32", DataTypeFactory::CaseInsensitive);
factory.registerAlias("SINGLE", "Float32", DataTypeFactory::CaseInsensitive); /// MS Access
factory.registerAlias("DOUBLE", "Float64", DataTypeFactory::CaseInsensitive);
}

View File

@ -1,12 +1,14 @@
#include <Databases/DatabaseAtomic.h>
#include <Databases/DatabaseOnDisk.h>
#include <Poco/File.h>
#include <Poco/Path.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteHelpers.h>
#include <Common/Stopwatch.h>
#include <Parsers/formatAST.h>
#include <Common/renameat2.h>
#include <Storages/StorageMaterializedView.h>
#include <filesystem>
namespace DB
@ -227,7 +229,7 @@ void DatabaseAtomic::commitCreateTable(const ASTCreateQuery & query, const Stora
void DatabaseAtomic::commitAlterTable(const StorageID & table_id, const String & table_metadata_tmp_path, const String & table_metadata_path)
{
SCOPE_EXIT({ Poco::File(table_metadata_tmp_path).remove(); });
SCOPE_EXIT({ std::error_code code; std::filesystem::remove(table_metadata_tmp_path, code); });
std::unique_lock lock{mutex};
auto actual_table_id = getTableUnlocked(table_id.table_name, lock)->getStorageID();
@ -323,7 +325,7 @@ void DatabaseAtomic::tryCreateSymlink(const String & table_name, const String &
try
{
String link = path_to_table_symlinks + escapeForFileName(table_name);
String data = global_context.getPath() + actual_data_path;
String data = Poco::Path(global_context.getPath()).makeAbsolute().toString() + actual_data_path;
Poco::File{data}.linkTo(link, Poco::File::LINK_SYMBOLIC);
}
catch (...)

View File

@ -29,7 +29,6 @@
#include <Common/escapeForFileName.h>
#include <Common/typeid_cast.h>
#include <common/logger_useful.h>
#include <ext/scope_guard.h>
namespace DB

View File

@ -74,7 +74,7 @@ ClickHouseDictionarySource::ClickHouseDictionarySource(
, load_all_query{query_builder.composeLoadAllQuery()}
{
/// We should set user info even for the case when the dictionary is loaded in-process (without TCP communication).
context.setUser(user, password, Poco::Net::SocketAddress("127.0.0.1", 0), {});
context.setUser(user, password, Poco::Net::SocketAddress("127.0.0.1", 0));
context = copyContextAndApplySettings(path_to_settings, context, config);
/// Query context is needed because some code in executeQuery function may assume it exists.

View File

@ -0,0 +1,604 @@
#include "ComplexKeyDirectDictionary.h"
#include <IO/WriteHelpers.h>
#include "DictionaryBlockInputStream.h"
#include "DictionaryFactory.h"
#include <Core/Defines.h>
namespace DB
{
namespace ErrorCodes
{
extern const int TYPE_MISMATCH;
extern const int BAD_ARGUMENTS;
extern const int UNSUPPORTED_METHOD;
}
ComplexKeyDirectDictionary::ComplexKeyDirectDictionary(
const std::string & database_,
const std::string & name_,
const DictionaryStructure & dict_struct_,
DictionarySourcePtr source_ptr_,
BlockPtr saved_block_)
: database(database_)
, name(name_)
, full_name{database_.empty() ? name_ : (database_ + "." + name_)}
, dict_struct(dict_struct_)
, source_ptr{std::move(source_ptr_)}
, saved_block{std::move(saved_block_)}
{
if (!this->source_ptr->supportsSelectiveLoad())
throw Exception{full_name + ": source cannot be used with ComplexKeyDirectDictionary", ErrorCodes::UNSUPPORTED_METHOD};
createAttributes();
}
#define DECLARE(TYPE) \
void ComplexKeyDirectDictionary::get##TYPE(const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, ResultArrayType<TYPE> & out) const \
{ \
dict_struct.validateKeyTypes(key_types); \
const auto & attribute = getAttribute(attribute_name); \
checkAttributeType(full_name, attribute_name, attribute.type, AttributeUnderlyingType::ut##TYPE); \
\
const auto null_value = std::get<TYPE>(attribute.null_values); \
\
getItemsImpl<TYPE, TYPE>( \
attribute, key_columns, [&](const size_t row, const auto value) { out[row] = value; }, [&](const size_t) { return null_value; }); \
}
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
DECLARE(UInt64)
DECLARE(UInt128)
DECLARE(Int8)
DECLARE(Int16)
DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void ComplexKeyDirectDictionary::getString(
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, ColumnString * out) const
{
dict_struct.validateKeyTypes(key_types);
const auto & attribute = getAttribute(attribute_name);
checkAttributeType(full_name, attribute_name, attribute.type, AttributeUnderlyingType::utString);
const auto & null_value = std::get<StringRef>(attribute.null_values);
getItemsStringImpl<StringRef, StringRef>(
attribute,
key_columns,
[&](const size_t, const String value) { const auto ref = StringRef{value}; out->insertData(ref.data, ref.size); },
[&](const size_t) { return String(null_value.data, null_value.size); });
}
#define DECLARE(TYPE) \
void ComplexKeyDirectDictionary::get##TYPE( \
const std::string & attribute_name, \
const Columns & key_columns, \
const DataTypes & key_types, \
const PaddedPODArray<TYPE> & def, \
ResultArrayType<TYPE> & out) const \
{ \
dict_struct.validateKeyTypes(key_types); \
const auto & attribute = getAttribute(attribute_name); \
checkAttributeType(full_name, attribute_name, attribute.type, AttributeUnderlyingType::ut##TYPE); \
\
getItemsImpl<TYPE, TYPE>( \
attribute, key_columns, [&](const size_t row, const auto value) { out[row] = value; }, [&](const size_t row) { return def[row]; }); \
}
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
DECLARE(UInt64)
DECLARE(UInt128)
DECLARE(Int8)
DECLARE(Int16)
DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void ComplexKeyDirectDictionary::getString(
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, const ColumnString * const def, ColumnString * const out) const
{
dict_struct.validateKeyTypes(key_types);
const auto & attribute = getAttribute(attribute_name);
checkAttributeType(full_name, attribute_name, attribute.type, AttributeUnderlyingType::utString);
getItemsStringImpl<StringRef, StringRef>(
attribute,
key_columns,
[&](const size_t, const String value) { const auto ref = StringRef{value}; out->insertData(ref.data, ref.size); },
[&](const size_t row) { const auto ref = def->getDataAt(row); return String(ref.data, ref.size); });
}
#define DECLARE(TYPE) \
void ComplexKeyDirectDictionary::get##TYPE( \
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, const TYPE def, ResultArrayType<TYPE> & out) const \
{ \
dict_struct.validateKeyTypes(key_types); \
const auto & attribute = getAttribute(attribute_name); \
checkAttributeType(full_name, attribute_name, attribute.type, AttributeUnderlyingType::ut##TYPE); \
\
getItemsImpl<TYPE, TYPE>( \
attribute, key_columns, [&](const size_t row, const auto value) { out[row] = value; }, [&](const size_t) { return def; }); \
}
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
DECLARE(UInt64)
DECLARE(UInt128)
DECLARE(Int8)
DECLARE(Int16)
DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void ComplexKeyDirectDictionary::getString(
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, const String & def, ColumnString * const out) const
{
dict_struct.validateKeyTypes(key_types);
const auto & attribute = getAttribute(attribute_name);
checkAttributeType(full_name, attribute_name, attribute.type, AttributeUnderlyingType::utString);
ComplexKeyDirectDictionary::getItemsStringImpl<StringRef, StringRef>(
attribute,
key_columns,
[&](const size_t, const String value) { const auto ref = StringRef{value}; out->insertData(ref.data, ref.size); },
[&](const size_t) { return def; });
}
void ComplexKeyDirectDictionary::has(const Columns & key_columns, const DataTypes & key_types, PaddedPODArray<UInt8> & out) const
{
dict_struct.validateKeyTypes(key_types);
const auto & attribute = attributes.front();
switch (attribute.type)
{
case AttributeUnderlyingType::utUInt8:
has<UInt8>(attribute, key_columns, out);
break;
case AttributeUnderlyingType::utUInt16:
has<UInt16>(attribute, key_columns, out);
break;
case AttributeUnderlyingType::utUInt32:
has<UInt32>(attribute, key_columns, out);
break;
case AttributeUnderlyingType::utUInt64:
has<UInt64>(attribute, key_columns, out);
break;
case AttributeUnderlyingType::utUInt128:
has<UInt128>(attribute, key_columns, out);
break;
case AttributeUnderlyingType::utInt8:
has<Int8>(attribute, key_columns, out);
break;
case AttributeUnderlyingType::utInt16:
has<Int16>(attribute, key_columns, out);
break;
case AttributeUnderlyingType::utInt32:
has<Int32>(attribute, key_columns, out);
break;
case AttributeUnderlyingType::utInt64:
has<Int64>(attribute, key_columns, out);
break;
case AttributeUnderlyingType::utFloat32:
has<Float32>(attribute, key_columns, out);
break;
case AttributeUnderlyingType::utFloat64:
has<Float64>(attribute, key_columns, out);
break;
case AttributeUnderlyingType::utString:
has<String>(attribute, key_columns, out);
break;
case AttributeUnderlyingType::utDecimal32:
has<Decimal32>(attribute, key_columns, out);
break;
case AttributeUnderlyingType::utDecimal64:
has<Decimal64>(attribute, key_columns, out);
break;
case AttributeUnderlyingType::utDecimal128:
has<Decimal128>(attribute, key_columns, out);
break;
}
}
void ComplexKeyDirectDictionary::createAttributes()
{
const auto size = dict_struct.attributes.size();
attributes.reserve(size);
for (const auto & attribute : dict_struct.attributes)
{
attribute_index_by_name.emplace(attribute.name, attributes.size());
attribute_name_by_index.emplace(attributes.size(), attribute.name);
attributes.push_back(createAttributeWithType(attribute.underlying_type, attribute.null_value, attribute.name));
if (attribute.hierarchical)
throw Exception{full_name + ": hierarchical attributes not supported for dictionary of type " + getTypeName(),
ErrorCodes::TYPE_MISMATCH};
}
}
template <typename T>
void ComplexKeyDirectDictionary::createAttributeImpl(Attribute & attribute, const Field & null_value)
{
attribute.null_values = T(null_value.get<NearestFieldType<T>>());
}
template <>
void ComplexKeyDirectDictionary::createAttributeImpl<String>(Attribute & attribute, const Field & null_value)
{
attribute.string_arena = std::make_unique<Arena>();
const String & string = null_value.get<String>();
const char * string_in_arena = attribute.string_arena->insert(string.data(), string.size());
attribute.null_values.emplace<StringRef>(string_in_arena, string.size());
}
ComplexKeyDirectDictionary::Attribute ComplexKeyDirectDictionary::createAttributeWithType(const AttributeUnderlyingType type, const Field & null_value, const std::string & attr_name)
{
Attribute attr{type, {}, {}, attr_name};
switch (type)
{
case AttributeUnderlyingType::utUInt8:
createAttributeImpl<UInt8>(attr, null_value);
break;
case AttributeUnderlyingType::utUInt16:
createAttributeImpl<UInt16>(attr, null_value);
break;
case AttributeUnderlyingType::utUInt32:
createAttributeImpl<UInt32>(attr, null_value);
break;
case AttributeUnderlyingType::utUInt64:
createAttributeImpl<UInt64>(attr, null_value);
break;
case AttributeUnderlyingType::utUInt128:
createAttributeImpl<UInt128>(attr, null_value);
break;
case AttributeUnderlyingType::utInt8:
createAttributeImpl<Int8>(attr, null_value);
break;
case AttributeUnderlyingType::utInt16:
createAttributeImpl<Int16>(attr, null_value);
break;
case AttributeUnderlyingType::utInt32:
createAttributeImpl<Int32>(attr, null_value);
break;
case AttributeUnderlyingType::utInt64:
createAttributeImpl<Int64>(attr, null_value);
break;
case AttributeUnderlyingType::utFloat32:
createAttributeImpl<Float32>(attr, null_value);
break;
case AttributeUnderlyingType::utFloat64:
createAttributeImpl<Float64>(attr, null_value);
break;
case AttributeUnderlyingType::utString:
createAttributeImpl<String>(attr, null_value);
break;
case AttributeUnderlyingType::utDecimal32:
createAttributeImpl<Decimal32>(attr, null_value);
break;
case AttributeUnderlyingType::utDecimal64:
createAttributeImpl<Decimal64>(attr, null_value);
break;
case AttributeUnderlyingType::utDecimal128:
createAttributeImpl<Decimal128>(attr, null_value);
break;
}
return attr;
}
template <typename Pool>
StringRef ComplexKeyDirectDictionary::placeKeysInPool(
const size_t row, const Columns & key_columns, StringRefs & keys, const std::vector<DictionaryAttribute> & key_attributes, Pool & pool) const
{
const auto keys_size = key_columns.size();
size_t sum_keys_size{};
for (size_t j = 0; j < keys_size; ++j)
{
keys[j] = key_columns[j]->getDataAt(row);
sum_keys_size += keys[j].size;
if (key_attributes[j].underlying_type == AttributeUnderlyingType::utString)
sum_keys_size += sizeof(size_t) + 1;
}
auto place = pool.alloc(sum_keys_size);
auto key_start = place;
for (size_t j = 0; j < keys_size; ++j)
{
if (key_attributes[j].underlying_type == AttributeUnderlyingType::utString)
{
auto start = key_start;
auto key_size = keys[j].size + 1;
memcpy(key_start, &key_size, sizeof(size_t));
key_start += sizeof(size_t);
memcpy(key_start, keys[j].data, keys[j].size);
key_start += keys[j].size;
*key_start = '\0';
++key_start;
keys[j].data = start;
keys[j].size += sizeof(size_t) + 1;
}
else
{
memcpy(key_start, keys[j].data, keys[j].size);
keys[j].data = key_start;
key_start += keys[j].size;
}
}
return {place, sum_keys_size};
}
template <typename AttributeType, typename OutputType, typename ValueSetter, typename DefaultGetter>
void ComplexKeyDirectDictionary::getItemsImpl(
const Attribute & attribute, const Columns & key_columns, ValueSetter && set_value, DefaultGetter && get_default) const
{
const auto rows = key_columns.front()->size();
const auto keys_size = dict_struct.key->size();
StringRefs keys_array(keys_size);
MapType<OutputType> value_by_key;
Arena temporary_keys_pool;
std::vector<size_t> to_load(rows);
PODArray<StringRef> keys(rows);
for (const auto row : ext::range(0, rows))
{
const StringRef key = placeKeysInPool(row, key_columns, keys_array, *dict_struct.key, temporary_keys_pool);
keys[row] = key;
value_by_key[key] = get_default(row);
to_load[row] = row;
}
auto stream = source_ptr->loadKeys(key_columns, to_load);
const auto attributes_size = attributes.size();
stream->readPrefix();
while (const auto block = stream->read())
{
const auto columns = ext::map<Columns>(
ext::range(0, keys_size), [&](const size_t attribute_idx) { return block.safeGetByPosition(attribute_idx).column; });
const auto attribute_columns = ext::map<Columns>(ext::range(0, attributes_size), [&](const size_t attribute_idx)
{
return block.safeGetByPosition(keys_size + attribute_idx).column;
});
for (const size_t attribute_idx : ext::range(0, attributes.size()))
{
const IColumn & attribute_column = *attribute_columns[attribute_idx];
Arena pool;
StringRefs keys_temp(keys_size);
const auto columns_size = columns.front()->size();
for (const auto row_idx : ext::range(0, columns_size))
{
const StringRef key = placeKeysInPool(row_idx, columns, keys_temp, *dict_struct.key, pool);
if (value_by_key.has(key) && attribute.name == attribute_name_by_index.at(attribute_idx))
{
if (attribute.type == AttributeUnderlyingType::utFloat32)
{
value_by_key[key] = static_cast<Float32>(attribute_column[row_idx].template get<Float64>());
}
else
{
value_by_key[key] = static_cast<OutputType>(attribute_column[row_idx].template get<AttributeType>());
}
}
}
}
}
stream->readSuffix();
for (const auto row : ext::range(0, rows))
{
set_value(row, value_by_key[keys[row]]);
}
query_count.fetch_add(rows, std::memory_order_relaxed);
}
template <typename AttributeType, typename OutputType, typename ValueSetter, typename DefaultGetter>
void ComplexKeyDirectDictionary::getItemsStringImpl(
const Attribute & attribute, const Columns & key_columns, ValueSetter && set_value, DefaultGetter && get_default) const
{
const auto rows = key_columns.front()->size();
const auto keys_size = dict_struct.key->size();
StringRefs keys_array(keys_size);
MapType<String> value_by_key;
Arena temporary_keys_pool;
std::vector<size_t> to_load(rows);
PODArray<StringRef> keys(rows);
for (const auto row : ext::range(0, rows))
{
const StringRef key = placeKeysInPool(row, key_columns, keys_array, *dict_struct.key, temporary_keys_pool);
keys[row] = key;
value_by_key[key] = get_default(row);
to_load[row] = row;
}
auto stream = source_ptr->loadKeys(key_columns, to_load);
const auto attributes_size = attributes.size();
stream->readPrefix();
while (const auto block = stream->read())
{
const auto columns = ext::map<Columns>(
ext::range(0, keys_size), [&](const size_t attribute_idx) { return block.safeGetByPosition(attribute_idx).column; });
const auto attribute_columns = ext::map<Columns>(ext::range(0, attributes_size), [&](const size_t attribute_idx)
{
return block.safeGetByPosition(keys_size + attribute_idx).column;
});
for (const size_t attribute_idx : ext::range(0, attributes.size()))
{
const IColumn & attribute_column = *attribute_columns[attribute_idx];
Arena pool;
StringRefs keys_temp(keys_size);
const auto columns_size = columns.front()->size();
for (const auto row_idx : ext::range(0, columns_size))
{
const StringRef key = placeKeysInPool(row_idx, columns, keys_temp, *dict_struct.key, pool);
if (value_by_key.has(key) && attribute.name == attribute_name_by_index.at(attribute_idx))
{
const String from_source = attribute_column[row_idx].template get<String>();
value_by_key[key] = from_source;
}
}
}
}
stream->readSuffix();
for (const auto row : ext::range(0, rows))
{
set_value(row, value_by_key[keys[row]]);
}
query_count.fetch_add(rows, std::memory_order_relaxed);
}
const ComplexKeyDirectDictionary::Attribute & ComplexKeyDirectDictionary::getAttribute(const std::string & attribute_name) const
{
const auto it = attribute_index_by_name.find(attribute_name);
if (it == std::end(attribute_index_by_name))
throw Exception{full_name + ": no such attribute '" + attribute_name + "'", ErrorCodes::BAD_ARGUMENTS};
return attributes[it->second];
}
template <typename T>
void ComplexKeyDirectDictionary::has(const Attribute & attribute, const Columns & key_columns, PaddedPODArray<UInt8> & out) const
{
const auto rows = key_columns.front()->size();
const auto keys_size = dict_struct.key->size();
StringRefs keys_array(keys_size);
MapType<UInt8> has_key;
Arena temporary_keys_pool;
std::vector<size_t> to_load(rows);
PODArray<StringRef> keys(rows);
for (const auto row : ext::range(0, rows))
{
const StringRef key = placeKeysInPool(row, key_columns, keys_array, *dict_struct.key, temporary_keys_pool);
keys[row] = key;
has_key[key] = 0;
to_load[row] = row;
}
auto stream = source_ptr->loadKeys(key_columns, to_load);
stream->readPrefix();
while (const auto block = stream->read())
{
const auto columns = ext::map<Columns>(
ext::range(0, keys_size), [&](const size_t attribute_idx) { return block.safeGetByPosition(attribute_idx).column; });
for (const size_t attribute_idx : ext::range(0, attributes.size()))
{
Arena pool;
StringRefs keys_temp(keys_size);
const auto columns_size = columns.front()->size();
for (const auto row_idx : ext::range(0, columns_size))
{
const StringRef key = placeKeysInPool(row_idx, columns, keys_temp, *dict_struct.key, pool);
if (has_key.has(key) && attribute.name == attribute_name_by_index.at(attribute_idx))
{
has_key[key] = 1;
}
}
}
}
stream->readSuffix();
for (const auto row : ext::range(0, rows))
{
out[row] = has_key[keys[row]];
}
query_count.fetch_add(rows, std::memory_order_relaxed);
}
BlockInputStreamPtr ComplexKeyDirectDictionary::getBlockInputStream(const Names & /* column_names */, size_t /* max_block_size */) const
{
return source_ptr->loadAll();
}
void registerDictionaryComplexKeyDirect(DictionaryFactory & factory)
{
auto create_layout = [=](const std::string & full_name,
const DictionaryStructure & dict_struct,
const Poco::Util::AbstractConfiguration & config,
const std::string & config_prefix,
DictionarySourcePtr source_ptr) -> DictionaryPtr
{
if (!dict_struct.key)
throw Exception{"'key' is required for dictionary of layout 'complex_key_direct'", ErrorCodes::BAD_ARGUMENTS};
if (dict_struct.range_min || dict_struct.range_max)
throw Exception{full_name
+ ": elements .structure.range_min and .structure.range_max should be defined only "
"for a dictionary of layout 'range_hashed'",
ErrorCodes::BAD_ARGUMENTS};
const String database = config.getString(config_prefix + ".database", "");
const String name = config.getString(config_prefix + ".name");
if (config.has(config_prefix + ".lifetime.min") || config.has(config_prefix + ".lifetime.max"))
throw Exception{"'lifetime' parameter is redundant for the dictionary' of layout 'direct'", ErrorCodes::BAD_ARGUMENTS};
return std::make_unique<ComplexKeyDirectDictionary>(database, name, dict_struct, std::move(source_ptr));
};
factory.registerLayout("complex_key_direct", create_layout, false);
}
}

View File

@ -0,0 +1,225 @@
#pragma once
#include <atomic>
#include <variant>
#include <vector>
#include <Columns/ColumnDecimal.h>
#include <Columns/ColumnString.h>
#include <Common/Arena.h>
#include <Core/Block.h>
#include <Common/HashTable/HashMap.h>
#include <ext/range.h>
#include <ext/size.h>
#include <ext/map.h>
#include "DictionaryStructure.h"
#include "IDictionary.h"
#include "IDictionarySource.h"
namespace DB
{
using BlockPtr = std::shared_ptr<Block>;
class ComplexKeyDirectDictionary final : public IDictionaryBase
{
public:
ComplexKeyDirectDictionary(
const std::string & database_,
const std::string & name_,
const DictionaryStructure & dict_struct_,
DictionarySourcePtr source_ptr_,
BlockPtr saved_block_ = nullptr);
const std::string & getDatabase() const override { return database; }
const std::string & getName() const override { return name; }
const std::string & getFullName() const override { return full_name; }
std::string getTypeName() const override { return "ComplexKeyDirect"; }
size_t getBytesAllocated() const override { return 0; }
size_t getQueryCount() const override { return query_count.load(std::memory_order_relaxed); }
double getHitRate() const override { return 1.0; }
size_t getElementCount() const override { return 0; }
double getLoadFactor() const override { return 0; }
std::string getKeyDescription() const { return key_description; }
std::shared_ptr<const IExternalLoadable> clone() const override
{
return std::make_shared<ComplexKeyDirectDictionary>(database, name, dict_struct, source_ptr->clone(), saved_block);
}
const IDictionarySource * getSource() const override { return source_ptr.get(); }
const DictionaryLifetime & getLifetime() const override { return dict_lifetime; }
const DictionaryStructure & getStructure() const override { return dict_struct; }
bool isInjective(const std::string & attribute_name) const override
{
return dict_struct.attributes[&getAttribute(attribute_name) - attributes.data()].injective;
}
template <typename T>
using ResultArrayType = std::conditional_t<IsDecimalNumber<T>, DecimalPaddedPODArray<T>, PaddedPODArray<T>>;
#define DECLARE(TYPE) \
void get##TYPE(const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
DECLARE(UInt64)
DECLARE(UInt128)
DECLARE(Int8)
DECLARE(Int16)
DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, ColumnString * out) const;
#define DECLARE(TYPE) \
void get##TYPE( \
const std::string & attribute_name, \
const Columns & key_columns, \
const DataTypes & key_types, \
const PaddedPODArray<TYPE> & def, \
ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
DECLARE(UInt64)
DECLARE(UInt128)
DECLARE(Int8)
DECLARE(Int16)
DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, const ColumnString * const def, ColumnString * const out) const;
#define DECLARE(TYPE) \
void get##TYPE(const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, const TYPE def, ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
DECLARE(UInt64)
DECLARE(UInt128)
DECLARE(Int8)
DECLARE(Int16)
DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, const String & def, ColumnString * const out) const;
void has(const Columns & key_columns, const DataTypes & key_types, PaddedPODArray<UInt8> & out) const;
BlockInputStreamPtr getBlockInputStream(const Names & column_names, size_t max_block_size) const override;
private:
template <typename Value>
using MapType = HashMapWithSavedHash<StringRef, Value, StringRefHash>;
struct Attribute final
{
AttributeUnderlyingType type;
std::variant<
UInt8,
UInt16,
UInt32,
UInt64,
UInt128,
Int8,
Int16,
Int32,
Int64,
Decimal32,
Decimal64,
Decimal128,
Float32,
Float64,
StringRef>
null_values;
std::unique_ptr<Arena> string_arena;
std::string name;
};
void createAttributes();
template <typename T>
void addAttributeSize(const Attribute & attribute);
void calculateBytesAllocated();
template <typename T>
void createAttributeImpl(Attribute & attribute, const Field & null_value);
Attribute createAttributeWithType(const AttributeUnderlyingType type, const Field & null_value, const std::string & name);
template <typename Pool>
StringRef placeKeysInPool(
const size_t row, const Columns & key_columns, StringRefs & keys, const std::vector<DictionaryAttribute> & key_attributes, Pool & pool) const;
template <typename AttributeType, typename OutputType, typename ValueSetter, typename DefaultGetter>
void getItemsStringImpl(
const Attribute & attribute, const Columns & key_columns, ValueSetter && set_value, DefaultGetter && get_default) const;
template <typename AttributeType, typename OutputType, typename ValueSetter, typename DefaultGetter>
void getItemsImpl(
const Attribute & attribute, const Columns & key_columns, ValueSetter && set_value, DefaultGetter && get_default) const;
template <typename T>
void resize(Attribute & attribute, const Key id);
template <typename T>
void setAttributeValueImpl(Attribute & attribute, const Key id, const T & value);
void setAttributeValue(Attribute & attribute, const Key id, const Field & value);
const Attribute & getAttribute(const std::string & attribute_name) const;
template <typename T>
void has(const Attribute & attribute, const Columns & key_columns, PaddedPODArray<UInt8> & out) const;
const std::string database;
const std::string name;
const std::string full_name;
const DictionaryStructure dict_struct;
const DictionarySourcePtr source_ptr;
const DictionaryLifetime dict_lifetime;
std::map<std::string, size_t> attribute_index_by_name;
std::map<size_t, std::string> attribute_name_by_index;
std::vector<Attribute> attributes;
mutable std::atomic<size_t> query_count{0};
BlockPtr saved_block;
const std::string key_description{dict_struct.getKeyDescription()};
};
}

View File

@ -28,6 +28,9 @@ DirectDictionary::DirectDictionary(
, source_ptr{std::move(source_ptr_)}
, saved_block{std::move(saved_block_)}
{
if (!this->source_ptr->supportsSelectiveLoad())
throw Exception{full_name + ": source cannot be used with DirectDictionary", ErrorCodes::UNSUPPORTED_METHOD};
createAttributes();
}

View File

@ -414,7 +414,8 @@ void checkAST(const ASTCreateQuery & query)
if (query.dictionary->layout == nullptr)
throw Exception("Cannot create dictionary with empty layout", ErrorCodes::INCORRECT_DICTIONARY_DEFINITION);
const auto is_direct_layout = !strcasecmp(query.dictionary->layout->layout_type.data(), "direct");
const auto is_direct_layout = !strcasecmp(query.dictionary->layout->layout_type.data(), "direct") ||
!strcasecmp(query.dictionary->layout->layout_type.data(), "complex_key_direct");
if (query.dictionary->lifetime == nullptr && !is_direct_layout)
throw Exception("Cannot create dictionary with empty lifetime", ErrorCodes::INCORRECT_DICTIONARY_DEFINITION);

View File

@ -25,6 +25,7 @@ void registerDictionaries()
registerDictionaryRangeHashed(factory);
registerDictionaryComplexKeyHashed(factory);
registerDictionaryComplexKeyCache(factory);
registerDictionaryComplexKeyDirect(factory);
#if !defined(ARCADIA_BUILD)
registerDictionaryTrie(factory);
#endif

View File

@ -20,6 +20,7 @@ class DictionaryFactory;
void registerDictionaryRangeHashed(DictionaryFactory & factory);
void registerDictionaryComplexKeyHashed(DictionaryFactory & factory);
void registerDictionaryComplexKeyCache(DictionaryFactory & factory);
void registerDictionaryComplexKeyDirect(DictionaryFactory & factory);
void registerDictionaryTrie(DictionaryFactory & factory);
void registerDictionaryFlat(DictionaryFactory & factory);
void registerDictionaryHashed(DictionaryFactory & factory);

View File

@ -25,6 +25,7 @@ SRCS(
ComplexKeyCacheDictionary_setAttributeValue.cpp
ComplexKeyCacheDictionary_setDefaultAttributeValue.cpp
ComplexKeyHashedDictionary.cpp
ComplexKeyDirectDictionary.cpp
DictionaryBlockInputStreamBase.cpp
DictionaryFactory.cpp
DictionarySourceFactory.cpp

View File

@ -12,11 +12,13 @@
namespace DB
{
namespace ErrorCodes
{
extern const int UNKNOWN_ELEMENT_IN_CONFIG;
extern const int EXCESSIVE_ELEMENT_IN_CONFIG;
extern const int PATH_ACCESS_DENIED;
extern const int INCORRECT_DISK_INDEX;
}
std::mutex DiskLocal::reservation_mutex;
@ -34,7 +36,9 @@ public:
UInt64 getSize() const override { return size; }
DiskPtr getDisk() const override { return disk; }
DiskPtr getDisk(size_t i) const override;
Disks getDisks() const override { return {disk}; }
void update(UInt64 new_size) override;
@ -282,6 +286,15 @@ void DiskLocal::copy(const String & from_path, const std::shared_ptr<IDisk> & to
IDisk::copy(from_path, to_disk, to_path); /// Copy files through buffers.
}
DiskPtr DiskLocalReservation::getDisk(size_t i) const
{
if (i != 0)
{
throw Exception("Can't use i != 0 with single disk reservation", ErrorCodes::INCORRECT_DISK_INDEX);
}
return disk;
}
void DiskLocalReservation::update(UInt64 new_size)
{
std::lock_guard lock(DiskLocal::reservation_mutex);

View File

@ -55,7 +55,7 @@ DiskSelectorPtr DiskSelector::updateFromConfig(
constexpr auto default_disk_name = "default";
std::set<String> old_disks_minus_new_disks;
for (const auto & [disk_name, _] : result->disks)
for (const auto & [disk_name, _] : result->getDisksMap())
{
old_disks_minus_new_disks.insert(disk_name);
}
@ -65,10 +65,10 @@ DiskSelectorPtr DiskSelector::updateFromConfig(
if (!std::all_of(disk_name.begin(), disk_name.end(), isWordCharASCII))
throw Exception("Disk name can contain only alphanumeric and '_' (" + disk_name + ")", ErrorCodes::EXCESSIVE_ELEMENT_IN_CONFIG);
if (result->disks.count(disk_name) == 0)
if (result->getDisksMap().count(disk_name) == 0)
{
auto disk_config_prefix = config_prefix + "." + disk_name;
result->disks.emplace(disk_name, factory.create(disk_name, config, disk_config_prefix, context));
result->addToDiskMap(disk_name, factory.create(disk_name, config, disk_config_prefix, context));
}
else
{

View File

@ -29,6 +29,10 @@ public:
/// Get all disks with names
const auto & getDisksMap() const { return disks; }
void addToDiskMap(String name, DiskPtr disk)
{
disks.emplace(name, disk);
}
private:
std::map<String, DiskPtr> disks;

View File

@ -206,8 +206,11 @@ public:
/// Get reservation size.
virtual UInt64 getSize() const = 0;
/// Get disk where reservation take place.
virtual DiskPtr getDisk() const = 0;
/// Get i-th disk where reservation take place.
virtual DiskPtr getDisk(size_t i = 0) const = 0;
/// Get all disks, used in reservation
virtual Disks getDisks() const = 0;
/// Changes amount of reserved space.
virtual void update(UInt64 new_size) = 0;

View File

@ -8,6 +8,17 @@
namespace DB
{
enum class VolumeType
{
JBOD,
SINGLE_DISK,
UNKNOWN
};
class IVolume;
using VolumePtr = std::shared_ptr<IVolume>;
using Volumes = std::vector<VolumePtr>;
/**
* Disks group by some (user) criteria. For example,
* - VolumeJBOD("slow_disks", [d1, d2], 100)
@ -22,7 +33,7 @@ namespace DB
class IVolume : public Space
{
public:
IVolume(String name_, Disks disks_): disks(std::move(disks_)), name(std::move(name_))
IVolume(String name_, Disks disks_): disks(std::move(disks_)), name(name_)
{
}
@ -37,16 +48,17 @@ public:
/// Volume name from config
const String & getName() const override { return name; }
virtual VolumeType getType() const = 0;
/// Return biggest unreserved space across all disks
UInt64 getMaxUnreservedFreeSpace() const;
Disks disks;
DiskPtr getDisk(size_t i = 0) const { return disks[i]; }
const Disks & getDisks() const { return disks; }
protected:
Disks disks;
const String name;
};
using VolumePtr = std::shared_ptr<IVolume>;
using Volumes = std::vector<VolumePtr>;
}

View File

@ -28,6 +28,7 @@ namespace ErrorCodes
extern const int FILE_ALREADY_EXISTS;
extern const int CANNOT_SEEK_THROUGH_FILE;
extern const int UNKNOWN_FORMAT;
extern const int INCORRECT_DISK_INDEX;
}
namespace
@ -369,7 +370,16 @@ public:
UInt64 getSize() const override { return size; }
DiskPtr getDisk() const override { return disk; }
DiskPtr getDisk(size_t i) const override
{
if (i != 0)
{
throw Exception("Can't use i != 0 with single disk reservation", ErrorCodes::INCORRECT_DISK_INDEX);
}
return disk;
}
Disks getDisks() const override { return {disk}; }
void update(UInt64 new_size) override
{

View File

@ -0,0 +1,6 @@
#include <Disks/SingleDiskVolume.h>
namespace DB
{
}

View File

@ -0,0 +1,26 @@
#pragma once
#include <Disks/IVolume.h>
namespace DB
{
class SingleDiskVolume : public IVolume
{
public:
SingleDiskVolume(const String & name_, DiskPtr disk): IVolume(name_, {disk})
{
}
ReservationPtr reserve(UInt64 bytes) override
{
return disks[0]->reserve(bytes);
}
VolumeType getType() const override { return VolumeType::SINGLE_DISK; }
};
using VolumeSingleDiskPtr = std::shared_ptr<SingleDiskVolume>;
}

View File

@ -55,7 +55,7 @@ StoragePolicy::StoragePolicy(
std::set<String> disk_names;
for (const auto & volume : volumes)
{
for (const auto & disk : volume->disks)
for (const auto & disk : volume->getDisks())
{
if (disk_names.find(disk->getName()) != disk_names.end())
throw Exception(
@ -102,7 +102,7 @@ bool StoragePolicy::isDefaultPolicy() const
if (volumes[0]->getName() != "default")
return false;
const auto & disks = volumes[0]->disks;
const auto & disks = volumes[0]->getDisks();
if (disks.size() != 1)
return false;
@ -117,7 +117,7 @@ Disks StoragePolicy::getDisks() const
{
Disks res;
for (const auto & volume : volumes)
for (const auto & disk : volume->disks)
for (const auto & disk : volume->getDisks())
res.push_back(disk);
return res;
}
@ -130,17 +130,17 @@ DiskPtr StoragePolicy::getAnyDisk() const
if (volumes.empty())
throw Exception("StoragePolicy has no volumes. It's a bug.", ErrorCodes::LOGICAL_ERROR);
if (volumes[0]->disks.empty())
if (volumes[0]->getDisks().empty())
throw Exception("Volume '" + volumes[0]->getName() + "' has no disks. It's a bug.", ErrorCodes::LOGICAL_ERROR);
return volumes[0]->disks[0];
return volumes[0]->getDisks()[0];
}
DiskPtr StoragePolicy::getDiskByName(const String & disk_name) const
{
for (auto && volume : volumes)
for (auto && disk : volume->disks)
for (auto && disk : volume->getDisks())
if (disk->getName() == disk_name)
return disk;
return {};
@ -181,7 +181,7 @@ ReservationPtr StoragePolicy::makeEmptyReservationOnLargestDisk() const
DiskPtr max_disk;
for (const auto & volume : volumes)
{
for (const auto & disk : volume->disks)
for (const auto & disk : volume->getDisks())
{
auto avail_space = disk->getAvailableSpace();
if (avail_space > max_space)
@ -207,10 +207,10 @@ void StoragePolicy::checkCompatibleWith(const StoragePolicyPtr & new_storage_pol
throw Exception("New storage policy shall contain volumes of old one", ErrorCodes::LOGICAL_ERROR);
std::unordered_set<String> new_disk_names;
for (const auto & disk : new_storage_policy->getVolumeByName(volume->getName())->disks)
for (const auto & disk : new_storage_policy->getVolumeByName(volume->getName())->getDisks())
new_disk_names.insert(disk->getName());
for (const auto & disk : volume->disks)
for (const auto & disk : volume->getDisks())
if (new_disk_names.count(disk->getName()) == 0)
throw Exception("New storage policy shall contain disks of old one", ErrorCodes::LOGICAL_ERROR);
}
@ -222,7 +222,7 @@ size_t StoragePolicy::getVolumeIndexByDisk(const DiskPtr & disk_ptr) const
for (size_t i = 0; i < volumes.size(); ++i)
{
const auto & volume = volumes[i];
for (const auto & disk : volume->disks)
for (const auto & disk : volume->getDisks())
if (disk->getName() == disk_ptr->getName())
return i;
}

View File

@ -4,6 +4,7 @@
#include <Disks/IDisk.h>
#include <Disks/IVolume.h>
#include <Disks/VolumeJBOD.h>
#include <Disks/SingleDiskVolume.h>
#include <IO/WriteHelpers.h>
#include <Common/CurrentMetrics.h>
#include <Common/Exception.h>

View File

@ -25,6 +25,8 @@ public:
DiskSelectorPtr disk_selector
);
VolumeType getType() const override { return VolumeType::JBOD; }
/// Next disk (round-robin)
///
/// - Used with policy for temporary data

View File

@ -0,0 +1,17 @@
#include "createVolume.h"
namespace DB
{
VolumePtr createVolumeFromReservation(const ReservationPtr & reservation, VolumePtr other_volume)
{
if (other_volume->getType() == VolumeType::JBOD || other_volume->getType() == VolumeType::SINGLE_DISK)
{
/// Since reservation on JBOD chosies one of disks and makes reservation there, volume
/// for such type of reservation will be with one disk.
return std::make_shared<SingleDiskVolume>(other_volume->getName(), reservation->getDisk());
}
return nullptr;
}
}

12
src/Disks/createVolume.h Normal file
View File

@ -0,0 +1,12 @@
#pragma once
#include <Disks/IVolume.h>
#include <Disks/VolumeJBOD.h>
#include <Disks/SingleDiskVolume.h>
namespace DB
{
VolumePtr createVolumeFromReservation(const ReservationPtr & reservation, VolumePtr other_volume);
}

Some files were not shown because too many files have changed in this diff Show More